YANGHONG

Dell PowerEdge R910 maintenance

Introduction

It can be tiring focusing on one thing for too long. I can feel my body physically resistent to the code which I have spent many hours on. So today I spend the whole day with something seemingly boring but at least deviant from my previous work.

Google is always my (and our) best friend. If I didn't explain every detail in the following sections, you can leave a message below or email me for detail.

Update BIOS and firmware in Debian

The default Linux distribution supported by Dell is RHEL. Update process on Debian/Ubuntu can be a little different. In my case, the server is PowerEdge R910. Target BIOS is PowerEdge R910 BIOS 2.10.0. Target PERC firmware is DELL PERC H700 Int v12.10.6-0001,A12.

PERC firmware update

Reference: DELL PERC Firmware Upgrade on Debian amd64

The first thing to do is replace dash with bash for default shell /bin/sh. Otherwise you will experience errors like "typeset is not found" or automatic reboot.

Here is the list of commands:

sudo dpkg-reconfigure dash      # choose No, make bash the default /bin/sh
sudo aptitude install rpm libstdc++5
sudo ./SAS-RAID_Firmware_C3X7D_LN_12.10.6-0001_A12.BIN --extract /tmp/update
cd /tmp/update
rpm2cpio srvadmin-storelib-sysfs-7.2.0-4.1.1.el4.x86_64.rpm | sudo cpio -idmv
sudo cp -a /tmp/update/opt/lsi/3rdpartylibs/x86_64/libsysfs.so* /opt/lsi/3rdpartylibs/x86_64
echo /opt/lsi/3rdpartylibs/ | sudo tee -a /etc/ld.so.conf.d/dellfw.conf
echo /opt/lsi/3rdpartylibs/x86_64 | sudo tee -a /etc/ld.so.conf.d/dellfw.conf
# update dynamic linker cache
sudo ldconfig
cd -
sudo ./SAS-RAID_Firmware_C3X7D_LN_12.10.6-0001_A12.BIN

Follow the instructions and everything should be OK.

BIOS update

BIOS update is simpler.

Reference: R610 and R710 fail to install BIOS update, CentOS 5

Follow the instructions in installation program until you lose your patience with the screen full of dots. So WTF is the installation program doing?

Actually it stuck with the following process:

/tmp/R910_BIOS_GX7WN_LN_2.10.0.BIN-9979-22192/./UpdRollBack --depcheck--

To avoid the hanging problem as well as a failure caused by "Unable to get the System Generation", one solution is to add a file to cheat the installation program. Put the following string in /etc/redhat-release.

Red Hat Enterprise Linux Server release 6.3 (Santiago)

To be honest, I don't know how I fixed this. But after searching and trial and error I finally got BIOS successfully updated once when I stroke CTRL+C in 10s after the update process started. It was definitely a wrong decision but I was lucky. It took the machine like 10 minutes to reboot after the update and I thought it must be dead.

Set Dell OEM in iDRAC using ipmi

Referring to http://sourceforge.net/projects/ipmitool/files/ipmitool/1.8.13/, new sysinfo interface was added in 1.8.13 but the stable(wheezy) version is 1.8.11.

Install ipmitool

So the first thing to do is install a newer version of ipmitool. Make sure you have correctly set testing(jessie) repository and the priority preferences.

$ apt-cache policy ipmitool
ipmitool:
  Installed: 1.8.11-5
  Candidate: 1.8.11-5
  Version table:
     1.8.14-4 0
        750 http://ftp.us.debian.org/debian/ jessie/main amd64 Packages
 *** 1.8.11-5 0
        995 http://ftp.us.debian.org/debian/ stable/main amd64 Packages
        100 /var/lib/dpkg/status

Install the 1.8.14 version from jessie repo.

sudo apt-get install ipmitool/jessie

After this, you should be able to execute some commands using ipmitool. The following command uses OpenIPMI kernel interface to read the management controller data.

$ sudo ipmitool -I open mc getsysinfo os_name
Linux

Get exchange-bmc-os-info script

Reference: Setting iDRAC OS Information with IPMI on Ubuntu Server

Download the init script here: exchange-bmc-os-info

In default case, the kernel will set default values through IPMI to iDRAC. This script will collect distribution and kernel infomation mainly based on /etc/os-release, uname, and /etc/lsb-release. Then it uses ipmitool setsysinfo to change values in iDRAC.

Bugs

If you have installed "Dell OpenManage Server Administrator", then it is possible that this script won't work.

The reason is:

  • Kernel will install the module ipmi_si for the first time at boot time and creating a kernel thread kipmi0. Then the script exchange-bmc-os-info is able to change the BMC values.
  • OMSA's init script /etc/init.d/dataeng will try to unload and load again the module ipmi_si through script /etc/init.d/instsvcdrv. I still can't figure out how OMSA use this script but simple print test shows someone invokes the script similar to

    /etc/init.d/instsvcdrv disablethread
    

So, the second load of kernel module resets the values set previously by exchange-bmc-os-info.

My solution is remove the operations for disablethread. Change the file /etc/init.d/instsvcdrv as follows:

disablethread)
        # instsvcdrv_openipmi_force_thread disable $2
        # EXIT_STATUS=$?
        EXIT_STATUS=0
        ;;

More problems and solutions can be found OMSA 6.3.0 on Ubuntu 10.04 LTS (Lucid)

comments powered by Disqus