Installing to the dl360 over ethernet

Synopsis: Instructions for installing Debian on the Compaq dl360 g3 1U server with kernel 2.6, and without using a keyboard, mouse, monitor, floppy, CD, MSDOS, or MSIE. Probably applicable to a lot of the other dl3xx series.

These instructions are for professionals, so don't expect any candy coating. Maybe the next person to use these instructions can use some of the time saved to make the docs a bit more helpful and up-to-date. These instructions are for a slightly older version of Debian before the Debian Installer system was retooled. If you know what you are doing enough to diverge from this recipe, you can probably save yourself a signifigant number of steps by using the newer installer, which is more supportive of being installed over the network, not to mention including better support for initial installs on reiserfs and ext3. Keeping fbcon support from loading is likely to be hairy in that case, though.

Materials needed:

Two ethernet switchports, at least one, preferebly both, 100Mbps. They should be on a switch known not to have autonegotiation issues.
A DHCP server on a separate box
A TFTP server on a separate box (stock tftpd won't do, instead use atftpd or tftpd-hpa)
An NFS server on a separate box
A separate box to compile an i386 kernel on.

... all my servers were Debian based, YMMV.

Step 1: Getting access to the iLOM and ROM interfaces.

Choose an IP address for the iLOM and another one for the server itself.

Connect iLOM port to ethernet (100 Mbps, 10/half doesn't work too well.) Find the MAC from your DHCP logs and add a DHCP config for it. Note the iLOM DHCP client is broken and will get a classful netmask regardless of the netmask you actually hand it.

With mozilla https to the iLOM IP, ignore the MSIE warnings. Login as Administrator with password from the label which I think is usually stuck to the unit itself, or at least comes packed with it. You can change the password via this interface if desired.

Set the Remote Console hotkeys to T=F8, U=F9, V=F10, W=CTRL-ALT-DELETE, X=F12, Y=F3 and save them.

Telnet to the IP address, preferably from a machine with 100Mbps connectivity. The username and password are the same as above. Note the password in this case will be sent in the clear. You should get a screen saying server is off, unless of course you have turned it on.

Hit the virtual power button using the WWW browser and watch the telnet session attentively.

There are two opportunities to hit F8 (CTRL-T), followed by a third opportunity to hit F9 (CTRL-U), F10 (CTRL-V), or F12 (CTRL-X).

First: F8 (CTRL-T) -- gets into ORCA (RAID configurator)

Second: F8 (CTRL-T) -- gets into iLOM setup (here can lock down iLO NIC stuff)

Third: F9 (CTRL-U) -- RBMS (PC-like ROM BIOS) F10 (CTRL-V) -- System Maintainance Menu (RBMS and DIAGS) F12 (CTRL-X) -- "PXE" boot (boot from ethercard)

Step 2: Apply some BIOS/ROM settings

First go into RBMS and do the following:

System Setup set OS Selection to Linux
Disable Diskette boot (optional)
Disable Virtual Install Disk (optional)
Boot order set disk, cd, floppy, net
Write down the MAC address of the first ethernet card.

Also get into ORCA and set up a logical volume. This may be already automatically done for you. You'll have no choice but to use whole of each disk -- what I did was create a RAID-1 volume and then later we can trick things to downsize it as long as all partitions created on it are at the top. If you do this, then you have to be careful and leave a swap partition or free space or a /boot partition at the top of the drive. Sorry I don't know exactly how much gets destroyed but a 64M partition at the top kept my reiserfs root partition safe from damage.

An alternate way of doing it, though, is to delete any volume that may have been autocreated (assuming you have no data on the disk yet that you want to keep) and recreating it as a RAID-0 array that only has one disk in it.

The first way saves you some time-consuming data copying and provides a bit of a safety net in case something doesn't go quite right, the second way saves you a couple steps/reboots later on. Of course if you are lucky enough to have more than two hard drives, e.g. a dl380, then you have a lot more shoulder room to fiddle with RAID later.

Step 3: Unpack and export the installer via NFS

Get the root.bin file from the woody disks-i386 area of the Debian archive

mkdir /path/to/nfs/exports/dl360
cp root.bin debroot.gz
gunzip debroot.gz
mount -t ext2 -o loop debroot /floppy/
mv /floppy/* /path/to/nfs/exports/dl360/
umount /floppy
rm debroot
cp drivers.tgz basedebs.tar /path/to/nfs/exports/dl360/
mkdir /path/to/nfs/exports/dl360/images-1.44
cp rescue.bin /path/to/nfs/exports/dl360/images-1.44/

As said above, choose a *different* IP address than the one you gave iLOM, and set up a DHCP entry for the server's first ethernet port using the MAC address you took down in step 2. Restart the dhcp server to load the new config.

edit /etc/exports to allow this IP address to access the above dl360 directory over NFS e.g.:

echo /exports/dl360 X.X.X.X/255.255.255.255(rw,no_root_squash) >> /etc/exports
/etc/init.d/nfs-user-server reload  # or nfs-kernel-server if you use that

(...place the IP address you chose where the X's are above)

Step 4: Build a 2.6 series kernel.

I assume you know how to build kernels from Debian source or from a kernel.org tarball. Well in this case you want to configure the kernel but then do a normal "make bzImage" rather than a make-kpkg.

The important options, for the purposes of the initial install, are:

build in the cciss driver for the Smart Array 5i (note *not* an SCSI driver).
build in input, build in keyboard, build in serio, build in i8042 keyboard
drivers/char build in virtual terminal and console on virtual terminal
You *must* turn OFF framebuffer support
build in VGA text console.
build in the tg3 gigabit ethernet driver
build in nfs client and nfsroot support
If you want resiserfs like I do, build it in, as well as ext2 for /boot

...the rest of the config should be common sense. The tricks are that iLOM won't display anything if you allow the vga mode to be tickled, and the iLOM emulates a i8042 keyboard, so we have to pretend to have one for now.

After you "make bzImage" find the file bzImage and copy it to your TFTP directory as the filename /path/to/tftp/kern26. Before you do so, you may want to use rdev to set the root device to 104,2, because sometimes the kernel boot parameters occasionally get broken between kernel versions WRT cciss.

Step 5: Prepare pxelinux TFTP boot glue.

The following assumes you do not already use pxelinux, and so we are going to define a default pxelinux config. If you already have pxelinux images defined you can create a configuration specific to the new server by following the instructions in the syslinux documentation.

On the tftp server:

apt-get install syslinux
apt-get install atftpd # or tftpd-hpa, vanilla tftpd won't work

Find the file pxelinux.0 in the syslinux package copy that file to your TFTP directory (/tftpboot/ on unconfigured systems). In general, keep your tftp pathnames short and simple since they could get munged or clipped. Also make sure any TFTP files you create are world readable and subdirectories are world readable and world executable.

In the TFTP directory mkdir pxelinux.cfg

cat > pxelinux.cfg/default

LABEL nfs
  KERNEL kern26
  APPEND root=/dev/nfs nfsroot=/path/to/nfs/exports/dl360/ ip=XXX.XXX.XXX.XXX:YYY.YYY.YYY.YYY:GGG.GGG.GGG.GGG:MMM.MMM.MMM.MMM:phn0rd:eth0:off

LABEL disk1
  KERNEL kern26
  APPEND root=6802

LABEL disk2
  KERNEL kern26
  APPEND root=6812
^D

Where:

X's are the new IP address chosen in step 3
Y's are the IP address of the NFS server
G's are the router gateway address
M's are the netmask
"phn0rd" is garbage filler text.

Step 6: Booting

Next we need to connect the first onboard ethernet if we have not yet. 10/half will work here but we will be NFS mounting the install disks so faster is better. On some autonegotiating switches I had some trouble getting DHCP to work; you may have to lock the speed or duplex on the switch end.

Above you will have written down the MAC address from RBMS, use that to create a new DHCP entry assigning the server it's IP address. Add the "next-server" option to send it to your TFTP server, and feed it the "filename" option /path/to/tftp/pxelinux.0

Reopen the SSL connection to iLOM and hit the virtual power button.

Now watch your daemon.log(s). The following should happen/have happened:

DHCP assigns the dl360 it's IP address
The dl360 TFTPs pxelinux.0
The dl360 tries to TFTP a bunch of filenames from pxelinux.cfg
The dl360 TFTPs pxelinux.cfg/default (boot: prompt here, see below.)
The dl360 TFTPs kern26
The dl360 NFS mounts the /exports/dl360/ directory.

If you still had the telnet session open you will see pxelinux boot. If it has closed, reopen it, and you should be at (or in the process of getting to) a "boot:" prompt where you can type "nfs" to get into the Debian installer menu sytem. If it hangs at booting from drive C, you need to force a PXE boot by hitting F12(CTRL-X) at the right moment. You can reboot again as above with the virtual power button, or in certain circumstances by using CTRL-ALT-DEL(CTRL-W)

The Debian installer is a bit cranky about being run over NFS so you just have to wait 20 seconds or so every time the system state is being determined and ignore the complaint about the "root fs". Also I have found the "configure network" menu item causes weird mojo to happen with my particular kernel revision, so avoid that.

You can do all the normal stuff -- the boot disks should recognize /dev/cciss/c0d0 as a partionable block device. As said earlier if you do not want your whole disk used as RAID1 you should only use partitions at the top of the disk, and you should put swap or an easy-to-reconstruct /boot partition, or just free space, at the top of the drive. Remember exactly what you do here, you will need to recreate the partition table when we start abusing the RAID configuration.

Step 7: Upgrade ROMs (optional)

Before going further we might want to take the time to freshen up the BIOS ROMS. The ROM upgrades can be fetched from HP/Compaq's site as an .scexe file.

The mainboard ROM can be upgraded as such:

On the NFS server, copy the .scexe file to the dl360 directory and perform the following. Look for the number they set _SKIP to at the top of the .scexe file (it is a shell script) and substitute it for _SKIP below:

mkdir mainrom
cd mainrom
tail +_SKIP CP00XXXX.scexe | tar zxv
chmod 755 cpqsetup

In the iLOM telnet session execute a shell:

cd mainrom
./cpqsetup
./cpqsetup

The first cpqsetup upgrades the ROM, the second upgrades it a second time to force the new BIOS into the backup ROM. If you are unsure about the new ROM image you may want to play it safe and reboot once with the new ROM before running the second upgrade.

If your lucky and there's no problems mixing the bootfloppies ld-linux.so with any available i386 libs you may have on hand, you can upgrade the Smart Array ROM as such:

cd /exports/dl360
mkdir sa5i
cd sa5i
cp /path/to/i386/libs/libc.so.6 .
cp /path/to/i386/libs/libm.so.6 .
cp /path/to/i386/libs/libpthread.so.0 .
tail +_SKIP CP00XXXX.scexe | tar zxv
chmod 755 cpqsetup

In the iLOM telnet session execute a shell:

cd sa5i
LD_LIBRARY_PATH=. ./cpqsetup --log foo.out
exit

Under certain kernel versions, you must have the first logical volume on the first disk group defined in order for this to work (the utility uses IOCTLs on /dev/cciss/c0d0 I think.) This didn't used to be the case before and may have been fixed by 2.6.

See the section on extracting offline SoftPaqs below if you want to upgrade the iLOM ROM. This is done through the iLOM HTTPS interface. So far I have managed to extract the image from the softpaq but mozilla doesn't work right (sends file then just hangs.) I have not tried MSIE; I'm still looking for an OpenSource solution.

Step 8: Trick to get reiserfs on root (optional)

I usually set up a small ext2 /boot/ partition and then a reiserfs partition. Note if you want to RAID-0 parts of the disk you need to put both these at the top of the disk and leave room for RAID-0. The boot disks find /dev/cciss/c0d0 for cfdisk OK, which is your first unpartitioned logical volume on your first controller, but remember X in the the c0d0pX files is 1-based like ide and scsi are, so the first partition is c0d0p1, not c0d0p0

What worked for me is to do this on the NFS server (assuming it is an x86):

cp /sbin/mkreiserfs /exports/dl360/sbin/
mkdir /exports/dl360/reiser
cp /path/to/i386/libs/libc.so.6 /path/to/nfs/exports/reiser
cp /path/to/i386/libs/libm.so.6 /path/to/nfs/exports/reiser
cp /path/to/i386/libs/libpthreads.so.0 /path/to/nfs/exports/reiser

Then execute a shell on the server and:

cd reiser
LD_LIBRARY_PATH=. ./mkreiserfs -v 3.6 /dev/cciss/c0d0p2
exit

...but it seems that if your packages are very fresh then there's some problems with the boot-floppies ld-linux.so if you do this. I believe someone has boot disks premade to make this easy -- try bins from those disks, or if you've been a good boy and read through these instructions first before applying them, you'll know to use a reiserfs disk set in the initial NFS unpack above.

Once you've formatted the filesystem you can mount it by hand or do it through the Debian menu system; either will work as long as the running kernel has reiserfs support.

Step 9: Get a hard-disk based system running.

Above, we loaded the necessary files into our NFS server to perform the base install over NFS. You could also try HTTP. You know the ropes from here.

You'll have to fudge the kernel part, as the install script will put a stock 2.4 kernel in. It is best to do this as I think the next step of the install doesn't go quite right if you do not. For now we will still be booting the kernel via the network, but we will mount the root filesystem from the RAID array. All we need to do is change to typing "disk1" at the boot: prompt, to use the entry that we put in pxelinux.cfg/default a few steps back.

So reboot the system and be ready to force a PXElinux boot with F12(CTRL-X) and then type "disk1" to boot with your new system as root. Now making things boot is simply a matter of installing a kernel-image package or a raw bzImage and configuring lilo.

Step 10: Reconfiguring RAID

Now let's say we want to shrink our raid1 volume and use the rest of the disk space as RAID0, which is very likely. You cannot do that through ORCA, you have to run the Compaq ACU utility. Unfortunately I was not able to get this utility to be happy running under the boot disk environment -- otherwise we might be able to avoid a lot of these steps. You get the ACU through the website; it comes as an rpm package. So install the following packages using alien (they may have newer package versions of course by the time you read this doc):

apt-get install alien rpm
alien --to-deb cpqacuxe-6.40-11.0.i386.rpm
dpkg -i cpqacuxe_6.40-12_i386.deb

cpqacuxe is linked to a specific version of libstc++, which is not available in Debian. However, it seems to be happily fooled by feeding it one of Debian's:

ln -s /usr/lib/libstdc++-libc6.2-2.so.3 /usr/lib/libstdc++-libc6.1-1.so.2

This package needs to be kept at bay since it wants to run some sort of webserver interfaces -- if you always run it with either the -c or -i options though the server will not remain running. When first run, it will ask for some sort of password. I don't know where or how it stores this.

If you are currently stuck with a RAID-1 volume you have to jump through some hoops. If above you opted to create a single-disk RAID-0 volume then you can skip a few paragraphs down.

Gather any data you need to recreate your partition table -- either dd it to the NFS area or remember all the entries you put into cfdisk. As said above some data at the beginning of the disk will get damaged, including some of the first partition. I don't know how much.

Reboot and start ORCA and delete the first logical volume. Then create a new RAID-0 volume with only the second of the two disks in it. Since RAID-1 keeps an exact mirror on both disks, your files should be intact on the new volume, but you need to boot back into the Debian installer on NFS and recreate your partition table and undo the damage to the first partition.

You should then be able to reboot into linux by booting the pxelinux image and using the right kernel boot parameters. Or you can use rdev on the bzImage to set the boot device if you have trouble with the root= directive.

Note that since the data from the other side of your mirror is still on the other disk, if you want to reconstruct the partition tables and use that data, you can, instead of performing the dd command below. But also note that you will be working on a snapshot of your system since before you ran ORCA. So keep in mind you have two separate systems you can boot into and try to keep track of which one you are in -- because the controller and linux can in some conditions disagree as to which drive/partition is "before" the other, this can get very confusing.

Now, regardless how things were partitioned initially, people are for the most part on the same page.

if you run cpqacuxe -c you will get a file acucapt.ini. copy this file to acuinput.ini That should look something like this:

Action= Configure
Method= Custom

; Controller Specifications
; Controller Compaq Smart Array 5i
Controller= Slot 0
ReadCache= 100
WriteCache= 0
RebuildPriority= Low
ExpandPriority= Low
SurfaceScanDelay= 15

; Array Specifications
Array= A
Drive= 1:1
OnlineSpare= No

; Logical Drive Specifications
LogicalDrive= 1
RAID= 0
Size= 17639
Sectors= 32
StripeSize= 128
ArrayAccelerator= Enable

In this file logicaldrive 1 is the one your OS is running off.

We will be creating a new RAID-0 array consisting of the other disk. However we will be making the volume on it smaller than the whole disk. copy acucapt.ini to acuinput.ini and:

change Action to Reconfigure
duplicate everything from Array = A on down to create a new section
In the new section, change Array to B
In the new section, change Disk to 1:0 (or 1:1 if it is 1:0 above)
In the new section change the Size to your desired volume size

save those changes and run cpqacuxe -i.

People who started out with RAID-1 can reboot into the NFS Debian Installer again and repartition /dev/cciss/c0d0 and /dev/cciss/c0d1 back to the way they are supposed to be (one will be fine, the other hosed) and remake the ext2fs on /boot/ if that is what you are using as your damage pad.

People who started out with RAID-0 may be able to partition the new logical volume without rebooting, and then copy the OS over:

dd if=/dev/cciss/c0d0p1 of=/dev/cciss/c0d1p1
dd if=/dev/cciss/c0d0p2 of=/dev/cciss/c0d1p2

[... etc.]

However I haven't tried this and funny things can happen to the logical volumes after reboot. I recommend a reboot into the NFS root system, followed by partitioning any logical volumes that are not partitioned, and, for both RAID-1 and RAID-0 scenarios if you want to boot the system on /dev/cciss/c0d1p2, you have to mount up that device and adjust etc/inittab by hand to reflect the change in device name before booting.

Note that if you use dd you are copying a live filesystem. If this makes you nervous, you should do so from the installer NFS-boot. Verify your files have copied by mounting them up, then reboot.

Now, after you are all nice and comfortable having these two separate systems, we are going to destroy one of them. Get into ORCA and delete the big whole-disk volume. Note that even though the smaller system may suddenly find itself on /dev/cciss/c0d0 instead of /dev/cciss/c0d1, as far as cpqacuxe and ORCA are concerned, volume it is on is still volume #2. Have some fun getting booted into the new system.

Now run "acucfg -c" and notice from acucapt.ini that the new Array, which is now the only array, is now A, even though it was B before.

copy acucapt.ini to acuinput.ini and:

change Action to Reconfigure
change Disk to 1:0,1:1
change the RAID= value on your logical drive to one.
Add a second logical drive section with RAID=0 and set it's size to the rest of the disk (times two).
run cpqacuxe -i
run cpqacuxe -c

The results should show you are the proud owner of an array with both a RAID-1 and a RAID-0 logical volume. You are free to work on the system and install packages and configure it, but you may want to give the disks a good amount of time to sync before rebooting anything (perhaps by making notes to improve this document :-)

From here out you are on your own. You can try shuffling things around to try to get the root filesystem back onto logical volume 1. This might save some commotion in the future, who knows?

Yes, in case you were wondering, compared to many others, this is indeed an extremely cretinous RAID configuration suite.

Optional Trick: Unpack a softpaq ROM image

If you want to try using some of the "offline" ROM upgrade utilities (the only one I found was for the iLO and it failed under mozilla), they usually come in self extracting MSDOS .exe files. You can use dosemu and freedos to get at the meat.

Start with the softpaq .exe files in your home directory

apt-get install dosemu
apt-get install dosemu-freedos
apt-get install xfonts-dosemu
xdosemu -home

Change dos drive to d: and run each of the softpaq exe files found there. Agree to the license terms. The ROM images will pop out as .img files.

Type exitemu to gracefully close dosemu. You will find the (case folded) .img files in your home directory.

Copyright (C) 2003-2004 Brian S. Julin

This document may be redistributed or intergrated into other
documents per the terms of the Open Publication License (same
one applied to the Debian WWW Homepage)