Dell PowerEdge R715 iSCSI Boot with Server Core 2008 R2


Note: This is written as it comes along and so you’ll get to see the failures and hopefully a wrap up with how to make it actually work.

Alright, so here’s the project: Get Windows Server Core 2008 R2 booting from iSCSI. The why of it is somewhat simple. I want to be able to iSCSI boot so I can have a set of Hyper-V host servers at my primary location completely configured and perfectly happy. Then I will replicate those LUNs to our offsite SAN hardware. When disaster strikes, I can then just configure those servers at the DR location to boot from the replicated LUNs. (in theory)

Yes, there’s easier ways to do this for my scenario like Citrix Essentials for Hyper-V and other SAN replication software which would then allow me to just fail over the setup or configure it as a geo-cluster. But reality is those cost money… and it wasn’t included in the budget for this project. We got the money for the hardware/OS and that’s pretty much it.

Server: Dell PowerEdge R715 – 12 Core AMD – (3) 4 port Broadcom BMC5709C NICs

Storage: Dell EqualLogic SAN Group – 2 PS6000s and 2 PS4000s

(2) Dell PowerConnect 6248 switches.

iSCSI Boot Learning Material:

Broadcom NetXtreme User Guide (Dell) // Original by Broadcom

Dell Instructions to Perform Boot from iSCSI (page 21-24 & 36-38 )

In the BIOS you need to set the boot order to put the Embedded NIC first in the list, followed by DVD then local storage. Second you need to enable the embedded NIC (assuming that’s what you’re using for boot) to allow for iSCSI boot instead of PXE. (UPDATE: SEE BELOW FOR MORE ON THE BIOS CONFIGURATION)

Following these instructions I can get the iSCSI LUN to connect on server boot (still working on the secondary connection which seems to cause iSCSI to not even load):

But as soon as I start the Windows installer, and tell it to load the VBD (driver) per the installation instructions  (the disk isn’t visible in the available drives list) …

I LOVE WINDOWS. 😦 The other problem I seem to be running into is the iSCSI session to the disk drops before I get the drivers loaded. Seems like its a 5 or 6 minute time out which isn’t enough time for the Windows install DVD to load and for me to install the drivers before the connection goes kaput.

HRM…. if I load the iSCSI Driver (bxois.inf), then the NDIS Driver – Win2k8 folder (bxnd.inf) and finally the VBD (bxvbd.inf) I don’t blue screen. Of course that takes me well past 6 minutes which is when the iSCSI connection drops. Shizzle.

This is interesting – from the EQL logs:

iSCSI session to target ‘192.168.21.24:3260, iqn.2001-05.com.equallogic:0-8a0906-fa332600a-1fd0000000a4e039-hyperv01-boot’ from initiator ‘192.168.21.33:4428, hyperv01.iscsiboot’ was closed.
iSCSI initiator connection failure.
No response on connection for 6 seconds.

I find that interesting simply because it’s consistently 6 minutes after the connection is established that connection drops with this error. I’ll have to pop out the disk later and see if it does that because of something the Windows installer is doing or if its something the system is doing. But for now, I’ll keep trying drivers.

Read this:
Main Link Configuring Dell PowerEdge 11G Servers Running Windows Server 2008 for iSCSI SAN Boot (direct link). I believe this explains my blue screen issues. In a nutshell if there’s anything wrong in your configuration, when you load the driver you’ll get a blue screen. So, I stripped it down to the absolute basics for connections (should have started with the KISS method) – basically just gave access to the volume to the IP address I assigned to the NIC instead of using CHAP like I had planned. I also copied their basic settings and removed everything from the second connection I was working on.

And guess what…

All with no driver installation required. I guess my blue screen issue was that I was even having to load the drivers in the first place. But of course Windows rears it’s ugly head again and displays the following: (See the error – click = big)

 GREAAAAAT. Now what? Turns out it was my fault (yet again). In order to load drivers on the prior attempts, I had plugged in my USB thumb drive. Well that was causing some problems with the windows installer I guess because I removed the drive, rebooted the box and now the installer is running directly to my SAN LUN.

All is now well for the most part. The biggest issue I have to overcome now is getting the secondary iSCSI connection (fault tolerance for the primary) to actually work. The problem is as soon as I configure a secondary device… iSCSI boot fails to initiate and the system just hangs after the ILO configuration. But Windows is installed.

More to come…. as I expect I’ll call Dell on Monday and see what they have to say about the secondary connection issue. But before I do that I’ll save some hassle and see about updating all the firmware.

UPDATE: 6/27/11Been on the phone for over 2 hours with Dell storage support team going over my configuration for iSCSI boot. Everything is correct from their perspective, firmware is ok. Now trying to bring a server support specialist on the phone to see if there is something we’re missing, because the storage team isn’t seeing it. For the record they’ve been really helpful so far, but no solution yet in sight.


After 4.5 hours…. nada. Same issue. ARG!

UPDATE 6/29/11 — I am hesitant to use the word resolved since the servers are now so completely out of date that it’s not even funny (and I’m sure support will ask if I ever call in), but I’ve got iSCSI Boot working using a much older BIOS.

iSCSI Boot on the PE R715 will work (with a secondary configured) using BIOS version 1.2.1 which contains the Broadcom NetXtreme II Ethernet Boot Agent v5.2.7 – For reference the server(s) shipped with BIOS version 1.3.1 which contained the v6.0.11 boot agent. I also tried the latest BIOS 1.5.1 (forget the boot agent version). Neither the shipping or the newest version would work.

On another note I was also able to get past the blue screen / reboot loop issue using the Broadcom NetXtreme I/II Ethernet Drivers v. 14.2.4 A04 – both the 16.0.0 A00 version and the 16.2.0 A01 would blue screen the system during the driver installation and then cause the system to go into a permanent reboot loop. I testing this on both 2008 R2 SP1 Datacenter Core and Full Install with the same results.

Update 7/1/2011 — One more interesting bit on this – It looks like Broadcom Firmware 5.2.7 (separate from the BIOS Boot Agent 5.2.7) needs to be installed on the NICs as well. I had the correct BIOS version on the other two hosts I’m building but they wouldn’t work. I installed package NETW_FRMW_WIN_R270088 and downgraded the NIC firmware from 6.2.12 and iSCSI Boot started working right away on those other systems.

More to come…

Update 8/5/2011 — After a month of emails back and forth with Dell support, some parts replacements and whatnot… Dell can reproduce the issue every time using current bios firmware and NIC firmware. My case is now up with internal engineering as this looks like a bug in the Broadcom Boot Agent. Sitting back waiting for an official solution now. The temporary work-around is to completely configure the second boot NIC – but do not configure it as the secondary.

Advertisements

9 thoughts on “Dell PowerEdge R715 iSCSI Boot with Server Core 2008 R2

  1. Can you tell me how to install the iscsi boot firmware on a broadcom extreme II? I am guessing it is in the iso they provide but having trouble finding it.

    Cheers

    1. Brian –

      Actually there’s nothing you can or should have to do to install the iSCSI boot software. The firmware that handles this process is actually baked into the Dell BIOS. Basically what happens is Broadcom creates the software, hands it off to Dell who then has their engineers integrate it into their BIOS firmware. That’s why changes (like what I need done) are so long to come by.

      So if your cards (and your system) are iSCSI boot capable, just make sure you have the latest Dell BIOS and pay close attention to your settings and boot prompts.

      1. Thanks Brandon. I am using an IBM HS22 Blade. I found out that IBM does not push the iSCSI firmware into their OEM Broadcom 5709’s in my blade. Doesnt look like I can upgrade, waiting to hear back from IBM HW support. Most likely will have to add a card that has the support and firmware built in.

    1. @books – thanks for the link. It looks like that was relating to the drives on the install disk. My issue was that the BIOS itself had/still has a bug which prevents the failover configuration from working. But, again thanks for the link because no doubt that will come in handy in the future.

  2. Interesting adventure. Reminds me of the fight we’re having with our 715 and Server 2008R2. Can you tell us how everything finally worked out? If it did, that is.

    1. I had everything booting from a single NIC. Never did get a BIOS that was capable of not failing miserably with the boot config.

      Due to other issues (bad EQL configuration / bad update) – I migrated the servers to us the internal HDDs. It was a great experiment, worked solid as a rock for a year until we made the changes.

  3. Wow after 3 days of banging my head I found this post. Wish I would have discovered it sooner would have saved me lots of time and aggravation. Thanks for your persistence in trying to make this work!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s