Dell Linux Engineers work over 5000 bugs with Red Hat

A post today by Dell’s Linux Engineering team announcing support for RHEL 5.8 on PowerEdge 12G servers made me stop and think.  In the post, they included a link to a list of fixes and enhancements worked in preparing RHEL 5.8 for our new servers.  The list was pretty short. But that list doesn’t tell the whole story.

A quick search in Bugzilla for issues which Dell has been involved in since 1999 yields 5420 bugs, 4959 of which are CLOSED, and only 380 of which are still in NEW or ASSIGNED state, many of which look like they’re pretty close to being closed as well.  This is a testament to the hard work Dell puts into ensuring Linux “Just Works” on our servers, straight out of the box, with few to no extra driver disks or post-install updates needed to make your server fully functional.  You want a working new 12G server?  Simply grab the latest RHEL or SLES DVD image and go.  Want a different flavor of Linux?  Just be sure you’re running a recent upstream kernel – we push updates and fixes there regularly too.

Sure, we could make it harder for you, but why?

Congratulations to the Linux Engineering team for launching 12G PowerEdge with full support baked into Linux!  Keep up the good work!

Free Money

This post is aimed at my Dell colleagues in the US.

If you’re like me, you dread the weeks shortly after Back To School.  Sure, the kids are now settled into their daily routines, evening homework, Fall sports and Scouting, but with the start of the new school year comes the start of Fall Fundraising by each and every organization you’re fortunate enough to be a part of or even nearby.  Each organization worthy in its own right, and as active participants, you bet we’re going to donate.

But did you know you can double your money?  Yep!  For every dollar you donate to a familiar charity, Dell will match that donation dollar-for-dollar up to $10,000/year per employee.  This is an amazing benefit, which I had long put on my Tell Dell survey wishlist, and starting about 7 years ago (maybe more?) it became reality.

Now, there’s one little catch – you can’t simply hand over a check to your familiar charity and let them get it doubled.  You must send your donation through the Dell internal web site (internal home page, You and Dell, Employee Giving), pay via credit card or payroll deduction (or if you’re particularly generous, stock donation), and in a few weeks Dell sends a check for 2x your amount to the charity.  Relatively painless, and a fantastic benefit.  You can give to a bunch of charities, or a few; a little (minimum $25), or a lot (up to $10k matched) any time during the year.  The $10k match resets on January 1.

In addition, Dell wants to encourage employees to volunteer their time, as well as give their money, to charitable causes.  Are you a Scout leader?  A coach?  A board member?  Maybe you help out at the library or at church.  However you volunteer is up to you.  In recognition of your volunteer hours, Dell will give $150 each quarter (yep, that’s $600/year) to charities you designate (they don’t even have to be the same organizations you volunteer for if you want), as long as you log 10 or more hours of volunteer time in the quarter.  So go to the tool (inside home page, You and Dell, Make a Difference), set up your charities, and log your hours.  Then it’s free money for the charities you choose.

So, don’t let that Free Money pass you by.  You know the charities need it, and it’s a simple benefit on top of the activities you’re up to your neck in already.  Take a few minutes to double your contributions, and send that $600 to folks who really need it.

Consistent Network Device Naming updates

Today I released biosdevname v0.3.7, after listening to feedback from all across the web, including NetworkWorld, LWN, and Slashdot.  No, I’m not killing the feature, as some might hope, but some changes are in order.

First, it’s amazing how many people hated the ‘#’ character in device names.  Yes, that was bound to cause some problems, but nothing that couldn’t be fixed given enough time.  But since it’s early in the game, changing that character from ‘#’ to ‘p’ accomplishes the same goal, with less chance of breakage, so that’s done.  pci<slot>p<port>_<vf> it is….

Second, the various virtual machine BIOSes each do something slightly different for the network devices they expose.  VMware exposes the first NIC (traditionally eth0) as in PCI slot 3.  KVM exposes the first NIC as in PCI slot 2, but has no information about the second NIC.  Xen doesn’t expose anything, so those all kept the ethX naming convention.

To address these discrepancies, and because there is no physical representation of a (virtual) NIC in a virtual machine, biosdevname no longer suggests a new name for NICs if running in a VM guest.  This means all VM guests keep ethX as their naming convention. Thanks to colleague Narendra K for this fix.

Third, for everyone who still thinks renaming devices is a really bad idea, you get an out.  A new kernel command line option, honored by udev, lets you disable biosdevname.  biosdevname=0 will prevent biosdevname from being invoked, effectively disabling this feature, leaving you with the ethX names.

All this, and the usual assorted bug fixes as biosdevname gets more widespread exposure and testing.

Love it?  Hate it?  Let me know.  You can find me (mdomsch) on IRC on FreeNode in #biosdevname, #udev, or #fedora-devel, as well as the usual mailing lists.

Fedora Test Day today – please join us

Today is the official  Fedora Test Day for Consistent Network Device Naming.  Given all the coverage this week on NetworkWorld and Slashdot, I would like to see widespread testing of this feature, to assuage the concerns and misconceptions raised there.  Testing is simple – download and boot the LiveISO, and report success or failure on the wiki page.  You can even try it out on a running Fedora 14 instance if you like.

The Dell engineers who have been working on this for years will be online in #fedora-test-day on FreeNode IRC today if you have any questions.  Please join us.  Thanks for your time and participation.

Consistent Network Device Naming coming to Fedora 15

One of my long-standing pet projects – Consistent Network Device Naming, is finally coming to Fedora (emphasizing the 2 of the Fedora F’s: Features and First), and thereafter, all Linux distributions.  What is this, you ask?

Systems running Linux have long had ethernet network devices named ethX.  Your desktop likely has one ethernet port, named eth0.  This works fine if you have only one network port, but what if, like on Dell PowerEdge servers, you have four ethernet ports?  They are named eth0, eth1, eth2, eth3, corresponding to the labels on the back of the chassis, 1, 2, 3, 4, respectively.  Sometimes.  Aside from the obvious confusion of names starting at 0 verses starting at 1, other race conditions can happen such that each port may not get the same name on every boot, and they may get named in an arbitrary order.  If you add in a network card to a PCI slot, it gets even worse, as the ports on the motherboard and the ports on the add-in card may have their names intermixed.

While several solutions have  been proposed over time (detailed at Linux Plumbers Conference last year), none were deemed acceptable, until now.

Enter biosdevname, the tool Dell has developed to bring sanity (and consistency!) to network device names.  Biosdevname is a udev helper, which renames network interfaces based on information presented by system BIOS.

The new naming convention is as follows:

  • em[1-N] for on-board (embedded) NICs (# matches chassis labels)
  • pci<slot>#<port> for cards in PCI slots, port 1..N
  • NPAR & SR-IOV devices add a suffix of _<vf>, from 0..N depending on the number of Partitions or Virtual Functions exposed on each port.
  • Other Linux conventions, such as .<vlan> and :<alias> suffixes remain unchanged and are still applicable.

This provides a sane mapping of Linux network interface name to externally visible network port (RJ-45 jack).

Where do we get this information?  The algorithm is fairly simple:

  • If system BIOS exposes the new PCI Firmware Specification 3.1 ACPI _DSM method, we get the interface label and index from ACPI, and use those.
  • Else if system BIOS exposes an index and label in SMBIOS 2.6 types 9 and 41, use the index value.
  • Else if system BIOS exposes index via the HP proprietary SMBIOS extension, use that.
  • Else fall back to using the legacy PCI IRQ Routing Table to figure out which slots devices are in, sort the PCI device list in breadth-first order, and assign index values.

How will this affect you?

If you have scripts that have hard-coded eth0 or have assumptions that ethX is a particular port, your scripts are already broken (you may just not know it yet).  Begin planning on using the new interface names going forward, adjusting your scripts as necessary.

Fedora 15 will be the first distribution to use biosdevname by default.  There will be a Test Day on Thursday, January 27.  I encourage you to download the Live image, boot it on your system, and verify that your network interfaces are now named according to the above convention, and that all works as expected.  You may also take the opportunity to review your custom scripts, looking for hard-coded ethX values, and prepare for the coming name change.

Once we get sufficient exposure and verification using Fedora, I expect to see this change roll into other Linux distributions, and other operating systems, over time.  Consider yourself warned.

Dell introduces RHEL Auto-Entitlement and 5-year subscriptions

Noted on the Dell blog, the auto-entitlement system we rolled out to the US and Europe a few years ago is finally available worldwide.  What is auto-entitlement, you ask?

If you’ve ever purchased a Red Hat Enterprise Linux subscription when purchasing a Dell PowerEdge server, shrink-wrapped alongside the CDs is a “registration card”, with a long string of numbers on it.  Upon unboxing your system, you had to a) not throw away that card; b) not lose that card; c) get that card to some responsible party at your organization; d) ensure that responsible party went to http://redhat.com/activate to activate the subscription, using the number on that card.  See how many steps that took?  Can you guess how many ways something could go wrong in the process?

With auto-entitlement, the system administrator is able to simply log their new system into Red Hat Network the first time they use it (as they would to get updates and to manage their system).  Red Hat Network is then smart enough to recognize that the system was purchased from Dell, knows the subscription type and duration, and Bob’s your Uncle.  No registration card to lose, no extra steps to take.  Oh, and if you manage to blow away the hard disk image and re-install RHEL before connecting to Red Hat Network for the first time – no worries – auto-entitlement will still work.

Oh, and while we’re at it, the new 5-year RHEL subscription matches the available 5-year ProSupport hardware service contract, so there’s never any mess with having out-of-sync support subscriptions.

Just two more ways Dell ensures Linux, in this case Red Hat Enterprise Linux, “Just Works”.

Dell at LinuxCon Boston

For the second year in a row, Dell engineers will be on hand at the Linux Foundation’s LinuxCon conference in Boston next week.  While I don’t get to fly a helicopter in the Penguin Bowl this year, we’ll have plenty of face time with the engineers and enthusiasts on hand.

On Wednesday at 10:30am, I’ll be presenting on Network Device Naming, which simplifies this:

PowerEdge R610 with 8 Ethernet ports

by letting the system administrator use better names for their network ports than “eth0”.   Can you guess which is eth0 in that picture?  (Hint: it might be green, it might be red, it might be orange and it may change from time to time.)

Shyam Iyer  follows me at 11:30am, presenting “Storage Provisioning with iSCSI for Virtualized Environments”, which describes the work he has been doing with the Open-iSCSI and libvirt teams to simplify iSCSI storage use by virtual machines, to take advantage of all the great hardware acceleration our EqualLogic arrays provide.

On Thursday at 2pm, I return to the stage in a panel moderated by Matt Asay, COO of Canonical, titled “What’s Next for Linux”, alongside James Bottomley of Novell, David Recordon of Facebook, and Ravi Simhambhatla of Virgin America.   I’m especially interested to be on this panel, as my cohorts are pushing the limits of computing, often with Dell’s help, and simultaneously Dell is active in the new worlds they’re creating.

See you in Boston next week!

New uefivars project leverages 9-year-old efibootmgr work

Finnbarr P. Murphy (fpmurphy) posted on his blog yesterday about his new project, uefivars, to retrieve and display information about UEFI variables. UEFI is the new firmware standard, replacing legacy BIOS over time, which is present on Dell 11G PowerEdge servers today. fpmurphy’s work is based largely on my own efibootmgr project which I started back in 2001 when first working on EFI for the Itanium processor. I’m glad to see renewed interest in this work as more people get exposed to UEFI on new systems. Perhaps it’s time, 9 years later, for bits of efibootmgr to turn into a library for use by applications like uefivars.

TPMs are good for something

TPMs (Trusted Platform Modules) have long been avoided on Linux, given that their primary use cases have historically been around licensing and Digital Rights Management, concepts which are mostly foreign to Free and Open Source software.  However, as new use cases, such as “trusted boot” have emerged, developers have added TPM device drivers to the Linux kernel to enable these uses.  One often-overlooked feature of the TPM is that it has a hardware pseudo-random number generator.

A while back, Jeff Garzik and others were discussing this on the linux-kernel mailing list (summarized on LWN.net), where it was suggested that the TPM could be used to feed the rngd (random number gathering daemon) tool, just as it reads from other hardware random number generators.  The rngd program reads from hardware-based random number generators and feeds entropy into the kernel’s entropy pool.  Easy in concept, but lacking in TPM implementation.

As it happens, quite a few Dell systems include a TPM chip, including the PowerEdge 11G servers such as the R610 and R710.  So, I asked Dell’s crack team of Linux developers to see what they could do.  The result: a patch to rngd which adds the TPM as another source of random numbers for feeding the kernel’s entropy pool.

We’re working with Jeff to get this patch applied to the rng-tools upstream sources, and from there into the various distributions as their schedules permit.

So, should you find yourself running out of entropy on your servers, and not having a keyboard or mouse attached as ways to feed the entropy pool, you can run enable the TPM in BIOS SETUP, run rngd, and never lack for randomness again.