Dell 12G PowerEdge – IPMI interrupt and the death of kipmi0

A seemingly minor feature was added to our 12G PowerEdge servers announced this week – IPMI interrupt handling.  This is the culmination of work I started back in 2005 when we discovered that many actions utilizing IPMI, such as polling all the sensors for status during system startup, and performing firmware updates to the IPMI controller itself, took a very very long time.  System startup could be delayed by minutes while OMSA polled the sensors, and firmware updates could take 15 minutes or more.

At the time, hardware rarely had an interrupt line hooked up to the Baseboard Management Controller, which meant we had to rely on polling the IPMI status register for changes.  The polling interval, by default, was the 100Hz kernel timer, meaning we could transfer no more than 100 characters of information per second – reading a single sensor could take several seconds.  To speed up the process, I introduced the “kipmi0″ kernel thread, which could poll much more quickly, but which PowerEdge users noted consumed far more CPU cycles than they would have liked.

Over time the Dell engineering team has made several enhancements to the IPMI driver to try to reduce the impact of the kipmi0 polling thread, but it could never be quite eliminated – until now.

With the launch of the 12G PowerEdge servers, we have a hardware interrupt line from the BMC hooked up and plumbed through the device driver.  This eliminates the need for the polling thread completely, and provides the best IPMI command performance while not needlessly consuming CPU cycles polling.

Congratulations to the Dell PowerEdge and Linux Engineering teams for finishing this effort!