In 2011 Facebook announced the Open Compute Project to form a community around open-source designs and specifications for data center hardware. Facebook shared its hardware specs, which resulted in 38 percent less energy consumption and 24 percent cost savings compared with its existing data centers. What Facebook and other hyperscalers (Google, Microsoft, et al.) donate to the Open Compute Project are their solutions to the agonizing problems that come with running data centers at scale.
Since then, the project has expanded to all aspects of the open data center: baseboard management controllers (BMCs), network interface controllers (NICs), rack designs, power busbars, servers, storage, firmware, and security. This column focuses on the BMC. This is an introduction to a complicated topic; some sections just touch the surface, but the intention is to provide a full picture of the world of the open-source BMC ecosystem, starting with a brief overview of the BMC’s role in a system, touching on security concerns around the BMC, and then diving into some of the projects that have developed in the open-source ecosystem.
A good overview.
Progress exists, but progress is slow in this part of the field.
On the BIOS/EFI firmware side of the computer itself, coreboot and LinuxBoot exist, but adding support for each board/macine still takes time/effort. LinuxBoot (basically a regular Linux kernel) can even boot Windows these days.
One of the funniest things about coreboot is, unlike a regular BIOS/EFI firmware is how fast it boots even on more complicated hardware. Why ? Because it’s initiating system parts in parallel, unlike regular BIOS/EFI firmware which initialize most stuff in phases or one after an other.
If you currently want some kind of pretty well locked down and non-expensive coreboot device you’d have to buy a Chromebook.
Lennie,
I’d like to mess around with coreboot some time, I just wish more hardware manufacturers supported it.
Slow bootup is a disappointment for me. Not many of us have written an OS, but when you do you realize just how quickly hardware responds to probing and initialization. The vast majority of time wasted on boot-up is due to software delays and timeouts, meanwhile the hardware is just waiting. Given how fast things are and that we’ve eliminated things like disk spinup, the OS should be loaded within 1-2s time. The hardware is not the bottleneck so much as poorly optimized software & firmware.
We make so many excuses for things being as slow as they are, but damn it ataris and commodores in the early 80s were able to be “ready” before the monitor was even turned on.
https://www.youtube.com/watch?v=moO2MaW_mPQ
Sure a modern OS is more complicated, but the hardware is so much faster. Whether it’s our phones or computers, our industry deserves a lot of flack for how long it takes to boot up and initialize system state. Somehow we lost the expectation of instant on (like VCRs & TVs) and now delays are considered normal.
This is fantastic from a cost savings and waste reduction perspective as well. If a $100 BMC on a $50,000 server dies, you’re either doing a motherboard swap or putting the drives in a whole new server. I had to do both many times in my days as a datacenter technician. This is a big piece of why so many otherwise fully functional servers and boards wind up for sale cheap on eBay. Open source, modular, replaceable software and hardware is going to be vital to extending the life of systems, especially as we expand off world – when cost to orbit is measured in hundreds or thousands of dollars per kilogram, suddenly replacement chips and soldering irons seem reasonable.
The security is nothing to sneeze at either – most datacenters end up spinning up THREE entirely separate network fabrics – one connected to the Internet, one private, and one for IPMI. Cutting three down to two would save a ton of space, money, and effort.