No more boot loader: please use the kernel instead

Thom Holwerda 2024-07-09 Linux 15 Comments

Most people are familiar with GRUB, a powerful, flexible, fully-featured bootloader that is used on multiple architectures (x86_64, aarch64, ppc64le OpenFirmware). Although GRUB is quite versatile and capable, its features create complexity that is difficult to maintain, and that both duplicate and lag behind the Linux kernel while also creating numerous security holes. On the other hand, the Linux kernel, which has a large developer base, benefits from fast feature development, quick responses to vulnerabilities and greater overall scrutiny.
We (Red Hat boot loader engineering) will present our solution to this problem, which is to use the Linux kernel as its own bootloader. Loaded by the EFI stub on UEFI, and packed into a unified kernel image (UKI), the kernel, initramfs, and kernel command line, contain everything they need to reach the final boot target. All necessary drivers, filesystem support, and networking are already built in and code duplication is avoided.
↫ Marta Lewandowska

I’m not a fan of GRUB. It’s too much of a single point of failure, and since I’m not going to be dual-booting anything anyway I’d much rather use something that isn’t as complex as GRUB. Systemd-boot is an option, but switching over from GRUB to systemd-boot, while possible on my distribution of choice, Fedora, is not officially supported and there’s no guarantee it will keep working from one release to the next.

The proposed solution here seems like another option, and it may even be a better option – I’ll leave that to the experts to discuss. It seems like to me that the ideal we should be striving for is to have booting the operating system become the sole responsibility of the EUFI firmware, which usually already contains the ability to load any operating system that supports UEFI without explicitly installing a bootloader. It’d be great if you could set your UEFI firmware to just always load its boot menu, instead of hiding it behind a function key or whatever.

We made UEFI more capable to address the various problems and limitations inherent in BIOS. Why are we still forcing UEFI to pretend it still has the same limitations?

About The Author

Thom Holwerda

Follow me on Mastodon @thomholwerda@exquisite.social

15 Comments

2024-07-09 6:29 pm
Drizzt321
Grub2 has worked, but yeah, from a educated power user, it’s a bit of magic that seems fragile at times.
Personally, I’ve been meaning to try out https://github.com/zbm-dev/zfsbootmenu on a spare machine, since I run ZFS root with Debian. So I’ve been wanting an easy way to boot to a snapshot/rollback to a snapshot/etc, in order to easily rollback if I have an issue with an update, since I snapshot before I do updates. Already saved me a couple of times from being in a bad state.
2024-07-09 7:01 pm
Langalf
The only place I dual boot is a laptop I use to test Haiku. I had to use the Haiku boot loader to get it to also boot Linux; I could never get Grub to do it. As long as it is still possible to dual boot both, I have no preference for how it is done.
2024-07-09 8:19 pm
runciblebatleth
The problem with UKI is that the kernel command line is baked in to the image. That means you have to generate a new image and write it to your EFISP to change the kernel command line instead of using GRUB or another loader, which makes it significantly harder to use temporary kernel command line arguments to debug boot issues or start single-tasking (init=/bin/sh) to help someone get back into a system to which they lost the password. I suppose the intended replacement for that is chrooting in from a live environment – or telling users that everything that everything they didn’t back up is now gone, if their disk is TPM-encrypted and the OS won’t boot.

2024-07-10 9:26 am
teco.sb
Wouldn’t the fact that the kernel ships with a built-in initramfs mitigate against this type of problem? Back in the initrd days, I know a shell was included (typically dash or busybox ash) and that’s the shell you dropped into when something went haywire in the boot process. Wouldn’t the packaged initramfs have the same capabilities, mitigating the need for init=/bin/sh?
In the places I run Linux (at home only), I have not had a boot issue in at least a dozen years. The last time it happened it was because of a botched experiment and ended up with an incompatible version of a library in /usr/local. As far as I understand, you would still be able to chroot into a system from an initramfs environment and fix things manually, assuming your /bin/sh is operational. No?
Disclaimer: I don’t know about this stuff as much as I would like.

2024-07-10 9:57 am
Alfman verbose=1
It’s not something I normally require, but the initrd is extremely limited environment and it doesn’t always have the tools you need to fix the problem. This could be fixed now, but as an example I was unable to fix an issue with lvm2 thin volumes from initrd because the tooling required wasn’t installed. So I had to boot from another live OS.
Incidentally the last time I had to use initrd was when I was testing btrfs raid. The boot fails when a drive is removed from the array, waiting for an admin to login to the console. IMHO this makes btrfs totally useless for production environments since it doesn’t offer the same level of fail over redundancy as mdraid.

2024-07-10 12:07 am
Enturbulated
Increases required effort for troubleshooting while providing dubious benefit? Sounds like a Microsoft solution.
2024-07-10 1:41 am
chriscox
Can vary, but generally speaking, UEFI can be brought up (sometimes easily, sometimes not) so that you can “select” what to boot.
Which for many, may provide the main feature they got out of grub.
2024-07-10 3:48 am
nia_netbsd
Translation: “we can’t keep track of our own churn any more”
2024-07-10 10:27 am
tknarr
This is basically how bootloaders used to work. The 1st-stage bootloader (boot ROM) was small and initialized the hardware. It was just smart enough to load the first block from the boot drive. Back then the big factor in what was needed to do that was the kind of I/O bus the drive was on, the type of drive didn’t much matter. The 2nd-stage bootloader (from the boot block) was just smart enough to understand the filesystem on the drive and locate and load the kernel image. It didn’t need to understand every possible filesystem because it was written to the boot block as part of formatting the volume and was set up to be the correct version for the filesystem and OS being installed. The complicated parts were part of the OS kernel and boot scripts, no need to include them in the bootloader at all.
All the complexity in the boot process today seems to come from having the boot ROM now being flash memory that can be rewritten, so the bootloader has to not just load the next stage but verify that that next stage hasn’t been compromised by malware. Maybe it’s time to go back to a stupidly-simple boot ROM that can’t be updated without physical access to the motherboard itself (“If you need to update your computer’s BIOS, please contact a local computer repair shop.”) and won’t load the 2nd-stage bootloader from disk without manual confirmation if it’s been modified since the last boot (“You will only be prompted to confirm booting the first time you start your computer or after re-installing the OS. If you are ever prompted at any other time, answer “No” and contact a local computer repair shop.”).
2024-07-10 2:52 pm
js
I’ve been using systemd-boot with my own Secure Boot keys since Fedora 33: https://blog.nil.im/?7a
And it only ever “broke” with Fedora 39: https://blog.nil.im/?80
And that was only because they started adding official UKI support.
So yeah, if you don’t want to use GRUB, you don’t have to.
2024-07-10 3:18 pm
NaGERST
bring back lilo instead. it was so easy to chainload windows BeOS, haiku, OS2, AROS, AtheOS, Symbian, Arachne and SEAL
2024-07-10 4:12 pm
bobby.tables
Hmm, isn’t it exactly what LinuxBoot.org is doing already?
2024-07-10 7:18 pm
grindstone
I think Slackware still uses lilo (somehow).

2024-07-10 7:49 pm
Alfman verbose=1
grindstone,
I think Slackware still uses lilo (somehow).
I use syslinux. Grub is fine, but I’d say it’s over-engineered. I’m a fan of the “keep it simple, stupid” rule. This is the philosophy that permeates most of my preferences.

2024-07-14 10:50 am
Geck
So creators of systemd are arguing Grub is a tiny bit too complex and a security risk, so they will ditch Grub and provide better solution. A bit of self reflection wouldn’t hurt and good luck in providing a better solution then Grub. Another case of NIH syndrome looming coupled with false sense of supremacy.