Why Monolithic Kernels Aren’t the End of the World

After the Why I like microkernels article, I thought it’d be useful to have a view from the “other side” of this endless war. While some of the reasons given by microkernel fans are true, the big picture is somewhat different and it’s what I think it keeps traditional-style kernels in the top. Note: please take note that the author is not a native English speaker, so forgive any grammar or spelling mistakes.

The main advantage of a pure muK (microkernel) is that things like drivers run in a separate process and communicate using some IPC mechanism, which makes pretty much impossible to affect other processes. This improves, in theory and in practice, the reliability of the system. Compare it with a traditional kernel like linux or solaris, where a NULL pointer dereference in a mouse driver does bring (and it does, there’re bug reports of it) the system down (by the way, this is one of the reasons why the quality of some monolithic kernels keeps so high and they can have years of uptime: the fact that even simple bugs like a pointer reference can make your system go down forces developers to keep their code stable and bug-free. Even if it would be better to avoid reboots it’s useful as whip for developers).

Those are facts. One of the reasons why microkernels were born was because people thought that with the expected increase of complexity it’d be impossible to keep monolithic kernels working (and the complexity has increased a lot, compare the first unix kernels using a few hundred of KB and the modern unix-like kernels like linux using a few MB). SATA, IPv6, IPsec, 3D-capable hardware, multi-core CPUs, heavy multi-threading usage, hotplugging at every level, USB, firewire…how monolithic kernels have managed to keep working? Since complexity has increased so much, why microkernels aren’t ruling the world as one would expect? How it’s possible that despite of all the complexity and all the disadvantages monolithic kernels are still there?

To understand this, you need to dismantle in your head what I think it’s the single biggest myth and lie of muKs: the supposed superiority of muKs when it comes to modularity and design. No matter how much I try I can’t really understand how using separate processes and IPC mechanisms improves the modularity and design of the code itself.

Having a good or a bad design or being modular or not (from a source code p.o.v.) doesn’t depends at all on using or not IPC mechanisms. There’s no reason – and that’s what Linus Torvalds has been saying for years – why a monolithic kernel can’t have a modular and extensible design. For example, take a look at the block layer and the I/O scheduler in the linux kernel. You can use new home-made I/O schedulers using insmod if you like and tell a device to use it by doing echo nameofmyscheduler > /sys/block/hda/scheduler (useful for devices which have special needs, for example flash-based USB disks where the access time is constant do want to use the “noop” io scheduler). You can also rmmod them when they’re not being used. That – the ability to insert and remove different I/O schedulers at runtime – is a wonderful example of how flexible, modular and well designed is that part of the Linux block layer.

Some people has wasted 20 years saying that monolithic kernels can’t have a good design, other people has wasted all that time improving the internal design of monolithic kernels like Linux or Solaris. Sure, a buggy I/O scheduler can still bring the system down with the previous example, but that doesn’t means that the design and modularity of the code is bad. The same goes for drivers and other parts of the kernel: Is not that in linux drivers doesn’t use well-defined APIs for every subsystem (PCI, wireless, networking). Writing a device driver for linux is not a hell that only a few people can do. On a local level, linux drivers can be simple and clean, and it keeps getting better with every release. A muK is not going to be more modular or have a better design just because it’s a muK. It’s true, however, that running things in separate processes and communicate over IPC channels forces programmers to define an API and keep things modular. It doesn’t mean, however, that the API is good and that for example there’s not a layer violation that you need to fix; and fixing that layer violation may force you to rewrite other modules depending on that interface. There’s no radical “paradigm shift” anywhere between micro or monolithic kernels when it comes to design and modularity: There’s the plain old process of writing and designing software. And nothing stops muKs from having a good design, but kernels like linux and Solaris have had a LOT of time, real-world experience and resources to improve their designs to a higher standards than some microkernels (and it’s not even a choice: the increase of complexity forces them anyway). Some people thinks that Linux developers like to break APIs with every release for fun, because “design of code” is something hard to get if you aren’t involved and you don’t have some taste, but that’s the one reason why the APIs are changed: Improving things. Does that means that is not possible to have a good muK? No, but it means that “traditional” kernels like linux aren’t the monster that some people say.

There’re other “myths”. For example, that only microkernels can update parts of the kernel on the fly. It’s true that this is not what you usually do in linux, people usually updates the whole kernel instead of updating a single driver, but there’s no reason why a monolithic kernel couldn’t do it. In Linux you can insert or remove modules, this means it’s possible to remove a driver, and insert an updated version. The “linux culture” doesn’t makes it easy – due to the API changes and the development process and some checks that avoid by default that modules compiled in a given version are inserted in kernels with other versions, but it could be done, and other monolithic kernels may be doing it already. But then there’re some parts that can’t be updated anywhere reasonably without breaking something, like for example an update of the TCP/IP stack – all the connections would need to be reset (unless you want to save & restore the state of the tcp/ip stack between updates, which would mean you’re increasing the complexity greatly for a event that happens very rarely)

There’s also the “CPUs are fast these days, performance is not a critical factor” myth. Imagine a process which takes X cycles to do something and another which takes Y=X+1. The faster a given CPU executes that process, the more cycles you’re losing with Y. A fast CPU doesn’t help to execute slow things faster, it could even help to make slow things even slower in some cases. The “let’s no care that much about resource usage” may work for userspace (gnome, kde, openoffice) where functionality is more important than wasting N months trying to figure out how to rewrite things more efficiently, but it’s not a good thing when you’re writing important things like the kernel, libc or other important library, because unlike it happens with most of the userspace performance and good resource usage is THE feature for that kind software.

There’s also the “a microkernel never crashes your system” myth. A driver, be it in userspace or kernelspace, can lock your computer by just touching the wrong register. Playing with the PCI bus or your graphics card can bring your system down. A microkernel can protect you against a software bug, but there’re hardware bugs that software can’t fix in any reasonable way, except by working around them. This means that drivers are not just “simple processes”: They’re “special”, in some way, just like other parts of the system.

Will microkernels step up some day (note: because I know there’s still some people left who thinks that Mac OS X is a real microkernel, I recommend reading point 2 of this paper or some Apple documentation)? Maybe, but looking at how hardware is done today, it doesn’t looks like it will be very soon, but who can predict how computers will be in 2070?. The main problem microkernels have today is the lack of functionality: A real, complete, general purpose kernel takes many years and resources to write. Even if you write a competitive microkernel for PCs, it won’t be successful because of the lack of support of hardware devices and other features. No matter if it’s monolithic or micro, writing a kernel for general purpose computers is an almost impossible task.

The fact is that as monolithic kernels improve, they’re moving some parts of functionality to userspace: udev, klibc or libusb are examples of it. Even if you don’t look it that way, the printing drivers you get with CUPS or the 2D X.org drivers are an example of device drivers in userspace. They are not performance-critical so interfaces have been written to allow them to run in userspace. FUSE is also a good example. There’re even some efforts of user-space driver framework for linux. For me that means only one thing: If running drivers as userspace processes gets to be so important that traditional kernels are not viable (and by important I mean: the real world can’t live without it, not just some academics), it could be much easier to move progressively parts of monolithic kernels to userspace than rewriting the whole thing from scratch.

Just my 2 cents.

–Diego Calleja


If you would like to see your thoughts or experiences with technology published, please consider writing an article for OSNews.

62 Comments

  1. 2006-04-24 2:40 pm
    • 2006-04-24 3:10 pm
      • 2006-04-24 3:27 pm
    • 2006-04-24 4:25 pm
      • 2006-04-24 4:27 pm
    • 2006-04-24 4:37 pm
    • 2006-04-24 8:53 pm
      • 2006-04-24 9:01 pm
  2. 2006-04-24 2:56 pm
  3. 2006-04-24 3:06 pm
    • 2006-04-24 3:27 pm
      • 2006-04-24 5:01 pm
        • 2006-04-24 5:26 pm
          • 2006-04-24 6:30 pm
          • 2006-04-24 10:22 pm
  4. 2006-04-24 4:15 pm
    • 2006-04-24 7:39 pm
  5. 2006-04-24 4:26 pm
    • 2006-04-24 4:47 pm
  6. 2006-04-24 4:33 pm
    • 2006-04-24 5:04 pm
      • 2006-04-24 5:28 pm
        • 2006-04-24 9:43 pm
          • 2006-04-24 10:01 pm
          • 2006-04-25 1:09 pm
    • 2006-04-24 5:11 pm
      • 2006-04-24 5:31 pm
        • 2006-04-24 7:20 pm
          • 2006-04-24 7:33 pm
          • 2006-04-24 8:24 pm
          • 2006-04-24 8:52 pm
          • 2006-04-24 7:33 pm
          • 2006-04-24 8:24 pm
          • 2006-04-24 8:36 pm
    • 2006-04-25 10:30 am
  7. 2006-04-24 4:46 pm
    • 2006-04-24 5:09 pm
    • 2006-04-24 7:28 pm
      • 2006-04-24 7:38 pm
      • 2006-04-25 5:46 am
  8. 2006-04-24 5:46 pm
  9. 2006-04-24 5:58 pm
  10. 2006-04-24 6:51 pm
    • 2006-04-24 7:47 pm
      • 2006-04-24 8:17 pm
    • 2006-04-24 8:05 pm
  11. 2006-04-24 10:02 pm
    • 2006-04-24 10:58 pm
      • 2006-04-25 12:33 am
  12. 2006-04-24 10:05 pm
    • 2006-04-24 11:12 pm
    • 2006-04-24 11:58 pm
      • 2006-04-25 12:02 am
        • 2006-04-25 12:13 am
          • 2006-04-25 12:31 am
          • 2006-04-25 1:11 pm
          • 2006-04-25 2:24 pm
  13. 2006-04-24 10:52 pm
    • 2006-04-24 11:29 pm
    • 2006-04-25 12:20 am
  14. 2006-04-25 11:18 am
  15. 2006-04-25 9:15 pm