Linked by Thom Holwerda on Sun 16th Apr 2006 15:36 UTC
OSNews, Generic OSes Right in between a car crash and Easter, I knew I had to write a Sunday Eve Column. So here I am, digesting vast quantities of chocolate eggs (and I don't even like chocolate), craving for coffee (for me about as special as breathing), with the goal of explaining to you my, well, obsession with microkernels. Why do I like them? Why do I think the microkernel paradigm is superior to the monolithic one? Read on.
Thread beginning with comment 115440
To view parent comment, click here.
To read all comments associated with this story, please click here.
nick
Member since:
2006-04-17

How can microkernel drivers "not harm you system"?!?

A buggy disk driver can hose your disks. A buggy
filesystem driver can corrupt your filesystem. A
buggy memory manager can corrupt memory. A buggy
network driver can corrupt network traffic.

Aside from corruption, their failure obviously will
result in the denial of usually critical system
resources like memory, filesystem, or IO device access.

So I don't see how microkernel proponents can be
lulled into this false sense of security.

Reply Parent Score: 3

AndyZ Member since:
2005-07-05

Corrupting a filesystem is one point of failure, but a failed "server" (as in driver) can be restarted. If its not possible for the microkernel to restart ie the network or filesystem driver again, then how could they be started in the first place(at boottime)?

AndyZ

Reply Parent Score: 1

nick Member since:
2006-04-17

I didn't say it wasn't possible. Neither would it be
impossible to do a similar reinitialisation of some
subsystem or driver in a monolithic kernel.

I don't try to claim there are no advantages of a
microkernel -- obviouly there are some otherwise even
their most stubborn supporters would give up on the
idea in the face of all their disadvantages.

But this (immunity of the system / data from bugs) is
not one of them.

Reply Parent Score: 1

Brendan Member since:
2005-11-16

Consider a dodgy driver or service that occasionally writes to random addresses.

In a traditional monolithic system, the driver/service would be implemented as part of the kernel and can trash anything that's running on the computer - nothing will stop if from continuing to trash things, and nothing will help to detect which driver or service is faulty.

On a basic micro-kernel the driver/service can't effect anything else in the system, and sooner or later it'd generate a page fault and be terminated. This makes it much easier to find which driver or piece of software was faulty, and means that damage is limited.

In this case, you're still partially screwed because everything that was relying on that driver or service will have problems when that driver/service is terminated. This isn't always a problem though (it depends on what died) - for example, if the driver for the sound card dies then no-one will care much. If the video driver dies then the local user might get annoyed, but you could still login via. network and things like databases and web servers won't be effected.

The more advanced a micro-kernel is the more systems it will have in place to handle failures.

For example, if the video driver dies the OS might tell the GUI about it, try to download/install an updated driver, then restart the video driver and eventually tell the GUI that the video is back up and running. The user might lose video for 3 seconds or something but they can still keep working afterwards (and there'd hopefully be an explanation in the system logs for the system administrators to worry about).

Another way would be to use "redundancy". For example, have one swap partition on "/dev/hda3" and another on "/dev/hdc3" with 2 seperate disk drivers. Writes go to both disk drivers, but reads come from the least loaded disk driver. In this case the system would be able to handle the failure of one swap partition or disk driver (but not both). With fast enough networking, maybe keeping a redundant copy of swap space on another computer is an option..

The point is that for monolithic kernels you don't have these options - if anything in kernel space dies you have to assume that everything in kernel space has become unreliable, and rebooting is the only reliable option (if the code to do a kernel panic and reboot hasn't been trashed too).

Most developers of monolithc systems will say that it's easier to make their drivers and services bug free than it is to implement systems to recover from failures. They may be right, but it might be "wishful thinking" too...

Reply Parent Score: 2

nick Member since:
2006-04-17

What if the soundcard driver gets corrupted and starts
DMA to a random page of memory that was actually some
filesystem's pagecache[*]?

What if a driver goes haywire and starts sending the
wrong IPC messages down the pipe?

Another way would be to use "redundancy". For example, have one swap partition on "/dev/hda3" and another on "/dev/hdc3" with 2 seperate disk drivers. Writes go to both disk drivers, but reads come from the least loaded disk driver. In this case the system would be able to handle the failure of one swap partition or disk driver (but not both). With fast enough networking, maybe keeping a redundant copy of swap space on another computer is an option..

I don't think so. You have to have at least 3 devices
and 3 different drivers and perform checksumming across
all data that comes out of them if you really want to
be able to discard invalid results from a single
driver. Or you could possibly store checksums on disk,
but if you don't trust a single driver...

I think in general it would be far better to go with
RAID, or a redundant cluster wouldn't it?

The point is that for monolithic kernels you don't have these options - if anything in kernel space dies you have to assume that everything in kernel space has become unreliable, and rebooting is the only reliable option (if the code to do a kernel panic and reboot hasn't been trashed too).

A microkernel can fail too, end of story. If you need
really high availability, you need failover clusters.

And within a single machine, I happen to think
hypervisor/exokernel + many monolithic kernels is a
much nicer solution than a microkernel.

[*] Perhaps you might have DMA services in the kernel
and verify all DMA requests are going to/from
driver-local pages, yet more overhead... does any
microkernel do this?

Reply Parent Score: 2

Cloudy Member since:
2006-02-15

The point is that for monolithic kernels you don't have these options - if anything in kernel space dies you have to assume that everything in kernel space has become unreliable, and rebooting is the only reliable option (if the code to do a kernel panic and reboot hasn't been trashed too).

This is true in most implementations, but it is a feature of the implementation rather than a necessity of the system. It is, given reasonable VM design, possible to make the user/supervisor transition distinct from the addressability distinction.

You can have a 'monolithic' kernel in the user/supervisor sense -- meaning that the whole thing is compiled as a unit and all runs in supervisor mode -- without having to have one in the memory addressability sense -- meaning that various subsystems can only access what they're allowed access to.

Reply Parent Score: 2

ma_d Member since:
2005-06-29

I did define system as memory+cpu didn't I?

Reply Parent Score: 1