posted by Nicholas Blachford on Thu 15th Jul 2004 20:14 UTC

"Next-Gen OS, Page 2/3"

Will this not lead to performance issues?
In single processor systems there is no doubt that there is a negative performance impact. However as I explained in part 1 the hardware for this system is based on a multi-core processor and this changes things.

In order to understand what will happen we first need to know exactly what causes the microkernel performance impact in the first place.

A context switch occurs when one task has to stop and let another run, a context switch involves saving the processor "state" to RAM (the contents of the data and control registers), this operation can take tens of thousands of clock cycles. These happen more in a microkernel based OS because the kernel functionality is broken up into different tasks and they all need to be switched in and out of the CPU to operate.

This is a problem because performing the switch takes time during which the CPU cannot do any useful work. Obviously if there are thousands per second performance is going to be impacted in a negative way. More importantly the context switch can cause part of the cache to be flushed and this has a big negative effect on subsequent performance, more so than the actual context switching.

The macrokernel approach does not suffer these performance issues as everything is in the kernel and it doesn't need to context switch when instruction flow switches between different internal parts of the kernel.

All that said, a well designed microkernel need not be slow, they can to some extent get around part of the context switch speed hit by putting messages together and transferring them on mass (asynchronously), this reduces the number of context switches necessary. Unfortunately much of the research into microkernels has been on Unix and the synchronous nature of Unix's APIs [async] means asynchronous messaging is not used and this results in more context switches and thus lower performance. As such microkernel's reputation for being slow may be at least partially undeserved.

Remember though, at the base of this system is an Exokernel. A traditional microkernel passes messages through the kernel to their destination. An Exokernel doesn't deal with the message itself, it just tells the destination there is a message and leaves it to deal with it. This reduces the messaging overhead.

BeOS used asynchronous messaging technique and it indeed is a very fast OS. However, the network stack proved to be a performance burden being outside the kernel and was later moved inside and indeed this boosted the performance of the networking. This was never commercially released by Be but it is part of Zeta [Zeta] (the only legal way to get a full BeOS today).

Using Multiple cores
The difference with multiple CPU cores is separate kernel components can run simultaneously on different cores so will not need to context switch as often. Messages still need to be sent between the cores but this will not have the same overhead as a context switch and will not have any impact on the cache performance.

Using multiple cores along with asynchronous message passing could lead to our exo/microkernel based OS outperforming a synchronous macrokernel OS. Each message pass or function call will take time and this is fixed, the asynchronous message passing allows bigger chunks of data to be passed in one go and this will reduce the number of times messages are passed compared to a system which uses synchronous API calls.

The very technique which reduces performance on a single core for microkernels and boosts macrokernels may have the complete opposite effect on multicore CPUs leaving microkernels as the higher performing OS architecture. This wont happen immediately but the effect will become more apparent as the number of cores increase and the different parts of the OS can get their own core.

It could be argued that even when spread apart like this a application running on multiple cores will make the OS components switch out causing performance loss. This is of course a risk but anything with high computation needs is more likely to be running on the Cell processors which will not be handling the OS. Remember, this is a desktop so the CPU cores are likely to be sitting around doing nothing most of the time. Many like to discuss the relative merits of the performance of OSs and hardware but very, very few actually utilise that performance.

It should be pointed out that the 2.6 Linux kernel includes asynchronous I/O. Asynchronous messaging is being added to a FreeBSD based OS by the DragonFly BSD project [Dragon].

So no Linux?
Linux (or *BSD) still have advantages of course but the microkernel approach looks like it can deliver not only all the inherent advantages of a microkernel design but may also have a performance advantage in it's favour. This approach is also consistent with the guiding principle of simplicity I set out in part 1 and gives us a chance to explore OS design from a new angle and see what the results are.

While the system will for the most part act like a microkernel based OS the fact it's really using an exokernel adds the ability to have applications almost completely bypass the OS and hit the hardware directly in a safe, shared manner. Hitting the hardware is a somewhat frowned upon approach but it will be useful for applications which use the FPGA and has the potential for allowing massive application speed ups [Exo]. It also allows something else which will be rather useful...

Table of contents
  1. "Next-Gen OS, Page 1/3"
  2. "Next-Gen OS, Page 2/3"
  3. "Next-Gen OS, Page 3/3"
e p (0)    81 Comment(s)

Technology White Papers

See More