After the Why I like microkernels article, I thought it’d be useful to have a view from the “other side” of this endless war. While some of the reasons given by microkernel fans are true, the big picture is somewhat different and it’s what I think it keeps traditional-style kernels in the top. Note: please take note that the author is not a native English speaker, so forgive any grammar or spelling mistakes.
The main advantage of a pure muK (microkernel) is that things like drivers run in a separate process and communicate using some IPC mechanism, which makes pretty much impossible to affect other processes. This improves, in theory and in practice, the reliability of the system. Compare it with a traditional kernel like linux or solaris, where a NULL pointer dereference in a mouse driver does bring (and it does, there’re bug reports of it) the system down (by the way, this is one of the reasons why the quality of some monolithic kernels keeps so high and they can have years of uptime: the fact that even simple bugs like a pointer reference can make your system go down forces developers to keep their code stable and bug-free. Even if it would be better to avoid reboots it’s useful as whip for developers).
Those are facts. One of the reasons why microkernels were born was because people thought that with the expected increase of complexity it’d be impossible to keep monolithic kernels working (and the complexity has increased a lot, compare the first unix kernels using a few hundred of KB and the modern unix-like kernels like linux using a few MB). SATA, IPv6, IPsec, 3D-capable hardware, multi-core CPUs, heavy multi-threading usage, hotplugging at every level, USB, firewire…how monolithic kernels have managed to keep working? Since complexity has increased so much, why microkernels aren’t ruling the world as one would expect? How it’s possible that despite of all the complexity and all the disadvantages monolithic kernels are still there?
To understand this, you need to dismantle in your head what I think it’s the single biggest myth and lie of muKs: the supposed superiority of muKs when it comes to modularity and design. No matter how much I try I can’t really understand how using separate processes and IPC mechanisms improves the modularity and design of the code itself.
Having a good or a bad design or being modular or not (from a source code p.o.v.) doesn’t depends at all on using or not IPC mechanisms. There’s no reason – and that’s what Linus Torvalds has been saying for years – why a monolithic kernel can’t have a modular and extensible design. For example, take a look at the block layer and the I/O scheduler in the linux kernel. You can use new home-made I/O schedulers using insmod if you like and tell a device to use it by doing echo nameofmyscheduler > /sys/block/hda/scheduler
(useful for devices which have special needs, for example flash-based USB disks where the access time is constant do want to use the “noop” io scheduler). You can also rmmod them when they’re not being used. That – the ability to insert and remove different I/O schedulers at runtime – is a wonderful example of how flexible, modular and well designed is that part of the Linux block layer.
Some people has wasted 20 years saying that monolithic kernels can’t have a good design, other people has wasted all that time improving the internal design of monolithic kernels like Linux or Solaris. Sure, a buggy I/O scheduler can still bring the system down with the previous example, but that doesn’t means that the design and modularity of the code is bad. The same goes for drivers and other parts of the kernel: Is not that in linux drivers doesn’t use well-defined APIs for every subsystem (PCI, wireless, networking). Writing a device driver for linux is not a hell that only a few people can do. On a local level, linux drivers can be simple and clean, and it keeps getting better with every release. A muK is not going to be more modular or have a better design just because it’s a muK. It’s true, however, that running things in separate processes and communicate over IPC channels forces programmers to define an API and keep things modular. It doesn’t mean, however, that the API is good and that for example there’s not a layer violation that you need to fix; and fixing that layer violation may force you to rewrite other modules depending on that interface. There’s no radical “paradigm shift” anywhere between micro or monolithic kernels when it comes to design and modularity: There’s the plain old process of writing and designing software. And nothing stops muKs from having a good design, but kernels like linux and Solaris have had a LOT of time, real-world experience and resources to improve their designs to a higher standards than some microkernels (and it’s not even a choice: the increase of complexity forces them anyway). Some people thinks that Linux developers like to break APIs with every release for fun, because “design of code” is something hard to get if you aren’t involved and you don’t have some taste, but that’s the one reason why the APIs are changed: Improving things. Does that means that is not possible to have a good muK? No, but it means that “traditional” kernels like linux aren’t the monster that some people say.
There’re other “myths”. For example, that only microkernels can update parts of the kernel on the fly. It’s true that this is not what you usually do in linux, people usually updates the whole kernel instead of updating a single driver, but there’s no reason why a monolithic kernel couldn’t do it. In Linux you can insert or remove modules, this means it’s possible to remove a driver, and insert an updated version. The “linux culture” doesn’t makes it easy – due to the API changes and the development process and some checks that avoid by default that modules compiled in a given version are inserted in kernels with other versions, but it could be done, and other monolithic kernels may be doing it already. But then there’re some parts that can’t be updated anywhere reasonably without breaking something, like for example an update of the TCP/IP stack – all the connections would need to be reset (unless you want to save & restore the state of the tcp/ip stack between updates, which would mean you’re increasing the complexity greatly for a event that happens very rarely)
There’s also the “CPUs are fast these days, performance is not a critical factor” myth. Imagine a process which takes X cycles to do something and another which takes Y=X+1. The faster a given CPU executes that process, the more cycles you’re losing with Y. A fast CPU doesn’t help to execute slow things faster, it could even help to make slow things even slower in some cases. The “let’s no care that much about resource usage” may work for userspace (gnome, kde, openoffice) where functionality is more important than wasting N months trying to figure out how to rewrite things more efficiently, but it’s not a good thing when you’re writing important things like the kernel, libc or other important library, because unlike it happens with most of the userspace performance and good resource usage is THE feature for that kind software.
There’s also the “a microkernel never crashes your system” myth. A driver, be it in userspace or kernelspace, can lock your computer by just touching the wrong register. Playing with the PCI bus or your graphics card can bring your system down. A microkernel can protect you against a software bug, but there’re hardware bugs that software can’t fix in any reasonable way, except by working around them. This means that drivers are not just “simple processes”: They’re “special”, in some way, just like other parts of the system.
Will microkernels step up some day (note: because I know there’s still some people left who thinks that Mac OS X is a real microkernel, I recommend reading point 2 of this paper or some Apple documentation)? Maybe, but looking at how hardware is done today, it doesn’t looks like it will be very soon, but who can predict how computers will be in 2070?. The main problem microkernels have today is the lack of functionality: A real, complete, general purpose kernel takes many years and resources to write. Even if you write a competitive microkernel for PCs, it won’t be successful because of the lack of support of hardware devices and other features. No matter if it’s monolithic or micro, writing a kernel for general purpose computers is an almost impossible task.
The fact is that as monolithic kernels improve, they’re moving some parts of functionality to userspace: udev, klibc or libusb are examples of it. Even if you don’t look it that way, the printing drivers you get with CUPS or the 2D X.org drivers are an example of device drivers in userspace. They are not performance-critical so interfaces have been written to allow them to run in userspace. FUSE is also a good example. There’re even some efforts of user-space driver framework for linux. For me that means only one thing: If running drivers as userspace processes gets to be so important that traditional kernels are not viable (and by important I mean: the real world can’t live without it, not just some academics), it could be much easier to move progressively parts of monolithic kernels to userspace than rewriting the whole thing from scratch.
Just my 2 cents.
–Diego Calleja
If you would like to see your thoughts or experiences with technology published, please consider writing an article for OSNews.
This is not meant to be offensive, but I’m tired of the disclaimer “the author is not a native speaker of English; please excuse grammatical mistakes.” This is what editors are for! Can’t authors find a native English speaker to proofread their work? Since this article is an OSNews “exclusive,” can’t someone at OSNews do a little light editing? That would be the procedure at a print magazine.
And yes, I will happily proofread for anyone who asks me to 😉
We don’t do that. First of all, our articles are often meant to be a community collection. Secondly, it’s often very hard to edit these articles for clarity without injecting any bias. As such, we made the editorial decision – long before Thom even – that editing would be minimal.
As such, we made the editorial decision – long before Thom even – that editing would be minimal.
Exactly. The article is NOT off of my hand, and as such, I cannot edit it freely. All we do is remove the really obvious errors, and I personally have an obsession with punctuation. It helps that I study English at university, by the way.
In any case, please discuss the actual article. Posts concerning the English are off topic and will be set to -5. Just so you know.
How pathetic are you, that you have nothing better to do with you life than proof read online articles for spelling and grammar errors. I think many people would agree with me when I say: get a life.
My comment was respectful and politely worded. Sorry if you don’t agree with me.
Print magazines normally come out once a month, you have a whole month to proof read edit and re edit. This is the web, we lucky people even give a damn to try and make it readable. LOL!
Why was the original post modded up ?
jaykayess, have you heard of censorship ?
what is to stop and editor totally changing the content of an article that was sent in ?
And, why is there so many people around here that are so anal, they nit-pick at spelling and grammer ?
the author does not use English as his native language, but you bitch and moan about this…. would you do it to the guys face if you met him in the street ? I think not.
You all sit in your mums basement, (not, mum, not MOM as US people get it wrong all the time, along with colour and through etc)…
You sit in your mums basement, spitting vengeance from you keyboard at someone who has made valid points that you cannot see, because he forgot to stroke the T and dot the i
What a bunch of muppets
I never criticized the author or his work. In fact, I even offered to proofread (for free!) for anyone who publishes here.
And I’m not sitting in “my mom’s basement;” I’m sitting at my desk at work!
But It reminds me debate like RISC vs CISC or procedural vs OO programming.
While I agree with the article, I’d like to emphazise on the fact that well designed micro-kernel are easier to maintain, that developping such kernel in a collaborative way is easier if good interface are provided.
I think it’s a question af choice since both micro-kernel and monolithic kernel have pros and cons.
It is a shame you even have to put a disclaimer in about the writer being non English speaking in the post. If someone has to tear down an article based on the writers command of English then it just means they really have nothing to offer in the way of a rebuttal. Perhaps you should just put the article there in the writers native language. That would at least weed out those who were just there to deride someone else’s lack of total command of the English language.
For those who get upset because someone can’t write perfect English: I have news for you, you can’t either so get over it. English speakers as a primary language are in the minority in this world. At least the writer is making an attempt to communicate with you in your language. I will now get down off my soapbox, but I have been wanting to say this for some time now. And yes, I speak English as a native tongue, (though I am sure the readers in the UK would beg to differ with that statement).
I agree. This was a good article, and as far as I saw there were only one or two grammatical mistakes. When you’re writing technical articles, most intelligent people will see past the typos – the ignorant ones will get stuck on petty details. President Bush can’t even pronouce most words properly, should we hang people who can’t spell? And for that matter, do Americans actually speak English? Its been butchered so bad, I doubt its proper to insult Britian “wit aor wustirn akzent”.
Back to the article, I agree with just about everything written. I do (just for bias’s sake) prefer the microkernel design simply because I’ve had very pleasant experiences with those systems (BeOS, Amiga, QNX) and for their time and place they performed quite responsively (moreso than any flavor of Linux I’ve used to date) and while they could (and did) crash, it was most often the result of trying to do something awful to the system.
But as the author pointed out, this is mostly irrelevant – software is software. APIs get broken and whether the changes are made in smaller modules or not, change happens all the time – so the weakness of the microkernel is that significant changes weave their way through the whole system. And in cases like the three OS’s I mentioned above – the resources simply weren’t there (QNX may be the exception) to rewrite all the affected parts of the system (BeOS’s BONE network kit comes to mind).
-BeOS is considered an hybrid kernel not microkernel, but I agree that it’s responsiveness was good.
-QNX is a microkernel, I don’t remember it’s responsiveness: it didn’t have enough application when I tried it.
-AmigaOS didn’t have memory protection if memory serves, that’s a fundamental flaw which makes any performance/responsivness measure irrelevant IMHO.
That said, DragonFly has been somewhat inspired by AmigaOS, but I would be very surprised if application ported to it will be more responsive on DragonFly.
True, I’ve heard the BeOS-is-a-hybrid comment before – and that’s probably accurate. I would debate that it’s hard to specifically define what is the exact size and nature of an ‘authentic’ microkernel – but that’s not really constructive. Of the very few muKs out there, in my gut I feel Be still qualifies – as it was one of Be Inc.’s major marketing buzzwords (but I know marketing’s not worth much).
As for ‘responsiveness’, I’m not sure if you’re referring to raw speed or interface performance (responsive has many meanings in computer terms). I’ve not used (or heard of) DragonFly, I’ll give that one a googling – but my use of ‘responsive’ is mostly in terms of the interface.
ie: When you click or type, does something happen?
The application isn’t as liable for this as the underlying OS is, in the case of Be – you almost (almost is key) never have a system lockup caused by one thread/app gobbling up time. I’ve experienced it once in the case of ‘Cortex’ – some silly media controller – and it was probably because I did something radically dumb that may have caused a hardware issue. I’ve used QNX’s live desktop disc (seems to not be available anymore) but it was almost as responsive – much more so than XP or Lin/*nix. The scheduling nature of QNX is quite odd, but like a good muK it responded rapidly to user input (from the GUI/input perspective). I’m sure there is a significant performance penalty in swapping between kernel/user spaces, but in the world of Ghz machines this isn’t really life-threatening.
I didn’t know Amiga didn’t have memory protection, that’s sad. It was probably sacrificed for performance. I do believe the latest AROS does, I’ve never used it but it’s good to see a ‘dead’ OS walking…. maybe a ‘zombie’?
I didn’t know Amiga didn’t have memory protection, that’s sad. It was probably sacrificed for performance.
Ever heard of a memory protected OS on 68000 CPU? So I thought.
However, the reason it wasn’t added later when MMU-enabled 68k CPUs (68030 & onwards) became available was that Commodore had already stepped down the development efforts considerably. And I think it would have needed quite a radical re-architecturing to boot.
I didn’t know Amiga didn’t have memory protection, that’s sad. It was probably sacrificed for performance.
I’m not sure about memory protection, but a lot of things in the AmigaOS were sacrificed for deadlines. Originally, a different operating system was in the works — I definitely recall that it would implement resource tracking — but CAOS (Commodore Amiga Operating System) fell so far behind that MetaComCo was called in to port TripOS, an OS written in BCPL.
http://www.amigau.com/aig/caos.html
http://www.thule.no/haynie/caos.html
(Second article suggests there was some memory management.)
I think “u” looks more like the Greek symbol, so that’s what I use when indicating microseconds or whatever.
I have no problem with other people using crash-proof operating systems, but when I’m programming at home, I like having available such instructions as CLI and STI for enabling or disabling interrupts in user programs. I also like being able to access my parallel port directly. In professional environments, fancy operating systems with security and protection are appropriate, but I think hobbiests programmers at home might prefer full access to their computers. That’s why I’ve written LoseThos, a multitasking operating system with everything running in kernel mode. For a typical computer game on a home computer, the whole machine is available.
I get tired of people suggesting one-size fits all and applying rules for professional computing at home.
Why not just UTF-8 and then use a REAL mu(μ)? It it’s time to stop bastardising words and names (oh… and let us use the real letters our names in our government documents!).
Repeat after me: Unicode is your friend.
I’m going to actually talk about the content of the article and not the grammer
First off, the author doesn’t really say anything in the article beyond a few thoughts. Non of which are really developed with examples.
I think the big problem with heavily abstracted micro kernels is that they’re easy to create early on, but they don’t evolve easily. There always seems to be some point down the line with development where you realize, oops I didn’t think about this. And then you have to fiddle with a whole chain of modules and whatnot to pass info from module A to module Q and back again. Making such wholesale changes is somewhat easier in a macrokernel since you don’t have as many abstracted data passing structures to mess with. Abstraction can sometimes = obfuscation. With microkernels, once you get it working you’re job STILL isn’t done, because the thing will probably be dog slow and you’ll need to then rearrange stuff so it’s faster.
Obviously, working, good microkernels can and have been done. I just think macrokernels have a real development time advantage.
All that being said, I think Linus should think about spinning off some of the driver parts into userspace. Say USB for example. If all the USB drivers were maintained outside the kernel with several entry points being made available to the USB subsystem. I think it would allow USB to move more quickly and save Linus the hassle of working with all those little changes. I bet there are complicating factors for this. Anyone know what they are?
Mostly, I agree; except for the ‘dog slow’ issue.
The stability of the overall system (to me) is more important than snagging every last possible compute cycle out of the processor. Considering that a system crash (related to the kernel) will result in a need to reboot, I’d rather sacrifice a clock or two instead…
And having mentioned rebooting, I do think this is one spot where microkernels have traditionally held the advantage (except for maybe DOS) in boot/reboot downtime. From the muKs I’ve mucked with, boot-time is snappy as heck – probably because the kernel isn’t concerned with destroying itself as much as a macrokernel is. During the boot sequence, the structure of a large(er) kernel (I’m using Linux as an example here) relies on time-expensive checking of its loading resources and cycling one-step-at-a-time through all the initilization. muKs, on the other hand, generally seem to have a leaner and more cost-effective management of starting processes and services. Since the kernel itself (in a micro) doesn’t have all of the dangerous extra modules, it can just keep doing what it does – and worry less about verifying its integrity.
But since I don’t code kernel stuff and I’m just a long-time programmer, this may not actually be the case; it’s just the ‘vibe’ I get from the PC…
The ability to migrate processes also reduces the need for hot patching a running OS. OpenVZ and Xen can both migrate processes over a network with millisecond level disruptions in process response time. Migrate the processes off the box, install a new kernel and reboot. To me that is safer that trying to suspend everything using a device, hot patch and restart the device, and then resume use of it. I’m always afraid the microcode on the device will get messed up.
Any system trying to achieve 100% up-time should have the redundant resources needed to allow the migrations.
PS, my belief is that TLB thrashing dooms muK’s from the start,
> PS, my belief is that TLB thrashing dooms muK’s from the start,
I don’t know: L4 seemed to have interesting performance, but I could be wrong.
What kind of CPU is L4 running on, does it have TLBs? Microcontroller class CPUs don’t have them.
Add up the overhead, one of the the most expensive things a CPU can do is mess with the TLB. Linux maps the kernel using minimal TLB entries. With a muK OS every little driver and sub-system is going to need one or more TLB entries since they each get their own address space. Now the OS is using a lot of TLB entries, this pushes out the user space ones. Now things start to thrash.
Linux is actually faster when compiled -Os than with all of the gcc ‘speed’ optimizations. The speed optimizations make the code larger. Larger code takes more TLB entries. Thrashing the TLB entries is so inefficient that it more than offsets the gains from the gcc speed optimizations. So it is better to make the kernel as small as possible and minimize TLB usage.
Related to this is why Linux tries to keep a copy of itself in the process’ address space. Check the top 1G of a process on a normal x86 box, there is a copy of the kernel up there.
> What kind of CPU is L4 running on, does it have TLBs? Microcontroller class CPUs don’t have them.
*Sigh* too lazy to google?
L4 is a research muK which runs on x86, Alpha, etc..
And I remember research paper showing that it did have very good performances.
Unless you can show otherwise ie for example show that their research paper is no good, your theoretical reasons why muK can’t perform are just hot air.
I checked my “hot air” in Google for a couple of hits and found this in a presentation.
Microkernel: Con
Performance
Requires more context switches
Each “system call” must switch to the kernel and then to another user level process
Context switches are expensive
State must be saved and restored
TLB is flushed
Google on “l4 tlb” and you’ll get 137,000 hits. Most of them are trying to deal with the TLB problem.
And in the first paper showed by googling ‘l4 tlb’ there is:
>>
Blazing fast IPC
[cut]
The author of the L4 paper, Jochen Liedtke, showed
that microkernels do not inherently:
Exhibit slow IPC
Impose significant processor overhead
Instead, he showed that particular implementations
are to blame: Mach was causing lots of cache misses because it’s fat, making it look like microkernels have high overhead
<<
So while TLB management is a problem, apparently L4 managed to avoid too much performance hit (10% hit in L4Linux).
hmm, hot-patching. was there not a article not to long ago that talked about the linux kernel having the ability to hand the computer of to a new version of itself?
hmm, hot-patching. was there not a article not to long ago that talked about the linux kernel having the ability to hand the computer of to a new version of itself?
Linux can do this but it is still an experimental feature.
so in theory, if things go bad, one can allways have the kernel “reboot” itself without rebooting the hardware?
that is, as long as its a software failure?
so in theory, if things go bad, one can always have the kernel “reboot” itself without rebooting the hardware?
that is, as long as its a software failure?
The purpose of the feature is to allow security updates on high availability systems.
If things have gone bad enough to take down the kernel, the processes are likely corrupt too. Most memory corruption errors start scattering writes at random into memory. It may take thousands of corrupt writes before hitting something critical so by the time you detect it a lot of damage has been done.
would not that be a issue with microkernels allso?
A higher degree of memory isolation is one of the few microkernel advantages. The higher degree of isolation increases the odds of quickly getting a GPF with a microkernel.
The more I think about it, I can’t see anything a microkernel can do that can’t also be achieved in a monolithic kernel in a similar manner. For example a monolithic kernel could implement subsystem based write protection on kernel data structures if it wanted to. On each entry/exit of the subsystem the appropriate regions of memory would be write protected/unprotected. There’s just aren’t any monolithic kernels doing this, probably because it is of little value.
so in theory, if things go bad, one can allways have the kernel “reboot” itself without rebooting the hardware?
<a href=”http://www-128.ibm.com/developerworks/linux/library/l-kexec.html“&g…
In fact, Linux is using that (rebooting the kernel automatically when it oopses) to implement a memory dump facility: Reboot to a “know safe” zone of memory, and save the memory to a file, ie: the kernel dump
hmm, so one still need to unmount all filesystems and kill all prosesses?
hmm, so one still need to unmount all filesystems and kill all prosesses?
kexec allows you to load a kernel from an already running kernel.
Wether you umount your filesystems or not depends on how you use kexec. You can use kexec to test new kernels, in those cases you can write an script which umounts the filesystem, etc. You can also not umount anything and execute that kernel directly. This last method is used when the system oopses: There’s no way to “umount” the filesystem, the kernel has crashed, just execute the new kernel, dump the memory and hope that everything works.
PS, my belief is that TLB thrashing dooms muK’s from the start,
This depends on the kernel design and some other factors.
For example, on 32 bit 80×86 (AFAIK) L4 uses segmentation to pack several processes into the same address space so that it can switch between these processes without any TLB flushing.
There are other techniques – like using buffered IPC, where messages are put into the receiver’s message queue and no address space switch is done as part of sending the message (it’s done later, when the scheduler decides the task has had enough CPU time anyway).
What author saying is muk does not simplify kernel design, and is less efficient than monolithic kernel etc.
1st point is simply not true. In monolithic kernels, the traditional way to support SMP is a big-f–king-lock. This means only 1 CPU at a time can be in kernel mode, and of cause this is highly inefficient. Modern linux kernel implements complex locking and synchronization mechanism to ensure multiple kernel pathways can run in parallel on multiple CPUs. To make kernel reentrant things can order of magnitude more complex and still it is very hard to eliminate all deadlocks and race conditions.
On microkernel’s, SMP is supported naturally. Locking and reentrance are none issue as most functions are scheduled in userspace. There are no needs to preempt kernel to meet real time constraint as minimal amount of time is spent in kernel mode anyway.
2nd point is true to some extent. Context switching is the main cause of inefficiencies in muk. However, 1) L4 has shown that well designed microkernel can be 98% as efficient as monolithic kernel on single cpu system. 2) World is moving towards multicore/SMP. One of the core can always be in kernel mode, thus eliminating those extra context switches in compare with single cpu mode.
IMO muk/hybrid muk is the way for the multicore future, though I am not a hardcore muk zealot that believe everything including memory management and scheduler should be push into userspace
I don’t know about your first point: to get good performance you need to use correctly the parallelism, that’s is to say: put related elements on the same CPU, otherwise inter-CPU communication will kill you.
So scheduling could become an issue, are these theoretical arguments of why muK are better than monolithic kernel at SMP backed by figures?
1st point is simply not true. In monolithic kernels, the traditional way to support SMP is a big-f–king-lock
What makes you think that?
What you call “traditional way to support SMP in monolithic kernels is the method that most of monolithic kernels have used to evolve from a UP, nonreentrant model kernel to a SMP, reentrant one. Most of the monolithic kernels we’re using today were UP one day, then they ere evolved. The big kernel lock is a just a method to make such kernels evolve. There’s no reason why you can’t write a monolithic kernel from scratch optimized for SMP, with no “big kernel lock”
On microkernel’s, SMP is supported naturally. Locking and reentrance are none issue as most functions are scheduled in userspace
Sorry but this is just plain wrong. In order to make microkernels perform well, server processes need to be multithreaded and then they get just as hairy as monolhitic kernels. Things don’t perform well in SMP by default, no matter what kernel model you use.
Take for example the filesystem process. A process tries to read a file, so it asks for data to the filesystem process. Unless the filesystem process is multithreaded, other processes asking for data to that same filesystem process will have to wait (ie: a “big kernel lock”)
Microkernels are not going to scale to multicore CPUs magically. We’re facing that same problem with the rest of userspace programs, and microkernels can avoid it magically? Duh. The reality here is that most of microkernels are not optimized heavily for SMP, but linux is…
Edited 2006-04-24 19:29
diegocg is correct. monolithic or muK design does not change the need for fine grained locking and multithreading to achieve reasonable SMP parallelism. The same complexity is exposed to both architectures and they both have to deal with it.
“”What you call “traditional way to support SMP in monolithic kernels is the method that most of monolithic kernels have used to evolve from a UP, nonreentrant model kernel to a SMP, reentrant one.””
Exactly, and I did say modern monolithic kernel are fine grained and didn’t suffer from this.
“”Most of the monolithic kernels we’re using today were UP one day, then they ere evolved. The big kernel lock is a just a method to make such kernels evolve. There’s no reason why you can’t write a monolithic kernel from scratch optimized for SMP, with no “big kernel lock”””
Sure one can optimize a monolithic kernel from scratch. Problem is relative complexity. To make monolithic kernel fine grained, a lot of locks and check points are necessary. Reentrance and kernel preemption just make the complexity worse.
“”Sorry but this is just plain wrong. In order to make microkernels perform well, server processes need to be multithreaded and then they get just as hairy as monolhitic kernels. Things don’t perform well in SMP by default, no matter what kernel model you use.””
Scheduling in userspace is still easier even it is multithreaded/multitasked.
1)kernel save all states when doing context switches, and can provides mechanism for atomicity.
2)Kernel can reorder all those tasks and make sure they execute in right order
3)Messages can be order as well.
4)Explicit share memory or IPC are required for those userspace tasks to talk, those accidental corruption of kernel data by reentrance/preemption cannot happens.
Of cause, signaling and synchronization are still needed in userspace. But atomicity, guaranteed /execution message ordering make writing those task servers can be far simpler in muk environment.
In comparison, monolithic kernel k_threads and tasklets have to be fully aware of all hazards cause by parallelism, making kernel development far more complex than muk.
“”Microkernels are not going to scale to multicore CPUs magically. We’re facing that same problem with the rest of userspace programs, and microkernels can avoid it magically? Duh. The reality here is that most of microkernels are not optimized heavily for SMP, but linux is…””
Sigh. Monolithic kernel is popular and well understood, this is not surprising that a lot of human effort are put into them to make SMP happens. This doesn’t mean they are easier/better from design point of view. OpenBSD is still suffering from big-O-lock, while FreeBSD and Linux are only getting fine grained in pass 2 years, these tell you all.
Few people understanding muk and even fewer implements it correctly. Only microkernel that are heavily used these days are aimed at niche market like embedded or multimedia (e.g. QNX, BeOS), and none of them are focused on smp throughput.
Thus does the current situation implies monolithic is better than muk in smp by design? No.
it just tends to start wars where nobody really acknowledges the fact that no real microkernels are terribly popular, so it *almost* doesn’t matter.
In monolithic kernel, there are some “mini-me”s (think austin powers) call kernel threads and interput handles. they can be scheduled more or less independently on different cores in parallel. all those mini-mes share address space with the mother kernel. when current execution are disrupted by interputs/schedule-time-out etc, mini-mes can left share data in a inconsistent state. So complex locking mechanism are written to make sure this will never happen. typical kernel can have dozen of mini-mes running, and you can get the picture how quickly these multibody interactions get complicated.
“So scheduling could become an issue, are these theoretical arguments of why muK are better than monolithic kernel at SMP backed by figures?”
You are misunderstanding my point. Current Linux and FreeBSD does not suffer from big-lock inefficiencies. They have complex mechanism to make kernel code running parallel on all cores. Problem is design complexity now.
In the most recent article by Robert Cringely (http://www.pbs.org/cringely/pulpit/pulpit20060420.html) he states that Apple will be getting a new kernel. More precisely, he states that it would be a monolithic kernel due to speed increases in integer calculations.
What does this say about microkernels? I mean, if Apple finds this change to be in their best interest (Cringely states the microkernel is hindering the adoption of xServe), does this reflect upon all microkernels, or just Mach microkernel?
I’m not trying to start a flame war of any type, but I would think real world experience (in this case, Apple) would lend credence to the fact that microkernels still need work to compete with monolithic kernels.
Edited 2006-04-24 19:06
This article says nothing about microkernels and all about Cringely not having a clue:
“Quite simply, a monolithic kernel like the one used in Linux or most of the other Open Source Unix clones is inherently two to three times faster for integer calculations than the Mach microkernel presently used in OS X 10.4.”
The kernel has nothing to do with the speed of integer calculations whatsoever. And if he even dared to do a bit of research about the Xserve, he’d have noticed that they’re not spelled “xServe”.
The kernel has nothing to do with the speed of integer calculations whatsoever. And if he even dared to do a bit of research about the Xserve, he’d have noticed that they’re not spelled “xServe”.
Granted Cringely didn’t offer any proof of the kernel’s involvement with integer calculations, I woldn’t dismiss it right away…
Think back to the open letter posted by Mac games developer/porter Aspyr. They discussed why OpenGL/game performance is lower in Mac OSX as opposed to Windows. One key reason for game performance issues is that the OSX kernel doesn’t give any one single process full CPU time. In Windows, if a rouge process gets itself stuck in an endless loop (I’m guilty of making such apps myself, oops) you’ll see 100% CPU utilization and the system is unresponsive. Then it becomes an arguement of speed or stability.
So with that in mind, it’s possible to say that the kernel is to blame for lower calculations in a given app.
When you read “integer calculations” you also have to consider that it might mean more than just simple add, sub, mul, div CPU instructions. If a math library was used, many complex equations might wind up in C-style functions. To use those, the process has to set up a stack frame, request memory, call the function, etc.
So unless you’re writing a really tight program in assembly and you’re addressing hardware directly, your compiled code and app will interface the system in some way.
does this reflect upon all microkernels, or just Mach microkernel?
1) Forget Cringely. We link to him and people like him (i.e. Dvorak) just for the fun of it (hey we as staff like a laugh too every now and then, we’re actually real people, you know), and because the things they come up with while imitating the Oracle of Delphi are usually provocative enough to discuss. But for real arguments, forget ‘m.
2) Apple’s kernel is no longer a muK. It’s a hybrid just like the NT kernel. Saying, ‘OSX is slow, so it must be that its muK is slow’ is nonsense because the kernel probably has little to do with the user experience of speed and responsiveness, and because OSX’ kernel is not a muK.
If you want to get a real idea of what a muK can do, download QNX. QNX’s Neutrino kernel is a true muK, not a hybrid like OSX’s or NT’s. And what makes QNX even more interesting, is that the windoing engine it uses, PhotonUI, is designed exactly like a muK. To understand more about QNX, read my article on it:
http://www.osnews.com/story.php?news_id=8911
It’s from before I became public piss post– err, managing editor at OSN.
And on a final note, it’s quite obvious I personally don’t agree with everything said in this article. However, since this article is a direct reply to one of mine, I’ll leave a ‘formal’ response to a future article, to be fair.
/boast mode
This whole muK vs. monolithic debate on OSNews does show why OSN is so popular: unbiased news, and even articles which are a direct contradiction to one of the main editor’s views are publicised. I don’t think many editors of other sites can say the same.
/end boast mode
“CPUs are fast these days, performance is not a critical factor” myth. Imagine a process which takes X cycles to do something and another which takes Y=X+1. The faster a given CPU executes that process, the more cycles you’re losing with Y.
Not correct. If process X takes 200 cycles and process Y takes 250 to do the same stuff then process Y does not magically take MORE than 250 cycles on a faster CPU. The only way you waste more cycles is if you use a faster CPU to execute process Y more times (but this was not stated) and then you still have gained something. A faster CPU makes a process execute in faster time (unless the execution depends on other factors like slow memory etc and then lower latency can still help so that a process Y does not loose as much time sitting idle waiting for the memory reads/writes to complete) which is the complete opposite of your argumentation
A fast CPU doesn’t help to execute slow things faster, it could even help to make slow things even slower in some cases.
Not sure what you are smoking here (no offence) but this is not correct.
A microkernel can protect you against a software bug, but there’re hardware bugs that software can’t fix in any reasonable way, except by working around them.
All the more reasons to use a microkernel on buggy hardware or when dealing with buggy drivers. If they run in userspace they can not bring down the kernel. Mikrokernels are not immune but a damn lot more robust towards both buggy hardware and buggy drivers. This is one of the biggest advantages of myK and one of the stronger arguments against monoK. Again as always the hybrids try to strike a balance where the more performance sensitive subsystems like storage IO etc are running in kernel mode and other drivers run in user space. On Vista for instance, the sound drivers are going to be running in userspace compared to XP (which will probably yield better latency performance for audio apps also due to not having the sound mixer etc context switching into kernel mode)
Edited 2006-04-24 22:14
Not correct. If process X takes 200 cycles and process Y takes 250 to do the same stuff then process Y does not magically take MORE than 250 cycles on a faster CPU
I never said it does.
What I say is that you buy a CPU which is ej: 200x faster (just a number to make calculations easier), X takes one cycle and Y takes more than one cycle, 1 + 1/4 in fact.
Now a better cpu comes, 400x faster. Surprise, X takes 2 cycles to run, Y takes… 2 + 1/4. Except that the “1/4” no means 100 cycles and keeps growing as the CPU gets faster.
And that’s what I wanted to prove: saying “fast cpus make microkernels more feasible” doesn’t fix the problem. Sure, the microkernel will run faster. The problem is that the monolithic kernel will continue being even faster
On Vista for instance, the sound drivers are going to be running in userspace compared to XP
Not at all. Drivers will keep running in kernel space. It’s the “sound core” (sound mixing etc, lots of crap that they never should have put in the kernel) what they’re moving to userspace – I don’t understand why people keeps spreading those false rumours about vista’s sound subsystem. They’re doing what linux does: Keep the core driver in the kernel, the rest of the stuff in userspace. Windows had too much crap in kernel, they’re just fixing it.
I never said it does…
Now a better cpu comes, 400x faster. Surprise, X takes 2 cycles to run, Y takes… 2 + 1/4. Except that the “1/4” no means 100 cycles and keeps growing as the CPU gets faster.
No you just said it again unless you have your own very specific defenition of what a cycle means in terms of CPUs.
Just to clarify. Cycle => power cycle or “tick”, measured in hz. A faster processor completes more cycles per timeunit (seconds). Processes usually (unless they depend on “outside” factors) takes a specified number of instructions to complete. Your process just doubled up on the amount of work required to complete on a twice as fast CPU. How is that? And what “magic” is that automatically makes the same process blow up like that. Please explain your logic because to most of us it sure does not make any sense.
During the 20th century CPUs were busy doing calculation and I/O. DMA became the norm to minimize the impact of I/O and caching was added to address the poor main memory performance. Memory protection became mainstream and multiprocessor designs were added. All this was designed with monolithic kernels in mind. Enough history 🙂
My point is that talking about the performance of microkernels is not totally fair. Now, I’m not convinced that microkernels is the right way to go, but the discussion should be about wether the microkernel design is good. Performance wise I think it already has proved itself on todays hardware to some extend.
Example: The TLB issue is always mentioned in these discussions and it’s a good example of how the CPU design matters. On x86 the way to deal with context switches has been to flush it. This gives an advantages to monolithic kernels. A tagged TLB (This is part of the virtual machine extentions to x86) solves this, given that monolithic and microkernels have an equal sized working set (please tell me if you know).
Times are changing. Virtualization is one of these changes and it has a lot in common with microkernels. Another is the amount of I/O. Today networking bandwidth has come very close to main memory bandwidth and this creates problems for the monolithic design (also for others). Memory copying has become an overhead while it is still the fastest way to move data between protection boundaries. There are many discussions about this on LKML: “If we could just move this page to another address space we could avoid copying and then we will become heroes”… “don’t do it! It’s slow”. The Singularity project has gone back to the days of software based protection to avoid the MMU all together.
The Linux FUSE project is a prof that microkernels has something good. If not for performance then at least for functionality.
To sum it up. A lot has happend and we really should try to be open to new ideas and not just be monolithic in our thinking. These are exciting times…
Well,
The 20th century saw a lot of system architecturs, and wasn’t until the PC commodity production that cache based systems came to dominate.
The Cray 2, for example, could move a 64-bit (8 byte) memory word every cycle on a machine with a 2.5ns clock, in 1985, with no caches. *per processor*, and it had four. that’s 8 gB / second. You may have access to a 64 gbit/sec network, but I sure don’t, and that’s a 20 year old machine.
Virtualization, especially at the hardware level, isn’t particularly new, IBM introduced it on the 360/370 architecture in 1972, and a lot of machines have had it over the years. What is new is that it has come to commodity hardware, and the commodity folk are reinventing the state of the art.
There were machines designed specifically to support microkernels. Mach, after all, comes out of the same research program as CM* and accent.
In the end, the current VM model won for no other reason than the sheer weight of popularity.
[edit: fixed mbit->gbit typo]
Edited 2006-04-24 23:13
After reading these discussions for a few days I’m probably in the exokernel camp. Video works well in the exokernel model and so does networking. Network protocols are very similar to the DRM/mesa model. TCP/IP could be implemented in user space with little downside that I can see. This would not be a TCP/IP server, instead each process would run the TCP/IP stack directly. The network packets would be passed to the kernel for transmission/reception. This is the same as what happens with DRM/mesa when it is direct rendering. You still need the kernel layer to control the DMA hardware and keep everyone separate.
I don’t see how filesystems can be done with an exokernel. The problem is security. With video/networking you send down the wrong info and you just mess up your own session, other users aren’t effected. With a file system, write a wrong directory entry and everyone is hosed. The work needed to verify fs writes is probably equal to the work needed to implement them in the kernel in the first place. Of course if everyone can be trusted you have an exokernel filesystem already available: just run GFS in each process.
Implementing Lustre locally might be a solution to an exokernel style file system. With Lustre the kernel would control the directory, locking and space allocation. User space would then be in direct control of the blocks owned by the file.
This would not be a TCP/IP server, instead each process would run the TCP/IP stack directly
A reliable tcp/ip stack does really want to live in a “separate process” (the kernel in the case of monolithic kernels)
http://www.ussg.iu.edu/hypermail/linux/kernel/0408.3/2462.html
I see what the problem is, if you run TCP/IP inside each process it messes up things like epoll(). If the kernel can’t tell who is listening for what it can’t tell which process to wake up when a packet arrives, If you move all of the state info into the kernel so that it does know who to wake up then you’ve moved most of TCP/IP there. epoll() is very critical to networking performance.
I was getting too focused on RDMA type transfers when I wrote the first response.
These links about Van Jacobson’s net channels are making me swing back towards exokernel style TCP/IP.
http://lwn.net/Articles/169961/
http://lkml.org/lkml/2006/4/20/169
The thing about Van Jacobson’s net channels is that I don’t understand how with such design stateful firewall could work..
Comments have suggested than a stateful firewall would have a channel coming to it and then place its results into outgoing channels. You would run the classifier after it. Running a stateful fireware would greatly slow the system down.
A better scheme proposed by some is to implement the stateful firewall inside the network card. Of course if the stateful firewall is running in a different box it doesn’t impact the net channel at all.
The stateful firewall problem is similar to the fragmented packet problem. Fragmented packets require the classifier function to adapt dynamically. Dynamic classifiers can grow without bounds.
Some of the arguments seem to answer or contradict themselves. It’s possible that the author’s language is the reason for this; I don’t mean any offense, but I’m pointing this out because I’d like clarification. Maybe it’s my not-so-thorough background in compsci.
the fact that even simple bugs like a pointer reference can make your system go down forces developers to keep their code stable and bug-free. Even if it would be better to avoid reboots it’s useful as whip for developers…
No it isn’t, and I’d say that most of the history of Windows (e.g.) proves it.
I can’t really understand how using separate processes and IPC mechanisms improves the modularity and design of the code itself.
That’s like saying, “I can’t understand how separating components of a program into different procedures/classes/modules improves the modularity and design of the code itself. You’re answering your own question. There is a penalty for modularity in programming, too; hence the popularity of inlining code.
Sure, a buggy I/O scheduler can still bring the system down with the previous example, but that doesn’t means that the design and modularity of the code is bad.
Design and modularity of the code, or design and modularity of the kernel? If a buggy I/O scheduler can bring the system down, that’s bad code design first in the driver (obviously) but also in the OS (indirectly).
And nothing stops muKs from having a good design, but kernels like linux and Solaris have had a LOT of time, real-world experience and resources to improve their designs to a higher standards than some microkernels
That’s silly. I haven’t heard anyone saying that a mukernel will automagically perform better, and have better uptime, than a monolithic kernel, simply by virtue of its betterness. On the other hand, OS-9, QNX, and other mukernel-based OS’s have had a lot of time, real-world experience and resources to improve their designs to higher standards than any monolithic kernels.
In Linux you can insert or remove modules, this means it’s possible to remove a driver, and insert an updated version. The “linux culture” doesn’t makes it easy…
Is it the Linux culture that doesn’t make it easy, or the design of the Linux kernel that doesn’t make it easy?
Imagine a process which takes X cycles to do something and another which takes Y=X+1. The faster a given CPU executes that process, the more cycles you’re losing with Y. A fast CPU doesn’t help to execute slow things faster, it could even help to make slow things even slower in some cases.
An absurd argument. More wasted cycles, so what? The process is still faster. How many cycles are being wasted right now by extra time to render HTML instead of plain text? Sure, the HTML requires overhead, but it’s much easier to read and understand than plain text.
Will microkernels step up some day?
Again, I refer you to OS-9 and QNX.
http://www.microware.com/
http://www.qnx.com/
These aren’t new; they’re older than the 20 years you’ve been talking about, and they’re used all the time in real-world, realtime applications.
The fact is that as monolithic kernels improve, they’re moving some parts of functionality to userspace: udev, klibc or libusb are examples of it.
Thus proving that the basic ideas of mukernel design are correct.
It is due to your lack of background in software engineering (no offense).
Software is designed with known issues, and shipped with known bugs. It’s a cost/profit sort of issue; although often a time available for developers is the cost/profit terms and not real money (Linux developers don’t have unlimited resources either).
So, to say that monolithic kernels enforce stronger coding is true. With a really well designed microkernel you could have drivers restart upon failure. And so, there would be known times when the driver can’t fail; and therefore known times it’s ok to fail. Costs would be put into making sure nothing you do can damage the hardware with little concern for whether it crashes every few hours.
With the monolithic kernel you can’t have it crash, ever, else you take the computer down. People don’t like drivers that crash their computer (see: Windows ME).
He’s right to say that this can be an advantage. No one really wants under quality drivers: They’re inconvenient even if they cause no real damage.
—
It’s the linux license that makes modules difficult. The license and culture are highly interlinked (see fsf.org).
—
HTML rendering is actually very slow. Especially in Mozilla browsers… Watch dillo and firefox do it and you’ll see the difference in cost. Not that this has anything to do with the discussion.
—
Thus proving that the basic ideas of mukernel design are correct.
I don’t believe anyone ever said microkernels were entirely without merit. This is a strawman argument: To say “you said it’s completely wrong” when what was really said was “it’s not the best way.”
Also, in CS, to say something is correct has little to do with any topic of performance: Bubble sort (n^2 time) is correct, but if you sorted New York’s phone directory with it you’d have wasted a lot of time!
Of course, that’s even off topic. The concern of microkernels verse monolithic kernels isn’t about the theoretic asymptotic runtime. It’s about the constant CS theorists ignore (note: These are different people than those who write microkernels).
Think of it like this, and this is a bad analogy I’m sure. Two ways to talk to someone:
1.) In person, face to face. You can make gestures, talk, touch them, etc. Communication is fairly easy. But they can stab you…
2.) Over the phone. You can only talk, and inflect, no gestures or touching. But they can’t stab you… And talking is a bit slower, although hardly noticeable now, you might notice if you were a bit quicker at talking.
No it isn’t, and I’d say that most of the history of Windows (e.g.) proves it.
What the history of windows shows me an OS that used to be buggy and unstable and its progressively getting better (you can get easily months of uptime with XP) despite of being a monolithic kernel.
“I can’t understand how separating components of a program into different procedures/classes/modules improves the modularity and design of the code itself. You’re answering your own question.
No, i’m not, maybe re-read the article? A microkernel can have its procedures/classes/modules bad designed. And if you think that you can write a kernel and get the procedures/classes/modules (be it a micro or a monolithic kernel) right the first (or the second, or the third) time, then you haven’t written a kernel. That’s the whole point of the article: Just because you modularized your kernel it doesn’t means you did it right, and just because it’s a monolithic kernel doesn’t means you did it wrong. Software design affetcs to every piece of software, the idea of a microkernel being “more modular” just because you intented to modularize it is laughable…
An absurd argument. More wasted cycles, so what? The process is still faster.
Yes, the process is faster. But then, the monolithic kernel is even faster and that will show up in benchmarks: Exactly one of the reasons why microkernels aren’t succesful today: they’re slower compared to a monolithic kernel
Is it the Linux culture that doesn’t make it easy, or the design of the Linux kernel that doesn’t make it easy?
Culture. It’s not easy for a microkernel neither for a monolithic kernel. Try designing a filesystem which updates itself while other process is requesting data from disk; have luck when saving the state of file descriptors, or the tcp/ip stack etc….the same applies for every piece of software of the “kernel”. I don’t see why it should be less of a hell for a microkernel.
These aren’t new; they’re older than the 20 years you’ve been talking about, and they’re used all the time in real-world, realtime applications.
We’re talking general-purpose, real operative systems here. Yes, I’m sure that qnx and os-9 are used every day for millions of people….
Edited 2006-04-25 00:26
I think a hybrid kernel is best. I look at Windows Vista and I see that it has the best of both worlds. Speed and the drivers are now in user-mode. Which means that not all code is in the driver, just important code that needs to be in the kernel for speed reasons.
This means that when you have drivers in user-mode, you can upgrade at will without rebooting the OS. Each driver is seperated from the kernel but it is in it’s own protected area of memory. This also makes it easier for drivers to be tested and created.
Dave Cutler is awesome. I have to admit I respect him a lot more than Linus.
The real reason because microkernel are not used is market (and drivers): there is a duopoly Microsoft/Linux on OS. And linux has not enough hardware drivers. If Linus when created Linux made it with microkernel now this thread will be different. Today there is no space fo r a new OS microkernel or not.
BTW: microkernels have a lot of advantages: things in userspace are easier debuggables, and you avoid a lot of code duplication (eg: linux subsystem for kernel threads is a duplication of work). And if you want to make them efficient you have to think twice about interfaces among modules, so resulting code is surely better written.