Hobby OS-deving 3: Designing a Kernel

Neolander 2011-02-20 OS News 34 Comments

Now that you have an idea of where your OS project is heading as a whole, it’s time to go into specifics. The first component of your OS which you’ll have to design, if you’re building it from the ground up, is its kernel, so this article aims at being a quick guide to kernel design, describing the major areas which you’ll have to think about and guiding you to places where you can find more information on the subject.

What is a kernel and what does it do?

I think it’s best to put a definition there. The kernel is the central part of an operating system, whose role is to distribute hardware resources across other software in a controlled way. There are several reasons why such centralized hardware resource management is interesting, including reliability (the less power an application has, the less damage it can cause if it runs amok), security (for the very same reasons, but this time the app goes berserk intentionally), and several low-level system services which require a system-wide hardware management policy (pre-emptive multitasking, power management, memory allocation…).

Beyond these generic considerations, the central goal of most modern kernels is in practice to manage the process and thread abstractions. A process is an isolated software entity which can hold access to a limited amount of hardware resources, generally in an exclusive fashion, in order to avoid concurrency disasters. A thread is a task which may be run concurrently from other tasks. Both concepts are independent from each other, although it is common for each process to have at least one dedicated thread in modern multitasking OSs.

Hardware resources

So far, I’ve not given much depth to the “hardware resource” concept. When you read this expression, the first thing which you’re thinking about is probably some pieces of real hardware which are actually fully independent from each other: mice, keyboards, storage devices, etc…

However, as you know, these peripherals are not directly connected to the CPU. They all are accessed via the same bus, through one single CPU port. So if you want to make sure that each process only has access to some peripherals, the kernel must be the one in control of the bus. Or you may also decide that the bus is the hardware resource which processes must request access to. Depending on how fine-grained your hardware resource model is, your position in the process isolation vs kernel complexity scale will vary.

To make things even more complex, modern OSs also manage some very useful hardware resources which do not actually exist from a pure hardware point of view. Consider, as an example, memory allocation. From a hardware point of view, there is only one RAM. You may have several RAM modules in your computer, but your CPU still sees them as one single, contiguous chunk of RAM. Yet you regularly want to allocate some part of it to one process, and another part of it to another process.

For this to work, the kernel has to take its finest cleaver and virtually slice the contiguous RAM in smaller chunks which can safely be allocated separately to various processes, based on each one’s needs. There also has to be some mechanism for preventing different processes from peeking into each other’s memory, which can be implemented in various ways but most frequently implies use of special hardware bundled in the CPU, the Memory Management Unit (MMU). This hardware allows the kernel to only give each process access to his limited region of memory, and to quickly switch between memory access permissions of various processes while the kernel is switching from one process to another.

Another typical example of abstract hardware resource is CPU time. I assume you have all noticed that desktop operating systems did not wait for multicore chips to appear before letting us run several applications at once. They all made sure that processes would share CPU time in some way, having the CPU frequently switching from one process to another, so that as a whole it looks like simultaneous execution under normal usage conditions.

Of course, this doesn’t work by calling the CPU and telling it “Hi, fella, can you please run process A with priority 15 and process B with priority 8?”. CPUs are fairly stupid, they just fetch an instruction of a binary, execute it, and then fetch the next one, unless some interrupt distracts them from their task. So in modern interactive operating systems, it’s the kernel which will have to make sure that an interrupt occurs regularly, and that each time this interrupt occurs, a switch to another process occurs. This whole process is called pre-emptive multitasking.

Finally, it is common not to let processes access storage devices directly, but rather to give them access to some places in the file system. Of course, like allocated RAM, the file system is a purely virtual construct, which has no physical basis in HDDs/SSDs, and must be fully managed by the OS at some point.

In short, you’ll have to define which hardware resources your kernel manages and gives processes access to. It is generally not a matter of just giving processes access to hardware x or not, there is often a certain amount of management work to be done in the kernel, compromises to be considered, and sometimes hardware resources must just be created by the kernel out of nowhere, as an example in the case of memory allocation, pre-emptive multitasking, and filesystem operation.

Inter-process communication and thread synchronization

Generally-speaking, the more isolated from each other processes are the better. As said earlier, malware can’t do much in a tightly sandboxed environment, and reliability is greatly improved too. On the other hand, there are several occasions where it is convenient for processes to exchange information with each other.

Typical use case for this is a client-server architecture: somewhere in the depths of the system, there’s a “server” process sleeping, waiting for orders. “Client” processes can wake it up and give it some work to do in a controlled way. At some point, “server” process is done and returns the result to “client” process. This way of doing things is especially common in the UNIX world. Another use case for inter-process communications is apps which are themselves made of several interacting processes.

There are several ways through which processes may communicate with each other. Here are a few:

Signals: The dumbest form of inter-process communication, akin to an interrupt. Process A “rings a bell” in process B. Said bell, called a signal, has a number associated to it, but nothing more. Process B may be waiting for this signal to arrive at the time, or have defined a function attached to it which is called in a new thread by the kernel when the process receives it.
Pipes and other data streams: Processes also frequently want to exchange data of various types. Most modern OSs provide a facility for doing this, although several ones only allow processes to exchange data on a byte-per-byte basis, for legacy reasons.
Remote/Distant procedure calls: Once we are able to both send data from one process to another and to send signals to other processes so that one of their methods gets called, it’s highly tempting to combine both and allow one process to call methods from another process (in a controlled way, obviously). This approach allows one to use processes like shared libraries, with the added advantage that contrary to shared libraries, processes may hold access to resources which the caller doesn’t have access to, giving the caller access to these resources in a controlled way.
Shared memory: Although in most cases processes are better isolated from each other, it may sometimes be practical for two processes to share a chunk of RAM and just do whatever they want in it without having the kernel going in the way. This approach is commonly used under the hood by kernels to speed up data transfers and to avoid loading shared libraries several times when several processes want to use them, but some kernels also make this functionality publicly available to processes which may have other uses for it.

Another issue, related to inter-process communication, is synchronization, that is situations where threads must act in a coordinated manner.

To get started to this problem, notice that in a multitasking environment, there are several occasions where you want to make sure that only a limited amount of threads (generally one) may access a given resource at a time. Imagine, as an example, the following scenario: two word processors are opened simultaneously, with different files inside of them. The user then brutally decides to print everything, and quickly clicks the “print” button of both windows.

Without a mechanism in place to avoid this, here’s what would happen: both word processors start to feed data to the printer, which gets confused and prints garbled output, basically a mixture of both documents. Not a pretty sight. To avoid this, we must put somewhere in the printer driver a mechanism which ensures that only one thread may be printing a document at the same time. Or, if we have two printers available and if which one is used does not matter, we can have a mechanism which ensures that only two threads may be printing a document at the same time.

The usual mechanism for this is called a semaphore, and typically works as follows: on the inside, the semaphore has a counter which represents how much time a resource can still be accessed. Each time a thread tries to access a resource protected by a semaphore, this counter is checked. If its value is nonzero, it is decreased by one and the thread is permitted to access the resource. If its value is zero, the thread may not access the resource. Notice that to be perfectly bullet-proof, the mechanism in use must ensure that the value of the semaphore is checked and changed in a single processor instruction that can’t be run on several CPU cores at once, which requires a bit of help from the hardware. It’s not as simple as just checking and changing an integer value. But how exactly this is done is not our business at this design stage.

Apart from semaphores, another, less frequently used but still well-known synchronization mechanism, is the barrier. It allows N threads to wait until each one has finished its respective work before moving on. This is particularly useful in situations where a task is parallelized in several chunks that may not necessarily take the same time to be completed (think as an example about rendering a 3D picture by slicing it in a number of parts and having each part be computed by a separate thread).

So in short, having defined your process model, you’ll have to define how they will communicate with each other, and how threads’ actions will be synchronized.

Major concerns and compromises

You may have noticed that I’ve done my best to stick to generic concepts and put some care in showing that there are several ways to do a given thing. That’s not just for fun. There are several compromises to play with when designing a kernel, and depending on what your OS project’s goals are, you’ll probably have to consider them in different ways.

Here’s an overview of some major concerns which you should have in mind when designing a kernel (though the importance of each varies depending on what your OS project is):

Performance: Both in terms of hardware resource use and in terms of performance perceived by the user. These are not totally the same. As an example, in a desktop operating system, prioritizing software which the user is directly interacting with over background services improves perceived performance without requiring much optimization work. In a real-time operating system, hardware resource usage does not matter as much as meeting deadlines. And so on…
Maintainability: Kernels are pretty hard to code, therefore you generally want them to last a fairly long time. For this to happen, it must be easy to grab their code and tinker with it as soon as a flaw is found or a feature is to be added. The codebase must thus be kept as short as possible, well-commented, well-documented, well-organized, and leave room for tweaking things without breaking the API.
Portability: In our world of quickly-evolving hardware, operating system kernels should be easily portable to a new architecture. This is especially true in the realm of embedded devices, where the hardware is much more complex and ever-changing than it is on the desktop.
Scalability: The other side of the portability coin is that your kernel should adapt itself to future hardware evolutions in a given architecture. This noticeably implies optimizing for multi-core chips: you should strive to restrict the parts of the kernel which can only be accessed by a limited number of threads to a minimal amount of code, and aggressively optimize your kernel for multi-threaded use.
Reliability: Processes should not crash. But when they do crash, the impact of their crash should be minimized, and recoverability should be maximized. This is where maximizing process isolation, reducing the amount of code in loosely-isolated processes, and investigating means of backing up process data and restarting even the most critical services without rebooting the OS really shines.
Security: On platforms which allow untrusted third-party software to run, there should be some protection against malware. You must understand right away that things like open-source, antiviruses, firewalls, and having software validated by a bunch of testers, are simply neither enough nor very efficient. These should only be tools for paranoids and fallback methods when system security has failed, and system security should fail as infrequently as possible. Maximal isolation of processes is a way to reach that result, but you must also minimize the probability that system component can be exploited by low-privilege code.
Modularity: You’ll generally want to make kernel components as independent from each other as possible. Aside from improving maintainability, and even reliability if you reach a level of modularity where you can restart failing kernel components on the fly without having the live system take a big hit, it also permits you to make some kernel features optional, a very nice feature, especially when applied to hardware drivers in kernels which include them.
Developer goodies: In the dark ages of DOS, it was considered okay to ask from developers that they literally code hardware drivers in their software, as the operating system would do nearly nothing for them. This is not the case anymore. For everything which you claim to support, you must provide nice and simple abstractions which hide the underlying complexity of hardware behind a friendly, universal interface.
Cool factor: Who’ll use a new kernel if it’s just the same as others, but in a much superior way? Let’s introduce power-efficient scheduling, rickrolling panic screens, colored and formatted log output, and other fun and unique stuff!

Now, let’s see how they end up in conflict with each other…

The quest for performance conflicts with everything but scalability when taken too far (writing everything in assembly, not using function calls, putting everything in the kernel for better speed, keeping the level of abstraction minimal…)
Maintainability conflicts with scalability, along with anything else that makes the codebase more complex, especially if the extra complexity can’t be put in separate modules.
Portability is in conflict with everything that requires using or giving access to architecture-specific features, particularly when arch-specific code ends up being spread all over the place and not tightly packed in a known corner (as in some forms of performance optimizations).
Scalability is in conflict with any feature or construct which can’t be used on 65536 CPU cores at the same time. Aside from the obvious compromise with maintainability and reliability which are better without hard-to-code and hard-to-debug threads spread all over the place, there’s also a balance with some developer goodies (an obvious example being the libc and its hundreds of blocking system calls).
Reliability is the fiend of anything which adds complexity, as more code statistically means more bugs, especially when said code is hard to debug. The conflict with performance is particularly big, as many performance optimizations require to provide code with more access to hardware than it actually requires. It is also the sole design criteria in this list to have the awesome property of conflicting with itself, as some system features can improve reliability.
Security is a cousin of reliability as far as code complexity is concerned, since bugs can be exploitable. It also doesn’t like low-level code where every single action is not checked (pointers arithmetic, C-style arrays…), which is more prone to exploitable failures than the rest.
Modularity doesn’t like chunks of code which must be put at the same place of RAM. This means a serious conflict with performance, since code locality allows optimal use of CPU caches. The relationship between modularity and maintainability is ambiguous: separating system components from each other initially helps maintainability a lot, but extreme forms of modularity like putting the scheduler (part of the kernel which manages multitasking) in a process can make the code quite confusing.
We’ve previously seen that developer goodies and other cool stuff conflict with a large part of the rest for a number of reasons. Notice also an extra side of the feature vs maintainability conflict: it’s easy to add features, but hard to remove them, and you don’t know in advance how useful they will be. If you’re not careful, this results in the phenomenon of feature bloat where the number of useless features grows exponentially with time. A way to avoid this is to keep the feature set minimal in the first release, then examine user feedback to see what is actually lacking. But beware of the “second-system effect”, where you just put everything you’re asked for in the second release, resulting in even worse feature bloat than if you had put a more extensive feature set to start with.

Some examples of kernel design

There are many operating system kernels in existence, though not all meet the same level of success. Here are a few stereotypical designs which tend to be quite frequently encountered (this list is by no mean exhaustive):

Monolithic kernels

The way all OS kernels were written long ago, for performance reasons, and still the dominant kernel design as of today. The monolithic kernel model remains quite attractive due to the extreme simplicity of its design. Basically, the kernel is a single big process running with maximum privileges and tending to include everything but the kitchen sink. As an example, it is common for desktop monolithic kernels to include facilities for rendering GPU-accelerated graphics and managing every single filesystem in existence.

Monolithic kernels shine especially in areas where high performance is needed, as everything is part of the same process. They are also easier to design, since the hardware resource model can be made simpler (only the kernel has direct access to hardware, user-space processes only have access to kernel-crafted abstractions), and since user-space is not a major concern until late in the development process. On the other hand, this way of doing things highers the temptation of using bad coding practices, resulting in unmaintainable, non-portable, non modular code. Due to the large codebase and the full access to hardware, bugs in a monolithic kernel are also more frequent and have a larger impact than in more isolated kernel designs.

Examples of monolithic kernel include Linux and its Android fork, most BSDs‘ kernels, Windows NT and XNU (Yes, I know, the two latter call themselves hybrid kernels, but that’s mostly marketing. If you put most services in the same address space, with full access to the hardware, and without any form of isolation between each other, the result is still a monolithic kernel, with the advantages and drawbacks of monolithic kernels).

Micro-kernels

This is the exact opposite of a monolithic kernel in terms of isolation. The part of the kernel which has full access to the hardware is kept minimal (a few thousands of lines of executable code in the case of MINIX 3, to be compared with the millions of lines of code of monolithic kernels like Linux or Windows NT), and most kernel services are moved in separate services whose access to hardware is fine-tuned for their specific purpose.

Microkernels are highly modular by their very nature, and the isolated design favors good coding practices. Process isolation and fine-tuned access to hardware resources also ensure optimal reliability and security. On the other hand, microkernels are harder to write as much as they are easier to maintain, and the need to constantly switch from one process to another makes the most straightforward implementation perform quite poorly: it takes more optimization work to have a microkernel reach high performance, especially on the IPC side (as IPC becomes a critical mechanism).

Examples of commercial-grade microkernels include QNX, Âµ-velOSity and PikeOS. On the research side, one can mention MINIX 3, GNU Hurd, the L4 family, and the EROS family (KeyKOS, EROS, Coyotos, CapROS).

VM-based kernels

A fairly recent approach, which at the time has not fully gotten out of research labs and proof-of-concept demos. Maybe you’ll be the one implementing it successfully. The idea here is that since most bugs and exploits in software come from textbook mistakes with native code (buffer overflows, dangling pointers, memory leaks…), native code is evil and should be phased out. The challenge is thus to code a whole operating system, including its kernel, in a managed language like C# or Java.

Benefits of this approach include obviously a very high cool factor and increased reliability and security. It could also potentially reach better performance than microkernels while providing similar isolation in a distant future, by isolating processes through a purely software mechanism (since all pointers and accesses to the hardware are checked by the virtual machine, no process may access resources which it’s not allowed to access). On the other hand, nothing is free in the world of kernel development, and VM-based kernels have several major drawbacks to compensate for these high promises.

The kernel must include a full featured interpreter for the relevant language, which means that the codebase will be huge, hard to design, and that very nasty VM bugs are to be expected during implementation.
Making a VM fast enough that it is suitable for running kernel-level code full of hardware access and pointer manipulation is another big programming challenge.
A VM running on top of bare hardware will be harder to write, and thus more buggy and exploitable, than a VM running in the user space of an operating system. At the same time, exploits will have even worse consequences. Currently, the Java Virtual Machine is one of the biggest sources of vulnerabilities in existence on desktop PCs, so clearly something must change in the way we write VMs before they are suitable for inclusion in operating system kernels.

Examples of active VM-based kernel projects include Singularity, JNode, PhantomOS and Cosmos. There are also some interesting projects that are not active anymore, like JX and SharpOS (whose developers are now working in the MOSA project).

Bibliography, links, and friends

Having defined what the general concepts which you’ll have to care about are, I bet you want to get into more details. In fact you should. So here is some material for going deeper than this introductory article on the subject of kernel design:

Modern Operating Systems (Andrew S. Tanenbaum): Should you only read one book on the subject, I strongly recommend this one. It is an enlightening, extensive description of the subject, covering a lot of aspects of kernel design and which you may also use in much more parts of your OS development work.
You may also find a list of other books, along with some reviews, on the OSdev wiki.
While you’re on said wiki, you might also want to have a look at its main page, more precisely at the “Design Considerations” links in the left column (scroll down a bit). Globally, you should bookmark this website, because you’ll have a vital need for it once you start working on implementation. It’s, simply put, the best resource I know of on the subject.
And when you have question, also consider asking them in their forum. They are being asked hundreds of “how do I?” and “I’m stuck, what should I do?” implementation questions per month, so a bit of theoretical discussions would really please them. But beware of stupid questions which are answered in the wiki, otherwise prepare to face Combuster’s sharp tongue.
Questions for an OS designer is also an interesting read, although it doesn’t go too deeply into specifics. I should have linked to it in my previous article.

And that’s all for now. Next time, we’re going to go a bit more platform-specific, as I’m going to describe basic characteristics of the x86 architecture, which will be used for the rest of this tutorial (I’ll noticeably explain why).

About The Author

34 Comments

2011-02-20 2:26 pm
Neolander
And now, let’s go back to my own OS and those damn accomodation issues in Uppsala ^^ I didn’t expect this article to take so long, but you should probably not expect the next one to come next week either
Edited 2011-02-20 14:28 UTC

2011-02-20 2:29 pm
WereCatf
Thanks for yet another great article, I’m sure there’s plenty of people here who have found these a great read
2011-02-22 8:49 am
abstraction
Great choice of University. You won’t be disappointed =) If you have any questions about Uppsala or the uni don’t hesitate to ask me. Drop a PM.

2011-02-22 3:14 pm
Neolander
AFAIK, the PM feature of OSnews is discontinued (look at http://www.osnews.com/messages?op=compose&uid=5 ).
However, as an editor, I’ve got access to the mail address you used for subscribing, so I can contact you this way if you’re okay with that

2011-02-22 10:15 pm
abstraction
No problems!

2011-02-20 3:10 pm
cb88
While you might be able to argue that a VM kernel could be more reliable with GC and array bounds etc…. arguing that it is more secure is a losing battle. VMs can and have been broken out of in the past…. I would actually think they would increase the amount of code were an exploit could occur.
Edited 2011-02-20 15:11 UTC

2011-02-20 6:24 pm
Neolander
Well, I thought that this article displayed too much my preference towards the microkernel approach Thanks for showing me that I did a good job balancing it after all.

2011-02-21 12:59 am
cb88
While any vailidation of your efforts on my part was wholly unintentional… you’re welcome just the same 🙂
I wish there were more information on exokernels myself…. the idea seems to have stalled back in 2000. XOK doesn’t even compile with a modern compiler. Myself I am looking at going through the MIT OS developement course on my own… through the xv6 code and development of JOS which is an exokernel http://pdos.csail.mit.edu/6.828
EDIT: I found an old copy of the code apparently a newer copy is here http://pdos.csail.mit.edu/6.828/xv6/xv6-rev5.tar.gz
It doesn’t require any hacking up to compile it… just make qemu or make bochs
Edited 2011-02-21 01:05 UTC

2011-02-20 5:56 pm
antonone
Thanks for this article, but it appears that the JNode system got missed from the sample OS’es based on VM approach. It even has a GUI so it might be interesting thing to watch, if anyone’s interested.

2011-02-20 6:36 pm
Neolander
Indeed. I’ll correct this along with some English mistakes once I get access to some real internet connection (EDGE with locked ports is the new 56K).
When looking for examples, I’ve been highly displeased to discover that most operating systems which were said to be VM-based (like Inferno or JavaOS) were in fact nothing but a java virtual machine running on top of a regular C/C++ kernel, and have given up a bit too quickly apparently.
SharpOS was also an interesting project, but apparently development has ended.
Edited 2011-02-20 18:45 UTC

2011-02-20 10:33 pm
moondevil
That is because in the area VM operating systems the author is either not old enough, or did not research enough the subject.
Pascal MicroEngine in 1976, which used to process P-Code as instruction set.
The Lisp machines the early 80s.
The original Smalltalk environment at Xerox in
Forth is a VM, compiler and operating system, all in one, in the early 70s.
Modula-2 based system for the Lilith architecture.
Some versions of Oberon operating system, use the modules in bytecode form and compile them on load.
Granted this systems still do use some assembly at the core of their implementations, but so do C based OSs.

2011-02-21 7:32 am
Neolander
What was the point of coding a kernel in an interpreted language without even having the benefits of modern interpreted languages like type safety and pointer/array bounds checking, just for the sake of having the code being interpreted ? Apart for proving “yeah, it’s possible”, sounds like a waste to me…

2011-02-21 7:55 am
moondevil
Funny that all the languages I mention have those benefits.
And most of the examples given by me have JIT implementations.
Do you know that Sun’s Java Hotspot was actually developed for Smaltalk (Self)?
Age has some benefits…
2011-02-21 8:02 am
Neolander
So, you are telling me, without a smile, that Forth manages to be type-safe without even having a type system ?
Age has some benefits…
Indeed, especially when trying to understand computer history… These benefits go in both directions of the “age” axis though. I’ve had the chance to know the time where not every game was an FPS and to learn to program with Delphi without having COBOL and FORTRAN giving me nightmares during my childhood, so I’d say I came in this world just at the right time as far as PC evolution is concerned ^^
Edited 2011-02-21 08:22 UTC
2011-02-21 11:53 am
moondevil
Ok, in what concerns Forth you are right.

2011-02-20 8:31 pm
jack_perry
How do loadable kernel modules enter into the micro v. monolithic kernel debate? Is it possible that a microkernel could minimize the penalty of IPC by adding these services as modules? i.e. does “micro” mean merely that the kernel remains small during runtime, or that the kernel’s core codebase is small, but by loading modules it could grow?

2011-02-20 8:47 pm
Neolander
Afaik, microkernels are defined by the way they put some of the services they offer in isolated processes. Microkernels are intrinsically modular, but monolithic kernels *can* also be modular without becoming microkernels.
Whether monolithic kernels are modular or not, most of their main advantages and drawbacks remain, since most code runs in the same address space, in kernel mode, with full access to hardware. Modularity acts independently from that.
Edited 2011-02-20 20:54 UTC

2011-02-20 9:45 pm
jack_perry
Okay. So you could have a monolithic kernel that is actually quite small, and loads modules dynamically into memory as needed. The drawbacks of security would remain, and might even be worsened if the administrator was dumb enough to load an unsecure module to extend the OS’ capabilities.
The reason I ask is that modularity could decrease code size significantly. You mention that it is common for desktop monolithic kernels to include facilities for rendering GPU-accelerated graphics and managing every single filesystem in existence, which, okay, could be bad if all these filesystems are in memory all the time, but isn’t so bad if they are loaded only when needed. It likewise could provide a defense against what you cite as unmaintainable, non-portable, non modular code.
It makes me wonder whether most monolithic kernels do this in practice.
I guess I should say that this line of questioning is inspired by my acquaintance of Microware’s OS-9, which I once heard described as neither monolithic nor micro but modular. But I don’t know more than that, and even that was a distant memory.

2011-02-20 9:57 pm
Neolander
Indeed, more modularity can be used as a way to reduce the disadvantages of the monolithic approach on “large” kernels, without going as far as the microkernel approach. It’s an interesting in-between solution.
On the other hand, there must be a policy somewhere that forces kernel devs to put new features in separate modules whenever possible. Otherwise, you get something like Linux : the kernel is modular, but outside of the realm of hardware drivers its modularity capabilities are heavily under-used (and since there’s no standard, stable module interface, third-party module development does not work well… But that’s another story)

2011-02-21 3:12 am
ebasconp
Beautiful article!!!
Some years ago I was fascinated reading about L4 and its several implementations. What I find great on microkernels is the way of implementing everything as servers (isolated processes) or implementing several “personalities” running on top of the microkernel; so you could, in theory, having your microkernel with several “virtual machines” running on top of it. Could you think on Xen or VMware ESX Server as having some microkernel-like design?
2011-02-21 7:21 am
Morph
Windows is more hybrid in recent years since the introduction of User-mode Driver Framework: http://en.wikipedia.org/wiki/User-Mode_Driver_Framework
IIRC, Aero graphics drivers run in usermode.
Another VM-based OS that’s had some coverage on OSNews is Phantom OS: http://dz.ru/en/solutions/phantom/
Their goal is full persistence of all processes, data etc across shutdowns. Last blog post was October 2010 so perhaps it’s not dead yet.

2011-02-21 8:07 am
Neolander
Windows is more hybrid in recent years since the introduction of User-mode Driver Framework: http://en.wikipedia.org/wiki/User-Mode_Driver_Framework
IIRC, Aero graphics drivers run in usermode.
Indeed. That thing crashes so often on my machine that if it was in kernel mode, I’d spend more time rebooting than doing something useful when I’m on Windows, like in the Win9x days…
Don’t know which part exactly of Windows’ graphics stack is in user mode, though. I find it hard to believe that they could just fully move graphics driver in user space. That would break driver compatibility, which is definitely not Microsoft’s thing.
Another VM-based OS that’s had some coverage on OSNews is Phantom OS: http://dz.ru/en/solutions/phantom/
Their goal is full persistence of all processes, data etc across shutdowns. Last blog post was October 2010 so perhaps it’s not dead yet.
Added, along with some links.
Edited 2011-02-21 08:28 UTC

2011-02-21 9:17 am
kaiwai
Indeed. That thing crashes so often on my machine that if it was in kernel mode, I’d spend more time rebooting than doing something useful when I’m on Windows, like in the Win9x days…
Don’t know which part exactly of Windows’ graphics stack is in user mode, though. I find it hard to believe that they could just fully move graphics driver in user space. That would break driver compatibility, which is definitely not Microsoft’s thing.
Microsoft did break compatibility with Windows Vista by pushing more responsibility off onto the hardware vendors hence WDDM (Windows Display Driver Model) required virtual re-write of drivers hence the reason why Windows Vista launch was so problematic. Windows 7 has WDDM 1.1 which has bought back hardware acceleration for some GDI functions given in 1.0 GDI was totally software driven.
As for Windows crashes; when you have multiple points of failure; crappy hardware, crappy drivers and a difficult to understand driver API then things will never work out as planned. With that being said, however, if your hardware vendor produces quality hardware and uses the latest Windows Driver Kit and takes advantage of the ‘tried and tested’ templates that exist then many of the issues shouldn’t arise (at least in theory).

2011-02-21 11:38 am
Neolander
Well, although there are some random Aero crashes, Aero mainly crashes when running old games and other things running in full screen which do not use DirectX (or at least not the latest releases), so I bet I’m just not in the official test cases
Also, I have an exotic GPU setup (Optimus), which further increases the likelihood of something breaking.
I’ve reported that on Microsoft’s bug reporting tool, but they were not very helpful in terms of telling me who else I should contact if it’s not their fault.
Anyway, I’ve switched back to using Linux as my main OS some months ago, so I don’t care so much about that anymore. And as said before, I must admit that it breaks quite nicely, silently falling back to CPU rendering without a glitch.
Edited 2011-02-21 11:40 UTC

2011-02-21 6:03 pm
AFreeQuark
The people at wiki.osdev.org/Main_Page sure are elitist considering how much of their information assumes the silly behavior of x86.
I think being an obnoxious snob carries certain responsibilities, such as knowing how large a world one lives in.

2011-02-21 7:57 pm
Kochise
Could you… elaborate ?
Kochise

2011-02-22 2:00 am
AFreeQuark
I’d rather take back my comment, actually, as it didn’t really add much to the conversation. I had poked around the FAQ and forum a bit and was dismayed by how discouraging people were sometimes being to newbies.
…but if I had to point something out, there were a lot of articles on the initial boot-up of systems which talked as if the details were universal, when they were really all x86-specific.
Again, though, I wish I could retract my previous comment. The site *does* have a lot of no-doubt useful information.
🙂

2011-02-21 11:45 pm
Brendan
Hi,
The people at wiki.osdev.org/Main_Page sure are elitist considering how much of their information assumes the silly behavior of x86.
This isn’t intentional. Anyone who has information on other platforms is encouraged to add to wiki.osdev.org.
The reality is, most hobbyist OS developers (just like normal users) have easy access to one or more “PC compatible” computers (required for testing, etc), and don’t have easy access to anything else.
Note 1: “testing” means being able to find bugs that exist but only show symptoms in some situations, by being able to test on a (hopefully large) number of different systems.
Note 2: “easy access” means you (and other people that volunteer to test your OS) can mess with system software without special equipment (JTAG cables, flash/EPROM burners, etc) and without worrying about bricking the system.
Most other platforms are either rare (e.g. SPARC), obsolete (e.g. Alpha), or too expensive (e.g. Itanium). This even includes other platforms that use x86 CPUs (UEFI based systems), which are still far less widespread than “PC compatibles”.
Then there’s embedded CPUs (MIPS, ARM) which can’t really be considered a platform because there’s no standardisation for much more than the CPU’s instruction set. For these, you can write an OS for one system and it won’t really work on other systems that have the same CPU type. For example, you can write an OS for one of the ARM development boards, but you’d need to port the OS to any other ARM systems you want to support, which makes it hard to test on a wide variety of systems (especially during the early stages of development).
Anway, the end result of “all the above” is that most hobbyist OS developers want information for “PC compatible” computers, and eventually become people who are able to add information about “PC compatible” computers to wiki.osdev.org.
I’d also point out that some of the information on wiki.osdev.org applies to any platform. This includes the “theory” sections, some hardware information (PCI and USB), etc.
– Brendan

2011-02-22 2:02 am
AFreeQuark
Good points, all.
Hopefully I will have things to add to the wiki in the near future.

2011-02-22 8:30 am
t3RRa
AFAIK because NT kernel does indeed include some features in kernel space where as pure microkernels separate all of those features into user space, but it still loads other features as services into user space, Therefore NT kernel is called a hybrid kernel. it is not actually monolithic and different from Linux’s module architecture. (BeOS is also similarly a adopted a hybrid kernel architecture.) So it is not true to call it monolithic kernel architecture because everything is separated.

2011-02-22 10:50 pm
t3RRa
So it is not true to call it monolithic kernel architecture because everything is separated.
I meant “everything is not separated”. common mistake while changing and moving around..

2011-02-23 8:12 am
Alfman verbose=1
I am absolutely delighted that my suggestion to use a type safe language to provide isolation got attention in the “VM Based Kernels” section.
“The kernel must include a full featured interpreter for the relevant language, which means that the codebase will be huge, hard to design, and that very nasty VM bugs are to be expected during implementation.”
Yes, a type safe VM language will require more work than simply using a traditional compiler.
As for devel effort, with any luck we wouldn’t have to completely re-invent the wheel and could use existing VM implementations like Java or Mono. Writing a VM from scratch would be a ton of work – even if it has merit.
I think the term “interpreter” mis-characterizes the approach. A type safe language capable of isolation is not dependant on being interpreted, it can use JIT and pre-compilation too.
“Making a VM fast enough that it is suitable for running kernel-level code full of hardware access and pointer manipulation is another big programming challenge.”
As discussed in the earlier comments, I don’t think the type safe language would imply any overhead overhead over a correctly implemented unsafe-language version. All we require are safe language constructs which map efficiently over top of the underlying hardware like (memory mapped devices, port IO, DMA).
“A VM running on top of bare hardware will be harder to write, and thus more buggy and exploitable, than a VM running in the user space of an operating system.”
This depends on how deeply integrated Java/Mono are with external dependencies (such as pthreads, or libc, or syscalls). Since I don’t know the answer, I’ll let the criticism stand.
“Currently, the Java Virtual Machine is one of the biggest sources of vulnerabilities in existence on desktop PCs.”
Citation?
Also, remember that, unlike a web browser sandbox, the kernel/VM isn’t required to protect from maliciously altered kernel modules (untrusted code). It only needs to ensure that modules written in type safe code remain isolated.
As long as the compiler only produces in-spec modules, then we can reasonably get away with a VM which produces undefined results for out-of-spec modules.
Features I thought about for my OS many years ago:
It would be very nice to have support for clusters within the OS, such that you could take any running application and migrate it to another node while continuing to run. Similar to what VMware/KVM/Xen do, but would work with arbitrary applications.
Every single kernel interface should have the ability to be virtualized such that all interfaces on PC-A could be transparently redirected/aliased on PC-B without explicit support for this within the drivers.
PC-A: soundblaster, webcam
PC-B: remote alias to peripheral devices from PC-A
This would go hand in hand to make the application migration feature seamless.
This also means that all local applications could bind to remote devices transparently without being explicitly written to do so.
Other thoughts:
I hope you’re not planning on copying the *nix signal model, it’s pretty bad for modern requirements.
I also hope you opt for an asynchronous IO design within the kernel over a threaded IO design. This has been one of the weaknesses plaguing linux for years.
While Posix plays an important role in compatibility and standardization, it’s a major impediment to revolutionary designs. Therefore I think strict Posix compliance should be an explicit non-goal, particularly with regards to fs permissions and some of the braindead APIs.
Now if only someone would employ me to work on these things… Is anyone else here woefully under employed?

2011-02-24 11:01 am
alexisread
I don’t think the type safe language would imply any overhead overhead over a correctly implemented unsafe-language version. All we require are safe language constructs which map efficiently over top of the underlying hardware like (memory mapped devices, port IO, DMA).
JNode and SqeakNOS are practical examples of VM based kernels – esp. check out Squeak and it’s associated OS branch squeakNOS. It looks like development has stalled again, so it’s missing COG integration (JIT compiler) and the latest I/O libraries, and the optimisation is pretty non-existant at this stage. It does run reasonably quickly however.
As per the argument put forward by the Phantom OS team, you should be able to (long term) optimise this sort of OS better than a monolithic kernel thanks to zero context switching.
“A VM running on top of bare hardware will be harder to write, and thus more buggy and exploitable, than a VM running in the user space of an operating system.” This depends on how deeply integrated Java/Mono are with external dependencies (such as pthreads, or libc, or syscalls). Since I don’t know the answer, I’ll let the criticism stand. “Currently, the Java Virtual Machine is one of the biggest sources of vulnerabilities in existence on desktop PCs.” Citation? Also, remember that, unlike a web browser sandbox, the kernel/VM isn’t required to protect from maliciously altered kernel modules (untrusted code). It only needs to ensure that modules written in type safe code remain isolated. As long as the compiler only produces in-spec modules, then we can reasonably get away with a VM which produces undefined results for out-of-spec modules.
In addition, the VM would take away a whole class of bugs in user-level code eg. buffer overrun bugs in flash plugins. OS-wide I think you’d have a net gain and any VM bugs would show themselves quickly (and get fixed for an open source project anyway).
Features I thought about for my OS many years ago: It would be very nice to have support for clusters within the OS, such that you could take any running application and migrate it to another node while continuing to run. Similar to what VMware/KVM/Xen do, but would work with arbitrary applications. Every single kernel interface should have the ability to be virtualized such that all interfaces on PC-A could be transparently redirected/aliased on PC-B without explicit support for this within the drivers. PC-A: soundblaster, webcam PC-B: remote alias to peripheral devices from PC-A This would go hand in hand to make the application migration feature seamless. This also means that all local applications could bind to remote devices transparently without being explicitly written to do so. Other thoughts: I hope you’re not planning on copying the *nix signal model, it’s pretty bad for modern requirements. I also hope you opt for an asynchronous IO design within the kernel over a threaded IO design. This has been one of the weaknesses plaguing linux for years.
Checkout the Spoon image – it can do this and much more (versioning is handled well), major update due in March!

2011-02-23 6:38 pm
maze
I think you forgot the SymbianOS. It might be worth to mention it since it has some interesting concepts, even though it is a bit unclear for how long it will exist (and for how long you can look at the source code to learn from it)