Linked by Hadrien Grasland on Sat 5th Feb 2011 10:59 UTC
OSNews, Generic OSes So you have taken the test and you think you are ready to get started with OS development? At this point, many OS-deving hobbyists are tempted to go looking for a simple step-by-step tutorial which would guide them into making a binary boot, do some text I/O, and other "simple" stuff. The implicit plan is more or less as follow: any time they'll think about something which in their opinion would be cool to implement, they'll implement it. Gradually, feature after feature, their OS would supposedly build up, slowly getting superior to anything out there. This is, in my opinion, not the best way to get somewhere (if getting somewhere is your goal). In this article, I'll try to explain why, and what you should be doing at this stage instead in my opinion.
Thread beginning with comment 461225
To view parent comment, click here.
To read all comments associated with this story, please click here.
RE[9]: Not always rational
by Alfman on Mon 7th Feb 2011 17:54 UTC in reply to "RE[8]: Not always rational"
Alfman
Member since:
2011-01-28

"Anyway, what I was referring to is that interpreted languages are intrinsically slower than compiled languages"

Yes.

"in the same way that an OS running in a VM is intrinsically slower than the same OS running on the bare metal"

But, most VM implementations do run apps on bare metal (within user space). We're talking about making a VM run on bare metal in kernel space.

A VM guaranties isolation, but beyond that requirement there is absolutely no reason it has to "interpret" code. It's genuine machine code which runs directly on the processor.

You said you've read the Java/C benchmarks, which is why I didn't cite any, but it sounds like your doubting the results? Why?



"Sure, but if the kernel's VM ends up doing most of the job of a kernel, what's the point of coding a kernel in X at all ? The VM, which is generally coded in a compiled language, ends up being close to a full-featured kernel, so I don't see the benefit"

Ok I understand.

We shouldn't underestimate what can be done in 'X' (as you call it). C's memory manager can be written in C, why rule out using X to do the same thing? It's only a question of bootstrapping.

More importantly though, the vast majority of code running in kernel space (whether micro/macro), is device drivers. In a micro-kernel design, the core kernel should be very small and do very little - like switch tasks and help them intercommunicate. If this very small piece cannot be implemented in pure 'X', then so be it. It's like peppering 'C' with assembly.

Even for a macro-kernel design, I'd say a safe language could be beneficial.

Personally, I actually like 'C', but the lack of bounds checking is something that developers have been struggling with since it's inception.
It has other shortcomings too: the lack of namespaces causing library collisions, ugly header file semantics, a very weak macro/template system, a lack of standardized strings, etc.

I'm not saying we should not use C, but if we do, then get ready for the "usual suspects".


"Take a linked list. When parsing it, a process ends up looking at lots of pointers without necessarily knowing where they come from. This is the kind of code which I had in mind."

One approach could be to adapt the way many safe languages already handle references (which let's face it, is a "safe" pointer). All references could be dereferenced safely without a check, any other pointers (let's say coming from user space) would need to be validated prior to use.

Reply Parent Score: 1

RE[10]: Not always rational
by Neolander on Mon 7th Feb 2011 18:34 in reply to "RE[9]: Not always rational"
Neolander Member since:
2010-03-08

"in the same way that an OS running in a VM is intrinsically slower than the same OS running on the bare metal"

But, most VM implementations do run apps on bare metal (within user space). We're talking about making a VM run on bare metal in kernel space.

Sorry for the confusion. I was talking about the VirtualBox/VMware kind of virtual machine there : software which "emulates" desktop computer hardware in order to run an OS in the userspace of another OS.

A VM guaranties isolation, but beyond that requirement there is absolutely no reason it has to "interpret" code. It's genuine machine code which runs directly on the processor.

So you say that it could be possible to envision "safe" code that's not interpreted ? What differs between this approach and the way a usual kernel isolates process from each other ?

You said you've read the Java/C benchmarks, which is why I didn't cite any, but it sounds like your doubting the results? Why?

What I've read showed that for raw computation and a sufficiently long running time, there's no difference between Java and C, which means that JIT compilation does work well. On the other hand, I've not seen benchmarks of stuff which uses more language features, like a comparison of linked list manipulation in C and Java or a comparison of GC and manual memory management from a performance and RAM usage point of view. If you have such benchmarks at hand...

We shouldn't underestimate what can be done in 'X' (as you call it).

I use X when I think that what I say applies to all "safe" programming languages.

C's memory manager can be written in C, why rule out using X to do the same thing? It's only a question of bootstrapping.

The problem is that in many safe languages, memory management and other high-level features are taken for granted, as far as I know, which makes living without them difficult. As an example, GC requires memory management to work, and it's afaik a core feature of such languages.

More importantly though, the vast majority of code running in kernel space (whether micro/macro), is device drivers. In a micro-kernel design, the core kernel should be very small and do very little - like switch tasks and help them intercommunicate. If this very small piece cannot be implemented in pure 'X', then so be it. It's like peppering 'C' with assembly.

There we agree... Except that good micro-kernels try to put drivers in user space when possible without hurting performance.

Even for a macro-kernel design, I'd say a safe language could be beneficial.

I think I agree.

Personally, I actually like 'C', but the lack of bounds checking is something that developers have been struggling with since it's inception.

Don't know... I hated it initially, but once I got used to it it only became a minor annoyance.

It has other shortcomings too: the lack of namespaces causing library collisions,

Fixed in C++

ugly header file semantics,

I totally agree there, C-style headers are a mess. The unit/module approach chosen by Pascal and Python is imo much better.

a very weak macro/template system,

Fixed in C++

a lack of standardized strings

Fixed in C++, but if I wanted to nitpick I'd say that char* qualifies.

I'm not saying we should not use C, but if we do, then get ready for the "usual suspects".

Low-level code must always be polished like crazy anyway.

One approach could be to adapt the way many safe languages already handle references (which let's face it, is a "safe" pointer). All references could be dereferenced safely without a check, any other pointers (let's say coming from user space) would need to be validated prior to use.

Are these working in the same way as the C++ ones ? If so, are they suitable for things like linked lists where pointers have to switch targets ?

Reply Parent Score: 1

RE[11]: Not always rational
by Alfman on Mon 7th Feb 2011 22:08 in reply to "RE[10]: Not always rational"
Alfman Member since:
2011-01-28

"Sorry for the confusion. I was talking about the VirtualBox/VMware kind of virtual machine there".

Well, that would skew the discussion considerably.
For better or worse, the term "virtual machine" has been overloaded for multiple purposes. I intended to use the term as in "Java Virtual Machine", or dot net...


"So you say that it could be possible to envision 'safe' code that's not interpreted ? What differs between this approach and the way a usual kernel isolates process from each other ?"

It would depend on the implementation of course. But theoretically, a single JVM could run several "virtual apps" together under one process such that they are all virtually isolated. Each virtual app would pull from a unified address space but would have it's own mark/sweep allocation tree (for example). This would enable the JVM to kill one virtual app without affecting others.

Take this design, and apply it to the kernel itself using virtually isolated modules. This is different from the 'process' model where the CPU protections and page tables to enforce isolation.

As an aside, I have nothing against the 'process' design, but the overwhelming concern that we always talk about is the IPC cost.


"What I've read showed that for raw computation and a sufficiently long running time..."

Honestly I haven't used Java in a while, I'm using it here since it is the most popular example of an application VM in use today, another might be microsoft's CLR. If you have benchmarks showing slow Linked Lists in Java, I'd like to see them.

"The problem is that in many safe languages, memory management and other high-level features are taken for granted, as far as I know, which makes living without them difficult."

You have to implement memory management in any OS. Language 'X' would simply have to implement it too, what's the difference whether it's implemented in 'C' or 'X'?

"As an example, GC requires memory management to work, and it's afaik a core feature of such languages."

Yes, but we need to implement memory management anyways. Malloc is part of the C spec, and yet malloc is implemented in C. Obviously the malloc implementation cannot use the part of the spec which is malloc. Instead, malloc is implemented using lower level primitives (ie pages)- I have written my own, it's not so bad.

I'm on the fence with pure Garbage Collection. It seems really lazy to me to let unused objects float around on the heap. I'd lean towards having a delete operator.


Let me just state the biggest reason NOT to go with 'X': we'd have to design and implement the whole language, and that would be a lot of work - we don't want to get off track from developing the OS.

So it makes sense to choose something that's already out there, and the tradition is to go with 'C' - it just sucks that it's such an unsafe language.

An alternative might be the 'D' language, which sounds like promising 'C' replacement.


"Are these working in the same way as the C++ ones ? If so, are they suitable for things like linked lists where pointers have to switch targets ?"

C++ references are for the benefit of the programmer, but add no functional value. Safe languages often say they don't have "pointers", they have "references" (I hate this distinction since they refer to the same thing).

In perl, you can do anything you would in C with a pointer except for arithmetic. In principal a safe language could support arithmetic so long as it is range checked against an array before being dereferenced. However, perl has it's own primitive arrays, so pointer arithmetic is not supported.

Reply Parent Score: 1