Linked by Hadrien Grasland on Sat 5th Feb 2011 10:59 UTC
OSNews, Generic OSes So you have taken the test and you think you are ready to get started with OS development? At this point, many OS-deving hobbyists are tempted to go looking for a simple step-by-step tutorial which would guide them into making a binary boot, do some text I/O, and other "simple" stuff. The implicit plan is more or less as follow: any time they'll think about something which in their opinion would be cool to implement, they'll implement it. Gradually, feature after feature, their OS would supposedly build up, slowly getting superior to anything out there. This is, in my opinion, not the best way to get somewhere (if getting somewhere is your goal). In this article, I'll try to explain why, and what you should be doing at this stage instead in my opinion.
Permalink for comment 461775
To read all comments associated with this story, please click here.
RE[10]: Not always rational
by Morin on Thu 10th Feb 2011 09:53 UTC in reply to "RE[9]: Not always rational"
Morin
Member since:
2005-12-31

> This sounds very much like NUMA architectures, and
> while support for them may be warranted, I don't
> know how this changes IPC?

NUMA is the term I should have used from the beginning to avoid confusion.

> > "As an example to emphasize my point, consider CPU/GPU combinations."
> I expect the typical use case is the GPU caches
> bitmaps once, and doesn't need to transfer them
> across the bus again. So I agree this helps
> alleviate shared memory bottlenecks, but I'm
> unclear on how this could help OS IPC?

It seems that my statement has added to the confusion...

I did *not* mean running anything on the GPU. I was talking about communication between two traditional software processes running on two separate CPUs connected to two separated RAMs. The separate RAMs *can* bring performance benefits if the programs are reasonably independent, and for microkernel client/server IPC, data caching and uploaded bytecode scripts improve performance even more and avoid round-trips.

The hint with the GPU was just to emphasize the performance benefits of using two separate RAMs. If CPU and GPU use two separate RAMs to increase performance, two CPUs running traditional software processes could do the same if the programs are reasonably independent.

Not to say that you *can't* exploit a GPU for such things (folding@home does), but that was not my point.

> I could be wrong, but I'd still expect RAM access to
> be faster than any hardware on the PCI bus.

Access by a CPU to its own RAM is, of course, fast. Access to another CPU's RAM is a bit slower, but what is much worse is that it blocks that other CPU from accessing its RAM *and* creates cache coherency issues.

Explicit data caching and uploaded scripts would allow, for example, a GUI server process to run on one CPU in a NUMA architecture, and the client application that wants to show a GUI run on the other CPU. Caching would allow the GUI server to load icons and the like once at startup over the interconnect. Bytecode scripts could also be loaded over the interconnect once at startup, then allow the GUI server to react to most events (keyboard, mouse, whatever) without any IPC to the application process.

The point being that IPC round-trips increase latency of the GUI (though not affecting throughput) and make it feel sluggish; data transfer limit both latency and throughput, and in a NUMA architecture you can't fix that with shared memory without contention at the RAM and cache coherency issues.

Reply Parent Score: 2