Linked by Hadrien Grasland on Sat 5th Feb 2011 10:59 UTC
OSNews, Generic OSes So you have taken the test and you think you are ready to get started with OS development? At this point, many OS-deving hobbyists are tempted to go looking for a simple step-by-step tutorial which would guide them into making a binary boot, do some text I/O, and other "simple" stuff. The implicit plan is more or less as follow: any time they'll think about something which in their opinion would be cool to implement, they'll implement it. Gradually, feature after feature, their OS would supposedly build up, slowly getting superior to anything out there. This is, in my opinion, not the best way to get somewhere (if getting somewhere is your goal). In this article, I'll try to explain why, and what you should be doing at this stage instead in my opinion.
Thread beginning with comment 461506
To view parent comment, click here.
To read all comments associated with this story, please click here.
RE[7]: Not always rational
by Alfman on Wed 9th Feb 2011 01:12 UTC in reply to "RE[6]: Not always rational"
Alfman
Member since:
2011-01-28

Morin,

"There *are* multi-chip x86 systems (e.g. high-end workstations), there *are* ARM systems (much of the embedded stuff, as well as netbooks), and there *are* systems with more than one RAM..."

Sorry, but I'm not really sure what you're post is saying?

Reply Parent Score: 1

RE[8]: Not always rational
by Morin on Wed 9th Feb 2011 07:29 in reply to "RE[7]: Not always rational"
Morin Member since:
2005-12-31

Morin, "There *are* multi-chip x86 systems (e.g. high-end workstations), there *are* ARM systems (much of the embedded stuff, as well as netbooks), and there *are* systems with more than one RAM..." Sorry, but I'm not really sure what you're post is saying?


Then I might have misunderstood your original post:

However, considering that shared memory is the only form of IPC possible on multicore x86 processors, we can't really view it as a weakness of the OS.


I'll try to explain my line of thoughts.

My original point was that shared RAM approaches make the RAM a bottleneck. You responded that shared RAM "is the only form of IPC possible on multicore x86 processors".

My point now is that this is true only for the traditional configuration with a single multi-core CPU and a single RAM, with no additional shared storage and no hardware message-passing mechanism. Very true, but your whole conclusion is limited to such traditional systems, and my response was aiming at the fact that there are many systems that do *not* use the traditional configuration. Hence shared memory is *not* the only possible form of IPC on such systems, and making the RAM a bottleneck through such IPC artificially limits system performance.

As an example to emphasize my point, consider CPU/GPU combinations with separated RAMs (gaming systems) vs. those with shared RAM (cheap notebooks). On the latter, RAM performance is limited and quickly becomes a bottleneck (no, I don't have hard numbers).

I wouldn't be surprised to see high-end systems in the near future, powered by two general-purpose (multicore) CPUs and a GPU, each with its own RAM (that is, a total of 3 RAMs) and without transparent cache coherency between the CPUs, only between cores of the same CPU. Two separate RAMs means a certain amount of wasted RAM, but the performance might be worth it.

Now combine that with the idea of uploading bytecode scripts to server processes, possibly "on the other CPU", vs. shared memory IPC.

Reply Parent Score: 2

RE[9]: Not always rational
by Alfman on Thu 10th Feb 2011 05:20 in reply to "RE[8]: Not always rational"
Alfman Member since:
2011-01-28

"my response was aiming at the fact that there are many systems that do *not* use the traditional configuration. Hence shared memory is *not* the only possible form of IPC on such systems, and making the RAM a bottleneck through such IPC artificially limits system performance."

"As an example to emphasize my point, consider CPU/GPU combinations."

I expect the typical use case is the GPU caches bitmaps once, and doesn't need to transfer them across the bus again. So I agree this helps alleviate shared memory bottlenecks, but I'm unclear on how this could help OS IPC?

Maybe Nvidia's CUDA toolkit does something unique for IPC? That's not really my area.

I'm curious, what role do you think GPU's should have in OS development?

"I wouldn't be surprised to see high-end systems in the near future, powered by two general-purpose (multicore) CPUs and a GPU, each with its own RAM (that is, a total of 3 RAMs) and without transparent cache coherency between the CPUs, only between cores of the same CPU."

This sounds very much like NUMA architectures, and while support for them may be warranted, I don't know how this changes IPC? I could be wrong, but I'd still expect RAM access to be faster than any hardware on the PCI bus.

Reply Parent Score: 1

RE[9]: Not always rational
by Alfman on Thu 10th Feb 2011 06:12 in reply to "RE[8]: Not always rational"
Alfman Member since:
2011-01-28

"Now combine that with the idea of uploading bytecode scripts to server processes, possibly 'on the other CPU', vs. shared memory IPC."


I guess that I may be thinking something different than you. When I say IPC I mean communication between kernel modules. It sounds like you want the kernel to run certain things entirely in the GPU, thereby eliminating the need to run over a shared bus. This would be great for scalability, but the issue is that the GPU isn't very generic.

Most of what a kernel does is IO rather than number crunching, it isn't so clear how a powerful GPU is helpful.

Reply Parent Score: 1