This article looks at two behaviors of the scheduler. The first behavior is the reaction to adding more choices to the scheduler’s switching decision. The second demonstrates fairness by performing a uniform workload in multiple threads. Source code is provided so you can experiment. Last month’s column looked at bare context switch times by using the best primitives on both Windows and Linux. According to those results, context switch time under Windows takes only half as long as under Linux.
I have not read the article. Just want to point out that the graphics layer in WNT started out as a separate user mode process (microkernel) but was later moved into the kernel as an application extension to reduce the context switching overhead. I’m not advocating the same fate of the X-server but if Windows in addition is twice as efficient on context switching then it is important to optimize Linux and the X-server in this area whereever possible in order to compete on the desktop
It seems like he is saying that without tweaking the OSs in any, W2K responds quicker under light loads, and Linux does better as the number of threads increases.
I glanced through the last month’s article… and I got the feeling that it was a pretty narrow view to context switch measurements. The author was working with threads, not processes, and was experimenting, apparently, with how fast the OS can switch from one thread to another (within the same process). So, the article doesn’t really say much about the whole context switching speed.
When using the best available primitives, the slowness of Linux vs. Windows is explained, I think, with the fact that on Linux a thread is almost the same thing as a process, and on Windows a thread is more tightly integrated to its process. Therefore, when context switching threads of the same process on Linux, Linux has to do pretty much the same work as if it switched a process, whereas Windows can optimize lots because it knows it’s working with the same process.
(Or at least Linux threads have been implemented like this before (clone() syscall), have things improved?)
One point more: in the Windows world, threads are used pretty often in applications of all sorts, but traditionally on UNIX systems threads are never used (although thread use is increasing). Therefore it would be really interesting to get actual process switch timings for Windows, too. And also userland-kernel-userland context switch times for both systems (but I’d guess Linux does a stellar job here).
The kernel version tested in RH7.3 (2.4.18) uses the “old” scheduling algorithm… Shouldn’t we see some improvements in context switch rate with the new O(1) scheduler?
I don’t know about the scheduler. But we should see a more responsive desktop with kernel 2.6:
http://www.ofb.biz/modules.php?name=News&file=article&sid=182
http://news.com.com/2100-1001-963447.html
Wasn’t this threading stuff improved recently for Linux?
(NTPL or so)
benchmarking context switches in this way has little relevance to how the system performs as a whole, as he points out, and can be thought of as a test of scheduling speed. we knew this was pretty poor in linux, and ingo’s O(1) scheduler wasn’t tested in this article (next generation posix threads are also not tested, but they’re not really stable yet).
there is little reason to include servers, even graphics servers, into the kernel as this would bloat it horribly and lead to many more crashes. frequently on desktop systems there are only a handful of running processes/threads, so X11 can respond within a reasonable time.
BeOS, also, had the graphics server outside of the kernel, and responded very well. evidently the problem is not in process/context switching.
there is little reason to include servers, even graphics servers, into the kernel as this would bloat it horribly and lead to many more crashes. frequently on desktop systems there are only a handful of running processes/threads, so X11 can respond within a reasonable time.
There’s a lot of context swithing, there’s at least 3 processes (the app, the x server, and the window manager) and this afects performance, having context switching each time you (for example) draw a line is not good.
I would also like to point out that xfree uses sockets (wich are slow and asynchronous) instead of pipes: http://rikkus.info/sysv_ipc_vs_pipes_vs_unix_sockets.html, in-kernel messaging should even be faster than pipes.
And why do you say that having a graphics system inside the kernel could lead to more crashes, linux isn’t win95, linux is a monolithic kernel, it as all kind of stuff in there, the network stack is in there, when was the last time it lead to a crash? and you already have the dri and framebuffer drivers there, so what’s the problem.
The only valid reason i can think not to have a graphics system in the kernel would be the loss of network transparency, but you dont need that for desktop use.
having context switching each time you draw a line is not good … in-kernel messaging should even be faster than pipes
undoubtably. there was an aticle somewhere examining precisely how many context switches are involved in a typical push-button. i think the figure was about 10… but, from the evidence in front of me, its fast enough. with DRI and MIT-SHM its really a moot point anyway, as no cx is involved in drawing a line to shared memory.
And why do you say that having a graphics system inside the kernel could lead to more crashes?
just from a probabilistic approach, more code almost certainly means more bugs. at the moment its unlikely that X11 crashing would kill the kernel (although the keyboard and mouse may become unresponsive) and the situation can be solved over a terminal or network login.
to summarise: reasons why it is a ‘a bad thing’ to have the graphics inside the kernel:
1) more code (probably)= more bugs, in kernel space bugs are very very bad.
2) flexibility: network transparency, display transparency, multiple servers, nested servers all become much more difficult.
3) i disagree that a kernel-based server would be significantly faster. i’m not saying that X11 is great, look to BeOS instead. (see shared memory comment)
Would you guys mind to provide a link where a non-Linux type of guy like me can read up on “Ingo’s O(1) scheduler” you keep mentioning? Doc / abstract, please, not (only) source.
Start from here and here and follow the links:
http://kerneltrap.org/node.php?id=517
http://kerneltrap.org/node.php?id=422
Please stop FUD. Thx.
UNIX Socket are as fast and even faster than pipes. Kernel messaging bring no improvement when you have good UNIX sockets because mechanism is the same.
I really wonder where does come your idea that for each line, there is a context switch. It way be true for GDI but, today, even MS is offering better API to improve drawing performances. Seems to me that you know nothing about X11.
UNIX Socket are as fast and even faster than pipes
i don’t know about “faster”, but the speed difference is probably negligable.
I really wonder where does come your idea that for each line, there is a context switch.
in the worst case, you XDrawLine() a single line and then XSync() then there are many cxs, but you can speed things up dramatically (indeed, Xlib does automatically) by passing blocks of commands at once.
as we said before: for the case of shared memory this argument is pointless. also for 3D games and anything else using DRI.