Linked by JRepin on Mon 29th Apr 2013 09:24 UTC
Linux After ten weeks of development Linus Torvalds has announced the release of Linux kernel 3.9. The latest version of the kernel now has a device mapper target which allows a user to setup an SSD as a cache for hard disks to boost disk performance under load. There's also kernel support for multiple processes waiting for requests on the same port, a feature which will allow it to distribute server work better across multiple CPU cores. KVM virtualisation is now available on ARM processors and RAID 5 and 6 support has been added to Btrfs's existing RAID 0 and 1 handling. Linux 3.9 also has a number of new and improved drivers which means the kernel now supports the graphics cores in AMD's next generation of APUs and also works with the high-speed 802.11ac Wi-Fi chips which will likely appear in Intel's next mobile platform. Read more about new features in What's new in Linux 3.9.
Permalink for comment 560287
To read all comments associated with this story, please click here.
RE[11]: Load of works there
by Brendan on Thu 2nd May 2013 02:07 UTC in reply to "RE[10]: Load of works there"
Member since:


On older processors it used to be a couple hundred cycles like in the link I supplied. I'm not sure how much they've brought it down since then. Do you have a source for the 50 cycles stat?

I think we're talking about different things. The cost of a "bare" software interrupt or call gate is around 50 cycles; but the benchmarks from your link are probably measuring the bare syscall plus an assembly language "stub" plus a call to a C function (and prologue/epilogue) plus another call (possibly via. a table of function pointers) to a minimal "do nothing" function.

"Note that this applies to both micro-kernels and monolithic kernels - they both have the same user space to kernel space context switch costs."

While technically true, the monolithic kernel doesn't need to context switch between modules like a microkernel does. That's the reason microkernels are said to be slower. The microkernel context switches can be reduced by using non-blocking messaging APIs, this is what I thought you were already suggesting earlier, no?

For the overhead of privilege level switches, and for the overhead of switching between tasks/processes, there's no real difference between micro-kernel and monolithic.

Micro-kernels are said to be slower because privilege level switches and switching between tasks/processes tend to happen more often; not because the overhead is higher.

"Agreed. The other thing I'd mention is that asynchronous messaging can work extremely well on multi-core; as the sender and receiver can be running on different CPUs at the same time and communicate without any task switches at all."

On the other hand, whenever I benchmark things like this I find that the cache-coherency overhead is a significant bottleneck for SMP systems such that a single processor can often do better with IO-bound processes. SMP is best suited for CPU bound processing where the ratio of CPU processing to inter-core IO is relatively high. Nothing is ever simple huh?

That'd be true regardless of how threads communicate. The only way to reduce the cache-coherency overhead is to build a more intelligent scheduler (e.g. make threads that communicate a lot run on CPUs that share the same L2 or L3 cache).

"Ironically; for modern kernels (e.g. both Linux and Windows) everything that matters (IO) is asynchronous inside the kernel."

With linux, file IO uses blocking threads in the kernel, all the FS drivers use threads. These are less scalable than async designs since every request needs a kernel stack until it returns. The bigger problem with threads is that they're extremely difficult to cancel asynchronously. One cannot simply "kill" a thread just anywhere, there could be side effects like locked mutexes, incomplete transactions and corrupt data structures...consequentially most FS IO requests are not cancelable on linux. In most cases this isn't observable because most file IO operations return quickly enough, but there are very annoying cases from time to time (most commonly with network shares) where we cannot cancel the blocked IO or even kill the process. We are helpless, all we can do is wait for FS timeouts to elapse.

It's difficult to justify the amount of work that'd be needed to fix these abnormal cases. I'd rather push for a real async model, but that's not likely to happen given the immense scope such a patch would entail.

Sadly, it's easier to keep adding extensions on top of extensions (and end up with an ugly mess that works) than it is to start again with a new/clean design; even when a new/clean design would reduce the total amount of work and improve the quality of the end result in the long run. Most people are too short-sighted for that - they only look at the next few years rather than the next few decades.

- Brendan

Reply Parent Score: 2