Linked by Thom Holwerda on Tue 10th Jul 2007 21:57 UTC, submitted by maverick
Linux The Linux kernel process scheduler, as you know it, has been completely ripped out and replaced with a completely new one called Completely Fair Scheduler. How fair it will be, remains to be seen. Here's what its original creator Ingo Molnar says: "80% Of CFS's design can be summed up in a single sentence: CFS basically models an 'ideal, precise multi-tasking CPU' on real hardware." Learn more about the new scheduler from the CFS design document.
Permalink for comment 254415
To read all comments associated with this story, please click here.
RE[2]: Good
by butters on Wed 11th Jul 2007 01:30 UTC in reply to "RE: Good"
butters
Member since:
2005-07-08

Kinda hard to schedule anything if you can't keep track of time, right? CFS doesn't use jiffies or any other HZ-dependent timescale. But it does keep track of time in terms of nanoseconds.

The theory behind CFS is that each thread runnable on a given CPU is due an equal share of its time. As the CPU runs a particular thread, the scheduler deducts from its time. The thread that is due the most time is the one that deserves to run on the CPU.

That's the 80%. The first caveat is scheduler granularity, the only tunable in CFS. Although CFS does away entirely with the notion of timeslices, the granularity represents the amount of time that a thread may run before a more deserving thread is eligible to preempt it. Of course, a thread can yield the CPU before this period expires. Decreasing this value favors interactivity and increasing it favors throughput.

Another caveat is priorities. Threads may have various priority levels, which entitle them to more or less than an equal share of CPU time. As we know, some threads are more equal than others. Priorities in CFS are static, user-defined levels. There are no runtime heuristics in CFS. Threads wake up with the same level of unfairness as when they went to sleep.

Then there's scheduler groups, which can contain multiple threads and (potentially) scheduler groups. At any level of the resulting tree, scheduler groups get the same fair share as a single thread, no matter how many threads they contain. Threads within a scheduler group get their fair share of the group's fair share. This can be used to implement fairness among users or processes running various numbers of threads, for example.

The CFS commit introduces the notion of scheduler modules, which allows multiple schedulers to run in series. The current implementation runs the realtime scheduler first to potentially select a runnable realtime thread, and failing that, runs the CFS scheduler to select a normal thread. The list of modules is currently static, but in the future, the order of the the modules might be managed dynamically. Scheduler modules implement a clean and simple API.

CFS maintains the per-CPU scheduler design central to the Linux 2.6 architecture, but does away with runqueues and the amortized O(1) algorithms. The "runqueues" in CFS are red-black trees, a kind of balanced search tree. Scheduling involves selecting the left-most thread in the tree, which is an O(log N) operation. It remains to be seen whether O(1) is actually faster in terms of wall-clock than CFS, and it is suggested that the scheduling improvement on heavily-loaded systems will be worth any additional overhead.

Ingo Molnar developed CFS in response to the problems with Con Kolivas' popular Staircase Deadline scheduler. After some heated debate and Ingo's attribution of key CFS ideas inspired by SD, Con essentially conceded that CFS would become the next Linux process scheduler. Ingo definitely has a knack for taking good ideas to their fruition, as he did the same thing assembling the ideas of Robert Love and others into what became the O(1) scheduler. Ingo also created a prototype for CFS that included a "scheduler economy," which allows threads to literally trade CPU time on an open market.

Edited 2007-07-11 01:32

Reply Parent Bookmark Score: 5