Linked by Thom Holwerda on Tue 10th Jul 2007 21:57 UTC, submitted by maverick
Linux The Linux kernel process scheduler, as you know it, has been completely ripped out and replaced with a completely new one called Completely Fair Scheduler. How fair it will be, remains to be seen. Here's what its original creator Ingo Molnar says: "80% Of CFS's design can be summed up in a single sentence: CFS basically models an 'ideal, precise multi-tasking CPU' on real hardware." Learn more about the new scheduler from the CFS design document.
Thread beginning with comment 254375
To view parent comment, click here.
To read all comments associated with this story, please click here.
RE: Good
by maverick on Tue 10th Jul 2007 22:25 UTC in reply to "Good"
maverick
Member since:
2006-12-29

Hm, but I don't see much connection between jiffies and scheduler. Jiffies are for timekeeping, and scheduler for scheduling, right?

Would you care to elaborate?

Reply Parent Bookmark Score: 1

RE[2]: Good
by diegocg on Tue 10th Jul 2007 22:47 in reply to "RE: Good"
diegocg Member since:
2005-07-08

I think he meant a completely tickless kernel (right now the tickless feature only works for idling, the goal would be to run tickless 100% of the time)

Reply Parent Bookmark Score: 5

RE[2]: Good
by Wes Felter on Tue 10th Jul 2007 23:10 in reply to "RE: Good"
Wes Felter Member since:
2005-11-15

IIRC, the scheduler used to calculate times (such as thread quanta and accumulated time) in units of jiffies, while now time is calculated in physical units (ns).

Reply Parent Bookmark Score: 4

RE[2]: Good
by butters on Wed 11th Jul 2007 01:30 in reply to "RE: Good"
butters Member since:
2005-07-08

Kinda hard to schedule anything if you can't keep track of time, right? CFS doesn't use jiffies or any other HZ-dependent timescale. But it does keep track of time in terms of nanoseconds.

The theory behind CFS is that each thread runnable on a given CPU is due an equal share of its time. As the CPU runs a particular thread, the scheduler deducts from its time. The thread that is due the most time is the one that deserves to run on the CPU.

That's the 80%. The first caveat is scheduler granularity, the only tunable in CFS. Although CFS does away entirely with the notion of timeslices, the granularity represents the amount of time that a thread may run before a more deserving thread is eligible to preempt it. Of course, a thread can yield the CPU before this period expires. Decreasing this value favors interactivity and increasing it favors throughput.

Another caveat is priorities. Threads may have various priority levels, which entitle them to more or less than an equal share of CPU time. As we know, some threads are more equal than others. Priorities in CFS are static, user-defined levels. There are no runtime heuristics in CFS. Threads wake up with the same level of unfairness as when they went to sleep.

Then there's scheduler groups, which can contain multiple threads and (potentially) scheduler groups. At any level of the resulting tree, scheduler groups get the same fair share as a single thread, no matter how many threads they contain. Threads within a scheduler group get their fair share of the group's fair share. This can be used to implement fairness among users or processes running various numbers of threads, for example.

The CFS commit introduces the notion of scheduler modules, which allows multiple schedulers to run in series. The current implementation runs the realtime scheduler first to potentially select a runnable realtime thread, and failing that, runs the CFS scheduler to select a normal thread. The list of modules is currently static, but in the future, the order of the the modules might be managed dynamically. Scheduler modules implement a clean and simple API.

CFS maintains the per-CPU scheduler design central to the Linux 2.6 architecture, but does away with runqueues and the amortized O(1) algorithms. The "runqueues" in CFS are red-black trees, a kind of balanced search tree. Scheduling involves selecting the left-most thread in the tree, which is an O(log N) operation. It remains to be seen whether O(1) is actually faster in terms of wall-clock than CFS, and it is suggested that the scheduling improvement on heavily-loaded systems will be worth any additional overhead.

Ingo Molnar developed CFS in response to the problems with Con Kolivas' popular Staircase Deadline scheduler. After some heated debate and Ingo's attribution of key CFS ideas inspired by SD, Con essentially conceded that CFS would become the next Linux process scheduler. Ingo definitely has a knack for taking good ideas to their fruition, as he did the same thing assembling the ideas of Robert Love and others into what became the O(1) scheduler. Ingo also created a prototype for CFS that included a "scheduler economy," which allows threads to literally trade CPU time on an open market.

Edited 2007-07-11 01:32

Reply Parent Bookmark Score: 5

RE[3]: Good
by hechacker1 on Wed 11th Jul 2007 01:44 in reply to "RE[2]: Good"
hechacker1 Member since:
2005-08-01

CFSv19 has many tunables in /proc/sys/kernel besides ns_granularity... Initially yes, it only had one like SD, but like I've previously posted, Ingo has been adding "smoothness" calculations and other "hand tuning" and also exposing more tunables. His "economy" based ideas (actually ideas of a college student) will only provide more heuristics ultimately leading to built-in unfairness. I don't think the economy based ideas made it into CFS.

A Completely Fair Scheduler shouldn't have heuristics if it should be truly fair. Initially both SD and CFS suffered from built in heuristics. SD has fixed those problems. CFS is still being tuned.

Perhaps these extra tunables are temporary until optimal values are found.. but it would seem that SD is better in this regard (inherently more fair by design).

Initially RSDL did have some problems, but they were fixed in (r)SDL -> SD. Part in thanks to Ingo and other developers giving Con some ideas.

btw, in CFSv19, most operations are O(1) time. Ingo's design document is out of date.

Edited 2007-07-11 01:54

Reply Parent Bookmark Score: 5