Fair User Scheduling for Linux

Submitted by JohnnyUtah 2007-10-25 Linux 20 Comments

The Completely Fair Scheduler was merged for the 2.6.23 kernel. One CFS feature which did not get in, though, was the group scheduling facility. Group scheduling makes the CFS fairness algorithm operate in a hierarchical fashion: processes are divided into groups, and, within each group, processes are scheduled fairly against one another. At the higher level, each group as a whole is given a fair share of the processor. The grouping of processes is done in user space in a highly flexible manner; the control groups (formerly ‘process containers’) mechanism allows a management daemon to classify processes according to almost any policy.

About The Author

Thom Holwerda

Follow me on Mastodon @[email protected]

20 Comments

2007-10-25 10:35 am

baadger
I’ve noticed a difference in 2.6.24-rc1 with regard to how applications ‘feel’ under heavy load. As a gentoo user, whenever I emerged (compiled), other apps were always fine, now they seem a bit less responsive.

On the other hand, UT2004 seems to have more consistent frame rates and less jerkiness and when the system isn’t under intense load everything feels wonderfully snappy.

Is this all in my head? Maybe, it’s rather hard to tell, it could be a bug this early in the development cycle or the result of dynticks on x86_64 or a hundred other things new in 2.6.24.

Interestingly, the summary reminds me quite a bit of Linux’s TCP/IP QoS implementation, which allows you to classify packets into classes and subclasses and choose from various algorithms (Such as “Stochastic Fair Queuing” or “Hierarchical Token Bucket”) to manage bandwidth on each of them.

Edited 2007-10-25 10:36
2007-10-25 10:35 am

evert
Further development of this functionality would enable sysadmins to assign higher priority to certain groups of processes who must always be very responsive, like a “webserver” group, including both the (my)sql database part and the httpd part.

For users, it would be possible to give lower priority to all processes that are running in the background (minimized or on another screen) at once. This could even be automated by a window manager and would result in a more responsive desktop.

2007-10-25 1:06 pm

Ford Prefect
First, system daemons could/should be niced already if needed.

Next, your user side usecase lacks substance. GUI processes that are minimized or on other desktop are usually sleeping anyways. If they are not, for example a music player or a cd burning application, it could even be fatal to give them bad priority.

It’s not as easy as it looks. On earlier Windows versions (don’t know the current ones) you could give a hint to the scheduler: Equal rights for every process or priority given to the current “foreground app”. I always chose the former…

2007-10-25 3:22 pm

evert
Thanks, I forgot about such things as audio players running in the background 😉 You are absolutely right.

2007-10-25 1:39 pm

siki_miki
For single user system this’d mean that a game (run in a separate group) gets more timeslice than a backgroud daemon which is a wanted behavior, (control groups aren’t merged yet though).

Also I hope it’s possible to integrate something like a guaranteed timeslice mechanism (similar to MMCS on Windows) which is a useful feature for player apps (simplest form would be to reserve up to e.g. fixed % of CPU time for a process/group or similar if they want to use it). Group scheduling seems as a one way to do this, by implementing something like fixed CPU % group quota and putting apps into it.

After all, ability to give guaranteed timeslice(while taking care about time resolution as well) generally is a requirement for realtime systems.

Reminding though that Vista has huge problems with MMCS, as they seem to starve network stack, but that kind of situation is probably possible to avoid, i.e. make better implementation than their obviously rushed & unmature scheduling and/or net stack.

Edited 2007-10-25 13:45
2007-10-25 2:58 pm

nulleight
Since 2.6.23 (when the new scheduler was released) i can play et quake wars and compile something in background without any frame drop ( i have 120 fps limit anyway). I even can unpack stuff in background at the same time ( which is heavy io/cpu) and i still get decent fps while gaming. I’d say it’s much better than 2.6.22, where framerate varied greatly and the gameplay became very laggy with the same workload. It is even better than what windows does – try to 7zip something while playing a game. I think this is a great improvement.
2007-10-25 6:37 pm

Oliver
Well, well according to Kris Kennaway (FreeBSD developers), CFS is completely fair to FreeBSD – well done indeed *g*

http://people.freebsd.org/~kris/scaling/7.0%20Preview.pdf“>S… , Page 19

2007-10-25 10:49 pm

adkk
Oh really.. must have been great for the FreeBSD guys ego Just take a look at the 6.2 scores, they are ridiculous. Linux has been ahead (performance wise) for so many years and now that FreeBSD finally (remember 7.0 isn’t out yet) got something decent they cannot resist to.. well 🙂

But please fanboys, keep the following in mind:

1. the latest development code of the scheduler (most of which was merged for 2.6.24) already had some improvements.

2. Ingo already committed a patch to improve performance further, see http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=…

3. Just read the FreeBSD-performance list, there are still cases where the “old” 4BSD scheduler performs better. Remember that CFS is still quite new. It’ll match the performance of the old scheduler in time, let’s see again what happens when 7.0 is out.

2007-10-26 1:20 am

Redeeman
i also seem to remember not too long ago some freebsd benchmarks where the freebsd dude was deliberately using some software versions that had bugs when compiled on linux, and various misconfigurations, where when actually done properly, linux beat the living crap out of bsd

2007-10-26 10:12 am

Oliver
Maybe you should clean up your mind first.

1. some heavy bugs in Linux, discovered thanks to these benchmarks

2. prober configuration from the beginning with the help of some Linux developers

3. even today CFS is sometimes inferior to the new FreeBSD scheduler

4. just a note: NetBSD current beats Linux too, it’s no miracle just proper software engineering

>where when actually done properly, linux beat the living crap out of bsd

Thanks god the members of LKML aren’t such zealots 🙂

2007-10-26 10:21 am

Redeeman
no, the case i am remembering was clearly a bug in the database software, which actually was fixed in a later release.

so YOU should clear YOUR mind..

and btw, CFS may not provide as high throughput, but that comes at the price of increase interactivity. and while CFS is not as good as it could be(as it havent gotten as good as SD), its still alot better interactivity wise than what we had before.
2007-10-26 6:55 pm

adkk
> 1. some heavy bugs in Linux, discovered thanks to

> these benchmarks

Yes, that’s true.

> 2. prober configuration from the beginning with the

> help of some Linux developers

Half true. In his first benchmark Jeff was using and older MySQL version on Linux and a newer one on FreeBSD. He also wasn’t using the latest development version of Linux, only the stable releases.

> 3. even today CFS is sometimes inferior to the new

> FreeBSD scheduler

That true and I didn’t deny that. I only said that the latest version of CFS got some improvements.

> 4. just a note: NetBSD current beats Linux too, it’s > no miracle just proper software engineering

Link please? I read @tech-kern, but they where using Linux 2.6.21 and an older glibc version (which had the malloc bug).

> Thanks god the members of LKML aren’t such zealots

Well, everything I said was that the latest version of CFS has some improvements (which is true) and that FreeBSD 6.2 and 5.5 don’t scale at all (also true). So who is the zealot? ;D
2007-10-27 4:12 pm

sbergman27
Yes, I’ve noticed that when it comes to SMP related issues some FreeBSD fans like to speak of 7.0 as if it were already a stable release… and yes, compare their unreleased, bleeding edge to Linux’s stable releases. That should probably not be surprising since, as the document you linked to which was written by a FreeBSD dev shows, all currently released FreeBSD versions suck pretty badly in this area. That being the case, I find it a bit amusing that they then turn around and accuse the Linux camp, which currently blows FreeBSD away on SMP performance if one compares officially released stable versions, of caring about hype more than solid engineering.

It’s true that the Linux camp has done a much better job developing “mind-share”, which I believe is what these people are mistaking for “hype”.

But that mind-share is what makes the difference between a platform being usable as the core of my customer’s infrastructures, and not. It’s not the 90% of stuff that they need that is supported that matters. It’s the 10% that isn’t.

It’s challenging enough making Linux do all the things they need, including heavy duty business accounting and shop floor control. FreeBSD is just not an option. And the difference comes down to the level of mind-share that Linux vs the BSDs have in the industry.

I’ve nothing against the BSDs. I’m fine with permissive licensing. And I was a Unix advocate for 8 years before I had even *heard* of Linux. But reality is reality. And I have to use what *works* for my clients. BSD could greatly benefit from some marketing savvy.

2007-10-26 10:05 am

Oliver
>Just read the FreeBSD-performance list, there are still cases where the “old” 4BSD scheduler performs better.

Just read it first before trolling around! Some bugs, nothing more, nothing less.

>Remember that CFS is still quite new.

Remeber this too for FreeBSD.

>the latest development code of the scheduler (most of which was merged for 2.6.24) already had some improvements.

Dear Linux zealot, the benchmarks were even in discussion on the LKML and lead to some positive development (your nice patches for CFS).

>Linux has been ahead (performance wise) for so many years

Yeah maybe in your very dreams. At high load Linux sucks for so many years (don’t mention the 2.4 crap at all), even today with the latest CFS. Linux generated some hype about peaks and couldn’t deliver a stable environment in terms of performance at high load. Linux is working with hype and error permissiveness, *BSD is working with quality and reliability in mind. So next time do your home work first. Btw. for all of these benchmarks always the latest patches were used, sometime with support of the Linux community from LKML.

2007-10-26 6:50 pm

adkk
> Just read it first before trolling around! Some bugs,

> nothing more, nothing less.

Well, I guess you could say the same about CFS then? That there are still some bugs and once they get fixed it’ll perform better.

> Dear Linux zealot, the benchmarks were even in

> discussion on the LKML and lead to some positive

> development (your nice patches for CFS).

Not quite true. Someone posted Jeff’s latest results on LKML (Jeff was using 2.6.23) and then Ingo redid the benchmark with the latest development code for CFS and the results were better.

> Yeah maybe in your very dreams.

Please take a look at this:

http://people.freebsd.org/~kris/scaling/7.0 Preview.pdf

As you can see at page 11, FreeBSD 5.5 didn’t scale at all, version 6.2 did a little better, but only for very few threads. 6.2 is still the stable version and Jeff published his first benchmarks of the new ULE scheduler in the beginning of 2007.

> Linux is working with hype and error permissiveness, > *BSD is working with quality and reliability in mind.

Yeah right… the old urban legends again. Linux is all crap and BSD is pure quality!

> Btw. for all of these benchmarks always the latest

> patches were used

You are wrong again. Jeff was using 2.6.23 and testing it against bleeding-edge FreeBSD, even though Ingo’s development branch of CFS had lot’s of improvements.

2007-10-25 7:48 pm

tyrione
Debian still doesn’t have 2.6.23 even in Experimental.

2007-10-25 8:11 pm

Oliver
Of course, they are always somewhat behind

2007-10-26 1:54 am

adkk
Well the problem with this particular benchmark is another thing. Who has access to an 8-way SMP machine? I guess most of the core-kernel hackers (which includes Ingo) work from home. So he has to call someone at Redhat to test his patches on a big machine. Usually you have to reserve your spot in advance (it was like this when I did an internship at IBM). At least thats what I would think. So it’s not that easy to validate the benchmarks.

2007-10-26 4:32 am

bnolsen
$1500 and you can build one with 8GB ram.

$2000 and you can buy one from dell with 8GB ram

That’s the price of desktop machines 10 years ago.
2007-10-26 9:55 am

Oliver
Most Linux Kernel Hackers are working maybe at home, but they are working for a company (more than 60% of the kernel development is done by companies). Last not least the benchmarks were in (positive) discussion even at the LKML. It’s obvious that most people in this comments don’t have any clue what they’re talking about, so it will always end in a flamewar.