Book Review: Understanding the Linux Virtual Memory Manager

Guest post by Can Sar 2004-06-30 Office 10 Comments

Virtual memory is one of the most important subsystems of any modern operating system. Virtual memory is deeply intertwined with user processes, protection between processes and protection of the kernel from user processes, efficient shared memory, communication with IO (DMA, etc.), paging, swapping, and countless other systems. Understanding the VM subsystem greatly helps understanding how all other parts of the kernel work and interact. Because of this “Understanding the Linux Virtual Memory Manager” is a great guide in better understanding and working with the entire kernel.The book is written in a very precise technical style and Gorman explains things clearly if somewhat dryly. Readers are expected to have knowledge of common hardware/OS terms, and prior knowledge of the kernel helps. Be aware that this is no read for somebody who has no prior understanding of Operating Systems, or just wants to understand the basics of what is going on. If you, however, want to really understand how modern operating systems handle memory, you should immediately buy this book. There are almost no other books on memory in Linux, and none of comparable quality.

Rather than trying to give you a rough overview of the kernel, and then focusing on individual subsystems, Gorman immediately dives into how physical memory is managed, and works his way up from there. This approach works quite well, and is consistent with the no-nonsense, no-fluff style of the book, but can make for a difficult start for beginners. Several terms such as buddy allocator, or slab allocator where mentioned early on, without being explained at the time. Of course these concepts where explained in great details later on, but somebody who had never heard of these things would be initially confused. If this is true for you, you might want to skim over some of the introductory kernel websites and books. Otherwise, it allows you to immediately start understanding VM, rather than rehashing simple concepts.

Virtual Memory is one of the subsystems of the kernel that interfaces very heavily with the hardware, so many things depend on how each instruction set architecture implements a feature. In order to be able to concentrate on the VM and not get bugged down in hardware issues, the book chooses to focus on the x86 and sometimes points out how things would work differently in other architectures. It would, however have been nice if some specific details and oddities of the x86 had been explained early on, because e.g. High Memory was confusing throughout the book, until it was covered in great detail, towards the end of the book.

Even experts should be satisfied with the amount of detail Gorman goes into, for instance many aspects of the implementation of Nodes, which are mostly important on NUMA (Non Uniform Memory Access) machines, are described extensively. The author wrote his master thesis about the Linux Virtual Memory architecture, and it shows. I found the chapter on Process Address Space to be particularly important, even for readers not immediately concerned with the entire VM, because it describes the implementation of the user address space, which is key to understanding how the kernel implements user process. Kernel programmers (even people only interested in straightforward tasks such as writing drivers) will be very interested in Physicial Page and Noncontigiuous Memory Allocation, as well as the Slab Allocator, which allows automatic reuse of “objects” (which are structures in memory, and unrelated to OO-objects). The explanation of the Slab Allocator also illustrates why using it can be preferrable than simply calling kmalloc.

Another chapter that is particularly important, after the recent discussion of whether Swapping is still a good idea, on the Linux Kernel Mailing list, is on Swap Managment. Linux does not do swapping in the traditional sense (writing out an entire process at once, and then reading it back in in one go), unless severe memory pressure demands it. Instead, swapping refers to writing out dirty memory pages to disk, which allows the kernel to eject unused pages, in order to free up memory for more important tasks. The implementation of this, touches on many issues, such as how the file system and VM interact, and this is explained very well in this chapter.

Only slightly more than 200 of the book’s 730 pages describe the VM in writing. The majority of the book is so called code commentary, small excerpts of source code followed by a short description of its purpose and a line by line explanation. This code commentary is especially useful, because even with a good understanding of the general workings of the VM, understanding the actual code, without having had any prior exposure to Linux kernel source, is extremely difficult.

The commentary is divided into chapters, with one chapter for each corresponding chapter of the description. This makes it easy for interested readers to flip to the back of book and see how things are implemented. The most important functions listed in each chapter are described, and longer functions are split into parts for clarity. Less important functions are omitted so that the reader does not get bogged down in useless details, that are easily understood when reading the actual source. This is very different from the famous Stevens’ approach where every small macro or typedef is explained. This decision fits into the overall style of the book, and works very well: by focusing on the core functions, the reader can keep the big picture in mind. The explanations themselves are quite brief but very clear, and most effectively used by browsing the actual source code while using the code commentary as a guide.

The book also includes a CD-ROM that has some very interesting features, and some more standard features with a novel presentation. Instead of simply opening a file on the CD from a web browser and following links from there, the book suggests installing the copy of apache provided on the CD. This approach does of course require a working copy of Linux which, given the subject of the book, is very likely. Users of other operating systems can of course simply browse the CD directly, though tools like the call graph generator (which was also used to generate the graphs in the book) will of course not work. Using apache makes the integration of these programs very elegant, it is, for example quite easy to generate callgraphs for any function in the VM subsystem.
The CD also includes tools for VM regression testing, to test the correctness and performance of your modified code. It is also quite helpful for examining the behavior of the original VM, and comparing performance to the 2.6 VM system. In addition to this, the CD also contains the entire book as HTML, browseable and searchable code commentary, a cross referenced HTML version of the 2.4.22 kernel source and a program that makes creating patches easy. Some of these programs, such as the callgraph generator were actually written by the author himself, and after playing around with them for a bit, I was quite impressed. Overall, the CD is a very useful addition to the book.

Finally it should be noted that the majority of the book ( including the code description) deals with the 2.4.22 kernel, which is quite a recent iteration of the 2.4 kernel. While it is true that the 2.6 kernel has been recently released, and is now being used in some distributions, pretty much everything in this book is still relevant to the new version. Though it would of course be nice, to have a more detailed treatment of the 2.6 kernel, the fast pace of development of the kernel, means that any book will not be entirely up to date after a few months. In order to ease the transition, every chapter contains a What’s New in 2.6 section, that covers all the changes from the 2.4 kernel described in the book. Once you are familiar with the 2.4 implementation transitioning to the 2.6 kernel should not be too difficult, at least as far as VM is concerned.

About the Author:
“Can Sar is a Sophomore in Computer Science at Stanford University where he is focusing on Operating Systems and Networking. He is spending the summer doing independent research on Distributed Virtual Memory and will be busy hacking on the Linux kernel next semester.”

Buy “Understanding the Linux Virtual Memory Manager”
at Amazon.com

If you would like to see your thoughts or experiences with technology published, please consider writing an article for OSNews.

10 Comments

2004-06-30 7:52 am

Anonymous
… if not for too many commas.

Can, you’re only a sophomore? I’m impressed 🙂
2004-06-30 8:25 am

Anonymous
As a note for people interresting in VM, the NetBSD paper

about UVM is great reading.

Design and Implementation of UVM:

http://www.ccrc.wustl.edu/pub/chuck/psgz/diss.ps.gz
2004-06-30 9:07 am

Anonymous
I was hopping for an “Understanding the Linux Virtual Memory Manager” article. Guess I should have read the title more closely eh?

Still, excellent review.
2004-06-30 11:03 am

Anonymous
Let me see if I’ve got this Virtual Memory thing straight…

It used to be that the computer copied data from disk to memory, then worked on it, then copied it back to disk again.

Then the data got really big and wouldn’t all fit in memory anymore. But the computer still wanted to think it was working in memory.

So now the computer copies data from disk via memory to somewhere else on disk (which it pretends is memory), then copys the bits it needs to works on from the somewhere else on disk to memory, then works on them, then copies those bits back from memory to the somewhere else on disk, then finally copies them back from the somewhere else on disk via memory to the place on disk they were originally.

Sounds like a game of musical bit-buckets.

Anyway, 640KiB ought to be enough for anyone.

—

James G.
2004-06-30 11:17 am

Anonymous
Well, virtual memory allows you to do what you’ve said. But VM is much more. It’s about presenting virtual memory to appliations. That means applications sees its own address space, and is not to buther with the code/data of other processes or the kernel. i.e. it offers protection. All this is usualy done by an mmu, which must be presented with registers and tables for mapping a memory address on to physical memory,

and that’s the hard part, maintaining all these mappings/tables in an efficient way, for all diffent kinds

of usage.

Read the

http://www.ccrc.wustl.edu/pub/chuck/psgz/diss.ps.gz as mentioned above, it offers insight.

it offer
2004-06-30 12:29 pm

Anonymous
“of any modern operating system”

*cough*

appearantly the writer never heard of DEC’s uh Digital uh Compaq uhm HP’s Virtual Memory System operating system.

as if virtual memory is some kind of new idea.

the great thing about linux is that it is always in a testing/experimental phase
2004-06-30 2:48 pm

Anonymous
VMS is a modern operating system. It has only been around since 1979.
2004-06-30 3:33 pm

Anonymous
So now the computer copies data from disk via memory to somewhere else on disk (which it pretends is memory), then copys the bits it needs to works on from the somewhere else on disk to memory, then works on them, then copies those bits back from memory to the somewhere else on disk, then finally copies them back from the somewhere else on disk via memory to the place on disk they were originally.

No, in modern VMs, that’s not how it works. When you mmap() a file, nothing is loaded into memory. When you touch a page in that mapping, the appropriate data is brought in from disk and put into memory (into the unified VM cache). When memory runs tight, pages from the file that haven’t been used in awhile are flushed back to the *same* file. So there is only one location in memory and one location on disk. These days, most VMs do this for *all* file I/O, not just for mmap()’ed regions. When you read 10 bytes from the middle of the file, the whole page is brought in and put into the buffer cache, just as if you’d mmap()’ed it. When you access that region again, the kernel just memcpy()’s the data into the buffer cache. As memory needs to be reused (or the file is closed) those pages are flushed back to the file.
2004-06-30 5:26 pm

Anonymous
since i inserted LOTS of RAM in this compoooooter 768 (3 256 sticks of DDR) now swap is allways on 0 and never used, it is still there just to keep my computer happy…
2004-07-04 3:36 pm

Anonymous
I have heard of VMS, and read the original paper describing the VMS VM architecture. Most of the common practices in VM systems actually originated there (e.g. not mapping anything to the first page, and segfaulting when somebody dereferences a null-pointer). What I meant was that in any “modern” operating system, VM is sure to play an important role, sorry about the confusion.

Thanks for all the feedback on the article everyone!