Interview with Solaris Kernel Engineer Andy Tucker

Today we host an interview with Andy Tucker, Distinguished Engineer of the Solaris kernel. We talk about the internals of Solaris, the competition and the future of the Solaris OE. The 9th question was answered by Robert O’Dea, Director of Engineering at Sun Microsystems.

1. Why have the other commercial Unixes all pretty much bitten the dust? Is Solaris that much better, or is it just more important to Sun than HP-UX was to HP, AIX to IBM or IRIX to SGI?


Solaris Logo Andy Tucker: I think the most important thing Sun has done to ensure the success of Solaris is simply to remain committed to it. Even in the early days of Solaris, when most Sun customers were still running SunOS 4.x and other companies with UNIX implementations were starting to look at NT, Sun stayed focused on Solaris.


You can also look at some of the “big bets” that were made early in Solaris development. One of the most significant was that of designing in support for multithreading and multiprocessing from the ground up. Doing this work up front allowed Solaris to easily scale on large multiprocessors, and to handle the multithreaded workloads that are increasingly common.


2. Do you think that the proprietary, company-supported development effort that you’re a part in has any specific benefits over the Linux kernel’s Linus-and-his-henchmen method?


Andy Tucker: The main advantage Sun has is that we can make sure our efforts are well integrated and are focused on the needs of Sun’s customers. There’s a lot of great stuff available for Linux, but the decentralized development model means that someone who’s looking for, say, both a fair-share CPU scheduler and network QoS support has to pull the pieces out of different places, build them into a kernel, and hope they work together. Solaris has these as built-in, integrated components that just need to be switched on.


3. Technically-speaking, what do you think of the Linux kernel and the Mach kernel? Also, how FreeBSD 5.x compares to Solaris?


Andy Tucker: I think they’re all fine operating systems, each with their strengths and weaknesses. Mach broke a lot of new ground: it was the first microkernel OS to get widespread use and introduced some basic concepts (such as processor sets) that we’ve since borrowed in Solaris. Linux obviously has a huge developer base, and as a result there’s a tremendous amount of activity and energy around it. FreeBSD (and the other *BSD implementations) are inheritors of the BSD legacy and have been the source of a lot of interesting ideas.


I don’t really like to do head-to-head comparisons, since I like to think of OS development as a collaborative exercise. We’re all working to improve the state of the art and to make life easier for our users. The open source operating systems are often a source of new and interesting ideas; I hope the developers of those operating systems see Solaris similarly.


4. Solaris has some very complex algorithms. STREAMS, page coloring, and multi-level scheduling are all more complex than what is usually implemented in UNIX kernels. In retrospect, which Solaris features have really paid off, despite their complexity, and which ones have not?


CDE Andy Tucker: I’ll note that most modern operating systems incorporate some sort of page coloring and multi-level scheduling algorithms; Solaris is hardly unique in this regard. I think that in most cases the significant work we’ve done has paid off; the complexity (if any) is usually required to meet the customer requirements. We’re also happy to rewrite things if we find a better or simpler way to do something.


On the other hand, there are obviously some features that haven’t really succeeded in the customer base, such as NIS+. And there are also some cases where we took a direction with the underlying technology that turned out to be a mistake. An example is the two-level thread scheduling model, where thread scheduling happens both at user level and in the kernel. Although this approach had some theoretical advantages in terms of thread creation and context switch time, it turned out to be enormously complicated, particularly when dealing with traditional Unix process semantics like signals. In Solaris 8, we made an “alternate” version of the threads library available that relied solely on kernel-based scheduling; it turned out to be not only much simpler and easier to maintain, but also faster in almost every case. It particularly sped up Java code, which is obviously important to us. In Solaris 9 (and later) we switched over to the single-level library as the only one available.


5. What do you think about the Cathedral vs Bazaar idea when applied to OS kernels, where the programming model is rather different than than of regular application programs?


Andy Tucker: In some ways the Cathedral vs. Bazaar distinction seems a bit artificial. I don’t know about other OS companies, but within Sun we have hundreds of engineers from all over the company working on different parts of the operating system. Many of these people aren’t actually part of the Solaris engineering organization; they work on different hardware platforms, or on storage devices, or in the research labs, or on some other product that touches on Solaris in some way. We continuously release the latest code for internal use throughout the development cycle, and do beta tests to get feedback from customers. So in a way we’re doing “Bazaar” style development, even though it’s commercial product and all developers are Sun employees.


The difficulty with this type of development, particularly on a large complex piece of software like an OS kernel, is ensuring that changes are architecturally consistent, well integrated, and of appropriate quality. This doesn’t mean there can’t be a large development community, it just means there needs to be some person or persons that are checking proposed changes to make sure they’re not going to cause a problem. In Linux, this role is filled by Linus and some of the other folks working with him, who review the changes going into the official kernel base. Within Sun, we have groups of senior engineers who similarly review proposed changes for quality, appropriateness, completeness, etc..

6. Is OS research dead? In the past, there was a great deal of research about the basics of OS design, like allocators and scalable scheduling. Today, the focus seems to be on application-level advancements like new virtual machines and user-level development frameworks. Have OS kernels pretty much reached the end of their evolution, or do they see kernels continue to evolve, perhaps incorporating some of the research techniques like orthogonal persistence or exokernels?


CDE Andy Tucker: The nature of OS research has changed over the years. In the 80’s and early 90’s, there was a lot of “big systems” research; universities and industry labs would start by building an operating system, and then use that as a platform for investigating new ideas. So CMU had Mach, Berkeley had Sprite, Stanford had the V System, etc.. This meant that there was a lot of re-examination of basic OS constructs — how to best build an OS from the ground up. As a result we had work on distributed systems, microkernels, etc. — but the systems
were all aimed at supporting the same applications, essentially the ones running on the researchers’ desktops.


Now most of the research I see is based on existing OS platforms, usually Linux or one of the *BSDs. The focus is often on improving support for new types of applications — multimedia, mobility, etc.. So we have fewer people looking at the basic structure of operating systems (with some notable exceptions), but more looking at how to make operating systems perform better from a user’s point of view. The use of existing OS platforms also removes some of the barriers to entry for OS research — universities with small OS groups and budgets can do interesting research without having to build an entirely new operating system.


7. 30 years after UNIX was recoded in C, most people still use C (or in some cases a little bit of C++) for the OS kernel. Is C perfectly adequate, or do they see some of the newer languages (C#, Java, or even modern C++ paradigms) being applied to OS design?


Andy Tucker: There have been various experiments in this area; as an example, Sun has developed operating systems in both C++ (SpringOS) and Java (JavaOS). While object-oriented languages offer a number of advantages in terms of ease of development for higher-level programming abstractions, this doesn’t always benefit OS kernels as much as it would user applications. Since the kernel is the piece of software that most directly interacts with the hardware, the benefits of having a simple mapping between the language and machine instructions is often more compelling than ease-of-development features like garbage collection and templates. There are also issues like runtime support requirements that can be extensive, depending on the language. What we often wind up doing instead is taking some of the concepts from object-oriented languages, such as polymorphism, and finding creative ways to implement them in non-OO languages like C.


8. How do you feel Solaris process management technologies like the Fair Share Scheduler will stack up to the Linux O(1) scheduler. Furthermore, has Sun ever attempted to implement an O(1) scheduler for Solaris and if so, what problems/drawbacks they encountered which kept it out of the released kernels.


Gnome2 Andy Tucker: Solaris has actually had an O(1) scheduler for a number of years. The run queues are also per-CPU to maximize scalability. This isn’t a secret, but we haven’t talked about the technology itself much; we’ve been mostly focused on the results.


The “fair-share scheduler” is one of several scheduling policies in Solaris, which control how priorities are assigned to individual processes. This is separate from the scheduler, which handles dispatching processes onto processors in priority order.


The fair-share scheduler allows the allocation of CPU in the system to be divided among groups of processes according to proportions defined by an administrator. For example, on a system running both a mail server and a web server, the administrator might decide that if the system is busy, 2/3 of the CPU should go to the mail server, and 1/3 should go the web server. Although in the past the fair-share scheduler was available only as a separate product (Solaris Resource Manager), we decided that it was important enough technology for our customers to bundle in the core operating system.


9. Sun killed-off the Sun Linux distribution, but are you going ahead with Project Mad Hatter which includes a Red Hat Linux 9.x distro serving as a cheaper thin client for your existing customers? If yes, when is the project going to be deployed and how important the usability of the user interface and OS is? Is Sun going to pay extra attention in Gnome and its better integration to the underlying OS in the future?


Robert O’Dea: Project Mad Hatter is moving ahead as planned with a release targeted for the second half of this year. Mad Hatter provides an integrated client solution for Linux based on open source technology. The graphical desktop will be LSB compliant, which is more important to our customers than the particular open source components provided.


We are also focusing our efforts on the user experience, interoperability, and enterprise specific functionality. Enhancements include a unified Sun look and feel, enhanced Microsoft interoperability [by which we mean Samba integration, ability to access MS file shares and access to Exchange servers], printer management, file management and sync. Additionally, we will be providing enterprise functionality such as system administration and configuration tools, single-sign on and other applications.


As for GNOME, this is an integral part of the Mad Hatter software stack. Sun has been working with GNOME for many years, has made significant contributions to the GNOME community as well as integrated GNOME with the Solaris Operating Environment. With our experience, you can expect that integration with the Linux OS will be tightly integrated.


10. What is the future holds for Solaris 10? What enhancements are in-store in the OS and kernel level? Are there any plans to integrate the Gridengine into Solaris rather than being a separate application?


Andy Tucker: Solaris 10 will have a number of new features that we think are pretty exciting. One is Solaris Zones — this takes an idea that was initially developed for FreeBSD (jails) and extends it to address the needs of our customers. It allows administrators to divide up a single system into a number of separate application environments, called zones, where processes in one zone are not able to see or interact with those in other zones. This means that multiple applications can run on the same system without conflicting with each other, but the administrator only has to deal with one OS kernel for backups, patches, etc..


We’re also looking at ways to improve system reliability and observability. Solaris 10 will include tools that allow tracing not only what’s going on at user level, but also what’s going on in the kernel. So a developer trying to understand why their application is performing poorly can get information from the whole software stack and get a much better picture of what’s really going on. We’re also using these tools internally to improve the performance and reliability of Solaris and other Sun software.

43 Comments

  1. 2003-06-18 3:55 pm
  2. 2003-06-18 4:22 pm
  3. 2003-06-18 4:41 pm
  4. 2003-06-18 4:50 pm
  5. 2003-06-18 4:53 pm
  6. 2003-06-18 4:59 pm
  7. 2003-06-18 5:00 pm
  8. 2003-06-18 5:01 pm
  9. 2003-06-18 5:03 pm
  10. 2003-06-18 6:02 pm
  11. 2003-06-18 6:10 pm
  12. 2003-06-18 6:25 pm
  13. 2003-06-18 6:45 pm
  14. 2003-06-18 7:04 pm
  15. 2003-06-18 7:09 pm
  16. 2003-06-18 7:22 pm
  17. 2003-06-18 8:10 pm
  18. 2003-06-18 8:14 pm
  19. 2003-06-18 9:23 pm
  20. 2003-06-18 9:32 pm
  21. 2003-06-18 9:33 pm
  22. 2003-06-18 9:40 pm
  23. 2003-06-18 9:44 pm
  24. 2003-06-18 10:12 pm
  25. 2003-06-18 11:49 pm
  26. 2003-06-19 12:01 am
  27. 2003-06-19 12:19 am
  28. 2003-06-19 12:34 am
  29. 2003-06-19 12:52 am
  30. 2003-06-19 1:27 am
  31. 2003-06-19 2:11 am
  32. 2003-06-19 3:28 am
  33. 2003-06-19 4:15 am
  34. 2003-06-19 4:29 am
  35. 2003-06-19 4:46 am
  36. 2003-06-19 9:05 am
  37. 2003-06-19 10:48 am
  38. 2003-06-19 1:17 pm
  39. 2003-06-19 5:00 pm
  40. 2003-06-20 8:20 am
  41. 2003-06-20 9:06 pm
  42. 2003-06-21 11:13 pm
  43. 2003-06-24 4:01 pm