posted by Rayiner Hashem & Eugenia Loli-Queru on Mon 24th Nov 2003 16:24 UTC

"Jim Gettys on the new X Server"
Rayiner Hashem: Computer graphics systems have changed a lot since X11 was designed. In particular, more functionality has moved from the CPU to the graphics processor, and the performance characteristics of the system have changed drastically because of that. How do you think the current protocol has coped with these changes?

Jim Gettys Jim Gettys: This is not true. The first X implementation had a $20,000 external display plugged into a Unibus on a VAX with outboard processor and bit-blit engine. Within 3 years, we went to completely dumb frame buffers.

Over X's life time, the cycle of reincarnation has turned several times, round and round the wheel turns. The tradeoffs of hardware vs. software go back and forth.

As far as X's graphics goes, X mouldered most of the decade of the '90's, and X11's graphics was arguably broken on day 1. The specification adopted forced both ugly and slow wide lines; we had run the "lumpy line" problem that John Hobby had solved, but unfortunately, we were not aware of it in time and X was never fixed. AA and image compositing were just gleams in people's eyes when we designed X11. Arguably, X11's graphics has always been lame.

It is only Keith Packard's work recently that has begun to bring it to where it needs to be.

Rob Pike and Russ Cox's work on Plan 9 showed that adopting a Porter-Duff model of image compositing was now feasible. Having machines 100-1000x faster than what we had in 1986 helps a lot :-).

Overall, the current protocol has done well, as demonstrated by Gnome and KDE's development over 10 years after X11's design, though it has been past to replace the core graphics in X, which is what Render does.

Rayiner Hashem: You mentioned in one of your presentations that a major change from W to X was a switch from structured to immediate mode graphics. However, the recent push towards vector graphics seems to indicate a return of structured graphics systems. Display PDF and XAML, in particular, seem particularly well-suited to a structured API. Do you see the X protocol evolving (either directly or through extensions) to better support structured graphics?

Jim Gettys: That doesn't mean that the window system should adopt structured graphics.

Generally, having the window system do structured graphics requires a duplication of data structures on the X server, using lots of memory and costing performance. The organization of the display lists would almost always be incorrect for any serious application. No matter what you do, you need to let the application do what *it* wants, and it generally has a better idea how to represent its data that the window system can possibly have.

Rayiner Hashem: What impact does the compositing abilities of the new X server have on memory usage? Are there any plans to implement a compression mechanism for idle window buffers to reduce the requirements?

Jim Gettys: The jury is out: one idea we've toyed with is to encourage most applications to use 16bit deep windows as much as possible. This might often save memory over the current situation where windows are typically the depth of the screen (32 bits). The equation is complex, and not all for or against either the existing or new approach.

Anyone who wants to do a compression scheme of idle window buffers is very welcome to do so. Most windows compress *extremely* well. Some recent work on the migration of window contents to and from the display memory should make this much easier, if someone wants to implement this and see how well it works.

Rayiner Hashem: What impact does the design of the new server have on performance? The new X server is different from Apple's implementation because the server still does all the drawing, while in Apple's system, the clients draw directly to the window buffers. Do you see this becoming a bottleneck, especially with complex vector graphics like those provided by Cairo?

Jim Gettys: No, we don't see this as a bottleneck.

One of the *really* nice things about the approach that has been taken is that your eye candy's (drop shadows, etc) cost is bounded by update rate to the screen, which never needs to be higher than the frame rate (and is typically further reduced by only having to update the parts of the screen that have been modified). Other approaches often have the cost going up proportional to the graphics updating, rather than the bounded behavior of this design, and take a constant fraction of your graphics performance,

Rayiner Hashem: Could this actually be a performance advantage, allowing the X server to take advantage of hardware acceleration in places Apple's implementation can not?

Jim Gettys: Without knowing Apple's implementation details it is impossible to tell.

Eugenia Loli-Queru: How is your implementation compares to that of Longhorn's new display system (based on available information so far)?

Jim Gettys: Too soon to tell. The X implementation is very new, and it is hard enough to keep up with what we're doing, much less keep up with the smoke and mirrors of Microsoft marketing ;-). Particularly sweet is that Keith says the new facilities saves code in the X server, rather than making it larger. That is always a good sign :-).

Rayiner Hashem: What impact will the new server have on toolkits?

Jim Gettys: None, unless they want to take advantage of similar compositing facilities internally.

Rayiner Hashem: Will they have to change to better take advantage of the performance characteristics of the new design? In particular, should things like double-buffering be removed?

Jim Gettys: If we provide some way for toolkits to mark stable points in their display, it may be less necessary for applications to ask explicitly for double buffering. We're still exploring this area.

But current Qt and GTK, and Mozilla toolkits need some serious tuning independent of the X server implementation. See our USENIX paper found here. Some of the worst problems have been fixed since this work was done last spring, but there is much more to do.

Rayiner Hashem: How are hardware implementations of Render and Cairo progressing? Render, in particular, has been available for a very long time, yet most hardware has poor to no support for it. According to the benchmarks done by Carsten Haitzler (Raster) even NVIDIA's implementation is many times slower in the general case than a tuned software implementation.

Jim Gettys: Without understanding exactly what Raster thinks he's measured, it is hard to tell.

We need better driver support (more along the lines of DRI drivers) to allow the graphics hardware to draw into pixmaps in the X server to take advantage of their compositing hardware.

Some recent work allows for much easier migration of pixmaps to and from the frame buffer where the graphics accelerators can operate.

An early implementation Keith did showed a factor of 30 for hardware assist for image compositing, but it isn't clear if the current software implementation is as optimal as it could be, so that number should be taken with a grain of salt. But fundamentally, the graphics engines have a lot more bandwidth and wires into VRAM than the CPU does into main memory.

Rayiner Hashem: Do you think that existing APIs like OpenGL could form a foundation for making fast Render and Cairo implementations available more quickly?

Jim Gettys: Understand that today's X applications draw fundamentally differently than your parent's X applications; we've found that a much simpler and narrower driver interface is sufficient for 2D graphics: 3D remains hard. The wide XFree86 driver interface is optimizing many graphics requests no longer used by current GTK, Qt or Mozilla applications. For example, core text is now almost entirely unused: I now use only a single application that still uses the old core text primitives; everything else is AA text displayed by Render.

So to answer your question directly, yes we think that this approach will form a foundation for making fast Render and Cairo implementations.

The fully software based implementations we have now are fast enough for most applications, and will be with us for quite a while due to X's use on embedded platforms such as handhelds that lack hardware assist for compositing.

But we expect high performance implementations using graphics accelerators will be running over the next 6 months. The proof will be in the pudding, now in the oven. Stay tuned :-).

Table of contents
  1. "Havoc Pennington on freedesktop.org"
  2. "Keith Packard on the new X Server"
  3. "Jim Gettys on the new X Server"
  4. "David Zeuthen on HAL"
  5. "Waldo Bastian on KDE/Qt and fd.org"
e p (0)    81 Comment(s)

Technology White Papers

See More