posted by Guillaume Maillard on Thu 10th Oct 2002 05:44 UTC

"Optimizing your XFree86 code"
4. Interesting improvements

An interesting extension of XFree is the SHM one, it provides a new API (close to the XPutImage one) to transfer bitmap by using shared memory between the Xserver and the client. The gain is about 20% on 'my' typical use case, not bad! The 'new' extension called 'XRender' is an extension to XFree86 that lets applications perform complex blending and transparency operations. Functionality is interesting, but performance are not there, even a small semi transparent window (100x100) is slow. We need something faster and simpler.

It's a bit of a parodox, I played with hardware accelerated OpenGL, it seems that my card can blit 32bits bitmap with transparency as fast than XFree blits non transparent bitmap... what's more, an other lack I didn't mentioned before, seems to be filled on the 3D part : synchronization with the refresh rate of my screen. Where is the magical function "WaitForVerticalBlank()"? It could increase the rendering quality and the feeling of nice scrolling, MacOSX use it, why not XFree?

I found many answers in the source code of my driver (and its DRI part), it looks like drivers could support acceleration for 32 bit data, but it's not used or masked because the X server don't use it, I'm interested to hear the reasons of this situation, IMHO the drivers MUST support transparency EVEN IF today no extension use it.

5. What can be done to improve it

A lot of thing can be improved, people always suggest the abandon of XFree, what I suggest is something more realistic (doable and efficient):

5.1. Simplify

The X11 functions are too much complicated, too much overhead and potential mistakes to be made. Developers need basic and efficient APIs. Some developers tried to solve this issue by creating a wrapper but it only adds limitations and decrease the execution efficiency. Modern hardware supports 32bits display, all we need is a new extension, which only works with 32bits local display allowing to
- create 32bits window window_id id=CreateWindow(int width,int height); and human readable functions like SetTitle(char*), SetSize(int with,int height), SetPosition(int x,int y), Show(), Hide()...
- create 32bits bitmaps(native ARGB format of the card) (always strored on shared memory) bitmap_id id=CreateBitmap(int width,int height);

- explicitly define functions like
bool StoreBitmapInGFXRAM(bitmap_id);
bool RemoveBitmapFromGFXRAM(bitmap_id);
void BlitToWindow(bitmap_id src, window_id dest, ...)
void BlitToBitmap(bitmap_id src, bitmap_id dest, ...)
void WaitForVerticalBlank()

5.2.Make it faster

This interesting extension should have a fast communication between the server and the client, by avoiding encoding/decoding and format conversion and using fast IPC like a shared memory. (the 3D part of XFree uses DRI which use this kind of shortcuts).

5.3.A typical use case, a window manager.

Let's change a bit the subject and talk about the window managers (WM), most of us use a WM. A WM is a process (a X11 client as a standard graphical app) which deals with XFree86 for windows operations like moving windows, drawing the borders of the windows, manage workspaces etc. It's the most used X11 application, and the one to blame for the slow refreshes we get on our desktops. Technicaly, when you move a window by draging its title bar, the WM will ask the Xserver to move the window, it will generate 'redraw events' (called 'ExposeEvents') to all the windows behind the moving one. Later all this windows will redraw themselves by using X11 functions (which implies to send message to the Xserver).

In practice: let's start with an example: a screen with 10 windows.
1. I click on my preferred window and move it
2. the first 'step' of the move will:
- ask the Xserver to move the window
- 5 windows are partly covered, so 5 windows receive an ExposeEvent these 5 windows redraws the needed part. A 'standard' part of a nice UI need to send more than 100 drawing requests (line, rect, font...) to the Xserver. If my GnomeCalc is right, it's about 500 requests at every 'step', if I want to be have the more smooth desktop on earth on my 100Hz screen, 500000 requests must be swallowed per second.... good luck, it's not going to happen! Remember, it was 'just' to move a window and it already consume 100% of my CPU.

5.4. Make things faster, all the time...

If I look at my benchmarks and what was described as 'bad' on Xfree, a good solution could be to use the memory of my GFX card, by sacrifying 8MB, you can put a built-in window manager IN the X server (which plays with bitmap blitting functions to give you an impressive result).

5.6. Don't re-invent the wheel

XFree already has a big potential, let's improve it by creating the most efficient API, who cares about slowing a bit the Xlib functions if Xfree can provide something that outperform the old standard?

When working on the graphical rendering of B.E.OS, I never modify XFree86, but I use as much as possible of what is good on it, I hope that the result will seduce you and will push the development of a 'better extension'.

6. It's time to conclude

Yes, XFree86 is fast, just add the appropriate extension in order to use 'shortcuts' (as explained before) and you will have something performing fast, which can be used both with the standard X11 API and the proposed one.

About the Author:
Guillaume is a software engineer, who started to write his first line of code at 7, now he is 26... Amstrad, Amiga and a x86 PC under BeOS are his preffered digital environements. After working at Philips on the MHP technology, he created his own company. Even with his spare time drasticly decreased, he continues to find the time to work on exciting projects, like B.E.OS.

Table of contents
  1. "Understanding how XFree86 works"
  2. "Optimizing your XFree86 code"
e p (0)    98 Comment(s)