Linked by Thom Holwerda on Sat 28th Jun 2008 22:09 UTC, submitted by diegocg
X11, Window Managers "Maybe I'm just naive, but designing a graphics API such that all image data had to be sent over a socket to another process every time the image needed to be drawn seems like complete idiocy. Unfortunately, that is precisely what the X Window System forces a program to do, and exactly what Cairo does when drawing images in Linux - a full copy of the image data, send to another process, no less, every time it is drawn. One would think there would be some room for improvement. Unsurprisingly, others felt the same way about X, and decided to write an extension, Xlib Shm or XShm for short, that allows images to placed in a shared memory segment from which the X server reads which allows the program to avoid the memory copy. GTK already makes use of the XShm extension, and it seems like a good idea to see if Gecko couldn't do the same."
Thread beginning with comment 320567
To view parent comment, click here.
To read all comments associated with this story, please click here.
RE[3]: Exaggerating
by dpeterc on Sun 29th Jun 2008 13:02 UTC in reply to "RE[2]: Exaggerating"
dpeterc
Member since:
2007-09-08

openSUSE 10.3, Intel Q6600, Nvidia8600 GTS nvidia binary drivers, 2560x1600 resolution, 32 bpp

x11perf -putimage500
x11perf - X11 performance program, version 1.5
The X.Org Foundation server version 70200000 on :0.0
from monsta
Sun Jun 29 14:38:21 2008

Sync time adjustment is 0.0413 msecs.

8000 reps @ 0.7415 msec ( 1350.0/sec): PutImage 500x500 square
8000 reps @ 0.8194 msec ( 1220.0/sec): PutImage 500x500 square
8000 reps @ 0.9253 msec ( 1080.0/sec): PutImage 500x500 square
8000 reps @ 0.8722 msec ( 1150.0/sec): PutImage 500x500 square
8000 reps @ 0.8803 msec ( 1140.0/sec): PutImage 500x500 square
40000 trep @ 0.8477 msec ( 1180.0/sec): PutImage 500x500 square

x11perf -shmput500
x11perf - X11 performance program, version 1.5
The X.Org Foundation server version 70200000 on :0.0
from monsta
Sun Jun 29 14:39:16 2008

Sync time adjustment is 0.0385 msecs.

12000 reps @ 0.4442 msec ( 2250.0/sec): ShmPutImage 500x500 square
12000 reps @ 0.4435 msec ( 2250.0/sec): ShmPutImage 500x500 square
12000 reps @ 0.4433 msec ( 2260.0/sec): ShmPutImage 500x500 square
12000 reps @ 0.4429 msec ( 2260.0/sec): ShmPutImage 500x500 square
12000 reps @ 0.4424 msec ( 2260.0/sec): ShmPutImage 500x500 square
60000 trep @ 0.4433 msec ( 2260.0/sec): ShmPutImage 500x500 square


Only two times faster. Or in other words, already the core X on a local server is very fast and does not use TCP/IP when local.
Don't get me wrong, I use MIT-SHM for 14 years in my programs. But in optimization, you need to look at the total cost, not just optimize because you can. The actual time spent in copying the image buffer is most likely negligent with respect to generating the image data. So you will get better overall speedup by optimizing other part of the application.

In the old times, this data transfer was very slow, so optimizing it by MIT-SHM make a lot of sense. Nowadays, graphics cards are very fast (even on high resolutions like the ones I use), so optimizing with MIT-SHM does not buy you much (in the total application speed).

Reply Parent Score: 2

RE[4]: Exaggerating
by BSDfan on Sun 29th Jun 2008 14:18 in reply to "RE[3]: Exaggerating"
BSDfan Member since:
2007-03-14

My Pentium 4 system running OpenBSD 4.3 (r128 driver):

$ x11perf -putimage500
x11perf - X11 performance program, version 1.5
The X.Org Foundation server version 10400090 on :0.0
from xxx.xxx
Sun Jun 29 10:04:07 2008

Sync time adjustment is 0.0857 msecs.

320 reps @ 16.6892 msec ( 59.9/sec): PutImage 500x500 square
320 reps @ 16.6960 msec ( 59.9/sec): PutImage 500x500 square
320 reps @ 16.7091 msec ( 59.8/sec): PutImage 500x500 square
320 reps @ 16.7655 msec ( 59.6/sec): PutImage 500x500 square
320 reps @ 16.6939 msec ( 59.9/sec): PutImage 500x500 square
1600 trep @ 16.7108 msec ( 59.8/sec): PutImage 500x500 square

$ x11perf -shmput500
x11perf - X11 performance program, version 1.5
The X.Org Foundation server version 10400090 on :0.0
from xxx.xxx
Sun Jun 29 10:04:44 2008

Sync time adjustment is 0.0856 msecs.

1600 reps @ 3.4037 msec ( 294.0/sec): ShmPutImage 500x500 square
1600 reps @ 3.4092 msec ( 293.0/sec): ShmPutImage 500x500 square
1600 reps @ 3.4257 msec ( 292.0/sec): ShmPutImage 500x500 square
1600 reps @ 3.4064 msec ( 294.0/sec): ShmPutImage 500x500 square
1600 reps @ 3.4086 msec ( 293.0/sec): ShmPutImage 500x500 square
8000 trep @ 3.4107 msec ( 293.0/sec): ShmPutImage 500x500 square

I definitely see a speed increase, stop thinking about how well it'll improve performance on modern systems and realize people DO use older systems. (Not that the pentium 4 is is ancient, but I don't have my Alpha system nearby.)

EDIT: This does not in any way mean I'm against the Networking principles of X, but for a local workstation that won't be listening on TCP. (i.e: -nolisten tcp), local optimizations are a good idea.

Edited 2008-06-29 14:22 UTC

Reply Parent Score: 3

RE[5]: Exaggerating
by Ekorn on Sun 29th Jun 2008 14:55 in reply to "RE[4]: Exaggerating"
Ekorn Member since:
2006-03-17

This is interesting, running Debian Lenny on a Core2 Duo 2,4GHz with integrated Intel GMA965 :

$ time x11perf -putimage500
x11perf - X11 performance program, version 1.2
The X.Org Foundation server version 10400090 on :0.0
from yggdrasil
Sun Jun 29 17:02:43 2008

Sync time adjustment is 0.0228 msecs.

8000 reps @ 0.7075 msec ( 1410.0/sec): PutImage 500x500 square
8000 reps @ 0.6736 msec ( 1480.0/sec): PutImage 500x500 square
8000 reps @ 0.6596 msec ( 1520.0/sec): PutImage 500x500 square
8000 reps @ 0.6818 msec ( 1470.0/sec): PutImage 500x500 square
8000 reps @ 0.6738 msec ( 1480.0/sec): PutImage 500x500 square
40000 trep @ 0.6793 msec ( 1470.0/sec): PutImage 500x500 square


real 0m33.953s
user 0m9.421s
sys 0m9.337s

$ time x11perf -shmput500
x11perf - X11 performance program, version 1.2
The X.Org Foundation server version 10400090 on :0.0
from yggdrasil
Sun Jun 29 17:03:29 2008

Sync time adjustment is 0.0230 msecs.

8000 reps @ 0.9115 msec ( 1100.0/sec): ShmPutImage 500x500 square
8000 reps @ 0.9155 msec ( 1090.0/sec): ShmPutImage 500x500 square
8000 reps @ 0.9105 msec ( 1100.0/sec): ShmPutImage 500x500 square
8000 reps @ 0.9106 msec ( 1100.0/sec): ShmPutImage 500x500 square
8000 reps @ 0.9121 msec ( 1100.0/sec): ShmPutImage 500x500 square
40000 trep @ 0.9120 msec ( 1100.0/sec): ShmPutImage 500x500 square


real 0m43.488s
user 0m0.356s
sys 0m0.728s


Edit: Added 'time' to the commands.

Edited 2008-06-29 15:04 UTC

Reply Parent Score: 1