Linked by Tony Bourke on Wed 28th Jan 2004 19:51 UTC
Benchmarks When doing research for my evaluation of Solaris 9 on my Ultra 5, I kept running into one comment over and over again: Sun's C compiler produces much faster code than GCC does. However, I couldn't find one set of benchmarks to back this up. (If you know of any, drop me an email.) Could this be yet another case of rumor-taken-as-fact?
Order by: Score:

Missed a statistic
by Will on Wed 28th Jan 2004 20:07 UTC

Don't know either way, but it would be interesting to know what the COMPILE times were, to see how the Sun and GCC compilers faired for just daily coding.

Also, all of these programs are (I believe) pure C, some C++ programs would be very interesting as well.

(I know, he goes through all of this effort, and yet all we do is criticize anyway. Not my intention, seemed like a good effort.)

great article
by andy richter on Wed 28th Jan 2004 20:12 UTC

Thanks so much, I've really enjoyed your series so far on your old sparc box. Keep them coming!

My only question, I have a feeling you're gonna address in future articles, did you breath new life in to the sparc? Or is it gonna sit next to your desk after all this is over?

Future ?
by Anonymous on Wed 28th Jan 2004 20:16 UTC

What would be great now is have some comments from the gcc team. Why is gcc slower ? Is it just some very specific areas(e.g. floating points, long long arithmetic) ? What's planned in the future for getting gcc to produce faster code ?

Another thing I miss in the comparison is standard compliance. How well does gcc and the Sun compiler conform to the relevant C and C++ standards ?

Re: great article
by Wee-Jin Goh on Wed 28th Jan 2004 20:18 UTC

Its definitely a good article. A real joy to read. Keep up the good work.

I think this suggest that the earlier 32bit is faster than 64bit conclusion is in fact entirely dependent on GCC. Sun's compiler seems to consistantly generate the same or faster 64bit code than its 32bit version.

Re: Anonymous
by Bascule on Wed 28th Jan 2004 20:29 UTC

What would be great now is have some comments from the gcc team. Why is gcc slower ? Is it just some very specific areas(e.g. floating points, long long arithmetic) ? What's planned in the future for getting gcc to produce faster code ?

gcc for sparcv9 is relatively new. Thanks to gcc's modular backend, the difference isn't severe in these examples, although from personal experience I've seen much more drastic differences in performance of binaries compiled with the Forte Compiler Collection tools versus gcc. When I tried compiling one of our grid analysis tools here (which makes extensive use of 64-bit integers and floating point math) with both compilers, the version compiled with the Sun compiler ran about 40% faster than the version compiled with gcc 3.3.

Sun has been fine tuning their compilers for sparcv8/v9 for over a decade, and it certainly shows.

Another thing I miss in the comparison is standard compliance. How well does gcc and the Sun compiler conform to the relevant C and C++ standards ?

Per default gcc has some very odd behavior. With the -ansi -pedantic flags gcc behaves a bit nicer. Obviously Sun's compiler doesn't support a lot of the gcc-specific extensions which some have seen as fit to use (i.e. variable argument macroes, nested functions, etc) I believe it's the responsibility of all programmers to ensure their code is portable across a number of compilers and not bound to a particular toolchain.

Most of the problems you'll run into trying to compile code developed primarily with gcc/x86 without a lot of portability testing are going to be in things like endianness issues and addressing misaligned words, which is, in my opinion, a coding error.

v may the speed be with you!
by TrollBoy on Wed 28th Jan 2004 20:31 UTC
64-bit versus 32-bit
by TonyB on Wed 28th Jan 2004 20:56 UTC

Yeah, that was a big surprise, which seemed to mean that 64-bit binaries with Sun's compiler, despite applications not specifically writen for 64-bit, seem to be faster.

Still, the fact that performance wasn't all that far off says a lot about the quality of GCC, and why it's in such wide use. They've done a fantastic job.

More C++ would be nice
by tuttle on Wed 28th Jan 2004 20:57 UTC

What about a benchmark that tests the more advanced C++ features such as the STL? That would be nice.

But other than that it is a very good article. The performance of GCC is really impressive. I wonder if this is why KDE3.2 is so fast.

Floating Point Benchmarks missing
by Jason Denton on Wed 28th Jan 2004 21:00 UTC

All the programs examined here appear to be integer heavy. Sparc chips are not great integer performers, those who buy them are not looking for integer performance.

Floating point performance is where Sparc really shines, and is a spot where GCC is traditionally bad. The Sparc compiler does amazing things to floating point heavy code. The code I run is fp heavy, and the sparc compiliers make a noticiable difference.

The author of this article has really missed the boat. If you want to run standard GNU/Linux type stuff like what was benchmarked here, buy a cheap intel box and run linux. But if you need fp, get the Solaris stuff.

Re: TonyB
by Bascule on Wed 28th Jan 2004 21:00 UTC

Yeah, that was a big surprise, which seemed to mean that 64-bit binaries with Sun's compiler, despite applications not specifically writen for 64-bit, seem to be faster.

It's not that much of a surprise considering the majority of executables that Sun ships with Solaris are 32-bit...

Tendra ?
by Anonymous on Wed 28th Jan 2004 21:01 UTC

These comparisons are great !
Now, anyone up for benchmarking Tendra vs gcc on x86/linux/*bsd ?

CFlags
by Siti on Wed 28th Jan 2004 21:15 UTC

First off good article.

But with the benchmarking CFlags for gcc, I think it would be intersting to use: "-O2 -march=ultrasparc -fomit-frame-pointer" instead of "-O3 -mcpu=ultrasparc".
March generates it for the cpu specified W/O backwards compalibility. So it should in theory generate faster code ;)
Also I have found that -O3 slows some programs up, And -fomit-frame pointer always seems to speed up programs but only slightly.

Although I think the conclusion will be the same that the sun compiler generates better code...

Re: Tendra
by RonG on Wed 28th Jan 2004 21:16 UTC

Is Tendra even running on Linux?
And if so, where is the rpm?

c++
by craig on Wed 28th Jan 2004 21:22 UTC

I too would like to see some in depth c++ comparisons, if the author could provide them.
Thanks

Disk I/O
by drsmithy on Wed 28th Jan 2004 21:28 UTC

Wouldn't the easiest way to keep I/O speed out of the equation for the gzip tests be to redirect the output to /dev/null ? Ie: time gzip -dc /some/file.gz > /dev/null.

Disk I/O
by TonyB on Wed 28th Jan 2004 21:42 UTC

Good idea, I hadn't thought of that. I'm traveling now, so I don't have access to the system, but I'll give it a try when I do.

v Dont have access? :-) n00b!
by Trollboy on Wed 28th Jan 2004 22:31 UTC

Is that Sun has the same advantage as SGI, they do produce their own hardware so Sun's compilers will perform higher than GCC because they are optimized for that platform.

Where sun has alway shined is on IO (IMHO).
by Trollboy on Wed 28th Jan 2004 22:59 UTC

Why no IO benchmarks? The reason for me to get Sun has always been the ability to run a superior OS (SunOS). Now there is Linux for the Intel CPUs, no reason to get Sun. Why get a Sun now? IO on Intel still chokes on heavy IO (I2O? Dunno... never cound find a board that supported it OK) Hows about someone give Tony a 450 loaded with RAM and disks so we can see some IO benchmarks?

Re: The Problem with GCC on alternate platforms
by tuttle on Wed 28th Jan 2004 23:36 UTC

Yes. If you take that into account, the performance of GCC is even more impressive.

way to go gcc
by Anonymous on Thu 29th Jan 2004 00:11 UTC

Much of the gcc effort is revolving around x86. Thus if indeed gcc comes so close to Workshop's performance it's very impressive. However, as Jason pointed already to make any kind of statement author should have performed full CPU2000 suite. Just gzip is not enough to evaluate anything.

Please try -Os and -O2 options as well.
by gps on Thu 29th Jan 2004 00:28 UTC

gcc 3.x's -O3 sometimes produces marginally faster code on x86 cpus (its not worth it; stick to -Os for general things and -O2 for hot spot code that gets executed a lot).

I've found -O2/-O3 being faster not to be true on other architectures. For instance on my alpha. gcc 3.x's -Os produces the fastest code (faster than -O2, -O3 or -O).

Why use low-end hardware
by Mike on Thu 29th Jan 2004 00:50 UTC

I think the Sun compilers would perform a bit better than gcc on higher end Sun hardware. A U5 even in it's day wasn't a terribly impressive machine compared to the U60 workstation line. Today's workstations vs desktops are even more spread out if you compare CPU cache and memory bandwith. A SunBlade 2k verses a Blade 100 is a signifigant difference.

RE:Please try -Os and -O2 options as well.
by root on Thu 29th Jan 2004 00:57 UTC

-Os produces buggy code sometimes. It is not advisable to use it for most code. I know Gnumeric, Abiword and some GNOME games segfault and crash with -Os.

Pricing
by MJ on Thu 29th Jan 2004 02:12 UTC

But Tony, Sun's compiler is so expensive! Yes, it is expensive, especially when compared to the free (as in beer and freedom) GCC. However, Sun does offer a free 60-day evaluation license for their compiler suite, which is what I used.

For Sun's compiler, I used the 60-day trial for Sun ONE Studio 7, which can be found here. It includes C, C++, and Fortran compilers, as well as Java other development tools. The full version lists for $2,995.


It is possible to obtain these compilers for a substantially lower price. See:

http://wwws.sun.com/software/cover/2003-1027/index.html

If you run your own small business, you could get it for as low as $105/yr, which, compared to the other price is quite a deal.

Mr. Bourke,

What's the difference between the compilers when profile directed optimization comes into play?

In regards to I/O bound issues, as an addition to /dev/null, you could try running inside of /tmp and/or switching your UFS filesystems to logging,noatime.

Yours truly,
Jeffrey Boulier

Note on the graphs used
by Geoff Wozniak on Thu 29th Jan 2004 03:19 UTC

I would greatly appreciate it if the actual numbers from the tests were given. By just looking at the graphs, it is difficult to tell just how pronounced the differences are.

It is good to see someone at least trying to verify the "Sun vs. GNU" statements.

Geoff

Re: tuttle
by Bascule on Thu 29th Jan 2004 03:30 UTC

Yes. If you take that into account, the performance of GCC is even more impressive

Not really. The vast majority of code optimizations are not platform specific. gcc sports a modular backend which makes it easy to port to other architectures. When faced with complex mathematical code, I've found (see my previous post in this thread) gcc to be a poor performer on sparcv9 (doing calculations on based on large sets of 64-bit fixed point grid coordinates)


Reading your story about how gcc is almost as fast as Sun's compiler, I thought I'd try it for myself given that I have both of them handy. I use Sun's Forte compiler on my product on Solaris/SPARC.

Here are my compiler specs that I currently have on a 300MHZ Ultra 60.


gcc: gcc version 3.3
cc: Forte Developer 7 C 5.4 2002/03/09

I'm doing 32 bit only, for now.

The application will be a memory manager, so one can safaly say that this is integer based through and through. Lots of loads and stores with many register ops in between. There isn't any reading/writing to/from disk, so no bottlenecks there. Just straight load/store/register operations.

Compiler flags used: I don't know if I'm using the right compiler flags for gcc, and in fact, maybe I could use better compiler flags on cc as well, but never the less, here's what I'm currently using. If anyone want to suggest better flags to try, pelase let me know and I can rerun the tests.

Compiler flags SPARC and INTEL platforms:

gcc flags: -O3
-fexpensive-optimizations
-finline-functions
-ffast-math
-fomit-frame-pointer

cc flags: -fast

The numbers are the time it takes to complete the test, so lower numbers are better:

gcc: 55s
cc: 44s

Did each run 5 times, took the average.

Sun is 25% faster by my math.

Sun's compiler is much faster. It takes longer to compiler the application, but that's not what is important. What's important is how fast the resulting binaries are.

On the INTEL side, I get the following results from using the following
compilers.

gcc: gcc version 3.3.2
cc: cc: Sun WorkShop 6 update 2 C 5.3
2001/05/15


Execution times (lower times are better):

gcc: 95s
cc: 95s

On INTEL, in my application, they are equal.


Any questions, email me: balson@attbi.com



Jim




More individual benchmark runs?
by MJ on Thu 29th Jan 2004 07:46 UTC

This may well be a minor point, but I think it would be helpful if Tony ran his benchmarks more than 3 times a piece and averaged the results. While he claims that his numbers were pretty much consistent, I think it would be instructive to run the benchmarks a large number of times to be sure that his data is mostly accurate. With a large dataset, you can easily pick-off the outlying points, and have a better idea of what your distribution is. My experience is that you mostly get a lot of hits around one datapoint, however, finding bi-modal cases can be an indication of complex/interesting behavior that might warrant further investegation. It's a minor point, but it would certainly lend additional credibility to his claims.

My conclusion is ..
by Arend on Thu 29th Jan 2004 08:51 UTC

that for us it ain't worth investing in proprietary compilers.

I work for a small company producing an open, cross-platform Seismic interpretation platform (www.opendtect.org).

We compile our suite with gcc on all platforms (including win32) and that is where the great benefit of gcc lies: it is the same on all platforms. If it compiles on Linux, it almost certainly also compiles on Solaris, SGI and even win32 if you leave differences in api's out of the equation.
Whatever you do with templates and whatever clever C++ constructs you might come up with, if it compiles on one platform, it compiles everywhere.

And yes, it might cost some performance, but I don't think a performance loss in the order of 5-10% average is a big deal, compared to the benifits of having a single, standard compliant compiler across all platforms.
This is not only benifitial to the developer (after all, the customer 'pays' the performance penalty, not the developer), because if I spend less time porting and debugging, I can spend more time on developing new features or write better documentaion.

By the way, beside the compiler there is also gdb, which is also the same across all platforms, even tough the SGI one does not support debugging of multi-threaded applications.

So, I personally prefer the cross-platform benifits of gcc over performance gain of proprietary alternatives without one second of hesitation.

RE: GNU GCC versus Sun's Compiler
by Borje lindh on Thu 29th Jan 2004 08:54 UTC

If you'd like to see a bit more performance difference,
try a really FP intensive code like Dyna, use the latest compiler (currently version 8), enable full optimization
at least "f90 -fast -v9b" or similar, and run on a modern machine (UIIICu, UIIIi or UIV processor).

option
by nico on Thu 29th Jan 2004 12:48 UTC

For the the gcc compiler -march=v9 should work better.
-funroll-all-loops too.
-fomit-frame-pointer could give you 10% more performance.

You should also try -Os and -O2

So, okay. This is a decidedly integer-heavy test. Given that at least one other person has commended on the floating point focus of scc and the solaris platform, I'd love to see some good heavy floating point done. Since we're talking about real-world performance, it might also be a good idea to get some fixed-point in there.

I'm a little surprised that you're using GCC 3.3.2, when GCC 3.5 is out. GCC 3.5 produces impacts on ARM7 code speed as much as 15%; I would be excited to learn how it performed on the Sparc, both in comparison to scc and to older GCCs.

I'd also like to see how the compilers hold up to Dhrystone, under general-approach algorithms like the Mersenne Twister or Boost's Lagged Fibonacci RNG generators; to large-scale large-variance code like the Boost regression tests, the Loki tests and some or another STL test suite; to large patterned number maniuplation like GIMPS or a GMP rigor test; et cetera.

This is a wonderfully neat page, but it could use some work in the way the tests are done. I suggest a look at David Welch's compiler performance page for the GameBoy Advance, which though just one test is a test done in a rather more rigorous fashion. http://www.dwelch.com/gba/dhry.htm

Great, but three times is not enough...
by Juan Zuluaga on Thu 29th Jan 2004 19:25 UTC

Your articles are great, but I would suggest running the test more than three times, say at least ten times, to be able to use common statistical tools to evaluate significance of your results.

No otimization for GCC
by Russ on Thu 29th Jan 2004 23:42 UTC

I'm not an expert (far from it) in compiler technology, but it seems to me that there would be some improvement in performance with the GCC compilers if they had both been built from source with the Sun cc compiler. The problem I see with the setup used here is that a binary version of the GCC 3.2 compiler was installed so it was likely optimized for a different peice of hardware or not optimized at all and then the 2.95 version was built with this non-optimized compiler.

SIMD instruction support
by Andre' on Fri 30th Jan 2004 02:25 UTC

I recall that the SPARC processor has an equivalent instruction set to MMX. I also recall that the Sun C compiler supports this and that GCC does not. For those of us involved in signal processing and image processing this would be a big win.

Also this would seemingly be a very big win for the Sun C compiler.

Any thoughts on this?

Also, it would be really nice to see some benckmarks with more floating point intensice operations.

- Andrew

higher-end features not covered
by Derek on Fri 30th Jan 2004 06:10 UTC

While the article is interesting, it neglects to mention that Sun's C compiler supports a lot of things that gcc does not, like code autovectorization (-xautopar) and OpenMP support. I've found that these can provide big wins on SMP machines (I've done some testing on a 6-processor V880 at my university).

Take a look at the cc man page - http://developers.sun.com/tools/cc/documentation/s1s8cc_documentati.... About half of it is dedicated to the various optimization flags.

Hi Tony,

Your article gives an interesting reading and it is very close to my own experience using SunOS and Solaris for longer than 15 years. Yet, I must differ with you on the actual rigourousness of the tests, because the optimization flags you used are *very different*; as it is your performance comparisons are between equivalent to comparing apples and oranges. In order to have equivalnet testing conditions, you should have used "gcc -O3 -march=ultrasparc -mcpu=ultrascparc [-m64]", because using -mcpu alone, you one set up timer switches but take no advantage of particular register optimizations available in the target architecture. On the same venue, "-xfast" is the bane of the SunPro compilers, it creates binaries that as a matter of fact, contain ABI imcompatibilities with system shared libraries!!! Rather, you should have used "cc -xO3 -Olimit=<something very high> -xarch=v8plusa|v9a" to have equivalent binaries and therefore a valid comparative test.

In summary, if I were your technical editor or your academic supervisor, I'd have you repeat all the experiments with an adjusted experimental model.