To view parent comment, click here.
To read all comments associated with this story, please click here.
As for your last sentence, isn't the consensus that GCC output performs poorly compared to other commercial compilers such as intel's?
I would be interested in seeing a fair comparison.
Apparently this is true. I googled a bit and found three benchmarks:
http://macles.blogspot.com/2010/08/intel-atom-icc-gcc-clang.html
http://multimedia.cx/eggs/intel-beats-up-gcc/
http://www.luxrender.net/forum/viewtopic.php?f=21&t=603
They're all from 2009 or 2010 and in all of them icc beats GCC by quite a large margin, not to mention icc is much faster at doing the actual compiling, too. Quite surprising. What could the reason be then, why does an open-source compiler fare so poorly against a commercial one?
Edited 2011-01-30 03:58 UTC
"What could the reason be then, why does an open-source compiler fare so poorly against a commercial one?"
Without looking at the assembly, it's just speculation on my part.
I've read that GCC is rather ignorant of code & data locality and cpu cache lines; therefor binary code placement is arbitrary rather than optimal. In theory, this could make a huge difference.
Function inlining is usually good, but only until the cache lines are full, further inlining is detrimental. This may be a weakness for GCC.
The compilers inject prefetch hints into the code, maybe GCC predicts the branches less accurately at compile time?
GLIBC is notoriously bloated. I don't know if intel links in it's own streamlined c library? That might make a difference.
As for compilation time, ICC has an "unfair" advantage. If GCC had been compiled under ICC, then GCC itself might perform much better - though I'm not sure the GCC folks would want to admit to that.
Thinking about it further... the GCC bottleneck may not be the compiler at all but just the malloc implementation.
In an earlier post, I had mentioned that I made my own malloc which performs much better than GNU's malloc in multithreaded apps.
I think my implementation fits somewhere between ptmalloc and Hoard on the following chart.
http://developers.sun.com/solaris/articles/multiproc/multiproc.html
I developed mine from scratch, so I have no idea why GNU's malloc is slow, but I'm baffled as to why GNU continues to use a slow implementation?
Well *I* am not surprised: remember that very recently distribution have changed their JPEG rendering libraries with a 20% performance improvement.
You can see this in two ways:
- the optimistic view: nice a 20% improvement!
- the realistic view: JPEG are very old, the improved library use ISA which are very old too, why only now do we have the 20% improvement?
My view is that: open-source developpers like to have very flexible software combinations so GCC compiles many language on many architecture, but from a performance POV the situation isn't very good..
1) Because GCC is *old* in terms of computer software. It has decades of cruft to take along for the ride.
2) GCC is portable. It's optimizations can only take the optimization so far, if not to break compatibility with, say, the 68040.
3) Exactly because it is open source. That model means gradual evolution, almost never rewriting/reinvention.




Member since:
2011-01-28
"Even to this day it's often easy to beat a compiler's output"
"I quite doubt that. There's been plenty of discussion of this and the general consensus nowadays is that it's really, really hard to beat atleast GCC's optimizations anymore."
I feel you've copied my phrase entirely out of context. I want to re-emphasize my point that c compilers are constrained to strict calling conventions which imply shifting more variables between registers and the stack than would be possible to do by hand.
I'm not blaming GCC or any other compiler for this, after all calling conventions are very important for both static and dynamic linking. However it can result in code which performs worse than if done by hand.
As for your last sentence, isn't the consensus that GCC output performs poorly compared to other commercial compilers such as intel's?
I would be interested in seeing a fair comparison.
Edited 2011-01-30 03:37 UTC