Linked by ebasconp on Fri 10th Jun 2011 22:22 UTC
Benchmarks "Google has released a research paper that suggests C++ is the best-performing programming language in the market. The internet giant implemented a compact algorithm in four languages - C++, Java, Scala and its own programming language Go - and then benchmarked results to find 'factors of difference'."
Thread beginning with comment 477053
To view parent comment, click here.
To read all comments associated with this story, please click here.
RE[5]: GCC isn't all that great
by Valhalla on Mon 13th Jun 2011 06:02 UTC in reply to "RE[4]: GCC isn't all that great"
Valhalla
Member since:
2006-01-24


Stupid code? Sure it was a trivial example, but that was deliberate.

Having two comparisons within the loop was obviously poor coding in this example, which you expected the compiler to fix for you.


Do we really want to go down the route of saying programmers need to check up on GCC's assembly output?

If you find that the performance is not what you'd expect out of the given code, you will profile and look at assembly output of the performance hotspots, doesn't matter if it's GCC, VC, ICC, Clang/LLVM.


Should students be taught to avoid legal C constructs which give the GCC optimizer a hard time?

Legal constructs does not equal efficient code. Compilers have never been able to turn shitty code into good code. If you know of one, please inform me, I'd buy it in a second.


I wish someone could test this for us, Intel boasts very aggressive SSE optimization.

I compiled your snippet with Clang 2.9, it didn't vectorize it either until I exchanged the len vars with constants just like in the case with GCC. Again I doubt ICC would do it either.

Again, this function within the context of a whole program would likely yield a different result (definately if PGO was used). For instance it would be interesting disecting the output of some of the micro-benchmarks over at language-shootout.

On a slightly off-topic note, anyone have any experience with the ekopath4 compiler suite? it appears that it is to be released as open source (gplv3) and judging by the performance benchmarks it appears to offer some gpgpu solution:

http://www.phoronix.com/scan.php?page=article&item=phoronix_dirndl_...

Reply Parent Score: 2

Alfman Member since:
2011-01-28

"Having two comparisons within the loop was obviously poor coding in this example, which you expected the compiler to fix for you."

I think it surprised you that I was able to come up with an example, and now your grasping at straws... I won't hold you to your original statements, don't feel the need to defend them.


"If you find that the performance is not what you'd expect out of the given code, you will profile and look at assembly output of the performance hotspots, doesn't matter if it's GCC, VC, ICC, Clang/LLVM."

The difference is, there is no need to audit the compiler if it can be trusted to do a great job in the first place. The fact that we can reveal shortcomings by looking at GCC's asm dump implies that we are able to do better.

"Legal constructs does not equal efficient code. Compilers have never been able to turn shitty code into good code. If you know of one, please inform me, I'd buy it in a second."

Still more excuses. Why does GCC reorder and optimize some code paths but not others? The developer shouldn't have to mess with clean code just to make it perform better under GCC.


"I compiled your snippet with Clang 2.9, it didn't vectorize it either until I exchanged the len vars with constants just like in the case with GCC. Again I doubt ICC would do it either."

That's not really a sufficient answer.

Reply Parent Score: 2

Valhalla Member since:
2006-01-24


I think it surprised you that I was able to come up with an example,

Hardly, depending on the number of integers to add and the memory alignment of the integer data (both of which are unknown to the compiler in this example), vectorizing this loop may very well turn out slower afaik. There's a reason all the compilers support sse intrinsics, they're anything but general purpose registers. Both GCC and Clang/LLVM are considered strong compilers, neither of them vectorized this snippet. You claim this proves them 'not all that great', I say this 6-line example is anything but conclusive.

The difference is, there is no need to audit the compiler if it can be trusted to do a great job in the first place. The fact that we can reveal shortcomings by looking at GCC's asm dump implies that we are able to do better.

You are basically saying that you know that a vectorized loop will outperform a non-vectorized loop no matter what the data length and data alignment is?

Because that's what this example entails, the compiler knows nothing about the data length and the data alignment at compile-time. And not knowing this, the compilers (GCC and Clang) chose not to vectorize. When I gave it the data length both compilers vectorized the loop (as I showed in an earlier post).

Still more excuses. Why does GCC reorder and optimize some code paths but not others? The developer shouldn't have to mess with clean code just to make it perform better under GCC.

Because no compiler is perfect, and again clean code does not equal efficient code. There's a reason you don't start questioning the compiler optimizations until you've questioned the actual algorithm.

That's not really a sufficient answer.

It was a statement, neither GCC not Clang/LLVM vectorized your snippet, and like I said doubt ICC would either.

Despite our bantering (or perhaps because of it!), I have to say it's fun discussing technical stuff here on OSNews once in a while. So thanks alfman, acobar and others for participating in this (imho) interesting discussion ;)

btw I'm surpised f0dder hasn't weighed in, I seem to recall him from win32 assembly forums way back in the day (maybe my memory is playing tricks on me)!

Reply Parent Score: 2