Linked by ebasconp on Fri 10th Jun 2011 22:22 UTC
Benchmarks "Google has released a research paper that suggests C++ is the best-performing programming language in the market. The internet giant implemented a compact algorithm in four languages - C++, Java, Scala and its own programming language Go - and then benchmarked results to find 'factors of difference'."
Thread beginning with comment 477006
To view parent comment, click here.
To read all comments associated with this story, please click here.
RE[4]: GCC isn't all that great
by Alfman on Sun 12th Jun 2011 02:11 UTC in reply to "RE[3]: GCC isn't all that great"
Alfman
Member since:
2011-01-28

"Maybe GCC should have been able to optimize one of the len comparisons away, but really it's (deliberately?) stupid code imo and there will always be these cases where the compiler fails to grasp 'the bigger picture'."

Stupid code? Sure it was a trivial example, but that was deliberate. You'll have to take my word that GCC has the same shortcomings on more complex code from real programs.

Even if you want to blame the programmer here, a seasoned programmer will have no reasonable way of knowing if GCC has optimized the loop correctly without looking at the assembly output. Do we really want to go down the route of saying programmers need to check up on GCC's assembly output?

Should students be taught to avoid legal C constructs which give the GCC optimizer a hard time?


"Secondly it seems obvious that gcc doesn't choose to use vectorization since it has no idea of how many integers are to be added, and depending on that it could very well be less efficient to use vectorization"

I have no idea why GCC chose not to use SSE, but the result is still that the assembly language programmer would be able to beat it.

"(I seriously doubt ICC would do so either with this snippet)"

I wish someone could test this for us, Intel boasts very aggressive SSE optimization.

Reply Parent Score: 3

moondevil Member since:
2005-07-08

You are forgetting something in your examples.

It used to be so that most humans could beat compiler generated code. In this day and age it is only true for small code snippets or simple processors.

Most up to date processors use out-of-order execution with superscalar processing units, and translate CISC instructions into microcode RISC like code. And this varies from processor model to processor model within the same family even!

It is very hard for most humans to still be able to keep all processor features on their head while coding assembly and still be able to beat the code generated from high performance compilers. Not GCC, but the ones you pay several thousand euros/dollars for, with years of research put into them.

Reply Parent Score: 2

Alfman Member since:
2011-01-28

moondevil,

"You are forgetting something in your examples."

Fair enough, but what?

"It used to be so that most humans could beat compiler generated code. In this day and age it is only true for small code snippets or simple processors."

I was asked for an example, which I provided. Then I provided another example with division. In these examples, I'm not aware of any processor for which GCC generated optimal code, which is what I set out to demonstrate; Nothing more, nothing less.

"Most up to date processors use out-of-order execution with superscalar processing units, and translate CISC instructions into microcode RISC like code. And this varies from processor model to processor model within the same family even!"

I have some observations:

1. In practice x86 binary code is precompiled and shared among many models. If you have an Intel Core2 Q6600, you cannot simply purchase/download software specifically for that model.

2. I don't believe GCC even allows model-specific compilation (I doubt ICC does either, but I could be wrong). You can specify a processor family, but that's it.


"It is very hard for most humans to still be able to keep all processor features on their head while coding assembly and still be able to beat the code generated from high performance compilers. Not GCC, but the ones you pay several thousand euros/dollars for, with years of research put into them."

Well there you go, in conclusion it seems that you do agree with my argument that we can beat GCC? I'm not moving goalposts here, this is what I said from the beginning - it's even in the title of the thread.

Edited 2011-06-13 01:48 UTC

Reply Parent Score: 2

Valhalla Member since:
2006-01-24


Stupid code? Sure it was a trivial example, but that was deliberate.

Having two comparisons within the loop was obviously poor coding in this example, which you expected the compiler to fix for you.


Do we really want to go down the route of saying programmers need to check up on GCC's assembly output?

If you find that the performance is not what you'd expect out of the given code, you will profile and look at assembly output of the performance hotspots, doesn't matter if it's GCC, VC, ICC, Clang/LLVM.


Should students be taught to avoid legal C constructs which give the GCC optimizer a hard time?

Legal constructs does not equal efficient code. Compilers have never been able to turn shitty code into good code. If you know of one, please inform me, I'd buy it in a second.


I wish someone could test this for us, Intel boasts very aggressive SSE optimization.

I compiled your snippet with Clang 2.9, it didn't vectorize it either until I exchanged the len vars with constants just like in the case with GCC. Again I doubt ICC would do it either.

Again, this function within the context of a whole program would likely yield a different result (definately if PGO was used). For instance it would be interesting disecting the output of some of the micro-benchmarks over at language-shootout.

On a slightly off-topic note, anyone have any experience with the ekopath4 compiler suite? it appears that it is to be released as open source (gplv3) and judging by the performance benchmarks it appears to offer some gpgpu solution:

http://www.phoronix.com/scan.php?page=article&item=phoronix_dirndl_...

Reply Parent Score: 2

Alfman Member since:
2011-01-28

"Having two comparisons within the loop was obviously poor coding in this example, which you expected the compiler to fix for you."

I think it surprised you that I was able to come up with an example, and now your grasping at straws... I won't hold you to your original statements, don't feel the need to defend them.


"If you find that the performance is not what you'd expect out of the given code, you will profile and look at assembly output of the performance hotspots, doesn't matter if it's GCC, VC, ICC, Clang/LLVM."

The difference is, there is no need to audit the compiler if it can be trusted to do a great job in the first place. The fact that we can reveal shortcomings by looking at GCC's asm dump implies that we are able to do better.

"Legal constructs does not equal efficient code. Compilers have never been able to turn shitty code into good code. If you know of one, please inform me, I'd buy it in a second."

Still more excuses. Why does GCC reorder and optimize some code paths but not others? The developer shouldn't have to mess with clean code just to make it perform better under GCC.


"I compiled your snippet with Clang 2.9, it didn't vectorize it either until I exchanged the len vars with constants just like in the case with GCC. Again I doubt ICC would do it either."

That's not really a sufficient answer.

Reply Parent Score: 2

f0dder Member since:
2009-08-05

Do we really want to go down the route of saying programmers need to check up on GCC's assembly output?
When you need maximum performance for a piece of code, yes. When you don't, simply don't worry about it - not generating optimal code doesn't matter if the code isn't a hotspot.

Reply Parent Score: 1