Linked by ebasconp on Fri 10th Jun 2011 22:22 UTC
Benchmarks "Google has released a research paper that suggests C++ is the best-performing programming language in the market. The internet giant implemented a compact algorithm in four languages - C++, Java, Scala and its own programming language Go - and then benchmarked results to find 'factors of difference'."
Thread beginning with comment 477133
To view parent comment, click here.
To read all comments associated with this story, please click here.
RE[7]: GCC isn't all that great
by Valhalla on Mon 13th Jun 2011 23:29 UTC in reply to "RE[6]: GCC isn't all that great"
Valhalla
Member since:
2006-01-24


I think it surprised you that I was able to come up with an example,

Hardly, depending on the number of integers to add and the memory alignment of the integer data (both of which are unknown to the compiler in this example), vectorizing this loop may very well turn out slower afaik. There's a reason all the compilers support sse intrinsics, they're anything but general purpose registers. Both GCC and Clang/LLVM are considered strong compilers, neither of them vectorized this snippet. You claim this proves them 'not all that great', I say this 6-line example is anything but conclusive.

The difference is, there is no need to audit the compiler if it can be trusted to do a great job in the first place. The fact that we can reveal shortcomings by looking at GCC's asm dump implies that we are able to do better.

You are basically saying that you know that a vectorized loop will outperform a non-vectorized loop no matter what the data length and data alignment is?

Because that's what this example entails, the compiler knows nothing about the data length and the data alignment at compile-time. And not knowing this, the compilers (GCC and Clang) chose not to vectorize. When I gave it the data length both compilers vectorized the loop (as I showed in an earlier post).

Still more excuses. Why does GCC reorder and optimize some code paths but not others? The developer shouldn't have to mess with clean code just to make it perform better under GCC.

Because no compiler is perfect, and again clean code does not equal efficient code. There's a reason you don't start questioning the compiler optimizations until you've questioned the actual algorithm.

That's not really a sufficient answer.

It was a statement, neither GCC not Clang/LLVM vectorized your snippet, and like I said doubt ICC would either.

Despite our bantering (or perhaps because of it!), I have to say it's fun discussing technical stuff here on OSNews once in a while. So thanks alfman, acobar and others for participating in this (imho) interesting discussion ;)

btw I'm surpised f0dder hasn't weighed in, I seem to recall him from win32 assembly forums way back in the day (maybe my memory is playing tricks on me)!

Reply Parent Score: 2

Alfman Member since:
2011-01-28

Valhalla,


"There's a reason all the compilers support sse intrinsics, they're anything but general purpose registers."

In theory, one could create intrinsics for every single assembly opcode available and then claim that it is the developer's fault that they don't get used. However C is used by devs who don't want to program at the opcode level.

If you can get away with using intrinsics instead of inline assembly, then sure, go ahead. But they are not very portable between compilers nor architectures.

And not every opcode we want to optimize has intrinsics. You haven't addressed the division example, I'd be very grateful if you could find a way to optimize 64bit / 32bit -> 32bit without using assembly.


"Both GCC and Clang/LLVM are considered strong compilers, neither of them vectorized this snippet."

Well if you say so, GCC usually doesn't score highly on benchmarks.


"Hardly, depending on the number of integers to add and the memory alignment of the integer data (both of which are unknown to the compiler in this example)"

Even so, GCC did a bad job.

Change the example so that the function only accepts one length, and GCC produces the SSE code. So it's clear that an unknown length was not the factor.


"Because that's what this example entails, the compiler knows nothing about the data length and the data alignment at compile-time. And not knowing this, the compilers (GCC and Clang) chose not to vectorize. When I gave it the data length both compilers vectorized the loop (as I showed in an earlier post)."

This is not strictly true, GCC can tell exactly how long the arrays are by looking at the rest of the program, it simply chooses not to optimize that way.

But this brings up another good point, which speaks to my first post (repeated here) "We as programmers can do a better job than GCC at anticipating certain states and algorithmic conditions which give GCC trouble."

There may be times when the developer knows things which we cannot reasonably expect the compiler to derive, nor does the C language provide the means for us to tell it. The result of this uncertainly is poorer optimization.

"It was a statement, neither GCC not Clang/LLVM vectorized your snippet, and like I said doubt ICC would either."

We'll have to leave it to the unknown.



Your view is too extreme for my liking. What if an optimizer does not eliminate loop invariants because the developer's code calculated them each time? We could blame the "shitty code" instead of the compiler there too. In fact any time the compiler failed to optimize but did a literal translation of logic, we could argue the developer is to blame, right?

If that is not your view, then what is the criteria for non-optimizations which are the developer's fault versus those which are the compiler's fault?

(and I won't accept this answer: if GCC doesn't handle it, then it's the developer's fault)


"Despite our bantering (or perhaps because of it!), I have to say it's fun discussing technical stuff here on OSNews once in a while."

I much prefer technical stuff to gadgetry hype, but maybe I'm just jealous that I cannot afford a lifestyle where many gadgets come into play.

Reply Parent Score: 2

Valhalla Member since:
2006-01-24


In theory, one could create intrinsics for every single assembly opcode available and then claim that it is the developer's fault that they don't get used.

Sure in theory, but in practice we don't need that, however pretty much every compiler out there supports sse insintrics and there's a reason for that.


And not every opcode we want to optimize has intrinsics. You haven't addressed the division example,

I haven't seen such an example? Show me the code!


Well if you say so, GCC usually doesn't score highly on benchmarks.

I haven't seen any fresh benchmarks outside Phoronix in quite some time (and GCC does very well there), sadly multimedia mike stopped doing his smackdowns, do you know of any fresh online comparisons between gcc/icc?


This is not strictly true, GCC can tell exactly how long the arrays are by looking at the rest of the program,

Is this back again to the two comparisons or are you suggesting that GCC should magically know the length of these arrays at compile-time? I'm not sure I follow.

Your view is too extreme for my liking...

Funny, I find your views are too extreme, I think you need to be a good assembly programmer in order to beat a current optimizing compiler. Obviously there will be specifical instances were the compiler generates subpar code, if that wasn't the case the compiler devs wouldn't have to work so hard on improving the optimization release after release. But generally I believe it to holds true, and this 6 line function did not convince me otherwise.

I much prefer technical stuff to gadgetry hype, but maybe I'm just jealous that I cannot afford a lifestyle where many gadgets come into play.

Heh, man if there's one thing I don't need it's another gadget to distract me ;) Instead of writing this post I should be finishing up as much work as I can before going on vacation, but yet I find myself here ;)


This C/assembly debate seems to come up all the time, the next time it does, I might just sit it out. It takes so much time to make the case, and at the end it's totally inconsequential.

Please don't, I've found discussing this with you very interesting even if we hold what seems like opposing views, and it's not as if any of the comment discussions here on OSNews are anything but inconsequential.

Again thanks for this discussion Alfman.

Reply Parent Score: 2