Linked by ebasconp on Fri 10th Jun 2011 22:22 UTC
Benchmarks "Google has released a research paper that suggests C++ is the best-performing programming language in the market. The internet giant implemented a compact algorithm in four languages - C++, Java, Scala and its own programming language Go - and then benchmarked results to find 'factors of difference'."
Thread beginning with comment 477163
To view parent comment, click here.
To read all comments associated with this story, please click here.
RE[5]: GCC isn't all that great
by Carewolf on Tue 14th Jun 2011 09:32 UTC in reply to "RE[4]: GCC isn't all that great"
Carewolf
Member since:
2005-09-08

I also rarely see GCC vectorizing integer loops (it seems to choke on signed vs unsigned), GCC does slightly better with floating-point loops, and it helps if you allow it to break some strict math rules.

As I noted in another comment the newest version of gcc now has -ftree-vectorize in -O3, so I was not fully up to date in my first reply.

If you compile to AMD64, SSE is automatically used for all math (not vectorized, one value at a time). You can also use SSE for math in IA32 using -mfpmath=sse. SSE math is generally much faster but removes the 80bit temporaries quirk 487 math has.

Reply Parent Score: 2

Alfman Member since:
2011-01-28

Carewolf,

"I also rarely see GCC vectorizing integer loops (it seems to choke on signed vs unsigned), GCC does slightly better with floating-point loops, and it helps if you allow it to break some strict math rules."

Really? I'd be surprised if signed vs unsigned was the reason it chokes. In two's complement, the calculation is identical regardless of sign.

0x43 + 0xfe (this is -2) = 0x41 (+carry flag)
0x43 - 2 = 0x41

0xf0 (this is -0x10) + 0xfd (this is -0x3) = 0xed (this is -0x13) (+carry flag)

0xfe (this is -2) * 0xfb (this is -5) = 0x0a (0xf9 carry)

In other words, GCC doesn't even have to care whether a variable is signed or unsigned in order to do "+ - *".

unsigned int x=-2;
unsigned int y=-10;
unsigned int z=49;
printf("%d\n", x*y*z); // give us 980

GCC merely ignores the carry (this is another one of my optimization gripes in fact, assembly programmers can use the carry flag, C programmers have to do computations using the next larger word size, sometimes wasting a register).


"If you compile to AMD64, SSE is automatically used for all math (not vectorized, one value at a time). You can also use SSE for math in IA32 using -mfpmath=sse. SSE math is generally much faster but removes the 80bit temporaries quirk 487 math has."

Yes, it's good to get away from quirky FP design.


Edit:

This C/assembly debate seems to come up all the time, the next time it does, I might just sit it out. It takes so much time to make the case, and at the end it's totally inconsequential.

Edited 2011-06-14 10:15 UTC

Reply Parent Score: 2