Linked by ebasconp on Fri 10th Jun 2011 22:22 UTC
Benchmarks "Google has released a research paper that suggests C++ is the best-performing programming language in the market. The internet giant implemented a compact algorithm in four languages - C++, Java, Scala and its own programming language Go - and then benchmarked results to find 'factors of difference'."
Thread beginning with comment 477063
To view parent comment, click here.
To read all comments associated with this story, please click here.
RE[3]: GCC isn't all that great
by Carewolf on Mon 13th Jun 2011 09:45 UTC in reply to "RE[2]: GCC isn't all that great"
Carewolf
Member since:
2005-09-08

After identifying that gcc didn't perform these optimization have you considered trying gcc with the options to enable them? -funroll-loops -ftree-vectorize.

It is not fun blaming the compiler for not doing optimizations it hasn't been asked to do. I know the default optimizations suck, but that is a well known problem with gcc.

The auto-vectorizer isn't very good with integers, but give it a try.

Edited 2011-06-13 09:46 UTC

Reply Parent Score: 2

Neolander Member since:
2010-03-08

Aren't those supposed to be automatically enabled by O2 or O3 ?

Reply Parent Score: 1

Carewolf Member since:
2005-09-08

Depends on the gcc-version.

I just double-checked: According to info:gcc the most recent version (4.6) has enabled -ftree-vectorize on -O3, but still not -funroll-loops.

Unrolling is only default enabled if you compile with profiling-data that helps the compiler to unroll the correct loops.

Reply Parent Score: 2

Alfman Member since:
2011-01-28

Carewolf,

Thanks for the feedback.

For me GCC does use SSE for me with and without '-ftree-vectorize' (when I tweak the C source code).

I couldn't get GCC to do vector math without changing the source file to calculate the number of loops for GCC. In a case as simple as this, the compiler should have been able to handle it.

I know Valhalla is complaining about this specific example (not sure why?), but I do frequently come across issues like this in much more complex code where GCC misses an equally trivial optimization.

Reply Parent Score: 2

Carewolf Member since:
2005-09-08

I also rarely see GCC vectorizing integer loops (it seems to choke on signed vs unsigned), GCC does slightly better with floating-point loops, and it helps if you allow it to break some strict math rules.

As I noted in another comment the newest version of gcc now has -ftree-vectorize in -O3, so I was not fully up to date in my first reply.

If you compile to AMD64, SSE is automatically used for all math (not vectorized, one value at a time). You can also use SSE for math in IA32 using -mfpmath=sse. SSE math is generally much faster but removes the 80bit temporaries quirk 487 math has.

Reply Parent Score: 2