posted by Tony Bourke on Mon 23rd Feb 2004 21:54 UTC

"SPARC Optimizations With GCC, Page 2/3"
Of course not all applications will benefit to this extent; presumably there are applications that would benefit very little. But for those CPU-intensive operations, this optimization can make a big difference.

The difference is much more dramatic than what we would see with similar optimizations on the x86 platform.

To show you the contrast in performance in intra-platform optimizations, I ran the same test on a Pentium III 1 GHz x86 system. I compiled OpenSSL with -march=i386 and -march=i686 (the highest effective optimization for my Pentium III system).

The x86 test system is running Linux 2.4, and OpenSSL 0.9.7c was again compiled with GCC 3.3.2. They were compiled with -O3, and each run was done 3 times with the results averaged. Again, there was very little delta between the individual runs.

Since I'm running a Pentium III, I could have used -march=pentium3. I actually did, and found there to be no difference in results between -march=i686 and -march=pentium3. Also, OpenSSL on Linux x86 is often distributed in both i386 and i686 iterations.

Remember, we're not comparing the performance of a 1 GHz Pentium III processor with a 333 MHz UltraSPARC IIi processor, rather we're comparing the difference between the lowest common denominator and the highest (effective) optimization between x86 and SPARC.

As you can see, the i686 flag does indeed give a performance boost as expected, but it's not nearly as dramatic as the difference between V7 and V9 (or even V8) on SPARC. This highlights the importance of optimizations for SPARC.

Contrasting With x86
You may have noticed that I used -march for x86, yet -mcpu for SPARC. For x86 GCC users this may seem confusing, since

-mcpu
under x86 only tunes a specific CPU, but doesn't take advantage of any additional instructions or additional functionality.

For SPARC, there is no -march flag, instead it uses

-mcpu
to specify platform-specific optimizations. The
-mtune
flags works as the -mcpu has typically been used on the x86 platform, by tuning code for a particular platform but not taking advantage of additional instructions. (It should be noted that the -mcpu flag has actually been deprecated on x86 GCC in favor of -mtune.)

So while -mtune is the same on both x86 and SPARC (creates backward compatible tuned binaries), -mcpu creates CPU-specific binaries (and not backward compatible) for SPARC, and -march does the same for x86.

For great resources on GCC for x86, check out GCC Facts and Myths by Joao Seabra and the GCC x86 optimization docs from GCC.

The -On Flag
Another optimization option for GCC (universal to all platforms) is the -On flag, which controls many more specific optimization flags.

Further reading on these optimizations can be found on the GCC document site.

To see what the effect of the -On flag with GCC has, I compiled OpenSSL 0.9.7c with -mcpu=ultrasparc, and -On (where n could be 0 through 3), which is the range for GCC (there's also -Os, which does maximum optimizations save for anything that might tend to dramatically increase size, but I didn't test that).

As before, the tests were run 3 times for each variant, and the results averaged. There was very little delta between the runs. OpenSSL 0.9.7c was used on Solaris 9 (12/03), compiled with GCC 3.3.2.

The results where quite surprising, as I had thought going in that there would be greater delta between the various levels of optimizations. As the results show, there wasn't much difference until going to zero.

This was only a single application, and the effectiveness of these optimizations will vary of course depending on your application, so keep that in mind.

Table of contents
  1. "SPARC Optimizations With GCC, Page 1/3"
  2. "SPARC Optimizations With GCC, Page 2/3"
  3. "SPARC Optimizations With GCC, Page 3/3"
e p (0)    22 Comment(s)

Technology White Papers

See More