These are tips that are easy to add when compiling software where performance is important. And as it turns out, the SPARC platform has characteristics that further benefit from optimization, often more dramatically than x86.
Since compilers work based on the hardware architecture, these tips would apply for GCC for all operating systems that run on SPARC, including all of the operating systems reviewed on this series of articles (FreeBSD, Linux, Solaris, NetBSD, and OpenBSD). These tips would cover GCC 2.95 through the current GCC (3.3.2 as of writing).
This article is written from the perspective of a sys admin, and not a developer. System administrators are usually concerned with performance, and these are tips to help when compiling source code.
Basics On The SPARC Platform
For the SPARC platform, there are a 3 basic classes of processors: V7, V8, and V9. The SPARC V7 is the lowest common denominator for the SPARC platform; anything compiled with the SPARC V7 instruction set will run on any SPARC-based system, just like i386 is the lowest common denominator for the x86 platform.
V7-based systems include Sun's sun4 and sun4c systems, such as the SPARCStation 1 and 2, and the SPARCStation IPX for sun4c, and the Sun 4/300 for sun4.
The V8 architecture includes sun4m and sun4d systems. The V8 architecture adds some instructions that really help out with performance, including integer divide and multiply. These benefits will become apparent in later tests.
Sun4m-based systems include the SPARCStation 5, 10, 20, and Classic, and sun4d-based systems include the SPARCServer 1000 and SPARCCenter 2000.
The V9 architecture are 64-bit processors (as opposed to V7/V8 32-bit processors) and are fully backwards compatible with previous architectures. The V9 processors include the UltraSPARC, UltraSPARC II, UltraSPARC III, and the new UltraSPARC IV processors. The V9 is known as sun4u, which is what my Sun Ultra 5 is classified as. Sun currently currently only makes systems based on SPARC V9/sun4u.
|sun4u||V9 (64-bit capable)|
On SPARC systems GCC will produces binaries for V7-based binaries by default, just as GCC produces binaries based on the i386 instruction set on the x86 platform, by default.
One way to possibly increase performance in a dramatic way is to set the -mcpu option to your specific processor. Here is a portion of the entry from the GCC docs regarding this option:
Since the processor for my Sun Ultra 5 is a V9-based UltraSPARC IIi, I'll use
-mcpu=cpu_typeSet the instruction set, register set, and instruction scheduling parameters for machine type cpu_type.
-mcpu=ultrasparc. Since the only V9 systems are UltraSPARC, there's no real reason to use
-mcpu=ultrasparcwould work for all UltraSPARC processors and is the (theoretically) high optimization.
It should be noted that pecifying
-mcpu=ultrasparc or even v9 for the V9/64-bit class of processors will not create 64-bit code. The code will still be tuned for the UltraSPARC processors, but the binaries will remain 32-bit. The creation of 64-bit code requires using the
-m64 flag (
-m32 for 32-bit code is implied by default).
To show how dramatically
-mcpu can affect performance on the SPARC architecture, I ran some comparison tests with OpenSSL 0.9.7c compiled with three
These tests were run under Solaris 9 (12/03) and compiled with GCC 3.3.2 and compiled with -O3. Each test was run three times, and the results averaged. The individual results varied very little.
For the computationally-intensive OpenSSL, the
-mcpu=ultrasparc optimization doubled the performance when compared with V7.