posted by Nicholas Blachford on Wed 9th Jul 2003 16:43 UTC

"Law of Diminishing , Performance, Vector Processing and Power Consumption differences"

The Law Of Diminishing Returns (Aka Amdahl's Law)
The Law of diminishing returns is not exactly a new phenomenon, it was originally noticed in parallel computers by IBM engineer Gene Amdahl, one of creators of the IBM System 360 Architecture. The original describes the problem in parallel computing terms however this simplified version pretty much describes the problem in terms of any modern computer system:

"Each component of a computer system contributes delay to the system If you make a single component of the system infinitely fast...
...system throughput will still exhibit the combined delays of the other components."
[3]

As the clock speeds goes upwards the actual performance of the CPU does not scale exactly with the clock speed. A 2GHz CPU is unlikely to be twice the speed of a 1GHz CPU, indeed on everyday tasks people seem to have some difficulty telling the difference between these speeds.

The reason for the lack of scaling is the fact that memory performance has not scaled with the CPU so the CPU is sitting doing nothing for much of it's time (HP estimate this at 70% for server CPUs). Additionally the latency of memory has barely improved at all so any program which requires the CPU to access memory a lot will be effected badly by memory latency and the CPU will not reach anything near it's true potential. The CPU memory cache can alleviate this sort of problem to a degree but it's effectiveness depends very much on the type of cache and software algorithm used.

Many of the techniques used within x86 CPUs may only boost performance by a small amount but they are used because of the need for AMD and Intel to outdo one another. As the clock speed increases ever higher the scaling problem increases further meaning that the additional effort has less and less effect on overall performance. Recent SPEC marks for two Dell workstations show that a greater than 50% increase in CPU speed and the addition of hyper-threading results in only a 26% increase in SPEC marks [2]. Yet when the Itanium 2 CPU got an 11% clock speed boost and double the cache the SPEC mark increased by around 50%

Of course there are other factors which effect the performance of CPUs such as the cache size and design, the memory interface, compiler & settings, the language it's programmed in and the programmer who wrote it. Changing the language can in fact be shown to have a much greater effect than changing the CPU [4]. Changing the programmer can also have a very large effect [5].

Performance Differences Between The PowerPC And x86
Since AMD began competing effectively with Intel in the late 1990s both Intel and AMD have been aggressively developing new faster x86 CPUs. This has lead them to becoming competitive with and sometimes even exceeding the performance of RISC CPUs (If you believe the benchmarks, see below). However RISC vendors are now becoming aware of this threat and are responding by making faster CPUs. Ironically however if you were to make all CPUs at the same geometry the Alpha 21364 is the fastest CPU going - yet it uses a 7 year old core design.

PowerPCs although initially designed as desktop processors are primarily used in embedded applications where power usage concerns outweigh raw processing power. Additionally, current G4 CPUs use a relatively slow single data rate bus system which cannot match the faster double or quad data rate busses found on x86 CPUs.

The current (non G5) PowerPC CPUs do not match up to the level of the top x86 CPUs however due to the effects of the law of diminishing returns they are not massively behind in terms of CPU power. The x86 CPUs are faster but not by as much as you might expect [6]. (Again, see below section on benchmarks).

Vector Processing Differences
Vector processing is also known as SIMD (Single Instruction Multiple Data) and it is used in some types of processing. When used it speeds up operations many times over the normal processing core.

Both x86 and PowerPC have added extensions to support Vector instructions. x86 started with MMX, MMX2 then SSE and SSE2. These have 8 128 bit registers but operations cannot generally be executed at the same time as floating point instructions. However the x86 floating point unit is notoriously weak and SSE is now used for floating point operations. Intel has also invested in compiler technology which automatically uses the SSE2 unit even if the programmer hasn't specified it boosting performance.

The PowerPC gained vector processing in one go when Apple, IBM and Motorola revised the powerPC instruction set and added the Altivec unit which has 32 128 bit registers. This was added in the G4 CPUs but not to the G3s but these are now expected to get Altivec in a later revision. Altivec is also present in the 970.

Currently the bus interface of the G4 slows down Altivec as it is very demanding of memory. However the Altivec has more registers than SSE so it can operate without going to memory too much which boosts performance over SSE. The Altivec unit can also operate independently from and simultaneously to the floating point unit.

Power Consumption Differences
One very big difference between PowerPC and x86 is in the area of power consumption. Because PowerPCs are designed for and used in the embedded sector their power consumption is deliberately low. The x86 CPUs on the other hand have very high power consumption due to the old, inefficient architecture as well as all the techniques used to raise the performance and clock speed. The difference in power consumption is greater than 10X for a 1GHz G4 (7447) compared with the 3GHz Pentium 4. The maximum rating for a G4 is less than 10 Watts whereas Intel do not appear to give out figures for power consumption rather referring to a "thermal design rating" which is around 30 Watts lower than the maximum figure. The Figure given for the design rating of a P4 3GHz is 81.9 Watts so the maximum is closer to and may even exceed 100 Watts.

A single 3GHz Pentium 4 CPU alone consumes more than 4 times power than a Pegasos PowerPC motherboard including a 1GHz G4.

Table of contents
  1. "History, Architectural differences, RISC Vs CISC, Current state of these CPUs"
  2. "Law of Diminishing , Performance, Vector Processing and Power Consumption differences"
  3. "Low Power x86s, Why The Difference?, To RISC Or Not To RISC, PPC and x86 get more Bits"
  4. "Benchmarks, the Future"
  5. "Conclusion, References"
e p (0)    221 Comment(s)

Technology White Papers

See More