Linked by MOS6510 on Fri 17th May 2013 22:22 UTC
Hardware, Embedded Systems "It is good for programmers to understand what goes on inside a processor. The CPU is at the heart of our career. What goes on inside the CPU? How long does it take for one instruction to run? What does it mean when a new CPU has a 12-stage pipeline, or 18-stage pipeline, or even a 'deep' 31-stage pipeline? Programs generally treat the CPU as a black box. Instructions go into the box in order, instructions come out of the box in order, and some processing magic happens inside. As a programmer, it is useful to learn what happens inside the box. This is especially true if you will be working on tasks like program optimization. If you don't know what is going on inside the CPU, how can you optimize for it? This article is about what goes on inside the x86 processor's deep pipeline."
Thread beginning with comment 562025
To view parent comment, click here.
To read all comments associated with this story, please click here.
RE[8]: Comment by Drumhellar
by Alfman on Sun 19th May 2013 02:17 UTC in reply to "RE[7]: Comment by Drumhellar"
Member since:


"Well, FPGAs are just seas of programmable logic cells with somewhat flexible interconnects, so their 'parallelism' depends on the designs being implemented."

Yes, it all depends on design, it'd be very powerful in the hands of innovative software developers, but I don't know if/when consumer CPUs will provide FPGA like technology enabling software developers to take advantage of them.

"To be fair, modern CPUs do support most forms of parallelism; whether it be some form of instruction level parallelism (superscalar, SMT, out-of-order, multicore, etc), as well as data parallel structures like SIMD and Vector units."

True, but it's watered down. Every time I look at SSE I ask myself why intel didn't make SIMD extension instructions that could accommodate much greater parallelism. x86 SIMD extensions only offer low parallel scaling factors. I know you are right that intel had to strike a balance somewhere, but never the less I feel their whole 'lite SIMD' approach is impeding significantly higher software scalability.

"In the case of GPUs, they're used to run algorithms with elevated degrees of data parallelism, so they can dedicate most of their area to execution structures"

I like the way GPUs in particular are designed to scale to arbitrary numbers of execution units without otherwise changing the software. This is just awesome for "embarrassingly parallel algorithms" (

"AMDs newer fusion microarchitectures are something that may interest you, since they are starting to support elevated degrees of data parallelism on die."

Yes maybe, I'll have to read up on it.

Reply Parent Score: 2