Linked by MOS6510 on Fri 17th May 2013 22:22 UTC
Hardware, Embedded Systems "It is good for programmers to understand what goes on inside a processor. The CPU is at the heart of our career. What goes on inside the CPU? How long does it take for one instruction to run? What does it mean when a new CPU has a 12-stage pipeline, or 18-stage pipeline, or even a 'deep' 31-stage pipeline? Programs generally treat the CPU as a black box. Instructions go into the box in order, instructions come out of the box in order, and some processing magic happens inside. As a programmer, it is useful to learn what happens inside the box. This is especially true if you will be working on tasks like program optimization. If you don't know what is going on inside the CPU, how can you optimize for it? This article is about what goes on inside the x86 processor's deep pipeline."
Permalink for comment 562155
To read all comments associated with this story, please click here.
RE[6]: Comment by Drumhellar
by theosib on Mon 20th May 2013 14:12 UTC in reply to "RE[5]: Comment by Drumhellar"
theosib
Member since:
2006-03-02

My opinion is that this is less about more compute power and more about the limits of compiler developers. This reminds me of Ray Kurzweil's stupid singularity thing, which seems to imply that the instant that computers are as fast as the human brain, they'll magically develop human intelligence. It doesn't matter how fast they are if we don't know the algorithms for human intelligence. And we still don't.

There's the same problem with compilers. I'm reminded of two events in computer history. One is LISP machines, and the other is Itanium. In both cases, hardware designers assumed that a "sufficiently smart compiler" would be able to take advantage of their features. But people were not able to develop those sufficiently smart compilers. Consider predicated execution for Itanium. Predication turns out to be a hard problem. With architectures (like ARM32) that have only one predicate, it gets used SOME of the time. Itanium has an array of 64 predicate bits. Humans can specially craft examples that show the advantages of the Itanium ISA, but compilers just don't exist that can do that well in the general case.

Reply Parent Score: 3