Linked by MOS6510 on Fri 17th May 2013 22:22 UTC
Hardware, Embedded Systems "It is good for programmers to understand what goes on inside a processor. The CPU is at the heart of our career. What goes on inside the CPU? How long does it take for one instruction to run? What does it mean when a new CPU has a 12-stage pipeline, or 18-stage pipeline, or even a 'deep' 31-stage pipeline? Programs generally treat the CPU as a black box. Instructions go into the box in order, instructions come out of the box in order, and some processing magic happens inside. As a programmer, it is useful to learn what happens inside the box. This is especially true if you will be working on tasks like program optimization. If you don't know what is going on inside the CPU, how can you optimize for it? This article is about what goes on inside the x86 processor's deep pipeline."
Permalink for comment 562099
To read all comments associated with this story, please click here.
RE[4]: Comment by Drumhellar
by theosib on Sun 19th May 2013 18:19 UTC in reply to "RE[3]: Comment by Drumhellar"
Member since:

Sorry for the long time to reply. Also sorry for not giving you a more thorough reply.

The bitcoin problem is interesting, and some friends and I are working on trying to get better computer/area out of an FPGA, just for the fun of it. As a chip designer, I see FPGAs as an obvious choice for accelerating this kind of thing far beyond what a general-purpose CPU can do.

One of the problems with using FPGAs is a general fear of hardware development. (Well, that and the cost of FPGA hardware, but that doesn't apply to supercomputing.) Another problem is reasoned avoidance. For me, having worked as a chip designer, I like to just put together solutions straight in Verilog. But we can't retrain all HPC programmers in chip design, and it's sometimes not a good cost/benefit tradeoff. The holy grail is being able to convert software source code into logic gates. There's plenty of work on that, but the results aren't necessarily all that great. There's a huge difference in performance between a custom-designed FPGA circuit (i.e. knowing what you're doing) versus something that came out of an automatic translator.

Reply Parent Score: 3