To view parent comment, click here.
To read all comments associated with this story, please click here.
Real programmers don't rely on nebulous optimization that are less optimum, as known from "The Story of Mel, a real programmer". :-)
http://www.pbm.com/~lindahl/mel.html
Doc Pain,
"Real programmers don't rely on nebulous optimization that are less optimum, as known from 'The Story of Mel, a real programmer'. :-)"
Haha, it's a fun story. But that sounds more like the manifesto of a defiant programmer than anything having practical value.
Hell, even in intel processor terms I've seen similar patterns, like programmers who relied on the memory wrapping behaviour of the 8086 at the 1M boundary due to the processor's original limit of 20 address lines. It was ridiculous to rely on that quirk, yet some (microsoft) programmers did and consequently there's been a number of hardware hacks to control the A20 address line ever since.
http://www.openwatcom.org/index.php/A20_Line
I do appreciate clever tricks as a form of CS "art", however I kind of hope the employees responsible for the A20 mess were fired over it since it was very irresponsible.
I myself have often cited shortcomings of the GCC optimiser, leaving me to contemplate whether to use non-portable assembly or to use GCC's code as is. In most cases suboptimal code is irrelevant in the scheme of the program so it's not even worth looking at. However in very tight loops such as those in encryption/compression/etc algorithms, hand optimisation can make an observable difference.
In any case, suffice it to say that GCC can handle the multiply by shift/addition on it's own. So my personal preference is to see x*CONST in code.
Is (((x<<1)+x)<<1)+x faster than x*7? I don't know without profiling it. On x86 it'd compile down to two LEA opcodes, which are darn fast. What about (x<<3)-x? Long story short, I'd rather let GCC handle it when it can since it's architecture specific anyway.
Thanks, Alfman; very interesting stuff. I must admit that I did not encounter the "x+x+x" issue myself. I *THINK* I heard that story from Herb Sutter and Bjarne Stroustrup at SD West about five years ago, but I could be misremembering. The point remains, however, that you can mis-apply knowledge, even if that knowledge is accurate.




Member since:
2011-01-28
ingraham,
I upvoted your post because I thought it was insightful. However...
"For example, after learning that add instructions are faster than multiply, a developer starts writing 'x+x+x' instead of '3*x' in their code. But adding is not TWICE as fast as multiplying, so he just slowed down his code."
It is silly to manipulate source code to optimise cases that can just as easily be handled by any respectable compiler, but I found your example ironically humorous because I performed this micro benchmark just now, and x+x+x was indeed faster than 3*x on my x86 processor. Only once I reached 4 did it become slower. The fastest solution by far was (x<<1)+x (this can be done in a single opcode on x86 btw).
Edit: Programers should be aware that the compiler already does these optimisations under the hood.
Edited 2012-09-30 04:47 UTC