Linked by Thom Holwerda on Fri 15th Feb 2013 10:40 UTC
General Development "Since I left my job at Amazon I have spent a lot of time reading great source code. Having exhausted the insanely good idSoftware pool, the next thing to read was one of the greatest game of all time: Duke Nukem 3D and the engine powering it named 'Build'. It turned out to be a difficult experience: The engine delivered great value and ranked high in terms of speed, stability and memory consumption but my enthousiasm met a source code controversial in terms of organization, best practices and comments/documentation. This reading session taught me a lot about code legacy and what helps a software live long." Hail to the king, baby.
Thread beginning with comment 552648
To view parent comment, click here.
To read all comments associated with this story, please click here.
RE[5]: Code Review
by Alfman on Fri 15th Feb 2013 17:32 UTC in reply to "RE[4]: Code Review"
Alfman
Member since:
2011-01-28

Laurence,

"These days compilers are so good at optimising that you're more likely to write better performing code in C than assembly."

There are still times when I find GCC didn't optimize something as well as it could have, especially when the C code doesn't translate directly to the desired architecture opcode (many bit manipulation, overflow flags, and multi-word-size instructions are completely absent in C). The roundabout algorithm in C will sometimes result in a roundabout assembly optimization.

That said, most of us have thrown in the towel because the business case for hand optimization skills is virtually non-existent. Most businesses are willing to let grossly inefficient code slide by if they can trade off developer costs with better hardware.

Every time this topic comes up someone else points out the the importance of optimizing the algorithm before resorting to assembly, let me preempt this by saying "I agree".

Reply Parent Score: 4

RE[6]: Code Review
by moondevil on Fri 15th Feb 2013 19:49 in reply to "RE[5]: Code Review"
moondevil Member since:
2005-07-08

The processors have also become too complex:

- out of order execution
- parallel execution units
- branch prediction
- multiple cache levels
- opcode rewriting
- SIMD
- NUMA
- (put you favourite feature here)

You need to be super human to really optimize for a given processor given all the variables, and when you manage to do it, it is only for a specific model.

Only on the embedded space it is still an advantage to code directly in assembly.

Reply Parent Score: 4

RE[7]: Code Review
by Bill Shooter of Bul on Fri 15th Feb 2013 20:58 in reply to "RE[6]: Code Review"
Bill Shooter of Bul Member since:
2006-07-14

Good point. Its a lot more difficult than the days of the 486 ;)

Reply Parent Score: 2

RE[7]: Code Review
by Alfman on Fri 15th Feb 2013 22:36 in reply to "RE[6]: Code Review"
Alfman Member since:
2011-01-28

It sure is complex, but humans can still have the upper hand in some of the cases you've mentioned. Compilers are often given more credit than they actually deserve. We treat them as though they're god like, but in reality they can be pretty dumb sometimes.

I think compilers will have to gain more artificial intelligence before they can persistently match & beat the best hand optimizations. I believe this will happen, but it just hasn't happened yet.

Edit: By my criteria, this will have happened once a compiler can consistently generate code listings which no human is able to optimize any further for the given target.

Edited 2013-02-15 22:40 UTC

Reply Parent Score: 2

RE[7]: Code Review
by Megol on Mon 18th Feb 2013 15:54 in reply to "RE[6]: Code Review"
Megol Member since:
2011-04-11

The processors have also become too complex:

- out of order execution


It's like: hey we (the processor manufacturer) have inserted a little data flow engine in your processor, this is sadly only for values in registers - memory stuff still is mostly in order. And the programmer say okay, that's nice, I guess I don't have to work as hard to make instructions start in parallel and instead optimize data flows.


- parallel execution units


See above.


- branch prediction


Okay, care to explain how this could make any difference when coding? This is a mechanism that applies predictions from dynamic patters when executing code, not something that have to be coded. Current x86 processors doesn't even support hints.


- multiple cache levels


Unless you code Fortran I don't think you'll ever see your compiler optimize towards this. But yes multi-dimensional cache blocking/tiling is a pain in the ass and making it dynamically adapt to the platform cache hierarchy almost require runtime code generation.
Which your standard compiler wont do.


- opcode rewriting


Don't know what you mean, perhaps fusing? Not a problem even for the beginner.
Macro op fusing -> CMP/TEST+Bcc is treated as one instruction.
Micro op fusing -> Intel finally stopped splitting stuff that never needed to be split. This mostly affects the instruction scheduler as load+execute instructions doesn't take two scheduler slots.
MOV elimination -> Some MOV instructions are executed in the renaming stage instead of requiring execution resources.


- SIMD


Can be a problem if one wants to get near optimal execution on several generations of processors. However thinking the compiler will make it easier is in most cases wrong, yes compilers can generate several versions of code and dynamically select which code paths should be used. But really critical code often require changes in data structures to fit the underlying hardware which the compilers will not do.

Doing the same in assembly language isn't a problem.


- NUMA


So you think your compiler will make the code NUMA aware?
This is something affected by algorithm choices.


- (put you favourite feature here)

You need to be super human to really optimize for a given processor given all the variables, and when you manage to do it, it is only for a specific model.


No it simply requires knowledge of software-hardware interaction. Assembly language programmers are also better when optimizing e.g. C code as they know that under the abstraction it's still a Von Neumann machine.


Only on the embedded space it is still an advantage to code directly in assembly.


Most embedded code is C so I guess you are wrong here too.

Reply Parent Score: 2