Linked by David Adams on Thu 20th Jan 2011 19:27 UTC, submitted by Ki5IA
Thread beginning with comment 459322
To view parent comment, click here.
To read all comments associated with this story, please click here.
To view parent comment, click here.
To read all comments associated with this story, please click here.
RE[3]: How does it affect performance?
by Neolander on Fri 21st Jan 2011 15:39
in reply to "RE[2]: How does it affect performance?"
I don't think that memory footprint is much increased by using 64-bit pointers, except maybe on very large code bases (e.g. QT, the Linux kernel...)
If you look at the nokia Qt SDK, which is one of the biggest binary downloads I can think of right now...
http://www.forum.nokia.com/info/sw.nokia.com/id/e920da1a-5b18-42df-...
"Linux 32 : 591 MB
Linux 64 : 596 MB"
The 64-bit version of the SDK is five megabytes larger. Not such a big deal on a download weighting hundreds of MB. For smaller software, the difference will probably be barely noticeable, if noticeable at all.
Edited 2011-01-21 15:43 UTC
RE[4]: How does it affect performance?
by theosib on Fri 21st Jan 2011 15:52
in reply to "RE[3]: How does it affect performance?"
RE[4]: How does it affect performance?
by theosib on Fri 21st Jan 2011 15:52
in reply to "RE[3]: How does it affect performance?"





Member since:
2006-03-02
Making your program use 64-bit pointers increases its memory footprint. I'm not sure by how much, but some people seem to be bothered by it. Increasing the memory footprint has the effect of decreasing the cache efficiency; by that I mean that because of the less compact memory representation, cache misses increase, which increases the average memory latency. On a modern x86 processor, L1 hits are like 3 cycles, L2 hits are on the order of 40, and last-level misses are hundreds of cycles. These aggressive out-of-order architectures can often completely absorb L1 misses by continuing to execute instructions that don't depend on an out-standing memory read, but last-level misses invariably lead to a lengthy stall. In a talk, one Sun engineer characterized modern OOO processors as a race between LLC misses, with the LLC misses being the dominating factor in runtime.
Regarding cache efficiency, here's an experiment to try. Let's say you have a lookup table, and the numbers are small enough to fit in 8 bits. You can use a char array to store it. But due to various overheads, x86 is more efficent at accessing 32-bit words than 8-bit words, so as long as the table is small enough, it's actually faster to use ints. Now, enlarge that table to be a few times the size of the L1 cache. Now, the L1 misses start to dominate. Switching from 32-bit words to 8-bit words decreases the cache footprint and thereby speeds up your program (depending on how sensitive your algorithm is to L2 latency).
Switching from absolute 64-bit addressing to relative 32-bit addressing will decrease the cache pressure of your programs, resulting in a speed increase. (Albeit to a much lesser degree.)