Username or EmailPassword
Let's think for a second. The bigger the pointers the less number of cache lines, does that make any sense? The number of cache lines are the same regardless of the bits in an address. A cache line is identified by a tag and 32-bit and 64-bit address will eventually hash down to similar tags thus occupying all the cache lines in the cache. Line size and number of cache lines are always constant for caches.
Eh, kindof. You're right that there will always be the same number of lines in the cache, but frequently caches are designed so that it is possible to put a number of blocks in one cache line. These blocks are selected by a tag, as you pointed out, and also an index. I should have been more specific, my comments were mostly in regard to the TLB where you will certainly be able to hold fewer blocks if you're using a 64-bit VA instead of a 32-bit VA.
Sparc uses a 22 bit immediate field only for the sethi instruction. There more ways to construct a 64-bit instruction. At max you will need 3 instructions to build a 64-bit constant.
Indeed. However, 22 bits is the biggest immediate value you get in the SPARC instruction set. This was more to point out the obvious benefit of being able to do the load directly in x86-64.
The test was done to see if 32-bit binaries run faster on 64-bit capable systems than 64-bit binaries, so using the maximum available optimizations for 32-bit on the UltraSPARC are entirely appropriate.
You've missed my point entirely. The goals you stated at the beginning of the article were that you wanted to compare the performance of 32-bit applications versus 64-bit applications. However, I point out that there are some caveats since your testing methodology, in a number of cases, mixes 64-bit things with 32-bit things. You don't even address the issues or the drawbacks and instead insist that evey possible optimization should be valid. If this is true, you should state that the purpose of your article is not to divine whether 32-bit is faster than 64-bits, but how users should go getting their apps to run the fastest on an UltraSPARC chip. Or, that your goals were to determine the difference in speed between applications that used 32-bit pointers and 64-bit pointers. If those were your claims I would have no issue. However, you've set out with a general goal and only tested a few cases. I argue that the cases you've tested are not sufficiently general that you can offer a correct conclusion on the entirety of 32 vs. 64 bit performance.
I'm tired of seeing conjecture, and conjecture-taken-as-fact in regards to OS and platform performance. Even if it's backed up by sound computer science theory (64-bit data paths, cache misses, etc), it's still pure conjecture until it's tested.
I'm not sure what you're trying to say here. However, going into a laboratory and running experiments is never going to cure cancer, AIDS, SARS, whatever. It is experimental result coupled with a body of knowledge that allows us to make conclusions that advance the sciences. If you don't know anything about viral pathology, anatomy, physiology, etc and you go run a medical experiment, you're unlikely to learn much from it. Do you insist that your waiters prove that real numbers exist before you pay your bill at a restaurant? Do you make people prove that gavity exists when they want to talk to you about it? This statement sounds awfully rediculous. There are plenty of things that are theoretical that we accept as fact because it facilitates the ease with which we communicate about other things. Like any other field, there is an accepted body of knowledge relating to Computer Systems which is important to understand if you hope to reach meaningful conclusions. Simply dismissing everyone else's points as conjecture because they have not proven them to _you_ is silly. However, if you think you can prove that application performance increases if you increase your cache miss rate, please feel free to test it out. I think you'll find that this is much more solid than conjecture.