To view parent comment, click here.
To read all comments associated with this story, please click here.
*: not 100% true even if both operations have the same performance because "trap on overflow" allow less reorganization than modulo arithmetic.
renox,
"Uh? MIPS has two version of integer operation: ADD/ADDU, SUB/SUBU (one which trap on overflow, one which doesn't and corresponds to modulo operation"
I wasn't really disagreeing with you.
"so the big thing here is that there is nearly no difference in performance between 'modulo' computations and 'trap on overflow' computations(*) which isn't the same with other ISA."
To be fair we'd need to actually test the performance differences on real CPUs. We cannot draw performance conclusions by counting instructions. For some x86 CPUs, jumps are "free" as long as they are predictable.
For example:
TEST1:
mov ecx, 0x00000000
mov eax, 0x00000000
.again:
add eax, 0x00000001
jo .overflow
.overflow:
loop .again
TEST2:
mov ecx, 0x00000000
mov eax, 0x00000000
.again:
add eax, 0x00000001
loop .again
TEST3:
mov ecx, 0x00000000
mov eax, 0x00000000
.again:
add eax, 0x00000001
jno .nooverflow
.nooverflow:
loop .again
On my 3GHz machine, 2^32 loops * 3 passes gives the following:
Test1=8.597570, 8.597259, 8.597007 -> 8.597278
Test2=8.596904, 8.599724, 8.598505 -> 8.598377
Test3=8.596864, 8.596893, 8.597207 -> 8.596988
Note that the addition of a jump instruction did not hurt the performance of the loop within a reasonable margin of error.
The same tests with unpredictable branching (adding 0x80000000 forces the branch to toggle each iteration).
Test1=10.694893, 10.742191, 10.759600 -> 10.732228
Test2=8.597308, 8.596457, 8.595678 -> 8.596481
Test3=10.862917, 10.777331, 10.680178 ->10.773475
So, the unpredictable branches hurt the performance, but I have to question whether a MIPS trap would do any better. Can MIPS do overflow checking without a trap? So long as the overflow is exceptional behaviour, I think we should both agree from these tests that the extra jump won't make any significant difference.
Now maybe it's true an inordinate amount of silicon has gone to branch prediction in the x86, which may have theoretically gone to better use in the MIPS, but you can't deny the x86 seems to do a decent job in this microbenchmark. Unfortunately I don't have a MIPS processor to test with.




Member since:
2011-01-28
renox,
"I also prefer MIPS ISA because it can trap on integer overflow which is nice for efficient Ada compilation. "
You may be right about mips handling overflow better than x86, but I actually don't mind the way the x86 does it.
ADD [edi], dword 5 # cause overflow
JO xyz # optionally handle overflow
INTO # generate an interrupt on overflow
ADC [edi+4], dword 0 # add carry
CMOVO [edi], ... # clip the range
# do nothing, modulo arithmetic is often desirable.
"But unfortunately nearly nobody use a language with this (nice) behaviour instead of stupid C/C++ or Java's behaviour on integer overflow, so it doesn't really matter"
I share the same gripe of the C language. It offers no way to use or act upon the overflow, leading to less efficient algorithms.
Also, I think modulo arithmetic should be made explicit. Languages like C/Java that implicitly discard overflow information lead to bugs regardless of architecture. New languages should automatically assert errors on overflow unless told to do otherwise.