Linked by Christopher W. Cowell-Shah on Thu 8th Jan 2004 19:33 UTC
This article discusses a small-scale benchmark test run on nine modern computer languages or variants: Java 1.3.1, Java 1.4.2, C compiled with gcc 3.3.1, Python 2.3.2, Python compiled with Psyco 1.1.1, and the four languages supported by Microsoft's Visual Studio .NET 2003 development environment: Visual Basic, Visual C#, Visual C++, and Visual J#. The benchmark tests arithmetic and trigonometric functions using a variety of data types, and also tests simple file I/O. All tests took place on a Pentium 4-based computer running Windows XP. Update: Delphi version of the benchmark here.

There is a serious problem with the long math benchmarks, due to python being a dynamically (not statically) typed language.
In C you can say
long long int i;
and get a 64-bit signed integer (in C99).

If you do an operation that makes 'i' too big it overflows, 'i' then contains the incorrect answer, but its type remains the same.
Python works differently to C (and the others). You can use a (plain) integer type, add to it, and instead of overflowing it is dynamically promoted to a long integer.

\$ python
>>> i=1
>>> type(i)<type 'int'>
>>> i=i+pow(2,32)
>>> i
4294967297L
>>> type(i)<type 'long'>

Furthermore, the "long integer" does not have 64-bit precision, it has unlimited precision!
For example try
\$ python
>>> n = pow(2,63)-1
>>> n
9223372036854775807L
>>> n = n * 10
>>> n
92233720368547758070L
>>> pow(2,128)
340282366920938463463374607431768211456L
>>> pow(2,256)
1157920892373161954235709850086879078532699846656405640394575840079131 29639936L

This is why in the long math (64-bit integer) benchmark the LongResult for C and Python differ.
C has 776627965
Python has 10000000000
In a integer benchmark the results must match otherwise you are not benchmarking the same operations!

For the third iteration through the loop, i = 10000000002
longResult = 10000000001 * 10000000002= 100000000030000000002
This is more than a 64-bit signed int can handle and it overflows. Python however calculates the correct results for integers bigger than 64 bits.

It looks to me like every multiply operation in the long integer test in C overflows 64 bits, so you are benchmarking 1/4 billion 64-bit integer overflows in C (and the others) against 1/4 billion 128-bit integer multiplications in Python, not a fair comparison.
Python is slower but it gives you the correct result!

A fair benchmark would involve the recoding C (and others) so that they check for 64-bit integer overflow and then do 128-bit arithmetic. Not so easy to do, Python however gives you this for free as its built in.