To view parent comment, click here.

To read all comments associated with this story, please click here.

I'm also wondering if it's possible to mathematically reverse a hash algorithm in a way that provides, say, all files within a size range between X and Y that result in a given hash Z when going through the hashing algorithm. This operation could be followed by analyzing those files in the hope of finding one which has some desirable properties (like, in the context of secure boot, some simple code which is able to load other code)

And if it is, whether such a "clever" approach would have the potential to beat brute force random data injection to a tampered binary until it gets the same hash, in terms of execution speed.

*Edited 2011-09-26 17:41 UTC*

Neolander,

"I'm also wondering if it's possible to mathematically reverse a hash algorithm in a way that provides..."

Lets go on the assumption that our function is an ideal hash function with no mathematical weaknesses. Since we already know that a broken hash function will limit the scope of search. An idea hash function means that the only way to find a collision is to brute force various inputs until we generate a collision.

sha256sum(x1) = y

sha256sum(x2) = y

How would we find x2, such that it produces the same hash as x1? This can be as simple as taking a known payload, and modifying it with a nonce until we generate the collision we're looking for. It's trivial, and it's been done with MD5. However, this task becomes exponentially more difficult as bits are added to the hash.

Let's reduce the difficulty of the problem:

1 bit hash function

sha256sum(x1) & 0x1 = y

sha256sum(x2) & 0x1 = y

Here, y can only be 0 or 1, therefor every other X value will produce a collision.

2 bit hash function:

sha256sum(x1) & 0x3 = y

sha256sum(x2) & 0x3 = y

Now, y can be 0,1,2,3, every 4th X value will produce a collision, twice as much work as 1 bit.

3 bit hash function:

sha256sum(x1) & 0x7 = y

sha256sum(x2) & 0x7 = y

Every 8th X produces a hash collision, twice as much work as 2 bits.

So with this exponential growth, a 256 bit hash function would collide every 1157920892373161954235709850086879078532699846656405640394575840079131 29639936 X values on average.

Assuming we have 1 billion computers, each able to forward hash 1 billion X values every second, then we might expect a collision every 3764568028158688209515806576697354474006124657512762 years on average (double check my math).

This is if we stick to classical computing, quantum computing introduces yet a whole new dimension to the problem. It's too bad quantum computing was not offered at my university, since I don't know that much about it.

Edit: I'd be happy to leave my cheap web development clients to work on this stuff instead, if anyone's willing to pay me to do it.

*Edited 2011-09-26 19:24 UTC*

Member since:

2011-01-28

Neolander,

"Isn't it possible to defeat hash signing by producing a binary which has the same hash, but different code ? After all, the transformation which turns a multi-MB binary into a small, easy to compute and check hash, loses so much information that there's a huge number of possible binaries associated to a given hash."

You are absolutely right about the hash. Furthermore, you might say the same thing about RSA itself. There are only a finite number of possibilities to try, which implies that it's not difficult to build algorithms to enumerate them.

The key to all digital cryptography is that all known algorithms to crack the numbers use exponential time. Every additional bit is exponentially more difficult crack.

However these problems also fit squarely into the class of parallel problems considered "embarrassingly parallel", with no IO/sync overhead. This means shorter bit lengths are vulnerable against massive specialized cracking clusters.

History shows us that we need to be much more conservative with our estimates of cryptographic security. I recall when the EFF deliberately build a DES cracking machine in 1998 to publicly embarrass the US government on it's legal policy of restricting international crypto products to algorithms known to be extremely weak.

http://news.cnet.com/Record-set-in-cracking-56-bit-crypto/2100-1017...

Now days that we routinely use far larger bit lengths, and our encryption is much more resilient to brute force attacks. 256 bit cryptography cannot be brute forced today or in the near future. But we are still occasionally finding flaws in the algorithms which mathematically etch away at their security.

It is an open question whether cryptographers will ever be able to place a lower boundary on the work needed to crack a code, or whether sufficiently clever algorithms will always exist to reduce the search space ever further.