What it does, however, mean is that somebody may be able to
generate another file with the same hash. If they can then
persuade you to download it, the results would appear
authentic. If able to be done in real time, this represents a
in the middle attack.
There are three hurdles, however, to overcome.
First, the attacker would have be able to generate a collision
at will. This is considerably harder than being able to find
a collision - something known as the
paradox. If you gather a random collection of 23 people,
you have a 50% chance of finding two people with the same
birthday. However it takes gathering a total of 253 people
before you have a 50% chance of finding somebody with your
birthday. Most people find these numbers to be a bit
counterintuitive, but the fact is that it is much easier to find a
collision than to generate a collision at will.
Second, the collision would have to be something more than a
random series of bits in order to be dangerous. This
presumably will require the ability to insert without detection a
number of bits roughly the size of the key (128 or 160). This
would not be a problem for a download the size of a Linux
distribution, but would be a bit more noticeable in a typical
PGP signed message.
Third, the collision found dealt with MD5. We are talking
about SHA-1. All other things being equal, if you were able
to construct a computer which could crack MD5 in a second, the same
computer would take over 136 years to crack SHA-1. And we
don't yet have a computer that can crack MD5 in a second, we only
have a collision that was found after an immense amount of
So, given all this, why the concern? The concern is not
that these hashes are vulnerable to a brute force attack, but that
a shortcut can be found. What kind of shortcut? Here's
a trivial example: one can determine that the number 123,456,789 is
divisible by 9 by a simple technique: adding up the digits.
1+2+3+4+5+6+7+8+9=45 and 45 is divisible by 9, so 123,456,789 is
divisible by 9. This shortcut only works for 3 and 9, but it is
much quicker than doing division. Other numbers have
shortcuts, some seemingly nearly as hard as doing the division
outright. These are merely examples of shortcuts, ones that are not
likely to help with the cracking of MD5 or SHA-1, but if an appropriate
shortcut can be found, all bets are off.
That's what people are concerned about. The prevailing
feeling seems to be that this is only a matter of time for MD5, but
that SHA1 might last a little longer.
Bruce Schneier on hash functions, in the wake of the MD5 collisions: "there are rumors, unconfirmed at this writing, of results against SHA-1". [via]......
"And we don't yet have a computer that can crack MD5 in a second, we only have a collision that was found after an immense amount of searching."
This is not entirely true. It's in most cases easier than you'd think. For example: if you break into some kind of server and get a database full of users and their passwords are MD5'ed then it is really simple. You just brute force all the passwords (if you know that for that kind of system you need at least 5 characters, you just start with 'aaaaa' and work from there). Every brute force you check with the database, this way you only have to walk the entire spectrum once.
Depending on a lot of parameters this would be possible to do in a weekend on a couple of home-grade PC's. This way you'd be able to retrieve the passwords of thousands of people...
But then again, that's not really the same as the 'collision' topic you talk about in your post.
A flaw in the MD5 hash function was recently exposed at Crypto 2004. On the surface this seems like pretty bad news (and it will need to be fixed in the very near future), but it turns out that......
It appears that a collision has been found in MD5. What triggered this? A collision in MD5. I am not an expert in cryptography, but I do find this subject fascinating. Effectively this means that there are two strings, which when MD5'ed return the same result. This certainly doesn't mean that MD5 is worthless, but researchers seem to have made a lot of progress towards breaking it. As Bruce Schneier said in a recent ComputerWorld column: This is how the science of cryptography advances: We learn how to design new algorithms by breaking other algorithms. Bruce also argues that its......
So it would seem that SHA-512 is the most secure route for safe hashes ... for now. It sure would be nice, as Bruce stated, if the US would host a competition for the next standard, just as they did for AES.
It has been nearly six months since the Collision in MD5 was found, and now the alert is being sounded that SHA1 is "broken". To put this in perspective, what is being said is that while the original standard required over one septillion hash operati...