It’s just data

Electronic Forgery

Bruce Schneier: It's time for us all to migrate away from SHA-1.

What triggered this?  A collision in MD5.  I am not an expert in cryptography, but I do find this subject fascinating.  Here is what I have pieced together so far:

What does a collision mean?  One thing it does not mean is that somebody can reverse engineer 186 megabyte Debian image knowing only a 128 bit hash.

What it does, however, mean is that somebody may be able to generate another file with the same hash.  If they can then persuade you to download it, the results would appear authentic.  If able to be done in real time, this represents a man in the middle attack.

There are three hurdles, however, to overcome.

First, the attacker would have be able to generate a collision at will.  This is considerably harder than being able to find a collision - something known as the birthday paradox.  If you gather a random collection of 23 people, you have a 50% chance of finding two people with the same birthday.  However it takes gathering a total of 253 people before you have a 50% chance of finding somebody with your birthday.  Most people find these numbers to be a bit counterintuitive, but the fact is that it is much easier to find a collision than to generate a collision at will.

Second, the collision would have to be something more than a random series of bits in order to be dangerous.  This presumably will require the ability to insert without detection a number of bits roughly the size of the key (128 or 160).  This would not be a problem for a download the size of a Linux distribution, but would be a bit more noticeable in a typical PGP signed message.

Third, the collision found dealt with MD5.  We are talking about SHA-1.  All other things being equal, if you were able to construct a computer which could crack MD5 in a second, the same computer would take over 136 years to crack SHA-1.  And we don't yet have a computer that can crack MD5 in a second, we only have a collision that was found after an immense amount of searching.

So, given all this, why the concern?  The concern is not that these hashes are vulnerable to a brute force attack, but that a shortcut can be found.  What kind of shortcut?  Here's a trivial example: one can determine that the number 123,456,789 is divisible by 9 by a simple technique: adding up the digits.  1+2+3+4+5+6+7+8+9=45 and 45 is divisible by 9, so 123,456,789 is divisible by 9.  This shortcut only works for 3 and 9, but it is much quicker than doing division.  Other numbers have other shortcuts, some seemingly nearly as hard as doing the division outright. These are merely examples of shortcuts, ones that are not likely to help with the cracking of MD5 or SHA-1, but if an appropriate shortcut can be found, all bets are off.

That's what people are concerned about.  The prevailing feeling seems to be that this is only a matter of time for MD5, but that SHA1 might last a little longer.

Bruce Schneier on hash functions, in the wake of the MD5 collisions: "there are rumors, unconfirmed at this writing, of results against SHA-1". [via]...... [more]

Trackback from Andrew Grumet's Weblog


MD5 collision

... [more]

Trackback from


"And we don't yet have a computer that can crack MD5 in a second, we only have a collision that was found after an immense amount of searching."

This is not entirely true. It's in most cases easier than you'd think. For example: if you break into some kind of server and get a database full of users and their passwords are MD5'ed then it is really simple. You just brute force all the passwords (if you know that for that kind of system you need at least 5 characters, you just start with 'aaaaa' and work from there). Every brute force you check with the database, this way you only have to walk the entire spectrum once.

Depending on a lot of parameters this would be possible to do in a weekend on a couple of home-grade PC's. This way you'd be able to retrieve the passwords of thousands of people...

But then again, that's not really the same as the 'collision' topic you talk about in your post.

- Jilles

Posted by Jilles Oldenbeuving at

The MD5 collision - an overview

Sam Ruby has written an overview about the recently discovered MD5 collision and what it means....

Excerpt from Joakim Andersson's blog at


A flaw in the MD5 hash function was recently exposed at Crypto 2004. On the surface this seems like pretty bad news (and it will need to be fixed in the very near future), but it turns out that...... [more]

Trackback from Raised By Robots


Md5 Collision found

It appears that a collision has been found in MD5. What triggered this? A collision in MD5. I am not an expert in cryptography, but I do find this subject fascinating. Effectively this means that there are two strings, which when MD5'ed return the same result. This certainly doesn't mean that MD5 is worthless, but researchers seem to have made a lot of progress towards breaking it. As Bruce Schneier said in a recent ComputerWorld column: This is how the science of cryptography advances: We learn how to design new algorithms by breaking other algorithms. Bruce also argues that its...... [more]

Trackback from Jim O'Halloran's Weblog


Sam Ruby: Electronic Forgery

Simon Willison : Sam Ruby: Electronic Forgery - A nice, simple explanation of the collision in MD5....

Excerpt from HotLinks - Level 1 at

Sam Ruby: Electronic Forgery

A simple explanation of the MD5 collision....

Excerpt from at

Hash function weaknesses

... [more]

Trackback from simon's ramblings


So it would seem that SHA-512 is the most secure route for safe hashes ... for now.  It sure would be nice, as Bruce stated, if the US would host a competition for the next standard, just as they did for AES.

Posted by Scott Johnson at

Math Forum: Calculation Tips & Tricks

Math Forum: Calculation Tips & Tricks—nice collection of math shortcuts [via Ruby]...

Excerpt from at

SHA-1 "Broken"

It has been nearly six months since the  Collision in MD5 was found, and now the alert is being sounded that  SHA1 is "broken". To put this in perspective, what is being said is that while the original standard required over one septillion hash operati... [more]

Trackback from Sam Ruby


Add your comment