Spelling words in hashes
I became mildly obsessed with the geeky project of spelling out words in the MD5 checksums of software releases. If MD5 checksums are really random (statistically uncorrelated with every bit of their input), then you would expect that 1/2 of all MD5 checksums end with the bit "0", 1/2 of those end with "00", 1/2 of those with "000", etc. Similarly, 1/(2n) of all MD5 checksums should end (or, if you prefer, begin) with any chosen n-bit sequence. Famously, you can write words in hexadecimal: dead, beef, feed, deaf, EFF. If you allow "0" for "o", you can write c0ffee. Many people say that the longest English word thus expressible is "acceded" (but others claim "fabaceae", a family of beans).
So if you have a way to partially reverse the MD5 hash, you ought to be able to make your software have a hash which starts with a chosen word. (Adam Back calls this a "partial hash collision" in his papers on hashcash. A "hash collision" is a case where two values hash to the same value; a "partial hash collision" is a case where two values hash to related values. Strictly speaking, Back's hashcash and the hashes I'm searching for are not really "collisions" because there is no original hash value with which the desired new hash value can "collide".)
A straightforward approach to this is to take a given existing file and append random bytes to it until the hash comes out the way you want. That is a kind of brute force search and doesn't rely on any knowledge about MD5. (If you want to square the number of operations required, you could try finding something which has the same first few bits in its MD5 hash and in its SHA1 hash -- I did a few experiments with that.) Adam Back mentions that in an ideal cryptographic hash, there is no way of generating collisions faster than brute force. Some other people have pointed out that, since MD5 is not ideal, there might be some known methods of going faster than brute force. You might find mathematical properties of the MD5 function which let you search a little more efficiently. (Maybe there is a correlation between one bit in the MD5 state some time before the conclusion of the MD5 calculation and a bit in the MD5 result?)
I wrote a Python program, and then ported it to C. I originally used Peter Deutsch's MD5 implementation, but Jef pointed out that OpenSSL contains a fast assembly implementation of MD5, so I changed the code slightly so it could link against OpenSSL.
The result of all of this is fun: all of the LNX-BBC builds since the evening May 29 now have checksums starting with "bbcbbc". A typical example is "bbcbbc2a348c82f5f0b907c7a7da4f5d". For official releases, we will try to put the version number in -- for example, LNX-BBC 2.1 ought to have started with "bbc210".
Zack is threatening to write a distributed collaborative partial hash collision client so that people on the Internet can donate computer time to search for how to modify LNX-BBC releases to have checksums starting with longer words. 48 bits is about the most I currently dare to hope for. I can find a 24-bit prefix in a few seconds on one computer!
If you maintain a free software project, why not get my code and spell some words in your next couple of releases' checksums? c0ffee, f00, c0de, maybe "abbe" in honor of my friend Abbey...
- Links from other weblogs:
Sun Jun 15 07:23:08: Marketing Job Canada from Marketing Job Canada
Straight Guys Sucking Cock Straight Guys Sucking Cock
Sat Jun 21 19:15:45: West African Porn from West African Porn
Malchik Gay Lyrics Malchik Gay Lyrics
Tue Jun 24 10:21:14: Selecting Girls Softball Glove from Selecting Girls Softball Glove
Blondes Girls In Pantyhose Blondes Girls In Pantyhose
Thu Jul 17 11:29:51: Kelly Clarkson Nude Pics from Kelly Clarkson Nude Pics
First Gay Experiences First Gay Experiences
Tue Jul 22 03:59:21: End Of Penis Sore After Sex from End Of Penis Sore After Sex
Beautiful Women Sunbathing Nude Beautiful Women Sunbathing Nude Wierd Xxx Wierd Xxx Mission Beach Bikini Contest Main]
Contact: Seth David Schoen