I blogged a while back about a contest where you were asked to find input for an MD5 hash that would produce a result that matched at least some number of bits to a hash value the contest provided.
I found the contest by watching the promotion video on the Novena crowd funding campaign web site. Towards the end of the video you’ll see some text appear for a fraction of a second on the screen behind Andrew Huang. It’s a base64 encoded URL for the contest web site. The contest awarded a Novena laptop to the person who found the most strings whose hashes matched a given number of bits. The contest web site stated the number of bits you had to match might change depending on how difficult the challenge proved to be.
When I found the contest, the number of bits you had to match was up to 107. There was one person at the top who already had several hundred matches under their belt. Either they are and expert at MD5 and its vulnerabilities, or (my guess) they found the contest much sooner when the difficulty was much lower.
The target hash for the contest was “29c986a49abf80e9edf2ffe8efb7e040” which is the MD5 hash of the string “novena”.
The most number of bits I was able to match was 105 after running my HashAway program for 48 hours on a 4-core Intel i5 processor. That I didn’t even find one was a bummer, but it was a great exercise in the search to pull out the best performance from my program. For example I had initially written the program to generate a random string and hash that to search for matches. What I eventually did was hash the result of the previous hash. This had a great boost in performance as the overhead of generating random strings, and converting them to byte values to hash, was gone. A simple trick with big results. It was also a great exercise in becoming familiar with multi-threading in Java.
The last version of the code I had posted was version 1.3. Here is HashAway version 1.4 which is what I used to find my best 105-bit match.
I did do a quick search and found one other person’s source code for this problem. In the source you can see they found the contest when the difficulty was at 105 bits. If only I’d found the contest sooner. The comments in the source also state the contest started out at 90 bits! I can get 90-bit matches at a rate of 4/5 a second even on an old dual-core processor.
Ah well. So it goes.
I saw a video on the internet not too long ago. In it, on a screen in the background, for just a few seconds, displayed a string of text I immediately recognized as being base64 encoded data. I paused the video, typed the text into my base64 decoder ring, and found it was a URL. I put the URL into my browser and found out that I’d stumbled upon a little contest that would begin sometime in the future and that I should keep checking back to see if the contest has started.
Well I forgot about all of that and only went back this evening to check and sure enough the contest had begun. It ends on May 10th.
The contest is fairly straightforward computer geek stuff. An MD5 hash value was provided and the contest was to find strings that, when similarly hashed, shared X number of the 128 bits in the hash on the page. There’s a form asking for a name and e-mail address to register. Once registered, you can submit strings you find that meet the criteria. When the contest is over, the person with the most number of found hashes wins a new, very cool computer.
There’s only 10 people on the leader board and only 2 of them have over 10 found hashes, but those two currently have over 100. The number of bits you need to match can be changed as the contest operators sees fit. It’s currently set at 107 bits. With nothing to do on a Friday night and a desire to take a little Java refresher, I dug in.
It’s multi-threaded to maximize all the cores in my CPU (4). Thread priority is set to minimum, so I can keep doing whatever I want on my computer and it won’t slow down. I’m averaging about 900,000 hashes a second and after an hour I haven’t found a match. This may take some time. The program can be set at the command line to use more or less threads and change the difficulty to. Try it at difficulty of 90 and you’ll see a few start popping up immediately.
I’m not providing a URL to the contest and the hash in the source isn’t the real hash the contest wants you to use. Maybe I’ll mention it after the contest finishes. For now I just want to find my first hash with the program I wrote. My electric company will love me.
— Update 2 Days Later —
It started out at about 900,000 hashes a second. I found some ways to speed things up such as avoiding string manipulation wherever possible and using bit operations instead of Java objects like BitSet and I found a slightly faster RNG than the native java.util.Random. That got me up to 3.5 million hashes a second. I then realized I was running a 32-bit JVM (for browser compatibility). I installed a 64-bit JVM and now I’m up to 5.5 million hashes a second.
The source linked above has been updated to my latest.
I probably have about 36 hours running this thing so far and I haven’t found a match yet. I modified the code to keep track of the highest number of bits matched against so I can get some sense of progress. The highest I’ve matched is 103 bits. So close, but so exponentially far away.
I’ve got six days left.
— Update 3 Days Later —
I’ve removed the use of String objects from the search loop (except when outputting a match, that doesn’t mean much from a performance standpoint) by writing my own code to generate the hexadecimal values instead of using Long.toHexString(). I have also removed the calls to the random number generator in the loop. Now I seed the loop with a random value, but then keep hashing the previous results. This is, in a way, how RNGs work and for my purposes (i.e. this is not a security application) it gets the job done without the overhead of generating new values with every iteration.
This has boosted performance up to 10 million hashes a second. 5 days left to go and no matches yet. Still trying to squeeze out every last drop I can from this code (“SO USE ASM or C!” Yeah, I know, but I want to stick with Java as this is a Java learning experience.)