No one really likes to fix coding mistakes. It’s frustrating enough when you have to find and fix your own, but finding and fixing other’s mistakes is extra frustrating. What were they thinking, why would you code that this way, it would be better to start this from scratch…are all things you probably find yourself saying when you need to take this task on.
Last year I ran across an article about a higher profile cryptocurrency hack that wasn’t the type of smart contract exploit that is more commonly seen. In this case the private keys were compromised and the attackers didn’t steal them…they computed them.
It turns out that Wintermute used an open-source program called profanity to brute force a smart contract address that started with many zeros (0x0000000fe6a514a32abdcdfcc076c85243de899b). This program is known as a vanity address generator so you can customize what your Ethereum Address looks like. Why they chose to do this I’m not sure as there are usage cost reasons for having more zeros in the address (but not leading zeros). However, it turns out there is a major flaw in the way profanity randomly generates it’s private keys. Without going too deep into how profanity works the basic workflow is that a “random” private key is generated, the corresponding Ethereum address is calculated and compared against what the user is looking for. If that’s not found, it increases the private key by one and tries again. The program uses attached GPUs to speed up this process significantly, into the hundreds of millions of checks per second.
The issue that occurs is in the way in which the initial “random” private key is generated. Ethereum private keys are 32 bytes in length, which means there are 2^256 possible keys. Brute forcing that many keys is impossible with current technology so there should be no way to compute someone’s private key. The profanity code uses a pseudo random number generator called mt19937, but that generator only outputs 8 bytes at a time and takes in a 4-byte unsigned int seed (which is fed by a random_device call). So the code has to get 4 outputs and combine them into one random private key. The problem lies in the fact that the mt19937_64 program only gets seeded once so the “random” numbers it outputs don’t change if the input seed is reused (the output of mt19937 is a 19937 bit seed sequence that doesn’t change if the seed is the same). By generating the initial private this way you have reduce the complexity of computation from 2^256 down to 2^32 (the 4 byte seed).
Now this is only the starting keys that are used and the methods used to crack the keys is fairly complex and involve a good understanding of cryptography and private/public key correlation, but the flaw in the program is pretty easy to understand. You can actually generate and save all of the starting private keys that could be generated by this program in just a few hours and less than 2TB of hard drive space.
A quick modification and recompile where we set the seed to a specific number you would see that the private key doesn’t change when run multiple times. Running the program multiple times outputs the same starting private key when the single randomization seed is fixed.
OK, so this goes back to “what were they thinking” when reviewing other’s code. This random number generation is such a basic concept that needs to be done properly that it defies logic that it would be done this way.
OK so how do we fix this so it’s actually random and has the expected randomness of 2^256. There are a few ways, but a for loop seemed to be the easiest for a permanent fix, however to prove that the new program doesn’t have the same issue it’s easier to just create 4 different mt19937 variables and set the first to zero. However, upon closer inspection you are still using only 4 * 4-byte seeds, so you have only increased the randomness to 2^128. That’s certainly a lot better than 2^32, but we are wanting to get a cryptographically secure randomness of 2^256.
So how do we get and feed a random seed sequence that is at least 32-bits long so mt19937 is random? We don’t even need to reseed that function for the multiple calls we need to make if we can feed it a large enough seed (mt19937 will take up to 624 words). In this case I found a blog that talked about how to properly seed mt19937 so I went about modifying the program to use that information.
Pretty simple changes to fix a program flaw that cost a company $160M.
This is a good reminder to everyone that just because a program is open source doesn’t mean that it doesn’t contain flaws. In reality it might be less secure to use something that is open source because everyone has eyes on the code and and attacker can more easily find the flaw and exploit it. In this case making a program to exploit the flaw can actually be done with about 100 lines of code changes in the original profanity code.