When you build any sort of device with firmware, you will want to be able to update it in the field. Bugs have a way of creeping into software. Many times the device hits the market before all features are fully-baked. After all, you're selling the potential of the device, right? But, nothing will leave your product more vulnerable than a badly designed firmware update process.
A Crash Course on Authentication
To keep firmware updates relatively safe, you will need:
- a means of checking the integrity of an update;
- a means of checking the authenticity of an update.
By checking the integrity of an update, corrupt updates won't get past you. Corrupt updates will render your device a (maybe expensive?) paperweight. A CRC32 will do a good enough job of this, and many MCUs have support to speed up CRC32 calculation. This might be adequate for firmware loads in a trusted environment. But how do you authenticate firmware in the field?
A naive approach might be to change the initial value for your CRC32 polynomial. This falls to a trivial brute-force attack, since the initial value for CRC32 is 32-bits long. So there must be a better way to prevent tampering with firmware images...
Enter The Naive Cryptographer
Let's look at an example of how not to do this. This is a real-world example: an engineer read a Wikipedia page on cryptographic hashing. The scheme they developed was as follows:
- Pick a key that is exactly the size of one SHA-256 block. This is our golden firmware update key, and every device will have this built-in.
- Hash the key block using SHA-256.
- Hash the firmware (plaintext) using SHA-256.
- Append the hash output to the end of the firmware update image.
Let's step back and look at SHA-256. The SHA-256 algorithm is a Merkle-Damgård (MD) design. These algorithms divide a message into equal-sized blocks (512 bits for SHA-256). You then iterate through each block, from first to last, applying the hash function to each block. The hash gets 'built-up' by consuming each block in order. After hashing the final block, the 'intermediate' state becomes the 'final' state.
The astute reader would be correct to ask 'But what if the length of the plaintext is not divided by 512?' In this case, the final block gets padded with MD Compliant Padding. The reason for this specific structure is beyond our scope. But what's important to note: padding can be implicitly done by the tool hashing data, or explicitly inserted into the plaintext. From the perspective of any MD design hash, this is indistinguishable. The hash function output will be identical.
The final output from most SHA* functions is the state of the hash after processing the final block. This means that the above authentication scheme is vulnerable to a Length Extension Attack. Since the attacker knows the intermediate state of the hash function, their work is easy. If the padding isn't present, insert it into the plaintext. Append their payload to the end of the plaintext. Update the hash state with their new payload. Append the new hash. Their modified firmware now is indistinguishable from real firmware, and they didn't have to grab any secrets at all.
Enter the Slightly Less Naive Cryptographer
A less vulnerable approach is a keyed Hash Message Authentication Code (HMAC). HMAC enables authenticating the origin of a message using a key that both parties have. HMAC requires mixing the key material into intermediate stages of the hash. This means the resulting hash can't be 'extended' by an attacker. The attacker would need the key to be available to forge a firmware image. This is a definite improvement. You can authenticate the source of the image (using the key). You can check the integrity of the firmware image (using the HMAC result). This raises the bar for attacks.
But, if the device is in an attacker's hands, neither case is better than the other. The attacker could use JTAG or built-in diagnostic tools to read device memory. They can then just use static analysis techniques to extract the key. Or, even if the device prevents JTAG readout, they could find a hole to execute their own code, then exfiltrate the key. If all your devices use the same key, the attacker can forge their own firmware update and use it for any of your devices. Ouch.
So, clearly neither of these are panacea. I think we'll have to pick a better tool. Stay tuned for Part 2, where we'll investigate public key cryptography as an option.
* SHA-3 is not susceptible to this attack. SHA-3 has a so-called 'sponge structure' with an explicit finalization stage. The 'squeezing' of the sponge does not return the intermediate hash state. MD5, SHA-1 and SHA-2 all use a Merkle-Damgård (MD) construction. The state of the hash function after applying it to a block is the hash of all prior data in the message.