Security Embedded is 15+ years of experience in building secure systems. Learn more about how we can help you by exploring Phil's blog or contacting us.

What's in a Firmware Load?

Let's continue our case study of the 88MC200-based IoT device. We know that firmware updates get downloaded from an HTTP server. We know the details of the update aren't authenticated. In fact, the update isn't even retrieved over TLS. Things are looking good for this to be an easy hack.

Before we dive into the firmware, let's review some details about the Cortex-M3. All Cortex-M3 implementations include a Nested Vectored Interrupt Controller (NVIC). The NVIC defines a set of entry points, one per interrupt source. The NVIC also manages the device's startup. To do all this, the NVIC needs a fixed "data structure," usually at the start of an executable image:

struct nvic_entry {
  uint32_t initial_sp;
  uint32_t reset_handler;
  uint32_t nmi_handler;
  uint32_t hard_fault_handler;
  /* ... etc ... */
};

There are a few things that make this table handy. First, initial_sp likely is somewhere in your data memory. If someone is being pragmatic about it, the initial_sp value will be the top of the MCU's data RAM region. Second, there's a good chance the various handlers (reset, NMI, Hard Fault, Mem Fault, PendSV, SysTick, etc.) will be close together in the code RAM (or flash in other devices). This means there is likely an obvious table of close together 32-bit integers. Finally, all these addresses will be odd. An odd branch address tells the ARM core that the instructions at the jump target are Thumb instructions. Since the Cortex M3 only supports Thumb/Thumb-2, all these addresses must be odd. If not, a Usage Fault will occur. This is also an important detail when crafting shellcode for ARMv7M devices.

We know that the data SRAM for the 88MC200 starts at 0x20000000, and is 128KiB long. The code SRAM starts at 0x00100000 and goes on for another 384KiB. Remember that a Cortex-M3 is a little endian device, too. Let's have a peek at the initial hexdump:

> hexdump -C 20026.bin | head 10

00000000 4d 52 56 4c 7b f1 9c 2e 45 2d 76 57 02 00 00 00 |MRVL{...E-vW....|
00000010 39 01 10 00 02 00 00 00 c8 00 00 00 d0 0e 04 00 |9...............|
00000020 00 00 10 00 46 9d ec 63 02 00 00 00 98 0f 04 00 |....F..c........|
00000030 20 0a 00 00 00 00 00 20 2d 3c de 94 ff ff ff ff | ...... -<......|
00000040 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff |................|
*
000000c0 ff ff ff ff ff ff ff ff 00 00 02 20 39 01 10 00 |........... 9...|
000000d0 45 01 10 00 59 01 10 00 45 01 10 00 45 01 10 00 |E...Y...E...E...|
000000e0 45 01 10 00 00 00 00 00 00 00 00 00 00 00 00 00 |E...............|

Remember, when reading little endian integers in a hexdump, you have to reverse the order of the bytes. This makes it more "human-friendly."

Well, things aren't looking 100% trivial, but maybe they're 99% trivial? First, there's an obvious header attached to this file. The magic "MRVL" is a dead giveaway. Then there are some random looking numbers. Following that is a 32-bit integer '2', and then a 32-bit integer 0x00100139. That looks like a code address. But the rest of the numbers don't line up for this to be the NVIC table. Let's read on.

There's another 32-bit integer '2', followed by 0xc8. This number is interesting: there's a whole pile of 0xff's that end at 0xc8. This is looking promising. Let's skip the rest of the header now and look at the values at 0xc8.

If we treat the value at 0xc8 as a 32-bit address, its value is 0x20020000. That's starting to look like a stack pointer. The next value should be code then, right? 0x00100139 looks like it falls in the code RAM region. Following that is the NMI handler: 0x00100145 -- tight, but you only need 2-4 bytes to jump to some C code.

So it looks like we've found where the real firmware image starts, at offset 0xc8. Using these as hints, we can start mapping out the contents of the header:

  • 4 bytes magic 'M', 'R', 'V', 'L'
  • 4 bytes unknown (might be magic)
  • 4 bytes timestamp (more on this in a bit)
  • 4 bytes subheader count
  • 4 bytes entry point offset (guess, but likely, based on the NVIC vectors we found)

This firmware image contains two subimages, further analysis shows. Each subheader consists of the following:

  • 4 bytes unknown, maybe flags
  • 4 bytes offset from start of file to start of section
  • 4 bytes length of section
  • 4 bytes load address
  • 4 bytes CRC32.

This is enough information to write a simple tool to dump the header in a human-readable form.

> ./fw 20026.bin
Opening input firmware image 20026.bin (size: 268728 bytes)
Firmware image: Magic Valid (2 entries in image list)
Release time: Fri, 01 Jul 2016 04:43:49 -0400 (1467362629 seconds since epoch)
    0000: 0000265936 bytes, load address: 0x00100000 CRC32: 63ec9d46 (offset: 200 bytes)
    0001: 0000002592 bytes, load address: 0x20000000 CRC32: 94de3c2d (offset: 266136 bytes)

There were two especially interesting fields. Let's talk a bit about divining what a value might mean.

The Divining Rod: Identifying a Timestamp

First was the timestamp. This brings us to Phil's first law of reverse engineering tools:

"Always render a value as many ways as is possible."

OK, this isn't a law, nor should someone's name be attached to it - it's too obvious. Sometimes when you look at a value in base 10 representation it will make more sense than looking at it in hex. Or looking at it in octal might make its meaning clear. When looking at the timestamp field value:

0x57762d45

If I scratched my head a bit I could pick that out as an epoch timestamp. But:

1467362629

That I somehow immediately recognize as a timestamp from 2016. And doing a quick conversion:

Fri, 01 Jul 2016 04:43:49 -0400

Seems that's the case, since the official release of this firmware image was July 2.

CRC32 Is Not Security

The CRC32s were pretty easy to guess -- the 88MC200 has a CRC accleration unit. So it would be a natural choice as a tool to use. The datasheet for the MCU tells us that it only supports CRC32 using the IEEE 802.3 polynomial.

My guess is that the original engineers (who I believe were at Marvell, and not Company X) had good intentions. The intention of the CRC32 was to be a quick integrity check. Verify that the image came to the device in good shape. Lucky for us, we didn't even have to attack a weak cryptographic system. There is no authentication to speak of at all!

Forging Ahead(er)

We now know how to make our own firmware image. This means we can append our code to the existing code and write it to flash. We could alter the NVIC entry point so our evil code takes over the execution process early on. Or maybe we can force a spurious interrupt to cause our code to execute.

When you're reverse engineering firmware, it's important to look for artifacts you can use to navigate the code. The NVIC table for us was our in for this case. Sometimes it isn't so obvious, so you need to look at other sources of infomation. This can include strings and data tables. Either way, you need to have an in-depth understanding of the architecture you're attacking. ARMv7M just so happens to make this easy.

Break out your rogue access points and tcpdump, because next time we're going to be watching wireless traffic.

Plaintext Symmetric Keys, fixed IVs, oh my!

Attacking Firmware Loads