I read the Meltdown Paper so you don't have to

Meltdown is a cache-timing attack on Intel CPUs that allows all memory to be read by any process because of how they do Speculative Execution. If that sounds like a handful, we wrote this blog for you!

This post includes everything you must know before you can understand Meltdown as a developer (it assumes no knowledge of CPU internals). I highly recommend reading the original paper alongside this from https://meltdownattack.com/meltdown.pdf, I found it to be very readable and extremely well written.

Out of Order / Speculative Execution

Modern CPUs do out-of-order execution whenever they see a branch (if/switch etc). They will typically execute code for multiple branches while the conditional is evaluated. So

if (a+b*c == d) {
  // first branch
}
else {
  // second branch
}

will involve both the conditions running simultaneously while the condition is evaluated. Once the CPU has the answer (say “true”), it scraps the work from the second branch and commits the first branch. The instructions that are executed out-of-order are called “transient instructions” till they are committed.

The Bug

The code in both the branches can do a lot of things. The assumption is that all of these things will be rolled back once a branch is picked. The attack is possible because cache-state is something that does not seem to be rolled back. This is the crux behind both Meltdown and Spectre attacks.

Meltdown specifically works because “any-random-memory-access” seems to work while in a transient instruction. This attack allows a program to access the memory, and thus also the secrets, of other programs and the operating system.

CPU Cache?

Reading data from RAM is slow when you are a CPU. CPU cache times are in the order of 1-10ns, while RAM access takes >100ns. Almost any memory read/write is placed in the cache: The cache is a mirror image of memory activity on the computer.

Cache Timing?

Let us say I have this piece of code:

$secrets = ["secret1", "secret2", "secret3", "secret4", "realSecret"];
$realSecret = $secrets[4];

This loads the real secret in memory. An attacker then does the following:

Clear the CPU cache
Runs the above program
Try to access the specific memory address

The above access results in an error, and raises an exception. However, the attacker knows that the secret is in one of the 5 possible locations. Since only one of these is ever read by the actual program, it can repeatedly run the program and time the exception to figure out which one of the locations was being read. The one which is being read is cached, and the exception will be raised much faster as a result.

Cache Timing attacks are the building blocks of Meltdown, which uses them as a side channel to leak data.

The Bug, again

Now that we’ve explained cache-timing attacks (which can tell you “what-memory” is being used by another program), we can get back to Meltdown. Meltdown happens because:

CPUs do not rollback CPU-cache after speculative execution, and
You can manipulate the cache in those transient instructions to create a “side-channel” and
Intel CPUs allow you to read memory from other processes while in a transient instruction.

From the paper:

Meltdown consists of 3 steps:

Step 1. The content of an attacker-chosen memory location, which is inaccessible to the attacker, is loaded into a register.

Step 2. A transient instruction accesses a cache line based on the secret content of the register.

Step 3. The attacker uses Flush+Reload to determine the accessed cache line and hence the secret stored at the chosen memory location.

In slightly more easy words:

Read an inaccessible memory address (this will raise an exception, but we’ll work later on suppressing this)
Depending on the value of the byte at the read address, read a specific value from a known memory location. Do this before the exception is raised, relying on Speculative Execution.
Use a cache-timing attack to see what value was read in 2, and use that to infer the value you wanted to readThe trick is in executing Step 2 as a transient instruction, which lets us read any memory address, even from another process

In code:

c = *kernel_memory_address;
b = probe[c];

There are several caveats:

Exception Suppressing

If you try to actually read kernel-space memory directly, your program will crash. Meltdown works around this by making sure that the memory is only read in transient instructions that will be rolled back.

So you wrap the above code with:

if (check_function()) {
    meltdown();
}

And make sure that check_function always returns false. What happens is that the CPU starts running the code inside meltdown function before it has the result from the check.

Cache Lines

CPU cache are broken down into several cache-lines. Think of them as lookup hashes for your CPU cache. Instead of accessing single-byte (probe[c]), meltdown multiples the memory addresses by 4096 to make sure that the code accessess a specific cache line. So more like:

b = probe[c * 4096];

If you’re wondering why we are doing a read instead of just printing c, or maybe copying it to another place, it is because CPU designers considered that, and rollback those instructions correctly, so any writes cannot be used to exfiltrate the data from a transient instruction.

Zeroes

Sometimes, the exception is raised before the code executes, and the value of c is set to 0 as part of the rollback. This makes the attack unreliable. So, the attack decides to ignore zero-value-reads and only prime the cache if it reads a non-zero value. Thus the whole code becomes

if (check_function()) {
  label retry:
  c = *kernel_memory_address;
  if (c != 0)
    b = probe[c * 4096];
  else
    goto retry;
}

The similar assembly code (from the paper) is:

; rcx = kernel address
; rbx = probe array
retry:
mov al, byte [rcx] ; try to read rcx
shl rax, 0xc ; multiply the read value with 4096 by shifting left 12(0xc) bits
jz retry ; retry if the above is zero
mov rbx, qword [rbx + rax] ; read specific entry in rbx

The special condition where c actually is zero is handled in the cache-timing where we notice no memory address has been cached and decide it was a zero.

References:

https://spectreattack.com/ (The official page)
https://meltdownattack.com/meltdown.pdf (The paper is very well written and easily readable)

If you’ve read this far, and this sounds interesting, perhaps you’d be interesting in joining the security team at Razorpay?

Design

Like the melting CPU design? Want to use it for your blog post? Download it from below:

The above image is released under a Attribution 4.0 International license.