In this lecture, we briefly overview x86-64 Assembly. A brief reference is found here (from Standford University).

x86-64/32 Registers

64-bit Register Name Rnn Name Name (Description) 32-bit register name
RAX R0 Accumulator (Return value) EAX
RCX R1 Counter (frequently used as i in iterations/loops) ECX
RDX R2 Data (frequently used as the 3rd Argument) EDX
RBX R3 Base (frequently used as base address or counter) EBX
RSP R4 Stack Pointer (keep track of top of CPU stack) ESP
RBP R5 Base Pointer (keep track of bottom of CPU stack) EBP
RSI R6 Source Index, 2nd Argument ESI
RDI R7 Destination Index, 1st Argument EDI
R8-R15 R8-R15 Additional general-purpose registers only avail. in 64-bit
In addition, recall that RIP (the instruction register) contains the address of the instruction to be executed.

Hello World!

Consider the following simple C program hello.c.

#include <stdio.h>

int main()
{
  printf("Hello World!\n");
  return 0x1234;
}
We will use GDB to see how the assembly code for the above code looks like, through which to try to understand the x86-64.

Remember to compile the code with -g option to enable GDB debugging.

gcc -g hello.c -o hello

GDB

As you see below, you can use list, break, run commands to pause the execution right before line 5.

The disassemble command shows the actualy assembly code.

lea (load effective address)

We will start with an easy one: lea at <+4>.
lea    0x9f(%rip),%rdi
This works as follows:
  1. When the instruction is executed, $rip will contain the address of the next instruction <+11>, which means 0x8000645.
  2. So, rip+0x9f is evaluated to 0x8000645 + 0x9f = 0x80006e4.
  3. So, rdi will contain 0x80006e4. As the hd command shows, at this address, the string "Hello World!" resides.
Overall, this instruction sets the rdi to contain the address of "Hello World!".

callq

Now let's consider the next instruction.
callq   0x8000510 <puts@plt>
This instruction makes $rip jump to the code at 0x8000510.
Important. The callq instruction also pushes the return addressto the stack, before jumping to the target function address.

GDB kindly tells us 0x800510 is where function puts@plt is.

  • puts is a standard C function that prints a string on the screen.
  • plt stands for Procedure Linkage Table. This technique is used to call external procedures/functions whose address isn't known in the time of linking, and is left to be resolved by the dynamic linker at run time.

    The function you're calling is located in another module (typically, libc.so.x), therefore the actual address of the function must be provided when the program is loaded for execution.

$ gdb hello
...
(gdb) list
1       #include <stdio.h>
2
3       int main()
4       {
5         printf("Hello World!\n");
6         return 0x1234;
7       }
(gdb) break 5
Breakpoint 1 at 0x63e: file hello.c, line 5.
(gdb) run
Starting program: .../hello

Breakpoint 1, main () at hello.c:5
5         printf("Hello World!\n");
(gdb) disassemble
Dump of assembler code for function main:
   0x000000000800063a <+0>:     push   %rbp
   0x000000000800063b <+1>:     mov    %rsp,%rbp
=> 0x000000000800063e <+4>:     lea    0x9f(%rip),%rdi    # 0x80006e4
   0x0000000008000645 <+11>:    callq  0x8000510 <puts@plt>
   0x000000000800064a <+16>:    mov    $0x1234,%eax
   0x000000000800064f <+21>:    pop    %rbp
   0x0000000008000650 <+22>:    retq
End of assembler dump.
(gdb) hd 0x80006e4 40
0x80006e4: 48 65 6C 6C       H e l l
0x80006e8: 6F 20 57 6F       o . W o
0x80006ec: 72 6C 64 21       r l d !
0x80006f0: 00 00 00 00       . . . .
0x80006f4: 01 1B 03 3B       . . . ;
0x80006f8: 38 00 00 00       8 . . .
0x80006fc: 06 00 00 00       . . . .
0x8000700: 0C FE FF FF       . . . .
0x8000704: 84 00 00 00       . . . .
0x8000708: 2C FE FF FF       , . . .
(gdb) quit
Overall, the lea and call instructions to gether implement the C code printf("Hello World!\n");. As you probably guessed, puts function refers to the register edi to pull up the string argument "Hello World!\n".

The two instructions before lea and call

Now, let's look at the two instructions on the top.

push

The push instruction takes the content of a given register as input (in this case, rbp) and pushes it into the stack.

To see better what's going on, we get set a breakpoint at the address of the first instruction (i.e., 0x8000063a).

Note: If you want to set a breakpoint at an address, you have to add * in front of the address (see the right).
(gdb) break *0x800063a
Breakpoint 3 at 0x800063a: file hello.c, line 4.
(gdb) c
Continuing.
Hello World!
[Inferior 1 (process 185) exited with code 064]
(gdb) run
Starting program: ../hello

Breakpoint 3, main () at hello.c:4
4       {
(gdb) disassemble
Dump of assembler code for function main:
=> 0x000000000800063a <+0>:     push   %rbp
   0x000000000800063b <+1>:     mov    %rsp,%rbp
   0x000000000800063e <+4>:     lea    0x9f(%rip),%rdi   # 0x80006e4
   0x0000000008000645 <+11>:    callq  0x8000510 <puts@plt>
   0x000000000800064a <+16>:    mov    $0x1234,%eax
   0x000000000800064f <+21>:    pop    %rbp
   0x0000000008000650 <+22>:    retq
End of assembler dump.
(gdb) p $rbp
$3 = (void *) 0x8000660 <__libc_csu_init>
(gdb) hd $rsp 40
0x7ffffffee048: F7 1B 02 FF       . . . .
0x7ffffffee04c: FF 7F 00 00       . . . .
0x7ffffffee050: 01 00 00 00       . . . .
0x7ffffffee054: 00 00 00 00       . . . .
0x7ffffffee058: 28 E1 FE FF       ( . . .
0x7ffffffee05c: FF 7F 00 00       . . . .
0x7ffffffee060: 00 80 00 00       . . . .
0x7ffffffee064: 01 00 00 00       . . . .
0x7ffffffee068: 3A 06 00 08       : . . .
0x7ffffffee06c: 00 00 00 00       . . . .
(gdb) p $rsp
$4 = (void *) 0x7ffffffee048
On the left, you can see the value of rbp, and and the stack contents. Note that the stack starts at 0x7f...48, which is also the value rsp.

On the right, the GDB command stepi executes one assembly instruction, which is the push instruction.

Note that the hexdump of the stack shows that $rbp is pushed on the top of the stack. This also changed $rsp to be 0x7ff...40.

(gdb) stepi
0x000000000800063b      4       {
(gdb) hd $rsp 40
0x7ffffffee040: 60 06 00 08       ` . . .
0x7ffffffee044: 00 00 00 00       . . . .
0x7ffffffee048: F7 1B 02 FF       . . . .
0x7ffffffee04c: FF 7F 00 00       . . . .
0x7ffffffee050: 01 00 00 00       . . . .
0x7ffffffee054: 00 00 00 00       . . . .
0x7ffffffee058: 28 E1 FE FF       ( . . .
0x7ffffffee05c: FF 7F 00 00       . . . .
0x7ffffffee060: 00 80 00 00       . . . .
0x7ffffffee064: 01 00 00 00       . . . .
(gdb) p $rsp
$5 = (void *) 0x7ffffffee040

mov

The next instruction is as follows:
mov    %rsp,%rbp
This instruction moves $rbp ← $rsp. See the GDB log below to see how $rbp changed.
(gdb) stepi

Breakpoint 1, main () at hello.c:5
5         printf("Hello World!\n");
(gdb) p $rbp
$6 = (void *) 0x7ffffffee040
(gdb) p $rsp
$7 = (void *) 0x7ffffffee040

Overall, what's going on? Setting up a new stack frame

In essense, right before the acutal code of the main function is executed, the above two instructions set up a new stack frame for the main function.

The three instructions after lea and call

Breakpoint 5, main () at hello.c:6
6         return 0x1234;
(gdb) disassemble
Dump of assembler code for function main:
   0x000000000800063a <+0>:     push   %rbp
   0x000000000800063b <+1>:     mov    %rsp,%rbp
   0x000000000800063e <+4>:     lea    0x9f(%rip),%rdi    # 0x80006e4
   0x0000000008000645 <+11>:    callq  0x8000510 <puts@plt>
=> 0x000000000800064a <+16>:    mov    $0x1234,%eax
   0x000000000800064f <+21>:    pop    %rbp
   0x0000000008000650 <+22>:    retq
End of assembler dump.

mov

mov    $0x1234,%eax
This instruction moves the value 0x1234 into register eax (32-bit portion of rax). See the GDB log before and after executing the instrucion.
(gdb) p $eax
$9 = 13
(gdb) stepi
7       }
(gdb) p $eax
$10 = 4660

pop

pop    rbp
This pop instruction
(gdb) p $rbp
$11 = (void *) 0x7ffffffee040
(gdb) stepi
0x0000000008000650      7       }
(gdb) p $rbp
$12 = (void *) 0x8000660 <__libc_csu_init>
(gdb) p $rsp
$13 = (void *) 0x7ffffffee048

retq

This instruction is used to return to the caller.
The retq instruction pops the return address from the stack into %rip, thus resuming at the saved return address.

What's going on?

These three instructions implement the C code: return 0x1234;. In particular: