Interface

Operating System
Architectural Interface

Reading

Objectives

Interface Definitions

There are three main interfaces that are defined by some combination of the hardware architecture and the operating system:

Instruction Set Architecture (ISA)

Application Binary Interface (ABI)

Application Programming Interface (API)

Data Representation

Endianness

The order in which multi-byte values are stored in a computer is called its Endianness. There are two choices, Big Endian and Little Endian. [ref1 ref2]

LSB/MSB: The "most significant" byte of a multi-byte value is the digit representing the largest value. In normal (Western) written numbers, it is the leftmost digit. For example, in the decimal number 1992, the digit '1' is the "most significant digit." In the hex number 0xAA55B2E3, the byte AA is the most significant byte (MSB), and E3 is the least significant byte (LSB).

The concept can be applied to bits, as well. In the number 01001001, the most significant bit is occupied by a 0, and the least significant bit is occupied by a 1.

Multi-byte storage is defined as follows:
Example: Suppose a 32-bit value is stored in using an instruction like the following pseudo-assembly: mov [R1], 0xAA55B2E3

Suppose register R1 points to memory address 0x100. The bytes would be stored as follows:
32-bit Big-Endian
  • 0x100: AA
  • 0x101: 55
  • 0x102: B2
  • 0x103: E3
64-bit Big-Endian
  • 0x100: 00
  • 0x101: 00
  • 0x102: 00
  • 0x103: 00
  • 0x104: AA
  • 0x105: 55
  • 0x106: B2
  • 0x107: E3

32-bit Little-Endian
  • 0x100: E3
  • 0x101: B2
  • 0x102: 55
  • 0x103: AA
64-bit Little-Endian
  • 0x100: E3
  • 0x101: B2
  • 0x102: 55
  • 0x103: AA
  • 0x104: 00
  • 0x105: 00
  • 0x106: 00
  • 0x107: 00


Note: observe how, in Little Endian representations, the lower-order bytes do not change when interpreting the value beginning at addresses 0x100 as a 32-bit or 64-bit number. This fact leads to architectural efficiencies that tend to favor Little Endian systems, which are generally more prevalent.

Where Little Endian storage can lead to confusion is in the intrepretation of multi-byte values, because, when reading them in order, from lower addresses to higher addresses, one byte at a time, they seem to be backwards, compared to normal written notation.

In computer systems, Endianness is important when storing multi-byte numbers in RAM, on disk, or even in network packets. In TCP/IP networks, multi-byte values for the IP address and port number are stored in the packet in Network Order, or Big Endian, and in RAM in Host Order, which is Little Endian. In socket programming, this requires a conversion between the formats.

A computer architecture may specify the use of only one Endianness (such as x86, which supports only Little Endian), or both (for example, RISC-V supports either Big or Little Endian implementations).

Integers

Integers may be stored in a number of different internal formats, according to the architectural specification. Because th ALU has to perform mathematical operations on integers, it needs to know how to interpret the bit values, according to their storage.

Unsigned Integers. Unsigned integers are normally stored as a direct conversion from their decimal value to the binary equivalent. For example, an unsigned 16-bit integer can store the values 0 through 65,535.

Signed Integers. Signed integers allow positive and negative values, but there are different ways to represent them. Example: The eight bits 10000001 would represent the value 129 as an Unsigned Integer, the value -1 as a Signed Integer, or the value -127 if using Two's Complement. Same bits, different meaning, depending on the architectural specification and data type.

Floating Point Numbers

Floating point numbers can be represented in binary using a "fixed point" format, where the location of the decimal point is fixed, or a "floating point" format, where the decimal point can be anywhere relative to the significand (the significant digits). Similar to integers, the storage format must specify how to handle negative numbers.

A common example is the float, which is short for a "single precision floating point" number, stored using 32 bits.

Computer System Main Components

The main components of a computer system are: In smaller form factor computers, such as phones and tablets, many or all of the above components can be combined and fabricated into a single unit called a System on Chip, or SoC.

Inside the CPU

Processor architectures vary, but several components are common in all general purpose CPUs:

Interrupts and Exceptions

There is a great deal of inconsistency when using the terms interrupt, exception, trap, hardware interrupt, software interrupt, syscall, etc. Our goal here is to adopt some standard definitions to use in this course. The term 'trap' is historically been used to mean a software interrupt, but we'll avoid it since seems to be the least well defined.

Interrupts are the fundamental way in which execution activity occurs in an OS. There are two distinct types:
Separate from interrupts, we have exceptions:
In all these cases, there is a transfer of control from the running process to some other code. In the case of both hardware and software interrupts, an interrupt service routine (ISR) is invoked. ISRs are kernel-mode code, supplied by the operating system (or perhaps a 3rd party device driver).

In the case of exceptions, the OS runtime system will look to see if there is a defined exception handler for the condition that occurred. Exception handlers can be supplied by the OS or by application code, and they may run in kernel mode of the processor or user mode.

Mode Switch vs. Context Switch

Context Switch. When privileged access is required, a context switch between the user program and the kernel must be performed. A context switch occurs when the user program execution is stopped, the current state is saved and offloaded from the processor, and the kernel is swapped in to complete the protected task. Once the operating system completes the request, the kernel will stage any results to be returned to the user process, and the kernel is swapped out in favor of the user process. Execution continues from that point. A context switch is performed by the operating system.

A context switch may occur, for example, due to a software interrupt, a page fault, or the OS swapping in a new user process to run on the CPU for a while. Context switches are highly optimized for performance; software that causes an excessive number of context switches can incur a tremendous performance penalty.

Mode Switch. The term mode switch is used to describe a change in the processor between its unprivileged mode or "ring" (ring 3 on Intel CPUs), and its privileged "ring" (ring 0 on Intel CPUs). In ring 0, all machine instructions and all regions of memory are accessible; in ring 3, only user memory and unprivileged machine instructions are available. A mode switch is performed by the CPU.

Library Calls

The POSIX standard specifies a number of libraries that must be made available to programs. These libraries each have a set of functions that are available for user programs to call, such as printf(). To include a specific library, we just use the #include directive at the top of a C program. Included libraries are loaded by the OS as they are needed. When we refer to a function call to invoke services provided by an OS library, we often just refer to it as a library call.

Some library functions perform a calculation or service that can be accomplished entirely in user mode, without a mode switch. Many other library functions provided by the OS are actually just 'wrappers' that, when called, validate their input and then in turn invoke one or more system calls.

System Calls

A system call is an entry point for requesting OS services that require privileged acess to the hardware. System calls are fundamental to the interface between the architecture and the OS, and are among the first things an OS designer must define.

Tracing Library and System Calls

One method for identifying understanding how programs using system calls or library calls is to trace program execution using ltrace and strace. These two programs monitor execution and report either the library calls used or the system calls, respectively.

Library Call Tracing

To begin, let's look at a simple program that prints "Hello World". This program defines a string called hello that references the string "Hello World." The string is then printed using puts() which puts the string to standard out, like printf(). Since we are not doing any formatting, so printf() is not required.
/* helloworld.c */
#include <stdio.h>
#include <stdlib.h>

int main(int argc, char * argv[]){
  puts("Hello, World!\n");
}
What we would like to do is see the library as it is used during execution. We can normally do this with ltrace. Current versions of Linux will compile programs using gcc in a way that breaks ltrace (preventing it from intercepting library calls), unless we disable something called Position Independent Execution, or PIE, with -no-pie. Doing that, we can see that, yes, indeed this program calls puts(), a library function call:
$ gcc -no-pie helloworld.c -o helloworld
$ ltrace ./helloworld 
puts("Hello, World!\n"Hello, World!) = 15
+++ exited (status 0) +++

System Call Tracing

If we use the -S flag to ltrace, or the related utility strace, we get a bit different narrative of our program, showing the system calls instead (output shortened a bit):
ltrace -S ./helloworld
SYS_brk(0)                                                                    = 0x1f73000
SYS_arch_prctl(0x3001, 0x7ffe8a486d90, 0x7f61a1ce6230, 0x7f61a1cefe13)        = -22
SYS_access("/etc/ld.so.preload", 04)                                          = -2
SYS_openat(0xffffff9c, 0x7f61a1cefb80, 0x80000, 0)                            = 3
SYS_fstat(3, 0x7ffe8a485f90)                                                  = 0
SYS_mmap(0, 0x135c7, 1, 2)                                                    = 0x7f61a1cb6000
SYS_close(3)                                                                  = 0
SYS_openat(0xffffff9c, 0x7f61a1cf9e10, 0x80000, 0)                            = 3
SYS_read(3, "\177ELF\002\001\001\003", 832)                                   = 832

...
                          = 0
SYS_mprotect(0x7f61a1cf7000, 4096, 1)                                         = 0
SYS_munmap(0x7f61a1cb6000, 79303)                                             = 0
puts("Hello, World!" <unfinished ...>
SYS_fstat(1, 0x7ffe8a486be0)                                                  = 0
SYS_brk(0)                                                                    = 0x1f73000
SYS_brk(0x1f94000)                                                            = 0x1f94000
SYS_write(1, "Hello, World!\n", 14Hello, World!
)                                           = 14
<... puts resumed> )                                                          = 14
SYS_exit_group(0 <no return ...>
+++ exited (status 0) +++
Notice there are a lot of SYS_* calls. What are these? These are actual system calls being executed. The one of interest to us is SYS_write(1, "Hello, World!\n", 14).

Conclusion. From this example, we conclude that:

Syscall Invocation

Invoking a System Call Indirectly, using a Library Function

The Unix/Linux API (unistd.h) provides a way to cause a particular system call to be invoked: syscall(). The syscall() interface is as follows:
           .--- System Call Number
           v
syscall(long number, ...)
                      ^
                      '---- Remaining Arguments to the system call
The first argument, the system call number, is a way to specify which system call you would like invoked. Each system call has a unique number assigned to it and it is machine code and operating system dependent. For example, in the x86_64 (64-bit) Intel architecture, the write is system call number is 1. Let's rewrite our program to use syscall() to write hello world:
#include <unistd.h>

int main(){
  char hello[] = "Hello, World!\n";
  syscall(1, 1, hello, 14);
  //  1: number of syscall
  //  1: for stdout
  //  14: number of bytes
}
Note that the arguments following the system call number match the arguments to the write() system call, which we learned from doing the ltrace above. Now we can run this program to see the output and do another trace of it (again using -no-pie to ensure ltrace sees the library calls):
$ gcc -no-pie helloworld.c -o helloworld

$ ltrace -S ./helloworld 

(...)
                                          = 0
syscall(1, 1, 0x7ffc49626499, 14 <unfinished ...>
SYS_write(1, "Hello, World!\n", 14Hello, World!
)                                           = 14
<... syscall resumed> )                                                       = 14
SYS_exit_group(0 <no return ...>
+++ exited (status 0) +++
Conclusion. From this example, we conclude that:

Invoking a System Call Directly, using Assembly Language

The fact that there are library functions that will in turn invoke system calls for us is an abstraction, for simplicity and convenience. However, the actual mechanics of a system call, involving a mode switch to the kernel, are normally defined by assembly-language functions that execute a special machine instruction on the CPU. The only way to directly invoke a system call (with an actual switch to kernel code -- a mode switch to ring 0 on the CPU) is by writing our program not in C, but rather in assembly language. The mov commands are assignments, and putting values in registers that match the arguments to syscall. Each architecture defines the machine instruction for system call invocation. Intel architectures use int 0x80 (interrupt 0x80, for the 32-bit binary interface) or sysenter / syscall (for the 64-bit binary interface), for example.
SECTION .data
  ;;char hello[] = "Hello, World!\n"
        hello   db "Hello, World!",0x0a 

SECTION .text
  global _start
  
_start:
  ;;syscall(1,1,hello,14) 
  mov rax,1
  mov rdi,1
  mov rsi,hello  
  mov rdx,14
   syscall

  ;;syscall(60,0); //exit with status 0
  mov rax,60
  mov rdi,0
  syscall
Compiling and running this assembly program looks a bit different, but the result is the same.
$ nasm -f elf64 hello.asm
$ ld hello.o -o hello
$ ./hello 
Hello, World!
If we run ltrace on this program to see the library calls, we get nothing! That's because we are no longer using a Unix/Linux userspace library at all -- we are now just using the OS system call interface in its purest form, directly to kernel code that communicates with the hardware architecture:
$ ltrace -S -n 3 ./hello > /dev/null 
Couldn't find .dynsym or .dynstr in "/proc/292/exe"
To see what system calls are executing, we can use strace. When we do that, we see that, yes, in fact we are still executing a write(). We also see the execve() system call, which is the system call that starts the execution from the command shell. This illustrates invocation of an OS service at the lowest level possible by an application program:
$ strace ./hello > /dev/null 
execve("./hello", ["./hello"], [/* 16 vars */]) = 0
write(1, "Hello, World!\n", 14)         = 14
exit(0)                                 = ?
+++ exited with 0 +++

Important Linux System Calls

Some important Unix/Linux system calls. (Tanenbaum, Modern Operating Systems)

Summary