OS Runtime Model and Executables



OS Runtime Features

Although there are many compiled languages, basic OS support for compiled executables begins with C. Many languages are derived from or layered upon C. Some important features that are supported in the basic language (and thus must be supported by an OS that supports a C-based runtime): The following important features are absent from the native runtime support in a basic C-based system (i.e., OS + compiler), and therefore require libraries to implement: In addition, the following useful features are not present in basic C, and can only be partially supported via a library (full-scale support would require using a different language): The Tanenbaum text notes that, from a performance standpoint, garbage collection is problematic for the OS. Since it could kick in and occur at any time, it could slow performance at key points. Therefore, OS designers tend to choose explicit memory allocation and de-allocation only (for OS code). The tradeoff of not using built-in garbage collection inside the OS, of course, is the risk of memory leaks. That said, runtime environments and libraries that run on top of the core OS, like Python and Java, can implement garbage collection, and some attempts have been made to build it into a core OS design.

Program Dependencies and Linking

Header Files. The first pass of the C compiler is the preprocessor. It grabs a header for each #include directive. The headers represent external dependencies -- library functions that need to be called by the main program. For an OS written in C, the header files have the .h file extension. these files contain declarations and definitions used by one or more code files. They can also include macros, constants, and conditional compilation, such as:
#define BUFFER_SIZE 4096
#define max(a,b) (a > b ? a : b)

#ifdef PENTIUM
Makefiles. For large projects with many dependencies, it is inefficient to recompile everything every time any one module is modified. There may be hundres of modules and millions of lines of code. The way to figure out what needs recompiling and what doesn't is to include a Makefile. The Makefile explicitly names each module's dependencies. When a project gets 'made', using the Unix make utility, only those modules that depend on something that changed get recompiled. For full details, see the definitive reference, the GNU Make Manual

Object Files. The compiler produces object files (.o file extension), which are machine code but not fully-formed executables. In order to become full executables, they need to be linked.

Linking. Linking can be static or dynamic. In static linking, actual executable object modules from libraries are attached to a main executable. Then, cross-referenced address are resolved ('fixed up'). In dynamic linking, the actual library code is not attached to the main executable. Instead, pointers to the called library functions are inserted into a table in the executable. The address references are then resolved ('fixed up') during the load process, or just prior to execution.

An important difference in the OS is required for dynamic linking. If a referenced library module is not in memory when it is referenced, the OS must block the program that called it, then go load the needed library into memory. Most operating systems will have commonly-needed dynamic-link libraries loaded and available all the time.

The basic model for constructing an executable in an OS based on the compiled C-code runtime model is illustrated in the following diagram:
The process of compiling C and header files to make an executable.

Executable Files

An OS must adopt a file format for binary executables. The diagram below illustrates the history and influence over major formats. These days, the principal formats are ELF (Unix/Linux) and PE32 (Windows) [now generally just called "PE", for 64 bit code]. They share many similarities. The MacOS format, called Mach-O, is also much the same in structure.
History and influence in common executable file formats.

Linux Executables. Executables may have a variety of file extensions, or none at all. Linux commonly uses no file extension, or any of: .axf, .bin, .elf, .o, .prx, .puff, .so. There are a lot of sections in an executable file. A good reference for the ELF format used by Linux is here.

Here are a some of the elements of the ELF format:
Tools. There are a number of tools for exploring the contents of an executable in detail:


The linker takes a collection of object modules, and produces a load module After linking, all internal address references are now relative to the beginning of the new, combined load module (instead of relative to the original individual object modules). Any external references (such as for a function call in a dynamic link library), not shown here, must also be identified in a table, so they can be resolved at load time or runtime.
Linking. Three object modules are on the left; one combined load module is on the right.

Static Linking

With static linking, all object code is assembled into a single load module. If functions from a library (such as stdio, unistd, or libc) are to be called, the full executable code from those functions gets inserted into the main binary, eliminating any runtime (dynamic) external dependencies.

Dynamic Linking

With dynamic linking, external object modules are referenced, but not included in the main executable. Dynamic linking can actually be done at load time or even deferred until run time.
Linking and loading.

Static vs. Dynamic Linking

File Size. Statically linked binary executables have all needed external library functions compiled right into the main binary. As a result, the main binary is larger in size, but it is more portable, meaning it does not rely on the presence within the OS of certain specific library functions (because it brings its own code).

Portability. Dynamically linked binary executable files are smaller in file size, but may in certain cases not be as easily ported to other systems, if those systems don't have the requisite dynamic-link libraries available. Generally speaking, this isn't a big concern, but in some cases static linking improves portability.

Copies in Memory. One huge advantage of dynamic linking is that (assuming the library functions are re-entrant) there only needs to be one unique copy of each library in memory for the whole OS, as opposed to one copy per application program (as with static linking). This can be achieved through the mapping of virtual memory pages.

Security. This issue is less clear cut. If a user application is statically linked, and a vulnerability in a library is identified, the application has to be re-compiled against the updated library and sent back out to all users. Under dynamic linking, if a vulnerable library is indentified, the users just need to update their systems to patch the library -- the applications depending on it do not need to be re-compiled and re-distributed.
HOWEVER -- dynamic linking opens up applications to some nefarious attacks, some of which we will discuss later. In particular, if an attacker can corrupt a dynamically linked library on a machine, that single exploit can immediately infect multiple unsuspecting executables.

Linux Linking and Loading

Linux uses the following conventions: The following commands are useful for dealing with libraries in Linux: