IPC

IPC and Synchronization

Reading

Intro

Ever since computers have had the ability to run multiple processes, there is a desire for the processes to be able to communicate somehow. Communication and synchronization are closely related challenges. Some of the most common and important problem types in computer applications involve these challenges. Importantly, the same issues arise whether processes are sharing resources (which usually must be done explicitly), or threads are sharing resources, which they normally do by default.

Inter-Process Communication

The term IPC stands for inter-process communication, but it refers not only to communication but synchronization, as well. Some examples of processes needing to communicate include: Basic information passing, as via signals and messages, does not ordinarily lead to problems, but operating on shared resources and data certainly does.

Terms

We begin the discussion with a few important definitions:

Race Conditions

When a race condition exists, program correctness is likely to be violated.

Example 1. Code. Consider the following program. There is a main program and two threads. Each thread appends a different string to the end of a main input string. There is a random number generated that determines which thread runs first. Although this may seem artificial, it is a realistic artifact because in reality, absent explicit ordering, there is no way for an application process to know the relative order of execution of its threads, or be assured that the order will always be the same. In this case, the random number generation simulates the non-deterministic nature of CPU scheduling by the OS.
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <string.h>
#include <pthread.h>

#define MAXLENGTH 32



void * gonavy(void * args){

  char * str = (char *) args;
  char navystring[] = "gonavy! ";
  strcat(str,navystring);
  return NULL;
}

void * beatarmy(void * args){

  char * str = (char *) args;
  char armystring[] = "beatarmy! ";
  strcat(str,armystring);
  return NULL;
}
int main(int argc, char * argv[]){

  char hello[MAXLENGTH] = "Hello World! ";
  pthread_t thread1, thread2; 
  srand((unsigned int)time(NULL));
  float r = (float)rand() / (float)(RAND_MAX);
  printf("r = %f\n",r);

  if(r > .5) {
    pthread_create(&thread1, NULL, gonavy, hello);
    sleep(.01);
    pthread_create(&thread2, NULL, beatarmy, hello);  
  }
  else {
    pthread_create(&thread2, NULL, beatarmy, hello);
    sleep(.01);
    pthread_create(&thread1, NULL, gonavy, hello);      
  }

  sleep(.2);
  printf("%s\n", hello);  
  pthread_join(thread1, NULL);
  pthread_join(thread2, NULL);
  return 0;
}

If we run the program repeatedly, the output changes, depending on the order of thread execution. Therefore, a race condition exists in this example.


Example 2. Consider two example sections of code, called T1 and T2, which might be threads that operate on the shared variables a and b.
// T1
a = a + 1;
b = b + 1;
// T2
b = b * 2;
a = a * 2;

If either T1 or T2 runs in isolation, the property a==b is preserved, as in the following examples:
// T1 then T2 (a==b==1 initially)
a = a + 1;
b = b + 1;
b = b * 2;
a = a * 2;  // a==b==4
// T2 then T1 (a==b==1 initially)
b = b * 2;
a = a * 2;
a = a + 1;
b = b + 1;  // a==b==3

However, suppose the code execution is somehow interleaved between T1 and T2. If initially a==1 and b==1, the following interleaved execution could occur:
a = a + 1;  // T1 (a==b==1)
b = b * 2;  // T2
b = b + 1;  // T1
a = a * 2;  // T2 (a==4, b==3)
At the end of the interleaved execution, a==4 and b==3, so a==b is no longer true. We observe that the value of the output depends on the order of execution, not just on the value of the input data. It might be very rare that T1 and T2 ever execute in such a way, but when they do, the program no longer executes the same. This example involves integer variables, but can be applied in principle to any shared data or object.


Security Vulnerabilites. These examples are innocuous, but a race condition can also lead to a security vulnerability. Suppose the following situation exists somewhere in an application:
# Elevate privileges
# (possible race condition here)
# Begin execution of privileged operation
Situations like this can, and actually do, occur in real code. An attacker who identifies such a situation may be able to exploit it to cause his own code to execute with elevated privileges.

Critical Sections

In the previous example, the problem could have been prevented if we could guarantee that, when either T1 or T2 runs, it maintains exclusive access to the shared objects (a and b). This is the idea behind the concept of a critical section. When a process (or thread) is executing in its own critical section, no other process (or thread) may be execute in its own same critical section. Enforcing this requirement on critical sections is part of ensuring mutual exclusion, as further explained below.
Mutual exclusion of two processes using critical sections.

Mutual Exclusion

The basic requirement for critical sections described above is necessary but not sufficient for providing mutual exclusion. In particular, there are four conditions that must be true for mutual exclusion to be fully successful:

Hardware-Based Approaches to Mutual Exclusion

In this section we briefly review a few hardware-based approaches to the problem of mutual exclusion.

Disabling Interrupts

This is possibly the simplest approach. Disable all interrupts before a process/thread enters its critical section, then re-enable them after the process/thread exits its critical section. On the good side, this approach does actually enforce mutual exclusion. The running process can't be interrupted in its critical section, even by the OS!
while (true) 
{
  /* disable interrupts */
  /* critical section */
  /* enable interrupts */

  /* remainder */
}
However, there are two important limitations: Besides disabling interrupts, there are other hardware-based mutual exclusion strategies. In each case, there are one or more hardware instructions that, when properly used, can implement mutual exclusion correctly.

Hardware Instructions -- Exchange, Test-and-Set

This approach is based on the use of a special hardware instruction provided by a given hardware architecture, typically one that can perform multiple actions atomically (indivisibly). One example is an "exchange" instruction, which swaps the values of two variables.

Inf the folowing example assembly code, the XCHG instruction is used to solve the critical section problem. Note that REGISTER is like a local variable that can only be accessed by the executing process, while LOCK is shared. Here, LOCK == 0 implies no process is in the critical section, while LOCK == 1 implies that the critical section is occupied.
enter_critical:
  MOV  REGISTER, 1
  XCHG REGISTER, LOCK  ; swap contents of REGISTER and LOCK
  CMP  REGISTER, 0     ; was LOCK 0, before the swap?
  JNE  enter_critical  ; no: busy loop (LOCK was 1)
  RET                  ; enter critical_section
  
leave_critical: 
  MOV  LOCK, 0         ; open the lock
  RET                  ; leave critical section
And here is a C-style implementation:
int key = 1;
while (true) {
  do exchange(&key, &lock) while (key != 0);
  /* critical section */
  lock = 0;
  /* remainder */
}
An illustration of locks using an Exchange/Swap instruction.
The 'lock' variable must be in a memory area shared by the threads.
The local 'key' variables are thread-specific, and may be in a CPU register or thread-local memory.
More than two threads can be synchronized in this way.
As long as the lock is locked (i.e., the lock variable holds a 1), then the action of swapping has no effect -- it continually swaps a 1 with another 1. Only when the lock variable is 0 does the swap have an effect -- the lock is quickly reset to 1, and the calling function gets returned a 0 value, indicating it has successfully gained access. The calling process knows it is the only one gaining access, because the exchange is atomic. This is generally done by briefly locking the memory bus in hardware.

The Intel x86/x64 architecture implements a version of Exchange. It is described in the architectural manual as follows:
Intel XCHG machine instruction for x86/x64. (source: Intel)
There are many other similar instructions. One mentioned in the Tanenbaum text, TSL (Test and Set Lock), functions almost exactly the same as the Exchange example.

Hardware-based solutions can implement mutual exclusion correctly, but they rely on busy waiting. They also fail if processes do not cooperate, and both deadlock and starvation are still possible.

Early Software-Based Approaches to Mutual Exclusion

There are two software-based approaches to mutual exclusion that involve busy waiting. While one process is in its critical section, other processes actively wait, repeatedly checking for their own opportunity to proceed.

Incorrect Attempt to use Lock Variables

In this approach, there is a shared 'lock' variable. If its value is 0, no process is in its critical region. If the value is 1, a process is in its critical region and all others must wait. A process needing to enter its critical section repeatedly checks the value, while the value is 1. As soon as the lock value is 0 again, the process enters its critical section and sets the lock value to 1.
while (true) {
  while (lock == 1)
    /* busy wait: do nothing until lock==0 */ ;
    
      // <- race condition here
  lock = 1;    
  /* critical section */
  lock = 0;
  /* remainder */
}
The problem with this approach is that it relies on a race condition. If two processes read the lock value as 0 and both proceed into their critical sections before the lock value can be set to 1, then mutual exclusion is not enforced successfully.

Strict Alternation

The integer variable turn keeps track of whose turn it is to enter the critical section. Each process will busy wait until its own turn.
while (true) {
  while (turn != 0)
    /* busy wait */ ;
      
  /* critical section */
  turn = 1;
  /* remainder */
}
while (true) {
  while (turn != 1)
    /* busy wait */ ;

  /* critical section */
  turn = 0;
  /* remainder */
}


The problem with this approach is that it requires strict alternation (neither process can execute two or more times in a row). If one process is significantly faster, it ends up spending a lot of time blocked, waiting on the slower process to finish. Also, it becomes complex to scale beyond two processes. The approach provides mutual exclusion, but is very inefficient.

Peterson's Solution

In 1981, G.L. Peterson devised a software-based solution to mutual exclusion that does not require strict alternation. There are a few variations of how to code Peterson's Solution; the one shown is from the Tanenbaum text.

There are two function calls, one each for entering and leaving a process' criticial region. Each process must call enter_region with its own process number as an argument. This will cuase it to wait, if necessary, until it is safe to enter. After accessing the shared variables, it calls leave_region to exit the critical region.
#define FALSE 0
#define TRUE 1
#define N 2

int turn;           // Whose turn is it?
int interested[N];  // All initially 0

void enter_region(int process)
{
  int other;            
  other = 1 - process;           // Number of the other process  
  interested[process] = TRUE;    // Show interest
  turn = process;                // Set flag
  // If both processes call enter_region() at the same time, the
  // next statement forces the latter, who writes to turn _last_,
  // to busy wait until the earlier process exits.
  while(turn == process && interested[other] == TRUE); // Busy wait
}

void leave_region(int process)
{
  interested[process] = FALSE;  // Indicate departure from critical region
}
If both processes call enter_region simultaneously, they will each try to modify the variable turn. The value of turn is overwritten by the second process, but by doing so it causes the second process to hang on the while loop and busy wait, until the earlier processes finishes and leaves its criticial region.

Peterson's solution works correctly if implemented properly, allowing mutual exclusion by two cooperating processes. It can also be scaled to work with more than two processes. However, Peterson's solution employs busy waiting, an inefficient use of the CPU.

An important aside: Message Passing

Another limitation of the approaches discussed above (and also true of the improved "semaphores" that we'll discuss soon) is that they require efficient areas of "shared memory" in order to synchronize (and to share data). On distributed machines, such shared memory may not be easily available. Instead, an alternative is to provide primitives for sending and receiving messages, with some synchronization built in. This is the philosophy behind message passing, which is implemented by all operating systems in some form.

Message passing is based on the send() and receive() primitives, which could just be library functions that are wrappers for system calls:
send(destination, &message);
receive(source, &message);
Design Issues. There are a number of design issues that need to be addressed when implementing a message passing system: A buffer that holds messages is called a mailbox. A process may simultaneously have ownership of a number of mailboxes. Mailboxes may be exclusively held by one process or thread, or shared by multiple senders and receivers, as shown in the following diagrams:
Mailboxes: possible combinations of senders and receivers. (Stallings)
If no buffering is used at all, a sender must block on send() until the message is received. Similarly, if a receive() action happens first, the receiver must block until a message is sent. When both sender and receiver block, the arrangement is called a rendezvous.

As noted by the Tannenbaum text, message passing is commonly used in parallel programming systems. A well-known example is MPI (Message Passing Interface). One implementation of that is Open-MPI which high-performance computing researchers use for fast, parallel computation tasks like modeling and machine learning.

Summary / Key Points

We have discussed a number of approaches to mutual exclusion and the challenges of each. The following table is a summary:
Approach Type Correct? Limitations
Disabling Interrupts Hardware Yes Single CPU only, could crash OS
Exchange/Test-and-Set Hardware Yes Busy waiting
Lock Variables Software No Race condition
Strict Alternation Software Yes Busy waiting, doesn't scale
Peterson's Solution Software Yes Busy waiting
It was not until the introduction of first semaphores that we had an efficient and correct solution to the problem of mutual exclusion. We examine semaphores in the next section.