Class 15: Concurrency II

Reading: These notes.
Homework: Printout the Homework and answer the questions on that paper.

Implementing a mutex

Last class we used pthread's mutex's to protect critical sections in a simple concurrent program (multithreaded, in our case). It's important to look at how something like pthread_mutex's mutual exclusion can be implemented. How can we guarantee:

mutual exlusion (mutex) - that only one process/thread can be in a critical section with respect to the shared resource at the same time,
progress - that no process/thread that is not in a critical section with respect to the shared resource can prevent other processes/threads from entering their critical sections, and
bounded wait - that any process/thread that wants to get into a critical section with respect to the shared resource eventually will enter that critical section.

A solution to mutual exclusion can rely on hardware support, or on the support of the kernel, or neither --- i.e. simply be regular old user-space software. We'll look at software solutions in some detail, briefly at hardware support, and finally at kernel-supported methods.

Failed software approach 1: lock variables

Our first approach will be straightforward. We'll simply mimic pthread mutex's lock/unlock operations. Suppose two threads P0 and P1 have a shared variable lock. They might try to use it to implement mutual exclusion following this protocol:

shared variable lock initialized to 0
lock must be 0 to enter critical section
lock must set to 1 when process enters critical section
lock set to 0 when process leaves critical section

int lock = 0; // Shared variable!

P0	P1
1 while(1) { 2 while(lock == 1) 3 ; 4 lock = 1; 5 critical section here 6 lock = 0; }	7 while(1) { 8 while(lock == 1) 9 ; 10 lock = 1; 11 critical section here 12 lock = 0; }

This doesn't work: Mutual exlusion fails! Why? What if both P0 and P1 execute lines 2 / 8, which test lock == 1, before either gets to lines 4 / 10, which set lock = 1? Then both are in their critical section. This approach fails because testing lock and setting lock are not atomic, i.e. they do not happen together as a single step.

Failed software approach 2: Alternation, i.e. take turns

Suppose two threads P0 and P1 have a shared variable turn. They might try to use it to implement mutual exclusion following this protocol:

shared variable lock initialized to 0
may only enter critical section if it's your "turn", i.e. turn must be 0 for P0 to enter critical section, turn must be 1 for P1 to enter critical section
yield "turn" when you leave critical section, i.e. P0 sets turn to 1, P1 sets turn to 0

int turn = 0; // Shared variable!

1 while(1)
  {
2   while(turn != 0)
3     ;
4   critical section here
5   turn = 1;
6   remainder (non-critical section code)
  }

7 while(1)
  {
8   while(turn != 1)
9     ;
10  critical section here
11  turn = 0
12  remainder (non-critical section code)
  }

Mutual exlusion holds for this, though it takes walking through several scenarios to convince yourself. However, progress is not satisfied. Why? Suppose P0 has a very long remainder, and P1 has a very short remainder. Say P0 goes through its critical section and starts its remainder. P1 goes through its critical section, sets turn to 0, goes through its short remainder section, and is back waiting to get into its critical section, while P0 is still going through its long remainder section. So, P1 is kept from getting into its critical section by P0, even though P0 isn't in its critical section. The progress requirement isn't satisfied. (Think of what a performance problem this would be!)

Failed software approach 3: Defer (the polite algorithm!)

Suppose two threads P0 and P1 have a shared array int flag[2]. They might try to use it to implement mutual exclusion following this protocol:

shared array initialized to 0's
when Pi wants to enter critical section, it sets flag[i] = 1
when Pi leaves critical seciton it sets flag[i] = 0
P1 can only enter its critical section when flag[0] == 0, and P0 can only enter when flag[1] == 0. I.e. each defers to the other.

int flag[2] = {0,0}; // Shared variable!

1 while(1)
  {
2   flag[0] == 1;
3   while(flag[1] == 1)
4     ;
5   critical section here
6   flag[0] = 0;
7   remainder (non-critical section code)
  }

8 while(1)
  {
9   flag[1] == 1;
10  while(flag[0] == 1)
11    ;
12  critical section here
13  flag[1] = 0;
14  remainder (non-critical section code)
  }

Mutual exlusion holds for this, though it takes walking through several scenarios to convince yourself. However, both progress and bounded wait are not satisfied. Why? Suppose P0 executes line 2 and P1 executes line 9 before either P0's line 3 or P1's line 10 get executed? Then both flag[0] and flag[1] are set to 1, and both processes will sit in their inner while loops forever. This is known as deadlock. No process/thread can move forward.

Sucessfull software approach: Peterson's Algorithm

If we combine the alternation and defer approaches we actually get a software solution that works: Peterson's algorithm.

int flag[2] = {0,0}; // Shared variable!
int turn = 1;        // Shared variable!

while(1)
{
  flag[0] = 1;                 // show I want to enter
  turn = 1;                    // but yield to other
  while(flag[1] && turn == 1)  // wait for other to leave
    ;
  critical section here  
  flag[0] = 0;                 // indicate I've left
  remainder (non-critical section code)
}

while(1)
{
  flag[1] = 1;                 // show I want to enter
  turn = 0;                    // but yield to other
  while(flag[0] && turn == 0)  // wait for other to leave
    ;
  critical section here  
  flag[1] = 0;                 // indicate I've left
  remainder (non-critical section code)
}

The above algorithm can be proven to solve the mutual exclusion problem for two threads/processes. moreover, it can be generalized to an arbitrary number of processes.

Hardware solutions

Our original lock-variable attempt at mutual exclusion failed because testing lock == 1 and setting

lock
	  = 1

are separate operations: not atomic. Well ... what if the underlying hardware provided a single test-and-set machine instruction? Then our approach would work. It might look like this:

1 while(1)
  {
2   while(test_and_set(lock))
3     ;
4   critical section here
5   lock = 0;
  }

Alternatively, since (on a single machine) mutual exclusion problems occur from threads/processes getting interrupted, what if we could disable interrupts? Then the following would work:

 while(1)
 {
 START:
   disable_interrupts();
   if (lock == 0)
   {
     lock == 1;
     enable_interrupts();
   }
   else
   {
     enable_interrupts();
     goto START:
   }
   critical section here
   lock = 0;
 }

Now, hopefully you see that allowing user-processes to disable interrupts is asking for real trouble!\ I'll say nothing further about it!

Spin-locks and the case for the kernel

Now, both peterson's algorithm and the hardware solution using test-and-set have one seriously undesireable property: they both rely on spin locks. A spin lock is a trival loop that gets executed over and over and over again until some condition is satisfied and the real computation can continue. The problem with spin-locks is that they eat up CPU cycles without doing any useful work. It's much better to have the thread/process sleep until the desired condition is satisfied, and then have someone/something wake it up. Following this approach, its natural to ask the kernel to take care of mutual exclusion for us. If our locking mechanism is actually a kernel resource, then the kernel put a process/thread asleep if it tries to aquire the lock and the lock is unavailable. The kernel can react to a lock becoming available by choosing one of the threads/processes that are waiting on the lock, waking it up and granting it the lock. After all, the kernels job is to put processes to sleep, and choose one amongst the sleeping processes to wake up. [ NOTE: scheduling of threads within a process may be managed by a separate runtime system that's not part of the kernel, in which case that runtime system to managea the locking mechanisms, not the kernel. ] How does the kernal manage mutual exclusion? Well, we can trust the kernel to disable interrupts, so it can use that method, for instance. The important thing to take away from this is that we as user-level programmers can take care of mutual exclusion with spin-locks and peterson's algorithm, but that's wasteful. It's far more efficient from a system-wide perspective to have processes/threads sleep until the lock they wish to aquire becomes available.

So, the moral of the story is: the kernel / threads-run-time-system implements mutex's for us, so that processes/threads awaiting a lock are suspended until the lock is available, rather that using up CPU cycles in spin locks.