SI335: Sorting Revisited

Reading

These notes and Unit 7 section 7.

Quiz questions

P4 from Week 14 quiz questions (done in clas for those that were there, otherwise do at home)

Partition

Our only divide and conquer sorting algorithm so far is mergesort. When we try mergesort with arrays we find that it has the undesireable property of requiring us to allocate additional memory. You see, if the "divide" step simply uses two halves of the initial array as its smaller subproblems, when we go to merge we won't have anyplace to put results as we merge ... the sublists we're mergeing will be in the way.

So we thought about ways we might do the "divide" step of a divide and conquer sorting method differently so as to be able to sort "in place", i.e. without having to allocate additional memory. We came to the conclusion that a "divide" step that moved all the small values to the left side of the array and all the large values to the right would be perfect, since when the "conquer" step was over, i.e. when the two halves had been sorted, there would be no "merge" step - the array would already be sorted at that point.

The parition algorithm takes an array A, and range i..j in that array, and a split value s, and moves around elements of A in i..j so that everything less than s is on the left side, and everything >= s is on the right side. Additionally, it needs to tell us where the boundary between the "left" and right "partitions" is, so it returns an index k, which is the index of the last element in the left-hand partition.

Algorithm: partition(A,i,j,s)
Inputs: A, an array,
        i ≤ j, integers defining a range A[i],..,A[j] in A
        s, a value from A[i],...,A[j]
Side Effects: The elements of A[i],...,A[j] may be reordered
Output: k such that A[i],...,A[k] all < s, A[k+1],...,A[j] all ≥ s

k ← partition(A,i,j,s)
{
  while(true)
  {
    while (j ≥ i && A[j] ≥ s)
      j--;
    while (A[i] < s)
      i++;
    if (i < j)
      swap(A[i],A[j]);
    else
      return j;
  }
}

Without thinking too hard, you should see that this is a Θ(n) algorithm, where n is the size of the array range i..j. At least if we assume that swapping and comparing elements of A takes constant time.

Quicksort

Using partition, we get a new divide an conquer sorting algorithm called quicksort. The idea is pretty simple: use partition to split the array into two ranges, recursively sort each range and then you're done. The only problem is this: what should we use for a split value? The mose basic quicksort algorithm simply uses the first element in the range as a split value, i.e. A[i]. The idea is that if the input is random, the first element is just as likely to be a good split value as any other.

So quicksort is

quicksort(A,i,j)

   1) call partition on A and the range i..j using the first element of the range as the split value
      Important: Deal with empty partition as per class discussion!
   2) recursively sort the left range
   3) recursively sort the right range

In the best case, our split value will always partition the array into two equal sized pieces. This gives a recurrence relation of

T(n) = 2 T(n/2) + Θ(n)

where n is the size of the range we're sorting. We've seen before that this is T_B(n) = Θ(n lg n). In the worst case, our split value will always partition the array into pieces of size 1 and n-1. This gives a recurrence relation of

T(n) = T(n-1) + Θ(n)

where n is the size of the range we're sorting. As we discussed in class, we simply apply the recurrence over and over until we see a pattern and deduce that T_W(n) = Θ(n^2).