Algorithm: sort(A,i,j) if i == j return m = (i + j)/2 ← divide r1 = sort(A,i,m) ← conquer r2 = sort(A,m+1,j) ????? ← combineSo the question is, how do we efficiently combine these two sorted subarrays to form the sorted version of the original array? The answer: I wish I had an algorithm "merge" that would do that for me!
Algorithm: merge(A,i,m,j)
Inputs : array A, and indices i ≤ m ≤ j such that
A[i..m] and A[(m+1)..j] are idividually in sorted order
Outputs: array A is modified so that A[i..j] contains its original
elements in sorted order
A1 = new array copy of A[i..m]
A2 = new array copy of A[(m+1)..j]
k = i, i1 = 0, j1 = m - i, i2 = 0, j2 = j - (m+1)
while i1 ≤ j1 and i2 ≤ j2 do ← keep adding the smaller of the initial elements of A1 & A2
if A2[i2] < A2[i1]
A[k] = A2[i2], i2 = i2 + 1, k = k + 1
else
A[k] = A1[i1], i1 = i1 + 1, k = k + 1
while i1 ≤ j1 do ← add any remaining elements of A1
A[k] = A1[i1], i1 = i1 + 1, k = k + 1
while i2 ≤ j2 do ← add any remaining elements of A2
A[k] = A2[i1], i1 = i1 + 1, k = k + 1
free A1 and A2
This version of merge should be fairly easy to understand, so I
won't worry about proving correctness. What about its running
time? Best and worst case running times are both $\Theta(n)$,
which we derived in class (and is done in the unit notes).
Algorithm: msort(A,i,j) if i == j return m = (i + j)/2 ← divide r1 = msort(A,i,m) ← conquer r2 = msort(A,m+1,j) merge(A,i,m,j) ← combineIn class we made a start on analyzing mergesort. If we use $T(n)$ to denote mergesort's worst-case running time as a function of the input array size, we end up with the following:
$ T(n) \leq c n + T\left( \lceil n/2 \rceil \right) + T\left( \lfloor n/2 \rfloor \right) $
where the floor and ceiling are required because with an odd-sized range, we cannot divide evenly in half. This is a bit difficult to analyze, so we said instead that we would try to examine a simplified version. We expaned the simpler version out and looked for patterns, and arrived at the hypothesis that $T(n) \in O(n \lg n)$. Next class: prove this hypothess!