## Reading

These notes and
closely review

Unit 2 Section 2.

## Review quiz questions

Both reviews were quite important!
## Review: Applying Divide & Conquer to sorting

Next we decided to try applying Divide and Conquer to the
problem of sorting. This is where we got to:

**Algorithm: **sort(A,i,j)
if i == j
return
m = (i + j)/2 ← divide
r1 = sort(A,i,m) ← conquer
r2 = sort(A,m+1,j)
????? ← combine

So the question is, how do we efficiently combine these two
sorted subarrays to form the sorted version of the original array?
The answer: I wish I had an algorithm "merge" that would do that
for me!

## Merge

Ask and you shall receive, here is an algorithm for merging two
consecutive sorted subranges.

**Algorithm:** merge(A,i,m,j)
Inputs : array A, and indices i ≤ m ≤ j such that
A[i..m] and A[(m+1)..j] are idividually in sorted order
Outputs: array A is modified so that A[i..j] contains its original
elements in sorted order
A1 = new array copy of A[i..m]
A2 = new array copy of A[(m+1)..j]
k = i, i1 = 0, j1 = m - i, i2 = 0, j2 = j - (m+1)
while i1 ≤ j1 and i2 ≤ j2 do ← keep adding the smaller of the initial elements of A1 & A2
if A2[i2] < A2[i1]
A[k] = A2[i2], i2 = i2 + 1, k = k + 1
else
A[k] = A1[i1], i1 = i1 + 1, k = k + 1
while i1 ≤ j1 do ← add any remaining elements of A1
A[k] = A1[i1], i1 = i1 + 1, k = k + 1
while i2 ≤ j2 do ← add any remaining elements of A2
A[k] = A2[i1], i1 = i1 + 1, k = k + 1
free A1 and A2

This version of merge should be fairly easy to understand, so I
won't worry about proving correctness. What about its running
time? Best and worst case running times are both $\Theta(n)$,
which we derived in class (and is done in the unit notes).

## Mergesort completed ... and a start on analysis

So here's mergesort, a divide and conquer sorting algorithm.

**Algorithm: **msort(A,i,j)
if i == j
return
m = (i + j)/2 ← divide
r1 = msort(A,i,m) ← conquer
r2 = msort(A,m+1,j)
merge(A,i,m,j) ← combine

In class we made a start on analyzing mergesort. If we use
$T(n)$ to denote mergesort's worst-case running time as a
function of the input array size, we end up with the following:

$
T(n) \leq c n + T\left( \lceil n/2 \rceil \right) + T\left( \lfloor n/2 \rfloor \right)
$

where the floor and ceiling are required because with an
odd-sized range, we cannot divide evenly in half. This is a bit
difficult to analyze, so we said instead that we would try to
examine a simplified version. We expaned the simpler version
out and looked for patterns, and arrived at the hypothesis that
$T(n) \in O(n \lg n)$. Next class: prove this hypothess!