Reading

These notes and closely review Unit 2 Section 2 and Section 3.

Quiz questions

Review quiz questions

We reviewed P1 from the week04 quiz questions, which was a nice look at proving a particular loop invariant.

Analyzing MergeSort

Recall that in the previous class we decided that the worst-case time for mergesort is given by the following recurrence relation:

$ T(n) \leq c n + T\left( \lceil n/2 \rceil \right) + T\left( \lfloor n/2 \rfloor \right) $

We also showed that a simplified version of this recurrence was $O(n \lg n)$, which we took as a hypothesis about the real recurrence. This class, by a long, exhaustive derivation, we verified that hypothesis. I.e. we showed that MergeSort's worst case running time is $O(n \lg n)$.

Normally, our next step would be to analyze things to determine a lower-bound on the worst case running time. However, I decided we'd do something more ambitious, that is ...

Prove that any comparison-based sorting algorithm is $\Omega(n \lg n)$

By "prove that any comparison-based sorting algorithm is $\Omega(n \lg n)$", I mean that any algorithm, including any super-clever advanced algorithm produced by future generations. That's pretty ambitious, no? How can I prove what future generations can't do?

One of the things that makes analyzing algorithms difficult, is that they are dynamic. An algorithm's state changes over the period of time in which it runs. However, it is possible with these sorting algorithms to talk about the algorithm as a static object, separated from any particular run of the algorithm. We saw how with an example. I showed in class how to construct the tree $T_3$ showing all possible ways insertion sort can run on a three element array.

One very interesting feature of this tree is that the worst-case running time for insertion-sort on three elements is given by the height of the tree. This will be important next class! Another important thing to note is the leaves. These represent the final order resulting from a run of insertion sort. You'll notice that all 3! permutations of x,y,z appear as leaves. This is in fact necessary, since there needs to be at least one path that gets you to each permuation, since any ordering is possible as input.

The key observation is that a sorting algorithm like insertion sort can be characterized by an infinite sequence of such trees: one for each input size.