These notes and closely review Unit 1 section 9.

Homework

## Analyzing the simplest algorithms, part I: linearSearch's worst case

Suppose I want to analyze our good friend linearSearch. Let's try to analyze its worst-case running time, $T_w(n)$, in terms of $n$, the number of elements in $A[i\ldots j]$. Here's the algorithm:
Algorithm: linearSearch
Input: $(A,i,j,x)$, an instance of the Sorted Array Search problem
Output: $k$, an index such that $A[k] = x$, if one exists in $A[i\ldots j]$, and 'NOT FOUND' otherwise
k = i
while k < j and A[k] < x
k = k + 1
if A[k] == x
return k
else

Your temptation is to start by saying "the worst case is ...". Don't! It's actually hard, in most cases, to know what the truly worst-possible input is. Worse, if you claim input x is worst-case input, you have to prove it. That can be tough. So how should you start?
• [Get an upper bound (Big-O) by analyzing structurally] Well, we will first analyze the algorithm "structurally", deducing upper-bounds on how many times we can go through loops, without claiming that there is any input instance that would actually make it happen. In this case, we know that the while may loop $n$ times: perhaps less, but never more. How can we say something that's always true that expresses this? The number times we execute the while loop body in the worst case is $O(n)$. Though not maximally informative, it is irrefutably and obviously true. So that's what we'll go with. Here's our analysis"

Thus, we deduce that $T_w(n) \in O(n)$.
• [Get a lower bound (Big-$\Omega$) by analyzing a particular instance] Next, we'll try to get a lower bound (i.e. use $\Omega$) on the the worst case running time $T_w(n)$. We'll do that by considering a particular input, deriving a lower bound on the running time for that input, and noting that this must also be a lower bound on the worst-case running time. After all, the worst case must be at least as bad as any particular input instance I choose. I'll choose something I know is pretty bad, namely an input instance $I$ in which $x = A[j]$ and $A[j]$ is larger than all the other elements in the range. Here's our analysis of $T_I(n)$, the running time on this instance:

Thus, we deduce that $T_w(n) \in \Omega(n)$.
Since we have both that $T_w(n) \in O(n)$, and $T_w(n) \in \Omega(n)$, we deduce that in fact $T_w(n) \in \Theta(n)$.

## Analyzing the simplest algorithms, part II: linearSearch's best case

To do a best-case analysis, we just flip things around. Use the structure of the algorithm to get a lower bound on $T_b(n)$, the best-case performance (using $\Omega$), then choose a concrete instance $I'$ and get an upper bound on $T_{I'}(n)$, the running time for instance $I'$ (using $O$), which in turn is an upper bound for $T_b(n)$. Since we never go through the while loop less than zero times, we easily derive $T_b(n) \in \Omega(1)$. Let $I'$ be the instance in which $A[j] = x$. In this case we execute $k = k+1$ zero times, and get an upper bound $T_{I'}(n) \in O(1)$. Since the best case cannot be worse than instance $I'$, we get $T_b(n) \in O(1)$

Since we showed both that $T_b(n) \in \Omega(1)$, and $T_b(n) \in O(1)$, we deduce that in fact $T_b(n) \in \Theta(1)$.

## Everyone's favorite bad algorithm: BubbleSort

Things aren't always as easy as our analyses of linearSearch might lead you to believe, even though the big ideas from those analyses will definitely stay with us. Let's look at a more complicated algorithm, the algorithm everybody love's to hate, BubbleSort.
Algorithm: BubbleSort
Input:  A, and array of integers
n, the number of elements in A
Output/Side Effect: elements of A are sorted in increasing order
count = 1, k = n-1
while count > 0 and k > 0
count = 0
for i from 1 to k
if A[i] < A[i-1]
swap(A[i-1],A[i])
count = count + 1
k = k - 1

We don't need to worry too much about how BubbleSort works, or that we haven't formally defined the sorting problem. We're just going to focus on the analysis problem.

In class we had a dynamic discussion about doing this analysis. Hopefully you paid close attention and took good notes. The upshot was that we can prove $T_w(n) \in O(n^2)$ pretty easily just by looking at the structure of the algorithm. To get a lower bound, we have to analyze a specific input — hopefully something bad for BubbleSort. A natural candidate is $n,n-2,\ldots,2,1$. However, while we might convince ourselves that this input causes the while loop to execute $\Omega(n)$ times (exactly $n-1$ in fact), all we can say in terms of $n$ about the inner for loop is that it executes $\Omega(1)$ times. This yields $T_w(n) \in \Omega(n)$, but it really feels like we should be able to say something stronger!

We can, by taking a slightly different approach. Let's account for the cost of executing the for-loop body separately from everything else. The body runs in $\Omega(1)$ time, and each iteration through the outter while loop gives us $k$ times the for-loop body is exectued. So, the total number of times it gets exectued (over all iterations through the outter while loop) is

	    n-1 ← the first time through the while loop
+ n-2 ← the second time through the while loop
+ n-3 ← the third time through the while loop
...
+  2  ← the second to last time through the while loop
+  1  ← the last time through the while loop

this gives us $\frac{(n-1)n}{2}$ times we execute the for-loop body, and: $T_w(n) \in \frac{(n-1)n}{2} \Omega(1) \Rightarrow T_w(n) \in \Omega\left(\frac{(n-1)n}{2}\right) \Rightarrow T_w(n) \in \Omega(n^2).$ So, since $T_w(n) \in O(n^2)$ and $T_w(n) \in \Omega(n^2)$, we get $T_w(n) \in \Theta(n^2)$.