Reading

These notes and closely review Unit 1 section 9.

Quiz questions

Review quiz questions

Analyzing a simple iterative algorithm: findFirstMissing with binarySearch

Dr. Roche, in the Unit 1 notes, kindly analyzes our three search functions so painstakingly that he gives us exact expressions for the worst-case running time in terms of number of statements exectued. From those functions, we can deduce $O$, $\Theta$ and $\Omega$ expressions. However, this is most decidedly not the way we normally do things! If we know the exact expressions, there's not much reason to bother with $O$, $\Theta$ and $\Omega$. The usual case is that it's too difficult to get exact expressions, and we "settle" for getting information about asymptotic growth rates. So let's try analyzing findFirstMissing the usual way: we'll get what info we can about the worst-case running time function's growth rate, which will come in terms of $O$/$\Theta$/$\Omega$, without ever trying to get an exact expression for the function.

In class we determined that the worst-case running time function $T_w(n)$ is both $O(n \lg n)$ and $\Omega(n \lg n)$. I.e., it is $\Theta(n \lg n)$.

What happens when we replace binarySearch with linearSearch?

Let's replace binarySearch with linearSearch in findFirstMissing. The interesting thing here, is that we can't just say the call to linearSearch takes time $\Theta(n)$. We would have to prove first that it's actually possible in the context of the findFirstMissing algorithm for every call to linearSearch to be "worst case" for linearSearch, despite the fact that each call is a different $B[i]$. In fact, if we don't allow duplicate entries in array $B$, it is not possible for more than two calls to linearSearch to take $n$ steps. If you look at an intermediate iteration of findFirstMissing's while loop (i.e. not the last), the time taken for search(A,0,n-1,B[j]) is strictly greater than the time taken search(A,0,n-1,B[t]) for any t < j. So we can't assume that search takes $\Omega(n)$ for every iteration. So what do we do?

The first thing we did was to punt, and put the call to search as taking $\Omega(1)$ time, which is correct but not very precise. That gave us $T_w(n) \in \Omega(n)$ as well as $T_w(n) \in O(n^2)$.

The second thing we did was to analyze a very specific input situation, get a lower bound for that case, and note that the "worst case" had to be at least that bad, so the lower bound was also a lower bound on the worst case. I suggested the input situation in which the first half of B is identical to the second half of A, and the index n/2 element of B is not in A. In this case, we go through the while-loop $\Omega(n/2)$ times, and the call to linear search takes $\Omega(n/2)$ time, which gives us $\Omega(n^2/4) = \Omega(n^2)$ as a lower bound. Since $T_n(n)$ is $O(n^2)$ and $\Omega(n^2)$, it is $\Theta(n^2)$.

Improved findFirstMissing

How does linear vs. binary search vcompare now?