This is the archived website of SI 486H from the Spring 2016 semester. Feel free to browse around; you may also find more recent offerings at my teaching page.

Problem 98

Substring search algorithms

Due: May 3
Points: 2

In class, we're going to look at the substring search problem: Given a text file with \(n\) characters, and a string \(s\) of length \(m\) (obviously \(m\le n\)), we want to find the first occurrence of \(s\) within the file.

(This is also sometimes called "string matching".)

The Boyer-Moore algorithm and Knuth-Morris-Pratt (KMP) algorithm are the two most famous deterministic algorithms for this problem. In class, we also looked at the Karp-Rabin randomized algorithm for this problem.

Do a little research and discover the running times of these three algorithms. Report the worst-case running time of BM and KMP, and the expected running time of KR algorithm we'll cover in this class. When do you think one would be more useful than the others?