Problem 79

This is the archived version of this course from the Spring 2013 semester. You might find something more recent by visitning my teaching page.

We know from class and from Problem 76 that the expected size of the largest bucket in a regular hash table with \(s=2n\) is \(O((\log n)/(\log\log n))\).

However, this is not a guarantee; for some hash functions, we might get really unlucky and have all the inserted items hash to the same bucket, for example.

Describe an algorithm that does regular hashing and *guarantees* buckets will always be less than \(O((\log n)/(\log\log n))\) in size, and which also maintains \(s \in O(n)\). (So for example you could have \(s=2n\) or \(s=100n\) but not \(s=n^2\).) The number of elements \(n\) is *not* known in advance; your algorithm has to adapt to the changing size as it goes along.

Specifically, assume that there is a continuous stream of items to insert into your hash table. Describe a way of choosing hash functions, and choosing hash table sizes, so that the size of the hash table never exceeds a constant times the number of things inserted so far, and the size of the largest bucket never exceeds \((\lg n)/(\lg\lg n)\).

The expected total running time of your algorithm, after \(n\) insertions have been performed, should be \(O(n)\).