Comments or suggestions about the course so far:
Citations (be specific about websites):
1 Sums of Squares
A "perfect square" is an integer multiplied by itself; the first few
are 0, 1, 4, 9, 16, etc. Some integers can be written as the sum of
two perfect squares. For example, $10 = 1^2 + 3^2$. But some cannot:
for example, 7. And some can be written as the sum of 2 squares in
more than one way: for example, $50 = 1^2 + 7^2 = 5^2 + 5^2$.
(Caution: mathematical enrichment irrelevant to the assignment.)
Integers that can be written as the sum of two squares have
interested mathematicians for a long time. For example, Fermat
showed that a prime number $p$ is the sum of two perfect squares
if and only if $p-1$ is divisible by 4.
The following algorithm takes a given integer $n$ and determines
whether it can be written as the sum of two perfect squares. If so,
it returns $a$ and $b$ such that $n = a^2 + b^2$. Otherwise, it
returns "NO".
Algorithm: sumsq(n)
Input : n, a non-negative integer
Output: (a,b), where a and b are integers such that
n = a*a + b*b, if such integers exist, 'NO' otherwise
a = 0, b = n
s = a*a + b*b
while a <= b and s != n:
if s < n
a = a + 1
else
b = b - 1
s = a*a + b*b
if s == n
return (a, b)
else
return 'NO'
-
Use a loop invariant to show that this algorithm is correct.
State the invariant, then go through the three steps from class
to show correctness.
-
Determine the worst-case running time of the algorithm. Give a
$\Theta$ bound on the number of primitive operations, in
terms of the input integer $n$, and simplify as much as
possible. Show your work.
-
Develop an improved algorithm that solves the same problem.
Present your new algorithm, briefly explain why it is correct
(you do not have to do a formal proof with a loop invariant),
and state the worst-case running time.
2 Meet and Greet
There are $n$ strangers in a room. Everyone wants to get acquainted and
meet everyone else. Assume that:
- It takes exactly one minute for a pair of people to meet each other.
- There is plenty of space in the room, and people can move instantly
from one stranger to another.
- Each individual can only meet one other person at a time, but
multiple meetings can be happening simultaneously in the room.
For example, if $n=4$, if we call the people A, B, C, D, then you could
have everyone meet everyone else in the following algorithm:
- A and B meet, and C and D meet
- A and C meet, and B and D meet
- A and D meet, and B and C meet
At this point everyone has met, and the total time taken was 3 minutes.
Your task is to develop an efficient algorithm if there are any number
$n$ of people, not just 4. It doesn't have to match with my algorithm
for $n=4$; that's just an example to see how it could be done. Answer
the following questions to get you on the right track:
- When $n=4$ above, there are a total of 6 meetings that take place.
How many meetings are required for any $n$? Your answer should be a
formula in terms of $n$.
- Only two people can meet at once, so for example when $n=4$ only 2
meetings can take place at once, and therefore the 3-minute solution
above is the best possible for $n=4$.
Using this same logic, what would be the least possible number of
minutes to meet for any value of $n$? (Again, your answer should be
a formula in terms of $n$.)
- Imagine you have split everyone up into two equal-sized groups $G_1$
and $G_2$. Everyone within $G_1$ has met everyone else in $G_1$, and
the same goes for $G_2$, but no one from $G_1$ has met anyone from
$G_2$ yet.
Describe an algorithm for everyone in $G_1$ to meet everyone in
$G_2$. If there are $n/2$ people in each group, your algorithm
should take $n/2$ minutes.
Hint: look up how "Speed Dating" works.
- Come up with an efficient algorithm for everyone in the whole group
of $n$ people to meet each other. You might be able to use your
algorithm from part (c) as a useful subroutine, but that's not
required. (There are many efficient ways to solve this problem.)
Try to get an exact formula in
terms of $n$ for how many minutes your algorithm will take. Ideally, your
formula should match your lower bound from part (b), or at least it
should have the same big-oh.
3 X Marks the Spot
I'm sure you are all missing Turing Machines right now, so let's
revisit them. Suppose you have a 2-way infinite tape Turing
Machine. One cell on the tape has an X on it, the rest are
blank. Your job is to design an algorithm to find the X using
the following three Turing Machine
operations:
- lmove(), which moves one tape square to the left,
- rmove(), which moves one tape square to the right, and
- isblank(), which returns true if the current tape
square is blank, and false otherwise.
In your algorithm, besides the Turing Machine operations and the
usual pseudocode operations (variables,
adding/subtracting/multiplying integers, loops, etc.), I'll let
you write "lmove(k)" as a short-hand for a loop that calls
lmove() k times, and "rmove(j)" as a short-hand for a loop that
calls rmove() k times. Your algorithm should exit with the
Turing Machine tape reader on the square that contains X.
- Present an algorithm for the FindX problem.
- Analyze the worst-case cost of your algorithm. The "measure of difficulty"
will be $d$, the distance (in tape squares) of the square
containing X from the square you start at.
So if the X is on the square you start at, $d = 0$. If it's
on one of the squares adjacent to the start square, the
distance is 1.
The "cost measure" will be the total number of moves,
i.e. the total number of calls to lmove() or
rmove(). [Remember: "lmove(k)" and "rmove(j)" are just
short-hand for multiple calls to lmove()/rmove().]
Give a big-O bound on the worst-case number of moves
in terms of $d$. The best solutions
will have worst-case cost $O(d)$.
Note: this difficulty measure is not typical! Since there is
no "input" to this algorithm, the measure of difficulty cannot be the
input size. And the cost measure is also unusual - we're not counting
primitive operations in the algorithm, rather the number
of moves made.
-
Asymptotics are great and all, but in this case the "hidden
constants" in the big-O really matter — tape moves on
my Turing Machine are really slow!
Refine your analysis from (b) to give an exact upper bound on the number
of miles sailed (i.e. not a big-O bound). There will be a
tangible prize for the group who gets the smallest coefficient in front
of $d$.
4 Swap counting
For this problem, you will study three sorting algorithms you should already
be familiar with: InsertionSort, SelectionSort, and HeapSort. The first two
of these are in the class notes, and here is the version of
HeapSort that you can use for the purposes of this problem:
Algorithm: heapSort(A,n)
Input : array A of n elements
Output: leaves A is sorted order
for i from n-1 to 0 do ← This loop does the heapify operation
bubbleDown(A,i,n)
for i from n-1 to 0 do ← This loop does the heap removals.
swap(A,0,i)
bubbleDown(A,0,i)
And here is a bubbleDown algorithm that you should use:
Algorithm: bubbleDown(A,start,end)
Input : A - an array,
start - the index of the heap root,
end - the index of the last/rightmost element of the heap
Note: A[start],...,A[end] is a heap except that the
element at the root might be out of place (hence the
call to bubbleDown!)
Output: at exit, A[start],...,A[end] is a heap
more = true
while more and 2*start + 1 ≤ end do ← as long as start is not a leaf, loop!
child = 2*start + 1 ← set child to start's left child
if child + 1 ≤ end and A[child+1] > A[child] ← set child to start's right child if
that's larger
child = child + 1
if A[start] < A[child] ← if A[start] out of place, push it
swap(A, start, child) down to A[child]
start = child
else
more = false ← start not out of place, we're done!
- Tell me the worst-case running time of InsertionSort, SelectionSort,
and HeapSort. You should know this already.
- Analyze the number of *swaps* performed by a call to InsertionSort,
in the worst case. To do this, you first have to describe a
worst-case family of examples, i.e., what ordering of a list,
for *any* size, makes InsertionSort do the maximum number of swaps.
Second, you need to analyze the number of swaps that InsertionSort
does on your worst case example. This should be an exact function of
$n$, like $3n^3 - 6$ or something like that. Try to be as precise
as possible.
Finally, turn your formula into a $\Theta$-bound on the number of
swaps.
- Repeat (b) for SelectionSort.
- Repeat (b) for HeapSort. It will be a little more difficult to get
the exact analysis of the number of swaps, so for this one you just
need to give a formula for an *upper bound* on the number of swaps.
But still, try to make your formula as precise as possible.
- Show that *any* sorting algorithm that only uses swaps to move
elements in the input array must always perform at least
$\lceil n/2 \rceil$ swaps, in the worst case. Which of the three
algorithms above comes closest to this lower bound?