A: Solution meets the stated requirements and is completely correct. Presentation is clear, confident, and concise.
B: The main idea of the solution is correct, and the presentation was fairly clear. There may be a few small mistakes in the solution, or some faltering or missteps in the explanation.
C: The solution is acceptable, but there are significant flaws or differences from the stated requirements. Group members have difficulty explaining or analyzing their proposed solution.
D: Group members fail to present a solution that correctly solves the problem. However, there is clear evidence of significant work and progess towards a solution.
F: Little to no evidence of progress towards understanding the problem or producing a correct solution.
Problem | Final assessment |
1 | |
2 | |
3 | |
4 |
Instructions: Review the course honor policy: you may not use any human sources outside your group, and must document anything you used that's not on the course webpage.
This cover sheet must be the front page of what you hand in. Use separate paper for the your written solutions outline and make sure they are neatly done and in order. Staple the entire packet together.
So here's your task: given a list of numbers `A` of length $n$, find all numbers `x` that occur more than $n/3$ times in $A$. These are the popular numbers.
Note that there can be 0, 1, or 2 popular numbers. (It's impossible to have three things that all occur *more* than one-third of the time.)
Algorithm: popular_basic(A) Input : array A of n numbers Output: list of the "popular numbers" apop = [] for i from 0 to n-1 do if count(A,A[i],0,n-1) > n/3 if A[i] not in apop apop.append(A[i]); Algorithm: count(A,x,i,j) Input : array A of numbers, a value x and a range i..j in A Output: number of times x occurs in A[i],...,A[j] c = 0 for each a in A[i],...A[j] do if a == x c = c+1 return cWhat's the running time of this `popular_basic` algorithm, in terms of $n$?
Algorithm: popular_better(A,i,j) Input : array A, and indices i <= j that define an n-element range in A Output: list of the "popular numbers" apop = [] if i == j apop.append(A[i]) else m = (i+j)/2, n = j - i + 1 ← note: n is the number of elements in A[i],...,A[j] L = popular_better(A,i,m) for each x in L do if count(A,x,i,j) > n/3 ← note: this is on the range i..j, not i..m! apop.append(x) U = popular_better(A,m+1,j) for each x in U do if x not in apop and count(A,x,i,j) > n/3 ← note: this is on the range i..j not m+1..j! apop.append(x) return apopConvince me that this algorithm is still correct, i.e., it always returns all the popular elements in `A`.
Your task is to determine who the strong Mids are, to decide who will be on the team. And the only tool you have to determine who is strong and weak is running *contests*. A contest involves pitting some Mids against some others in a tug-of-war, and the outcome can be either that one side wins, or the other side wins, or they tie. This pseudocode might help clarify:
# Calling this function represents a single contest. def winner(group1, group2): if group1 wins: return group1 elif group2 wins: return group2 else: return 'tie'
There can be any number of Mids in any contest, but it should always be the same number on each side of the contest. The side with more strong Mids (or, equivalently, with fewer weak Mids) wins. For example, here is an algorithm for $n=3$:
def strongOf3(M0, M1, M2): w1 = winner({M0}, {M1}) if w1 == 'tie': if winner({M1}, {M2}) == {M1}: return {M0, M1} else: return {M2} else: if winner(w1, {M2}) == 'tie': return w1 + {M2} else: return w1An here's an algorithm for $n=4$:
def strongOf4(M0, M1, M2, M3): w1 = winner({M0}, {M1}) w2 = winner({M2}, {M3}) if w1 == 'tie' and w2 == 'tie': return winner({M0,M1}, {M2,M3}) elif w1 == 'tie': # w1 is a tie, but w2 is not if winner({M0}, w2) == 'tie': return {M0, M1} + w2 else: return w2 elif w2 == 'tie': # the opposite here; w1 is not a tie if winner({M2}, w1) == 'tie': return w1 + {M2, M3} else: return w1 else: return w1 + w2
State your exact lower bound as a function of $n$, showing all your work. Then state what the asymptotic big-$\Omega$ bound is that results, simplified as much as possible.
For example of what I'm asking for, in class we showed that sorting requires at least $\lg (n!)$ comparisons (the exact bound), which is $\Omega(n\log n)$.
Analyze the number of contests that your algorithm performs in the worst case (NOT the number of primitive operations, just the number of contests).
To be more specific: you continually read add($n$) messages, which tell you that $n$ is the name of a new person, and same($n1$,$n2$) messages, which tell you that $n1$ and $n2$ have been discovered to be from the same clan. Additionally, you get clan?($n$) messages to which you should respond by giving the clan $n$ belongs to, named by the representative you've chosen for it.
Example: add(joe) add(sue) add(eve) add(sam) same(joe,sam) clan?(sam) ← might say "clan joe" (these are all "might" because you can choose who is the "representative") clan?(eve) ← would say "clan eve" (at this point, eve is a clan of one, as far as you know) same(sam,eve) clan?(eve) ← might say "clan joe"The last bit shows the interesting part. When you discover two people, $n1$ and $n2$, are in the same clan, that means that the clans you thought $n1$ belonged to is in fact the same as the clan you thought $n2$ belonged to. So you have to "merge" them into one clan so that they all share the same representative.
Here's a very inefficient data-structure (VIDS) that supports
add
,
same
and
clan?
operations:
init :
create $A$, an extensible array / arraylist of
length-2-string-arrays. $A[i]$ will represent a person.
$A[i][0]$ is the
person's name, $A[i][1]$ is the name of $A[i][0]$'s clan
representative.
add(n) :
is implemented by a "push_back" on $A$
of $[n,n]$. Note: this means that initially $n$ is a clan
of one.
Time: this operation runs in amortized $O(1)$,
since that's the time for a push_back on an extensible array.
clan?(n) :
is implemented by
searching $A$ for entry $i$ such that $A[i][0]$ is $n$,
and then returning $A[i][1]$.
Time: $O(m k)$, where $m$ is the size of $A$ and
$k$ is a bound on the length of the names.
same(n1,n2) :
is implemented
by first letting c1 = clan?(n1)
and
c2 = clan?(n2)
, and then
simply returning if c1 == c2
, and otherwise
by going through $A$ and
doing if (A[i][1] == c2) A[i][1] = c1;
(or vice versa if you want n2's representative to be the
overall representative). This
ensures that everyone in n2's clan gets joined to n1's clan.
Time: $O(m k)$, where $m$ is the size of $A$ and
$k$ is a bound on the length of the names.
init
,
add
, clan?
and same
.
(d (i,j) val)
- where the meaning
is that cell (i,j) contains data val, or
(f (i,j) ((i1,j1),....,(ik,jk)) f(x1,x2,...,xk))
- where the meaning is that cell (i,j) is the result of a
calculation evaluating function f with argument x1 coming
from cell (i1,j1), x2 coming from cell (i2,j2), etc.
Example description | Example rendering as spreadsheet | |||||||||||||||||||||||||
(f (0,2) ((0,0),(0,1)) average) (d (1,1) 23) (f (1,2) ((1,1),(0,2)) sum) (d (0,0) 14) (f (2,2) ((0,2)(1,2)) average) (d (0,1) 16) |
|
(f ((8,5),(8,6),(8,7)) sum)but cell (8,7) is empty, then we can't evaluate the spreadsheet completely. More perniciously, we can't have a situation where cell (i1,j1) requires cell (i2,j2)'s value to compute, but cell (i2,j2) requires (i1,j1)'s value to compute. If we tried to evaluate a spreadsheet with a cyclic dependency like this, we'd end up in an infinite loop. Give an algorithm for determining whether a spreadsheet (described as shown in the example above) has a cyclic dependency that would cause it to be un-evaluatable ... which is a word I just made up.