## Reading

Unit 5 Notes Sections 1.1,1.2.

## Problem 1

Here's an adjacency list describing a graph. Draw it.
Here's an adjacency matrix describing a graph. Draw it.
Note: The two diagrams were given in class.
## Problem 2

Analyse the worst-case running time (in terms of $n$ the number
of vertices in $G$ and $k$ the length of $L$) for the following algorithm,
once assuming the adjacency list representation, once assuming
the adjacency matrix representation

checkTrail(G,n,L,k)
------------------
Input: directed graph G on n vertices {0,1,...,n-1}, array L of k vertices
Output: true if L is a "trail", i.e. in going from each vertex to the
next in L, we only follow valid edges.
check = true, i = 1, curr = L[0]
while i < k and check
next = L[i]
check = *check if there's an edge from curr to next*
curr = next
i = i + 1
return check

## Problem 3

Suppose that we store the neighbors list in the adjacency list
representation as a sorted array. What would your analysis look
like for the previous algorithm in this variant of the adjacency
list?

## Problem 4

Modify checkTrail to make it "checkPath", where we also require
that no vertex is visited more than once. Analyze the worst
case of your modified
algorithm (once again in both representations).

## Problem 5

Let's describe the input size of G more precisely: it has n
vertices and e edges. Do a worst-case analysis of checkPath
(for both representations) using all three paramters (n,e,k) as
the input "size".

## Problem 6

A vertex is a "source" if there are no edges coming into it from
other vertices. Design an algorithm that takes a graph (n
vertices and e edges) and determines all sources in the graph.
Do a worst-case analysis (in terms of $n$ and $e$) of your
algorithm in both representations.
## Problem 7

Consider the algorithm findCommon that takes a graph and an
array L of k source vertices and determines whether there is a
vertex that is a descendent of more than one of the given
sources, returning such a vertex if one exists.
Analyze the worst-case running time of this algorithm in terms
of n (number of vertices) and e (number of edges)
in both representations (i.e. adjacency list and matrix).

## Problem 4 - A Solution

Here is a modification of checkTrail to give "checkPath", as
asked for in Problem 4. Note that this version uses an
adjacency list.

checkPath(G,n,L,k)
------------------
Input: directed graph G on n vertices {0,1,...,n-1}, array L of k vertices
**NOTE:** assume G is given as an adjacency list
**NOTE:** assume k ≤ n, since we can't have a path otherwise
Output: true if L is a "trail", i.e. in going from each vertex to the
next in L, we only follow valid edges.
A = an array of n bool's ← to remember which vertices have been visited
initialize elt's of A to false
check = true, i = 1, curr = L[0], A[curr] = true
while i < k and check
next = L[i]
if (A[next] = true) return false else A[next] = true
check = (linearSearch(G[curr],0,G[curr].length-1,next) ≠ NOT_FOUND)
curr = next
i = i + 1
return check

One analysis for worst-case time in terms of $n$ only goes like this:
Creating and initializing the array $A$ takes $O(n)$ time. We
loop $O(k)$ times, but since $k \leq n$, we can make that $O(n)$
times. The linear search is $O(n)$ since no vertex has more
than $n$ neighbors. So I arrive at $T_w(n) \in O(n^2)$.

## Problem 5 - A better analysis of the Problem 4 Solution

Let's instead do our analysis in terms of $n$ and $e$, where $e$
is the number of edges in the graph. Of course we could just
use the previous bound (it can't be wrong!) and say
$T_w(n,e) \in O(n^2)$. However, we could look at things a bit
differently. If you think about it, we can never have the same
value of

`curr`

twice, which means that we can never
do worse than calling linear search exactly once for each
vertex. This means the amount of work is the sum of the
neighbors list length over all vertices. But this is exactly
$e$, the number of edges! Thus, we get $O(n)$ for the
intializing $A$ and for the worst-case number of times through
the while loop, and we separately get $O(e)$ for the combined
cost of all the linear searches. This gives us a bound of
$O(n + e)$, which we can't simplify further, because $e$ could
be greater or less than $n$. But, $O(n + e)$ is a much more
precise bound, because when graphs have few
edges (e.g. no node has more than four children) $e$ is much
smaller than $n^2$. This is the kind of analysis we'll usually
want with graph algorithms.

**Note:**
$O(f(n) + g(n))$ is just a synonym for $O(\max(f(n),g(n)))$.
So $O(n + e) = O(\max(n,e))$.