First a brief review:
- What is the difference between an Abstract Data Type and a Data
Structure?
- An ADT is a mathematical model of information stored in a
computer.
- A Data Structure is a particular arrangement of data in a
computer.
- An ADT consists of specification of behavior by providing a list
of operations that can be performed on it. It does not say
how this gets done in the computer.
- A data structure describes how an ADT gets implemented.
- In java an Interface is like an ADT- says what needs to be done,
with no instructions of how.
- When you implement the interface, you write the data structure.
- Which of the following is an ADT/DS:
- Linked List
- Stack
- Array
- Queue
- List
- Our first ADT in this class will be the Tree.
- When we say the word "tree" in Computer Science, what we mean is
a tree like a family tree, or a military command structure. We can draw one
like: (insert academy hierarchy)
- What trees represent is hierarchy. This is different from lists
in that lists represent linear structure.
- Other things organized as trees: folders/directories, book
chapters, decision trees.

- In a tree, everything must be connected up to a single root.
- Definitions:
- Node. element of the tree.
- Root. Node at the top.
- Parent. Of 2 nodes, A and B, A is the parent of B is A is
directly above B. Nodes can each have only one parent.
- Child. Reverse of parent.
- Sibling. 2 nodes are siblings if they share the same parent.
- Ancestor. Parent, or parent's parent, or... etc.
- External. Of a node, has no children.
- Internal. Of a node, not external.
- Subtree. A portion of a tree that is still a tree.
- In a filesystem tree, what kind of nodes are the files? what kind
are the directories?
- Tree methods:
- element(v) // returns the content of node v
- root() // returns the root node of the tree.
- parent(v) //returns the prent of node v
- children(v) //returns an iterable container holding the children of node v
- isInternal(v) // returns true if node v is internal
- isExternal(v) // returns true if node v is external
- isRoot(v) // returns true if node v is the root
- size() // returns the number of node in the tree
- isEmpty() // returns true if the tree is empty
- positions() // returns an iterable container of all nodes in
the tree.
- elements() // returns an iterable container of the content of
all node in the tree.
- replace(v,e) // replace the contents of node v with e. return
the old contents of v.
- All of these take O(1) time, except:
- children(v), which takes O(Cv)
- elements(), positions(), which take O(n).
- Lets define the depth of a node to be the number of ancestors of
that node. The depth of the root is 0.
- We can come up with a quick algorihtm for computing the depth of
a node:
int depth(v)
d = 0
while (!isRoot(v))
v = parent(v)
d++
return d
- But we could also do that recursively:
int depth(v)
if (isRoot(v))
return 0
else
return 1 + depth(parent(v))
- Either way, the run time is O(d), the depth of the node.
- depth is easy, but lets say we want to know the height of the
tree, defined as the depth of the deepest node. We know a basic way
to do this:
int height()
allnodes = positions()
max = 0
for each v in all nodes
h = depth(v)
if (h > max)
max = h
return max
- How long does this take? First, how many times is depth(v)
called? Ok, how many steps does each call to depth take? It
depends on the depth of each node, and the depth depends on the
shape of the tree. What shape tree maximizes the number of steps in
depth?
- Exactly. A List shaped one. In this case, the first call to
depth takes 0 passes through the loop, the second takes 1 pass, the
third 2 passes, up to (n-1) passes, giving us 1+2+3+ ... + (n-1) or
O(n2).
- But we can do better:
int height(v)
if (isExternal(v))
return 0
else
h = 0
for each w in children(v)
n = height(w)
if (n > h) h = n
return h + 1
- OK, what is the run time of that? Well, analyzing recursive
algorithms is a little bit harder, of course.
- One way to do it is to count the number times the function gets
called, times the work in the function (not counting the recursive
call).
- How many times is height called? Once per each node node.
- How much work does the function do? In each call, we go through
the loop Cv times, giving us O(nCv). This
doesn't help us too much, because we don't know how big Cv
is. There are 2 ways to fix this:
- Break the multiplication into repeated addition. instead of
nCv, we have Cv1 + Cv2 +
Cv3 + ... + Cv(n-1) + Cvn
this of course equals n-1, because there are n-1 children in the
tree. Thus height is O(n).
- Use the average of all the Cv. Since the average of
the Cv does not grow with the size of the tree, then it
is a constant. Thus nCv is O(n).
- What is the lesson here? The main one is that good recursive
algorithms on trees are often much better than iterative ones.
- Traversals. Another common set of algorithms on trees are
traversals. Traversals are a way of visiting all the nodes of a
tree. A visit is defined as performing an operation on the node.
The simplest visit is printing out the element. There are two basic
traversals: Preorder and Postorder.
- The preorder visits a node, and then performs a pre-order
traversal on its children.
- Lets see...
- First we visit A, then we we do a pre-order on the children.
- So we visit B. There are no children, so we go back to A.
- We go down to C, which we visit. again no children.
- We go to D. We visit that, then where? H, then I. etc.
- The Visit Order is: A B C D H I E J K L F M N G O.
preorder(T,v)
vist(v)
for each w in children(v)
preorder(T,w)
- Pre order of a book section tree visits nodes in the order read.
- Post Order
- Post order visits a node AFTER doing a post-order of all the
children:
- B C H I D J K L E M N F O G A
postorder(T,v)
for each w in children(v)
postorder(T,w)
visit(v)
- Post order is useful for things where we want to visit the
parent last (duh). For example, lets say the the tree is a
directory tree, and leaf nodes are all files. We want to print
out the size of the director- the sum of the size of all files in
the directory (or in some descendant directory).
diskspace(T,v)
sz = 0;
if (isExternal(v))
sz = size(v)
for each w in children(v)
sz = sz + diskspace(T,w)
if (isInternal(v))
print(name(v) + ":" + sz)
return sz;
- Try it out on the tree. Give some numbers to the leaves and go.
- What about other kinds of traversals? Can we visit in between
traversals of various children? Well that's hard to define when the
number of children can vary. But if the number of children is always
fixed, then maybe...