Unit 4: Trees

This is the archived website of IC 312 from the Fall 2014 semester. Feel free to browse around; you may also find more recent offerings at my teaching page.

**Credit**:
Gavin Taylor for the
original version of these notes.

First, the mathematical definition of a Tree. A *Graph* is any set of
vertices and a set of edges that connect vertices. A cycle is a path in a
graph such that you can arrive back at a vertex without re-using any edges.
A connected graph, is a graph where a path exists between all possible pairs
of vertices. Finally, any connected graph that has no cycles is a Tree.

Well, that's not very interesting. Here's an example.

This is known as a decision tree. You see that a single node is at the top
("Does it move?"). This is the root. The root has two branches, to "No,
should it?" and "Yes, should it?" Each of nodes also has two branches, one to
"No/Yes, No problem," and the other to "No/Yes, WD-40/Duct Tape."
Since every node in the three has two branches, it's called a
*binary tree*.

Some definitions:

**Node**: An element of the tree.**Root**: The single element at the top of the tree.**Parent**: A node's parent is the node immediately above it. Every node except the root has exactly one parent.**Child**: A node's children are the nodes immediately below it. In a binary tree, every node has at most 2 children.**Sibling**: Two nodes are siblings if they have the same parent.**Ancestor**: Parent, or parent's parent, or...**Leaf**: A node with no children.**Internal Node**: A node with at least one child. Every node is either a leaf or an internal node.**Subtree**: A portion of a tree that is, itself, a tree.**Depth**: The depth of a node is the number of edges that have to be traversed to encounter the root.**Height**: The most number of edges on any path from the root to a leaf. Same as the maximum depth of any node in the tree. A tree with only one node has height 0. An empty tree (where the root is null) has height -1.**Branching Factor**: Maximum number of children for a node in a tree. A binary tree has branching factor 2.

So what does this look like in Java? We'll write this for a binary tree, or a tree where each node has up to two children, with some methods that might be useful.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 | public class Tree<T> { private class Node { public Node left = null; public Node right = null; public T data; public Node(T data) { this.data=data; } } private Node root = null; // constructors, some other methods too... public int height() { return heightFrom(root); } private int heightFrom(Node uu) { if (uu == null) return -1; int lh = heightFrom(uu.left); int rh = heightFrom(uu.right); return Math.max(lh, rh) + 1; } } |

Just like we needed to know how to traverse a Linked List, we need to know how to traverse a Tree. In a Linked List, the work we were doing to each Node could come either before or after the recursive call; this determined whether that work was done from the front of the list to the back, or from the back to the front. We have a similar concept here for trees, we just have more options.

**Preorder Traversal**: In preorder traversal, we have the rule that the
node we're looking at must be visited before its children; there are no rules
as to the order of its children, or grandchildren. In the following code, I'm
going to use `process()`

in place of whatever work you're doing to each
node (maybe you're printing each data point, or whatever). Preorder traversal
looks like this:

1 2 3 4 5 | preorder(Node uu) { process(uu.data) for each child cc of uu: preorder(cc) } |

**Postorder Traversal**: In postorder traversal, the node we're looking
at is visit after its children. It looks like this:

1 2 3 4 5 | postorder(Node uu) { for each child cc of uu: postorder(cc) process(uu.data) } |

**Inorder Traversal**: We're going to deal with a lot of binary trees in
this class. With a binary tree, we also have the option of Inorder Traversal,
which is where you visit the left subtree first, then this node, then the
right subtree.

1 2 3 4 5 | inorder(Node uu) { inorder(uu.left) process(uu.data) inorder(uu.right) } |

Consider the following tree:

These traversals would visit the nodes in the following order:

**Preorder**: A B C D F E

**Inorder**: B A F D C E

**Postorder**: B F D E C A

**Level-order Traversal**: Level order traversal is easy to do
visually, but harder to do computationally. Unlike preorder and postorder
traversals which are *depth-first* in nature (they descend all the
way to the leaves of one subtree before moving on to the next), a
level-order traversal is *breadth-first*. And in order to search
in a breadth-first manner, we need a queue:

1 2 3 4 5 6 7 8 9 10 | levelOrder() { nodesQueue = new Queue of Nodes nodesQueue.enqueue(root) while nodesQueue is not empty: uu = nodesQueue.dequeue() if uu is not null: process(uu.data) enqueue(uu.left) enqueue(uu.right) } |

The level-order traversal of the example tree above would be A B C D E F.

We've learned a lot of things about trees, but we haven't actually put any data in them yet. In a Binary Search Tree, we require that our data be sortable, and the rule is this: everything in the left subtree of a node is less than the node's data, and everything in the right subtree is greater. Pretty simple, right?

Here's some code to make that happen, if the tree is storing `int`

s.
`insert()` adds data to the
tree, and `find()` returns a boolean as to whether that data appears in
the tree.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 | public class BSTofInts { private class Node { public Node left = null; public Node right = null; public int data; public node(int data) { this.data = data; } } private Node root = null; public boolean find(int needle) { return findFrom(root, needle); } private boolean findFrom(Node uu, int needle) { if (uu == null) return false; else if (needle == uu.data) return true; else if (needle < uu.data)) return findFrom(uu.left, needle); else return findFrom(uu.right, needle); } public void insert(int element) { root = insertBelow(root, element); } private Node insertBelow(Node uu, int element) { if (uu == null) return new Node(element); if (element <= uu.data) uu.left = insertBelow(uu.left, element); else uu.right = insertBelow(uu.right, element); return uu; } } |

Note that we do not know the height of a BST, except to say that it is
somewhere between O(log(n)) and O(n), which is a massive range. The best we
can say, is that both `find`

and `insert`

are O(height)
in the worst case.