Credit: Gavin Taylor for the original version of these notes.
Beginning of Class 13
First, the mathematical definition of a Tree. A Graph is any set of vertices and a set of edges that connect vertices. A cycle is a path in a graph such that you can arrive back at a vertex without re-using any edges. A connected graph, is a graph where a path exists between all possible pairs of vertices. Finally, any connected graph that has no cycles is a Tree.
Well, that's not very interesting. Here's an example.
This is known as a decision tree. You see that a single node is at the top ("Does it move?"). This is the root. The root has two branches, to "No, should it?" and "Yes, should it?" Each of nodes also has two branches, one to "No/Yes, No problem," and the other to "No/Yes, WD-40/Duct Tape." Since every node in the three has two branches, it's called a binary tree.
- Node: An element of the tree.
- Root: The single element at the top of the tree.
- Parent: A node's parent is the node immediately above it. Every node except the root has exactly one parent.
- Child: A node's children are the nodes immediately below it. In a binary tree, every node has at most 2 children.
- Sibling: Two nodes are siblings if they have the same parent.
- Ancestor: Parent, or parent's parent, or...
- Leaf: A node with no children.
- Internal Node: A node with at least one child. Every node is either a leaf or an internal node.
- Subtree: A portion of a tree that is, itself, a tree.
- Depth: The depth of a node is the number of edges that have to be traversed to encounter the root.
- Height: The most number of edges on any path from the root to a leaf. Same as the maximum depth of any node in the tree. A tree with only one node has height 0. An empty tree (where the root is null) has height -1.
- Branching Factor: Maximum number of children for a node in a tree. A binary tree has branching factor 2.
So what does this look like in Java? We'll write this for a binary tree, or a tree where each node has up to two children, with some methods that might be useful.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
class Node: def __init__(self, data): self.data = data self.left = None self.right = None class Tree: def __init__(self): self.root = None def height(self): return self._height(self.root) def _height(self, curnode): if curnode is None: return -1 else: lh = self._height(curnode.left) rh = self._height(curnode.right) return 1 + max(lh, rh)
Beginning of Class 14
Just like we needed to know how to traverse a Linked List, we need to know how to traverse a Tree. In a Linked List, the work we were doing to each Node could come either before or after the recursive call; this determined whether that work was done from the front of the list to the back, or from the back to the front. We have a similar concept here for trees, we just have more options.
Preorder Traversal: In preorder traversal, we have the rule that the
node we're looking at must be visited before its children; there are no rules
as to the order of its children, or grandchildren. In the following code, I'm
going to use
process() in place of whatever work you're doing to each
node (maybe you're printing each data point, or whatever). Preorder traversal
looks like this:
1 2 3 4 5
def preorder(curnode): process(curnode.data) preorder(curnode.left) preorder(curnode.right) }
Postorder Traversal: In postorder traversal, the node we're looking at is visit after its children. It looks like this:
1 2 3 4 5
def postorder(curnode): postorder(curnode.left) postorder(curnode.right) process(curnode.data) }
Inorder Traversal: We're going to deal with a lot of binary trees in this class. With a binary tree, we also have the option of Inorder Traversal, which is where you visit the left subtree first, then this node, then the right subtree.
1 2 3 4 5
def inorder(curnode): inorder(curnode.left) process(curnode.data) inorder(curnode.right) }
Consider the following tree:
These traversals would visit the nodes in the following order:
Preorder: A B C D F E
Inorder: B A F D C E
Postorder: B F D E C A
Level-order Traversal: Level order traversal is easy to do visually, but harder to do computationally. Unlike preorder and postorder traversals which are depth-first in nature (they descend all the way to the leaves of one subtree before moving on to the next), a level-order traversal is breadth-first. And in order to search in a breadth-first manner, we need a queue:
1 2 3 4 5 6 7 8 9 10
def levelorder(root): nodesq = new Queue of Nodes nodesq.enqueue(root) while nodesq is not empty: curnode = nodesq.dequeue() if curnode is not null: process(curnode.data) enqueue(curnode.left) enqueue(curnode.right) }
The level-order traversal of the example tree above would be A B C D E F.
Beginning of Class 15
Binary Search Trees
We've learned a lot of things about trees, but we haven't actually put any data in them yet. In a Binary Search Tree, we require that our data be sortable, and the rule is this: everything in the left subtree of a node is less than the node's data, and everything in the right subtree is greater. Pretty simple, right?
Here's some code to make that happen, if the tree is storing
insert() adds data to the
tree, and find() returns a boolean as to whether that data appears in
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36
class Node: def __init__(self, data): self.data = data self.left = None self.right = None class BST: def __init__(self): self.root = None def find(self, needle): return self._find(needle, self.root) def _find(self, needle, curnode): if curnode is None: return False elif needle < curnode.data: return self._find(needle, curnode.left) elif needle > curnode.data: return self._find(needle, curnode.right) else: # needle must be equal to curnode.data return True def insert(self, element): self.root = self._insert(element, self.root) def _insert(self, element, curnode) if curnode is None: return Node(element) elif element < curnode.data: curnode.left = self._insert(element, curnode.left) return curnode else: curnode.right = self._insert(element, curnode.right) return curnode
Note that we do not know the height of a BST, except to say that it is
somewhere between O(log(n)) and O(n), which is a massive range. The best we
can say, is that both
insert are O(height)
in the worst case.