We’re going to learn about two types of self-balancing trees in this class, defined as “trees that always meet some definition of balanced.” There’s a wide variety of these, and I think they’re pretty cool.
AVL Trees define “balanced” as “the heights of the subtrees of any node can differ by at most one.” Note, this is NOT the same as “the tree is as short as possible.” For example, consider this tree:
This tree is NOT complete (meaning totally filled in). However, by the definition of AVL trees, it IS balanced. The height of an AVL tree is still O(log (n)).
AVL trees keep themselves balanced by performing rotations. A tree rotation is a way of rearranging the nodes of a BST that will change the height, but will NOT change the binary-search-ness, of “small things to the left, big things to the right”. That’s really important - the nodes in a BST are always in order, before and after every rotation operation. All that we change is the levels of some of the nodes to fix up the height.
The following illustration demonstrates a left rotation on the root node, x:
In this illustration, the triangles A, B, C, D represent subtrees which might be empty (null), or they might contain 1 node, or a whole bunch of nodes. The important thing is that the subtrees don’t change, they just get moved around. In fact, there are only really two links that get changed in a rotation; they are highlighted in red in the image above.
Pseudocode for a left rotation is as follows:
Node lRotate(Node oldRoot) { // oldRoot is node "x" in the image above.
Node newRoot = oldRoot.right; // newRoot is node "y" in the image above.
Node middle = newRoot.left; // middle is the subtree B
newRoot.left = oldRoot;
oldRoot.right = middle;
return newRoot;
}A right rotation on the root node is exactly the opposite of a left rotation:
To implement rRotate, you just take the code for lRotate, and swap right and left!
The idea that makes AVL trees possible to quickly implement is that nodes are modified to include the height of the subtree rooted at that node. An empty tree (null) has a height of -1, leaves have a height of 0, and any other node u has a height of 1 + max (u.left.height, u.right.height).
We also define balance, where
If, at any point, balance becomes -2 or 2, we must perform at least one rotation to rebalance the tree. There are four cases to consider:
Left rotate at the root.
Right rotate at the right subtree, then left rotate at the root. (This is called a “double rotation”).
The other two cases are the mirror images of the above two.
This is no different from find or get in any other binary search tree!
So far, we know that we can implement the Map ADT with an AVL tree, which is a special kind of BST which maintains a balance property at every node. And we know that the cost of get and set in a BST is O(height). If you think about it, each AVL rotation operation is O(1), and so even in an AVL tree, the get and set operations are O(height).
What makes AVL trees useful is that, unlike any old Binary Search Tree, the height of an AVL tree is guaranteed to be O(log (n)), where n is the number of nodes in the tree. That’s amazing! Here’s how you prove it:
Let \(N(h)\) be the minimum number of node of a tree of height h. By the definition of an AVL tree we know: $$ \begin{align} N(h) &= N(h-1) + N(h-2) + 1\\ N(h-1) &> N(h-2)\text{, therefore}\\ N(h) &> 2N(h-2)\\ 2N(h-2)&> 4N(h-4)\text{, and so on}\\ N(h) &> 2^{i}N(h-2i)\text{, for all } i\\ N(0) &= 1\\ \text{Let } i &= \frac{h}{2}\\ N(h-2i) &= N(h - 2\frac{h}{2}) = N(0) = 1\\ N(h) &> 2^{\frac{h}{2}}\\ \log(N(h)) &>\frac{h}{2}\\ h &< 2\log n \end{align} $$
And wouldn’t you know it, that’s O(logn).
OK! So, AVL trees are guaranteed to be of O(log n) height! Does this mean that they’re necessarily fast? Well, get(k) is unchanged from normal BSTs, so get(k) is O(log n). That’s a great improvement over a normal BST! What about insert?
Node lRotate(Node oldRoot) { // oldRoot is node "x" in the image above.
Node newRoot = oldRoot.right; // newRoot is node "y" in the image above.
Node middle = newRoot.left; // middle is the subtree B
newRoot.left = oldRoot;
oldRoot.right = middle;
update oldRoot.height // this order of fixing oldRoot first is
update newRoot.height // important! See why?
return newRoot;
}private Node insert(Node n, K key, V value){
if n is null
return new Node(key,value) //this Node is a leaf, and so has a height of 0
if key==n.key
replace n.value with value
else if key<n.key
n.left=insert(n.left,key,value)
else
n.right=insert(n.right,key,value)
n.height=1+max(n.left.height,n.right.height)
if n is unbalanced
perform necessary rotations
return new root of this subtree
return nSince ratations are constant time, and the height of the tree is \( < 2\lg n\), the worst insert can be is \(O(\lg n)\).Both insert and get are now O(log n)!
Remove is also \(O(\lg n)\) for the same reasons.| Unsorted Array | Sorted Array | Unsorted Linked List | Sorted Linked List | Balanced Binary Search Tree | |
|---|---|---|---|---|---|
| insert(k,e) | Amortized O(1) | O(n) | O(1) | O(n) | O(log(n)) |
| get(k) | O(n) | O(log n) | O(n) | O(n) | O(log(n)) |