2-4 Trees

This page contains course notes for Advanced Data Structures taught in the Autumn of 2004. If you have come here from a search engine or the like, you may wish to visit the course home page for more material, or visit my general teaching page for possibly more up to date versions.
  1. We established that binary search trees give us good run-times if we can keep them complete.
  2. We can keep a tree complete by, everytime we insert something, we dump everything out of the tree and rebuilding the tree from scratch.
  3. But that takes O(n) plus extra space. Wouldn't it be nice if we could find a way to keep a tree as complete (or close to complete) by making simple alterations each time we add or delete something?
  4. There are multiple ways to do this, such as, splay trees, AVL trees, and most commonly red-black trees. Before we talk about red-black trees, however, we need to talk about a new kind of tree, a 2-4 tree.
  5. A 2-4 tree is a search tree with special kinds of nodes. These nodes can have multiple items in them and can have anywhere from 2 to 4 children. Here are some example nodes:
  6. We will use these nodes to build a tree with the following rules:
    1. For 2-nodes:
      1. ∀m,n LeftDescendant(m,n) → (∀ k ∈ keys(m) → k < key(n))
      2. ∀m,n RightDescendant(m,n) → (∀ k ∈ keys(m) → k > key(n))
    2. For 3-nodes:
      1. ∀m,n LeftDescendant(m,n) → (∀ k ∈ keys(m) → k < leftKey(n))
      2. ∀m,n MiddleDescendant(m,n) → (∀ k ∈ keys(m) → (k > leftKey(n)) ∧ (k < rightKey(n)))
      3. ∀m,n RightDescendant(m,n) → (∀ k ∈ keys(m) → k > rightKey(n))
    3. For 4-nodes:
      1. ∀m,n LeftDescendant(m,n) → (∀ k ∈ keys(m) → k < leftKey(n))
      2. ∀m,n MiddleLeftDescendant(m,n) → (∀ k ∈ keys(m) → (k > leftKey(n)) ∧ (k < middleKey(n)))
      3. ∀m,n MiddleRightDescendant(m,n) → (∀ k ∈ keys(m) → (k > middleKey(n)) ∧ (k < MiddleKey(n)))
      4. ∀m,n RightDescendant(m,n) → (∀ k ∈ keys(m) → k > rightKey(n))
    4. All leaf nodes are at the same depth.
  7. The ordering rules make sure the tree operates as a search tree, and the last rule makes sure that the tree is completely full at all times.
  8. Here's an example:
  9. So, what's the height of this thing? Well, in worst case (the tree is as high as possible) then all nodes are 2-nodes. What to we have when all nodes are 2 nodes? a binary tree! we already know that the height of a binary tree is O(log(n)).
  10. Interesting excercise: what is the BEST height of the tree?
  11. So how does find() work? Just like in BSTs. Thus it is O(log(n))
  12. How does insert work?
    1. this is a little funny cause we can't just add a node on to the end- it would break the full tree property. But now, since nodes can hold multiple items, most of the time we just go to the right leaf node, and stick it in there.
    2. Not bad, how long did that take? O(log(n)).
    3. But what if we go to add and item, and the node is full, then what do we do? We move up one of the elements, and split the node:
    4. That promotion just took constant time.
    5. What if we promote, but there's no room above? Just promote from there:
    6. Just keep promoting up until there's room!
    7. What if we get to the root and there's no room? When we promote, make a new node and make that the root.
    8. How long does that take? possibly, we need to promote from the leaves to the root, or O(log(n)).
  13. OK, we can insert in log(n) and keep the tree complete. But what about deleting?
    1. A typical delete works just like in a BST. Swap the item with the next one at some leaf location. Then delete the item at the leaf.
    2. The item we swap with is still the lef-most right child.
    3. If the item to be removed at a leaf is in a 3 or 4 node, we can just pull it out, and voila! We're done.
    4. Of course if we delete from a 2 node, then we have a problem. Sometimes we can fix that by taking an item from the parent and moving it down, and taking an item from an adjacent sibling and moving it up to the parent:

      Note that the dummy node a was actually transfered from one place to another. That's important later.
    5. Sometimes when we delete, there are no free items in adjacent siblings. Instead we drop down an element from the parent, and since the parent now has few items, it needs fewer children, so we merge with a sibling:
    6. Piece of cake, huh? Well, what if the parent node is also a 2 node? Then what do we do? Well, if the parent has immdiate siblings with some extra items, we can do a transfer:



    7. What if the parent node is also a 2 node, AND it has no siblings we can perform a transfer with? Then we the same as before, drop down from the parent, and merge with a sibling:


    8. What if we get all the way to the root? Well then, we have an empty root. Would just getting rid of that node be a problem? Why not? So we're done
    9. How long does that take? In worst case, we go up from a leaf all the way to the root of the tree, which we know is just O(log(n))
  14. That's a lot of work, but we now have a tree that is guaranteed to always be balanced! We know that all our operations are guaranteed to be O(log(n)) w00t!