2-4 Trees
This page contains course notes for Advanced Data Structures taught in
the Autumn of 2004. If you have come here from a search engine or the like, you may wish to
visit the
course home page for more material, or visit my general
teaching page for possibly more up to date versions.
- We established that binary search trees give us good run-times if
we can keep them complete.
- We can keep a tree complete by, everytime we insert something, we
dump everything out of the tree and rebuilding the tree from scratch.
- But that takes O(n) plus extra space. Wouldn't it be nice if we
could find a way to keep a tree as complete (or close to complete) by
making simple alterations each time we add or delete something?
- There are multiple ways to do this, such as, splay trees, AVL
trees, and most commonly red-black trees. Before we talk about
red-black trees, however, we need to talk about a new kind of tree, a
2-4 tree.
- A 2-4 tree is a search tree with special kinds of nodes. These
nodes can have multiple items in them and can have anywhere from 2 to
4 children. Here are some example nodes:
- We will use these nodes to build a tree with the following rules:
- For 2-nodes:
- ∀m,n LeftDescendant(m,n) → (∀ k ∈ keys(m) → k < key(n))
- ∀m,n RightDescendant(m,n) → (∀ k ∈ keys(m) → k > key(n))
- For 3-nodes:
- ∀m,n LeftDescendant(m,n) → (∀ k ∈ keys(m) → k < leftKey(n))
- ∀m,n MiddleDescendant(m,n) → (∀ k ∈
keys(m) → (k > leftKey(n)) ∧ (k < rightKey(n)))
- ∀m,n RightDescendant(m,n) → (∀ k ∈ keys(m) → k > rightKey(n))
- For 4-nodes:
- ∀m,n LeftDescendant(m,n) → (∀ k ∈ keys(m) → k < leftKey(n))
- ∀m,n MiddleLeftDescendant(m,n) → (∀ k ∈
keys(m) → (k > leftKey(n)) ∧ (k < middleKey(n)))
- ∀m,n MiddleRightDescendant(m,n) → (∀ k ∈
keys(m) → (k > middleKey(n)) ∧ (k < MiddleKey(n)))
- ∀m,n RightDescendant(m,n) → (∀ k ∈
keys(m) → k > rightKey(n))
- All leaf nodes are at the same depth.
- The ordering rules make sure the tree operates as a search tree,
and the last rule makes sure that the tree is completely full at all
times.
- Here's an example:
- So, what's the height of this thing? Well, in worst case (the tree
is as high as possible) then all nodes are 2-nodes. What to we have
when all nodes are 2 nodes? a binary tree! we already know that the
height of a binary tree is O(log(n)).
- Interesting excercise: what is the BEST height of the tree?
- So how does find() work? Just like in BSTs. Thus it is O(log(n))
- How does insert work?
- this is a little funny cause we can't just
add a node on to the end- it would break the full tree property. But
now, since nodes can hold multiple items, most of the time we just go
to the right leaf node, and stick it in there.
- Not bad, how long did that take? O(log(n)).
- But what if we go to add and item, and the node is full, then
what do we do? We move up one of the elements, and split the node:
- That promotion just took constant time.
- What if we promote, but there's no room above? Just promote from
there:
- Just keep promoting up until there's room!
- What if we get to the root and there's no room? When we promote,
make a new node and make that the root.
- How long does that take? possibly, we need to promote from the
leaves to the root, or O(log(n)).
- OK, we can insert in log(n) and keep the tree complete. But what
about deleting?
- A typical delete works just like in a BST. Swap the item with
the next one at some leaf location. Then delete the item at the
leaf.
- The item we swap with is still the lef-most right child.
- If the item to be removed at a leaf is in a 3 or 4 node, we can
just pull it out, and voila! We're done.
- Of course if we delete from a 2 node, then we have a problem.
Sometimes we can fix that by taking an item from the parent and
moving it down, and taking an item from an adjacent sibling and
moving it up to the parent:

Note that the dummy node a was actually transfered from one
place to another. That's important later.
- Sometimes when we delete, there are no free items in adjacent
siblings. Instead we drop down an element from the parent, and since
the parent now has few items, it needs fewer children, so we merge
with a sibling:
- Piece of cake, huh? Well, what if the parent node is also a 2
node? Then what do we do? Well, if the parent has immdiate siblings
with some extra items, we can do a transfer:



- What if the parent node is also a 2 node, AND it has no siblings
we can perform a transfer with? Then we the same as before, drop
down from the parent, and merge with a sibling:


- What if we get all the way to the root? Well then, we have an
empty root. Would just getting rid of that node be a problem? Why
not? So we're done
- How long does that take? In worst case, we go up from a leaf all
the way to the root of the tree, which we know is just O(log(n))
- That's a lot of work, but we now have a tree that is guaranteed to
always be balanced! We know that all our operations are guaranteed to
be O(log(n)) w00t!