Red-Black Trees
October 2, 2009
A red-black tree is a data structure, similar to a binary tree, which is always approximately balanced, so that individual insert and lookup operations take only O(log n) time. Red-black trees are popular because of their good performance and the relative simplicity of their balancing operations. Our discussion of red-black trees is drawn from Section 3.3 of Chris Okasaki’s book Purely Functional Data Structures.
A red-black tree is a binary search tree in which each node is colored either red or black. A red-black tree maintains two invariants that ensure its balance:
- No red node ever has a red child.
- Every path from the root to an empty node has the same number of black nodes.
Thus, the shortest possible path has only black nodes, and the longest possible path has alternating red and black nodes, so the longest path is never more than twice as long as the shortest path, and the tree is approximately balanced.
Lookup in red-black trees is identical to its binary-tree counterpart; the colors make no difference. The balance condition is maintained by the insert operation. Each new node is initially colored red. If its parent is black, the tree remains balanced, and nothing need be done. However, if its parent is red, the first invariant is violated, and a balancing function is called to repair the violation by rewriting the black-red-red path as a red node with two black children. This may propagate the invariant up the tree, so the balancing function is called recursively until it reaches the root of the tree, which is always recolored black.
Your task is to write functions to maintain red-black trees; you should provide an insert function, a lookup function, and an enlist function that returns a list with the nodes of the tree in order. Each node should contain a key and a value, so the red-black tree can be used as a dictionary, in the manner of the treaps and ternary search tries that we have written in previous exercises. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.