by vijay | tags: [ learn-fp-in-sml functional-programming functional SML OCaml tutorial ]
Functional Algorithm Design - Balanced Trees
In the previous post we learned about binary trees
and how they can be used to implement search algorithms with 𝒪(log(n))
worst-case time complexity. But this performance
characteristic is lost for ordered data.
In this post we will develop a version of binary tree that knows how to keep its balance, in spite of the order in which elements are inserted. Search and insert operations on these trees will always maintain the logarithmic worst-case time complexity.
The type of binary trees that we will study here are known as red-black trees. In a red-black tree, every node with data is colored either red or black. So we have to augment the basic binary tree with this color bit:
datatype color = Red | Black
datatype 'a rbtree =
Empty
| Node of color * 'a rbtree * 'a * 'a rbtree
Searching
The search function for red-black trees is similar to that of the basic binary tree. It’s just that the color field should be ignored:
fun rbtree_search x Empty = false
| rbtree_search x (Node(_, left, v, right)) =
if x = v then true
else if x < v then rbtree_search x left
else rbtree_search x right
Inserting
The insert function is more complex because it needs to maintain the following two invariants:
- No red node should have a red child.
- Every path from the root to an empty node should have the same number of black nodes.
The following function can be called on a node to regain its balance:
fun balance ((Black, Node(Red, Node(Red, a, x, b), y, c), z, d)
| (Black, Node(Red, a, x, Node(Red, b, y, c)), z, d)
| (Black, a, x, Node(Red, Node(Red, b, y, c), z, d))
| (Black, a, x, Node(Red, b, y, Node(Red, c, z, d))))
= Node(Red, Node(Black, a, x, b), y, Node(Black, c, z, d))
| balance x = Node(x)
The balance
function basically fixes a violation of invariant 1, i.e when the parent of a new node is red.
This violation is fixed when balance
deals with the black parent of a red node with a red child.
This can come in four patterns, but the fix is same for all cases - rewrite the node red with two black children.
Now we can implement the function to insert a new node to a red-black tree:
fun rbtree_add x t =
let fun insert Empty = Node(Red, Empty, x, Empty)
| insert (t as Node(color, a, y, b)) =
if x < y then balance(color, insert a, y, b)
else if x > y then balance(color, a, y, insert b)
else t;
val Node(_, a, y, b) = insert t
in
Node(Black, a, y, b)
end
The rbtree_add
function is similar to the btree_add
function, with the following noticeable differences:
- The
balance
function is called for thex < y
andx > y
conditions to enforce the invariants. - The root of the final tree is always forced to be black.
The balance
function recursively balances to the top of the tree, but it might end up with a root red node with a red child.
This case is handled by always forcing the root to be black.
Here is how you would construct a red-black tree and perform searchs on it:
val t = rbtree_add 5 (rbtree_add 4 (rbtree_add 3 (rbtree_add 2 (rbtree_add 1 Empty))));
rbtree_search 5 t
(* val it = true : bool *)
rbtree_search 3 t
(* val it = true : bool *)
rbtree_search 0 t
(* val it = false : bool *)
As an exercise, can you change the rbtree_search
function to print the path it followed?
This should show you that the tree is really balanced.
An Optimization
A small optimization can be applied to balance
- it could be split into two separate functions for fixing violations involving left and
right children respectively.
fun lbalance ((Black, Node(Red, Node(Red, a, x, b), y, c), z, d)
| (Black, Node(Red, a, x, Node(Red, b, y, c)), z, d))
= Node(Red, Node(Black, a, x, b), y, Node(Black,c,z,d))
| lbalance x = Node(x)
fun rbalance ((Black, a, x, Node(Red, Node(Red, b, y, c), z, d))
| (Black, a, x, Node(Red, b, y, Node(Red, c, z, d))))
= Node(Red, Node(Black, a, x, b), y, Node(Black,c,z,d))
| rbalance x = Node(x)
The rbtree_add
function can be re-defined as follows. This revision will get rid of several unnecessary tests for the balancing act:
fun rbtree_add x t =
let fun insert Empty = Node(Red, Empty, x, Empty)
| insert (t as Node(color, a, y, b)) =
if x < y then lbalance(color, insert a, y, b)
else if x > y then rbalance(color, a, y, insert b)
else t;
val Node(_, a, y, b) = insert t
in
Node(Black, a, y, b)
end
References
This post is based on the red-black tree implementation from Purely Functional Data Structures by Chris Okasaki.
The Functional Approach to Programming shows a complete implementation of an AVL tree.
Both these books are great sources for SML programmers looking for practical implementations of functional algorithms.
Note that name and e-mail are required for posting comments