In a previous blog entry, I discussed the
difficulties associated with implementing leftleaning redblack trees. A couple of readers commented that
treaps might be superior to
redblack trees, and as part of some
recent jemalloc optimization work, I had occasion to implement
treaps in order to measure tree operation overhead.
Most of this article discusses performance, but let me first mention implementation difficulty. It took me about 90 hours to design/implement/test/benchmark leftleaning redblack trees, and less than 10 hours for
treaps. Search/insert/delete for redblack trees is O(lg n), versus O(n) for
treaps. However, the average case for
treaps is (lg n), and the chances of worst case behavior are vanishingly small, thanks to (pseudo)randomness. Thus, realworld performance differences are only incremental. To be fair, I made redblack trees harder by avoiding recursion. Regardless however,
treaps are
way easier to implement than redblack trees.
As for benchmarking, I wrote functionally identical benchmark programs for three redblack tree implementations and two
treap implementations. The tree implementations are:
 rb_new: Leftleaning redblack trees.
 rb_old: Standard redblack trees.
 RB: Standard redblack trees, as implemented by the *BSD sys/tree.h.
 trp_hash: Treaps, with priorities computed via pointer hashing.
 trp_prng: Treaps, with priorities computed via pseudorandom number generation (PRNG).
The benchmark programs iteratively generate permutations of
NNODES nodes, for
NSETS node sets. For each node set, the programs iteratively build and tear down a tree using the first [1..
NNODES] nodes in the set. Each insert/remove operation is accompanied by
NSEARCH rounds of searching for every object in the tree, and NITER rounds of iterating over every object in the tree. Don't worry too much about the details; in short the benchmark programs can be configured to predominantly benchmark insertion/deletion, searching, and/or iteration.
The following table summarizes benchmark results as measured on a 2.2 GHz
amd64
Ubuntu 8.10 system. The benchmarks were all compiled with "
gcc O3", and the times are user+system time (fastest of three runs):
NNODES, NSETS, NSEARCH, NITER (focus)  rb_new  rb_old  RB  trp_hash  trp_prng 


1500,25,0,0 (ins/del)  7.60  3.99  4.25  17.57  7.58 
125,25,10,0 (search)  17.74  18.61  16.60  17.84  17.77 
250,25,0,100 (iteration)  18.45  21.06  19.19  20.45  20.40 
Insertion/deletion is fastest for the redblack tree implementations that do lazy
fixup (
rb_old and RB).
rb_new uses a singlepass algorithm, which requires more work.
trp_
prng is about the same speed as
rb_new, but
trp_hash is
way slower, due to the repeated hash computations that are required to avoid explicitly storing node priorities.
Search performance is similar for all implementations, which indicates that there are no major disparities in tree balance.
Iteration performance is similar for all implementations, even though they use substantially different algorithms. If tree size were much larger,
rb_old and RB would suffer, since they use an O(n lg n) algorithm, whereas
rb_new and
trp_* use O(n) algorithms.
rb_new uses a rather complicated iterative algorithm, but
trp_* use recursion and callback functions due to the weak upper bound on tree depth.
Sadly, there is no decisive winner, though any of the five tree implementations is perfectly adequate for the vast majority of applications. The winners according to various criteria are:
 Space: rb_new and trp_hash.
 Speed (insertion/deletion): rb_old and RB.
 Ease of implementation: trp_prng.