Splay Heaps
January 22, 2013
We store a heap element in a three-slot vector and provide functions to access a heap element:
(define heap vector) ; lkid item rkid
(define (lkid h) (vector-ref h 0))
(define (item h) (vector-ref h 1))
(define (rkid h) (vector-ref h 2))
The empty heap has children that point to itself recursively:
(define empty (heap 'empty 'empty 'empty))
(vector-set! empty 1 empty)
(vector-set! empty 2 empty)
(define (empty? h) (eqv? h empty))
The partition
function takes a new element to be added to the tree, which Okasaki calls the pivot element by analogy to quicksort, and returns two sub-trees containing elements smaller and bigger than the pivot, respectively:
(define (partition lt? pivot h)
(if (empty? h) (values h h)
(if (lt? (item h) pivot)
(if (empty? (rkid h))
(values h empty)
(if (lt? (item (rkid h)) pivot)
(call-with-values
(lambda ()
(partition lt? pivot (rkid (rkid h))))
(lambda (small big)
(values (heap (heap (lkid h) (item h)
(lkid (rkid h)))
(item (rkid h)) small)
big)))
(call-with-values
(lambda ()
(partition lt? pivot (lkid (rkid h))))
(lambda (small big)
(values (heap (lkid h) (item h) small)
(heap big (item (rkid h))
(rkid (rkid h))))))))
(if (empty? (lkid h))
(values empty h)
(if (lt? (item (lkid h)) pivot)
(call-with-values
(lambda ()
(partition lt? pivot (rkid (lkid h))))
(lambda (small big)
(values (heap (lkid (lkid h))
(item (lkid h)) small)
(heap big (item h) (rkid h)))))
(call-with-values
(lambda ()
(partition lt? pivot (lkid (lkid h))))
(lambda (small big)
(values small
(heap big (item (lkid h))
(heap (rkid (lkid h))
(item h) (rkid h)))))))))))
This is complicated but correct. After handling an empty input tree, the two branches of the if
handle the case where the top of the heap is less than the pivot (the then clause) or greater than or equal to the pivot (the else clause). Both clauses call partition
recursively as they descend the tree. Notice the (heap (heap …))
in the first small clause, which is where the restructuring occurs.
Given partition
, the three priority queue operators are simple. Insert
partitions around the new element, then builds a new heap. First
charges down the left spine of the tree until it finds the minimum element. Rest
restructures the tree in the same manner as partition, except that there is no comparison because we always take the left path:
(define (insert lt? x h)
(call-with-values
(lambda () (partition lt? x h))
(lambda (a b) (heap a x b))))
(define (first lt? h)
(cond ((empty? h) (error 'first "empty queue"))
((empty? (lkid h)) (item h))
(else (first lt? (lkid h)))))
(define (rest lt? h)
(cond ((empty? h) (error 'rest "empty queue"))
((empty? (lkid h)) (rkid h))
((empty? (lkid (lkid h)))
(heap (rkid (lkid h)) (item h) (rkid h)))
(else (heap (rest lt? (lkid (lkid h)))
(item (lkid h))
(heap (rkid (lkid h)) (item h) (rkid h))))))
We test by writing a simple heapsort function, which inserts each item in input order then extracts each item from the heap in sorted order:
(define (heap-sort lt? xs)
(let loop ((xs xs) (h empty))
(if (pair? xs)
(loop (cdr xs) (insert lt? (car xs) h))
(let loop ((h h) (zs (list)))
(if (empty? h)
(reverse zs)
(loop (rest lt? h) (cons (first lt? h) zs)))))))
All is well:
> (heap-sort < (range 20))
(0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19)
> (heap-sort > (range 20))
(19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0)
> (heap-sort < (shuffle (range 20)))
(0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19)
We used the Standard Prelude functions range
and shuffle
for testing, though they are not needed for implementation of heaps. You can run the program at http://programmingpraxis.codepad.org/b1NndaEK.
[…] Pages: 1 2 […]
Here’s my version in Racket: Sorting via splay heap
It’s not *that* different (I’m not sure how different two Scheme implementations could really be), but there is some. For example, I went with structs rather than wrapping a vector, which at once abstracts over the backend but unfortunately makes the function names a bit longer.
One thing I’m curious about is when you would want to use something like this. It was interesting to build it, but I feel like there would be better options, either for a priority queue (just use a list, insert in linked lists is relatively fast; although the runtime would be O(n) rather than amortized O(log n)) or for sorting (quicksort has a much simpler implementation, at least IMO). I guess if someone else writes the implementation and you just use it, then most of the issues would go away. Hmm.
@jpverkamp: We have used priority queues (heaps) in several exercises: Melissa O’Neill’s method of generating primes, apportioning votes in the U S House of Representatives, Prim’s algorithm for the minimum spanning tree, calculating the streaming median, building treaps, and in puzzle exercises from Wirth, Ullman and Amazon. You can find those by clicking on the Exercises | Search item on the menu bar. Priority queues are not just for sorting!
Perhaps I phrased that incorrectly. I definitely do see the use for priority queues in all sorts of cases. Personally, I remember using them for job scheduling in an industrial situation. I was more getting at this particular implementation (using a tree as the underlying structure rather than an array with power of two offsets).
Out of curiosity, I looked up what Java’s PriorityQueue and it’s apparently implemented using a heap. So I’ve used it several times without actually realizing it. :) That being said, their implementation uses a mutable array rather than a tree to implement the heap. So slightly different.
I went ahead and implemented a mutable version using arrays (as a tree though really) and sifting rather than rotations: Splay heaps redux–imperative model.
It’s based on the OpenJDK implementation of a PriorityQueue. It seems a bit more natural to me, although I wonder if modeling the tree explicitly rather than as an array (yet still performing the swaps) would have helped even more. It ended up taking me about as long to implement this one as the other, mostly because of a pair of really sneaky off-by-one errors that only cropped up in about 5% of randomized tests…
In Clojure. It is noticeably faster than a leftist heap. On my computer, to generate the first 1,000,000 primes using the leftist heap (O’Neil algorithm) takes 7.25 seconds. Replacing the heap used with a splay heap takes 6.4 seconds.
The test is simply to use some reductions to convert to/from a list.