Shuffle
October 20, 2009
The classic description of the array shuffle is given by Donald E. Knuth in The Art of Computer Programming, Volume 2, Section 3.4.2, Algorithm P:
(define (shuffle! v)
(do ((n (vector-length v) (- n 1))) ((zero? n) v)
(let* ((r (randint n)) (t (vector-ref v r)))
(vector-set! v r (vector-ref v (- n 1)))
(vector-set! v (- n 1) t))))
That algorithm has time complexity O(n). Using the same algorithm on a list, replacing vector-ref
with list-ref
, has time complexity O(n²), due to the cost of continually indexing through the list, starting each time at its head. We won’t show the list version of that algorithm.
Joe Marshall proposes shuffling a list by partitioning it into two pieces deterministically, shuffling them recursively, then merging them randomly:
(define (shuffle xs)
(if (or (null? xs) (null? (cdr xs))) xs
(let split ((xs xs) (odd '()) (even '()))
(if (pair? xs)
(split (cdr xs) (cons (car xs) even) odd)
(let merge ((odd (shuffle odd)) (even (shuffle even)))
(cond ((null? odd) even)
((null? even) odd)
((zero? (randint 2)) (cons (car odd) (merge (cdr odd) even)))
(else (cons (car even) (merge odd (cdr even))))))))))
Al Petrofsky proposes a somewhat faster method that first partitions the list randomly, then randomly merges the two pieces:
(define (shuffle xs)
(let shuffle ((xs xs) (acc '()))
(if (null? xs) acc
(if (null? (cdr xs)) (cons (car xs) acc)
(let split ((xs xs) (x1 '()) (x2 '()))
(if (null? xs)
(if (null? x1)
(split x2 '() '())
(shuffle x1 (shuffle x2 acc)))
(if (zero? (randint 2))
(split (cdr xs) (cons (car xs) x1) x2)
(split (cdr xs) x1 (cons (car xs) x2)))))))))
If you want, you can always do Perl’s omigod Schwartzian transform:
(define (shuffle xs)
(map cdr
(sort (lambda (x y) (< (car x) (car y)))
(map (lambda (x) (cons (rand) x)) xs))))
All the list shuffles have time complexity O(n log n); that’s the price of the linked list. But since Knuth’s method has time complexity O(n), the fastest algorithm for sorting a list is to convert it to an array, shuffle the array using Knuth’s method, then convert it back to a list, which is not only fast in theory but also faster in practice than any of the other methods, as for small lists the two type conversions get lost in the overhead of the partition/merge process, and for large lists the asymptotic behavior dwarfs everything else:
(define (shuffle x)
(do ((v (list->vector x)) (n (length x) (- n 1)))
((zero? n) (vector->list v))
(let* ((r (randint n)) (t (vector-ref v r)))
(vector-set! v r (vector-ref v (- n 1)))
(vector-set! v (- n 1) t))))
We used sort
, rand
and randint
from the Standard Prelude. You can run the program at http://programmingpraxis.codepad.org/bgrR41GU.
Here’s a PLT version, which works like the Perl version:
(define (shuffle x) (sort x #:key (lambda _ (random)) #:cache-keys? #t))
Here’s an example of the array shuffle in perl (without using the pre-existing library). Perl doesn’t generally support linked lists as all arrays have the push/pop/etc functionality included.
Perl array shuffle example
Here is an implementation in Clojure. Its native datatypes are immutable, so for the array shuffle I’m using Java’s ArrayList. The collection shuffler works on any ordered Clojure collection (immutable lists as well as vectors); because values are immutable, we are modifying what the variables point to, which has to be done within a software transaction with dosync.
To shuffle a linked list without converting it into an array or treating it as if it were an array, create two temporary lists, left and right, and then for each element in the original list, move that element randomly into either left or right. Then use recursion shuffle each of the left and right lists, combine them, and return the combined list.
Of course, doing two recursive calls can result in lots of memory usage, especially since the random selection could potentially put all elements of the original into left, or all into right… which adds to the stack depth, while accomplishing little work.
Instead, after partitioning into left and right, join the lists back together before shuffling, so that one of the two recursive calls can be changed into a loop.
Here’s some lisp code to do it:
Alternatively, here’s some C code:
This would be used as:
PS: I’ve only tested the lisp code, since I’m still working on installing gcc on my computer.
But since the algorithm is the same, they should work the same.
I haven’t looked closely, but isn’t your algorithm O(n log n), like any other divide-and-conquer algorithm? The advantage of the list->vector->knuth-shuffle->list algorithm is that it works in O(n) time.
MY c IMPLEMENTATION