Array Rotation, Timing Tests
March 9, 2018
We begin by rewriting the three algorithms so they all have a common form, taking a vector and a rotation distance as arguments, with all the supporting code tucked neatly inside the main function:
(define (juggling vec dist) (let ((len (vector-length vec))) (do ((idx 0 (+ idx 1))) ((= idx (gcd dist len)) vec) (let ((temp (vector-ref vec idx))) (do ((lo idx hi) (hi (modulo (+ idx dist) len) (modulo (+ hi dist) len))) ((= hi idx) (vector-set! vec lo temp)) (vector-set! vec lo (vector-ref vec hi)))))))
(define (block-swap vec dist) (define (swap a b m) (do ((i 0 (+ i 1))) ((= i m) vec) (let ((t (vector-ref vec (+ a i)))) (vector-set! vec (+ a i) (vector-ref vec (+ b i))) (vector-set! vec (+ b i) t)))) (let ((len (vector-length vec))) (let loop ((i dist) (j (- len dist))) (cond ((< i j) (swap (- dist i) (+ dist j (- i)) i) (loop i (- j i))) ((< j i) (swap (- dist i) dist j) (loop (- i j) j)) (else (swap (- dist i) dist i) vec)))))
(define (reversal vec dist) (define (swap i j) (let ((t (vector-ref vec i))) (vector-set! vec i (vector-ref vec j)) (vector-set! vec j t))) (define (reverse lo hi) (when (< lo hi) (swap lo hi) (reverse (+ lo 1) (- hi 1)))) (let ((len (vector-length vec))) (reverse 0 (- dist 1)) (reverse dist (- len 1)) (reverse 0 (- len 1))) vec)
Next we write a timing function, using the Chez Scheme (cpu-time)
function to calculate the times:
(define (timing rotate vec dist) (let ((start (cpu-time))) (rotate vec dist) (- (cpu-time) start)))
And now we are ready for some timings:
> (let ((vec (list->vector (range 1000000)))) (map (lambda (x) (timing juggling vec 100)) (range 50))) (109 94 109 94 109 93 94 109 94 109 109 94 109 94 109 93 110 93 94 109 94 93 109 94 109 94 109 94 93 125 94 93 109 94 94 109 93 94 109 94 93 110 93 125 109 94 109 94 93 109) > (let ((vec (list->vector (range 1000000)))) (map (lambda (x) (timing block-swap vec 100)) (range 50))) (187 171 172 187 187 172 203 203 171 187 188 187 171 203 172 187 172 187 171 188 187 171 188 171 203 187 172 187 172 187 171 188 187 187 187 172 187 187 172 187 172 187 171 188 187 171 203 187 188 187) > (let ((vec (list->vector (range 1000000)))) (map (lambda (x) (timing reversal vec 100)) (range 50))) (140 156 141 156 140 141 140 156 140 156 141 140 156 125 156 156 140 156 141 140 156 141 140 140 172 140 141 140 156 141 140 156 140 141 156 140 141 156 156 140 140 141 156 156 156 140 156 156 187 141)
So the juggling algorithm is fastest, the block-swap algorithm is slowest, and the reversal algorithm is somewhere in the middle. Our result differs from Bentley, who found that block-swap was just a little bit faster than reversal, and both were much faster than juggling. Bentley explained the differences based on CPU memory-cache effects; apparently Scheme is a higher-level language than C and is less influenced by caching.
Despite the timings, I will continue to use the reversal algorithm when I need to rotate an array. It’s time is competitive with juggling, and the code is very much simpler; in fact, it’s hard to see how you could get the reversal code wrong, whereas the juggling code still baffles me, just a little bit, even after reading Bentley’s explanation and copying his implementation.
You can run the program at https://ideone.com/DQLleZ.
I compiled the code with Gambit, added declarations
(declare (standard-bindings)
(extended-bindings)
(block)
(fixnum)
(not safe))
except for
(define (timing rotate vec dist)
(declare (generic))
(let ((start (cpu-time)))
(rotate vec dist)
(- (cpu-time) start)))
and got times
which is closer to Bentley’s results
Here’s a solution in C.
The output times follow the code.
The swapping approach worked fastest of my implementations.
I was able to improve the speed of the juggle approach by using subtraction to calculate the remainder instead of using the modulus operator.
I also added code to validate that my rotation implementations are working correctly.
Output (-O0):
Output (-O2):
My calc_rotate_time function casts the result to a double in the return statement. That should be removed. An earlier implementation was calculating seconds, and the cast to double was there to bypass integer division. Removing the cast to double won’t have a substantive effect on the numbers reported above.