Hoare’s Partition

October 28, 2016

Here is our version of Hoare’s partition:

(define (partition lt? ary lo hi)
  (let ((x (vector-ref ary lo)) (i (- lo 1)) (j hi))
    (let forever ()
      (do-until (set! j (- j 1)) (le? lt? (vector-ref ary j) x))
      (do-until (set! i (+ i 1)) (le? lt? x (vector-ref ary i)))
      (when (< i j) (swap! ary i j) (forever)))
    (values j ary)))

Here, lo is the index of the first element of the sub-array and hi is the index of the element just beyond the end of the sub-array; we find that more convenient than making hi point to the end of the sub-array, because then we can say (quicksort lt? ary 0 (vector-length ary)) to sort the entire array. We always partition around the first element of the array. Here is the corresponding quick sort:

(define (quicksort lt? ary lo hi)
  (if (< (+ lo 1) hi)
      (call-with-values
        (lambda ()
          (partition lt? ary lo hi))
        (lambda (p ary)
          (cond ((< (- p lo) (- hi p))
                  (quicksort lt? ary lo (+ p 1))
                  (quicksort lt? ary (+ p 1) hi))
          (else (quicksort lt? ary (+ p 1) hi)
                (quicksort lt? ary lo (+ p 1))))))
      ary))

We always recur on the smaller partition first, because otherwise the stack can blow out.

By the way, we use a macro to provide the do-until control flow. Scheme, unlike algol-ish languages, provides few looping constructs. The Standard Prelude provides a while macro, which tests at the top, but no do-while or do-until macro, which tests at the bottom. Here are our versions of do-while and do-until:

(define-syntax do-while
  (syntax-rules (loop)
    ((do-while body ... condition)
      (let loop () body ... (when condition (loop))))))
(define-syntax do-until
  (syntax-rules (loop)
    ((do-until body ... condition)
      (let loop () body ... (unless condition (loop))))))

It is one of the strengths of Scheme that we can create new control flow macros as needed. We also need to convert the lt? predicate to a le? predicate:

(define (le? lt? x y) (not (lt? y x)))

We write the program using le? rather than lt? because that’s the predicate Hoare used, and it’s easier not to change things when translating pseudocode to a working program.

We test the program by writing a shuffle function that randomly re-orders the elements of an array, shuffles and sorts an initially-ordered array a bunch of times, and tests that each result is sorted. Here are sorted? and shuffle:

(define (sorted? lt? ary)
  (let ((len (vector-length ary)))
    (let loop ((i (- len 1)))
      (if (zero? i) #t
        (if (lt? (vector-ref ary (- i 1))
                 (vector-ref ary i))
            (loop (- i 1))
            #f)))))
(define (shuffle ary)
  (do ((v ary) (n (vector-length ary) (- n 1)))
      ((zero? n) ary)
    (let* ((r (random n)) (t (vector-ref ary r)))
      (vector-set! ary r (vector-ref ary (- n 1)))
      (vector-set! ary (- n 1) t))))

And here’s the test code, which performs n tests on arrays of k elements:

(define (test-quicksort k n)
  (let ((ary (list->vector (range k))))
    (do ((n n (- n 1))) ((zero? n))
      (quicksort < ary 0 k)
      (unless (sorted? < ary)
        (display "ERROR ") (display ary) (newline))
      (shuffle ary))))

The test code is purposely evil: the very first test sorts an ordered array, which is the worst-case test for quicksort. Our quicksort is quite fast: (test-quicksort 10000 100) finishes in just over a second, and more than half of that time is spent on the worst-case initially-ordered array. You can run the program at http://ideone.com/ttjYgA, where you will also see swap! and range.

There are several improvements we can make to this code, including switching to insertion sort when the sub-arrays become small and improving the choice of the pivot to make worst-case performance less likely. We’ll look at those improvements, and others, in the next exercise.

Pages: 1 2

7 Responses to “Hoare’s Partition”

  1. Milbrae said

    Found this code on Stackexchange. I found it to be quite fast. Source code is in D…

    void QuickSort(int[] a, int start, int end) {
        if (end <= start) return;
        int q = HoarePartition(a, start, end);
        QuickSort(a, start, q);
        QuickSort(a, q + 1, end);
    }
    
    int HoarePartition (int[] a, int p, int r) {
        int x = a[p], i = p-1, j = r+1;
        while (1) {
            do  j--; while (a[j] > x);
            do  i++; while (a[i] < x);
            if  (i < j)
                swap(a[i], a[j]);
            else
                return j;
        }
    }
    
  2. Milbrae said

    I did a little speed comparison with some implementations of Quicksort. I took the code which I’d posted earlier on PP, and also three implementations from Rosetta Code (https://www.rosettacode.org/wiki/Sorting_algorithms/Quicksort#D). Here’s the code in D:

    import std.stdio;
    import std.datetime;
    import std.algorithm;
    import std.random;
    import std.range;
    import std.array;
    
    void quickSort1(T)(T[] items) pure nothrow @safe @nogc {
        if (items.length >= 2) {
            auto parts = partition3(items, items[$ / 2]);
            parts[0].quickSort1;
            parts[2].quickSort1;
        }
    }
    
    auto quickSort2(T)(T[] items) pure nothrow @safe {
        if (items.length < 2)
            return items;
        immutable pivot = items[0];
        return items[1 .. $].filter!(x => x < pivot).array.quickSort2 ~
               pivot ~
               items[1 .. $].filter!(x => x >= pivot).array.quickSort2;
    }
    
    T[] quickSort3(T)(T[] items) pure nothrow {
        if (items.empty)
            return items;
        T[] less, notLess;
        foreach (x; items[1 .. $])
            (x < items[0] ? less : notLess) ~= x;
        return less.quickSort3 ~ items[0] ~ notLess.quickSort3;
    }
    
    void hoareQuicksort(T)(T[] a, size_t start, size_t end) {
        if (end <= start) return;
        size_t q = hoareQSPartition(a, start, end);
        hoareQuicksort(a, start, q);
        hoareQuicksort(a, q + 1, end);
    }
    
    size_t hoareQSPartition(T)(T[] a, size_t p, size_t r) {
        size_t x = a[p], i = p-1, j = r+1;
        while (1) {
            do  { j--; } while (a[j] > x);
            do  { i++; } while (a[i] < x);
            if  (i < j)
                swap(a[i], a[j]);
            else
                return j;
        }
    }
    
    void main()
    {
        const size_t N = 10_000_000;
    
        size_t[] test = new size_t[](N);
        for (size_t k = 0; k < N; k++) {
            test[k] = uniform(1, N);
        }
    
        StopWatch q1, q2, q3, q4;
    
        q1.reset; q1.start;
        quickSort1(test.dup);
        q1.stop;
    
        q2.reset; q2.start;
        quickSort2(test.dup);
        q2.stop;
    
        q3.reset; q3.start;
        quickSort3(test.dup);
        q3.stop;
    
        q4.reset; q4.start;
        hoareQuicksort(test.dup, 0, N-1);
        q4.stop;
    
        writefln("%d elements\n", N);
        writefln("Quicksort #1 took %d ms", q1.peek.msecs);
        writefln("Quicksort #2 took %d ms", q2.peek.msecs);
        writefln("Quicksort #3 took %d ms", q3.peek.msecs);
        writefln("Quicksort Hoare took %d ms", q4.peek.msecs);
    }
    

    Compiled with [i]dmd -O -release -inline -of”qsort” “qsort.d”[/i] it gives these times…

    10000000 elements
    
    Quicksort #1 took 1693 ms
    Quicksort #2 took 15955 ms
    Quicksort #3 took 18132 ms
    Quicksort Hoare took 1325 ms
    

    Compiled with [i]ldc2 -O3 -enable-inlining -of”qsort” “qsort.d”[/i] it gives these times…

    10000000 elements
    
    Quicksort #1 took 1110 ms
    Quicksort #2 took 8596 ms
    Quicksort #3 took 14627 ms
    Quicksort Hoare took 1017 ms
    

    Maybe there’s an even faster implementation….

  3. Daniel said

    Here’s a solution in C. I had some bugs in my initial implementation that I’ve fixed, but I’m still not feeling confident that this works properly, as I haven’t done much testing. In practice, I’ve used Lomuto partitioning when implementing quicksort. IIRC, I implemented quicksort since I needed to track where each item an unsorted array was placed in the sorted array (this also could have been figured out *after* sorting, but would have been O(n^2), so I tracked it at sort-time).

    #include <stdio.h>
    
    void print_array(int* array, int n) {
        printf("[");
        for (int i = 0; i < n; ++i) {
            if (i > 0) printf(",");
            printf("%d", array[i]);
        }
        printf("]");
    }
    
    void swap(int* a, int* b) {
        int tmp = *a;
        *a = *b;
        *b = tmp;
    }
    
    int partition(int* array, int n) {
        int pivot = array[0];
        for (int i = 1, j = n-1;; ++i, --j) {
            while (i < n && array[i] < pivot) ++i;
            while (j >= 0 && array[j] > pivot) --j;
            if (i >= j) {
                swap(array, array + j);
                return j;
            }
            swap(array + i, array + j);
        }
    }
    
    void qsort(int* array, int n) {
        if (n <= 1) return;
        int p = partition(array, n);
        qsort(array, p);
        qsort(array + p + 1, n - p - 1);
    }
    
    int main() {
        int array[] = {9,4,8,7,1,7,4,10,1,5,5,5,8,4,8,5,1,2,8,5};
        int n = sizeof(array) / sizeof(int);
        printf("array (unsorted):\n  ");
        print_array(array, n);
        printf("\n");
        qsort(array, n);
        printf("array (sorted):\n  ");
        print_array(array, n);
        printf("\n");
    }
    

    Output:

    array (unsorted):
      [9,4,8,7,1,7,4,10,1,5,5,5,8,4,8,5,1,2,8,5]
    array (sorted):
      [1,1,1,2,4,4,4,5,5,5,5,5,7,7,8,8,8,8,9,10]
    
  4. Daniel said

    > “IIRC, I implemented quicksort since I needed to track where each item an unsorted array was placed in the sorted array (this also could have been figured out *after* sorting, but would have been O(n^2), so I tracked it at sort-time).”

    I suppose that using a hash table could have worked as well for what I was trying to do (create a mapping of item to index), but I don’t recall the details, and may not have had hash tables available or even a standard library sorting algorithm. I was using VBA (Visual Basic for Applications) to write a Microsoft Office plugin. It had to run on both Windows and Mac, so some of the Windows-only extensions weren’t available.

  5. […] today’s exercise we will make several improvements to the quicksort program of the previous exercise, working in small steps and measuring our progress throughout. We make the following […]

  6. […] two previous exercises we’ve been working toward a variant of quicksort that has guaranteed O(n log n) […]

  7. […] sorting algorithm that we have been working up to in three previous exercises is introspective sort, or introsort, invented by David Musser in 1997 for the […]

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: