Binary Search

April 29, 2016

We did this exercise a long time ago, but it’s worth doing again, because binary search is fundamental and because it is such a good teaching tool. Here’s my version, which assumes that the input is sorted in non-decreasing order:

(define (bsearch lt? x xs)
  (let loop ((lo 0) (hi (- (vector-length xs) 1)))
    (let ((mid (quotient (+ lo hi) 2)))
      (cond ((< hi lo) #f)
            ((lt? x (vector-ref xs mid))
              (loop lo (- mid 1)))
            ((lt? (vector-ref xs mid) x)
              (loop (+ mid 1) hi))
            (else mid)))))

For testing, I create vectors containing even numbers, then check that the results are correct when searching for both evens and odds:

(define (test-bsearch n)
  (do ((i 0 (+ i 1))) ((= n i))
    (let ((xs (list->vector (range 0 n 2))))
      (do ((j -1 (+ j 1))) ((< n j))
        (if (and (even? j) (< j n))
            (assert (bsearch < j xs) (/ j 2))
            (assert (bsearch < j xs) #f))))))

No news is good news:

> (test-bsearch 25)
>

We used range and the assert macro from the Standard Prelude. You can run the program at http://ideone.com/qzE6ZT.

Advertisements

Pages: 1 2

6 Responses to “Binary Search”

  1. John Cowan said

    Hell no. Binary search is extraordinarily difficult to get right. Even Bentley’s “proved correct” 2006 algorithm turns out to have a bug involving numeric overflow (which is at least only a performance bug in Lispy languages, where overflow produces bignums, floats, or an error). This is one of those cases like Posix regular expressions, where you absolutely need to use a high-quality library, not the one which comes with your OS (which surely has bugs), still less your own casually written code.

  2. Jussi Piitulainen said

    This is where I always write those invariants and variants out in some detail and still get it wrong at first. It happened again: on the first run I had an infinite loop and had to think; sure enough, one of my comments was false, so I fixed the comment and the code.

    (define (index big? vec)
      ;;; Returns the index k before which the elements of the vector do
      ;;; not satisfy the predicate, and from which on they do, assuming
      ;;; the elements are in a relevant order.
      (let loop ((b 0) (e (vector-length vec)))
        ;;; (<= b e)
        ;;; none before b is big
        ;;; all from e on are big
        (if (= b e)
            b
            (let ((m (+ b (quotient (- e b) 2))))
              ;;; (< b e) (<= b m) (< m e)
              ;;; m is a valid index
              ;;; (< (- m b) (- e b))
              ;;; (<= (- e m) (- e b))
              ;;; (< (- e (+ m 1)) (- e b))
              (if (big? (vector-ref vec m))
                  (loop b m)
                  (loop (+ m 1) e))))))
    

    The proceduce looks for the least index where the element satisfies a predicate. The test predicate asserts that a string is at least as many characters long as a given number, and there are equally long strings and length gaps in the data. The final check is when no element is big enough, so the value is the length of the vector, which is not the index of any element.

    (let ((data (vector "" "" "ab" "abc" "abd" "abcde" "abcdf" "abcdefghij"))
          (pred (lambda (n) (lambda (s) (>= (string-length s) n)))))
      (do ((n 0 (+ n 1)))
          ((> n (string-length "abcdefghij"))
           (write (= (index (pred n) data) (vector-length data)))
           (newline))
        (let ((u (index (pred n) data)))
          (write (list u n (vector-ref data u)))
          (newline))))
    

    Output (least index, sought least length, the string at the found index):

    (0 0 "")
    (2 1 "ab")
    (2 2 "ab")
    (3 3 "abc")
    (5 4 "abcde")
    (5 5 "abcde")
    (7 6 "abcdefghij")
    (7 7 "abcdefghij")
    (7 8 "abcdefghij")
    (7 9 "abcdefghij")
    (7 10 "abcdefghij")
    #t
    
  3. Jussi Piitulainen said

    I translated :)

    def index(isbig, vec):
        'Det är samma på Python.'
        b, e = 0, len(vec)
        while b < e:
            m = b + (e - b) // 2
            if isbig(vec[m]):
                e = m
            else:
                b = m + 1
        else:
            return b
    
    data = ['', '', 'ab', 'abc', 'abd', 'abcde', 'abcdf', 'abcdefghij']
    pred = lambda n: lambda s: len(s) >= n
    for n in range(len('abcdefghij') + 1):
        u = index(pred(n), data)
        print('({} {} {})'.format(u, n, repr(data[u])))
    else:
        print(index(pred(n + 1), data) == len(data))
    
  4. Daniel said

    Here’s a solution in Java.

    Two things I’d probably do differently if implementing this in production, as opposed to implementing for an exercise:
    1) loops instead of recursion
    2) a more generic implementation that works for other types of arrays in addition to integer arrays.

    private static int helper(int[] array, int element, int lo, int hi) {
        if (hi <= lo) {
            return -1; // base case 1
        } else if (hi - lo == 1) {
            return element == array[lo] ? lo : -1; // base case 2
        }
        int mid = lo + (hi-lo) / 2;
        int val = array[mid];
        if (element == val) {
            return mid;  // base case 3
        } else if (element < val) {
            return helper(array, element, lo, mid);
        } else {
            return helper(array, element, lo+1, hi);
        }
    }
    
    public static int search(int[] array, int element) {
        return helper(array, element, 0, array.length);
    }
    
  5. Daniel said

    After implementing, I took a look at how binary search is implemented in Java’s standard library. The version I implemented returns -1 if it doesn’t find the element. The Java built-in also returns a negative number when the element is not found, but the returned number seemingly can be transformed to an insertion point, “the point at which the key would be inserted into the array”. I didn’t consider returning that information, but suppose it could be useful in some cases.

  6. Daniel said

    Also note I avoided the overflow error that John Cowan referenced, having already been familiar with this post:
    http://googleresearch.blogspot.com/2006/06/extra-extra-read-all-about-it-nearly.html

    Note: I’ve quoted from that page when I or someone else has detected bugs in my code:
    “We programmers need all the help we can get, and we should never assume otherwise. Careful design is great. Testing is great. Formal methods are great. Code reviews are great. Static analysis is great. But none of these things alone are sufficient to eliminate bugs: They will always be with us. A bug can exist for half a century despite our best efforts to exterminate it. We must program carefully, defensively, and remain ever vigilant.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: