Binary Search

April 29, 2016

I goofed.

While writing a program (it may appear in a future exercise) I needed to search a sorted array for a target value. I should have copied an existing binary search, but instead I wrote my own, since I’m a good programmer and can certainly write a simple function like that. You won’t have any trouble guessing what happened next.

Your task is to write a binary search function; do it yourself, without looking at any library implementations or searching the internet. You might also want to write a test script to give you confidence in your function. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.

Advertisement

Pages: 1 2

6 Responses to “Binary Search”

  1. John Cowan said

    Hell no. Binary search is extraordinarily difficult to get right. Even Bentley’s “proved correct” 2006 algorithm turns out to have a bug involving numeric overflow (which is at least only a performance bug in Lispy languages, where overflow produces bignums, floats, or an error). This is one of those cases like Posix regular expressions, where you absolutely need to use a high-quality library, not the one which comes with your OS (which surely has bugs), still less your own casually written code.

  2. Jussi Piitulainen said

    This is where I always write those invariants and variants out in some detail and still get it wrong at first. It happened again: on the first run I had an infinite loop and had to think; sure enough, one of my comments was false, so I fixed the comment and the code.

    (define (index big? vec)
      ;;; Returns the index k before which the elements of the vector do
      ;;; not satisfy the predicate, and from which on they do, assuming
      ;;; the elements are in a relevant order.
      (let loop ((b 0) (e (vector-length vec)))
        ;;; (<= b e)
        ;;; none before b is big
        ;;; all from e on are big
        (if (= b e)
            b
            (let ((m (+ b (quotient (- e b) 2))))
              ;;; (< b e) (<= b m) (< m e)
              ;;; m is a valid index
              ;;; (< (- m b) (- e b))
              ;;; (<= (- e m) (- e b))
              ;;; (< (- e (+ m 1)) (- e b))
              (if (big? (vector-ref vec m))
                  (loop b m)
                  (loop (+ m 1) e))))))
    

    The proceduce looks for the least index where the element satisfies a predicate. The test predicate asserts that a string is at least as many characters long as a given number, and there are equally long strings and length gaps in the data. The final check is when no element is big enough, so the value is the length of the vector, which is not the index of any element.

    (let ((data (vector "" "" "ab" "abc" "abd" "abcde" "abcdf" "abcdefghij"))
          (pred (lambda (n) (lambda (s) (>= (string-length s) n)))))
      (do ((n 0 (+ n 1)))
          ((> n (string-length "abcdefghij"))
           (write (= (index (pred n) data) (vector-length data)))
           (newline))
        (let ((u (index (pred n) data)))
          (write (list u n (vector-ref data u)))
          (newline))))
    

    Output (least index, sought least length, the string at the found index):

    (0 0 "")
    (2 1 "ab")
    (2 2 "ab")
    (3 3 "abc")
    (5 4 "abcde")
    (5 5 "abcde")
    (7 6 "abcdefghij")
    (7 7 "abcdefghij")
    (7 8 "abcdefghij")
    (7 9 "abcdefghij")
    (7 10 "abcdefghij")
    #t
    
  3. Jussi Piitulainen said

    I translated :)

    def index(isbig, vec):
        'Det är samma på Python.'
        b, e = 0, len(vec)
        while b < e:
            m = b + (e - b) // 2
            if isbig(vec[m]):
                e = m
            else:
                b = m + 1
        else:
            return b
    
    data = ['', '', 'ab', 'abc', 'abd', 'abcde', 'abcdf', 'abcdefghij']
    pred = lambda n: lambda s: len(s) >= n
    for n in range(len('abcdefghij') + 1):
        u = index(pred(n), data)
        print('({} {} {})'.format(u, n, repr(data[u])))
    else:
        print(index(pred(n + 1), data) == len(data))
    
  4. Daniel said

    Here’s a solution in Java.

    Two things I’d probably do differently if implementing this in production, as opposed to implementing for an exercise:
    1) loops instead of recursion
    2) a more generic implementation that works for other types of arrays in addition to integer arrays.

    private static int helper(int[] array, int element, int lo, int hi) {
        if (hi <= lo) {
            return -1; // base case 1
        } else if (hi - lo == 1) {
            return element == array[lo] ? lo : -1; // base case 2
        }
        int mid = lo + (hi-lo) / 2;
        int val = array[mid];
        if (element == val) {
            return mid;  // base case 3
        } else if (element < val) {
            return helper(array, element, lo, mid);
        } else {
            return helper(array, element, lo+1, hi);
        }
    }
    
    public static int search(int[] array, int element) {
        return helper(array, element, 0, array.length);
    }
    
  5. Daniel said

    After implementing, I took a look at how binary search is implemented in Java’s standard library. The version I implemented returns -1 if it doesn’t find the element. The Java built-in also returns a negative number when the element is not found, but the returned number seemingly can be transformed to an insertion point, “the point at which the key would be inserted into the array”. I didn’t consider returning that information, but suppose it could be useful in some cases.

  6. Daniel said

    Also note I avoided the overflow error that John Cowan referenced, having already been familiar with this post:
    http://googleresearch.blogspot.com/2006/06/extra-extra-read-all-about-it-nearly.html

    Note: I’ve quoted from that page when I or someone else has detected bugs in my code:
    “We programmers need all the help we can get, and we should never assume otherwise. Careful design is great. Testing is great. Formal methods are great. Code reviews are great. Static analysis is great. But none of these things alone are sufficient to eliminate bugs: They will always be with us. A bug can exist for half a century despite our best efforts to exterminate it. We must program carefully, defensively, and remain ever vigilant.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: