Binary Search
April 29, 2016
I goofed.
While writing a program (it may appear in a future exercise) I needed to search a sorted array for a target value. I should have copied an existing binary search, but instead I wrote my own, since I’m a good programmer and can certainly write a simple function like that. You won’t have any trouble guessing what happened next.
Your task is to write a binary search function; do it yourself, without looking at any library implementations or searching the internet. You might also want to write a test script to give you confidence in your function. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.
Hell no. Binary search is extraordinarily difficult to get right. Even Bentley’s “proved correct” 2006 algorithm turns out to have a bug involving numeric overflow (which is at least only a performance bug in Lispy languages, where overflow produces bignums, floats, or an error). This is one of those cases like Posix regular expressions, where you absolutely need to use a high-quality library, not the one which comes with your OS (which surely has bugs), still less your own casually written code.
This is where I always write those invariants and variants out in some detail and still get it wrong at first. It happened again: on the first run I had an infinite loop and had to think; sure enough, one of my comments was false, so I fixed the comment and the code.
The proceduce looks for the least index where the element satisfies a predicate. The test predicate asserts that a string is at least as many characters long as a given number, and there are equally long strings and length gaps in the data. The final check is when no element is big enough, so the value is the length of the vector, which is not the index of any element.
Output (least index, sought least length, the string at the found index):
I translated :)
Here’s a solution in Java.
Two things I’d probably do differently if implementing this in production, as opposed to implementing for an exercise:
1) loops instead of recursion
2) a more generic implementation that works for other types of arrays in addition to integer arrays.
After implementing, I took a look at how binary search is implemented in Java’s standard library. The version I implemented returns -1 if it doesn’t find the element. The Java built-in also returns a negative number when the element is not found, but the returned number seemingly can be transformed to an insertion point, “the point at which the key would be inserted into the array”. I didn’t consider returning that information, but suppose it could be useful in some cases.
Also note I avoided the overflow error that John Cowan referenced, having already been familiar with this post:
http://googleresearch.blogspot.com/2006/06/extra-extra-read-all-about-it-nearly.html
Note: I’ve quoted from that page when I or someone else has detected bugs in my code:
“We programmers need all the help we can get, and we should never assume otherwise. Careful design is great. Testing is great. Formal methods are great. Code reviews are great. Static analysis is great. But none of these things alone are sufficient to eliminate bugs: They will always be with us. A bug can exist for half a century despite our best efforts to exterminate it. We must program carefully, defensively, and remain ever vigilant.