## An Array Of Two Symbols

### March 12, 2013

The solution is to start from the left end instead of the middle, doubling the size of the search sub-array at each step: first from 0 to 1, then from 1 to 2, then from 2 to 4, then from 4 to 8, and so on. That will take O(log m) doublings. Once i doublings have occurred and the first copy of the second symbol is bracketed between the 2ith array element and the 2i+1th, the sub-array will be no larger than m, so binary search on the sub-array will find the desired item in O(log m) time, giving a combined O(log m) for the entire solution. Here’s the code, where we use 0 and 1 as the two symbols; beware that it is harder than it looks to get this code right:

```(define (f vec)   (let ((len (vector-length vec)))     (let loop ((lo 0) (hi 1))       (when verbose? (display "first loop: ")         (display lo) (display " ") (display hi) (newline))       (if (zero? (vector-ref vec hi))           (loop hi (min (+ hi hi) (- len 1)))           (let loop ((lo lo) (hi hi))             (let ((mid (quotient (+ lo hi) 2)))               (when verbose? (display "second loop: ")                 (display lo) (display " ") (display mid)                 (display " ") (display hi) (newline))               (cond ((= lo hi) hi)                     ((zero? (vector-ref vec mid))                       (loop (+ mid 1) hi))                     (else (loop lo mid)))))))))```

We assume that the form of the input vector is correct, that it has at least two items, including at least one 0 and one 1, and that all the 0 items appear before all the 1 items. Here’s an example:

```> (set! verbose? #t) > (f '#(0 0 1 1 1 1)) first loop: 0 1 first loop: 1 5 second loop: 1 3 5 second loop: 1 2 3 second loop: 1 1 2 second loop: 2 2 2 2```

The code here is tricky, with several places where it is easy to be off-by-one or to branch the wrong way on equality (yes, I stumbled into one of them while writing this code). Because of that, the code includes debugging output that can be turned on by setting the global variable `verbose?` to `#t`; it is instructive to turn on the verbose output and watch the code run. We also wrote a test program that runs a large number of tests, writing any failures or returning the null list for success:

```(define (test n)   (list-of (list vec result)     (x range 1 n)     (y range 1 n)     (vec is (list->vector (append (make-list x 0) (make-list y 1))))     (result is (f vec))     (or (not (= (vector-ref vec result) 1))         (not (= (vector-ref vec (- result 1)) 0)))))```

``` ```

```> (set! verbose? #f) > (test 100) ()```

My own experience is instructive. I ran `f` by hand for several test cases, and thought all was well. Then I wrote the `test` program, and threw an exception that caused the test to halt midway through. Not seeing the problem, I added the debugging code, and saw that in the problem case lo was greater than hi, which should be impossible. Once I knew what was wrong, the problem was easy to fix, but I would never have found it without the debugging output. Finally, I decided to leave both the test script and the debugging code in the final version of the program, so I could show it to you, and to satisfy a gnawing sense that there might still be some remaining problem.

You can run the program at http://programmingpraxis.codepad.org/gq1oqTqk.

Pages: 1 2

### 14 Responses to “An Array Of Two Symbols”

1. […] today’s exercise, our goal is to write a function that, given a list consisting of m of one symbol followed […]

```import qualified Data.Vector as V
import Test.QuickCheck

search :: V.Vector Char -> Int
search xs = f 0 (V.length xs - 1) where
f start end = case span ((xs V.! 0 ==) . (xs V.!)) . takeWhile (<= end) .
map (start +) \$ 0 : iterate (*2) 1
of   ([_],[n]) -> n
(ms ,[] ) -> f (last ms) end
(ms ,n:_) -> f (last ms) n
```

On a side note, there’s a typo in the description. The complexity for the binary search should be O(log n), not O(log m).

3. cosmin said

A python solution that computes the binary representation of m, and then returns m+1. The algorithm determines sequentially the most significant bit, then the second and so on, constructing 2^k1, 2^k1 + 2^k2, … , 2^k1 + 2^k2 + … + 2^kt = m

```def split(a):
pow2, n = 1, len(a)
while pow2 < n and a == a[pow2]:
pow2 <<= 1
pow2 >>= 1
index = pow2
while pow2 > 0:
if index + pow2 < n and a == a[index + pow2]:
index += pow2
pow2 >>= 1
return index + 1

print split([0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1])
```
4. Joachim said

The typo in the text makes it quite confusing.

> You are to find the index of the first copy of the second symbol, m, in time O(log m).

> But that takes time O(log m), which violates the conditions of the exercise.

5. programmingpraxis said

6. howlettk said

i have done it with php:

7. howlettk said

i have dione it with php:

8. […] Question is from here […]

9. howlettk said

i pasted the code so why didnt it show up?
anyway, here it is:

10. pfernie said

Seems like there is a bug in the solution, in the call into the first loop:

``` (loop hi (max (+ hi hi) (- len 1))) ```

This should be a call to (min), otherwise on the first iteration it checks (0 1), but the next jumps directly to the case (1 len – 1), as can be seen from the sample output:
``` > (f '#(0 0 1 1 1 1)) first loop: 0 1 first loop: 1 5 // We blew right past 2... second loop: 1 3 5 second loop: 1 2 3 second loop: 1 1 2 second loop: 2 2 2 2 ```

We should expect:
``` first loop: 0 1 first loop: 1 2 second loop: 1 1 2 second loop: 2 2 2 2 ```

11. This is simple: find a lower bound for min in O(log2(m)), then apply a dichotomy(min,max) which will be in O(log2(m)) too, for a total of O(log2(m)).

``` (defun find-in-log-m (v) (let* ((max (1- (length v))) (symbol-m (aref v max))) (labels ((bound-in-log-m (i) (if (eql symbol-m (aref v (- max i))) (bound-in-log-m (* 2 i)) i))) (let ((min (bound-in-log-m 1))) (1+ (nth-value 1 (com.informatimago.common-lisp.cesarum.utility:dichotomy (lambda (index) (cond ((eql symbol-m (aref v index)) -1) ((eql symbol-m (aref v (1+ index))) 0) (t +1))) min max)))))))```

``` (find-in-log-m #(a a a a a a a a a a a a b b b b b b b b b b b b b b b b)) --> 17 ```

12. programmingpraxis said

pfernie: Fixed. That’s embarrassing.

13. […] Second symbol starting index Array of two symbol […]