## An Array Of Two Symbols

### March 12, 2013

I found today’s exercise on a discussion forum for learning programmers. It looks to me like homework, but the posting was old enough that I don’t think we have to worry about doing someone’s homework for them — unless the teacher reuses the same problems from one year to the next.

You are given an array of length

nthat is filled with two symbols; allmcopies of one symbol appear first, at the beginning of the array, followed by alln−mcopies of the other symbol. You are to find the index of the first copy of the second symbol,m, in timeO(logm).

The obvious solution is binary search. Pick the midpoint of the array; if the item there is the first symbol, search recursively in the right half of the array, otherwise search recursively in the left half of the array, stopping when the sub-array has been reduced to a single item. But that takes time *O*(log *n*), which violates the conditions of the exercise.

Your task is to write the *O*(log *m*) solution. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.

[…] today’s exercise, our goal is to write a function that, given a list consisting of m of one symbol followed […]

My Haskell solution (see http://bonsaicode.wordpress.com/2013/03/12/programming-praxis-an-array-of-two-symbols/ for a version with comments):

On a side note, there’s a typo in the description. The complexity for the binary search should be O(log n), not O(log m).

A python solution that computes the binary representation of m, and then returns m+1. The algorithm determines sequentially the most significant bit, then the second and so on, constructing 2^k1, 2^k1 + 2^k2, … , 2^k1 + 2^k2 + … + 2^kt = m

The typo in the text makes it quite confusing.

> You are to find the index of the first copy of the second symbol, m, in time O(log m).

> But that takes time O(log m), which violates the conditions of the exercise.

Fixed. Sorry about that.

i have done it with php:

i have dione it with php:

[…] Question is from here […]

My java solution here.

i pasted the code so why didnt it show up?

anyway, here it is:

Seems like there is a bug in the solution, in the call into the first loop:

(loop hi (max (+ hi hi) (- len 1)))

This should be a call to (min), otherwise on the first iteration it checks (0 1), but the next jumps directly to the case (1 len – 1), as can be seen from the sample output:

> (f '#(0 0 1 1 1 1))

first loop: 0 1

first loop: 1 5 // We blew right past 2...

second loop: 1 3 5

second loop: 1 2 3

second loop: 1 1 2

second loop: 2 2 2

2

We should expect:

first loop: 0 1

first loop: 1 2

second loop: 1 1 2

second loop: 2 2 2

2

This is simple: find a lower bound for min in O(log2(m)), then apply a dichotomy(min,max) which will be in O(log2(m)) too, for a total of O(log2(m)).

(defun find-in-log-m (v)

(let* ((max (1- (length v)))

(symbol-m (aref v max)))

(labels ((bound-in-log-m (i)

(if (eql symbol-m (aref v (- max i)))

(bound-in-log-m (* 2 i))

i)))

(let ((min (bound-in-log-m 1)))

(1+ (nth-value 1 (com.informatimago.common-lisp.cesarum.utility:dichotomy

(lambda (index)

(cond

((eql symbol-m (aref v index)) -1)

((eql symbol-m (aref v (1+ index))) 0)

(t +1)))

min max)))))))

(find-in-log-m #(a a a a a a a a a a a a b b b b b b b b b b b b b b b b))

--> 17

pfernie: Fixed. That’s embarrassing.

[…] Second symbol starting index Array of two symbol […]