An Array Of Two Symbols

March 12, 2013

I found today’s exercise on a discussion forum for learning programmers. It looks to me like homework, but the posting was old enough that I don’t think we have to worry about doing someone’s homework for them — unless the teacher reuses the same problems from one year to the next.

You are given an array of length n that is filled with two symbols; all m copies of one symbol appear first, at the beginning of the array, followed by all nm copies of the other symbol. You are to find the index of the first copy of the second symbol, m, in time O(log m).

The obvious solution is binary search. Pick the midpoint of the array; if the item there is the first symbol, search recursively in the right half of the array, otherwise search recursively in the left half of the array, stopping when the sub-array has been reduced to a single item. But that takes time O(log n), which violates the conditions of the exercise.

Your task is to write the O(log m) solution. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.

About these ads

Pages: 1 2

13 Responses to “An Array Of Two Symbols”

  1. [...] today’s exercise, our goal is to write a function that, given a list consisting of m of one symbol followed [...]

  2. My Haskell solution (see http://bonsaicode.wordpress.com/2013/03/12/programming-praxis-an-array-of-two-symbols/ for a version with comments):

    import qualified Data.Vector as V
    import Test.QuickCheck
    
    search :: V.Vector Char -> Int
    search xs = f 0 (V.length xs - 1) where
        f start end = case span ((xs V.! 0 ==) . (xs V.!)) . takeWhile (<= end) .
                           map (start +) $ 0 : iterate (*2) 1
                      of   ([_],[n]) -> n
                           (ms ,[] ) -> f (last ms) end
                           (ms ,n:_) -> f (last ms) n
    

    On a side note, there’s a typo in the description. The complexity for the binary search should be O(log n), not O(log m).

  3. cosmin said

    A python solution that computes the binary representation of m, and then returns m+1. The algorithm determines sequentially the most significant bit, then the second and so on, constructing 2^k1, 2^k1 + 2^k2, … , 2^k1 + 2^k2 + … + 2^kt = m

    def split(a):
    	pow2, n = 1, len(a)
    	while pow2 < n and a[0] == a[pow2]:
    		pow2 <<= 1
    	pow2 >>= 1
    	index = pow2
    	while pow2 > 0:
    		if index + pow2 < n and a[0] == a[index + pow2]:
    			index += pow2
    		pow2 >>= 1
    	return index + 1
    
    print split([0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1])
    
  4. Joachim said

    The typo in the text makes it quite confusing.

    > You are to find the index of the first copy of the second symbol, m, in time O(log m).

    > But that takes time O(log m), which violates the conditions of the exercise.

  5. programmingpraxis said

    Fixed. Sorry about that.

  6. howlettk said

    i have done it with php:

  7. howlettk said

    i have dione it with php:

  8. howlettk said

    i pasted the code so why didnt it show up?
    anyway, here it is:

  9. pfernie said

    Seems like there is a bug in the solution, in the call into the first loop:


    (loop hi (max (+ hi hi) (- len 1)))

    This should be a call to (min), otherwise on the first iteration it checks (0 1), but the next jumps directly to the case (1 len – 1), as can be seen from the sample output:

    > (f '#(0 0 1 1 1 1))
    first loop: 0 1
    first loop: 1 5 // We blew right past 2...
    second loop: 1 3 5
    second loop: 1 2 3
    second loop: 1 1 2
    second loop: 2 2 2
    2

    We should expect:

    first loop: 0 1
    first loop: 1 2
    second loop: 1 1 2
    second loop: 2 2 2
    2

  10. This is simple: find a lower bound for min in O(log2(m)), then apply a dichotomy(min,max) which will be in O(log2(m)) too, for a total of O(log2(m)).


    (defun find-in-log-m (v)
    (let* ((max (1- (length v)))
    (symbol-m (aref v max)))
    (labels ((bound-in-log-m (i)
    (if (eql symbol-m (aref v (- max i)))
    (bound-in-log-m (* 2 i))
    i)))
    (let ((min (bound-in-log-m 1)))
    (1+ (nth-value 1 (com.informatimago.common-lisp.cesarum.utility:dichotomy
    (lambda (index)
    (cond
    ((eql symbol-m (aref v index)) -1)
    ((eql symbol-m (aref v (1+ index))) 0)
    (t +1)))
    min max)))))))

    (find-in-log-m #(a a a a a a a a a a a a b b b b b b b b b b b b b b b b))
    --> 17

  11. programmingpraxis said

    pfernie: Fixed. That’s embarrassing.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 576 other followers

%d bloggers like this: