## An Impossible Interview Question

### October 18, 2016

I think the desired solution was Floyd’s algorithm: Pick the first character with probability 1. Then replace it with the nth character in the input with probability 1/n. When the stream is exhausted, output the character that remains:

```(define (floyd xs)
(let loop ((xs xs) (n 1) (x #f))
(if (null? xs) x
(loop (cdr xs) (+ n 1)
(if (< (rand) (/ n)) (car xs) x)))))

> (floyd '(1 2 3 4 5))
4```

We take the input all at once, because Scheme handles lists better than streams, but we don’t make use of anything other than the current character and the count of characters that we have seen, so we do properly solve the problem. You can run the solution at http://ideone.com/Hh4IfH.

Except, of course, that we don’t properly solve the problem because we use a counter n as well as a character. I don’t see any way to solve the problem as given. Do you?

Pages: 1 2

### 10 Responses to “An Impossible Interview Question”

1. Paul said

In Python.

```from random import randrange
from collections import Counter

def random_char(seq):
result = None
for i, c in enumerate(seq, 1):
if randrange(i) == 0:
result = c
return result

test = (random_char("abcd") for i in range(1000000))
print(Counter(test).most_common())
# [('a', 250848), ('d', 250520), ('b', 249523), ('c', 249109)]
```
2. Zack said

@Paul. Kudos for such an elegant piece of code! I have my own implementation of this in Julia but I don’t think it’s any better than yours (though it’s probably faster due to the nature of the language). Anyway, I’m glad there are people like you who don’t take the term “impossible” very seriously. Thank you for sharing.

3. nims11 said

Doesn’t your solution have higher probabilities to return initial numbers (the question asks for equal likeliness) ? For a 3 character sequence, the probabilities would be 1/2, 2/3, 1/3 for the characters, respectively, right?

4. nims11 said

My bad, the probabilities are indeed equal

5. Paul said

@Zack. Thank you. You make me blush.

6. V said

Solution in Ruby

The tricky part of this one is the condition:

> Only 1 character of storage space; in particular, you may not save multiple characters from the input stream

What is storage? variables? ram? disk?

Anyway here is my solution which might not comply with the storage condition.

```
def pick_random(stream)
stream.chars.uniq.sample
end

# test

input = "an impossible interview question"

puts 1_000_000.times
.map { pick_random(input) }
.group_by(&:itself)
.sort_by {|k, v| k }
.map {|k, v| "#{k}: #{v.count}" }

```

Output (How many times each letter was picked?)

: 58765
a: 58743
b: 58609
e: 58692
i: 58765
l: 58512
m: 58883
n: 58760
o: 58570
p: 59486
q: 58672
r: 59039
s: 58658
t: 58681
u: 59169
v: 58940
w: 59056

7. Daniel said

Here’s a solution in Java.

```import java.util.Random;

public class ReservoirSampling {
private static Random rng = new Random();

// a stream of characters with unknown length.
public static class Stream {
// the unknown remaining size of the stream. (stream size starts between 1 and 5000)
private int n = rng.nextInt(5000) + 1;

private static final String alphabet = "abcdefghijklmnopqrstuvwxyz";

public boolean hasNext() {
return n > 0;
}

// get the next character in the stream (or null if stream is empty)
public Character next() {
if (n > 0) {
--n;
return alphabet.charAt(rng.nextInt(alphabet.length()));
} else {
return null;
}
}
}

public static Character sample(Stream stream) {
int i = 1;
Character c = stream.next();
for (Character next = stream.next(); next != null; next = stream.next()) {
++i;
if (rng.nextDouble() < 1.0/i) {
c = next;
}
}
return c;
}

public static void main(String[] args) {
System.out.println("Sample: " + sample(new Stream()));
}
}
```

Output:

```Sample: e
```
8. Jussi Piitulainen said

There was a Praxis exercise on Reservoir Sampling a while back.

9. Nyet Ya Koshka said

The proposed solution stores one character, but it also stores one integer counter worth of state. Not sure it meets the requirements of this shibboleth.

10. John Cowan said

The problem statement is ambiguous. Does it mean that the probability of choosing a character in the running text (a token) is equal, or that the probability of choosing a distinct character (a type) is equal? If the latter, I think the problem truly is impossible, because you must know what characters appear or do not appear.

As for the counter problem, I would weasel it as follows: you are allowed to have only one character, but that says nothing about keeping integer state in addition. (Obviously this doesn’t mean keeping the characters in ints!)

(There’s nothing inherently wrong with inherent or impossible interview questions, especially if they get the answerer to ask counter-questions, something interviewees are normally quite unwilling to do, although they need this skill for actual work.)