Two Amazon Interview Questions
February 26, 2016
For the first question, the obvious answer is a hash table. We’ll take the hash table from the Standard Prelude, adapt it to a set (key only, no associated value) and add a random
operator:
(define (make-set hash eql? size) (let ((table (make-vector size (list)))) (lambda (message . args) (if (eq? message 'random) (let loop ((index (randint size))) (if (pair? (vector-ref table index)) (fortune (vector-ref table index)) (loop (randint size)))) (let* ((key (car args)) (index (modulo (hash key) size)) (bucket (vector-ref table index))) (case message ((member?) (let loop ((bucket bucket)) (cond ((null? bucket) #f) ((eql? (car bucket) key) #t) (else (loop (cdr bucket)))))) ((insert) (vector-set! table index (let loop ((bucket bucket)) (cond ((null? bucket) (list key)) ((eql? (car bucket) key) bucket) (else (cons (car bucket) (loop (cdr bucket)))))))) ((delete) (vector-set! table index (let loop ((bucket bucket)) (cond ((null? bucket) bucket) ((eql? (car bucket) key) (cdr bucket)) (else (cons (car bucket) (loop (cdr bucket)))))))) (else (error 'set "unrecognized message"))))))))
Here are some examples:
> (define s (make-set identity = 7)) > (s 'insert 1) > (s 'insert 2) > (s 'insert 3) > (s 'insert 4) > (s 'insert 5) > (s 'member? 2) #t > (s 'delete 2) > (s 'member? 2) #f > (s 'random) 4
For the second question, we collect counts of each item in a first pass, then make a second pass where we return the first item with an even count:
(define (first-even lt? xs) (define (eql? a b) (if (lt? a b) #f (not (lt? b a)))) (let* ((evens (map car (filter (lambda (x) (even? (cdr x))) (uniq-c eql? (sort lt? xs))))) (firsts (filter (lambda (x) (member x evens)) xs))) (if (pair? firsts) (car firsts) #f)))
Here are some examples:
> (first-even < '(1 2 1 3 1 2 1)) 1 > (first-even < '(1 2 3 4 5)) #f
You can run the programs at http://ideone.com/XO5Q1W, where you will also see contributions from the Standard Prelude.
Hash tables are not constant time. They’re amortized O(1), which is very different – try implementing a hard-realtime system with hash tables.
@Janne – I’d have thought it depends on the situation, if you either don’t need to resize because you know in advance the required table size or you spread the cost of resizing the table over many insertions, and if your hash function is perfect or in practice produces a good distribution, then most people would be happy calling that constant (or at least bounded) time.
Anyway, the suggested random selection seems to be randomly selecting a (non-empty) hash table bucket, then selecting an element in that bucket – this isn’t going to produce a uniform distribution. If you are going for linked list buckets, better to allocate the list nodes from a separate array, then randomly select from that array rather than from the hash bucket array.
Does random operate in constant time? Is not it O(n)?
The answer given here has a nice way of making a uniform selection:
http://stackoverflow.com/questions/8629447/efficiently-picking-a-random-element-from-a-chained-hash-table
If there are N buckets and L is the length of the longest hash table bucket chain, first randomly choose a non-empty bucket, then choose j from 0 to L-1 (inclusive), if bucket has a jth element in its chain, return that, otherwise choose another j.
Assuming items are spread randomly across the hash buckets, average runtime should be O(1).
I’m getting it wrong now: that should be, if there is no jth element in the bucket, start again by choosing another bucket.