Anna’s Homework
January 31, 2020
Our solution iterates through all the elements in the array, accumulating their counts in a hash table, then iterates through the hash table, outputting all those elements that appear n/3 or more times (though the problem doesn’t say so, there may be two such items, and we want to find both); total time is O(n) for the pass through the array and no more than O(n) for the pass through the hash table, for a total time of O(n), and no more than O(n) space for the has table if all the elements are distinct:
(define (anna xs) (let ((ht (make-eq-hashtable)) (n (length xs))) (let loop ((xs xs)) (if (pair? xs) (begin (hashtable-update! ht (car xs) add1 0) (loop (cdr xs))) (call-with-values (lambda () (hashtable-entries ht)) (lambda (keys vals) (let loop ((k 0) (result (list))) (if (= k (vector-length keys)) result (if (< (/ n 3) (vector-ref vals k)) (loop (+ k 1) (cons (vector-ref keys k) result)) (loop (+ k 1) result))))))))))
> (anna '(1 1 2 2 1 1 1 5 6 7)) (1) > (anna '(1 1 1 1 2 2 2 2 3 4)) (2 1) > (anna '(1 2 3 4 5)) ()
If you’re clever, you can cut off the loop on xs
early when it becomes impossible to reach n/3. It is not possible to solve the problem in O(1), as one commenter replied.
You can see the program at https://ideone.com/fltzDn, but it doesn’t run because the version of Scheme at Ideone doesn’t supply R6RS-style hash tables.
Here is my take on this in Julia: https://pastebin.com/xv0xnX5d
I’m not sure if the code is optimal, but I’m open to suggestions on how to improve it.
Have a nice weekend!
Here is a simple solution in R7RS Scheme plus a few helpers from
popular SRFIs. The algorithm is the same as the one by
@programmingpraxis, but the implementation is a bit more general
(although it should probably include a item-hash argument too) and
tries to use built-in procedures as much as possible. It is O(n) with
the usual caveat about hash tables.
Output:
I haven’t checked it, but https://stackoverflow.com/questions/14955634/number-which-appears-more-than-n-3-times-in-an-array has an algorithm that claims to be O(1) space – it’s a variant on the Boyer-Moore majority voting algorithm that is definitely OK.
Here’s a solution in Python.
Output:
I used
Fraction
to avoid floating points. Here’s a simpler solution (same output).Here’s that O(1) algorithm, which does seem to work, plus a test program that generates random inputs to be checked.
Klong version
Klong version #2 – Need to measure number of matches to length of list, and not an arbitrary number