Anna’s Homework
January 31, 2020
Our solution iterates through all the elements in the array, accumulating their counts in a hash table, then iterates through the hash table, outputting all those elements that appear n/3 or more times (though the problem doesn’t say so, there may be two such items, and we want to find both); total time is O(n) for the pass through the array and no more than O(n) for the pass through the hash table, for a total time of O(n), and no more than O(n) space for the has table if all the elements are distinct:
(define (anna xs)
(let ((ht (make-eq-hashtable)) (n (length xs)))
(let loop ((xs xs))
(if (pair? xs)
(begin (hashtable-update! ht (car xs) add1 0)
(loop (cdr xs)))
(call-with-values
(lambda () (hashtable-entries ht))
(lambda (keys vals)
(let loop ((k 0) (result (list)))
(if (= k (vector-length keys)) result
(if (< (/ n 3) (vector-ref vals k))
(loop (+ k 1) (cons (vector-ref keys k) result))
(loop (+ k 1) result))))))))))
> (anna '(1 1 2 2 1 1 1 5 6 7)) (1) > (anna '(1 1 1 1 2 2 2 2 3 4)) (2 1) > (anna '(1 2 3 4 5)) ()
If you’re clever, you can cut off the loop on xs early when it becomes impossible to reach n/3. It is not possible to solve the problem in O(1), as one commenter replied.
You can see the program at https://ideone.com/fltzDn, but it doesn’t run because the version of Scheme at Ideone doesn’t supply R6RS-style hash tables.
Here is my take on this in Julia: https://pastebin.com/xv0xnX5d
I’m not sure if the code is optimal, but I’m open to suggestions on how to improve it.
Have a nice weekend!
Here is a simple solution in R7RS Scheme plus a few helpers from
popular SRFIs. The algorithm is the same as the one by
@programmingpraxis, but the implementation is a bit more general
(although it should probably include a item-hash argument too) and
tries to use built-in procedures as much as possible. It is O(n) with
the usual caveat about hash tables.
(import (scheme base) (scheme write) (only (srfi 1) fold) (only (srfi 69) make-hash-table hash-table-update!/default hash-table-fold)) (define (find-popular-items fraction items item=?) (let* ((ht (make-hash-table item=?)) (n (fold (lambda (x nseen) (hash-table-update!/default ht x (lambda (c) (+ c 1)) 0) (+ nseen 1)) 0 items)) (threshold (ceiling (* n fraction)))) (hash-table-fold ht (lambda (key val lst) (if (>= val threshold) (cons key lst) lst)) '()))) (display (find-popular-items 1/3 '(3 1 4 1 5 1 1) =)) (newline) (display (find-popular-items 1/3 '(3 1 4 1 5 9 2) =)) (newline)Output:
I haven’t checked it, but https://stackoverflow.com/questions/14955634/number-which-appears-more-than-n-3-times-in-an-array has an algorithm that claims to be O(1) space – it’s a variant on the Boyer-Moore majority voting algorithm that is definitely OK.
Here’s a solution in Python.
from collections import Counter from fractions import Fraction def anna(l): return [x for x, c in Counter(l).items() if c >= Fraction(len(l), 3)] print(anna([1, 1, 2, 2, 1, 1, 1, 5, 6, 7])) print(anna([1, 1, 1, 1, 2, 2, 2, 2, 3, 4])) print(anna([1, 2, 3, 4, 5]))Output:
I used
Fractionto avoid floating points. Here’s a simpler solution (same output).from collections import Counter def anna(l): return [x for x, c in Counter(l).items() if c * 3 >= len(l)]Here’s that O(1) algorithm, which does seem to work, plus a test program that generates random inputs to be checked.
import random # Given array a, length N, return n1 and n2 such that if # an array entry n occurs more that N/3 times, then # n = n1 or n = n2. def onethird(a): n1,n2,count1,count2 = 0,0,0,0 for i, n in enumerate(a): #print(i,n,n1,count1,n2,count2) if n == n1: count1 += 1 elif n == n2: count2 += 1 elif count1 == 0: n1,count1 = n,1 elif count2 == 0: n2,count2 = n,1 else: count1 -= 1; count2 -= 1 return n1,n2 # Generate a test array for the onethird function. # Since we don't care when there isn't a majority # item, ensure that there is one, then fill in # the rest of the array randomly. def testcase(K): N = random.randrange(2,K+1) M = random.randrange(1+N//3,N+1) assert(3*M > N) a = [0]*M i,j = 1,M while j < N: K = random.randrange(1,N-j+1) a += [i]*K j += K i += 1 assert(len(a) == N) for i in range(10): # Having constructed the array elements, # shuffle and test a few times. random.shuffle(a) n1,n2 = onethird(a) if n1 != 0 and n2 != 0: print(a,N,M,n1,n2) return for i in range(1000000): testcase(40)Klong version
l Code: {[a a2];a::x;{a@*x}'a2@&{2<#x}'a2::=a}'[[1 1 2 2 1 1 1 5 6 7] [1 1 1 1 2 2 2 2 3 4] [1 2 3 4 5]];"" [[1] [1 2] []] "" Explanation: l :"list" [1 2 3 3 4 5 6 7 3 2 2 2] =l :"find locations of unique members of list" [[0] [1 9 10 11] [2 3 8] [4] [5] [6] [7]] {2<#x}'a2::=l :"find if those member which occur >2 times = winners" [0 1 1 0 0 0 0] &{2<#x}'a2::=l :"narrow it down to only the positions of the winners" [1 2] a2@&{2<#x}'a2::=l :"get the positions of the winners" [[1 9 10 11] [2 3 8]] {a@*x}'a2@&{2<#x}'a2::=l :"get the numbers" [2 3]Klong version #2 – Need to measure number of matches to length of list, and not an arbitrary number
{[a a2 n];a::x;n::(#a)%3;{a@*x}'a2@&{~n>#x}'a2::=a}'[[1 1 2 2 1 1 1 5 6 7] [1 1 1 1 2 2 2 2 3 4] [1 2 3 4 5]] [[1] [1 2] []]