## Anna’s Homework

### January 31, 2020

Our solution iterates through all the elements in the array, accumulating their counts in a hash table, then iterates through the hash table, outputting all those elements that appear n/3 or more times (though the problem doesn’t say so, there may be two such items, and we want to find both); total time is O(n) for the pass through the array and no more than O(n) for the pass through the hash table, for a total time of O(n), and no more than O(n) space for the has table if all the elements are distinct:

```(define (anna xs)
(let ((ht (make-eq-hashtable)) (n (length xs)))
(let loop ((xs xs))
(if (pair? xs)
(begin (hashtable-update! ht (car xs) add1 0)
(loop (cdr xs)))
(call-with-values
(lambda () (hashtable-entries ht))
(lambda (keys vals)
(let loop ((k 0) (result (list)))
(if (= k (vector-length keys)) result
(if (< (/ n 3) (vector-ref vals k))
(loop (+ k 1) (cons (vector-ref keys k) result))
(loop (+ k 1) result))))))))))```
```> (anna '(1 1 2 2 1 1 1 5 6 7))
(1)
> (anna '(1 1 1 1 2 2 2 2 3 4))
(2 1)
> (anna '(1 2 3 4 5))
()```

If you’re clever, you can cut off the loop on `xs` early when it becomes impossible to reach n/3. It is not possible to solve the problem in O(1), as one commenter replied.

You can see the program at https://ideone.com/fltzDn, but it doesn’t run because the version of Scheme at Ideone doesn’t supply R6RS-style hash tables.

Pages: 1 2

### 8 Responses to “Anna’s Homework”

1. Zack said

Here is my take on this in Julia: https://pastebin.com/xv0xnX5d

I’m not sure if the code is optimal, but I’m open to suggestions on how to improve it.

Have a nice weekend!

2. chaw said

Here is a simple solution in R7RS Scheme plus a few helpers from
popular SRFIs. The algorithm is the same as the one by
@programmingpraxis, but the implementation is a bit more general
(although it should probably include a item-hash argument too) and
tries to use built-in procedures as much as possible. It is O(n) with
the usual caveat about hash tables.

``` (import (scheme base) (scheme write) (only (srfi 1) fold) (only (srfi 69) make-hash-table hash-table-update!/default hash-table-fold)) (define (find-popular-items fraction items item=?) (let* ((ht (make-hash-table item=?)) (n (fold (lambda (x nseen) (hash-table-update!/default ht x (lambda (c) (+ c 1)) 0) (+ nseen 1)) 0 items)) (threshold (ceiling (* n fraction)))) (hash-table-fold ht (lambda (key val lst) (if (>= val threshold) (cons key lst) lst)) '()))) (display (find-popular-items 1/3 '(3 1 4 1 5 1 1) =)) (newline) (display (find-popular-items 1/3 '(3 1 4 1 5 9 2) =)) (newline) ```

Output:

``` (1) () ```

3. matthew said

I haven’t checked it, but https://stackoverflow.com/questions/14955634/number-which-appears-more-than-n-3-times-in-an-array has an algorithm that claims to be O(1) space – it’s a variant on the Boyer-Moore majority voting algorithm that is definitely OK.

4. Daniel said

Here’s a solution in Python.

```from collections import Counter
from fractions import Fraction

def anna(l):
return [x for x, c in Counter(l).items() if c >= Fraction(len(l), 3)]

print(anna([1, 1, 2, 2, 1, 1, 1, 5, 6, 7]))
print(anna([1, 1, 1, 1, 2, 2, 2, 2, 3, 4]))
print(anna([1, 2, 3, 4, 5]))
```

Output:

```
[1, 2]
[]
```
5. Daniel said

I used `Fraction` to avoid floating points. Here’s a simpler solution (same output).

```from collections import Counter

def anna(l):
return [x for x, c in Counter(l).items() if c * 3 >= len(l)]
```
6. matthew said

Here’s that O(1) algorithm, which does seem to work, plus a test program that generates random inputs to be checked.

```import random

# Given array a, length N, return n1 and n2 such that if
# an array entry n occurs more that N/3 times, then
# n = n1 or n = n2.
def onethird(a):
n1,n2,count1,count2 = 0,0,0,0
for i, n in enumerate(a):
#print(i,n,n1,count1,n2,count2)
if n == n1: count1 += 1
elif n == n2: count2 += 1
elif count1 == 0: n1,count1 = n,1
elif count2 == 0: n2,count2 = n,1
else: count1 -= 1; count2 -= 1
return n1,n2

# Generate a test array for the onethird function.
# Since we don't care when there isn't a majority
# item, ensure that there is one, then fill in
# the rest of the array randomly.
def testcase(K):
N = random.randrange(2,K+1)
M = random.randrange(1+N//3,N+1)
assert(3*M > N)
a = *M
i,j = 1,M
while j < N:
K = random.randrange(1,N-j+1)
a += [i]*K
j += K
i += 1
assert(len(a) == N)
for i in range(10):
# Having constructed the array elements,
# shuffle and test a few times.
random.shuffle(a)
n1,n2 = onethird(a)
if n1 != 0 and n2 != 0:
print(a,N,M,n1,n2)
return

for i in range(1000000): testcase(40)
```
7. Steve said

Klong version

```        l
Code:
{[a a2];a::x;{a@*x}'a2@&{2<#x}'a2::=a}'[[1 1 2 2 1 1 1 5 6 7] [1 1 1 1 2 2 2 2 3 4] [1 2 3 4 5]];""
[ [1 2] []]
""

Explanation:
l :"list"
[1 2 3 3 4 5 6 7 3 2 2 2]
=l :"find locations of unique members of list"
[ [1 9 10 11] [2 3 8]    ]
{2<#x}'a2::=l :"find if those member which occur >2 times = winners"
[0 1 1 0 0 0 0]
&{2<#x}'a2::=l :"narrow it down to only the positions of the winners"
[1 2]
a2@&{2<#x}'a2::=l :"get the positions of the winners"
[[1 9 10 11] [2 3 8]]
{a@*x}'a2@&{2<#x}'a2::=l :"get the numbers"
[2 3]
```
8. Steve said

Klong version #2 – Need to measure number of matches to length of list, and not an arbitrary number

```        {[a a2 n];a::x;n::(#a)%3;{a@*x}'a2@&{~n>#x}'a2::=a}'[[1 1 2 2 1 1 1 5 6 7] [1 1 1 1 2 2 2 2 3 4] [1 2 3 4 5]]
[ [1 2] []]
```