Anna’s Homework

January 31, 2020

Our solution iterates through all the elements in the array, accumulating their counts in a hash table, then iterates through the hash table, outputting all those elements that appear n/3 or more times (though the problem doesn’t say so, there may be two such items, and we want to find both); total time is O(n) for the pass through the array and no more than O(n) for the pass through the hash table, for a total time of O(n), and no more than O(n) space for the has table if all the elements are distinct:

(define (anna xs)
  (let ((ht (make-eq-hashtable)) (n (length xs)))
    (let loop ((xs xs))
      (if (pair? xs)
          (begin (hashtable-update! ht (car xs) add1 0)
                 (loop (cdr xs)))
          (call-with-values
            (lambda () (hashtable-entries ht))
            (lambda (keys vals)
              (let loop ((k 0) (result (list)))
                (if (= k (vector-length keys)) result
                  (if (< (/ n 3) (vector-ref vals k))
                      (loop (+ k 1) (cons (vector-ref keys k) result))
                      (loop (+ k 1) result))))))))))

> (anna '(1 1 2 2 1 1 1 5 6 7))
(1)
> (anna '(1 1 1 1 2 2 2 2 3 4))
(2 1)
> (anna '(1 2 3 4 5))
()

If you’re clever, you can cut off the loop on xs early when it becomes impossible to reach n/3. It is not possible to solve the problem in O(1), as one commenter replied.

You can see the program at https://ideone.com/fltzDn, but it doesn’t run because the version of Scheme at Ideone doesn’t supply R6RS-style hash tables.

Posted by programmingpraxis

Filed in Exercises

8 Comments »

8 Responses to “Anna’s Homework”

Zack said
January 31, 2020 at 12:09 PM
Here is my take on this in Julia: https://pastebin.com/xv0xnX5d

I’m not sure if the code is optimal, but I’m open to suggestions on how to improve it.

Have a nice weekend!

chaw said

January 31, 2020 at 4:16 PM

Here is a simple solution in R7RS Scheme plus a few helpers from
popular SRFIs. The algorithm is the same as the one by
@programmingpraxis, but the implementation is a bit more general
(although it should probably include a item-hash argument too) and
tries to use built-in procedures as much as possible. It is O(n) with
the usual caveat about hash tables.


(import (scheme base)
        (scheme write)
        (only (srfi 1) fold)
        (only (srfi 69)
              make-hash-table hash-table-update!/default  hash-table-fold))

(define (find-popular-items fraction items item=?)
  (let* ((ht (make-hash-table item=?))
         (n (fold (lambda (x nseen)
                    (hash-table-update!/default ht
                                                x
                                                (lambda (c) (+ c 1))
                                                0)
                    (+ nseen 1))
                  0
                  items))
         (threshold (ceiling (* n fraction))))
    (hash-table-fold ht
                     (lambda (key val lst)
                       (if (>= val threshold)
                           (cons key lst)
                           lst))
                     '())))

(display (find-popular-items 1/3 '(3 1 4 1 5 1 1) =))
(newline)
(display (find-popular-items 1/3 '(3 1 4 1 5 9 2) =))
(newline)

Output:


(1)
()

matthew said
January 31, 2020 at 4:43 PM
I haven’t checked it, but https://stackoverflow.com/questions/14955634/number-which-appears-more-than-n-3-times-in-an-array has an algorithm that claims to be O(1) space – it’s a variant on the Boyer-Moore majority voting algorithm that is definitely OK.

Daniel said

January 31, 2020 at 5:20 PM

Here’s a solution in Python.

from collections import Counter
from fractions import Fraction

def anna(l):
    return [x for x, c in Counter(l).items() if c >= Fraction(len(l), 3)]

print(anna([1, 1, 2, 2, 1, 1, 1, 5, 6, 7]))
print(anna([1, 1, 1, 1, 2, 2, 2, 2, 3, 4]))
print(anna([1, 2, 3, 4, 5]))

Output:

[1]
[1, 2]
[]

Daniel said
January 31, 2020 at 5:24 PM
I used Fraction to avoid floating points. Here’s a simpler solution (same output).
```
from collections import Counter

def anna(l):
    return [x for x, c in Counter(l).items() if c * 3 >= len(l)]
```

matthew said

January 31, 2020 at 5:43 PM

Here’s that O(1) algorithm, which does seem to work, plus a test program that generates random inputs to be checked.

import random

# Given array a, length N, return n1 and n2 such that if
# an array entry n occurs more that N/3 times, then
# n = n1 or n = n2.
def onethird(a):
    n1,n2,count1,count2 = 0,0,0,0
    for i, n in enumerate(a):
        #print(i,n,n1,count1,n2,count2)
        if n == n1: count1 += 1
        elif n == n2: count2 += 1
        elif count1 == 0: n1,count1 = n,1
        elif count2 == 0: n2,count2 = n,1
        else: count1 -= 1; count2 -= 1
    return n1,n2

# Generate a test array for the onethird function.
# Since we don't care when there isn't a majority
# item, ensure that there is one, then fill in
# the rest of the array randomly.
def testcase(K):
    N = random.randrange(2,K+1)
    M = random.randrange(1+N//3,N+1)
    assert(3*M > N)
    a = [0]*M
    i,j = 1,M
    while j < N:
        K = random.randrange(1,N-j+1)
        a += [i]*K
        j += K
        i += 1
    assert(len(a) == N)
    for i in range(10):
        # Having constructed the array elements,
        # shuffle and test a few times.
        random.shuffle(a)
        n1,n2 = onethird(a)
        if n1 != 0 and n2 != 0:
            print(a,N,M,n1,n2)
            return

for i in range(1000000): testcase(40)

Steve said

January 31, 2020 at 9:14 PM

Klong version

        l
Code:
        {[a a2];a::x;{a@*x}'a2@&{2<#x}'a2::=a}'[[1 1 2 2 1 1 1 5 6 7] [1 1 1 1 2 2 2 2 3 4] [1 2 3 4 5]];""
[[1] [1 2] []]
""

Explanation:
        l :"list"
[1 2 3 3 4 5 6 7 3 2 2 2]
        =l :"find locations of unique members of list"
[[0] [1 9 10 11] [2 3 8] [4] [5] [6] [7]]
        {2<#x}'a2::=l :"find if those member which occur >2 times = winners"
[0 1 1 0 0 0 0]
        &{2<#x}'a2::=l :"narrow it down to only the positions of the winners"
[1 2]
        a2@&{2<#x}'a2::=l :"get the positions of the winners"
[[1 9 10 11] [2 3 8]]
        {a@*x}'a2@&{2<#x}'a2::=l :"get the numbers"
[2 3]

Steve said

January 31, 2020 at 10:08 PM

Klong version #2 – Need to measure number of matches to length of list, and not an arbitrary number

        {[a a2 n];a::x;n::(#a)%3;{a@*x}'a2@&{~n>#x}'a2::=a}'[[1 1 2 2 1 1 1 5 6 7] [1 1 1 1 2 2 2 2 3 4] [1 2 3 4 5]]
[[1] [1 2] []]

S	M	T	W	T	F	S
			1	2	3	4
5	6	7	8	9	10	11
12	13	14	15	16	17	18
19	20	21	22	23	24	25
26	27	28	29	30	31

Programming Praxis

Anna’s Homework

January 31, 2020

8 Responses to “Anna’s Homework”

Leave a comment

Categories

Archives

Archives

Programming Praxis

Anna’s Homework

January 31, 2020

Share this:

Related

8 Responses to “Anna’s Homework”

Leave a comment

Categories

Archives

Archives