## Subset Sums

### November 9, 2010

The obvious answer is to enumerate all subsets, then check each to see if it satisfies the summing criteria:

```(define (subsets s)   (if (null? s) (list ())     (let ((rest (subsets (cdr s))))       (append rest (map (lambda (x) (cons (car s) x)) rest)))))```

```(define (greplin3-slow s)   (length     (filter (lambda (x) (= (car x) (apply + (cdr x))))       (map reverse         (filter (lambda (x) (< 2 (length x)))           (subsets s))))))```

`Subsets` generates all the subsets in a list, then the various `map`s and `filter`s massage the list to produce the answer. The problem with this approach is that it requires exponential time to generate all the subsets; for a set of five elements, as in the example problem, that’s fine, but for the 22-element challenge set, it is uncomfortable, taking several minutes to compute a solution. And of course it is impossible to solve a problem with hundreds of elements.

A better approach uses dynamic programming. Let S be the set of numbers S1, S2, …, Sk and D(k,n) be the number of subsets of S1, S2, …, Sk that sum to n. Then the basic recursion is D(k,n) = D(k−1,n) + D(k−1,nSk). The recursion stops with 0 when either k or n is 0, except in the case where k=n=0, when the recursion stops with 1 (the case where 0+0=0):

`(define ss #f)`

```(define (d k n)   (cond ((negative? n) 0)         ((negative? k) (if (zero? n) 1 0))         (else (+ (d (- k 1) n)                  (d (- k 1) (- n (list-ref ss k)))))))```

```(define (greplin3-fast s)   (set! ss s)   (let loop ((s s) (k 0) (c (- (length s))))     (if (null? s) c       (loop (cdr s) (+ k 1) (+ c (d k (car s)))))))```

We save the original set in the global variable `ss` to avoid having to pass it around all the time. This works well on both the example set and the challenge set:

```> (greplin3 '(1 2 3 4 6)) 4 > (greplin3 '(3 4 9 14 15 19 28 37 47 50 54 56 59 61 70 73 78 81 92 95 97 99)) 179```

It’s much faster than the naive recursion because instead of forming every possible subset it cuts off early when a partial sum grows too large. But it’s still too slow if S is large. The fix is to memoize the D function, saving each value as it is computed so that the saved value can be reused when the function is called again with the same arguments:

`(define t #f)`

`(define (hash args) (+ (* (car args) 10000) (cadr args)))`

```(define (d-memo k n)   (define (return k n v) (t 'insert! (list k n) v) v)   (let ((prev (t 'lookup (list k n))))     (cond (prev prev)           ((negative? n) (return k n 0))           ((negative? k) (return k n (if (zero? n) 1 0)))           (else (return k n (+ (d-memo (- k 1) n)             (d-memo (- k 1) (- n (list-ref ss k)))))))))```

```(define (greplin3-memo s)   (set! ss s)   (set! t (make-hash hash equal? #f 10000))   (let loop ((s s) (k 0) (c (- (length s))))     (if (null? s) c       (loop (cdr s) (+ k 1) (+ c (d-memo k (car s)))))))```

Now we can handle even large sets:

```> (greplin3-memo (primes (expt 2 10))) 44586565247```

On my system, that takes less than two seconds. By comparison, the `-fast` version takes over three minutes on the set of primes less than 29, and timing it on the set of primes less than 210 exhausted my patience.

We used filter and hash tables from the Standard Prelude and `primes` from the exercise on the Sieve of Eratosthenes. You can run the program at http://programmingpraxis.codepad.org/8rQnIhhP. You might like to look at http://news.ycombinator.net/item?id=1773838.

Pages: 1 2

### 7 Responses to “Subset Sums”

1. […] Praxis – Subset Sums By Remco Niemeijer In today’s Programming Praxis exercise, we have to solve the third part of the Greplin challenge, which is to […]

```import Data.List

greplin3 :: [Integer] -> Int
greplin3 = length . filter (\s -> sum (init s) == last s) .
tail . subsequences

main :: IO ()
main = print \$ greplin3 [ 3, 4, 9,14,15,19,28,37,47,50,54
,56,59,61,70,73,78,81,92,95,97,99]
```
3. turuthok said

For the 2^10:

```n = 2**10
f =  * n
f, f = 0, 0
p = []
for i in xrange(2, n):
if f[i]:
for j in xrange(i+i, n, i): f[j] = False
p.append(i)

m = {}

def go(idx, sum):
if sum >= n: return 0
if idx >= len(p): return f[sum]
if (idx, sum) in m: return m[(idx, sum)];

r = go(idx+1, sum) + go(idx+1, sum+p[idx])
m[(idx, sum)] = r
return r

print go(0, 0) - len(p)
```

~/praxis\$ time python greplin3.py
44586565247

real 0m0.592s
user 0m0.456s
sys 0m0.132s

4. turuthok said

Should’ve read the HOWTO post source code first before posting … sorry, and I should’ve changed f[j] = False to f[j] = 1 even though it doesn’t seem to affect the result.

5. Christopher Oliver said

I am confused that brute force should take as long as you claim. The following code brute-force searches the powerset, but runs in under four seconds (not minutes) on a
cheap (AMD M520) laptop with stklos. I suspect that since your first algorithm computes
and stores the entire powerset in order to then filter it, you are hitting the swap hard.

(define subsets-problem-data
‘(3 4 9 14 15 19 28 37 47 50 54 56 59 61 70 73 78 81 92 95 97 99))

(define (for-each-in-powerset fn set)
(let recur ((arg-so-far ‘()) (set set))
(if (null? set)
(fn arg-so-far)
(begin
(recur arg-so-far (cdr set))
(recur (cons (car set) arg-so-far) (cdr set))))))

(define (problem-3)
(let ((total 0))
(for-each-in-powerset
(lambda (set)
(if (and (pair? set)
(pair? (cdr set))
(= (car set) (apply + (cdr set))))
(set! total (+ total 1))))
subsets-problem-data)
total))

6. Khanh Nguyen said
```let power_set (list:int list) =
let rec helper l result =
match l with
| [] -> result
| x::xs ->
let ll = List.map (fun s -> x::s) result
helper xs (List.append result ll)
helper list [[]]

let subset_sums (list:int list) =
power_set list
|> List.tail //the first set is the empty set
|> List.filter (fun x -> List.head x = (List.tail x |> List.sum))
|> List.length

subset_sums [3;4;9;14;15;19;28;37;47;50;54;56;59;61;70;73;78;81;92;95;97;99]
```