## Subset Sums

### November 9, 2010

The obvious answer is to enumerate all subsets, then check each to see if it satisfies the summing criteria:

`(define (subsets s)`

(if (null? s) (list ())

(let ((rest (subsets (cdr s))))

(append rest (map (lambda (x) (cons (car s) x)) rest)))))

`(define (greplin3-slow s)`

(length

(filter (lambda (x) (= (car x) (apply + (cdr x))))

(map reverse

(filter (lambda (x) (< 2 (length x)))

(subsets s))))))

`Subsets`

generates all the subsets in a list, then the various `map`

s and `filter`

s massage the list to produce the answer. The problem with this approach is that it requires exponential time to generate all the subsets; for a set of five elements, as in the example problem, that’s fine, but for the 22-element challenge set, it is uncomfortable, taking several minutes to compute a solution. And of course it is impossible to solve a problem with hundreds of elements.

A better approach uses dynamic programming. Let *S* be the set of numbers *S*_{1}, *S*_{2}, …, *S _{k}* and

*D*(

*k*,

*n*) be the number of subsets of

*S*,

_{1}*S*

_{2}, …,

*S*that sum to

_{k}*n*. Then the basic recursion is

*D*(

*k*,

*n*) =

*D*(

*k*−1,

*n*) +

*D*(

*k*−1,

*n*−

*S*). The recursion stops with 0 when either

_{k}*k*or

*n*is 0, except in the case where

*k*=

*n*=0, when the recursion stops with 1 (the case where 0+0=0):

`(define ss #f)`

`(define (d k n)`

(cond ((negative? n) 0)

((negative? k) (if (zero? n) 1 0))

(else (+ (d (- k 1) n)

(d (- k 1) (- n (list-ref ss k)))))))

`(define (greplin3-fast s)`

(set! ss s)

(let loop ((s s) (k 0) (c (- (length s))))

(if (null? s) c

(loop (cdr s) (+ k 1) (+ c (d k (car s)))))))

We save the original set in the global variable `ss`

to avoid having to pass it around all the time. This works well on both the example set and the challenge set:

`> (greplin3 '(1 2 3 4 6))`

4

> (greplin3 '(3 4 9 14 15 19 28 37 47 50 54 56 59 61 70 73 78 81 92 95 97 99))

179

It’s much faster than the naive recursion because instead of forming every possible subset it cuts off early when a partial sum grows too large. But it’s still too slow if *S* is large. The fix is to memoize the *D* function, saving each value as it is computed so that the saved value can be reused when the function is called again with the same arguments:

`(define t #f)`

`(define (hash args) (+ (* (car args) 10000) (cadr args)))`

`(define (d-memo k n)`

(define (return k n v) (t 'insert! (list k n) v) v)

(let ((prev (t 'lookup (list k n))))

(cond (prev prev)

((negative? n) (return k n 0))

((negative? k) (return k n (if (zero? n) 1 0)))

(else (return k n (+ (d-memo (- k 1) n)

(d-memo (- k 1) (- n (list-ref ss k)))))))))

`(define (greplin3-memo s)`

(set! ss s)

(set! t (make-hash hash equal? #f 10000))

(let loop ((s s) (k 0) (c (- (length s))))

(if (null? s) c

(loop (cdr s) (+ k 1) (+ c (d-memo k (car s)))))))

Now we can handle even large sets:

`> (greplin3-memo (primes (expt 2 10)))`

44586565247

On my system, that takes less than two seconds. By comparison, the `-fast`

version takes over three minutes on the set of primes less than 2^{9}, and timing it on the set of primes less than 2^{10} exhausted my patience.

We used filter and hash tables from the Standard Prelude and `primes`

from the exercise on the Sieve of Eratosthenes. You can run the program at http://programmingpraxis.codepad.org/8rQnIhhP. You might like to look at http://news.ycombinator.net/item?id=1773838.

[…] Praxis – Subset Sums By Remco Niemeijer In today’s Programming Praxis exercise, we have to solve the third part of the Greplin challenge, which is to […]

My Haskell solution (see http://bonsaicode.wordpress.com/2010/11/09/programming-praxis-subset-sums/ for a version with comments):

For the 2^10:

~/praxis$ time python greplin3.py

44586565247

real 0m0.592s

user 0m0.456s

sys 0m0.132s

Should’ve read the HOWTO post source code first before posting … sorry, and I should’ve changed f[j] = False to f[j] = 1 even though it doesn’t seem to affect the result.

I am confused that brute force should take as long as you claim. The following code brute-force searches the powerset, but runs in under four seconds (not minutes) on a

cheap (AMD M520) laptop with stklos. I suspect that since your first algorithm computes

and stores the entire powerset in order to then filter it, you are hitting the swap hard.

(define subsets-problem-data

‘(3 4 9 14 15 19 28 37 47 50 54 56 59 61 70 73 78 81 92 95 97 99))

(define (for-each-in-powerset fn set)

(let recur ((arg-so-far ‘()) (set set))

(if (null? set)

(fn arg-so-far)

(begin

(recur arg-so-far (cdr set))

(recur (cons (car set) arg-so-far) (cdr set))))))

(define (problem-3)

(let ((total 0))

(for-each-in-powerset

(lambda (set)

(if (and (pair? set)

(pair? (cdr set))

(= (car set) (apply + (cdr set))))

(set! total (+ total 1))))

subsets-problem-data)

total))

My C Implementation

Follow Link “Download here”