Ullman’s Puzzle
December 7, 2010
This puzzle is due to Jeffrey Ullman:
Given a list of n real numbers, a real number t, and an integer k, determine if there exists a k-element subset of the original list of n real numbers that is less than t.
For instance, given the list of 25 real numbers 18.1, 55.1, 91.2, 74.6, 73.0, 85.9, 73.9, 81.4, 87.1, 49.3, 88.8, 5.7, 26.3, 7.1, 58.2, 31.7, 5.8, 76.9, 16.5, 8.1, 48.3, 6.8, 92.4, 83.0, 19.6, t = 98.2 and k = 3, the 3-element subset 31.7, 16.5 and 19.6 sums to 67.8, which is less than 98.2, so the result is true.
This is a puzzle, so you’re not allowed to look at the suggested solution until you have your own solution.
Your task is to write a function that makes that determination. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.
My Haskell solution (see http://bonsaicode.wordpress.com/2010/12/07/programming-praxis-ullman%E2%80%99s-puzzle/ for a version with comments):
My O(n log n) solution is similar to the one given. I did keep one quick check in that’s not
strictly necessary, though: if k times the first (smallest) element is bigger than t, then we
needn’t bother adding things up. This is a remnant from when I first started working on the
problem. I was trying to find things that would reduce the amount of work needed in
O(n log n) time or less before I came upon the final answer. Since it’s a quick check for small
first values, I’ve kept it in.
My Ocaml commented solution: http://www.gleocadie.net/?p=444&lang=en
This is O(n lg(k)) instead of O(n lg(n)).
Here’s my try , not sure if it is right way to do
Also have not checked if it is correct (will do that later).
(define (solve nums t k)
(let ([p-num (/ t k)])
(let ([l1 (filter (lambda (n) (= n p-num)) nums)])
(if (empty? l1)
#f
(let ([len (length l1)])
(if (>= len k)
(take l1 k)
(let ([new-k (- k len)])
(let ([solved (solve l2 (- t (apply + l1)) new-k)])
(if solved
(append l1 solved )
#f)))))))))
oops formatting
hmm , i guess mine is in quadratic , not good :)
A slightly un-optimized and simplistic Python version:
{{{
def ullman(numbers, threshold, k):
count = 0
for x in numbers:
count += 1
if (k – 1) < count and count <= len(numbers):
b = count – k # subsequence start
e = count # subsequence end
if sum(numbers[b:e]) < threshold:
return True
return False
}}}
Formatting fail…
Keramida:
sorry but your code does not return true for this case:
You only select the k-subsets of contiguous elements of the set, which are not all k-subsets of the original set.
An improved version of my previous post. Thanks to F. Carr for the inspiration.
In F#.
Gregory LEOCADIE, you’re right. I didn’t realize that we were looking for non-contiguous subsets too.
;) just call me Greg ;)
I’m not really sure about the complexity of this, but I think it’s O(n k). Also I’m not sure about the take & drop from the standard prelude—my leading & without were O(k) but assumed n > k.
The J solution (and algorithm) is pretty simple: sort the data set, and test if the sum of the K first items sum is less than T.
NB. Three inputs:
NB. DATA – the data set of numbers (in the example, 25 real numbers)
NB. K – the grouping factor (in the example, 3)
NB. LIMIT – the sum limit (in the example, 98.2
up =: dyad def 0
NB. LIMIT K up DATA
({.x)>+/({:x){.(/:{])y
)
NB. Returns 1 for true and 0 for false
testdata =. (25?1000)%1+25?100
testdata
35.8148 1.26667 3.93814 51.8571 3.73034 24.7838 3.74324 4.6 58.1667 5.63291 12.425 12.8333 4.64789 49.125 17.8261 10.1528 14.4167 6.32 11.2436 218 26.5294 4.64211 4.35366 26.9524 4.79104
<./testdata NB. mininum
1.26667
>./testdata NB. maximum
218
(+/ % #)testdata NB. average
24.7117
98.2 3 up testdata
1
10 3 up testdata
1
5 3 up testdata
0
3{.(/:{])testdata NB. 3 smallest numbers
1.26667 3.73034 3.74324
Just getting used to Q :)
/ data
n:18.1 55.1 91.2 74.6 73.0 85.9 73.9 81.4 87.1 49.3 88.8 5.7 26.3 7.1 58.2 31.7 5.8 76.9 16.5 8.1 48.3 6.8 92.4 83.0 19.6
t:98.2
k:3
/ func
df:{y>+/[asc[x][til z]]}
df[n;t;k]
Another J solution.
ullman=: 3 : 0
‘t k list’ =. y
list =. /:~ list NB. ascending sort
set =. k {. list NB. first k items
U =. t > +/set NB. test
U ; U#set NB. result; empty set if false
)
ullman 12.3; 4; 145.3 3 4 4 5 655 _3 6
+-+——–+
|1|_3 3 4 4|
+-+——–+
ullman 2.3; 4; 145.3 3 4 4 5 655 _3 6
+-++
|0||
+-++
I got asked a variant of this question in a phone interview for Google. I basically used the sorting solution above, but my interviewer told me there’s a better solution. Instead, if you use a data structure like a BST or a heap, size-bounded at k elements, and replacing the largest element when you find something smaller than it, you can reduce the complexity from O(n log n) to O(n log k). In the worst case scenario, where k = n, the complexity will become O(n log n), but otherwise it will be quicker. Thanks for the coding puzzles, and keep up the good work!
the ‘better’ solution will not even get the first line coded by the time {t > +/ k take sort V} gets coded. One could use the remaining time to find an idiomatic shortcut to k take sort V, which is what Matt described with the data structure solution, like:
t > +/ k take_sort V
Note, that kind of optimization not being obvious with an algoloid language.