Random Total
June 12, 2015
Our task today is to generate a set of k random positive integers that add up to a given totaln. For instance, if we want 4 random numbers that add up to 9, there are six possible results (not counting permutations of them): {6,1,1,1}, {5,2,1,1}, {4,3,1,1}, {4,2,2,1}, {3,3,2,1} and {3,2,2,2}.
An easy way to do that is to choose k−1 random numbers r with 0 ≤ r < n−k, sort them, calculate the differences between them, calculate the difference between 0 and the smallest, calculate the difference between n−k and the largest, shuffle the differences, and add 1 to each; subtracting k and adding 1 ensures that all the numbers are positive. For our example above, choose three random non-negative integers less than n−k = 5, say 1, 3, and 3, the differences are 1, 2, 0 and 2, and the four resulting numbers are 2, 3, 1 and 3, which form the fifth of the six sets shown above.
Your task is to write the program that generates a random set of integers that adds to a given total, as described above. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.
I may be incorrect (I’m feeling slow due to a cold running through my house),
but I think it can be done with a simple recursion if you don’t care about
order. Details in Scheme follow.
First, here’s a function that returns a list of
n
copies ofx
:We’ll use that in a base case of our recursion. Next, the main event:
I’m on a computer without a decent editor, so please pardon any typos or errors in
my counting parentheses. As always, let me know if my “solution” really isn’t one.
@Graham: It worked fine, with no typos. But it’s not very good. Here’s a typical run:
> (random-total 17 1000)
(3 1 1 1 1 1 4 3 10 3 6 6 48 92 29 714 77)
The problem is that once you have a “big” number, like 714 in the example above, all the rest of the numbers have to be “little”, and you end up with 5 ones, 3 threes, 1 four, two sixes, and a ten. The function did indeed produce 17 random numbers that add up to 1000, but it’s probably not what you want.
Don’t feel bad. I did the same thing the first time I solved this problem.
@ProgrammingPraxis: Thanks for responding!
Why would something like this not be what you want, though? Shouldn’t all (permutations of) k-partitions of n be equally likely?
In that case, random totals where all the summands are “close” would have less probability than ones with “big” numbers.
Scratch that, my “solution” looks much more biased in experimentation than I expected.
@Graham: I did 10000 trials of the suggested solution on (rand-total 4 9) and came up with these counts:
And here’s the same thing with your version of rand-total:
So there’s no difference. But when n is very much larger than k, the difference between the two methods is obvious. I’m not sure which is “right”, but I think the suggested solution looks “better” based on my expectations of what the program should do.
I guess the right answer is to choose a random number between 1 and n! / (n-k)! — I don’t think I got that right — and map that to a particular permutation. I’ll have to think about that. But I can tell you that the algorithm of the suggested solution is the “computing folklore” algorithm.
@ProgrammingPraxis: cool stuff, and thanks for running the experiments. I need to pick up more of the folklore.
As for the other route you’re thinking of, check out the combinadic number system.
@Praxis: Here’s a paper describing your idea: http://files.figshare.com/484635/Partitioning_paper.pdf
In summary:
1. Calculate p(n,k) = the number of integer partitions of n, having k parts.
2. Generate a random number r, 1 <= r <= p(n,k), then the r-th partition in lex-order is the randomly selected partition.
3. Map r onto a partition:
a. let s = ceil(n / k), the smallest big number that can be in the partition. For p(9, 4), s = 3, corresponding to the partition (3, 2, 2, 2).
b. let t = n – (k – 1), the largest number that can be in the partition. For p(9, 4), t = 6, corresponding to the partition (6, 1, 1, 1).
c. for j in range [s, t]
d. calculate, x = the number of partitions in which the largest number is j.
e. If x = r, you know j is the first digit in the partition.
e. now let n -= j and k -=1 and goto step a to find the next number in the partition
Here’s a solution based on the paper I linked above.
Test:
My python-fu is still very lacking, but hopefully I got in the ballpark of the right answer. Long time follower, first time contributor, so hopefully I get the formatting and whatnot correct.
To test this, I just ran the algorithm a bunch of times on (9,4), assuming that if run enough times I could get all possible combinations, and printed those to screen. If they matched, at least as a litmus test it should be approximately correct.
Output:
6,1,1,1
5,2,1,1
4,3,1,1
4,2,2,1
3,3,2,1
3,2,2,2
Here’s a run with a larger n:
rand_partition(1000, 17) => [281, 133, 85, 83, 71, 66, 66, 45, 40, 34, 31, 26, 15, 11, 8, 4, 1]
Of interest, there are pk(1000, 17) ~ 3.3 x 1020 partitions. The python docs say the random number generator only has about 53 bits of precision, whereas approx 69 bits of precision are needed to index all 3.3 x 1020 partitions . As written, the function above would not be able to choose some of the partitions.
I don’t think the folklore solution generates numbers with the proper distribution. For example, looking at the case for rand_partition(9,4), here is a comparison odds for the largest number in a partition:
largest #true probfolklore prob
60.1670.006
50.1670.124
40.3330.405
30.3330.467
table tags didn’t work
largest true folklore
number prob prob
6 0.167 0.006
5 0.167 0.124
4 0.333 0.405
3 0.333 0.467
@Mike: No, the folklore solution doesn’t generate numbers with the proper distribution; you saw that in my experiment with Graham. But it does generate solutions that “look good” in a way similar to your solution based on a random partition.
If you need random numbers with lots of bits, use a random number generator to generate as many coin flips as necessary, one at a time.
Here’s another way: a k-partition of n either has smallest member 1, or all elements are > 1. If the smallest member is 1, we can remove this, and get a k-1 partition of n-1; if all elements are > 1, we can subtract 1 from each element and get a k-partition of n-k. If p(n,k) is the number of partitions for a given n and k, this gives a recurrence relation:
p(n,k) = p(n-1,k-1) + p(n-k,k)
The base cases are p(0,0) = 1, otherwise p(n,k) = 0 if n <= 0 or k <= 0.
This enables us to both calculate the number of partitions and also map an integer i, 0 <= i < p(n,k) to a partition. Here's some Haskell:
p n k
| k == 0 && n == 0 = 1
| k <= 0 || n <= 0 = 0
| otherwise = p (n-k) k + p (n-1) (k-1)
gen n k i
| k <= 0 || n <= 0 = []
| otherwise =
if i < t then 1:gen(n-1)(k-1) i
else map (+1) (gen(n-k) k (i-t))
where t = p(n-1)(k-1)
parts n k = map (gen n k) [0..(p n k)-1]
main = print (parts 10 5)
— [[1,1,1,1,6],[1,1,1,2,5],[1,1,1,3,4],[1,1,2,2,4],[1,1,2,3,3],[1,2,2,2,3],[2,2,2,2,2]]
p would benefit from memoization, and I've left out actually selecting a random number in the appropriate range.
And we can turn this in to a nice iterative solution – rather than building up the partition directly, we generate the differences between the elements of the partition. C++ for this one:
@matthew — nice!