## Closest Two-Sum To Zero

### June 23, 2015

Before we write a solution, here is a function that returns a random list of k integers with absolute value less than n; it uses `range` and `randint` from the Standard Prelude.

```(define (rand-list k n)   (map (lambda (r) (randint (- 1 n) n)) (range k)))```

The brute-force solution compares each integer to each other integer and keeps track of the smallest pair:

```(define (brute-force xs)   (let ((min-sum (apply max xs)) (min-pair (list)))     (do ((ys xs (cdr ys))) ((null? ys) min-pair)       (do ((zs (cdr ys) (cdr zs))) ((null? zs))         (let ((s (abs (+ (car ys) (car zs)))))           (when (< s min-sum)             (set! min-sum s)             (set! min-pair (list (car ys) (car zs)))))))))```

That takes time O(k2), and is useful for testing. A better algorithm sorts by the absolute value, then scans:

```(define (sort-and-scan xs)   (define (lt? x y) (< (abs x) (abs y)))   (let loop ((xs (sort lt? xs))             (min-sum (apply max xs))             (min-pair (list)))     (if (null? (cdr xs)) min-pair       (let ((s (abs (+ (car xs) (cadr xs)))))         (if (< s min-sum)             (loop (cdr xs) s (list (car xs) (cadr xs)))               (loop (cdr xs) min-sum min-pair))))))```

That takes time O(k log k). Here is an example:

```> (let ((xs (rand-list 7 100)))     (display xs) (newline)     (display (brute-force xs)) (newline)     (display (sort-and-scan xs)) (newline)) (45 -29 -96 -7 -17 72 -60) (72 -60) (-60 72)```

You can run the program at http://ideone.com/s2iFaQ.

I have a question for the mathematicians who read this: What is the expected minimum sum for a given k and n? For instance, given n = 1000 and k = 17, what is the expected minimum sum? In ten trials I had minimum sums 12, 0, 7, 6, 16, 1, 10, 7, 8 and 8, for an average of 7.5. Is that normal?

Pages: 1 2

### 15 Responses to “Closest Two-Sum To Zero”

1. klettier said
```let d = [|45; -29; -96; -7; -17; 72; -60|]

let sum x y = x + y

let rec closestToZero (ar : int array) start (a1 : int) (a2 : int) (sm1: int) =
| x when x < ar.Length - 1 ->
let s = ar.[start]
let mutable a3, a4, sm2 = a1, a2, sm1
for i in [start + 1 .. ar.Length - 1] do
let smt = sum s ar.[i]
if(System.Math.Abs(smt) < System.Math.Abs(sm2)) then
sm2 <- smt
a3 <- s
a4 <- ar.[i]
closestToZero ar (start + 1) a3 a4 sm2
| _ -> (a1, a2, sm1)

closestToZero d 0 0 0 System.Int32.MaxValue;;
```
2. graham said

I think the expected minimum sum is zero, since you’re picking k independent uniformly from [-n, n].
I’m too under-caffeinated to properly work out the full distribution, though. As for “is that normal?”, I
think you may be falling prey to small sample size; with only 10 trials, it’s hard to get an idea of what
the distribution looks like.

Here’s an implementation in vanilla Python:

```def sort_and_scan(xs):
ys = sorted(xs, key=abs)
return min(map(sum, zip(ys, ys[1:])), key=abs)
```

And for conducting a large number of experiments, we turn to the excellent numpy and pandas
libraries:

```from numpy.random import randint
from pandas import DataFrame

def experiment(k, n, t):
df = DataFrame(randint(low=-n, high=n + 1, size=(k, t)))
return df.apply(sort_and_scan).describe()

if __name__ == '__main__':
print(experiment(17, 1000, int(1e5)))
```

In one run, I got a mean of 0.03161 with a standard deviation of 10.806607.

3. graham said

Thinking a bit more about the distribution, it makes sense that the expected min sum would be zero.
A sum of 2n is only possible if two different entries are both plus or minus n, but there are n different
ways to get a sum of zero: plus and minus x for any x in {0, …, n}. This distribution is also symmetric

4. programmingpraxis said

@Graham: Does k matter? If k is very small, there won’t be very many ways to make 0. Is there some minimum k required to guarantee an expected min sum of 0?

5. graham said

My intuition is that k doesn’t matter for the expected minimum sum to be zero; it should always
be zero. The variance probably plays a big role, though; if it’s relatively the same size as k, the
expected min sum may not be all that much more likely than other values. If I get a bit, I’ll take a
stab at the full conditional distribution.

6. Paul said

If you look at the average of the absolute sums you get something close to 7.45. If you take the average of the sum, you come close to zero, of course. I have googled a little to find papers on problems like this, but did not find any yet.

7. Mike said

My solution – sort the list and scan from each end:

```def min_sum(y):
y.sort()
m = float("+inf")
f, r = 0, len(y)-1
while f < r:
t = y[f] + y[r]
if -m < t < m:
m = abs(t)
sf, sr = f, r
if t < 0:
f += 1
elif t >= 0:
r -= 1

return y[sf], y[sr], m
```
8. Mike said

The problem of finding the expected value of the minimum sum, feels like a “birthday paradox” type problem. In particular, the variation in which you have a room of m men and w women, and want to know the probability that a man and a woman share a birthday. E.g., consider men, women to be the random numbers zero, respectively, and their absolute value to be the birthday. See: https://en.wikipedia.org/wiki/Birthday_problem#Generalization_to_multiple_types

9. Mike said

WordPress ate the less than and greater than symbols. I was trying to say let the random integers less than zero represent the men, and the random integers greater than zero represent women. Then the problem maps fairly well onto the case in the Wikipedia article. Not sure how to handle zero, maybe alternate so first zero is male, second is female, etc.

10. Mike said

@praxis – In your code min-sum is the absolute value of the minimum sum, so your min-sum’s are all positive. I think that explains why your expected min-sum wasn’t what you expected.

In @Graham’s experiment, he used the actual minimum sum, which can be positive or negative. Therefor his expected value is 0.

11. Brett Warren said

Solution – generate array; compare each item with every item after it, and compare to the closest pair of integers.

```def closest_to_zero(int_array):
closest_to_zero = [float("inf")]
for index, operand1 in enumerate(int_array):
for operand2 in list(int_array[index+1:]):
summation_ints = [operand1, operand2]
if abs(sum(summation_ints)) < abs(sum(closest_to_zero)):
closest_to_zero = summation_ints
return closest_to_zero

if __name__ == "__main__":
print(closest_to_zero([45, -29, -96, -7, -17, 72, -60]))
```
12. Paul said

I think this problem is close to the problem to find the smallest distance between the points from an array. A nice explanation of the formula for that problem is given in MathOverflow.
We have m cards (in our case 2000) with k marked cards (17 in our case). If we deal them out in the following way we get a result where the cards have at least a given distance d. We will assume that cards dealt from the bottom are safe (not marked). Start to deal from the top. When a marked ard is dealt we deal d “safe” cards from the bottom. When all the cards are dealt we have the desired result, if the (k – 1) d cards at the bottom were indeed safe cards (not marked). The probabillity that no marked cards are in the bottom (k-1)d cards is the probability that we can deal cards with a minimum distance of d. From this you get a minimum distance value of L / (k^2-1), which is exact, if you let go m to infinity – L is the length of the interval (2000 in our case).
For our problem the situation is slightly different, but we can use the same argumentation. Our deal starts from positions 0, -1, +1, -2, +2, ….-2000, +2000. When we hit a marked card and the position is positive, we deal d cards on the negative side (we fill positions -pos-1, …-pos-d). When we hit a marked card and the position is negative, we deal d+1 cards on the positive side (we fill positions -pos, -pos-1, …,-pos-d). So on average we fill d+1/2 positions i.so. d positions for the minimum distance case. After that we can safely fill d cards at the same side (as the marked card) from the top. Tthere are 2 complications, which are not handled yet. The first is, that if the frist marked card is very early more safe cards have to be used, as a very small positive or a few very small negative numbers can result in a low sum too. The second complication is, that if after a marked card within d cards another marked card follows, the number of safe cards that have to be used the fill positions on the other side is less.
All in all, it is to be expected that the minimum distance problem is close to the closest to zero sum problem . As we have to deal slightly more “safe” cards for our problem, is is expected that the expectation value will be higher. As you can see from the table (m=2000) for all values of k the simulation value for our problem is in between the values for the minimum distance for k-1 and k. (all simulations were done with floats i.s.o. ints; the differences are negligable)

```  k    |sum|     |dist|
2 666.432 666.667
3 333.362 250.000
4 186.892 133.333
5 114.160  83.333
6  75.370  57.143
7  52.583  41.667
8  38.801  31.746
9  29.588  25.000
10  23.401  20.202
11  18.884  16.667
12  15.647  13.986
13  13.189  11.905
14  11.245  10.256
15   9.688   8.929
16   8.464   7.843
17   7.449   6.944
18   6.615   6.192
19   5.900   5.556
20   5.311   5.013
29   2.472   2.381
30   2.305   2.225
40   1.284   1.251
49   0.852   0.833
50   0.818   0.800
99   0.206   0.204
100   0.202   0.200
200   0.050   0.050
300   0.022   0.022
```
13. seth2810 said

I think that the easiest way to solve the problem directly to iterate over all the elements

function closest(input) {
var v1, v2, sum, result;

for (var idx1 = input.length; idx1 >= 0; idx1–) {
for (var idx2 = input.length; idx2 >= 0; idx2–) {
if (idx1 != idx2) {
v1 = input[idx1];
v2 = input[idx2];
sum = v1 + v2;

if (sum >= 0 && (!result || result > sum)) {
result = [v1, v2, sum];
}
}
}
}

return result;
}

var input = [45, -29, -96, -7, -17, 72, -60];
console.log(closest(input));

14. Meisam said
```i = 0
j = 1
arr =  [45, -29, -96, -7, -17, 72, -60]
sum = abs (arr+arr)
while i < len(arr):
while j < len(arr):
if sum > abs (arr[i]+arr[j]):
sum = abs (arr[i]+arr[j])
a = arr[i]
b = arr[j]
j +=1
i += 1
j = i +1
print a,b, sum
```
15. r. clayton said

A solution in (Racket) Scheme.