Closest Two-Sum To Zero

June 23, 2015

Given a random array of integers, both positive and negative, find the pair with sum closest to zero. For instance, in the array [45, -29, -96, -7, -17, 72, -60], the two integers with sum closest to zero are -60 and 72.

Your task is to write a program that finds the two-sum closest to zero. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.


Pages: 1 2

15 Responses to “Closest Two-Sum To Zero”

  1. klettier said
    let d = [|45; -29; -96; -7; -17; 72; -60|]
    let sum x y = x + y
    let rec closestToZero (ar : int array) start (a1 : int) (a2 : int) (sm1: int) =
        match start with
        | x when x < ar.Length - 1 ->
            let s = ar.[start]
            let mutable a3, a4, sm2 = a1, a2, sm1
            for i in [start + 1 .. ar.Length - 1] do
                let smt = sum s ar.[i]
                if(System.Math.Abs(smt) < System.Math.Abs(sm2)) then
                    sm2 <- smt
                    a3 <- s
                    a4 <- ar.[i]
            closestToZero ar (start + 1) a3 a4 sm2
        | _ -> (a1, a2, sm1)
    closestToZero d 0 0 0 System.Int32.MaxValue;;
  2. graham said

    I think the expected minimum sum is zero, since you’re picking k independent uniformly from [-n, n].
    I’m too under-caffeinated to properly work out the full distribution, though. As for “is that normal?”, I
    think you may be falling prey to small sample size; with only 10 trials, it’s hard to get an idea of what
    the distribution looks like.

    Here’s an implementation in vanilla Python:

    def sort_and_scan(xs):
        ys = sorted(xs, key=abs)
        return min(map(sum, zip(ys, ys[1:])), key=abs)

    And for conducting a large number of experiments, we turn to the excellent numpy and pandas

    from numpy.random import randint
    from pandas import DataFrame
    def experiment(k, n, t):
        df = DataFrame(randint(low=-n, high=n + 1, size=(k, t)))
        return df.apply(sort_and_scan).describe()
    if __name__ == '__main__':
        print(experiment(17, 1000, int(1e5)))

    In one run, I got a mean of 0.03161 with a standard deviation of 10.806607.

  3. graham said

    Thinking a bit more about the distribution, it makes sense that the expected min sum would be zero.
    A sum of 2n is only possible if two different entries are both plus or minus n, but there are n different
    ways to get a sum of zero: plus and minus x for any x in {0, …, n}. This distribution is also symmetric
    about the middle.

  4. programmingpraxis said

    @Graham: Does k matter? If k is very small, there won’t be very many ways to make 0. Is there some minimum k required to guarantee an expected min sum of 0?

  5. graham said

    My intuition is that k doesn’t matter for the expected minimum sum to be zero; it should always
    be zero. The variance probably plays a big role, though; if it’s relatively the same size as k, the
    expected min sum may not be all that much more likely than other values. If I get a bit, I’ll take a
    stab at the full conditional distribution.

  6. Paul said

    If you look at the average of the absolute sums you get something close to 7.45. If you take the average of the sum, you come close to zero, of course. I have googled a little to find papers on problems like this, but did not find any yet.

  7. Mike said

    My solution – sort the list and scan from each end:

    def min_sum(y):
    	m = float("+inf")
    	f, r = 0, len(y)-1
    	while f < r:
    		t = y[f] + y[r]
    		if -m < t < m:
    			m = abs(t)
    			sf, sr = f, r
    		if t < 0:
    			f += 1
    		elif t >= 0:
    			r -= 1
    	return y[sf], y[sr], m
  8. Mike said

    The problem of finding the expected value of the minimum sum, feels like a “birthday paradox” type problem. In particular, the variation in which you have a room of m men and w women, and want to know the probability that a man and a woman share a birthday. E.g., consider men, women to be the random numbers zero, respectively, and their absolute value to be the birthday. See:

  9. Mike said

    WordPress ate the less than and greater than symbols. I was trying to say let the random integers less than zero represent the men, and the random integers greater than zero represent women. Then the problem maps fairly well onto the case in the Wikipedia article. Not sure how to handle zero, maybe alternate so first zero is male, second is female, etc.

  10. Mike said

    @praxis – In your code min-sum is the absolute value of the minimum sum, so your min-sum’s are all positive. I think that explains why your expected min-sum wasn’t what you expected.

    In @Graham’s experiment, he used the actual minimum sum, which can be positive or negative. Therefor his expected value is 0.

    This IPython notebook shows has more info:

  11. Brett Warren said

    Solution – generate array; compare each item with every item after it, and compare to the closest pair of integers.

    def closest_to_zero(int_array):
    	closest_to_zero = [float("inf")]
    	for index, operand1 in enumerate(int_array):
    		for operand2 in list(int_array[index+1:]):
    			summation_ints = [operand1, operand2]
    			if abs(sum(summation_ints)) < abs(sum(closest_to_zero)):
    				closest_to_zero = summation_ints
    	return closest_to_zero
    if __name__ == "__main__":
    	print(closest_to_zero([45, -29, -96, -7, -17, 72, -60]))
  12. Paul said

    I think this problem is close to the problem to find the smallest distance between the points from an array. A nice explanation of the formula for that problem is given in MathOverflow.
    We have m cards (in our case 2000) with k marked cards (17 in our case). If we deal them out in the following way we get a result where the cards have at least a given distance d. We will assume that cards dealt from the bottom are safe (not marked). Start to deal from the top. When a marked ard is dealt we deal d “safe” cards from the bottom. When all the cards are dealt we have the desired result, if the (k – 1) d cards at the bottom were indeed safe cards (not marked). The probabillity that no marked cards are in the bottom (k-1)d cards is the probability that we can deal cards with a minimum distance of d. From this you get a minimum distance value of L / (k^2-1), which is exact, if you let go m to infinity – L is the length of the interval (2000 in our case).
    For our problem the situation is slightly different, but we can use the same argumentation. Our deal starts from positions 0, -1, +1, -2, +2, ….-2000, +2000. When we hit a marked card and the position is positive, we deal d cards on the negative side (we fill positions -pos-1, …-pos-d). When we hit a marked card and the position is negative, we deal d+1 cards on the positive side (we fill positions -pos, -pos-1, …,-pos-d). So on average we fill d+1/2 positions d positions for the minimum distance case. After that we can safely fill d cards at the same side (as the marked card) from the top. Tthere are 2 complications, which are not handled yet. The first is, that if the frist marked card is very early more safe cards have to be used, as a very small positive or a few very small negative numbers can result in a low sum too. The second complication is, that if after a marked card within d cards another marked card follows, the number of safe cards that have to be used the fill positions on the other side is less.
    All in all, it is to be expected that the minimum distance problem is close to the closest to zero sum problem . As we have to deal slightly more “safe” cards for our problem, is is expected that the expectation value will be higher. As you can see from the table (m=2000) for all values of k the simulation value for our problem is in between the values for the minimum distance for k-1 and k. (all simulations were done with floats i.s.o. ints; the differences are negligable)

      k    |sum|     |dist|
      2 666.432 666.667
      3 333.362 250.000
      4 186.892 133.333
      5 114.160  83.333
      6  75.370  57.143
      7  52.583  41.667
      8  38.801  31.746
      9  29.588  25.000
     10  23.401  20.202
     11  18.884  16.667
     12  15.647  13.986
     13  13.189  11.905
     14  11.245  10.256
     15   9.688   8.929
     16   8.464   7.843
     17   7.449   6.944
     18   6.615   6.192
     19   5.900   5.556
     20   5.311   5.013
     29   2.472   2.381
     30   2.305   2.225
     40   1.284   1.251
     49   0.852   0.833
     50   0.818   0.800
      99   0.206   0.204
    100   0.202   0.200
    200   0.050   0.050
    300   0.022   0.022
  13. seth2810 said

    I think that the easiest way to solve the problem directly to iterate over all the elements

    function closest(input) {
    var v1, v2, sum, result;

    for (var idx1 = input.length; idx1 >= 0; idx1–) {
    for (var idx2 = input.length; idx2 >= 0; idx2–) {
    if (idx1 != idx2) {
    v1 = input[idx1];
    v2 = input[idx2];
    sum = v1 + v2;

    if (sum >= 0 && (!result || result[2] > sum)) {
    result = [v1, v2, sum];

    return result;

    var input = [45, -29, -96, -7, -17, 72, -60];

  14. Meisam said
    i = 0
    j = 1
    arr =  [45, -29, -96, -7, -17, 72, -60]
    sum = abs (arr[0]+arr[1])
    while i < len(arr):
        while j < len(arr):
            if sum > abs (arr[i]+arr[j]):
                sum = abs (arr[i]+arr[j])
                a = arr[i]
                b = arr[j]
            j +=1
        i += 1
        j = i +1
    print a,b, sum
  15. r. clayton said

    A solution in (Racket) Scheme.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: