Closest Pair, Part 2

February 13, 2015

In a previous exercise we studied the brute-force method for finding the closest pair of points in a set of points by forming all pairs, computing the distance between each of them, and choosing the smallest. That algorithm has time complexity O(n2). Today we will look at a divide-and-conquer algorithm that has time complexity O(n log n.

The divide-and-conquer algorithm sorts the pairs along their x-coordinates, splits the list of pairs in two, recursively finds the closest pair in the two halves, then compares all points for the closest pair that crosses the dividing line between the two sets of points, taking the minimum of the three possibilities. The third possibility is the tricky one. It won’t do to consider all possible pairs. Instead, we consider only those points less than d distance from the dividing line, where d is the minimum of the distances of the two recursive calls. It can be proved, though we won’t do so here, that the third step takes only linear time, so the entire algorithm is O(n log n).

Your task is to write a program that computes the closest pair of points in an input set using the divide-and-conquer algorithm. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.

Pages: 1 2 3

2 Responses to “Closest Pair, Part 2”

  1. matthew said

    I was jumping the gun a bit with my solution to Part I.

    Do you use the fact that the ys list is sorted?

  2. Mike said
    from collections import namedtuple
    from itertools import combinations, product
    
    class Point(namedtuple("Point_", "x y")):
        def mag(self):
            return (self.x ** 2 + self.y ** 2) ** 0.5
        def __sub__(self, other):
            return Point(self.x - other.x, self.y - other.y)
    
    def closest_pair2(points):
        def cp(ps):
            if len(ps) <= 3:
                d, pr = min(((p - q).mag(), (p, q)) for p,q in combinations(ps, 2))
    
            else:
                m = len(ps) // 2
                mid = ps[m]
    
                d, pr = min(cp(ps[:m]), cp(ps[m:]))
    
                for p in ps[m-1::-1]:
                    if p.x < mid.x - d: break
                    
                    for q in ps[m:]:
                        if q.x > p.x + d: break
                        
                        if p.y - d <= q.y <= p.y + d:
                            td = (p - q).mag()
                            if td < d:
                                d = td
                                pr = p, q
    
            return d, pr
    
        return cp(sorted(points))
    

    Takes about 4.5 seconds to find closest pair in a set of 1000000 points–much faster than brute-force method.
    Example:

    >>> from random import randrange
    >>> from timeit import timeit
    >>> def random_point(a, b=None):
        return Point(randrange(a,b), randrange(a,b))
    
    >>> points = set()
    >>> while len(points) < 1000000:
        points.add(random_point(5000000))
    
    >>> timeit(stmt="print(closest_pair2(points))", setup="from __main__ import points, closest_pair2", number=1)
    (6.4031242374328485, (Point(x=185209, y=328746), Point(x=185214, y=328750)))
    4.447064363699084
    >>> 
    
    

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: