Closest Pair Algorithm

Algorithm Design Techniques
Divide and Conquer
Closest Pair Problem
Given a set p of n  2 points in 2D space. Find the “closest” pair of points (closest
usually
refers
distance 
to
Euclidean
distance
p1  x1 , y1  ,
p2  x2 , y 2  ,
so
x1  x2 2   y1  y2 2 ).
Why would I want to do something like this?
Consider air or sea traffic control. We continually recalculate the speed, direction and
position of vehicles in relation to one another to detect potential collisions.
How many pairs of points are there in a set containing n points?
n=3
a
b
P  ab, ac, bc
P 3
c
n=4
a
b
c
d
a
b
P  ab, ac, ad , bc, bd , cd 
P 6
n=5
d
n C2 
c
P  ab, ac, ad , ae, bc, bd , be, cd , ce, de
P  10
e
nn  1 n 2  n

  n2
2
2
 
 
A brute force method would check all pairs in an exhaustive search –  n 2 .
Closest Pair Algorithm
Data Structures
struct Point
{
int x;
int y;
};
struct ClosestPair
{
float dMin;
Point p1;
Point p2;
};
Point pX[], pY[], pXL[], pYL[], pXR[], pYR[], pYC[];
ClosestPair clPL, clPR, clP;
int n, middle, x, y;
Important preprocessing step
Before first call to ClosestPair, pX and pY must be sorted. pX is sorted in ascending
order by x-coordinate. pX must be copied to pY, and pY is sorted in ascending order by the
y-coordinate.
General Idea
4
1
2
3
5
6
In any pair of partitions, the
closest pair lies to the left of
the border, the right of the
border, or has one point on
the left of the border and
one point on the right of the
border (i.e. straddling).
both on left
straddling
both on right
Divide
Start by partitioning the points in “half” using 1st partition giving regions  and .
Further partition the left and right “halves” using 2nd and 3rd partitions giving regions
, , , and .
Conquer / Combine
In regions  and  calculate the distance of the closest pairs.
Combine regions  and .
Distance will be the minimum regions  and .
Determine whether there are any closer points that straddle the border between regions  and  (one
only need check those points less than distance from the border).
Distance will be the minimum from regions  and  and the border.
The closest pair in region  has been found.
Repeat for regions  and  as done for  and  to find the closest pair in region .
Combine regions  and ..
Distance will be the minimum for the whole set of points.
The closest pair has been found.
Algorithm
ClosestPair (pX, pY, n)
if n <= 3
clP = BruteForce(pX, n)
else
middle = (n – 1) / 2
x = pX[middle].x
y = pX[middle].y
for i = 0 to middle
pXL[i].x = pX[i].x
pXL[i].y = pX[i].y
j = 0
for i = middle + 1 to n – 1
pXR[j].x = pX[i].x
pXR[j].y = pX[i].y
j++
j = 0
k = 0
for i = 0 to n – 1
if pY[i].x < x or (pY[i].x == x and
pY[i].y <= y and j <= middle)
pYL[j].x = pY[i].x
pYL[j].y = pY[i].y
j ++
else
pYR[k].x = pY[i].x
pYR[k].y = pY[i].y
k ++
clPL = ClosestPair(pXL, pYL, j)
clPR = ClosestPair(pXR, pYR, k)
clP = Min(clPL, clPR)
j = 0
for i = 0 to n – 1
if pY[i].x >= x – clP.dMin && pY[i].x <= x + clP.dMin
&& Abs(pY[i].x – x) <= clP.dMin
pYC[j].x = pY[i].x
pYC[j].y = pY[i].y
j ++
k = j
for i = 0 to k – 2
for j = i + 1 to k – 1
if Abs(pYC[j].y – pYC[i].y > clP.dMin)
break
d = Dist (pYC[i], pYC[j])
if d < clP.dMin
clP.dMin = d
clP.p1.x = pYC[i].x
clP.p1.y = pYC[i].y
clP.p2.x = pYC[j].x
clP.p2.y = pYC[j].y
return clP
Complexity
Start with pX and pY. Both arrays need to be sorted in a pre-processing step. This can be
done at a cost that is Onlog n .
For each recursive step, pX and pY need to be split in “half” Olog n times. Copying values
from pY into pYL and pYR is done sequentially, placing each element of pY into pYL or
pYR. This can be done at a cost of On  . Therefore, the total cost of each recursive splitting
of points is Onlog n .
For the merge step, where points are checked to determine whether any pairs straddle the
border, the cost is actually On  . The idea is that for each point within clP.dMin of the
border, only a few points need to be checked.
The overall complexity of the closest pair algorithm is therefore Onlog n .
Selection Problem
Given an unsorted collection of n elements, select the K th smallest.
One approach would be to sort the collection using one of the best sorting algorithms (would
take Onlog n time) and then select the element in the K th position.
Can we achieve On  running time for any value of K ?
An algorithm called Randomized Quickselect runs in On  for the average case.
The algorithm works much like the Quicksort algorithm. The difference comes after the
partition phase; we do not need to examine both partitions but rather just one because we
know which one contains the K th element.
Randomized Quickselect Algorithm
RandomizedQuickSelect (a, left, right, K)
if left == right
return a[left]
pivot = RandomizedPartition(a, left, right)
i = pivot - left + 1
if K == i
return a[pivot]
else if K < i
return RandomizedQuickSelect(a, left, pivot - 1, K)
else
return RandomizedQuickSelect(a, pivot + 1, right, K - i)
Additional Function:
RandomizedPartition(a, left, right)
r = rand(left, right)
Exchange(a[right], a[r])
pivot = a[right]
i = left - 1
j = right
while true
while a[++i] < pivot
―
while pivot < a[--j]
―
if i < j
Swap(a[i], a [j]);
else
break;
Exchange(a[i], a[right])
return i
 
The worst case running time is O n 2 and occurs when we use partitions around the largest
or smallest elements.
Example
Randomized Quickselect the 5th element:
81
43
13
31
92
65
26
57
0
75
Select pivot p = 65
57
13
43
31
26
92
65
0
81
75
Select pivot p = 26
0
13
43
26
31
57
Select pivot p = 57
43
57
31
Select pivot p = 43
31
43
13
0
26
31
43
57
65
92
75
81

Download Report

Closest Pair Algorithm

Paperzz.com

Your Paperzz