Expected (in terms of efficiency) solution to the “all

Solution to the “all-pairs” function
Assume for simplicity that the keys in the leaf nodes reside in an array.
Naïve Approach: (~ nested loops join)
To find all the key pairs within distance ε, we start comparing the 1st key
with the consecutive keys (2nd, 3rd, …) until we reach an ε distance.
Then we move to the 2nd key and compare it with the 3rd, 4th,… again
until we reach an ε distance and we do the same for all the keys until we
reach the last key in the array.
This approach is naïve and so is not as efficient, because we might end up
doing unnecessary comparisons. For example, if the 5th key is the last key
that is within ε distance to the 1st key, when we move to the 2nd key
we do not need to compare it with the 3rd, 4th and the 5th keys
because we are assured that they are less than ε distance to the 2nd key
(since they are within ε distance to the 1st key and 2nd key in a
B+tree is always greater than the 1st key).
Expected (in terms of efficiency) solution
Sliding Window Approach: (~ merge sort join)
k1 k2 k3 k4 k5 k6 k7 k8 k9
….
≤ε
>ε
Compare k1 with k2, k3, k4, k5, k6. Assume strdist(k1, k6) > ε. Then print pairs (k1,k2), (k1,k3), (k1,k4), and (k1,k5). Move to k2.
k1 k2 k3 k4 k5 k6 k7 k8 k9
≤ε
….
?
Print pairs (k2,k3), (k2,k4), (k2,k5) directly (unnecessary comparisons of k2 to k3, k4 and k5 were deducted points).
Continue comparing k2 to k6, k7, … until an ε distance is reached. Do the same for k3 and for all the remaining keys. That
is, print pairs (k3, k4), (k3, k5) …. (k3, kx) without doing comparisons if kx is the last key within ε distance to k2, etc.
Here is a better illustration of the Sliding Window Approach:
k1 k2 k3
k4 k5 k6k7
k8 ….
ε
Print pairs (k1,k2), (k1,k3), (k1,k4), and (k1,k5).
k1 k2 k3
k4 k5 k6k7
k8 ….
ε
Move to k2 (slide the window). Print pairs (k2,k3), (k2,k4), and (k2,k5)
without comparisons. Still have to compare k2 to k6 and k7 (and k8 to see
Where to stop). Move to k3 and so on.
Notes
• Typical setting: All students used the naïve
approach doing unnecessary comparisons.
• The sliding window approach can be
implemented in recursive solutions by bookkeeping an “end-pointer” as well as a startpointer.
• Keys-in-range were implemented efficiently by
most students by searching for the first key and
then scanning until the second key is reached.