Top-k Spatial Joins Po-Sung [email protected] 1 Survey What’s top-k spatial joins 2 What’s top-k spatial joins Map overlays incur high execution cost Retrieve the k objects B A The processing of such this is expensive 3 Top-k Spatial Joins Apply a conventional spatial join algorithm on the two data sets A and B Count the number of output pairs in which each object participates Return the k objects with the maximum intersection counts 4 Top-1 join < id , count , IL > {A1 , 3 , [B1 , B2 , B3]} B1 A1 {a1 , 3 , [b1 , b5 , b10]} b1 B3 b5 a1 b10 A2 B2 5 Definition 1 maxnum(e) C e.level E is an intermediate entry of Ra C the node capacity e.level the level of the node that contains e Upper bound maxnum(e) for the number of objects in the subtree of e maxnum( A1 ) 5 6 Definition 2 count (e) ei Rb and ei intersects e maxnum(ei ) If e is leaf entry of Ra • the number of objects of Rb that intersect If e is intermediate entry • upper bound of the actual count of any object in e count ( A1 ) 3 5 15 7 Example A1.IL A2.IL B1.IL B5.IL = [B1 , B2 , B5] = [B5] A1 = [A1] = [A1 , A2] B1 B5 A2 B2 8 Example (cont.) Heap H E : <e , count , list> e is the entry (of Ra or Rb) count (e) list is e.IL ei e.IL maxnum(ei ) 9 Example (cont.) a1.IL= [b1 , b5 , b10] a1.key=3 A2.IL= [b5] a2.key=1 A1 B1 a2 b5 b1 a1 B3 b10 A2 B2 10 Pseudocode •For each ei i.IL TS (Rtree R , Rtree Rb, int by k) e •Join n and na // n is pointed i i i Join RTa and RTb to get intersecting pair (ea,eb) •For each intersecting entry pair(e’, e’i) // For each entry e that appears in a pair build e’ e.IL, compute e.count and insert •Add i to r’.IL <e, ecount, e.IL>to a heap H (sorted by e.count) •Compute e’.count While number of reported objects < k •If e’.count > pruning condition • e = de-heap(H) • If e is a leaf // actual object found so far //ie...count ofentry the k-th best object – Report (<e.id, e.count, e.IL>) •Insert <e’, e’.count, e’IL> to H • Else // e is an intermediate entry pointing to node n •If e’ is a leaf entry //object •Update pruning condition •return 11 Algorithm Visiting order count Pruning condition 12 Multiple Expansions Method (ME) 13 Two binary search trees 14 Full join VS. Semi join 15 Comparison environment 1. MCB x LA returns 16,477,244 intersection pairs 2. SKEW x LA returns 19,657,973 intersection pairs 16 Node accesses versus k (full join). 17 CPU time versus k (full join). 18 Total cost versus k (full join, 10 percent cache). 19 Node accesses versus k (semijoin). 20 CPU time versus k (semijoin). 21 Total cost versus k (semijoin, 10 percent cache). 22 Conclusions Bottom-k queries Top-k distance (semi) join Top-k nearest neighbor (semi) joins Computing the NN (in A) of all objects of B Sorting the resulting pairs (ob, oa) where oa the NN of ob B with respect to oa Reporting the top-k objects of A A 23 in
© Copyright 2026 Paperzz