PowerPoint Template

Dept. of Computer Science
Probation Talk:
Assignment Problems in
Spatial Databases
Kamiru Leong Hou, U
Supervisor: Nikos Mamoulis
Outlines
 Spatial Assignment Problems
 Applications
 Continuous Exclusive Closest Pair
Monitoring
 Problem Definition
 Examples of the solutions
 Experiments
 Capacity Constrained Assignment Problem
 Problem Definition
 Solutions
 Discussion
Dept. of Computer Science
Spatial Assignment
Problems
 Given two sets of objects, find the k1
to k2 assignments between these
two sets with
1. minimizing the cost from one set of objects,
which is named as stable assignment
2. minimizing the total cost, which is named as
optimal assignment
 Both cost might be measured by the
distance in spatial applications
Dept. of Computer Science
Spatial Assignment
Problems
Spatial Assignment
Problems
Stable Assignment
Optimal Assignment
Set A
Set B
Dept. of Computer Science
Stable Assignment
It is a variant of stable marriage
problem
`stable matching’ means
 a matching in which no element of the first
matched set prefers an element of the second
matched set that also prefers the first set
element.
b
a
c
d
{b,c} is a stable matching since both b and
c cannot find an element oi of the other
matched set which
• dist(oi,b/c)<dist(b,c) and
• dist(oi,b/c)<dist(oi,oi.assigned-object)
b
a
c
d
Dept. of Computer Science
Stable Marriage
Problem
Stable Marriage Problem can be
solved by [Gale62][Gusfield89]
men
preference
men
suggested woman
women
suggested man
m1
w2, w3, w1
m1
w2
w1
m1
m2
w1 , w 2 , w3
m2
w1
w2
m2
women
preference
w1
m1 , m 2
w2
m2 , m 1
w3
m2 , m 1
w3
In general, stable marriage problem is
asymmetric
The results can be symmetric if the
preference lists are symmetric
Dept. of Computer Science
Optimal Assignment
It consists of finding a maximum
(minimum) weight matching in a
weighted bipartite graph.
b
a
c
a
b
d
5
12
2
The pairs {a,c} and {b,d} are the best
assignment with the minimum total
distance
c
d
5
Dept. of Computer Science
Application 1
 Given a set of cars and a set of
parking slots, which of the
followings is the best way to assign
the slots to the cars?
1. Stable Assignment
2. Optimal Assignment
please go to
slot b
please go to
slot a
slot a
slot b
Dept. of Computer Science
Application 1
 Given a set of cars and a set of
parking slots, which of the
followings is the best way to assign
the slots to the cars?
1. Stable Assignment
2. Optimal Assignment
please go to
slot a
please go to
Sorry, no!
slot b
Can
##@#$%$
I park in?
slot a
slot b
Dept. of Computer Science
Application 2
 Given a set of wireless routers and a
set of clients, which of the followings
is the best way to let the routers
serve the clients?
1. Stable Assignment
2. Optimal Assignment
No signal
each router can serve at
most 2 clients
Dept. of Computer Science
Application 2
 Given a set of wireless routers and a
set of clients, which of the followings
is the best way to let the routers
serve the clients?
1. Stable Assignment
2. Optimal Assignment
each router can serve at
most 2 clients
Dept. of Computer Science
Conclusion
Both problems in spatial database
have received little attention in the
past
Although both can be solved by some
existing algorithms, they are not
optimized by the spatial features
Dept. of Computer Science
Dept. of Computer Science
Continuous Exclusive
Closest Pair
Monitoring
Exclusive Closest Pair
 For the datasets (A, B)
 Run CP(A,B) = {(a,b)}
 A=A-{a}, B=B-{b}
 Repeat this process until A or B is empty
 ECP can be solved by stable marriage
algorithm
 Too many computations
Dept. of Computer Science
Memory Based
Solution
CPM (conceptual partitioning
monitoring) is the state of the art
technique for NN searching
 which can be used by incremental solution of
ECP
U3
o1
L3
L2
L1
L0
U2
First NN has been
found, stop?
U1
No, since
U0
dist(q, o1)>mindist(q, next cell)
q
D0
Continue to search~~
R0
R1
R2
R3
o2
D1
D2
D3
o3
Dept. of Computer Science
Solution 1
(a1,b1) must be our result
Remove a1 and b1
b1
a2
a1
Enlarge the searching distance, until
find all assignments
b2
 Drawback
 Each ai should use a priority queue to store the
searching order
 If |a| is very large, then there are too many priority
queue
Dept. of Computer Science
Solution 2
b1
a2
a1
b2
(a1,b1) and (a2,b2) are the candidates of
the ECP
(a1,b1) is the result, since b1 also sees
a1 as its NN
(a2,b2) is not sure, since b2 sees a1 as
its NN
 Drawback
 There might have a lot of
false alarms
• too many candidates are found,
but too less results can be
confirmed
Dept. of Computer Science
Discussion
The drawback of solution 1
 too many priority queue if there are too many
objects in A
The drawback of solution 2
 less results are found in late iterations
Hybrid Solution
 Run solution 2 first, then run solution 1
• The number of A should be reduced by solution 2
• The solution 2 does not need to run into the late
iterations
Dept. of Computer Science
Experiments
Grid size: 128x128
Strip size: 16
Dept. of Computer Science
Update module
We focus on the car-parking slot
assignment
 Run Hybrid solution in the initial state
 Monitor the results by the update module
Observation
please go to
Timestamp 2
slot a
b
slot b
slot a
##@#$%$
Timestamp 1
The distance between the assignment is shorter by time to time
The assignment should be changed if and only if the car can find a better slot
Dept. of Computer Science
Update module
Find a better assignment for the
assigned cars
 If there is any, then change the original
assigned slot as empty and set the new slot
as assigned
Run Hybrid solution to all unassigned
cars (which are asking for park) and
all empty slots
Dept. of Computer Science
Update module
c1 cannot change the assignment to s3
c2 can change the assignment to s3, let s2
as empty slot
s1 c1
Repeat to run until no more empty slot
s3
c2
s2
Optimization 1
sw
Assume that (sw,cw) is the worst
assignment in last iteration
cw
Terminate the search, since
sf
Optimization 2
s 1 c1
s2
c2
sf
dist(sf,next cell)>worst dist
sf does not need to scan c1, since
dist(sf, cellc1)>dist(s1,c1)
Dept. of Computer Science
Experiments
|C|: 40k (number of unassigned cars)
|S_f|: 20k (number of empty slots)
Dept. of Computer Science
Dept. of Computer Science
Capacity Constrained
Assignment Problem
Capacity Constrained
Assignment
Given a set of queries, a set of objects
and a parameter k
Find the assignments between
queries and objects with minimum
cost (distance), such that
 Each query can assign to at most k objects
 Each object can assign to at most 1 query
Dept. of Computer Science
Related works
It is one variant of the optimal
assignment
Some existing algorithms
 Hungarian Algorithm
 Cost Scaling Algorithm
 Successive Shortest Path Algorithm (SSPA)
Dept. of Computer Science
Flow Network
dij, xij/cij
i
j
dji=-dij
4, 1/1
0, 1/1
0, 1/2
6, 0/1
5, 0/1
0, 0/2
0, 0/1
8, 0/1
rij=cij-xij
rji=xij
0, 1
i
dij, rij
j
-4, 1
0, 1/2
0, 1 6, 1
0, 1
5, 1
0, 2
0, 1
8, 1
arc ij exists if rij>0
Residual Network
Flow Network
Minimum Cost Maximum Flow
i
dij, xij/cij
j
4, 1/1
6, 1/1
5, 0/1
0, 0/2
D
0, 1/1
8, 0/1
Flow Network
dij, rij
j
-4, 1
0, 1/1
0, 2/2
E
i
1 to 2 assignment
E
0, 2 -6, 1
0, 1
5, 1
0, 2
D
0, 1
8, 1
Residual Network
Dept. of Computer Science
Optimal Assignment
Queries
i
0, 2
E
0, 2
Objects
d(i,j), rij
4, 1
a
6, 1
8, 1
b
Each query can assign to at most 2 objects
j
c
0, 1
3, 1
d 0, 1 D
7, 1
0, 1
e
5, 1
Add one exceed node and one deficit
node
Exceed node have arcs to all queries with
cost=0 and capacity k=2
All objects have arcs to deficit node with
cost=0 and capacity 1
The number of flows from exceed node to
deficit node is min{k|Q|,|O|}
Dept. of Computer Science
SSPA
0, 2
E
0, 2
0, 2
E
0, 1
4, 1
6, 1
8, 1
a
b
4, 1
a
6, 1
8, 1
b
0, 1
0, 2
E
0, 1
0, 1
4, 1
a
6, 1
8, 1
b
c
0, 1
(1) Find the shortest path from exceed
node to deficit node
3, 1
d 0, 1 D
7, 1
0, 1
e
5, 1
c
0, 1
-3, 1
(2) Send a flow from exceed node to
deficit node, and update the graph
d 0, 1 D
7, 1
0, 1
e
5, 1
c
0, 1
-3, 1
d 0, 1 D
7, 1
0, 1
e
5, 1
(3) Repeat (1), until no more augmenting
path can be found
The number of augmenting paths is
min{k|Q|, |O|}
Dept. of Computer Science
How to find shortest
path?
Dijkstra algorithm is an efficient
algorithm to find the shortest path
 but it does not allow negative value on the arc
 since the negative value may affect the
correctness of the Dijkstra algorithm
1. S={(a,0)}, a is labeled
-3
c
b
10
a
2. S={(d,5),(b,10)}, d is labeled
-4
d
3
e
3. S={(e,8),(b,10)}, e is labeled (PROBLEM)
5
Dept. of Computer Science
How to find shortest
path?
 Use reduced cost (d’ij) to solve this problem
 For every path, the reduced cost can be defined as
follows:
 d’ij=dij-p(i)+p(j)
 p(i)=p(i)-mindist(i)+mindist(D)
• For all node i which are labeled by the Dijkstra algorithm
• p is the potential value of node i
 At the beginning, all p(i)=0
0
a
E
2
3
0
2
c
a
D
4
b
0
6
d
0
E
0
1
0
3
c
a
D
2
b
2
4
d
0
a, b, and c are
labeled by Dijkstra
p(a)=p(a)-2+3=3
p(a)=p(a)-0+2=2
p(b)=p(b)-0+3=5
p(b)=p(b)-0+2=2
p(c)=p(c)-2+3=1
p(c)=p(c)-2+2=0
p(d)=p(d)-3+3=0
E
0
0
1
c
D
0
b
5
1
d
0
Dept. of Computer Science
SSPA
Each query can assign to at most 2 objects
o6
o7
o5
0
0
q2
o1
o2 0
o1
o2
o3
o4
o5
o6
o7
q1
20
19
30
35
36
45
56
q2
13
11
9
30
15
17
29
q3
30
22
23
7
42
35
49
(1) Labeled objects = {q1,q2,q3,o4}
11
0
7
9
0
0
dij
0 o3
2
p(q1)=p(q1)-mindist(q1)+mindist(D)=0-0+7=7
0
p(q2)=p(q2)-mindist(q2)+mindist(D)=0-0+7=7
0
p(q3)=p(q3)-mindist(q3)+mindist(D)=0-0+7=7
p(o4)=p(o4)-mindist(o4)+mindist(D)=0-7+7=0
11
7
0
9
q1
2
0
q3 4
0
7
9
11
0
o4
(2) Labeled objects = {q1,q2,q3,o3}
p(q1)=p(q1)-mindist(q1)+mindist(D)=7-0+2=9
p(q2)=p(q2)-mindist(q2)+mindist(D)=7-0+2=9
p(q3)=p(q3)-mindist(q3)+mindist(D)=7-0+2=9
p(o3)=p(o3)-mindist(o3)+mindist(D)=0-2+2=0
(3) Labeled objects = {q1,q2,q3,o2}
…
Dept. of Computer Science
SSPA
Each query can assign to at most 2 objects
o6
o7
o5
0
0
q2
o1
o2 0
o2
o3
o4
o5
o6
o7
q1
20
19
30
35
36
45
56
q2
13
11
9
30
15
17
29
q3
30
22
23
7
42
35
49
o3
p(q1)=p(q1)-mindist(q1)+mindist(D)=11-0+9=20
0
p(o1)=p(o1)-mindist(o1)+mindist(D)=0-9+9=0
1
0
p(q2)=p(q2)-mindist(q2)+mindist(D)=11-8+9=12
p(o2)=p(o2)-mindist(o2)+mindist(D)=0-8+9=1
8
11
20
q1
o1
(4) Discovered objects = {q1,q2,q3,o1,o2}
12
11
0
0
dij
q3
0
o4
p(q3)=p(q3)-mindist(q3)+mindist(D)=11-0+9=20
11
20
Dept. of Computer Science
SSPA-Range
It is unnecessary to calculate all
distances before we run the SSPA
Assume that we run a range search
with a threshold ε
 Run the Range-Search(ε) for all queries
 Run SSPA
• Find the shortest path
• Fill the flow only when it can be guaranteed as a
shortest path, otherwise, break
 ε = ε+k
Dept. of Computer Science
SSPA-Range
Each query can assign to at most 2 objects
o6
o7
o5
0
0
(1) Labeled objects = {q1,q2,q3,o4} …
11
0
7
9
0 o3
2
o1
E->q3->o4->D must be the shortest path
dist(E->q3->o4->D)=7  ε
q2
0
ε=10
0
o2 0
E->q2->o3->D must be the shortest path
0
dist(E->q2->o3->D)=2  ε-7 (the highest p)
0
(2) Labeled objects = {q1,q2,q3,o3} …
11
907
q1
0
o4
q3 402
970
11
ε=15
E->q2->o2->D must be the shortest path
dist(E->q2->o2->D)=2  ε-9 (the highest p)
(3) Labeled objects = {q1,q2,q3,o2} …
d’ij
o1
o2
o3
o4
o5
o6
o7
p
dij
o1
o2
o3
o4
o5
o6
o7
q1
-
-
-
-
-
-
-
9
711
q1
-
-
-
-
-
-
-
q2
24
0
2
22
0
-
46
-
-
9
711
q2
13
11
9
-
15
-
-
q3
-
-
-
04
2
-
-
-
711
9
q3
-
-
-
7
-
-
-
p
0
0
0
0
0
0
0
Dept. of Computer Science
SSPA-Range
ε=20
ε=20
dij
o1
o2
o3
o4
o5
o6
o7
d’ij
o1
o2
o3
o4
o5
o6
o7
p
q1
20
19
-
-
-
-
-
q1
9
8
-
-
-
-
-
11
q2
13
11
9
-
15
17
-
q2
2
0
2
-
4
6
-
11
q3
-
-
-
7
-
-
-
q3
-
-
-
4
-
-
-
11
p
0
0
0
0
0
0
0
dir







d’ij
o1
o2
o3
o4
o5
o6
o7
p
q1
0
0
-
-
-
-
-
20
q2
1
0
3
-
3
5
-
12
q3
-
-
-
13
-
-
-
20
p
0
1
0
0
0
0
0
dir







E->q1->o1->D must be the shortest path
dist(E->q1->o1->D)=9  ε-11 (the highest p)
(4) Labeled objects = {q1,q2,q3,o1,o2} …
Dept. of Computer Science
SSPA-Range
Repeat the above steps, until find all
assignments
It saves I/O accesses (only 30
objects are discovered by rangesearch) in this example, and it can
save more if there are more objects
and queries
ε=35
dij
o1
o2
o3
o4
o5
o6
o7
q1
20
19
30
35
-
-
-
q2
13
11
9
30
15
17
29
q3
30
22
23
7
-
35
-
Dept. of Computer Science
SSPA-INN
Each query can assign to at most 2 objects
o6
o7
o5
0
First, run NN search for all queries
0
E->q3->o4->D must be the shortest path
dist(E->q3->o4->D)=7  possible mindist = 7
0
q2
(1) Labeled objects = {q1,q2,q3,o4} …
11
9
7
0
Find next NN for query q3
2 o3
0
0
o1
o2 0
0
E->q2->o3->D must be the shortest path
0
dist(E->q2->o3->D)=2  possible mindist = 2
11
907
q1
2
0
q3 4
0
7
9
11
(2) Labeled objects = {q1,q2,q3,o3} …
0
o4
Find next NN for query q2
E->q2->o2->D must be the shortest path
d’ij
o1
o2
o3
o4
o5
o6
o7
p
q1
-
12
10
8
19
-
-
-
-
-
11
70
9
q2
-
0
2
92
0
-
-
-
-
11
70
9
q3
-
13
15
11
-
74
0
2
-
-
-
11
70
9
p
0
0
0
0
0
0
0
dist(E->q2->o2->D)=2  possible mindist = 2
(3) Labeled objects = {q1,q2,q3,o2} …
Dept. of Computer Science
SSPA-INN
Each query can assign to at most 2 objects
o6
o7
o5
0
Find next NN for query q2
0
0
E->q1->o2->q2->o1->D is not the shortest path
dist(E->…->D)= 10 > mindist = 8
q2
11
Find next NN for query q1
o3
0
o1
o2
E->q1->o1->D must be the shortest path
0
dist(E->q1->o1->D)=9  mindist = 9
0
(4) Labeled objects = {q1,q2,q3,o1,o2} …
11
q1
0
o4
q3
11
d’ij
o1
o2
o3
o4
o5
o6
o7
p
q1
09
80
-
-
-
-
-
20
11
q2
12
0
23
-
-
-
-
11
12
q3
-
11
3
-
4
13
-
-
-
11
20
p
0
01
0
0
0
0
0
Dept. of Computer Science
Discussion
The objects have
been searched
before terminated
the SSPA-INN
dij
o1
o2
o3
o4
o5
o6
o7
q1
20
19
30
-
-
-
-
q2
13
11
9
-
15
17
-
q3
30
22
23
7
-
35
-
It saves 8/35 I/O accesses in the
above example, which is more than
SSPA-RANGE
SSPA-INN should be better than
SSPA-RANGE, since it accesses the
index tree only when it is necessary
Dept. of Computer Science
SSPA-INN-PRUNING
Observation
Assigned in
previous iteration
oi
Running NN
search
oa
qa
qb
oc
Does not need to search in blue area
Since
dist(qb,oa)+dist(qa,oi)  dist(qa,oa)+dist(qb,oi)
which oi is the objects in blue area
Dept. of Computer Science
SSPA-INN-PRUNING
oa
ob
qa
qb
Find next NN
in blue area
Find next NN
in red area oc
M1 M2 M3
m1 m2 m3 m4 m5 m6 m7 m8 m9
…
Dept. of Computer Science
Conclusion
Three algorithms have been
introduced to solve the capacity
constrained assignment problem
 SSPA-RANGE
 SSPA-INN
 SSPA-INN-PRUNING
All these algorithms aim to reduce the
number of I/O accesses and optimize
with spatial features
The experiments have not finished yet
Dept. of Computer Science
Dept. of Computer Science
Kamiru
Dept. of Computer Science
Kamiru