Disjoint Sets

Disjoint Sets (cont)
Disjoint Sets
n elements
Partitioned into k sets (disjoint sets)
13
21
21
X = {x1 , x2 ,K , xn }
S1 , S 2 , K, S k
88
15
15
4
2
66
20
20
12
Operations
16
14 18
14
9
Find-Set(x) - return set containing x
11
3
Union(x,y) - make a new set by combining the sets
containing x and y (destroying them)
5
7
17
MakeSet(x) - add a new element x and make it into
a singleton set
Union(4,16)
Find-Set(4)
1
19
10
→8
1
Each set is identified by its
representative element x
Disjoint Sets (cont)
13
21
8
15
4
2
6
20
12
Disjoint Sets (cont)
13
21
16
3
5
19
10
17
7
Find-Set(4)
→ 13
Find-Set(9)
→9
1
8
15
4
2
6
20
12
14 18
9
11
3
{13,21,15,2,4,20 ,12}
{8,6,16 ,14 ,18}
{9} {1} {11}
{10 ,3,5,17 ,7,19}
16
14 18
9
11
3
5
10
17
19
7
1
22
MakeSet(22)
2
4
Disjoint Sets (cont)
Disjoint Sets (cont)
Version 1
Each set is a linked list (with pointer to tail)
List head is the representative
Each element has a pointer to the head
{13,21,15,2,4,20 ,12}
{8 , 6 ,16 ,14 ,18 }
{9 } {11 } {1}
{10 ,3,5,17 ,7 ,19 }
Version 1
A sequence of m operations can take Θ(m 2 ) time.
for i←1 to m/2
MakeSet(i)
endfor
13
21
12
2
15
for i←2 to m/2
4
20
m/2
∑ i − 1 = Θ( m
i-1 pointer updates
Union(i-1,i)
8
6
18
14
1
16
11
i-1
9
10
5
19
...
i-2
i
17
7
3
i-1
i-2
...
2
1
7
Disjoint Sets (cont)
Version 1
(weighted-union heuristic)
Find-Set(x) - traverse head pointer
O(1)
Version 2
MakeSet(x) - create
O(1)
Modify Version 1 by keeping track of the size of
each set and appending the smaller to the larger.
x
Union(x,y) - Append S x to S y and update head pointers
O( S x )
12
13
2
15
20
)
i
1
2
5
Disjoint Sets (cont)
2
i =2
endfor
Each time x’s head pointer is updated the size of x’s set doubles.
A head pointer has been updated at most lg n times if there are a
total of n elements in the sets.
A sequence of m instructions in which there are n MakeSets
takes time O(m + n lg n).
4
21
6
8
Disjoint Sets (cont)
Disjoint Sets (cont)
Version 3
{13,21,15,2,4,20 ,12}
{8 , 6 ,16 ,14 ,18 }
{9 } {11 } {1}
{10 ,3,5,17 ,7 ,19 }
Each set is a tree with parent pointers only
The root of the tree is the representative
12
2
15
3
21
10
1
4
17
14 18
13
16
12
2
15
3
5
19
5
→ 13
11
6
20
10
14 18
16
Find-Set(4)
8
8
6
FindSet(x) - Traverse parent pointers to reach root
(till x=parent(x))
Worst-case is Θ(n)
13
9
11
Version 3
17
7
9
19
20
21
1
4
7
9
Disjoint Sets (cont)
MakeSet(x) - create
Version 3
Union(x,y) - Link(FindSet(x),FindSet(y))
O(1) + time for 2 FindSets
8
11
6
10
14 18
Union(4,16)
13
16
12
2
17
9
19
20
Union(x,y)
p(x)←x
r1←FindSet(x)
rank(x) ← 0
r2←FindSet(y)
Link(x,y)
if r1≠r2 Link(r1,r2)
if rank(x)>rank(y)
FindSet(x)
if x≠p(x)
else p(x)←y
15
then p(x)←FindSet(p(x))
if rank(x)=rank(y) return(p(x))
then rank(y)++
21
1
7
(weighted-union and path compression)
MakeSet(x)
then p(y)←x
3
5
Disjoint Sets (cont)
Version 4
O(1)
x
11
m MakeSets, FindSets, Unions ⇒ at most 3m MakeSets, FindSets,
Links
4
10
12
Disjoint Sets (cont)
Version 4
Disjoint Sets (cont)
Version 4 Analysis
Let size(x) be the number of nodes in the subtree rooted at x.
FindSet(x) - Traverse parent pointers to reach root
(till x=parent(x))
Find-Set(4)
8
Lemma If x is a root, then size(x) ≥ 2 rank(x).
Proof: By induction on the number of Link operations.
Base: 0 Links. All nodes are roots with size 1 and rank 0.
Step: Assume property holds after i Links and consider the
Link i+1, Link(x,y).
Case 1 rank(x)>rank(y) (or vice versa). Then rank(x) does
not change and size(x) increases so inequality still holds.
Case 2 rank(x)=rank(y)=r.
Then before the link size(x) ≥ 2 r and size(y) ≥ 2 r.
→ 13
11
6
10
14 18
13
16
12
2
15
3
5
17
9
19
20
21
After link rank(y)=r+1 and size(y) ≥ 2 r + 2 r = 2 r +1 = 2 rank ( y ).
1
7
4
QED
13
Disjoint Sets (cont)
15
Disjoint Sets (cont)
Version 4 Analysis
Version 4 Analysis
Lemma If x≠p(x) then rank[x]<rank[p(x)].
Lemma For any r ≥ 0, the # of nodes of rank r is at most
Proof: By induction on the number of operations.
Base: 0 operations. Trivially true since no elements.
Proof: A node gets rank r as the result of a Link of two
trees of rank r-1.
Step: Assume property holds after i operations and consider
operation i+1.
MakeSet No change except for new node with x=p(x).
This node will have size at least
2r
n
.
2r
at that time.
No node can contribute more than once to a rank change from
r-1 to r.
n
nodes can attain rank r.
2r
FindSet If a node gets a new parent, it is the root of its tree
and must have even higher rank than the previous parent.
Since there are n nodes, at most
Link The node that gets a new parent, gets one with higher rank.
Corollary For any node x, rank ( x) ≤ lg n .
QED
QED
14
16
Disjoint Sets (cont)
Version 4 Analysis
Version 4 Analysis
Disjoint Sets (cont)
First try at bounding cost of m instructions
x
Maximum cost of a FindSet is O(lg n).
A sequence of m operations costs at most O(m lg n).
x
Over estimate since FindSets are destructive, reducing height
of trees.
Cannot find sequence with Ω(m) FindSets each with cost Ω(lg n).
Before FindSet(x)
After FindSet(x)
Idea: Combine first and second approaches.
Third try at bounding cost of m instructions
Idea: Bound number of parent updates (as in Version 2).
Charge some parent updates to the node and the
rest to the operation.
17
Disjoint Sets (cont)
Version 4 Analysis
19
Disjoint Sets (cont)
Second try at bounding cost of m instructions
How many times can node x of rank r get a new parent?
At most lg n - r times. (Its parents increase in rank from r+1 to lg n.)
The cost of FindSets is at most O(m + # of parent updates).
lg n
lg n
lg n
n
r
1
≤ n lg n∑ r − n∑ r .
r
2
2
2
r =0
r =0
r =0
1  
lg n + 2 

≤ n lg n 2 − lg n  − n 2 − lg n .
2  
2


≤ 2n lg n − lg n − 2n + lg n + 2 = Θ( n lg n).
# parent updates ≤ ∑ (lg n − r )
Cost of m operations is O(m + n lg n).
Over estimate! Not many branches with consecutive ranks.
Cannot find Ω(n) nodes with Ω(lg n) parent updates.
rank
f (k) = lg n
lg n - 1
.
.
f (i)
.
f (i-1)+1
.
5
4
3
2
1
f (0) = 0
Version 4 Analysis
group
k
group i has ranks in the
range [f (i-1)+1, f (i)]
i
.
f ( ) to be determined
.
.
1
0
f (-1) = -1
18
20
Disjoint Sets (cont)
Version 4 Analysis
Version 4 Analysis
Disjoint Sets (cont)
Amortized Analysis using Aggregate Method
0
N2
f (i ) = 2 f ( i −1)
If FindSet traverses nodes x1 , x2 ,K , x p then the cost for
traversing x is charged
•
•
f −1 (lg n )
n
−1
−1
f (lg n )
f (lg n )
2 f ( i −1)
f (i )
=
=
n
n
1 = n f −1 (lg n)
∑
∑
2 f (i −1)
2 f ( i −1)
i =1
i =1
Total cost = total path charges + total block charges
Total cost = total path charges + total block charges
−1
∑
i =1
Block Charge
Maximum block charges to single FindSet is f
(
)
*
(
O m lg* n
)
21
Disjoint Sets (cont)
Version 4 Analysis
Amortized Analysis using Aggregate Method
To bound path charges:
If node x has a parent in a different group, this will remain the case.
x in group i can be charged at most f (i)-(f (i-1)+1) path charges.
f −1 (lg n )
f (i )
n
( f (i) − ( f (i − 1) + 1))
r
i = 0 r = f ( i −1) +1 2
f −1 (lg n )
f (i )
1
≤ n ∑ f (i ) ∑
r
i =1
r = f ( i −1) +1 2
Total path charges ≤
f −1 (lg n )
≤n
∑
i =1
∑
∑
−1
f (lg n )
1 
f (i )
 1
f (i ) ⋅  f (i −1) − f (i )  ≤ n ∑ f ( i −1)
2
2
2


i =1
Time to choose f ( )
)
*
f (lg n) = lg (lg n) + 1 = lg n
Maximum block charges to a sequence of m operations is
(
(
O m + m f −1 (lg n) + n f −1 (lg n) = O m f −1 (lg n)
(lg n)
−1
O m + m ⋅ f −1 (lg n)

 i times

f −1 ( j ) = lg* j + 1
to the node if x and p(x) are in the same rank group (before) and
p(x) is not the root
Path Charge
otherwise to the operation.
f (i ) = 2 2
f ( 0) = 0
22
)
23

Download Report

Disjoint Sets

Paperzz.com

Your Paperzz