1. Disjoint set
COMP3001/3901 Algorithms
Lecture 11- 1
Advanced Data Structure:
Disjoint sets
(Chapter 21)
University of Sydney
COMP3001 Algorithms, 2002
• Disjoint set data structure: maintain a
collection S = {S1, S2, …, Sk} of disjoint
dynamic sets
– Each set is identified by a representative
Operations:
• MAKE-SET(x): create a new set whose only
member is x (x not already be in other set)
• UNION(x, y): unite the dynamic sets that
contain x and y, Sx and Sy, into a new set that
is the union of these two sets (destroy Sx, Sy)
• FIND-SET(x): return a pointer to the
representative of the set containing x
University of Sydney
COMP3001 Algorithms, 2002
University of Sydney
COMP3001 Algorithms, 2002
Application
• MSP algorithm: Kruskal’s algorithm
• Finding the connected components of an
undirected graph
University of Sydney
COMP3001 Algorithms, 2002
Representation
1. Linked list representation
2. Rooted tree representation: better time
complexity
University of Sydney
COMP3001 Algorithms, 2002
University of Sydney
COMP3001 Algorithms, 2002
1
2. Linked list representation
• First object in each linked list: representative
• Each object:
• contain a set member
• A pointer to the object containing the next
set member
• A pointer back to the representative
• Each list:
• head: pointer to the representative
• tail: pointer to the last object in the list
University of Sydney
COMP3001 Algorithms, 2002
University of Sydney
• MAKE-SET, FIND-SET: O(1) time
• UNION
1. Simple implementation
• UNION(x,y): append x’s list to the end of y’s
list
• Use tail to find where to append
• must update the pointer to the representative
for each object on x’s list: time linear in the
length of x’s list
• A sequence of m operations on n objects:
O(n 2) time
• Amortized time of an operation: O(n)
University of Sydney
COMP3001 Algorithms, 2002
COMP3001 Algorithms, 2002
n MAKE-SET: O(n)
n-1 UNION:
ΣΣ (i=1 to n-1) i = O(n2)
m = 2n-1 operations
Each operation: O(n)
amortized time complexity
University of Sydney
COMP3001 Algorithms, 2002
2. Disjoint-set forests
2. Weighted-union heuristic
• Each list maintain the length of the list
• Always append the smaller list onto the longer
• A single UNION: O(n)
• A sequence of m MAKE-SET, UNION, FIND-SET
operations(n: MAKE-SET) : O(m+nlgn)
• Faster implementation
• Represent sets by rooted trees
– Each member points only its parent
– Root of each tree: representative
• Naïve algorithm: no faster than linked-list
• Two heuristics:
– Union by rank
– Path compression
• FIND-SET: follow parent pointer until the root
(find path: nodes visited on the path)
• UNION: root of one tree to point to the root of
the other
University of Sydney
COMP3001 Algorithms, 2002
University of Sydney
COMP3001 Algorithms, 2002
2
Heuristics
• A sequence of n-1 UNION: create a linear chain
of n nodes
• Two heuristics: almost linear running time in
total number of m operations
1. Union by rank
– Similar to weight-union heuristic
– Make the root of tree with fewer nodes point
to the tree with more nodes
– For each node, we maintain a rank: upper
bound on the height of the node
– The root with smaller rank point to the root
with larger rank
University of Sydney
COMP3001 Algorithms, 2002
University of Sydney
COMP3001 Algorithms, 2002
University of Sydney
COMP3001 Algorithms, 2002
University of Sydney
COMP3001 Algorithms, 2002
Heuristics
2. Path compression
– Simple & effective
– Use it during FIND-SET operations
– Make each node on the find path point
directly to the root
University of Sydney
COMP3001 Algorithms, 2002
Pseudo code
1. UNION-by-rank
– For each node x, rank[x]: upper bound on the
height of x (# of edges in the longest path
between x and a descendant leaf)
– MAKE-SET: initial rank = 0
– FIND-SET: rank unchanged
– UNION
• Roots with unequal rank: root of higher
rank be the parent of the root of lower rank
• Roots with equal rank: arbitrarily choose
one of the roots as the parent and
increment its rank
University of Sydney
COMP3001 Algorithms, 2002
3
Pseudo code
2. Path compression
• FIND-SET: two pass method
• One pass up the find path to find the root
• Second pass back down the find path to update
each node s. t. it points directly to the root
University of Sydney
COMP3001 Algorithms, 2002
Time complexity
• Union by rank only: O(mlogn)
• Both union by rank & path compression:
O(mα
α (n)) in worst case time
– α (n) : very slowly growing function
– α (n) <= 4 for all practical purpose
– Amortized cost of each MAKE-SET: O(1)
– Amortized cost of each LINK: O(α
α (n))
– Amortized cost of each FIND-SET: O(α
α (n))
– A sequence of m MAKE-SET, UNION, FINDSET operations (n: MAKE-SET), with union
by rank and path compression: O(mα
α (n)) in
worst case time
University of Sydney
Amortized analysis
COMP3001/3901 Algorithms
Amortized Time Complexity
(Chapter 17)
University of Sydney
COMP3001 Algorithms, 2002
• Time required to perform a sequence of data
structure operations is averaged over all the
operations performed
• Show that the average cost of an operation is
small
• No probability involved
• Guarantee the average performance of each
operation in the worst case
• Three techniques
1. Aggregate analysis
2. Accounting method
3. Potential method
University of Sydney
Amortized analysis
1. Aggregate analysis
– n operations take T(n) time
– Average cost of an operation: T(n)/n
– Imprecise: don’t get separate cost for each
type of operation
COMP3001 Algorithms, 2002
COMP3001 Algorithms, 2002
Amortized analysis
3. Potential method
– “stored work” (accounting method) viewed
as “potential energy”
– Most flexible & powerful view
2. Accounting method
– Charge each operation an (invented)
amortized cost
– Amount not used stored in “bank”
– Later operations can use stored work
– Balance must not go negative
University of Sydney
COMP3001 Algorithms, 2002
University of Sydney
COMP3001 Algorithms, 2002
4
© Copyright 2026 Paperzz