Chapter 9

Greedy method
Idea: sequential choices that are locally optimum
combine to form a globally optimum solution. The
choices should be both feasible and irrevocable.
Giving change: we want to return the smallest
number of coins. The greedy algorithm (always
giving the largest coin that fits) works for any
denominations in a fixed ratio c ≥ 2 to one another:
c0, c1, …, ck.
Note: also works for integral ratios (mixed radix).
Minimum Spanning Trees
Given: a connected simple graph with weighted edges
Find: minimum weight subgraph connecting all nodes
Clearly, T must be acyclic and hence a spanning tree.
Input: G = V, E with w: E → R+ a weight function.
Output: T  E connects all vertices, w(T) is minimal.
6
c
a
w (T ) 
1
4
w (u,v )
(u ,v )T
2
b
3
d
Generic Approach to MST
Invariant: grow a set of edges A that is always within an MST
1) A ← Ø
► invariant trivially satisfied
2) while A does not span V
► until it covers the graph
3) find a “safe” edge (u, v) for A ► chose to satisfy invariant
4) A ← A  {(u, v)}
► grow the set of edges
5) return A
► correctness by invariant
Analysis: An edge in (3) must always exist. Since (5) has
|V| − 1 edges, the loop must be executed O(V) times.
Motivation: An edge (u, v) is safe for A if it makes A  {(u, v)}
also a subset of some MST.
Prim’s algorithm
Idea: start with any one vertex tree; augment it by
the edge to a ‘nearest’ vertex not in the tree.
T ← ‹{v}, Ø›
► initial tree
for i = 1 to |V| - 1 do ► all remaining vertices
pick a minimum weight edge e = ‹v, u› leaving T
T ← T + ‹{u}, e›
► add it to the tree
N.b. need a data structure to keep track of the
edges leaving T (use a min-heap priority queue).
Example
1
b
3
a
c
4
5
6
4
5
f
2
6
e
8
d
Correctness
Show: Each T remains a subset of some MST, M.
Proof: Trivially true for T0. Suppose Ti = Ti-1 + ei is
not part of any MST, where ei is the minimum
weight edge leaving Ti-1. By IH, Ti-1  M, a MST, so
there must be another edge e in M extending Ti-1.
But replacing e by ei yields a spanning tree no
heavier, a contradiction.
remove e
Ti-1
add ei
Kruskal’s Algorithm
Idea: start with forest of all nodes and no edges,
incrementally joining them by the shortest edge that
doesn't create a cycle.
F ← ‹V, Ø›
for each ‹v, u› in E do
if F + ‹v, u› is acyclic
then F ← F + ‹v, u›
► a forest of one node trees
► smallest to largest weight
► i.e. if Find(u) ≠ Find(v)
► add it, using Union(u, v)
Example
1
b
3
a
c
4
5
6
4
5
f
2
6
e
8
d
Correctness
Show: Each F remains a subset of some MST T.
Proof: Trivially true for F0. Suppose Fi-1 is contained
in some MST T, and that Fi is not contained in any
MST. So the minimum weight edge ei added to Fi-1
creates a cycle in T. Since Fi-1 is contained in T,
there must be another edge e in T extending Fi-1.
But replacing e by ei yields a spanning tree no
heavier, a contradiction.
Activity Selection
Problem: Given a set S = {[s1, f1), …, [sn, fn)}, find a
maximal cardinality subset of disjoint intervals.
Idea: Think of S as a set of proposed activities, with
start time si and finish time fi, all competing for one
resource (such as a lecture hall). We are trying the
schedule the greatest number of activities.
Solution: Order the finish times: f1 ≤ … ≤ fn.
Choose the first activity, and select subsequent
activities in order, provided they don’t conflict with
those activities already chosen (i.e. the last one).
This is greedy because the next activity chosen is
the first compatible one.
Greedy-Activity-Selector
i
si
fi
1
1
4
2
3
5
3
0
6
4
5
7
5
3
8
6
5
9
7
6
10
8
8
11
9
8
12
10
2
13
11
12
14
The operation of Greedy-ActivitySelector on 11 activities given at
the left. Each row of the figure
corresponds to an iteration of the
for loop in lines 4-7. If the
starting time si of activity i occurs
before then finishing time fi of the
most recently selected activity, it
is rejected. Otherwise it is
accepted and put into set A.
1
2
3
4
5
6
7
8
9
10
11
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
time
Correctness (beginning)
Claim: There is an optimal solution A  S (where
|A| is maximal) containing the first activity
(ordered by finish time).
Proof: Let k be the minimum index such that [sk, fk)
is in A. If k = 1 we are done. Otherwise, let B = [A −
{[sk, fk)} + {[s1, f1)}. Since f1 ≤ fk, B is also an optimal
solution, with |B| = |A|.
So the greedy algorithm makes a correct first
choice. It remains to show that the rest of its
choices are correct – by induction.
Correctness (conclusion)
We want to show there is an optimal solution
containing the first (i + 1) greedy choices. So,
assume by IH that A  S contains the first i greedy
choices, call them G. Let A′ = A − G, and let S′ = {[sj,
fj)  S: sj ≥ fg}, where g is the last index in G. The
claim applied to A′  S′ yields an equal size B′
containing the first element of S′ (the (i + 1)st
greedy choice). Now B = G  B′ is an optimal
solution of S which contains the first (i + 1) greedy
choices.
Disjoint Set Union-Find
A dynamically combinable equivalence relation. E.g. network
connectivity which is reflexive, symmetric, and transitive.
Find is a function a  [a] returning a unique representative.
Union is the operation [a]  [b] = [a, b] joining [a] and [b].
Observations: the names of elements are irrelevant, and the
names of the equivalence classes are arbitrary.
Application:
• Use find to test a ~ b:
Find(a) = Find(b)
• Use union to join a and b:
Union(Find(a), Find(b))
Approaches
The elements 1, …, n are indices into an array
containing names of equivalence classes. Find is θ(1)
lookup, but Union involves scanning the array to
change names, θ(n). A sequence of n − 1 unions
(maximum possible) would be θ(n2). To save time,
the classes could be stored as lists instead. But
worst-case for updating classes is still θ(n2). If we
also keep track of the lengths of the lists, and always
update the smaller list, then the total time for n − 1
merges is θ(n log n), since each element has its class
changed at most log n times (the class size at least
doubles with each change).
More approaches
(this time to make Union O(1)).
Use a tree to represent each class, the root giving the name.
all edges are directed toward the root. A Find simply travels up
the tree until it hits a root. A Union simply causes the root of
one tree to point to the other. Since it is possible to create a
tree of depth n, it still could take θ(n2) for a sequence of n
Union-Finds.
Tricks:
• Union by rank: shallow tree becomes subtree of deeper
one. Break ties arbitrarily.
• Path compression: make each node on “find path” point
to root.
Implementation
FIND
INIT
x
INITIALIZE
1) for x ← 1 to n
2)
p[x] ← x
3)
r[x] ← 0
UNION
x
y
only uses parent
pointers (p) and
rank field (r) in
arrays
► every element is put into a set by itself
► elements are just array indices
► x is root of singleton class {x}
► rank (height) is zero
UNION(x, y)
► assumes x and y are class reps. (roots)
1) if r[x] > r[y]
FIND(x)► returns [x]
2)
then p[y] ← x
1) if p[x] ≠ x
3)
else p[x] ← y
2)
then p[x] ← FIND(p[x])
4)
if r[x] = r[y]
3) return p[x]
5)
then r[y] ← r[y] + 1
Single-source shortest-paths
Problem: In a non-negatively weighted directed or
undirected graph, find the shortest paths from a single
source vertex s to each of the other vertices v.
Solution: Dijkstra’s algorithm
1. Start with tree T1 consisting of one vertex, s = v0.
2. Construct a series of T1, T2, … which expand one
edge at a time, keeping track of shortest path
d(v) from the source to each of the vertices in Ti.
3. Find Ti+1 by adding a “fringe” edge (v, u) with the
lowest d(v) + w(v, u). [This is the greedy step.]
4. Terminate when all vertices are included.
Example
5
a
6
4
1
c
b
3
d
2
7
e
Correctness: prove by induction on i that Ti
contains the i closest vertices to the source, and
that the tree path from s to each of them is a
shortest path.
Efficiency: O(|V|2) for adjacency matrix
(unordered array for priority queue); O(|E|∙log |V|)
for adjacency list (min-heap priority queue)