AlgorithmLecture1213

CSCI 58000, Algorithm Design,
Analysis & Implementation
Lecture 12
Greedy Algorithms (Chapter 16)
Greedy Algorithm (Chapter 16)
•
For many optimization problem, a dynamic problem is just overkill; when?
•
If we can make the optimal choice before finding optimal values of the subproblems.
– We just make the choice that looks best at the moment; that justify the name, “greedy”
•
For many problems, greedy solution does not yield optimal solution, for many it
does
– Activity-Selection problem (greedy works)
– Minimum Spanning Tree (greedy works)
– Vertex Cover (greedy does not work)
•
In a dynamic programming solution, we solve the sub-problem first and then use
that to find optimal choices for solving larger problems
– What if that optimal choice can be achieved without knowing the optimal value of the subproblems
– In other words, the choice that you make greedily is the same as the optimal choice
Activity-Selection problem
• Consider a set of activities {𝑎1 , 𝑎2 , ⋯ , 𝑎𝑛 } that to be scheduled on a
machine, each activity has a start time and a finish time, and
activities are sorted in the increasing order of finish time
• We want to select the largest sub-set of compatible activities
• You can think this as the activity selection problem that we have
shown in dynamic programming, but in this case each task has a
weight value of 1
I
1
2
3
4
5
6
7
8
9
10
11
S(i) 1
3
0
5
3
5
6
8
8
2
12
F(i) 4
5
6
7
9
9
10
11
12
14
16
{2, 4, 9, 11} can be one answer
One Solution approach using DP
• Optimal substructure property
– 𝑆𝑖𝑗 : The set of activities that start after activity 𝑎𝑖 finish and the activity 𝑎𝑗
starts
– 𝐴𝑖𝑗 : Optimal solution for the problem 𝑆𝑖𝑗
– If we choose 𝑎(𝑘) in 𝐴𝑖𝑗 , we have two more sub-problems to solve: 𝑆𝑖𝑘 and
𝑆𝑘𝑗
– We also have 𝐴𝑖𝑗 = 𝐴𝑖𝑘 + 𝐴𝑘𝑗 + 1
– Using cut-paste argument it is easy to see that the optimal substructure
property holds
• If 𝑐(𝑖, 𝑗) is the size of the optimal solution for the set 𝑆𝑖𝑗 , following
recurrence holds:
– 𝑐 𝑖, 𝑗 =
0, 𝑖𝑓 𝑆𝑖𝑗 = 𝑁𝑈𝐿𝐿
max
𝑎𝑘 ∈ 𝑆𝑖𝑗 𝑐 𝑖, 𝑘 + 𝑐 𝑘, 𝑗 + 1 , 𝑖𝑓 𝑆𝑖𝑗<>0
A better DP Solution (we have shown
it’s weighted version in earlier class)
• The structure of the solution in last slide is similar to the chain
matrix multiplication DP, which has a cost of 𝑂(𝑛3 ). We have shown
a better DP solution which runs in 𝑂(𝑛) time (if you consider the
cost of activity sorting, then it is 𝑂(𝑛 log 𝑛)) . We discuss it below:
• 𝑆𝑗 = 𝑎𝑗 , 𝑎𝑗+1 , ⋯ , 𝑐𝑛 , set of activities from 𝑎𝑗 and onwards. We will
call the activity selection from this set the sub-problem 𝑆𝑗
• 𝑛𝑒𝑥𝑡(𝑗): returns the smallest 𝑖: 𝑗 < 𝑖 ≤ 𝑛 such that that 𝑎𝑖 starts
after 𝑎𝑗 finishes. If no such 𝑖 exist it returns nil.
• 𝐴𝑗 : Optimal solution for activity selection problem for input 𝑆𝑗
• 𝑐 𝑗 : Size of optimal solution for the problem 𝑆𝑗 ; if 𝑆𝑗 = ∅, 𝑐 𝑗 = 0
• If 𝑎𝑗 ∈ 𝐴𝑗 : 𝑐 𝑗 = 1 + 𝑐 𝑛𝑒𝑥𝑡 𝑗 , otherwise 𝑐 𝑗 = 𝑐(𝑗 + 1)
Solution Using Greedy approach
• Instead of solving it using DP, we can using the following
Greedy algorithm
– repeatedly choose the activity that finished first
– Keep the remaining compatible activities and solve recursively
– Assume 𝑆𝑘 be the set of activities that start after the activity k
finishes
Recursive-Activity-Selector (𝑠, 𝑓, 𝑘, 𝑛)
𝑚 = 𝑘 + 1
while 𝑚 <= 𝑛 and 𝑠[𝑚] < 𝑓[𝑘]
𝑚 = 𝑚+1
if 𝑚 ≤ 𝑛
return {𝑎𝑚} ∪ Recursive-Activity-Selector (𝑠, 𝑓, 𝑚, 𝑛)
How do we know whether it works
• If we can prove that an optimal solution will
include the element chosen by the greedy
property, then greedy yields an optimal
solution
• Theorem 16.1: Consider any nonempty subproblem 𝑆𝑘 and let 𝑎𝑚 be an activity in 𝑆𝑘
with the earliest finish time. Then 𝑎𝑚 is
included in some maximum-size subset of
mutually compatible activities of 𝑆𝑘
Elements of greedy strategy
• Two important criteria need to be satisfied
– Greedy choice property (locally optimal choice will
ultimately provide a global optimal solution)
– Optimal sub-structure property
• Greedy algorithm design steps
– Cast the optimization problem as one in which we can
make a choice using a greedy criteria and are left with one
sub-problem to solve
– Prove that there is always an optimal solution to the
original problem that makes the greedy choice (greedy
choice is safe)
Greedy vs Dynamic
• Dynamic
– Optimal Substructure property
– Smaller sub-problems need to be solved first, as their
optimal value affects the choice we make when we solve
the larger problems
– Sub-problems are overlapping, so memoization is
important
• Greedy
– Optimal substructure property
– Smaller sub-problem don’t need to be solved first, as
greedy choice let us solve the problem top-down
– After greedy choice, only one sub-problem exist, so subproblems are not overlapping
Huffman Codes
•
Huffman codes are used to compress data by encoding each character to a binary
string
– The compression is non-lossy.
•
Such codes are prefix-free code (also known as prefix codes), i.e., no codeword is a
prefix of some other codeword.
– Prefix codes are optimal (no proof)
– No delimiter is required between two codewords in the compressed file
•
The idea of compression is to use variable-length codes instead of fixed length
code
– The most frequent character should have the shortest-length code, and the rarest character
should have the longest-length code
– Huffman proposed a greedy algorithm that find coding that is optimal among all prefix-free
codes.
– Optimality is defined over the expected length of each codeword
Example
30,000 bit for a 10,000
character file)
22,400 bit for a 10,000
character file)
Constructing Huffman code
• Given frequency, Huffman code algorithm constructs a
code tree 𝑇
• The cost of a code tree is: 𝐵 𝑇 =
𝑐∈𝐶 𝑐. 𝑓𝑟𝑒𝑞 ⋅ 𝑑 𝑇 (𝑐)
– 𝑐. 𝑓𝑟𝑒𝑞 = frequency of character 𝑐
– 𝑑 𝑇 𝑐 = codeword length of code 𝑐 in tree T
• Considering this cost as the objective function,
Huffman coding is optimal prefix code.
Pseudo-code
Huffman (C)
n = |C|
Q=C
for i = 1 to n-1
allocate new node z
z.left = x = Extract-min (Q)
z.right = y = Extract-min (Q)
insert (Q, z)
return Extract-Min (Q) // returns the root
Complexity = 𝑂(𝑛 lg 𝑛)
Example
Correctness Proof
• Greedy choice property holds (Lemma 16.2)
– Let C be an alphabet in which each character 𝑐 ∈
𝐶 has frequency 𝑐. 𝑓𝑟𝑒𝑞.
– Let 𝑥 and 𝑦 are the two characters with the lowest
frequencies. Then there exist a prefix code for 𝐶 in
which the codewords for 𝑥 and 𝑦 have the same
length, and they differ only in the last bit
– To prove, modify an optimal tree to show that
modified tree has smaller cost
Correctness Proof (cont.)
• Optimal Substructure Property Holds (Lemma
16.3)
– Let 𝐶 be a given alphabet where 𝑥 and 𝑦 are the
two lowest frequency characters.
– Let 𝐶′ be another alphabet such that 𝐶 ′ = 𝐶 ∖
𝑥, 𝑦 ∪ 𝑧 , where 𝑧. 𝑓𝑟𝑒𝑞 = 𝑥. 𝑓𝑟𝑒𝑞 + 𝑦. 𝑓𝑟𝑒𝑞
– Let, 𝑇′ be an optimal tree for alphabet 𝐶′
– Then 𝑇, the optimal tree for alphabet 𝐶, can be
obtained by replacing leaf node for 𝑧 with an
internal node having 𝑥 and 𝑦 as children.
Correctness Proof (cont.)
• 𝑧. 𝑓𝑟𝑒𝑞 ⋅ 𝑑 𝑇 ′ 𝑧 = 𝑥. 𝑓𝑟𝑒𝑞 + 𝑦. 𝑓𝑟𝑒𝑞 ⋅ 𝑑 𝑇 𝑥 + 1 =
𝑥. 𝑓𝑟𝑒𝑞 ⋅ 𝑑 𝑇 𝑥 + 𝑦. 𝑓𝑟𝑒𝑞 ⋅ 𝑑 𝑇 𝑦 + 𝑥. 𝑓𝑟𝑒𝑞 + 𝑦. 𝑓𝑟𝑒𝑞
• Thus, 𝐵 𝑇 = 𝐵 𝑇 ′ + 𝑥. 𝑓𝑟𝑒𝑞 + 𝑦. 𝑓𝑟𝑒𝑞
• Now, assume 𝑇 is not optimal for alphabet 𝐶, but 𝑇 ′′ is.
From Lemma 16.2, in 𝑇′′, x and y are siblings. We
construct 𝑇′′′, where x and y are replaced by z, then
– 𝐵 𝑇 ′′′ = 𝐵 𝑇′′ − 𝑥. 𝑓𝑟𝑒𝑞 − 𝑦. 𝑓𝑟𝑒𝑞 < 𝐵 𝑇 − 𝑥. 𝑓𝑟𝑒𝑞 −
𝑦. 𝑓𝑟𝑒𝑞 = 𝐵(𝑇 ′ )
– Then 𝑇′ is not optimal for alphabet 𝐶′
Knapsack problem
• We have a list of n items, each having a weights(𝑤𝑖 )
and value (𝑣𝑖 ). We like to select a subset of items so
that the total value is maximized and the total weight
does not exceed a given weight, 𝑊
Matroids and Greedy Methods
• A matroid is an ordered pair 𝑀 = (𝑆, 𝐼),
satisfying the following conditions
1. 𝑆 is a finite set
2. 𝐼 is a nonempty family of subsets of 𝑆, called the
independent subsets of 𝑆, such that if 𝐵 ∈ 𝐼, and
𝐴 ⊆ 𝐵, then 𝐴 ∈ 𝐼. This is called hereditary property
of 𝐼.
3. If 𝐴 ∈ 𝐼, 𝐵 ∈ 𝐼, and 𝐴 < |𝐵|, then there exists
some element 𝑥 ∈ 𝐵 ∖ 𝐴, such that 𝐴 ∪ 𝑥 ∈ 𝐼. This
is called exchange property of matroid.
Example: Graphical Metroid
• 𝑀𝐺 = (𝑆𝐺 , 𝐼𝐺 ) is a graphical metroid for the graph 𝐺(𝑉, 𝐼)
– 𝑆𝐺 is defined to be the edge set, 𝐸
– If 𝐴 ⊆ 𝐸, then 𝐴 ∈ 𝐼𝐺 iff 𝐴 is acyclic
• Proof: 𝑀𝐺 is a matroid
– 𝐼 is a finite set. 𝐼𝐺 satisfy hereditary property, because if a set of edge does not
form a cycle ( a forest) , its subset won’t form a cycle (another forest)
– 𝐺𝐴 = (𝑉, 𝐴) and 𝐺𝐵 = (𝑉, 𝐵) are forests of 𝐺, and 𝐵 > |𝐴|.
– Lemma: A forest 𝐹 = (𝑉𝐹 , 𝐸𝐹 ) contains exactly 𝑉𝐹 − |𝐸𝐹 | trees.
– Since, forest 𝐺𝐵 has fewer trees than forest 𝐺𝐴 , there must exists edge (𝑢, 𝑣)
in a tree T 𝑖𝑛 𝐺𝐵 such that 𝑢 and 𝑣 belong to two different trees in the forest
𝐺𝐴 . Since, 𝑇 is connected, the vertices share an edge, which can be included
with the set 𝐴, to prove the exchange property
Matroid (cont.)
• An element 𝑥 ∉ 𝐴 an extension of 𝐴 ∈ 𝐼, if we can add 𝑥 to 𝐴 and
still provide independence;
• If 𝐴 is an independent subsets with no extension, it is called a
maximal independent subset.
• All maximal independent subsets in a matroid have the same size
• A matroid is called weighted if it assigns a strictly positive weight
𝑤(𝑥) to each element 𝑥 ∈ 𝑆. The weight function extends to the
subset by summation: 𝑤 𝐴 = 𝑥∈𝐴 𝑤(𝑥)
• Many algorithm for which greedy approach provides optimal
solution can be formulated in terms of finding a maximum-weight
independent subset in a weighted matroid.
Greedy algorithm on weighted matroid
Greedy (M, w)
A=∅
Sort 𝑀. 𝑆 into monotonically decreasing order by weight
for each 𝑥 ∈ 𝑀. 𝑆 taken in the above sort order
if 𝐴 ∪ 𝑥 ∈ 𝑀. 𝐼
𝐴 = 𝐴 ∪ {𝑥}
return 𝐴
• Matroid exhibits greedy-choice property: 𝑀(𝑆, 𝐼) is a
weighted matroid with weight function w and S is sorted by
decreasing order of weigth. Let 𝑥 be the first element of S,
such that {𝑥} is independent, then their exists an optimal
solution that contain 𝑥
• Matroid exhibits optimal substructure property
Scheduling unit-time tasks with
penalty
• Given
– A set of unit-time tasks: 𝑆 = {𝑎1 , 𝑎2 , … , 𝑎𝑛 }
– A set of 𝑛 integer deadlines 𝑑1 , 𝑑2 , … , 𝑑𝑛 such that 1 ≤
𝑑𝑖 ≤ 𝑛, such that task 𝑎𝑖 is supposed to finish by time
𝑑𝑖
– A set of 𝑛 non-negative penalties, 𝑤1 , 𝑤2 , … , 𝑤𝑛 , such
that we incur a penalty of 𝑤𝑖 , if task 𝑎𝑖 is not finished
by time 𝑑𝑖 , and we incur no penalty if a task is finished
by its deadline
• Objective
– Find a scheduling (time starts at 0, and ends at 𝑛) that
minimize total penalty incurred for missed deadline
Canonical Scheduling
• An early-first scheduling is an schedule in which all early
tasks precedes the late tasks.
– If an scheduling keeps an early task after a late task, we can
simply switch their position, early task will still be early and late
task will still be late
• A canonical scheduling is an early-first scheduling where all
the early tasks appear in order of monotonically increasing
deadlines.
– If in an early-first schedule, if 𝑎𝑖 and 𝑎𝑗 are two early tasks
finishing at time 𝑘 and 𝑘 + 1, but 𝑑𝑗 < 𝑑𝑖 , then we can swap
their position, and the status of the tasks do not change
– Since, 𝑎𝑗 is early, we have 𝑑𝑗 ≤ 𝑘 + 1, but 𝑑𝑖 > 𝑑𝑗 , so 𝑑𝑖 > 𝑘 +
1, so 𝑑𝑖 can finish at 𝑘 + 1 without being late
Lemma 16.12
• A set of tasks is independent, if there exist a
schedule for these tasks, where no tasks are late.
• For 𝑡 = 0, 1, … , 𝑛, Let, 𝑁𝑡 (𝐴) denote the number
of tasks in 𝐴 whose deadline is 𝑡 or earlier.
• Then the following statements are equivalent
1. The set 𝐴 is independent
2. For 𝑡 = 0, 1, … , 𝑛, we have 𝑁𝑡 𝐴 ≤ 𝑡
3. If the tasks in 𝐴 are scheduled in order of
monotonically increasing deadline, no tasks are late
Lemma 16.12 (cont.)
• Proof:
– For some 𝑡, if there are more than 𝑡 tasks to be
finished before time 𝑡, there is no way we can
schedule them, thus, 1 → 2.
– If 𝑁𝑡 𝐴 ≤ 𝑡, then we can schedule the tasks in
monotonically increasing order of deadline, and
we will be able to do so. 2 → 3.
– 3 → 1, trivial
Task with Deadlines shows Matroid
property
• If 𝑆 is a set of unit-time tasks with deadlines, and 𝑙 is the set
of all independent set of tasks, then the corresponding
system (𝑆, 𝑙) is a matroid
– Hereditary is obvious
– For exchange, assume 𝐴 and 𝐵 are independent set of tasks,
such that 𝐵 > |𝐴|
– Let 𝑘 be the largest 𝑡 such that 𝑁𝑡 𝐵 ≤ 𝑁𝑡 𝐴 . Such a 𝑡 exists
because 𝑁0 𝐵 = 𝑁0 𝐴 = 0
– Again, 𝑁𝑛 𝐵 = 𝐵 , and 𝑁𝑛 𝐴 = 𝐴 , and 𝐵 > |𝐴|, 𝑘 < 𝑛
– For 𝑘 + 1 ≤ 𝑗 ≤ 𝑛, we have 𝑁𝑗 𝐵 > 𝑁𝑗 (𝐴). Take the task 𝑎𝑗 ∈
𝐵 ∖ 𝐴, with deadline 𝑘 + 1.
– Now, take 𝐴′ = 𝐴 ∪ {𝑎𝑗 }.
– 𝐴′ is independent, because for 0 ≤ 𝑡 ≤ 𝑘, 𝑁𝑡 𝐴 = 𝑁𝑡 𝐴′ ≤ 𝑡,
and 𝑘 < 𝑡 ≤ 𝑛 , we have 𝑁𝑡 𝐴′ ≤ 𝑁𝑡 𝐵 < 𝑡
Greedy Algorithm for Unit-length task
with penalty
• Use the algorithm for finding maximumweight independent set of tasks
• Then use an optimal scheduling having the
tasks in 𝐴 as its early tasks.
• Total complexity is 𝑂(𝑛2 ), because
independence check takes 𝑂(𝑛) time