Lecture 8: Merge-Sort Lecture 8: Merge Sort

Lecture 8: Merge-Sort
Merge Sort
‰
‰
‰
‰
Divide-and-conquer
Recursive merge-sort
Recurrence equation revisit
Solving recurrence equation
„
„
Substitution
Recursion tree (new)
Courtesy to Goodrich, Tamassia and Olga Veksler
1
Instructor: Yuzhen Xie
Divide-and-Conquer Approach
S
S1
S2
ƒ Divide-and conquer is a general algorithm design
paradigm
ƒ Suppose need to solve some problem on a sequence S
ƒ our problem will be sorting
ƒ Suppose solving the problem on S is hard, but if we have
a solution on subsequences S1 and S2 of S,
S then
combining these solutions into solution on S is easy
ƒ Divide: divide the input data S in two disjoint subsets S1 and S2
ƒ Recur: solve the subproblems associated with S1 and S2
ƒ Conquer: combine the solutions for S1 and S2 into solution for S
ƒ Base case for recursion are subproblems of size 0 or 1
ƒ Base case is usually trivial to solve
Divide-and-Conquer Example
ƒ Need to sort sequence S
7 3 5 9 1 2 4 8 1
S1
S2
ƒ Divide: split S into subsequences S1 and S2
ƒ Recur: sort subsequences S1 and S2
3 5 7 9
ƒ Conquer:
q
merge
g sorted
subsequences S1 and S2
into sorted S
1 1 2 4 8
1 1 2 3 4 5 7 8 9
ƒ Base case: sorting a sequence of size 1 is trivial
Merge-Sort
ƒ Merge-sort on an input
sequence S with n
elements consists of
three steps:
ƒ Divide:
partition S into two
sequences S1 and S2 of
about n/2 elements each
ƒ Recur:
recursively sort S1 and S2
Algorithm mergeSort(S, C)
Input: sequence S with n elements,
comparator
t C
Output: sequence S sorted according
to C
if S.size() > 1
(S1, S2) ← partition(S, n/2)
mergeSort(S1, C)
mergeSort(S2, C)
S ← merge(S1, S2)
ƒ Conquer:
merge S1 and S2 into a
unique sorted sequence
4
Merging Two Sorted
Sequences
‰
‰
S1
3 4 5
3 4 5
3 4 5
3 4 5
4 5
5
The conquer step of merge-sort
consists of merging two sorted
sequences S1 and S2 into a sorted
sequence S
Consider merging with a linked list
implementation of a sequence
S2
7
7
7
7
7
7
7
1 1 2 8
1 2 8
2 8
8
8
8
8
8
S
9
9
9
9
9
9
9
9
9
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
2
2
2
2
2
2
2
3
3
3
3
3
3
4
4
4
4
4
5
5 7
5 7 8
5 7 8 9
Merging Two Sorted Sequences
‰
‰
Merging two
sorted
sequences
sequences,
each with n/2
elements and
implemented
by means of a
doubly linked
list takes O(n)
O( )
time
If implemented
by arrays, also
takes O(n) time
Algorithm merge(A, B)
Input: sequences A and B with n/2 elements
each
Output: sorted sequence of A ∪ B
S ← empty sequence
while
hil ¬A.isEmpty()
A E
() ∧ ¬B.isEmpty()
B E
()
if A.first().element() < B.first().element()
S.insertLast(A.removeFirst())
else
S.insertLast(B.removeFirst())
while ¬A.isEmpty()
¬A isEmpty()
S.insertLast(A.removeFirst())
while ¬B.isEmpty()
S insertLast(B removeFirst())
S.insertLast(B.removeFirst())
return S
6
Merge-Sort Tree
ƒ An execution of merge-sort is depicted by a binary tree
ƒ each node represents a recursive call of merge-sort and stores
ƒ unsorted
t d sequence b
before
f
th
the execution
ti and
d its
it partition
titi
ƒ sorted sequence at the end of the execution (after →)
ƒ the root is the initial call
ƒ the leaves are calls on subsequences of size 0 or 1
7 2
7
⏐
⏐
9 4 → 2 4 7 9
2 → 2 7
7→7
9
2→2
⏐
4 → 4 9
9→9
7
4→4
Execution Example
‰
Partition
7 2 9 4⏐3 8 6 1 → 1 2 3 4 6 7 8 9
7 2 9 4 → 2 4 7 9
7 2 → 2 7
3 8 6 1 → 1 3 8 6
9 4 → 4 9
3 8 → 3 8
8
6 1 → 1 6
Execution Example (cont.)
‰
Recursive call, partition
7 2 9 4⏐3 8 6 1 → 1 2 3 4 6 7 8 9
7 2⏐9 4→ 2 4 7 9
7 2 → 2 7
3 8 6 1 → 1 3 8 6
9 4 → 4 9
3 8 → 3 8
9
6 1 → 1 6
Execution Example (cont.)
‰
Recursive call,, partition
p
7 2 9 4⏐3 8 6 1 → 1 2 3 4 6 7 8 9
7 2⏐9 4→ 2 4 7 9
7⏐2→2 7
3 8 6 1 → 1 3 8 6
9 4 → 4 9
3 8 → 3 8
10
6 1 → 1 6
Execution Example
p (cont.)
(
)
‰
Recursive call, base case
7 2 9 4⏐3 8 6 1 → 1 2 3 4 6 7 8 9
7 2⏐9 4→ 2 4 7 9
7⏐2→2 7
3 8 6 1 → 1 3 8 6
9 4 → 4 9
3 8 → 3 8
7→7
11
6 1 → 1 6
Execution Example (cont.)
‰
Recursive call, base case
7 2 9 4⏐3 8 6 1 → 1 2 3 4 6 7 8 9
7 2⏐9 4→ 2 4 7 9
7⏐2→2 7
7→7
3 8 6 1 → 1 3 8 6
9 4 → 4 9
3 8 → 3 8
2→2
12
6 1 → 1 6
Execution Example
p (cont.)
(
)
‰
Merge
7 2 9 4⏐3 8 6 1 → 1 2 3 4 6 7 8 9
7 2⏐9 4→ 2 4 7 9
7⏐2→2 7
7→7
3 8 6 1 → 1 3 8 6
9 4 → 4 9
3 8 → 3 8
2→2
13
6 1 → 1 6
Execution Example (cont.)
‰
Recursive call, …, base case, merge
7 2 9 4⏐3 8 6 1 → 1 2 3 4 6 7 8 9
7 2⏐9 4→ 2 4 7 9
7⏐2→2 7
7→7
2→2
3 8 6 1 → 1 3 8 6
9 ⏐4 → 4 9
9→9
3 8 → 3 8
4→4
14
6 1 → 1 6
Execution Example (cont.)
‰
Merge
7 2 9 4⏐3 8 6 1 → 1 2 3 4 6 7 8 9
7 2⏐9 4→ 2 4 7 9
7⏐2→2 7
7→7
2→2
3 8 6 1 → 1 3 8 6
9 4 → 4 9
9→9
3 8 → 3 8
4→4
15
6 1 → 1 6
Execution Example
p (cont.)
(
)
‰
Recursive call,, …,, merge,
g , merge
g
7 2 9 4⏐3 8 6 1 → 1 2 3 4 6 7 8 9
7 2⏐9 4→ 2 4 7 9
7⏐2→2 7
7→7
2→2
3 8 6 1 → 1 3 6 8
9 4 → 4 9
9→9
3 8 → 3 8
4→4
3→3
16
8→8
6 1 → 1 6
6→6
1→1
Execution Example (cont.)
‰
Merge
7 2 9 4⏐3 8 6 1 → 1 2 3 4 6 7 8 9
7 2⏐9 4→ 2 4 7 9
7⏐2→2 7
7→7
2→2
3 8 6 1 → 1 3 6 8
9 4 → 4 9
9→9
3 8 → 3 8
4→4
3→3
17
8→8
6 1 → 1 6
6→6
1→1
Non-Recursive Merge Sort
ƒ Recursive implementation is less efficient (by a constant
factor) than non-recursive implementation
ƒ Merge
Merge-sort
sort can be implemented non-recursively
non recursively
ƒ At iteration i, break the sequence into groups of size 2i-1
ƒ 2,4,8,…
ƒ Merge
g 2 nearbyy groups
g p together
g
7 2 9 4 3 8 6 1
i=1
7 2 9 4 3 8 6 1
2 7 4 9 3 8 1 6
i=2
2 7 4 9 3 8 1 6
2 4 7 9 1 3 6 8
i=3
2 4 7 9 1 3 6 8
1 2 3 4 6 7 8 9
18
Analysis of Merge-Sort
ƒ Can write down a
recurrence equation,
with constants c and k
T(1) = c
T(n) = kn + 2T(n/2)
Algorithm mergeSort(S, C)
() > 1
if S.size()
(S1, S2) ← partition(S, n/2)
mergeSort(S1, C)
mergeSort(S
S (S2, C)
S ← merge(S1, S2)
19
Analysis of Merge-Sort (cont’d)
ƒ Let’s solve it by substitution:
T(n) = kn + 2T(n/2)
= kn + 2[kn/2 + 2T(n/4)]
= kn + kn + 4T(n/4)
= 2kn + 4T(n/4)
= 2kn + 4[kn/4 + 2T(n/8)]
= 3kn + 8T(n/8)
= …
= ikn + 2iT(n/2i).
ƒ
T(1) = c
T(n) = kn + 2T(n/2)
6
ƒ The “unwrapping” will stop when n/2i = 1, that is when
i = log
l n
ƒ Thus T(n) = (log n) kn + 2 log nT (1) = kn(log n) + cn
ƒ Thus
Th the
th running
i time
ti
is
i O(n
O( llog n))
20
Analysis of Merge-Sort (cont’d)
ƒ Let’s solve it using recursion tree
T(n) = kn + 2T(n/2)
= kn + 2[kn/2 + 2T(n/4)]
= kn + kn + 4T(n/4)
= 2kn + 4T(n/4)
= 2kn + 4[kn/4 + 2T(n/8)]
= 3kn + 8T(n/8)
= …
= ikn + 2iT(n/2i).
T(1) = c
T(n) = kn + 2T(n/2)
Draw a recursion tree on the board.
board
21