• Matrix-chain multiplication problem: given a chain ܣଵǡ ܣଶ,…,ܣ௡ of

10/31/2013
• Matrix-chain multiplication problem: given a
chain
,…,
of matrices, where matrix
has dimension
, fully parenthesize
the product
in a way that minimizes
the number of scalar multiplications
• We are not actually multiplying matrices
• Our goal is only to determine an order for
multiplying matrices that has the lowest cost
• Typically, the time invested in determining this
optimal order is more than paid for by the time
saved later on when actually performing the
matrix multiplications
MAT-72006 AA+DS, Fall 2013
31-Oct-13
415
Counting the number of parenthesizations
• Exhaustive checking of all possible parentheses
combinations doesn’t yield an ef cient algorithm
• Let the number of alternative parenthesizations of a
sequence of matrices be
• When = 1, we have just one matrix and only one
way to fully parenthesize the matrix product
• When
2, a fully parenthesized matrix product is
the product of two fully parenthesized matrix
subproducts, and the split between the two may
occur between the th and
+ 1)st matrices
MAT-72006 AA+DS, Fall 2013
31-Oct-13
416
1
10/31/2013
• Thus, we obtain the recurrence
1
=
if
) if
=1
2
• We leave it as an exercise to show that the
solution to the recurrence is (2 )
• The number of solutions is exponential in
• The brute-force method of exhaustive search
makes for a poor strategy when determining
how to optimally parenthesize a matrix chain
MAT-72006 AA+DS, Fall 2013
31-Oct-13
417
Applying dynamic programming
Step 1: The structure of an optimal
parenthesization
• Let us adopt the notation .. ,
, for the matrix that
results from evaluating the product
• Observe that if
then to parenthesize the product
, we must split the product between
and
for some
• I.e., for some value of , we rst compute the matrices
.. and
.. and then multiply them together to
produce the nal product ..
MAT-72006 AA+DS, Fall 2013
31-Oct-13
418
2
10/31/2013
• The cost: cost of computing the matrix .. + cost of
computing
.. + cost of multiplying them together
• The optimal substructure of this problem:
• Suppose that to optimally parenthesize
,
we split the product between
and
• Then the way we parenthesize
within
this optimal parenthesization of
must be
an optimal parenthesization of
• If there were a less costly way to parenthesize
, then substitute that parenthesization in
the optimal parenthesization of
to
produce a way to parenthesize
with
lower cost than the optimum: a contradiction
MAT-72006 AA+DS, Fall 2013
31-Oct-13
419
• We can construct an optimal solution to the problem
from optimal solutions to subproblems
• Solution to a nontrivial instance requires us to split
the product, and any optimal solution contains within
it optimal solutions to subproblem instances
• We can build an optimal solution by splitting the
problem into two subproblems (optimally
parenthesizing
and
),
nding optimal solutions to these, and then
combining these optimal subproblem solutions
• We must ensure that when we search for the correct
place to split the product, we have considered all
possible places, so that we are sure of having
examined the optimal one
MAT-72006 AA+DS, Fall 2013
31-Oct-13
420
3
10/31/2013
Step 2: A recursive solution
• Next, we de ne the cost of an optimal solution
recursively in terms of the optimal solutions to
subproblems
• For the matrix-chain multiplication problem, we
pick as our subproblems the problems of
determining the minimum cost of parenthesizing
for
• Let
] be the minimum number of scalar
multiplications needed to compute the matrix
.. for the full problem, the lowest-cost way to
compute .. would thus be [1, ]
MAT-72006 AA+DS, Fall 2013
31-Oct-13
421
We de ne [ , ] recursively as follows:
If
, no scalar multiplications are necessary
Thus,
= 0 for = 1,2, … ,
To compute
,
, we take advantage of the
structure of an optimal solution from step 1
• Assume that to optimally parenthesize, we split the
product
between
and
•
•
•
•
• Then,
] = the minimum cost for computing the
subproducts .. and
.. , plus the cost of
multiplying these two matrices together
• Recalling that each matrix
is
, we see
that computing the matrix product ..
.. takes
scalar multiplications
MAT-72006 AA+DS, Fall 2013
31-Oct-13
422
4
10/31/2013
• Thus, we obtain
+ 1,
• This recursive equation assumes that we know the
value of , which we do not
• There are only
possible values for , however,
namely
+ 1, … ,
1
• The optimal parenthesization must use one of these
values, we need only check them all to nd the best
• Thus, our recursive de nition for the minimum cost
of parenthesizing the product
becomes
=
0
if
min
, +
+ 1, +
if <
MAT-72006 AA+DS, Fall 2013
31-Oct-13
423
• The
values give the costs of optimal
solutions to subproblems, but they do not
provide an optimal solution
• To help us do so, we de ne
to be a
value of at which we split the product
in an optimal parenthesization
• That is,
equals a value
+ 1,
MAT-72006 AA+DS, Fall 2013
such that
31-Oct-13
424
5
10/31/2013
Step 3: Computing the optimal costs
• Now we could easily write a recursive algorithm
based on the recurrence to compute the
minimum cost
1, for multiplying
• This recursive algorithm takes exponential time
• It is no better than the brute-force method of
checking each way of parenthesizing a product
• We have relatively few distinct subproblems: one
subproblem for each choice of and satisfying
, or
) in all
2
• We encounter each subproblem many times in
different branches of its recursion tree
MAT-72006 AA+DS, Fall 2013
31-Oct-13
425
• We implement the tabular, bottom-up method in
MATRIX-CHAIN-ORDER, which assumes that
matrix
has dimensions
• Its input is a sequence =
,…,
, where
+1
• The procedure uses
– table [1. . , 1. . ] for storing the
] costs and
– another table [1. .
1, 2. . ] that records which
index of achieved the optimal cost in computing
]
• We use the table
solution
to construct an optimal
MAT-72006 AA+DS, Fall 2013
31-Oct-13
426
6
10/31/2013
• By the recurrence the cost [ , ] of computing
a matrix-chain product of
+ 1 matrices
depends only on the costs of computing matrixchain products of
+ 1 matrices
• I.e., for
+ 1, … ,
1, .. is a product of
+1 <
+ 1 matrices and
.. is a
product of
+ 1 matrices
• Thus, we should ll in in a manner that
corresponds to solving the problem on matrix
chains of increasing length
• For
, we consider the subproblem
size to be the length
+ 1 of the chain
MAT-72006 AA+DS, Fall 2013
31-Oct-13
427
31-Oct-13
428
MATRIX-CHAIN-ORDER( )
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
1
let [1. . , 1. . ] and [1. .
1, 2. . ] be new tables
for = 1 to
=0
for = 2 to
// is the chain length
for = 1 to
+1
1
for
to
]
if
return
1
+ 1,
and
MAT-72006 AA+DS, Fall 2013
7
10/31/2013
PRINT-OPTIMAL-PARENS , 1,6) prints
the parenthesization
MAT-72006 AA+DS, Fall 2013
31-Oct-13
429
• The minimum number of scalar multiplications to multiply the 6
matrices is 1,6 = 15,125
• Of the darker entries, the pairs that have the same shading are
taken together in line 10 when computing
•
2,5 =
2,2 + 3,5 +
= 0 + 2500 + 35 15 20 = 13,000
2,3 + 4,5 +
= 2625 + 1000 + 35
20 =
min
2,4 + 5,5 +
= 4375 + 0 + 35 10 20 = 11,375
• The nested loop structure of MATRIX-CHAIN-ORDER yields a running
time of
) for the algorithm
– The loops are nested three deep, and each loop index ( , , and ) takes
on at most
1 values
• The running time of this algorithm is in fact also
• The algorithm requires
) space to store the two tables
• MATRIX-CHAIN-ORDER is much more ef cient than the exponentialtime method of enumerating all possible parenthesizations and
checking each one
MAT-72006 AA+DS, Fall 2013
31-Oct-13
430
8
10/31/2013
Step 4: Constructing an optimal solution
• Table [1. .
1, 2. . ] gives us the information
we need to multiply the matrices
• Entry
] records a value of s.t. an optimal
parenthesization of
splits the product
between
and
• Thus, we know that the nal matrix multiplication
in computing .. optimally is ..
]
..
•
[1, 1, ] determines the last matrix
multiplication when computing ..
] and
1, + 1, ] determines the last matrix
multiplication when computing
..
MAT-72006 AA+DS, Fall 2013
31-Oct-13
431
• The following procedure prints an optimal
parenthesization of
, … , , given the
table computed by MATRIX-CHAIN-ORDER and the
indices and
• The call PRINT-OPTIMAL-PARENS , 1,
prints an
optimal parenthesization of
,…,
PRINT-OPTIMAL-PARENS
1. if ==
2.
print ”
3. else print “(”
4.
PRINT-OPTIMAL-PARENS
5.
PRINT-OPTIMAL-PARENS
6.
print “)”
]
+ 1,
MAT-72006 AA+DS, Fall 2013
31-Oct-13
432
9
10/31/2013
15.4 Longest common subsequence
• Biological applications often need to compare
the DNA of two (or more) different organisms
• A strand of DNA consists of a string of molecules
called bases, where the possible bases are
Adenine, Guanine, Cytosine, and Thymine
• We express a strand of DNA as a string over the
alphabet {A,C,G,T}
• E.g., the DNA of two organisms may be
–
–
ACCGGTCGAGTGCGCGGAAGCCGGCCGAA
= GTCGTTCGGAATGCCGTTGCTCTGTAAA
MAT-72006 AA+DS, Fall 2013
31-Oct-13
433
• By comparing two strands of DNA we determine
how “similar” they are, as some measure of how
closely related the two organisms are
• We can de ne similarity in many different ways
• E.g., we can say that two DNA strands are
similar if one is a substring of the other
– Neither
nor
is a substring of the other
• Alternatively, we could say that two strands are
similar if the number of changes needed to turn
one into the other is small
MAT-72006 AA+DS, Fall 2013
31-Oct-13
434
10
10/31/2013
• Yet another way to measure the similarity of
and is by nding a third strand
in which the
bases in
appear in each of and
• these bases must appear in the same order, but not
necessarily consecutively
• The longer the strand
we can nd, the more
similar and are
• In our example, the longest strand
is
–
ACCGGTCGAGTGCGCGGAAGCCGGCCGAA
–
GTCGTTCGGAATGCCGTTGCTCTGTAAA
– = GTCGTCGGAAGCCGGCCGAA
MAT-72006 AA+DS, Fall 2013
31-Oct-13
435
• We formalize this last notion of similarity as the
longest-common-subsequence problem
• A subsequence is just the given sequence with
zero or more elements left out
,…,
,
• Formally, given a sequence =
another sequence =
,…,
is a
subsequence of if there exists a strictly
increasing sequence
,…,
of indices of
such that for all = 1,2, … , , we have
• For example, =
is a subsequence
of =
with corresponding
index sequence 2,3,5,7
MAT-72006 AA+DS, Fall 2013
31-Oct-13
436
11
10/31/2013
• We say that a sequence is a common
subsequence of and if is a subsequence
of both and
• For example, if =
and
=
, the sequence
is a
common subsequence of both and
is not a longest common subsequence
•
(LCS) of and
• The sequence
, which is also common
to both and , has length 4
• This sequence is an LCS of and , as is
; and have no common
subsequence of length 5 or greater
MAT-72006 AA+DS, Fall 2013
31-Oct-13
437
• In longest-common-subsequence problem, we are
given sequences =
,…,
and
=
,…,
and wish to nd a maximum-length
common subsequence of and
Step 1: Characterizing a longest common subsequence
• In a brute-force approach, we would enumerate all
subsequences of and check each of them to see
whether it is also a subsequence of , keeping track of
the longest subsequence we nd
• Each subsequence of corresponds to a subset of the
indices 1,2, … ,
of
• Because has 2 subsequences, this approach
requires exponential time, making it impractical for long
sequences
MAT-72006 AA+DS, Fall 2013
31-Oct-13
438
12
10/31/2013
• The LCS problem has an optimal-substructure
property, however, as the following theorem
shows
• We shall see that the natural classes of
subproblems correspond to pairs of “pre xes” of
the two input sequences
• Precisely, given a sequence =
,…,
,
we de ne the th pre x of , for = 0,1, … , , as
,…,
=
• For example, if =
, then
= , , ,
and
is the empty sequence
MAT-72006 AA+DS, Fall 2013
31-Oct-13
439
Theorem 15.1 (Optimal substructure of LCS)
Let =
,…,
and =
,…,
be
sequences, and let =
,…,
be any LCS
of and .
1. If
, then
and
is an
LCS of
and
.
2. If
, then
implies that is an
LCS of
and .
3. If
, then
implies that is an
LCS of and
.
MAT-72006 AA+DS, Fall 2013
31-Oct-13
440
13
10/31/2013
Proof (1) If
, then we could append
to to obtain a common subsequence of
and of length + 1, contradicting the
supposition that is a LCS of and . Thus, we
must have
. Now, the pre x
is a
length1) common subsequence of
and
. We wish to show that it is an LCS. Suppose
for the purpose of contradiction that there exists a
common subsequence
of
and
with
length greater than
1. Then, appending
to
produces a common subsequence
of and whose length is greater than , which is
a contradiction.
MAT-72006 AA+DS, Fall 2013
31-Oct-13
441
(2) If
, then is a common subsequence of
and . If there were a common subsequence
with length greater than , then
would also
be a common subsequence of
and ,
contradicting the assumption that is an LCS of
and .
(3) The proof is symmetric to (2).
• Theorem 15.1 tells us that an LCS of two
sequences contains within it an LCS of pre xes
of the two sequences
• Thus, the LCS problem has an optimalsubstructure property
MAT-72006 AA+DS, Fall 2013
31-Oct-13
442
14
10/31/2013
Step 2: A recursive solution
• We examine either one or two subproblems
when nding an LCS of and
• If
, we nd an LCS of
and
• Appending
yields an LCS of and
• If
, then we (1) nd an LCS of
and
and (2) nd an LCS of and
• Whichever of these two LCSs is longer is an
LCS of and
• These cases exhaust all possibilities, and we
know that one of the optimal subproblem
solutions must appear within an LCS of and
MAT-72006 AA+DS, Fall 2013
31-Oct-13
443
• To nd an LCS of and , we may need to nd
the LCSs of and
and of
and
• Each subproblem has the subsubproblem of
nding an LCS of
and
• Many other subproblems share subsubproblems
• As in the matrix-chain multiplication, recursive
solution to the LCS problem involves a
recurrence for the value of an optimal solution
• Let us de ne
] to be the length of an LCS of
the sequences
and
• If either = 0 or = 0, one of the sequences has
length 0, and so the LCS has length 0
MAT-72006 AA+DS, Fall 2013
31-Oct-13
444
15
10/31/2013
• The optimal substructure of the LCS problem gives
=
max
1,
0
1 +1
if
1,
1, ) if
if = 0 or = 0
> 0 and
> 0 and
• Observe that a condition in the problem restricts which
subproblems we may consider
• When
, we consider nding an LCS of
and
• Otherwise, we instead consider the two subproblems of
nding an LCS of and
and of
and
• In the previous dynamic-programming algorithms — for
rod cutting and matrix-chain multiplication — we ruled
out no subproblems due to conditions in the problem
MAT-72006 AA+DS, Fall 2013
31-Oct-13
445
Step 3: Computing the length of an LCS
• Since the LCS problem has only
distinct
subproblems, we can use dynamic programming to
compute the solutions bottom up
• LCS-LENGTH stores the
] values in [0. . , 0. . ],
and it computes the entries in row-major order
– I.e., the procedure lls in the rst row of
right, then the second row, and so on
from left to
• The procedure also maintains the table [1. . , 1. . ]
• Intuitively,
] points to the table entry
corresponding to the optimal subproblem solution
chosen when computing
•
contains the length of an LCS of and
MAT-72006 AA+DS, Fall 2013
31-Oct-13
446
16
10/31/2013
LCS-LENGTH
)
1.
2.
3. let [1. . , 1. . ] and
[0. . , 0. . be new
tables
4. for = 1 to
5.
,0 = 0
6. for = 0 to
0, = 0
7.
8. for = 1 to
9.
for = 1 to
The
=
and
10.
if
==
11.
1,
1 +1
12.
]=“ ”
13.
elseif
1,
1]
14.
1, ]
15.
]=“ ”
1]
16.
else
17.
=“
”
18. return and
Running time:
MAT-72006 AA+DS, Fall 2013
)
31-Oct-13
tables computed by LCS-LENGTH on
and =
MAT-72006 AA+DS, Fall 2013
31-Oct-13
447
448
17
10/31/2013
Step 4: Constructing an LCS
• The table returned by LCS-LENGTH enables us
to quickly construct an LCS of and
• We simply begin at
] and trace through the
table by following the arrows
• Whenever we encounter a “ ” in entry
, it
implies that
is an element of the LCS that
LCS-LENGTH found
• With this method, we encounter the elements of
this LCS in reverse order
• A recursive procedure prints out an LCS of
and in the proper, forward order
MAT-72006 AA+DS, Fall 2013
31-Oct-13
449
• The square in row and column contains the value of
and the appropriate arrow for the value of
]
• The entry 4 in [7,6] — the lower right-hand corner of
the table — is the length of an LCS
• For
> 0, entry
depends only on whether
1, ,
1,
and the values in entries
and
1,
1 , which are computed before
• To reconstruct the elements of an LCS, follow the
]
arrows from the lower right-hand corner
• Each “ ” on the shaded sequence corresponds to an
entry (highlighted) for which
is a member of an
LCS
MAT-72006 AA+DS, Fall 2013
31-Oct-13
450
18
10/31/2013
Improving the code
• Each
entry depends on only 3 other table
entries:
1, ,
1 , and
1,
1
, we can determine in
• Given the value of
(1) time which of these three values was used
to compute
, without inspecting table
• We can reconstruct an LCS in
) time
• The auxiliary space requirement for computing
an LCS does not asymptotically decrease, since
we need
space for the table anyway
MAT-72006 AA+DS, Fall 2013
31-Oct-13
451
• We can, however, reduce the asymptotic space
requirements for LCS-LENGTH, since it needs
only two rows of table at a time
– the row being computed and the previous row
• This improvement works if we need only the
length of an LCS
– if we need to reconstruct the elements of an LCS,
the smaller table does not keep enough
information to retrace our steps in
) time
MAT-72006 AA+DS, Fall 2013
31-Oct-13
452
19
10/31/2013
15.5 Optimal binary search trees
• We are designing a program to translate text
• Perform lookup operations by building a BST with
words as keys and their equivalents as satellite data
• We can ensure an (lg ) search time per occurrence by
using a RBT or any other balanced BST
• A frequently used word may appear far from the root
while a rarely used word appears near the root
• We want frequent words to be placed nearer the root
• How do we organize a BST so as to minimize the
number of nodes visited in all searches, given that we
know how often each word occurs?
MAT-72006 AA+DS, Fall 2013
31-Oct-13
453
• What we need is known as an optimal binary
search tree
• Formally, given a sequence =
,…,
of
distinct sorted keys (
), we
wish to build a BST from these keys
• For each key , we have a probability that a
search will be for
• Some searches may be for values not in , so
we also have + 1 “dummy keys”
,…,
representing values not in
• In particular,
represents all values less than
,
represents all values greater than
MAT-72006 AA+DS, Fall 2013
31-Oct-13
454
20
10/31/2013
• For = 1, 2, … ,
1, the dummy key
represents all values between
and
• For each dummy key , we have a probability
that a search will correspond to
MAT-72006 AA+DS, Fall 2013
31-Oct-13
455
• Each key is an internal node, and each
dummy key is a leaf
• Every search is either successful ( nds a key )
or unsuccessful ( nds a dummy key ), and so
we have
+
=1
• Because we have probabilities of searches for
each key and each dummy key, we can
determine the expected cost of a search in a
given BST
MAT-72006 AA+DS, Fall 2013
31-Oct-13
456
21
10/31/2013
• Let us assume that the actual cost of a search
equals the number of nodes examined, i.e., the
depth of the node found by the search in + 1
• Then the expected cost of a search in ,
E search cost in =
depth
=1+
+1
depth
+
+
(depth
depth
+ 1)
,
• where depth denotes a node’s depth in the tree
MAT-72006 AA+DS, Fall 2013
Node
Total
31-Oct-13
457
31-Oct-13
458
Depth Probability Contribution
1
0.15
0.30
0
0.10
0.10
2
0.05
0.15
1
0.10
0.20
2
0.20
0.60
2
0.05
0.15
2
0.10
0.30
3
0.05
0.20
3
0.05
0.20
3
0.05
0.20
3
0.10
0.40
2.80
MAT-72006 AA+DS, Fall 2013
22
10/31/2013
• For a given set of probabilities, we wish to construct
a BST whose expected search cost is smallest
• We call such a tree an optimal binary search tree
• An optimal BST for the probabilities given has
expected cost 2.75
• An optimal BST is not necessarily a tree whose
overall height is smallest
• Nor can we necessarily construct an optimal BST
by always putting the key with the greatest
probability at the root.
• The lowest expected cost of any BST with
at the
root is 2.85
MAT-72006 AA+DS, Fall 2013
31-Oct-13
459
MAT-72006 AA+DS, Fall 2013
31-Oct-13
460
23