Chomsky Hierarchy Language Operations and Properties

CYK Algorithm
Introduction
• Problem: Given a context free grammar and a
string s is it possible to decide whether s can
be generated by the grammar or not?
• If the grammar is not in a very special form
this is not so efficient.
• If the grammar is in Chomsky Normal Form,
we have an elegant algorithm for testing this,
the CYK algorithm.
The CYK algorithm
• Suppose that we are given a grammar in
Chomsky Normal form
S → AB
A → BB | 0
B → AA |1
• We would like to see if 10110 is generated by
this grammar or not.
Substrings of length 1
• Since the only way to produce terminals is by
following the rules A → a, just replace every
terminal with the variables that produce it.
1
0
1
1
0
B
A
B
B
A
Substrings of length 2
Suppose now that we want to see how every substring
of length 2 can be generated. This is equivalent with
finding ways to produce all the length 2 substrings
where terminals are replaced with the variables that
represent them. But since every rule is of the form
A → BC, it suffices to replace every two consecutive
variables with the variables that produce them.
1
B
-
0
A
S
1
B
A
1
B
-
0
A
Substrings of length 3
• To produce the substring 101 (in 10110) we can
either take 1 with 01 or 10 with 1. Here BS cannot be
produced by any variable.
1
B
-
0
A
S
1
B
A
1
B
-
0
A
Substrings of length 3
• To produce the substring 101 (in 10110) we can
either take 1 with 01 or 10 with 1. Here we don’t
have a pair since 10 cannot be produced.
1
B
-
0
A
S
1
B
A
1
B
-
0
A
Substrings of length 3
• To produce the substring 011 (in 10110) we can
either take 0 with 11 or 01 with 1. Here AA can be
produced by B.
1
B
-
0
A
S
B
1
B
A
1
B
-
0
A
Substrings of length 3
• To produce the substring 011 (in 10110) we can
either take 0 with 11 or 01 with 1. Here SB cannot be
produced by any variable
1
B
-
0
A
S
B
1
B
A
1
B
-
0
A
Substrings of length 3
• To produce the substring 110 (in 10110) we can either
take 1 with 10 or 11 with 0. Here we don’t have a pair
since 10 cannot be produced by a variable.
1
B
-
0
A
S
B
1
B
A
-
1
B
-
0
A
Substrings of length 3
• To produce the substring 110 (in 10110) we can
either take 1 with 10 or 11 with 0. Here AA can be
produced by B
1
B
-
0
A
S
B
1
B
A
B
1
B
-
0
A
Substrings of length 4
• To produce the substring 1011 (in 10110) we can
take 1 with 011 or 10 with 11, or 101 with 1. Here BB
can be produced by A.
1
B
A
0
A
S
B
1
B
A
B
1
B
-
0
A
Substrings of length 4
• To produce the substring 1011 (in 10110) we can
take 1 with 011 or 10 with 11, or 101 with 1. Here we
don’t have a pair since 10 cannot be produced.
1
B
A
0
A
S
B
1
B
A
B
1
B
-
0
A
Substrings of length 4
• To produce the substring 1011 (in 10110) we can
take 1 with 011 or 10 with 11, or 101 with 1. Here we
don’t have a pair since 101 cannot be produced.
1
B
A
0
A
S
B
1
B
A
B
1
B
-
0
A
Substrings of length 4
• To produce the substring 0110 (in 10110) we can
take 0 with 110 or 01 with 10, or 011 with 0. Here AB
can be produced by S.
1
B
A
0
A
S
B
S
1
B
A
B
1
B
-
0
A
Substrings of length 4
• To produce the substring 0110 (in 10110) we can
take 0 with 110 or 01 with 10, or 011 with 0. Here we
don’t have a pair since 10 cannot be produced.
1
B
A
0
A
S
B
S
1
B
A
B
1
B
-
0
A
Substrings of length 4
• To produce the substring 0110 (in 10110) we can
take 0 with 110 or 01 with 10, or 011 with 0. Here BA
cannot be produced by any variable.
1
B
A
0
A
S
B
S
1
B
A
B
1
B
-
0
A
Combine previous solutions
• In order now to produce the whole string 10110 we
can take 1 with 0110 or 10 with 110 or 101 with 10,
or 1011 with 0. Here, BS cannot be produced.
1
B
A
-
0
A
S
B
S
1
B
A
B
1
B
-
0
A
Combine previous solutions
• In order now to produce the whole string 10110 we
can take 1 with 0110 or 10 with 110 or 101 with 10,
or 1011 with 0. Here we don’t have a pair.
1
B
A
-
0
A
S
B
S
1
B
A
B
1
B
-
0
A
Combine previous solutions
• In order now to produce the whole string 10110 we
can take 1 with 0110 or 10 with 110 or 101 with 10,
or 1011 with 0. Here we don’t have a pair.
1
B
A
-
0
A
S
B
S
1
B
A
B
1
B
-
0
A
Combine previous solutions
• In order now to produce the whole string 10110 we
can take 1 with 0110 or 10 with 110 or 101 with 10,
or 1011 with 0. Here, AA is produced by B.
1
B
A
B
0
A
S
B
S
1
B
A
B
1
B
-
0
A
Answer
• If the last line contains the start variable S,
then there is a way to produce the string else
the string cannot be generated. For our
example 10110 cannot be generated.
Mechanical way
• Now that we show why this method works lets give an
easy way to compute the table
• Suppose that we are about to fill in the position with
the cycle. We take the pairs that the arrows designate
1
B
A
0
A
S
B
S
1
B
A
B
1
B
-
0
A
Mechanical way
• Now that we show why this method works lets give an
easy way to compute the table
• Suppose that we are about to fill in the position with
the cycle. We take the pairs that the arrows designate
1
B
A
0
A
S
B
S
1
B
A
B
1
B
-
0
A
Mechanical way
• Now that we show why this method works lets give an
easy way to compute the table
• Suppose that we are about to fill in the position with
the cycle. We take the pairs that the arrows designate
1
B
A
0
A
S
B
-
1
B
A
B
1
B
-
0
A
Mechanical way
• Now that we show why this method works lets give an
easy way to compute the table
• Suppose that we are about to fill in the position with
the cycle. We take the pairs that the arrows designate
1
B
A
0
A
S
B
-
1
B
A
B
1
B
-
0
A
Mechanical way
• So finally:
1
B
A
0
A
S
B
S
1
B
A
B
1
B
-
0
A
A string that is produced
• Run the CYK algorithm for the string 10111
1
0
1
1
1
A string that is produced
• Run the CYK algorithm for the string 10111
1
B
0
A
1
B
1
B
1
B
A string that is produced
• Run the CYK algorithm for the string 10111
1
B
-
0
A
1
B
1
B
1
B
A string that is produced
• Run the CYK algorithm for the string 10111
1
B
-
0
A
S
1
B
1
B
1
B
A string that is produced
• Run the CYK algorithm for the string 10111
1
B
-
0
A
S
1
B
A
1
B
1
B
A string that is produced
• Run the CYK algorithm for the string 10111
1
B
-
0
A
S
1
B
A
1
B
A
1
B
A string that is produced
• Run the CYK algorithm for the string 10111
1
B
-
0
A
S
1
B
A
1
B
-
1
B
A string that is produced
• Run the CYK algorithm for the string 10111
1
B
-
0
A
S
1
B
A
1
B
-
1
B
A string that is produced
• Run the CYK algorithm for the string 10111
1
B
-
0
A
S
B
1
B
A
1
B
-
1
B
A string that is produced
• Run the CYK algorithm for the string 10111
1
B
-
0
A
S
B
1
B
A
1
B
-
1
B
A string that is produced
• Run the CYK algorithm for the string 10111
1
B
-
0
A
S
B
1
B
A
-
1
B
-
1
B
A string that is produced
• Run the CYK algorithm for the string 10111
1
B
-
0
A
S
B
1
B
A
S
1
B
-
1
B
A string that is produced
• Run the CYK algorithm for the string 10111
1
B
A
0
A
S
B
1
B
A
S
1
B
-
1
B
A string that is produced
• Run the CYK algorithm for the string 10111
1
B
A
0
A
S
B
1
B
A
S
1
B
-
1
B
A string that is produced
• Run the CYK algorithm for the string 10111
1
B
A
0
A
S
B
1
B
A
S
1
B
-
1
B
A string that is produced
• Run the CYK algorithm for the string 10111
1
B
A
0
A
S
B
-
1
B
A
S
1
B
-
1
B
A string that is produced
• Run the CYK algorithm for the string 10111
1
B
A
0
A
S
B
-
1
B
A
S
1
B
-
1
B
A string that is produced
• Run the CYK algorithm for the string 10111
1
B
A
0
A
S
B
A
1
B
A
S
1
B
-
1
B
A string that is produced
• Run the CYK algorithm for the string 10111
1
B
A
-
0
A
S
B
A
1
B
A
S
1
B
-
1
B
A string that is produced
• Run the CYK algorithm for the string 10111
1
B
A
-
0
A
S
B
A
1
B
A
S
1
B
-
1
B
A string that is produced
• Run the CYK algorithm for the string 10111
1
B
A
-
0
A
S
B
A
1
B
A
S
1
B
-
1
B
A string that is produced
• Run the CYK algorithm for the string 10111
1
B
A
S
0
A
S
B
A
1
B
A
S
1
B
-
1
B
A string that is produced
• Run the CYK algorithm for the string 10111
1
B
A
S
The derivation is:
S
0
A
S
B
A
1
B
A
S
1
B
-
1
B
A string that is produced
• Run the CYK algorithm for the string 10111
1
B
A
S
The derivation is:
S → AB
0
A
S
B
A
1
B
A
S
1
B
-
1
B
A string that is produced
• Run the CYK algorithm for the string 10111
1
B
A
S
The derivation is:
S → AB → BBB
0
A
S
B
A
1
B
A
S
1
B
-
1
B
A string that is produced
• Run the CYK algorithm for the string 10111
1
B
A
S
The derivation is:
S → AB → BBB
0
A
S
B
A
1
B
A
S
1
B
-
1
B
A string that is produced
• Run the CYK algorithm for the string 10111
1
0
1
B
A
B
S
A
B
S
A
A
S
The derivation is:
S → AB → BBB → BAAB
1
B
-
1
B
A string that is produced
• Run the CYK algorithm for the string 10111
1
0
1
B
A
B
S
A
B
S
A
A
S
The derivation is:
S → AB → BBB → BAAB
1
B
-
1
B
A string that is produced
• Run the CYK algorithm for the string 10111
1
0
1
B
A
B
S
A
B
S
A
A
S
The derivation is:
S → AB → BBB → BAAB
1
B
-
1
B
A string that is produced
• Run the CYK algorithm for the string 10111
1
0
1
1
1
B
A
B
B
B
S
A
B
S
A
A
S
The derivation is:
S → AB → BBB → BAAB → BABBB
A string that is produced
• Run the CYK algorithm for the string 10111
1
0
1
1
1
B
A
B
B
B
S
A
B
S
A
A
S
The derivation is:
S → AB → BBB → BAAB → BABBB
A string that is produced
• Run the CYK algorithm for the string 10111
1
0
1
1
1
B
A
B
B
B
S
A
B
S
A
A
S
The derivation is:
S → AB → BBB → BAAB → BABBB
A string that is produced
• Run the CYK algorithm for the string 10111
1
0
1
1
1
B
A
B
B
B
S
A
B
S
A
A
S
The derivation is:
S → AB → BBB → BAAB → BABBB