Chomsky Normal Form CYK Algorithm Normal Forms There are some special forms in which I can bring the grammar to work with it more easily. • Chomsky Normal Form • Greibach Normal Form Chomsky Normal Form A Context Free Grammar is in Chomsky Normal Form if the rules are of the form: • A ⟶ BC • A⟶a • S⟶ε with A, B, C being variables (B,C not being the start variable), a being a terminal and S only being the start variable. Chomsky Normal Form There are 5 steps to follow in order to transform a grammar into CNF: 1. Add the a new start variable S0 and the production rule S0 ⟶ S. 2. Eliminate the ε-rules. 3. Eliminate the unary productions A ⟶ B. 4. Add rules of the form Vt ⟶ t for every terminal t and replace t with the variable Vt. 5. Transform the remaining of the rules to the form A ⟶ BC (A, B, C variables). 1. Add a new start variable • We have to make sure that the start variable doesn’t occur to the right side of some rule. • Thus, we add a new start variable S0 and the rule S0 ⟶ S, where S is the old start variable. 2. Eliminate ε-rules • We have to eliminate all productions of the form A ⟶ ε, for A being any non-start variable. • To do so we should remove the rule A ⟶ ε and replace every appearance of A with ε in all other rules. 3. Eliminate unary productions • A unary production is a production of the form A ⟶ B (with both A, B being variables). • There should only be productions of the form V1 ⟶ V2V3 involving variables, thus we have to eliminate unary productions. • To do so, we replace B in A ⟶ B with the right parts of the rules involving B in the left part. 4. Add Vt ⟶ t and replace t with Vt • There should only be rules of the form A ⟶ t involving terminals, thus terminals should disappear from every other rule involving more than just one single literal. • To do so, we add a new variable Vt for every terminal t and we replace every appearance of t with Vt , except those in rules of the form A ⟶ t. 5. Transform rules to A ⟶ BC • All the rules involving only variables should be of the form A ⟶ BC. Thus we should take care of all the rules involving more than 2 variables in the right part • For the rule V ⟶ A1A2A3…An,we start reducing the size of the right part by replacing every two variables with one new variable (resulting in the creation of n-2 new variables). 5. Transform rules to A ⟶ BC V ⟶ A1A2A3A4A5A6…An 5. Transform rules to A ⟶ BC V ⟶ B1A3A4A5A6…An B1 ⟶ A1A2 5. Transform rules to A ⟶ BC V ⟶ B2A4A5A6…An B2 ⟶ B1A3 B1 ⟶ A1A2 5. Transform rules to A ⟶ BC V ⟶ B3A5A6…An B3 ⟶ B2A4 B2 ⟶ B1A3 B1 ⟶ A1A2 5. Transform rules to A ⟶ BC V ⟶ B4A6…An B4 ⟶ B3A5 B3 ⟶ B2A4 B2 ⟶ B1A3 B1 ⟶ A1A2 5. Transform rules to A ⟶ BC V ⟶ Bn-2An Bn-2 ⟶ Bn-3An-1 … B4 ⟶ B3A5 B3 ⟶ B2A4 B2 ⟶ B1A3 B1 ⟶ A1A2 Example S ⟶ CSC | B C ⟶ 00 | ε B ⟶ 01B | 1 Example 1. Add new start variablle S0 ⟶ S S ⟶ CSC | B C ⟶ 00 | ε B ⟶ 01B | 1 Example 2. Eliminate ε-moves S0 ⟶ S S ⟶ CSC | B C ⟶ 00 | ε B ⟶ 01B | 1 Example 2. Eliminate ε-moves S0 ⟶ S S ⟶ CSC | B | CS | SC | S C ⟶ 00 B ⟶ 01B | 1 Example 3. Eliminate Unary Productions S0 ⟶ S S ⟶ CSC | B | CS | SC | S C ⟶ 00 B ⟶ 01B | 1 Example 3. Eliminate Unary Productions S0 ⟶ S S ⟶ CSC | B | CS | SC C ⟶ 00 B ⟶ 01B | 1 Example 3. Eliminate Unary Productions S0 ⟶ S S ⟶ CSC | B | CS | SC C ⟶ 00 B ⟶ 01B | 1 Example 3. Eliminate Unary Productions S0 ⟶ S S ⟶ CSC | 01B | 1 | CS | SC C ⟶ 00 B ⟶ 01B | 1 Example 3. Eliminate Unary Productions S0 ⟶ S S ⟶ CSC | 01B | 1 | CS | SC C ⟶ 00 B ⟶ 01B | 1 Example 3. Eliminate Unary Productions S0 ⟶ CSC | 01B | 1 | CS | SC S ⟶ CSC | 01B | 1 | CS | SC C ⟶ 00 B ⟶ 01B | 1 Example 4. Create Vt for every terminal t S0 ⟶ CSC | 01B | 1 | CS | SC S ⟶ CSC | 01B | 1 | CS | SC C ⟶ 00 B ⟶ 01B | 1 Z⟶0 Example 4. Create Vt for every terminal t S0 ⟶ CSC | Z1B | 1 | CS | SC S ⟶ CSC | Z1B | 1 | CS | SC C ⟶ ZZ B ⟶ Z1B | 1 Z⟶0 Example 4. Create Vt for every terminal t S0 ⟶ CSC | Z1B | 1 | CS | SC S ⟶ CSC | Z1B | 1 | CS | SC C ⟶ ZZ B ⟶ Z1B | 1 Z⟶0 A⟶1 Example 4. Create Vt for every terminal t S0 ⟶ CSC | ZAB | 1 | CS | SC S ⟶ CSC | ZAB | 1 | CS | SC C ⟶ ZZ B ⟶ ZAB | 1 Z⟶0 A⟶1 Example 5. Take care of long rules S0 ⟶ CSC | ZAB | 1 | CS | SC S ⟶ CSC | ZAB | 1 | CS | SC C ⟶ ZZ B ⟶ ZAB | 1 Z⟶0 A⟶1 D ⟶ CS Example 5. Take care of long rules S0 ⟶ DC | ZAB | 1 | CS | SC S ⟶ DC | ZAB | 1 | CS | SC C ⟶ ZZ B ⟶ ZAB | 1 Z⟶0 A⟶1 D ⟶ CS Example 5. Take care of long rules S0 ⟶ DC | ZAB | 1 | CS | SC S ⟶ DC | ZAB | 1 | CS | SC C ⟶ ZZ B ⟶ ZAB | 1 Z⟶0 A⟶1 D ⟶ CS E ⟶ ZA Example 5. Take care of long rules S0 ⟶ DC | EB | 1 | CS | SC S ⟶ DC | EB | 1 | CS | SC C ⟶ ZZ B ⟶ EB | 1 Z⟶0 A⟶1 D ⟶ CS E ⟶ ZA CYK Introduction • Problem: Given a context free grammar and a string s is it possible to decide whether s can be generated by the grammar or not? • If the grammar is not in a very special form this is not so efficient. • If the grammar is in Chomsky Normal Form, we have an elegant algorithm for testing this, the CYK algorithm. The CYK algorithm • Suppose that we are given a grammar in Chomsky Normal form S → AB A → BB | 0 B → AA |1 • We would like to see if 10110 is generated by this grammar or not. Substrings of length 1 • Since the only way to produce terminals is by following the rules A → a, just replace every terminal with the variables that produce it. 1 0 1 1 0 B A B B A Substrings of length 2 Suppose now that we want to see how every substring of length 2 can be generated. This is equivalent with finding ways to produce all the length 2 substrings where terminals are replaced with the variables that represent them. But since every rule is of the form A → BC, it suffices to replace every two consecutive variables with the variables that produce them. 1 B - 0 A S 1 B A 1 B - 0 A Substrings of length 3 • To produce the substring 101 (in 10110) we can either take 1 with 01 or 10 with 1. Here BS cannot be produced by any variable. 1 B - 0 A S 1 B A 1 B - 0 A Substrings of length 3 • To produce the substring 101 (in 10110) we can either take 1 with 01 or 10 with 1. Here we don’t have a pair since 10 cannot be produced. 1 B - 0 A S 1 B A 1 B - 0 A Substrings of length 3 • To produce the substring 011 (in 10110) we can either take 0 with 11 or 01 with 1. Here AA can be produced by B. 1 B - 0 A S B 1 B A 1 B - 0 A Substrings of length 3 • To produce the substring 011 (in 10110) we can either take 0 with 11 or 01 with 1. Here SB cannot be produced by any variable 1 B - 0 A S B 1 B A 1 B - 0 A Substrings of length 3 • To produce the substring 110 (in 10110) we can either take 1 with 10 or 11 with 0. Here we don’t have a pair since 10 cannot be produced by a variable. 1 B - 0 A S B 1 B A - 1 B - 0 A Substrings of length 3 • To produce the substring 110 (in 10110) we can either take 1 with 10 or 11 with 0. Here AA can be produced by B 1 B - 0 A S B 1 B A B 1 B - 0 A Substrings of length 4 • To produce the substring 1011 (in 10110) we can take 1 with 011 or 10 with 11, or 101 with 1. Here BB can be produced by A. 1 B A 0 A S B 1 B A B 1 B - 0 A Substrings of length 4 • To produce the substring 1011 (in 10110) we can take 1 with 011 or 10 with 11, or 101 with 1. Here we don’t have a pair since 10 cannot be produced. 1 B A 0 A S B 1 B A B 1 B - 0 A Substrings of length 4 • To produce the substring 1011 (in 10110) we can take 1 with 011 or 10 with 11, or 101 with 1. Here we don’t have a pair since 101 cannot be produced. 1 B A 0 A S B 1 B A B 1 B - 0 A Substrings of length 4 • To produce the substring 0110 (in 10110) we can take 0 with 110 or 01 with 10, or 011 with 0. Here AB can be produced by S. 1 B A 0 A S B S 1 B A B 1 B - 0 A Substrings of length 4 • To produce the substring 0110 (in 10110) we can take 0 with 110 or 01 with 10, or 011 with 0. Here we don’t have a pair since 10 cannot be produced. 1 B A 0 A S B S 1 B A B 1 B - 0 A Substrings of length 4 • To produce the substring 0110 (in 10110) we can take 0 with 110 or 01 with 10, or 011 with 0. Here BA cannot be produced by any variable. 1 B A 0 A S B S 1 B A B 1 B - 0 A Combine previous solutions • In order now to produce the whole string 10110 we can take 1 with 0110 or 10 with 110 or 101 with 10, or 1011 with 0. Here, BS cannot be produced. 1 B A - 0 A S B S 1 B A B 1 B - 0 A Combine previous solutions • In order now to produce the whole string 10110 we can take 1 with 0110 or 10 with 110 or 101 with 10, or 1011 with 0. Here we don’t have a pair. 1 B A - 0 A S B S 1 B A B 1 B - 0 A Combine previous solutions • In order now to produce the whole string 10110 we can take 1 with 0110 or 10 with 110 or 101 with 10, or 1011 with 0. Here we don’t have a pair. 1 B A - 0 A S B S 1 B A B 1 B - 0 A Combine previous solutions • In order now to produce the whole string 10110 we can take 1 with 0110 or 10 with 110 or 101 with 10, or 1011 with 0. Here, AA is produced by B. 1 B A B 0 A S B S 1 B A B 1 B - 0 A Answer • If the last line contains the start variable S, we can find a derivation for the string following the way the S was produced backwards. • In our example, 10110 cannot be generated since S was not found in the last line. Mechanical way • Now that we showed why this method works lets give an easy way to compute the table Mechanical way • Suppose that we are about to fill in the position with the cycle. We take the pairs that the arrows designate 1 B A 0 A S B S 1 B A B 1 B - 0 A Mechanical way • Suppose that we are about to fill in the position with the cycle. We take the pairs that the arrows designate 1 B A 0 A S B S 1 B A B 1 B - 0 A Mechanical way • Suppose that we are about to fill in the position with the cycle. We take the pairs that the arrows designate 1 B A 0 A S B - 1 B A B 1 B - 0 A Mechanical way • Suppose that we are about to fill in the position with the cycle. We take the pairs that the arrows designate 1 B A 0 A S B - 1 B A B 1 B - 0 A Mechanical way • So finally: 1 B A 0 A S B S 1 B A B 1 B - 0 A A string that is produced • Run the CYK algorithm for the string 10111 1 0 1 1 1 A string that is produced • Run the CYK algorithm for the string 10111 1 B 0 A 1 B 1 B 1 B A string that is produced • Run the CYK algorithm for the string 10111 1 B - 0 A 1 B 1 B 1 B A string that is produced • Run the CYK algorithm for the string 10111 1 B - 0 A S 1 B 1 B 1 B A string that is produced • Run the CYK algorithm for the string 10111 1 B - 0 A S 1 B A 1 B 1 B A string that is produced • Run the CYK algorithm for the string 10111 1 B - 0 A S 1 B A 1 B A 1 B A string that is produced • Run the CYK algorithm for the string 10111 1 B - 0 A S 1 B A 1 B - 1 B A string that is produced • Run the CYK algorithm for the string 10111 1 B - 0 A S 1 B A 1 B - 1 B A string that is produced • Run the CYK algorithm for the string 10111 1 B - 0 A S B 1 B A 1 B - 1 B A string that is produced • Run the CYK algorithm for the string 10111 1 B - 0 A S B 1 B A 1 B - 1 B A string that is produced • Run the CYK algorithm for the string 10111 1 B - 0 A S B 1 B A - 1 B - 1 B A string that is produced • Run the CYK algorithm for the string 10111 1 B - 0 A S B 1 B A S 1 B - 1 B A string that is produced • Run the CYK algorithm for the string 10111 1 B A 0 A S B 1 B A S 1 B - 1 B A string that is produced • Run the CYK algorithm for the string 10111 1 B A 0 A S B 1 B A S 1 B - 1 B A string that is produced • Run the CYK algorithm for the string 10111 1 B A 0 A S B 1 B A S 1 B - 1 B A string that is produced • Run the CYK algorithm for the string 10111 1 B A 0 A S B - 1 B A S 1 B - 1 B A string that is produced • Run the CYK algorithm for the string 10111 1 B A 0 A S B - 1 B A S 1 B - 1 B A string that is produced • Run the CYK algorithm for the string 10111 1 B A 0 A S B A 1 B A S 1 B - 1 B A string that is produced • Run the CYK algorithm for the string 10111 1 B A - 0 A S B A 1 B A S 1 B - 1 B A string that is produced • Run the CYK algorithm for the string 10111 1 B A - 0 A S B A 1 B A S 1 B - 1 B A string that is produced • Run the CYK algorithm for the string 10111 1 B A - 0 A S B A 1 B A S 1 B - 1 B A string that is produced • Run the CYK algorithm for the string 10111 1 B A S 0 A S B A 1 B A S 1 B - 1 B A string that is produced • Run the CYK algorithm for the string 10111 1 B A S The derivation is: S 0 A S B A 1 B A S 1 B - 1 B A string that is produced • Run the CYK algorithm for the string 10111 1 B A S The derivation is: S 0 A S B A 1 B A S 1 B - 1 B A string that is produced • Run the CYK algorithm for the string 10111 1 B A S The derivation is: S → AB 0 A S B A 1 B A S 1 B - 1 B A string that is produced • Run the CYK algorithm for the string 10111 1 B A S The derivation is: S → AB 0 A S B A 1 B A S 1 B - 1 B A string that is produced • Run the CYK algorithm for the string 10111 1 B A S The derivation is: S → AB 0 A S B A 1 B A S 1 B - 1 B A string that is produced • Run the CYK algorithm for the string 10111 1 B A S The derivation is: S → AB → BBB 0 A S B A 1 B A S 1 B - 1 B A string that is produced • Run the CYK algorithm for the string 10111 1 B A S The derivation is: S → AB → BBB 0 A S B A 1 B A S 1 B - 1 B A string that is produced • Run the CYK algorithm for the string 10111 1 B A S The derivation is: S → AB → BBB 0 A S B A 1 B A S 1 B - 1 B A string that is produced • Run the CYK algorithm for the string 10111 1 0 1 B A B S A B S A A S The derivation is: S → AB → BBB → BAAB 1 B - 1 B A string that is produced • Run the CYK algorithm for the string 10111 1 0 1 B A B S A B S A A S The derivation is: S → AB → BBB → BAAB 1 B - 1 B A string that is produced • Run the CYK algorithm for the string 10111 1 0 1 1 1 B A B B B S A B S A A S The derivation is: S → AB → BBB → BAAB → BABBB A string that is produced • Run the CYK algorithm for the string 10111 1 0 1 1 1 B A B B B S A B S A A S The derivation is: S → AB → BBB → BAAB → BABBB A string that is produced • Run the CYK algorithm for the string 10111 1 0 1 1 1 B A B B B S A B S A A S The derivation is: S → AB → BBB → BAAB → BABBB → 10111
© Copyright 2026 Paperzz