Simplifications
of
Context-Free Grammars
A Substitution Rule
Equivalent
grammar
S aB
A aaA
A abBc
B aA
Bb
S aB | ab
Substitute
Bb
A aaA
A abBc | abbc
B aA
A Substitution Rule
S aB | ab
A aaA
A abBc | abbc
B aA
Substitute
B aA
S aB | ab | aaA
A aaA
A abBc | abbc | abaAc
Equivalent
grammar
In general:
A xBz
B y1
Substitute
B y1
A xBz | xy1z
equivalent
grammar
Nullable Variables
production :
A
Nullable Variable:
A
Removing Nullable Variables
Example Grammar:
S aMb
M aMb
M
Nullable variable
Final Grammar
S aMb
M aMb
M
S aMb
Substitute
M
S ab
M aMb
M ab
Unit-Productions
Unit Production:
A B
(a single variable in both sides)
Removing Unit Productions
Observation:
A A
Is removed immediately
Example Grammar:
S aA
Aa
A B
BA
B bb
S aA
Aa
A B
BA
B bb
S aA | aB
Substitute
A B
Aa
B A| B
B bb
S aA | aB
S aA | aB
Aa
B A| B
B bb
Remove
BB
Aa
BA
B bb
S aA | aB
Aa
BA
B bb
Substitute
BA
S aA | aB | aA
Aa
B bb
Remove repeated productions
Final grammar
S aA | aB | aA
S aA | aB
Aa
Aa
B bb
B bb
Useless Productions
S aSb
S
SA
A aA Useless Production
Some derivations never terminate...
S A aA aaA aaaA
Another grammar:
SA
A aA
A
B bA Useless Production
Not reachable from S
contains only
terminals
In general:
if
S xAy w
w L(G )
then variable
A
is useful
otherwise, variable
A
is useless
A production A x is useless
if any of its variables is useless
S aSb
S
SA
Productions
useless
useless
A aA
useless
useless
BC
useless
useless
CD
useless
Variables
Removing Useless Productions
Example Grammar:
S aS | A | C
Aa
B aa
C aCb
First:
find all variables that can produce
strings with only terminals
S aS | A | C
Round 1:
Aa
SA
B aa
C aCb
{ A, B}
Round 2:
{ A, B, S}
Keep only the variables
that produce terminal symbols:
{ A, B, S}
(the rest variables are useless)
S aS | A | C
Aa
S aS | A
Aa
B aa
C aCb
B aa
Remove useless productions
Second: Find all variables
reachable from
S
Use a Dependency Graph
S aS | A
Aa
B aa
S
A
B
not
reachable
Keep only the variables
reachable from S
(the rest variables are useless)
Final Grammar
S aS | A
Aa
B aa
S aS | A
Aa
Remove useless productions
Removing All
Step 1: Remove Nullable Variables
Step 2: Remove Unit-Productions
Step 3: Remove Useless Variables
Normal Forms
for
Context-free Grammars
Chomsky Normal Form
Each productions has form:
A BC
variable
or
variable
Aa
terminal
Examples:
S AS
S AS
S a
S AAS
A SA
A SA
Ab
A aa
Chomsky
Normal Form
Not Chomsky
Normal Form
Convertion to Chomsky Normal Form
Example:
S ABa
A aab
B Ac
Not Chomsky
Normal Form
Introduce variables for terminals:
Ta , Tb , Tc
S ABTa
S ABa
A aab
B Ac
A TaTaTb
B ATc
Ta a
Tb b
Tc c
Introduce intermediate variable:
S ABTa
A TaTaTb
B ATc
Ta a
Tb b
Tc c
V1
S AV1
V1 BTa
A TaTaTb
B ATc
Ta a
Tb b
Tc c
Introduce intermediate variable:
S AV1
V1 BTa
A TaTaTb
B ATc
Ta a
Tb b
Tc c
V2
S AV1
V1 BTa
A TaV2
V2 TaTb
B ATc
Ta a
Tb b
Tc c
Final grammar in Chomsky Normal Form:
S AV1
V1 BTa
Initial grammar
S ABa
A aab
B Ac
A TaV2
V2 TaTb
B ATc
Ta a
Tb b
Tc c
In general:
From any context-free grammar
(which doesn’t produce )
not in Chomsky Normal Form
we can obtain:
An equivalent grammar
in Chomsky Normal Form
The Procedure
First remove:
Nullable variables
Unit productions
Then, for every symbol
a:
Ta a
Add production
In productions: replace
New variable:
Ta
a with Ta
Replace any production
with
A C1C2 Cn
A C1V1
V1 C2V2
Vn2 Cn1Cn
New intermediate variables:
V1, V2 , ,Vn2
Theorem:
For any context-free grammar
(which doesn’t produce )
there is an equivalent grammar
in Chomsky Normal Form
Observations
• Chomsky normal forms are good
for parsing and proving theorems
• It is very easy to find the Chomsky normal
form for any context-free grammar
Greinbach Normal Form
All productions have form:
A a V1V2 Vk
symbol
variables
k 0
Examples:
S cAB
A aA | bB | b
Bb
Greinbach
Normal Form
S abSb
S aa
Not Greinbach
Normal Form
Conversion to Greinbach Normal Form:
S abSb
S aa
S aTb STb
S aTa
Ta a
Tb b
Greinbach
Normal Form
Theorem:
For any context-free grammar
(which doesn’t produce )
there is an equivalent grammar
in Greinbach Normal Form
Observations
• Greinbach normal forms are very good
for parsing
• It is hard to find the Greinbach normal
form of any context-free grammar
The CYK Parser
The CYK Membership Algorithm
Input:
• Grammar
• String
G in Chomsky Normal Form
w
Output:
find if
w L(G )
The Algorithm
Input example:
• Grammar
• String
G : S AB
A BB
Aa
B AB
Bb
w : aabbb
aabbb
a
a
b
b
aa
ab
bb
bb
aab
abb
bbb
aabb
abbb
aabbb
b
S AB
A BB
Aa
B AB
Bb
a
A
a
A
b
B
b
B
aa
ab
bb
bb
aab
abb
bbb
aabb
abbb
aabbb
b
B
S AB
A BB
Aa
B AB
Bb
a
A
a
A
b
B
b
B
aa
aab
ab
S,B
abb
bb
A
bbb
bb
A
aabb
abbb
aabbb
b
B
S AB
A BB
Aa
B AB
Bb
a
A
a
A
b
B
b
B
aa
bb
A
bbb
S,B
bb
A
aab
S,B
ab
S,B
abb
A
aabb
A
abbb
S,B
aabbb
S,B
b
B
Therefore:
aabbb L(G )
Time Complexity:
Observation:
3
| w|
The CYK algorithm can be
easily converted to a parser
(bottom up parser)
© Copyright 2026 Paperzz