Formal Theory of Context-Free Grammars
Initial Bachelor Seminar Talk
Jana Hofmann
Advisor: Prof. Dr. Gert Smolka
saarland
university
computer science
29th May 2015
1
Topics
Formalization of Context-Free Grammars in Coq
Verified Algorithm for Normalization
(Decidability of Context-Free Languages)
2
Sources
Dexter C. Kozen
Automata and Computability
Springer, 1997
John E. Hopcroft, Rajeev Motwani and Jeffrey D. Ullman
Introduction to automata theory, languages, and computation
AddisonWesley, 2nd edition, 2001
Denis Firsov and Tarmo Uustalu
Certified Normalization of Context-Free Grammars
Institute of Cybernetics at TUT, 2015
Jan-Oliver Kaiser
Constructive Formalization of Regular Languages
Bachelor thesis, Saarland University, 2012
3
Content
Motivation and Examples
Context-Free Grammars
Chomsky Normal Form
Formalization
Definitions
Derivation
Transformation into CNF
ε- Elimination
1) Adding Nullable Rules
Correctness
2) Deleting ε - Rules
Outlook
4
Context-Free Grammars
Context-Free Grammars are used to
I
describe non-regular languages
I
define programming languages (Backus-Naur-Form)
Example
A −→ ε
A −→ ( A )
A −→ AA
describes {ε, (), (()), ()(), (())(), ...}
5
Notation
notation
example
variable
A, B, C , ...
A
terminal
a, b, c, ...
(,)
phrase
u, v , w
( A ) , ( A ( AA ) ), ε
rule
r
A −→ ( A )
grammar
G
A −→ ε | ( A ) | AA
grammar with start
symbol
(G , S)
6
Chomsky Normal Form
I
Chomsky Normal Form is the foundation for further reasoning
on CFGs (e.g. CYK algorithm)
(G , S) is in Chomsky Normal form if every rule in G is of one of the
following forms:
I
A −→ BC
I
A −→ a
I
S −→ ε
where B, C 6= S
7
Chomsky Normal Form
Example
A −→ ε | ( A ) | AA
A −→ ε | B ) | AA
B −→ ( A
A −→ ε | BC | AA
B −→ DA
C −→ )
D −→ (
8
Definitions
var
:= n
(n ∈ N)
ter
:= n
(n ∈ N)
symbol := var | ter
phrase
:= L (symbol)
rule
:= var × phrase
grammar := L (rule)
9
Derivation
I
=⇒: grammar → var → phrase → Prop
A −→ u ∈ G
G
G
A =⇒ A
I
G
A =⇒ uBw
A =⇒ u
G
B =⇒ v
G
A =⇒ uvw
L : grammar → var → phrase → Prop
G
LA
G := λu. (A =⇒ u ∧ terminal u)
10
Transformation into CNF
1. eliminate all ε-rules (A −→ ε)
2. eliminate unit-rules (A −→ B)
3. eliminate long-rules (A −→ X1 X2 ...Xk )
4. replace terminals with variables
11
ε- Elimination
1. add new rules by dropping variables
Example
A −→ ε | a
A −→ ε | a
B −→ ε | b
B −→ ε | b
C −→ AB | cAc
C −→ AB | ε | A | B | cAc | cc
2. remove all rules of the form A −→ ε (and add S −→ ε)
12
1) Adding Nullable Rules
I
construct the closure G’ of G with respect to
A −→ u ∈ G
A −→ u1 Xu2 ∈ G 0
A −→ u ∈ G 0
I
A
prove that LA
G ≡ Lclosure
X −→ ε ∈ G 0
A −→ u1 u2 ∈ G 0
G
13
ε- Elimination
A −→ u ∈ G
A −→ u ∈ G 0
A −→ u1 Xu2 ∈ G 0
X −→ ε ∈ G 0
A −→ u1 u2 ∈ G 0
I
fixpoint iteration
I
termination: new rule is a subset of the old so only finitely
many rules can be added
in Coq bounded recursion with two bounds:
I
I
I
number of possible rules
|G 0 | - (number of steps done without adding a rule)
14
Correctness
G
G’
A =⇒ u ↔ A ==⇒ u
→: easy
←: essential:
let r ∈ closure G , r ∈
/G
r::G
G
prove: A ===⇒ u → A =⇒ u
r::G
induction on A ===⇒ u:
G
I
u = A and we get A =⇒ A
I
A −→ u ∈ r :: G . So either A −→ u ∈ G (trivial) or r =
A −→ u
then by construction
∃ u1 u2 X . A −→ u1 Xu2 ∈ G ∧ X −→ ε ∈ G ∧ u = u1 u2 .
G
So we get A =⇒ u.
I
u = u1 u2 u3 . By IH: A =⇒ u1 Xu3 , X =⇒ u2 so A =⇒ u.
G
G
G
15
Correctness (2)
1)
G
nullableGA := A =⇒ ε
2)
nullable'A
G := A −→ ε ∈ closure G
3)
∀ X ∈ u. nullable''X
G
A −→ u ∈ G
nullable''A
G
I
1) ↔ 2) proof not yet done
I
1) ↔ 3) proof by mutual induction
16
2) Deleting ε - Rules
I
every rule B −→ ε is superfluous now
I
A
LA
G − {ε} ≡ LG 0 −{B−→ ε}
I
Proof: Let A =⇒ u be of minimal length and u 6= ε.
no rule B −→ ε needed (otherwise not minimal length)
I
if ε ∈ LSG , add S −→ ε
G
=
⇒ LSG ≡ LSclosure
G
17
Outlook
I
proof for nullable correctness property
I
proof for deletion of ε - Rules
finish algorithm and it’s verification:
I
I
I
I
deletion of unit-rules
deletion of long-rules
new rules for terminals
I
add other constraints to CNF (e.g. useless symbols)
I
decidability of context-free languages: CYK-algorithm
18
Derivations
A −→ u ∈ G
G
G
A =⇒ uBw
G
A =⇒ A
G
B =⇒ v
G
A =⇒ u
A =⇒ uvw
is equivalent to
G
A =⇒ uBw
G
A =⇒ [A]
B −→ v ∈ G
G
A =⇒ uvw
proof by straightforward induction
19
Equivalence nullable
strengthen the statement:
G
G
A =⇒ u → ∀ X ∈ u. X =⇒ ε → nullable''G
A
G
proof by induction on A =⇒ u.
20
© Copyright 2026 Paperzz