Chapter 3
Context-Free Grammars
By
Dr Zalmiyah Zakaria
•Context-Free Grammars and Languages
•Regular Grammars
Formal Definition of
Context-Free Grammars (CFG)
A CFG can be formally defined by a quadruple
of (V, , P, S) where:
– V is a finite set of variables (non-terminal)
– (the alphabet) is a finite set of terminal
symbols , where V =
– P is a finite set of rules (production rules) written
as: A for A V, (v )*.
– S is the start symbol, S V
2
Formal Definition of CFG
• We can give a formal description to a
particular CFG by specifying each of its
four components, for example,
G = ({S, A}, {0, 1}, P, S) where P
consists of three rules:
S → 0S1
S→A
A→
Sept2011
Theory of Computer Science
3
Context-Free Grammars
• Terminal symbols – elements of the
alphabet
• Variables or non-terminals – additional
symbols used in production rules
• Variable S (start symbol) initiates the
process of generating acceptable
strings.
4
Terminal or Variable ?
• S → (S) | S + S | S × S | A
•A→1|2|3
• The terminal symbols are { (, ), +,
×, 1, 2, 3}
• The variable symbols are S and A
Sept2011
Theory of Computer Science
5
Context-Free Grammars
• A rule is an element of the set
V (V )*.
• An A rule:
[A, w] or A w
• A null rule or lambda rule:
A
6
Context-Free Grammars
• Grammars are used to generate
strings of a language.
• An A rule can be applied to the
variable A whenever and wherever
it occurs.
• No limitation on applicability of a
rule – it is context free
8
Context-Free Grammars
• CFG have no restrictions on the right-hand side of
production rules. All of the following are valid CFG
production rules:
S aSbS
Sλ
S ABcDEF
Sc
S xyz
S MZK
• Notice that there can be a λ, one or more terminals,
one or more non-terminals and any combination of
terminals and non-terminals on the right-hand side
of a production rule.
Sept2011
Theory of Computer Science
9
Generating Strings with CFG
• Generate a string by applying rules
– Start with the initial symbol
– Repeat:
• Pick any non-terminal in the string
• Replace that non-terminal with the right-hand side of
some rule that has that non-terminal as a left-hand
side
• Repeat until all elements in the string are terminals
• E.g. :
P:
S uAv
Aw
We can derived string uwv as:
S ⇒ uAv ⇒ uwv
10
Generating Strings with CFG
S aS
S Bb
B cB
B
• Generating a string:
P:
S
aS
aBb
acBb
acb
Sept2011
replace S with aS
replace S with Bb
replace B with cB
replace B with
Final String
Theory of Computer Science
11
Generating Strings with CFG
P:
S aS
B cB
S Bb
B
• Generating a string:
S
aS
aaS
aaBb
aacBb
aaccBb
aaccb
Sept2011
replace S with aS
replace S with aS
replace S with Bb
replace B with cB
replace B with cB
replace B with
Final String
Theory of Computer Science
12
Generating Strings with CFG
P:
S aS
B cB
S Bb
B
• Regular expression equivalent to
∗
∗
this CFG: a c b
• Shortest string: S ⇒ Bb ⇒ b.
Sept2011
Theory of Computer Science
13
Derivation
• A derivation is a listing of how a string is generated –
showing what the string looks like after every
replacement.
S AB
A aA |
B bB |
Sept2011
S ⇒ AB
⇒ aAB
⇒ aAbB
⇒ abB
⇒ abbB
⇒ abb
Theory of Computer Science
14
Derivation
• A terminal string is in the language of
the grammar if it can be derived from
the start symbol S using the rules of
the grammar:
• Example: S AASB | AAB
Aa
B bbb
15
Derivation
Derivation
Rule Applied
S ⇒ AASB
⇒ AAAASBB
⇒ AAAAAASBBB
⇒ AAAAAAAABBBB
⇒ aaaaaaaaBBBB
⇒ aaaaaaaabbbbbbbbbbbb
S AASB
S AASB
S AASB
S AAB
Aa
B bbb
16
Derivation
• Let G be the grammar :
S aS | aA
A bA |
• The derivation of aabb is as shown:
S aS
aaA
aabA
aabbA
aabb
aabb
17
Example
• Let G be the grammar :
S aSb | ab
• The derivation of aabb is as shown:
S aSb
aaSbb
aaaSbbb
aaaaSbbbb
aaaabbbb
Set notation equivalent to this CFG is
L = {anbn | n > 0}
18
CFG – Generating strings
• E.g. : Formal description of G = ({S}, {a, b}, P, S)
P:
S aS | bS |
• The following strings can be derived:
S⇒
S ⇒ aS ⇒ a ⇒ a
S ⇒ bS ⇒ b ⇒ b
S ⇒ aS ⇒ aaS ⇒ aa ⇒ aa
S ⇒ aS ⇒ abS ⇒ ab ⇒ ab
S ⇒ bS ⇒ baS ⇒ ba ⇒ ba
S ⇒ aS ⇒ abS ⇒ abbS ⇒ abb ⇒ abb
19
CFG – Generating strings
• E.g. : G = ({S}, {a,b}, P, S)
P: S aS | bS |
• The language above can also be
defined using regular expression:
L(G) = (a + b)*
20
CFG – Generating strings
• E.g. :
G = ({S, A}, {a, b}, P, S)
P:
S AA
A AAA | bA | Ab | a
• The following strings can be derived:
S ⇒ AA
S ⇒ aA
rule [A a]
S ⇒ aAAA
rule [A AAA]
S ⇒ abAAA
rule [A bA]
S ⇒ abaAA
rule [A a]
S ⇒ abaAbA
rule [A Ab]
S ⇒ abaabA
rule [A a]
S ⇒ abaaba
rule [A a]
Shortest
string ?
21
CFG – Generating strings
G = (V, , P, S), V = {S, A}, = {a, b},
P:
S AA
A AAA | bA | Ab | a
• Four distinct derivations of ababaa in G:
(a) S ⇒ AA
(b) S
⇒ aA
⇒ aAAA
⇒ abAAA
⇒ abaAA
⇒ ababAA
⇒ ababaA
⇒ ababaa
(a) & (b) leftmost,
⇒ AA
(c) S ⇒ AA
(d) S ⇒ AA
⇒ AAAA
⇒ Aa
⇒ aA
⇒ aAAA
⇒ AAAa
⇒ aAAA
⇒ abAAA
⇒ AAbAa
⇒ aAAa
⇒ abaAA
⇒ AAbaa
⇒ abAAa
⇒ ababAA
⇒ AbAbaa
⇒ abAbAa
⇒ ababaA
⇒ Ababaa
⇒ ababAa
⇒ ababaa
⇒ ababaa
⇒ ababaa
(c) rightmost,
(d) arbitrary
22
Context-Free Grammars
S ⇒ aS ⇒ aSa ⇒ aba
• Sentencial forms are the strings derivable
from start symbol of the grammar.
• Sentences are forms that contain only
terminal symbols.
• A set of strings over is context-free
language if there is a context-free grammar
that generates it.
23
Derivation Tree, DT
G = ({S, A}, {a, b}, P, S)
P: S AA
A AAA | bA | Ab | a
The derivation tree for
S ⇒ AA ⇒ aA ⇒ abA ⇒ abAb ⇒ abab
24
Derivation Tree
S ⇒ aS ⇒ aSa ⇒ aba
S ⇒ Sa ⇒ aSa ⇒ aba
25
Derivation Tree
• A CFG G is ambiguous if there exist more
than 1 DT for n, where n L(G).
• Example: G = ({S}, {a, b}, P, S)
P: S aS | Sa | b
The string aba can be derived as:
S ⇒ aS ⇒ aSa ⇒ aba or S ⇒ Sa ⇒ aSa ⇒ aba
26
Example
• Let G be the grammar given by
the production
S aSa | aBa
B bB | b
• Then L(G) = {anbman | n > 0, m > 0}
27
Example
• Let L(G) =
n≥0, m>0}
• Then the production rules for this
grammar is:
n
m
m
2n
{a b c d |
S aSdd | A
A bAc | bc
28
Example
• A string w is a palindrome if w = wR
• The set of palindrome over {a, b}
can be derived using rules:
Sa|b|
S aSa | bSb
29
More Examples
• What is the language of this grammar ?
• S aA
A aA | bA | b
• S aA
A aA | bB
B bB |
• S aS | bA
A bB
B aB |
a(a + b)*b
aa*bb*
a*bba*
30
How to convert RE to CFG
• General rules:
• a+b
Sa|b
• ab
S aA
Ab
• a*
S → aS | λ
Sept2011
Theory of Computer Science
31
Regular Grammars
• Regular grammars are an important
subclass of context-free grammars that
play an important role in the lexical
analysis and parsing of programming
languages.
• Regular grammars is a subset of CFG.
• Regular grammars are obtained by
placing restrictions on the form of the
right hand side of the rules.
33
Definition
•
A regular grammar is a CFG in which each rule
has one of the following form:
1. A a
2. A aB
3. A λ
where A, B V, and a , A, B is a nonterminal and a can be one or more terminals
34
Regular Grammars
• There is at most ONE variable in a
sentential form – the rightmost
symbol in the string.
• Each rule application adds ONE
terminal to the derived string.
Derivation is terminated by rules:
A a OR
Aλ
35
Example
• Consider the grammar:
G:
S abSA |
A Aa |
• The equivalent regular grammar:
Gr :
S aB |
B bS | bA
A aA |
36
Example
Syntax of Pascal in Backus-Naur Form
<assign> <var> := <exp>
<var>
A B C
<exp> <var> + <exp>
<var> - <exp>
(<exp>)
<var> * <exp>
<var>
37
Example
Is A := B*(A+C) Syntactically correct?
<assign> <var> := <expr>
A := <expr>
A := <var>*<expr>
A := B*<expr>
A := B*(<var>+<expr>)
A := B*(A+<expr>)
A := B*(A+<var>)
A := B*(A+C)
38
Example
Is A := B*(A+C) Syntactically correct?
39
How to construct RG from a RE
• Use a new non-terminal for every new
character
• Each loop state turns into a recursive
definition on a non-terminal
• Example: R.E.
ab*ab
RG
S aA
A bA
A aB
Bb
Sept2011
Theory of Computer Science
40
Reg Expression to Reg Grammar
• RE
• RE
Sept2011
a(a + b)*b RG S → aA
A → aA
A → bA
A→b
ab*a
RG S → aA
A → bA
A→a
Theory of Computer Science
41
How to construct RG from a RE (cont.)
• Eg.
Sept2011
R.E.
a(a + b)*b
Regular Grammar
S aA
A aA
A bA
Ab
Theory of Computer Science
42
RE ⇒ RG
• Consider the regular expression:
a+b*
• The regular grammar is:
S aS aR
R bR
43
RE ⇒ RG ⇒ RL
• The regular language L = a+b*
can be defined as:
L = ( V, , P, S)
where: V = {S, R}
= {a, b}
P: S aS aR
R bR
44
Non-Regular Language
• The language L = a+b*
can also be defined as:
L = ( V, , P, S)
where: V = {S, A, B}
= {a, b}
P: S AB
A aA | a
B bB |
non-regular
grammar
45
Non-RG to RG
• Since every RG defines a RL and every RL must have a
RE, it would be easier to go ahead and figure out
what the RE is and convert it to the RG.
S XYX
X aX | bX |
Y aa | bb
• The equivalent RE for it is: (a + b)*(aa + bb)(a + b)*
• Eg. : CFG that is not regular:
S aS | bS | X
X aaY | bbY
Y aY | bY |
• Convert it to RG:
Sept2011
Theory of Computer Science
46
Example
• Grammar for even-length strings over
{a, b}:
S aE | bE |
E aS | bS
47
Design example
L = {0n1n | n 0}
These strings have recursive structure:
000000111111
0000011111
00001111
000111
0011
01
λ
S 0S1| λ
Design examples
L = {0n1n0m1m | n 0, m 0}
These strings have two parts:
L = L1L2
L1 = {0n1n | n 0}
L2 = {0m1m | m 0}
rules for L1: S1 0S11| λ
L2 is the same as L1
Allowed strings
010011
00110011
000111
S AA
A 0A1 | λ
Design examples
L = {0n1m0m1n | n 0, m 0}
Allowed strings
011001
0011
1100
00110011
These strings have nested structure:
outer part: 0n1n
inner part: 1m0m
S 0S1|A
A 1A0 | λ
Example of RG
• G = ({S, A, B, C}, {a, b}, P, S)
P:
S aS | aB
B bC
C aC | a
m
n
• L(G) = {a ba |m, n 1}
Sept2011
Theory of Computer Science
51
Example of CFG
• G = ({S, A, B, C}, {a, b}, P, S)
P:
S aSbb | abb
n
2n
• L(G) = {a b |n 1}
Sept2011
Theory of Computer Science
52
© Copyright 2026 Paperzz