Grammars
What is a grammar?
• Roughly speaking, a grammar is a set of rules
that produce sentences (strings).
• For example, a simplistic grammar for the
English language is the following:
– <SENTENCE> → <NOUN PHRASE><VERB>
– <NOUN PHRASE> → <ARTICLE><NOUN>
– <ARTICLE> → a | the
– <NOUN> → girl | dog
– <VERB> → barks | runs |sleeps
Production of the sentence
“the dog barks”
– <SENTENCE> → <NOUN PHRASE> <VERB>
– <NOUN PHRASE> → <ARTICLE> <NOUN>
– <ARTICLE> → a | the
– <NOUN> → girl | dog
– <VERB> → barks | runs |sleeps
<SENTENCE> → <NOUN PHRASE> <VERB> →
<ARTICLE> <NOUN> <VERB> →
the <NOUN> <VERB> →
the dog <VERB> →
the dog barks
Production of the sentence
“a girl runs”
– <SENTENCE> → <NOUN PHRASE> <VERB>
– <NOUN PHRASE> → <ARTICLE> <NOUN>
– <ARTICLE> → a | the
– <NOUN> → girl | dog
– <VERB> → barks | runs |sleeps
<SENTENCE> → <NOUN PHRASE> <VERB> →
<ARTICLE> <NOUN> <VERB> →
a <NOUN> <VERB> →
a girl <VERB> →
a girl runs
Production of the sentence
“a girl runs”
– <SENTENCE> → <NOUN PHRASE> <VERB>
– <NOUN PHRASE> → <ARTICLE> <NOUN>
– <ARTICLE> → a | the
Variables
– <NOUN> → girl | dog
– <VERB> → barks | runs |sleeps
<SENTENCE> → <NOUN PHRASE> <VERB> →
<ARTICLE> <NOUN> <VERB> →
Terminals
a <NOUN> <VERB> →
a girl <VERB> →
a girl runs
Grammars (more formally)
• A grammar is a quadruple (V, T, S, P) with
– V being a set of Variables
– T being a set of Terminals
– S in V being the start variable
– P being a set of productions rules
Derivation
• A string of terminals can be derived by the
grammar if, starting from the start variable
and substituting parts of the so far produced
string (following the production rules) we can
produce the string of terminals.
• The language that a grammar produces is the
set of strings of terminals which can be
derived by the grammar.
Another grammar
• S → aSb | ε
• Which language does this grammar produce?
Another grammar
• S → aSb | ε
• Which language does this grammar produce?
• S
Another grammar
• S → aSb | ε
• Which language does this grammar produce?
• S→ε
Another grammar
• S → aSb | ε
• Which language does this grammar produce?
• S → aSb
Another grammar
• S → aSb | ε
• Which language does this grammar produce?
• S → aSb → ab
Another grammar
• S → aSb | ε
• Which language does this grammar produce?
• S → aSb
→ aaSbb
Another grammar
• S → aSb | ε
• Which language does this grammar produce?
• S → aSb
→ aaSbb → aabb
Another grammar
• S → aSb | ε
• Which language does this grammar produce?
• S → aSb
→ aaSbb
→ aaaSbbb
Another grammar
• S → aSb | ε
• Which language does this grammar produce?
• S → aSb
→ aaSbb
→ aaaSbbb → aaabbb
Another grammar
• S → aSb | ε
• Which language does this grammar produce?
• S → aSb
→ aaSbb
→ aaaSbbb
…
→ anSbn
Another grammar
•
•
•
•
•
•
•
S → aSb | ε
Which language does this grammar produce?
S
→ε
S → aSb
→ ab
S → aSb → aaSbb
→ aabb
S → aSb → aaSbb → aaaSbbb → aaabbb
S → aSb → aaSbb → … → anSbn → anbn
Another grammar
•
•
•
•
•
•
•
S → aSb | ε
Which language does this grammar produce?
S
→ε
S → aSb
→ ab
S → aSb → aaSbb
→ aabb
S → aSb → aaSbb → aaaSbbb → aaabbb
S → aSb → aaSbb → … → anSbn → anbn
L = {anbn : n ≥ 0}
L() : “proper opening and closing (, )”
Strings in the language:
Strings not in the language:
L() : “proper opening and closing (, )”
Strings in the language: (), (()), ()(), ((()())())
Strings not in the language: (, )(, (()((
A grammar for the language
L() : “proper opening and closing (, )”
• S → (S) | SS | ε
• Example: Derivation of the string ((()())())
S → (S) → (SS) → ((S)S) → ((SS)S) → (((S)S)S) →
(((ε)S)S) → ((()S)S) → ((()(S))S) → ((()(ε))S) →
((()S)S) → ((()())(S)) → ((()())(ε)) → ((()())())
Try it yourself
• Find a grammar that produces the language
LR = {wwR : w in {a,b}*}
Try it yourself
• Find a grammar that produces the language
LR = {wwR : w in {a,b}*}
• S → aSa | bSb | ε
Context free grammars
All the grammars that we saw so far are called
Context Free grammars. The CF grammars
have a special rule form:
• The production rules must be of the form
A⟶α
– A is in V
– α is in (V∪T)*
Context Free Languages
• A language which is produced by a context free
grammar is called context free language.
• Example of CF languages:
– L = {anbn : n ≥ 0}
– L() = set of strings of proper opening and closing (, )
– LR = {wwR : w in {a,b}*}
Context Free Languages
• A language which is produced by a context free
grammar is called context free language.
• Example of CF languages:
– L = {anbn : n ≥ 0}
S → aSb | ε
– L() = set of strings of proper opening and closing (, )
S → (S) | SS | ε
– LR = {wwR : w in {a,b}*} :
S → aSa | bSb | ε
Parse Trees
• A parse tree for a string of terminals is a tree
with the following properties:
• The root is the start variable
• Each intermediate node has as children from
left to right all the variables and terminals in
the sequence they appear in the right part of
a derivation rule in which this variable appears
in the left part.
• All the leafs are terminals appearing in the
sequence they are in the string.
Parse Trees
• Consider the grammar: S → (S) → SS → ε
• A parse tree for (()()) is the following
Parse Trees
• Consider the grammar: S → (S) → SS → ε
• A parse tree for (()()) is the following
S
Parse Trees
• Consider the grammar: S → (S) → SS → ε
• A parse tree for (()()) is the following
S
(
S
)
Parse Trees
• Consider the grammar: S → (S) → SS → ε
• A parse tree for (()()) is the following
S
S
(
S
)
S
Parse Trees
• Consider the grammar: S → (S) → SS → ε
• A parse tree for (()()) is the following
S
S
(
)
S
(
S
S
)
(
S
)
Parse Trees
• Consider the grammar: S → (S) → SS → ε
• A parse tree for (()()) is the following
S
S
(
)
S
(
S
ε
S
)
(
S
ε
)
Parse Trees
• Consider the grammar: S → (S) → SS → ε
• A parse tree for (()()) is the following
S
S
(
)
S
(
S
ε
S
)
(
S
ε
)
Ambiguity
• For the previous language there is a unique
parse tree for every string in the language.
• This is not always the case!!!
• Consider the language for mathematical
expressions:
E → E + E | E • E | (E) | N
N→0|1|…|9
Ambiguity
• This language is ambiguous because the string
2+2•2 has two different derivation trees.
E
E
N
2
+
E
E
•
E
N
N
2
2
Ambiguity
• This language is ambiguous because the string
2+2•2 has two different derivation trees.
E
•
E
E
+
E
N
N
2
2
E
N
2
Ambiguity
• The problem with that is that the same string
can have two different meanings according to
the way it is parsed.
• Example: 2 + 2•2 can denote:
– 2 + (2 • 2) = 6 –first parse tree
– (2 + 2) • 2 = 8 –second parse tree
• This is bad because there are two different
ways to evaluate the same expression.
Ambiguity
• We can design an unambiguous grammar for
the same language:
E→E+T|T
T→|T•F|F
F → (E) | N
N→0|1|…|9
Unique parse tree for 2+2•2
E
E
+
T
T
T
F
F
N
N
N
2
2
2
•
F
Closure under the reg. operations
Context Free languages are closed under the
regular operations:
• Union ∪
• Concatenation ∘
• Star *
CF languages are closed under ∪
• Assume that we have two CF languages L1, L2.
• There are CF grammars (V1, T1, S1, R1) and (V2,
T2, S2, R2) producing them.
• Create a CF grammar for the union:
– Add a new variable S and the rules S ⟶ S1 | S2.
– Make sure that the sets of variables are
completely disjoint. If not change the names of
common symbols.
– The grammar for the union is (V1 ∪ V2, T1 ∪ T2, S,
R1 ∪ R2).
Example
Say we want to find a grammar for the union
of L = {anbn : n ≥ 0} and LR = {wwR : w in
{a,b}*}
G1: S → aSb | ε and G2: S → aSa | bSb | ε
The grammar for L ∪ LR is:
S → S1 | S 2
S1 → aS1b | ε
S2 → aS2a | bS2b | ε
Try it yourself
Prove that CF languages are closed under ∘
Try it yourself
Prove that CF languages are closed under ∘
Same as before. Just add a new start variable S
and a new rule S → S1S2
CF languages are closed under *
• Assume that L is a CF language.
• There is a CF grammar (V, T, S, R) for L.
• Create a CF grammar for L*:
– Create a new start variable S’
– Add the rules S’ ⟶ S’S’| S | ε
© Copyright 2026 Paperzz