Discrete Math. and Logic II. Context-Free Grammars

Discrete Math. and Logic II. Context-Free
Grammars
SFWR ENG 2FA3
Ryszard Janicki
Winter 2014
Acknowledgments: Material
partially
based on
Automata and Computability
Ryszard Janicki
by Dexter C. Kozen (Chapter 19).
Discrete Math. and Logic II. Context-Free Grammars
1 / 16
Introduction
An example of a context-free grammar
Ryszard Janicki
Discrete Math. and Logic II. Context-Free Grammars
2 / 16
Introduction
The objects hxxxi are called nonterminal symbols
Each nonterminal symbol generates a set of strings over a
nite alphabet Σ in a systematic way
the nonterminal harith-expri in
hassg-stmti ::= hvari := harith-expri
generates the set of syntactically correct arithmetic
expressions in this language
The strings corresponding to the nonterminal hxxxi are
generated using rules with hxxxi on the left-hand side
Ryszard Janicki
Discrete Math. and Logic II. Context-Free Grammars
3 / 16
Introduction
The alternatives on the righthand side, separated by vertical
bars |, describe dierent ways strings corresponding to hxxxi
can be generated
These alternatives may involve other nonterminals hyyyi,
which must be further eliminated by applying rules with hyyyi
on the left-hand side
while x ≤ y do begin x := (x + 1); y := y − 1 end
is generated by the nonterminal hstmti
Ryszard Janicki
Discrete Math. and Logic II. Context-Free Grammars
4 / 16
Introduction
We obtain the while statement from hstmti through a
sequence of expressions called sentential forms
Each sentential form is derived from the previous by an
application of one of the rules
hstmti
hwhile-stmti
while hbool-expri do hstmti
while harith-expri hcompare-opi harith-expri do hstmti
while hvari hcompare-opi harith-expri do hstmti
while hvari ≤ harith-expri do hstmti
while hvari ≤ hvari do hstmti
while x ≤ hvari do hstmti
while x ≤ y do hstmti
while x ≤ y do hbegin-stmti
...
Ryszard Janicki
Discrete Math. and Logic II. Context-Free Grammars
5 / 16
Introduction
Applying dierent rules will yield dierent results
begin if z = (x + 3) then y := z else y := x end
The set of all strings not containing any nonterminals
generated by the grammar is called the language generated by
the grammar
In general, this set of strings may be innite, even if the set of
rules is nite
There may also be several dierent derivations of the same
string
A grammar is said to be unambiguous if a string cannot have
more than one derivation
Ryszard Janicki
Discrete Math. and Logic II. Context-Free Grammars
6 / 16
Introduction
The language (subset of Σ∗ ) generated by the context-free
grammar G is denoted L(G )
A subset of Σ∗ is called a context-free language (CFL) if it is
L(G ) for some CFG G
CFLs are good for describing innite sets of strings in a nite
way
They are particularly useful in computer science for describing
the syntax of
programming languages,
well-formed arithmetic expressions,
well-nested begin-end blocks,
strings of balanced parentheses,
All regular sets are CFLs, but not necessarily vice versa.
Ryszard Janicki
Discrete Math. and Logic II. Context-Free Grammars
7 / 16
Pushdown Automata (PDAs): A Preview
A pushdown automaton (PDA) is like a nite automaton,
except it has a stack or pushdown store, which it can use to
record a potentially unbounded amount of information
Its input head is read-only and may only move right
The machine can store information on the stack in a
last-in-rst-out (LIFO) fashion
It can push symbols onto the top of the stack or pop them o
the top of the stack
It may not read down into the stack without popping the top
symbols o
Ryszard Janicki
Discrete Math. and Logic II. Context-Free Grammars
8 / 16
Pushdown Automata (PDAs): A Preview
a1 a2 a3 a4 a5 a6 a7 a8
an
left to right, read only
Q
push/pop
Finite
Control
Ryszard Janicki
A
B
C
B
Stack
Discrete Math. and Logic II. Context-Free Grammars
9 / 16
Formal Denition of CFGs and CFL
Denition
A context-free grammar (CFG) is a quadruple G = (N , Σ, P , S ),
where
N is a nite set (the nonterminal symbols),
Σ is a nite set (the terminal symbols) disjoint from
N,
P is a nite subset of N × (N ∪ Σ)∗ (the productions),
S ∈ N (the start symbol).
Ryszard Janicki
Discrete Math. and Logic II. Context-Free Grammars
10 / 16
Formal Denition of CFGs and CFL
We use capital letters A, B , C , · · · for nonterminals
We use a, b, c , · · · for terminal symbols
Strings in (N ∪ Σ)∗ are denoted α, β, γ, · · ·
Instead of writing productions as (A, α), we write A −→ α
We often use the vertical bar | to abbreviate a set of
productions with the same left-hand side
Ryszard Janicki
Discrete Math. and Logic II. Context-Free Grammars
11 / 16
Formal Denition of CFGs and CFL
Example
Instead of writing
A −→ α1 ,
we write
A −→ α2 ,
A −→ α1
A −→ α3 ,
| α2 | α3
Denition
If α, β ∈ (N ∪ Σ)∗ , we say that β is derivable from α in one step
and write
1
α −→ β
G
if β can be obtained from α by replacing some occurrence of a
nonterminal A in α with γ , where A −→ γ , is in P .
Ryszard Janicki
Discrete Math. and Logic II. Context-Free Grammars
12 / 16
Formal Denition of CFGs and CFL
Example
We have A −→ γ a production in P
α = α1 Aα2 , where α1 , α2 ∈ (N ∪ Σ)∗
β = α1 γα2
1
Then, we have α −→
β
G
Ryszard Janicki
Discrete Math. and Logic II. Context-Free Grammars
13 / 16
Formal Denition of CFGs and CFL
Denition
∗
1
Let −→
be the reexive transitive closure of the relation −→
; that
G
G
is, dene
0
α −→ α
G
n +1
n
1
α −→ β if there exists γ such that α −→ γ and γ −→ β
G
G
G
α −→ β if ∃(n |
G
∗
n≥0
n
: α −→ β )
G
Ryszard Janicki
Discrete Math. and Logic II. Context-Free Grammars
14 / 16
Formal Denition of CFGs and CFL
Terminology:
A string in (N ∪ Σ)∗ derivable from the start symbol S is
called a sentential form
A sentential form is called a sentence if it consists only of
terminal symbols
The language generated by G , denoted L(G ), is the set of all
sentences:
L(G ) = {x ∈ Σ∗
|
∗
S −→
x}
G
A subset B ⊆ Σ∗ is a context-free language (CFL) if
B = L(G ) for some context-free grammar G
Ryszard Janicki
Discrete Math. and Logic II. Context-Free Grammars
15 / 16
Formal Denition of CFGs and CFL
Example
The nonregular set {an bn | n ≥ 0} is a CFL
It is generated by the grammar G = (N , Σ, P , S ), where
N = {S }, Σ = {a, b}, P = {S −→ aSb, S −→ }
Here is a derivation of a3 b3 in G :
1
1
1
1
S −→
aSb −→
aaSbb −→
aaaSbbb −→
aaabbb
G
G
G
G
n +1 n n
S −→
ab
G
Ryszard Janicki
Discrete Math. and Logic II. Context-Free Grammars
16 / 16