Context Free Grammar (CFG)

Decision, Computation and Language
Context-free Grammar (CFG)
Dr. Muhammad S Khan
([email protected])
Ashton Building, Room G22
http://www.csc.liv.ac.uk/~khan/comp218
The Chomsky Hierarchy
Languages exist which are not regular; Noam Chomsky
categorised regular and other languages as follows:
Language Class
Grammar
Automaton
3
Regular
NFA or DFA
2
Context-Free
Push Down Automaton
1
Context-Sensitive
Linear-Bounded Automaton
0
Unrestricted
Turing machine
M S Khan (Univ. of Liverpool)
COMP218 Decision, Computation and Language
2
Type 3 - Regular Languages
A regular language is one which can be:
represented by a regular grammar,
described using a regular expression, or
accepted using an NFA or a DFA.
M S Khan (Univ. of Liverpool)
COMP218 Decision, Computation and Language
3
Type 2 - Context-Free Languages
A Context-Free Grammar (CFG) is one whose production
rules are of the form:
𝐴→𝛼
where 𝐴 is any single non-terminal, and 𝛼 is any combination of
terminals and non-terminals.
A NFA/DFA cannot recognise strings from this type of
language since we must be able to "remember"
information somehow.
CFG is accepted using Push-Down Automaton which is like
a DFA except that we are also allowed to use a stack
(memory).
M S Khan (Univ. of Liverpool)
COMP218 Decision, Computation and Language
4
Type 1 - Context-Sensitive Languages
Context-Sensitive grammars may have more than one
symbol on the left-hand-side of their production rules
(provided that at least one of them is a non-terminal).
However, the production rules must now obey the
following:
The number of symbols on the left-hand-side must not exceed the
number of symbols on the right-hand-side
We do not allow rules of the form 𝐴 β†’ πœ€ unless 𝐴 is the start
symbol and does not occur on the right-hand-side of any rule.
M S Khan (Univ. of Liverpool)
COMP218 Decision, Computation and Language
5
Type 0 - Unrestricted (Free) Languages
Free grammars have absolutely no restrictions on their
grammar rules, (except, of course, that there must be at
least one non-terminal on the left-hand-side).
The type of automata which can recognise such a language
is basically a NFA/DFA with an infinitely-long list at its
disposal to use as a store; this is called a Turing machine.
M S Khan (Univ. of Liverpool)
COMP218 Decision, Computation and Language
6
Context-free Grammars (CFG)
M S Khan (Univ. of Liverpool)
COMP218 Decision, Computation and Language
7
Grammar
M S Khan (Univ. of Liverpool)
COMP218 Decision, Computation and Language
8
Context-free Grammars
Show how CFGs can be converted into β€œnormal forms”,
i.e. equivalent CFGs that have additional syntactic
restrictions
Use normal form to show that β€œpushdown automata” are
the class of machines that accept CFLs.
Parsing is the process of checking that a sequence of
symbols is generated by a context-free grammar
Consider classes of parsing algorithms expressible as
restricted classes of pushdown automata (note: general
pushdown automata are impractical to implement, unlike
finite automata)
M S Khan (Univ. of Liverpool)
COMP218 Decision, Computation and Language
9
Context-free Grammars
M S Khan (Univ. of Liverpool)
COMP218 Decision, Computation and Language
10
Example
M S Khan (Univ. of Liverpool)
COMP218 Decision, Computation and Language
11
Example of using the grammar
M S Khan (Univ. of Liverpool)
COMP218 Decision, Computation and Language
12
Another Example
M S Khan (Univ. of Liverpool)
COMP218 Decision, Computation and Language
13
Example of using the grammar
M S Khan (Univ. of Liverpool)
COMP218 Decision, Computation and Language
14
Finite languages
M S Khan (Univ. of Liverpool)
COMP218 Decision, Computation and Language
15
Notation
M S Khan (Univ. of Liverpool)
COMP218 Decision, Computation and Language
16
Notation
M S Khan (Univ. of Liverpool)
COMP218 Decision, Computation and Language
17
Types of rules in CFG
There are three types of rules in CFG
1.
2.
3.
Union Rule: 𝑆 β†’ 𝐴 | 𝐡
Production Rule: 𝑆 β†’ 𝐴𝐡
Closure Rule: 𝑆 β†’ 𝐴𝑆 | πœ€
M S Khan (Univ. of Liverpool)
COMP218 Decision, Computation and Language
18
Examples
M S Khan (Univ. of Liverpool)
COMP218 Decision, Computation and Language
19
More Examples
𝐿 = πœ€, π‘Ž, 𝑏, 𝑏𝑏, π‘Žπ‘Ž, π‘Žπ‘Žπ‘Ž, 𝑏𝑏𝑏, … , π‘Žπ‘› , 𝑏 𝑛 ∢ 𝑛 β‰₯ 0
𝐿 = πœ€, π‘Žπ‘, π‘Žπ‘Žπ‘π‘, π‘Žπ‘Žπ‘Žπ‘π‘π‘, … , π‘Žπ‘› 𝑏 𝑛 ∢ 𝑛 β‰₯ 0
𝐿 = πœ€, π‘Žπ‘, π‘Žπ‘π‘Žπ‘, π‘Žπ‘π‘Žπ‘π‘Žπ‘, … , π‘Žπ‘
𝑛
βˆΆπ‘›β‰₯0
𝐿 = π‘Žπ‘› 𝑏 π‘š ∢ 𝑛, π‘š β‰₯ 0
𝐿 = π‘Žπ‘› 𝑏 π‘š 𝑐 π‘˜ : 𝑛, π‘š, π‘˜ β‰₯ 0
Well balances parentheses: (()(()))()
M S Khan (Univ. of Liverpool)
COMP218 Decision, Computation and Language
20
Regular Languages are Context-free...
M S Khan (Univ. of Liverpool)
COMP218 Decision, Computation and Language
21
Regular Grammar
M S Khan (Univ. of Liverpool)
COMP218 Decision, Computation and Language
22
An equivalent definition
The above definition is used in Hopcroft and Ullman. The previous definition is used
in Temblay and Sorenson.
M S Khan (Univ. of Liverpool)
COMP218 Decision, Computation and Language
23
Conversion to β€œrestricted” form
M S Khan (Univ. of Liverpool)
COMP218 Decision, Computation and Language
24
Regular Grammar to NFA
M S Khan (Univ. of Liverpool)
COMP218 Decision, Computation and Language
25
How to prove the construction is valid
M S Khan (Univ. of Liverpool)
COMP218 Decision, Computation and Language
26
Example
M S Khan (Univ. of Liverpool)
COMP218 Decision, Computation and Language
27
Example of conversion from grammar to FA
M S Khan (Univ. of Liverpool)
COMP218 Decision, Computation and Language
28
Example of conversion from grammar to FA
M S Khan (Univ. of Liverpool)
COMP218 Decision, Computation and Language
29
Example of conversion from grammar to FA
M S Khan (Univ. of Liverpool)
COMP218 Decision, Computation and Language
30
Another Example
M S Khan (Univ. of Liverpool)
COMP218 Decision, Computation and Language
31
Example (contd.)
M S Khan (Univ. of Liverpool)
COMP218 Decision, Computation and Language
32
Example: some arithmetic expressions
M S Khan (Univ. of Liverpool)
COMP218 Decision, Computation and Language
33
Ambiguity
Parsing is the problem of: given a grammar and a string,
find a derivation of the string using the grammar. (example
on previous slide!)
Given our claim that programming languages are often
described using grammars, this is a key problem for
compiler.
We prefer β€œunambiguous” CFGs (ones where all
derivations of a string are β€œessentially the same”, noting
that variables of a CFG often correspond to specific
structures of a program, e.g. arithmetic expression,
procedure, statement, method etc.
M S Khan (Univ. of Liverpool)
COMP218 Decision, Computation and Language
34
Ambiguity (continued)
M S Khan (Univ. of Liverpool)
COMP218 Decision, Computation and Language
35
Parse Trees (a.k.a. syntax trees, derivation trees
M S Khan (Univ. of Liverpool)
COMP218 Decision, Computation and Language
36
Parse Trees (continued)
M S Khan (Univ. of Liverpool)
COMP218 Decision, Computation and Language
37
Rewriting a grammar
M S Khan (Univ. of Liverpool)
COMP218 Decision, Computation and Language
38
Rewriting a grammar (contd.)
M S Khan (Univ. of Liverpool)
COMP218 Decision, Computation and Language
39
Leftmost/rightmost derivations
M S Khan (Univ. of Liverpool)
COMP218 Decision, Computation and Language
40
Leftmost/rightmost derivations (contd.)
M S Khan (Univ. of Liverpool)
COMP218 Decision, Computation and Language
41
Another example of a simple ambiguous grammar
M S Khan (Univ. of Liverpool)
COMP218 Decision, Computation and Language
42
Example (contd.)
M S Khan (Univ. of Liverpool)
COMP218 Decision, Computation and Language
43