faber f10

CD5560
FABER
Formal Languages, Automata
and Models of Computation
Lecture 7
Mälardalen University
2010
1
Content
Midterm results
Regular vs. Nonregular Languages
Context-Free Languages
Context-Free Grammars
Derivation Trees. Ambiguity
Push-Down Automata, PDA
Applications
2
Midterm 1 Solutions
Problem 1. (4 points) Circle true statements among the following:
The number of outgoing arcs from a state of a DFA is always equal
to |S|. (True)
The number of outgoing arcs from a state of a NFA is always equal
to |S|. (False)
Not all finite languages are regular. (False)
If L is regular language, then LR is also a regular language. (True)
If L is regular language, then L2 may not be a regular language.
(False)
The family of regular languages is closed under intersection. (True)
The grammar { G  ({S},{a}, S , {S  Saaa aS a })}
is not regular. (True)
The grammar { G  ({S},{a, b}, S ,{S  aS bS  })}
is regular. (True)
3
Problem 2. (6 points)
Convert the following NFA into regular expression:
4
Solution: Create a new initial state and a new, unique
final state, neither of which is part of a loop.
5
Removed state 3:
6
Removed state 2:
Removed state 1:
7
b) Let M be the following DFA.
Minimize M by set partitioning.
8
b) Let M be the following DFA.
Minimize M by set partitioning.
9
State Reduction by Set Partitioning
Step 0: Partition the states into two groups
accepting and non-accepting.
P1
{ 1, 3 }
P2
{ 2, 4, 5, 6 }
10
State Reduction by Set Partitioning
Step 1: Get the response of each state for
each input symbol.
P1
p2 p2
a  
{1, 3 }
b  
p2 p2
P2
p1 p2 p2 p2
a    
{2, 4, 5, 6 }
b    
p2 p1 p2 p1
11
Step 2: Partition the sets according to the responses,
and go to Step 1 until no partition occurs.
P11
P21
p21 p21
p11
a 


{1, 3}
{2}
b 

p22 p22

P22
P23
p23 p23
a 


{4, 6}
{5}


b
p23
p22
p11 p11
p21
So the final partition results are as follows:
{1, 3} {2} {4, 6}
{5}
12
Minimized DFA consists of four states of
the final partition, and the transitions are
the one corresponding to the starting DFA.
{1, 3}
{2} {5}
{4, 6}
a
1,3
a
b
b
b
4,6
2
a
a
Minimized DFA
b
5
Starting DFA
13
Problem 3 (4 points) Regular or not? If regular,
construct an automaton, regular expression or a
grammar.
If not regular, use pumping lemma for regular
languages.
k l
a) Set of all strings of form { 0 1 k  l  8}
over alphabet S = {0, 1}.
Solution:
Regular. A simple finite state machine with finite
number of states which takes care of the order of
symbols and counts the total number of characters.
14
{ 0k 1l k  l  8}
1
1
1
1
1
1
1
1
0
1
0
1
0
1
0
1
0
1
1
1
0
0
0
Solution by John E.
15
k
k  l  8}
l
{0 1
0
18
0
17
1
16
0
15
0
0
12
0
Solution by Otto N.
14
13
0
0
16
0810
0711
{ 0k 1l k  l  8}
0612
0513
0414
0315
0216
0117
0018
Solution by Fredrik P.
17
b) Set of all strings of form { 0k 1l
over alphabet S = {0, 1}.
k  l  8}
Solution:
Not regular. Language L consists of all strings
of the form 0*1* where the number of 0’s
is eight more than the number of 1’s.
We can show that L is not regular by applying pumping
lemma.
Let w = 0m+8 1m. Since |xy|  m,
y must equal 0k for some k > 0.
We can pump y out once,
which will generate the string 0m+8-k 1m,
which is not in L because the number of 0’s
is is less than 8 more than the number of 1’s.
18
A Regular Language
b
q0
a
q2
a
b
a
ababba a ab  xyz  L
q1
x y
also
b
a, b q
3
z
a (bab)i ba a ab  xyz  L
There is a cycle in the graph
q1, q3 , q2 , q1
The pumped strings
x y zL
i
i  0, 1, 2....
19
The Pumping Lemma
Given an infinite regular language
there exists an integer
for any string
we can write
with
w L
m
with length
| w| m
w x y z
| x y |  m and
such that:
L
| y | 1
x y z  L i  0, 1, 2, ...
i
20
The Pumping Lemma
States the property of regular language
Cannot be used to prove that a language is regular!
An example: If something is a square it always has
four edges (a property of square) But: having proved
that something has four edges does not necessarily
mean that the object is a square.
21
The Pumping Lemma
So we use pumping lemma not to prove that a
language is regular, but to show that language
(not obeying pumping lemma) can not be
regular.
It is usually done using contrapositive (proof
by contradiction)
22
Applying Pumping Lemma
If for arbitrary
w L
m
there exists a string
with length
| w| m
w x y z
| y | 1
such that for all decompositions
with
| x y |  m and
there exists i  0 such that:
xy z  L
i
then the language is NOT REGULAR
23
Formal Definition
Grammar
V:
T:
G  V , T , S , P
Set of variables
Set of terminal symbols
S : Start variable
P:
Set of production rules
24
Regular Grammars
A regular grammar is any
right-linear or left-linear grammar
Examples
G1
G2
S  abS
S  Aab
S a
A  Aab | B
Ba
25
A Nonregular Language
L  {a b : i  n}
i i
a0
a
a a a  a a a
n 1
2
a1
b
b0
b
b
b1

b
b
an
b
b
bn  2
bn1
b
DFA must have infinite number of states.
States
ai , a j
are distinct for each
i j
26
n l n l
{a b c
: n, l  0}{a : n  0}
n!
Non-regular languages
Context-Free Languages
n n
R
{a b }
{ww }
Regular Languages
27
Automata theory: formal languages and
formal grammars
Grammar
Languages
Automaton
Type-0
Recursively
enumerable
Turing machine
Type-1
Contextsensitive
Linear-bounded nondeterministic Turing
machine
Type-2
Context-free
Non-deterministic pushdown
automaton
Production
rules
No restrictions
and
Type-3
Regular
Finite state automaton
28
Context-Free Languages
29
Context-Free Languages
Context-Free
Grammars
Pushdown
Automata
30
Context-Free Grammars
31
Example
A context-free grammar G
S  aSb
S 
A derivation
S  aSb  aaSbb  aabb
32
A context-free grammar G
S  aSb
S 
Another derivation
S  aSb  aaSbb  aaaSbbb  aaabbb
33
S  aSb
S 
L(G )  {a b : n  0}
n n
( ( ( ( ) ) ) )
34
Example
A context-free grammar
G
S  aSa
S  bSb
S 
A derivation
S  aSa  abSba  abba
35
A context-free grammar
G
S  aSa
S  bSb
S 
Another derivation
S  aSa  aaSaa  aaaSaaa
 aaabSbaaa  aaabbaaa
36
S  aSa
S  bSb
S 
L(G)  {ww : w {a, b}*}
R
37
Example
A context-free grammar
G
S  aSb
S  SS
S 
A derivation
S  SS  aSbS  abS  ab
38
A context-free grammar
G
S  aSb
S  SS
S 
A derivation
S  SS  aSbS  abS  abaSb  abab
39
S  aSb
S  SS
S 
L(G )  {w : na ( w)  nb ( w),
and na (v)  nb (v)
in any prefix v}
( )( ( ( ) ) ) ( ( ) ) 
40
Definition: Context-Free Grammars
Grammar
Variables
G  (V ,T , S , P)
Terminal
symbols
Start
variables
Productions of the form:
A x
x is string of variables and terminals
41
Definition: Context-Free Languages
A language
L
is context-free
if and only if there is a grammar
G with
L  L(G )
42
Derivation Order
1. S  AB
2. A  aaA
3. A  
4. B  Bb
5. B  
Leftmost derivation
1
2
3
4
5
S  AB  aaAB  aaB  aaBb  aab
43
Derivation Order
1. S  AB
2. A  aaA
3. A  
4. B  Bb
5. B  
Rightmost derivation
1
4
5
2
3
S  AB  ABb  Ab  aaAb  aab
44
S  aAB
A  bBb
B  A|
Leftmost derivation
S  aAB  abBbB  abAbB  abbBbbB
 abbbbB  abbbb
45
S  aAB
A  bBb
B  A|
Rightmost derivation
S  aAB  aA  abBb  abAb
 abbBbb  abbbb
46
Derivation Trees
47
S  AB
A  aaA | 
B  Bb | 
S  AB
S
A
B
48
S  AB
A  aaA | 
B  Bb | 
S  AB  aaAB
S
A
a
a
B
A
49
A  aaA | 
S  AB
B  Bb | 
S  AB  aaAB  aaABb
S
A
a
a
B
A
B
b
50
A  aaA | 
S  AB
B  Bb | 
S  AB  aaAB  aaABb  aaBb
S
A
a
a
B
A

B
b
51
S  AB
A  aaA | 
B  Bb | 
S  AB  aaAB  aaABb  aaBb  aab
S
Derivation Tree
B
A
a
a
A
B


b
52
A  aaA | 
S  AB
B  Bb | 
S  AB  aaAB  aaABb  aaBb  aab
S
Derivation Tree
A
a
a
B
A
B


yield
b
aab
 aab
53
Partial Derivation Trees
S  AB
A  aaA | 
B  Bb | 
S  AB
Partial derivation tree
S
A
B
54
S  AB  aaAB
Partial derivation tree
S
A
a
a
B
A
55
S  AB  aaAB
sentential form
Partial derivation tree
S
yield
A
a
a
B
aaAB
A
56
Sometimes, derivation order doesn’t matter
Leftmost:
S  AB  aaAB  aaB  aaBb  aab
Rightmost:
S  AB  ABb  Ab  aaAb  aab
S
Same derivation tree
A
a
a
B
A
B


b
57
Ambiguity
58
E  E  E | E  E | (E) | a
a  a a
E
E
a

E
a
derivation
(* denotes multiplication)
E  E  E  a  E  a  E E
 a  a  E  a  a a
E

E
a
leftmost derivation
59
E  E  E | E  E | (E) | a
a  a a
derivation
E
E
E  E E  E  E E
 a  E E  a  aE
E

E

E
a
 a  a a
leftmost derivation
a
a
60
E  E  E | E  E | (E) | a
a  aa
E
E
a
E

E
E
a

E
E
a
a
E

E

E
a
a
61
E  E  E | E  E | (E) | a
a  aa
Two derivation trees
E
E
a

E
E
a

E
E
a
a
E
E

E

E
a
a
62
The grammar
E  E  E | E  E | (E) | a
is ambiguous!
String a  a  a has two derivation trees
E
E
a
E

E
E
a

E
E
a
a
E

E

E
a
a
63
The grammar
E  E  E | E  E | (E) | a
is ambiguous:
string
a  a  a has two leftmost derivations
E  E  E  a E  a EE
 a  a E  a  a*a
E  EE  E  EE  a EE
 a  aE  a  aa
64
Definition
A context-free grammar
G is ambiguous
if some string w L(G ) has
two or more derivation trees
(two or more leftmost/rightmost derivations)
65
Why do we care about ambiguity?
a  aa
a2
E
E
a

E
E
E
a

E
E
a
a
E

E

E
a
a
66
Why do we care about ambiguity?
2  22
E
E
2
E

E
E
2

E
E
2
2
E

E

E
2
2
67
Why do we care about ambiguity?
2  22
6
E
2
E
2
8
E
4
E

2
E

2
2  22  6
2
E
2
E
2
2
4
E

2
E

2
E
2
2
2  22  8
68
Correct result:
2  22  6
6
E
2
E
2
4
E

2
E
2

2
E
2
69
Ambiguity is bad
for programming languages
We want to remove ambiguity!
70
We fix the ambiguous grammar…
E  E  E | E  E | (E) | a
E  E T
…by introducing parentheses ()
to indicate grouping, (precedence)
E T
T T F
Non-ambiguous grammar
T F
F  (E)
F a
71
E  E T T T  F T  a T  a T F
 a  F F  a  aF  a  aa
E
E  E T
a  aa

E T
E
T T F
T
T
F
F
T F
F  (E)
F a
a
T
a

F
a
72
Unique derivation tree
a  aa
E
E

T
T
T
F
F
a
a

F
a
73
The grammar G :
E  E T
E T
T T F
T F
F  (E)
is non-ambiguous
F a
Every string w L(G ) has a unique
derivation tree
74
Inherent Ambiguity
Some context free languages
have only ambiguous grammars!
Example:
S  S1 | S2
L  {a b c }  {a b c }
n n m
n m m
S1  S1c | A
S 2  aS2 | B
A  aAb | 
B  bBc | 
75
The string
n n n
a b c
has two derivation trees
S1
S
S
S1
S2
c
a
S2
76
Definition: Context-Free Grammars
Grammar
Variables
G  (V ,T , S , P)
Terminal
symbols
Start
variables
Productions of the form:
A x
x is string of variables and terminals
77
Definition: Regular Grammars
Grammar
Variables
G  (V ,T , S , P)
Terminal
symbols
Start
variables
Right or Left Linear Grammars. Productions of the form:
A  xB
A  Bx
or
Cx
x is string of terminals
78
n l n l
{a b c
: n, l  0}{a : n  0}
n!
Non-regular languages
Context-Free Languages
n n
R
{a b }
{ww }
Regular Languages
79
Context-Free Languages
Context-Free
Grammars
Pushdown
Automata
stack
automaton
80
Pushdown Automata
PDAs
81
Pushdown Automaton - PDA
Input String
Stack
States
82
The Stack
A PDA can write symbols on stack
and read them later on.
POP reading symbol
PUSH writing symbol

y
x
z
All access to the stack only on the top!
(Stack top is written leftmost in the string, e.g. yxz)
A stack is valuable as it can hold an unlimited
amount of information.
The stack allows pushdown automata to
recognize some non-regular languages.
83
The States
Input
symbol
Pop old
reading
stack symbol
q1 a, b / c
Push new
writing
stack symbol
q2
84
q1 a, b / c
q2
input

a


a

stack
b
h
e
$
top
Replace
c
h
e
$
(An alternative is to start and finish with empty stack)
85
q1 a,  / c
q2
input

a


stack
b
h
e
$
top
Push
a

c
b
h
e
$
86
q1 a,b / 
q2
input

a


a

stack
b
h
e
$
top
Pop
h
e
$
87
q1 a,
/
q2
input

a


a

stack
b
h
e
$
top
No Change
b
h
e
$
88
Time 0
Example 3.7 Salling:
A PDA for simple nested parenthesis strings
(
(
(
)
)

)
Input
(, / (
start
s
Stack
), ( /
), (/ 
q
end
89
Example 3.7
Time 1
Input
(
(
(
)
)
(, / (
start
s
(
)

Stack
), ( /
), (/ 
q
end
90
Example 3.7
Time 2
Input
(
(
(
(
)
)
)
(

(, / (
start
s
Stack
), ( /
), (/ 
q
end
91
Example 3.7
Time 3
Input
(
(
(
)
)
(
(
)
(

(, / (
start
s
), ( /
), (/ 
q
Stack
end
92
Example 3.7
Time 4
Input
(
(
(
)
)
(
(
)
(
(, / (

), ( /
Stack
start
s
), (/ 
q
end
93
Example 3.7
Time 5
Input
(
(
(
)
)
(
)
(

(, / (
start
s
), ( /
), (/ 
q
Stack
end
94
Example 3.7
Time 6
Input
(
(
(
)
)
(
)

), ( /
(, / (
start
s
Stack
), (/ 
q
end
95
Example 3.7
Time 7
Input
(
(
(
)
)

)
Stack
(, / (
start
s
), ( /
), (/ 
q
end
96
Applications:
Compilers
97
Machine Code
Program
v = 5;
if (v>5)
x = 12 + v;
while (x !=3) {
x = x - 3;
v = 10;
}
......
Compiler
Add v,v,0
cmp v,5
jmplt ELSE
THEN:
add x, 12,v
ELSE:
WHILE:
cmp x,3
...
98
Compiler
Lexical
analyzer
input
program
parser
output
machine
code 99
A parser “knows” the grammar
of the programming language
100
Parser
PROGRAM  STMT_LIST
STMT_LIST STMT; STMT_LIST | STMT;
STMT EXPR | IF_STMT | WHILE_STMT
| { STMT_LIST }
EXPR  EXPR + EXPR | EXPR - EXPR | ID
IF_STMT  if (EXPR) then STMT
| if (EXPR) then STMT else STMT
WHILE_STMT while (EXPR) do STMT
101
The parser finds the derivation
of a particular input
derivation
Parser
input
10 + 2 * 5
EE+E
|E*E
| INT
EE+E
E+E*E
 10 + E*E
 10 + 2 * E
 10 + 2 * 5
102
derivation
EE+E
E+E*E
 10 + E*E
 10 + 2 * E
 10 + 2 * 5
derivation tree
E
E
+
E
10
E
*
E
5
2
103
derivation tree
E
E
machine code
+
E
mult a, 2, 5
add b, 10, a
10
E
2
*
E
5
104
Parsing
105
Parser
input
string
grammar
derivation
106
Example:
Parser
input
aabb
S  SS
derivation
S  aSb
S  bSa
?
S 
107
Exhaustive Search
S  SS | aSb | bSa | 
Phase 1:
S  SS
S  aSb
Find derivation of
aabb
S  bSa
S 
All possible derivations of length 1
108
S  SS
aabb
S  aSb
S  bSa
S 
109
Phase 2
S  SS | aSb | bSa | 
S  SS  SSS
S  SS  aSbS
Phase 1
S  SS  bSaS
S  SS
S  SS  S
S  aSb
S  aSb  aSSb
aabb
S  aSb  aaSbb
S  aSb  abSab
S  aSb  ab
110
S  SS | aSb | bSa | 
Phase 2
S  SS  SSS
S  SS  aSbS
aabb
S  SS  S
S  aSb  aSSb
S  aSb  aaSbb
Phase 3
S  aSb  aaSbb  aabb
111
Final result of exhaustive search
(top-down parsing)
Parser
input
aabb
S  SS
S  aSb
S  bSa
S 
derivation
S  aSb  aaSbb  aabb
112
Context Free Art
http://www.contextfreeart.org/wiki/index.php?page=AboutPage
113
Context Free Art
http://www.contextfreeart.org/wiki/index.php?page=AboutPage
114