CDT314
FABER
Formal Languages, Automata
and Models of Computation
Lecture 4
School of Innovation, Design and Engineering
Mälardalen University
2012
1
Content
Regular Expressions and Regular Languages
NFADFA Subset Construction (Delmängdskonstruktion)
FA RE State Elimination (Tillståndseliminering)
Minimizing DFA by Set Partitioning (Särskiljandealgoritmen)
Grammar: Linear grammar. Regular grammar.
Application: Compiler. Lexical Analysis.
2
Two Basic Theorems
about Finite State Automata
Based on C Busch, RPI, Models of Computation
3
1: KleeneTheorem
NFADFA
Let M be NFA accepting L.
Then there exists DFA M´ that accepts L as well.
For each NFA there exists corresponding DFA
that accepts the same language.
4
II: Finite Language Theorem
Any finite language is regular.
Any finite language is FSA-acceptable.
Proof: every finite language is a union of single strings which
are regular because described by regular expressions
Example L = {abba, abb, abab}
L is expressed by the regular expression “abba|abb|abab“
FSA = Finite State Automaton = DFA/NFA
5
Some Properties of
Regular Languages,
Summary
6
Properties
For regular languages L1 and L2
we will prove that:
Union:
Concatenation:
Star:
L1 L2
L1L2
Are Regular
Languages
L1 *
7
Regular languages are closed under
Union:
Concatenation:
Star:
L1 L2
L1L2
L1 *
8
Regular language
L1
LM1 L1
NFA
M1
Single final state
Regular language
LM 2 L2
NFA
L2
M2
Single final state
9
Example
M1
a
L1 {a b}
n
b
M2
L2 ba
b
a
10
Union (Thompson’s construction)
NFA for
L1 L2
M1
M2
11
Example
NFA for L1 L2 {a b} {ba}
n
L1 {a b}
n
a
b
L2 {ba}
b
a
12
Concatenation (Thompson’s construction)
NFA for
L1L2
M1
M2
13
Example
NFA for
L1L2 {a b}{ba} {a bba}
n
L1 {a b}
n
n
L2 {ba}
a
b
b
a
14
Star Operation
(Thompson’s construction)
NFA for
L1 *
L1 *
M1
15
Example
NFA for
L1* {a b} *
n
L1 {a b}
n
a
b
16
Reverse of a Regular
Language
17
Theorem
R
The reverse L of a regular language L
is a regular language
Proof idea
R
Construct NFA that accepts L :
invert the transitions of the NFA
that accepts L
18
Proof
Since L is regular,
there is NFA that accepts L
Example:
b
L ab * ba
a
b
a
19
b
Invert Transitions
a
b
a
b
a
b
a
20
Make old initial state
a final state
and vice versa
b
a
b
a
b
a
b
a
21
b
a
b
Add a new initial state
a
b
a
b
a
22
R
Resulting machine accepts L
R
L
is regular
L ab * ba
b
a
b
L b * a ab
R
a
23
Regular Expressions
and
Regular Languages
24
Theorem: Regular expressions
are representation of regular
languages and they
are equivalent.
Regular expressions describe regular languages in formal language theory. They
have the same expressive power as regular grammars.
25
Theorem - Part 1
Languages
Generated by
Regular Expressions
Regular
Languages
1. For any regular expression r
the language L(r ) is regular
26
Theorem - Part 2
Languages
Generated by
Regular Expressions
Regular
Languages
2. For any regular language L there is
a regular expression r with L(r ) L
27
Proof - Part 1
For any regular expression r
the language L(r ) is regular
Proof by induction on the size of
r
28
Induction Basis
Primitive Regular Expressions:
NFAs
, ,
(where
)
L( M1) L()
L( M 2 ) {} L( )
a
Regular
Languages
L( M 3 ) {a} L(a)
29
Inductive Hypothesis
Assume
for regular expressions r1 and r2
that L(r1 ) and L(r2 ) are regular languages
30
Inductive Step
We will prove:
L( r1 r2 )
L( r1 r2 )
1
L(r )
regular
languages
L(( r1 ))
31
By definition of regular expressions:
L(r1 r2 ) L(r1 ) L(r2 )
L(r1 r2 ) L(r1 ) L(r2 )
1
L(r ) ( L(r1 ))
*
L(( r1 )) L(r1 )
32
By inductive hypothesis we know:
L(r1 ) and L(r2 ) are regular languages
Regular languages are closed under
union
concatenation
star
Lr1 Lr2
Lr1 Lr2
Lr1 *
33
Therefore
Lr1 r2 Lr1 Lr2
Lr1 r2 Lr1 Lr2
regular
languages
Lr1 * Lr1 *
And trivially L((r1 )) is a regular language
34
Proof – Part 2
For any regular language L there is
a regular expression r with L(r ) L
Proof by construction of regular expression
35
Since L is regular
take the NFA M that accepts it
L( M ) L
Single final state
36
From M construct the equivalent
Generalized Transition Graph
transition labels are regular expressions
Example
M
a
c
a, b
a
c
ab
QED
37
Summary: Operations on Regular Expressions
RE
a+b
(a+b)(a+b)
a*
a*b
(a+b)*
Regular language description
{a,b}
{aa, ab, ba, bb}
{, a, aa, aaa, …}
{b,ab, aab, aaab, …}
{, a, b, aa, ab, ba, aaa, bbb…}
38
Algebraic Properties
Axiom
Description
r +s = s+r
+ is commutative
(r +s)+t = r +(s+t)
+ is associative
(rs)t = r (st)
concatenation is associative
r (s+t) = rs+rt
(s+t)r = sr +tr
r = r
r = r
r* = ( r +)*
concatenation distributes over +
r** = r*
* is idempotent
is the identity element for
concatenation
relation between * and
39
Operator Precedence
1. Kleene star
2. Concatenation
3. Union
allows dropping unnecessary parentheses.
40
Converting Regular
Expression to a DFA
41
Example: From a Regular Expression to an NFA
Example : (a+b)*abb step by step construction
(a+b)
a
3
2
6
1
4
5
b
42
Example: From a Regular Expression to an NFA
Example : (a+b)*abb
(a+b)*
a
0
3
2
1
6
4
5
b
7
43
Example: From a Regular Expression to an NFA
Example : (a+b)*abb
(a+b)*abb
0
2
a
3
1
6
4
5
b
7
a
8
b
9
b
10
44
Converting FA
to a Regular Expression
State Elimination
http://www.math.uu.se/~salling/Movies/FA_to_RegExpression.mov
45
Example
a
a,b
2
1
b
a
b
3
Add a new start (s) and a new final (f) state:
• From s to the ex-starting state (1)
• From each ex-final state (1,3) to f
46
s
a
a,b
2
1
b
a
b
The original:
3
a
a,b
2
1
b
a
b
f
3
Let’s remove state 1!
Each combination of input/output to 1 will generate
a new path once state 1 is gone
; a,b
;
a ; a, b
a;
47
When state 1 is gone we must be able to make all
those transitions!
( a b
a,b
;
s
)
a,b
a
2
1
b
a
Previous:
a
a, b
3
b
a
b
f
b
48
;
( a b)
s
a,b
2
1
a
b
a
b
3
f
49
a ; a, b
( a b)
s
a,b
2
1
a
b
a
b
3
f
a ( a b)
50
A common mistake: having several arrows between the same
pair of states. Join the two arrows (by union of their regular
expressions) before going on to the next step.
( a b)
s
a,b
2
1
a
b
a
3
f
b
a ( a b) b
51
a;
( a b)
s
a,b
2
1
a
b
a
3
f
a ( a b) b
a
52
Union again..
( a b)
s
a,b
2
1
b
a
a
a ( a b) b
3
f
a
53
Without state 1...
( a b)
s
2
a
b
a ( a b) b
3
f
a
54
Now we repeat the same procedure for the state 2...
( a b)
s
a
2
b
a (a
b
b
3
f
a
55
Following the path s-2-3 we concatenate all strings on
our way…don’t forget a* in 2!
( a b)
s
2
( a b) a b
a
b
a ( a b) b
3
f
a
56
When 2 is removed, the path 3 - 2 –3 has also to be preserved,
so we concatenate 3-2 string, take a* loop and go back 2-3 with
string b!
( a b)
s
b
( a b) a b
3
f
a
2
a
a ( a b
b
( a ( a b ) b ) ab
57
This is how the FA looks like without state 2:
s
( a b) a b
3
f
a
( a ( a b ) b ) ab
58
Finally we remove state 3…
s
( a b) a b
f
a
3
( a ( a b ) b ) ab
59
...so we concatenate strings s-3, loop in 3 and 3-f
s
(a b)a * b
3
f
(a b)ab
(a(a b) b)a b
a
((a(a b) b)ab)
( a )
60
Now we can omit state 3!
s
f
(a b)ab (( a(a b) b)ab) ( a )
From s we have two choices, empty string or the long expression
OR is represented by union , as usually
61
So union the arrows...
(a b)a b (( a(a b) b)a b) ( a )
s
f
...and we are done!
62
Converting FA to a Regular
ExpressionAn Algorithm for State
Elimination
63
• We expand our notion of NFA- to allow transitions on
arbitrary regular expressions, not simply single
symbols or .
• Successively eliminate states, replacing transitions that
enter and leave a state with a more complicated
regular expression, until eventually there are only two
states left: a start state, an accepting state, and a
single transition connecting them, labeled with a
regular expression.
• The resulting regular expression is then our answer.
64
• To begin with, the automaton should have a start state
that has no transitions into it (including self-loops), and
which is not accepting.
• If your automaton does not obey this property, add a
new, non-accepting start state s, and add a -transition
from s to the original start state.
• The automaton should also have a single accepting
final state with no transitions leaving it, and no selfloops.
• If your automaton does not have it, add a new final
state q, change all other accepting states to nonaccepting, and add -transitions from them to q.
This change clearly doesn't change the language
accepted by the automaton.
65
Repeat the following steps, which eliminate a state:
1. Pick a non-start, non-accepting state q to eliminate. The
state q will have i transitions in and j transitions out. Each
will be labeled with a regular expression. For each of the
ij combinations of transitions into and out of q, replace:
B
A
p
C
q
r
with
p
And delete state q.
AB * C
r
66
2. If several transitions go between a pair of states, replace
them with a single transition that is the union of the
individual transitions.
E.g. replace:
A
p
r
B
with
p
A B
r
67
Example
a
b
a
1
2
b
b ab*a
1
1
a
3
b
4
a
3
b
(b a b * a)a * b
4
4
68
N.B. common mistake!
a
means
NOT
b
( a b) *
i.e.
a, b
(a * b*)
See example on s 46 and 47 in Sallings book
69
Minimizing DFA
by Set Partitioning
http://www.math.uu.se/~salling/Movies/SubsetConstruction.mov
70
Minimization of DFA
The deterministic finite automata are not always
the smallest possible accepting the source
language.
There may be states with the same
"acceptance behavior". This applies to states
p and q, if for all input words, the automaton
always or never moves to a final state from p
and q.
71
State Reduction by Set Partitioning
(Särskiljandealgoritmen)
The set partitioning technique is similar to one used
for partitioning people into groups based on their
responses to questionnaire.
The following slides show the detailed steps for
computing equivalent state sets of the starting DFA
and constructing the corresponding reduced DFA.
72
b
a
3
a
a
Starting DFA
1
a
b
0
4
a
2
b
b
b
5
b
a
a
b
3
Reduced DFA
a, b
0
1,2
a
b
a, b
4,5
73
State Reduction by Set Partitioning
Step 0: Partition the states into two groups
accepting and non-accepting.
P1
P2
a
a
{ 3, 4, 5 }
a
b
3
a
1
{ 0, 1, 2 }
b
0
a
2
b
b
4
b
b
5
a
74
State Reduction by Set Partitioning
Step 1: Get the response of each state for each input symbol.
Notice that States 3 and 0 show different responses from the
ones of the other states in the same set.
P1
p1
a
{3,
b
p2
p1 p1
4, 5 }
p1 p1
P2
p2 p1
a
{0, 1,
b
p2 p1
P2
p1
2}
p1
P1
a
b
a
a
3
1
a
b
0
a
2
b
b
4
b
b
5
a
75
Step 2: Partition the sets according to the responses,
and go to Step 1 until no partition occurs.
P11
P12
P21
p11 p11
a
b
p12 p12
{4, 5}
P22
a
{3}
{1, 2}
b
p11 p11
{0}
p11 p11
No further partition is possible for the sets P11 and P21 .
So the final partition results are as follows:
{4, 5}
{3}
{1, 2}
{0}
76
Minimized DFA consists of four states of the final partition,
and the transitions are the one corresponding to the
starting DFA.
{4, 5}
{3} {1, 2}
{0}
a
3
a
a, b
0
1,2
a
a
b
a
a, b
2
b
4,5
Minimized DFA
3
1
0
a
b
a
b
b
b
4
b
b
5
a
Starting DFA
77
DFA Minimization Algorithm
The algorithm
Why does this work?
Partition P 2Q (power set)
P { F, {Q-F}}
Start off with 2 subsets of Q
{F} and {Q-F}
while ( P is still changing)
T{}
for each set s P
for each
partition s by
into s1, s2, …, sk
T T s1, s2, …, sk
if T P then
PT
This is a fixed-point algorithm!
While loop takes PiPi+1 by
splitting one or more sets
Pi+1 is at least one step closer to
the partition with |Q | sets
Maximum of |Q | splits
Partitions are never combined
Initial partition ensures that final
states are intact
78
Set Partition Algorithm (Salling book)
Example 2.39 We apply step by step set partition algorithm
on the following DFA:
{a, b}
Two strings x,y are
distinguished by
language (DFA) if
there is a
(distinguishing) string
z such as that only
one of strings xz, yz
ends in accepting
state (belongs to
language).
b
1
a
b 4
b
a
2
a 5
a
3
a
a
b
b
6
79
We search for strings that are
distinguished by DFA!
Non-accepting:
b
1
a
b 4
2 b a
a 5
a
3
a
6
{1,2,3,4,5,6}
Accepting:
{1,2,4,5,6}
{3}
a
b
Distinguishing string:
a
or b
What happens when we read
b
a
?
From 1 and 6 by a we reach 3 which is
accepting. They form a special group.
{1,6} {3}
{2,4,5}
What happens when we read
From 5 we end in 6 by b,
leaving its group.
b?
{2,4}
{5}
{1,6} {3}
80
The minimal automaton has four states:
{2,4}
{5}
{1,6}
{3}
b
1
b
a
a
{5}
b
{2,4}
b 4
2 b a
a 5
a
3
a
a
b
b
6
a
{1,6}
b
{3}
a
a
81
The Chomsky Hierarchy
82
n l n l
{a b c
: n, l 0}{a : n 0}
n!
Non-regular languages
Context-Free Languages
n n
R
{a b }
{ww }
Regular Languages
83
Some Additional Examples
for your excersize
84
Example of subset construction for NFA DFA
2
start
0
a
3
1
6
7
a
8
b
9
b
10
4
b
5
NFA N for (a+b)*abb
85
Example of subset construction for NFA DFA
STATE
A
B
C
D
E
INPUT SYMBOL
a
b
B
B
B
B
B
C
D
C
E
C
Translation table for DFA
86
Example of subset construction for NFA DFA
b
C
b
b
a
start
A
a
B
b
b
D
10
E
a
a
a
Result of applying the subset construction of NFA for (a+b)*abb.
87
Another Example of State Elimination
a
q0
b
b
q1 a, b
q2
b
b
b
a
q0
q1 a b q2
b
88
Another Example of State Elimination
b
a
q0
b
q1 a b q2
b
bb * a
q0
b
bb * (a b)
q2
89
Another Example of State Elimination
Resulting Regular Expression
bb * a
q0
b
bb * (a b)
q2
r (bb * a) * bb * (a b)b *
L( r ) L( M ) L
90
In General
e
c
d
Removing
states
qi
qj
q
a
b
ae * d
ce * b
ce * d
qi
qj
ae * b
91
Obtaining the final regular expression
r4
r1
r3
q0
r2
qf
r r1 * r2 (r4 r3r1 * r2 ) *
L( r ) L( M ) L
92
Example: From an NFA to a DFA
b
states
A
B
C
D
E
a
B
B
B
B
B
b
C
D
C
E
C
C
b
b
a
a
b
A
B
a
b
a
D
E
a
93
Example: From an NFA to a DFA
states
a b
A
B A
B
B D
D
B E
E
B A
b
b
b
a
A
B
a
b
a
D
E
a
94
Example: DFA Minimization
Current Partition Split on a
Split on b
P0
{s4} {s0, s1, s2, s3}
none
{s0, s1, s2} {s3}
P1
{s4}{s3}{s0, s1, s2}
none
{s0, s2}{s1}
P2
{s4}{s3}{s1}{s0, s2}
none
none
final state
a
a
s0
a
b
b
s1
a
s3
a
b
s4
b
b
a
s0 , s2
a
a
s1
b
s3
a
b
s4
b
s2
b
95
Example: DFA Minimization
What about a ( b + c )* ?
q0
a
q1
q2
q4
b
q5
q3
q8
q6
c
q7
q9
First, the subset construction:
-closure(move(s,*))
b
NFA states
a
b
c
s0
q0
none
none
s1
q 1 , q 2 , q3 ,
q4, q6, q9
q 5 , q 8 , q9 ,
q3, q4, q6
q 7 , q 8 , q9 ,
q3, q4, q6
q1, q2, q3,
q4, q6, q9
none
none
q5, q8, q9,
q3, q4, q6
s2
q7, q8, q9,
q3, q4, q6
s3
none
s2
s3
s2
s3
s2
b
s0
a
s1
b
c
c
s3
c
Final states
96
Example: DFA Minimization
b
Then, apply the minimization algorithm
Split on
P0
Current Partition
a
b
c
{ s1, s2, s3} {s0}
none
none
none
final states
s2
b
a
s0
b
s1
c
c
s3
c
To produce the minimal DFA
b,c
a
s0
s1
97
Grammars
98
Grammars
Grammars express languages
Example:
the English language
sentence noun _ phrase
noun _ phrase article
predicate
noun
predicate verb
99
article a
article the
noun bird
noun dog
verb sings
verb barks
100
A derivation of “the bird sings”:
sentence noun _ phrase predicate
article noun predicate
article noun verb
the noun verb
the bird verb
the bird sings
101
A derivation of “a dog barks”:
sentence noun _ phrase
noun _ phrase
predicate
verb
article noun verb
a noun verb
a dog verb
a dog barks
102
The language of the grammar:
L { " a bird barks " ,
" a bird sings " ,
" the bird barks " ,
" the bird sings " ,
}
" a dog barks " ,
" a dog sings " ,
" the dog barks " ,
" the dog sings " }
103
Notation
noun bird
noun dog
Non-terminal
(Variable)
Production rule
Terminal
104
Example
Grammar:
S aSb
S
Derivation of sentence:
ab
S aSb ab
S aSb
S
105
Grammar:
S aSb
S
Derivation of sentence
aabb
S aSb aaSbb aabb
S aSb
S
106
Other derivations
S aSb aaSbb aaaSbbb aaabbb
S aSb aaSbb aaaSbbb
aaaaSbbbb aaaabbbb
107
The language of the grammar
S aSb
S
L {a b : n 0}
n n
108
Formal Definition
Grammar
V:
T:
G V , T , S , P
Set of variables
Set of terminal symbols
S : Start variable
P:
Set of production rules
109
Example
Grammar
G
S aSb
S
G V , T , S , P
V {S}
T {a, b}
P {S aSb, S }
110
Sentential Form
A sentence that contains
variables and terminals
Example
S aSb aaSbb aaaSbbb aaabbb
sentential forms
Sentence
(sats)
111
*
We write:
S aaabbb
Instead of:
S aSb aaSbb
aaaSbbb aaabbb
112
In general we write
*
w1 wn
if
w1 w2 w3 wn
*
( By default
w w
)
113
Example
Grammar
S aSb
S
Derivations
*
S
*
S ab
*
S aabb
*
S aaabbb
114
Example
Grammar
S aSb
S
Derivations
S aaSbb
aaSbb aaaaaSbbbbb
115
Another Grammar Example
Grammar
G
S Ab
A aAb
A
Derivations
S Ab b
S Ab aAbb abb
S aAbb aaAbbb aabbb
116
More Derivations
S Ab aAbb aaAbbb aaaAbbbb
aaaaAbbbbb aaaabbbbb
S aaaabbbbb
S aaaaaabbbbbbb
S a b b
n n
117
The Language of a Grammar
For a grammar G with start variable
S
L(G ) {w : S w}
String of terminals
118
Example
For grammar G
S Ab
A aAb
A
L(G ) {a b b : n 0}
n n
Since
S a b b
n n
119
Notation
A aAb
A
article a
article the
A aAb |
article a | the
120
Linear Grammars
121
Linear Grammars
Grammars with
at most one variable (non-terminal)
at the right side of a production
Examples:
S aSb
S Ab
S
A aAb
A
122
A Non-Linear Grammar
Grammar
G
S SS
S
S aSb
S bSa
L(G ) {w : na ( w) nb ( w)}
123
A Linear Grammar
Grammar
G
SA
A aB |
B Ab
L(G) {a b : n 0}
n n
124
Right-Linear Grammars
All productions have form:
A xB
or
A x
Example
S abS
S a
125
Left-Linear Grammars
All productions have form
A Bx
or
A x
Example
S Aab
A Aab | B
Ba
126
Regular Grammars
A grammar is a regular grammar if and only if it
is right-linear or left-linear.
127
Regular Grammars
Generate
Regular Languages
128
Theorem
Languages
Generated by
Regular Grammars
Regular
Languages
129
Theorem - Part 1
Languages
Generated by
Regular Grammars
Regular
Languages
Any regular grammar generates
a regular language
130
Theorem - Part 2
Languages
Generated by
Regular Grammars
Regular
Languages
Any regular language is generated
by a regular grammar
131
Proof – Part 1
Languages
Generated by
Regular Grammars
Regular
Languages
The language L(G ) generated by
any regular grammar G is regular
132
The case of Right-Linear Grammars
Let
G
be a right-linear grammar
We will prove:
Proof idea
L(G ) is regular
We will construct NFA M
with L( M ) L(G )
133
Grammar
G is right-linear
Example
S aA | B
A aa B
Bb B|a
134
Construct NFA M such that every state
is a grammar variable:
A
S
VF
special
final state
B
S aA | B
A aa B
Bb B|a
135
Add edges for each production:
a
A
S
S aA | B
VF
B
A aa B
Bb B|a
136
a
S
A
VF
B
S aA | B
A aa B
Bb B|a
137
A
a
a
S
S aA | B
a
VF
B
A aa B
Bb B|a
138
A
a
a
S
S aA | B
A aa B
VF
a
B
b
Bb B|a
139
A
a
a
S
S aA | B
A aa B
Bb B|a
a
VF
a
B
b
140
A
a
S
a
a
VF
a
B
b
S aA aaaB aaabB aaaba
141
NFA M
Grammar
A
S aA | B
a
a
G
A aa B
B bB | a
S
a
VF
a
B
b
L( M ) L(G )
aaab * a b * a
142
In General
A right-linear grammar
has variables:
G
V0 ,V1,V2 ,
and productions:
Vi a1a2 amV j
or
Vi a1a2 am
143
We construct the NFA
each variable
V0
V1
Vi
M
such that:
corresponds to a node:
V2
….
VF
special
final state
144
For each production:
Vi a1a2 amV j
we add transitions and intermediate nodes
Vi
a1
a2
………
am V
j
Example:
http://www.cs.duke.edu/csed/jflap/tutorial/grammar/toFA/index.html
Convert Right-Linear Grammar to FA by JFLAP
145
Example
a9
V0 a1V1 | a3V2
V1 a2 a4V3 | a3a4 a8V4
V2 a5V4
V3 a9V1 | a5
V4 a0
a1
V0
a2
V1
a3
a4
M
V3
a5
a4
a3
V2
a5
a8
V4
VF
a9
L(G ) L( M )
146
The case of Left-Linear Grammars
Let G be a left-linear grammar
We will prove: L(G ) is regular
Proof idea
We will construct a right-linear
R
grammar G with L(G ) L(G)
147
Since G is left-linear grammar
the productions look like:
A Ba1a2 ak
A a1a2 ak
148
Construct right-linear grammar
In
G:
G
A Ba1a2 ak
A vB
In
G :
A ak a2a1B
Av B
R
149
Construct right-linear grammar
In
G:
G
A a1a2 ak
Av
In
G :
A ak a2 a1
Av
R
150
It is easy to see that:
Since
L(G) L(G)
R
G is right-linear, we have:
L(G)
L(G)
Regular
Language
Regular
Language
R
L(G )
Regular
Language
151
Proof - Part 2
Languages
Generated by
Regular Grammars
Regular
Languages
Any regular language L is generated
by some regular grammar G
152
Any regular language L is generated
by some regular grammar G
Proof idea
Since L is regular
there is an NFA M such that
L L(M )
Construct from M a regular grammar G
such that L( M ) L(G )
153
Example
b
M
q0
a
q1
a
L ab * ab(b * ab) *
L L(M )
q2
b
q3
154
Convert
M
G
q0 aq1
q1 bq1
q1 aq2
q2 bq3
q3 q1
q3
to a right-linear grammar
b
M
q0
a
q1
a
q2
b
q3
L(G ) L( M ) L
155
In General
For any transition:
Add production:
variable
q
a
p
q ap
terminal
variable
156
For any final state:
Add production:
qf
qf
157
Since
G
G
is right-linear grammar
is also a regular grammar with
L(G ) L( M ) L
158
Regular Grammars
A regular grammar is any
right-linear or left-linear grammar
Examples
G1
G2
S abS
S Aab
S a
A Aab | B
Ba
159
Observation
Regular grammars generate regular languages
Examples
G1
S abS
S a
L(G1) (ab) * a
G2
S Aab
A Aab | B
Ba
L(G2 ) aab(ab) *
160
Chomsky’s Language Hierarchy
Non-regular languages
Regular Languages
161
Application: Compiler
Lexical Analysis
(Lexikalisk analys i Kompilatorteori)
162
What is a compiler?
A compiler is program that translates a source
language into an equivalent target language
while (i > 3) {
a[i] = b[i];
i ++
}
C program
compiler
does this
mov eax, ebx
add eax, 1
cmp eax, 3
jcc eax, edx
assembly
program
163
What is a compiler?
class foo {
int bar;
...
}
Java program
compiler
does this
struct foo {
int bar;
...
}
C program
164
What is a compiler?
class foo {
int bar;
...
}
Java program
compiler
does this
........
.........
........
Java virtual
machine program
165
Phases of a compiler
Source Program
Scanner
Lexical Analyzer
Tokens
Parser
Syntax Analyzer
Parse Tree
Semantic Analyzer
Abstract Syntax Tree with
attributes
166
Compilation in a Nutshell
Source code
(character stream)
if (b == 0) a = b;
Lexical analysis
Token stream if ( b == 0 ) a = b ;
==
Abstract syntax tree
b
(AST)
Parsing
if
0
;
=
a
b
if
boolean
Decorated AST
==
int b
int 0
Semantic Analysis
=
int
;
int a int b
lvalue
167
Stages of Analysis
Lexical Analysis – breaking the input up into
individual words/tokens
Syntax Analysis – parsing the phrase
structure of
the program
Semantic Analysis – calculating the program’s
meaning
168
Lexical Analyzer
Input is a stream of characters
Produces a stream of names, keywords &
punctuation marks
Discards white space and comments
169
Lexical Analyzer
Recognize “tokens” in a program source code.
The tokens can be variable names, reserved
words, operators, numbers, … etc.
Each kind of token can be specified as an RE,
e.g., a variable name is of the form [A-Zaz][A-Za-z0-9]*.
We can then construct an -NFA to recognize it
automatically.
170
Lexical Analyzer
By putting all these -NFA’s together, we obtain
one which can recognize all different kinds of
tokens in the input string.
We can then convert this -NFA to NFA and
then to DFA, and implement this DFA as a
deterministic program - the lexical analyzer.
171
RE
NFA
Thompson’s
Contruction
DFA
Subset
Contruction
Minimal DFA
Hopcroft
Minimization
172
© Copyright 2026 Paperzz