CD5560
FABER
Formal Languages, Automata
and Models of Computation
Lecture 3
Mälardalen University
2005
1
Content
- Applications: Compiler Lexical Analysis
- Regular Expressions and Regular Languages
- NFADFA (Subset Construction - Delmängdskonstruktion)
- FA RE (State Elimination - Tillståndseliminering)
- Minimizing DFA by Set Partitioning (Särskiljandealgoritmen)
2
Application: Compiler
Lexical Analysis
(Lexikalisk analys i Kompilatorteori)
3
What is a compiler?
A compiler is program that translates a source
language into an equivalent target language
while (i > 3) {
a[i] = b[i];
i ++
}
C program
compiler
does this
mov eax, ebx
add eax, 1
cmp eax, 3
jcc eax, edx
assembly
program
4
What is a compiler?
class foo {
int bar;
...
}
Java program
compiler
does this
struct foo {
int bar;
...
}
C program
5
What is a compiler?
class foo {
int bar;
...
}
Java program
compiler
does this
........
.........
........
Java virtual
machine program
6
Phases of a compiler
Source Program
Scanner
Lexical Analyzer
Tokens
Parser
Syntax Analyzer
Parse Tree
Semantic Analyzer
Abstract Syntax Tree with
attributes
7
Compilation in a Nutshell
Source code
(character stream)
if (b == 0) a = b;
Lexical analysis
Token stream if ( b == 0 ) a = b ;
==
Abstract syntax tree
b
(AST)
Parsing
if
0
;
=
a
b
if
boolean
Decorated AST
==
int b
int 0
Semantic Analysis
=
int
;
int a int b
lvalue
8
Stages of Analysis
Lexical Analysis – breaking the input up into
individual words/tokens
Syntax Analysis – parsing the phrase
structure of
the program
Semantic Analysis – calculating the program’s
meaning
9
Lexical Analyzer
Input is a stream of characters
Produces a stream of names, keywords &
punctuation marks
Discards white space and comments
10
Lexical Analyzer
Recognize “tokens” in a program source code.
The tokens can be variable names, reserved
words, operators, numbers, … etc.
Each kind of token can be specified as an RE,
e.g., a variable name is of the form [A-Zaz][A-Za-z0-9]*.
We can then construct an -NFA to recognize it
automatically.
11
Lexical Analyzer
By putting all these -NFA’s together, we obtain
one which can recognize all different kinds of
tokens in the input string.
We can then convert this -NFA to NFA and
then to DFA, and implement this DFA as a
deterministic program - the lexical analyzer.
12
RE
NFA
Thompson’s
Contruction
DFA
Subset
Contruction
Minimal DFA
Hopcroft
Minimization
13
Two Basic Theorems
14
Theorem NFADFA (Kleene)
Let M be NFA accepting L.
Then there exists DFA M´ that accepts L as well.
15
Finite Language Theorem
Any finite language is FSA-acceptable.
Example L = {abba, abb, abab}
FSA = Finite State Automaton = DFA/NFA
16
Regular Expressions
and
Regular Languages
17
Theorem - Part 1
Languages
Generated by
Regular Expressions
Regular
Languages
1. For any regular expression r
the language L(r ) is regular
18
Theorem - Part 2
Languages
Generated by
Regular Expressions
Regular
Languages
2. For any regular language L there is
a regular expression r with L(r ) L
19
Proof - Part 1
For any regular expression r
the language L(r ) is regular
Proof by induction on the size of
r
20
Induction Basis
Primitive Regular Expressions:
NFAs
, ,
(where
)
L( M1) L()
L( M 2 ) {} L( )
a
Regular
Languages
L( M 3 ) {a} L(a)
21
Inductive Hypothesis
Assume
for regular expressions r1 and r2
that L(r1 ) and L(r2 ) are regular languages
22
Inductive Step
We will prove:
L( r1 r2 )
L( r1 r2 )
1
L(r )
regular
languages
L(( r1 ))
23
By definition of regular expressions:
L(r1 r2 ) L(r1 ) L(r2 )
L(r1 r2 ) L(r1 ) L(r2 )
1
L(r ) ( L(r1 ))
*
L(( r1 )) L(r1 )
24
By inductive hypothesis we know:
L(r1 ) and L(r2 ) are regular languages
We also know:
Regular languages are closed under
union
concatenation
star
Lr1 Lr2
Lr1 Lr2
Lr1 *
25
Therefore
Lr1 r2 Lr1 Lr2
Lr1 r2 Lr1 Lr2
regular
languages
Lr1 * Lr1 *
And trivially L((r1 )) is a regular language
26
Proof – Part 2
For any regular language L there is
a regular expression r with L(r ) L
Proof by construction of regular expression
27
Since L is regular
take the NFA M that accepts it
L( M ) L
Single final state
28
From M construct the equivalent
Generalized Transition Graph
transition labels are regular expressions
Example
M
a
c
a, b
a
c
ab
29
Reverse of a Regular
Language
30
Theorem
R
The reverse L of a regular language L
is a regular language
Proof idea
R
Construct NFA that accepts L :
invert the transitions of the NFA
that accepts L
31
Proof
Since L is regular,
there is NFA that accepts L
Example:
b
L ab * ba
a
b
a
32
b
Invert Transitions
a
b
a
b
a
b
a
33
Make old initial state
a final state
and vice versa
b
a
b
a
b
a
b
a
34
b
a
b
Add a new initial state
a
b
a
b
a
35
R
Resulting machine accepts L
R
L
is regular
L ab * ba
b
a
b
L b * a ab
R
a
36
Some Properties of
Regular Languages,
Summary
37
Properties
For regular languages L1 and L2
we will prove that:
Union:
Concatenation:
Star:
L1 L2
L1L2
Are Regular
Languages
L1 *
38
Regular languages are closed under
Union:
Concatenation:
Star:
L1 L2
L1L2
L1 *
39
Regular language
L1
LM1 L1
NFA
M1
Single final state
L2
Regular language
LM 2 L2
NFA
M2
Single final state
40
Example
M1
a
L1 {a b}
n
b
M2
L2 ba
b
a
41
Union (Thompson’s construction)
NFA for
L1 L2
M1
M2
42
Example
NFA for L1 L2 {a b} {ba}
n
L1 {a b}
n
a
b
L2 {ba}
b
a
43
Concatenation (Thompson’s construction)
NFA for
L1L2
M1
M2
44
Example
NFA for
L1L2 {a b}{ba} {a bba}
n
L1 {a b}
n
n
L2 {ba}
a
b
b
a
45
Star Operation
(Thompson’s construction)
NFA for
L1 *
L1 *
M1
46
Example
NFA for
L1* {a b} *
n
L1 {a b}
n
a
b
47
Summary: Operations on Regular Expressions
RE
a+b
(a+b)(a+b)
a*
a*b
(a+b)*
Regular language description
{a,b}
{aa, ab, ba, bb}
{, a, aa, aaa, …}
{b,ab, aab, aaab, …}
{, a, b, aa, ab, ba, aaa, bbb…}
48
Algebraic Properties
Axiom
Description
r +s = s+r
+ is commutative
(r +s)+t = r +(s+t)
+ is associative
(rs)t = r (st)
concatenation is associative
r (s+t) = rs+rt
(s+t)r = sr +tr
r = r
r = r
r* = ( r +)*
concatenation distributes over +
r** = r*
* is idempotent
is the identity element for
concatenation
relation between * and
49
Operator Precedence
1. Kleene star
2. Concatenation
3. Union
allows dropping unnecessary parentheses.
50
Converting Regular
Expression to a DFA
51
Example: From a Regular Expression to an NFA
Example : (a+b)*abb step by step construction
(a+b)
a
3
2
6
1
4
5
b
52
Example: From a Regular Expression to an NFA
Example : (a+b)*abb
(a+b)*
a
0
3
2
1
6
4
5
b
7
53
Example: From a Regular Expression to an NFA
Example : (a+b)*abb
(a+b)*abb
0
2
a
3
1
6
4
5
b
7
a
8
b
9
b
10
54
Converting FA
to a Regular Expression
55
Example
a
a,b
2
1
b
a
b
3
Add a new start (s) and a new final (f) state:
• From s to the ex-starting state (1)
• From each ex-final state (1,3) to f
56
s
a
a,b
2
1
b
a
b
The original:
3
a
a,b
2
1
b
a
b
f
3
Let’s remove state 1!
Each combination of input/output to 1 will generate
a new path once state 1 is gone
; a,b
;
a ; a, b
a;
57
When state 1 is gone we must be able to make all
those transitions!
( a b
a,b
;
s
)
a,b
a
2
1
b
a
Previous:
a
a, b
3
b
a
b
f
b
58
;
( a b)
s
a,b
2
1
a
b
a
b
3
f
59
a ; a, b
( a b)
s
a,b
2
1
a
b
a
b
3
f
a ( a b)
60
A common mistake: having several arrows between the same
pair of states. Join the two arrows (by union of their regular
expressions) before going on to the next step.
( a b)
s
a,b
2
1
a
b
a
a ( a b) b
3
f
61
a;
( a b)
s
a,b
2
1
a
b
a
3
f
a ( a b) b
a
62
Union again..
( a b)
s
a,b
2
1
b
a
a
a ( a b) b
3
f
a
63
Without state 1...
( a b)
s
2
a
b
a ( a b) b
3
f
a
64
Now we repeat the same procedure for the state 2...
( a b)
s
a
2
b
a (a
b
b
3
f
a
65
Following the path s-2-3 we concatenate all strings on
our way…don’t forget a* in 2!
( a b)
s
2
( a b) a b
a
b
a ( a b) b
3
f
a
66
When 2 is removed, the path 3 - 2 –3 has also to be preserved,
so we concatenate 3-2 string, take a* loop and go back 2-3 with
string b!
( a b)
s
b
( a b) a b
3
f
a
2
a
a ( a b
b
( a ( a b ) b ) ab
67
This is how the FA looks like without state 2:
s
( a b) a b
3
f
a
( a ( a b ) b ) ab
68
Finally we remove state 3…
s
( a b) a b
f
a
3
( a ( a b ) b ) ab
69
...so we concatenate strings s-3, loop in 3 and 3-f
s
(a b)a * b
3
f
(a b)ab
(a(a b) b)a b
a
((a(a b) b)ab)
( a )
70
Now we can omit state 3!
s
f
(a b)ab (( a(a b) b)ab) ( a )
From s we have two choices, empty string or the long expression
OR is represented by union , as usually
71
So union the arrows...
(a b)a b (( a(a b) b)a b) ( a )
s
f
...and we are done!
72
Converting FA to a Regular
ExpressionAn Algorithm
73
• We expand our notion of NFA- to allow transitions on
arbitrary regular expressions, not simply single
symbols or .
• Successively eliminate states, replacing transitions that
enter and leave a state with a more complicated
regular expression, until eventually there are only two
states left: a start state, an accepting state, and a
single transition connecting them, labeled with a
regular expression.
• The resulting regular expression is then our answer.
74
• To begin with, the automaton should have a start state
that has no transitions into it (including self-loops), and
which is not accepting.
• If your automaton does not obey this property, add a
new, non-accepting start state s, and add a -transition
from s to the original start state.
• The automaton should also have a single accepting
final state with no transitions leaving it, and no selfloops.
• If your automaton does not have it, add a new final
state q, change all other accepting states to nonaccepting, and add -transitions from them to q.
This change clearly doesn't change the language
accepted by the automaton.
75
Repeat the following steps, which eliminate a state:
1. Pick a non-start, non-accepting state q to eliminate. The
state q will have i transitions in and j transitions out. Each
will be labeled with a regular expression. For each of the
ij combinations of transitions into and out of q, replace:
B
A
p
C
q
r
with
p
And delete state q.
AB * C
r
76
2. If several transitions go between a pair of states, replace
them with a single transition that is the union of the
individual transitions.
E.g. replace:
A
p
r
B
with
p
A B
r
77
Example
a
b
a
1
2
b
b ab*a
1
1
a
3
b
4
a
3
b
(b a b * a)a * b
4
4
78
N.B. common mistake!
a
means
NOT
b
( a b) *
i.e.
a, b
(a * b*)
See example on s 46 and 47 in Sallings book
79
Minimizing DFA
80
Minimization of DFA
The deterministic finite automata are not always
the smallest possible accepting the source
language.
There may be states with the same
"acceptance behavior". This applies to states
p and q, if for all input words, the automaton
always or never moves to a final state from p
and q.
81
State Reduction by Set Partitioning
(Särskiljandealgoritmen)
The set partitioning technique is similar to one used
for partitioning people into groups based on their
responses to questionnaire.
The following slides show the detailed steps for
computing equivalent state sets of the starting DFA
and constructing the corresponding reduced DFA.
82
b
a
3
a
a
Starting DFA
1
a
b
0
4
a
2
b
b
b
5
b
a
a
b
3
Reduced DFA
a, b
0
1,2
a
b
a, b
4,5
83
State Reduction by Set Partitioning
Step 0: Partition the states into two groups
accepting and non-accepting.
P1
P2
a
a
{ 3, 4, 5 }
a
b
3
a
1
{ 0, 1, 2 }
b
0
a
2
b
b
4
b
b
5
a
84
State Reduction by Set Partitioning
Step 1: Get the response of each state for each input
symbol. Notice that States 3 and 0 show different responses
from the ones of the other states in the same set.
P1
p1
a
{3,
b
p2
p1 p1
4, 5 }
p1 p1
P2
p2 p1
a
{0, 1,
b
p2 p1
p1
2}
p1
a
b
a
a
3
1
a
b
0
a
2
b
b
4
b
b
5
a
85
Step 2: Partition the sets according to the responses,
and go to Step 1 until no partition occurs.
P11
P12
P21
p11 p11
a
b
p12 p12
{4, 5}
P22
a
{3}
{1, 2}
b
p11 p11
{0}
p11 p11
No further partition is possible for the sets P11 and P21 .
So the final partition results are as follows:
{4, 5}
{3}
{1, 2}
{0}
86
Minimized DFA consists of four states of the final partition,
and the transitions are the one corresponding to the
starting DFA.
{4, 5}
{3} {1, 2}
{0}
a
3
a
a, b
0
1,2
a
a
b
a
a, b
2
b
4,5
Minimized DFA
3
1
0
a
b
a
b
b
b
4
b
b
5
a
Starting DFA
87
DFA Minimization Algorithm
The algorithm
Why does this work?
Partition P 2Q (power set)
P { F, {Q-F}}
Start off with 2 subsets of Q
{F} and {Q-F}
while ( P is still changing)
T{}
for each set s P
for each
partition s by
into s1, s2, …, sk
T T s1, s2, …, sk
if T P then
PT
This is a fixed-point algorithm!
While loop takes PiPi+1 by
splitting one or more sets
Pi+1 is at least one step closer to
the partition with |Q | sets
Maximum of |Q | splits
Partitions are never combined
Initial partition ensures that final
states are intact
88
Minimering och särskiljande (Salling)
Exempel 2.38 En minimal automat
a
Sats 2.4
En DFA med N
tillstånd är
minimal om
dess språk
särskiljer N
strängar
a
b
{a, b}
aa
a
a
b ab
aba
b
a
b
b
Automaten ovan är minimal, för att den har sex
tillstånd och särskiljer sex strängar: { , a, b, aa, ab, aba}
89
Två strängar x,y särskiljs av
språket (DFA) om det finns
någon (särskiljande) sträng z
sådan att enbart en av
strängarna xz, yz hamnar i
accepterande tillstånd (tillhör
språket).
a
a
b
a
b
aa
ab
aba
a
aa a
a aaa a
aa
a b aa ab
aa
a
a
b ab
aba
b
a
b
b
90
En mängd av strängar
särskiljs av språket om
varje par av strängar
särskiljs.
a
a
b
a
b
aa
ab
aba
a
aa a
a aaa a
aa
a b aa ab
aa
a
a
b ab
aba
b
a
b
b
91
a
b
a a
b aa a
aa
ab a aaa
aba
a
a
aa
a
a
b ab
aba
b
a
b
b
a
aa
b aa ab
92
a
a
b
a
b
aa
ab
aba
a
aa a
a aaa a
aa
a b aa ab
aa
a
a
b ab
aba
b
a
b
b
93
a
a
b
a
b
aa
ab
aba
a
aa a
a aaa a
aa
a b aa ab
aa
a
a
b ab
aba
b
a
b
b
94
a
a
b
a
b
aa
ab
aba
a
aa a
a aaa a
aa
a b aa ab
aa
a
a
b ab
aba
b
a
b
b
95
a
b
a a
b aa a
aa
ab a aaa
aba
a
a
aa
a
a
b ab
aba
b
a
b
b
a
aa
b aa ab
96
aaa ?
Varför
Vi kan testa kortare
strängar. Duger
?
a
Nej, för att både aba och aa
accepteras! Osv. Testa själv!
a
a
b
a
b
aa
ab
aba
a
aa a
a aaa a
aa
a b aa ab
aa
a
a
b ab
aba
b
a
b
b
{a, b}
97
a
a
b
a
b
aa
ab
aba
a
aa a
a aaa a
aa
a b aa ab
aa
a
a
b ab
aba
b
a
b
b
98
a
b
a a
b aa a
aa
ab a aaa
aba
a
a
aa
a
a
b ab
aba
b
a
b
b
a
aa
b aa ab
99
a
a
b
a
b
aa
ab
aba
a
aa a
a aaa a
aa
a b aa ab
aa
a
a
b ab
aba
b
a
b
b
100
a
a
b
a
b
aa
ab
aba
a
aa a
a aaa a
aa
a b aa ab
aa
a
a
b ab
aba
b
a
b
b
101
a
a
b
a
b
aa
ab
aba
a
aa a
a aaa a
aa
a b aa ab
aa
a
a
b ab
aba
b
a
b
b
102
a
b
a a
b aa a
aa
ab a aaa
aba
a
a
aa
a
a
b ab
aba
b
a
b
b
a
aa
b aa ab
103
a
a
b
a
b
aa
ab
aba
a
aa a
a aaa a
aa
a b aa ab
aa
a
a
b ab a aba
a
b
b
b
104
a
a
b
a
b
aa
ab
aba
a
aa a
a aaa a
aa
a b aa ab
aa
a
a
b ab
aba
b
a
b
b
105
Stegvisa särskiljandealgoritmen (Salling)
Exempel 2.39 Vi tillämpar stegvisa särskiljandealgoritmen
på följande slösaktiga DFA:
{a, b}
Två strängar x,y
särskiljs av språket
(DFA) om det finns
någon (särskiljande)
sträng z sådan att
enbart en av
strängarna xz, yz
hamnar i
accepterande tillstånd
(tillhör språket).
b
1
a
b 4
b
a
2
a 5
a
3
a
a
b
b
6
106
Nu letar vi strängar som särskiljs!
{1,2,3,4,5,6}
Icke-accepterande:
b
1
a
b 4
2 b a
a 5
a
3
a
Accepterande:
{1,2,4,5,6}
a
b
Särkiljande strängar:
eller
b
a
?
Från 1 och 6 på ett a når man 3 som är
accepterande. De bildar egen grupp.
{1,6} {3}
{2,4,5}
Vad händer när man läser ett
Från 5 hamnar man i 6
på ett b. Lämnar sin
grupp.
a
Vad händer när man läser ett
b
6
{3}
b?
{2,4}
{5}
{1,6} {3}
107
Den minimala automaten har fyra tillstånd:
{2,4}
{1,6}
{5}
b
a
{5}
b
{2,4}
{3}
b
a
{1,6}
b
{3}
a
a
108
The Chomsky Hierarchy
109
n l n l
{a b c
: n, l 0}{a : n 0}
n!
Non-regular languages
Context-Free Languages
n n
R
{a b }
{ww }
Regular Languages
110
Some Additional Examples
111
Example of subset construction for NFA DFA
2
start
0
a
3
1
6
7
a
8
b
9
b
10
4
b
5
NFA N for (a+b)*abb
112
Example of subset construction for NFA DFA
STATE
A
B
C
D
E
INPUT SYMBOL
a
b
B
B
B
B
B
C
D
C
E
C
Translation table for DFA
113
Example of subset construction for NFA DFA
b
C
b
b
a
start
A
a
B
b
b
D
10
E
a
a
a
Result of applying the subset construction of NFA for (a+b)*abb.
114
Another Example of State Elimination
a
q0
b
b
q1 a, b
q2
b
b
b
a
q0
q1 a b q2
b
115
Another Example of State Elimination
b
a
q0
b
q1 a b q2
b
bb * a
q0
b
bb * (a b)
q2
116
Another Example of State Elimination
Resulting Regular Expression
bb * a
q0
b
bb * (a b)
q2
r (bb * a) * bb * (a b)b *
L( r ) L( M ) L
117
Removing states
In General
e
c
d
qi
qj
q
a
b
ae * d
ce * b
ce * d
qi
qj
ae * b
118
Obtaining the final regular expression
r4
r1
r3
q0
r2
qf
r r1 * r2 (r4 r3r1 * r2 ) *
L( r ) L( M ) L
119
Example: From an NFA to a DFA
b
states
A
B
C
D
E
a
B
B
B
B
B
b
C
D
C
E
C
C
b
b
a
a
b
A
B
a
b
a
D
E
a
120
Example: From an NFA to a DFA
states
a b
A
B A
B
B D
D
B E
E
B A
b
b
b
a
A
B
a
b
a
D
E
a
121
Example: DFA Minimization
Current Partition Split on a
Split on b
P0
{s4} {s0, s1, s2, s3}
none
{s0, s1, s2} {s3}
P1
{s4}{s3}{s0, s1, s2}
none
{s0, s2}{s1}
P2
{s4}{s3}{s1}{s0, s2}
none
none
final state
a
a
s0
a
b
b
s1
a
s3
a
b
s4
b
b
a
s0 , s2
a
a
s1
b
s3
a
b
s4
b
s2
b
122
Example: DFA Minimization
What about a ( b + c )* ?
q0
a
q1
q2
q4
b
q5
q3
q8
q6
c
q7
q9
First, the subset construction:
-closure(move(s,*))
b
NFA states
a
b
c
s0
q0
none
none
s1
q 1 , q 2 , q3 ,
q4, q6, q9
q 5 , q 8 , q9 ,
q3, q4, q6
q 7 , q 8 , q9 ,
q3, q4, q6
q1, q2, q3,
q4, q6, q9
none
none
q5, q8, q9,
q3, q4, q6
s2
q7, q8, q9,
q3, q4, q6
s3
none
s2
s3
s2
s3
s2
b
s0
a
s1
b
c
c
s3
c
Final states
123
Example: DFA Minimization
b
Then, apply the minimization algorithm
Split on
P0
Current Partition
a
b
c
{ s1, s2, s3} {s0}
none
none
none
final states
s2
b
a
s0
b
s1
c
c
s3
c
To produce the minimal DFA
b,c
a
s0
s1
124
© Copyright 2026 Paperzz