Minimizing DFAs

Regular Expressions, equivalence
with FA
Regular Languages
• The regular languages are the languages that
dfas accept.
• If we find an automaton that accepts the
language L then L is regular.
Regular Operations
The regular operations are the following:
• Union ( )
• Concatenation (o)
• Star (*)
Regular Operations
•
•
•
•
•
L1 L2 = {x|x L1 or x L2}
L1o L2 = {xy|x L1 and y L2}
We can define Lk = L o L o … o L, k times and L0 = {ε}
L* = Lk for any k ≥ 0
L+ = Lk for any k > 0
Regular Operations Examples
L1 = {a, b}, L2 = {b, c}
• L1 L2= {a, b, c}
• L1 o L2 = {ab, ac, bb, bc}
• L1 2 = {aa, ab, ba, bb}
• L1* = {ε, a, b, aa, ab, ba, bb, aaa, aab, aba, abb,
baa, bab, bba, bbb, aaaa, aaab, aaba, …}
Regular Languages are closed under
the Regular Operations
To show that a set is closed under an operation
we should prove that, if we apply the
operation on members of the set then the
result belongs in the set.
Regular Languages are closed under
union.
• We should prove that if L1 and L2 are regular
then L = L1 L2 is also regular.
• Since L1 and L2 are regular there exist DFAs M1
and M2 that accept them.
• It suffices to show that there exist a DFA M (or
an NFA) that accepts L.
• We construct a DFA that accepts L.
Proof Idea
• I want to construct a DFA M that will simulate
both M1 and M2. This machine should keep track
of what happened to either machine after
following a symbol a of the alphabet.
• Start from q01 in M1 and q02 in M2. After following
a symbol a the first machine will be in state q1
and the second in q2.
• I will keep information for both transitions by
keeping the pair of next states under each symbol
of the alphabet.
Proof Idea
• So starting from (q01, q02) after following a M
will be in state (q1, q2).
• I want to compute the union of the languages.
So machine M1 will accept under string s, if s is
in L1 and M2 will accept s, if s is in L2. So either
machine could be in an accept state after
input s in order to accept s. Thus the accepting
states of M will be pairs that contain at least
one accept state.
A DFA that accepts L1 L2
Assume that M1 = {Q1, Σ ,δ1 ,q01 ,F1} and
M2 = {Q2, Σ ,δ2 ,q02 ,F2}.
Then M = {Q, Σ, δ, q0, F}, where
• Q = Q1 x Q2
• δ((q1,q2), a) = (δ1(q1, a), δ2(q2, a))
• q0 = (q01, q02)
• F = (Q1 x F2) (F1 x Q2)
An NFAε that accepts L1 L2
Suppose M1 and M2 are shown in the first figure.
Add an new start state and ε-moves from this
to the initial start states of M1 and M2
ε
ε
Regular Languages are closed under
concatenation
• Suppose that L1 and L2 are two regular
languages. Then there exist two NFAs M1 and
M2 that recognize L1 and L2.
• We construct an NFAε that accepts their
concatenation L.
An NFAε that accepts L1 o L2
Suppose M1 and M2 are shown in the first figure.
Add ε-moves from the accept states of M1 to the
start state of M2 and make them non-accepting
ε
ε
Regular Languages are closed under
star operation
• Suppose that L is a regular language. Then
there exists NFA M that recognizes L.
• We construct an NFAε that accepts L*.
An NFAε that accepts L1 o L2
Suppose M is shown in the first figure.
Add a new accept start state. Add ε-moves from
the accept states the previous start state.
ε
ε
ε
What is a regular expression
•
•
•
•
•
•
a, where a is any symbol of the alphabet Σ.
ε
r1 r2, where r1 and r2 are regular expressions
r1r2, where r1 and r2 are regular expressions
r* where r is a regular expression
The regular expressions represent
languages
• a represents the language {a}
• ε represents the language {ε}
• represents the language
If r1 represents the language L1 and r2 the
language L2 then:
• r1 r2 represents the language L1 L2
• r1r2 represents the language L1 o L2
• r1* represents the language L1*
Precedence of Regular Operations
1. Star
2. Concatenation
3. Union
For example ((0 1)1)*
Regular expressions are equivalent
with FAs (=>)
Suppose that you are given a reg. expr. You want
to construct an NFAε that recognizes the
language that the reg. expr. represents.
• If the expression is a symbol a then the
language that represents is {a} and an
a
automaton that recognizes it is
• If the expression is ε then the language is {ε}
and the automaton that recognizes it is
• If the expression is then the language is
and the automaton that recognizes it is
Regular expressions are equivalent
with FAs (=>)
• We already proved that the regular languages
are closed under the regular operations.
• So by applying several regular operations on
regular expressions we make sure that the
language that the resulting regular expression
represents is also regular.
Example
Given a regular expression construct NFAε that
accepts the language that represents.
1. Construct the NFAεs that accept the symbols
that are used in the regular expression
2. Build larger NFAs by combining the original
ones when a regular operation is applied
(0* 10)+ = (0* 10)*(0* 10)
0
0
1
1
ε
ε
ε
1
0*
0
0
01
ε
ε
ε
0
0*
10
ε
1
ε
0
ε
ε
ε
ε
0
ε
ε
(0*
ε
ε
1
ε
0
10)*
ε
ε
ε
ε
ε
0
ε
0
ε
ε
0
ε
0
ε
ε
ε
1
ε
1
(0* 10)*(0* 10) = (0* 10)+
Simpler way
The NFAε that is produced can be very complicated
and this can easily lead to mistakes. However you
can instantly eliminate some moves that aren’t
0
needed.
• 0* creates
1
0
• 10 creates
r1
0,1
• 0 1 creates
ε
• r1 creates
+
ε
(0* 10)+
0
0*
1
10
0
0
0* 10
ε
1
ε
0
0
ε
ε
ε
1
ε
0
(0* 10)+
Regular expressions are equivalent
with FAs (<=)
Suppose that you are given an NFA. You want to
construct a reg. expr. that represents the
language that the NFA accepts.
• We construct from the NFA a Generalized NFA
(GNFA) with two more states (a start and a
unique accepting state)
• We repeatedly remove all the other states one
by one.
GNFA
A GNFA is an NFA where we have regular
expressions instead of symbols in the arrows.
For example:
0
ε
1*
0+1
(0
10*)0
01
Regular expressions are equivalent
with FAs (<=)
1. Construct the GNFA from the NFA.
Add two new states:
• A start state that points with an ε-move to
the start state of the NFA
• A unique final state and add incoming εmoves from each accept state of the NFA.
• All the other states are non-accepting.
Notice that the NFA and the resulting GNFA are
equivalent.
0
1
2
0
NFA
1
0
3
1 4
0
5
0
ε
0
s
ε
1
2
1
0
3
0
1 4
0
0
ε
5
f
GNFA
Regular expressions are equivalent
with FAs (<=)
2. Remove a state
Suppose that we want to remove qrem. We
should take account of the transitions that
were possible through qrem
r1
qi
r2
qj
r4
qrem
r3
qi
r1 U r2 r3 * r4
qj
Regular expressions are equivalent
with FAs (<=)
3. When we are left with just the start and the
final states, the regular expression over the
arrow that connects them is the one we are
seeking.
Example
0
1
2
0
3
1
0
1 4
0
0
5
1. Construct the GNFA
0
2
1
0
NFA
1
3
0
1 4
0
5
0
ε
0
s
ε
1
2
1
0
3
0
1 4
0
0
ε
5
f
GNFA
2-1 Eliminate state 1
1
Find all outgoing and incoming arrows of
state 1. Consider all possible combinations s
5
ε
0
s
ε
1
2
0
3 1
1
0
4
0
f
ε
5
0
ε
0
1
2
s
3
0
01
0
1 4
00
0
ε
5
f
2
3
1
0
01
00
2-2 Eliminate state 2
2
ε
0
1
2
s
3
0
s 10*
0
1 4
01
0
f
5
f
ε
010*
5
00
10*
0
s
0
3
1 4
00
f
0
5
ε U 010*
2-3 Eliminate state 3
3
10*
0
s
3
0
1 4
f
0
4
5
s 01
00
5
000
001
ε U 010*
5
00
10*
001
s
4
01
00
0
f
5
ε U 010*
000
2-4 Eliminate state 4
4
s 010
10*
001
s
4
01
5
0
f
5
0010
ε U 010*
5
000
00
10*
f
s
5
00 U 010
ε U 010*
000 U 0010
2-5 Eliminate state 5
10*
5
f
s
(00 U 010 )(000 U 0010)* (ε U 010*)
f
s
5
00 U 010
s
ε U 010*
000 U 0010
10* U (00 U 010 )(000 U 0010)* (ε U 010*)
f
3. The regular expression
So the regular expression is
10* U (00 U 010 )(000 U 0010)* (ε U 010*) Indeed:
• With 10* the automaton goes to state 2 and accepts.
• 00 and 010 are two different ways to go to 5. From
there, with 000 or 0010 we can go again to 5 (we can
do this as many times as we want possibly none).
• When in 5 we can stay there (thus the ε) or go to 2
0
following 010*.
1
2
0
3
1
0
1 4
0
0
5