Chapter 2 Finite Automata

Chapter 2 Finite Automata
2.1 Deterministic Finite Automata
In this chapter we take up a severely restricted model of an actual computer called finite
automaton. It shares with a real computer the fact that its “central processor” is of fixed finite
capacity, depending on its original design. However, beyond this fixed capacity to handle
information, a finite automaton has no auxiliary memory at all. It receives its input as a strings; the
input string is delivered to it on an input tape. It delivers no output at all, accept an indication of
whether the input is considered acceptable. It is, in other words, a language recognition device.
Strings are fed into the device by means of an input tape, which is divided into squares, with one
symbol inscribed in each tape square (see Figure 2-1).
Input
Tape
a
b
a
a
b
b
a
b
Reading head
q0
q5
q1
q4
q2
Finite Control
q3
Figure 2-1
The main part of the machine itself is a “black box” with innards that can be, at any
specified moment, in one of a finite number of distinct internal states. This black box – called the
finite control – can sense what symbol is written at any position on the input tape by means of a
movable reading head. Initially, the reading head is placed at the leftmost square of the tape and
the finite control is set in a designed initial state. At regular intervals the automaton reads one
symbol from the input tape and then enters a new state that depends only on the current state and
the symbol just read. After reading an input symbol, the reading head moves one square to the right
on the input tape so that on the next move it will read the symbol in the next tape square. This
process is repeated again and again; a symbol is read, the reading head moves to the right, and the
state of the finite control changes, Eventually the reading head reaches the end of the input string.
The automaton then indicates its approval or disapproval of what it has read by the state it is in at the
end: if it winds up in one of a set of final states the input string is considered to be accepted. The
language accepted by the input string is considered to be accepted. The language accepted by
the machine is the set of string it accepts.
Definition 2.1 A deterministic finite automaton is a quintuple M = (K, , , s, F)
Where
K is a finite set of states,
 is an alphabet,
s  K is the initial state,
F  K is the set of final states,
And , the transition function, is a function from K  to K.
Since a deterministic finite automaton is not allowed to move its reading head back into the
part of the input string that has already been read, the portion of the string to the left of the reading
head cannot influence the future operation of the machine. Thus a configuration is determined by the
current state and the unread part of the string being processed. In other words, a configuration of
a deterministic finite automaton (K, , , s, F) is any element of K *. For example, the
configuration illustrated in Figure 2-1 is (q2, aabbab).
The binary relation |-M holds between two configurations of M if and only if the machine can
pass from one to the other as a result of a single move. Thus if (q, w) and q’, w’) are two
configurations of M, then (q, w) |-M (q’, w’) if and only if w = w’ for some symbol , and (q, )
= q’. In this case we say that (q, w) yields (q’, w’) in one step. We denote the reflexive, transitive
closure of |-M by |-M* (yields in zero or more steps) and |-M+ (yields in one or more steps).
A string w * is said to be accepted by M if and only if there is a state q  F such that
(s, w) |-M* (q, ). Finally, the language accepted by M, L(M), is the set of all strings accepted by M.
If M is given the input aabba, its initial configuration is (q0, aabba). Then
(q0, aabba)
|-M (q0, abba)
|-M (q0, bba)
|-M (q1, ba)
|-M (q0, a)
|-M (q0, ).
Therefore (q0, aabba) |-M* (q0,) and so aabba is accepted by M.
Example
(a) Consider the deterministic finite automaton M = (K, , , s, F), where
K = { q0 , q 1 }
 = { 0, 1 }
s = q0
F = { q0 }
And  is the function tabulated below.
q

(q,)
q0
q0
q1
q1
0
1
0
1
q0
q1
q1
q0
Then L(M) is the set of all strings in {0, 1}* that have an even number of 1’s.
The tabular representation of the transition function used in this example is not the clearest
description of a machine. We generally use a more convenient graphical representation called the
state diagram (Figure 2-2). Final states are indicated by double circles, and the initial state is
shown by a >.
1
0
>
0
q0
1
q1
Figure 2-2
(b) For the following state diagram:
0
1
1
0
> q0
q1
q2
1
0
L(M) = { x : x is a binary number, the remainder of x divided by 11 (or 3) is not 0 }. If we changed the
set of final state to {q0}, then L(M) = { x : x is a binary number, the remainder of x divided by 11 (or 3)
is 0 }. If we changed the set of final state to {q1}, then L(M) = { x : x is a binary number, the
remainder of x divided by 11 (or 3) is 1 }. If we changed the set of final state to {q2}, then L(M) = { x :
x is a binary number, the remainder of x divided by 11 (or 3) is 10 (or 2) }.
2.2 Nondeterministic Finite Automata
Whereas deterministic finite automata are restricted to having exactly one transition from a
state for each a , a nondeterministic finite automaton may have any number of transitions for a
given input symbol, including zero transitions.
To see that a nondeterministic finite automaton can be a much more convenient device to
design than a deterministic finite automaton, consider the language L = (ab  aba)*, which is
accepted by the deterministic finite automaton :
a
q0
q1
q2
q3
>
a
b
a
a
b
b
b
a
b
q4
One might hope to find a simpler deterministic finite automaton accepting L, but it can be shown
formally that this is impossible. However, L is accepted by the simple nondeterministic device:
q0
b
q1
>
a
a
b
q2
Sometimes it is convenient to allow state diagrams in which arrows can be labeled either by symbols
in  or by the empty string :
q0
a
q1
>
a

b
q2
Once we have allowed arrows labeled by , we may as well generalize our diagrams in the other
direction and allow as a label any string in *. Thus, the language L is also accepted by the onestate device:
v
ab
aba
q0
Remark! In text book, definition of nondeterministic finite automaton according to the above figure i.e.
, the transition relation, is a finite subset of K K. For example  = { (q0, ab, q0), (q0, aba, q0)}.
But in our following definition,  is a little bit different. In our definition, the equivalent of the above
figure is:
b
a
a
b
>
q0
a
Definition 2.2 A nondeterministic finite automaton is a quintuple M = (K, , , s, F)
Where
K is a finite set of states,
 is an alphabet,
s  K is the initial state,
F  K is the set of final states,
and  : K ({})  2K
Example One of the several possible nondeterministic finite automata that accept the set of all strings
containing an occurrence of the pattern bb or of the pattern bab. Formally, this machine is
M = (K, , , s, F), where
K = { q0 , q 1 , q 2 , q 3 , q 4 }
 = { a, b }
s = q0
F = { q4 }
and  : { q0, q1, q2, q3, q4} ({})  {,{ q0},{ q1},{ q2},{ q3}, { q4}, {q0, q1},{ q0, q2},{q0, q3},
{q0, q4}, … { q0, q1, q2, q3, q4} } is given below:

q0
q1
q2
q3
q4
b

{q0, q1} 
{q2}


{q4}
{q4} 
{q4}

a
{q0}
{q3}


{q4}
a
b
q1
b
q2
q0 >

a
b
b
a
q3
Figure 2-3
q4
b
When M is given the string bababaab as input, several different sequences of moves may ensue. For
example, M may wind up in the nonfinal state q0 in case the only transitions used are (q0, a, q0) and
(q0, b, q0).
(q0, bababaab) |-M (q0, ababaab)
|-M (q0, babaab)
|-M (q0, abaab)
|-M (q0, baab)
|-M (q0, aab)
|-M (q0, ab)
|-M (q0, b)
|-M (q0, )
The same input string may drive M from state q0 to the final state q4, and indeed may do so in three
different ways. One of these ways is the following.
(q0, bababaab) |-M (q1, ababaab)
|-M (q3, babaab)
|-M (q4, abaab)
|-M (q4, baab)
|-M (q4, aab)
|-M (q4, ab)
|-M (q4, b)
|-M (q4, )
A string is accepted by a nondeterministic finite automaton if and only if at least one sequence of
moves leading to a final state, it follows that bababaab  L(M).
2.3 Equivalence of Deterministic and Nondeterministic Finite Automata
Although nondeterministic finite automata appear to be more general than deterministic
finite automata, they are nevetheless no more powerful in terms of the languages they accept: A
nondeterministic finite automaton can always be converted into a deterministic one.
Definition 2.3 Finite automata M1 and M2 are said to be equivalet if and only if L(M1) = L(M2).
Theorem 2.1 For each nondeterministic finite automaton, there is an equivalent deterministic finite
automaton.
Proof: Let M = (K, , , s, F) be a nondeterministic finite automaton. We shall now construct a
deterministic finite automaton M’ = (K’, , , s’, F’). The key idea is to view a nondeterministic finite
automaton as occupying, at any moment, not a single state but a set of states: namely, all the states
that can be reached from the initial state by means of the input consumed thus far. Thus if M had five
states {q0, …, q4} and, after reading a certain input string, it could be in state q0, q2, or q3 but not q1
or q4, its state could be considered to be the set {q0, q2, q3}, rather than an undetermined member of
that set. And if the next input symbol could drive M from q0 to q1 or q2, from q2 to q0, and from q3 to
q2, then the next state of M could be considered to be the set {q0, q1, q2}.
The construction formalizes this idea. The set of states of M’ will be 2K. The set of final states
of M’ will consist of all those subsets of K that contain at least one final state of M. The definition of
the transition function of M’ will be slightly more complicated. The basic idea is that a move of M’ on
reading an input symbol  imitates a move of M on input symbol , followed by some number
of moves of M on which no input is read. To formalize this idea we need a special definition.
For any state q  K, let -CLOSURE(q) be the set of all states of M that are reachable from
state q without reading any input. That is,
-CLOSURE(q) = { p  K : (q, ) |-M* (p, ) }.
Example In the automaton of Figure 2-4, -CLOSURE(q0) = {q0, q1, q2, q3}, -CLOSURE(q1) = {q1, q2,
q3}, -CLOSURE(q2) = {q2}, -CLOSURE(q3) = {q3}, and -CLOSURE(q4) = {q3, q4}.

q1
q3
a

>
q0
b

a
a

q2
b
q4
Figure 2-4
Now define M’ = (K’, , , s’, F’),
where
K’ = 2K,
s’ = -CLOSURE(s),
F’ = { Q K : Q  F  }
And for each Q  K and each symbol , (Q, ) =  {-CLOSURE(p) : p  K and
(q, , p)  for some q  Q}.
For example, if M is the automaton of Figure 2-4, then s’ = -CLOSURE(q0) = {q0, q1, q2, q3}.
Since (q1, a, q0)  and (q1, a, q4) , it follows that ({q1}, a) = -CLOSURE(q0) CLOSURE(q4) = { q0, q1, q2, q3, q4}. Similarly, (q0, b, q2)  and (q2, b, q4) , so ({q0, q2}, b) =
-CLOSURE(q2) -CLOSURE(q4) = { q2, q3, q4}.
It remains to show that M’ is deterministic and equivalent to M. The demonstration that M’ is
deterministic is straightforward since  is single value and well defined by the way it was
constructed.
We now claim that for any string w *, and any states q, p  K,
(q, w) |-M* (p, ) if and only if (-CLOSURE(q), w) |-M’* (P, )
for some set P containing p. From this the theorem will follow easily. To show that M and M’ are
equivalent, consider any string w *. Then w  L(M) if and only if (s, w) |-M* (f, ) for some f  F
(by definition) and if and only if (-CLOSURE(s), w) |-M* (Q, ) for some Q containing f (by the claim);
in other words, if and only if (s’, w) |-M’* (Q, ) for some Q  F’. The last condition is the definition of w
 L(M’).
We prove the claim by induction on |w|.
Basis Step. For |w| = 0 – that is for w =  - we must show that (q, ) |-M* (p, ) if and only if (CLOSURE(q), ) |-M’* (P, ) for some set P containing p. The first statement is equivalent to saying
that p  -CLOSURE(q). Since M’ is deterministic and must therefore read an input symbol on every
move, the second statement is equivalent to saying that P = -CLOSURE(q) and P contains p, that is,
p  -CLOSURE(q). This completes the proof of the basis step.
Induction Hypothesis. Suppose that the claim is true for all strings w of length k or less for some k
 0.
Induction Step. We prove the claim for any string w of length k + 1, Let w = va, where a  and
v *.
(=>) First suppose that (q, w) |-M* (p, ). Then there are states r1 and r2 such that
(q, va) |-M* (r1, a) |-M(r2, ) |-M* (p, ).
That is, M reaches state p from state q by some number of moves during which input v is read,
followed by one move during which input a is read, followed by some number of moves during which
no input is read. Since (q, va) |-M* (r1, a), then (q, v) |-M* (r1, ), and since |v| = k, by the induction
hypothesis (-CLOSURE(q), v) |-M’* (R1, ) for some set R1 containing r1. Since (r1, a) |-M (r2, ), (r1, a,
r2) ; then by the construction of M’, -CLOSURE(r2) (R1, a). But since (r2, ) |-M* (p, ), p 
-CLOSURE(r2) and therefore p (R1, a). Therefore (R1, a) |- M’ (P, ) for some P containing p and
(-CLOSURE(q), va) |- M’* (R1, a) |-M’ (P, ).
(<=) Now suppose that (-CLOSURE(q), va) |-M’*(R1, a) |-M’ (P, ) for some P such that p  P and
some R1 such that (R1, a) = P. Now by the definition of , (R1, a) is the union of all sets CLOSURE(r2), where for some state r1  R1, (r1, a, r2) is a transition of M. Since p  P = (R1, a),
there is some particular r2 such that p  -CLOSURE(r2) and for some r1  R1, (r1, a, r2) is a
transition of M. Then (r2, ) |-M* (p, ) by the definition of -CLOSURE(r2). Also, by the induction
hypothesis, (q, v) |-M* (r1, ), and therefore (q, va) |-M* (r1, a) |-M (r2, ) |-M* (p, ).
This completes the proof of the claim and the theorem.
Example Let M be the automaton of Figure 2-4. Since M has 5 states, M’ will have 25 = 32 states.
However, only a few of these states will be relevant to the operation of M’ –namely, those states that
can be reached from state s’ by reading some input string.
Since s’ = -CLOSURE(q0) = {q0, q1, q2, q3},
(q1, a, q0), (q1, a, q4) , and (q3, a, q4) are all the transitions (q, a, p) for some q  s’. It follows that
(s’, a) = -CLOSURE(q0) -CLOSURE(q4) = {q0, q1, q2, q3, q4}.
Similarly, (q0, b, q2), (q2, b, q4) are all the transitions (q, b, p) for some q  s’, so
(s’, b) = -CLOSURE(q2) -CLOSURE(q4) = { q2, q3, q4}.
Repeating this calculation for the newly introduced states, we have the following.
({ q0, q1, q2, q3, q4}, a) = {q0, q1, q2, q3, q4}
({ q0, q1, q2, q3, q4}, a) = {q2, q3, q4}
({q2, q3, q4}, a) = -CLOSURE(q4) = { q3, q4}
({q2, q3, q4}, b) = -CLOSURE(q4) = { q3, q4}
Finally,
({q3, q4}, a) = -CLOSURE(q4) = { q3, q4}
({q3, q4}, b) = 
and
(, a) = (, b) = .
The relevant part of M’ is illustrated in Figure 2-5. F’, the set of final states, contains each set of states
of which q4 is a member, since q4 is the sole member of F; so in the illustration, the three states {q0,
q1, q2, q3, q4}, {q2, q3, q4}, and {q3, q4} of M’ are final.
> {q0, q1, q2, q3}
a
a
{q0, q1, q2, q3, q4}
b
b
{q2, q3, q4}
a
{q3, q4}
a
b
Figure 2-5
A minimization Algorithm
There is a simple method for finding the minimum state of a deterministic automaton. Rather
than give a formal algorithm for computing, we work through an example. First some terminology is
needed. Let  be the equivalence relation on the states of a deterministic finite automaton, such that
p  q if and only if for each input string x, (p, x) is an accepting state if and only if (q, x) is an
accepting state, we say p is equivalent to q. We say that p is distinguishable from q if there exists an
x such that (p, x) is in F and (q, x) is not, or vice versa.
Next for each pair of states p and q that are not already known to be distinguishable we
consider the pairs of states r = (p, a) and s = (q, a) for each input symbol a. If states r and s have
been shown to be distinguishable by some string x, then p and q are distinguishable by string ax.
Thus if the entry (r, s) in the table has an X, an X is also placed at the entry (p, q). If the entry (r, s)
does not yet have an X, then the pair (p, q) is placed on a list associated with the (r, s) – entry. At
some future time, if the (r, s) entry receives an X, then each pair on the list associated with the (r, s)entry also receives an X.
Example Let M be the finite automaton of Figure 2-6.
0
> a
0
b
1
e
1
1
1
0
1
c
0
0
0
1
f
1
0
g
1
d
h
0
Figure 2-6
We constructed a table with an entry for each pair of states. An X is placed in the table each
time we discover a pair of states that cannot be equivalent. Initially an X is placed in each entry
corresponding to one final state and one nonfinal state. In our example, we place an X in the entries
(a, c), (b, c), (c, d), (c, e), (c, f), (c, g), and (c, h).
Continuing with the example, we place an X in the entry (a, b), since the entry ((b, 1), (a, 1)) =
(c, f) already has an X. Similarly, the (a, d)-entry receives an X since the entry ((a, 0), (d, 0)) =
(b, c) has an X. Consideration of the (a, e)-entry on input 0 results in the pair (a, e) being placed on
the list associated with (b, h). Observe that on input 1, both a and e go to the same state f and hence
no string starting with a 1 can distinguish a from e. Because of the 0-input, the pair (a, g) is placed
on the list associated with (b, g). When the (b, g)-entry is considered, it receives an X on account of a
1-input, and hence the pair (a, g) receives an X since it was on the list for (b, g). The string 01
distinguishes a from g.
On completion of the table in Figure 2-7, we conclude that the equivalent states are a  e,
b  h, and d  f. The minimum-state finite automaton is given in Figure 2-8.
b
c
d
e
f
g
h
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
Figure 2-7
a
b
c
d
e
f
g
>
0
{a, e}
0
1
{b, h}
1
{g}
0
1
1
1
0
{c}
0
{d, f}
Figure 2-8
2.4 Properties of the Languages Accepted by Finite Automata
Theorem 2.2 The class of languages accepted by finite automata is closed under:
(a)
(b)
(c)
(d)
(e)
union;
concatenation;
Kleene star;
Complementation;
Intersection.
Proof:
(a) Union. Let L1 and L2 be languages accepted by nondeterministic automata M1 and M2,
respectively. Let M1 = (K1, , 1, s1, F1) and M2 = (K2, , 2, s2, F2). Without loss of
generality, we may assume that K1 and K2 are disjoint sets.
We construct a nondeterministic finite automaton M that accepts L(M1) L(M2) as follows:
M = (K, , , s, F), where s is a new state not in K1 or K2,
K = K1  K2  {s}
F = F1  F2
 = 1  2  {(s, , s1), (s, , s2)}.
S1 >
s2 >
F1
F2
M1
M2
s
>


s2
s1
F
M
(b) Concatenation. We construct a nondeterministic finite automata as follow:
S1 >
s2

F1
F2
M
(c) Kleene star. We construct a nondeterministic finite automata as follow:
S1 >
F1
s’1 >


s1
M1
M
(d) Complementation. Let M = (K, , , s, F) be a deterministic finite automaton. Then the
complementary language * - L(M) is accepted by the deterministic finite automaton
M = (K, , , s, K - F).
That is, M is identical to M except that final and nonfinal states are interchanged.
(e) Intersection. If L1 and L2 are languages accepted by finite automata M1 and M2, then
L1  L2 = * - ((* - L1)  (* - L2))
and L1  L2 is accepted by a finite automaton by (a) and (d) above.
Theorem 2.3 There are algorithms for answering the following questions about finite automata.
a) Given a finite automaton M and a string w, is w  L(M)?
b) Given a finite automaton M, is L(M) = ?
c) Given a finite automaton M, is L(M) = *?
d) Given a finite automaton M1 and M2, is L(M1)  L(M2)?
e) Given a finite automaton M1 and M2, is L(M1) = L(M2)?
Proof. We may assume that the finite automata are in each case deterministic. Question (a) can be
answered simply by tracing the operation of M on input w for |w| steps, since M consumes one input
symbol on every step.
Question (b) can be answered by checking whether there is any sequence of arrows in the state
diagram that leads from the initial state to a final state; this can be easily done since the state
diagram is finite.
Question (c) find a finite automaton M’ such that L(M’) = * - L(M), and ask whether L(M’) = .
Question (d) ask whether (* - L(M2))  L(M1) = , which can be answered by (b) and closure
under intersection.
Question (e) ask whether L(M1)  L(M2) and L(M2)  L(M1).
2.5 Finite Automata and Regular Expressions
Theorem 2.4 A language is regular if and only if it is accepted by a finite automaton.
Proof: () Since a regular language is represented by regular expression. If r be a regular
expression, then there exists an NFA that accepts L(r).
Recall that the regular expression contains , , and the singletons {a} for some a in 
and closed under union, concatenation, and Kleen star. It is evident (see figure 2-9) that , , or {a}
are accepted by NFA.
a
>
>
>
q0
q0
qf
q0
qf
(a) r = 
(b) r = 
(c) r = a
Figure 2-9
By theorem 2.2 the finite automaton languages are closed under union, concatenation, and Kleene
star. We can construct NFA for union, concatenation, and Kleene star from that operator of regular
expression. Hence every regular language is accepted by some finite automaton.
() Let M = (K, , , s, F) be a deterministic finite automaton; we need to show that there is a
regular language R such that R = L(M). We represent L(M) as the union of many (but a finite number
of) simple languages. Let K = {q1, …, qn}, s = q1. For i, j = 1, …, n and k = 1, …, n + 1, we let Rk(i, j)
be the set of all strings in * that drive M from qi to qj without passing through any state numbered k
or greater. Formally,
Rk(i, j) = { x * : (qi, x) |-M* (qj, ), and, if
(qi, x) |-M* (ql, y) for some y *,
then l < k, or, y =  and l = j,
or, y = x and l = i }.
Note that “without passing through”, we mean both entering and then leaving. Thus i and j may be
greater than k.
When k = n + 1, it follows that
Rn+1(i, j) = { x * : (qi, x) |-M* (qj, ) }.
Therefore L(M) =  { Rn+1(1, j) : qj  F }.
The crucial point is that each set Rk(i, j) is regular, and hence so is L(M). The proof is by induction on
k. For k = 1, we have the following.
{ a  : (qi, a) = qj }
if i  j
1
R (i, j) =
(1)
{  }  {a  : (qi, a) = qj } if i = j
Each of these sets is finite and therefore regular. For k = 1, …, n, provided that all the sets Rk(i, j)
have been defined, each set Rk+1(i, j) can be defined in terms of previously defined languages as
Rk+1(i, j) = Rk(i, j)  Rk(i, k) Rk(k, k)* Rk(k, j).
(2)
This equation states that to get from qi to qj without passing through a state numbered greater than k,
M may either
1. go from qi to qj without passing through a state numbered greater than k – 1; or
2. go (a) from qi to qk; then
(a) from qk to qk repeatedly; then
(b) from qk to qj;
in each case without passing through a state numbered greater than k – 1.
Therefore if each language Rk(i, j) is regular, so is each language Rk+1(i, j). This completes the
induction.
Example
(a) Construct a nondeterministic finite automaton accepting the language denoted by (ab  aab)*.
state1:
a
a
>
b
b
>
state 2:
ab
a

b
a

a
>
aab
>

b
state 3:
ab  aab

a
b

>

a

a
b

state 4:
(ab  aab)*
>


a

b
a

a


b


(b) Write a regular expression that accepting the same language with the following deterministic
finite automaton:
a
b
a
b
q2
a
>
q1
b
q3
Rather than calculating all the sets Rk(i, j), we can reduce the computational labor somewhat by
starting from equation (2) for k = n + 1 and working back to the case k = 1 by using equation (2)
repeatedly. Since F = { q2 }, we must find L(M) = R4(1, 2).
By equation (2),
R4(1, 2) = R3(1, 2)  R3(1, 3) R3(3, 3)* R3(3, 2).
Thus we need four of the sets Rk(i, j) for k = 3. These are as follows.
R3(1, 2) = R2(1, 2)  R2(1, 2) R2(2, 2)* R2(2, 2).
R3(1, 3) = R2(1, 3)  R2(1, 2) R2(2, 2)* R2(2, 3).
R3(3, 3) = R2(3, 3)  R2(3, 2) R2(2, 2)* R2(2, 3).
R3(3, 2) = R2(3, 2)  R2(3, 2) R2(2, 2)* R2(2, 2).
Expanding further to ontain equations for the required sets Rk(i, j) for k = 2 gives these values.
R2(1, 2) = R1(1, 2)  R1(1, 1) R1(1, 1)* R1(1, 2).
R2(2, 2) = R1(2, 2)  R1(2, 1) R1(1, 1)* R1(1, 2).
R2(1, 3) = R1(1, 3)  R1(1, 1) R1(1, 1)* R1(1, 3).
R2(2, 3) = R1(2, 3)  R1(2, 1) R1(1, 1)* R1(1, 3).
R2(3, 3) = R1(3, 3)  R1(3, 1) R1(1, 1)* R1(1, 3).
R2(3, 2) = R1(3, 2)  R1(3, 1) R1(1, 1)* R1(1, 2).
The sets Rk(i, j) for k = 1 are found using equation 1.
R1(1, 1) = R1(2, 2) =R1(3, 3) = a  
R1(1, 2) = R1(2, 3) =R1(3, 1) = b
R1(2, 1) = R1(3, 2) =R1(1, 3) = 
Substituting back into the equation for k = 2, we have the following.
R2(1, 2) = b  (a   ) (a  )* b = a*b
R2(2, 2) = (a  )   (a  )* b = (a  )
R2(1, 3) =   (a  ) (a  )* = 
R2(2, 3) = b   (a  )*  = b
R2(3, 3) = (a  )  b (a  )* = (a  )
R2(3, 2) =   b (a  )* b = ba*b
The equation for k = 3 are as follows.
R3(1, 2) = a*b (a*b)(a  )*(a) = a*ba*
R3(1, 3) =   a*b)(a  )*(b) = a*ba*b
R3(3, 3) = ( a  )  (ba*b) (a  )*b = (a  )  ba*ba*b
R3(3, 2) = ba*b  (ba*b) (a  )*a = ba*ba*
Finally,
L(M) = R4(1, 2) = a*ba*  (a*ba*b)(a    ba*ba*b)* (ba*ba*).
Remark! The following identities for regular expression r, s, and t are true. Here r = s means L ( r ) =
L( s ).
(a) r + s = s + r
(b) (r + s) + t = r + (s + t)
(c) (rs)t = r(st)
(d) r(s + t) = rs + rt
(e) (r +s)t = rt + st
(f) * = 
(g) (r*)* = r*
(h) ( + r)* = r*
(i) (r*s*)* = (r + s)*
2.6 Proofs that Language are and are not Regular
Although we now have a variety of powerful techniques for showing that languages are regular,
as yet we have none for showing that languages are not regular. We know on fundamental
principles that nonregular languages do exist, since the number of regular expressions (or the
number of finite automata) is countable, whereas the number of languages is uncountable.
Pumping Lemma for Regular Languages
Lemma 1. Let L be a regular language. Then there is a constant n such that if z  L and | z|  n, we
may write z = uvw in such a way that | uv |  n, | v |  1, and for all i  0 , uviw is in L. Furthermore,
n is no greather than the number of states of the smallest finite automata accepting L.
Example: n = 3
0
1
1
> q0
0
q1
q2
1
0
z= 010 L and | z |  n, u =  , v = 0, w = 10 ,| v | 1, |uv|=1 n => uviw  L for i  0
or z=101  L and | z|  n, u = 10, v = 1, w =  10 ,| v | 1, |uv|=1 n => uviw  L for i  0.
To proof that L is not a regular language:
- Assume L is regular, therefore there exists some DFA to accept L. This DFA has some number of
state, we will call it n.
-
-
Every string in L which is sufficiently long ( z  L and | z|  n) must be pumpable (have some
loop in DFA) as follows: z = uvw
such that
| uv |  n, | v | > 0 and  i , uviw  L.
Find one long string in L, which there is no way you can break it up into pumpable pieces. Then
L can not be regular language.
Example:
a) L = { a x*x | x  0 } is not a regular language.
Proof: Let z = a n*n  L and z = uvw such that | uv |  n , | v | > 0 .
Consider z’ = uvvw  L,
| z’ | = | z | + | v |
| z | = n2 < | z’ | = | z | + | v |  n2 + n < (n+1)2
That is, the length of uv2w lies properly between n2 and (n+1)2 and is thus not a perfect square. Thus
uv2w is not in L, a contradiction.
b) L = { axbx | x  0 } is not a regular language.
Proof: Let z = anbn L and z = uvw such that | uv |  n , | v | > 0.
Case 1: v consists entirely of a’s i.e. z = uvw such that u = ap, v = aq , w = arbs, where p, r  0, and q,
s > 0.
Consider z’ = uviw = a p+iq+r bs, for i  0, then z’  L because at most one of these strings (i =
1 only) has equal numbers of a’s and b’s, contradiction.
Case 2: v consists entirely of b’s; this is similar to case 1.
Case 3: v contains both a’s and b’s. Then for i > 1, uviw has an occurrence of b preceding an
occurrence of a and therefore cannot be in L, contradiction.
Note! Be careful! L = { axbx | x > 0 }  L’ = { x | x is a string with equal number of 0 and 1}
L’ is a regular language.
L’ = { 01, 0101, 0011, 110010, … }
c) L = {ax | x is prime} is not regular.
Proof: Let z = uvw such that | uv |  n , | v | > 0 and u = ap, v = aq, and w = ar, where p, r  0 ,
q > 0 and p+q+r is prime.
Consider z’ = uviw = a p+iq+r , for i  0, then z’  L because p+iq+r is not necessary be a
prime for every i. For example, i = p+2q+r+2; then p +iq+r = (q+1)(p+2q+r), which is a product
of two natural numbers, each greater than 1.
Two-way Finite Automata
We have viewed the finite automaton as a machine that reads a tape, moving one square
right at each move and changes state. We added nondeterminism to the model, which allowed many
“copies” of the finite control to exist and scan the tape simultaneously and we added -transitions,
which allowed change of state without reading the input symbol or moving the tape head. Another
interesting extension is to allow the tape head the ability to move left as well as right. Such a finite
automaton is called a two-way finite automaton. It accepts an input string if it moves the tape head off
the right end of the tape, at the same time entering an accepting state. We shall see that even this
generalization does not increase the power of the finite automaton; two-way FA accept only regular
language.
A two-way deterministic finite automaton (2DFA) is a quintuple M = (K, , , s, F), where K,
, s, and F are as before, and  is a map from K  to K {L, R}. If (q, a) = (p, L), then in state
q, scanning input symbol a, the 2DFA enters state p and moves its head left one square. If (q, a) =
(p, R), the 2DFA enters state p and moves its head right one square.
Example,
q0
q1
q2
0
(q0, R)
(q1, R)
(q0, R)
Consider the input 101001.
q0101001 |-1q101001
|- 10q11001
|- 1q201001
|- 10q01001
|- 101q1001
|- 1010q101
|- 10100q11
|- 1010q201
|- 10100q01
|- 101001q1
Thus 101001 is in L(M).
1
(q1, R)
(q2, L)
(q2, L)
We leave as an homework to find that one-way and two-way finite automata are equivalence, i.e.
proof that if L is accepted by a 2DFA, then L is a regular language.
Finite Automata with output
One limitation of the finite automaton as we have defined it is that output is limited to a
binary signal: “accept”/ “don’t accept”. Models in which the output is chosen from some other
alphabet have been considered. There are two distinct approaches; the output may be associated
with the state (called a Moore machine) or with the transition (called a Mealy machine). We notice
that the two machine types produce the same input-output mappings.
Moore machines
A Moore machine is a six-tuple (K, , , , , s), where K, , , and s are as in the DFA.  is the
output alphabet and  is a mapping from K to  giving the output associated with each state. The
output of M in response to input a1a2 … an , n0, is (q0) (q1) … (qn) , where q0, q2, … , qn is the
sequence of states such that (q i-1, ai) = qi for 1  i  n. Note that any Moore machine gives output
(q0) in response to input . The DFA may be viewed as a special case of a Moore machine where
the output alphabet is {0, 1} and state q is “accepting” if and only if (q) = 1.
Example, Suppose we wish to determine the residue mod 3 for each binary string treated as a binary
integer. To begin, observe that if i written in binary is followed by a 0, the resulting string has value
2*i, and if i in binary is followed by a 1, the resulting string has value 2*i + 1. If the remainder of i/3 is
p, then the remainder of 2*i/3 is 2*p mod 3. If p = 0, 1, or 2, then 2*p mod 3 is 0, 2, or 1, respectively.
Similarly, the remainder of (2*i + 1)/3 is 1, 0, or 2, respectively.
It suffices therefore to design a Moore machine with three states, q0, q1, and q2, where qj is
entered if and only if the input seen so far has residue j. We define (qj) = j for j = 0, 1, and 2. The
following figure shows the transition diagram, where outputs label the states.
>q0
1
q1
0
q2
0
1
2
0
1
0
1
On input 1010 the sequence of states entered is q0, q1, q2, q2, q1, giving output sequence 01221.
That is ,  (which has “value” 0) has residue 0, 1 has residue 1, 2 (indecimal) has residue 2, 5 has
residue 2, and 10 (in decimal) has residue 1.
Mealy machines
A Mealy machine is a six-tuple (K, , , , , s), where all is as in the Moore machine, except that
 maps K to . That is, (q, a) gives the output associated with the transition from state q on
input a. The output of M in response to input a1a2 … an , n0, is (q0, a1) (q1, a2) … (qn-1, an),
where q0, q2, … , qn is the sequence of states such that (q i-1, ai) = qi for 1  i  n. Note that this
sequence has length n rather than length n + 1 as for Moore machine, and on input  a Mealy
machine gives output .
Example,
0/y
p0
0/n
1/n
0/n
> q0
1/n
p1
1/y
We use the label a/b on an arc from state p to state q to indicate that (p, a) = q and (p, a) = b.
The response of M to input 01100 is nnyny, with the sequence of states entered being q0p0p1p1p0p0.
It is your homework to find out that:
Theory: If M1 is a Moore machine, then there is a Mealy machine M2 equivalent to M1.
AND
Theory: If M1 is a Mealy machine, then there is a Moore machine M2 equivalent to M1.