Kleene's Theorem
We have defined the regular languages,
using regular expressions, which
are convenient to write down and use.
We have also defined the languages which
are accepted by FSAs, which make it easy to
tell whether a string is a member of the language.
Theorem: Kleene's theorem
A language L is accepted by a FSA iff L is regular
Not only are regular expressions and
FSA's equivalent, there are algorithms
allowing us to translate between the two.
Recall: Regular Expressions (REs)
Let T be an alphabet. A regular expression over T
defines a language over T as follows:
(i) l denotes {l}, f denotes {}, t denotes {t} for t T;
(ii) if r and s are regular expressions denoting languages R and S, then
(r + s) denoting R + S,
(rs) denoting RS, and
(r*) denoting R* are regular expressions;
(iii) nothing else is a regular expression over T.
Note: a recursive definition of the language of REs
Algorithm: Reg Ex => NDFSA (overview page)
Let L be a regular language over T. We will create
A, a NDFSA accepting L. Recall: (Q,I,F,T,E)
if L = {l}, then A = ({q} , {q} , {q}, T , {})
if L = {}, then A = ({q} , {q} , {} , T, {})
if L = {t}, then A = ({p,q} , {p} , {q} , T , {(p,t,q)})
if L = L1 + L2 then obtain
A1 = (Q1 , {i1} , {f1} , T , E1)
L1 = L(A1)
A2 = (Q2 , {i2} , {f2} , T , E2)
L2 = L(A2)
A = (Q1 Q2 {i,f} , {i} , {f} , T , E1 E2 {(i,l,i1),(i,l,i2),(f1,l,f),(f2,l,f)})
if L = L1L2 then obtain A1 and A2 as above
A = (Q1 Q2 , {i1} , {f2} , T , E1 E2 {(f1,l,i2)})
if L = L1* then obtain A1 as above
A = (Q1 {i,f} , {i}, {f}, T, E1 {(i,l,i1),(i,l,f),(f1,l,f),(f1,l,i1)})
Regular Expression => NDFSA (with added comments)
Where necessary, draw the DFAs constructed here!
Let L be a regular language over T. We will create
A, a NDFSA accepting L. Recall: (Q,I,F,T,E)
if L = {l}, then A = ({q},{q},{q},T,{})
(I=F)
if L = {}, then A = ({q},{q},{},T,{})
(F={})
if L = {t}, then A = ({p,q},{p},{q},T, {(p,t,q)})
NB: So far, the NDFSAs we’re constructing have exactly one initial state
and at most one final state. Later constructs will keep it that way!
(We return to the case where L={} later.)
Regular Expression => NDFSA (with added comments)
Let L be a regular language over T. We will create
A, a NDFSA accepting L. Recall: (Q,I,F,T,E)
if L = L1 + L2 then obtain
A1 = (Q1,{i1},{f1},T,E1) L1 = L(A1)
A2 = (Q2,{i2},{f2},T,E2) L2 = L(A2)
A = (Q1 Q2 {i,f} ,{i},{f},T,
E1 E2 {(i,l,i1),(i,l,i2),(f1,l,f),(f2,l,f)})
Start with i. Following l, do
either A1 or A2. End in f.
Nondeterminism (and l edges)
can be useful!
l
i1
A1
f1
l
f
i
l
i2
A2
f2
l
Regular Expression => NDFSA (with added comments)
Let L be a regular language over T. We will create
A, a NDFSA accepting L. Recall: (Q,I,F,T,E)
if L = L1 + L2 then obtain
A1 = (Q1,{i1},{f1},T,E1), L1 = L(A1)
A2 = (Q2,{i2},{f2},T,E2), L2 = L(A2)
A = (Q1 Q2 {i,f} ,{i},{f},T,E1 E2
{(i,l,i1),(i,l,i2),(f1,l,f),(f2,l,f)})
(Start with i. Following l, do
either A1 or A2. End in f.
Nondeterminism can be useful!)
if L = L1L2 then obtain A1 and A2 as above
A = (Q1 Q2,{i1},{f2},T,E1 E2 {(f1,l,i2)})
({(f1,l,i2)}) links the end of A1 with the start of A2)
Regular Expression => NDFSA (with added comments)
if L = L1L2 then obtain A1 and A2 as above
A = (Q1 Q2,{i1},{f2},T,E1 E2 {(f1,l,i2)})
({(f1,l,i2)}) links the end of A1 with the start of A2)
i1
f1
l
i2
L1
Result: a newly constructed FSA,
connecting the FSAs
L1 and L2 in a pipeline
f2
L2
Regular Expression => NDFSA (with added comments)
Let L be a regular language over T. We will create
A, a NDFSA accepting L. Recall: (Q,I,F,T,E)
if L = L1 + L2 then obtain
A1 = (Q1,{i1},{f1},T,E1), L1 = L(A1)
A2 = (Q2,{i2},{f2},T,E2), L2 = L(A2)
A = (Q1 Q2 {i,f} ,{i},{f},T,E1 E2
{(i,l,i1),(i,l,i2),(f1,l,f),(f2,l,f)})
(Start with i. Following l, do
either A1 or A2. End in f.
Nondeterminism can be useful!)
if L = L1L2 then obtain A1 and A2 as above
A = (Q1 Q2,{i1},{f2},T,E1 E2 {(f1,l,i2)})
({(f1,l,i2)}) links the end of A1 with the start of A2)
if L = L1* then obtain A1 as above
A = (Q1 {i,f} ,{i},{f},T,
E1 {(i,l,i1),(i,l,f),(f1,l,f),(f1,l,i1)})
The edge (i,l,f) stands for 0 strings in L1
The edge (f1,l,i1) causes a loop
Regular Expression => NDFSA (with added comments)
if L = L1* then obtain A1 as above
A = (Q1 {i,f} ,{i},{f},T,
E1 {(i,l,i1),(i,l,f),(f1,l,f),(f1,l,i1)})
The edge (i,l,f) stands for 0 strings in L1
The edge (f1,l,i1) causes a loop
l
L1
i
l
i1
f1
l
l
Result: a newly constructed FSA,
allowing the FSA L1 to be run any number
of times (including zero times)
f
This concludes the proof
-- All three base cases (for constructing
simple Regular Expressions) have been
addressed
-- Each of the three recursive rules (for
constructing complex Regular Expressions)
has been addressed
-- Every Regular Expression is constructed
by combining these cases/rules a finite
number of time
Example: Regular Expression => NDFSA
Let L = (b+ab)(b+ab)*, T = {a,b}
Find NDFSA's for
1. (b+ab)
1.1. b
1.2. ab
1.2.1. a
1.2.2. b
2.(b+ab)*
2.1. (b+ab) (same as 1.)
1.2.1 = ({1,2},{1},{2},T,{(1,a,2)})
1.2.2 = ({3,4},{3},{4},T,{(3,b,4)})
1.2 = ({1,2,3,4},{1},{4},T,{(1,a,2), (2,l,3) ,(3,b,4)})
1.1 = ({5,6},{5},{6},T,{(5,b,6)})
1 = ({1,2,3,4,5,6,7,8},{7},{8},T, { (7,l,1),(7,l,5),
(1,a,2),(2,l,3),(3,b,4), (5,b,6), (4,l,8),(6,l,8) })
2.1 = ({9,10,11,12,13,14,15,16},{15},{16},T,
{ (15,l,9),(15,l,13), (9,a,10),(10,l,11),
(11,b,12),(13,b,14), (12,l,16),(14,l,16) })
Example (cont.)
2. = ({9,10,11,12,13,14,15,16,17,18},{17},{18},T,
{(17,l,15),(17,l,18),(15,l,9),(15,l,13),(9,a,10),
(10,l,11),(11,b,12),(13,b,14),(12,l,16),
(14,l,16),(16,l,18),(16,l,15)})
A = ({1,2,...,18},{7},{18},T,
{ (7,l,1),(7,l,5),(1,a,2),(2,l,3),(3,b,4),(5,b,6),
(4,l,8),(6,l,8), (8,l,17), (17,l,15),(17,l,18),
(15,l,9),(15,l,13),(9,a,10),(10,l,11),(11,b,12),
(13,b,14),(12,l,16),(14,l,16),(16,l,18),(16,l,15) })
l
a
1
2
a
b
4
3
9
l
l
l
l
8
b
11
12
l
l
17
l
l
15
l
l
b
5
l
l
l
7
10
b
6
13
14
l
16
This is nondeterministic ... but
we know that any NDFSA can be
converted into a DFSA (although
we skipped the details of
18
how this is done)
where we are at the moment ...
• We’ve seen how for each Regular Expression
one can construct an NDFSA that accepts the
same language
• Now let’s do the reverse: given a NDFSA
Expression, we construct a regular expression
that denotes the same language
Recall: FSAs
A Finite State Automaton (FSA) is a 5-tuple (Q, I, F, T, E) where:
Q = states
= a finite set;
I = initial states = a nonempty subset of Q;
F = final states = a subset of Q;
T = an alphabet;
E = edges
= a subset of Q (T + l) Q.
FSA = labelled, directed graph
= set of nodes (some final/initial) +
directed arcs (arrows) between nodes +
each arc has a label from the alphabet.
Algorithm: FSA -> Regular Expression
1. create unique initial state
2. create unique final state
3. unique FSA ->Regular Expression
2. Algorithm: create unique final state (informal)
Input: (Q, I, F, T, E)
Q := Q {f} (where f Q)
for each q F, E:= E {(q,l,f)}
F := {f}
A regular finite state automaton (RFSA) is a FSA
where the edge labels may be regular expressions.
An edge labelled with the regular expression r
indicates that we can move along that edge on
input of any string included in r.
1. a unique
initial state
is created in
the same
way
(3) Unique FSA -> Regular Expresssion
Let A be a FSA with
unique initial and final
states. In a number of
steps, A can be
converted to a RFSA
with just one edge,
whose label is the
required Regular
Expression.
Here’s the start of the
proof:
While there are states in Q\{i,f}
begin
For each state p in Q with more than one edge
to itself, labelled r1,r2,...,rn, replace all those
edges by (p,r1+r2+...+rn, p).
For each pair of states p,q in Q with more than
one edge from p to q labelled r1,r2,...,rn, replace
all those edges by (p,r1+r2+...+rn, q).
(....)
Unique FSA -> Regular Expresssion
Let A be a FSA with
unique initial and final
states.
A can be converted to
a RFSA.
r1 r3* r2
r1
p
r2
s
r3
q
While there are states in Q\{i,f}
begin
For each state p in Q with more than one edge
to itself, labelled r1,r2,...,rn, replace all those
edges by (p,r1+r2+...+rn, p).
For each pair of states p,q in Q with more than
one edge from p to q labelled r1,r2,...,rn, replace
all those edges by (p,r1+r2+...+rn, q).
select a state s {i,f}
For each pair of states p,q (s) s.t. there are
edges (p,r1,s) and (s,r2,q)
begin
if there is an edge (s,r3,s)
add the edge (p,r1r3*r2,q)
else add the edge (p,r1r2,q)
end
remove all edges to or from s
remove all states & edges with no path from i
end
return r, where E = (i,r,f).
Example: FSA -> Regular Expression
a
1
a
2
b
a
b
4
b
3
a,b
Example: FSA -> Regular Expression
create unique initial and final states; add a+b loop
i
2
a
1
a
b
4
a
1
b
3
a,b
b
3
b
a
l
4
b
b
a
l
a
2
f
a+b
Example: FSA -> Regular Expression
create unique initial and final states; add a+b loop
i
2
a
1
a
l
a
1
b
3
f
b
b
4
l
4
a,b
a+b
3
b
a
b
a
a
2
b
remove state 2 - edges are 1-3, 1-4, 4-3, 4-4
i
l
3
aa
aa
1
b
4
ab
ab
l
b
f
a+b
Example: FSA -> Regular Expression
i
l
3
aa
aa
1
b
4
ab
ab
l
b
f
a+b
Example (cont.)
combine: b+ab, b+ab
i
l
3
aa
aa
1
b+ab
4
l
b+ab
f
a+b
Example (cont.)
remove state 3 - no edges
i
l
3
aa
b+ab
i
aa
1
4
l
b+ab
a+b
l
1
b+ab
f
4
b+ab
l
f
Example (cont.)
remove edge pairs
i
l
remove state 3 - no edges
3
aa
b+ab
i
aa
1
4
l
b+ab
a+b
l
1
b+ab
4
l
f
b+ab
f
remove state 4 - edge is 1-f
i
l
1
f
(b+ab)(b+ab)*
Example (cont.)
remove edge pairs
i
l
remove state 3 - no edges
3
aa
b+ab
i
aa
1
4
l
b+ab
a+b
l
1
b+ab
4
l
f
b+ab
f
remove state 4 - edge is 1-f
l
i
1
f
(b+ab)(b+ab)*
remove state 1 - edge is i-f
f
i
(b+ab)(b+ab)*
expression is: (b+ab)(b+ab)*
Is this a proper proof?
• The first half (RE=>FSA) is unproblematic
– Proof follows the (recursive) definition of
the language of REs
– Wrinkle: A may lack a final state (the case where L={})
• The second half (FSA=>RE):
– Algorithm not fully specified. (e.g., “select a state s”)
Does the order in which states are selected not matter?
– Is the resulting RE always equivalent to the initial FSA?
• These wrinkles can be ironed out
© Copyright 2026 Paperzz