University of Oslo
: Department of Informatics
Finite State Automata
Jonathon Read
9 October 2012
INF4820: Algorithms for AI and NLP
Previously
I
Regular expressions are powerful tools for describing
infinite sets of strings
I
The fundamental operations are:
I
I
I
Matching characters, wildcards (.) and anchors (ˆ and $)
Disjunction (| and [ ])
Quantification (?, *, + and {n, m})
I
Precedence can be enforced with brackets (( and ))
I
More complex operations include capturing groups
Some examples
I
/ˆa (fox|wolf)$/ ⇒ { a fox a wolf }
I
/ˆf[aio]x]$/ ⇒ { fax fix fox }
I
/ˆ(fox[ $])+/ ⇒ { fox fox fox fox fox fox }
Today
I
Describing regular expressions with finite state automata
I
Using finite state automata to process strings
I
State-space search
I
Finite state transducers
I
Applications in gaming artificial intelligence
Finite state automata (FSAs)
I
Mathematical models of computation
I
Can be in any one of a finite number of states
I
Changes state in response to triggering conditions—sets of
labelled transitions from state to state
Defining finite state automata
Jurafsky & Martin 2009
Q = {q0 , q1 , . . . , qN−1 } a finite set of N states
Σ
a finite input vocabulary
qo
the start state
F
δ q, i
the set of final states, F ⊆ Q
the transition function between
states. Given a state q ∈ Q and
an input symbol i ∈ Σ, δ q, i returns a new state q0 ∈ Q
Sheeptalk
/baa+!/ ⇒ { baa! baaa! baaaa! baaaaa! . . . }
State transition tables
I
Rows represent states
(i.e. q ∈ Q)
I
Columns represent
possible input from the
vocabulary Σ
I
I
Cells are transitions
given a state and input
(i.e. δ q, i )
Final states are denoted
using a colon
Input
State
b
0
1
2
3
4:
1
a
!
2
3
3
4
Recognising deterministic FSAs
D-Recognise
Input: input string, state-transition-table, final-states
Output: accept or reject
current-state ← 0;
index ← 0;
repeat
if end of input then
if current-state is in final-states then
return accept;
else
return reject;
end
else if state-transition-table[current-state, input[index]] is empty then
return reject;
else
current-state ← state-transition-table[current-state, input[index]];
index ← index + 1;
end
until stop;
Non-deterministic FSAs
/b((aa)+|(aaa)+)!/ ⇒ {
baa!
baaa!
baaaa!
baaaaaa!
...
}
Abstract approaches to searching
Heuristic
I
Look ahead at input beyond the current index
I
Try to work out/guess which branch to take
Parallel
I
Assume unlimited number of cpus etc.
I
Copy state and remaining input, search all branches
Backtracking
I
Keep track of choice points
I
Follow each branch, returning to choice points on failure
Recognising non-deterministic FSAs
ND-Recognise
Input: input string, state-transition-table, final-states
Output: accept or reject
agenda ← {h0, 0i};
repeat
current-state, index ← pop(agenda);
if end of input and current-state in final-states then
return accept;
end
for next-state in state-transition-table[current-state, input[index]] do
agenda ← agenda ∪ {hnext-state, index + 1i};
end
if agenda is empty then
return reject;
end
until stop;
-transitions
Arcs that do not consume input are called -transitions
-transitions: Concatenation
-transitions: Closure (*)
-transitions: Union (|)
Deterministic and non-deterministic FSAs
Are non-deterministic FSAs more powerful?
No, every non-deterministic FSA has a deterministic equivalent
(Hopcroft and Ullman, 1979).
But they are easier to read—given a non-deterministic FSA with
n nodes, its deterministic equivalent can have up to 2n nodes.
Hopcroft, J. E. and Ullman, J. D. (1979). Introduction to Automata Theory,
Languages and Computation. Addison-Wesley.
Finite state transducers (FST)
Finite state automata represent strings in language. Finite state
transducers are a type of FSA which map between string pairs.
They can have several functions:
Recognisers Given a pair of strings, determine whether they
are in the string pair language
Generators Represent how to construct output in the string
pair language
Translator Read in one string and output another string
Relator Compute relations between sets
Defining finite state transducers
Jurafsky & Martin 2009
Q = {q0 , q1 , . . . , qN−1 } a finite set of N states
Σ
a finite input vocabulary
∆
a finite output vocabulary
qo
the start state
F
the set of final states, F ⊆ Q
δ q, i σ q, i
the transition function
the output function giving the set
of possible output strings for each
state and input
Translating Sheeptalk
/baa+!/ becomes /bæ+!/
q0
a:æ
a:
b:b
q1
q2
!:!
q3
q4
a:æ
Exercise: Hunden snakke
Draw a transducer for Norwegian and English dogs, i.e.
/(vo(ff|v)[ $])+/ becomes /(woof[ $])+/
Controlling agent behaviour with FSAs
An FSA can control agents in games, e.g.
ghosts in Pac-Man have four behaviours:
1. Wander the maze
2. Chase Pac-Man
3. Run away from Pac-Man
4. Return to the centre
Each of these behaviours depend on
certain triggering events.
Ghost FSA
sight Pac-Man
q0
wander
maze
Pac-Man eats
power-up
is in
centre
q4
return
to centre
lose sight of Pac-Man
q1
chase
Pac-Man
Pac-Man eats
power-up
power-up
wears off
is eaten by Pac-Man
q2
run from
Pac-Man
Summary
I
Modeling sequences of words with finite state automata
(FSAs)
I
FSAs are models of computation that can be in a finite
number of states
I
We define transitions that change the state of the machine
I
Non-deterministic FSAs contain choice-points, where for
some combination of state and input there is more than
one possible action
I
These can be handled with a back-tracking search
I
Finite state transducers produce output in response to
input
Next week
I
Probability theory: terminology and notation
I
Estimating probability of words using corpora
I
Handling unseen sequences
I
Applications in natural language processing
© Copyright 2026 Paperzz