Outline Learning Goals: In-Class Learning Goals: In

snick
∨
snack
Outline
• Learning Goals
• Formally Specifying DFAs
• Designing DFAs
CPSC 121: Models of Computation
2010 Winter Term 2
DFAs in Depth
Benjamin Israel
[email protected]
Notes heavily borrowed from Steve
Wolfman’s, whose slides were based
on notes by Patrice Belleville and
others
• Analyzing DFAs: Can DFAs Count?
• Non-Deterministic Finite Automata:
Are They More Powerful than DFAs?
2
Learning Goals: In-Class
Learning Goals: In-Class
By the end of this unit, you should be able to
for the exam:
– Build a DFA to recognize a particular
language.
– Identify the language recognized by a
particular DFA.
By the end of this unit, you should be able to
for yourself:
– Use the formalisms that we’ve learned so far
(especially power sets and set union) to illustrate
that nondeterministic finite automata are no more
powerful than deterministic finite automata.
– Demonstrate through a simple contradiction
argument that there are interesting properties of
programs that cannot be computed in general.
– Describe how the theoretical story 121 has been
telling connects to the branch of CS theory called
automata theory.
Reminder:
Formal DFA Specification
Outline
I the (finite) set of
letters in the input
language.
S the (finite) set of
states.
s0 the start state; s0 ∈ S
F the set of accepting
states; F ⊆ S
N : S × I → S is the
transition function.
• Learning Goals
• Formally Specifying DFAs
• Designing DFAs
• Analyzing DFAs: Can DFAs Count?
• Non-Deterministic Finite Automata:
Are They More Powerful than DFAs?
5
a
a,b
a
b
b
a,b
One Way to Formalize
DFA Operation
The algorithm “RunDFA” runs a given
DFA on an input, resulting in “Yes”,
“No”, or (if the input is invalid) “Huh?”
Formal DFA Example (Fixed)
a,b
a
b
b
A DFA is a five-tuple: (I, S, s0, F, N)
a
a,b
RunDFA((I, S, s0, F, N), in)
1. If in is empty, then:
if s0 ∈ F, return “Yes”, else return “No”.
2. Otherwise, let in0 be the first element of in and
inrest be the rest of in.
3. If in0 ∉ I, return “Huh?”
4. Otherwise, let s' = N(s0, in0).
5. Return RunDFA((I, S, s', F, N), inrest).
a,b
a
What is I? S? s0? F? N?
Running the DFA
a,b
a
b
b
a,b
b
Outline
How does the DFA runner work?
a
a
a,b
• Prereqs, Learning Goals, and !Quiz Notes
• Formally Specifying DFAs
• Designing DFAs
• Analyzing DFAs: Can DFAs Count?
• Non-Deterministic Finite Automata:
Are They More Powerful than DFAs?
Input: abbab
b
Only partly on the exam.
(You should be able to simulate a DFA, but needn’t use RunDFA.)
Strategy for Designing a DFA
To design a DFA for some language, try:
1. Should the empty string be accepted? If so, draw
an accepting start state; else, draw a rejecting
one.
2. From that state, consider what each letter would
lead to. Group letters that lead to the same
outcome. (Some may lead back to the start state
if they’re no different from “starting over”.)
3. Decide for each new state whether it’s accepting.
4. Repeat steps 2 and 3 for new states as long as
you have unfinished states, always trying to
reuse existing states!
As you go, give as descriptive names as you can to your
states, to figure out when to reuse them!
10
Problem: Design a DFA to
Recognize “Decimal Numbers”
Rules?
Problem: Design a DFA to
Recognize “Decimal Numbers”
•Can have a negative sign
•Can have a decimal
•If decimal, needs at least one digit
before and one digit after the decimal.
Is this a decimal number?
Let’s build a DFA to check whether an input is a decimal
number. Which of the following is a decimal number?
3.5
2.011
-6
5.
-3.1415926536
.25
--3
3-5
5.2.5
.
Rules for whether something is a decimal number?
Regular Expressions
• Syntax for defining which words to accept
and which to reject
• Can be converted back and forth between
DFAs
• Don’t worry too much about them now,
they will be further introduced in lab
Regular Expressions
Something followed by ? is optional.
Anything followed by a + can be repeated.
[0-9] is a digit, an element of the set
{0, 1, …, 9}.
– is literally –.
\. means.
() are used to group elements together.
Rules for a decimal number:
-?[0-9]+(\.[0-9]+)?
Problem: Design a DFA to
Recognize Telephone Numbers
Problem: Design a DFA to
Recognize Telephone Numbers
Problem: Design a DFA to recognize
telephone numbers
Problem: Design a DFA to recognize
telephone numbers, defined using the
regular expression:
((1\-)?[2-9][0-9][0-9]-)?
[2-9][0-9][0-9]\[0-9][0-9][0-9][0-9]
Step 1: Design a Regular Expression
More DFA Design Problems
Design a DFA that recognizes words that
have two or more contiguous sequences
of at least two vowels. (This is a rough
stab at “multisyllabic” words.)
More DFA Design Problems
Design a DFA that recognizes base 4
numbers with at least one digit in which
the digits are in sorted order.
Accepted: Sweetie, Bearee
Not accepted: Queue
More DFA Design Problems
Design a DFA that recognizes strings
containing the letters a, b, and c in which
all the as come before the first c and no
two bs neighbour each other.
Outline
• Prereqs, Learning Goals, and !Quiz Notes
• Formally Specifying DFAs
• Designing DFAs
• Analyzing DFAs: Can DFAs Count?
• Non-Deterministic Finite Automata:
Are They More Powerful than DFAs?
22
Can a DFA Count?
Problem: Design a DFA to recognize anbn:
the language that includes all strings that
start with some number of as and end with
the same number of bs.
DFAs Can’t Count
Let’s prove it. The heart of the proof is the
insight that a DFA can only have a finite
number of states and so can only “count
up” a finite number of as.
We proceed by contradiction. Assume
some DFA recognizes the language anbn.
DFAs Can’t Count
DFAs Can’t Count
Assume DFA counter recognizes anbn.
counter must have some finite number of
states k = |Scounter|.
Consider the input akbk. counter must
repeat (at least) one state as it processes
the k as in that string. (Because it starts
in s0 and transitions k times; if it didn’t
repeat a state, that would be k+1 states,
which is greater than |Scounter|.)
Consider the input akbk. counter must
repeat (at least) one state as it processes
the k as in that string.
counter accepts akbk, meaning it ends in
an accepting state.
a
a
a
DFAs Can’t Count
a
a
Give counter a(k-r)bk instead.
counter must accept a(k-r)bk, which is not
in the language anbn . Contradiction!
Thus, no DFA recognizes anbn
a
b
a
b
Doesn’t necessarily look like this, but similar.
a
...
b
DFAs Can’t Count
Now, consider the number of as that are
processed between the first and second
time visiting the repeated state. Call it r.
Give counter a(k-r)bk instead.
(Note: r > 0.)
a
a
...
a
...
(In fact, it repeats each state after the first one it repeats up to the b.)
a
...
...
...
a
b
a
a
b
a
a
a
b
Let’s Try Something More
Powerful (?): NFAs
Outline
An NFA is like a DFA except:
• Prereqs, Learning Goals, and !Quiz Notes
• Formally Specifying DFAs
• Designing DFAs
• Any number (zero or more) of arcs can lead from
each state for each letter of the alphabet
• There may be many start states
When we run a DFA, we put a finger on the current
state and then read one letter of the input at a
time, transitioning from state to state.
When we run an NFA, we may need many fingers to
keep track of many current states. When we
transition, we go to all the possible next states.
• Analyzing DFAs: DFAs Can’t Count
• Non-Deterministic Finite Automata:
Are They More Powerful than DFAs?
29
NFAs Continued
Another way to think of NFAs
An NFA is like a DFA except:
• Any number (zero or more) of arcs can lead
from each state for each letter of the alphabet
• There may be many start states
A DFA accepts when we end in an
accepting state.
An NFA accepts when at least one of the
states we end in is accepting.
Decimal Number NFA... Is Easy!
An NFA is a non-deterministic finite
automaton.
Instead of doing one predictable thing at
each state given an input, it may have
choices of what to do.
If one of the choices leads to accepting the
input, the NFA will make that choice. (It
“magically” knows which choice to make.)
Are NFAs More Powerful than
DFAs?
Problem: Prove that every language an NFA can
recognize can be recognized by a DFA as well.
s1
-
s2
0-9
int
.
Strategy: Given an NFA, show how to build a DFA
(i.e., I, S, s0, F, and N) that accepts/rejects the
same strings.
dot
0-9
0-9
-?[0-9]+(\.[0-9]+)?
real
Hint: Formally, what is the “group of states” an NFA
can be in at any given time? Build the parts of the
DFA based on that (i.e., S is the set of all such
things).
0-9
Worked Problem:
Are NFAs More Powerful than DFAs?
Worked Problem:
Are NFAs More Powerful than DFAs?
Problem: Prove that every language an
NFA can recognize can be recognized by
a DFA as well.
C has an alphabet IC. Let ID = IC. (They operate on the
same strings.)
C’s states are SC. Let SD = P(SC). (Each state in D
represents being in a subset of the states in C.)
C starts in some subset S0C of SC. Let s0D = S0C. (D’s
start state represents being in the set of start states in
C.)
C has accepting states FC. Let FD = {S ⊆ SC | S ∩ FC ≠
∅}. (A state in D is accepting if it represents a set of
states in C that contains at least one accepting state.)
C has zero or more arcs leading out of each state in SC
for each letter in IC. ND(sD, i) operates on a state sD
that actually represents a set of states from SC. So:
WLOG, consider an arbitrary NFA “C”. We’ll
build a DFA “D” that does the same thing.
We now build each element of the five-tuple
defining D. We do so in such a way that D
faithfully simulates N’s operation.
N D (sD , i) =
U the set of states connected to by arcs labeled i from s
C
sC ∈s D
Decimal Number NFA...
as a DFA
0-9
{s1, s2}
-
{s2}
0-9 {s2, int}
else
else
0-9
.
{}
.
{dot}
0-9
{real}
else
else
Σ
0-9
LOTS of states left off that are unreachable from the start state.