MA/CSSE 474 Theory of Computation But first - Rose

1/22/2012
MA/CSSE 474
Theory of Computation
PDAs and CFGs
Top-down and Bottom-up parsing
But first … another
Pumping Theorem example
L = {anbman, n, m ≥ 0 and n ≥ m}.
Let w = akbkak
aaa … aaabbb … bbbaaa … aaa
|
1
|
2
|
3
|
1
1/22/2012
Nested and Cross-Serial Dependencies
PalEven = {wwR : w ∈ {a, b}*}
a a b b a a
The dependencies are nested.
WcW = {wcw : w ∈ {a, b}*}
a a b c a a b
Cross-serial dependencies.
Some more examples for you to
consider later
• On the next few slides
2
1/22/2012
WcW = {wcw : w ∈ {a, b}*}
Let w = akbkcakbk.
aaa … aaabbb …
|
1
|
2
bbbcaaa … aaabbb … bbb
|3|
4
|
5
|
Call the part before c the left side and the part after c the right side.
● If v or y overlaps region 3, set q to 0. The resulting string will no
longer contain a c.
● If both v and y occur before region 3 or they both occur after
region 3, then set q to 2. One side will be longer than the other.
● If either v or y overlaps region 1, then set q to 2. In order to make
the right side match, something would have to be pumped into
region 4. Violates |vxy| ≤ k.
● If either v or y overlaps region 2, then set q to 2. In order to make
the right side match, something would have to be pumped into
region 5. Violates |vxy| ≤ k.
• {(ab)nanbn : n > 0}
• {x#y : x, y ∈ {0, 1}* and x ≠ y}
3
1/22/2012
PDAs and Context-Free Grammars
Theorem: The class of languages accepted by PDAs is
exactly the class of context-free languages.
Recall: context-free languages are languages that
can be genrated by context-free grammars.
The hard direction: PDA → CFG (later)
The easy direction: CFG → PDA
The idea: Let the stack do the work.
Two approaches:
• Top-down
• Bottom-up
Top Down
The idea: Let the stack keep track of expectations.
Example: Arithmetic expressions
E→E+T
E→T
T→T∗F
T→F
F → (E)
F → id
(1)
(2)
(3)
(4)
(5)
(6)
(q, ε, E), (q, E+T)
(q, ε, E), (q, T)
(q, ε, T), (q, T*F)
(q, ε, T), (q, F)
(q, ε, F), (q, (E) )
(q, ε, F), (q, id)
(7) (q, id, id), (q, ε)
(8) (q, (, ( ), (q, ε)
(9) (q, ), ) ), (q, ε)
(10) (q, +, +), (q, ε)
(11) (q, ∗, ∗), (q, ε)
4
1/22/2012
A Top-Down Parser
The outline of M is:
M = ({p, q}, Σ, V, ∆, p, {q}), where ∆ contains:
● The start-up transition ((p, ε, ε), (q, S)).
● For each rule X → s1s2…sn. in R, the transition:
((q, ε, X), (q, s1s2…sn)).
● For each character c ∈ Σ, the transition:
((q, c, c), (q, ε)).
Example of the Construction
L=
{anb*an}
(1) S → ε
(2) S → B
(3) S → aSa
(4) B → ε
(5) B → bB
*
input = a a b b a a
trans
0
3
6
3
6
2
5
7
5
7
4
6
6
state
p
q
q
q
q
q
q
q
q
q
q
q
q
q
0 (p, ε, ε), (q, S)
1 (q, ε, S), (q, ε)
2 (q, ε, S), (q, B)
3 (q, ε, S), (q, aSa)
4 (q, ε, B), (q, ε)
5 (q, ε, B), (q, bB)
6 (q, a, a), (q, ε)
7 (q, b, b), (q, ε)
unread input
a a b b
a a b b
a a b b
a b b
a b b
b b
b b
b b
b
b
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
ε
This is here for
later reference.
We did a similar
example with the
expression
grammar.
stack
ε
S
aSa
Sa
aSaa
Saa
Baa
bBaa
Baa
bBaa
Baa
aa
a
ε
5
1/22/2012
Another Example : tracing practice
L = {anbmcpdq : m + n = p + q}
0 (p, ε, ε), (q, S)
(1) S → aSd
1 (q, ε, S), (q, aSd)
(2) S → T
2 (q, ε, S), (q, T)
(3) S → U
3 (q, ε, S), (q, U)
(4) T → aTc
4 (q, ε, T), (q, aTc)
(5) T → V
5 (q, ε, T), (q, V)
(6) U → bUd
6 (q, ε, U), (q, bUd)
(7) U → V
7 (q, ε, U), (q, V)
(8) V → bVc
8 (q, ε, V), (q, bVc)
(9) V → ε
9 (q, ε, V), (q, ε)
10 (q, a, a), (q, ε)
11 (q, b, b), (q, ε)
input = a a b c d d
12 (q, c, c), (q, ε)
13 (q, d, d), (q, ε)
trans
state
unread input
This is here for
later reference.
We did a similar
example with the
expression
grammar.
stack
Notice the Nondeterminism
Machines constructed with the algorithm are often nondeterministic,
even when they needn't be. This happens even with trivial
languages.
Example: AnBn = {anbn: n ≥ 0}
A grammar for AnBn is:
[1] S → aSb
[2] S → ε
A PDA M for AnBn is:
(0)
(1)
(2)
(3)
(4)
((p, ε, ε), (q, S))
((q, ε, S), (q, aSb))
((q, ε, S), (q, ε))
((q, a, a), (q, ε))
((q, b, b), (q, ε))
Transitions 1 and 2 make M nondeterministic.
The manually constructed machine for AnBn that we created last week
is deterministic.
6
1/22/2012
Bottom-Up
The idea: Let the stack keep track of what has been found.
(1) E → E + T
(2) E → T
(3) T → T ∗ F
(4) T → F
(5) F → (E)
(6) F → id
(1)
(2)
(3)
(4)
(5)
(6)
Discover a rightmost derivation in
reverse order. Start with the sentence
and try to "pull it back" to S.
Shift Transitions:
(7) (p, id, ε), (p, id)
(8) (p, (, ε), (p, ()
(9) (p, ), ε), (p, ))
(10) (p, +, ε), (p, +)
(11) (p, ∗, ε), (p, ∗)
Reduce Transitions:
(p, ε, T + E), (p, E)
(p, ε, T), (p, E)
(p, ε, F ∗ T), (p, T)
(p, ε, F), (p, T)
(p, ε, )E( ), (p, F)
When the right side of a production is
(p, ε, id), (p, F)
on the top of the stack, we can replace
it by the left side of that production…
…or not! That's where the nondeterminism comes in:
choice between shift and reduce; choice between two reductions.
A Bottom-Up Parser
The outline of M is:
M = ({p, q}, Σ, V, ∆, p, {q}), where ∆ contains:
● The shift transitions: ((p, c, ε), (p, c)), for each c ∈ Σ.
● The reduce transitions: ((p, ε, (s1s2…sn.)R), (p, X)), for each rule
X → s1s2…sn. in G.
● The finish up transition: ((p, ε, S), (q, ε)).
7
1/22/2012
Sketch of PDA→CFG
Lemma: If a language is accepted by a pushdown automaton M, it is
context-free (i.e., it can be described by a context-free grammar).
Proof (by construction):
Step 1: Convert M to restricted normal form:
● M has a start state s′ that does nothing except push a special
symbol # onto the stack and then transfer to a state s from which
the rest of the computation begins. There must be no transitions
back to s′.
● M has a single accepting state a. All transitions into a pop # and
read no input.
● Every transition in M, except the one from s′, pops exactly one
symbol from the stack.
Second Step - Creating the Productions
Example:
WcW R
M=
The basic idea –
simulate a leftmost derivation of M on any input string.
8
1/22/2012
Step 2 - Creating the Productions
The basic idea: A leftmost derivation simulates the
actions of M on an input string.
Example:
abcba
Halting
It is possible that a PDA may
● not halt,
● not ever finish reading its input.
Let Σ = {a} and consider M =
L(M) = {a}: (1, a, ε) |- (2, a, a) |- (3, ε, ε)
On any other input except a:
● M will never halt.
● M will never finish reading its input unless its input is ε.
9
1/22/2012
Nondeterminism and Decisions
1. There are context-free languages for which no
deterministic PDA exists.
2. It is possible that a PDA may
● not halt,
● not ever finish reading its input.
● require time that is exponential in the length of its
input.
3. There is no PDA minimization algorithm.
It is undecidable whether a PDA is minimal.
Solutions to the Problem
●
For NDFSMs:
● Convert to deterministic, or
● Simulate all paths in parallel.
● For NDPDAs:
● No general solution.
● Formal solutions that usually involve changing the
grammar.
● Such as Chomsky or Greibach Normal form.
● Practical solutions that:
● Preserve the structure of the grammar, but
● Only work on a subset of the CFLs.
10
1/22/2012
Alternative Equivalent Definitions of a
PDA
Accept by accepting state at end of string (i.e., we don't
care about the stack).
From M (in our definition) we build M′ (in this one):
1. Initially, let M′ = M.
2. Create a new start state s′. Add the transition:
((s′, ε, ε), (s, #)).
3. Create a new accepting state qa.
4. For each accepting state a in M do,
4.1 Add the transition ((a, ε, #), (qa, ε)).
5. Make qa the only accepting state in M′.
Example
The balanced parentheses language
11
1/22/2012
What About These?
● FSM plus FIFO queue (instead of stack)?
● FSM plus two stacks?
Comparing Regular and
Context-Free Languages
Regular Languages
● regular exprs.
or
regular grammars
● recognize
● = DFSMs
Context-Free Languages
● context-free grammars
● parse
● = NDPDAs
12