Pushdown Automata Chapter 12 Recognizing Context-Free Languages Two notions of recognition: (1) Say yes or no, just like with FSMs (2) Say yes or no, AND if yes, describe the structure a + b * c Just Recognizing We need a device similar to an FSM except that it needs more power. The insight: Precisely what it needs is a stack, which gives it an unlimited amount of memory with a restricted structure. Example: Bal (the balanced parentheses language) (((())) Definition of a Pushdown Automaton M = (K, , , , s, A), where: K is a finite set of states is the input alphabet is the stack alphabet s K is the initial state A K is the set of accepting states, and is the transition relation. It is a finite subset of (K state ( {}) *) input or string of symbols to pop from top of stack (K state *) string of symbols to push on top of stack Definition of a Pushdown Automaton A configuration of M is an element of K * *. The initial configuration of M is (s, w, ). Manipulating the Stack c will be written as cab a b If c1c2…cn is pushed onto the stack: c1 c2 cn c a b c1c2…cncab Yields Let c be any element of {}, Let 1, 2 and be any elements of *, and Let w be any element of *. Then: (q1, cw, 1) |-M (q2, w, 2) iff ((q1, c, 1), (q2, 2)) . Let |-M* be the reflexive, transitive closure of |-M. C1 yields configuration C2 iff C1 |-M* C2 Computations A computation by M is a finite sequence of configurations C0, C1, …, Cn for some n 0 such that: ● C0 is an initial configuration, ● Cn is of the form (q, , ), for some state q KM and some string in *, and ● C0 |-M C1 |-M C2 |-M … |-M Cn. Nondeterminism If M is in some configuration (q1, s, ) it is possible that: ● contains exactly one transition that matches. ● contains more than one transition that matches. ● contains no transition that matches. Accepting A computation C of M is an accepting computation iff: ● C = (s, w, ) |-M* (q, , ), and ● q A. M accepts a string w iff at least one of its computations accepts. Other paths may: ● Read all the input and halt in a nonaccepting state, ● Read all the input and halt in an accepting state with the stack not empty, ● Loop forever and never finish reading the input, or ● Reach a dead end where no more input can be read. The language accepted by M, denoted L(M), is the set of all strings accepted by M. Rejecting A computation C of M is a rejecting computation iff: = (s, w, ) |-M* (q, w, ), ● C is not an accepting computation, and ● M has no moves that it can make from (q, , ). ●C M rejects a string w iff all of its computations reject. So note that it is possible that, on input w, M neither accepts nor rejects. A PDA for Balanced Parentheses A PDA for Balanced Parentheses M = (K, , , , s, A), where: K = {s} the states = {(, )} the input alphabet = {(} the stack alphabet A = {s} contains: ((s, (, **), (s, ( )) ((s, ), ( ), (s, )) **Important: This does not mean that the stack is empty A PDA for AnBn = {anbn: n 0} A PDA for AnBn = {anbn: n 0} A PDA for {wcwR: w {a, b}*} A PDA for {wcwR: w {a, b}*} M = (K, , , , s, A), where: K = {s, f} the states = {a, b, c} the input alphabet = {a, b} the stack alphabet A = {f} the accepting states contains: ((s, a, ), (s, a)) ((s, b, ), (s, b)) ((s, c, ), (f, )) ((f, a, a), (f, )) ((f, b, b), (f, )) A PDA for {anb2n: n 0} A PDA for {anb2n: n 0} Exploiting Nondeterminism A PDA M is deterministic iff: ● M contains no pairs of transitions that compete with each other, and ● Whenever M is in an accepting configuration it has no available moves. But many useful PDAs are not deterministic. A PDA for PalEven ={wwR: w {a, b}*} S S aSa S bSb A PDA: A PDA for PalEven ={wwR: w {a, b}*} S S aSa S bSb A PDA: A PDA for {w {a, b}* : #a(w) = #b(w)} A PDA for {w {a, b}* : #a(w) = #b(w)} More on Nondeterminism Accepting Mismatches L = {ambn : m n; m, n > 0} Start with the case where n = m: b/a/ a//a b/a/ 1 2 More on Nondeterminism Accepting Mismatches L = {ambn : m n; m, n > 0} Start with the case where n = m: b/a/ a//a b/a/ 1 2 ● If stack and input are empty, halt and reject. ● If input is empty but stack is not (m > n) (accept): ● If stack is empty but input is not (m < n) (accept): More on Nondeterminism Accepting Mismatches L = {ambn : m n; m, n > 0} b/a/ a//a b/a/ 2 1 ● If input is empty but stack is not (m < n) (accept): b/a/ a//a /a/ b/a/ 1 /a/ 2 3 More on Nondeterminism Accepting Mismatches L = {ambn : m n; m, n > 0} b/a/ a//a b/a/ 2 1 ● If stack is empty but input is not (m > n) (accept): b// b/a/ 1 b// b/a/ a//a 2 4 Putting It Together L = {ambn : m n; m, n > 0} ● Jumping to the input clearing state 4: Need to detect bottom of stack. ● Jumping to the stack clearing state 3: Need to detect end of input. The Power of Nondeterminism Consider AnBnCn = {anbncn: n 0}. PDA for it? The Power of Nondeterminism Consider AnBnCn = {anbncn: n 0}. Now consider L = AnBnCn. L is the union of two languages: 1. {w {a, b, c}* : the letters are out of order}, and 2. {aibjck: i, j, k 0 and (i j or j k)} (in other words, unequal numbers of a’s, b’s, and c’s). A PDA for L = AnBnCn Are the Context-Free Languages Closed Under Complement? AnBnCn is context free. If the CF languages were closed under complement, then AnBnCn = AnBnCn would also be context-free. But we will prove that it is not. L = {anbmcp: n, m, p 0 and n m or m p} S NC S QP NA NB Aa A aA A aAb Bb B Bb B aBb C | cC P B' P C' B' b B' bB' B' bB'c C' c | C'c C' C'c C' bC'c Q | aQ /* n m, then arbitrary c's /* arbitrary a's, then p m /* more a's than b's /* more b's than a's /* add any number of c's /* more b's than c's /* more c's than b's /* prefix with any number of a's Reducing Nondeterminism ● Jumping to the input clearing state 4: Need to detect bottom of stack, so push # onto the stack before we start. ● Jumping to the stack clearing state 3: Need to detect end of input. Add to L a termination character (e.g., $) Reducing Nondeterminism ● Jumping to the input clearing state 4: Reducing Nondeterminism ● Jumping to the stack clearing state 3: More on PDAs A PDA for {wwR : w {a, b}*}: What about a PDA to accept {ww : w {a, b}*}? PDAs and Context-Free Grammars Theorem: The class of languages accepted by PDAs is exactly the class of context-free languages. Recall: context-free languages are languages that can be defined with context-free grammars. Restate theorem: Can describe with context-free grammar Can accept by PDA Going One Way Lemma: Each context-free language is accepted by some PDA. Proof (by construction): The idea: Let the stack do the work. Two approaches: • Top down • Bottom up Top Down The idea: Let the stack keep track of expectations. Example: Arithmetic expressions EE+T ET TTF TF F (E) F id (1) (2) (3) (4) (5) (6) (q, , E), (q, E+T) (q, , E), (q, T) (q, , T), (q, T*F) (q, , T), (q, F) (q, , F), (q, (E) ) (q, , F), (q, id) (7) (q, id, id), (q, ) (8) (q, (, ( ), (q, ) (9) (q, ), ) ), (q, ) (10) (q, +, +), (q, ) (11) (q, , ), (q, ) A Top-Down Parser The outline of M is: M = ({p, q}, , V, , p, {q}), where contains: ● The start-up transition ((p, , ), (q, S)). ● For each rule X s1s2…sn. in R, the transition: ((q, , X), (q, s1s2…sn)). ● For each character c , the transition: ((q, c, c), (q, )). Example of the Construction L = {anb*an} (1) S (2) S B (3) S aSa (4) B (5) B bB * input = a a b b a a trans 0 3 6 3 6 2 5 7 5 7 4 6 6 0 (p, , ), (q, S) 1 (q, , S), (q, ) 2 (q, , S), (q, B) 3 (q, , S), (q, aSa) 4 (q, , B), (q, ) 5 (q, , B), (q, bB) 6 (q, a, a), (q, ) 7 (q, b, b), (q, ) state p q q q q q q q q q q q q q a a b b a a b b a a b b a b b a b b b b b b b b b b unread input a a a a a a a a a a a a a a a a a a a a a a a a a S aSa Sa aSaa Saa Baa bBaa Baa bBaa Baa aa a stack Another Example L = {anbmcpdq : m + n = p + q} Another Example L = {anbmcpdq : m + n = p + q} (1) S aSd (2) S T (3) S U (4) T aTc (5) T V (6) U bUd (7) U V (8) V bVc (9) V input = a a b c d d Another Example L = {anbmcpdq : m + n = p + q} 0 (p, , ), (q, S) (1) S aSd 1 (q, , S), (q, aSd) (2) S T 2 (q, , S), (q, T) (3) S U 3 (q, , S), (q, U) (4) T aTc 4 (q, , T), (q, aTc) (5) T V 5 (q, , T), (q, V) (6) U bUd 6 (q, , U), (q, bUd) (7) U V 7 (q, , U), (q, V) (8) V bVc 8 (q, , V), (q, bVc) (9) V 9 (q, , V), (q, ) 10 (q, a, a), (q, ) 11 (q, b, b), (q, ) input = a a b c d d 12 (q, c, c), (q, ) 13 (q, d, d), (q, ) trans state unread input stack The Other Way to Build a PDA - Directly L = {anbmcpdq : m + n = p + q} (1) S aSd (2) S T (3) S U (4) T aTc (5) T V input = a a b c d d (6) U bUd (7) U V (8) V bVc (9) V The Other Way to Build a PDA - Directly L = {anbmcpdq : m + n = p + q} (1) S aSd (2) S T (3) S U (4) T aTc (5) T V a//a (6) U bUd (7) U V (8) V bVc (9) V b//a b//a 1 c/a/ c/a/ 2 d/a/ d/a/ 3 c/a/ 4 d/a/ d/a/ input = a a b c d d Notice Nondeterminism Machines constructed with the algorithm are often nondeterministic, even when they needn't be. This happens even with trivial languages. Example: AnBn = {anbn: n 0} A grammar for AnBn is: [1] S aSb [2] S A PDA M for AnBn is: (0) (1) (2) (3) (4) ((p, , ), (q, S)) ((q, , S), (q, aSb)) ((q, , S), (q, )) ((q, a, a), (q, )) ((q, b, b), (q, )) But transitions 1 and 2 make M nondeterministic. A directly constructed machine for AnBn: Bottom-Up The idea: Let the stack keep track of what has been found. (1) E E + T (2) E T (3) T T F (4) T F (5) F (E) (6) F id Reduce Transitions: (1) (p, , T + E), (p, E) (2) (p, , T), (p, E) (3) (p, , F T), (p, T) (4) (p, , F), (p, T) (5) (p, , )E( ), (p, F) (6) (p, , id), (p, F) Shift Transitions (7) (p, id, ), (p, id) (8) (p, (, ), (p, () (9) (p, ), ), (p, )) (10) (p, +, ), (p, +) (11) (p, , ), (p, ) A Bottom-Up Parser The outline of M is: M = ({p, q}, , V, , p, {q}), where contains: ● The shift transitions: ((p, c, ), (p, c)), for each c . ● The reduce transitions: ((p, , (s1s2…sn.)R), (p, X)), for each rule X s1s2…sn. in G. ● The finish up transition: ((p, , S), (q, )). Going The Other Way Lemma: If a language is accepted by a pushdown automaton M, it is context-free (i.e., it can be described by a context-free grammar). Proof (by construction): Step 1: Convert M to restricted normal form: ● M has a start state s that does nothing except push a special symbol # onto the stack and then transfer to a state s from which the rest of the computation begins. There must be no transitions back to s. ● M has a single accepting state a. All transitions into a pop # and read no input. ● Every transition in M, except the one from s, pops exactly one symbol from the stack. Converting to Restricted Normal Form Example: {wcwR : w {a, b}*} Add s and a: Pop no more than one symbol: M in Restricted Normal Form [1] [3] Pop exactly one symbol: Replace [1], [2] and [3] with: [1] ((s, a, #), (s, a#)), ((s, a, a), (s, aa)), ((s, a, b), (s, ab)), [2] ((s, b, #), (s, b#)), ((s, b, a), (s, ba)), ((s, b, b), (s, bb)), [3] ((s, c, #), (f, #)), ((s, c, a), (f, a)), ((s, c, b), (f, b)) [2] Must have one transition for everything that could have been on the top of the stack so it can be popped and then pushed back on. Second Step - Creating the Productions Example: WcWR M= The basic idea – simulate a leftmost derivation of M on any input string. Second Step - Creating the Productions Example: abcba Nondeterminism and Halting 1. There are context-free languages for which no deterministic PDA exists. 2. It is possible that a PDA may ● not halt, ● not ever finish reading its input. 3. There exists no algorithm to minimize a PDA. It is undecidable whether a PDA is minimal. Nondeterminism and Halting It is possible that a PDA may ● not halt, ● not ever finish reading its input. Let = {a} and consider M = L(M) = {a}: (1, a, ) |- (2, a, a) |- (3, , ) On any other input except a: ● M will never halt. ● M will never finish reading its input unless its input is . Solutions to the Problem ● For NDFSMs: ● Convert to deterministic, or ● Simulate all paths in parallel. ● For NDPDAs: ● Formal solutions that usually involve changing the grammar. ● Practical solutions that: ● Preserve the structure of the grammar, but ● Only work on a subset of the CFLs. Alternative Equivalent Definitions of a PDA Accept by accepting state at end of string (i.e., we don't care about the stack). From M (in our definition) we build M (in this one): 1. Initially, let M = M. 2. Create a new start state s. Add the transition: ((s, , ), (s, #)). 3. Create a new accepting state qa. 4. For each accepting state a in M do, 4.1 Add the transition ((a, , #), (qa, )). 5. Make qa the only accepting state in M. Example The balanced parentheses language What About These? ● FSM plus FIFO queue (instead of stack)? ● FSM plus two stacks? Comparing Regular and Context-Free Languages Regular Languages ● regular exprs. ● or ● regular grammars grammars ● recognize ● = DFSMs Context-Free Languages ● context-free ● parse ● = NDPDAs
© Copyright 2024 Paperzz