CS 235: User Interface Design - Department of Computer Science

CS 154
Formal Languages and Computability
March 22 Class Meeting
Department of Computer Science
San Jose State University
Spring 2016
Instructor: Ron Mak
www.cs.sjsu.edu/~mak
Context-Free Decidable Properties

There is an algorithm to decide whether or not
a given string w can be derived from a contextfree grammar G. YES


Membership
Is w in L(G)?

There is an algorithm to decide whether or not
L(G) is empty for a context-free grammar G. YES

There is an algorithm to decide whether or not
L(G) is infinite for a context-free grammar G. YES
Computer Science Dept.
Spring 2016: March 22
CS 154: Formal Languages and Computability
© R. Mak
2
More Context-Free Decision Problems

There is an algorithm to decide whether or not a
given context-free grammar G is ambiguous. NO

There is an algorithm to decide whether or not
two given context-free languages L1 and L2
share a common word. NO

There is an algorithm to decide whether or not
two given context-free grammars G1 and G2
generate the same language. NO
Computer Science Dept.
Spring 2016: March 22
CS 154: Formal Languages and Computability
© R. Mak
3
Programming Language Parsers

A parser uses a context-free grammar for a
programming language.

The parser analyzes source program
statements to determine if they can be
generated by the grammar.



Are the statements syntactically correct?
A parser is a major component of a compiler.
Parsers use either a top-down
or a bottom-up approach.
Computer Science Dept.
Spring 2016: March 22
CS 154: Formal Languages and Computability
© R. Mak
4
Top-Down Parsers

A top-down parser starts with the higher-level
productions and works its way down to the
lower-level productions.

Example: When Simple Calculator parses an
expression, it starts at the highest level:
<expression>  <simple expression>  <term>  <factor>  <primary>  <number>

Top-down parsers are relatively easy to
understand, write, and debug.


But they are not very efficient.
Lots of recursive calls.
Computer Science Dept.
Spring 2016: March 22
CS 154: Formal Languages and Computability
© R. Mak
5
Bottom-Up Parsers

Bottom-up parsers can be very efficient.


Given a context-free grammar G, we can create
a PDA that implements a bottom-up parser for
the language L(G).


But they can also be hard to understand and debug.
The parser should be deterministic.
We will only consider deterministic context-free
languages defined by LR(1) grammars.


Parsed left to right with only one symbol lookahead.
Rightmost derivations.
Computer Science Dept.
Spring 2016: March 22
CS 154: Formal Languages and Computability
© R. Mak
6
Shift-Reduce Parsing

A shift-reduce bottom-up parser is a PDA.

It has two repeated operations:


Shift: Push the next input symbol onto the stack.

Reduce: If the symbols on top of the stack match
the right side of a production, replace those symbols
on the stack by the variable on the left side of the
production.
Terminate and accept when only
the start variable is left on the stack.
Computer Science Dept.
Spring 2016: March 22
CS 154: Formal Languages and Computability
© R. Mak
7
Shift-Reduce Parsing, cont’d

Questions:

Should you reduce now, or read and shift
more symbols from the input?

Use which production to reduce?
Computer Science Dept.
Spring 2016: March 22
CS 154: Formal Languages and Computability
© R. Mak
8
Shift-Reduce
Parsing Example
Operation
Stack
(top at right)
Input
Rule
(n+n)*n
E ® E +T |T
shift
(
n+n)*n
T ®T ´F |F
F ® (E) | n
shift
(n
+n)*n
Reduce
(F
+n)*n
Fn
Reduce
(T
+n)*n
TF
Reduce
(E
+n)*n
ET
shift
(E+
n)*n
shift
(E+n
)*n
Reduce
(E+F
)*n
Fn
Reduce
(E+T
)*n
TF
Reduce
(E
)*n
EE+T
shift
(E)
*n
Reduce
F
*n
F(E)
Reduce
T
*n
TF
shift
T*
n
shift
T*n
Reduce
T*F
Fn
Reduce
T
TTxF
This grammar uses
recursion instead of
iteration.
Is (n+n)*n
syntactically
correct?
Computer Science Dept.
Spring 2016: March 22
CS 154: Formal Languages and Computability
E© R. Mak
Reduce
Not ET
Not ET
ET
Not TF
9
Shift-Reduce
Parsing Example
Operation
Stack
(top at right)
Input
Rule
(n+n)*n
E ® E +T |T
shift
(
n+n)*n
T ®T ´F |F
F ® (E) | n
shift
(n
+n)*n
Reduce
(F
+n)*n
Fn
Reduce
(T
+n)*n
TF
Reduce
(E
+n)*n
ET
shift
(E+
n)*n
shift
(E+n
)*n
Reduce
(E+F
)*n
Fn
Reduce
(E+T
)*n
TF
Reduce
(E
)*n
EE+T
shift
(E)
*n
Reduce
F
*n
F(E)
Reduce
T
*n
TF
shift
T*
n
shift
T*n
Reduce
T*F
Fn
Reduce
T
TTxF
This grammar uses
recursion instead of
iteration.
Is (n+n)*n
syntactically
correct?
Computer Science Dept.
Spring 2016: March 22
CS 154: Formal Languages and Computability
E© R. Mak
Reduce
Not ET
Not ET
ET
Not TF
10
Table-Driven Shift-Reduce Parsing

With an LR(1) grammar, one symbol lookahead
of the input is enough for the parser to decide
whether to shift or to reduce.

The decisions can be encoded in a table:


Is the next operation a shift or a reduce?
What is the next state of the parser?
Computer Science Dept.
Spring 2016: March 22
CS 154: Formal Languages and Computability
© R. Mak
11
Table-Driven Shift-Reduce Parsing, cont’d

The PDA that underlies the parser has a stack
where the entries alternate between symbols
(both terminals and variables) and states.


Example (top at right): … 4B7a
Given the current state of the parser and the
next input symbol, the table specifies one of
four possible actions: shift, reduce, accept,
or error.
Computer Science Dept.
Spring 2016: March 22
CS 154: Formal Languages and Computability
© R. Mak
12
Table-Driven Shift-Reduce Parsing, cont’d
… 4B7a

Shift
1.
2.
3.
4.

Push the current state onto the stack (e.g., 7).
Push the next input symbol onto the stack (e.g., a).
Change to the state specified in the table.
Do next operation specified in the table based on
the new state and the next input symbol.
Example: “s3” in the table means to shift
and change to state 3.
Computer Science Dept.
Spring 2016: March 22
CS 154: Formal Languages and Computability
© R. Mak
13
Table-Driven Shift-Reduce Parsing, cont’d
… 4B7a

… 4C
Reduce by the specified production

Example: “R1” in the table means to use production 1.
Assume it’s CBa.
1.
Pop off the right-hand-side symbols B and a
and the intervening state 7 and push C.
Change to the new state in the table based on
the top two items on the stack, 4 and C.
Do next operation in the table based on
the new state and the next input symbol.
2.
3.
Computer Science Dept.
Spring 2016: March 22
CS 154: Formal Languages and Computability
© R. Mak
14
Table-Driven Shift-Reduce Parsing, cont’d

Accept


Stop and accept the input string
(i.e., it’s syntactically correct)
Error

A blank table entry indicates an error in the input
(i.e., it’s syntactically incorrect)
Computer Science Dept.
Spring 2016: March 22
CS 154: Formal Languages and Computability
© R. Mak
15
Shift-Reduce Parsing Example #2
1.
2.
3.
4.
State real
0
,
<stmt>  real <idlist>
<idlist>  <idlist>,<id>
<idlist>  <id>
<id>  A | B | C | D
A|B|C|D
λ
s1
s3
<idlist>
4
5
accept
2
3
R4
R4
4
R3
R3
5
s6
R1
6
Computer Science Dept.
Spring 2016: March 22
<id>
2
1
7
<stmt>
s3
R2
7
R2
CS 154: Formal Languages and Computability
© R. Mak
16
Shift-Reduce Parsing Example #2, cont’d
1.
2.
3.
4.
<stmt>  real <idlist>
<idlist>  <idlist>,<id>
<idlist>  <id>
<id>  A | B | C | D
Is
real A, B
syntactically
correct?
Computer Science Dept.
Spring 2016: March 22
State
rea
l
0
s1
,
A|B|C|
D
<stmt
>
<id
>
<idlist
>
4
5
2
1
s3
2
accept
3
R4
R4
4
R3
R3
5
s6
R1
6
State
λ
s3
7
R2
Stack (top at right)
0
7
R2
Input
Operation
real A,
B
s1
A, B
s3
1
0 real
3
0 real 1 A
, B
R4
4
0 real 1 <id>
, B
R3
5
0 real 1 <idlist>
, B
s6
6
0 real 1 <idlist> 5 ,
B
s3
3
0 real 1 <idlist> 5 , 6 B
R4
7
0 real 1 <idlist> 5 , 6
R2
5
0 real 1 <idlist>
R1
CS 154: Formal Languages and Computability
<id> © R. Mak
17
Shift-Reduce Parsing Example #2, cont’d
1.
2.
3.
4.
<stmt>  real <idlist>
<idlist>  <idlist>,<id>
<idlist>  <id>
<id>  A | B | C | D
Is
real A, B
syntactically
correct?
Computer Science Dept.
Spring 2016: March 22
State
rea
l
0
s1
,
A|B|C|
D
<stmt
>
<id
>
<idlist
>
4
5
2
1
s3
2
accept
3
R4
R4
4
R3
R3
5
s6
R1
6
State
λ
s3
7
R2
Stack (top at right)
0
7
R2
Input
Operation
real A,
B
s1
A, B
s3
1
0 real
3
0 real 1 A
, B
R4
4
0 real 1 <id>
, B
R3
5
0 real 1 <idlist>
, B
s6
6
0 real 1 <idlist> 5 ,
B
s3
3
0 real 1 <idlist> 5 , 6 B
R4
7
0 real 1 <idlist> 5 , 6
R2
5
0 real 1 <idlist>
R1
CS 154: Formal Languages and Computability
<id> © R. Mak
18
How to Construct an LR(1) Parsing Table

Constructing a table for an LR(1) parser is
tantamount to building a deterministic PDA
for the parser.

The construction algorithm first determines
the state change and shift operations, and
then it determines the reduction operations.

Compiler construction utilities like Unix’s yacc
(yet another compiler-compiler) or Linux’s bison
will build these tables based on grammar files
that you provide.
Computer Science Dept.
Spring 2016: March 22
CS 154: Formal Languages and Computability
© R. Mak
19
Review for the Midterm

Regular languages



Kleene’s theorem
closure properties
pumping lemma
Computer Science Dept.
Spring 2016: March 22
CS 154: Formal Languages and Computability
© R. Mak
20
Review for the Midterm, cont’d

Context-free grammars and languages







leftmost and rightmost derivations
derivation trees
ambiguity
transforming and simplifying grammars
Chomsky and Greibach normal forms
closure properties
pumping lemma
Computer Science Dept.
Spring 2016: March 22
CS 154: Formal Languages and Computability
© R. Mak
21
Review for the Midterm, cont’d

Nondeterministic pushdown automata (NPDA)





flowchart programming
transition function
relationship to context-free languages
parsers for context-free languages
JavaCC



BNF
expression grammars
calculators
Computer Science Dept.
Spring 2016: March 22
CS 154: Formal Languages and Computability
© R. Mak
22