L = R

Chapter 3-3
Chang Chi-Chung
2015.07.03
Bottom-Up Parsing

LR methods (Left-to-right, Rightmost
derivation)





LR(0), SLR, Canonical LR = LR(1), LALR
最右推導
沒有右遞迴
明確性文法
Other special cases:


Shift-reduce parsing
Operator-precedence parsing
Bottom-Up Parsing
reduction
E → T → T * F → T * id → F * id → id * id
rightmost derivation
E→E+T|T
T→T*F|F
F → ( E ) | id
LR(0)


An LR parser makes shift-reduce decisions
by maintaining states to keep track.
An item of a grammar G is a production of G
with a dot at some position of the body.

Example
A→XYZ
items
A → .X Y Z
A → X.Y Z
A→ XY.Z
A→ XYZ.
A → X.Y Z
stack
next derivations
with input strings
Note that production A   has one item [A  •]
LR(0)

Canonical LR(0) Collection


One collection of sets of LR(0) items
Provide the basis for constructing a DFA that is
used to make parsing decisions.


LR(0) automation
The canonical LR(0) collection for a grammar

Augmented the grammar



If G is a grammar with start symbol S, then G’ is the
augmented grammar for G with new start symbol S’ and
new production S’ → S
Closure function
Goto function
Shift-Reduce Parsing

Shift-Reduce Parsing is a form of bottom-up
parsing



A stack holds grammar symbols
An input buffer holds the rest of the string to
parsed.
Shift-Reduce parser action




shift
reduce
accept
error
Shift-Reduce Parsing

Shift


Reduce



The right end of the string to be reduced must be at the top
of the stack.
Locate the left end of the string within the stack and decide
with what nonterminal to replace the string.
Accept


Shift the next input symbol onto the top of the stack.
Announce successful completion of parsing
Error

Discover a syntax error and call recovery routine
Conflicts During Shift-Reduce Parsing


Conflicts Type
 shift-reduce
 reduce-reduce
Shift-reduce and reduce-reduce conflicts are
caused by


The limitations of the LR parsing method (even
when the grammar is unambiguous)
Ambiguity of the grammar
Use of the LR(0) Automaton
Function Closure


If I is a set of items for a grammar G, then closure(I) is
the set of items constructed from I.
Create closure(I) by the two rules:



add every item in I to closure(I)
If A→α.Bβ is in closure(I) and B →γ is a production, then
add the item B →.γ to closure(I). Apply this rule untill no
more new items can be added to closure(I).
Divide all the sets of items into two classes

Kernel items


initial item S’ → .S, and all items whose dots are not at the left end.
Nonkernel items

All items with their dots at the left end, except for S’ → .S
Example


The grammar G
E’ → E
E →E+T | T
T →T*F | F
F → ( E ) | id
Let I = { E’ → .E } , then
closure(I) = {
E’ →.E
E →.E + T
E →.T
T →.T * F
T →.F
F →.( E )
F →.id }
Exercise


The grammar G
E’ → E
E →E+T | T
T →T*F | F
F → ( E ) | id
Let I = { E → E +. T }
Function Goto

Function Goto(I, X)




I is a set of items
X is a grammar symbol
Goto(I, X) is defined to be the closure of the set of
all items [A  α X‧β] such that [A  α‧ Xβ] is in I.
Goto function is used to define the transitions in
the LR(0) automation for a grammar.
Example


I={
E’ → E .
E → E .+ T }
Goto (I, +) = {
E → E +. T
T →. T * F
T →.F
F →.(E)
F →.id
}
The grammar G
E’ → E
E →E+T | T
T →T*F | F
F → ( E ) | id
Constructing the LR(0) Collection
1. The grammar is augmented with a new start
symbol S’ and production S’S
2. Initially, set C = closure({[S’•S]})
(this is the start state of the DFA)
3. For each set of items I  C and each
grammar symbol X  (NT) such that
GOTO(I, X)  C and goto(I, X)  ,
add the set of items GOTO(I, X) to C
4. Repeat 3 until no more sets can be added
to C
Model of an LR Parser
Structure of the LR Parsing Table

Parsing Table consists of two parts:



A parsing-action function ACTION
A goto function GOTO
The Action function, Action[i, a], have one of four
forms:

Shift j, where j is a state.


Reduce A→β.


The action of the parser reduces β on the top of the stack to
head A.
Accept


The action taken by the parser shifts input a to the stack, but
use state j to represent a.
The parser accepts the input and finishes parsing.
Error
Structure of the LR Parsing Table(1)

The GOTO Function, GOTO[Ii, A], defined on
sets of items.

If GOTO[Ii, A] = Ij, then GOTO also maps a state i
and a nonterminal A to state j.
LR-Parser Configurations
Configuration ( = LR parser state):
($s0 s1 s2 … sm, ai ai+1 … an $)
stack
input
($ X1 X2 … Xm, ai ai+1 … an $)
If action[sm, ai] = shift s then push s (ai), and advance input
(s0 s1 s2 … sm, ai ai+1 … an $)  (s0 s1 s2 … sm s, ai+1 … an $)
If action[sm, ai] = reduce A   and goto[sm-r, A] = s with r = ||
then pop r symbols, and push s ( push A )
( (s0 s1 s2 … sm, ai ai+1 … an $)  (s0 s1 s2 … sm-r s, ai ai+1 … an $)
If action[sm, ai] = accept then stop
If action[sm, ai] = error then attempt recovery
Example LR Parse Table
Grammar:
1. E  E + T
2. E  T
3. T  T * F
4. T  F
5. F  ( E )
6. F  id
action
state
0
id
s5
reduce by
production #1
*
(
s4
)
$
1
s6
2
r2
s7
r2
r2
3
r4
r4
r4
r4
4
s4
r6
E
1
T
2
F
3
8
2
3
9
3
acc
s5
5
shift & goto 5
+
goto
r6
r6
6
s5
s4
7
s5
s4
r6
10
8
s6
s11
9
r1
s7
r1
r1
10
r3
r3
r3
r3
11
r5
r5
r5
r5
Line
Example
Grammar
0. S  E
1. E  E + T
2. E  T
3. T  T * F
4. T  F
5. F  ( E )
6. F  id
STACK
SYMBOLS
(1) 0
INPUT
ACTION
id * id + id $ shift 5
(2) 0 5
id
* id + id $ reduce 6 goto 3
(3) 0 3
F
* id + id $ reduce 4 goto 2
(4) 0 2
T
* id + id $ shift 7
(5) 0 2 7
T*
(6) 0 2 7 5
T * id
+ id $ reduce 6 goto 10
(7) 0 2 7 10
T*F
+ id $ reduce 3 goto 2
(8) 0 2
T
+ id $ reduce 2 goto 1
(9) 0 1
E
+ id $ shift 6
(10) 0 1 6
E+
(11) 0 1 6 5
E + id
$ reduce 6 goto 3
(12) 0 1 6 3
E+F
$ reduce 4 goto 9
(13) 0 1 6 9
E+T
$ reduce 1 goto 1
(14) 0 1
$E
$ accept
id + id $ shift 5
id $ shift 5
SLR Grammars


SLR (Simple LR): a simple extension of LR(0)
shift-reduce parsing
SLR eliminates some conflicts by populating
the parsing table with reductions A on
symbols in FOLLOW(A)
Shift on +
SE
E  id + E
E  id
State I2:
State I0:
goto(I0,id) E  id•+ E goto(I3,+)
S  •E
E  id•
E  •id + E
E  •id
FOLLOW(E)={$}
thus reduce on $
SLR Parsing Table


Reductions do not fill entire rows
Otherwise the same as LR(0)
1. S  E
2. E  id + E
3. E  id
0
id
s2
+
E
1
acc
1
s3
2
3
$
r3
4
s2
4
Shift on +
FOLLOW(E)={$}
thus reduce on $
r2
Constructing SLR Parsing Tables
 Augment the grammar with S’ S
 Construct the set C={I0, I1, …, In} of LR(0) items
 State i is constructed from Ii



If [A•a]  Ii and goto(Ii, a)=Ij then set
action[i, a]=shift j
If [A•]  Ii then set action[i,a]=reduce A for all a 
FOLLOW(A) (apply only if AS’)
If [S’S•] is in Ii then set action[i,$]=accept
 If goto(Ii, A)=Ij then set goto[i, A]=j set goto table
 Repeat 3-4 until no more entries added
 The initial state i is the Ii holding item [S’•S]
Example SLR Grammar and LR(0) Items
I0 = closure({[C’  •C]})
I1 = goto(I0,C) = closure({[C’  C•]})
…
State I1:
State I4:
final
C’  C•
C  A B•
goto(I0,C)
Augmented
grammar:
0. C’  C
1. C  A B
2. A  a
3. B  a
start
goto(I2,B)
State I0:
State I2:
goto(I
,A)
C’  •C
0
C  A•B
C  •A B
B  •a
goto(I2,a)
A  •a
goto(I0,a)
State I3:
A  a•
State I5:
B  a•
Example SLR Parsing Table
State I0:
C’  •C
C  •A B
A  •a
State I1:
C’  C•
1
C
start
0
A
0
B
4
2
a
a
3
State I2:
C  A•B
B  •a
5
State I3:
A  a•
a
s3
1
$
State I4:
C  A B•
C
1
A
2
B
acc
2
s5
3
r2
4
4
r1
5
r3
State I5:
B  a•
Grammar:
0. C’  C
1. C  A B
2. A  a
3. B  a
SLR and Ambiguity


Every SLR grammar is unambiguous, but not every
unambiguous grammar is SLR, maybe LR(1)
Consider for example the unambiguous grammar
SL=R|R
L  * R | id
RL
I0:
S’  •S
S  •L=R
S  •R
L  •*R
L  •id
R  •L
I1:
S’  S•
I2:
S  L•=R
R  L•
FOLLOW(R) = {=, $}
I3:
S  R•
I4:
L  *•R
R  •L
L  •*R
L  •id
I5:
L  id•
action[2,=]=s6
Has no SLR
no
action[2,=]=r5
parsing table
I6:
S  L=•R
R  •L
L  •*R
L  •id
I7:
L  *R•
I8:
R  L•
I9 :
S  L=R•
LR(1) Grammars



SLR too simple
LR(1) parsing uses lookahead to avoid
unnecessary conflicts in parsing table
LR(1) item = LR(0) item + lookahead
LR(0) item
[A•]
LR(1) item
[A•, a]
SLR Versus LR(1)


Split the SLR states by adding LR(1) lookahead
Unambiguous grammar
I2:
SL=R|R
S  L•=R split
R  L•
L  * R | id
RL
S  L•=R
R  L•
action[2,=]=s6
Should not reduce, because no
right-sentential form begins with R=
LR(1) Items



An LR(1) item
[A•, a]
contains a lookahead terminal a, meaning 
already on top of the stack, expect to see a
For items of the form
[A•, a]
the lookahead a is used to reduce A only
if the next input is a
For items of the form
[A•, a]
with  the lookahead has no effect
The Closure Operation for LR(1) Items



Start with closure(I) = I
If [A•B, a]  closure(I) then
for each production B in the grammar
and each terminal b  FIRST(a)
add the item [B•, b] to I
if not already in I
Repeat 2 until no new items can be added
The Goto Operation for LR(1) Items


For each item [A•X, a]  I, add the set
of items closure({[AX•, a]}) to goto(I,X) if
not already there
Repeat step 1 until no more items can be
added to goto(I,X)
Example



The grammar G
S’ → S
S →CC
C →cC | d
Let I = { (S’ → •S, $) }
I0 = closure(I) = {
S’ → •S, $
S → • C C, $
C → •c C, c/d
C → •d, c/d
}
goto(I0, S) = closure( {S’ → S •, $ } )
= {S’ → S •, $ } = I1
Exercise



Let I = { (S → C •C, $) }
I2 = closure(I) = ?
I3 = goto(I2, c) = ?
The grammar G
S’ → S
S →CC
C →cC | d
Construction of the sets of LR(1) Items




Augment the grammar with a new start
symbol S’ and production S’S
Initially, set C = closure({[S’•S, $]})
(this is the start state of the DFA)
For each set of items I  C and each grammar
symbol X  (NT) such that goto(I, X)  C and
goto(I, X)  , add the set of items goto(I, X) to
C
Repeat 3 until no more sets can be added to
C
LR(1) Automation
Construction of the Canonical LR(1)
Parsing Tables
Augment the grammar with S’S
Construct the set C={I0,I1,…,In} of LR(1) items
State i of the parser is constructed from Ii









If [A•a, b]  Ii and goto(Ii,a)=Ij then set action[i,a]=shift
j
If [A•, a]  Ii then set action[i,a]=reduce A (apply
only if AS’)
If [S’S•, $] is in Ii then set action[i,$]=accept
If goto(Ii,A)=Ij then set goto[i,A]=j
Repeat 3 until no more entries added
The initial state i is the Ii holding item [S’•S,$]
Example
state
0
ACTION
c
d
s3
s4
1
GOTO
$
S
C
1
2
acc
2
s6
s7
5
3
s3
s4
8
4
r3
r3
5
6
r1
s6
s7
7
8
9
9
r3
r2
r2
r2
The grammar G
S’ → S
S →CC
C →cC | d
Example Grammar and LR(1) Items



Unambiguous LR(1) grammar:
SL=R|R
L  * R | id
RL
Augment with S’  S
LR(1) items (next slide)
I0: [S’  •S,
[S  •L=R,
[S  •R,
[L  •*R,
[L  •id,
[R  •L,
$] goto(I0,S)=I1
$] goto(I0,L)=I2
$] goto(I0,R)=I3
=/$] goto(I0,*)=I4
=/$] goto(I0,id)=I5
$] goto(I0,L)=I2
I6: [S  L=•R,
[R  •L,
[L  •*R,
[L  •id,
$] goto(I6,R)=I9
$] goto(I6,L)=I10
$] goto(I6,*)=I11
$] goto(I6,id)=I12
I7: [L  *R•,
=/$]
I1: [S’  S•,
$]
I8: [R  L•,
=/$]
I2: [S  L•=R,
[R  L•,
$] goto(I0,=)=I6
$]
I9: [S  L=R•,
$]
I3: [S  R•,
$]
I4: [L  *•R,
[R  •L,
[L  •*R,
[L  •id,
=/$] goto(I4,R)=I7
=/$] goto(I4,L)=I8
=/$] goto(I4,*)=I4
=/$] goto(I4,id)=I5
I5: [L  id•,
=/$]
I10: [R  L•,
$]
I11: [L  *•R,
[R  •L,
[L  •*R,
[L  •id,
$] goto(I11,R)=I13
$] goto(I11,L)=I10
$] goto(I11,*)=I11
$] goto(I11,id)=I12
I12: [L  id•,
$]
I13: [L  *R•,
$]
Example LR(1) Parsing Table
0
id
s5
*
s4
=
1
Grammar:
1. S’  S
2. S  L = R
3. S  R
4. L  * R
5. L  id
6. R  L
S
1
L
2
R
3
8
7
10
4
acc
2
s6
3
4
$
r6
r3
s5
s4
5
r5
r5
6 s12 s11
7
r4
r4
8
r6
r6
9
r2
10
r6
11 s12 s11
10 13
12
r5
13
r4
LALR(1) Grammars



LR(1) parsing tables have many states
LALR(1) parsing (Look-Ahead LR) combines LR(1)
states to reduce table size
Less powerful than LR(1)



Will not introduce shift-reduce conflicts, because shifts do
not use lookaheads
May introduce reduce-reduce conflicts, but seldom do so
for grammars of programming languages
SLR and LALR tables for a grammar always have
the same number of states, and less than LR(1)
tables.

Like C, SLR and LALR >100, LR(1) > 1000
Constructing LALR Parsing Tables

Two ways

Construction of the LALR parsing table from the
sets of LR(1) items.



Union the states
Requires much space and time
Construction of the LALR parsing table from the
sets of LR(0) items


Efficient
Use in practice.
Example
ACTION
state
0
c
d
GOTO
$
s36 s47
1
S
C
1
2
acc
state
0
ACTION
c
d
s3
s4
1
8
s4
36
s36 s47
89
4
r3
r3
47
r3
r2
r2
r2
2
5
s3
89
1
s7
3
r1
C
s6
5
5
S
2
s36 s47
r3
$
acc
2
r3
GOTO
5
6
r1
s6
s7
7
8
9
r3
r2
r2
9
r2
LALR(1)
LR(1)
Constructing LALR(1) Parsing Tables


Construct sets of LR(1) items
Combine LR(1) sets with sets of items that
share the same first part
I4: [L  *•R,
[R  •L,
[L  •*R,
[L  •id,
=]
=]
=]
=]
I11: [L  *•R,
[R  •L,
[L  •*R,
[L  •id,
$]
$]
$]
$]
[L  *•R,
[R  •L,
[L  •*R,
[L  •id,
=/$]
=/$]
=/$]
=/$]
Shorthand
for two items
in the same set
Example LALR(1) Grammar



Unambiguous LR(1) grammar:
SL=R|R
L  * R | id
RL
Augment with S’  S
LALR(1) items (next slide)
I0: [S’  •S,
[S  •L=R,
[S  •R,
[L  •*R,
[L  •id,
[R  •L,
$] goto(I0,S)=I1
$] goto(I0,L)=I2
$] goto(I0,R)=I3
=/$] goto(I0,*)=I4
=/$] goto(I0,id)=I5
$] goto(I0,L)=I2
I6: [S  L=•R,
[R  •L,
[L  •*R,
[L  •id,
$] goto(I6,R)=I8
$] goto(I6,L)=I9
$] goto(I6,*)=I4
$] goto(I6,id)=I5
I7: [L  *R•,
=/$]
I1: [S’  S•,
$]
I8: [S  L=R•,
$]
I2: [S  L•=R,
[R  L•,
$] goto(I0,=)=I6
$]
I9: [R  L•,
=/$]
I3: [S  R•,
$]
I4: [L  *•R,
[R  •L,
[L  •*R,
[L  •id,
=/$] goto(I4,R)=I7
=/$] goto(I4,L)=I9
=/$] goto(I4,*)=I4
=/$] goto(I4,id)=I5
I5: [L  id•,
=/$]
Shorthand
for two items
[R  L•,
[R  L•,
=]
$]
Example LALR(1) Parsing Table
0
Grammar:
1. S’  S
2. S  L = R
3. S  R
4. L  * R
5. L  id
6. R  L
id
s5
*
s4
=
1
s6
3
7
s5
R
3
9
7
9
8
r6
s4
r5
s5
r5
s4
r4
8
9
L
2
r3
5
6
S
1
acc
2
4
$
r4
r2
r6
r6
LL, SLR, LR, LALR Summary

LL parse tables computed using FIRST/FOLLOW



LR parsing tables computed using closure/goto



Nonterminals  terminals  productions
Computed using FIRST/FOLLOW
LR states  terminals  shift/reduce actions
LR states  nonterminals  goto state transitions
A grammar is




LL(1) if its LL(1) parse table has no conflicts
SLR if its SLR parse table has no conflicts
LR(1) if its LR(1) parse table has no conflicts
LALR(1) if its LALR(1) parse table has no conflicts
LL, SLR, LR, LALR Grammars
LR(1)
LALR(1)
LL(1) SLR
LR(0)
YACC
yacc
specification
yacc.y
Yacc or Bison
compiler
y.tab.c
C
compiler
a.out
a.out
output
stream
input
stream
y.tab.c