SLR

More SLR /LR(1)
Professor Yihjia Tsai
Tamkang University
Remember
Computing FIRST (N)
• If N
e
• if N
aABC
First (N) includes e
First (N) includes a
• if N
X1X2
• if N
X1X2… and X1
•
First (N) includes First (X1)
e,
First (N) includes First (X2)
• Obvious generalization to First (a) where a is
X1X2...
2
Computing Follow (N)
• Follow (N) is computed from productions in
which N appears on the rhs
• For the sentence symbol S, Follow (S) includes
$
• if A
a N b, Follow (N) includes First (b)
– because an expansion of N will be followed by an
expansion from b
• if A
a N,
Follow (N) includes Follow (A)
– because N will be expanded in the context in which A
is expanded
• if A
aNB,B
Follow (A)
e, Follow (N) includes
3
Recall our Example
• A grammar to generate all palindromes over
S = { a, b }
1)
2)
3)
4)
S--> P
P --> a Pa
P --> b P b
P --> c
• LR parsers work with an augmented grammar in
which the start symbol never appears in the right
side of a production. Here the original grammar was
rules 2-4
4
Computing the Items
• S0: S--> .P , P --> .a P a, P--> .bP b,
P-->.c
• S1: S--> P.
• S2: P --> a.Pa, P-->.aPa,P-->.bPb,P->.c
• S3:P--> b.P b, P-->.aPa,P-->.bPb,P->.c
• S4: P--> c.
• S5: P--> aP.a
• S6:P--> bP.b
• S7: P--> aPa.
5
Finite State Machine
• Draw the FSA. The major
difference is that transitions can be
both terminal and non-terminal
symbols.
• The Goto and Action Parts of the
parsing table come from the FSA
6
c
Io
S-> .P
P -> .aPa
P -> .bPb
P ->.c
I4
P-> c.
P
FSA
I1
S-> P.
I2
P -> a.Pa
P -> .aPa
P -> .bPb
P ->.c
a
b
c
b
I3
P -> b.Pb
P -> .aPa
P -> .bPb
P ->.c
c
P
I5
P-> a P.a
a
I7
P-> a Pa.
b
P
I6
P-> bP.b
a
b
I8
P-> bPb.
1) P -> aPa
2) P -> bPb
3) P -> c
a
Parsing Table
state
0
1
2
3
4
5
6
7
8
a
S2
b
S3
c
S4
$
P
1
acc
S2
S2
R3
S7
R1
R2
S3
S3
R3
S8
R1
R2
S4
S4
5
6
R3
R1
R2
Parsing Table Contd
• Si means shift the input symbol and
goto state I.
• Rj means reduce by jth production.
Note that we are not storing all the
items in the state in our table.
• example: abcba$
• if we go thru, parsing algorithm, we get
9
Example Contd
•
•
•
•
•
•
•
•
•
•
State
$0
$0a2
$0a2b3
$0a2b3c4
reduce
$0a2b3P6
$0a2b3P6b8
$0a2P5
$0a2P5a7
reduce
$0P
1
Input
abcba$
bcba$
cba$
ba$
ba$
a$
a$
Action
shift
shift
shift
shift
reduce
shift
$
$
10
LR(0) Summary
• LR(0) state: set of LR(0) items
• LR(0) item: a production with a dot in
RHS
• Compute LR(0) states and build DFA
– Use closure operation to compute states
– Use goto operation to compute transitions
• Build LR(0) parsing table from the DFA
• Use LR(0) parsing table to determine
whether to shift or reduce
11
LR(0) Limitations
• An LR(0) machine only works if
states with reduce actions have a
single reduce action
• With a more complex grammar,
construction gives states with
shift/reduce or reduce/reduce
conflicts
• Need to use lookahead to choose
12
A Non-LR(0) Grammar
• Grammar for addition of numbers
–S  S + E | E
– E  num
• Left-associative version is LR(0)
• Right-associative is not LR(0)
–S  E + S | E
– E  num
13
Shift/Reduce Conflicts
• An LR(0) state contains a conflict if its
canonical set has two items that recommend
conflicting actions.
• shift/reduce conflict - when one item prompts
a shift action, the other prompts a reduce
action.
• reduce/reduce conflict - when two items
prompt for reduce actions by different
production.
• A grammar is said be to be LR(0) grammar, if
the table does not have any conflicts.
14
Shift/Reduce Conflict
S’ -> .S
S -> .A b | d c | b A c
A -> .d
A very simple language = {db, dc, bdc}
Follow(S) = {$}, Follow(A) = {b,c}
Form part f the SLR(1) parser:
I1
I0
S’ -> .S
S -> .A b
S -> . d c
S -> . b A c
A -> .d
S -> d .c
A -> d.
But since c is in Follow(A),
we don’t whether to
shift or reduce in I1
D1: S’ -> S ->dc
15
D2; S’ ->S ->bAc ->bdc
Reduce/Reduce Conflict
S’-> S
S -> b A e | b B d | A c
A -> d
B -> Ec
E-> d
S’ -> S -> Ac -> dc
S’ ->S -> bBd -> bEcd -> bdcd
S’ -> S -> bAe -> bde
d
I2
I0
S’ ->
S ->
S ->
S ->
A ->
.S
.bAe
.b B d
.A c
.d
I1
b
S
S
A
B
E
->
->
->
->
->
b .A e
b .B d
. d
.E c
.d
A -> d.
E -> d .
Which reduction
should be taken?
There is not enough
context to decide!
16
SLR(1) Grammar
• An LR parser using SLR(1) parsing
tables for a grammar G is called as
the SLR(1) parser for G.
• If a grammar G has an SLR(1)
parsing table, it is called SLR(1)
grammar (or SLR grammar in
short).
• Every SLR grammar is
unambiguous, but every
unambiguous grammar is not a SLR
17
SLR Summary
• Uses DFA to recognize viable prefixes of grammar
G
• Each state in the DFA:
– is the set of LR(0 items valid for a viable prefix
– “encodes” information about the symbols that have been
shifted onto the stack
• Valid LR(0) items are computed by applying the
closure and goto functions to the initial, valid
item
[S’ -> .S] (this is called the canonical collection
of LR(0) items)
• Uses FOLLOW to disambiguate actions
18
SLR(1) Summary
1. If A -> aAb is in Ik and goto(Ik, a) = Ij, then set
actions[k,a] to sj
2. If A -> a is in Ik then set actions[k,b] to rule#,
for all b e FOLLOW(A)
3. If S’ -> S. is in Ik then set actions[k,$] to
accept
Rules 1-3 may define conflicting actions
for an entry in the actions table. In
this case, the grammar is not SLR(1).
19
LR(0) Limitations
• An LR(0) machine only works if states
with reduce actions have a single
reduce action
• With a more complex grammar,
construction gives states with
shift/reduce or reduce/reduce conflicts
• Need to use lookahead to choose
OK
shift/reduce
reduce/reduce
LL,S.
LL,S.
SS.,L
LS,L.
LS.
20
A Non-LR(0) Grammar
• Grammar for addition of numbers
–S  S + E | E
– E  num
• Left-associative version is LR(0)
• Right-associative is not LR(0)
–S  E + S | E
– E  num
21
LR(0) Parsing Table
1
E
S’  . S $
S  .E + S
S.E
E  .num
S
S’  S . $
Shift or
reduce
in state 2?
num
3
2
S  E . +S
SE.
4
num
E  num .
$
num
1 s4
2 SE
+
E
7
SE+. S
S.E+S
S.E
E  . num
Grammar
SE+S|E
E  num
S
5
SE+S.
S’  S $ .
+
$
s3/SE
SE
E
g2
S
g6
22
Solve Conflict With
Lookahead
• 3 popular techniques for employing
lookahead of 1 symbol with bottom-up
parsing
– SLR – Simple LR
– LALR – LookAhead LR
– LR(1)
• Each as a different means of utilizing
the lookahead
– Results in different processing capabilities
23
SLR Parsing
• SLR Parsing = Easy extension of LR(0)
– For each reduction X  b, look at next
symbol C
– Apply reduction only if C is not in
FOLLOW(X)
• SLR parsing table eliminates some
conflicts
– Same as LR(0) table except reduction
rows FOLLOW(S) = {$}
Example:
– Adds reductions X  b only in the
num of symbols
+
E
S
columns
in $FOLLOW(X)
1 s4
2
g2
s3
g6
SE
24
SLR Parsing Table
• Reductions do not fill entire rows
Grammar
as before
SE+S|E
• Otherwise, same as LR(0) E  num
num
1 s4
2
3 s4
4
5
6
7
+
$
s3
SE
Enum
Enum
SE+S
s7
accept
E
g2
S
g6
g2
g5
25
Class Problem
Consider:
SL=R
SR
L  *R
L  ident
RL
Think of L as l-value, R as r-value, and
* as a pointer dereference
When you create the states in the SLR(1) DFA,
2 of the states are the following:
SL.=R
RL.
SR.
Do you have any shift/reduce conflicts?
26
1.S’S
2.Sdca
3.SdAb
4.Ac
Another
SLR(1)
Example
UMBC
Action
a
S0:
S'   S
S   dca
S ● dAb
S1:
S’  S ●
d
S2:
S  d ca
S  d Ab
A● c
c
d
S0
S4
c
A
1
S3
S5
4
R4
S6
S5
R2
S6
R3
S3:
S
dc●a
Ac ●
A
S
A
S2
S3
$
S2
S1
In S3 there is reduce/shift conflict:
It can be R4 or shift. By looking at
the Follow set of A, the conflict is
removed.
S
b
Goto
S4:
S
dA●b
a
S5:
S
dca●
b
S6:
S
dAb●
NonSLR(1)
example
1. S’S
2. Sdca
3. SdAb
4. SAa
5. Ac
S1:
S’ S
●
S
S0:
S'  S
S dca
S●
dAb
S● Aa
A● c
c
S9:
Ac
●
d
A
c
S2:
Sdca
SdAb
A●c
S7:
SA●
a
•
S3 has shift/reduce conflict.
•
By looking at Follow(A),
•
both a and b are in the follow
set.
•
So under column a we still
don’t know whether to
reduce or shift.
S3:
Sdc●a
Ac●
a
S5:
Sdca●
A
S4:
SdA●
b
a
S8:
SAa
●
b
S6:
SdAb
●
The conflict SLR parsing
table
Action
a
b
S0
Goto
c
d
S9
S2
S1
$
S
A
1
7
A
S2
S3
4
S3 S5/R R5
5
S4
S6
S5
R2
S6
R3
S7
S8
S8
Follow(A) = {a, b}
S9
R5
R5
R4
29
LR(1)
• Solution: keep more information about
context. Namely keep track what next
input symbol can be as part of DFA
state
• Idea: keep an input look-ahead as part
of each item - these are called LR(1)
items
• Always a subset of Follow(A) for any
non-terminal A (may not be a proper
subset)
• Can give rise to larger parsers (i.e.
many states) than SLR but recognizes a
30
LR(k) Items
• The table construction algorithm for an
LR(k) parser uses LR(k) items to represent
the set of possible states in a parse
• An LR(k) item is a pair [a , b], where
– a is a production from G with a “.”at some
position in the rhs
 b is a look-ahead string containing k (where k is
typically 1) symbols that are terminals or $
• Example LR(1) item
b
[A -> X . Y Z ,
a
a]
31
LR(k) Items
• What’s the point of the look-ahead symbols?
• Carry them along to allow us to choose correct
reduction when there is any choice
• Look-ahead symbols are bookkeeping unless
item has unless reducing (i.e. has a “.” at the
right end)
[A -> X . Y Z , a ]
[A -> X Y Z . , a ]
No Use
Use to Guide Reduction
The point: for [A -> a ., a] and [B -> a ., b], we can
decide between reducing to A or to B by looking at
limited right context
32
LR(1) DFA Construction
If S’ = goto(S,x) then add an edge labeled x from S to S’
S’  . S , $
S
S.E+S,$
S.E,$
num
E  .num , +,$
E
+
SE.+S,$
SE.,$
E
S  E+S. , +,$
S
S’ S . , $
E  num . , +,$
num
SE+.S,$
S.E+S,$
S.E,$
E  . num , +,$
Grammar
S’  S$
S  E + S | E
E  num
33
LR(1) Reductions
Reductions correspond to LR(1) items of the form (X   . , y)
S’  . S , $
S
S.E+S,$
S.E,$
num
E  .num , +,$
E
+
SE.+S,$
SE.,$
E
S  E . , +,$
S
S’ S . , $
E  num . , +,$
num
SE+.S,$
S.E+S,$
S.E,$
E  . num , +,$
Grammar
S’  S$
SE+S|E
E  num
34
LR(1) Parsing Table
Construction
• Same as construction of LR(0), except
for reductions
• For a transition S  S’ on terminal x:
– Table[S,x] += Shift(S’)
• For a transition S  S’ on non-terminal
N:
– Table[S,N] += Goto(S’)
• If I contains {(X   . , y)} then:
– Table[I,y] += Reduce(X  )
35
LR(1) Parsing Table Example
1
2
S’  . S , $
S. E+S,$
S.E,$
E  .num , +,$
3 SE+.S,$
E
SE.+S,$
SE.,$
Fragment of the
parsing table
Grammar
S’  S$
SE+S|E
E  num
+
1
2
S.E+S,$
S.E,$
E  . num , +,$
+
$
s3
SE
E
g2
36
LALR(1) Grammars
• Problem with LR(1): too many states
• LALR(1) parsing (aka LookAhead LR)
– Constructs LR(1) DFA and then merge any 2 LR(1)
states whose items are identical except lookahead
– Results in smaller parser tables
– Theoretically less powerful than LR(1)
S  id . , +
SE.,$
+
S  id . , $
SE.,+
=
??
• LALR(1) grammar = a grammar whose LALR(1)
parsing table has no conflicts
37
LALR Parsers
• LALR(1)
– Generally same number of states as
SLR (much less than LR(1))
– But, with same lookahead capability
of LR(1) (much better than SLR)
– Pascal programming language
• In SLR, several hundred states
• In LR(1), several thousand states
38
LL/LR Grammar Summary
• LL parsing tables
– Non-terminals x terminals  productions
– Computed using FIRST/FOLLOW
• LR parsing tables
– LR states x terminals  {shift/reduce}
– LR states x non-terminals  goto
– Computed using closure/goto operations on LR
states
• A grammar is:
– LL(1) if its LL(1) parsing table has no conflicts
– same for LR(0), SLR, LALR(1), LR(1)
39
Classification of Grammars
LL(1)
LR(1)
LALR(1)
SLR
LR(0)
Not to scale 
LR(k)  LR(k+1)
LL(k)  LL(k+0)
LL(k)  LR(k)
LR(0)  SLR
LALR(1)  LR(1)
40
Automate the Parsing
Process
• Can automate:
– The construction of LR parsing tables
– The construction of shift-reduce parsers based on
these parsing tables
• LALR(1) parser generators
–
–
–
–
yacc, bison
Not much difference compared to LR(1) in practice
Smaller parsing tables than LR(1)
Augment LALR(1) grammar specification with
declarations of precedence, associativity
– Output: LALR(1) parser program
41
Associativity
EE+E
E  num
SS+E|E
E  num
What happens if we run this grammar through LALR construction?
EE+E
E  num
EE+E.,+
E  E . + E , +,$
+
1+2+3
shift/reduce
conflict
shift: 1+ (2+3)
reduce: (1+2)+3
42
Associativity (2)
• If an operator is left associative
– Assign a slightly higher value to its precedence if it
is on the parse stack than if it is in the input stream
– Since stack precedence is higher, reduce will take
priority (which is correct for left associative)
• If operator is right associative
– Assign a slightly higher value if it is in the input
stream
– Since input streamis higher, shift will take priority
(which is correct for right associative)
43
Precedence
EE+E|T
T  T x T | num | (E)
E  E + E | E x E | num | (E)
What happens if we run this grammar through LALR construction?
E  E . + E , ...
EExE.,+
EE+E.,x
E  E . x E, ...
Shift/reduce
conflict results
Precedence: attach precedence indicators to terminals
Shift/reduce conflict resolved by:
1. If precedence of the input token is greater than the last
terminal on parse stack, favor shift over reduce
2. If the precedence of the input token is less than or equal to
the last terminal on the parse stack, favor reduce over shift
44
References
• Modern Compiler Implementation in Java,
Andrew Appel, Cambridge University Press
45