Implementing a Parser

Implementing LR Parsers
LR parsing tables and
building LR(0), SLR(1) and LR(1)
Parsers
Adapted from Notes by
Prof. Saman Amarasinghe (MIT)
11
Actions of a Shift-Reduce Parser
• Shift to sn
– Push input token into the symbol stack
– Push sn into state stack
– Advance to next input symbol
• Reduce
– Pop both stacks as many times as the number of symbols on
the RHS of rule n
– Push LHS of rule n into symbol stack
– Lookup [top of the state stack][top of symbol stack]
– Push that state (in goto k) into state stack
• Accept
– End of stream reached and stack only has the start symbol
• Reject
– End of stream reached but stack has more than the start symbol
53
Building a LR(0) parser engine
• Add the special production S’  S $
• Find the items of the CFG
• Create the DFA
– using closure and goto functions
• Build the parse table
LR(0)
Parser
Engine
Creating the LR(0) DFA
s1
X
<S>  <X> • $
s0
<S>  • <X> $
<X>  • ( <X> )
<X>  • ( )
s2
(
<X>
<X>
<X>
<X>
s5
 ( • <X> )
 (•)
 • ( <X> )
•()
)
<X>  ( ) •
(
s3
<X>  ( <X> • )
X
s4
)
<X>  ( <X> ) •
50
Creating the LR(0) parse tables
• For each state
• Transition to another state using a terminal symbol
is a shift to that state (shift to sn)
• Transition to another state using a non-terminal is
a goto that state (goto sn)
• If there is an item A   • in the state
do a reduction with that production for all
terminals (reduce k)
Building LR(0) Parse Table Example
State
s0
s1
s2
s3
s4
s5
(
shift to s2
error
shift to s2
error
reduce (2)
reduce (3)
X
Goto
$
X
goto s1
error
accept
error
error
reduce (2)
reduce (3)
goto s3
<S>  <X> • $
s0
<S>  • <X> $
<X>  • ( <X> )
<X>  • ( )
ACTION
)
error
error
shift to s5
shift to s4
reduce (2)
reduce (3)
s1
s2
(
<X>
<X>
<X>
<X>
s5
 ( • <X> )
 (•)
 • ( <X> )
•()
)
<X>  ( ) •
(
s3
<X>  ( <X> • )
X
s4
)
<X>  ( <X> ) •
Changing the Language
• String of one more more left parentheses followed
by the same number of right parentheses
<S>  <X> $
<X>  ( <X> )
<X>  ( )
• String of zero or more more left parentheses
followed by the same number of right parentheses
<S>  <X> $
<X>  ( <X> )
<X>  
Building the LR(0) Parse Table
State
s0
s1
s2
s3
s4
ACTION
(
)
shift to s2/reduce (3) reduce (3)
error
error
shift to s2/reduce (3) reduce (3)
error
shift to s4
reduce(2) reduce(2)
X
goto s1
goto s3
<S>  <X> • $
s0
Shift/reduce
conflict
$
reduce(3)
accept
reduce(3)
error
reduce(2)
s1
X
<S>  • <X> $
<X>  • ( <X> )
<X>  •
Goto
s2
(
<X>  ( • <X> )
<X>  • ( <X> )
<X>  •
Shift/reduce
conflict
s3
(
X
<X>  ( <X> • )
s4
)
<X>  ( <X> ) •
16
Idea behind SLR(1) grammars
• Many shift/reduce conflicts in LR(0)
– an item X   • in the current state identifies
a reduction
– But does not select when to reduce
– Thus, have to perform the reduction on all input
symbols
• Do the reduction only when the input symbol
truly follows the reduction
– Need to calculate the terminals that can follow a
non-terminal symbol
Building the SLR(1) Parse Table
State
s0
s1
s2
s3
s4
(
shift to s2
error
shift to s2
error
error
Goto
$
reduce(2)
accept
reduce(3)
error
reduce (2)
X
goto s1
goto s3
s1
X
<S>  <X> • $
s0
<S>  • <X> $
<X>  • ( <X> )
<X>  •
ACTION
)
reduce (2)
error
reduce (3)
shift to s4
reduce (2)
s2
(
follow(<X>) = { ), $ }
follow(<S>) = { $ }
<X>  ( • <X> )
<X>  • ( <X> )
<X>  •
s3
(
X
<X>  ( <X> • )
s4
)
<X>  ( <X> ) •
22
Expanded Example Grammar
<S>  <X> $
(1)
<X>  <Y>
(2)
<X>  (
(3)
<Y>  ( <Y> )
(4)
<Y>  
(5)
• Change language to:
– Zero or more open parentheses followed by
matching # of close parentheses
– or a single open parenthesis
26
Expanded Example DFA
<S>
<X>
<X>
<Y>
<Y>
s0
<S>
<X>
<X>
<Y>
<Y>
 <X> $
 <Y>
 (
 ( <Y> )
 
 • <X> $
 • <Y>
•(
 • ( <Y> )
•
X
s5
<S>  <X> • $
s1
(
Y
<X>
<Y>
<Y>
<Y>
s6
(•
 ( • <Y> )
 • ( <Y> )
•
<X>  <Y> •
(
s2
(
<Y>  ( • <Y> )
<Y>  • ( <Y> )
<Y>  •
Y
s3
Y
<Y>  ( <Y> • )
s4
)
<Y>  ( <Y> ) •
31
State
s0
s1
s2
s3
s4
s5
s6
s0
<S>
<X>
<X>
<Y>
<Y>
(
shift to s1
shift to s2
shift to s2
error
error
error
error
<S>  <X> • $
Goto
$
reduce (5)
X
goto s5
s1
(
Y
<X>
<Y>
<Y>
<Y>
s6
Y
goto s6
goto s3
goto s3
reduce(3 or 5)
reduce (5)
error
reduce (4)
accept
reduce (2)
(•
 ( • <Y> )
 • ( <Y> )
•
<X>  <Y> •
(
s2
(
 • <X> $
 • <Y>
•(
 • ( <Y> )
•
X
s5
ACTION
)
reduce (5)
reduce (5)
reduce (5)
shift to s4
reduce (4)
error
error
<Y>  ( • <Y> )
<Y>  • ( <Y> )
<Y>  •
Y
s3
Y
<Y>  ( <Y> • )
s4
)
<Y>  ( <Y> ) •
31
LR(1) Items
• Items will keep info on
– production
– right-hand-side position (the dot)
– look ahead symbol
• LR(1) item is of the form [A   • 
– A    is a production
– The dot in A   •  denotes the position
– a is a terminal or the end marker ($)
• For the item [A   •
a]
– a is the next symbol after A in the string
i.e. there exist a derivation S   A a 
a]
46
Expanded Example LR(1) DFA
<S>
<X>
<X>
<Y>
<Y>
s0
[<S>
[<X>
[<X>
[<Y>
[<Y>
 <X> $
 <Y>
 (
 ( <Y> )
 
[<Y>  (<Y> • ) )]
s8
[<Y>  (<Y>) • )]
 • <X> $
 • <Y>
•(
 • (<Y>)
•
[<S>  <X> •$ ?]
)
s2
(
?]
$]
$]
$]
$]
X
s5
s7
s1
(
[<X>
[<Y>
[<Y>
[<Y>
(•
$]
 ( • <Y> ) $]
 • ( <Y>) )]
•
)]
Y
s6
[<X>  <Y> •
$]
Y
(
[<Y>  ( • <Y>) )]
[<Y>  • ( <Y> ) )]
[<Y>  •
)]
Y
s3
[<Y>  (<Y> • ) $]
s4
)
[<Y>  (<Y>) • $]
51
State
s0
s1
s2
s3
s4
s5
s6
ACTION
)
error
reduce (5)
reduce (5)
shift to s4
error
error
error
(
shift to s1
shift to s2
shift to s2
error
error
error
error
Goto
$
reduce (5)
reduce (3)
error
error
reduce (4)
accept
reduce (2)
s7 … s8 =>
[<S>
[<X>
[<X>
[<Y>
[<Y>
[<Y>  (<Y> • ) )]
[<Y>  (<Y>) • )]
 • <X> $
 • <Y>
•(
 • (<Y>)
•
s5
[<S>  <X> •$ ?]
)
s2
(
?]
$]
$]
$]
$]
X
Y
goto s6
goto s3
s7
goto s3
s7
s8
s0
X
goto s5
s1
(
[<X>
[<Y>
[<Y>
[<Y>
(•
$]
 ( • <Y> ) $]
 • ( <Y>) )]
•
)]
Y
s6
[<X>  <Y> •
$]
Y
(
[<Y>  ( • <Y>) )]
[<Y>  • ( <Y> ) )]
[<Y>  •
)]
Y
s3
[<Y>  (<Y> • ) $]
s4
)
[<Y>  (<Y>) • $]
51
State
s0
s1
s2
s3
s7
s4
s8
s5
s6
ACTION
)
error
reduce (5)
reduce (5)
shift to s4
s8
error
reduce (5)
error
error
(
shift to s1
shift to s2
shift to s2
error
error
error
error
Goto
$
reduce (5)
reduce (3)
error
error
reduce
error (4)
accept
reduce (2)
s3 … s4 =>
[<S>
[<X>
[<X>
[<Y>
[<Y>
[<Y>  (<Y> • ) )]
[<Y>  (<Y>) • )]
 • <X> $
 • <Y>
•(
 • (<Y>)
•
s5
[<S>  <X> •$ ?]
)
s2
(
?]
$]
$]
$]
$]
X
Y
goto s6
goto s3
s7
goto s3
s7
s8
s0
X
goto s5
s1
(
[<X>
[<Y>
[<Y>
[<Y>
(•
$]
 ( • <Y> ) $]
 • ( <Y>) )]
•
)]
Y
s6
[<X>  <Y> •
$]
Y
(
[<Y>  ( • <Y>) )]
[<Y>  • ( <Y> ) )]
[<Y>  •
)]
Y
s3
[<Y>  (<Y> • ) $]
s4
)
[<Y>  (<Y>) • $]
LR(1)
52
State
s0
s1
s2
s3
s4
s5
s6
(
shift to s1
shift to s2
shift to s2
error
error
error
error
ACTION
)
error
reduce (5)
reduce (5)
shift to s4
error
error
error
Goto
$
reduce (5)
reduce (3)
error
error
reduce (4)
accept
reduce (2)
X
goto s5
Y
goto s6
goto s3
s7
goto s3
SLR(1)
s7 … s8 => similar to s3…s4 except for “reduce” column.
State
s0
s1
s2
s3
s4
s5
s6
(
shift to s1
shift to s2
shift to s2
error
error
error
error
ACTION
)
reduce (5)
reduce (5)
reduce (5)
shift to s4
reduce (4)
error
error
Goto
$
reduce (5)
reduce(3 or 5)
reduce (5)
error
reduce (4)
accept
reduce (2)
X
goto s5
Y
goto s6
goto s3
goto s3
46
Example -- LALR(1) DFA
<S>
<X>
<X>
<Y>
<Y>
s0
[<S>
[<X>
[<X>
[<Y>
[<Y>
 <X> $
 <Y>
 (
 ( <Y> )
 
 • <X> $
 • <Y>
•(
 • (<Y>)
•
?]
$]
$]
$]
$]
X
s5
[<S>  <X> •$ ?]
s1
(
[<X>
[<Y>
[<Y>
[<Y>
(•
$]
 ( • <Y> ) $]
 • ( <Y>) )]
•
)]
Y
s6
[<X>  <Y> •
$]
(
s2
(
Y
[<Y>  ( • <Y>) )]
[<Y>  • ( <Y> ) )]
[<Y>  •
)]
Y
s3
[<Y>  (<Y> • )
s4
{$,)}]
)
[<Y>  (<Y>) • {$,)}]
51
State
s0
s1
s2
s3
s4
s5
s6
s0
[<S>
[<X>
[<X>
[<Y>
[<Y>
(
shift to s1
shift to s2
shift to s2
error
error
error
error
 • <X> $
 • <Y>
•(
 • (<Y>)
•
[<S>  <X> •$ ?]
Goto
$
reduce (5)
reduce (3)
error
error
reduce (4)
accept
reduce (2)
X
goto s5
s1
(
[<X>
[<Y>
[<Y>
[<Y>
(•
$]
 ( • <Y> ) $]
 • ( <Y>) )]
•
)]
Y
s6
[<X>  <Y> •
$]
Y
goto s6
goto s3
goto s3
(
s2
(
?]
$]
$]
$]
$]
X
s5
ACTION
)
error
reduce (5)
reduce (5)
shift to s4
error
reduce(4)
error
error
Y
[<Y>  ( • <Y>) )]
[<Y>  • ( <Y> ) )]
[<Y>  •
)]
Y
s3
[<Y>  (<Y> • )
s4
{$,)}]
)
[<Y>  (<Y>) • {$,)}]
LR(1)
LALR(1)
52
State
s0
s1
s2
s3
s4
s5
s6
State
s0
s1
s2
s3
s4
s5
s6
(
shift to s1
shift to s2
shift to s2
error
error
error
error
(
shift to s1
shift to s2
shift to s2
error
error
error
error
s7 … s8 =>
ACTION
)
error
reduce (5)
reduce (5)
shift to s4
error
reduce(4)
error
error
ACTION
)
error
reduce (5)
reduce (5)
shift to s4
error
error
error
Goto
$
reduce (5)
reduce (3)
error
error
reduce (4)
accept
reduce (2)
X
goto s5
Y
goto s6
goto s3
goto s3
Goto
$
reduce (5)
reduce (3)
error
error
reduce (4)
accept
reduce (2)
X
goto s5
Y
goto s6
goto s3
s7
goto s3
SLR(1)
LALR(1)
52
State
s0
s1
s2
s3
s4
s5
s6
State
s0
s1
s2
s3
s4
s5
s6
(
shift to s1
shift to s2
shift to s2
error
error
error
error
ACTION
)
error
reduce (5)
reduce (5)
shift to s4
error
reduce(4)
error
error
(
shift to s1
shift to s2
shift to s2
error
error
error
error
ACTION
)
reduce (5)
reduce (5)
reduce (5)
shift to s4
reduce (4)
error
error
Goto
$
reduce (5)
reduce (3)
error
error
reduce (4)
accept
reduce (2)
X
goto s5
Y
goto s6
goto s3
goto s3
Goto
$
reduce (5)
reduce(3 or 5)
reduce (5)
error
reduce (4)
accept
reduce (2)
X
goto s5
Y
goto s6
goto s3
goto s3
A Hierarchy of Grammar Classes
LR Languages
• The set of LR languages are independent of the
lookahead distance k.
• For all the languages with SLR(1), LALR(1)
and LR(1) grammars we looked at, we could
have found an LR(0) grammar!!!
• This grammar may not be natural however.
LR Languages
• Given any LR(k) grammar Gk, if L(Gk) is
prefix-free, then there exists an (equivalent)
LR(0) grammar G0 such that L(Gk) = L(G0)
– A language L is prefix-free iff [x in L and xy in L
implies y is ].
– A language can be made prefix-free by
concatenating a special end-marker to each string in
the language.
LR vs LL
In general, LL(k) grammars are
LR(k) grammars, for any k.
LR(0) vs LL(k)
• The following LR(0) grammar (with left
recursion) is not LL(1).
S  S0 | 0
– The following LR(0) grammar (with left recursion)
is not LL(k), for any k.
S  S0 | S1 | 0 | 1
• The following LL(1) grammar is not LR(0).
S Z$
Z aZ | 
Not LR(k) for an k
• The following regular grammar (with left
recursion) is not LR(k), for any k.
S  Cc | Bb
C  Ca | a
B  Ba | a
SLR(1) vs LL(k)
• The following SLR(1) grammar (and hence,
LR(1) grammar) requiring left factoring is not
LL(k), for any k.
E  T+E | T
T  x *T | x
LL(1) vs SLR(1)
• The following grammar is LL(1), but not
SLR(1).
S  1X | 2Ag
 Af | Bg
A3|
B4|
X
LL(1) vs LALR(1)
• The following grammar is LL(1), but not
LALR(1).
S  aF | bG
F  Xc | Yd
 IA
A 
X
 Xd | Yc
Y  IB
I 
G
B

