CSE244 SLR Parsing Aggelos Kiayias - UConn - CSE

SLR Parsing
CSE244
Aggelos Kiayias
Computer Science & Engineering Department
The University of Connecticut
371 Fairfield Road, Box U-155
Storrs, CT 06269-1155
[email protected]
http://www.cse.uconn.edu/~akiayias
CH4.1
Items


CSE244



SLR (Simple LR parsing)
DEF A LR(0) item is a production with a “marker.”
E.g. S  aA.Be
intuition: it indicates how much of a certain production
we have seen already (up to the point of the marker)
CENTRAL IDEA OF SLR PARSING: construct a DFA
that recognizes viable prefixes of the grammar.
Intuition: Shift/Reduce actions can be decided based on
this DFA (what we have seen so far & what are our next
options).
Use “LR(0) Items” for the creation of this DFA.
CH4.2
Basic Operations

CSE244
Augmented Grammar:
E’  E
EE+T | T
TT*F | F
F  ( E ) | id
EE+T | T
TT*F | F
F  ( E ) | id
CLOSURE OPERATION of a set of Items:
Function closure(I)
{
J=I;
repeat for each A .B in J and each produtcion
B of G such that B. is not in J: ADD B. to J
until … no more items can be added to J
return J
}
EXAMPLE consider I={ E’.E }
CH4.3
GOTO function

CSE244


Definition.
Goto(I,X) = closure of the set of all items
A X. where A .X belongs to I
Intuitively: Goto(I,X) set of all items that
“reachable” from the items of I once X has been
“seen.”
E.g. consider I={E’ E. , E E.+T} and compute
Goto(I,+)
Goto(I,+) = { E E+.T, T  .T * F , T  .F ,
F  .( E ) , F  .id }
CH4.4
The Canonical Collections of Items for G
Procedure Items(G’:augmented grammar)
{
C:={ closure [S’  .S] }
CSE244
repeat
for each set of items I in C and each
grammar symbol X
such that goto(I,X) is not empty and not in C
do add goto(I,X) to C
until no more sets of items can be added to C
}
I0
I1
E’  .E
E’  E.
E  .E + T
E  E. + T
E’  E
…
E  .T
EE+T | T
I2
I11
T  .T * F
TT*F | F
E  T.
T .F
F  ( E ) | id
T  T. * F
F  .( E )
F  .id
CH4.5
The DFA For Viable Prefixes
CSE244



States = Canonical Collection of Sets of Items
Transitions defined by the Goto Function.
All states final except I0
I0
+
E
I1
T
I2
*
I3
I7
F
…
…
I3
Look p. 226
Intuition: Imagine an NFA with states all the items
in the grammar and transitions to be of the form:
“A .X” goes to “A X.” with an arrow
labeled “X”
Then the closure used in the Goto functions
Essentially transforms this NFA into the DFA above
CH4.6
Example
CSE244





S’  S
S  aABe
A  Abc
A b
Bd
Start with I0 = closure(S’ .S)
CH4.7
Example, II
E’  E
EE+T | T
CSE244 T  T * F | F
F  ( E ) | id
start with I0 = closure(E’  E)
CH4.8
Relation to Parsing

CSE244
An item A  1.2 is valid for a viable prefix
1 if we have a rightmost derivation that yields
Aw which in one step yields 12w

An item will be valid for many viable prefixes.

Whether a certain item is valid for a certain viable
prefix it helps on our decision whether to shift or
reduce when  1 is on the stack.


If 2 looks like we still need to shift.

If 2= it looks like we should reduce A  1
Not a total solution since two valid items may
tell us different things.
CH4.9
Sanity Check

CSE244 



E+T* is a viable prefix (and the DFA will be at
state I7 after reading it)
Indeed: E’=>E=>E+T=>E+T*F is a rightmost
derivation, T*F is the handle of E+T*F, thus
E+T*F is a viable prefix, thus E+T* is also.
Examine state I7 … it contains
T  T*.F
F  .(E)
F  .id
i.e., precisely the items valid for E+T*:
E’=>E=>E+T=>E+T*F
E’=>E=>E+T=>E+T*F=>E+T*(E)
E’=>E=>E+T=>E+T*F=>E+T*id
There are no other valid items for for the viable
prefix E+T*
CH4.10
SLR Parsing Table Construction
Input: the augmented grammar G’
Output: The SLR Parsing table functions ACTION & GOTO
CSE244
1.
2.
Construct C={I0,..,In} the collections of LR(0) items for G’
“State i” is constructed from Ii
If [A  .a] is in Ii and goto(Ii,a)=Ik then we set
ACTION[i,a] to be “shift k”
(a is a terminal)
If [A  .] is in Ii then we set ACTION[i,a] to reduce “A”
for all a in Follow(A) --- (note: A is not S’)
If [S’  S.] is in Ii then we set ACTION[i,$] = accept
3. The goto transitions for state i are constructed as follows for
all A, if goto(Ii,A)=Ik then goto[i,A]=k
4. All entries not defined by rules (2) and (3) are made “error”
5. The initial state of the parser is the one constructed from the
set of items I0
CH4.11
Example.
I0
E’  .E
E  .E + T
CSE244
E  .T
T  .T * F
T .F
F  .( E )
F  .id
I1
E’  E.
E  E. + T
I2
E  T.
T  T. * F
Goto(I0, E)=I1
Goto(I0,T)=I2
Goto(I0,( )=I4
I4
F  (.E)
E  .E + T
E  .T
T  .T * F
T .F
F  .( E )
F  .id
Since F  .( E ) is in I0
And Goto(I0,( )=I4
we set ACTION(0, ( )=s4
Since E’  E. is in I1
We set ACTION(1,$)=acc
Since E  T. is in I2 and
Follow(E)={$,+,) }
We set ACTION(2,$)=rE T
ACTION(2,+)=rE T
ACTION(2,))=rE T
Follow(T)=Follow(F)={ ) , + , * , $ }
CH4.12
Construct the whole table..

(SLR table has no multiply defined labels).
CSE244
CH4.13
Conflicts


Shift/Reduce
Reduce/Reduce
CSE244
Example:
Sometimes even unambiguous
grammars produce multiply defined
labels (s/r, r/r conflicts)in the SLR
table.
S’  S
SL=R | R
L  * R | id
RL
I0 = {S’  .S , S  .L = R , S  .R , L  .* R ,
L . id , R  .L}
I1 = {S’  S. }
action[2, = ] ?
I 2 = {S  L . = R , R  L . }
s6
I3 = {S  R.}
(because of S  L . = R )
I4 = {L  *.R , R  .L , L  .* R ,
rR  L
L . id } I5 = {L  id. }
(because of R  L .
I6 = {S  L = . R , R  .L , L  .* R ,
and = follows R)
L . id } … also I7, I8, I9 …
CH4.14
But Why?

Let’s consider a string that will exhibit the conflict.
id=id
$0
$0id5
$0L2
CSE244





id=id$
=id$
=id$
s5
r L id
conflict…
What is the correct move? (recall: grammar is nonambig.)
R=id is not a sentential form!!!
Even though = might follow R …it does not in this
case.
…it does only when R is preceded by *
SLR finds a conflict because using Follow + LR(0)
items as the guide to find when to reduce is not the
best method.
CH4.15