CSE244 - UCONN School of Engineering

Sections 4.5,4.6
CSE244
Aggelos Kiayias
Computer Science & Engineering Department
The University of Connecticut
371 Fairfield Road, Box U-155
Storrs, CT 06269-1155
[email protected]
http://www.cse.uconn.edu/~akiayias
CH4.1
Bottom Up Parsing


CSE244 
“Shift-Reduce” Parsing
Reduce a string to the start symbol of the grammar.
At every step a particular substring is matched (in
left-to-right fashion) to the right side of some
production and then it is substituted by the nonterminal in the left hand side of the production.
Consider:
S  aABe
A  Abc | b
Bd
abbcde
aAbcde
aAde
aABe
S
Rightmost Derivation:
S  aABe  aAde  aAbcde  abbcde
CH4.2
Handles

CSE244

Handle of a string = substring that matches the
RHS of some production AND whose reduction to
the non-terminal on the LHS is a step along the
reverse of some rightmost derivation.
Formally: handle of a right sentential form
 is <A  , location of  in >
that satisfies the above property.
 i.e. A   is a handle of  at the location immediately
after the end of , if:
S rm
=> A =>
rm 
*


A certain sentential form may have many different handles.
Right sentential forms of a non-ambiguous grammar
have one unique handle [but many substrings that look like
handles potentially !].
CH4.3
Example
CSE244
Consider:
S  aABe
A  Abc | b
Bd
S  aABe  aAde  aAbcde  abbcde
It follows that:
(S ) aABe is a handle of aABe in location 1.
(B ) d is a handle of aAde in location 3.
(A ) Abc is a handle of aAbcde in location 2.
(A ) b is a handle of abbcde in location 2.
CH4.4
Example, II
CSE244
Grammar:
S  aABe
A  Abc | b
Bd
Consider aAbcde (it is a right sentential form)
Is [A  b, aAbcde] a handle?
if it is then there must be:
S rm … rm aAAbcde rm aAbcde
no way ever to get two consecutive
A’s in this grammar. => Impossible
CH4.5
Example, III
CSE244
Grammar:
S  aABe
A  Abc | b
Bd
Consider aAbcde (it is a right sentential form)
Is [B  d, aAbcde] a handle?
if it is then there must be:
S rm … rm aAbcBe rm aAbcde
we try to obtain aAbcBe
S rm aABe ?? aAbcBe
not a right
sentential form
CH4.6
Handle Pruning

CSE244 
A rightmost derivation in reverse can be obtained
by “handle-pruning.”
Apply this to the previous example.
S  aABe
A  Abc | b
Bd
abbcde
aAbcde
aAde
aABe
S
Ab
A  Abc
Bd
S  aABe
CH4.7
Handle Pruning, II

Consider the cut of a parse-tree of a certain right
sentential form.
S
CSE244
A
Left part
Handle (only terminals here)
Viable prefix
CH4.8
Shift Reduce Parsing with a Stack

CSE244 

The “big” problem : given the sentential form
locate the handle
General Idea for S-R parsing using a stack:
using a stack:
1. “shift” input symbols into the stack until a
handle is found on top of it.
2. “reduce” the handle to the corresponding nonterminal.
(other operations: “accept” when the input is
consumed and only the start symbol is on the stack,
also: “error”).
Viable prefix: prefix of a right sentential form that
appears on the stack of a Shift-Reduce parser.
CH4.9
What happens with ambiguous grammars
CSE244
Consider:
EE+E | E*E|
| ( E ) | id
Derive id+id*id
By two different Rightmost
derivations
CH4.10
Example
STACK
CSE244
$
$ id
$E
INPUT
id + id * id$
+ id * id$
+ id * id$
Remark
Shift
Reduce by E  id
EE+E
| E*E
| ( E ) | id
CH4.11
Conflicts
Conflicts [appear in ambiguous grammars]
either “shift/reduce” or “reduce/reduce”

CSE244

Another Example:
stmt  if expr then stmt
| if expr then stmt else stmt
| other (any other statement)
Stack
if … then
Input
else …
Shift/ Reduce
conflict
CH4.12
More Conflicts
stmt  id ( parameter-list )
stmt  expr := expr
CSE244 parameter-list  parameter-list , parameter | parameter
parameter  id
expr-list  expr-list , expr | expr
expr  id | id ( expr-list )
Consider the string A(I,J)
Corresponding token stream is id(id, id)
After three shifts:
Stack = id(id
Input = , id)
Reduce/Reduce Conflict … what to do?
(it really depends on what is A,
an array? or a procedure?
CH4.13
Removing Conflicts

CSE244

One way is to manipulate grammar.
 cf. what we did in the top-down approach to
transform a grammar so that it is LL(1).
Nevertheless:
 We will see that shift/reduce and reduce/reduce
conflicts can be best dealt with after they are
discovered.
 This simplifies the design.
CH4.14
Operator-Precedence Parsing

problems encountered so far in shift/reduce parsing:
 IDENTIFY a handle.
 resolve conflicts (if they occur).
 operator grammars: a class of grammars where handle
identification and conflict resolution is easy.

Operator Grammars: no production right side is 
or has two adjacent non-terminals.
CSE244
E  E - E | E + E | E * E | E / E | E ^ E | - E | ( E ) | id

note: this is typically ambiguous grammar.
CH4.15
Basic Technique


CSE244





resolving ambiguity:
For the terminals of the grammar,
define the relations <. .> and .=.
a <. b means that a yields precedence to b
a .=. b means that a has the same precedence as b.
a .> b means hat a takes precedence over b
E.g. * .> + or + <. *
Many handles are possible. We will use <. .=. And
.> in a clever way to find the correct handle (i.e.,the
one that respects the precedence).
CH4.16
Using Operator-Precedence Relations


CSE244





GOAL: delimit the handle of a right sentential form
<. will mark the beginning, .> will mark the end
and .=. will be in between.
Since no two adjacent non-terminals appear in the
RHS of any production, the same is true for any
any sentential form.
So given 0 a1 1 a2 2 … an n
where each i is either a nonterminal or the empty string.
We drop all non-terminals and we write the corresponding
relation between each consecutive pair of terminals.
example for $id+id*id$ using standard precedence:
$<.id.>+<.id.>*<.id.>$
Example for $E+E*id$ … $<.+<.*<.id.>$
CH4.17
Using Operator-Precedence

CSE244

… Then
1. Scan the string to discover the first .>
2. Scan backwards skipping .=. (if any) until a <. is
found. (we will associate to the right)
3. The handle is the substring delimited by the two
steps above (including any in-between or
surrounding non-terminals).
E.g.
Consider the sentential form E+E*E
we obtain $+*$ and from this the string
$<. + <. * .> $
The handle is E*E
CH4.18
Operator Precedence Parser
Set ip to point to the first symbol of w$
Stack=$
CSE244
Repeat forever:
if $==topofstack and ip==$ then accept
Else { a=topofstack; b=ip;
if a<.b or a.=.b then push(b);advance ip;
if a.>b then repeat pop() until the top stack
terminal is related by <.
else error
CH4.19
Example
STACK
CSE244
$
$ id
$
$+
$ + id
$+
$+*
$ + * id
$+*
$+
$
INPUT
id + id * id$
+ id * id$
+ id * id$
id * id$
* id$
* id$
id$
$
$
$
$
$
Remark
$ <. id
id >. +
$ <. +
+ <. id
id .> *
+ <. *
* <. id
id .> $
* .> $
+ .> $
accept
A sequence of pops
corresponds to the
application of some of
the productions
CH4.20
Operator Precedence Table Construction

CSE244
Basic techniques for operators:
 if operator 1 has higher precedence than 2
then set 1.> 2
 If the operators are of equal precedence (or the
same operator)
set 1.> 2 and 2.> 1 if the operators associate
to the left
set 1<. 2 and 2<. 1 if the operators associate
to the right
 Make <.( and (<. and ).> and .>)
 id has higher precedence than any other symbol
 $ has lowest precedence.
CH4.21
Unary Operators

CSE244 

Unary operators that are not also used as binary
operators are treated as before.
Problem: the – sign.
Typical solution: have the lexical analyzer return a
different token when it sees a unary minus.
CH4.22