id + * ( ) $ E

WELCOME TO A
JOURNEY TO
CS419
Dr. Hussien Sharaf
Dr. Mohammad Nassef
Department of Computer Science,
Faculty of Computers and Information,
Cairo University
Dr. Mohammad Nassef
DESIGNING A TOP-DOWN PARSER:
Steps:
 Elimination of Ambiguity
 Elimination of Left Recursion
 Left Factoring
 Drawing Transition Diagram (Optional)
 Applying First/Follow operators
 Building the Parsing Table
 Parse the given statements
TRANSITION DIAGRAMS

Transition diagrams can describe recursive parsers, just like
they can describe lexical analyzers, but the diagrams are
slightly different.

Construction:
Eliminate left recursion from grammar G
Left factor grammar G
For each non-terminal A, do:
1.
2.
3.
1.
2.
Create an initial and final (return) state
For each production A -> X1 X2 … Xn, create a path from the initial to
the final state with edges X1 X2 … Xn.
EXAMPLE TRANSITION DIAGRAMS

An expression grammar
with left recursion and
ambiguity removed:

E -> T E’
E’ -> + T E’ | ε
T -> F T’
T’ -> * F T’ | ε
F -> ( E ) | id





Example : parse the string
“id + id * id”
Corresponding transition
diagrams:
USING TRANSITION DIAGRAMS
Begin in the start state for the start symbol
 When we are in state s with edge labeled by
terminal a to state t, if the next input symbol is a,
move to state t and advance the input pointer.
 For an edge to state t labeled with non-terminal A,
jump to the transition diagram for A, and when
finished, return to state t
 For an edge labeled ε, move immediately to t.

PROCEDURE




Make a transition diagram( like DFA/NFA) for every
rule of the grammar.
Optimize the DFA by reducing the number of states,
yielding the final transition diagram
To parse a string, simulate the string on the transition
diagram
If after consuming the input the transition diagram
reaches an accept state, it is parsed.
Dr. Mohammad Nassef
DESIGNING A TOP-DOWN PARSER:
Steps:
 Elimination of Ambiguity
 Elimination of Left Recursion
 Left Factoring
 Drawing Transition Diagram (Optional)
 Applying First/Follow operators
 Building the Parsing Table
 Parse the given statements
PREDICTIVE PARSING

Recall the main idea of top-down parsing:
Start at the root, grow towards leaves
 Pick a production and try to match input
 May need to backtrack


Can we avoid the backtracking?


Given A   |  the parser should be able to choose
between  and 
How?

What if we do some "preprocessing" to answer the
question: Given a non-terminal A and look-ahead t,
which (if any) production of A is guaranteed to start
with a t?
PREDICTIVE PARSING

Armed with
 FIRST
 FOLLOW

We can build a parser where no backtracking is
required!
EXAMPLE GRAMMAR FOR FIRST/FOLLOW
EE+E
EE*E
E(E)
Eid

Original grammar:

This grammar is left-recursive, ambiguous and requires left-factoring. It
needs to be modified before we build a predictive parser for it:
Remove ambiguity:
EE+T
TT*F
F(E)
Fid
Remove left recursion:
ETE'
E'+TE'|
TFT'
T'*FT'|
F(E)
Fid
COMPUTING FIRST:

Compute FIRST(X) as follows:
 if
X is a terminal, then FIRST(X)={X}
 if X is a production, then add  to FIRST(X)
 if X is a non-terminal and XY1Y2...Yn is a
production, add FIRST(Y1) to FIRST(X)
 if X is a non-terminal and XY1Y2...Yn is a
production, add FIRST(Yi) to FIRST(X) if the
preceding Yj’s contain  in their FIRSTs

Focus on L.H.S of productions
CS416 Compiler Design
Fall 2003
FIRST EXAMPLE
E  TE’
E’  +TE’ | 
T  FT’
T’  *FT’ | 
F  (E) | id
FIRST(F)
= {(,id}
FIRST(T’)
= {*, }
FIRST(T)
= FIRST(F) = {(,id}
FIRST(E’)
= {+, }
FIRST(E)
= FIRST(T) = {(,id}
FIRST(TE’) = {(,id}
FIRST(+TE’ ) = {+}
FIRST() = {}
FIRST(FT’) = {(,id}
FIRST(*FT’) = {*}
FIRST() = {}
FIRST((E)) = {(}
FIRST(id) = {id}
COMPUTING FOLLOW

Compute FOLLOW as follows:
 FOLLOW(S)
contains EOF (or $)
 For productions AB, everything in FIRST()
except  goes into FOLLOW(B)
 For productions AB or AB where FIRST()
contains , FOLLOW(B) contains everything that is
in FOLLOW(A)

Focus on R.H.S of productions
CS416 Compiler Design
FOLLOW EXAMPLE
E  TE’
E’  +TE’ | 
T  FT’
T’  *FT’ | 
F  (E) | id
FOLLOW(E)
= { $, ) }
FOLLOW(E’)
= Follow(E) = { $, ) }
FOLLOW(T)
= FIRST(E’) + FOLLOW(E’) = { +, ), $ }
FOLLOW(T’)
= FOLLOW(T) = { +, ), $ }
FOLLOW(F)
= FIRST(T’) + FOLLOW(T’) = {+, *, ), $ }
Fall 2003
Dr. Mohammad Nassef
DESIGNING A TOP-DOWN PARSER:
Steps:
 Elimination of Ambiguity
 Elimination of Left Recursion
 Left Factoring
 Drawing Transition Diagram (Optional)
 Applying First/Follow operators
 Building the Parsing Table
 Parse the given statements
PREDICTIVE PARSING (W/TABLE)

For each production A do:
each terminal a  FIRST() add A to entry
M[A,a]
 If FIRST(), add A to entry M[A,b] for each
terminal b  FOLLOW(A).
 If FIRST() and EOFFOLLOW(A), add A to
M[A,EOF]
 For

Use table and stack to simulate recursion.
CS416 Compiler Design
Fall 2003
LL(1) PARSING TABLE
FIRST(E) = FIRST(T) = FIRST(F) = {(, id}
FIRST(E') = {+, }
FIRST(T') = {*, }
FOLLOW(E) = FOLLOW(E') = {$, )}
FOLLOW(T) = FOLLOW(T') = {+, $, )}
FOLLOW(F) = {*, +, $, )}
E  TE’
E’  +TE’ | 
T  FT’
T’  *FT’ | 
F  (E) | id
id
E
+
*
E  TE’
E’
E’  +TE’
T T  FT’
T’
T’   T’  *FT’
F
F  id
(
)
$
E  TE’
E’   E’  
T  FT’
T’   T’  
F  (E)
Is this grammar LL(1)? Yes, because each cell has only one production!
Dr. Mohammad Nassef
DESIGNING A TOP-DOWN PARSER:
Steps:
 Elimination of Ambiguity
 Elimination of Left Recursion
 Left Factoring
 Drawing Transition Diagram (Optional)
 Applying First/Follow operators
 Building the Parsing Table
 Parse the given statements
CS416 Compiler Design
Fall 2003
LL(1) PARSER – EXAMPLE
id
E
stack
$E
$E’T
$E’ T’F
$ E’ T’id
$ E’ T’
$ E’
$ E’ T+
$ E’ T
(
)
$
E’  
E’  
T’  
T’  
E  TE’
E’  +TE’
T  FT’
T  FT’
T’  
T’
F
*
E  TE’
E’
T
+
F  id
T’  *FT’
F  (E)
input
id+id$
id+id$
id+id$
id+id$
+id$
+id$
+id$
id$
output
E  TE’
T  FT’
F  id
T’  
E’  +TE’
T  FT’
19
CS416 Compiler Design
Fall 2003
LL(1) PARSER – EXAMPLE (CONT’D)
id
E
stack
$ E’ T
$ E’ T’ F
$ E’ T’id
$ E’ T’
$ E’
$
(
)
$
E’  
E’  
T’  
T’  
E  TE’
E’  +TE’
T  FT’
T  FT’
T’  
T’
F
*
E  TE’
E’
T
+
T’  *FT’
F  id
F  (E)
input
id$
id$
id$
$
$
$
output
T  FT’
F  id
T’  
E’  
accept
20