9780324788594_PPT_ch11

Invitation to Computer Science
5th Edition
Chapter 11
Compilers and Language Translation
Objectives
In this chapter, you will learn about:
• The compilation process
–
–
–
–
Phase I: Lexical analysis
Phase II: Parsing
Phase III: Semantics and code generation
Phase IV: Code optimization
Invitation to Computer Science, 5th Edition
2
Introduction
• Compiler
– Translates high-level language into machine
language prior to execution
• Assembly language and machine language are
related one to one
• One to many
– Relationship between a high-level language and
machine language
• Compiler goals
– Correctness; efficient and concise
Invitation to Computer Science, 5th Edition
3
The Compilation Process
• Four phases of compilation
–
–
–
–
Phase I: Lexical analysis
Phase II: Parsing
Phase III: Semantic analysis and code generation
Phase IV: Code optimization
Invitation to Computer Science, 5th Edition
4
Figure 11.1 General Structure of a Compiler
Invitation to Computer Science, 5th Edition
5
Figure 11.2 Overall Execution Sequence of a High-level Language Program
Invitation to Computer Science, 5th Edition
6
Phase I: Lexical Analysis
• Lexical analyzer
– Groups input characters into units called tokens
• Scanner
– Discards nonessential characters, such as blanks
and tabs
– Groups remaining characters into high-level
syntactical units such as symbols, numbers, and
operators
Invitation to Computer Science, 5th Edition
7
Figure 11.3 Typical Token Classifications
Invitation to Computer Science, 5th Edition
8
Phase II: Parsing
• During the parsing phase:
– Compiler determines whether the tokens recognized
by the scanner during phase I fit together in a
grammatically meaningful way
• Parsing
– Process of diagramming a high-level language
statement
– Done by a program called a parser
Invitation to Computer Science, 5th Edition
9
Grammars, Languages, and BNF
• Backus-Naur Form
– Named after its designers John Backus and Peter
Naur
– Syntax of a language is specified as a set of rules,
also called productions
– Entire collection of rules is called a grammar
– Lefthand side of a BNF rule is the name of a single
grammatical category
– Operator ::= means “is defined as,” and “definition,”
also called righthand side
Invitation to Computer Science, 5th Edition
10
Grammars, Languages, and BNF
(continued)
• Backus-Naur Form
– Nonterminal: intermediate grammatical category
used to help explain and organize the language
– Goal symbol: final nonterminal
– Language: collection of all statements that can be
successfully parsed
– Metasymbols: <, >, and ::=
Invitation to Computer Science, 5th Edition
11
Parsing Concepts and Techniques
• Parser
– Receives as input the BNF description of a highlevel language and a sequence of tokens recognized
by the scanner
• Look-ahead parsing algorithms
– “Look down the road” a few tokens to see what
would happen if a certain choice is made
• Ambiguous
– Grammar that allows the construction of two or more
distinct parse trees for the same statement
Invitation to Computer Science, 5th Edition
12
Figure 11.4 First Attempt at a Grammar for
a Simplified Assignment Statement
Invitation to Computer Science, 5th Edition
13
Figure 11.5 Parse Tree Produced by the Parser
Invitation to Computer Science, 5th Edition
14
Figure 11.6 Second Attempt at a Grammar
for Assignment Statements
Invitation to Computer Science, 5th Edition
15
Figure 11.7 Two Parse
Trees for the Statement
x=x+y+z
Invitation to Computer Science, 5th Edition
16
Figure 11.8 Third Attempt at a Grammar
for Assignment Statements
Invitation to Computer Science, 5th Edition
17
Figure 11.9 Grammar for a Simplified Version of an
if-else Statement
Invitation to Computer Science, 5th Edition
18
Figure 11.10 Parse Tree for the Statement if (x55y)x5z; else x5y;
Invitation to Computer Science, 5th Edition
19
Phase III: Semantics and Code
Generation
• Semantic record
– Data structure that stores information about a
nonterminal
• First part of code generation
– Involves a pass over the parse tree to determine
whether all branches of the tree are semantically
valid
• Code generation
– Compiler must determine how transformation of
grammatical objects can be accomplished in
machine language
Invitation to Computer Science, 5th Edition
20
Phase III: Semantics and Code
Generation (continued)
• Optimization
– Compiler polishes and fine-tunes the translation so
that it runs a little faster or occupies a little less
memory
Invitation to Computer Science, 5th Edition
21
Figure 11.11 Code Generation for
the Assignment Statement x = x +
y+z
Invitation to Computer Science, 5th Edition
22
Phase IV: Code Optimization
• Efficiency
– Ability to write highly optimized programs that
contained no wasted microseconds or unnecessary
memory cells
• Goal in compiler design today
– Provide a wide array of compiler tools to simplify the
programmer’s task and increase productivity
• Integrated development environment
– Compiler is embedded within a collection of
supporting software development routines
Invitation to Computer Science, 5th Edition
23
Phase IV: Code Optimization
(continued)
• Two types of optimization
– Local optimization and global optimization
• Possible local optimizations
– Constant evaluation
– Strength reduction
– Eliminating unnecessary operations
Invitation to Computer Science, 5th Edition
24
Figure 11.12 Optimized Code for the
Assignment Statement x = x + y + z
Invitation to Computer Science, 5th Edition
25
Summary
• Compiler
– Piece of system software that translates high-level
languages into machine language
• Goals of a compiler
– Correctness and the production of efficient and
concise code
• Source program
– High-level language program
Invitation to Computer Science, 5th Edition
26
Summary (continued)
• Object program
– The machine language translation of the source
program
• Phases of the compilation process
–
–
–
–
Phase I: Lexical analysis
Phase II: Parsing
Phase III: Semantic analysis and code generation
Phase IV: Code optimization
Invitation to Computer Science, 5th Edition
27