Advanced Compiler Techniques
Inter-procedural Analysis
LIU Xianhua
School of EECS, Peking University
Topics
Up to now
Intra-procedural analysis
Dataflow analysis
PRE
Loops
SSA
Just for individual procedures
Today: Inter-procedural analysis
across/between procedures
“Advanced Compiler Techniques”
2
Modularity is a Virtue
Decomposing programs into procedures aids
in readability and maintainability
Object-oriented languages have pushed this
trend even further
In a good design, procedures should be:
An interface
A black box
“Advanced Compiler Techniques”
3
The Catch
This inhibits optimization!
The compiler must assume:
Called procedure may use or change any
accessible variable
Procedure’s caller provides arbitrary values as
parameters
Interprocedural optimizations – use the
calling relationships between procedures to
optimize one or both of them
“Advanced Compiler Techniques”
4
Recall
Function calls can affect our points-to sets
p1 = &x;
p2 = &p1;
...
foo();
Be conservative
– Lose a lot of information
“Advanced Compiler Techniques”
5
Applications of IPA
Virtual method invocation
Pointer alias analysis
Parallelization
Detection software errors and vulnerabilities
SQL injection
Buffer overflow analysis & protection
“Advanced Compiler Techniques”
6
Basic Concepts
Procedure (Function )
Caller/Callee
Call Site
Call Graph
Call Context
Call Strings
Formal Arguments
Actual Arguments
“Advanced Compiler Techniques”
7
Terminology
Goal
– Avoid making overly conservative assumptions about the effects
of procedures and the state at call sites
int a, e
// globals
procedure foo(var b, c) // formal args
b := c
end
program main
int d
// locals
foo(a, d)
// call site with
end
// actual args
In procedure body
formals and/or globals may be aliased (two names refer to
same location)
formals may have constant value
At procedure call
global vars may be modified or used
actual args may be modified or used
“Advanced Compiler Techniques”
8
Interprocedural Analysis vs.
Interprocedural Optimization
Interprocedural analysis
Gather information across multiple procedures
(typically across the entire program)
Can use this information to improve
intraprocedural analysis and optimization (e.g.,
CSE)
Interprocedural optimizations
Optimizations that involve multiple procedures
e.g., Inlining, procedure cloning, interprocedural
register allocation
Optimizations that use interprocedural analysis
“Advanced Compiler Techniques”
9
The Call Graph
Represent procedure call relationship
by call graph
G = (V,E,start)
Each procedure is a unique vertex
Call site = edge between caller & callee
(u,v) = call from u to v (u may call v)
Can label with source line
Cycles represent recursion
“Advanced Compiler Techniques”
10
Call Graph
“Advanced Compiler Techniques”
11
Super Graph
“Advanced Compiler Techniques”
12
Validity of Interprocedural
Control Flow Paths
“Advanced Compiler Techniques”
13
Safety, Precision, and Efficiency
of Data Flow Analysis
Data flow analysis uses static representation of
programs to compute summary
information
along
A path
which represents
paths
legal control flow
Ensuring Safety. All valid paths must be covered
Ensuring Precision . Only valid paths should be
covered.
Ensuring Efficiency. Only relevant valid paths should
be covered.
Subject to merging data
flow values at shared
program points without
creating invalid paths
A path which yields
information that
affects the summary
information
“Advanced Compiler Techniques”
14
Flow and Context Sensitivity
Flow sensitive analysis:
Considers intraprocedurally valid paths
Context sensitive analysis:
Considers interprocedurally valid paths
For maximum statically attainable precision
, analysis must be both flow and context
sensitive.
“Advanced Compiler Techniques”
15
Context Sensitivity in
Interprocedural Analysis
“Advanced Compiler Techniques”
16
Example of Context Sensitivity
“Advanced Compiler Techniques”
17
Staircase Diagrams of
Interprocedurally Valid Paths
“You can descend only as much as you have ascended!”
Every descending step must match a corresponding
ascending step.
“Advanced Compiler Techniques”
18
Context Sensitivity in
Presence of Recursion
• For a path from u
tov, g must be
applied exactly the
same number of
times as f .
• For a prefix of
the above path, g
can be applied only
at most as many
times as f .
“Advanced Compiler Techniques”
19
Staircase Diagrams of
Interprocedurally Valid Paths
“Advanced Compiler Techniques”
20
Interprocedural Analysis
Goals
Enable standard optimizations even with
procedure calls
Reduce call overhead for procedures
Enable optimizations not possible for single
procedures
Optimizations
Register allocation
Loop transformations
CSE, etc.
“Advanced Compiler Techniques”
21
Analysis Sensitivity
Flow-insensitive
What may happen (on at least one path)
Linear-time
Flow-sensitive
Consider control flow (what must happen)
Iterative data-flow: possibly exponential
Context-insensitive
Call treated the same regardless of caller
“Monovariant” analysis
Context-sensitive
Reanalyze callee for each caller
“Polyvariant” analysis
More
sensitivity
More
accuracy, but
more
expensive
Path-sensitive vs. path-insensitive
Computes one answer for every execution path
Subsumes flow-sensitivity
Extremely expensive
“Advanced Compiler Techniques”
22
Increasing Precision in
Data Flow Analysis
actually, only
caller sensitive
“Advanced Compiler Techniques”
23
Precision of IPA
Flow-insensitive
result not affected by control flow in procedure
Flow-sensitive
result affected by control flow in procedure
A
A
B
B
“Advanced Compiler Techniques”
24
Context Sensitivity
Re-analyze callee as if procedure was inlined
a = id(3);
3
4
b = id(4);
id(x) { return x; }
a = min(3, 4);
ints
s = min(“aardvark”, “vacuum”);
strings
min(x, y) { if (x <= y) return x; else return y; }
Too expensive in space & time
Recursion?
Approximate context sensitivity:
Reanalyze callee for k levels of calling context
“Advanced Compiler Techniques”
25
Path Sensitivity
Path-sensitive analysis
– Computes an answer for every path:
– x is 4 at the end of the left path
– x is 5 at the end of the right path
Path-insensitive analysis
– Computes one answer for all path:
– x is not constant
“Advanced Compiler Techniques”
26
Key Challenges for Interprocedural Analysis
Compilation time, memory
Key problem: scalability to large programs
Dominated by analysis time/memory
Flow-sensitive analysis: bottleneck often memory, not
time
Often limited to fast but imprecise analysis
Multiple calling environments
Different calls to P() have different properties:
Known constants
Aliases
Surrounding execution context (e.g., enclosing loops)
Function pointer arguments
Frequency of the call
Recursion
“Advanced Compiler Techniques”
27
Brute Force: Full Context-Sensitive
Interprocedural Analysis
Invocation Graph [Emami94]
Use an invocation graph, which distinguishes all
calling chains
Re-analyze callee for all distinct calling paths
Pro: precise
Cons: exponentially expensive, recursion is tricky
“Advanced Compiler Techniques”
28
Middle Ground: Use Call Graph and
Compute Summaries
Goal
Represent procedure
Call relationships
Definition
If program P consists of n procedures:
p1, . . ., pn
Static call graph of P is GP = (N,S,E,r)
−N = {p1, . . ., pn}
−S = {call-site labels}
−E ⊆ N × N × S
−r ∈ N is start node
“Advanced Compiler Techniques”
29
Summary Information
Compute summary information for each procedure
Summarize effect of called procedure for callers
Summarize effect of callers for called procedure
Store summaries in database
Use later when optimizing procedures
Pros
+ Concise
+ Can be fast to compute and use
+ Separate compilation practical
Cons
– Imprecise if only have one summary per procedure
“Advanced Compiler Techniques”
30
Two Types of Information
Track info that flows into procedures
“Propagation problems”, e.g.:
which formals are constant?
which formals are aliased to globals?
Track info that flows out of procedures
“Side effect problems”, e.g.:
proc(x, y)
{
. . .
which globals defined/used by procedure? }
which locals defined/used by procedure?
Which actual parameters defined by
procedure?
“Advanced Compiler Techniques”
31
Propagation Summaries: Examples
MAY-ALIAS
Formals that may be aliased to globals
MUST-ALIAS
Formals definitely aliased to globals
CONSTANT
Formals that are definitely constant
“Advanced Compiler Techniques”
32
Side-Effect Summaries: Examples
MOD
Variables possibly modified (defined) by
procedure call
REF
Variables possibly referenced (used) by
procedure
KILL
Variables that are definitely killed in
procedure
“Advanced Compiler Techniques”
33
Computing Summaries
Bottom-up (MOD, REF, KILL)
Summarizes call effects
Top-down (MAY-ALIAS)
Summarizes information
about caller
Bi-directional (AVAIL,
CONSTANT)
Info to/from caller & callee
“Advanced Compiler Techniques”
34
Side-Effect Summarization
At procedure boundaries:
Translate formal args to actuals at call
site
Compute:
GMOD, GREF = procedure side effects
MOD, REF = effects at call site
Possibly specific to call
“Advanced Compiler Techniques”
35
Parameter Binding
At procedure boundaries, we need to translate
formal arguments of procedure to actual arguments
of procedure at call site
int a,b
program main
foo(b)
end
procedure foo (var c)
int d
d := b
bar(b)
end
procedure bar (var d)
if (...)
d := a
end
// MOD(foo) = b
// REF(foo) = a,b
// GMOD(foo)= b
// GREF(foo)= a,b
// MOD(bar) = b
// REF(bar) = a
// GMOD(bar)= d
// GREF(bar)= a
“Advanced Compiler Techniques”
36
Constructing Summary Flow Functions
Iteratively
Termination is possible only if all function compositions
and confluences can be reduced to a finite set of functions
“Advanced Compiler Techniques”
37
An Example of Interprocedural
Liveness Analysis
“Advanced Compiler Techniques”
38
An Example of Interprocedural
Liveness Analysis
“Advanced Compiler Techniques”
39
An Example of Interprocedural
Liveness Analysis
“Advanced Compiler Techniques”
40
An Example of Interprocedural
Liveness Analysis
“Advanced Compiler Techniques”
41
An Example of Interprocedural
Liveness Analysis
“Advanced Compiler Techniques”
42
An Example of Interprocedural
Liveness Analysis
e ∈ InSp but e ∉ Inc1
“Advanced Compiler Techniques”
43
Interprocedural Validity and
Calling Contexts
“You can descend only as much as you have ascended!”
Every descending step must match a corresponding ascending step.
Calling context is represented by the remaining descending steps.
“Advanced Compiler Techniques”
44
Available Expressions Analysis Using Call
Strings Approach
int a, b, t;
void p()
{ if (a == 0)
{
a = a-1;
p();
t = a∗b;
}
}
Is a ∗ b
available?
YES!
“Advanced Compiler Techniques”
45
Available Expressions Analysis Using Call
Strings Approach
“Advanced Compiler Techniques”
46
Alternatives to IPA: Inlining
Replaces calls to procedures with copies of
their bodies
Converts calls from opaque objects to local
code
Exposes the “effects” of the called procedure
Extends the compilation region
Language support: the inline attribute
But the compiler can decide per call-site, rather
than per procedure
“Advanced Compiler Techniques”
47
Inlining Decisions
Must be based on
Heuristics, or
Profile information
Considerations
The size of the procedure body (smaller=better)
Number of call sites (1=usually wins)
If call site is in a loop (yes=more optimizations)
Constant-valued parameters
“Advanced Compiler Techniques”
48
Inlining Policies
The hard question
– How do we decide which calls to inline?
Many possible heuristics
– Only inline small functions
– Let the programmer decide using an inline directive
– Use a code expansion budget [Ayers, et al ’97]
– Use profiling or instrumentation to identify hot
paths—inline along the hot paths [Chang, et al ’92]
– JIT compilers do this
– Use inlining trials for object oriented languages [Dean
& Chambers ’94]
– Keep a database of functions, their parameter
types, and the benefit of inlining
– Keeps track of indirect benefit of inlining
– Effective in an incrementally compiled language
“Advanced Compiler Techniques”
49
Study on Real Compilers
Cooper, Hall, Torczon (92)
Eight Programs, five compilers, five processors
Eliminated 99% of dynamic calls in 5 of the
programs
Measured speed of original vs. transformed code
What do you
expect?
“Advanced Compiler Techniques”
V.S.
50
Results on real compilers
“Advanced Compiler Techniques”
51
What happened?
Input code violated assumptions made by compiler
writers
Longer procedures
More names
Different code shapes
Exacerbated problems that are unimportant on
“normal” code
Imprecise analysis
Algorithms that scale poorly
Tradeoffs between global and local speed
Limitations in the implementations
The compiler writers were surprised!
“Advanced Compiler Techniques”
52
Inlining: Summary
Pros
+ Exposes context & side effects
+ Simple
Cons
-
Code bloat (bad for caches, branch predictor)
Can’t decide statically for OOPs
Library source?
Recursion?
How do we decide when to inline?
“Advanced Compiler Techniques”
53
Alternatives to IPA: Cloning
Cloning: customize procedure for certain call sites
Partition call sites to procedure p into equivalence
classes
e.g., {{call3, call1}, {call4}}
Equivalence based on optimization
Constant propagation: partition based on parameter
value
“Advanced Compiler Techniques”
54
Cloning
Pros
+
+
+
+
Compromise between inlining & IPA
Less code bloat compared to inlining
No problem with recursion
Better caller/callee optimization potential (compared
to IPA)
Cons
-
Some code bloat (compared to IPA)
- May have to do interprocedural analysis anyway
e.g. Interprocedural constant propagation can guide cloning
“Advanced Compiler Techniques”
55
Summary
Interprocedural analysis
Difficult but expensive
Need source code, recompilation analysis
Trade-offs for precision & speed/space
Better than inlining
Useful for many optimizations
IPA and cloning likely to become more
important
Java: many small procedures
“Advanced Compiler Techniques”
56
Summary
Most compilers avoid interprocedural analysis
– It’s expensive and complex
– Not beneficial for most classical optimizations
– Separate compilation + interprocedural analysis requires
recompilation analysis [Burke and Torczon’93]
– Can’t analyze library code
When is it useful?
–
–
–
–
–
–
–
Pointer analysis
Constant propagation
Object oriented class analysis
Security and error checking
Program understanding and re-factoring
Code compaction
Parallelization
{
“Modern” Uses
of Compilers
“Advanced Compiler Techniques”
57
Trends
Cost of procedures is growing
– More of them and they’re smaller (OO
languages)
– Modern machines demand precise
information (memory op aliasing)
Cost of inlining is growing
– Code bloat degrades efficacy of many
modern structures
– Procedures are being used more extensively
Programs are becoming larger
Cost of interprocedural analysis is shrinking
– Faster machines
– Better methods
“Advanced Compiler Techniques”
58
Next Time
Homework
Convert program to SSA form
Exercise 12.1.1
Pointer Analysis
Reading: Dragon chapter 12
Mid-term Review
“Advanced Compiler Techniques”
59
© Copyright 2026 Paperzz