Control Flow Analysis
Mooly Sagiv
http://www.math.tau.ac.il/~sagiv/courses/pa.html
Tel Aviv University
640-6706
Sunday 18-21 Scrieber 8
Monday 10-12 Schrieber 317
Textbook Chapter 3
(Simplified+OO)
Goals
Understand
the problem of Control Flow Analysis
– in Functional Languages
– In Object Oriented Languages
– Function Pointers
Learn
Constraint Based Program Analysis
Technique
–
–
–
–
General
Usage for Control Flow Analysis
Algorithms
Systems
Similarities
between Problems &Techniques
Outline
A Motivating Example (OO)
The Control Flow Analysis Problem
A Formal Specification
Set Constraints
Solving Constraints
Adding Dataflow information
Adding Context Information
Back to the Motivating Example
Conclusions
A Motivating Example
class Vehicle Object { int position = 10;
void move(x1 : int) {
position = position + x1 ;}}
class Car extends Vehicle { int passengers;
void await(v : Vehicle) {
if (v.position < position)
then v.move(position - v.position);
else self.move(10); }}
class Truck extends Vehicle {
void move(x2 : int) {
if (x2 < 55) position = position + x2; }}
void main { Car c; Truck t; Vehicle v1;
new c;
new t;
v1 := c;
c.passangers := 2;
c.move(60);
v1.move(70);
c.await(t) ;}
The Control Flow Analysis (CFA) Problem
Given
a program in a functional programming
language with higher order functions
(functions can serve as parameters and return
values)
Find out for each function invocation
which functions may be applied
Obvious in C without function pointers
Difficult in C++, Java and ML
The Dynamic Dispatch Problem
An ML Example
let f = fn x => x 1 ;
g = fn y => y + 2 ;
h = fn z => z + 3;
in (f g) + (f h)
An ML Example
let f = fn x => /* {g, h} */ x 1 ;
g = fn y => y + 2 ;
h = fn z => z + 3;
in (f g) + (f h)
The Language FUN
Notations
–
–
–
–
–
–
e Exp // expressions (or labeled terms)
t Term // terms (or unlabeled terms)
f, x Var // variables
c Const // Constants
op Op // Binary operators
l Lab // Labels
Abstract Syntax
– e ::= tl
– t ::= c | x
| fn x e // function definition
| fun f x e // recursive function definition
| e1 e2 // function applications
| if e0 then e1 else e2
| let x = e1 in e2 | e1 op e2
A Simple Example
((fn x x1)2 (fn y y3)4)5
An Example which Loops
(let g = fun f x (f1 (fn y y2)3)4
)5
(g6 (fn z z7)8)9
)10
The 0-CFA Problem
Compute
for every program a pair (C, ) where:
– C is the abstract cache associating abstract values with
labeled program points
– is the abstract environment associating abstract values
with variables
Formally
–
–
–
–
v Val = P(Term) // Abstract values
Env = Var Val // Abstract environment
C Cache - Lab Val // Abstract Cache
For function application (t1l1 t2l2)l
C(l1) determine the function that can be applied
These
maps are finite for a given program
No context is considered for parameters
Possible Solutions for ((fn x x1)2 (fn y y3)4)5
1 {fn yy3} {fn yy3}
2 {fn xx1} {fn xx1}
3 {}
{}
4 {fn yy3} {fn yy3}
5 {fn yy3} {fn yy3}
x {fn yy3} {}
y {}
{}
(let g = fun f x (f1 (fn y y2)3)4
)5
(g6 (fn z z7)8)9
)10
Shorthand
sf fun f x (f1 (fn y y2)3)4
idy fn y y2
idz fn z z7
C(1) = {sf}
C(2) = {}
C(3) = {idy}
C(4) = {}
C(5) = {sf}
C(6) = {sf}
C(7) = {}
C(8) = {idy} C(9) = {}
C(10) = {}
(x) = {idy , idy }
(z) = {}
(y) = {}
Relationship to Dataflow Analysis
Expressions
are side effect free
– no entry/exit
A single
environment
Represents information at different points via
maps
A single value for all occurrences of a variable
Function applications act similar to assignments
– “Definition” - Function abstraction is created
– “Use” - Function is applied
A Formal Specification of 0-CFA
A Boolean
function define when a solution is
acceptable
(C, ) e means that (C, ) is acceptable for the
expression e
Define by structural induction on e
Every function is analyzed once
Every acceptable solution is sound (conservative)
Many acceptable solutions
Generate a set of constraints
Obtain the least acceptable solution by solving the
constraints
Syntax Directed 0-CFA
(Simple Expressions)
[const] (C, ) cl
[var] (C, ) xl
always
if (x) C (l)
Syntax Directed 0-CFA
Function Abstraction
[fn] (C, ) (fn x e)l
if:
(C, ) e
fn x e C(l)
[fun] (C, ) (fun f x e)l if:
(C, ) e
fun x e C(l)
fun x e (f)
Syntax Directed 0-CFA
Function Application
[app] (C, ) (t1l1 t2l2)l
if:
(C, ) t1l1
(C, ) t2l2
for all fn x t0l0 C(l):
C (l2) (x) C(l0) C(l)
for all fun x t0l0 C(l):
C (l2) (x) C(l0) C(l)
Syntax Directed 0-CFA
Other Constructs
[if] (C, ) (if t0l0 then t1l1 else t2l2)l
(C, ) t0l0
(C, ) t1l1
(C, ) t2l2
C(l1) C(l)
C(l2) C(l)
[let] (C, ) (let x = t1l1 in t2l2)l
if:
(C, ) t1l1
(C, ) t2l2
C(l1) (x)
C(l2) C(l)
[op] (C, ) (t1l1 op t2l2)l
if:
(C, ) t1l1
(C, ) t2l2
if:
Possible Solutions for ((fn x x1)2 (fn y y3)4)5
1 {fn yy3} {fn yy3}
2 {fn xx1} {fn xx1}
3 {}
{}
4 {fn yy3} {fn yy3}
5 {fn yy3} {fn yy3}
x {fn yy3} {}
y {}
{}
Set Constraints
A set
of rules of the form:
– lhs rhs
– {t} rhs’ lhs rhs (conditional constraint)
– lhs, rhs, rhs’ are
» terms
» C(l)
» (x)
The
least solution (C, ) can be found iterativelly
– start with empty sets
– add terms when needed
Efficient
cubic graph based solution
Syntax Directed Constraint Generation (Part I)
C* cl = {}
C* xl = { (x) C (l)}
C* (fn x e)l = C* e { {fn x e} C(l)}
C* (fun x e)l = C* e { {fun x e} C(l)}
{{fun x e} ( f)}
C* (t1l1 t2l2)l
= C* t1l1 C* t2l2
{{t} C(l) C (l2) (x) | t=fn x t0l0 Term* }
{{t} C(l) C (l0) C (l) | t=fn x t0l0 Term* }
{{t} C(l) C (l2) (x) | t=fun x t0l0 Term* }
{{t} C(l) C (l0) C (l) | t=fun x t0l0 Term* }
Syntax Directed Constraint Generation (Part II)
C* (if t0l0 then t1l1 else t2l2)l = C* t0l0 C* t1l1 C*
t2l2
{C(l1) C (l)}
{C(l2) C (l)}
C* (let x = t1l1 in t2l2)l = C* t1l1 C* t2l2
{C(l1) (x)}
{C(l2) C(l)}
C* (t1l1 op t2l2)l = C* t1l1 C* t2l2
Set Constraints for ((fn x x1)2 (fn y y3)4)5
Iterative Solution to the Set Constraints for
((fn x x1)2 (fn y y3)4)5
step
Constraint
1
2
3
4
x
y
Adding Data Flow Information
Dataflow
Example
values can affect control flow analysis
(let f = (fn x (if (x1 > 02)3
then (fn y y4)5
else (fn z 56)7
)8
)9
in ((f10 311)12 013)14)15
Adding Data Flow Information
Add
a finite set of “abstract” values per program
Data
Update Val = P(TermData)
– Env = Var Val // Abstract environment
– C Cache - Lab Val // Abstract Cache
Generate
extra constraints for data
Obtained a more precise solution
A special of case of product domain (4.4)
The combination of two analyses may be more
precise than both
Adding Dataflow Information (Sign Analysis)
Sign
analysis
Add a finite set of “abstract” values per program
Data = {P, N, TT, FF}
Update Val = P(TermData)
dc is the abstract value that represents a constant c
–
–
–
–
d3 = {p}
d-7= {n}
dtrue= {tt}
dfalse= {ff}
Every
operator is conservatively interpreted
Syntax Directed Constraint Generation (Part I)
C* cl = dc C (l)}
C* xl = { (x) C (l)}
C* (fn x e)l = C* e { {fn x e} C(l)}
C* (fun x e)l = C* e { {fun x e} C(l)}
{{fun x e} ( f)}
C* (t1l1 t2l2)l
= C* t1l1 C* t2l2
{{t} C(l) C (l2) (x) | t=fn x t0l0 Term* }
{{t} C(l) C (l0) C (l) | t=fn x t0l0 Term* }
{{t} C(l) C (l2) (x) | t=fun x t0l0 Term* }
{{t} C(l) C (l0) C (l) | t=fun x t0l0 Term* }
Syntax Directed Constraint Generation (Part II)
C* (if t0l0 then t1l1 else t2l2)l = C* t0l0 C* t1l1 C*
t2l2
{dt C (l0) C(l1) C (l)}
{df C (l0) C(l2) C (l)}
C* (let x = t1l1 in t2l2)l = C* t1l1 C* t2l2
{C(l1) (x)}
{C(l2) C(l)}
C* (t1l1 op t2l2)l = C* t1l1 C* t2l2
{C(l1) op C(l2) C(l)}
Adding Context Information
The
analysis does not distinguish between
different occurrences of a variable
(Monovariant analysis)
Example
(let f = (fn x x1) 2
in ((f3 f4)5 (fn y y6) 7)8)9
Source to source can help (but may lead to code
explosion)
Example rewritten
let f1 = fn x1 x1
in let f2 = fn x2 x2
in (f1 f2) (fn y y)
Simplified K-CFA
Records
the last k dynamic calls (for some fixed
k)
Similar to the call string approach
Remember the context in which expression is
evaluated
Val is now P(Term)Contexts
– Env = Var Contexts Val
– C Cache - LabContexts Val
1-CFA
f = (fn x x1) 2 in ((f3 f4)5 (fn y y6) 7)8)9
Contexts
(let
– [] - The empty context
– [5] The application at label 5
– [8] The application at label 8
Polyvariant
Control Flow
C(1, [5]) = (x, 5)= C(2, []) = C(3, []) = (f, []) =
({(fn x x1)}, [] )
C(1, [8]) = (x, 8)= C(7, []) = C(8, []) = C(9, []) =
({(fn y y6)}, [] )
The Motivating Example
class Vehicle Object { int position = 10;
void move(x1 : int) {
position = position + x1 ;}}
class Car extends Vehicle { int passengers;
void await(v : Vehicle) {
if (v.position < position)
then v.move(position - v.position);
else self.move(10); }}
class Truck extends Vehicle {
void move(x2 : int) {
if (x2 < 55) position = position + x2; }}
void main { Car c; Truck t; Vehicle v1;
new c;
new t;
v1 := c;
c.passangers := 2;
c.move(60);
v1.move(70);
c.await(t) ;}
Missing Material
Efficient
Cubic Solution to Set Constraints
www.cs.berkeley.edu/Research/Aiken/bane.html
Experimental results for OO
www.cs.washington.edu/research/projects/cecil
Operational Semantics for FUN (3.2.1)
Defining acceptability without structural induction
– More precise treatment of termination (3.2.2)
– Needs Co-Induction (greatest fixed point)
Using
general lattices as Dataflow values
instead of powersets (3.5.2)
Lower-bounds
– Decidability of JOP
– Polynomiality
Conclusions
Set
constraints are quite useful
– A Uniform syntax
– Can even deal with pointers
But
semantic foundation is still based on abstract
interpretation
Techniques used in functional and imperative
(OO) programming are similar
Control and data flow analysis are related
© Copyright 2026 Paperzz