Advanced Compiler Techniques
Data Flow Analysis
LIU Xianhua
School of EECS, Peking University
REVIEW
Introduction to optimization
Control Flow Analysis
Basic knowledge
Basic blocks
Control-flow graphs
Local Optimizations
Peephole optimizations
“Advanced Compiler Techniques”
2
Outline
Some Basic Ideas
Reaching Definitions
Available Expressions
Live Variables
“Advanced Compiler Techniques”
3
Levels of Optimizations
Local
inside a basic block
Global (intraprocedural)
Across basic blocks
Whole procedure analysis
Interprocedural
Across procedures
Whole program analysis
“Advanced Compiler Techniques”
4
Dataflow Analysis
Last lecture:
How to analyze and transform within a
basic block
This lecture:
How to do it for the entire procedure
“Advanced Compiler Techniques”
5
An Obvious Theorem
boolean x = true;
while (x) {
. . . // no change to x
}
Doesn’t terminate.
Proof: only assignment to x
is at top, so x is always true.
x = true
if x == true
“body”
“Advanced Compiler Techniques”
6
Formulation: Reaching Definitions
Each place some
variable x is assigned is
a definition.
Ask: for this use of x,
where could x last have
been defined.
In our example:
only at x=true.
d1: x = true
d1
d2
if x == true
d1
d2
d1
d2: a = 10
“Advanced Compiler Techniques”
7
Clincher
Since at x == true, d1 is the only
definition of x that reaches, it must be
that x is true at that point.
The conditional is not really a conditional
and can be replaced by a jump.
“Advanced Compiler Techniques”
8
Not Always That Easy
int i = 2;
int j = 3;
while (i != j) {
if (i < j) i += 2;
else j += 2;
}
“Advanced Compiler Techniques”
9
Not Always That Easy
Build the Flow Graph
d1: i = 2
d2: j = 3
d1
d2
d3
d4
if i != j
d1, d 2, d 3, d 4
if i < j
d2 , d3, d 4
d1 , d 2 , d3 , d 4
d3: i = i+2
d1 , d 2 , d3 , d 4
d1, d 3, d 4
d4: j = j+2
“Advanced Compiler Techniques”
10
DFA Is Sufficient Only
In this example, i can be defined in two
places, and j in two places.
No obvious way to discover that i!=j is
always true.
But OK, because reaching definitions is
sufficient to catch most opportunities for
constant folding (replacement of a variable
by its only possible value).
“Advanced Compiler Techniques”
11
Be Conservative!
(Code optimization only)
It’s OK to discover a subset of the
opportunities to make some code-improving
transformation.
It’s not OK to think you have an
opportunity that you don’t really have.
“Advanced Compiler Techniques”
12
Example: Be Conservative
boolean x = true;
while (x) {
. . . *p = false; . . .
}
Is it possible that
d1:
p points to x?
Another
def of x
d2
x = true
d1
if x == true
d2: *p = false
“Advanced Compiler Techniques”
13
Possible Resolution
Just as data-flow analysis of “reaching
definitions” can tell what definitions of x
might reach a point, another DFA can
eliminate cases where p definitely does not
point to x.
Example: the only definition of p is
p = &y
and
there is no possibility that y is an alias of x.
“Advanced Compiler Techniques”
14
Data-Flow Analysis Schema
Data-flow value: at every program point
Domain: The set of possible data-flow
values for this application
IN[S] and OUT[S]: the data-flow values
before and after each statement s
Data-flow problem: find a solution to a set
of constraints on the IN [s] ‘s and OUT[s]
‘s, for all statements s.
based on the semantics of the
statements ("transfer functions" )
based on the flow of control.
“Advanced Compiler Techniques”
15
Data-Flow Equations (1)
A statement/basic block can generate a
definition.
A statement/basic block can either
1. Kill a definition of x if it surely
redefines x.
2. Transmit a definition if it may not
redefine the same variable(s) as that
definition.
“Advanced Compiler Techniques”
16
Data-Flow Equations (2)
Variables:
1. IN[s] = set of data flow values the
before a statement s.
2. OUT[s] = set of data flow values the
after a statement s.
Extends:
1. IN[B] = set of definitions reaching the
beginning of block B.
2. OUT[B] = set of definitions reaching the
end of B.
“Advanced Compiler Techniques”
17
Data-Flow Equations (3)
Two kinds of Constraints:
Transfer Functions:
OUT[s] =
fs(IN[s])
IN[s]
=
fs(OUT[s])
reversed~
Control Flow Constraints:
If B consists of statements s1,s2,…,sn
IN[si+1] =
OUT[si]
“Advanced Compiler Techniques”
18
Between Blocks
Forward analysis(eg: Reaching definitions)
Transfer equations
OUT[B] = fB(IN[B])
Where fB = fsn ◦ ••• ◦ fs2 ◦ fs1
Confluence equations
IN[B] = UP a predecessor of B OUT[P]
Backward analysis(eg: live variables)
Transfer equations
IN[B] = fB (OUT[B])
Confluence equations
OUT[B] = US a successor of B IN[S].
“Advanced Compiler Techniques”
19
Confluence Equations
IN(B) = ∪predecessor P of B OUT(P)
OUT(B) = ∪successor S of B IN(S)
P1
P2
{d1, d2}
{d2, d3}
{d1, d2, d3}
B
“Advanced Compiler Techniques”
20
Transfer Equations
OUT[B] = fB(IN[B])
IN[B] = fB(OUT[B])
~reverse
Generate a definition in the block if its
variable is not definitely rewritten later in
the basic block.
Kill a definition if its variable is definitely
rewritten in the block.
“Advanced Compiler Techniques”
21
Example: Gen and Kill
An internal definition may be both killed
and generated.
For any block B:
OUT(B) = (IN(B) – Kill(B)) ∪ Gen(B)
IN = {d2(x), d3(y), d3(z), d5(y), d6(y), d7(z)}
Kill includes {d2(x),
d3(y), d5(y), d6(y),…}
Gen = {d2(x), d3(x),
d3(z),…, d4(y)}
d1:
d2:
d3:
d4:
y = 3
x = y+z
*p = 10
y = 5
OUT = {d2(x), d3(x), d3(z),…, d4(y), d7(z)}
“Advanced Compiler Techniques”
22
Iterative Solution to Equations
For an n-block flow graph, there are 2n
equations in 2n unknowns.
Alas, the solution is not unique.
Standard theory assumes a field of
constants; sets are not a field.
Use iterative solution to get the least
fixed-point.
Identifies any def that might reach a
point.
“Advanced Compiler Techniques”
23
Example: Reaching Definitions
B1
B2
B3
d1: x = 5
if x == 10
IN(B1) = {}
OUT(B1) = { d1}
IN(B2) = {
d1, d2}
OUT(B2) = { d1, d2}
d2: x = 15
IN(B3) = {
d1, d2}
OUT(B3) = { d2}
“Advanced Compiler Techniques”
24
Outline
Some Basic Ideas
Reaching Definitions
Available Expressions
Live Variables
“Advanced Compiler Techniques”
25
Reaching Definitions
Concept of definition and use
a = x+y
is a definition of a
is a use of x and y
A definition reaches a use if
value written by definition
may be read by use
“Advanced Compiler Techniques”
26
Reaching Definitions
s = 0;
a = 4;
i = 0;
k == 0
b = 1;
b = 2;
i<n
s = s + a*b;
i = i + 1;
return s
“Advanced Compiler Techniques”
27
Reaching Def. and Const. Propagation
Is a use of a variable a constant?
Check all reaching definitions
If all assign variable to same constant
Then use is in fact a constant
Can replace variable with constant
“Advanced Compiler Techniques”
28
Is a Constant in s = s+a*b?
Yes!
s = 0;
a = 4;
i = 0;
k == 0
b = 1;
On all reaching
definitions
a=4
b = 2;
i<n
s = s + a*b;
i = i + 1;
return s
“Advanced Compiler Techniques”
Constant Propagation
Transform
s = 0;
Yes!
a = 4;
i = 0;
k == 0
b = 1;
On all reaching
definitions
a=4
b = 2;
i<n
s = s + 4*b;
i = i + 1;
return s
“Advanced Compiler Techniques”
Is b Constant in s = s+a*b?
No!
s = 0;
a = 4;
i = 0;
k == 0
b = 1;
b = 2;
i<n
s = s + a*b;
i = i + 1;
One reaching
definition with
b=1
One reaching
definition with
b=2
return s
“Advanced Compiler Techniques”
Splitting
s = 0;
a = 4;
i = 0;
k == 0
b = 1;
Preserves Information Lost At
Merges
b = 2;
i<n
s = s + a*b;
i = i + 1;
s = 0;
a = 4;
i = 0;
k == 0
return s
b = 1;
b = 2;
i<n
s = s + a*b;
i = i + 1;
i<n
return s
s = s + a*b;
i = i + 1;
“Advanced Compiler Techniques”
return s
Splitting
s = 0;
a = 4;
i = 0;
k == 0
b = 1;
Preserves Information Lost At
Merges
b = 2;
i<n
s = s + a*b;
i = i + 1;
s = 0;
a = 4;
i = 0;
k == 0
return s
b = 1;
b = 2;
i<n
s = s + a*1;
i = i + 1;
i<n
return s
s = s + a*2;
i = i + 1;
“Advanced Compiler Techniques”
return s
Computing Reaching Definitions
Compute with sets of definitions
represent sets using bit vectors
each definition has a position in bit
vector
At each basic block, compute
definitions that reach start of block
definitions that reach end of block
Do computation by simulating execution of
program until reach fixed point
“Advanced Compiler Techniques”
34
1234567
0000000
1: s = 0;
2: a = 4;
3: i = 0;
k == 0
1110000
1234567
1110000
4: b = 1;
1111000
1234567
1110000
5: b = 2;
1110100
1234567
1111100
1111111
1234567
1111111
1111100
6: s = s + a*b;
7: i = i + 1;
0101111
i<n
1111100
1111111
1234567
1111111
1111100
return s
1111100
1111111
“Advanced Compiler Techniques”
Formalizing Reaching Definitions
Each basic block has
IN - set of definitions that reach beginning of
block
OUT - set of definitions that reach end of block
GEN - set of definitions generated in block
KILL - set of definitions killed in block
GEN[s = s + a*b; i = i + 1;] = 0000011
KILL[s = s + a*b; i = i + 1;] = 1010000
Compiler scans each basic block to derive
GEN and KILL sets
“Advanced Compiler Techniques”
36
Dataflow Equations
IN[b] = OUT[b1] U ... U OUT[bn]
where b1, ..., bn are predecessors of b in CFG
OUT[b] = (IN[b] - KILL[b]) U GEN[b]
IN[entry] = 0000000
Result: system of equations
KILLB= KILL1 U KILL2 U… U KILLn
GENB=GENn U(GENn-1-KILLn)U
(GENn-2-KILLn-1-KILLn) U
… U(GEN1-KILL2-KILL3-…-KILLn)
“Advanced Compiler Techniques”
37
Solving Equations
Use fixed point algorithm
Initialize with solution of OUT[b] = 0000000
Repeatedly apply equations
IN[b] = OUT[b1] U ... U OUT[bn]
OUT[b] = (IN[b] - KILL[b]) U GEN[b]
Until reach fixed point
Until equation application has no further effect
Use a worklist to track which equation
applications may have a further effect
“Advanced Compiler Techniques”
38
Reaching Definitions Algorithm
for all nodes n in N
OUT[n] = emptyset; // OUT[n] = GEN[n];
IN[Entry] = emptyset;
OUT[Entry] = GEN[Entry];
Changed = N - { Entry }; // N = all nodes in graph
while (Changed != emptyset)
choose a node n in Changed;
Changed = Changed - { n };
IN[n] = emptyset;
for all nodes p in predecessors(n)
IN[n] = IN[n] U OUT[p];
OUT[n] = GEN[n] U (IN[n] - KILL[n]);
if (OUT[n] changed)
for all nodes s in successors(n)
Changed = Changed U { s };
“Advanced Compiler Techniques”
39
Iterative Solution
IN(entry) = ∅;
for each block B do OUT(B)= ∅;
while (changes occur) do
for each block B do {
IN(B) = ∪predecessors P of B OUT(P);
OUT(B) = (IN(B) – Kill(B)) ∪ Gen(B);
}
“Advanced Compiler Techniques”
40
Example
“Advanced Compiler Techniques”
41
Outline
Some Basic Ideas
Reaching Definitions
Available Expressions
Live Variables
“Advanced Compiler Techniques”
42
Available Expressions
An expression x+y is available at a point p
if
every path from the initial node to p must
evaluate x+y before reaching p,
and there are no assignments to x or y after
the evaluation but before p.
Available Expression information can be used
to do global (across basic blocks) CSE
If expression is available at use, no need to
reevaluate it
“Advanced Compiler Techniques”
43
Example: Available Expression
a=b+c
d=e+f
f=a+c
b=a+d
h=c+f
g=a+c
j=a+b+c+d
“Advanced Compiler Techniques”
44
Is the Expression Available?
a=b+c
d=e+f
f=a+c
YES!
b=a+d
h=c+f
g=a+c
j=a+b+c+d
“Advanced Compiler Techniques”
45
Is the Expression Available?
a=b+c
d=e+f
f=a+c
YES!
b=a+d
h=c+f
g=a+c
j=a+b+c+d
“Advanced Compiler Techniques”
46
Is the Expression Available?
a=b+c
d=e+f
f=a+c
NO!
b=a+d
h=c+f
g=a+c
j=a+b+c+d
“Advanced Compiler Techniques”
47
Is the Expression Available?
a=b+c
d=e+f
f=a+c
NO!
b=a+d
h=c+f
g=a+c
j=a+b+c+d
“Advanced Compiler Techniques”
48
Is the Expression Available?
a=b+c
d=e+f
f=a+c
NO!
b=a+d
h=c+f
g=a+c
j=a+b+c+d
“Advanced Compiler Techniques”
49
Is the Expression Available?
a=b+c
d=e+f
f=a+c
YES!
b=a+d
h=c+f
g=a+c
j=a+b+c+d
“Advanced Compiler Techniques”
50
Is the Expression Available?
a=b+c
d=e+f
f=a+c
YES!
b=a+d
h=c+f
g=a+c
j=a+b+c+d
“Advanced Compiler Techniques”
51
Use of Available Expressions
a=b+c
d=e+f
f=a+c
b=a+d
h=c+f
g=a+c
j=a+b+c+d
“Advanced Compiler Techniques”
52
Use of Available Expressions
a=b+c
d=e+f
f=a+c
b=a+d
h=c+f
g=a+c
j=a+b+c+d
“Advanced Compiler Techniques”
53
Use of Available Expressions
a=b+c
d=e+f
f=a+c
b=a+d
h=c+f
g=a+c
j=a+b+c+d
“Advanced Compiler Techniques”
54
Use of Available Expressions
a=b+c
d=e+f
f=a+c
b=a+d
h=c+f
g=f
j=a+b+c+d
“Advanced Compiler Techniques”
55
Use of Available Expressions
a=b+c
d=e+f
f=a+c
b=a+d
h=c+f
g=f
j=a+b+c+d
“Advanced Compiler Techniques”
56
Use of Available Expressions
a=b+c
d=e+f
f=a+c
b=a+d
h=c+f
g=f
j=a+c+ b+d
“Advanced Compiler Techniques”
57
Use of Available Expressions
a=b+c
d=e+f
f=a+c
b=a+d
h=c+f
g=f
j=f+ b+d
“Advanced Compiler Techniques”
58
Use of Available Expressions
a=b+c
d=e+f
f=a+c
b=a+d
h=c+f
g=f
j=f+ b+d
“Advanced Compiler Techniques”
59
Computing Available Expressions
Represent sets of expressions using bit
vectors
Each expression corresponds to a bit
Run dataflow algorithm similar to reaching
definitions
Big difference
definition reaches a basic block if it comes from
ANY predecessor in CFG
expression is available at a basic block only if it
is available from ALL predecessors in CFG
“Advanced Compiler Techniques”
60
Gen(B) and Kill(B)
An expression x+y is generated if it is
computed in B, and afterwards there is no
possibility that either x or y is redefined.
An expression x+y is killed if it is not
generated in B and either x or y is possibly
redefined.
“Advanced Compiler Techniques”
61
Example
Kills z-w,
x+z, etc.
x = x+y
z = a+b
Kills x+y,
w*x, etc.
Generates
a+b
“Advanced Compiler Techniques”
62
Example
Expressions
1: x+y
2: i<n
3: i+c
4: x==0
0000
a = x+y;
x == 0
1001
x = z;
b = x+y;
1000
1001
i = x+y;
1000
i<n
1100
c = x+y;
i = i+c;
“Advanced Compiler Techniques”
1100
d = x+y
63
Example:Global CSE Transform
Expressions
1: x+y
2: i<n
3: i+c
4: x==0
0000
a = x+y;
t=a
x == 0
must use same temp
for CSE in all blocks
1001
x = z;
b = x+y;
t=b
1000
1001
i = x+y;
1000
i<n
1100
c = x+y;
i = i+c;
“Advanced Compiler Techniques”
1100
d = x+y
64
Example:Global CSE Transform
Expressions
1: x+y
2: i<n
3: i+c
4: x==0
0000
a = x+y;
t=a
x == 0
must use same temp
for CSE in all blocks
1001
x = z;
b = x+y;
t=b
1000
1001
i = t;
1000
i<n
1100
c = t;
i = i+c;
1100
d=t
“Advanced Compiler Techniques”
65
Formalizing Analysis
Each basic block has
IN - set of expressions available at start of block
OUT - set of expressions available at end of block
GEN - set of expressions computed in block (and
not killed later)
KILL - set of expressions killed in in block (and not
re-computed later)
GEN[x = z; b = x+y] = 1000
KILL[x = z; b = x+y] = 0001
Compiler scans each basic block to derive GEN
and KILL sets
“Advanced Compiler Techniques”
66
Dataflow Equations
IN[b] = OUT[b1] ... OUT[bn]
where b1, ..., bn are predecessors of b in CFG
OUT[b] = (IN[b] - KILL[b]) U GEN[b]
IN[entry] = 0000
Result: system of equations
“Advanced Compiler Techniques”
67
Solving Equations
Use fixed point algorithm
IN[entry] = 0000
Initialize OUT[b] = 1111
Repeatedly apply equations
IN[b] = OUT[b1] ... OUT[bn]
OUT[b] = (IN[b] - KILL[b]) U GEN[b]
Use a worklist algorithm to reach fixed point
“Advanced Compiler Techniques”
68
Available Expressions Algorithm
for all nodes n in N
OUT[n] = E;
IN[Entry] = emptyset;
OUT[Entry] = GEN[Entry];
Changed = N - { Entry }; // N = all nodes in graph
while (Changed != emptyset)
choose a node n in Changed;
Changed = Changed - { n };
IN[n] = E; // E is set of all expressions
for all nodes p in predecessors(n)
IN[n] = IN[n] OUT[p];
OUT[n] = GEN[n] U (IN[n] - KILL[n]);
if (OUT[n] changed)
for all nodes s in successors(n)
Changed = Changed U { s };
“Advanced Compiler Techniques”
69
Transfer Equations
Transfer is the same idea:
OUT(B) = (IN(B) – Kill(B)) ∪ Gen(B)
“Advanced Compiler Techniques”
70
Confluence Equations
Confluence involves intersection, because an
expression is available coming into a block if
and only if it is available coming out of each
predecessor.
IN(B) = ∩predecessors
P of B
OUT(P)
“Advanced Compiler Techniques”
71
Iterative Solution
IN(entry) = ∅;
for each block B do OUT(B)= ALL;
while (changes occur) do
for each block B do {
IN(B) = ∩predecessors P of B OUT(P);
OUT(B) = (IN(B) – Kill(B)) ∪ Gen(B);
}
“Advanced Compiler Techniques”
72
Why It Works
An expression x+y is unavailable at point
p iff there is a path from the entry to
p that either:
1. Never evaluates x+y, or
2. Kills x+y after its last evaluation.
IN(entry) = ∅ takes care of (1).
OUT(B) = ALL, plus intersection during
iteration handles (2).
“Advanced Compiler Techniques”
73
Example
Entry
x+y killed
x+y
never
gen’d
x+y
never
gen’d
point p
“Advanced Compiler Techniques”
74
Duality In Two Algorithms
Reaching definitions
Confluence operation is set union
OUT[b] initialized to empty set
Available expressions
Confluence operation is set intersection
OUT[b] initialized to set of available expressions
General framework for dataflow algorithms.
Build parameterized dataflow analyzer once,
use for all dataflow problems
“Advanced Compiler Techniques”
75
Outline
Some Basic Ideas
Reaching Definitions
Available Expressions
Live Variables
“Advanced Compiler Techniques”
76
Live Variable Analysis
A variable x is live at point p if
x is used along some path starting at p, and
no definition of x along the path before the use.
When is a variable x dead at point p?
No use of x on any path from p to exit node,
or
If all paths from p redefine x before using x.
“Advanced Compiler Techniques”
77
What Use is Liveness Information?
Register allocation.
If a variable is dead, can reassign its register
If x is not live on exit from a block, there is no
need to copy x from a register to memory.
Dead code elimination.
Eliminate assignments to variables not read later.
But must not eliminate last assignment to
variable (such as instance variable) visible outside
CFG.
Can eliminate other dead assignments.
Handle by making all externally visible variables
live on exit from CFG
“Advanced Compiler Techniques”
78
Conceptual Idea of Analysis
Simulate execution
But start from exit and go backwards in
CFG
Compute liveness information from end to
beginning of basic blocks
“Advanced Compiler Techniques”
79
Liveness Example
Assume a,b,c visible
outside method
So are live on exit
Assume x,y,z,t not
visible
Represent Liveness
Using Bit Vector
order is abcxyzt
0101110
a = x+y;
t = a;
c = a+x;
x == 0
1100111
abcxyzt
1000111
b = t+z;
1100100
abcxyzt
1100100
c = y+1;
1110000
abcxyzt
“Advanced Compiler Techniques”
80
Dead Code Elimination
Assume a,b,c visible
outside method
So are live on exit
Assume x,y,z,t not
visible
Represent Liveness
Using Bit Vector
order is abcxyzt
0101110
a = x+y;
t = a;
c = a+x;
x == 0
1100111
abcxyzt
1000111
b = t+z;
1100100
abcxyzt
1100100
c = y+1;
1110000
abcxyzt
“Advanced Compiler Techniques”
81
Formalizing Analysis
Each basic block has
IN - set of variables live at start of block
OUT - set of variables live at end of block
USE - set of variables with upwards exposed uses in
block (use prior to definition)
DEF - set of variables defined in block prior to use
USE[x =
DEF[x =
Compiler
and DEF
z; x = x+1;] = { z } (x not in USE)
z; x = x+1; y = 1;] = {x, y}
scans each basic block to derive USE
sets
“Advanced Compiler Techniques”
82
Algorithm
for all nodes n in N - { Exit }
IN[n] = emptyset;
OUT[Exit] = emptyset;
IN[Exit] = use[Exit];
Changed = N - { Exit };
while (Changed != emptyset)
choose a node n in Changed;
Changed = Changed - { n };
OUT[n] = emptyset;
for all nodes s in successors(n)
OUT[n] = OUT[n] U IN[p];
IN[n] = use[n] U (out[n] - def[n]);
if (IN[n] changed)
for all nodes p in predecessors(n)
Changed = Changed U { p };
“Advanced Compiler Techniques”
83
Transfer Equations
Transfer equations give IN’s in terms of
OUT’s:
IN(B) = (OUT(B) – Def(B)) ∪ Use(B)
“Advanced Compiler Techniques”
84
Confluence Equations
Confluence involves union over successors, so
a variable is in OUT(B) if it is live on entry
to any of B’s successors.
OUT(B) = ∪successors
S of B
IN(S)
“Advanced Compiler Techniques”
85
Iterative Solution
OUT(exit) = ∅;
for each block B do IN(B)= ∅;
while (changes occur) do
for each block B do {
OUT(B) = ∪successors S of B IN(S);
IN(B) = (OUT(B) – Def(B)) ∪ Use(B);
}
“Advanced Compiler Techniques”
86
Similar to Other Dataflow Algorithms
Backward analysis, not forward
Still have transfer functions
Still have confluence operators
Can generalize framework to work for both
forwards and backwards analyses
“Advanced Compiler Techniques”
87
Comparison
“Advanced Compiler Techniques”
88
Comparison
Reaching Definitions
Available Expressions
for all nodes n in N
OUT[n] = emptyset;
IN[Entry] = emptyset;
OUT[Entry] = GEN[Entry];
Changed = N - { Entry };
for all nodes n in N
OUT[n] = E;
IN[Entry] = emptyset;
OUT[Entry] = GEN[Entry];
Changed = N - { Entry };
for all nodes n in N - { Exit }
IN[n] = emptyset;
OUT[Exit] = emptyset;
IN[Exit] = use[Exit];
Changed = N - { Exit };
while (Changed != emptyset)
choose a node n in Changed;
Changed = Changed - { n };
while (Changed != emptyset)
choose a node n in Changed;
Changed = Changed - { n };
while (Changed != emptyset)
choose a node n in Changed;
Changed = Changed - { n };
IN[n] = emptyset;
for all nodes p in predecessors(n)
IN[n] = IN[n] U OUT[p];
OUT[n] = GEN[n] U (IN[n] KILL[n]);
if (OUT[n] changed)
for all nodes s in successors(n)
Changed = Changed U { s };
Live Variables
IN[n] = E;
for all nodes p in predecessors(n)
IN[n] = IN[n] OUT[p];
OUT[n] = GEN[n] U (IN[n] KILL[n]);
if (OUT[n] changed)
for all nodes s in successors(n)
Changed = Changed U { s };
OUT[n] = emptyset;
for all nodes s in successors(n)
OUT[n] = OUT[n] U IN[p];
IN[n] = use[n] U (out[n] def[n]);
if (IN[n] changed)
for all nodes p in
predecessors(n)
Changed = Changed U { p };
“Advanced Compiler Techniques”
89
Comparison
Reaching Definitions
Available Expressions
for all nodes n in N
OUT[n] = emptyset;
IN[Entry] = emptyset;
OUT[Entry] = GEN[Entry];
Changed = N - { Entry };
for all nodes n in N
OUT[n] = E;
IN[Entry] = emptyset;
OUT[Entry] = GEN[Entry];
Changed = N - { Entry };
while (Changed != emptyset)
choose a node n in Changed;
Changed = Changed - { n };
while (Changed != emptyset)
choose a node n in Changed;
Changed = Changed - { n };
IN[n] = emptyset;
for all nodes p in predecessors(n)
IN[n] = IN[n] U OUT[p];
OUT[n] = GEN[n] U (IN[n] KILL[n]);
if (OUT[n] changed)
for all nodes s in successors(n)
Changed = Changed U { s };
IN[n] = E;
for all nodes p in predecessors(n)
IN[n] = IN[n] OUT[p];
OUT[n] = GEN[n] U (IN[n] KILL[n]);
if (OUT[n] changed)
for all nodes s in successors(n)
Changed = Changed U { s };
“Advanced Compiler Techniques”
90
Comparison
Reaching Definitions
Live Variable
for all nodes n in N
OUT[n] = emptyset;
IN[Entry] = emptyset;
OUT[Entry] = GEN[Entry];
Changed = N - { Entry };
for all nodes n in N
IN[n] = emptyset;
OUT[Exit] = emptyset;
IN[Exit] = use[Exit];
Changed = N - { Exit };
while (Changed != emptyset)
choose a node n in Changed;
Changed = Changed - { n };
while (Changed != emptyset)
choose a node n in Changed;
Changed = Changed - { n };
IN[n] = emptyset;
for all nodes p in predecessors(n)
IN[n] = IN[n] U OUT[p];
OUT[n] = emptyset;
for all nodes s in successors(n)
OUT[n] = OUT[n] U IN[p];
OUT[n] = GEN[n] U (IN[n] - KILL[n]);
IN[n] = use[n] U (out[n] - def[n]);
if (OUT[n] changed)
for all nodes s in successors(n)
Changed = Changed U { s };
if (IN[n] changed)
for all nodes p in predecessors(n)
Changed = Changed U { p };
“Advanced Compiler Techniques”
91
Pessimistic vs. Optimistic Analysis
Available expressions is optimistic
(for common sub-expression elimination)
Assume expressions are available at start of
analysis
Analysis eliminates all that are not available
Cannot stop analysis early and use current result
Live variables is pessimistic (for dead code
elimination)
Assume all variables are live at start of analysis
Analysis finds variables that are dead
Can stop analysis early and use current result
Dataflow setup same for both analyses
Optimism/pessimism depends on intended use
“Advanced Compiler Techniques”
92
Summary
Dataflow Analysis
Control flow graph
IN[b], OUT[b], transfer functions, join points
Paired analyses and transformations
Reaching definitions/constant propagation
Available expressions/common sub-expression
elimination
Live-variable analysis/Reg Alloc & Dead code
elimination
“Advanced Compiler Techniques”
93
Next Time
Projects
CFG Construction, DFA & Optimization
DUE to 20th, Nov
Data Flow Analysis: Foundation
DragonBook: §9.3
“Advanced Compiler Techniques”
© Copyright 2026 Paperzz