S1: l := new Cons p := l S2: t

Recap: pointer analysis
• Aliases represented by points-to graph
– set of edges: {p->x, p->y, x->z}
x
p
y
• Pointer/alias analysis is important for
subsequent optimizations:
– need to disambiguate memory references:
• Example: a = *x
• if x must point to z, then we can replace with a = z
z
Recap: flow-sensitivity
• Iterative dataflow analysis (with lattice,
monotone flow functions, worklist)
• Each program point has its own fact
• Can do strong updates (delete edges):
• for statements of the form x = y
– delete all edges from x
• for statements of the form *x = y
– if x may-point-to only one location
» then delete all edges from *x
» otherwise, delete no edges
Recap: flow-sensitivity
without allocation-site summaries
S1: l := new Cons
l
p := l
V1
p
l
V1
l
p
t
V1
V2
S2: t := new Cons
l
*p := t
l
p := t
l
p
t
V1
V2
p
t
V1
V2
p
t
V1
V2
p
l
V1
t
V2
p
l
l
V1
V1
V3
t
V2
V3
p
t
V2
V3
weak update
p -> {V1,V2}
...
Recap: flow-sensitivity
with allocation-site summaries
S1: l := new Cons
l
p := l
S1
p
l
S1
l
p
t
S1
S2
p
t
S1
S2
p
t
S1
S2
p
t
S1
S2
l
p
t
S1
S2
p
t
S1
S2
p
t
S1
S2
p
t
S1
S2
S2: t := new Cons
l
*p := t
l
p := t
l
p
t
S1
S2
p
t
S1
S2
p
t
S1
S2
l
l
l
l
l
l
Recap: flow-insensitivity
• Flow-sensitive analysis is expensive
– propagating many large points-to graphs
– potentially many iterations until convergence
Flow-insensitivity: one points-to graph for the
whole program
Recap: flow-insensitivity
S1: l := new Cons
l
p := l
S1
p
l
S1
l
p
t
S1
S2
p
t
S1
S2
p
t
S1
S2
p
t
S1
S2
l
p
t
S1
S2
p
t
S1
S2
p
t
S1
S2
p
t
S1
S2
S2: t := new Cons
l
*p := t
l
p := t
l
p
t
S1
S2
p
t
S1
S2
p
t
S1
S2
l
l
l
l
l
l
Recap: flow-insensitivity
More imprecise due to lack of strong updates
(all our kill sets are empty!)
Flow insensitive loss of precision
S1: l := new Cons
Flow-sensitive Soln
p := l
p
l
S2: t := new Cons
S1
t
S2
p
l
*p := t
S1
t
S2
p
l
S1
p
t
S2
p := t
p
l
S1
Flow-insensitive Soln
t
S2
l
S1
t
S2
Flow insensitive loss of precision
• Flow insensitive analysis leads to loss of
precision!
main() {
x := &y;
...
Flow insensitive analysis tells us that x
may point to z here!
x := &z;
}
• However:
– uses less memory (memory can be a big bottleneck
to running on large programs)
– runs faster
Worst case complexity of Andersen
x
a
b
y
c
d
e
x
*x = y
f
a
b
y
c
d
e
Worst case: N2 per
statement, so at least N3
for the whole program.
Andersen is in
fact O(N3)
f
New idea: one successor per node
• Make each node have only one successor.
• This is an invariant that we want to maintain.
x
y
a,b,c
d,e,f
*x = y
x
y
a,b,c
d,e,f
More general case for *x = y
x
y
*x = y
More general case for *x = y
x
y
x
*x = y
y
x
y
Handling: x = *y
x
y
x = *y
Handling: x = *y
x
y
x
x = *y
y
x
y
Handling: x = y (what about y = x?)
x
y
x = y
Handling: x = &y
x
y
x = &y
Handling: x = y (what about y = x?)
x
y
x
y
x
y
x = y
get the same
for y = x
Handling: x = &y
x
y
x
y
x
x = &y
y,…
Our favorite example, once more!
S1: l := new Cons
p := l
1
2
S2: t := new Cons 3
*p := t
4
p := t
5
Our favorite example, once more!
S1: l := new Cons
l
1
1
l
2
p
S1
S1
3
p := l
2
l
p
t
l
p
t
4
S2: t := new Cons 3
S1
S2
S1
S2
5
*p := t
4
l
p := t
5
p
S1
t
S2
l
p
S1,S2
t
Flow insensitive loss of precision
S1: l := new Cons
Flow-sensitive
Subset-based
p := l
p
l
S2: t := new Cons
S1
t
S2
p
l
S1
Flow-insensitive
Subset-based
t
p
S2
l
*p := t
p
l
S1
t
S2
p := t
p
l
S1
Flow-insensitive
Unificationbased
t
S2
S1
t
l
p
S2
S1,S2
t
Another example
bar() {
1 i := &a;
2 j := &b;
3 foo(&i);
4 foo(&j);
// i pnts to what?
*i := ...;
}
void foo(int* p) {
printf(“%d”,*p);
}
Another example
p
bar() {
1 i := &a;
2 j := &b;
3 foo(&i);
4 foo(&j);
// i pnts to what?
*i := ...;
1
i
2
a
i
j
a
b
3
i
j
a
b
4
}
p
void foo(int* p) {
printf(“%d”,*p);
}
i
j
i,j
a
b
a,b
p
Almost linear time
• Time complexity: O(Nα(N, N))
inverse Ackermann
function
• So slow-growing, it is basically linear in practice
• For the curious: node merging implemented
using UNION-FIND structure, which allows set
union with amortized cost of O(α(N, N)) per op.
Take CSE 202 to learn more!
In Class Exercise!
S1: p := new Cons
S2: q := new Cons
*p = q
r = &q
*q = r
*q = p
s = r
s = p
*r = s
In Class Exercise! solved
S1: p := new Cons
q,S1,s2
S2: q := new Cons
Steensgaard
p
*p = q
r
r = &q
*q = r
*q = p
s = r
s = p
*r = s
p
S1
q
S2
r
s
Andersen
s
Advanced Pointer Analysis
• Combine flow-sensitive/flow-insensitive
• Clever data-structure design
• Context-sensitivity