Is it a Tree, a DAG,
A Shape
Analysis
for Heap-Directed
Rakesh
School
or a Cyclic
C4hiya and Laurie
of Colmputer
Montr6al,
McGill
This
H3A
,mcgill.
2A7
ca
pointers to heap-objects
are obtained using a memory
allocating
function
like malloc ( ). Henceforth
we will
paper
tion
in C *
University
CANADA
{ghiya,hendren}ocs
Abstract
Pointers
J. Hendren
Science,
Qu6bec,
Graph?
reports
on the
of a practical
pose of the analysis
heap-allocated
design
shape
and
analysis
for
implementaC. The
is to aid in the disambiguation
data structures
by estimating
refer
purof
the shape
to these analyses
respectively
as stack
analysis
and heap analysis.
A considerable
amount
of these areas. Initially,
of work has been done in both
the focus was on heap analysis
(‘Ike, DA G, or Cyclic Graph) of the data structure accessible from each heap-directed
pointer.
This shape
alone, for languages like Lisp and Scheme or for toy imperative languages that did not include all of the com-
information
plexities
can be used to improve
dependence
ing and in parallelization,
and to guide
more complex heap analyses.
test-
the choice of
analyses
The method
has been implemented
as a contextsensitive interprocedural
analysis in the McCAT conlpiler.
Experimental
for 16 benchmark
results
and observations
are given
These results
show that
programs.
the analysis gives accurate and useful
important
group of applications.
1
results
has been to actually
in real C and FORTRAN90
cus on the stack
estimates
problem
and only
for the heap problem.
the fact that
pointer
to pairs [6] of the form
p points
Work
pairs
Pointer analyses are of critical importance
for optinlizing/parallelizing
compilers that support languages like
targets
[26] of the form
(*p, x ) (denoting
ers that
to dynamically-allocated
(typi-
they are allocated
ally obtained
*This
project
work
supported
(financed
by
ers, Digital
recherche
using the address-of
Equipment
inforrnatique
by
Industry
Canada,
NSERC,
Canada,
IBM
are usu-
(&a) operator,
FCAR,
Alex
Canada
and
Parallel
and
the
POPL
to the place (statement)
names with
naming
[3, 26, 29] or further
procedure
strings
them ac-
in the program
where
qualifying
to distinguish
these
between
objects
EPPP
ferent calling chains [1, 33]. These naming schemes can
give the same name to completely
unrelated
heap ob-
Conlput-
the Centre
de
de Mcmtr6al).
’96, St. Petersburg FLA USA
ACM
()..89791_769_3
/95/()l ..$3.50
as alias
*p and x are
while
Permission to make digitallhard
copies of all or part of thk material for
personal or claasroom use is granted without fee provided that the copies
are not made or distributed for profit or commercial advantsge, the copyright notice, tbe title of the publication
and ita date appear, and notice is
given rha~ copyright is by permission of the ACM, Inc. To copy otherwise,
to repubhsh, to post on servers or to redistribute to tista, requnes specific
permission and/or fee.
@1996
variable
Unfortunately
heap objects do not have any fixed
name during their lifetime,
as they are dynamically
allocated
and are inherently
anonymous.
Hence various schemes are used to name them:
to stack objects
ex-
aliased).
cording
Pointers
pointer
x), or alternatively
point-
in the heap).
give conservative
(p, x) (denoting
to the data object
on the stack),
cally
and
on the stack always
jects (typically
objects
compilers,
These approaches
C, C++ and FORTRAN90.
The pointer analysis problem can be divided into two distinct subproblems:
(i)
analyzing pointers that point to statically-allocated
obpoint
pointer
possess a compile-time
name. Using this property stack
pointer relationships
are accurately captured as points-
Related
and (ii) analyzing
30].
implement
to examine if practical
and useful solutions
can be
obtained.
The most recently
proposed
(and implemented) approaches [1, 6, 26, 29, 32, 33], mostly fo-
ploit
and
Introduction
for an
of C [3, 4, 5, 12, 17, 18, 19,20,22,23,27,
A recent trend
allocated
at the same statement
but along dif-
jects, and hence tend to provide conservative
and they cannot compute shape information.
Instead of adapting
heap analysis
actually
designed for stack analysis
results,
to an abstraction
(by naming heap
objects),
our approach is to decouple the two analyses and provide heap analyses that approximate
relationships and attributes
of heap-directed
pointers.
In
our McCAT
compiler
we first
perform
a stack
analy-
sis called points-to
analysis [6, 8] that resolves pointer
relationships
on the stack.
It uses one abstract
lo-
cation
called
for all heap locations
heap
all heap-directed
pending
pointers
to it.
of the program
analysis, we can then apply the appropriate
ysis which gives more precise information
relationships
between
grams with
heap-directed
is enough.
and
the
arrays
level-l
identifies
that
[10, 11]. Scientific
ten in C typically
exhibit
number
dynamically-allocated
of disjoint
paper
non-recursive
focuses
on
the
heap-directed
or a general
ically,
pointer:
graph
our focus
and acyclic
gram, and
cycles?
is on identifying
analysis:
Shape analysis
is designed
rays and recursive
can be gainfully
[12, 17,22,
data structures.
exploited
like loop unrolling
that
primarily
of ar-
pipelining
of the previous
focused
shape
estimation
eral,
all
than
they
result
look
start
these
one
find
may
compilers
of our
abstractions
class
first
sults
to
that
analysis
build
them.
mally
as follows.
2.1
The
rest
of
the
Section
2 we give
assuming
and
The
and
and
Our
in
collect
results
accurate
be inlple-
main
Gtven
a node
DAC,-like
data
paper
a high-level
a simple
for
that
has been
pointers
fully
are
implemented
the
p, the
struc-
data
(possibly
empty)
nodes (heap objects)
be-
If ibis
(i. e, ii is acyclic).
a node
having
to be Cycle.
a path
to itself,
Note that,
data structures,
their
any
heap-dtrected
two
D captures
as lists
shape is
pointers
the followtng
= I : An access path possibly
re-
the heap
potnted
object
in the
exists
pointed to by pointer p,
to by pointer q. In this
object
that pointer
p has a path to
q.
access path
= o : No
2.3
Gzven
q, interference
relattonslups
between
e I Cp, q]
our
sibly
=
any
two
matrix
I
extsts from
the heap
heap-directed
captures
pointers
the following
ihem:
in a
cessed,
In
case,
analy-
2
= O : No
starting
that
attributes,
for
safely
heap object can be pos-
from
that
poznters
p and
common
from
relationships
shape
are used
in the
starting
we state
we state
Direction
the
sepa-
I : A common
accessed,
thus case,
stack-directed
clearly
to itself
Given
heap
Definition
p and
programs
of the
pointer
if in
for-
re-
as follows.
where
be-
are defined
to by q.
we are
structures
overview
model
heap-directed
method
the
e I Ip, q]
is organized
relationships
object poznted to by p, to the heap object pointed
an optinliz-
for
shape,
we approx-
as Tree.
from
e I) [p, q]
con-
empirical
results
comheap-
attribute
heap-directed
is Tree,
store-
are
For each
abstractions
any
contatns
2.2
pointer
of
analysis
indicate
interference
and
three
case, we szmply state
manner.
sis rules
pointers
a method
pointers
ts considered
to
is
to a set of rep-
shape
the
of heap-directed
p. shape
structure
heap,
Rather
We believe
and
accessible from p there is a unique
o II [p, q]
and implementation
to perform
and
5.
ab-
approach
the
cost
of three
together
we approximate
These
attribute
composed
point.
direction
the
tween
as a
the usefulness
Thus,
in Section
work
matrix
p and q, direction
lattonshlps
between them:
In gem
and
can
respect
design
such
provides
tree-like
that
of C programs.
C programs.
our
present
the
Conclusions
program
pair
Definition
of
complex
that
are a spectal case of tree
also pri-
answer.
examine
C, compiler
compositional
rated.
each
p. shape
ploblem
paper,
precise
with
implement
for real
shape
and
is the
an important
ing/parallelizing
a more
programs.
work
of practzcal
the
the
this
abstractions
benchmark
tribution
analysis
of
abstraction,
abstractions
resentative
in
a more
simple
in real
use
given
for a complex
simple
on heap
variation
approaches
the
with
mented
work
some
is actually
pointer,
for
data
[16] on
[3, 5, 17, 19, 21, 23, 28, 30].
of these
straction
than
on
analysis.
are given
for each
also considered
Much
to
together
no path from
them.
marily
We
(dtrected
to d.
It is considered to be DAG
acyclic
graph), if there can be more than one path bebut there is
tween any two nodes an this data structure,
transformations
[15] and software
features.
a
method
longtng
such programs
optimizing
this
4, to evaluate
(access) path between any two
Shape information
to parallelize
24], or to apply
3 we give
of
Rules
analysis
imate
ture
tree-like
or a combination
shape
shape
specif-
unaliased
for programs
data structures,
directions
Definition
DAG-like
More
DAG-like
data structures
built by the proprovide conservative
estimates
otherwise.
use recursive
Analysis
and
arrays.
heap
is it tree-like,
containing
2
directed
use a
shape analysis.
The goal of shape analysis is to estimate the shape of the data structure
accessible from a
given
future
puted
writ-
as they
/evcL2
pertinent
in Section
/ess [4] abstractions
point, to
applications
this feature,
most
of shape
some
interprocedu-
Section
implementation
data
effectiveness
The
is used
pointers
the same structure
the
empirical
In
For pro-
analysis
if two heap-directed
C programs.
of our
discuss
and
use a number
and/or
or connection
as a context-sensitive
for
overview
some
heap analabout the
pointers.
For programs
of dynamically-allocated
structures,
This
Deunder
compiler
analysis
brief
few uses of the heap, the level-O or points-
to analysis
which
ral
and reports
to be pointing
on the characteristics
McCAT
heap
pointers
object
p
p and q do not
are used
while
calculating
p and
q.
In
q can interfere.
and
direction
be acIn
this
interfere.
to actually
interference
can
q.
estimate
relationships
relationships.
-----------e----w\
t
.( ~-y
i
---------------
u
r
s
--
-~
-.
n’.
)
\
h
R:
\
I
-
AL
w“
-i-
‘jLRR~~
(a)
P
q
r
s
t
1
1
q
o
o
o
o
o
1
0
0
0
0
0
0
0 0
0 0
(1 o
,q
1
0
0
0
1
1
1
0
s
0
0
0
1
0
t
0
1
u
s
t
u
(b)
1
Direction
Figure
Illustrative
Structure
P
r
2.1
Heap
~,
P
q
r
s
t
u,
1
1
1
o
o
o
o
1
0
0
0
0
0
0
1
0
0
1
0
0
0
0
0
1
1
1
1
0
1
1
1
1
0
0
1
P
r
Matrix
(c) Interference
1: Example
Direction
Examples
and
Interference
Figure
illustrate
tions
in
the
Figure
1.
at a program
direction
and
q, and
also
D Cq, pl
exists
interference
shows
parts
pointer
s to t,
for
2, initially
and
set to zero.
t]
D [r,
are
pointer
s to u and
and D Cu, s]
we have
entries
are
not
The
access
path
can also both
We
help
now
estimate
by half,
exists
access
the
demonstrate
in the
shape
how
already
is being
set also from
pointed
of heap
data
exists
next
direction
Thus,
link.
from
matrix
existed
of a cycle
q.
q to
and
the
we know
p,
and
now
a
we can deduce
objects
statement,
Cycle
a path
statement
p to q through
heap
the
q. shape
information
from
between
=
The
p to q. Thus
after
= 1, p. shape
to
a
prop-
that
pointed
D [p, q]
q. shape
=
they
structures.
cessible
ing from
accessible
about
from
In
field
the
to
=
1,
Cycle.
of the
in
if only
the
pointed
part
data
Figure
3,
to by
of the
p or q is considered,
the shape
of the data
pointer,
heap
p, if p. shape
p->f
of the
If p. shape
p->f->f
shape
example,
structure
disambiguating
subpieces
accesses
overall
p,
structure
data
its shape
both p. shape and q. shape as Tree.
For a pointer
fields).
pointer
of the data
For
data
a heap-directed
for
it.
from
two accesses of the form
distinct
relationships
not
by p.
of the
is Tree. So we have
to disjoint
to by q.
to
for a heap-directed
the shape
q is DA G. However,
structure
information
if
p and
shape
Knowledge
the
implenlentafact
from
pointed
overall
that
abstracts
accessible
the
are
be noted
only
structure
p and
form
for
It should
p. shape
This
p to q, then
direction
the creation
D[q,p]
However
second
the
From
by p and
the
pointed
requirement
actual
and
as there
no path
relationships
from
pointer
object
a path
~elationships
The
follows
from
that
path
so both
relationships
storage
p. shape
= q, sets up a path
p to
entries
both
is one,
the
link.
from
are set to one.
interference
property
the
Further,
versa,
(i) direction
relationships.
matrix
third
I [u,s]
the
p->prev
p through
prev
this, the interference
interference
to reduce
interference
an
(iii)
of direction
is used
tion.
and
(ii)
and
so the
are set to zero.
that:
symmetric,
superset
erty
I [s, u]
also illustrates
symmetric,
t,
exists
q to
D [q, p]
the
D [p, q]
entries
u and s the heap object
by r can be accessed. To indicate
example
pointer
path
to
vice
show
it.
so the
r
from
structure
(c)
from
No access
pointer
abstrac-
heap
and
exists
q to p, or from
from both
matrix
the
(b)
matrices
path
are set to one.
D [s, u]
entries
(a)
while
interference
from
from
starting
and
1, an access
D [s, t]
pointer
Part
point,
In Figure
and
direction
Matrix
Matrices
as Tree. Further
We
1
and
and p–>g
tree
structure
provides
accesses
(assuming
is DA G, then
p->g
can
originat-
is Tree, then
will
lead
ac-
crucial
any
always
lead
f and
g are
two
to
distinct
a com-
/.-v
r
q
la
P
/.~ ‘R
p->prev = q
next
next prev
\
\
\
P
.
I
~.
A4
J-4
J
Figure
2: Example
Demonstrating
Shape Estimation
y. Additionally,
.
la
for
attribute
\
P
matrix
For
tion
each
and
ates.
\
ments
iterations
ing
to be Cycle,
and
goal
now
data
as long
present
matrix
information
using
This
in different
calls,
from
analysis
to
and
direction
matrix
one.
also
We
tree-
retain
attribute
attribute
this
direction
and to estimate
and
dating
shape
the
that
to
NULL.
entry
Basic
ure 5(b).
D and
Given
the
I at program
we compute
the
of the
direction
point
matrices
whose
given
state-
only
the
relationD,
point
that
I and
to heap
updating
an
identically
assumption
nature
H..
using
matrices
assume
the
set
5(c).
I [p ,q], implies
This
to store
whose
can
symmetric
be set to
to the
pointers
pointers
},
output
H$,
computed
by
I[q,p].
due to the
groups:
Statements
of pointer
op denotes
structure
of
entry
the
in Figure
Further
matrix
should
belonging
abstracted
these
in the
Ac is used
is then
rela-
D(y,z)
up-
is rendered
of interference
rela-
tionships.
actual
(1)
analysis
type,
the
variable
+ and
analysis
is shown
and
interference
the given
and
in
updates.
In the
2.2.1
Allocating
the
matrices
statement,
171 at program
point
4
three
: Pointer
object.
All
p now
with
attributes
cated
object
going
links,
its
has
interferes
into
three
assignments,
and
following
heap
subsections
we
an empty
itself.
of only
shape
cells
p points
existing
pointed
its
be divided
pointer
rules.
new
= malloco
also
Fig-
can
(2)
the
Pointer
- operations.
x, before
Dn
k is of
rules
allocations,
discuss
heap
overall
set
D(x,y),
by
matrix
ele-
as D(p ,q) for
of pointers
modified
A,,
are
the
valid
be
In are
the
interference
D,, [y,z])
set
gener-
and
that
entries
and
AC as shown
be
or
Note
I(p ,q) for
the
matrix
interference
5(c).
of pointers
A and
objects
p
The
(Dn [x,y]
attribute
q and the field
and
aa
of direc-
and
Dn
sets are denoted
compute
Another
basic statements
that can access or modify heap data
structures
as listed in Figure 5(a).
Variables p and
type,
The
sets
kills
matrices
corresponding
ment.
Assume
new
and
can
H
the
it
a gen set of the form{
attribute
A.
analysis,
the
shape
Let
The McCAT
compiler
translates
input
C programs
into a structured
and compositional
intermediate
form
called SIMPLE
[31] . Using this form, there are eight
integer
A, where
attribute.
is represented
compute
in Figure
the
(3) structure
are
sets,
ships/attributes
the
calculate
that
to dis-
p.
to
during
Thus
matrices
them.
of
f
matrix
shape
statement
we
relationships,
indicates
The
happens
is to identify
structures,
abstractions,
Analysis
its
relationships
in the gen and kill
The
2.2
the
as shown
changed
travers-
no information
originating
rules
every
information
if p. shape
as possible,
the
interference
of links,
recursive
effectively
of shape
if a dag-like
accesses
Finally,
accesses
dag-like
information
We
or different
we have
the
heap
structure.
heap
Thus,
node.
to disambiguate
a data
ambiguate
a sequence
a distinct
Criterion
uccesszbthiy
4. However,
using
of a loop,
such
like
as in Figure
visits
can be used
with
Shape
is traversed
subsequence
after
these
computed
3: Estimating
structure
gives
statement,
Using
direction
mon heap object,
A [p]
interference
tionships.
Figure
p,
A,,.
.“
~
we have the attribute
a pointer
to a newly
path
This
pointer
to itself
statement
p.
to by p has
attribute
allocated
relationships
Since
the
get
and
can
This
it
change
newly
no incoming
is Tree.
killed.
and
allo-
or outcan
be
~. . ..@f-JJ-–-p-p
P
Figure
4: Acyclicity
of Dag
Data
Structures
Build
Allocations
l.p
V
= malloco;
Potnter
b’s
Assignments
3. p = q->f;
= k(q->f);
5.p=qopk;
Structure
lJpdate.s
7. p->f
= q;
8. p->f
= NULL;
(a)
Basic
I
(b)
Analysis
the
following
Structure
D_kill-set={D(p,s)
[s
{ D(s,p)
=
The Overall
=
{ I(p,s)
{p}
D(r,s)
V entries
I(r,s)
implies
does
generated
I
EHAD[p,s]}u
S E
H A D[s,p]
V entries
I(r,s)
having
we present
matrix
that
p presently
accessible
would
fromp
after
AC[p]
Pointer
points
this
to
set here
a heap
data
object.
structure
statement.
D[q,q]
simply
also
D(p,p)
and I[q,q]
rule
for
the
p = q op
stack-resident
pointer
object.
can
So the
They
only
kill
are same
kill
set and
as that
for
kandp
all
the
the
set
III
= NULL)
make
existing
shape
H,
it
=
for
the statement
to
all these
D-gen_set
and
case
I(p,p)
heap
and
q presently
after
in
the
points
the
gen
object
the shape
set,
only
set to one (implying
We have
to
statement.
the following
if
that
overall
{ D(s,p)
{ D(p,s)
] s G H A s # p A D[s,q]
[ s c H A s # p A D[q,s]
{ I(p,s)
H,
=
I D[q,q]
AC[p]
p A I[q,s]
} U
}
== D_gen-set.from
{ p }
}
I s ~ H A s #
I I[q,ql
}
} U
U D_gen_set.to
= A[q]
ofp,
of pointer
For
p.
( ).
purposes
pointing
statements,
p = malloc
gen set
p = q.
{ I(p,p)
a new
relationships
attribute
=
= q->f,
updatethe
point
to the same
to NULL
and
the
relationships
{ D(p,p)
p, and
change
q.
point
to NULL).
=
assignnwnts
p = &(q->f),
points
the
statement
D-gen-set-from
(p = q,p
Rules
case, we
= Cycle.
The next five basic heap statements
and
1
1
statements,
are presently
q does not point
It
becomes
In that,
p now
of pointer
p would
I-.gen_set
heap
=
to calculate
five
inherits
D-genset-to
2.2.2
Dn [r)s] =
In[r,s]
of Analysis
rules
these
Pointer
attribute
NULL,
in thegen
a cyclic
also have
= O
of affected pointers
Form
the
Ac for
as q. It simply
}
Tree
D(p,p)
that
c D.gen-set,
c I-gen.s.et,
(c) General
p=q:
}
I-gen-set={I(p,p)}
AC[p]=
imply
Dn [r,s] = O
In[r,s]
relationships
D(r,s)
low,
I s E H A I[p,s]
that
not
E D.kill=et,
E I-kill-set,
V entries
So we have
Note
= I[r,s]
Structureo ftheAnalysis
rule.
D_gen_set={D(p,p)}
H,
I~[r,s]
relationships
V entries
I
the
I_kill._set
D[r,s],
Update shape attributes
statements
with
killed
=
= A[s]
Compute H, and Ac
=
&[s]
V s E H,, /ln[S]
Fi,gure5:
summarized
c H, An[s]
Add
6. p = NULL;
matrzces
H, D~[r,s]
E
Delete
2.p=q;
4.p
the new
r,s
heap
Be-
K
to
object,
of our
a specific
analysis
field
to be pointing
or
we
consider
at
a specific
to the
object
a pointer
offset
itself.
of a
With
this
assumption,
q op
analysis
and
The
and
points
using
p = NULL
not
generate
to NULL
after
relevant
to it.
p = &(q->f
to the statement
are analyzed
statement
does
not
the statements
k are equivalent
the
the
kills
any
p
same
all
new
) ancl p =
attribute
The
is
from
it is set as Tree.
overall
and
link
as shown
in
Figure
6.1
(i)
new
because
of pointer
p having
Relationships
After
from
v and
statement
a path
terfere
of the
6).
from:
6),
second
a number
t
that
due
that
For
example
relationships
path
to r and
from/to,
in
I-gell-set
Figure
p, besides
can
I.
It
is because
the
=
pointers
q has
data
structure
name
data
direction
path
matrix,
to,
to via
to
but
to.
In
identify
link.
a
all
have
link
the
the
aii
the
the
direction
}
a path
f.
to pointers
Note
that
q has
to
q after
structure
the
accessible
or Dag),
is Tree,
path
to q.
Thus,
the
summarized
have
hence
also
statement.
from
p cannot
A[q]
to itself,
p should
q has
is possible
assume
q has
s, where
bacli
reported
(i.e.
and
But
of
fo
direction
gives
belonging
to
accessible
structure
accessible
shape
attribute
of q,
AC:
A[q]
direction
the
this
the
Dag
after
tree-like
and
tree
the
from
to
data
at-
acyclic
p. Note
of p from
we may
separately
proves
dag-like
the
statement,
Thus,
attribute
An [p] is a
lose
attribute
the
of
is Tree.
information
preserve
shape
attribute.
shape
the
accessible
the
shape
link
say that
still
structure
the
f
from
we do not
we
deduce
Structure
be
ab-
critical
in
structures.
Updates
can
updates
= q. These
ments
data
be having
relatioushil]s
the
analysis
a
ity
= Tree
to q. Iu Figure
to
is Dag
data
or
while
q via
conservatively
relationships
Tree
Structure
to the
if
is Cycle,
from
is Tree,
if it
of the
2.2.3
6,
in
can
gen
tical
be
types
6
The
choice
conservative
of statements.
three
shape
discuss
the
estimation
analysis
made
gen and
accurate
in
using
kill
this
a combination
in many
important
statements
kill
complex
sets for
able
two
state-
connectiv-
abstractions
we are still
of the
such
an overly
simple
and
for shape
and
is to get
have
However,
abstractions,
shape
using
our
= NULL
because
goal
we
use
p->f
statements
the
without
is to
overly
curate
low.
stack.
The
form
programs
structures.
technique
the
“nasty”
change
information
with
of
imperative
abstraction.
a
are
are the
drastically
of heap
and
our
the
only
It
to
as:
representing
q.
reshape
p
a path
1In this Figure, for sake of clarit y we have simply labelecf each
node with the stack-resident pointer that points to it, imteacl of
explicitly
the
structure
matrix
detect
must
if we simply
live
set
and
its
A[q]
if A[q]
identifying
to have
A[q]
that
accessible
we cannot
stracting
the path
so according
be reported
Ho\vever,
a path
p is not,
data
=
spurious
objects
p the
attribute
available,
lose
a
p is reported
q is acyclic
} U
affect
from
the data
of the
following
structure
p->f
a path
assumption
not
heap
to assign
However,
its
the
q has a path
pointers
1, r and
Since
p is a subpiece
the
several
does
of the
structure.
property
to
~Fronl
pointers
pointers
statement,
following
}
accessible
one
q, it is safe
the
q has paths
So we conservatively
to
6, after
paths
the
to
from
It
has
p to s is spurious.
above
path
find
a path
Figure
be having
from
we can
cannot
a specific
be having
to via
statement
from
Cycle.
p will
a path
I I[q,q]
AC[p]
I s E H A s # p A I[s,q]
: Pointer
Relationships
To
pointers
the
relationships:
introducing
a new
that
all
So we have
\ s 6 H A s # p A I[q,s]
of the
giving
as follows.
{ D(s,p)
statement,
all the
== { I(p,s)
potentially
tribute,
D.gen_set.from
the
with
interference
this
spuri-
as q also
q, the set of ~rom
be stated
the
be generated.
As all pointers
with
with.
possibilities,
generate
D(s, p),
in-
interference
two
can
unioning
generated
lationships,
this
a path
which
we abstract
to these
and
q (u,
also
q has
6).
6, we will
D(r,p)
a path
to
potentially
pointers
relationships
also interfere
relationships
(ii)
Figure
Note
of spurious
ous
in
possibility
have
to which
and
: After
interfere
interferes
set of newly
respect,
a path
p can
pointers
q (pointer
relationships.
with
p will
have
Further,
(i)
Relationships
{ I(p,p)
= q->f,
by
other point-
relationships
presently
1 in Figure
with
from/to
}
is obtained
p can potentially
Despite
p
that
Figure
(pointer
following
relationships
:
pointers
q in
have
to
the
all
generates
a path
ers, and (ii) new interference
to pointer p.
From
It
direction
D-gen-set
Interference
q presently
of relationships:
I D[q,q]
A
= Cycle}
to sets.
pointer
f,
As+p
I A[q]
p
: This statement
makes pointer
p point
to
P = Iq->f
the heap object
accessible
from pointer
q through
the
types
\ s 6 H A s #q
} U { D(p,q)
U { D(p,p)
of p
Since
shape
case,
{ D(p,s)
rule.
relationships.
As a default
=
D[q,s]
relationships
statement,
D-gen-set-to
= q for shape
to perform
pracand
these
of
ac-
cases.
We
in detail
be-
‘Q ‘P
‘Q ‘P
Figure
p->f
: This
= NULL
nating
from
the
statement,
heap
object
p should
it presently
already
statement
discussed,
this
from
direction/interference
can
statement
ture,
new
shape
to which
such
attributes
of pointer
the
A[p]
kill
structure
and
a tree
for this
as Dag
p->f
=
(e.g.
analysis
when
tion
the
lack
We have
report
ther,
its
the heap object
pointed
Figure 7. As already discussed,
on breaking the link f, cannot
information
available.
in generating
some
attributes
be obtained
new
the
will
now
pointers
In
have
will
Figure
pointers
having
a path
have
7,
to q via
paths
pointers
q, r and
of direction
a path
to all pointers
u,
s after
relationships
v and
the
If
f
the
p will
generated
have
Thus,
A[q]
=
{ D(r,s)
and
these
In Figure
7, pointer
to.
with
situ-
and
=
have
1) : After
back
p and
from
to p or
the shape
direc-
a path
between
be accessible
Cycle.
paths
to
the
set
A D[q,s]
pointer
S I S G
H
D[q,p]
q.
2. Furall
point-
q (including
attribute
p
of all these
We summarize
A (D[s,q]
this
V D[s,p])
=+ Ac[s]
=
case as
}
Cycle
t,
7
this
case
from
another
all the
we have
tree-like
pointers
q are initially
completely
disjoint,
simply
connects
of any
a path
(if
is presently
in this
a tree
to by p and
pointer.
the shape
have
the
Tree).
category.
Finally,
is already
In other
words,
Dag
the
then
affect
if the
v in
shape
case.
Dag
Figure
attribute
it remains
attribute
data
initially
q, becomes
or Cycle,
shape
this
that
u and
state-
the shape
7 illustrates
with
to by p
the
to the
of pointers
Pointers
presently
pointed
does not
Figure
attribute
structure
that
substructure
to p and also interfere
a pointer
changecl.
be-
arise,
structures
Otherwise
}
not
If the data
attribute
fall
does
top.
pointed
it
situation
: In
structure
such
q interferes
{
accessible
a path
ment
cau be summarized
I r,s ~ H A D[r,p]
following
case in Figure
a path
at-
possibilities:
= Tree
have
as follows:
D.gen-set
=
above
following
p itself),
q has paths
statement.
become
also
this
shape
q:
be generated
also
and
H,
nlod-
Further,
f.
will
V s E Hs,
as discussed
to p (including
the link
will
q themselves),
becomes
pointers
the
attributes
q, p will
illustrated
cycle
will
can have
current
p and
=
the
relationships
to p (D[q,p]
p->f
already
affect
direction
}
follows:
below.
All
we
the
link
and
pointers,
Thus
A I[q,s]
of pointers
have
killed
with
relationships
of several
potentially
with.
relationships:
has a path
presently
to by q, as shown in
resetting
can
pointed
the relationships
However,
q. We
a cycle
this
ers that
first breaks the link f, and
linking
p,
interference
have
on the
q already
statement
and
: This statement
u,
demonstrates
I r,s 6 H A D[r,p]
which
p and
to q. Thus,
the children
to
to
can considerably
relationships
the
and
{ I(r,s)
depending
Pointer
of
=
pointers
ations,
to de-
cyclic,
swapping
cent inue
pointers
This
q interferes
set of new
of pointers,
with
di-
if a tree-like
or
t.
a path
pointers
statement
pointers
q
the
the
This
tribute
structhe
leaving
dag-like
would
to by p, to the heap object
results
suffice
to
statement,
with
having
all
I-gen-set
or Cycle.
then resets it thereby
ifying
does not
statement,
becomes
again
our
shape
But
due
pointers
with
the
interfere
the following
Ilot
if this
data
or Cycle.
that
After
also
relation-
change,
of the
all
interfere
get
p = q->f
statement.
p will
that
As
does
conservatively
Note
temporarily
becomes
So no
subpiece
we err
information
of a tree),
f.
the
v and
the
be obtained
statement
p may
is Dag
unchanged.
precise
link
Statement
fore
to pointers,
the
cannot,
information
cases,
Heap
f enla-
relationships.
attribute
rection/interference
tect
via
the
Basic
link
paths
matrices.
Further,
disconnects
due
have
information
be killed.
any
The
the
to by p. After
to exclusively
ships
generate
pointed
no longer
has paths
6: Analyzing
breaks
8
of
un-
of these
‘Q ‘P’j/
‘Q ‘P ‘Q ‘P
fg
P
v
f
t
‘
OP
fg
s
‘(3?$
1
Figure
Figure
pointers,
becomes
and the
the
Dag
shape
the
attribute,
attribute
7: Analyzing
8: Direction
merge
of their
where
the merge
is defined
Basic
Heap
Relationships
current
Impacting
w for
as follows:
tion
However,
Hs =
{S
C H A (I[s,q]
Is
V s c Hs,
((7 D[q,p])
A.[s]
A[q]
# Tree
pointers
that
attribute
ture
all
the
: In
this
have
of q. This
accessible
these
from
pointers
case,
path
= Tree))
q, will
the
+
the
shape
because
also become
statement.
attribute
of all
with
shape
the
the
data
accessible
V s e H,,
((~ D[q,p])
struc-
A.[s]
= A[s]
and
The
termediate
from
1~’e summarize
# Tree))
presented
in Sec-
provides
effective
range
of programs
~1-mpe
Analysis
McCAT
collects
in
C Compiler
positional
subset
after
on the
by
implementation
w A[q]
8
is performed
representation
of
subset
C
[7,
13,
easier
analysis.
the
the
31].
Shape
[6,
reported
This
incom-
analysis
8] and
is
focuses
to be pointing
reduces
abstractions,
as well
informaSIMPLE
is a simplified,
analysis
of pointers
for
on
which
points-to
points-to
age requirements
5
program-point-specific
analysis
performed
to heap
D[s,P]}
A (A[q]
for a broad
Implemen~in~
analysis
tion.
only
HA
results
analysis
Shape analysis has been implemented
aa a contextsensitive interprocedural
analysis in the McCAT
opC compiler.
It is a flow-sensitive
timizing/parallelizing
case as follows:
H,={s[sc
our
manner.
in the
}
M Dag
to p is merged
is required
after
A D[s,p])
A (A[ql
= A[s]
3
as follows:
empirical
that
information
an efficient
summarized
Attribute
4, indicate
shape
case can be formally
= q
Shape
analysis.
This
p->f
LFrom the rules presented above, it can be noticed
that a considerable
number of spurious direction
and
interference relationships
can be introduced
during the
attribute
operator
Statement
as more
and
the
makes
efficient.
storthe
The
overall
analysis
points-to
framework
analysis.
other
dataflow
abstraction
tion.
It
Our
rules
presented
control
in
each
Section
calls.
While
presenting
the
checks.
form
both
the
context,
The
subtle
may
refer
stack
and
heap.
a name,
form
while
object
with
rule
name
stack
first
and
priate
heap
of all the
Our
shape
tors
for our
direction
the
logical
may)
Also
interprocedural
(V)
attribute
the
results
depth-first
of the
program.
known
statically
handled
in
represented
by
nodes,
unrolling
function
pointers
possible
as they
on the
rule
merge
the
feasible.
equality
opera-
the
for
both
input
matrices,
x
loop
like
cial
ing
for
condition
(p
Statement
analysis
and
and
every
the
time
the
and
Indirect
matrices
calls
ap-
for some
node
via
using
an
the spe-
in the invocation
by separately
from
are
this
are handled
nodes
are handled
these
graph
computation,
procedure
and then merging
With
invocation
approximate
invocable
with
call is analyzed
a unique
Recursive
calls
procedure.
obtained
call-site.
a procedure
fixed-point
are mapped to
called
is analyzed
to the
exists
to it.
recursive
for
output
returned
there
each
at the call-site
matrices
the
analyz-
given
call-site,
all outputs.
in Section
For example,
test
a while
of the procedure
corresponding
graph.
possi-
operator
body
iuterprocedural
is simply
defined
V Dl[r,s]
rules for our interprocedural
input
Next,
call-chain,
for
operator
are
merge
been
we consider
handle
procedure
analysis
framework
It provides
==
program
which
of the
Since
for
the
as
F;f>l
when
NULL)
g(x) ;
or
special
pairs
{
of recursion.
represented
during
above
interprocedural
by
points-to
strategy
Unmap Process
a
they
calls
and
are
approxi-
represents
calls
nodes
all
through
the
given
analysis).
we use the
depicted
contextin
The
main
cesses,
is
10: The
issue
Interprocedural
related
identifying
tributes/relationships
indicating
from
Figure
is not
calls,
Recursive
invocable
framework,
by
structure
node
}
invoca-
structure
indirect
Indirect
Anal ysis
the
points-to
is constructed
of recursive
the approximate
are
by
use
complete
invocation
and
manner.
we
built
invocation
recursive
a special
calls,
us the
traversal
set of functions
(resolved
merge
= D[r,s]
Function
where
possible
the
unmapped
is illus-
analysis
The
The
when
constructs
9: Analyzing
abstractions)
prepare
of possible
merging
the
operation,
that
of the
simple
Based
our imto resolve
appro-
the
V r,s c H, D[r,s]
V r,s c H, I[r,s] = I[r,s] V Il[r,s]
a V s E H, A[s] = A[s] M Al[s]
ure 10. Complete
the
defines
has already
[6, 8, 14].
call-site
Figure
analysis
relationships
relationships.
note
graph
also
interference
a pointer
analysis
gives
~
scheme are described in [11]. Here we briefly discuss
only the most pertinent
issues. The general idea is that,
first, the three matrices (for direction,
interference
and
applies
control
abstractions.
and
‘= @
To accurately
sensitive
It
assignment,
it involves
the
then
10
that
p may
= q. Thus,
rules,
9, which
attribute
a simple
mate
of
a statement
a set
~
Merge(I,Il)
Merge(A,Al)
proach,
three
OR
the shape
are
Merge(D,Dl)
to a heap-allocated
into
and
handling
statement.
the
tion
context,
information
Dl,Il,Al,H,ign);
I = Merge(I,12);
calling
appropriate
p->f
analysis
for
Figure
while
2.2.3.
the
D = Merge(D,D2);
A = Merge(A,A2);
while
((D != prevD) and (I != prevI)
and (A != prevA));
return([D,I,A]);
to a stack-allocated
is p->f
form
= process_stmt(body,
outputs.
in
ble (or
rule
[D2,12,A2]
or to
object,
Consider
= A;
each
addi-
in one
calling
if p points
locations,
strategy
trated
example,
uses points-to
of the
simple
For
= I; prevA
in the
heap,
a stack-allocated
= D; prevI
= process-basic-strnt(ccmd,D,I,A,H);
references
the
=
[D1,I1,A1]
for
some
is that
stack,
x, then
prevD
of stack-directed
to the
body, D, I, A, H,ign)
form
in Section
account
process-while(cond,
do
as
for
rules
fun
sec-
approach
requires
object.
the appropriate
references
this
another
= q, whereas
plementation
rule
/ * DII, A : Input matrices,
H : Set of pointers
* abstracted,
ign : Current
invocation
graph node */
basic
of the
into
= q. If p points
the
is x. f
object,
(P
to
in
p->f
them
point
to a heap-allocated
of the
the
point
the
is structured
presence
and
p->f
p may
point
all
the
for
that
previous
statement
analysis
we take
implementation,
tional
has
basic
consider
However,
actual
the
not
basic
used
use
the
a context-sensitive
procedure
we did
also
in
2, a compositional
and
handling
2.2,
that
however,
implementation
for
construct,
pointers.
could
presented
particular
analysis
to
be noted,
frameworks
and
a simple
is similar
should
Fig-
dure
global
call.
This
in scope,
set
(ii)
with
the
set
can
be
includes
not
Strategy
map and
of
unmap
pointers
modified
by
pointers
which
in the scope
pro-
whose
the
of the callee
at-
proceare:
(i)
but
are
variables),
via m indirect
reference
(Invisible
(iii) not at all accessible in the callee, but have a
direction/interference
relationship
with some pointer
accessible
and
accessible
in the callee
(tnaccesszble
variables).
Spe-
cial symbolic
invisible
names must be generated
and inaccessible
variables
to represent, all
and capt,ure
space
their
requirements
section
gives
in the
program,
names
where
the
variables, and we simply reuse them
For
Inaccessible
variables we generate two symbolic names
for each parameter
or global that is related to some
cation]
to
parameter
inaccessible
vocation
attributes/relationships
Points-to
in the calling
analysis
already
context
generates
symbollc
for znvzsible
variable(s).
If the name of the parameter
or global
variable
represent
is x, then
all inaccessible
we use the name
pointers
and the name O+x to represent
that
have paths
procedure
from
context
that
with
is stored
and names
O–x to
have paths
all inaccessible
x or interfere
call a mapping
the calling
the
given
context
[11].
Finally,
inorder
a reduced
rization
dure
price,
and
its
Every
output
to it.
techniques
to optimize
(i)
vocation
graph,
directed
pointer
in
to
the
neither
all
arises
extensive
calls
to
of the
for
Experimental
4.1
a number
a variety
rizes
The
from
[9],
The
and
by trying
(except
the
characteristics
first
ments,
section
counted
of statements
tion
of
gives
the
using
in the
of small
the
appear
first
the
the
source
wc utility,
SIMPLE
(this
number
gives
the
analysis
point
gives
the
minimum,
maximum
abstracted
by
trices
over
symbolic
all
functions
variables
numbers
indicate
requirements
gram.
Note
that
which
the
the
estimate
of view).
the
by
second
average
of the
OUI’
number
average
varies
reasonable
size
with
DAG
at the
give
references
pointer,
say
data
or
where
Cycle,
given
the
in the
or cyclic
and
the
pro-
7 (sianford)
respect,
*a/
p,
struc-
program
A is
point.
reported
form.2
the
by
the
information
bles have
the
be
and/or
programs
corresponds
are thus
In 9 of these
structures
the
tion
is useful.
observe
in
improving
that
build
and
s im, we conservatively
we can
Trees
remaining
and
that
deta-
The
top
our
data
shape
determine
this
informa-
2 programs,
find
the
Both
tree-like
for
acwith
if
2 groups.
candidates
are in fact
the
the
parallelization.
good
given
builds
we
11 programs,
that
In
we compare
into
refla-
the
and
to programs
and
for
useful
divided
sep-
multi-column
a program
analysis
would
pendence
2(b)
(where
2(a),
indirect
statistics
structure(s)
information
for
The
overall
in Table
of data
in Table
measurements
respective
gives
. b and a [i]
(*a)
pointer)
above
Further,
shape
reverse
the structures
are
shape information
is only slightly
useThe bottom
group of programs
build structures
are inherently
dag-like or graph-like,
and so even
z Note
These
a given
the
Overall
ful.
that
of
includes
analysis).
from
references
respectively
dereferenced
labeled
of the
m~]-
(this
for
a
DA GS, so our
section
number
abstractions
analysis
Thus
evaluating
indirect
dag-like
Tree,
erences
analysis.
representa-
program
of the
is quite
the
=
give
structures,
com-
direction/interference
in the
size
including
of program
The
and
introduced
memory
to 83 ( szm),
a good
loader).
for
indirect
matrix
arately
group
sized
programs.
and
A[p]
attribute
program.
1 summa-
intermediate
from
variables
lines
and
columns
the
is a heap-directed
that
benchmark
where
multi-columns
shape
Table
three
a itself
beled
on
and medium
of sources.
of the
(except
heap-
Benchmarks
benchmarks
majority
suited
of heap-related
to a tree-like,
i.e.
the
Results
We have collected
in
substan-
locations
assembler
of heap-related
ture:
in-
tual
4
in-
pointer
1 have
with
heap
in one
C.
C : These
program,
different
analysis
they
to
set is well
number
points
invocation
memorization
call-chain
to
the
access
the
a procedure
pointer
a heap-directed
in Table
a
a formal
benchmark.
number
algorithm,
nor
the
T, D,
other
from
demand
during
The
in the
from
is identical
building
when
lo-
and
Results
Refs:
along
output
functions
as the
more
irrespective
node
exploring
update
(ii)
manner,
of in-
graph
interprocedural
variables,
performing
one)
input
our
contexts
memoize
current
excluding
which
a lazy
invocation
(iii)
if the
stored
for
a stack
a procepairs
call is re-analyzed
use the
We are also currently
include:
graph
in the invocation
both
references,
benchmarks:
references
to a stack
happens
refering
third
We estimate the effectiveness of shape analysis for this
set, using the following
measurements
(Table 2(a)):
memo-
analyzing
to
benchmarks
benchmark
analysis
4.2
at
a simple
computed
this
for
context-sensitivity
we finish
When
node,
input.
full
currently
we simply
invocation
the stored
time
the
rules
be fouudin
implemented
matrices,
call-chain,
which
to get the
we store
corresponding
this
can
we have
scheme.
call,
put
processes
heap
of indirect
and
and
of indirect
two
references
indirect
can point
typically
references
the
The
of
a stack-directed
the
number
matrices).
number
pointer
location
(this
All
for
map
unmup
tial
the
receives
indirect
and is used
a heap
bit
number
of the function
For each
called
and
location
names in
x.
between
in the
to x,
use
total
dereferenced
another).
pointers
(globals, parameters
and symbolic names),
Completed
escriptiouofthe
while unmupptng.
and
heap
(we
the
a [i]
is
a pointer,
ences
of
the
of
the
as
{
of
10
an
a
pointer
temp
type
arrays
a[i].
*a
=
access
of
reference
form
form
using
to
that
ered
but
x
=
of
;
x
pointers
form
not
a itself
* ( a [i
because
a[i]
the
(and
of
=
is
=
a[i]
not
For
always
simply
when
Indirect
refer-
as indirect
which
this
have
consid-
reference)
a pointer.
counted
}.
is
indirect
simplification
*temp;
may
not
] ) are
our
x
an
reason,
indirect
references
expresses
this
benchmarks
references
Program
Source
SIh’IPLE
Min
Max
Avg
Ind
To
To
Stack/
Stints
vars
vars
vars
Refs
Stack
Heap
Heap
0
Lines
bintree
351
xref
153
342
l~g
misr
277
235
chomp
430
476
stanford
885
880
our
enough
a way
heap
40
20
40
24
31
0
31
0
2
PO
10
8
47
39
35
27
27
22
127
45
82
0
14
7
28
0
28
0
4
110
4
6
11
14
7
7
0
641
16
23
18
180
29
151
0
reverse
123
49
9
18
12
16
0
16
0
assembler
3361
3071
22
36
24
718
666
52
0
loader
1539
1055
13
28
17
170
106
64
0
sim
1422
1760
76
111
83
374
34
340
0
paraffins
381
180
6
31
21
37
2
35
0
blocks2
876
1070
56
82
61
373
98
275
0
nbody
2204
24
36
27
134
24
116
6
sparse
2859
703
1495
60
32
468
3
465
0
pug
2400
2089
24
!Jp
153
48
822
147
688
13
analysis
shape
gives
correct,
shape
( 4 of the
is not
dependence
testing
information
is useful,
however,
that,
a higher-level
determining
should
1: Benchmark
information
of automatically
analysis
10
681
for improving
The
50
257
the shape
lelization.
10
hash
shape
programs),
23
power
Table
when
4
really
5
Just
before
have
clet ailed
and/or
Characteristics
paral-
oldnode
path
as
not
Based
on the
actually
and
our
make
examination
a manner,
that
analysis
always
structure
structure
attribute.
and
node
to
sem bier
the
a new
node
existing
the
This
does
build
a leaf
build
ified)
shape
ports
code
by
the
shape
fragment
set,
new
provides
mtsr,
The
and
finds
its final
shape
a
canto
(which
is
its shape
are done
analysis
sets
oldnode
So it reports
insertions
from
analysis
newmode
to benchmarks
data
loader
a new
shape
sively
case the
at-
into
this
becomes
overly
attribute
as Cy-
sire,
blocks2
ships
new
and
item
attribute
structure
nodes
between
conservative
this
is built
existing
estimates
or cyclic.
The
(or
at
for
(all
such
a pointer
teed
that
pointed
nodes,
share
re-
following
and
tree
= q;
= newNode
disjoint
ner.r_node
() ;
S3:
new-node->f
= q;
arrays
S4:
old_node-M
= new_node;
the
11
loop
pointer,
indices
disjoint
from
array
two
the
shape
to
case.
from
represent
be
point
shape
Tree,
other
indices,
the merge
one
attribute
(if
of
relation-
attribute
to tree-like
each
an array
of all the pointers
If the
as
recur-
it
of
is guaranstructures,
the
structures
say
a [i]
for
a is reported
and
a[jl
as
Cycle).
benchmark
lists,
[2, !25]
S2:
this
that
a. So the
attributes
or
analysis
its shape
reverse
all pointers
is reported
dag-like
shape
to report
represents
a [i]).
by
a node,
In our
case:
tree
and
all array
to
D.4 G or
benchmark
of this
pointers
completely
nlod-
and
and attribute
denotes
continues
, as one pointer
of the relationships
M-
again,
and
abstracts
say a[lo]
becomes
tree-like
The
a binary
analysis
pointers,
stanford
the
temporarily
becomes
this,
swaps
Shape
tree
structure
then
or cyclic.
to linked
Si : vld_nvde–>f
q via
structure,
and reports
detect
dag-like
estimated.
data
to be dag-like
illustrates
.rrcf,
by appending
list.
is accurately
and
to
path).
case applies
and
cannot
shape
lose its
appending
cyclic
at the
of this
in this
bzafree,
by
hash,
lists
or dag-like
analysis
the shape
temporarily
trees
of the
This
If a data
in such
then
because
even
while
linked
inserting
we
appended
structure,
infers
binary
structure
is always
benchmark
node,
programs
If a tree
cle.
path
neunode
path
f),
Our
information,
dag-like
link
newnode.
as DA G. If further
conservative
2(b),
programs,
data
happens
not
beginning/end
these
and
the
the
nbodg.
a tree-like
successfully
In our
chomp
benchmark
of
as Tree.
2(a)
observations.
builds
beginning/end
in Tables
of the
the following
If a program
data
presented
kill
and
S4 kills
is via
to
the only
apparently
data
q (which
the
oldnode
Statement
an additional
tribute
Observations
S4 both
q.
oldnode
detect
have
be applied.
to
to
from
now
4.3
statement
a path
witl~
that
power
its
subtrees.
as Tree.
set,
and
root
We get
For
builds
hash uses an array
implements
having
the
example,
the
power
an
array
shape
the
tree
of pointers
a power
of
network
pointers
attribute
simplified
to
of these
version
is as follows:
of
*a/
Program
(*a).
Refs
a[i]
c
bintree
36
36
0
0
29
29
0
0
4
~
Overall
D
T
Refs
xref
c
Refs
T
D
c
4
0
0
40
40
0
0
2
0
0
31
31
0
0
misr
35
35
0
0
0
0
0
0
35
35
0
0
chomp
56
56
0
0
26
26
0
0
82
82
0
0
stanforcl
28
28
0
0
0
0
0
0
28
28
0
0
0
0
u
o
0
0
7
7
0
0
4
0
0
151
151
0
0
0
7
hash
-7
147
147
0
0
4
reverse
16
11
5
u
o
0
0
0
16
11
5
assembler
45
45
0
0
7
7
0
0
52
52
0
0
loader
55
55
0
0
9
9
0
0
64
64
0
0
sim
96
29
67
0
23
0
340
250
90
0
power
paraffins
bloclw2
244
221
26
8
18
0
9
3
6
0
35
11
24
0
119
16
37
66
156
64
43
49
275
80
80
115
74
22
0
52
42
14
0
28
116
36
0
80
sparse
384
14
0
370
0
384
14
0
370
514
16
0
498
0
1
0
pug
0
174
0
173
688
17
0
671
nbody
(a)
Program
b
D
T
Actual
Data
bintree
A binary
tree
xref
A binary
tree
misr
A linked
list
chomp
A game
tree
stanford
A binary
hash
A hash
power
A tree
reverse
A binary
tree
table
Structure(s)
with
and
a linked
for
tree-sort
using
h~e~isurements
list
hanging
Shape
from
each
node
list
an arrayof
a power
which
for
Built,
a linked
implementing
tree
Enlpiricz+l
Analysis
Shape
Shape
Reported
Correct
yes
yes
Tree
yes
yes
Tree
yes
yes
Tree
yes
yes
Tree
yes
yes
Tree
yes
yes
Tree
yes
yes
DA G
no
slightly
net]vork
s~rapped
Info.
Useful
Tree
linkedlists
is recursively
Shape
assembler
A linked
list
Tree
yes
yes
loader
A linked
list
Tree
yes
yes
sim
An
DA G
no
slightly
paraffins
3 arrays
DA G
yes
slightly
blocks2
A constraint
graph
partially
slightly
nbody
A leaf-linked
octree
sparse
Sparse
pug
A COIml>le.X CYCliC
array
of linked
lists
of interconnected
matrix
data
using
linked
structure
lists
(DAG)
(DAG)
DA G/Cycle
(DAG)
linked
lists
(cyclic)
structure
(b)
Accuracy
and
Tab]e
t]sefulness
2: Experimental
12
of Shape
Results
Cycle
no
no
Cycle
yes
no
cycle
yes
no
Information
t
= (struct
for
(i
{
root*)
= O ; i
temp-1=
malloco;
<=N;
(i
*
i
= i+
10);
analysis
temp-2
characteristics
= (*
t).feeders;
above
returns
loop,
a tree
to the
tth
iteration
of the
pointer
most
of the
is performed
in
important
tree
array
allbui.ld-lateral
which
is then
connected
array
temp-z.
Since
pointer
a disjoint
of the
format
functiouc
in each iteration,
index
Further,
the
is connected,
temp-2
which
segment
the
in this
iterates
of this
in
the
array.
simpl~jic
(i=O;
i<=N;
i=i+
the
and
the number
1)
= (*
r).theta_R;
theta_I
= (*
r).theta_I;
the
and
column
per
invocation
,
theta_R,theta.-I)
Table
;
fns
that
in
tree,
we
loop,
each
which
parallelized
know
iteration
is
operated
Thus
provided
variables
demonstrates
this
there
(there
how
inforrnatiou
for
from
pointer
then
compute~ateral.
to other
1 points
to
the
are
by
can
no
in
analysis
a disjoint
benchmark
to linked
lists.
be effectively
this
can
due
case).
provide
aualysis
pizrafins
and
This
critical
parallel
benchmarks
in the
with
they
program.
Besides
are
iza-
to one.
the
if data
cyclic
that
get
actually
node.
per
These
number
from
(given
Act
the first
The
arecalcu-
column,
three
in
free
to
currently
then
interference
direction
call
ig
I Rec
I App
safely
(p),
it
if
may
exploring
one
matrix
2
4
2
4
1.07
misr
5
7
7
0
0
1.00
information
deallocating
20
47
12
98
13
7
2
7
2.09
8
4
1.08
hash
5
8
8
0
0
1.00
power
18
31
53
6
6
1.71
reverse
10
11
2
4
1.10
assembler
5
52
loader
30
sin)
14
pa~affhs
7
~o
(live)
not
the
inforlnation,
pointer
he a safe
these
0
2.44
2
2
1.52
2
8
1.70
6
7
0
0
1.16
28
28
1
2
1.00
2
2
1.76
118
spa~se
76
121
0
0
1.59
pug
41
69
101
0
0
1.46
is set
Static
these
results
not, explode,
tion
graphs,
cussed
in Section
have
a large
number
13
simple
same
ously
computedinput.
3 that
handling
It is also
scheme,
the
invocation
majorityof
that
the
is that
graph
context-
There
very
are, how-
large
invoca-
more
aggressive
memo-
these
programs,
as dis-
interesting
to note
calls
get
only
reuses
node
Finally,
to be made
first
invocation
cost.
do have
of procedure
when
Table
that
The
do a complete
reasonable
for
3.
4.
size of the
so we are exploring
At, a
othel
observations
we can
benchmarks
techniques
our
with
Measurements
3 and
the
and
analysis
other
rization
applications.
interesting
in Tables
benchmarks,
ever,
with
Interprocedural
are several
the
sensit,ive
two
of’ diYecslid
0
125
44
67
does
deallocation.
effectiveness
for
can
642
82
Q6
263
34
28
for
can aid
melnory.
1.03
stanford
There
to sinl-
I [p, q]
call-site
15
Table3:
ex-
from
needs
entry
matrix
any
For
accessible
a node,
nodesj
I
32
structure
own.
in
ones
and interferenc(~
on their
with
columns
14
nodes,
data
and
31
ubody
data
in
call-site
averages
in the
that
average
analyzed
function,
mem-
calls
colurnnsgivethe
per
an-
8
blocks?
of inclirect
allocated
main
structures
direction
and
the
Hodes,
as D.-l G. The
category.
newly
to
q share
Similarly,
p,
Cycle
be useful
interference
a path
tiou
also
if the
We
are
in the
information,
can
like
some
three
calls
17
from
programmer
call-site
share
So majority
represent
say p and
check
of pointers
use inherently
pointers.
fall
to identify
pointers
lists
gets reported
pug
connected
shape
relationships
ply
back
for them
before
ample,
and
Tree category
of the
these
the shape
sparse
references
also uses arrays
However
and consequently
structures
first
of procedure
mef
chomp
function
dependencies
none
dependence
information
upon
loop
are
shape
shape
tiou.
The
interproceduThe
sites
bintree
this
nodes,
3.
}
For
three
nulnber
Program
...
dynamic
calls
the
of
columns
approximate
analysis.
calls
number
number
three
of procedure
graph
the appropriate
some
Act)
by dividing
last
number
last
labeled
total
total
of procedure
actual
The
the
the
three
of func-
call-site.
number
of procedure
the
lated
per
graph
first
number
The
shape
total
number
the
analyzed.
number
a = Compute_Lateral(l,theta-R,theta_I
the
total
and
of nodes
for
The
and
graph.
4 we present
give
invocation
program,
program,
nurnberofrecursive
Table
oized,
r).feeders;
theta_R
the
give
the
in the
invocation
in its
alyzed,
d
give
called
interprocedural
the
benchmarks.
table,
nodes
get
{ temp_O = (*
1 = temp-O[il;
this
in
columns
is as follows:
for
in
the
rzd measurements
benchmark,
this
of
actually
In
as Trte.
o~er
loop
in
3, we present
of call-sites
overall
is deduced
computation
a loop
Measurements
is a context-sensitive
In Table
columns
= 1;
tions
In the
The
Shape
1 = build-lateral(temp_l,20);
1
each
Interprocedural
analysis.
temp_2[il
shape
4.4
~)
memoized,
is visited
output
with
values
aprevi-
it can be observed
benchmarks,
have
that
even
from
recur-
We
Program
Calls
Analyzed
Mem
Tot
bintree
xref
60
!1
misr
II
chomp
390
stanford
112 \
power
nbody
assembler
16
36
0
42
9
1.25
3.50
7,2(J
2.03
l,~lj
6.17
836
I 1.98
2.50
1
2.00 1
will
1.25
and by using
tbe
certain
be able
trees
3.60
1.50
l,jg
updates
1,78
Based
1.30
presented
heap
a more
shape
It
structures
of the information
better
on the
in this
analyses
development
of
leaf-linked
from
programs
more
analysis
this
also
re-
trees,
improve
the
we also plan
to larger
that
upon
information.
results
paper,
matrix
with
in the face of structural
kill
positive
to
link
matri-
structure
like
and
the first
attribute
data
the
the direc-
of path
is hoped
pointers,
by giving
to create
about
complex
of the
links.
parent
the accuracy
analysis
by enriching
implementation
to handle
with
1.19
~,~~
3.18
(a partial
abstracts
to
shape
hierarchy
to keep information
path
spect
2.31
1
1
3.13
16.08
that
our
in our
abstraction
ces [17])
1.00
1
4.13
3.75
11
210
221
1
to extend
analysis
on the
3.13
1
1.00
9.70
63
9
252
3.36
1
1.40
[1
10
49 I
1057
5.88
11
30
I
5?
reverse
paraffins
1.-17
11
1
1
I
1.51
194 II
6
I
11
,[
47
[
196
36
1[
hash
~,~~
plan
level-3
tion
0101
1
A\’i
47
13
1
7
AYC
Act
12
59
Avf
efficient
experiments
to apply
and
all of our
to continue
interprocedural
our
strate-
gies,
References
Table4:
Dyllalllic
Illterproced~lral
hIeasuremr]lts
[1] J.
Choi,
M.
Burke,
flow-sensitive
sive
and
approximate
most
of these
they
also
employ
traverse
and
useful,
any
analysis,
recursion
shape
as the
them.
aualysis
it
must
accurate
graph
use recursive
modify
and
safe and
invocation
programs
induced
of
that
to
Symposzurn
pages 232-245,
[2] A.
programs
iu
a
Rogers,
M.
March
Conclusions
and
Future
this
paper
proximate
we have
the
C programs.
pointer
piler,
analyses
it
recursive
The
and
tested
mental
structures.
and
show
of lists
information
for
data
erful
enough
sults
will
that
cases
complicated
real
compilers.
simplest
consideration,
class
for
expensive
[3,
5,
major
Our
approach
possible
or complex
The
if the
lists,
J.
on
on
17(2):231-263,
and F. K. Zadeck.
In Proceedings
finite
Language
pages 296–310,
model
Analysis
of the SIGJune
of aliasing
De-
1990.
and its ab-
representations
of right-regular
In Proceedings
of the 1992 ln-
relations.
ternationol
2–13, April
Conference
on Computer
Languages, pages
1992. IEEE Computer
Society Press.
June
useful
changes
is uot
[6] M.
to
the
reof
use
program
fits
uot
the
into
the
apply
N ’94 Conference
and
analysis
Proceedings
on Programming
Implementation,
pages
for
of the
Lan-
230–241,
Emami,
R. (lhiyaj
and
iuterprocedural
and
L. J. Hendren.
points-to
analysis
Contextin the pres-
In Proceedings
of the ACM
on Programming
Language
Implementation,
pages
242-256,
June
1994.
[7] A. M. Erosa
iu
A structured
cheapest
program
In
1!)94.
Deszgn
substantially
to implement
may-alias
k-limiting.
ence of function
pointers.
SIGPL.4 N ’94 Conference
pow-
solne
Interprocedural
Beyond
Deszgn
sensitive
haudle
each
Deutsch.
guage
trees,
providt;
are
difficult,
we will
L.
Transactions
on Programming
A storeless
.4 CM SIGPLA
data
although
can
for
analysis,
re-
simple
structural
is to
A CM
and
structures
und Systems,
M. Wegman,
using
pointers:
parallelizatiou.
they
Reppy,
equivalence
[5] A,
experi-
accurate
abstraction
but
Record
Languages,
data
machines.
’90 Conference
stractions
C c,onl-
J. H.
dynamic
and structures.
[4] A. Deutsch.
of
composition
build
results,
more
in
Conference
SIGPLAN-SIGACT
implemented
provide
aualyses
analysis
shape
Carlisle,
sign and implementation,
use simple
building
and
30],
and
built
that
accurate
17,
Thus,
our
does
shape
Other
that
we can often
make
to give
more
and
are
programs
our
be safe.
these
it
or trees,
structure,
Mc(3AT
completely
programs
for
the
PLAN
al.>-
structures
programs.
optimization
For programs
the
that
that
of a hierarchy
programs
been
that
those
Thus,
arrays
in
at
has
data
is part
16 benchmark
results
the
dynamic
structures
on
an analysis
implemented
analysis
for
of
analysis
is directed
data
ally.
sults
shape
Shape
and
preseuted
In
ACM
1995.
[3] D. R. Chase,
Work
Efficient
of pointer-
1993.
Languages
of pointers
In
C.
distribated-mernory
marmer,
P.
of Programming
January
Supporting
Programming
5
Annual
on Principles
Hendren.
Carini.
computation
and side-effects.
Twentieth
to
interprocedural
recursive
aliases
the
be
structure
implies
handle
handle
SIuC.e
structures,
control
This
must
nodes.
data
and
interprocedural
and L. J. Hendren.
approach
Taming
to eliminating
In Proceedings of the 19.$14International
[:onlPtlter
Languages, Pages ~29–240,
-
under
target
[8] VI. Emami.
a 111oM
fol
thesis,
analysis.
14
A practical
interprocedural
au optimizing/parallelizing
McGill
University,
control
Conference
May
1993.
on
1994.
alias
C compiler.
July
flow:
goto statements.
analysis
Master’s
[9] E. Gagnon.
ACAPS
sity,
[10]
R.
A fast-forward
P~oject
May
L.
interprocedural
of the Etght
for
Ghiya.
Parallel
III.
[13]
L.
Heudren,
tiani,
C.
based on a family
resentations.
In
Workshop
number
[14]
analysis
Memo
[15]
72, School
of ComputerScience,
illc(;illITni\e~-
languages
analyzability:
A fresh
for
In Proceedings
Conference
[27]
L. J. Hendren,
Society
J. E. Hummel,
for
recursive
the analysis
and
pointer
In Proceedings
Conference
on progr’anzm~ng
Nicolau.
of the .4CM
Language
June
L. J. Hendren
and A. Nicolau.
Paralleliziug
with
data
IEEE
recursive
structures.
and Distrzbated
Sgstems,
[29]
programs
S. Horwitz,
P. Pfeiffer,
ysis for pointer
PLAN
and T. Reps.
variables.
on
’89 Cortfevence
Dependence
on Programming
stgn and Implementation,
anal-
[31]
of the SIG’-
June
N. D. Jones and S. S. Muchnick.
sis, Theorg
and Applications,
and Optimization
Structures,
N. D. Jones and S. S. M uchnick.
data
recursive
structures.
data
Workshop
Parallel
of the ACM
SIGPLAN
SIGPLA
on Programming
Language
Design
ion,
June
M,
Sagiv,
T.
Reps,
problems
Program
and
reconsidered.
N ’95 Conference
and
with
Implementat-
Programming
Languages,
B. Sridharan.
An analysis
Solving
of the Twenty
January
thesis,
updat-
Third
Annual
on Principles
of
1996.
framework
McGill
shape-
destructive
CT Symposium
Master’s
1995.
R. Wilhelm.
Record
SIGPLAN-SIGA
on
Ma-
1995,
in languages
In Conference
problem.
Symposium
alias analysis
of the ACM
in 1994.
path
pages 1–11, June
pages 13-22,
pages 37-
Published
and Semantics-Based
Context-insensitive
number
Science,
as a generalized
In
on Lan-
Computing,
1993. Springer-Verlag.
of
execution.
for the McCAT
University,
Septem-
ber 1992.
1989.
Flow .4 nal~-
4, Flow
Notes
Analysis
parallel
of the 6th International
(PEPM),
on
Implementation,
and V. Karamcheti.
for eflicient
Evaluation
E. Ruf.
‘9.2 Conference
and
B. Steensgaard.
time.
Analysis
In
nual .4 Cilf
pages IL)2–
Points-to
Conference
SIGPL.4
ples of Programming
Record
N-SIGA
analysis
in
almost
of the Twenty
CT Symposium
Languages,
January
Third
linear
An-
on Princi1996.
1981.
interprocedural
the Ninth
[’1]
chapter
of LISP-like
131. Prentice-Hall,
[20]
Program
N
Desi~n
in Computer
compiler.
[32]
[19]
Novem-
768 in Lecture
ACM
L[lnguageDe -
pages 28–40,
and
93, pages 243–249,
for
ing.
17, .lanuary
In Proceedings
Culler,
In Pro-
guages and Compilers
analysis
1990.
[18]
D,
pricing.
SIGPLA
A. Chien,
In Proceedings
’92
Tran.sactlons
Li,
power
1992.
structures
nipulcdion
In~-
1992.
1(1)::15-
ACM
June
In Proceedings
De.s&gn ctnd II)(,.
pages 249–260,
De-
1988.
1988.
X.
optimal
Language
56, August
Ab-
SIG’PL.4N
June
Restructuring
Lisp
In Proceedings
of
July
hlurphy,
[28] T. Reps. Shape analysis
of imperative
plernerstation,
Parallei
of the
J. Plevyak,
Partial
structures:
and transformation
programs.
at pointer
[30]
[17]
L.
Proceedings
Press.
A.
data
conflicts
of the SIG-
W. A. Landi and B. G. Ryder.
A safe approximate
algorithm
for interprocedural
pointer
aliasing,
In Pro-
dynamic
Languages jpages?42-
Computer
1991.
Language
pages 21–34,
Decentralized
pages 235–248,
of the 1992 I!lterna-
on Computer
1992. IEEE
stractions
proving
[26]
piogrammiug
look
on Programming
pages 100–110,
Lummetta,
ceedings
.4 C,4PS Technical
Detecting
accesses. In Proceedings
ceechngs of Super computtrrg
Parallel
for C compilers,
exe-
4:29–99,
ber 1993.
Emami,
R. Ghiya,
and C. \~ercontext-sensitive
interprocednral
Designing
251, April
[16]
rep-
1993.
structures.
S.
1. Khalil.
1[)92. Springer-t;ellag.
and G. R. Gao.
tional
[25]
NotesinConll>ute~
L. J. Hendren
P. N. Hilfiuger.
1993.
for parallel
Computation,
J. R. Larus and P. N. Hilfinger.
programs
for concurrent
execution.
Programming
framework
July
data
JusconL-
in 1993.
L. J. Heudren,
M.
brugge.
A practical
sity,
for
and
structure
ancl Systems,
5th Internc{tr’o~~al
Compders
Lecture
August
Gao,
Cr.
J. R. Larus
of Pro-
January
the .4 CM/SIGPLA
N PPEA LS 1988 — Parallel
Programming:
Experience
with Applications,
Languages
1989,
interlllediat,e
oj the
and
757in
and
Lwp
the klccAT
of structured
pages 406–420,
Published
analysis
Emami,
Proceedings
on Languages
Computing,
Science,
hI.
Lisp programs
PL.4 N ’88 Conference
progralns,
Desiguillg
Compiling
stgn and Implementation,
[24]
on Principles
pages 196–205,
Lisp and Symbolic
between
of Computer
2(3/4):179–396,
Douawa,
p~]
1995.
of Scheme
Computatzors,
In
interprocedural
interprocedural
B, Sriciharau.
and
piler
The
C.
1995.
School
May
parallelization
and Symbolzc
for
CT .$gmpostum
Languages,
J. R, Larus.
cution.
on Lungucrgesund
for
thesis,
University,
L. Harrison
analysis
techniques
Master’s
McGill
automatic
heap
Workshop
N-SIGA
gramming
analysis:
Cotllptlting,Allgt{st
Practical
analysis.
Science,
W.
(lnl\Jer-
Connection
J. Henrlren,
Proceedings
R.
SIGPL.4
analysis,
hIc(iill
[~Q
and
A practical
heap
[12]
19$)5,622 B,03,
1995.
Ghiya
Compilers
[11]
and lazy points-to
Report
Annual
flow
ACM
A flexible
analysis
In
approach
and progralus
C~onference
SgmposIunl
on Pr(nczples
.January
N.
Klarlund
Schwartzbach.
C+raph
In
Conference
of the
An naal
ancl
M.
Record
Twentieth
to
pointer
\vit h
Record
Programming
Languages,
pages 66-74.
ACM SIGACT
and SIGPLAN.
[:33] R. P. Wilson
of
of
1982.
types.
A (7h1
15
and M. S. Lam.
analysis
ACM
SIGPLA
guuge
1995,
Design
Efficient
context-sensitive
for C programs.
In Proceedings
N ’95 Conference
on Programming
Lan-
pages
June
and Implementation,
1–12,
of the
© Copyright 2026 Paperzz