Definition 1

In these notes we will use   ┬ for meet, join, bottom and top respectively and
 for the partial order relation.
In the first part of the lesson, a practical problem from the C programming language
will be introduced.
Problem definition Points-To Analysis
Let p be the variable of a pointer type and q another variable. We need to decide if p
points to q.
In the C language this problem exists because there we can get addresses of variables.
Thus, after the assignment p = &q, p points to q.
Motivation
The motivation to solve this problem is that without knowing whether a variable can
be effected by a destructive updates, i.e., *p = q operations, the compiler ability to
perform several analysis, e.g., constant propagation, and optimizations is hindered.
Example
Suppose we have the following basic block:
x := 1;
*p := 3;
y := x;
Since this is basic block if p does not point to x there is no assignment to x between x
:= 1; and y := x;. So the assignment y:=x may be replaced by the assignment y:=1,
and in particular, the compiler can deduce that after the last assignment, both x and y
has the constant value 1. However, if p may point to x, the assignment *p := 3 can
change the value of x, and the compiler cannot simply perform the above
optimization.
Conservative Solution
Don’t do any optimization on a variable whose address was taken in the program.
Example
Suppose we have the following basic block:
q := &z;
p := &x;
x := 1;
*q := 3;
y := x;
Using the conservative solution, the compiler will not deduce that after the
assignment y:=x, both x and y have the constant value 1, because the address of x was
taken in the program.
1
It’s easy to see that the conservative solution is very weak. It prevents any sort of
analysis/optimization when a variable whose address was taken at some place in the
program is involved
Notice that as usual there are many variables that their address is taken in the
program. For example, all the variables that we pass to procedure by reference. So
many possible optimizations will not be executed.
Remark
The conservative solution is sound in a sense that any variable whose address was not
taken, cannot be effected by a destructive update operation.
We will solve the Point-To Analysis problem for the extension of the While language
named PWhile.
PWhile definition
a := x | *x | &x | n | a1 opa a2
b := true | false | not b | b1 opb b2 | a1 opr a2
S := x := a | *x := a | skip | S1 ; S2 |
if b then S1 else S2 | while b do S
PWhile Semantic Definition
We extend the semantic of While for new operations, arithmetic expressions and
statements. For performing this represent the state with two mappings:
Loc: Var -> Loc
State: Loc -> Loc U Z
Where Loc is a set of locations.
Informaly, loc is a static function that maps variables to their locations and state is a
function that takes the location of the variable x and returns location of variable y if x
is a pointer to y or a number (a value of x) if x is not a pointer.
Now if we want to get value of variable x we’ll do this in two stages. At the first stage
we’ll find location l of x by l=loc(x). At the second stage we’ll find the value by
applying s(l) to l, the location of x.
After the definition of the new state, denote it Pstate, we can to define the new
semantic.
The semantics of arithmetic expressions
The meaning of an expression depends on the values bound to variables locations that
occur in it.
Then given an arithmetic expression a and a program state s we can determine the
value of the expression by redefining the A function to be:
A : A exp  ( Pstate  Loc  Z )
2
A[ x]s  s (loc ( x))
A[& x]s  loc ( x)
A[*a ]s  s ( A[a ]s )
All other expressions will stay with no change.
Changes in structural operational semantics
The new transition relation of SOS has the form:
<S,s> => γ where γ either of the form <S’,s’> or of the form s’ were s’ and s belong
to Pstate .
The new definition of => is:
[ass 1SOS ]  x : a, s  s[loc ( x)  A[a]s]
2
[ass SOS
]
 *x : a, s  s[ s(loc ( x))  A[a]s]
All other rules will stay with no change.
Analysis Construction
Concrete Space Construction
The concrete problem we want to solve is Point-To Analysis. The concrete semantics
identifies some set of maps of the form s: Loc -> Loc U Z. Therefore the lattice of the
collecting semantic is: L = (P (Loc -> Loc U Z),  ). The collecting semantic for each
statement l in the PWhile language defines the function f l such that
f l : P( Loc  Loc  Z )  P( Loc  Loc  Z )
For example
if l  [ x : a] then :
Y  P( Loc  Loc  Z )
f l (Y )  [ x : a]Y  {s[loc ( x)  A[a]s] | s  Y }
And in this way we define f l for each statement l.
Abstract Space Construction
We need to define an abstract space of properties such that each element in the space
describes all variables that points to another variable.
Therefore the abstract space will be M  ( P(Var* xVar* ), ) and each element m M
is a set of pairs of variables. The intuitively meaning of such set is ( x, y )  m if x
points to y.
Galois Connections Construction
We have two lattices M  ( P(Var* xVar* ), ) and L  ( P( Loc  Loc  Z ), )
3
And we want to construct a Galois Connections between them.
We will do it in two stages. At the first stage we will construct a function
 : ( Loc  Loc  Z )  P(Var* xVar* ) . At the second stage we will define a Galois
connections using the function  .
Stage 1
The meaning of the function  is to map a value from Loc  Loc  Z to the best
element in P(Var* xVar* ) describing it. For each value s from Loc  Loc  Z the
best element m P(Var* xVar* ) describing it is the set m such that
( x, y)  m s(loc ( x))  loc ( y) .
Therefore we define  to be:
v  Pstate  (v)  {( x, y ) | v(loc ( x))  loc ( y )}
Stage2
Construction of the functions  and  using  .
We will see in the general case that if we have  : V  M mapping the values of V
to the properties of the complete lattice M it gives rise to a Galois connection
( P(V ),  ,  , M ) between P(V) and M where the abstraction and concretization
functions are defined by:
V ' P(V )  (V ' )  { (v) | v V ' }
l  M  (l )  {v V |  (v)  l}
Lemma
( P(V ),  ,  , M ) Defines a Galois connection.
Proof:
In order to proof, that ( P(V ),  ,  , M ) defines GC it suffice to proof that:
V ' P(V ) m  M  (V ' )  m   (m)  V '
Let V '  P(V ) and m  M be arbitrary elements, then:
 (V ') m  { (v) | v V '} m  v V '  (v) m  V '   (m)
but the lattice is ( P(V ), ) then :
 (V ') m  V '  (m)
And it is true for arbitrary V’ and m then:
V ' P(V ) m  M  (V ' )  m   (m)  V ' .
■
4
In our case  : ( Loc  Loc  Z )  P(Var* xVar* ) and M  P(Var* xVar* ) then we can
construct ( P( Loc  Loc  Z ),  ,  , P(Var* Var* )) where:
V '  P( Loc  Loc  Z )  (V ')  { (s) | s V '} 
{( x, y ) | s(loc( x))  loc( y )}
sV '
m  P(Var*  Var* )  (m)  {s  Pstate |  (s) m}  {s  Pstate |  (s)  m} 
 {s  Pstate | ( x, y) Var*  Var* ( x, y )   (s)  ( x, y )  m} 
 {s  Pstate | ( x, y) Var*  Var s(loc( x))  loc( y)  ( x, y)  m}
In addition as we’ve seen in the last lemma, ( P( Loc  Loc  Z ),  ,  , P(Var* Var* ))
Defines GC.
Note, that if s   (m) than if in s variable x points to y (i.e., s(loc(x))=loc(y) than
(x,y)  m, but not vice versa.
Now let us define the Abstract Semantics.
Abstract Semantics Construction
As we did in the concrete space for each statement l in PWhile language let’s define
the function fl # such that:
f l # : P (Var*  Var* )  P (Var*  Var* )
for example let l  [ x : & y ]
Y  P (Var*  Var* ) f[ #x:& y ] (Y )  (Y \{( x, z ) | ( x, y \ z )  Y })  {( x, y )}
or for example let l  [*x : y ]
Y  P (Var*  Var* ) f[*#x: y ] (Y )  Y  {(t , z ) | ( x, t )  Y  ( y, z )  Y }
And in this way we define fl # for each statement l.
Let us discuss way our Analysis is sound.
Soundness proof
Lemma 1
For each statement l in PWhile language
Proof
fl
is monotone functions.
For each statement l. in PWhile we need to proof that
We will do it for l = [ x := a].
Let Y1 , Y2  P( Loc  Loc  z ) be arbitrary elements.
5
fl
is monotone function.
And assume that Y 1 Y 2 . We’ll prove that fl (Y1)
fl (Y 2) .
f[ x:a ] (Y1)  {s[loc( x)  A[a]s] | s  Y1}  {s[loc( x)  A[a]s] | s  Y 2}  f[ x:a ] (Y 2)
is  in our lattice and since that Y 1 Y 2 :
Since
f[ x: a ] (Y 1)
f[ x:a ] (Y 2)
■
Lemma 2
For each statement l in PWhile language
Proof
fl #
is monotone function.
For each statement l. in PWhile we need to proof that
We will do it for l = [*x := y].
Let m1 , m2  P(Var* Var* ) be arbitrary elements.
And assume that m1
fl #
#
is monotone function.
f[*#x: y ] (m2 ) .
m2 we’ll prove that f[*x: y ] (m1 )
f[*#x: y ] (m1 )  m1 {(t , z) | ( x, t )  m1  ( y, z)  m1}(*)
But m1 m2 then m1  m2 , but m1  m2  ((a, b)  m1  (a, b)  m2 )
then: (*)  m1 {(t , z ) | ( x, t )  m2  ( y, z )  m2}(**)
And again since m1  m2
(**)  m2 {(t , z) | ( x, t )  m2  ( y, z)  m2}  f[*#x: y ] (m2 ) .
Hence for arbitrary m1 , m2  P(Var* Var* ) it hold that f[*x: y ] (m1 )
#
Then:
■
m1 , m2  P(Var* Var* ) m1
m2  f[*#x: y ] (m1 )
Lemma 3 (Local Soundness)
l  Stm Y  P( Loc  Loc  Z )
 ( fl (Y ))
fl # ( (Y ))
Proof
We will proof the lemma only for l = [x: = &y].
Let Y  P( Loc  Loc  Z ) be an arbitrary element then:
6
f[*#x: y ] (m2 )
f[*#x: y ] (m2 )
 ( f[ x:& y ] (Y ))   ({s[loc( x)  loc( y )] | s  Y }) 

{(a, b) | e(loc(a))  loc(b)}(*)
e{ s[ loc ( x ) loc ( y )]| sY }
Let us calculate
 (Y ) 
f l # ( (Y )) :
{(c, d ) | s (loc(c))  loc(d )}
sY
then :
f[ #x:& y ] ( (Y ))  ( (Y ) \{( x, z ) | ( x, z )   (Y )})  {( x, y )}
We want to prove that  ( fl (Y ))
fl # ( (Y )) . Recall that
is  . Then actually we
want to prove that  ( fl (Y ))  fl # ( (Y )) . Let (t1 , t2 )   ( f[ x:& y ] (Y )) be an
arbitrary element, then by (*):
(t1 , t2 ) 
{(a, b) | e(loc(a))  loc(b)}
then
e{s[ loc ( x )  loc ( y )] | sY }
e {s[loc( x)  loc( y)]| s  Y }  e(loc(t1 ))  loc(t2 ) (***)
Therefore
s  Y e  s[loc( x)  loc( y)]  e(loc(t1 ))  loc(t2 ) (**)
There are two possibilities: x  t1  x  t1
1) x  t1 then by (**) e(loc(t1 ))  loc(t2 ) and e  s[loc( x)  loc( y )] then we can
conclude that e(loc( x))  loc(t2 ) and e(loc( x))  loc( y ) therefore y = t2 and
(t1, t2 )  ( x, y)  ( (Y ) \{( x, z) | ( x, z)  (Y )}) {( x, y)}  f[ #x:& y ] ( (Y ))
2) x  t1 then by (**)
s  Y e  s[loc( x)  loc( y)]  e(loc(t1 ))  loc(t2 )
So s  Y s(loc(t1 ))  loc(t2 ) . Then
{(c, d ) | s(loc(c))  loc(d )}   (Y )
(t1 , t2 ) 
sY
Since x  t1 it holds that:
(t1 , t2 )  ( (Y ) \{( x, z) | ( x, z)  (Y )}) {( x, y)}  f[ #x:& y ] ( (Y ))
In both cases 1),2) we get that:
(t1 , t2 )  f[ #x:& y ] ( (Y ))
Therefore, we can conclude that:
(t1 , t2 )  f[ #x:& y ] ( (Y ))
And since (t1 , t2 ) was arbitrary, we can conclude that:
7
(t1 , t2 )  P(Var *  Var * ) (t1 , t2 )  ( f[ x:& y ] (Y ))  (t1, t2 )  f[ #x:& y ] ( (Y ))
It implies that:  ( fl (Y ))  fl # ( (Y )) . Y was arbitrary then:
Y  P( Loc  Loc  Z )  ( f[ x:& y ] (Y ))  f[ #x:& y ] ( (Y ))
■
Let p be a program in PWhile with n nodes in the control flow graph of p.
Definition 1
Let us define
f p : P( Loc  Loc  Z )n  P( Loc  Loc  Z )n
output of CI algorithm implemented with the lattice
and the functions
fl
by the
( P( Loc  Loc  Z ), )
on the edges of the control flow graph of p.
Definition 2
Let us define
f p# : P(Var* Var* )n  P(Var* Var* )n ,
By the output of CI algorithm implemented with the lattice
and the functions
fl #
( P(Var* Var* ), )
on the edges of the control flow graph of p.
Proposition 1
f p and f p#
are monotone functions
Proof
Immediately from lemma1 and lemma2 and the fact that in lattices that constructed
with Cartesian product:
(l1 , l2 ,..., ln )
(l1' , l2' ,..., ln' ) iff l1
l1'  l2
l2'  ...  ln
■
Let us define GC
( P( Loc  Loc  Z ) n ,  ' ,  ' , P(Var*  Var* ) n )
X  P( Loc  Loc  z ) n  ' ( X )  ( ( X 1 ),  ( X 2 ),...,  ( X n ))
Y  P(Var*  Var* ) n  ' (Y )  ( (Y1 ),  (Y2 ),...,  (YN ))
8
ln'
And indeed it defines a GC because:
 ' ( X ) Y  1  i  n  ( X i ) Yi  1  i  n X i
 (Yi )  X
 ' (Y )
Proposition 2
c  P( Loc  Loc  Z )n  ' ( f p (c))
f p# ( ' (c))
Proof
Immediately from Lemma 3 and the fact that in lattices that constructed with
Cartesian product:
(l1 , l2 ,..., ln )
(l1' , l2' ,..., ln' ) iff l1
l1'  l2
l2'  ...  ln
ln'
■
From Proposition 1 and Proposition 2 and the fact that
( P( Loc  Loc  Z ) n ,  ' ,  ' , P(Var*  Var* ) n )
defines GC
Using the Soundness Theorem (2) it follows that:
 ' (lfp( f p )) lfp( f p# )
I.e. our Analysis is Sound.
Example
Let p be the following program:
t := &a;
y := &b;
z := &c;
if x> 0;
then p:= &y;
else p:= &z;
*p := t;
In order to demonstrate the running of CI on the following program we
will construct the control flow graph.
9
Control graph 1
 m.
1
2
t:=&a
 m.(m \{(t , y ) | (t , y)  m})  {(t , a)}
y:=&b
 m.(m \ {( y, a) | ( y, a)  m})  {( y, b)}
3
z:=&c
 m.(m \ {( z, a) | ( z, a)  m})  {( z, c)}
4
If(x>0)
m.m
5
m.m
6
p:=&z
p:=&y
 m.
 m.
( m \ {( p, a ) | ( p, a )  m})
( m \ {( p, a ) | ( p, a )  m})
{( p, z )}
{( p, y )}
7
*p:=t
 m.m  {(a, b) | ( p, a)  m  (t , b)  m}
8
exit
Now we want to calculate f p# for the above control graph.
10
The CI algorithm, which computes f p# stops whenever the lattice has finite height.
Recall that the lattice in the abstract space is ( P(Var*  Var* ) ,  ) and indeed
has finite height because using the lemma that says that partially ordered set L has
finite height iff it satisfies both the Ascending and Descending Chain Conditions, we
n
n
get that ( P(Var*  Var* ) ,  ) has finite height iff ( P(Var* Var* ), ) has finite
height. But using the same lemma again ( P(Var* Var* ), ) has finite height iff it
satisfies both the Ascending and Descending Chain Conditions. Indeed
( P(Var* Var* ), ) satisfies the conditions because the number of variables in the
program is finite, therefore ( P(Var* Var* ), ) has finite height and therefore the
n
n
calculation of f p# stops.
Calculation of f
Work list
[1]
[2]
[3]
[4]
[5],[6]
[6],[7]
[7]
[8]
11
#
p
:
Dfentry[v]
Φ
Dfentry[2] = { (t,a) }
Dfentry[3] = { (y,b) , (t,a) }
Dfentry[4] = { (z,c) , (y,b) , (t,a) }
Dfentry[5] = { (z,c) , (y,b) , (t,a) }
Dfentry[6] = { (z,c) , (y,b) , (t,a) }
Dfentry[6] = { (z,c) , (y,b) , (t,a) }
Dfentry[7] = { (p,z) , (y,b) , (z,c) , (t,a) }
Dfentry[7] = {(p,y), (p,z), (y,b), (z,c),(t,a) }
Dfentry[8]={(y,a),(z,a),(p,y),(p,z),(y,b),(z,c),(t,a)}
Optimization on CI Algorithm
The real world applications are very big. There are applications with more then
1,000,000 lines of code and with a lot of variables and with a huge control flow graph.
Then our lattice is sill finite but with a very big height. Therefore in order to run the
algorithm on a big program we must introduce an optimization.
When we use the CI Algorithm to solve Points-To Analysis we can change the input
of the algorithm in such way:
if input to CI is G(V,E), s:node , f:E->(L->L) we can translate it to another input to
CI: G(V’,E’) , s’:node , f’:E’->(L->L) such that:
1. V’ contains one node, denote it s’
2. s’ is also the initial node
3. for each e from E add e’ to E’ such that e’ is edge from s’ to s’ and f(e)=f’(e’)
Intuitively graph that we built represents program with all the statements of the
original program but in arbitrary order.
For example if the statements of the original Program is :s1,s2,s3,…,sn
Then the graph represents any program with those statements but with arbitrary order
on them.
Example
Suppose the program is s1;s2;,….;sn then the control flow graph is:
S1
fsn #
fs1#
fs1#
Translate to
S2
fs2
#
S’
fs2 #
fsn #
Sn
In the worst case the time complexity of the run will not better but in the practice it
can be computed in almost linear time.
But the space complexity in the worst case is better because we need to save only one
Dfentry.
In the real world application we can implement elements of the lattice by using graph
that the variables are the nodes and we start with no edges and if we want to add pair
(x,y) to the pairs that the graph represents we just add edge between x and y.
12
Run Example
Suppose we run the optimization above on the control graph 1 with edges with the
original sequence, then the pair graph at each step will be:
Step 1
t
y
z
p
p
a
b
c
c
Represent Ф
Step 2
t
y
z
p
p
a
b
c
Represent {(a,b)}
Step 3
t
y
z
p
p
a
b
Represent {(a,b),(y,b)}
13
c
c
Step 4
t
y
z
p
p
a
b
c
c
Represent {(a,b),(y,b),(z,c)}
Step 5
t
y
z
p
p
a
b
c
c
Represent {(a,b),(y,b),(z,c),(p,z)}
Step 6
t
y
z
p
p
a
b
c
c
Represent {(a,b),(y,b),(z,c),(p,z),(p,y)}
14
Step 7
t
y
z
p
p
a
b
c
c
Represent {(a,b),(y,b),(z,c),(p,z),(p,y),(y,a),(z,a)}
We see that the expansive operation is
f[*#x: y] (m)  m {(t, z) | ( x, t )  m  ( y, z)  m}
Remark
The analysis as we introduce it is not the best transform because of the function
f [*#x: y ] definition. By its definition if for example x points to t and y points to z and
t points to w then after the statement *x=y t points to w and t points to z, because we
only add the pair (t,z) to the pairs and we don’t remove any pair. We can improve this
in the original version of the algorithm (in certain cases) but in optimize version we cannot
improve this, because we lost the meaning of control of the flow, then we cannot remove
element on same paths and add elements from other paths, therefore assignment only can
add element without remove anyone. So in the optimize version we have the best we can
get.
15
In this part, we will introduce how to analyze lattice of infinite height. Recall that the
follow CI algorithm:
Chaotic(G(V, E): Graph, s: Node, L: lattice, i: L, f: E ->(L ->L) ){
for each v in V to n do dfentry[v] := 
In[v] = t
WL = {s}
while (WL   ) do
select and remove an element u  WL
for each v, such that (u, v)  E do
temp = f(e)(dfentry[u])
new := dfentry(v)  temp (*)
if (new  dfentry[v]) then (**)
dfentry[v] := new;
WL := WL  {v}
This algorithm satisfies that at the point (**) holds that dfentry  new. However, if
we denote
df entry
in i-th iteration by
i
df entry
we get:
0
1
n
df entry
 df entry
 ...  df entry
 ...
If the lattice has a finite height, the sequence eventually stabilizes. Nevertheless, if the
lattice has an infinite height it is possible that the sequence will not stabilize.
Example
The lattice (Interval,  ) of intervals over Z may be described as follows. The
elements we work with them are intervals
Intervals = {[z1, z2] | z1  z2, z1  Z  {}, z 2  Z  {} }  {}
 Defined as follows:
Stage 1:
We’ll expand the order relation  defined over Z to  ’ defined over Z  {,}
such that:
a, b  Z a  ' b iff a  b
a  Z    ' a and a  ' 
  ' 
For the sake of simplicity we’ll use

instead of  ’ for the set Z  {,} .
Stage 2:
We’ll use int for denoting elements from Interval. Intuitively: if int1 and int2 are two
intervals then int1  int2 if int1 is contained in int2. For giving the formal definition
for  we define two functions:
inf: Interval -> Z  {,}
sup: Interval -> Z  {,}
16
as follow

  if int  

inf(int)  
z1 if int  [ z1 , z 2 ]





   if int  

sup(int)  
z 2 if int  [ z1 , z 2 ]




Now we can define the order:
Int1  int2 iff inf(int1)  inf(int2) and sup(int1)
Interval is depicted in the diagram:

sup(int2) the partial ordering on
Proposition:
(Interval,  ) is lattice
Proof
Is on page 220 in the book. Here We only define the join operator without proving it
is really join operator (recall that if we’ve defined join operator there is a construction
for meet operator). The least upper bound of set of intervals is a minimal interval
which contains all the intervals in the set. For example:
{[1,2], [4,5]}  [1,5] . For defining the operator formally we need to define two
auxiliary functions: inf’, sup’: P (Z  {,} ) -> Z  {,} as follow


inf' (Y )   z


17


z  min{ a | a  Y }

  else

 Y 


sup' (Y )   z




z  max{ a | a  Y }

 else

 Y 
Examples:
1. inf’({1,2,…}) = 1 sup’({1,2,…}) = 
2. inf’({…,-2,-1,0}) = -  sup’({…,-2,-1,0}) = 0
Now define  :
Y  P( Interval )
 if Y  {}


Y 

[inf'{inf(int) | int  Y }, sup'{sup(int) | int  Y }] else 
For example:
1.  {[1,1],[2,2],[3,3],…}=[1,  ]
2.  {…,[-2,-2],[-1,-1]}=[-  , -1]
Let’s define GC between the lattices (Interval,  ) and (P(Z),  ).
(P(Z),  ,  , Interval ) definition
 : Interval  P(Z )
 (int)  {z  Z | inf(int)  z  sup(int)}
Thus for some interval  returns all the numbers containing in the interval.
Example:
1.  ([0,3]) = {0,1,2,3}
2.  ([0,  ]) = { z  Z | z  0 )}
 : P( Z )  Interval
if Y  

Y  P( Z )  (Y )  
[inf'(Y ),sup'(Y )] else
Thus  (Y ) will determine the smallest interval that includes all the
elements of the set Y.
For example:
1.  ({0,1,3})  [0,3]
2.  ({2* z | z  0})  [2, ]
Lemma
( P( Z ),  ,  , Interval ) is a Galois connection.
Proof
Proof is on page 233 in the book.
18
The Join-Over-All-Paths (JOP)
We now introduce another solution for the analysis deferent from CI algorithm. In the
analysis as we introduced before it is difficult to know if our answer is “good”
enough. Because even if we have a solution γ(l) and we know that
γ(l) the real solution,
it can be the case that γ(l) is very far from the real solution.
The new solution, the join-over-all-paths solution (JOP), gives us a tool to measure if
our solution is the best we can get or if it is not the best. In a sense, the JOP solution
defines
In the analysis as we introduced before it is difficult to know if our answer is “good”
enough. All we know is that the solution we got is conservative. That is, it
approximates in program point l, any state in the set of states that can occur in l.
We now introduce the join-over-all-paths solution to the given a new concept that
defines the “best” solution that
other solution for the analysis deferent from CI algorithm
The new analysis method will give us a tool to measure if our solution is the best we
can get or if it is not the best.
The Join-Over-All-Paths (JOP) solution
Let denote G (V, E): Graph, s: Node, τ:L , g:E->(L->L)
an input to the analysis as we have in the CI algorithm.
Definition 1
For each v ε V we define:
path(v) = { [v1,v2,…,vn-1] | n  1 and  i < n : (vi,vi+1) ε E and vn = v and v1 = s }
path(v) is the set of all paths from the initial node s up to node v.
Definition 2
We define id: L -> L
The identity function λl.l .
For the next definition recall that g: E -> (L -> L) is the function that for each e ε E
returns a monotone function fe: L -> L.
In addition recall that each m ε path(v) has the form of m = [v1,v2,…,vn-1] .
Definition 3
Let m ε path(v), we define :
f m : L -> L
f m (l )  f en 1 ( f en  2 (...( f e1 (id (l )))...)
Where m = [v1,v2,…,vn-1] and ei=(vi,vi+1) for n > i  1 .
19
The meaning of f m is for an initial element l ε L and a path m from the initial node s
to a node v we compute f m (l ) by composing the transfer functions associated with
the individual blocks in the path m.
In addition we compose all the transfer functions on the path m with the id function,
because for the empty path we want to get f [ ]  id .
Example 1
Consider the constant propagation analysis.
Recall that we define a lattice (P ( Var*  Z ),  ,  , Var*  Z  {  , top})
Were:
Y  P([Var*  Z ])
 (Y )  { |  Y}
 (a)  { | 
a Var*  Z
a}
Consider the following program:
while( x!=3 )
if( x = 1) then
x := 1;
else
x := 2;
end if
end while
[x->0]
1
e. e  [ x  3]
While( x!=3 )
6
e. if (e x  3) then e else 
e. e
2 If(x=1)
e. e[ x  1]
3 x:=1;
If(x=1)
 e. e[ x  1]
20
e. if (e x  1) then e else 
x := 2;
If(x=1)
4
e. e[ x  2]
exit
End if
5
path(6) = { [1] , [1,2,3,5,1] , [1,2,4,5,1] ,…}
f[1] ([x -> 0] ) = ( e. e  [ x  3] ) (id)([ x -> 0 ]) =
= (  e. e  [ x  3] ) ([x -> 0]) = [ x  0] [ x  3] = 
f[1,2,3,5,1] ([x -> 0]) = ( e. e  [ x  3] ) ( e. e )
(  e. e[ x  1] ) (  e. e[ x  1] ) ( e. if (e x  3) then e else  )
(id)([ x -> 0 ]) = (  e. e  [ x  3] ) ( e. e ) (  e. e[ x  1] )
(  e. e[ x  1] ) ( e. if (e x  3) then e else  ) ([ x -> 0 ]) = =
(  e. e  [ x  3] ) ( e. e ) (  e. e[ x  1] ) (  e. e[ x  1] ([ x -> 0 ]) =
= (  e. e  [ x  3] ) ( e. e )
(  e. e[ x  1] ) ([ x ->  ])
= (  e. e  [ x  3] ) ( e. e )([ x -> 1 ])
= (  e. e  [ x  3] ) ([ x -> 1 ]) == [ x  1] [ x  3] = 
End of example 1.
Definition 4
Let (L, ) be a lattice and let G (V, E): Graph, s: Node, τ: L, g: E->(L->L) be an input
to the analysis. Suppose V = { v1 , v2 ,..., vn }.
Then the join over all paths analysis-using lattice L is J n  Ln such that
J n  ( JOP(v1 ), JOP(v2 ),..., JOP(vn )) and JOP ( vi ) defined as:
i {1, 2,..n}
f m ( )
JOP(vi ) 
m path ( vi )
The meaning of JOP ( vi ) is:
For an initial value τ compute the effect of some path from the start of the program to
node vi by composing the transfer functions along the path ( f m ) and evaluate the
composed function on the initial value τ. Then join all the effects of each path to vi
and this is JOP( vi ).
Example 2
21
In example 1 we saw that: f[1] ([x -> 0]) =  and f[1,2,3,5,1] ([x -> 0]) = 
We can calculate for each m  path(6) that f m ([ x  0])  
Therefore JOP(6) 
f m ([ x  0]) 
m path ( vi )
  .
m path ( vi )
End of Example 2
We can see that the meaning of the value JOP(6) =  is that there is no path from the
start node to node 6 that the effect of that path on the initial value [ x -> 0] will give a
value greater then  , so the analysis determine that the while loop does not
terminate.
Theorem 1
The join over all paths analysis for Constant Propagation is undecidable.
Proof is in the book at page 77.
Recall that if we have a lattice (L, ) and G (V, E): Graph, s: Node, τ: L,
g: E->(L->L) (g(E) is a set of monotone functions) is an input to the analysis then
#
when L satisfies the ascending chain condition lfp ( f ) is always computable.
Under the same conditions by the theorem 1 join over all paths analysis is not always
computable.
Example 3
Let calculate Df entry (6) calculated by CI algorithm on the graph from example 1.
WL
Df entry [v]
1
2
4
5
1
6,2
[x -> 0]
[x->0]
[x->0]
[x->2]
[x->top]
Df(6)=[x->3] , df(2)=[x->top]
.
.
.
We see that at the end df entry (6)   df entry (6)  .
So JOP(6) is strictly better then df entry (6) .
In that case the least fixpoint analysis fails to discover that the while-loop does not
terminate.
22
Definition 5
A function f : L1  L2 between two lattices L1  ( L1 , ) and L2  ( L2 , )
Is completely additive if:
Y  L1 : f ( L1 Y )  L2 { f (l ) | l Y }.
Example for completely additive function
A join function L : P( L)  L is completely additive because:
Let Y  P( L) then Y = { l1 , l2 ,..., lm } where li  L. We need to see that:
L
(
But
P( L)
L
(
{l1 , l2 ,..., lm })
P( L)
=
{l1 , l2 ,..., lm }) 
L
L
{
l1 ,
L
l2 ,...,
L
L
lm } .
li ) . So we need to see that:
(
i{1,2,..n}
L
li ) =
(
L
{
L
l1 ,
L
l2 ,...,
L
lm } .
i{1,2,..n}
Let denote  L (
l
)  a1 and
i
i{1, 2,..., n}
L
{
L
l1 ,
L
l2 ,...,
L
lm }  a2 .
We need to see that a1  a2 .
Stage 1
We’ll show that a1 a2 :
L {
L l1 ,
L l2 ,...,
L lm }  a2 , then for each 1  i  n
a2 . Let i be an
l
L i
arbitrary number between 1 and n, then for each l  li l
a2 .
Therefore for each i such that 1  i  n and for each l 
li holds that l
a2 , so
i{1,2,..n}
L
(
li )
a2 .
i{1,2,..n}
And we get a1
a2 .
Stage 2
We’ll show that a1
L(
l
a2 :
)  a1 then for each l 
i
i{1, 2,..., n}
li holds that l
a1 . Let i be arbitrary
i{1,2,..n}
number between 1 to n then for each l  li l a1 , therefore
arbitrary then for each i such that 1  i  n holds that
L li
a1 then a2 a1 .
L {
L l1 ,
L l2 ,...,
L lm }
Therefore a1  a2 i.e.
Then
L
L
(
P( L)
{l1 , l2 ,..., lm })
=
: P( L)  L is completely additive function.
In the case that the Y is empty set the proof is trivial.
23
L
{
a1 . Since i is
a1 then
l
L i
L
l1 ,
L
l2 ,...,
L
lm }
Example for non-completely additive function
Consider the function:
f[ #x:a ] : (Var*  Z {, top})  (Var*  Z {, top})
of Constant
Propagation analysis .
Recall that
f[ #x:a ] ( )   [ x  [a]#  ] and consider 1  [ x  2, z  3] and
 2  [ x  3, z  2] then:
f[ #x: x z ] (1
 2 )  f[ #x: x z ] ([ x  2, z  3] [ x  3, z  2])
= f[ #x: x z ] ([ x  top, z  top])  [ x  top, z  top] .
On the other side
f[ #x: x z ] (1 )
f[ #x: x z ] ( 2 )  f[ #x: x z ] ([ x  2, z  3])
f[ x#: x z ] ([ x  3, z  2]) =
= [ x  5, z  3]) [ x  5, z  2] = [ x  5, z  top] .
Therefore f[ x: x z ] (1
#
 2 )  f[ #x: x z ] (1 )
f[ #x: x z ] ( 2 ) .
Theorem 2
Let (L, ) be a lattice and let G(V,E):Graph , s:Node , τ:L , g:E->(L->L) (g(E) is a set
of monotone functions) be an input to the analysis.
And let J n  ( JOP(v1 ), JOP(v2 ),..., JOP(vn )) and
CI n  (dfentry (v1 ), dfentry (v2 ),..., dfentry (vn )) output of the join over all paths analysis and
least fixpoint analysis respectively then:
J n CI n
and in addition if for every e  E g(e): L->L is completely additive then:
J n  CI n
Proof is in the book at page 78.
It is sometimes stated that the join over all paths solution is the desired solution and
that one only uses the least fixpoint analysis because the join over all paths solution
might not be computable. In order to validate this belief we would need to prove that
the join over all paths solution is semantically correct (we don’t do this in these
notes).
But the way we can use the join over all paths solution is to estimate our result in the
least fixpoint analysis.
24
25

Download Report

Definition 1

Paperzz.com

Your Paperzz