Numerical Abstract Interpretation

Numeric Abstract Domains
Mooly Sagiv
http://www.cs.tau.ac.il/~msagiv/courses/pa16.html
Tel Aviv University
640-6706
Adapted from Antoine Mine
1
Subjects
Goals
Infer inductive invariants on numeric values
 Abstract sets of points in P(Rn)
 Applications:

– Array bound
– Termination
» infer ranking functions with value in N
– Cost Analysis
» time, memory consumption are numeric quantities
– Pointer analysis with pointer arithmetic
» pointer  offest
– String analysis in C
» Length, index
Numeric Semantics
Arithmetic Expressions & Commands
::= V
V  Var|
| - <exp>
| <exp> op <exp>
op {+, -, , /}|
| [c, c’]
c, c’  R  {-, }
 <com> ::= V := <exp>
V  Var
| assume <exp> relop 0
| assert <exp> relop 0
relop {=, , <,>, , }
 Control Flow Graph G(N, E, s) where ENN is
annotated with commands
 <exp>
– s  N is the start node
Example Program
1
1: X := [1, 10] ;
2: Y := 100;
X := [1, 10]
2
while 3: X>= 0 do {
4: X := X – 1;
Y := [100, 100]
assume X<0
3
5: Y := Y + 10
}
6
assume X0
Y := Y +10
4
6:
X := X -1
5
Concrete Operational Semantics
Semantics of Expressions
 = Var  R
 Semantics E
: <exp>    R
E V = V
 E c, c’  = { x R | c  x  c’}
 E -<exp>  = { - x | x  E <exp>  }
 E <exp> op <exp’> ={x op x’ | x 
E <exp> ,
x’
E <exp’> }
op  {+, -, }
 E <exp> / <exp’> =
 States
Semantics of Commands
 = Var  R
 Semantics C
: <com>  P()  P()
 C V := <exp> Z = { [V
x] | Z,
x E <exp> 
}
 C assume <exp> relop 0 Z ={| Z,
x E <exp> : x relop 0 }
 C assert <exp> relop 0 Z
 States
Distributivity
C
exp is distributive
 C exp ( Z) =  Z C exp {}
Concrete Semantics of Programs

G(s, N, E)
: P() N  P()
– The set of reachable states
– D = <P(), , , , , )
 The
smallest simultaneous solution to the set of
equations G(s, N, E) 
CSs = 
CSn = <m, c, n>E C c
s
CSm
n
defined from Tarski’s theorem but not
computable
 Uniquely
Numeric Abstract Domains


Representation: a set D# of representable abstract values
<D#, #, #, #, #, #)
– relating the amount of information given by abstract values


A concretization function
: D#  D = P() = P(VarR)
Required algebraic properties:
–  need to be monotonic: d # d’   d #  d’
– Strictness  # = 
–  # = Var  R

 need not be one-to-one
Numeric Abstract Domain Examples
y
y
x
y
x
y
x
x
signs
intervals
octagons
x 0
polyhedra
x [a, b]
x y c
ai xi c
Requirements on abstract operators

Algorithmic requirements
– For each c  <cmd>, c# c : D# D# is computable
– Algorithm for #
» Used for merging control paths and iterations
– Algorithm for
» Used for assume
– Algorithm for #
» Used for checking termination
Abstract Semantics of Programs

G(s, N, E)
: D# N  D#
– The set of reachable abstract states
– D# = <D#, #, #, #, #, #)
 The
smallest simultaneous solution to the set of
equations G(s, N, E) # #
ASs = #
 Uniquely
ASn =
s
#
<m, c, n>E C
#
c ASm
defined from Tarski’s theorem
n
Soundness
 The
smallest simultaneous solution to the set of
equations G(s, N, E) 
– CS
CSs = 
CSn =
 Any
#
c
CSm
solution AS set of equations
ns
G(s, N, E)
ASs = #
ASn =
s
 CS
<m, c, n>E C
#
<m, c, n>E C
 AS for all n  N
#
c ASm
n
#
Soundness requirement

  #
 For each c  <cmd>, d D#, cc( d)   (c #
c d)
Set of states
command c
C c 
Set of states

D#
 Set of states

command c
C#c 
D#
Optimality (induced operation)
existence of abstraction : D D# such
that <, > form a Galois connection
 Define c c # = d.  (c c ( d)
  may not exist
 c c # may be hard to compute
 Requires
Widening
 Accelerate
the termination of Chaotic iterations by
computing a more conservative solution
 Can handle lattices of infinite heights
  : D#  D#  D# such that
– d # d’ d  d’
– For every increasing chain d#1
d#2
…,
» The sequence s0 = d#0 and si+1 = si  d#i is finite
Chaotic Iterations with widening
for each n in N do AS[v] := #
AS[s] = #
WL = {s}
while (WL   ) do
select and remove an element m  WL
for each n, such that. (m, c, n) E do
temp = cc# AS[m]
if m is a loop header
then new := AS(n)  temp
else new := AS(n) # temp
if (new  AS[n]) then
AS[n] := new;
WL := WL {n}
Non-Relational Abstractions
Cartezian Abstraction
(independent attribute)
 Forget
the relationship between variables
Example Program
1
X := [1, 10]
2
Y := [100, 100]
assume X<0
3
assume X0
Y := Y +10
4
X := X -1
5
The Interval Domain
The Interval Domain [Moore’66,
Cousot’76]





#
D# = {[a, b] | a  b  R or a=- or b=  }
# = [-, ]
d # d’ = if d = # then d’
else if d’ = # then d
else let d=[a, b] and d’=[c, d] in
[min(a, c), max(b, d)]
d # d’ = if d = # then #
else if d’ = # then #
else let d=[a, b] and d’=[c, d] in
let l = max(a, c) and u= min(b, d)
if l > u then # else [l, u]
d d’ = if d = # then d’
else else let d=[a, b] and d’=[c, d] in
[if ac then a else -,
if bd then b else ]
Galois Connection
Abstract Expressions
Abstract Assignments
Optimality (Induced)
Abstract Assume
Example Program
1
X := 0
2
assume X40
assume X<40
X := X+1
3
Relational Domains
The need for relational domains
 Non-relation
domains cannot represent variable
relationships
Y :=0;
while true do {
X:=[-128,128]; D:=[0,16];
S:=Y; Y:=X; R:=X-S;
if R<=-D then Y:=S-D fi;
if R>=D then Y:=S+D fi
}
X: input signal
Y: output signal
S: last output
R: Y-S
D: max allowed for |R|
The need for relational domains
 Infer
strong enough inductive invariants
X:=0; I:=1;
while I<5000 do {
if … then X:=X+1 else X:=X-1 fi;
I:=I+1
}
The need for relational domains
 Modular
analysis of procedures
Z :=X ;
if Y > Z then Z :=Y ;
if Z < 0 then Z :=0;
Weakly Relational Domains
The Zone Domain [Shacham’00, Mine’01]
Constrains of the form
Vi – Vj c
Vi c
Machine Representation
potential constraint has the form Vi – Vj c
 Represented as a directed graph G
A
– Nodes are labeled with variables
– An arc with weight c from Vi to Vj for each constraint
Vi – Vj c
 Difference
Bound Matrix (DBM)
– Adjacency matrix m of G
– mij = c <   Vi – Vj c
– mij =   No such constraints
 Concretization
Machine Representation (cont)
 Unary
–
–
–
–
–
constraints
Add another variable V0
m has size n+1 n+1
Vi  c is denoted as Vi -V0  c, i.e., mi,0 = c
Vi  c is denoted as V0 - Vi  -c, i.e., m0,i = -c
 m = { (v1, v2, …, vn) | (0, v1, v2, …, vn)  m}
V0
V1
V2
V0
+
4
3
V1
-1
+
+
V2
-1
1
+
The DBM Lattice
Relational Domains
The Polyhedra Domain [CH’78]
i  j ai, j xi, j ci
Summary
 Numerical
Domains are Powerful
 Infer interesting invariants
 Cost is an issue
 Need to combine with other domains
 Next week some applications