tvlap

3-Valued Logic Analyzer
(TVP)
Part II
Tal Lev-Ami and Mooly Sagiv
Outline
 The
Shape Analysis Problem
 Solving Shape Analysis with TVLA
–
–
–
–
–
–
–
Structural Operational Semantics
Predicate logic
Embedding
(Imprecise) Abstract Interpretation
Instrumentation Predicates
Focus
Coerce
 Bibliography
Shape Analysis
 Determine
the possible shapes of a dynamically
allocated data structure at given program point
 Relevant questions:
– Does a variable point to an acyclic list?
– Does a variable point to a doubly-linked list?
– Does a variable point p to an allocated element every
time p is dereferenced?
– Can a procedure create a memory-leak
Dereference of NULL pointers
typedef struct element {
int value;
struct element *next;
} Elements
bool search(int value, Elements *c) {
Elements *elem;
for ( elem = c; c != NULL;elem = elem->next;)
if (elem->val == value) NULL dereference
return TRUE;
return FALSE
Memory leakage
Elements* reverse(Elements *c)
{
Elements *h,*g;
h = NULL;
while (c!= NULL) {
g = c->next;
leakage of address pointed-by h
h = c;
c->next = h;
c = g;
}
return h;
The SWhile Programming Language
Abstract Syntax
sel:= car | cdr
a := x | x.sel | null | n | a1 opa a2
b := true | false | not b | b1 opb b2 | a1 opr a2
S := [x := a]l | [x.sel := a]l | [x := malloc()]l |
[skip] l | S1 ; S2 | if [b]l then S1 else S2 | while [b]l do S
Dereference of NULL pointers
[elem := c;]1
[found := false;]2
while ([c != null]3 && [!found]4) (
if ([elem->car= value]5)
then [found := true]6
else [elem = elem->cdr]7
)
NULL dereference
Structural Operational Semantics
for languages with dynamically allocated objects
 The
program state consists of:
– current allocated objects
– a mapping from variables into atoms, objects, and null
– a car mapping from objects into atoms, objects, and
null
– a cdr mapping from objects into atoms, objects, and
null
– …
 malloc()
allocates more objects
 assignments update the state
Structural Operational Semantics
 The
–
–
–
–
–
program state S=(O, env, car, cdr):
current allocated objects O
atoms (integers, Booleans) A
env: Var*  A  O  {null}
car: A  A  O  {null}
cdr: A  A  O  {null}
 The
meaning of expressions
Aa: S  A  O  {null}
–
–
–
–
Aat(s) = at
Ax((O, env, car, cdr)) = env(x)
Ax.car((O, env, car, cdr)) = car(env(x))
Ax.cdr((O, env, car, cdr)) = cdr(env(x))
Structural Semantics for SWhile
axioms
[assvsos] <x := a, s=(O, e, car, cdr)>  (O, e[x Aas], car, cdr)
[asscarsos] <x.car := a, (O, e, car, cdr)>  (O, e, car[e(x) Aas],
cdr)
[asscdrsos] <x.cdr := a, (O, e, car, cdr)>  (O, e, car, cdr[e(x)
Aas])
[assmsos] <x := malloc(), (O, e, car, cdr)>  (O {n}, e[x n], car,
cdr)
[skipsos] <skip, s>  s
where nO
Structural Semantics for SWhile
rules
[comp1sos] <S1 , s>  <S’1, s’>
<S1; S2, s>  < S’1; S2, s’>
[comp2sos] <S1 , s> s’
<S1; S2, s>  < S2, s’>
[ifttsos] <if b then S1 else S2, s> <S1, s>
if
Bbs=tt
<if b then S1 else S2, s> <S2, s>
if
Bbs=ff
[ifff
sos]
Summary
 The
SOS is natural
 Can handle:
– errors, e.g., null dereferences
– free
– garbage collection
 But
does not lead to an analysis
– The set of potential objects is unbound
 Solution: Three-Valued
Kleene Predicate Logic
Predicate Logic
 Vocabulary
– A finite set of predicate symbols P
each with a fixed arity
– A finite set of function symbols
 Logical
Structures S provide meaning for
predicates
– A set of individuals (nodes) U
– PS: US  {0, 1}
Formulas over  express logical
structure properties
 First-Order
Using Predicate Logic to describe
states in SOS
 U=O
 For
a Boolean variable x define a nullary
predicate (proposition) b[x]
– b[x] = 1 when env(x)=1
 For
a pointer variable x define a unary predicate
– p[x](u)=1 when env(x)=u and u is an object
 Two
binary predicates:
– s[car](u1, u2) = 1 when car(u1)=u2 and u2 is object
– s[cdr](u1, u2) = 1 when cdr(u1)=u2 and u2 is object
Running Example
[elem := c;]1
[found := false;]2
while ([c != null]3 && [!found]4) (
if ([elem->car= value]5)
then [found := true]6
else [elem = elem->cdr]7
)
%s Pvar {elem, c} %s Bvar {found} %s Sel {car, cdr}
#include "pred.tvp"
%%
#include "cond.tvp"
#include "stat.tvp"
%%
/* [elem := c;]1 */ l_1 Copy_Var(elem, c) l_2
/* [found := false;]2 */ l_2 Set_False(found) l_3
/* while ([c != null]3 && [!found]4) ( */
l_3 Is_Not_Null_Var (c) l_4
l_3 Is_Null_Var (c) l_end
l_4 Is_False(found) l_5
l_4 Is_True(found) l_end
/*
if ([elem->car= value]5) */
l_5 Uninterpreted_Cond() l_6
l_5 Uninterpreted_Cond() l_7
/*
then [found := true]6 */ l_6 Set_True(found) l_3
/*
else [elem = elem->cdr]7 */ l_7 Get_Sel(cdr, elem, elem)
l_3
/* ) */
%%
l_1, l_end
pred.tvp
foreach (z in Bvar) {
%p b[z]()
}
foreach (z in Pvar) {
%p p[z](v) unique box
}
foreach (sel in Sel) {
%p s[sel](v1, v2) function
}
Actions
first order formulae over  to express
the SOS
 Every action can have:
 Use
–
–
–
–
–
–
–
title %t
focus formula %f
precondition formula %p
error messages %message
new formula %new
predicate-update formulas {}
retain formula
cond.tvp (part 1)
%action Uninterpreted_Cond() {
%t "uninterpreted-Condition"
}
%action Is_True(x1) {
%t x1
%p b[x1]()
{
b[x1]() = 1
}
}
%action Is_False(x1) {
%t "!" + x1
%p !b[x1]()
{
b[x1]() = 0
}
}
cond.tvp (part 2)
%action Is_Not_Null_Var(x1) {
%t x1 + " != null"
%p E(v) p[x1](v)
}
%action Is_Null_Var(x1) {
%t x1 + " = null"
%p !(E(v) p[x1](v))
}
stat.tvp (part 1)
%action Skip() {
%t "Skip"
}
%action Set_True(x1) {
%t x1 + " := true"
{
b[x1]() = 1
}
}
%action Set_False(x1) {
%t x1 + " := false"
{
b[x1]() = 0
}
}
stat.tvp (part 2)
%action Copy_Var(x1, x2) {
%t x1 + " := " + x2
{
p[x1](v) = p[x2](v)
}
}
stat.tvp (part 3)
%action Get_Sel(sel, x1, x2) {
%t x1 + " := " + x2 + “.” + sel
%message (!E(v) p[x2](v)) ->
"an illegal dereference to" + sel + " component of " + x2
{
p[x1](v) = E(v_1) p[x2](v_1) & s[sel](v_1, v)
}
}
stat.tvp (part 4)
%action Set_Sel_Null(x1, sel) {
%t x1 + "." + sel + " := null"
%message (!E(v) p[x1](v)) ->
"an illegal dereference to" + sel + " component of " + x1
{
s[sel](v_1, v_2) = s[sel](v_1, v_2) & !p[x1](v_1)
}
}
stat.tvp (part 5)
%action Set_Sel(x1, sel, x2) {
%t x1 + “.” + sel + " := " + x2
%message (E(v, v1) p[x1](v) & s[sel](v, v1)) ->
"Internal Error! assume that " + x1 + "." + sel + ==NULL"
%message (!E(v) p[x1](v)) ->
"an illegal dereference to" + sel + " component of " + x1
{
s[sel](v_1, v_2) = s[sel](v_1, v_2) | p[x1](v_1) & p[x2](v_2)
}
}
stat.tvp (part 6)
%action Malloc(x1) {
%t x1 + " := malloc()"
%new
{
p[x1](v) = isNew(v)
}
}
3-Valued Kleene Logic
 A logic
with 3-values
– 0 -false
– 1 - true
– 1/2 - don’t know
 Operators
are conservatively interpreted
– 1/2 means either true or false
1/2
information order
0
01=1/2
1
Logical order
Kleene Interpretation of Operators
(logical-and)

0 1
½
0
0 0
0
1
0 1
½
½ 0 ½ ½
Kleene Interpretation of Operators
(logical-negation)

0
1
1
0
½
½
3-Valued Predicate Logic
 Vocabulary
– A finite set of predicate symbols P
– A special unary predicate sm
» sm(u)=0 when u represents a unique concrete node
» sm(u)=1/2 when u may represent more than one concrete node
 3-valued
Logical Structures S provide meaning
for predicates
– A (bounded) set of individuals (nodes) U
– PS: US  {0, 1/2, 1}
Formulas over  express logical
structure properties
 Interpret  as maximum on logical order
 First-Order
The Blur Operation
 Abstract
an arbitrary structure into a structure of
bounded size
 Select a set of unary predicates as
abstraction-predicates
 Map all the nodes with the same value of
abstraction predicates into a single summary node
 Join the values of other predicates
The Embedding Theorem
 If
a big structure B can be embedded in a structure
S via a surjective (onto) function f such that all
predicate values are preserved, i.e.,
pB(u1, .., uk)  pS (f(u1), ..., f(uk))
 Then, every formula  is preserved is preserved
– =1 in S =1 in B
– =0 in S =0 in B
– =1/2 in S don’t know
Naive Program Analysis via
3-valued predicate logic
 Chaotic
iterations
 Start with the initial 3-valued structure
 Execute every action in three phases:
–
–
–
–
check if precondition is satisfied
execute update formulas
execute blur
Command line tvla prgm prgm -action pub
prgm.tvs
%n = {u, u0}
%p = {
sm = {u:1/2}
s[cdr] = {u->u:1/2, u0->u:1/2}
p[c] = {u0}
}
More Precise Shape Analysis
 Distinguish
between cyclic and acyclic lists
 Use Focus to guarantee that important formulas
do not evaluate to 1/2
 Use Coerce to maintain global invariants
 It all works
–
–
–
–
–
Singly linked lists (reverse, insert, delete, del_all)
Sortedness (bubble-sort, insetion-sort, reverse)
Doubly linked lists (insert, delete
Mobile code (router)
Java multithreading (interference, concurrent-queue)
The Instrumentation Principle
Increase
precision by storing the truthvalue of some designated formulae
Introduce
predicate-update formulae to
update the extra predicates
Example: Heap Sharing
is[cdr](v) = v1,v2: cdr(v1,v)  cdr(v2,v) 
v1  v2
x

31
is = 0
71
is = 0
is = 0
x
u1
is = 0
u
is = 0
91
is = 0
Example: Heap Sharing
is[cdr](v) = v1,v2: cdr(v1,v)  cdr(v2,v) 
v1  v2
x

31
71
is = 0 is = 10
91
is = 0
x
u1
u
is = 0 is = 1
is = 0
is = 0
pred.tvp
foreach (z in Bvar) {
%p b[z]()
}
foreach (z in Pvar) {
%p p[z](v) unique box
}
foreach (sel in Sel) {
%p s[sel](v1, v2) function
}
foreach (sel in Sel) {
%i is[sel](v) = E(v1, v2) sel(v_1) & sel(v2, v) & v_1 != v_2
}
stat.tvp (part 4)
%action Set_Sel_Null(x1, sel) {
%t x1 + "." + sel + " := null"
%message (!E(v) p[x1](v)) ->
"an illegal dereference to" + sel + " component of " + x1
{
s[sel](v_1, v_2) = s[sel](v_1, v_2) & !p[x1](v_1)
is[sel](v) = is(v) & (!(E(v_1) x1(v_1) & sel(v_1, v)) |
E(v_1, v_2) v_1 != v_2 &
(sel(v_1, v) & !x1(v_1)) &
(sel(v_2, v) & !x1(v_2)))
}
}
stat.tvp (part 5)
%action Set_Sel(x1, sel, x2) {
%t x1 + “.” + sel + " := " + x2
%message (E(v, v1) p[x1](v) & s[sel](v, v1)) ->
"Internal Error! assume that " + x1 + "." + sel + ==NULL"
%message (!E(v) p[x1](v)) ->
"an illegal dereference to" + sel + " component of " + x1
{
s[sel](v_1, v_2) = s[sel](v_1, v_2) | p[x1](v_1) & p[x2](v_2)
is[sel](v) = is[sel](v) | E(v_1) x2(v) & sel(v_1, v)
}
}
Additional Instrumentation Predicates
 reachable-from-variable-x(v)
v1:x(v1)  cdr*(v1,v)
 cyclic-along-dimension-d(v)
cdr+(v, v)
 ordered element
inOrder(v)
v1:cdr(v, v_1)v->d <= v_1->d
 doubly linked lists
The Focusing Principle
To
increase precision
– “Bring the predicate-update formula into focus”
(Force 1/2 to 0 or 1)
– Then apply the predicate-update formulas
(1) Focus on  v1: x(v1)  cdr(v1,v)
x
y
u1
u











x
y
x
y
x
y
u
u1
u1
u1
u
u.1
u.0
(2) Evaluate Predicate-Update Formulae
x(v) =  v1: x(v1)  cdr(v1,v)











x
y
x
y
x
y
x
u
u1
u
u1
x
u1
u
u1
u
y
x
u1
u.1
u.0
u1
u.1
u.0
The Coercion Principle
 Increase
precision by exploiting some structural
properties possessed by all stores
(Global invariants)
 Structural
 Apply
properties captured by constraints
a constraint solver
(3) Apply Constraint Solver
x
x
u
u1
u1
u
u1
u
x
x
u1
u
y
y
x
x
u1
u.1
y
u.0
u1
u.1
y
u.0
Conclusion
 TVLA allows
construction of non trivial analyses
 But it is no panacea
– Expressing operational semantics using logical
formulas is not always easy
– Need instrumentation to be reasonably precise
(sometimes help efficiency as well)
 Open
problems:
– A debugger for TVLA
– Frontends
– Algorithmic problems:
» Space optimizations
Bibliography
 Chapter
2.6
 http://www.cs.uni-sb.de/~wilhelm/foiles/
(Invited talk CC’2000)
 http://www.cs.wisc.edu/~reps/#shape_analysis
Parametric Shape Analysis based on 3-valued
logics (the general theory)
 http://www.math.tau.ac.il/~tla/
The system and its applications