Analyzing Java programs with
Octagons
September 20
Francesco Logozzo
ENS
Analyzing Java programs with Octagons – p.1/41
Goals
We want a static analysis for sequential Java that
Is sound
6= ESC/Java, ESC/Java 2
Is automatic
6= Jive
Infers real invariants
6= Daikon
Analyze classes in isolation
6= Excelsion Flawdetector
Does not rely on user annotations
6= JML-based approaches as LOOP, Krakatoa
Analyzing Java programs with Octagons – p.2/41
Outline
Concrete Semantics
Abstract Semantics
Comparison with other tools
Conclusions & Future work
Analyzing Java programs with Octagons – p.3/41
Syntax
A class is a triplet hinit, F, Mi where
init is the class constructor
F is a set of variables
M is a set of function definitions
For simplicity assume
Just one constructor
All the fields are protected
– Otherwise use get_f/set_f
Analyzing Java programs with Octagons – p.4/41
Semantic domains
The set of environments is Env = [Var → Addr]
where
Var is a set of variables
Addr is a set of addresses
The the set of stores is Store = [Addr → Val] where
Val is a set of values such that Env ⊆ Val
∈ Val is the void value
The environment of an object is stored in a certain
memory location
The address of such a location is the object identity
A program state is σ ∈ Env × Store
Analyzing Java programs with Octagons – p.5/41
Constructor and method semantics
The semantics of the constructor is a function
iJinitK ∈ [Val × Store → P(Env × Store)]
The semantics of a method m is a function
mJmK ∈ [Val × Env × Store → P(Val × Env × Store)]
If Addr ⊆ Val then the method may expose a part of
object state to the context
Analyzing Java programs with Octagons – p.6/41
Object semantics
The semantics of an object is given by a set of
traces
Each trace represents a possible evolution
history of the object
The first state represents the object right after its
creation
Each further state is the result of an interaction
between the object and a context
The context invokes a method of the object; or
It modifies a memory location that is reachable
from the object environment
The semantics of a class considers all its instances
Analyzing Java programs with Octagons – p.7/41
Initial states
The set of interaction states is
Σ = Env × Store × Val × P(Addr)
The initial states of an object o is
i
S0 hv, si = {he0 , s0 , , ∅i | he0 , s0 i ∈ JinitK(v, s)}
The initial states of a class are
IJinitK = ∪{S0hv, si | hv, si ∈ Val × Store}
Analyzing Java programs with Octagons – p.8/41
Direct interaction
Let reachable ∈ [(Val × Store) → P(Addr)]
It determines all the memory addresses
reachable from an address in Val
The collecting semantics of a method m is
MJmK(S) = {he0, s0, v0, Esc0i |he, s, v, Esci ∈ S, vin ∈ Val,
hv0 , e0 , s0 i ∈ mJmK(vin, e, s),
Esc0 = Esc ∪ reachable(v0 , s0 )}
The transition function for the direct interaction is
M
nextdir (σ) = ∪{ JmK({σ}) | m ∈ M}
Analyzing Java programs with Octagons – p.9/41
Indirect interaction
Let update ∈ [Addr × Store → P(Store)] be defined as
update(a, s) = {s0 | ∃v ∈ Val. s0 = s[a 7→ v]}
The context interactions are
Context(S) = {he, s0 , , Esci |he, s, v, Esci ∈ S,
∃a ∈ Esc. s0 ∈ update(a, s)}
The transition function for the indirect interaction is
nextind (σ) = Context({σ})
Analyzing Java programs with Octagons – p.10/41
Class semantics, cJ·K
c
Th. JAK can be expressed in fixpoint form as
cJAK = lfpλT. IJinitK ∪ {σ0 −→ . . . −→ σn −→ σ0 |
σ0 −
→ ... −
→ σn ∈ T,
σ 0 ∈ nextdir (σn ) ∪ nextind (σn )}
cJAK can be shown sound and complete w.r.t. a
whole program object-oriented semantics
The class reachable states of a class A are
[
JAK = lfpλS. JinitK ∪
JmK(S) ∪ Context(S)
C
I
M
m∈M
Analyzing Java programs with Octagons – p.11/41
Outline
Concrete Semantics
Abstract Semantics
Comparison with other tools
Conclusions & Future work
Analyzing Java programs with Octagons – p.12/41
Abstraction of the reachable states
Assume an abstract domain D such that
γ
−→
− hD, v, ⊥, >, t, ui
hP(Σ), ⊆, ∅, Σ, ∪, ∩i ←
−−
α
A sound approximation of the initial states
IJinitK ⊆ γ(ĪJinitK)
A sound approximation of the method semantics
∀S ∈ P(Σ).
MJmK(S) ⊆ γ(M̄JmK(α(S)))
A sound approximation of the context
∀S ∈ P(Σ). Context(S) ⊆ γ(Context(α(S))
Analyzing Java programs with Octagons – p.13/41
Abstract class invariant
Th. (SAS’03) Let A be a class. Then
C̄JAK = lfpλX. ĪJinitKt M̄JmK(X)tContext(X)
G
m∈M
is such that
cJAK ⊆ γΣ ◦ γ(C̄JAK)
In practice:
We have to define J·K and
Ī
M̄J·K
The abstract domain D must track object aliasing
=⇒ In particular the exposed objects
=⇒ Context sets them to >
Analyzing Java programs with Octagons – p.14/41
Abstract domain: Abstract environment
The abstract domain is D = Env × Store
An abstract environment overapproximates the set
of addresses a variable may be tied to
The abstract environments are
Env = [Var → P(Addr)]
Addr is the set of abstract addresses
The meaning of a ρ ∈ Env is
γ(ρ) = {ρ ∈ Env | ∀x. ρ(x) ∈ γaddr (ρ(x))}
Analyzing Java programs with Octagons – p.15/41
Abstract domain: Abstract store
Assume the only primitive type being int
The abstract stores are
Store = ([Addr → N] × Oct)×
|
{z
}
Approximation of ints
[Addr → Env]
|
{z
}
Approximation of objects
×
P(Addr)
| {z }
Summary locations
Let hhg, oi, f, Si ∈ Store then
dom(f ) ∩ dom(g) = ∅
=⇒ An address denotes objects or integers
=⇒ Typed addresses
Analyzing Java programs with Octagons – p.16/41
S is the set of summary locations
if a ∈ S then a stands for (infinitely?) many
concrete addresses
Otherwise a stands for exactly one concrete
address
f ∈ [Addr → Env] associates an abstract
environment to each abstract object
g ∈ [Addr → N] maps ints to octagon’s dimensions
A dimension is associated to each address
Useful in implementation
– Max dimension not known statically
Analyzing Java programs with Octagons – p.17/41
Example
class Box { int i = 0; }
// ...
// hρ, hhg, oi, f, Sii, o with n dimensions
Box b = null;
if( ... )
b = new Box();
else skip;
// hρ[b 7→ {a}], hhg[a 7→ (n + 1)], o[xn+1 = 0]i, f, Sii
Note: At the join point we add the constraint xn+1 = 0
=⇒ In the false branch: xn+1 = ⊥ is tacitely assumed
Analyzing Java programs with Octagons – p.18/41
Abstract operations and transfer functions
The abstract operations are defined as one expects
e.g., t, u, ∇ are pointwisely defined
Most of the transfer functions are as usual
Sequence, if, while, . . .
Next we describe the most interesting ones
Set up the initial state for the methods’ analysis
Creation of a new object (new)
Assignment (:=)
Projection of exposed addresses
Analyzing Java programs with Octagons – p.19/41
Handling parameters
Let I k = hρ, hhg, oi, f, Sii be the state at the k-th
iteration for the class invariant
The initial state I0 for a method m is given by
The class invariant computed so far (I k )
The parameters of the method
Parameters can be ints or objects
If it is an int we simply add it to the state, e.g.:
public void m(int v) { ... }
then I0 = hρ[v 7→ a], hhg[a 7→ n + 1], o0 i, f, Sii
where o0 is the octagon o extended to the n + 1
dimension
Analyzing Java programs with Octagons – p.20/41
If they are objects, we have to pay attention to
aliasing, eg:
class A { B bRef; }
class B { int i; }
class AnalyzeMe {
// ...
public void m(A a, B b) { ... } }
=⇒ a.bRef and b may alias
cycles, eg:
class Cycle_1 { Cycle_2 c1to2; // ...}
class Cycle_2 { Cycle_1 c2to1; // ...}
class AnalyzeMeWithCycles {
//
public void n(Cycle_1 c) { ... } }
=⇒ c and c.c1to2.c2to1 may alias
Analyzing Java programs with Octagons – p.21/41
We use summary locations (we assume worst case)
for objects of a type that appear at least twice in
method’s parameters
Ex: AnalyzeMe: I0 = hρ0 , hhg 0 , o0 i, f 0 , S 0 ii where
ρ0 = ρ[a 7→ {13}, b 7→ {10}]
g 0 = g[15 7→ (n + 1)]
o0 = o extended with an n + 1th dimension
f 0 = f [13 7→ hi 7→ {15}i, 10 7→ hbRef 7→ {13}i]
S 0 = S ∪ {13, 15}
(A more understandable picture will come)
Analyzing Java programs with Octagons – p.22/41
Output of the Analyzer
Analyzing Java programs with Octagons – p.23/41
new
Objects are created through the new statement
A a = new A();
We must not to create infinitely many objects
Loops and successive invocations of the same
method
We handle the first k invocations of a new exactly
and then we create a summary location
k is a parameter of the analysis
Analyzing Java programs with Octagons – p.24/41
Example
class Box { int i; }
class Allocation {
Box b;
Allocation() { b = new Box(); }
public void m() { b = new Box(); }
}
Analyzing Java programs with Octagons – p.25/41
Assignment
Let e1 = e2 be an assignment
Two cases: whether e1 is an object or an int
Java is a typed language...
If it is an object, then create an alias for e2
=⇒ Update the abstract environment
If it is an int, then we have to update the octagon
Analyzing Java programs with Octagons – p.26/41
e1 and e2 may evaluate to several addresses, eg:
a.x = b.c.y + 2
with, eg, I0 = hρ, hhg, oi, f, Sii and
ρ = ha 7→ {a1 , a2 }, b 7→ {a3 }i
g = ha4 7→ d0 , a5 7→ d1 , a7 7→ d2 i
f = ha1 7→ hx 7→ {a4 }i, a2 7→ hx 7→ {a5 }i,
a3 7→ hc 7→ {a6 }i, a6 7→ hc 7→ {a7 }ii
The possible assignments are
a4 = a7 + 2 and a5 = a7 + 2
that become the octagon constraints
d0 = d2 + 2 and d1 = d2 + 2
Analyzing Java programs with Octagons – p.27/41
finally, the octagon after the assignment is the union of
the two possible octagons:
o0 = o.assign(d0 = d2 + 2) t o.assign(d1 = d2 + 2)
so that the state after the assignment is
I1 = hρ, hhg, o0 i, f, Sii
Analyzing Java programs with Octagons – p.28/41
Assignment of ints: Summary
Let e1 = e2 + c and I0 = hρ, hhg, oi, f, Sii
A
A
Let Je1K(I0 ) and Je2K(I0 ) the addresses e1 (risp.
e2) may evaluate to in I0
Let
Ass = {g(a1 ) = g(a2 )+c | a1 ∈
AJe1K(I0), a2 ∈ AJe2K(I0)}
Then the octagon after the assignment is
G
0
o =
o.assign(d1 = d2 + c)
d1 =d2 +c∈Ass
Analyzing Java programs with Octagons – p.29/41
Method exit point
At method’s exit point we collect the addresses that
escape from the class
Somehow similar to the concept of ownership
types
They are reachable
by the method’s / constructor parameters
by the methods retur value
We set the corresponding location to >
The context can do everything on an exposed
location...
Analyzing Java programs with Octagons – p.30/41
Outline
Concrete Semantics
Abstract Semantics
Comparison with other tools
Conclusions & Future work
Analyzing Java programs with Octagons – p.31/41
A Running example
class SimplePosCounter {
private int i, N;
SimplePosCounter(int N0) {
assert -N0 <= -1;
this.N = N0;
this.i = 0; }
public void add() {
if(this.i - this.N < 0)
i = i + 1; }
public void sub() {
if(-this.i <= this.N < 0)
i = i - 1; }
public void check() {
assert N - i <= -1; }
}
Analyzing Java programs with Octagons – p.32/41
My analyzer
Discover that the assertion in check() is always
violated
in 266ms on my laptop (Centrino 1.2Ghz,
768Mbyte, WinXp, Java 1.5_04)
Analyzing Java programs with Octagons – p.33/41
ESC/Java 2
Let us consider ESC/Java 2
Unsound, but not for the example...
It is made to discover errors...
However, the wrong program passes without not
even a warning...
< omissis >
SimplePosCounter: check() ...
[0.016 s 6625592 bytes] passed
Is is even slower than ours: 297ms
Analyzing Java programs with Octagons – p.34/41
Daikon
It executes the program, and it infers properties from
the traces
Very popular...
It needs that user supplies test cases
Unlike what suggested by the tool’s home page
=⇒ Cannot analyze the example directly
=⇒ Write the test case by myself
Analyzing Java programs with Octagons – p.35/41
class TesterSimplePosCounter {
public static void main(String[] unused) {
// ... Repeat 10 times
s = new SimplePosCounter( ... );
// ... repeat Math.random() times
double tmp = Math.random();
if(tmp <= 0.475)
s.add();
else if(tmp <= 0.999)
s.sub();
else
s.check();
}
Analyzing Java programs with Octagons – p.36/41
If the Daikon executes s.check() then it stops
=⇒ Execute it many times before success
Finds wrong “invariants” ! eg
<omissis>
SimplePosCounter:::OBJECT
this.i >= 0
this.i < this.N
Analyzing Java programs with Octagons – p.37/41
Excelsior Flawdetector
A commercial product (free trial) to statically verify:
ArrayIndexOutOfBoundsException
NullPointerException
Simple assertions
etc.
Now: “Excelsior FlawDetector is temporarily
unavailble due to technical issues.”
However, I had a trial
not compatible with Java 5.0
It requires a class with main
=⇒ Cannot analyze classes in isolation
=⇒ Just issues a warning for the check()
Analyzing Java programs with Octagons – p.38/41
JML-based/style tools
Krakatoa, LOOP, Jack and Spec#
Spec# is as JML but for C#
They need annotations
In particular the class invariant:
invariant i <= N
to prove the class
In practice these tools require heavy annotations
Analyzing Java programs with Octagons – p.39/41
Conclusions
We presented the theory and some of the abstract
operations of our analyzer
There is a lot of work to be done:
Implementation details
– Some optimizations, some syntax
transformations, etc.
Modular handling of inheritance (VMCAI’04
paper)
Better handling of mutually recursive classes
(AMAST’04 paper)
Does it scales up?
– eg. Object caching, used in the
implementation, does scales up?
Analyzing Java programs with Octagons – p.40/41
Thank you!
Analyzing Java programs with Octagons – p.41/41
© Copyright 2026 Paperzz