Compactly Representing
First-Order Structures for
Static Analysis
Tel-Aviv University
Roman Manevich
Mooly Sagiv
I.B.M T.J. Watson
Ganesan Ramalingam
John Field
Deepak Goyal
Motivation
TVLA is a powerful and general abstract
interpretation system
Abstract interpretation in TVLA
Operational semantics is expressed with
first-order logic formulae
Program states are represented as
sets of Evolving First-Order Structures
Space is a major bottleneck
Desired Properties
Sparse data structures
Share common sub-structures
Inherited sharing
Incidental sharing due to program invariants
But feasible time performance
Phase sensitive data structures
Outline
Background
First-order structure representations
Base representation (TVLA 0.91)
BDD representation
Empirical evaluation
Conclusion
First-Order Logical Structures
Generalize shape graphs
Arbitrary set of individuals
Arbitrary set of predicates on individuals
Dynamically evolving
Usually small changes
Properties are extracted by evaluating first
order formula: ∃v1 , v: x(v1) ∧ n(v1, v)
Join operator requires isomorphism testing
First-Order Structure ADT
Structure : new() /* empty structure */
SetOfNodes : nodeSet(Structure)
Node : newNode(Structure)
removeNode(Structure, node)
Kleene eval(Structure, p(r), <u1, . . . ,ur>)
update(Structure, p(r), <u1, . . . ,ur>, Kleene)
Structure copy(Structure)
print_all Example
/* list.h */
typedef struct node {
struct node * n;
int data;
} * L;
/* print.c */
#include “list.h”
void print_all(L y) {
L x;
x = y;
while (x != NULL) {
/* assert(x != NULL) */
printf(“elem=%d”, xdata);
x = xn;
}
}
print_all Example
n=½
x = y
x’(v) := y(v)
copy(S0) : S1
nodeset(S0) : {u1, u}
eval(S0, y, u1) : 1
update(S1, x, u1, 1)
eval(S0, y, u) : 0
update(S1, x, u, 0)
S0
u1
y=1
n=½
u
sm=½
n=½
S1
u1
y=1
x=1
n=½
u
sm=½
print_all Example
n=½
while (x != NULL)
precondition : ∃v x(v)
x = x n
focus : ∃v1 x(v1) ∧ n(v1, v)
x’(v) := ∃v1 x(v1) ∧ n(v1, v)
S1
u1
u
n=½
x=1
sm=½
y=1
n=½
S2.0
u
sm=½
u1
y=1
n=½
S2.1
u1
y=1
n=1
u
x=1
n=½
S2.2
u1
y=1
n=1
u.1
x=1
n=½
n=½
u.0
sm=½
Overview and Main Results
1.
Two novel representations of first-order
structures
2.
3.
New BDD representation
New representation using functional maps
Implementation techniques
Empirical evaluation
Comparison of different representations
Space is reduced by a factor of 4–10
New representations scale better
Base Representation
(Tal Lev-Ami SAS 2000)
Two-Level Map :
Predicate (Node Tuple Kleene)
Sparse Representation
Limited inherited sharing by
“Copy-On-Write”
BDDs in a Nutshell (Bryant 86)
Ordered Binary Decision Diagrams
Data structure for Boolean functions
Functions are represented as (unique) DAGs
x1
x2
x3
f
0
0
0
0
0
0
1
0
0
1
0
0
0
1
1
1
1
0
0
0
1
0
1
1
1
1
0
0
1
1
1
1
x1
x2
x2
x3
0
x3
0
0
x3
1
0
x3
1
0
1
BDDs in a Nutshell (Bryant 86)
Ordered Binary Decision Diagrams
Data structure for Boolean functions
Functions are represented as (unique) DAGs
Also achieve sharing across functions
x1
x1
x2
x3
x2
x3
0
x3
x3
1
Duplicate Terminals
x1
x2
x2
x3
x3
0
1
Duplicate Nonterminals
x2
x3
0
1
Redundant Tests
Encoding Structures Using Integers
Static encoding of
Dynamic encoding of nodes
Predicates
Kleene values
0, 1, …, n-1
Encode predicate p’s values as
ep(p).en(u1). en(u2) . … . en(un) . ek(Kleene)
BDD Representation of Integer Sets
Characteristic function
S={1,5}
S
1=<001>
5=<101>
=
(¬x1¬x2x3)
(x1¬x2x3)
x1
x2
x2
x3
0
1
BDD Representation of Integer Sets
Characteristic function
S={1,5}
S
1=<001>
5=<101>
=
(¬x1¬x2x3)
(x1¬x2x3)
x1
x2
x2
x3
1
BDD Representation Example
n=½
S0
u
u1 n=½
sm=½
y=1
S0
1
BDD Representation Example
n=½
S0
u
u1 n=½
sm=½
y=1
S0
S1
x=y
n=½
S1
u1
u
n=½
x=1
sm=½
y=1
1
BDD Representation Example
S2.2
n=½
S0
u
u1 n=½
sm=½
y=1
S0
S1
x=y
n=½
S1
u1
u
n=½
x=1
sm=½
y=1
x=xn
n=½
S2.2
u1
y=1
n=1
u.1
x=1
n=½
n=½
u.0
sm=½
1
BDD Representation Example
S2.2
n=½
S0
u
u1 n=½
sm=½
y=1
S0
S1
x=y
n=½
S1
u1
u
n=½
x=1
sm=½
y=1
x=xn
n=½
S2.2
u1
y=1
n=1
u.1
x=1
n=½
n=½
u.0
sm=½
1
Improved BDD Representation
Using this representation directly
doesn’t save space
Observation
Our heuristics
Node names can be arbitrarily remapped without
affecting the ADT semantics
Use canonic node names to encode nodes
Increases incidental sharing
Reduces isomorphism test to pointer comparison
4-10 space reduction
Reducing Time Overhead
Current implementation not optimized
Expensive formula evaluation
Hybrid representation
Distinguish between phases:
mutable phase Join immutable phase
Dynamically switch representations
Functional Representation
Alternative representation for first-order structures
Structures represented by maps from integers to
Kleene values
Tailored for representing first-order structures
Achieves better results than BDDs
Techniques similar to the BDD representation
More details in the paper
Empirical Evaluation
Benchmarks:
Cleanness Analysis (SAS 2000)
Garbage Collector
CMP (PLDI 2002) of Java Front-End and Kernel
Benchmarks
Mobile Ambients (ESOP 2000)
Stress testing the representations
We use “relational analysis”
Save structures in every CFG location
Space Results
450
402.8
400
350
300
Base
OBDD total
Functional
250
200
187.7
168.2
150
100
51.6
50
12.8
5.5
22.7 16.7
12.9
9.6
0
JFE
KERNEL
CA
MA
GC
Abstract Counters
Ignore language/implementation details
A more reliable measurement technique
Count only crucial space information
Independent of C/Java
Abstract Counters Results
45,000,000
40,000,000
35,000,000
30,000,000
Base
OBDD
Functional
25,000,000
20,000,000
15,000,000
10,000,000
5,000,000
0
JFE
KERNEL
CA
MA
GC
Trends in the
Cleanness Analysis Benchmark
600
500
564
505
400
Base
OBDD
Functional
300
200
100
0
74
54
42
50
1
2
3
4
5
6
7
8
9
10
What’s Missing from this Work?
Investigate other node mapping heuristics
Compactly represent sets of structures
Time optimizations
Conclusions
Two novel representations of first-order structures
Implementation techniques
New BDD representation
New representation using functional maps
Normalization techniques are crucial
Empirical evaluation
Comparison of different representations
Space is reduced by a factor of 4–10
New representations scale better
Conclusions
The use of BDDs for static analysis
is not a panacea for space saving
Domain-specific encoding crucial for saving space
Failed attempts
Original implementation of Veith’s encoding
PAG
The End
© Copyright 2026 Paperzz