Compactly Representing First-Order Structures for Static Analysis Tel-Aviv University Roman Manevich Mooly Sagiv I.B.M T.J. Watson Ganesan Ramalingam John Field Deepak Goyal Motivation TVLA is a powerful and general abstract interpretation system Abstract interpretation in TVLA Operational semantics is expressed with first-order logic formulae Program states are represented as sets of Evolving First-Order Structures Space is a major bottleneck Desired Properties Sparse data structures Share common sub-structures Inherited sharing Incidental sharing due to program invariants But feasible time performance Phase sensitive data structures Outline Background First-order structure representations Base representation (TVLA 0.91) BDD representation Empirical evaluation Conclusion First-Order Logical Structures Generalize shape graphs Arbitrary set of individuals Arbitrary set of predicates on individuals Dynamically evolving Usually small changes Properties are extracted by evaluating first order formula: ∃v1 , v: x(v1) ∧ n(v1, v) Join operator requires isomorphism testing First-Order Structure ADT Structure : new() /* empty structure */ SetOfNodes : nodeSet(Structure) Node : newNode(Structure) removeNode(Structure, node) Kleene eval(Structure, p(r), <u1, . . . ,ur>) update(Structure, p(r), <u1, . . . ,ur>, Kleene) Structure copy(Structure) print_all Example /* list.h */ typedef struct node { struct node * n; int data; } * L; /* print.c */ #include “list.h” void print_all(L y) { L x; x = y; while (x != NULL) { /* assert(x != NULL) */ printf(“elem=%d”, xdata); x = xn; } } print_all Example n=½ x = y x’(v) := y(v) copy(S0) : S1 nodeset(S0) : {u1, u} eval(S0, y, u1) : 1 update(S1, x, u1, 1) eval(S0, y, u) : 0 update(S1, x, u, 0) S0 u1 y=1 n=½ u sm=½ n=½ S1 u1 y=1 x=1 n=½ u sm=½ print_all Example n=½ while (x != NULL) precondition : ∃v x(v) x = x n focus : ∃v1 x(v1) ∧ n(v1, v) x’(v) := ∃v1 x(v1) ∧ n(v1, v) S1 u1 u n=½ x=1 sm=½ y=1 n=½ S2.0 u sm=½ u1 y=1 n=½ S2.1 u1 y=1 n=1 u x=1 n=½ S2.2 u1 y=1 n=1 u.1 x=1 n=½ n=½ u.0 sm=½ Overview and Main Results 1. Two novel representations of first-order structures 2. 3. New BDD representation New representation using functional maps Implementation techniques Empirical evaluation Comparison of different representations Space is reduced by a factor of 4–10 New representations scale better Base Representation (Tal Lev-Ami SAS 2000) Two-Level Map : Predicate (Node Tuple Kleene) Sparse Representation Limited inherited sharing by “Copy-On-Write” BDDs in a Nutshell (Bryant 86) Ordered Binary Decision Diagrams Data structure for Boolean functions Functions are represented as (unique) DAGs x1 x2 x3 f 0 0 0 0 0 0 1 0 0 1 0 0 0 1 1 1 1 0 0 0 1 0 1 1 1 1 0 0 1 1 1 1 x1 x2 x2 x3 0 x3 0 0 x3 1 0 x3 1 0 1 BDDs in a Nutshell (Bryant 86) Ordered Binary Decision Diagrams Data structure for Boolean functions Functions are represented as (unique) DAGs Also achieve sharing across functions x1 x1 x2 x3 x2 x3 0 x3 x3 1 Duplicate Terminals x1 x2 x2 x3 x3 0 1 Duplicate Nonterminals x2 x3 0 1 Redundant Tests Encoding Structures Using Integers Static encoding of Dynamic encoding of nodes Predicates Kleene values 0, 1, …, n-1 Encode predicate p’s values as ep(p).en(u1). en(u2) . … . en(un) . ek(Kleene) BDD Representation of Integer Sets Characteristic function S={1,5} S 1=<001> 5=<101> = (¬x1¬x2x3) (x1¬x2x3) x1 x2 x2 x3 0 1 BDD Representation of Integer Sets Characteristic function S={1,5} S 1=<001> 5=<101> = (¬x1¬x2x3) (x1¬x2x3) x1 x2 x2 x3 1 BDD Representation Example n=½ S0 u u1 n=½ sm=½ y=1 S0 1 BDD Representation Example n=½ S0 u u1 n=½ sm=½ y=1 S0 S1 x=y n=½ S1 u1 u n=½ x=1 sm=½ y=1 1 BDD Representation Example S2.2 n=½ S0 u u1 n=½ sm=½ y=1 S0 S1 x=y n=½ S1 u1 u n=½ x=1 sm=½ y=1 x=xn n=½ S2.2 u1 y=1 n=1 u.1 x=1 n=½ n=½ u.0 sm=½ 1 BDD Representation Example S2.2 n=½ S0 u u1 n=½ sm=½ y=1 S0 S1 x=y n=½ S1 u1 u n=½ x=1 sm=½ y=1 x=xn n=½ S2.2 u1 y=1 n=1 u.1 x=1 n=½ n=½ u.0 sm=½ 1 Improved BDD Representation Using this representation directly doesn’t save space Observation Our heuristics Node names can be arbitrarily remapped without affecting the ADT semantics Use canonic node names to encode nodes Increases incidental sharing Reduces isomorphism test to pointer comparison 4-10 space reduction Reducing Time Overhead Current implementation not optimized Expensive formula evaluation Hybrid representation Distinguish between phases: mutable phase Join immutable phase Dynamically switch representations Functional Representation Alternative representation for first-order structures Structures represented by maps from integers to Kleene values Tailored for representing first-order structures Achieves better results than BDDs Techniques similar to the BDD representation More details in the paper Empirical Evaluation Benchmarks: Cleanness Analysis (SAS 2000) Garbage Collector CMP (PLDI 2002) of Java Front-End and Kernel Benchmarks Mobile Ambients (ESOP 2000) Stress testing the representations We use “relational analysis” Save structures in every CFG location Space Results 450 402.8 400 350 300 Base OBDD total Functional 250 200 187.7 168.2 150 100 51.6 50 12.8 5.5 22.7 16.7 12.9 9.6 0 JFE KERNEL CA MA GC Abstract Counters Ignore language/implementation details A more reliable measurement technique Count only crucial space information Independent of C/Java Abstract Counters Results 45,000,000 40,000,000 35,000,000 30,000,000 Base OBDD Functional 25,000,000 20,000,000 15,000,000 10,000,000 5,000,000 0 JFE KERNEL CA MA GC Trends in the Cleanness Analysis Benchmark 600 500 564 505 400 Base OBDD Functional 300 200 100 0 74 54 42 50 1 2 3 4 5 6 7 8 9 10 What’s Missing from this Work? Investigate other node mapping heuristics Compactly represent sets of structures Time optimizations Conclusions Two novel representations of first-order structures Implementation techniques New BDD representation New representation using functional maps Normalization techniques are crucial Empirical evaluation Comparison of different representations Space is reduced by a factor of 4–10 New representations scale better Conclusions The use of BDDs for static analysis is not a panacea for space saving Domain-specific encoding crucial for saving space Failed attempts Original implementation of Veith’s encoding PAG The End
© Copyright 2024 Paperzz