Inferring Heap Abstraction Grammars

Inferring Heap Abstraction Grammars
Alexander Dominik Weinert
RWTH Aachen University
July 30, 2012
A. Weinert (RWTH Aachen)
Inferring HAGs
July 30, 2012
1 / 33
Model Checking on Unbounded Structures
Model Checking
Verification by exploration of states
Input: int i1 , int i2
while i1 6= 0 do
i1 := i1 − 1
i2 := i2 + 1
end while
Input: List l
Element e := l.first()
while e 6= null do
e := l.next(e)
end while
Infinite number of states χ
Finite number of states X
Idea [Heinen et al., 2009]
Heap Abstraction Grammars ⇒ Finitely many heap states
A. Weinert (RWTH Aachen)
Inferring HAGs
July 30, 2012
2 / 33
Alphabets
Traditional Case
An alphabet is a finite set of symbols
Definition
An alphabet is a triple: Σ := (N, T , rk),
where
N – nonterminal symbols
T – terminal symbols
rk: N ∪ T → N – ranking function
and N ∩ T = ∅
A. Weinert (RWTH Aachen)
Inferring HAGs
July 30, 2012
3 / 33
Heap Configurations
Definition
1
A heap configuration over an alphabet Σ
and a finite set of symbols Γ is a tuple
n
G := (V , E , labV , labE , att, ext, ⊥)
r
l
V – nodes
n
E – edges
n
1
labV: V → Γ – node labels
r
labE: E → N ∪ T – edge labels
att: E →
ext ∈
V∗
V∗
– attachment
– external nodes
S
l
2
⊥
⊥ ∈ V – null node
A. Weinert (RWTH Aachen)
Inferring HAGs
July 30, 2012
4 / 33
Heap Configurations (cont.)
Definitions
Rank of an edge := number of nodes attached
Terminal edge :⇔ labE (e) ∈ T
Terminal heap configuration :⇔ all edges are terminal
Requirements
Terminal edges are of rank 2
Rank of a label determines rank of edges
All outgoing pointers of a class are represented
All pointers are connected according to their type
A. Weinert (RWTH Aachen)
Inferring HAGs
July 30, 2012
5 / 33
Hyperedge Substitution
Given: heap configurations G , H ,edge e
Assumed: e ∈ EG , labE (e) ∈ N, rk(e) = |extH |.
Substitute e with H, use external nodes as attachment points.
n
l
n
1
n
n
T
l
l
n
n
r
n
l
2
⊥
r
r
1
r
n
l
r
r
l
l
n
l
r
r
⊥
⊥
2
A. Weinert (RWTH Aachen)
Inferring HAGs
July 30, 2012
6 / 33
Data Structure Grammars
Definition
A Data Structure Grammar (DSG) I over an alphabet Σ and a set of
symbols Γ is a tuple (N, T , P, S),
where
N – nonterminal symbols
T – terminal symbols
P = {X1 → G1 , . . . , Xn → Gn } – production rules
S – axiom
L(I ) := Set of all terminal configurations that can be derived from S
A. Weinert (RWTH Aachen)
Inferring HAGs
July 30, 2012
7 / 33
Heap Abstraction Grammar
Definition
A Heap Abstraction Grammar (HAG) is a Data Structure Grammar
(DSG), that
1
Produces only valid heap configurations
2
Fulfills other restrictions [Heinen et al., tbp]
Any DSG that fulfils the first condition, but not the second one can be
transformed into a HAG [Jansen, 2010].
A. Weinert (RWTH Aachen)
Inferring HAGs
July 30, 2012
8 / 33
Problem
Given: Set of Heap Configurations L
Goal: Heap Abstraction Grammar I with L(I ) ⊇ L
A. Weinert (RWTH Aachen)
Inferring HAGs
July 30, 2012
9 / 33
Overview
L, Σ
Produce Grammar
I
First: L(I ) = L
A. Weinert (RWTH Aachen)
Inferring HAGs
July 30, 2012
10 / 33
Overview
L, Σ
Init
I
Find Suitable Subgraph
I
I
I , sub
I
Abstract All Occurrences
X
I , sub
Find Fresh Nonterminal
X
Add Rule
A. Weinert (RWTH Aachen)
Inferring HAGs
July 30, 2012
11 / 33
Overview
L, Σ
I
Init
Get All Subgraphs
I , subs
I
Choose Subgraph
I , sub
I
I
Abstract All Occurrences
X
I , sub
Find Fresh Nonterminal
X
Add Rule
A. Weinert (RWTH Aachen)
Inferring HAGs
July 30, 2012
12 / 33
Subgraph Enumeration
Problem: Exponentially many subgraphs
Solution: Growing subgraphs [Jonyer et al., 2002]
1
l
l
l
1
1
l
l
n
n
2
n
n
n
n
n
2
A. Weinert (RWTH Aachen)
Inferring HAGs
n
2
July 30, 2012
13 / 33
Duplicate and Isomorphic Subgraphs
a
n
1
a
n
1
a
b
n
c
n
1
n
b
b
n
n
c
2
b
n
1
n
c
n
c
n
d
d
n
n
2
d
2
n
2
A. Weinert (RWTH Aachen)
Inferring HAGs
July 30, 2012
14 / 33
Duplication vs. Isomorphism
Isomorphic subgraphs =
ˆ Same structure at different points in the graph
Duplicate subgraphs =
ˆ Same structure at same point in the graph
A. Weinert (RWTH Aachen)
Inferring HAGs
July 30, 2012
15 / 33
Generate Seeds
Get Subgraph
yes
Generate Extensions
duplic.?
yes
no
isom.?
no
no
extend?
yes
Add Structure
Add Embedding
A. Weinert (RWTH Aachen)
Inferring HAGs
July 30, 2012
16 / 33
Overview
L, Σ
I
Init
Get All Subgraphs
I , subs
I
Choose Subgraph
I , sub
I
I
Abstract All Occurrences
X
I , sub
Find Fresh Nonterminal
X
Add Rule
A. Weinert (RWTH Aachen)
Inferring HAGs
July 30, 2012
17 / 33
Choosing a Subgraph – Minimum Description Length
Question: Which subgraph to abstract?
Answer: Optimization of cost function
Minimum Description Length [Rissanen, 1978]
sub = argmin(cost(X → H) + cost(L0 )),
where L0 is L under the assumption that the rule X → H is known.
A. Weinert (RWTH Aachen)
Inferring HAGs
July 30, 2012
18 / 33
Description Length
Definition
Cost of a heap configuration G :
cost(G ) = |V | + |E | +
X
rk(e)
e∈E
Cost of a set of heap configurations L:
X
cost(L) =
cost(G )
G ∈L
Description length of a production rule:
cost(X → H) = 1 + cost(H)
Several definitions possible
A. Weinert (RWTH Aachen)
Inferring HAGs
July 30, 2012
19 / 33
Simplification
After a single substitution of H
cost(L0 ) = cost(L) − cost(H) + 2 · |extH | + 1
After n substitutions of H
cost(L0 ) = cost(L) − n · (cost(H) + 2 · |extH | + 1)
⇒ Computation of cost without actual replacement
A. Weinert (RWTH Aachen)
Inferring HAGs
July 30, 2012
20 / 33
Overview
L, Σ
I
Init
Get All Subgraphs
I , subs
I
Choose Subgraph
I , sub
I
I
Abstract All Occurrences
X
I , sub
Find Fresh Nonterminal
X
Add Rule
Now: L(I ) ⊇ L
A. Weinert (RWTH Aachen)
Inferring HAGs
July 30, 2012
21 / 33
Recursive Structures [Jonyer et al., 2002]
n
n
2
s
s
1
n
n
3
n
n
n
2
s
s
s
1
n
n
n
3
n
n
2
s
s
1
n
n
3
A. Weinert (RWTH Aachen)
Inferring HAGs
July 30, 2012
22 / 33
n
s
n
s
n
n
n
s
n
s
n
n
Definition
A structure S is recursive, iff there exist two embeddings, such that
There are no nonterminal edges attached to the external nodes
The two embeddings do not share internal nodes
The external nodes can be partitioned into entry- and exit-nodes
There are no incoming pointers to the entry-nodes
There are no outgoing pointers from the exit-nodes
The exit-nodes of one embedding can be reached from the
entry-nodes of the other one
A. Weinert (RWTH Aachen)
Inferring HAGs
July 30, 2012
23 / 33
Recursive Rules added to the Grammar
Concatenation Rules
2
s
1
n
n
s
X
1
n
2
3
s
1
s
n
n
3
n
2
X
1
n
2
n
3
3
Abstraction in Graph
Finalization Rule
n
2
s
s
1
1
n
3
A. Weinert (RWTH Aachen)
1
2
n
3
n
2
X
3
Inferring HAGs
July 30, 2012
24 / 33
Complete Algorithm
L, Σ
I
Init
Get All Subgraphs
I , subs
I
Abstract Recursion
I
I , subs
Choose Subgraph
I , sub
I
I
Abstract All Occurrences
X
I , sub
Find Fresh Nonterminal
X
Add Rule
A. Weinert (RWTH Aachen)
Inferring HAGs
July 30, 2012
25 / 33
Singly Linked List
Input: Singly linked lists with 25 to 200 nodes
Nodes
25
50
75
100
125
150
175
200
A. Weinert (RWTH Aachen)
Subgraphs [ms]
90
285
437
642
1 001
1 455
1 884
2 895
Inferring HAGs
Complete [ms]
102
305
500
682
1 040
1 526
2 000
3 028
July 30, 2012
26 / 33
Singly Linked Circular List
Input: Singly linked circular lists with 25 to 200 nodes
Nodes
25
50
75
100
125
150
175
200
A. Weinert (RWTH Aachen)
Subgraphs [ms]
75
261
451
680
979
1 465
1 889
2 805
Inferring HAGs
Complete [ms]
90
281
464
658
1 032
1 511
1 933
2 995
July 30, 2012
27 / 33
Singly Linked Nested List
Input: Singly linked nested lists
n
o
v
v
i
i
n
n
n
n
i
n
o
i
A. Weinert (RWTH Aachen)
⊥
Outer
2
2
4
4
4
5
5
5
Inner
2
4
2
4
5
2
4
5
Inferring HAGs
Inner [ms]
9
19
31
394
1 280
96
2 701
51 608
Outer [ms]
8
5
16
11
20
23
22
23
July 30, 2012
28 / 33
Binary Tree
n
l
r
n
n
l
r
r
l
n
l
n
n
l
n
r
l
r
l
r
l
n
l
n
r
r
n
r
n
l
n
l r l r
n
l
n
r
l
n
r
⊥
A. Weinert (RWTH Aachen)
Inferring HAGs
r
July 30, 2012
l
r
29 / 33
Binary Tree – Rules
1
1
n
n
r
l
X1 →
n
1
n
X2 →
2
n
2
r
l
r
3
l
⊥
⊥
2
3
A. Weinert (RWTH Aachen)
X1
Inferring HAGs
July 30, 2012
30 / 33
Summary
Definition: Heap Configurations and Heap Abstraction Grammars
Problem: Generate grammar from set of Heap Configurations
Solution: Abstract subgraphs step by step
1
2
Find recursive subgraphs
Abstract subgraph with greatest gain in description length
Backed by experimental results
A. Weinert (RWTH Aachen)
Inferring HAGs
July 30, 2012
31 / 33
Further Research
Subgraph enumeration avoids certain graphs
Efficiently enumerate subgraphs with unconnected internal nodes
Recursion only works for simply concatenations
Allow multiple sets of entry- and exit-points
A. Weinert (RWTH Aachen)
Inferring HAGs
July 30, 2012
32 / 33
Thank you for your attention
A. Weinert (RWTH Aachen)
Inferring HAGs
July 30, 2012
33 / 33
Heinen, J., Jansen, C., Katoen, J.-P., and Noll, T. (tbp).
Juggrnaut: Using Graph Grammars for Abstracting Unbounded Heap
Structures.
Heinen, J., Noll, T., and Rieger, S. (2009).
Juggrnaut: Graph Grammar Abstraction for Unbounded Heap
Structures.
In Johnsen, E. B. and Stolz, V., editors, Harnessing Theories for Tool
Support in Software, Preliminary Proceedings, pages 53–67. United
Nations University - International Institute for Software Technology.
Jansen, C. (2010).
Konstruktion und Inferenz von Hypergraphabstraktionsgrammatiken.
Diplomarbeit, RWTH Aachen University.
Jonyer, I., Holder, L. B., and Cook, D. J. (2002).
Concept Formation Using Graph Grammars.
In Proceedings of the KDD Workshop on Multi-Relational Data
Mining.
A. Weinert (RWTH Aachen)
Inferring HAGs
July 30, 2012
33 / 33
Rissanen, J. (1978).
Modeling by shortest data description.
Automatica, 14(5):465–471.
A. Weinert (RWTH Aachen)
Inferring HAGs
July 30, 2012
33 / 33