slides

Integrating Testing and Theorem Proving
Harsh Raju Chamarthi, Peter C. Dillinger, Matt Kaufmann and
Pete Manolios
Northeastern University and University of Texas at Austin
November 4, 2011
1 / 50
Overview
Motivation
I
Teaching freshmen how to reason about programs
I
Improved interactive theorem proving experience
2 / 50
Overview
Motivation
I
Teaching freshmen how to reason about programs
I
Improved interactive theorem proving experience
3 / 50
Overview
Motivation
I
Teaching freshmen how to reason about programs
I
Improved interactive theorem proving experience
Concrete Counterexamples!!
4 / 50
Overview
Motivation
I
Teaching freshmen how to reason about programs
I
Improved interactive theorem proving experience
In the spirit of
Combining automated methods with interactive theorem proving
technology results in a more powerful, yet automated method.
5 / 50
Overview
Motivation
I
Teaching freshmen how to reason about programs
I
Improved interactive theorem proving experience
Takeaway
Combining automated testing methods with theorem proving
technology results in a more powerful, yet automated theorem
proving.
6 / 50
A First Try: Random Testing
The simplest approach -- Instantiate free variables and evaluate!
7 / 50
A First Try: Random Testing
The simplest approach -- Instantiate free variables and evaluate!
A Conjecture
(implies (and (pos-listp X)
(integer-listp Y))
(> (sum-list X)
(sum-list Y)))
8 / 50
A First Try: Random Testing
The simplest approach -- Instantiate free variables and evaluate!
A Conjecture
(implies (and (pos-listp X)
(integer-listp Y))
(> (sum-list X)
(sum-list Y)))
Instance
(implies (and (pos-listp "aab")
(integer-listp 1))
(> (sum-list "aab")
(sum-list 1)))
Vacuous!
9 / 50
A First Try: Random Testing
The simplest approach -- Instantiate free variables and evaluate!
A Conjecture
(implies (and (pos-listp X)
(integer-listp Y))
(> (sum-list X)
(sum-list Y)))
Instance
(implies (and (pos-listp 0)
(integer-listp "abb"))
(> (sum-list 0)
(sum-list "abb")))
Vacuous!
10 / 50
A First Try: Random Testing
The simplest approach -- Instantiate free variables and evaluate!
A Conjecture
(implies (and (pos-listp X)
(integer-listp Y))
(> (sum-list X)
(sum-list Y)))
Instance
(implies (and (pos-listp NIL)
(integer-listp '|ko|))
(> (sum-list NIL)
(sum-list '|ko|)))
Vacuous!
11 / 50
A First Try: Random Testing
The simplest approach -- Instantiate free variables and evaluate!
A Conjecture
(implies (and (pos-listp X)
(integer-listp Y))
(> (sum-list X)
(sum-list Y)))
Instance
(implies (and (pos-listp '|b|)
(integer-listp T))
(> (sum-list '|b|)
(sum-list T)))
Vacuous!
12 / 50
A First Try: Random Testing
The simplest approach -- Instantiate free variables and evaluate!
A Conjecture
(implies (and (pos-listp X)
(integer-listp Y))
(> (sum-list X)
(sum-list Y)))
Instance
(implies (and (pos-listp '(0))
(integer-listp 1/3))
(> (sum-list '(0))
(sum-list 1/3)))
Vacuous!
13 / 50
A First Try: Random Testing
The simplest approach -- Instantiate free variables and evaluate!
A Conjecture
(implies (and (pos-listp X)
(integer-listp Y))
(> (sum-list X)
(sum-list Y)))
Instance
(implies (and (pos-listp 95)
(integer-listp '(-1 0)))
(> (sum-list 95)
(sum-list '(-1 0))))
Vacuous!
14 / 50
A First Try: Random Testing
The simplest approach -- Instantiate free variables and evaluate!
A Conjecture
(implies (and (pos-listp X)
(integer-listp Y))
(> (sum-list X)
(sum-list Y)))
Instance
(implies (and (pos-listp '(2 1))
(integer-listp '(-3 1)))
(> (sum-list '(2 1))
(sum-list '(-3 1))))
Witness!
15 / 50
A First Try: Random Testing
The simplest approach -- Instantiate free variables and evaluate!
A Conjecture
(implies (and (pos-listp X)
(integer-listp Y))
(> (sum-list X)
(sum-list Y)))
Instance
(implies (and (pos-listp '(1 3))
(integer-listp '(9)))
(> (sum-list '(1 3))
(sum-list '(9))))
Countereg!
16 / 50
Data Definition Framework
Characterizing Types
I
Predicate
I
Enumerator (enables automatic test data generation)
17 / 50
Data Definition Framework
Characterizing Types
I
Predicate
I
Enumerator (enables automatic test data generation)
foo is a defdata type iff
1. predicate foop is defined and
2. either enumerator nth-foo or *foo-values* is defined
18 / 50
Data Definition Framework
Characterizing Types
I
Predicate
I
Enumerator (enables automatic test data generation)
foo is a defdata type iff
1. predicate foop is defined and
2. either enumerator nth-foo or *foo-values* is defined
Type Combinations
1. Union type -- (defdata pf (oneof pos foo))
2. Product type -- (defdata bar (cons (/ 1 pos) nat-list))
19 / 50
Data Definition Framework
Characterizing Types
I
Predicate
I
Enumerator (enables automatic test data generation)
foo is a defdata type iff
1. predicate foop is defined and
2. either enumerator nth-foo or *foo-values* is defined
Type Combinations
1. Union type -- (defdata pf (oneof pos foo))
2. Product type -- (defdata bar (cons (/ 1 pos) nat-list))
demo
20 / 50
Naive Testing
(top-level-test? (equal (rev (rev x)) x))
21 / 50
Naive Testing
(top-level-test? (equal (rev (rev x)) x))
Demo
22 / 50
Naive Testing
(top-level-test? (equal (rev (rev x)) x))
Demo
(top-level-test?
(implies (true-listp x)
(equal (rev (rev x)) x)))
23 / 50
Naive Testing
(top-level-test? (equal (rev (rev x)) x))
Demo
(top-level-test?
(implies (true-listp x)
(equal (rev (rev x)) x)))
But often type restrictions are more complex
than datatype hypotheses!!
24 / 50
Highly unlikely to satisfy complex data restrictions
Triangle Example
(defdata triple (list pos pos pos))
(defun trianglep (v)
(and (triplep v)
(< (third v) (+ (first v) (second v)))
(< (first v) (+ (second v) (third v)))
(< (second v) (+ (first v) (third v)))))
(top-level-test?
(implies (and (triplep x)
(trianglep x)
(> (third x) 256)
(= (third x)
(* (second x) (first x))))
(not (equal "isosceles" (shape x)))))
25 / 50
Highly unlikely to satisfy complex data restrictions
Triangle Example
(defdata triple (list pos pos pos))
(defun trianglep (v)
(and (triplep v)
(< (third v) (+ (first v) (second v)))
(< (first v) (+ (second v) (third v)))
(< (second v) (+ (first v) (third v)))))
(top-level-test?
(implies (and (triplep x)
(trianglep x)
(> (third x) 256)
(= (third x)
(* (second x) (first x))))
(not (equal "isosceles" (shape x)))))
Low
Probability!!
Prob <
1
32768
26 / 50
Better Testing With Theorem Proving
Use the theorem prover to increase the chances of finding
counterexamples/witnesses.
27 / 50
Better Testing With Theorem Proving
Use the theorem prover to increase the chances of finding
counterexamples/witnesses.
demo
28 / 50
Better Testing With Theorem Proving
.
Goal
(implies (and (triplep x)
(trianglep x)
(> (third x) 256)
(= (third x)
(* (second x) (first x))))
(not (= "isosceles" (shape x))))
.pool
.simplification
.dest elim
29 / 50
Better Testing With Theorem Proving
.
Subgoal 3
Subgoal 2
Subgoal 1
.pool
.simplification
.dest elim
30 / 50
Better Testing With Theorem Proving
.
.pool
.simplification
Subgoal 3
(IMPLIES (AND (CONSP X)
(INTEGERP (CAR X))
(< 0 (CAR X))
(CONSP (CDR X))
(INTEGERP (CADR X))
(< 0 (CADR X))
(CONSP (CDDR X))
(INTEGERP (CADDR X))
(< 0 (CADDR X))
(NOT (CDDDR X))
...trianglep ...
(< 256 (CADDR X))
...
(NOT (= (CAR X) (CADR X))))
(NOT (= (CAR X) (CADDR X))))
.dest elim
31 / 50
Better Testing With Theorem Proving
.
.pool
.simplification
Subgoal 3''
(IMPLIES (AND (CONSP X)
(INTEGERP (CAR X))
(< 0 (CAR X))
(CONSP (CDR X))
(CONSP (CDDR X))
(NOT (CDDDR X))
(< 1 (* 2 (CAR X)))
(< 256 (CAR X))
(EQUAL 1 (CADR X))
(NOT (= (CAR X) 1)))
(NOT (= (CAR X) (CADDR X))))
.dest elim
32 / 50
Better Testing With Theorem Proving
.
.pool
.simplification
Subgoal 3''
(IMPLIES (AND (CONSP X)
(INTEGERP (CAR X))
(< 0 (CAR X))
(CONSP (CDR X))
(CONSP (CDDR X))
(NOT (CDDDR X))
(< 1 (* 2 (CAR X)))
(< 256 (CAR X))
(EQUAL 1 (CADR X))
(NOT (= (CAR X) 1)))
(NOT (= (CAR X) (CADDR X))))
.dest elim
33 / 50
Better Testing With Theorem Proving
.
.pool
.simplification
Subgoal 3'''
(IMPLIES (AND (INTEGERP X5)
(< 0 X5)
(CONSP ...)
(INTEGERP X3)
(< 0 X3)
(CONSP ...)
(CONSP ...)
(INTEGERP X1)
(< 0 X1)
(NOT X6)
(< 1 (* 2 X1))
(< 256 X1)
(EQUAL 1 X3)
(NOT (EQUAL X1 1)))
(NOT (EQUAL X1 X5)))
.dest elim
34 / 50
Better Testing With Theorem Proving
.
Subgoal 3'4'
(IMPLIES (AND (INTEGERP X1)
(< 0 X1)
(< 1 (* 2 X1))
(< 256 X1))
(EQUAL X1 1))
.pool
.simplification
.dest elim
35 / 50
Better Testing with Theorem Proving
Subgoal which was refuted
(IMPLIES (AND (INTEGERP X1)
(< 0 X1)
(< 1 (* 2 X1))
(< 256 X1))
(EQUAL X1 1))
36 / 50
Better Testing with Theorem Proving
Subgoal which was refuted
(IMPLIES (AND (INTEGERP X1)
(< 0 X1)
(< 1 (* 2 X1))
(< 256 X1))
(EQUAL X1 1))
Probability of hitting
a counterexample
Prob −→ 1
37 / 50
Better Testing with Theorem Proving
Subgoal which was refuted
(IMPLIES (AND (INTEGERP X1)
(< 0 X1)
(< 1 (* 2 X1))
(< 256 X1))
(EQUAL X1 1))
Probability of hitting
a counterexample
Prob −→ 1
Big Win
primary simplification due to libraries of lemmas (arithmetic-5 etc)
38 / 50
Better Theorem Proving with Testing
(defthm m-=-...
...
:hints (("Goal" :do-not '(generalize) ...)))
39 / 50
Better Theorem Proving with Testing
(defthm m-=-...
...
:hints (("Goal" :do-not '(generalize) ...)))
Use Testing to discard bad generalizations
40 / 50
Better Theorem Proving with Testing
(defthm m-=-...
...
:hints (("Goal" :do-not '(generalize) ...)))
Use Testing to discard bad generalizations
In principle, testing can be used to direct the theorem prover or help
search for a proof strategy.
41 / 50
ACL2 Enhancements
I
Record reasons for eliding variables
42 / 50
ACL2 Enhancements
I
Record reasons for eliding variables
I
override-hints -- modify tentative selected hint
43 / 50
ACL2 Enhancements
I
Record reasons for eliding variables
I
override-hints -- modify tentative selected hint
I
:backtrack hint -- applied after a proof process has finished
44 / 50
ACL2 Enhancements
I
I
I
Record reasons for eliding variables
override-hints -- modify tentative selected hint
:backtrack hint -- applied after a proof process has finished
Destructor Elimination
Simplification
Equality
User
Generalization
formula
pool
Elimination of
Irrelevance
Induction
45 / 50
ACL2 Enhancements
I
Record reasons for eliding variables
I
override-hints -- modify tentative selected hint
I
:backtrack hint -- applied after a proof process has finished
(defmacro enable-acl2s-random-testing ()
`(make-event
'(add-override-hints
'((list* :backtrack
`(test-checkpoint id clause
processor
',pspv ',hist
state)
keyword-alist)))))
(demo)
46 / 50
Experience
I
Great help to students.
I
Experts can benefit as well!
47 / 50
Experience
I
Great help to students.
I
Experts can benefit as well!
An example from J Moore (square-root i)
I believe this returns the largest integer n such that n*n <= i, for natural
numbers i. But I haven't proved it.
48 / 50
Experience
I
Great help to students.
I
Experts can benefit as well!
An example from J Moore (square-root i)
I believe this returns the largest integer n such that n*n <= i, for natural
numbers i. But I haven't proved it.
Pipeline Machine example
49 / 50
Future Work?
I
Design rewrite rules to help testing
I
Use a more powerful counterexample generation tool than
random testing.
I
Lemma generation
50 / 50