Integrating Testing and Theorem Proving Harsh Raju Chamarthi, Peter C. Dillinger, Matt Kaufmann and Pete Manolios Northeastern University and University of Texas at Austin November 4, 2011 1 / 50 Overview Motivation I Teaching freshmen how to reason about programs I Improved interactive theorem proving experience 2 / 50 Overview Motivation I Teaching freshmen how to reason about programs I Improved interactive theorem proving experience 3 / 50 Overview Motivation I Teaching freshmen how to reason about programs I Improved interactive theorem proving experience Concrete Counterexamples!! 4 / 50 Overview Motivation I Teaching freshmen how to reason about programs I Improved interactive theorem proving experience In the spirit of Combining automated methods with interactive theorem proving technology results in a more powerful, yet automated method. 5 / 50 Overview Motivation I Teaching freshmen how to reason about programs I Improved interactive theorem proving experience Takeaway Combining automated testing methods with theorem proving technology results in a more powerful, yet automated theorem proving. 6 / 50 A First Try: Random Testing The simplest approach -- Instantiate free variables and evaluate! 7 / 50 A First Try: Random Testing The simplest approach -- Instantiate free variables and evaluate! A Conjecture (implies (and (pos-listp X) (integer-listp Y)) (> (sum-list X) (sum-list Y))) 8 / 50 A First Try: Random Testing The simplest approach -- Instantiate free variables and evaluate! A Conjecture (implies (and (pos-listp X) (integer-listp Y)) (> (sum-list X) (sum-list Y))) Instance (implies (and (pos-listp "aab") (integer-listp 1)) (> (sum-list "aab") (sum-list 1))) Vacuous! 9 / 50 A First Try: Random Testing The simplest approach -- Instantiate free variables and evaluate! A Conjecture (implies (and (pos-listp X) (integer-listp Y)) (> (sum-list X) (sum-list Y))) Instance (implies (and (pos-listp 0) (integer-listp "abb")) (> (sum-list 0) (sum-list "abb"))) Vacuous! 10 / 50 A First Try: Random Testing The simplest approach -- Instantiate free variables and evaluate! A Conjecture (implies (and (pos-listp X) (integer-listp Y)) (> (sum-list X) (sum-list Y))) Instance (implies (and (pos-listp NIL) (integer-listp '|ko|)) (> (sum-list NIL) (sum-list '|ko|))) Vacuous! 11 / 50 A First Try: Random Testing The simplest approach -- Instantiate free variables and evaluate! A Conjecture (implies (and (pos-listp X) (integer-listp Y)) (> (sum-list X) (sum-list Y))) Instance (implies (and (pos-listp '|b|) (integer-listp T)) (> (sum-list '|b|) (sum-list T))) Vacuous! 12 / 50 A First Try: Random Testing The simplest approach -- Instantiate free variables and evaluate! A Conjecture (implies (and (pos-listp X) (integer-listp Y)) (> (sum-list X) (sum-list Y))) Instance (implies (and (pos-listp '(0)) (integer-listp 1/3)) (> (sum-list '(0)) (sum-list 1/3))) Vacuous! 13 / 50 A First Try: Random Testing The simplest approach -- Instantiate free variables and evaluate! A Conjecture (implies (and (pos-listp X) (integer-listp Y)) (> (sum-list X) (sum-list Y))) Instance (implies (and (pos-listp 95) (integer-listp '(-1 0))) (> (sum-list 95) (sum-list '(-1 0)))) Vacuous! 14 / 50 A First Try: Random Testing The simplest approach -- Instantiate free variables and evaluate! A Conjecture (implies (and (pos-listp X) (integer-listp Y)) (> (sum-list X) (sum-list Y))) Instance (implies (and (pos-listp '(2 1)) (integer-listp '(-3 1))) (> (sum-list '(2 1)) (sum-list '(-3 1)))) Witness! 15 / 50 A First Try: Random Testing The simplest approach -- Instantiate free variables and evaluate! A Conjecture (implies (and (pos-listp X) (integer-listp Y)) (> (sum-list X) (sum-list Y))) Instance (implies (and (pos-listp '(1 3)) (integer-listp '(9))) (> (sum-list '(1 3)) (sum-list '(9)))) Countereg! 16 / 50 Data Definition Framework Characterizing Types I Predicate I Enumerator (enables automatic test data generation) 17 / 50 Data Definition Framework Characterizing Types I Predicate I Enumerator (enables automatic test data generation) foo is a defdata type iff 1. predicate foop is defined and 2. either enumerator nth-foo or *foo-values* is defined 18 / 50 Data Definition Framework Characterizing Types I Predicate I Enumerator (enables automatic test data generation) foo is a defdata type iff 1. predicate foop is defined and 2. either enumerator nth-foo or *foo-values* is defined Type Combinations 1. Union type -- (defdata pf (oneof pos foo)) 2. Product type -- (defdata bar (cons (/ 1 pos) nat-list)) 19 / 50 Data Definition Framework Characterizing Types I Predicate I Enumerator (enables automatic test data generation) foo is a defdata type iff 1. predicate foop is defined and 2. either enumerator nth-foo or *foo-values* is defined Type Combinations 1. Union type -- (defdata pf (oneof pos foo)) 2. Product type -- (defdata bar (cons (/ 1 pos) nat-list)) demo 20 / 50 Naive Testing (top-level-test? (equal (rev (rev x)) x)) 21 / 50 Naive Testing (top-level-test? (equal (rev (rev x)) x)) Demo 22 / 50 Naive Testing (top-level-test? (equal (rev (rev x)) x)) Demo (top-level-test? (implies (true-listp x) (equal (rev (rev x)) x))) 23 / 50 Naive Testing (top-level-test? (equal (rev (rev x)) x)) Demo (top-level-test? (implies (true-listp x) (equal (rev (rev x)) x))) But often type restrictions are more complex than datatype hypotheses!! 24 / 50 Highly unlikely to satisfy complex data restrictions Triangle Example (defdata triple (list pos pos pos)) (defun trianglep (v) (and (triplep v) (< (third v) (+ (first v) (second v))) (< (first v) (+ (second v) (third v))) (< (second v) (+ (first v) (third v))))) (top-level-test? (implies (and (triplep x) (trianglep x) (> (third x) 256) (= (third x) (* (second x) (first x)))) (not (equal "isosceles" (shape x))))) 25 / 50 Highly unlikely to satisfy complex data restrictions Triangle Example (defdata triple (list pos pos pos)) (defun trianglep (v) (and (triplep v) (< (third v) (+ (first v) (second v))) (< (first v) (+ (second v) (third v))) (< (second v) (+ (first v) (third v))))) (top-level-test? (implies (and (triplep x) (trianglep x) (> (third x) 256) (= (third x) (* (second x) (first x)))) (not (equal "isosceles" (shape x))))) Low Probability!! Prob < 1 32768 26 / 50 Better Testing With Theorem Proving Use the theorem prover to increase the chances of finding counterexamples/witnesses. 27 / 50 Better Testing With Theorem Proving Use the theorem prover to increase the chances of finding counterexamples/witnesses. demo 28 / 50 Better Testing With Theorem Proving . Goal (implies (and (triplep x) (trianglep x) (> (third x) 256) (= (third x) (* (second x) (first x)))) (not (= "isosceles" (shape x)))) .pool .simplification .dest elim 29 / 50 Better Testing With Theorem Proving . Subgoal 3 Subgoal 2 Subgoal 1 .pool .simplification .dest elim 30 / 50 Better Testing With Theorem Proving . .pool .simplification Subgoal 3 (IMPLIES (AND (CONSP X) (INTEGERP (CAR X)) (< 0 (CAR X)) (CONSP (CDR X)) (INTEGERP (CADR X)) (< 0 (CADR X)) (CONSP (CDDR X)) (INTEGERP (CADDR X)) (< 0 (CADDR X)) (NOT (CDDDR X)) ...trianglep ... (< 256 (CADDR X)) ... (NOT (= (CAR X) (CADR X)))) (NOT (= (CAR X) (CADDR X)))) .dest elim 31 / 50 Better Testing With Theorem Proving . .pool .simplification Subgoal 3'' (IMPLIES (AND (CONSP X) (INTEGERP (CAR X)) (< 0 (CAR X)) (CONSP (CDR X)) (CONSP (CDDR X)) (NOT (CDDDR X)) (< 1 (* 2 (CAR X))) (< 256 (CAR X)) (EQUAL 1 (CADR X)) (NOT (= (CAR X) 1))) (NOT (= (CAR X) (CADDR X)))) .dest elim 32 / 50 Better Testing With Theorem Proving . .pool .simplification Subgoal 3'' (IMPLIES (AND (CONSP X) (INTEGERP (CAR X)) (< 0 (CAR X)) (CONSP (CDR X)) (CONSP (CDDR X)) (NOT (CDDDR X)) (< 1 (* 2 (CAR X))) (< 256 (CAR X)) (EQUAL 1 (CADR X)) (NOT (= (CAR X) 1))) (NOT (= (CAR X) (CADDR X)))) .dest elim 33 / 50 Better Testing With Theorem Proving . .pool .simplification Subgoal 3''' (IMPLIES (AND (INTEGERP X5) (< 0 X5) (CONSP ...) (INTEGERP X3) (< 0 X3) (CONSP ...) (CONSP ...) (INTEGERP X1) (< 0 X1) (NOT X6) (< 1 (* 2 X1)) (< 256 X1) (EQUAL 1 X3) (NOT (EQUAL X1 1))) (NOT (EQUAL X1 X5))) .dest elim 34 / 50 Better Testing With Theorem Proving . Subgoal 3'4' (IMPLIES (AND (INTEGERP X1) (< 0 X1) (< 1 (* 2 X1)) (< 256 X1)) (EQUAL X1 1)) .pool .simplification .dest elim 35 / 50 Better Testing with Theorem Proving Subgoal which was refuted (IMPLIES (AND (INTEGERP X1) (< 0 X1) (< 1 (* 2 X1)) (< 256 X1)) (EQUAL X1 1)) 36 / 50 Better Testing with Theorem Proving Subgoal which was refuted (IMPLIES (AND (INTEGERP X1) (< 0 X1) (< 1 (* 2 X1)) (< 256 X1)) (EQUAL X1 1)) Probability of hitting a counterexample Prob −→ 1 37 / 50 Better Testing with Theorem Proving Subgoal which was refuted (IMPLIES (AND (INTEGERP X1) (< 0 X1) (< 1 (* 2 X1)) (< 256 X1)) (EQUAL X1 1)) Probability of hitting a counterexample Prob −→ 1 Big Win primary simplification due to libraries of lemmas (arithmetic-5 etc) 38 / 50 Better Theorem Proving with Testing (defthm m-=-... ... :hints (("Goal" :do-not '(generalize) ...))) 39 / 50 Better Theorem Proving with Testing (defthm m-=-... ... :hints (("Goal" :do-not '(generalize) ...))) Use Testing to discard bad generalizations 40 / 50 Better Theorem Proving with Testing (defthm m-=-... ... :hints (("Goal" :do-not '(generalize) ...))) Use Testing to discard bad generalizations In principle, testing can be used to direct the theorem prover or help search for a proof strategy. 41 / 50 ACL2 Enhancements I Record reasons for eliding variables 42 / 50 ACL2 Enhancements I Record reasons for eliding variables I override-hints -- modify tentative selected hint 43 / 50 ACL2 Enhancements I Record reasons for eliding variables I override-hints -- modify tentative selected hint I :backtrack hint -- applied after a proof process has finished 44 / 50 ACL2 Enhancements I I I Record reasons for eliding variables override-hints -- modify tentative selected hint :backtrack hint -- applied after a proof process has finished Destructor Elimination Simplification Equality User Generalization formula pool Elimination of Irrelevance Induction 45 / 50 ACL2 Enhancements I Record reasons for eliding variables I override-hints -- modify tentative selected hint I :backtrack hint -- applied after a proof process has finished (defmacro enable-acl2s-random-testing () `(make-event '(add-override-hints '((list* :backtrack `(test-checkpoint id clause processor ',pspv ',hist state) keyword-alist))))) (demo) 46 / 50 Experience I Great help to students. I Experts can benefit as well! 47 / 50 Experience I Great help to students. I Experts can benefit as well! An example from J Moore (square-root i) I believe this returns the largest integer n such that n*n <= i, for natural numbers i. But I haven't proved it. 48 / 50 Experience I Great help to students. I Experts can benefit as well! An example from J Moore (square-root i) I believe this returns the largest integer n such that n*n <= i, for natural numbers i. But I haven't proved it. Pipeline Machine example 49 / 50 Future Work? I Design rewrite rules to help testing I Use a more powerful counterexample generation tool than random testing. I Lemma generation 50 / 50
© Copyright 2024 Paperzz