The HR Program for Automated Theorem Generation

The HR Program for
Theorem Generation
Simon Colton
Mathematical Reasoning Group
University of Edinburgh
Overview

Start with the axioms of a domain

Produce 100s of theorems about domain

How do we do this?

Why do we do this?
The HR Program

Machine learning Java program
– With special application to mathematics
– Performs automated theory formation

Uses five processes to generate theorems
–
–
–
–
–
Initialisation from axioms (bootstrapping using MACE)
Production rule based concept formation
Empirical conjecture making (with a little reasoning)
Automated theorem settling (ATP/ModGen)
Theorem post-processing
Concept Formation
10 general production rules
 Example: Abelian groups

a*b=c
compose
a*b=c&b*a=c
exists
 c (a * b = c & b * a = c)
forall
 a b  c (a * b = c & b * a = c)
Empirical Conjecture Making

Non-existence conjectures
– Invents a concept with no examples

Equivalence conjectures
– Two concepts have exactly same examples

Implication conjectures
– A concept has all the examples of another
A Little Reasoning

HR discards many conjectures:
¬( A (p(A) & ¬p(A)) [bad negation]
f(A) = x & f(A) = y & x  y [bad instantiation]
 a b (p(a,b) & q(a)   x (p(a,x) & q(x)))
[unification]

HR also has:
– Built-in forward-chaining prover
Settling Conjectures

HR first uses Otter
– To try and prove each theorem

If Otter fails
– HR uses MACE to try to find a counterex.

Other provers via MathWeb
– Bliksem, E, Spass, …
– See Jürgen Zimmer’s PaPS talk on Weds
Post-Processing Conjectures

Example: (p(a) & q(a)  r(a) & s(a))

Extracts implicates:
– p(a) & q(a)  r(a), p(a) & q(a)  s(a)

Attempts to find prime implicates
– Tries: p(a)  r(a), then q(a)  r(a)
– Using Otter each time
Example session

Ring theory axioms RNG-004
– 1000 steps in 6481 seconds
– 275 prime implicates extracted
– 39 with proof length > 10
– 30 examples of rings added as counters
– 2 of #2 2 of #3 25 of #4 1 of #7

See paper for further details
Applications

Pre-processing AI problems
– CSP()

ATP(?)
ML(??)
Mathematical discovery
– Number theory, algebraic domains

Mathematics tutoring
– See talk at RADM workshop

Testing ATP programs
– HR first non-human to add to TPTP library
– Roughly 15 in this year’s CASC comp.
Example TPTP conjecture
Otter and E fail (120 seconds), Spass succeeds:

xy
(( z (inv(z)=x & z*y=x) &
 u (x*u=y &  v (v*x=u & inv(v)=x)))

( a (inv(a)=x & a*y=x) &
 b (b*y=x & inv(b)=y)))
[about pairs of identity elements]
Conclusions & Future Work

Automated theory formation
– Produces 100s of conjectures
– Initialisation, concept formation, empirical
conjecture making, ATP & MG, post-processing

Many applications
– Pre-proc, TPTP, discovery, tutoring

Applying this to bioinformatics
– Deduction and induction combined
http://www.dai.ed.ac.uk/~simonco/research/hr
Please ask me for a demo!