x > y - SRI AI Center

Probabilistic Inference Modulo Theories
Rodrigo de Salvo Braz
Artificial Intelligence Center - SRI International
joint work with
Ciaran O’Reilly
Artificial Intelligence Center - SRI International
Vibhav Gogate
University of Texas at Dallas
Rina Dechter
University of California, Irvine
Probabilistic Programming for Advanced Machine Learning
Hybrid Reasoning Workshop – IJCAI 2015 - July 26, 2015
Outline
• Two extremes: logic and probabilistic inference
• Probabilistic inference in Graphical Models
–
–
–
–
Factorized joint probability distribution
Marginalization
Variable elimination
Mainstream probabilistic inference representation
• Logic Inference
–
–
–
–
Theories
Inference: satisfiability
DPLL
DPLL(T)
• Symbolic Generalized DPLL Modulo Theories (SGDPLL(T))
• Experiments
• Conclusion
Two extremes: logic and probabilistic
inference
• Logic
– Initial mainstream approach to AI: represent knowledge declaratively,
have a system using it
– Can use rich theories: data structures, numeric constraints, functions
– Basic logic does not offer treatment for uncertain knowledge
• Probabilistic Inference
– Follows the idea of representing knowledge declaratively and having a
system use it
– Very good treatment of uncertainty
– Poor representation, equivalent to discrete variables with equality,
lacking interpreted functions (data structures, arithmetic) and even
uninterpreted non-nullary functions (friends(X,Y))
– Way to use such constructs is to ground them into tables or formulas
Probabilistic Inference in Graphical Models
P(epidemic)
epidemic
P(sick_john | epidemic)
sick_john
P(sick_bob | epidemic)
sick_mary
sick_bob
P(sick_mary | epidemic)
P(sick_john, sick_mary, sick_bob, epidemic)
(chain rule)
= P(sick_john | epidemic) * P(sick_mary | epidemic)
* P(sick_bob | epidemic) * P(epidemic)
Graphical models as Factor Networks
P(epidemic)
P(sick_john | epidemic)
sick_john
epidemic
P(sick_bob | epidemic)
sick_mary
sick_bob
P(sick_mary | epidemic)
P(sick_john, sick_mary, sick_bob, epidemic)
 P(sick_john | epidemic) * P(sick_mary | epidemic)
* P(sick_bob | epidemic) * P(epidemic)
Marginalization
P(epidemic)
P(sick_john | epidemic)
sick_john
epidemic
P(sick_bob | epidemic)
sick_mary
sick_bob
P(sick_mary | epidemic)
P(sick_john)  epidemic sick_mary  sick_bob
P(sick_john | epidemic)
* P(sick_mary | epidemic) * P(sick_bob | epidemic)
* P(epidemic)
Inference: Marginalization via Variable Elimination
P(epidemic)
epidemic
P(sick_john | epidemic)
P(sick_bob | epidemic)
sick_john
sick_mary
sick_bob
P(sick_mary | epidemic)
P(sick_john)  epidemic P(sick_john | epidemic) * P(epidemic)
* sick_mary P(sick_mary | epidemic)
* sick_bob P(sick_bob | epidemic)
Inference: Marginalization via Variable Elimination
P(epidemic)
P(sick_john | epidemic)
sick_john
epidemic
1(epidemic)
sick_mary
P(sick_mary | epidemic)
P(sick_john)  epidemic P(sick_john | epidemic) * P(epidemic)
* sick_mary P(sick_mary | epidemic)
* 1 (epidemic)
Inference: Marginalization via Variable Elimination
P(epidemic)
P(sick_john | epidemic)
sick_john
epidemic
1(epidemic)
sick_mary
P(sick_mary | epidemic)
P(sick_john)  epidemic P(sick_john | epidemic) * P(epidemic)
* 1 (epidemic)
* sick_mary P(sick_mary | epidemic)
Inference: Marginalization via Variable Elimination
P(epidemic)
P(sick_john | epidemic)
sick_john
epidemic
1(epidemic)
2(epidemic)
P(sick_john)  epidemic P(sick_john | epidemic) * P(epidemic)
* 1 (epidemic)
* 2 (epidemic)
Inference: Marginalization via Variable Elimination
3(sick_john)
sick_john
P(sick_john)  3 (sick_john)
A Graphical Model with a Theory
• Consider an undirected graphical model defined on strings:
if announcement = title + speaker + venue + abstract then 0.7 else
if announcement = title + venue + speaker + abstract then 0.3 else 0
if contains(title, "complexity") then 0.1 else 0.9
if contains(title, "efficient") then 0.3 else 0.7
if length(title) > 5 then 0.9 else 0.1
if length(title) < 15 then 0.9 else 0.1
if speaker = "Prof." + name then 0.1 else
if speaker = name then 0.9 else 0
... // more statements, for example defining knowledge about names
and query
P(speaker| announcement = “Efficient PP Solving
Prof. Paul Smith
We did this and that”)=
• Can we do probabilistic inference using theories?
?
Basic Step: summing a variable out
• A basic operation is summing a variable out
 epidemic P(sick_john | epidemic) * P(epidemic) * 1 (epidemic) * 2 (epidemic)
=
3 (sick_john)
• Typically, factors are represented as tables
• This is fine for variables with a few values such as booleans, but what
about, for example
 x in {1,...,1000} (if x > 20 then 0.3 else 0.7) * (if x < z then 0.4 else 0.6)
or
 s in Strings
(if length(s) < 40 then 0.8 else if length(s) < length(s2) then 0.2 else 0 )
* (if s starts with “Announcement” then 0.7 else 0.3)
• It is wasteful to go over all values of x  and impossible for s.
• Results are symbolic in z and s2
A parallel track: Satisfiability
• The Davis-Putnam-Logemann-Loveland (DPLL)
algorithm solves the problem of satisfiability:
∃𝑝 ∃𝑞 ∃𝑟 (𝑝 ∨ 𝑞) ∧ (𝑞 ∨ ¬𝑟) ∧ (¬𝑝 ∨ 𝑟)
• This is similar to what we need, but for
– Existential quantification instead of summation
– Propositional variables (no theories)
– Total quantification (no free variables)
Solving Satisfiability with DPLL
xyz (x  y)  ( x  y  z)
x = false

x = true
yz y
y = false
z true

yz y  z
y = true

y = false
z z
z false
z = false
false

y = true
z true
z = true
true
A parallel track:
Satisfiability Modulo Theories (SMT)
• Satisfiability modulo theories generalizes
satisfiability to non-propositional logic
(includes arithmetic, inequalities, lists,
uninterpreted functions, and others)
∃𝑥 ∃𝑦 ∃𝑙 𝑥 ≤ 𝑦 ∨ 𝑙 = 𝑥, 3
∧ 𝑓𝑖𝑟𝑠𝑡 𝑙 = 𝑦
• This is closer to what we need (since it works
on theories), but for
– Existential quantification instead of summation
– Total quantification (no free variables)
Solving Satisfiability Modulo Theories with
DPLL(T)
∃𝑥 ∃𝑦 ∃𝑙 𝑥 ≤ 𝑦 ∨ 𝑙 = 𝑥, 3
𝑥≤𝑦

∧ 𝑓𝑖𝑟𝑠𝑡 𝑙 = 𝑦
𝑥>𝑦
𝑓𝑖𝑟𝑠𝑡 𝑙 = 𝑦
𝑓𝑖𝑟𝑠𝑡 𝑙 = 𝑦
true

𝑙 = 𝑥, 3 ∧ 𝑓𝑖𝑟𝑠𝑡 𝑙 = 𝑦
𝑓𝑖𝑟𝑠𝑡 𝑙 ≠ 𝑦
false
𝑙 = 𝑥, 3
𝑙 ≠ 𝑥, 3
𝑓𝑖𝑟𝑠𝑡 𝑙 = 𝑦
𝑓𝑖𝑟𝑠𝑡 𝑙 = 𝑦
true


false
𝑓𝑖𝑟𝑠𝑡 𝑙 ≠ 𝑦
false
Our solution:
Probabilistic Inference Modulo Theories
• Similar to SMT, but based on
– Summation (or other quantifiers) instead of 
– Partial quantification (free variables)
x
{1,…,100}
z
{1,…,100}
(if x > y  y ≠ 5 then 0.1 else 0.9)
 (if z < y  y < 3 then 0.4 else 0.6)
• Note that y is a free variable
• Summed expression is not Boolean
• Language is not propositional (≠, <, …)
Probabilistic Inference Modulo Theories (SGDPLL(T))
xyz (x  y)  ( x  y  z)
x = false

yz y
y = false
z true

x  {1,…,100} z  {1,…,100}
x = true
[
(if x > y  y ≠ 5 then 0.1 else 0.9) 
(if z < y  y < 3 then 0.4 else 0.6) ]
yz y  z
y = true

y = false
z z
z false
z = false
false

+
x>y
y = true
z true
x z
:x>y
x z
(if y ≠ 5 then 0.1 else 0.9) 
(if z < y  y < 3 then 0.4 else 0.6)
:x≤ y
x≤y
0.9 
(if z < y  y < 3 then 0.4 else 0.6)
z = true
if y ≠ 5
true
then
x z
:
else
…
0.1  if z < y  y < 3 then 0.4 else 0.6
x>y
…
Condition on literals
until
base case with no
literals in main
expression:
x z
0.04
:x>y :z<y
= x: y < x ≤ 100 z: 1 ≤
z<y
= x: y < x ≤ 100 (y – 1) 0.04
= (100 – y) (y – 1) 0.04
= –0.04y2 + 4.04y – 4
0.04
… + …
Probabilistic Inference Modulo Theories (SGDPLL(T))
x  {1,…,100} z  {1,…,100}
[
(if x > y  y ≠ 5 then 0.1 else 0.9) 
(if z < y  y < 3 then 0.4 else 0.6) ]
x>y
x z
:x>y
(if y ≠ 5 then 0.1 else 0.9) 
(if z < y  y < 3 then 0.4 else 0.6)
if y ≠ 5
then
x z
:
else
+
x≤y
x z
:x≤ y
0.9 
(if z < y  y < 3 then 0.4 else 0.6)
… + …
…
0.1  if z < y  y < 3 then 0.4 else 0.6
x>y
…
= x: y < x ≤ 100 z: 1 ≤
x z
:x>y :z<y
0.04
z<y
= x: y < x ≤ 100 (y – 1) 0.04
= (100 – y) (y – 1) 0.04
= –0.04y2 + 4.04y – 4
0.04
Condition on literals
until
base case with no
literals in main
expression
Unifying Logic and Probabilistic Inference
Satisfiability ()
Propositional
DPLL (satisfiability)
Modulo Theories
SMT (satisfiability
modulo theories)
Symbolic
Modulo Theories
Sum (), max and others
Variable Elimination
SGVE(T), with
SGDPLL(T)
for inner summations
Evaluation
• Generated random graphical models defined
on equalities on bounded integers
• Evaluated against VEC (Gogate & Dechter
2011), a state-of-the-art graphical model
solver, after grounding into tables with
increasing random variable domain sizes
• For domain of size 16, our solver was already
20 times faster than VEC.
Final Remarks on SGDPLL(T)
•
•
•
•
It is symbolic (S)
It is generic (not only Boolean expressions) (G)
Can use theories (T)
Can re-use SMT techniques
– Satisfiability solvers
– Modern SAT solver techniques:
– Unit propagation and Watched literals
– Clause learning
• Requires new solvers on theories for base cases
(at least as powerful as model counting)
Thanks!