Nonmonotonic Causal Theories

491 Knowledge Representation
‘Nonmonotonic Causal Theories’
Marek Sergot
Department of Computing
Imperial College, London
March 2006 v1.0f; November 2014 v1.1a
The details here are NOT EXAMINABLE. They are provided for background for the action
language C+, and for general interest.
Multi-valued signatures
A multi-valued propositional signature σ consists of:
• a set of symbols called constants,
• for each constant c, a non-empty set dom(c) of values, called the domain of c. For
simplicity we will assume that there are at least two distinct values in every dom(c)
(otherwise c is trivial — meaningful but causes some minor technical complications
in definitions and so on, which I want to avoid).
An atom of a signature σ is an expression of the form c = v where c is a constant in σ and
v ∈ dom(c).
A formula ϕ of signature σ is any truth-functional compound of atoms of σ.
Enrico Giunchiglia, Joohyung Lee, Vladimir Lifschitz, Norman McCain, Hudson Turner.
Nonmonotonic causal theories. Artificial Intelligence, 153:49–104, 2004.
Nonmonotonic causal theories (or just ‘causal theories’ for short) is a general purpose
non-monotonic representation formalism. A causal theory is a set of causal rules of the
form
F ⇐G
where F and G are formulas (defined on next page). Informally, this is to be read as saying
that F is ‘caused’ if G is true (which is not the same as saying that G is the cause of F ).
We don’t need to rely on this informal reading to use the formalism.
‘Causal theories’ are also called ‘the logic of causal explanation’ in [2]. Turner [3] has a
more general formalism which he calls the ‘logic of universal causation’.
ϕ ::= ⊥ | > | any atom c = v | ¬ϕ | ϕ ∧ ϕ | ϕ ∨ ϕ | ϕ → ϕ.
(⊥ and >, representing ‘false’ and ‘true’, and all atoms are formulas. If ϕ is a formula so
is ¬ϕ. If ϕ and ψ are formulas, then so are ϕ ∧ ψ and ϕ ∨ ψ and ϕ → ψ.)
A Boolean constant is one whose domain is the set of truth values {t, f}. If p is a Boolean
constant, p is shorthand for the atom p = t and ¬p for the atom p = f.
An interpretation of σ is a function that maps every constant in σ to an element of its
domain. An interpretation I satisfies an atom c = v, written I |= c = v, if I(c) = v. The
satisfaction relation |= is extended from atoms to formulas in accordance with the standard
truth tables for the propositional connectives. I(σ) stands for the set of all interpretations
of σ. As usual, when X is a set of formulas, I |= X signifies that I is a model of X, i.e.,
that I |= ϕ for every ϕ ∈ X.
It is often convenient to represent an interpretation I by the set of atoms satisfied by I.
The main interest for us is that causal theories are intimately connected to the action
language C+ which we will be looking at later.
An implementation (of both causal theories and C+) is available in the form of the Causal
Calculator (CCalc). See: http://www.cs.utexas.edu/users/tag/cc.
(We also have our own (re-)implementation of CCalc, called iCCalc, which has some
additional features.)
‘Causal theories’ (and C+) allows multi-valued propositions as well as Boolean ones. This
is actually a very minor, but useful, extension. We won’t need it in these notes but it is
useful when we look at C+, so I might as well introduce it now.
Reduction to Boolean signatures
Multi-valued signatures are for convenience. As long as the set of constants is finite, and
the domain of every constant is finite, a multi-valued signature can be translated to an
equivalent Boolean signature.
An atom c = v can be viewed as a classical, propositional atom. Then add the following
set of additional formulas:
_
^
(c = v) ∧
¬(c = v ∧ c = w)
for all c ∈ σ
v
v6=w
There are various optimisations — Details omitted.
1
2
Nonmonotonic causal theories
Syntax A causal theory of signature σ is a set of expressions (‘causal rules’) of the form
F ⇐G
where F and G are formulas of signature σ. A rule of this form is to be read as saying
that F is ‘caused’ if G is true (which is not the same as saying that G is the cause of F ).
Semantics Let Γ be a causal theory and let X be an interpretation of its signature.
ΓX is the set of heads of all rules of Γ whose bodies are satified by the interpretation X:
ΓX =def {F | F ⇐ G is a rule in Γ and X |= G}
Translation to (classical) propositional logic
Definite causal theories (defined above) can be translated via the process of ‘literal completion’ into expressions of (classical) propositional logic. The process is analogous to
the Clark completion that provides the original semantics for negation by failure in logic
programs.
Literal completion
For a definite causal theory Γ, translate to set of (classical) formulas comp(Γ):

F ⇐ G1 

..
becomes F ↔ G1 ∨ · · · ∨ Gn
.

F ⇐G 
n
By definition: X is a model of Γ iff X is the unique (classical) model of ΓX .
If F is an atom and there are no causal rules with F as the head then F ↔ ⊥ (which is
logically equivalent to ¬F ).
ΓX is the set of formulas that are ‘caused’ or ‘causally explained’ according to the rules of
Γ, under interpretation X. Perhaps TΓ (X) would be a better notation — it is the heads
of the rules in Γ whose bodies are true in X. I have kept ΓX because that is what the
inventors choose to write. (They also call ΓX the ‘reduct’ but I won’t.)
A causal rule ⊥ ⇐ G becomes ¬G.
Models of Γ are the (classical) models of the formulas comp(Γ).
Notice that if Γ has no models, or has more than one model, then X is not considered to
be a model of Γ.
(As already observed, for multi-valued signatures the task of finding a model of the completion comp(Γ) of a definite causal theory Γ can be reduced to the task of finding a model
of a set of classical (Boolean) propositional formulas. Details omitted.)
Examples below.
Example Signature: p, q (Boolean)
X
Definite clausal theories A causal theory Γ is definite if
• the head of every rule of Γ is an atom or ⊥, and
• no atom is the head of infinitely many rules of Γ.
Note that this is not the same definition of ‘definite’ as in ‘definite clauses’ or ‘definite logic
programs’.
Note also: as defined earlier, when p is a Boolean constant, ¬p is shorthand for the atom
p = f and so covered by the definition.
Intepretations will be represented by the set of atoms that they satisfy. So the interpretations are: {p, q}, {p, ¬q}, {¬p, q}, {¬p, ¬q}. (¬p and ¬q are being used as shorthand for
the atoms p = f and q = f, remember.)
Γ1 = {p ⇐ q,
q ⇐ q,
¬q ⇐ ¬q}
Compute the models of Γ1 :
X1
X2
X3
X4
1
= {p, q}:
ΓX
1
X2
= {p, ¬q}:
Γ1
3
= {¬p, q}:
ΓX
1
4
= {¬p, ¬q}: ΓX
1
= {p, q}
= {¬q}
= {p, q}
= {¬q}
which
which
which
which
has
has
has
has
one (classical) model, X1 .
two (classical) models, {p, ¬q} and {¬p, ¬q}.
one (classical) model X1 6= X3 .
two (classical) models, {p, ¬q} and {¬p, ¬q}.
So Γ1 has one model: {p, q}.
The literal completion comp(Γ1 ) = {p ↔ q, ¬p ↔ ⊥, q ↔ q, ¬q ↔ ¬q} ≡ {p ↔ q, p}. This
has one model. Correct.
3
4
Example
Defaults
Signature: p, q (Boolean)
Γ2 = {p ⇐ q,
Compute the models of Γ2 :
X1
X2
X3
X4
= {p, q}:
= {p, ¬q}:
= {¬p, q}:
= {¬p, ¬q}:
1
ΓX
2
2
ΓX
2
3
ΓX
2
X4
Γ2
= {p, q}
= {¬q}
= {p, q, ¬p}
= {¬p, ¬q}
In causal theories, a causal rule of the form
q ⇐ q,
¬q ⇐ ¬q,
F ⇐F
¬p ⇐ ¬p}
effectively says ‘F holds by default’.
which
which
which
which
has
has
has
has
one (classical) model, X1 .
two (classical) models, {p, ¬q} and {¬p, ¬q}.
no (classical) models.
one (classical) model, X4 .
So Γ2 has two models: {p, q} and {¬p, ¬q}.
The literal completion comp(Γ2 ) = {p ↔ q, q ↔ q, ¬p ↔ ¬p, ¬q ↔ ¬q} ≡ {p ↔ q}. This
has two models. Correct.
To see why, do the tutorial exercises.
It might help to know that the causal rule F ⇐ G is (nearly exactly) equivalent to the
:G
:F
Reiter default rule
. So F ⇐ F corresponds to default rule
: if it is consistent
F
F
that F , then F , or ‘F holds by default’.
There is also a corresponding translation to (extended) logic programs. For an atom p,
p ⇐ p translates to p ← not ¬p
Example
More details later.
Signature: p, q (Boolean)
Careful: the causal theory {p ⇐ q} has no models. The completion of this theory is not
{p ↔ q}, which has models, but {p ↔ q, ¬p ↔ ⊥, q ↔ ⊥, ¬q ↔ ⊥}, which has no models.
When c is a constant and v is a value in its domain
c=v ⇐ c=v
says that the default value of c is v.
Example
Signature: constants c, g with dom(c) = {1, 2, 3} and dom(g) = {a, b}.
c=1 ⇐ c=1
c=2 ⇐ g=a
c=3 ⇐ g=b
For Boolean constant p:
p ⇐ p represents that p is true by default. (Because p is shorthand for p = t ⇐ p = t.)
¬p ⇐ ¬p represents that p is false by default. (Because ¬p is shorthand for p = f ⇐ p = f.)
g=a ⇐ g=a
g=b ⇐ g=b
There are obviously six possible interpretations:
interpretation X reduct ΓX
{c = 1, g = a}
{c = 1, c = 2, g = a}
{c = 2, g = a}
{c = 2, g = a}
{c = 3, g = a}
{c = 2, g = a}
{c = 1, g = b}
{c = 1, c = 3, g = b}
{c = 2, g = b}
{c = 3, g = b}
{c = 3, g = b}
{c = 3, g = b}
5
models of ΓX X model of Γ?
none
no
{c = 2, g = a}
yes
{c = 2, g = a}
no
none
no
{c = 3, g = b}
no
{c = 3, g = b}
yes
6
Example (‘Nixon diamond’)
Signature: Boolean constants pacifist(X), quaker(X), republican(X) for X ranging
over alan, bill.
pacifist(X) ⇐ pacifist(X) ∧ quaker(X)
¬pacifist(X) ⇐ ¬pacifist(X) ∧ republican(X)
By default, X is not a Quaker:
How do we interpret multiple models?
Usual story:
cautions/sceptical – Γ ` α if α satisfied in all models of Γ
brave/credulous – Γ `credulous α if α satisfied in some model of Γ
Finally, what if there are no Republicans who are Quakers?
¬quaker(X) ⇐ ¬quaker(X)
By default, X is not a Republican:
¬republican(X) ⇐ ¬republican(X)
Forget about Nixon for a minute. How would we express that no-one is both a Quaker and
a Republican?
By adding a causal rule like this:
⊥ ⇐ quaker(X) ∧ republican(X)
And some facts:
quaker(alan) ⇐ >
republican(bill) ⇐ >
This causal theory has one model – (check as an exercise):
{quaker(alan), ¬quaker(bill), republican(bill), ¬republican(alan),
pacifist(alan), ¬pacifist(bill)}
Suppose we add another fact: pacifist(bill) ⇐ >.
This causal theory has a different model (check!):
{quaker(alan), ¬quaker(bill), republican(bill), ¬republican(alan),
pacifist(alan), pacifist(bill)}
What about Nixon?
Signature: same as above, but with X ranging over nixon as well.
Let’s add two facts (to the original)
quaker(nixon) ⇐ >
republican(nixon) ⇐ >
This causal theory has two models (check!):
{quaker(alan), republican(bill), quaker(nixon), republican(nixon),
pacifist(alan), pacifist(bill), pacifist(nixon)}
The ‘Causal Calculator’ CCalc
CCalc works with causal theories and also with action descriptions in the action language
C+ to be described later.
Given a definite causal theory Γ of some specified signature, CCalc
• constructs comp(Γ),
• invokes a standard propositional sat-solver to find (classical) models of comp(Γ), and
then
• post-processes the sat-solver output to show the models obtained.
In practice, the first two steps may be combined into one, possibly with some additional
optimisations to simplify the set of formulas passed to the sat-solver.
(http://www.cs.utexas.edu/users/tag/cc)
In addition, CCalc provides
• a language for specifying the signature (sorts, variables, various shorthand notations)
• various other syntactic shorthands, which we will ignore.
I won’t show the details.
{quaker(alan), republican(bill), quaker(nixon), republican(nixon),
pacifist(alan), pacifist(bill), ¬pacifist(nixon)}
7
8
Relationship to other formalisms
Example (a big person is strong by default; a small (not big) person is not strong by
default):
Reiter default logic (You can ignore the details!!)
A causal theory of a Boolean signature can be viewed as a Reiter default theory.
:G
.
Translate causal rule F ⇐ G to the default rule
F
More precisely: Let Γ be a causal theory of Boolean signature. Let D(Γ) be the set of
default rules obtained by translating every rule in Γ as described above.
An interpretation I is a model of Γ iff th(I) is an extension of the default theory (D(Γ), ∅).
(th(I) stands for the set of all formulas true in I.)
Extended logic programs (IMPORTANT)
Suppose that a causal theory Γ has a Boolean signature, and moreover that the head of
every rule in Γ is of the form c or ¬c (these are shorthand for the atoms c = t and c = f) or
⊥.
Every such causal theory can be written equivalently as a set of causal rules of the form
L ⇐ L1 ∧ · · · ∧ Ln
(n ≥ 0)
⊥ ⇐ L1 ∧ · · · ∧ Ln
(n ≥ 0)
or
where L and every Li is a Boolean atom, i.e., of the form c or ¬c (c = t and c = f). Translate
each such rule to an extended logic programming clause:
L ← not L1 , . . . , not Ln
where as usual Li stands for the literal complementary to Li . (Translate L ⇐ > to L ←.)
Translate
to the constraint
Call this the program lp(Γ).
⊥ ⇐ L1 ∧ · · · ∧ Ln
(n ≥ 0)
← not L1 , . . . , not Ln
Now identify an interpretation I with the set of literals satisfied by I (as usual).
An interpretation I is a model of Γ iff I is an answer set of lp(Γ).
Notice: the above does not say that every answer set of an extended logic program lp(Γ)
is a model of Γ. It says that every interpretation I – every consistent and complete set of
atoms of the signature of Γ – is a model of Γ iff it is an answer set of lp(Γ).
strong ⇐ big ∧ strong
¬strong ⇐ ¬big ∧ ¬strong
translates to the extended logic program
strong ← not ¬big, not ¬strong
¬strong ← not big, not strong
Notice that
¬big ← not big
(¬big unless one can show big)
is the translation of
¬big ⇐ ¬big
(¬big by default; or, the default value of big is f)
And similarly
big ← not ¬big
(big unless one can show ¬big)
is the translation of
big ⇐ big
(big by default; or, the default value of big is t)
In certain circumstances it is possible to translate to a simpler extended logic program
lp0 (Γ) that has fewer occurrences of negation-by-failure.
For instance, it turns out that the extended logic program
strong ← not ¬big, not ¬strong
¬strong ← not big, not strong
is equivalent to (has the same answer sets as):
strong ← big, not ¬strong
¬strong ← ¬big, not strong
For example: the causal theory {p ⇐ q; ¬q ⇐ ¬q} has no models, but translates to the
logic program {p ← not ¬q; ¬q ← not q} which has an answer set – {¬q}. But {¬q} is
not an interpretation. (It gives no value to p.)
What are these circumstances? They are similar to conditions for splitting sets. I will omit
the details. They are not of any practical significance because the simplification does not
seem to affect performance of answer set solver (in clingo anyway).
9
10
Multi-valued signatures
Example Here is the example earlier.
For multi-valued signatures it is still the case that every definite causal theory can be
written equivalently as a set of causal rules of the forms:
L ⇐ L1 ∧ · · · ∧ Ln
⊥ ⇐ L1 ∧ · · · ∧ Ln
Signature: constants c, g with dom(c) = {1, 2, 3} and dom(g) = {a, b}.
c=1 ⇐ c=1
c=2 ⇐ g=a
c=3 ⇐ g=b
(n ≥ 0)
(n ≥ 0)
where L and every Li is an atom, except that now it is not necessarily Boolean. In other
words, these rules will look like this:
c = v ⇐ c1 = v1 ∧ · · · ∧ cn = vn
⊥ ⇐ c1 = v1 ∧ · · · ∧ cn = vn
g=a ⇐ g=a
g=b ⇐ g=b
Translates to the logic program:
(n ≥ 0)
(n ≥ 0)
val(c,1)
val(c,2)
val(c,3)
val(g,a)
val(g,b)
I will write the logic program lp(Γ) in clingo notation to try to make things clearer. Let
the propositional atom val(c,v) in the logic program represent the causal atom c = v.
Translate the causal rules to logic program clauses as before:
-val(c,1)
-val(c,1)
-val(c,2)
-val(c,2)
-val(c,3)
-val(c,3)
val(c,v) :- not -val(c1 ,v1 ), ..., not -val(cn ,vn )
-val(c1 ,v1 ) represents ¬(c1 = v1 ), etc.
Constraints are translated similarly:
:- not -val(c1 ,v1 ), ..., not -val(cn ,vn )
Now we have to add logic program clauses to define -val(c,v) for every constant c and
every v ∈ dom(c). Assuming dom(c) = {v1 , . . . , vn }, these rules will be (in clingo notation):
-val(c,v) :- val(c,v1 ).
..
.
(vi 6= v)
-val(c,v) :- val(c,vn ).
There are other ways to get the same effect but this is the simplest. It also has other
advantages (see below).
because they are automatically satisfied by the way that the -val(c,v) literals are defined.
::::::-
not
not
not
not
not
-val(c,1).
-val(g,a).
-val(g,b).
-val(g,a).
-val(g,b).
val(c,2).
val(c,3).
val(c,1).
val(c,3).
val(c,1).
val(c,2).
% not used in this example
%
"
%
"
%
"
-val(g,a) :- val(g,b).
-val(g,b) :- val(g,a).
Two-valued constants, such as g in this example, can be represented more concisely by
means of a single Boolean atom in the logic program. We could represent g = a and g = a
by means of g(a) and -g(a) respectively. That would save a couple of clauses but is
harder to read. It is a detail, and does not save much.
There are other possible translations that give the same effect. For example, notice that
(in this example)
¬(c = 1) ↔ c = 2 ∨ c = 3
So, instead of defining -val(c,1) etc., we could also translate
Note that we don’t need the constraints
:- val(c,v), -val(c,v).
:::::-
c=1 ⇐ c=1
into
val(c,1) :- not val(c,2), not val(c,3)
And similarly for the other values of c and for g.
The method using explicit -val/2 rules has other advantages.
11
12
In particular: every definite causal theory can also be written equivalently as a set of causal
rules of the form
L ⇐ L1 ∧ · · · ∧ Lk ∧ ¬Lk+1 ∧ . . . ¬Ln
(k ≥ 0, n ≥ k)
⊥ ⇐ L1 ∧ · · · ∧ Lk ∧ ¬Lk+1 ∧ . . . ¬Ln
(k ≥ 0, n ≥ k)
One last little note
The very observant might have noticed that the causal rule
L ⇐ L1 ∧ · · · ∧ Ln
or
translates to the Reiter default
where L and every Li is an atom (not necessarily Boolean). In other words, in this form:
c = v ⇐ c1 = v1 ∧ · · · ∧ ck = vk ∧ ¬(ck+1 = vk+1 ) ∧ · · · ∧ ¬(cn = vn )
: L1 ∧ · · · ∧ Ln
L
Whereas the logic program clause
(k ≥ 0, n ≥ k)
L ← not L1 , . . . , not Ln
translates to the Reiter default
or
⊥ ⇐ c1 = v1 ∧ · · · ∧ ck = vk ∧ ¬(ck+1 = vk+1 ) ∧ · · · ∧ ¬(cn = vn )
(k ≥ 0, n ≥ k)
: L1 , . . . , L n
L
These are not equivalent, in general. Compare, for example
These causal rules are translated, respectively, to the following logic program clauses
L ← not ¬L1 , . . . , not ¬Lk , not Lk+1 , . . . , not Ln
and constraints
← not ¬L1 , . . . , not ¬Lk , not Lk+1 , . . . , not Ln
In clingo notation, logic program clauses:
val(c,v) :- not -val(c1 ,v1 ),..., not -val(ck ,vk ),
not val(ck+1 ,vk+1 ),..., not val(cn ,vn ).
and constraints:
:- not -val(c1 ,v1 ),..., not -val(ck ,vk ),
not val(ck+1 ,vk+1 ),..., not val(cn ,vn ).
:p∧q
r
and
: p, q
r
But they are equivalent when causal rules have only atoms in the head, i.e., when they
can be translated to a logic program.
References
[1] Enrico Giunchiglia, Joohyung Lee, Vladimir Lifschitz, Norman McCain, and Hudson
Turner. Nonmonotonic causal theories. Artificial Intelligence, 153:49–104, 2004.
[2] Vladimir Lifschitz. On the logic of causal explanation. Artificial Intelligence, 96:451–
465, 1997.
[3] Hudson Turner. A logic of universal causation. Artificial Intelligence, 113:87–123,
1999.
-val(c,v) etc are defined as before.
This is essentially the translation used in iCCalc.
Note There is nothing here that guarantees that the answer sets of lp(Γ) will be interpretations — complete and consistent sets of val(c,v) and -val(c,v) literals. They will
be consistent but not necessarily complete.
How to make them complete? We have to add some more constraints. I won’t go into detail
because we are primarily interested in C+ not in causal theories per se. In C+ translations
the necessary constraints will be done in a special way.
13
when (¬p ∨ ¬q)
14