Rethinking Epistemic Logic

Rethinking Epistemic Logic
Mark Jago
1
Introduction
Hintikka’s logic of knowledge and belief [8] has become a standard logical tool for
dealing with intentional notions in artificial intelligence and computer science.
One reason for this success is the adoption by many computer scientists of modal
logics in general, as tools for reasoning about relational structures. Many areas
of interest to computer scientists, from databases to the automata which underlie
many programs, can be thought of as relational structures; such structures can
be reasoned about using modal logics. Hintikka’s work can be seen as the start
of what is known to philosophers, logicians and computer scientists as formal
epistemology: that branch of epistemology which seeks to uncover the formal
properties of knowledge and belief.
In Reasoning About Knowledge [6], Fagin, Halpern, Moses and Vardi showed
how logics based on Hintikka’s ideas can be used to solve many real-world problems; these were problems about other agent’s knowledge, as well as problems
in which agents reason about the world. Solutions to such problems have applications in distributed computing, artificial intelligence and game theory, to
name a but a few key areas. With the appearance of Reasoning About Knowledge, Hintikka’s approach was firmly cemented as the orthodox logical account
of belief for philosophers and computer scientists alike.
It is rare for an orthodox account of such popularity to receive no criticism
and Hintikka’s framework is no exception. One major source of objections is
the so-called problem of logical omniscience whereby, as a result of the modal
semantics applied to ‘knows’ (and ‘believes’), agents automatically know every
tautology as well as every logical consequence of their knowledge. Just how such
a consequence should be viewed is a moot point; perhaps this notion of knowledge applies to ideal agents, or perhaps it is an idealised notion, saying what a
non-ideal agent should believe (given what it already does believe). Although
neither account is entirely satisfactory, defenders of the approach claim that,
in many cases, the assumptions are harmless, and that the applications which
such logics have found speak for themselves. [11] surveys logics for which logical
omniscience is a problem and concludes that an alternative logic of knowledge
and belief is required, if real-world agents are to be modelled with any kind of
fidelity.
However, my aim here is not to criticise Hintikka’s approach on the ground of
logical omniscience. Instead, I show that the assumptions required by Hintikka’s
approach cannot be justified by an acceptable account of belief. I concentrate
throughout on the notion of belief, rather than that of knowledge, as the former
is (usually) regarded as the more primitive notion; an account of knowledge will
usually proceed from an account of belief. Hintikka’s logic is not in itself such
1
an account, a fact which has been ignored in the recent literature in artificial
intelligence and computer science. Hintikka’s approach cannot be seen as a
definition of belief without vicious circularity. I then argue that any acceptable
account of belief must be able to account for Frege’s problem of informativeness
and settle on a partly representational, partly dispositional account of belief.
Such an account clearly shows the mistaken assumptions in Hintikka’s approach.
In the second half of the paper, I introduce a new approach to epistemic
logic, based on these considerations.1 The logic is introduced for the case of
rule-based agents of the kind common in artificial intelligence but is then extended to a full propositional reasoner. The logic differs from Hintikka’s in using
representational elements to classify belief states and in treating temporal and
alethic modality as essential components of an account of the dynamics of belief
states. The rest of the paper is organised as follows. In the following section,
I present Hintikka’s proposal and, in section 3, discuss the basis notion behind
it, that of an epistemically possible world. In section 4, I discuss and reject accounts of belief which are formulated in terms of Fregean senses, before putting
forward my own account of belief (section 5). Sections 6 and 7 present the
logic of dynamic belief states, including some interesting properties possessed
by models of the logic. An axiomatization of the logic is also given and proved
to be complete with respect to these models. To conclude, I briefly mention
ways in which this approach to epistemic logic can be implemented in real AI
systems, where Hintikka’s approach has proved troublesome.
2
Epistemic Logic
In this section, Hintikka’s logic of knowledge and belief—hereafter, standard
epistemic logic—is presented. Hintikka’s idea in in Knowledge and Belief [8]
was to treat knowledge and belief as ways of locating the actual world in a
space of logical possibilities. If I believe that φ, I rule out possibilities in which
φ is not true, for such worlds are not epistemic possibilities for me. On the
other hand, if I am not sure whether ψ, then I cannot say whether the actual
world falls into the class of worlds in which ψ is true, or in which it is not.
In this case, the space of possibilities in which I locate the actual world will
include both worlds in which ψ holds and those in which it does not. If I have
knowledge, rather than just belief, then the actual world must be within the
class of possibilities in which I think I have located it.
A clarification is needed here. People do not always take their beliefs to
rule out incompatible possibilities, as evinced in the commonly heard, ‘I believe
that φ . . . but of course, I could be wrong.’ I believe that I have two hands,
and that there is a screen before my eyes as I type; yet, if the kind of radical
scepticism which first motivated Descartes’ meditations were true, these beliefs
would be false. Since I have no proof that such scepticism is false, I have to
admit that although I believe these things, they could be false. Given I admit
this possibility, would a world in which I were a brain in an vat count as an
epistemically possibility for me, in Hintikka’s sense? It cannot for, if it did, I
would believe very little indeed (perhaps only that my mind exists, and that I
1 I use the term epistemic logic to include what should more properly be termed doxastic
logic. A correct account of knowledge relies on a correct account of belief, so I view these
logical considerations as pertaining to epistemic logic in general.
2
am currently thinking). So Hintikka’s notion must mean something else: given
that my beliefs are true, the possibilities that are then left open are the epistemic
possibilities for me (regardless of whether those beliefs are true or not).
Hintikka used possible world semantics, with a primitive relation of epistemic
accessibility R holding between worlds. Rww′ says that w′ is epistemically
possible (for the agent in question) from w. Belief is then defined in terms
of this relation. An agent believes φ (at a world w) if φ is true in all worlds
epistemically possible from R, the worlds w′ such that Rww′ . In the case of an
agents actual beliefs, a unique world @ may be distinguished as the actual world.
An agent’s beliefs are then the sentences which hold at all worlds accessible from
the actual world, i.e. the worlds w such that R@w. Much the same holds for
knowing that φ, although then a different accessibility relation RK must be
considered, as knowledge has different properties to belief. At the very least,
RK must relate every world to itself so that, if one knows φ at w, then φ is true
there. Let us leave knowledge to one side; our worry is to give an account of
belief.
Models of belief are relational structures. The domain of the model is a set of
points W , considered to be possible worlds. At each world, the sentences which
are said to be true are closed under some logical consequence relation (say,
that of classical propositional or first-order logic). The accessibility relation
R (⊆ W × W ) holds between worlds in the domain of the model. A model
represents the beliefs of more than one agent will have a distinct accessibility
relation for each agent; we then talk about a multi-modal logic, in which we
have a belief modality Bi for each agent i. The satisfaction relation ‘|=’ holding
between a model, a world in that model and a sentence of the logic is defined
recursively in the usual way. In the propositional case, truth-values are assigned
to propositions at each state; in the first-order case, constants are assigned an
individual of the domain and relation letters an extension. In either case, the
following clauses define the truth of logically complex sentences, in terms of less
complex sentences, at each world w.
M, w |= ¬φ iff M, w 6|= φ
M, w |= φ ∧ ψ iff M, w |= φ and M, w |= ψ
M, w |= φ ∨ ψ iff M, w 6|= φ or M, w |= ψ
M, w |= φ → ψ iff M, w 6|= φ or M, w |= ψ
Satisfaction (or truth) at a world is clearly standard. The definition of ‘|=’
continues with the following clause for belief (with ‘Bφ’ read as ‘the agent
believes that φ’):
M, w |= Bφ iff, for all w′ ∈ W , Rww′ implies M, w′ |= φ
Logically equivalent sentences must have a common truth value at each
world in W and so beliefs which are logically equivalent (in that logic) are
indistinguishable: to believe the one is to believe the other. In the same sense
that a set of premises is said to contain the conclusions which may be drawn
from it, this is a notion of belief in which the belief that φ includes whatever
logical consequences φ may have. Since the consequences of the empty set ate
precisely the valid sentences of a logic, to be said to believe the latter is viewed
as unproblematic.
3
What of a traditional problem for accounts of belief: the substitution of coreferring terms? This has been viewed as a problem because, in a sense, a belief
about Bob Dylan is a belief about Robert Zimmerman, for they are one and
the same person. Yet, I may believe Dylan, but not Zimmerman, to be a great
songwriter. The problem is blocked in Hintikka’s account by requiring that
the worlds in W need only be epsitemically possible. That is, the condition
for membership of W is just the epistemic possibility of a world’s logically
primitive true sentences. The truth of logically complex sentences is then given
by the standard recursion clauses for Booleans and quantifiers. The thought is
that even though ‘a’ and ‘b’ are co-denoting terms, it is at least an epistemic
possibility that the two refer to different entities. That is, even though a = b in
actuality, it remains epistemically possible that a 6= b.
3
Epistemic Possibility
There are many worries raised by this account of belief; not least that, as mentioned in the introduction, agents are modelled are being logically omniscient.
There is perhaps a place for such abstraction and it has proved popular in AI
and computer science; my target here is to concentrate on two problems which
remain even if we allow this idealisation. The first is that we have not been
given an account of what beliefs are at all; the second is that, if the account
were correct, it would give us no reason for thinking that the logical principles
upon which the account is based should hold of belief. Both of these problems
relate to the notion of epistemically possible worlds. Just what kind of entities
are they? Evidently, they are worlds in which Robert Zimmerman need not be
Bob Dylan, even though Robert Zimmerman is in actuality Bob Dylan. Yet,
identity is a matter of de re necessity: entities are necessarily self-identical.
Kripke [12] gives us additional reasons to suppose that identity statements involving distinct rigid designators are either necessarily true or necessarily false.
If ‘a = b’ is true at any world, it is true at all worlds in which a exists. Epistemic possibilities, we must conclude, need not be metaphysical possibilities.
But what right have we to call such possibilities entities?
I take it that, by world, it is meant an entity, the truths about which form
a maximal consistent set (a consistent set which, upon the addition of just one
extra formula, would become inconsistent). We might then think of worlds
simply as assignments to the primitive constructs in the language, together
with the closure of such under the satisfaction statements for Booleans and
quantifiers. Does this conception entitle us to think of any such assignment as
giving an entity? It does not; we may describe a logical theory by assigning
different elements of our domain to the constants ‘a’ and ‘b’ at a world w1 ,
and assigning them the same element at w2 . As far as the logical theory goes,
this is fine; but we are then caught in a dilemma. Either we treat epistemically
possible worlds simply as logical notions; or we treat them as genuine entities. If
we take the latter horn, we are left explaining how something, whose existence is
impossible, can exist (I take it that the scope of metaphysical possibility decides
all questions of ontological possibility).
If we take the former horn of the dilemma, we arrive at the original accusation
that we do not have a theory of belief at all. For why should a logical point,
or a logical theory, be considered to be epistemically possible or impossible by
4
an agent? On what basis would we make such a judgement? It seems that one
would judge such a logical point to be epistemically impossible, in the sense
settled on in the previous section, precisely because one’s beliefs rule it out as
a contender for actuality—and possible otherwise. To put the point another
way, what could count as the truthmaker for a logical point, or a logical theory,
being epistemically possible for an agent, other than what that agent considers
possible? But the sense of epistemically possible settled on in the previous
section is in no way comparable to objective notions of metaphysical possibility,
as the example of scepticism has shown. An agent considers a state of affairs
epistemically possible, in the sense required for the standard epistemic logic to
be useful, precisely when her beliefs do not rule it out. This sense of epistemic
possibility cannot be accounted for without a prior understanding of belief.
Standard epistemic logic, then, is not in itself a theory of belief at all. Given
what an agent considers possible, the theory tells us what beliefs she has and
what follows from those beliefs. But, to fix what an agent considers possible,
it is necessary to fix what her beliefs are first. Belief is the more fundamental
notion here. Now, it is not always incumbent on a logic of some concept to
explain, on its own, what that concept is or means. Logics may help focus
our thoughts when thinking about a particular concept. A logic of belief can
help us top become clearer about the implications our account (for example by
settling whether one’s believing φ implies believing that one believes φ). But
then we must search elsewhere for an account of beliefs, their meanings and
their identity.
There is a second worry concerning modal epistemic logic, which challenges
its claim even to help us to clarify questions concerning the logic of belief. The
worry is this. Let us grant the required notion of epistemic possibility, either as
genuine entities (ignoring the remarks above) or as purely logical notions. Given
that these possibilities lie outside the scope of the metaphysically possible, what
weight can there be to the insistence on logical necessity that is supposed to
hold at such worlds? When we say that some formula is a logical consequence
of another, we mean that the former cannot be true unless the latter is too,
on pain of contradiction. But, in the realm of the conceptually possible but
metaphysically impossible, why should the threat of contradiction pain us? Why
should the notion of logical consequence hold any weight at all? To put the point
another way, consider the propositional case, in which precisely one value is
assigned to each primitive proposition in the language at each world. Now, what
is our criterion for calling these values truth values, rather than just arbitrary
assignments of one symbol to another? With no notion of metaphysical necessity
in play, it seems we can make no headway with the notion of truth.
Some logicians, e.g. Hintikka [9] and Levesque [13], endorse this thought and
treat it as an advantage, allowing for primitive propositions to be both true and
false at a world. Let us call such a world a paraconsistent world. Allowing paraconsistent worlds in the domain W allows for a logic in which agents may have
inconsistent beliefs and need not believe every classical tautology. In the former
case, all accessible worlds will be paraconsistent. (If a logic is concerned with
which beliefs are true, it must force worlds which are accessible to themselves
to be classical. Otherwise, an agent might have the belief that, say, p as well as
the belief that ¬p; but these beliefs cannot be true at the same time.)
We have seen that we have no reason to suppose classical logic to be the
logic of each epistemically possible world. We equally have no reason to sup5
pose that any logic can fulfil this rôle without conflicting with the given account
of epistemic possibility. Suppose a modal system contains worlds which contain
a non-modal logic Λ (for example, in paraconsistent propositional modal logic,
the theory of each world contains the theorems of paraconsistent propositional
logic).2 Each member of Λ will be epistemically necessary and so must be believed by any agent, regardless of which worlds she considers possible. The
question is, given the account of epistemic possibility, why should any sentence
be considered a universal epistemic necessity? We could always find some element of Λ which someone could take to be false. Suppose we have little reason,
for all we know, to consider a sentence ‘φ’ to be a theorem of Λ but equally little
reason to think that it is not (perhaps Λ is undecidable; perhaps the complexity
of validity checking too high). It seems that the theoremhood or otherwise of
‘φ’ is epistemically open. Yet, according to the standard epistemic logic over
Λ, ‘φ’ is universally and globally believed iff it is Λ-valid. The only logic which
could avoid this difficulty is the zero logic ∅, which contains no theorems whatsoever; and it is clear that the zero logic cannot provide us with a useful tool
for analysing belief at all.
4
Fregean senses
Frege [7] discusses two questions which are of interest to us, viz. (i) why is it that
co-denoting terms are not substitutable salva veritate in belief contexts? and
(ii) how is it possible that certain identity statements are informative? The
latter is known as the problem of cognitive significance and clearly impacts on
the former. In summary, Frege’s solution is that senses mediate reference and
that propositions, or thoughts, consist of senses. Frege thought of senses as
mind-independent entities, distinct both from the physical world and the realm
of language. Thoughts qua entities consisting of senses are not, on this view,
mental entities at all. Thoughts are mind-independent and thus the very same
thought (the same token, not just the same type) may be grasped by more
than one person. Understanding simply consists in the grasping of a thought.
Roughly, we may think of the sense of a term ‘a’ as a way in which its referent a
is presented. The problem of cognitive significance then vanishes, for the terms
‘a’ and ‘b’ may have very different senses, even if a = b. One would grasp a
different thought in understanding the sentence ‘a = b’ than the thought grasped
in understanding ‘a = a’.
A major problem with this view is in part caused by the inherent abstract
nature of such entities. The metaphor of grasping a term’s sense lacks any explanatory force; nor does a more informative answer seem possible. One simply
has to posit non-natural mental powers in order to account for our understanding and, since a theory of understanding is a theory of meaning, the meaning
of language is treated as primitive and impossible to analyse further. Secondly,
if the sense of a sentence is a thought, then we should treat the senses of the
constituents of a sentence as the constituents of thought, i.e. as concepts. Frege
allows this by treating the senses of singular terms (by which Frege included descriptions as well as names, demonstratives and indexicals) as primitive, whereas
the sense of predicates and relational terms are to be treated as functions.
2 The
theory of a world is just the set of formulas true at that world.
6
By way of illustration, let us write ‘σ[P ]’ for the sense of the predicate ‘P ’
and ‘σ[a]’ for the sense of a singular term ‘a’. The former is a function, from the
sense of a singular term to a thought. σ[P ], given σ[a] as its argument, returns
the sense of the sentence ‘P a’, none other than the thought that a is P . We
thus have a compositional way of analysing the structure of thought; we may
write the forgoing as σ[P a] = σ[P ](σ[a]), where the latter relatum, structured
as a functional application, displays the structure of the former relatum, i.e. the
thought. However, if we suppose concepts to be primitive senses (those with no
internal structure), no naturalistic explanation of concept formation is possible.
Again, we would have to posit non-natural mental powers to account for concept
formation. As with a theory of understanding, an adequate account of concept
formation must also serve as an explanation; and some such does not seem
possible on the Fregean view.
It seems that the only option, if we are to consider Fregean senses as having
any explanatory value whatsoever, is to credit them with a concrete existence.
One option is to treat senses as descriptions, or as bundles of descriptions. Then,
we treat meanings and concepts as such and understanding as the grasping of
such bundles. But this account faces a well-known objects. Plato, the teacher
of Aristotle, was not so of necessity. Somebody else could have taught Aristotle, even though, in actuality, Plato did. But if ‘Plato’ means ‘the teacher of
Aristotle’, to assert such a contingent fact as ‘Plato taught Aristotle’ would be
a necessary truth. Of course, it is not anything of the sort; hence, senses cannot
be assimilated to descriptions or bundles thereof. There is an interesting idea
here, despite the falure of the account. If thoughts were constituted from senses
qua bundles of descriptions, then thought would have a language-like character. Concepts would contain cognitive information analogous to the information
contained within a description’s descriptive condition. Mental information, on
such an account, would be precisely analogous to linguistic information. We
shall return to this point below.
5
Belief States
An account of what belief is is now owed. Accounts of belief in terms of epistemically possible worlds or Fregean senses have been rejected; so let us consult
our intuitions concerning what would and would not count as a belief. Senses
were originally invoked to solve the problem of informativeness of different yet
co-denoting terms. The problem can be re-cast in terms of sentences by dealing with equivalences rather than identity statement. Where Frege asked how
it can be that ‘a = b’ is informative, we may ask how it is that ‘someone is a
bachelor if and only if he is an unmarried man’ is informative, whereas ‘someone
is a bachelor if and only if he is a bachelor’ is not. One possibility is that, in
combining one’s concept of marriage (or being unmarried) with that of a man,
one does not necessarily arrive at one’s concept of a bachelor. This would be
the case when one has been told that, say, Rob is a bachelor but does not know
what ‘bachelor’ means. Yet, without a prior understanding of what concepts
are, this is simply a re-statement the problem.
An account of concepts in terms of Fregean senses has been rejected. May
concepts instead be accounted for as abilities? Certainly, to have a concept is
to have an ability, to be able to distinguish things which fall under that concept
7
from those that do not. However, one need not be disposed to correctly make all
such distinctions in order to possess the relevant concept. My concept of an elm
is distinct from my concept of a beech and yet, presented with one type of tree,
I doubt I could say which it is. This is not a defect in my conceptual scheme;
I simply lack the requisite information to distinguish instances of the one type
from those of the other. This suggests the question, how are the two concepts
distinct at all? Were I asked to list distinct types of tree, I would certainly utter
both ‘beech’ and ‘elm’; this seems evidence that my concepts elm and beech are
distinct. I have a repository of information associated with each term, or with
the mental equivalent thereof. Rather than assuming the concept is nothing
but this information, as the senses-qua-bundle-of-descriptions account does, the
repository of information relating to elms seems intrinsically linked to my uses
of ‘elm’. Were I to learn that elms possess a property P lacking in beech trees, I
become disposed to assert ‘elms are P ’. Notice that this new information might
not provide me with the ability to distinguish elms from beeches, for I might
be unable to identify the P s from non-P s. The identity of concepts, then, goes
beyond particular distinguishing abilities.
The notion of a concept is best explained on the analogy of a mental repository of information. The information associated with a particular concept may
change, in some cases dramatically, yet what the concept is a concept of stays
the same. My concept of an elm is intrinsically linked to the term ‘elm’; rather
than the word being a label for the concept, as the Fregean account presupposes,
the very identity of the concept appears tied to a linguistic, or at least representational, entity. In fact, in order to account for the problem of informativeness,
concepts must be to a greater or lesser extent representational. There is not the
space here to argue for a representational account of concepts in the detail it
deserves; the aim, after all, is to provide a logic of belief. It suffices to say that,
if this account is correct, then what one believes about elms is what representational information one associates with one’s elm concept. Certain information
disposes one to assert certain sentences and not others; having a certain belief
is therefore a disposition to assert certain sentences and not others.
This latter formulation, if correct, allows an analysis of belief free from
assumptions about the nature of the mind. Yet this formulation is not correct
as it stands and is in a degree of tension with the formulation in terms of
representational mental information. Consider these two cases. You are asked
whether the Eiffel tower is taller than Elvis ever was. You have never made such
a comparison before yet, with relatively little mental effort, you answer in the
affirmative. You believe that things are so and your disposition to answer the
question in the affirmative is evidence of this belief. Now suppose a logician is
asked whether a sentence φ is a theorem of some logic Λ. After a week of a priori
mental toil, she responds that it is. Did she believe it to be so at the instant the
question was posed? Surely not, for then she would not have taken so long and
taken such pains to reply. She was disposed to assert that φ is a theorem but
only after a great deal of mental gymnastics. Her mental state at the posing of
the question cannot be considered a state of belief (about φ), yet we suppose
that one’s mental state in the case of the Eiffel tower-Elvis comparison is a
state of belief. Surely the difference is only a matter of degree?—the former
case requires a great deal of thought before an assertion is made, the latter very
little. In the case of the logician, she arrives at the belief that φ is a theorem
precisely when her deliberating—the kind of thought process which enables one
8
to assert or deny statements—internally represents φ as being so. It as if she
internally derives the mental analogue of the sentence ‘φ is a theorem’. This
internal derivation may be a more or less logical—more or less rational—process;
its important feature is that it enables the logician to assert, without further
deliberation, that φ is a theorem.
The disposition to assert a certain sentence characterises the kind of deliberative mental process required for belief in a precise way. In a strict sense, having
the belief that φ consists in being disposed to assert that φ, in virtue of being in
a cognitive state involving the representational mental analogue of φ, without
further deliberation or empirical investigation. In the case of artificial agents,
this account seems undeniable; if we are to credit artificial agents with beliefs,
then they must be accounted for in the following way. The individuation of the
internal states of the agent is theoretically unproblematic. Suppose the agent is
a program which executes by assigning different values to variables at different
times (or, to be more precise, at different cycles of the hardware upon which
the program executes). Such states may be unambiguously translated into a
metalanguage—say, a precise logical language—by assigning values to variables
in a way which mimics variable assignment in the execution of the program.
Then, an agent believes that φ in a certain state s iff the part of s which is
responsible for declarative output, when translated into our metalanguage, includes φ. One might protest that whether or not an agent can actually make
declarative utterances (or truth-apt output of any kind) is irrelevant when considering whether a particular internal state is a state of belief. If we allow such
states to be belief states, then such an agent believes that φ in state s iff the
representational part of s, when translated into the metalanguage, includes φ.
In the case of human belief states, things are not so simple. They never
are. Upon being asked whether the Eiffel tower is taller than Elvis was, one’s
cognitive state is likely to change ever so slightly, moving into a cognitive state
which includes comparing. The change may be so slight as to be unnoticeable to
the agent so that, on forming the belief (in the latter mental state) that the Eiffel
tower is indeed taller than Elvis and asserting so, it seems that the belief must
have been there in the former state as well. Our practises of reporting beliefs
are not fine-grained enough to distinguish the two states; they cannot be, for
our sensory apparatus has its own granularity. More tellingly, we would have
no reason to make such a distinction. We are usually interested in the beliefs of
others as a guide to future action, or as an explanation of their behaviour, or as a
guide for our own beliefs and actions. The granularity of these considerations is,
of course, far coarser than that of cognitive states. We thus generally disqualify
acts of recalling from memory, quick and easy comparisons and the like as acts
which would distinguish between belief states. Just which cognitive acts we
do include in distinguishing one belief state from another is a vague matter.
We include a week’s worth of deliberation in this category, we do not include
split-second comparisons. At some point in between the two extremes, all bets
will be off. Human belief states have vague borders and so there must be an
inherent vagueness in the notion of human belief itself.
9
6
A Logic of Belief States
From what has been said above, it should be clear that the aim is to provide
a descriptive model of an agent’s beliefs, rather than a normative account. It
seems the question of whether an artificial agent believes a sentence φ can be
settled by considering the internal state of the agent—by looking at which values
have been assigned to which variables. By limiting the focus to what such an
agent believes now, we are likely to arrive at a very dull logic of belief. It is
therefore necessary to be clear as to why a logic of belief is desirable at all.
One answer is the following. There has been considerable interest in the last
twenty years in verifying properties of programs. Knowing that a program will
not enter a loop from which it will never exit, or that a server cannot enter a
state in which two users can change the content of a database at the same time,
are clearly useful things to know. The same kind of knowledge is desirable with
artificial agents in the picture. We judge many current artificial agents to be
stupid because they frequently do stupid things. No doubt, the programmers
did not envisage such possibilities coming about and would like to ensure future
generations of the agent avoid similar mistakes, i.e. to verify that the future
generation satisfies certain properties. One use for a logic of belief is to enable
properties to be verified at the intentional level, the descriptive level at which
the agent is said to have concepts, to believe this, to desire that and so on.
Such a logic cannot just talk of belief; it must include temporal and alethic
notions, allowing for judgements such as ‘the agent may reach a state in which
it believes φ in ten cycles’, ‘the agent can reach its goal in ten cycles’ or ‘the
agent cannot reach a state in which it believes ψ within ten cycles’ (here cycle
means something like the change from one belief state to another).
The logic is illustrated by the various belief states of Doris, who singlehandedly runds Doris’ Dating Agency. Doris is a rule-based agent, whose program
incorporates rules such as
suits(x,y), suits(y,x) match(x,y)
(if x is suited to y and vice versa, then they are a good match for one another).
Such rules are read by Doris as inference rules: given that suits(x,y) and
suits(y,x) have been derived (where x and y are consistently replaced with
constants), infer match(x,y) (with the same constants substituted for x and y).
In reasoning about Doris, replace ‘have been derived’ with ‘are believed’ and
‘infer’ with ‘believe’ in this reading. Such rules are a species of condition-action
rule; the condition is the current state of Doris’ beliefs, the action required is for
Doris to form a new belief. Doris works in cycles. In each cycle, she checks her
beliefs against her rules, trying to find a consistent matching instance. If there
are matches, she picks one and forms the belief it tells her to. More precisely,
an instance of a rule is obtained by uniformly replacing all variables by with
constants. Let δ be some substitution function from the set of variables of the
rule into the set of constants and ρδ be our notation for the instance of the rule
ρ under δ. For example, if δ assigns Rob to x and Roberta to y, then
δ
suits(x,y), suits(y,x) suited(x,y)
=
suits(Rob,Roberta), suits(Roberta,Rob) suited(Rob,Roberta)
10
Given a rule instance ρδ under the substitution δ, Doris can fire that instance,
adding the new belief (the consequent of the rule ρ under the substitution δ,
written cn(ρ)δ ) to her working memory. Only one rule instance may be fired at
a time, for rule firing is strictly serial.
In order to reason about Doris’ belief states, let us introduce a further language, containing sentences such as B male(Rob)∧B suits(Roberta,Rob), read
as ‘Doris believes that Rob is male and that Roberta is suited to Robert.’ The
question is, what kind of logical apparatus can be applied to such sentences? To
be sure, the above sentence should be true precisely when Doris has those two
beliefs, i.e. when both male(Rob) and suits(Roberta,Rob) are held in Doris’
working memory; but what logical principles hold of the beliefs themselves?
Suppose Doris believes the rule male(x) –female(x), which she matches to
produce the instance male(Rob) –female(Rob). Does this mean that, if Doris
believes male(Rob), then she believes –female(Rob)? It does not, for we can
easily imagine a case in which the former is held is working memory but the latter
is not. Perhaps Doris has discovered many instances of the rule and is working
through them one at a time, checking which may be fired, and has not yet come
to seeing whether she believes male(Rob). Thus, B male(Rob) –female(Rob)
is not equivalent to B male(Rob) → B–female(Rob). We can say that, given
enough time, and provided she has enough room in her memory to add new beliefs, B–female(Rob) will eventually become true. We can also say that Doris
could fire that rule instance and add –female(Rob) to her beliefs at the very
next cycle; nothing prevents Doris’ beliefs evolving this way.
As this brief discussion shows, the interesting features of Doris’ belief states
go beyond what she believes now to include what she will and what she she
could believe in the future, given her current belief state. As mentioned above,
temporal and alethic matters are important in reasoning about belief states, yet
this platitude is ignored by standard epistemic logics. The approach here is to
combine temporal and alethic modalities by using a discrete branching model of
time. Figure 1 shows part of such a a model. Time is said to be branching in the
•
y kkkkkk
• SSSS
SS
ww
ww
•
w
w
w
x • GG
GG
GG
G kkkkk •
• kS
z SSSSS
•
Figure 1: Part of a branching time model
model in the sense that the facts which hold at point x do not determine whether
point y or z will come next; although only one may actually follow x, both are
possible successors. In the same way, given what we have said about rule-based
agents such as Doris, the agent’s current belief state does not determine future
belief states, but does make certain states possible successors to the current
one. Of course, in reality any belief is a possible belief, in the sense that an
agent could perceive or hallucinate something completely disconnected from the
focus of its thoughts and come to entertain the corresponding belief. The use
11
of possibility here has to be limited to the possibilities afforded by the agent’s
belief states, or those of other agents, considered as a closed system.
Such structures are relational and can be described using a modal logic.
Modal logic has proved popular in describing relational structures because of
its conceptual simplicity and its robust decidability. Below, a modality ‘3’ is
introduced, with ‘3Bα’ meaning that the agent can move into a belief state
in its next cycle which includes the belief that α. Times are not referred to
explicitly; however, the ‘3’ modality may be chained, so that ‘333Bα’ means
that the agent can in three cycles move into a belief state containing the belief
that α. They key to modelling evolving belief states is to capture the notion of
a transition from one state to another in the heart of the logic. The semantics
of this logic is based on the kind of relational structure diagrammed in figure 1,
where the points represent internal states of the agent and the lines, read left
to right, are transitions between these states. When an agent in state s can fire
a rule to derive a new belief, this is modelled by a transition from s to a new
state s′ , just like s except for the addition of the new belief. Since firing a rule
produces just the one new belief, states related by a transition may differ only by
a single belief. There is a close correlation between transitions and rule firings,
allowing for a very fine-grained model of belief state change. The remainder of
this section is fairly technical. It presents a logic for reasoning about the kind of
rule-based agents introduced above. To conclude, it is indicated how the logic
can be extended to incorporate multiple agents and agents which use a more
expressive language than that of simple condition-action rules.
6.1
Language
We fix the set of predicate symbols P and the set of constants D. We denote
the set of all possible substitutions δ : X −→ D by Σ, where X is any set of
rules. Given a rule ρ = (λ1 , . . . , λn λ), where each λi is a literal, the instance
of ρ under δ, written ρδ , is λδ1 , . . . , λδn λδ . Note that given finite sets X and
D, Σ is also finite; then the set of all possible rule instances is finite as well.
The agent’s internal language L(P, D) over P and D contains only rules and
ground literals. Since we assume both P and D are fixed throughout, we may
drop these arguments and refer to the agent’s internal language simply as L.
The following notation is used:
• literals are denoted by λ, λ1 , λ2 , . . .
• ground literals are denoted by λδ , λδ1 , λδ2 , . . ., where δ ∈ Σ
• rules of the form λ1 , . . . λn λ are denoted by ρ, ρ1 , ρ2 , . . .
• instances of a rule ρ are denoted ρδ , where δ ∈ Σ
Only ground literals and rules, the λδ s and the ρs, are considered well-formed
formulas of L. Arbitrary formulas of L are denoted α, α1 , . . . . The modal language ML(P, D), which is used to reason about the agent’s beliefs, is built from
formulas of L(P, D). ML(P, D) contains the usual propositional connectives
¬, ∧, ∨, →, the ‘3’ modality and a belief operator B. Given a literal λ, a rule ρ
and a substitution function δ, ‘Bλδ ’ and ‘Bρ’ are primitive wffs of ML(P, D).
There are no other primitive wffs. If φ1 and φ2 are both ML(P, D) wffs, the
12
complex wffs of ML(P, D) are then given by
¬φ1 | φ1 ∧ φ2 | φ1 ∨ φ2 | φ1 → φ2 | 3φ1
df
The dual modality ‘2’ is introduced by definition: 2φ = ¬3¬φ. Since P and D
will be fixed throughout, these arguments may informally be dropped, and the
agents’ language referred to as L and the modal language as ML. Note that
the primitive formulas of ML are all of the form Bα, where α is a L-formula,
hence the problem of substitution within belief contexts does not arise in logics
based on ML.
6.2
Models for the single-agent logic
A model M is a structure
hS, T, V i
where S is a set of states; T ⊆ S × S is the transition (accessibility) relation
on states; and V : S −→ ℘(L) is the labelling function assigning to each state
the set of L-formulas which the agent believes in that state. The definition of
a formula φ of ML being true, or satisfied, by state s in a model M (written
M, s φ) is as follows:
M, s Bα iff α ∈ V (s)
M, s ¬φ iff M, s 6 φ
M, s φ1 ∧ φ2 iff M, s φ1 and M, s φ2
M, s φ1 ∨ φ2 iff M, s φ1 or M, s φ2
M, s φ1 → φ2 iff M, s 6 φ1 or M, s φ2
M, s 3φ iff there exists a state s′ ∈ S such that T ss′ and M, s′ φ
By substituting the definition of ‘2’ into the clause for ‘3’, we get M, s 2φ
iff, for all states s′ ∈ S such that T ss′ , M, s′ φ.
Definition 1 (Global satisfiability and validity) An ML formula φ is globally satisfied in a model M = hS, T, V i, notation M φ, when M, s φ for
each state s ∈ S. Given a class of models C, φ is said to be valid in C or C-valid,
written C φ, when M φ for any M ∈ C. Validity (simpliciter) is validity
in any class. A set of ML formulas Γ is said to be satisfied at a state s ∈ S,
written M, s Γ, when every element of Γ is satisfied at s. Γ is then globally
satisfied, C-valid or valid in a similar way.
This formulation applies to relational structures in general, not just to models of rule-based agents. To get the desired class of model, structures have to
be restricted in the following way. To begin with, the agent’s program—the
set of rules it believes—is finite by definition and does not change; rules are
neither learnt nor forgot. This is standard practise in rule-based AI systems.
This means that, if the set R contains all rules believed at a state s, then the
rule believed at all states reachable from s (i.e. in some unspecified number of
transitions) will also be precisely those in R (and similarly for all states from
which s is reachable). To say that a rule has an instance (which may be fired)
is to say that the rule is believed and that there is a substitution such that the
13
premises of the rule are believed (under that substitution) but the consequent
is not (agents do not try to derive what they already believe). Such rules are
said to match. When a rule ρ matches under a substitution δ, we say that δ(ρ)
is a matching instance of ρ under δ.
Definition 2 (Matching rule) Let ρ be a rule of the form λ1 , . . . , λn → λ
and δ a substitution function for ρ. ρ is then said to be s-δ-matching, for some
state s ∈ S, iff ρ ∈ V (s), each λδ1 , . . . , λδn ∈ V (s) but λδ 6∈ V (s).
As explained above, transitions from one state to another correspond to the
agent firing a rule instance and adding a new belief to its working memory.
When a rule instance may be fired in a state s, and a transition to a further
state s′ is possible, s′ must then be just like s except for the addition of that
new belief. In such cases, we say that s′ extends s by that new belief.
Definition 3 (Extension of a state) Let δ be a substitution function for a
rule ρ and λδ be the consequent of ρ under δ. Then a state s′ is said to extend
a state s by λδ when V (u) = V (s) ∪ {λδ }.
One exception is made to the stipulation that transitions correspond to a
rule instance being fired, purely for technical reasons. If there are no matching
rules at a state (and so no rule instances to fire), that state is a terminating
state and has a transition to itself (or to another identical state, which amounts
to much the same in modal logic). This ensures that every state has an outgoing
transition; in other words, T is a serial relation. As a consequence, the question
‘what will the agent doing after n cycles’ may always be answered, even if the
agent ran out of rules to fire in less than n cycles.
Definition 4 (Terminating state) A state s is said to be a terminating state
in a model M iff, for all substitution functions δ ∈ Σ, no rule ρ is s-δ-matching.
Transitions may relate terminating states. If, on the other hand, there is a
matching rule at a state s, then a transition should only be possible to a state
s′ when s′ extends s by an appropriate belief (i.e. the consequent of a matching
rule instance at s). We capture such transition systems in the class S (for single
agent models).
Definition 5 The class S contains precisely those models M which satisfy the
following:
S1 for all states s ∈ S, if a rule λ1 , . . . , λn → λ is s-δ-matching, then there is
a state s′ ∈ S such that T ss′ and s′ extends s by λδ .
S2 for any terminating state s ∈ S, there exists a state s′ ∈ S such that V (s′ ) =
V (s) and T ss′
S3 for all states s, s′ ∈ S, T ss′ only if either (i) there is an s-δ-matching rule
λ1 , . . . , λn → λ and s′ extends s by λδ ; or (ii) s is a terminating state and
V (s) = V (s′ ).
There may, of course, be many rules matching at a given state and many matching instances of each (i.e. instances in which the consequent of the rule, under
14
that substitution, is not already believed). For each such instance of each matching rule at a state s, there will be a state s′ with an transition to it from s. Each
transition may be thought of as corresponding to the agent’s nondeterministic
choice to fire one of these rule instances (i.e. to add the consequent of that rule
instance to its set of beliefs). ‘⋄φ’ may then be read as ‘after some such choice,
φ will hold.’ We can think of the agent’s reasoning as a cycle:
1. match rules to produce rule instances;
2. choose a rule instance;
3. add the consequent of that instance to the set of beliefs; repeat.
By chaining diamonds (or boxes), e.g. ‘333’ we can express what properties
can (and what will) hold after so many such cycles. We can abbreviate sequences
of n diamonds (or n boxes) as 3n and 2n respectively. ‘2n φ’, for example,
may be read as ‘φ is guaranteed to hold after n cycles.’ Note that the choices
made in each of these cycles are nondeterministic; that the agent’s set of beliefs
grows monotonically state by state and that the agent never revises its beliefs,
even if they are internally inconsistent. In [4], an analysis of an agent which
makes a deterministic choice of which rule instance to fire at each cycle is given.
In [3], it is shown how similar agents can revise inconsistent belief states in
a computationally efficient way. In the deterministic case, models are linear
rather than branching, i.e. each state has a transition to a unique successor
state. Given a program (a finite set for rules) R for the agent, models in the
class SR are those models in which the agent believes all the rules in R and no
further rules. Each program R thus defines a unique class of model, SR , such
that models in this class represent agents which reason based on that program.
Definition 6 (The class SR ) Let R be a set of rules. A model M ∈ SR iff
M ∈ S and, for every state s in M , M, s Bρ iff ρ ∈ R. Given a set of ML
formulas Γ and an ML formula φ, we write Γ R φ iff every model of Γ which
is in the class SR is also a model of φ.
6.3
Some properties of the class SR
This section surveys a few of the more interesting properties of the class SR
(for some fixed program R); a more detailed discussion is given in [10]. A few
definitions need to be given first.
Definition 7 (Label equivalence) Given models M = hS, T, V i and M ′ =
hS ′ , T ′ , V ′ i, states s ∈ S and s′ ∈ S ′ are said to be label equivalents, notation
s L s′ , iff V (s) = V ′ (s′ ).
Definition 8 (Modal equivalence) Given models M = hS, T, V i and M ′ =
hS ′ , T ′ , V ′ i, the theory of a state s ∈ S, written th(s), is the set of all formulas
satisfied at s, i.e. {φ | M, s φ}. States s ∈ S and s′ ∈ S ′ are modally
equivalent, written s ! s′ , iff they have the same theory. Similarly for models,
the theory of M is the set {φ | M φ} and M is modally equivalent to M ′ ,
M ! M ′ , iff their theories are identical.
15
Definition 9 (Bisimulation) Given two models M = hS, T, V i and M ′ =
hS ′ , T ′ , V ′ i, a nonempty binary relation Z ⊆ S × S ′ is a bisimulation between
M and M ′ , written Z : M ⋍ M ′ , when
Label If Zss′ then s and s′ are label identical (i.e. s
L
s′ )
Forth If Zss′ and T su, then there is a state u′ ∈ S ′ such that T ′ s′ u′ and Zuu′
Back If Zss′ and T ′ s′ u′ , then there is a state u ∈ S such that T su and Zuu′
When these conditions hold, we say s and s′ are bisimilar and write M, s ⋍
M ′ , s′ (or simply s ⋍ s′ if the context makes it clear which models s and s′
belong to). When there exists such a Z, we write M ⋍ M ′ .
Proposition 1 Given two models M = hS, T, V i and M ′ = hS ′ , T ′ , V ′ i, for all
s ∈ S and s′ ∈ S ′ , s ⋍ s′ implies s ! s′ .
Proof: The proof is standard; see, for example, [5, p.67].
2
A tree model has the form diagrammed in figure 1, such that each state s
has a unique parent (the state from which there is a transition to s) except the
root.
Proposition 2 Every model M has a bisimilar tree model.
Proof: A tree model M ′ can be obtained by unravelling M ; the proof that M
and M ′ are bisimilar is standard.
2
What is the significance of this result? It means that any satisfiable ML
formula can be satisfied in a finite model. Suppose M satisfies φ; then there
is a tree model M ′ bisimilar to M , so M ′ must satisfy φ too. Now, consider a
syntax tree for φ with all but the modalities removed from the nodes (so that
the same number of modalities appears on the tree as in φ). Then the greatest
number of modalities which can be traced from a leaf to the root is called the
modal depth of φ. For example, the modal depth of 2(3p ∨ 32q) is 3. If a
formula is satisfiable, it is satisfiable in a tree model whose longest branch does
not exceed the modal depth of the formula. To obtain this model, simply take
the tree model for the formula φ and chop off the branches at (the modal depth
of φ) states from the root. The upshot is that modal logics are decidable in an
extremely robust way. It easy to find models for satisfiable formulas using the
automatic technique of model checking. Of course, chopping a model down in
this way may disqualify it from membership of SR ; but this can be re-instated
by taking a chopped down model and continuing each branch until no matching
rules are left; then looping the last state on the branch back to itself. Since R
and D are finite, this is guaranteed to happen in finitely many steps.
It is well-known that the converse of proposition 2 does not hold in general.
Given a model M , we can construct a modally equivalent model M ′ containing
an infinite branch for which there can be no bisimulation Z : M ⋍ M ′ (if we
suppose there is, we will eventually come to a point on the infinite branch in
M ′ for which the corresponding point in M has no successor; hence they cannot
be bisimilar states). However, we do have a restricted result in the converse
direction:
16
Proposition
3 (Hennessy-Milner Theorem) A model is image finite if the
S
set s∈S {s′ | T ss′ } is. Then, given two image finite models M = hS, T, V i and
M ′ = hS ′ , T ′ , V ′ i, for all s ∈ S and s′ ∈ S, s ! s′ implies s ⋍ s′ .
Proof: See, for example, [5, p.69]
2
Corollary 1 Given a finite program R, models M, M ′ ∈ SR and any state s in
M and s′ in M ′ : s ⋍ s′ iff s ! s′ .
Proof: Clearly any such model in SR is image finite, for a finite number of
matching rules will only induce a finite number of transitions from a state. Then
the ‘if’ direction is given by the Hennessy-Milner theorem, the ‘only if’ direction
by proposition 1.
2
This is a useful result. We can now say that two such models are indistinguishable iff there is a bisimulation between their states.
Definition 10 (Run and Path) A run θ = s0 · · · sn · · · is an infinite sequence
of states with T si si+1 for each i ∈ N. A path is a finite sequence of states
s0 · · · sn , again with T si si+1 for each i ≤ n; then length(θ) = n. We write θ[i]
to denote state i−1 on θ, i.e. si (with i ≤ length(θ) if θ is a path). We say θ
is a run or path from s when s = θ[0]. θ[0, i] denotes the path consisting of the
first i states s0 · · · si of θ (with i ≤ length(θ) if θ is a path). We say that two
runs θ and θ′ are label-identical iff θ[i] L θ′ [i] for each i ∈ N and similarly for
paths of equal length (paths of unequal length are trivially non-label-idential).
The following results show that the way a state s is labelled determines which
formulas are true at states reachable from s. This allows us to establish a useful
connection between label identical states, bisimulation and modal equivalence.
The following abbreviation is helpful.
df
cn(λ1 , . . . , λn λ) = λ
(1)
Lemma 1 For tree models M, M ′ ∈ SR and states s in M , s′ in M : if s L s′
then, for any path θ from s in M , there is a path θ′ from s′ in M ′ which is labelidentical to θ.
Proof: By induction on the position of s in θ. Assume θ[0] L θ′ [0] and that
the length of θ is n. The base case is covered by assumption so assume further
that, for all j < i ≤ n, θ[j] exists and θ[j] L θ′ [j]. We shown this holds when
j = i. Set w = θ[i − 1] and u = θ[i]. Since T wu, there must be a w-matching
rule ρ such that u extends w by cn(ρ). By hypothesis, w′ = θ′ [i − 1] exists and
w L w′ . Then ρ is also a w′ -matching rule; hence u′ = θ′ [i] exists and T w′ u′ .
Then u′ extends w′ by cn(ρ), so u L u′ .
2
This lemma leads to a surprising result, which does not hold of modal systems
in general.
Theorem 1 For any models M, M ′ ∈ SR and all states s in M , and s′ in M ′ :
s L s′ iff s ! s′ iff s ⋍ s′ .
17
Proof: Without loss of generality, assume M and M ′ are both tree models
and that s L s′ , but not s ! s′ for s in M and s′ in M ′ . Then there is some
φ such that M, s ⋄φ but M ′ , s′ 1 ⋄φ, hence a path θ from s in M such that,
for some i, there is no path θ′ from s′ in M ′ such that θ′ [i] L θ[i]. But this
contradicts lemma 1.
2
Corollary 2 Let M ∈ SR be a model in which s L s′ , X = {u ∈ S | T us′ }
and Y be the set of all states reachable from s′ but not from s. Let M ′ be the
model in which S ′ = S − (Y ∪ {s′ }), T ′ is just like T (restricted to S ′ ) except
T ′ us for each x ∈ X and v ′ = v restricted to S ′ . Then there is a bisimulation
Z : M ⋍ M ′.
Proof: Given s L s′ , we have s ! s′ (theorem 1). So assume there is no
such bisimulation Z. Since M ′ is identical to M except for the descendants of
each u ∈ X, there must a formula φ and state x ∈ X such that M, x φ but
M ′ , x 6 φ. But we have M, u ⋄φ iff M, s′ φ iff M ′ , u ⋄φ, for each u ∈ X.
Hence our assumption was wrong.
2
This last result says that if we take any model and squash together similarly
labelled states, we get a model which satisfies a formula iff the original model
did. As can be seen, models of rule-based agents (those in the class SR , for some
program R) have many desirable properties, several of which are not possessed
by models of modal logics in general. The computational properties of models
of rule based agents make it easy to verify properties of the agents they model;
this can only be a sign of the success of the logic. In the following section, an
axiomatization of the logic is given.
6.4
Complete and sound axiom system
Fix a program R and a finite set of constants D which may be used in substitutions. To axiomatize the transition systems in which every transition corresponds to firing exactly one rule instance in R, the following abbreviations are
helpful.
df
match(λ1 , . . . , λn λ) = Bλδ1 ∧ · · · ∧ Bλδn ∧ ¬Bλδ
(2)
_
df
match ρ =
matchδ ρ
(3)
δ∈Σ
The axiom system shown in figure 2 is called ΛR . Explanations of the more
complicated axiom schemata A6 and A7 are given below.
A6 says that, when a belief is added, it must have been is added by some
matching rule instance in R. The abbreviation
_
Bλδ1 ∧ . . . ∧ Bλδn
λ1 ,...,λn λ∈R,λδ =α
abbreviates the disjunction of all formulas Bλδ1 ∧ . . . ∧ Bλδn for which there is
a rule λ1 , . . . , λn → λ in the agent’s program R whose consequent under δ is
α. Intuitively, the A7 schema says that, if all matching rule instances in the
current state are ρδ11 , . . . , ρδnm , then each of the successor states should contain
the consequent of one of those instances.
18
Cl all classical propositional tautologies
K 2(φ → ψ) → (2φ → 2ψ)
A1 Bρ
where ρ ∈ R
A2 ¬Bρ
where ρ 6∈ R
A3 Bα → 2Bα
A4 B(λ1 , . . . , λn → λ) ∧ Bλδ1 ∧ · · · ∧ Bλδn → 3Bλδ
A5 3(Bα ∧ Bβ) → Bα ∨ Bβ
W
A6 3Bα → (Bα ∨ ρ∈R,δ∈Σ: cn(ρ)δ =α matchδ ρ)
V
A7 matchδ1 ρ1 ∧ · · · ∧ matchδm ρn ∧ δ6=δi ,ρ6=ρj ¬matchδ ρ →
h
i
2 B cn(ρ1 )δ1 ∨ · · · ∨ B cn(ρn )δm
V
A8 ρ∈R ¬matchρ → 3⊤
MP
N
for each δ ∈ Σ
1 ≤ i ≤ m, 1 ≤ j ≤ n
φ φ→ψ
ψ
φ
2φ
Figure 2: Axiom schemes for ΛR
A derivation in ΛR is defined in a standard way, relative to R: φ is derivable
from a set of formulas Γ (written Γ ⊢R φ) iff there is a sequence of formulas
φ1 , . . . , φn where φn = φ and each φi is either an instance of an axiom schema,
or a member of Γ, or is obtained from the preceding formulas by MP or N.
Suppose an agent’s program R contains the rules ρ1 , . . . , ρn . This agent is
guaranteed to reach a state in which it believes α in k steps, starting from a
state where it believes λδ11 , . . . , λδmm , iff the following statement is derivable in
ΛR :
Bρ1 ∧ . . . ∧ Bρn ∧ Bλδ11 ∧ . . . ∧ Bλδmm → 2k Bα
(Again, 2k α is an abbreviation for 22 · · · 2α, k times). Below, a proof is given
that ΛR is the logic of the class SR . First, a few lemmas need to be prepared.
In each case a sketch of the proof is given; the full proofs, together with more
discussion, can be found in [10].
Lemma 2 (Lindenbaum lemma) Any set of formulas Γ can be expanded to
a ΛR -maximal consistent set Γ+ .
Proof: The proof is standard.
2
A canonical model M R = hS, T, V i is built in the usual way. States in S
are ΛR -maximal consistent sets; T su iff {φ | 2φ ∈ s} ⊆ u (or equivalently, iff
{3φ | φ ∈ u} ⊆ s). Finally, V (s) = {α ∈ L | Bα ∈ s}, for each s ∈ S.
Lemma 3 (Existence lemma) For any state s in M R , if there is a formula
⋄φ ∈ s then there is a state u in M R such that T su and φ ∈ u.
19
Lemma 4 (Truth lemma) For any φ and any state s ∈ S: M R , s φ iff
φ ∈ s.
Proof: The proofs of lemmas 3 and 4 are standard.
2
Lemma 5 Let M R be a canonical model and let α ∈ L and s, u ∈ S. Then
(i) if T su and α ∈ V (u) but α ∈
/ V (s), then V (u) = V (s) ∪ {α}.
(ii) α in part (i) must be a ground literal.
Proof: Part (i) follows from the definition of M, s Bα as α ∈ V (s) together
with the truth lemma and the fact that states are closed under axioms A3 and
A5. The former axiom ensures that s is a subset of u, the latter ensures that α
is the only new belief. For part (ii), if we suppose α were some rule we would
have α ∈ R and so α ∈ s, contrary to hypothesis.
2
Lemma 6 M R satisfies condition S1.
Proof: Assume there is a matching rule ρ in s under some substitution δ.
Given the truth lemma, it is easy to see that each of its (ground) antecedents
under δ is a member of s, whereas its consequent is not. A4 and the existence
lemma guarantee an accessible state u which, given lemma 5, is the extension
of s by the consequent of ρ under δ.
2
Lemma 7 M R satisfies condition S3.
Proof: Suppose T su for states s, u in M R . By definition, {φ | 2φ ∈ s} ⊆ u.
By axiom A7, there must be one ground literal believed in u but not in s,
namely the consequent of either ρδ11 or . . . or ρδnm . Then by the argument used
in lemma 6, it follows that u is the extension of s by this new belief.
2
Theorem 2 (Completeness) ΛR is strongly complete with respect to the class
SR : given a program R, a set of ML-formulas Γ and an ML-formula φ, Γ R φ
only if Γ ⊢R φ.
Proof: Expand Γ to a ΛR -maximal consistent set Γ+ from which we build
a canonical model M R . From the truth lemma, it follows that M R , Γ+ Γ.
It remains only to show that M R is in the class SR . Given lemmas (6) and
(7), we only have to show that M R satisfies condition S2. So suppose s is a
terminating state. By axiom A8, there is an accessible state s′ . By axiom A6,
α ∈ V (s′ ) implies α ∈ V (s) for any literal α (this holds because there are no
matching rules at s). It then follows from axioms A1–A3 that V (s′ ) = V (s),
hence S2 is satisfied.
2
7
Extending the logic
The logic can easily be extended to accommodate a system of agents in communication with one another. To accommodate multiple agents, we replace
the labelling function V with a family of such functions. To model a system
consisting of n agents comprising the set A, a model is an n + 3-tuple
hS, A, T, {Vi }i∈A i
20
where each Vi is the labelling function for agent i. The language of these models
is extended, first to include a belief operator Bi for each agent i ∈ A and secondly
to include formulas for communication between agents, such as ‘ask(i, j)λδ ’
and ‘tell(i, j)λδ ’. These are read as ‘agent i has asked agent j whether λδ ’
and ‘agent i has told agent j that λδ ’; such formulas are called asks and tells,
respectively. As above, λδ is some ground literal (these are the only types of
belief which rule-based agents communicate). The primitive wffs of the modal
language ML(P, D) are then
Bi λδ | Bi tell(i, j)λδ | Bi ask(i, j)λδ | Bi ρ
and the complex wffs are as above. The definition of ‘’ changes its first clause
to:
M, s Bi α iff α ∈ Vi (s), for i ∈ A, α ∈ L
but the remaining clauses stay the same. The definition of a matching rule is
also made relative to each agent; instead of a rule being s-δ-matching, it will
now be Vi (s)-δ-matching, for some agent i. The class M, for multi-agent models,
is defined in much the same way as the class S modulo the amendments just
noted. Although agents share a common language, each has its own unique
program. There are restrictions on which rules which may appear in an agent’s
program, which we summarise here. A rule ρ may appear in Ri , for any agent
i ∈ A, only if:
1. for any ask or tell α in the antecedent of ρ, α’s second argument is i; or
2. for any ask or tell α in the consequent of ρ, α’s first argument is i
A program for an agent is then a finite set of rules which satisfy these conditions.
Given a program Ri for each agent i ∈ A, we define the program set
R = {R1 , . . . , Rn }
for A. Just as in the single agent case above, the class MR contains precisely
those models in which agents believe all the rules in their program (in R) and
no further rules. To axiomatize the resulting logic, the following axiom schemes
need to be added to those in figure 2:
W
A6-tell 3Bi tell(j, i)λ → Bi tell(j, i)λ ∨ ρ∈Rj , cn(ρ)δ =tell(j,i)λ matchδj ρ
A6-ask 3Bi ask(j, i)λ → Bi ask(j, i)λ ∨
W
ρ∈Rj , cn(ρ)δ =ask(j,i)λ
matchδj ρ
A9 Bi tell(i, j)λ ↔ Bj tell(i, j)λ
A10 Bi ask(i, j)λ ↔ Bj ask(i, j)λ
In addition, A7 needs to be replaced with the following similar-looking axiom
(the differences is just that the substitution δ is no longer uniform; there may
be a distinct substitution for each matching rule of each agent).
V
A7′ matchδi11 ρ1 ∧ . . . ∧ matchδinm ρn ∧ (δ,ρ)6∈{(δ1 ,ρ1 ),...,(δm ,ρn )} ¬matchδ ρ →
2 Bi1 cn(ρ1 )δ1 ∨ . . . ∨ Bin cn(ρn )δm
Call the resulting logic ΛR . The following result then holds (the proof is much
the same as that of theorem 2 above and can be found in [10]).
21
Theorem 3 (Completeness) ΛR is strongly complete with respect to the class
MR : given a set of programs R = {R1 , . . . , Rn }, for n agents, a set of ML
formulas Γ and a ML sentence φ, Γ R φ only if Γ ⊢R φ.
Extending the language beyond that of the condition-action rules discussed
above requires a slightly different approach. One such way to extend the language is to introduce a symbol ‘|’ for disjunction into the agent’s internal language, so that information such as man(Rob) | woman(Rob) may be represented
internally. In this way, an agent’s program may include definition-like pairs of
rules, such as
human(x) man(x) | woman(x)
man(x) | woman(x) human(x)
The question is, how can this internal disjunction be handled by an agent in
the step-by-step way described above? How would an agent use the information
that man(Rob) | woman(Rob), together with other information, to conclude that
man(Rob)? The answer is: by case-based reasoning. This is just the kind of
reasoning one does to eliminate disjunction from a natural deduction proof (rule
➀ in figure 3). Expanding this rule using the introduction rule for ‘→’ (and
writing ‘[φ]’ for the closed assumption that φ) gives rule ➁, which explicitly
shows the form of case-based reasoning: see what follows from the left disjunct
and then we see what follows from the right. If something follows from both,
it follows from the disjunction, simpliciter. Reasoning by cases, an agent with
disjunctive beliefs can thus form non-disjunctive beliefs.
φ∨ψ
[φ] [ψ]
..
..
.
.
φ∨ψ
φ→χ
➀
ψ→χ
χ
χ
➁
χ
χ
Figure 3: Eliminating disjunction using case-based reasoning
Case-based reasoning can be captured in the kind of transition system used
above by introducing the notion of a set of alternatives. Alternatives are primitive points to which sentences are assigned—they have many of the properties
of what were called states above—and states are now defined as sets of alternatives. Perhaps a diagrammatic example is best here; see figure 4. To keep
things simple, the example contains just a single agent, whose rules are pr
qr. In the diagram, the dots represent alternatives, the circles which enclose
them are the states s1 to s4 . Transitions are from left to right, so that s1 s2 s3 s4
forms a path. The dotted arrows show the reasoning which the agent is doing
in each transition.
Whereas the agents described in the previous section could only perform one
type of mental action, viz. to fire a rule and form a new belief (and in doing
so to move to a state which extends its predecessor), the agents modelled in
this framework can now enter into case-based reasoning. Call these two types
of move extend and split, respectively. Then, the example can be read from left
to right as:
22
Rules:
p|q
pr
qr
p|q, p
p|q, p, r
b
b
b
p|q, q
p|q, q
p|q, q, r
b
b
s1
p|q, p, r
b
s2
b
s3
s4
Figure 4: Case-based reasoning
1. split s1 to move to s2 ;
2. extend s2 ’s top alternative to move to s3 ;
3. finally, extend s3 ’s bottom alternative to move to s4 .
To be precise, an alternative w′ is said to extend another, w, as in definition
3 above (replacing ‘state’ with ‘alternative’). In the example, the top alternative
in s3 extends the top alternative in s2 (by ‘r’). A state s′ is now said to extend
a state s when one alternative in s′ extends one in s and all others remain the
same. In the example, s3 extends s2 . A state s′ is said to split a state s when
they differ only in that an alternative w ∈ s, labelled with a disjunction λ1 |λ2 ,
is replaced by alternatives w1 , w2 ∈ s′ , such that λ1 labels w1 and λ2 labels w2 ;
otherwise, w1 and w2 agree with w. In the example, s2 splits s1 . A transition
is allowed between states s and s′ only when s′ extends s, or s′ splits s, or both
s, s′ are terminating states. Models of a set A of n agents are now n + 3-tuples
hW, A, T, {Vi }i∈A i
where W is a set of alternatives, T ∈ ℘(W ) × ℘(W ) is a serial relation on states
(i.e. on sets of alternatives) and each Vi : W −→ ℘(L) assigns a set of L formulas
to each alternative. Finally, a belief α holds at a state s for agent i when Vi
labels every alternative in s with α:
M, s Bi α iff α ∈ Vi (w) for all w ∈ s
In this way, agents which reason in a full propositional language (with variables and substitution) can be modelled. The restriction to rule-based agents
is not a restriction of this logical approach. Rule-based agents were introduced
because they provide a clear example of agents which form new beliefs as a
deductive step-by-step process, and because their models have interesting computational properties.
8
Conclusion
Standard epistemic logic is not, in itself, an account of belief; yet it appears that
no acceptable account of belief can justify the assumption which the logic makes.
Belief should be thought of in terms of the mental representations disposing an
agent to make the relevant assertion. The logic presented here makes use of the
connection between belief and internal representation, in capturing an agent’s
23
internal representations as a logical language, termed the internal language. In
the case of artificial agents which do not so represent their environment, it is
nevertheless possible to provide an unambiguous translation, from the values
which variables may whilst the agent executes, into such a language. The logic
which was then developed can capture the class of rule-based agents, which have
favourable computational properties and can be used to verify properties of the
modelled agent, but can also capture agents which reason in a more expressive
language.
In the case of human agents, the case of belief is not so clear-cut. Our practise
of belief ascription, and our purposes in so doing, are not so fine-grained as, say,
the practise of debugging a program (in which a programmer may want to know
the precise point at which the program went wrong). Human belief states thus
have vague borders, but are nevertheless genuine mental states. In practise,
the vagueness of belief is just as unproblematic as the inherent vagueness of
medium-sized objects, such as tables and animals.
I hope this discussion has achieved two things. First, I hope it has cleared the
way for similar accounts in epistemology; for example, in analysing the notion
of knowledge, or of information, one first needs a correct account of belief.
Secondly, I hope the logic of belief presented in the second half of the paper will
form the basis of practical accounts of intentional states in AI and computer
science. Many such accounts in these domains, such as the AGM theory of belief
revision [1], have high computational complexity and are notoriously difficult
to implement in a practical real-time system. An account of belief revision for
rule-based agents is given in [3], based on the notion of belief argued for above.
The mechanism proposed in the latter work has been incorporated into the
agent programming language AgentSpeak [2]; thus demonstrating the practical
nature of this approach to intentional notions in AI and computer science.
References
[1] Carlos E. Alchourrón, Peter Gärdenfors, and D. Makinson. On the logic of
theory change: Partial meet contraction and revision functions. Journal of
Symbolic Logic, 50:510–530, 1985.
[2] Natasha Alechina, Rafael Bordini, Mark Jago, and Brian Logan. Belief
revision for agentspeak agents. Manuscript.
[3] Natasha Alechina, Mark Jago, and Brian Logan. Resource-bounded belief
revision and update. In 3rd International Workshop on Declarative Agent
Languages and Technologies (DALT 05), 2005.
[4] Natasha Alechina, Brian Logan, and Mark Whitsey. Modelling communicating agents in timed reasoning logics. In proc. JELIA 04, pages 95–107,
Lisbon, September 2004.
[5] Patrick Blackburn, Maarten de Rijke, and Yde Venema. Modal Logic.
Cambridge University Press, New York, 2002.
[6] Ronald Fagin, Joseph Y. Halpern, Yoram Moses, and Moshe Y. Vardi.
Reasoning About Knowledge. MIT press, 1995.
24
[7] Gottlob Frege. Über sinn und bedeutung. Zeitschrift für Philosophie und
philosophische, 100, 1892.
[8] Jaako Hintikka. Knowledge and Belief: An Introduction into the logic of
the two notions. Cornell University Press, Ithaca, 1962.
[9] Jakko Hintikka. Impossible possible worlds vindicated. Journal of Philisophical Logic, 4:475–484, 1975.
[10] Mark Jago. Logics for resource-bounded agents. Forthcoming PhD Thesis.
[11] Mark Jago. Logical omniscience: A survey. Technical report, University of
Nottingham, 2003.
[12] Saul Kripke. Naming and Necessity. Blackwell, Oxford, 1980.
[13] H. J. Levesque. A logic of implicit and explicit belief. In National Conference on Artificial Intelligence, pages 1998–202, 1984.
25