LOGICS FOR RESOURCE-BOUNDED
AGENTS
Mark Jago, BA(hons)
Thesis submitted to The University of Nottingham
for the degree of Doctor of Philosophy
July 2006
Abstract
In this thesis, I criticise traditional epistemic logics based on possible worlds semantics, inspired by Hintikka, as a framework for representing the beliefs and knowledge
of agents. The traditional approach suffers from the logical omniscience problem: agents
are modelled as knowing all consequences of their knowledge, which is not an admissible
assumption when modelling real-world reasoning agents. My thesis proposes a new logical framework for representing the knowledge and beliefs of multiple resource-bounded
agents. I begin by arguing that amendments to the possible worlds framework for modelling knowledge and belief cannot successfully overcome the logical omniscience problem
in all its guises, and conclude that a so-called sentential account of belief and knowledge is
to be preferred. Sentential accounts do receive support from the recent literature, but tend
to conflate belief with explicit assent.
In response to this problem, I consider Dennett’s intentional stance to belief ascription, holding that beliefs can only be ascribed to an agent from the point of view of
a particular predictive strategy. However, Dennett’s account itself suffers from logical
omniscience. I offer a sentential account of belief based on Dennett’s that avoids logical omniscience. Briefly, we should only ascribe those sentences to agents as beliefs that
the agent could explicitly assent to, given its situation and an appropriate bound on the
agent’s cognitive resources. In the latter half of the thesis, I concentrate on developing a
logical framework that respects this philosophical account of belief. In order to capture
resource-bounded reasoning, a fine-grained approach capable of modelling individual acts
of inference is required. An agent’s reasoning process is modelled as a non-deterministic
succession of belief states, where each new belief state differs from the previous one by a
single act of inference.
The logic I develop is a modal logic with many interesting model-theoretic properties and a simple, yet complete and decidable proof theory. I focus on the rule-based
agent paradigm from contemporary AI, as well as investigating classical propositional
reasoning-by-assumption. I investigate the complexity of the resulting logics and conclude
by discussing various extensions to the framework, including more expressive languages
and the handling of non-monotonic reasoning.
ii
Acknowledgments
Thank you first and foremost to Natasha Alechina, who put up with my sometimes
woeful attempts at proofs and showed me how to do them properly; and to Brian Logan,
both of whom gave up far too much of their time in helping me. Thank you also to Stephen
Barker for advice and inspiration; to Eros Corazza for teaching me and for many debates
over good food; and to Anna Fellows for reading draft after draft of several papers which
have found their way into the thesis and for putting up with me so far.
For comments on parts of the thesis, or on talks whose content has been incorporated here, thanks go to: Thorsten Altenkirch, Trevor Bench-Capon, Robert Black,
Dov Gabbay, Wiebe van der Hoek, David Mackinson, Stephen Mumford, Paul Noordhof,
Siewert van Otterloo, Michael Wooldridge and the regular attendees of the postgraduate
philosophy seminar at Nottingham.
Finally, thank you to the Arts and Humanities Research Board and the Department
of Philosophy and the School of Computer Science and IT at the University of Nottingham
for funding my studies, and to the School of Humanities and the Graduate School at
Nottingham; the Department of Philosophy, Lund University, Sweden; and the Midlands
Graduate School for various travel grants and bursaries during my doctoral study.
iii
To Anna; and to my parents who will be, and Bill who would have been, proud.
iv
v
Contents
Introduction
1
1 Epistemic Logic and Logical Omniscience
1.1 Epistemic Logic
1.1.1 Introduction
1.1.2 Possible worlds semantics
1.1.3 Systems of Modal Epistemic Logic
1.2 Logical Omniscience
1.3 Hintikka’s First Response
1.3.1 Defensibility and Indefensibility
1.3.2 Scepticism
1.3.3 Idealized knowledge
6
6
14
17
2 Responses to Logical Omniscience
2.1 Impossible Possible Worlds
2.1.1 Hintikka’s Problem
2.1.2 Hintikka’s 1975 Response
2.2 Nonclassical Worlds
2.2.1 Cresswell’s Nonclassical Worlds
2.2.2 Paraconsistent Worlds
2.2.3 Relevant worlds
2.2.4 Levesque’s Logic of Explicit and Implicit Belief
2.2.5 Local Reasoning
2.2.6 Fagin, Halpern and Vardi’s Nonstandard Worlds
2.2.7 Evaluating the Nonclassical/Nonstandard Worlds Approach
2.3 Awareness
2.3.1 Logics of General Awareness
2.3.2 What is Awareness?
2.4 Conclusion
26
26
54
3 Accounts of Belief States
3.1 Epistemic Possibility
3.2 The Fregean Account
3.3 Realist and Representational Theories
56
56
59
62
33
48
CONTENTS
3.4
3.5
3.6
3.7
Predictive Accounts and the Intentional Stance
Sentential Accounts
Belief and Acceptance
Prospects for a Logic of Belief
4 Syntactic logic
4.1 First-Order Accounts
4.2 Self-Referentiality and Inconsistency
4.3 Sentential Logics
4.3.1 The Deduction Model
4.3.2 Syntactic Models
4.3.3 Algorithmic Knowledge
4.3.4 Dynamic Logic
4.4 Step Logics
4.4.1 Agent and Meta Step Logics
4.4.2 Active Logics
4.4.3 First-Order Semantics for Step Logics
4.5 Timed Reasoning Logic
4.5.1 Reasoning with Assumptions
4.5.2 Modelling a Natural Deduction Style Reasoner
4.6 Semantics for TRL
4.7 Discussion and Related Work
5 Rules and Rule-Based Agents
5.1 Motivation for the Logic
5.2 Rules and Rule-Based Agents
5.3 Modelling Rule-Based Agents
5.4 Bisimulation
5.5 Properties of Models
5.6 Finite Models and Programs
5.6.1 Finite Models
5.6.2 Programs
5.7 Multi-Agent Systems
5.7.1 Communication Between Agents
5.7.2 Communication and deduction rules
5.7.3 Models of Multi-Agent Systems
5.7.4 Programs
6 Proof Systems
6.1 Introduction
6.2 Logic for a program R
6.3 Logic for a Multi-Agent System
6.4 Complexity of Satisfiability Checking
vi
65
70
75
80
84
84
87
92
98
106
113
118
121
121
124
126
134
137
140
144
154
154
156
161
165
CONTENTS
vii
7 Expanding the Logic
7.1 Introduction
7.2 A Hilbert-Style Reasoner
7.3 An Assumption-Making Reasoner
7.4 Investigating the Class A
7.4.1 Embedding the Class S
7.4.2 Capturing Natural Deduction Proofs
7.4.3 Properties of AR
7.5 Adding Temporal Modalities
171
171
172
174
179
8 Conclusions
8.1 Summary of the Thesis
8.2 Future Work
8.2.1 Embedding into Alternating Time Logic
8.2.2 Non-monotonicity
8.2.3 Information and Epistemic Possibility
189
189
190
References
196
A Proofs
211
186
1
Introduction
Background
Logics are powerful tools with which to reason about the behaviour of agents. Ascribing
beliefs to agents allows us to reason about their behaviour without detailed information
about their low-level workings. Reasoning about an agent at a high level of description
(to use Douglas Hofstadter’s phrase) is often useful to designers of artificial agents, for
example in debugging the agent’s program, rather than reasoning about specific low-level
code. The position that Dennett has termed the intentional stance [Den87] has proved
popular in artificial intelligence and computer science. The idea is that, in order to predict
the behaviour of an agent, we treat it as a rational agent and decide what beliefs and desires
it ought to have, given its circumstances and its purpose [Den87, p.17]. Doing so allows us
to make generalizations about the agent’s behaviour that would not necessarily follow from
a finite number of simulations of the agent [McC79a]. The resulting characterization of the
agent’s behaviour, in terms of beliefs and desires, is almost certainly easier to understand
than details of the underlying biological or neurological or electrical phenomena, or its
program features.
Hintikka’s logic of knowledge and belief [Hin62] has become a standard logical
tool for dealing with intentional notions in artificial intelligence and computer science.
One reason for this success is the adoption by many computer scientists of modal logics
in general, as tools for reasoning about relational structures. Many areas of interest to
computer scientists, from databases to the automata that underlie many programs, can
be thought of as relational structures; such structures can be reasoned about using modal
logics. Hintikka’s work can be seen as the start of what is known to philosophers, logicians
and computer scientists as formal epistemology: that branch of epistemology which seeks to
uncover the formal properties of knowledge and belief.
I
2
Then, in 1995, Fagin, Halpern, Moses and Vardi’s Reasoning About Knowledge
[FHMV95] showed how logics based on Hintikka’s ideas can be of use in distributed
computing, artificial intelligence and game theory, to name a but a few key areas. With
the appearance of Reasoning About Knowledge, Hintikka’s approach was firmly cemented
as the orthodox logical account of belief and knowledge for philosophers and computer
scientists alike.
It is rare for an orthodox account of such popularity to receive no criticism and
Hintikka’s framework is no exception. One major source of objections is the so-called
problem of logical omniscience whereby, as a result of the modal semantics applied to
‘knows’ (and ‘believes’), agents automatically know every tautology as well as every
logical consequence of their knowledge. Just how such a consequence should be viewed
is a moot point; perhaps this notion of knowledge applies to ideal agents, or perhaps it is
an idealized notion, saying what a non-ideal agent should believe (given what it already
does believe). Although neither account is entirely satisfactory, defenders of the approach
claim that, in many cases, the assumptions are harmless, and that the applications which
such logics have found speak for themselves.
Aims and Scope
In this thesis, I aim to develop an account of beliefs that respects an agent’s resource
bounded nature. All humans have resource bounds: we have only so much time and ability
to think, only so much memory and imaginative capacities. Artificial agents, whether
they are physically realized robots or software programs, are similarly resource bounded.
Software agents execute on physical hardware that has a fixed, finite amount of memory
available and are often required to reason and make decisions, given some input, within a
set period of time. An agent allowed to reason for as long as required would have a very
limited use!
Logics that suffer from the logical omniscience problem cannot, in general, model
agents in a way that reflects their resource bounds. Although there are applications for
epistemic logics in computer science that do not require resource bounds to be part of the
model of an agent, I argue that there are situations that do not permit the assumption that
agents are ideal reasoners. Logical models of ideal agents must be viewed as heuristic
approximations of real-world agents, rather than an accurate modelling tool applicable in
I
3
all situations.
The aim of this thesis is to supplant the prevalent notion that the possible worlds
framework developed by Hintikka is the way to model intentional notions. I argue for a
philosophical account of belief based on considerations from the philosophies of language
and mind that supports the idea that an agent’s state of belief is best characterized in terms
of sentences, rather than in terms of propositions. I argue that we can use knowledge of an
agent’s resource bounds to make the link between what the agent explicitly assents to (or
would explicitly assent to, if asked) and what it believes.
I develop a logical framework for modelling resource bounded agents in terms of
their beliefs. This logic is intended as a solution to the problem of logical omniscience in
epistemic logics. Because of the myriad uses that have been found for the possible worlds
approach in artificial intelligence and computer science, a guiding principle of the logic
I develop is that it should be intuitive and of relatively low complexity: this is, after all,
one of the motivating factors in adopting a modal rather than a first-order logic to model
agents in the first place.
Because of the philosophical analysis I give of belief in terms of sentences, rather
than propositions, the logic I propose falls into the category of what has been termed
syntactic logics—although I prefer the term sentential logic. Several authors, e.g. [FH88]
and [FHMV95], criticize the sentential approach on the grounds that it lacks “elegance and
intuitive appeal” [FH88, p.40]. One of my aims here is to show that this is not necessarily the
case. In fact, not only does the logic I propose share many of the useful properties of other
modal logics; it also has a number of interesting additional properties relating to belief. I
aim to show that the framework I propose is intuitive, philosophically well-motivated and
useful to researchers in artificial intelligence and computer science.
Organization of the Thesis
Chapter 1 sets out the details of Hintikka’s original proposal and discusses the logical
omniscience problem. Hintikka’s original attempt at a solution is rejected, along with
similar attempts to disregard the problem. Chapter 2 continues the theme of responses
to logical omniscience that remain within the spirit of the possible worlds account of
knowledge and belief. I begin with Hintikka’s influential 1975 proposal and critique
several prominent approaches that share Hintikka’s strategy to avoid logical omniscience.
I
4
I argue that none of these attempted solutions adequately solves the problem. Much of this
chapter was presented in [Whi03].
In chapter 3, I discuss philosophical approaches to analyzing belief. One conclusion drawn is that an agent’s state of belief is best characterized using sentences, rather
than that traditional philosophical stalwart, the proposition. I discuss the relation between
having a belief and explicitly assenting to a sentence. Despite the many differences between the two notions, I analyze belief in terms of the potential of explicit assent. This
potential must be characterized in terms of an agent’s abilities and the resources available
for reasoning. A preliminary version of the argument presented here was given in [Jag05b],
and also read at a Royal Institute of Philosophy seminar at the University of Nottingham.
Chapter 4 begins by surveying formal logics that deal with belief (or knowledge)
in a syntactic or sentential way, i.e. they provide semantics for belief (or knowledge) in
terms of the sentences that the agents believes (or knows). I highlight the useful aspects
of the logics considered from the point of view of modelling resource boundedness and
discuss their shortfalls. In the latter half of the chapter, I develop a novel logic of belief,
called timed reasoning logic, which combines many of the plus points of the logics discussed
previously in the chapter. The chapter concludes with a discussion of the drawbacks of
timed reasoning logic. The survey part of the chapter is taken from [Whi03] with several
additions. Timed reasoning logic, presented in the second half of the chapter, was first
presented in [Whi04] and [ALW04a, ALW04b]. The idea of modelling reasoning with
assumptions from [Whi04], discussed in the second half of the chapter, was developed
further in [Jag05a].
In chapter 5, I concentrate on the case of rule-based agents, a prominent example
from the current literature in artificial intelligence. The internal reasoning process in rulebased agents is, from a logical point of view, relatively simple. This allows us to abstract
away from logical complexity and concentrate on developing practical, realistic models
of resource-bounded reasoners. Much of this chapter is published in [Jag06b] and forms
the basis of the approach in [AJL06a]. Chapter 6 continues the investigation of the logical
framework developed in chapter 5 by presenting an axiomatization of the logic and proving
it to be sound, complete and decidable. I then give a complexity analysis of the logic.
Finally, chapter 7 considers several ways in which the logic developed in chapters
5 and 6 can be extended. To begin with, I consider a Hilbert-style axiomatic reasoner in full
propositional logic, rather than the restricted rule-based agents considered in chapters 5
I
5
and 6. I then present models of an assumption-making reasoner of the kind met in chapter
4, but without the drawbacks of timed reasoning logics encountered there. I then discuss
adding more expressive temporal modalities and tie the notion of explicit belief used in the
logic together with the idea of predictive belief ascriptions, discussed at the end of chapter
3. The chapter concludes with a discussion of how the logical framework can be used to
model epistemic possibility and information, published as [Jag06a], and thoughts about
future work.
Note that papers I published before 2005, namely [Whi03] and [Whi04], were
published under my former surname Whitsey.
6
C 1
E L L O
1.1 Epistemic Logic
1.1.1 Introduction
In this section, Hintikka’s logic of knowledge and belief is presented. Throughout, I will
refer to Hintikka’s logic as standard epistemic logic. Hintikka’s guiding idea in Knowledge
and Belief [Hin62] was to treat knowledge and belief as ways of locating the actual world
in a space of logical possibilities. Knowing or believing something, say, that Tony Blair
is currently the Prime Minister, is a state of mind in which an agent rules out conflicting
epistemic possibilities. It is not inconceivable for there to be another Prime Minister and there
is no force of necessity by which Tony Blair currently holds the post. There are here a number
of relevant alternative possibilities—possibilities in which Gordon Brown has supplanted
Tony Blair as Prime Minister, or in which the Tories won the last General Election, or even
possibilities in which Britain has done away with the rôle of a Prime Minster altogether.
But in knowing or believing that Tony Blair is in fact the Prime Minister, I am ruling out all
of these alternative possibilities; and if I know that this is so, then the actual world really
must be like that.
On the other hand, if I am unsure whether the home secretary is Charles Clark,
then the space of possibilities in which I locate the actual world will include possibilities
in which Charles Clarke is the home secretary and those in which he is not. Following the
usual philosophical terminology that originated with Leibniz, such alternative possibilities
shall be termed possible worlds. There is thus a relationship between what an agent knows,
or believes, and which worlds that agent considers possible. Viewed in one direction,
we can say that worlds in which Tony Blair is not the Prime Minister are not epistemic
1.
7
possibilities for someone who believes that he is. In the other direction, we can say that an
agent believes whatever is true of all possible worlds that that agent considers epistemically
possible. Just how this thought leads to a logic of knowledge and belief will be explained
below. First, a few worlds should be said on just what these alternative possibilities are.
Writers differ as to the nature of possible worlds. David Lewis [Lew86] thinks of
possible worlds as concrete entities, each just as real as any other. On this view, each possible
world is considered to be actual from the perspective of that world, so that ‘actual’ functions
as an indexical. David Armstrong [Arm97] presents an opposing view that privileges our
world as the actual world. Possible worlds are then possible re-combinations of the parts
of states of affairs that constitute our world. Saul Kripke [Kri80] thinks of worlds simply
as ways the world could have been. In many ways, exactly what possible worlds consist in
is irrelevant to the logician, who merely uses them as a tool. Possible worlds are usually
thought of as spatio-temporally maximal entities; that is to say, no proper part of a possible
world is itself a possible world. In this respect, possible worlds differ from situations, for
a proper part of a situation may itself be a situation. As a metaphysical starting point, let
us consider worlds to be maximal and consistent collections of states of affairs, lacking de
re logical complexity. That is, possible worlds contain no negative, conjunctive, disjunctive
or universal states of affairs.1
1.1.2 Possible worlds semantics
Hintikka used possible world semantics, with a primitive relation of epistemic accessibility
R holding between worlds. Rww′ says that w′ is epistemically possible, or accessible (for
the agent in question) from w. Knowledge and belief is then defined in terms of such a
relation; it will usually have different properties, depending on whether it is knowledge
or belief under discussion. The following elucidation of Hintikka’s ideas will be put in
terms of knowledge, but the basic ideas apply equally to belief. An agent knows that φ
(at a world w) if φ is true in all worlds epistemically possible from R, i.e. if φ is true in all
the worlds w′ such that Rww′ . (If we want to reason about an agent’s actual knowledge, a
unique world @ can be distinguished as the actual world. An agent’s knowledge is then
1
Just whether this minimal notion of a world can support a grounded notion of truth is a moot point. Truth
is grounded in the sense that the way the world is makes propositions true or false. The worry is, without
negative or universal states of affairs, just what is it that makes propositions such as that there is no rhinoceros
in the room true? Armstrong [Arm04], for example, opts to include universal states of affairs.
1.
8
represented as the sentences that hold at all worlds accessible from the actual world, i.e. the
worlds w such that R@w.)
Standard epistemic logic is a modal logic that uses relational structures to model an
agent’s knowledge and beliefs. The domain of such models is a set of points W, considered
to be possible worlds. At each world, the sentences that are said to be true are closed under
some logical consequence relation (say, that of classical propositional or first-order logic).
The accessibility relation R (⊆ W × W) holds between worlds in the domain of the model.
The satisfaction relation holding between a model, a world in that model and a sentence
of the logic is defined recursively in the usual way. In the propositional case, truth-values
are assigned to propositions at each world; in the first-order case, constants are assigned
an individual of the domain and relation letters an extension.
In this thesis, only the propositional case of standard epistemic logic will be
considered (bar a few occasions when the discussion of a theory requires a first-order
setting). Although issues in first-order modal logic are interesting in their own right, much
of the current literature refers only to the propositional case. The language L of standard
epistemic logic contains a denumerable set P = {p1 , p2 , . . .} of propositional letters, the
Boolean connectives ¬, ∧, ∨ and → and an operator K, where ‘Kφ’ is read as ‘the agent
knows that φ’. If the logic is to be used for reasoning about belief, the language may use
a B operator in place of K. Since the language is parametrized by P, it should properly be
written as LP . However, as it will always be clear which set of propositional letters is in
use, the superscript is unnecessary. All primitive propositions are well-formed formulae
(wffs) of L and, if φ and ψ are wffs of L, then so are ¬φ, φ ∧ ψ, φ ∨ ψ, φ → ψ and Kφ (or
Bφ). Models of standard epistemic logic are structures
M = hW, R, Vi
where:
• W is a nonempty denumerable set of possible worlds,2 called the domain of the model;
• R ∈ W × W is the epistemic accessibility relation on worlds described above; and
• V is the labelling function (also called a valuation) of type W −→ 2P , assigning a set
of primitive propositions to each world w ∈ W.3
2
3
The elements of W are often called states instead of worlds, depending on the application of the logic.
This is a slightly unusual way of defining the labelling function V; it is more common to define V as a
1.
9
The basic modal notions used by standard epistemic logic are as follows:
Definition 1 (Satisfaction) The satisfaction relation , which holds between a model, a world
in that model and a formula is defined inductively as follows:
M, w p iff p ∈ V(w), for p ∈ P
4
M, w ¬φ iff M, w 1 φ
M, w φ ∧ ψ iff M, w φ and M, w ψ
M, w φ ∨ ψ iff M, w φ or M, w ψ
M, w φ → ψ iff M, w 1 φ or M, w ψ
M, w Kφ iff, for all w′ ∈ W, Rww′ implies M, w′ φ
M, w φ is read as ‘w satisfies φ in M’, or alternatively as ‘φ is true at w in M’. Where there is
no possibility of ambiguity between models, we may write w φ in place of M, w φ. If Γ is a set
of formulae, then Γ is satisfied by a world w in a model M, written M, w Γ, when every element
of Γ is satisfied at w in M, i.e. M, w φ for each φ ∈ Γ. When a formula φ is satisfied by every
world w in a model M, φ is said to be globally satisfied in M, written M φ. A set of formulae is
globally satisfied in a model M when each of its elements are.
A key notion in modal logic is that of validity. First, the notion of a frame is
introduced. A frame F is a pair
hW, Ri
where W and R are as above. A frame contains the ontological information that underlies
a model—which possible worlds exist in the domain and how they are related by R. A
model M can then be viewed as a composite structure, consisting of a frame F and a
labelling function V. M is said to be based on F and F to be the frame underlying M. V
adds contingent information to a frame—which propositional letters are satisfied at which
possible worlds. As with non-modal logics, validity ignores contingent information, for
valid formulae are those that are satisfied however truth values are assigned. Validity is
thus dealt with at the level of frames.
Definition 2 (Validity) Given a frame F = hW, Ri and a world w ∈ W, a formula φ is valid at
w in F , written F , w φ, if φ is satisfied at w in every model M based on F . φ is valid in F ,
function mapping pairs from W×P into the set {true, false}. However, the definitions are clearly interchangeable,
as V(w, p) = true on this definition precisely when p ∈ V(w) using the definition given above.
4
On the alternative definition of V given in footnote 3, this clause becomes M, w p iff V(p, w) = true, for
p ∈ P.
1.
10
notation F φ, if it is valid at every world w ∈ W in F . φ is valid on a class of frames F, written
F φ, when φ is valid in every frame F ∈ F. Finally, φ is valid, φ, when it is valid on the class
of all frames. A set of formulae Γ is then defined as valid at a world in a frame, valid in a frame, in a
class and valid simpliciter in the obvious way.
The set of all formulae valid on a class of frames F is called the logic of F, denoted
ΛF . Below (section 1.1.3) we shall see that certain important classes of frames give rise to
interesting properties of the K and B modalities of epistemic logic. It is worth noting that
the definitions given above are common to all normal modal logics,5 not just epistemic
logic. In one sense, an epistemic logic is simply a standard modal logic with a K or B
operator in place of the usual 2 modality.
Systems that deal with both knowledge and belief will contain two accessibility
relations, which we may call RK and RB . Ascriptions of knowledge will be evaluated
using the former; those of belief using the latter.6 In cases where we have more than one
agent—a multi-agent system—the language will contain a knowledge operator Ki (and/or
a belief operator Bi ) for each agent i. The language containing n such operators over a
set of propositional letters P is denoted LP
n , although again L will usually suffice when
the context removes the possibility of ambiguity. Frames and models will then contain an
accessibility relation Ri for each agent, i.e. given n agents, a frame will be an n + 1-tuple
hW, R1 , . . . , Rn i
and a model will remain a frame-valuation pair. Formulae containing an operator Ki are
evaluated for satisfiability using the corresponding accessibility relation Ri :
M, w Ki φ iff, for all w ∈ W, Ri ww′ implies M, w′ φ.
1.1.3 Systems of Modal Epistemic Logic
Given a class of frames F, we have said that the logic corresponding to this class is the set of
all formulae valid on F. In this section, we see how certain important classes of frames give
rise to different logics. Because the worlds w in the domain of a frame are simply points
in a relational structure from the point of view of the frame, different classes of frames are
5
A normal modal logic is one that includes K: see section 1.1.3 below.
Hintikka discusses the relationship between knowledge and belief at [Hin62, §3.6–§3.9]. A good overview
of the issues is found in [Sta06]; a rather different theoretical approach to the relationship is discussed in
[Fas03].
6
1.
11
obtained by imposing restrictions on R. In doing to, the properties of the K operator of the
corresponding logic are affected, as we shall now see.
To begin with, suppose we are interested in the class of all frames; that is, we
impose no restriction on R whatsoever, other than its definition as a relation on the domain
W. What formulae are valid on this class? To be sure, all propositional tautologies will
be valid at each world, for the definition of contains the usual clauses for the Boolean
connectives. For a similar reason, modus ponens preserves validity: if φ and φ → ψ are
valid, then so is ψ. Moreover, if a formula is valid, then by definition it is valid at all worlds
in the domain. So, a valid formula φ will be satisfied at all worlds accessible via R from
any world w; hence Kφ also holds at w. We have shown that Kφ may be inferred from any
validity φ; this is a rule known as generalization. It allows us to enter a modal (belief or
knowledge) context from a non-modal context.
Finally, what about reasoning within modal contexts? Suppose K(φ → ψ) and Kφ
are valid and take any world w. From the second formula, w φ hence, from the first,
w ψ too. But then ψ must also be valid and, by generalization, so must Kψ. Hence the
formula K(φ → ψ) → (Kφ → Kψ) is valid on this frame. This formula is known as the
distribution axiom scheme: K distributes over →. It allows for propositional-style reasoning
within modal contexts. The smallest logic containing propositional logic, all distribution
axioms and the necessitation rule is known as K. It is the minimal (weakest) logic for
reasoning about frames.
Hilbert-style proofs are sequences of formulae, each of which is valid on a particular class of frames if the previous formulae are. The rules that allow a new line to be
added to the proof must therefore preserve validity. A Hilbert-style proof system for K is
as follows.
Definition 3 (Proofs in K) A K-proof is a finite sequence of formulae, each of which is an axiom
or follows from a previous formula in the sequence by applying either modus ponens (from φ and
φ → ψ infer ψ) or generalization (from φ infer Kφ). The axioms of K are all the propositional
tautologies, plus instances of the distribution axiom: K(φ → ψ) → (Kφ → Kψ). A formula φ
is K-provable from a set of premises Γ, written Γ ⊢K φ, if it is the last formula in some K-proof
beginning with the elements of Γ.
Stronger logics are obtained by adding extra axioms to K. We now list some of the interesting
axiom schemes from the point of view of epistemic logic.
1.
12
T: Truth
Adding all instances of the scheme
(T) Kφ → φ
to K results in the logic T (a more systematic naming scheme would have T as ‘KT’; but the
name ‘T’ is used for historical reasons). T is sometimes said to distinguish knowledge from
belief: beliefs need not be true, whereas knowledge must be. However, there is clearly
more to knowledge than simply true belief, so it is better to say that T results in a logic of
true belief. Of course, a logic of knowledge based on K must contain T.
What class of frames are the axioms of T valid on? The (T) axioms tells us that
when a formula is satisfied at all worlds accessible via R from a world w, then it must be
satisfiable at w as well. To guarantee this in all models in the class, w must be accessible
from itself, i.e. R must be reflexive: for all worlds w ∈ W, Rww. The axioms of T are thus
valid on the class of reflexive frames. Moreover, these are the only axioms valid on all
reflexive frames, so T is the logic corresponding to the class of all reflexive frames.
D: Consistency
The logic KD adds all instances of
(D) Kφ → ¬K¬φ
to K. D is the logic of consistent belief: agents do not believe the negations of their beliefs.
In terms of models, inconsistent beliefs are possible at worlds from which no worlds are
accessible via R. Consider such a world w, such that the set {w′ | Rww′ } is empty. Then
there are no worlds accessible from w that do not satisfy both p and ¬p, therefore both p
and ¬p are believed at w. The frames that axioms of D are valid on cannot contain such
worlds. They are valid on—and D is the logic corresponding to—the class of serial frames,
i.e. for all worlds w ∈ W, there exists a w′ ∈ W such that Rww′ .
4: Positive introspection
The (4) axioms are of the form
(4) Kφ → KKφ
1.
13
and read: if the agent believes φ, it believes that it believes φ. It is a moot point whether
either belief or knowledge satisfies the positive introspection principle. In the case of beliefs,
it seems that a person can be confused about what she really believes, so may believe φ
without believing she believes it. Hintikka argues [Hin62] that knowledge satisfies positive
introspection (this is sometimes known as the KK principle). His argument appeals to an
intuitive notion of consistency, rather than one’s supposed ability to introspect on one’s
epistemic states. If {Kφ, ¬K¬ψ} is consistent, Hintikka argues, then so is {Kφ, ψ}. The KK
principle follows by substituting ‘¬Kφ’ for ‘ψ’ and eliminating the double negation: since
{Kφ, ¬Kφ} is clearly not consistent, {Kφ, ¬KKφ} cannot be either. So Kφ implies KKφ.
Of course, the premise may be disputed. I may know that it is raining (Kp) and
not know that some very long sentence φ (which, unbeknown to me is a logical falsehood)
is false (¬K¬φ). Yet this certainly does not show that {Kp, φ} is consistent—it cannot be
since {φ} is not. Another common example that suggests that the KK principle is incorrect
runs as follows. A schoolboy sitting a history test is asked when the Battle of Hastings
took place. He cannot clearly recall the lesson in which the answer was given but, on a
hunch, he writes ‘1066’, but is unsure as to whether this is the right answer. It seems that
he really knows the answer (it would be a very lucky guess otherwise!) but he clearly does
not know that he knows it.7
Adding all (4) axioms to K results in K4, the logic of the class of all transitive
frames (those such that, for all worlds w, w′ , w′′ ∈ W, if Rww′ and Rw′ w′′ then Rww′′ ). The
following diagram hints why this is.
•
KKp
•
Kp
•
p
5: Negative introspection
Finally, the (5) axioms are instances of the scheme:
(5) ¬Kφ → K¬Kφ
If someone doesn’t believe something, then she believes that she doesn’t believe it (or
similarly with ‘knows’ in place of ‘believes’). This is a very strong condition to place on
7
There are other reasons to reject the KK principle; see [Wil00] for a good discussion.
1.
14
either belief or knowledge. Socrates claimed that he only knew that he knew nothing—but
this is not so say that he knew all the things he did not know! Adding all instances of (5)
to K results in the logic K5, the logic of Euclidian frames: those such that, for all worlds
w, w′ , w′′ ∈ W, if Rww′ and Rww′′ then Rw′ w′′ .
Systems of epistemic logic other than those defined above can be obtained by
combining the axioms (K), (T), (D), (4) and (5). Common combinations for epistemic logic,
together with the class of frames (the condition on R) that they characterize (or are sound
and complete with respect to), are given in table 1.1. In the case of multi-agent logics
T 1.1: Common epistemic logics
Condition on R
Name Axioms
KD
K ∪ {D}
Serial
K ∪ {T}
Reflexive
T
S4
T ∪ {4}
Reflexive, transitive
S4 ∪ {5}
Equivalence relation
S5
KD4
KD ∪ {4}
Serial, transitive
KD45 KD4 ∪ {5} Serial, transitive, Euclidian
containing n knowledge operators K1 , . . . , Kn , the names of each logic receive an additional
‘n’ subscript, i.e. KD45n and so on.
1.2 Logical Omniscience
This framework contains some uncomfortable implications. Given the possible worlds
semantics as presented above, any formula satisfied at all worlds must be globally known. A
consequence is that all logical consequences of an agent’s knowledge must be automatically
known by the agent. As in instance of this, all tautologies must be globally known. A
hypothetical agent whose knowledge lived up to such stringent requirements could be
termed logically omniscient; consequently, this feature of certain epistemic logics has been
dubbed the logical omniscience problem [Hin75].
Logical omniscience can take on a variety of forms, but the basic case, which I
shall term full logical omniscience, is that of the closure under logical consequence of an
agent’s knowledge, or beliefs:
Definition 4 (Full logical omniscience) An agent is fully logically omniscient when: if ψ is a
logical consequence of a set of formulae Γ and the agent knows every formula of Γ, then it also knows
1.
15
ψ.
Why is logical omniscience a problem? Firstly, logics that suffer from the problem
allow conclusions that are unintuitive at best and “clearly inadmissible” [Hin62, p.31] at
worst. Two practical examples of why logical omniscience is problematic run as follows.
Chess players usually know the rules of chess and the positions of the pieces on the board.
So why is is that (even very good) chess players often fail to spot a winning strategy8 when
one exists? It is usually because the effort required to find such a strategy is far too great.
Chess players often do not have the time or mental ability to investigate every possible
strategy to see whether it is a winning one. But whether a given sequence of moves is in fact
a winning strategy for one of the players follows logically from the rules of the game and
the positions of the pieces and so logically omniscient agents cannot fail to spot a winning
strategy.
Prime number factorization in cryptography provides us with another example
(as discussed in [HMV95]). A message is to be sent over an insecure channel and so is
encrypted using two large primes n1 , n2 . The key n1 × n2 is also sent over the channel
and so is also publicly known. Given that an agent knows the rules of basic arithmetic,
the encrypted message and the public key, must it be able to decrypt the message? The
point of prime number factorization is that the complexity of finding the primes n1 , n2 is
so high that the secret key is unlikely to be discovered, despite all the information needed
to do so being available publicly. However, agents modelled in standard epistemic logic
fail to know the secret key only when they fail to know the public key or the rules of basic
arithmetic—this is by no means reasonable.
Hintikka himself notes the problem of logical omniscience (although not under
that name) at [Hin62, p.30], “it is readily seen that [‘Kp → Kq’] is valid as soon as p logically
implies q in our ordinary propositional logic.” He goes on to comment that:
it is clearly inadmissible to infer “he knows that q” from “he knows that p” solely
on the basis of the fact that q follows logically from p, for the person in question
may fail to see that p entails q, particularly if p and q are relatively complicated
statements. The state of his knowledge might be comparable with that of a man
who knows the axioms of some sophisticated mathematical theory but who
does not know some distant consequences of the axioms. [Hin62, pp.30–31]
8
i.e. a sequence of moves by one player such that, whatever moves the other player makes, the first is
guaranteed to win.
1.
16
The passage in interesting in that Hintikka explicitly says that such inferences are “clearly
inadmissible”. This is at odds with the literature accompanying many applications of
epistemic logic, which treats the problem of logical omniscience as an irksome feature that
can, in most cases, be ignored.9 But a system that generates genuinely inadmissible inferences is itself inadmissible. Hintikka comments further that “there need not be anything
nonsensical, irrational, or dishonest” [Hin62, p.31] in the following set of sentences:
1. a knows that p
2. p implies q
3. a doesn’t know that q
This is again at odds with a great deal of the current literature, which deems an agent
irrational if it fails to know all consequences of its knowledge. We will look at Hintikka’s
first reply to these difficulties in the following section (1.3); before that, let us continue our
look at the problem itself.
There are also a variety of forms of logical omniscience that (depending on the
formal framework) may be weaker conditions than that of full logical omniscience. The
list in given in [vdHvLM99, p.140] includes:
• Knowledge of all valid formulae: If φ is valid, then any agent knows φ. This is often
called the problem of irrelevant beliefs.
• Closure under logical implication: If ψ is a logical consequence of φ and an agent knows
φ, then the agent also knows ψ. This differs from full logical omniscience, which
requires agents to know all consequences of a (possibly infinite) set Γ of known
formulae.
• Closure under logical equivalence: If φ is logically equivalent to ψ and an agent knows
φ, then it also knows ψ.
• Closure under valid implication: If φ → ψ is valid and an agent knows φ, then the agent
also knows ψ.
9
To pick one such application at random, Rao and Georgeff comment of their belief-desire-intentions
framework that “like most possible-worlds formalisms, our logic suffers from the logical omniscience problem”
but ignore the worry: “we adopt the more traditional modal logic semantics” [RG91, p.8].
1.
17
• Closure under material implication: If an agent knows both φ and φ → ψ, then the agent
also knows ψ.
• Closure under conjunction: If an agent knows both φ and ψ, it also knows φ ∧ ψ.
The first three forms are implied by full logical omniscience, whereas the latter three depend
on the interpretation of the Boolean connectives. In the standard semantics presented
above, full logical omniscience implies all of these forms of logical omniscience and so all
forms of logical omniscience apply to the standard possible worlds semantics for epistemic
logic. There is a more general notion of logical omniscience, which we may term partial
logical omniscience:
Definition 5 (Partial logical omniscience) An agent is partially logically omniscient when it
knows infinitely many formulae if it knows any at all.
Partial logical omniscience can itself be a problem, even in the absence of full logical
omniscience. Suppose we reason about an agent with a limited amount of memory; we
want to know whether it can reach a certain conclusion before running out of memory,
or perhaps we want to investigate which strategy such an agent should use to make best
use of its available memory. A framework in which the agent appears to know an infinite
number of sentences may well give us the wrong answers, for we will not be able to tell
which the agent actually holds in its memory (its explicit beliefs) and which are attributed
through the closure property of the logic.
1.3 Hintikka’s First Response
1.3.1 Defensibility and Indefensibility
In response to the problems of logical omniscience raised above (which Hintikka raises
at [Hin62, §2.5], although he does not term the problem ‘logical omniscience’ there), Hintikka’s immediate response is to claim that the problem “does not go to show that our
rules [i.e. the syntactic rules governing the K operator, in particular, the (K) axioms and
the necessitation rule] are incorrect” [Hin62, p.31]. He goes on to comment that “[w]hat
it shows is that the notion which they define [entailments between attributions of knowledge] is unlike inconsistency . . . and should be carefully distinguished from it.” It should
1.
18
be pointed out that, up to this point [Hin62, §2.1–§2.5], Hintikka has been explicating his
rules for knowledge in terms of the consistency or inconsistency of statements such as
It is possible, for all I know, that ¬p
with
I know that p1
I know that p2
...
I know that pk .
If the former sentence is inconsistent with the latter sentences (according to Hintikka’s
criteria), which it is whenever p follows logically from p1 , . . . , pk , then it also follows that
‘I do not know that p’ is also inconsistent with the latter sentences. But, as Hintikka
comments, “I may fail to know, in a perfectly good sense, that p is the case, for I may fail to
see that p follows from what I know” [Hin62, p.30]. Hintikka’s response to this is then to
withdraw the use of the terms ‘consistency’ and ‘inconsistency’ in relation to entailments
between attributions of knowledge, for “our terminology is not appropriate” [Hin62, p.31].
Instead, entailments of the form, ‘if Kp then Kq’ should in “typical cases” be viewed
as asserting “immunity to certain kinds of criticism” [Hin62, p.31, Hintikka’s emphasis]. The
thought is that, if someone can be shown that q follows logically “by means of some
argument which he would be willing to accept” from what he says he knows, then it would
be irrational of him to “persist in saying that he does not know whether q is the case”
[Hin62, p.31]. Hintikka proposes to use the terminology defensibility and indefensibility in
place of consistency and inconsistency; sentences are to be called self-sustaining instead of
valid. ‘Kp → Kq’ is thus self-sustaining whenever p → q (given the usual semantics for →),
for it would be indefensible to claim knowledge of p but not of q.
Several comments are called for at this point. Hintikka claims that the new
notion of indefensibility is “a notion important enough to deserve serious study” [Hin62,
p.32], which time has shown to be true; yet he makes no comment as to whether the
resulting notion does indeed provide us with an analysis of “the way the verb “to know”
is actually used” [Hin62, p.30]. Hintikka has ruled out such an analysis when his rules
are interpreted in terms of consistency; but clearly, changing only part of the terminology
cannot overcome this problem. If the rules expressed in terms of consistency do not give
1.
19
us an accurate analysis of how the verb “to know” is actually used, then neither will those
same rules when interpreted in terms of defensibility, immunity to criticism or any other
notion, for that matter.
The implication seems to be that the entire terminology prior to [Hin62, §2.6] has to
be thrown out, including the reading of ‘Kq’ as ‘the agent knows that q’. The new reading of
‘Kq’ should be something like: ‘¬Kq is indefensible (given what the agent already knows)’,
meaning that “he would have come to know that q all by himself if he had followed far
enough the consequences of what he already knew” [Hin62, p.31]. Even if we accept this
somewhat vague reading, it can still give us the wrong results in certain cases. Consider
an agent a and suppose that φ is some extremely long tautology such that, given certain
factors that each human has a finite cognitive capacity and lives for only so many years,
it would be physically impossible for anyone to ascertain its validity. Ka φ is entailed by
Hintikka’s rules, yet it is certainly not true that a “would have come to know that [φ] all
by himself if he had followed far enough the consequences of what he already knew”.
Should we say that ‘Ka φ’ is a self-sustaining sentence—that ‘¬Ka φ’ is indefensible? If not,
then Hintikka’s rules again appear to be “clearly inadmissible”. If we do say that ‘¬Ka φ’ is
indefensible, then we are left without a clear sense of what ‘indefensible’ means.
Suppose we try another of Hintikka’s elucidations of the concept: ‘¬Ka p’ is indefensible when a can be shown that p follows from what he already knows “by means
of some argument which he would be willing to accept”. ‘Ka p → Ka q’ is self-sustaining
whenever there is such an argument showing that q follows from p (and in particular,
‘Ka q’ is self-sustaining whenever there is a valid argument for q that requires no premises).
Let Φ be a finite set of sentences φ1 , . . . , φn and ‘Ka Φ’ mean Ka φ ∧ · · · ∧ Ka φn . Hintikka’s
justification of his rules in terms defensibility and indefensibility may be put as follows:
• {Ka Φ, ¬Ka φ} is indefensible if and only if there is an argument that a would be willing
to accept which shows that φ follows from Φ.
• Ka Φ entails Ka φ if and only if there is an argument which a would be willing to accept
that shows that φ follows from Φ.
1.
20
The trouble is, this last claim just isn’t true, so long as ‘entails’ is taken to mean
entailment according to Hintikka’s rules10 or entailment in K.11 The definition given is neither
necessary nor sufficient to account for all K-entailments. It is not necessary because, for
certain φ, Ka φ is entailed by Ka Φ even though there is no argument that a would accept
which shows that φ follows from Φ. For one such example, again assume that a is an agent
with a bounded cognitive capacity and take φ to be so complex that no argument for its
truth could be accepted by a. Alternatively, take a to be a fully signed-up constructive
(or relevance) logician and ‘φ’ to be a statement with a classical but no constructive (or
relevant) proof. Then no argument (classical proof) for φ will be acceptable to a, yet Ka Φ
K-entails Ka φ (remember, K has been defined over classical logic).12
The definition given is not sufficient because there are arguments that show that
φ does follow from Φ, which a would accept, even though Ka Φ does not K-entail Ka φ.
Suppose a is reasonable agent who knows the meanings of colour predicates, take Φ to be
facts about the meanings of colour predicates and let φ be the statement
Nothing that is red all over can, at the same time and in the same way, be green all
over.
This is a necessary yet not a logical truth (and so Ka φ is not self-sustaining). Only a very
simple argument is required to convince a of φ, given Φ, yet Ka φ is not K-entailed by Ka Φ,
unless one adopts the view that the meaning of ‘red’ somehow excludes red objects being
any other colour (this is surely an implausible view; for it seems one would have to know an
infinite umber of exclusions before mastering what ‘red’ means). Again, it will be objected
that what is needed to deal with such knowledge is an epistemic logic supplemented with
axioms about colour exclusion; but these hardly seem a necessary component of a logical
account of knowledge. It is evident that the justification of the axioms of an epistemic logic
should be independent of a theory of colour exclusion, contrary to what the explanation
offered by Hintikka implies.13
10
Hintikka calls such entailments virtual entailments.
Hintikka endorses the stronger logic S4 but, since all the problems of logical omniscience are present in K,
his proposed solution should be satisfactory for K if it is for any logic.
12
Of course, it will be objected that a constructive logician’s knowledge should be modelled using constructive
epistemic logic and a relevance logician’s in relevant epistemic logic. This may be so but misses my point: there
can be no general definition of ‘entails’ (and so none of ‘indefensible’) in terms of acceptable arguments. What
is indefensible for one agent may be perfectly defensible for another.
13
If an acceptable argument is simply one whose conclusion could not possibly be false whilst its premises
are true, then the situation is even worse: any argument whose conclusion is a necessary truth will then be
11
1.
21
To make the definition acceptable—to find an acceptable explanation of entailment
between statements of the form ‘Ka φ’—the following amendments are required. Firstly, to
overcome the problem of the cognitive limitations of the agent, the requirement that the
argument must be capable of being accepted by the agent must go. Secondly, there must
be a universal standard of when an argument is acceptable within the theory—i.e. in a
classical epistemic logic, classical arguments must be acceptable to each agent. But these
two amendments amount to dropping the psychological (or cognitive) requirement of
acceptability completely. The definition thus becomes:
Ka Φ entails Ka φ if and only if there exists an argument which shows that φ follows
from Φ.
This definition is necessary but still not sufficient for ‘entailment’ read as K-entailment, as
the above argument from colour exclusion shows. The final amendment is to insist that the
argument mentioned must be a wholly logical one, so as to exclude any argument whose
conclusion is a necessary but not a logical truth. We can then replace ‘argument’ with
‘proof’ (in some logic Λ). The final version of the definition then becomes:
Ka Φ entails Ka φ (in a logic based on Λ) if and only if there exists a proof of φ from Φ
in Λ.
Recall that the purpose of this definition was to provide a notion that would both make the
rules of Hintikka’s logic acceptable and explain why, for example, Ka φ → Ka ψ whenever
φ → ψ. The answer now given is that it is because ψ can be proved from φ, i.e. φ → ψ is
valid in the logic in question; but this is clearly no explanation at all. It does not provide
us with any reason to believe Hintikka’s rules if we did not already believe them. If we
ask, why should we believe that Ka φ follows from Ka Φ whenever φ follows from Φ? the
answer will be: because φ follows from Φ.
Hintikka may maintain the reading of ‘K’ in terms of indefensibility but, as we
have seen, saying that {Ka Φ, ¬Ka φ} is indefensible says no more than that φ is logically
entailed by Φ. It fails to explain why indefensibility has anything to do with knowledge.
It also fails to explain why the denial of a necessary truth about colour exclusion shouldn’t
be considered indefensible, as the term ‘indefensible’ suggests.
acceptable, so that Hintikka’s justification entails that ‘Ka φ’ whenever ‘φ’ is a necessary truth. Agents modelled
in this way are then conceptually as well as logically omniscient, for conceptual truths such as ‘all bachelors are
unmarried’ are, of course, necessary.
1.
22
1.3.2 Scepticism
There is a further objection to the proposed reading of ‘Ka p’ that comes from a rather
different quarter. Suppose we ignore the above considerations and take Hintikka’s original
definition of indefensibility:
{Ka Φ, ¬Ka φ} is indefensible if and only if there is an argument that a would be willing
to accept which shows that φ follows from Φ.
Consider a reasonable agent a and let p be the sentence ‘a has two hands’.14 As Moore
[Moo39] points out, if an agent really does know that she has two hands, then she can
also rule out various kinds of radical scepticism of the kind discussed by Descartes.15 Let
us consider the following sceptical scenario, discussed originally by Nozick and Putnam
[Put77, Noz81, Put81]. A brain is suspended in a vat of nutrients, its nerve endings
connected to a supercomputer capable of stimulating them in precisely the way my central
nervous system would when in contact with the external world. The setup is such that the
mental states generated by the brain are such that they are qualitatively indistinguishable
from mental states generated by a brain connected to the physical world via the usual
bodily arrangement. Given that there is nothing about my experiences that shows that my
mental states are not so generated, how can I say that my brain is not a brain in a vat?16
The barest reflection, which a would surely accept, shows that ‘a has two hands’
(p) entails ‘a is not a brain in a vat’ (q). Then, by definition, {Ka p, ¬Ka q} is indefensible,
hence Ka p entails Ka q. But, the sceptic maintains, a cannot know that she is not a brain in
a vat, ¬Ka q hence, by modus tollens,17 a cannot know that p. But then Hintikka’s notion of
knowledge is subject to radical scepticism about the external world: an agent cannot have
knowledge about the external world which, if true, would entail that she is not a brain in a
vat. Note that this applies to agents situated in physical, non-vat reality, as we assume that
we all are, as well as to brains in vats. Knowledge of the external world, claims the sceptic,
is just not possible, in which case there then seems little point in formulating a logic of
knowledge in the first place.
14
G.E. Moore’s example, [Moo39].
First meditation, [CSM85].
16
The thrust of such examples is that, if I cannot rule out such possibilities, how can I even know that I am
currently sitting in my house, typing? [Noz81, p.167]
17
Note that modus tollens is a valid rule of inference even with ‘entails’ read as Hintikka’s notion of virtual
entailment, because Hintikka’s logic includes classical propositional logic.
15
1.
23
It is worth taking the time to see why this problem arises. The sceptic needs
to assume that knowledge is closed under known consequence: that is, whenever an agent
a knows that p and knows that p entails q, a also knows that q.18 Assuming the usual
semantics for →, the principle can be expressed as:
Ka p ∧ Ka (p → q) → Ka q
It is easily seen that this principle follows from full logical omniscience.19 Hence, any epistemic logic that includes all instances of the distribution axiom scheme will be vulnerable
to radical knowledge scepticism.
1.3.3 Idealized knowledge
Before moving on to look at other responses to the logical omniscience problem, it is
worthwhile looking at claims that logical omniscience is an acceptable property of (certain)
epistemic logics in certain situations. Such logics may result in unintuitive and disconcerting conclusions, yet this does not affect their applicability to practical situations. For
example, Fagin, Halpern and Vardi comment that
logical omniscience is not a problem under some conditions (this is true in particular for interpretations of knowledge that are often appropriate for analyzing
distributed systems [Hal87] and certain AI systems [RK86]) [FHV90, p.41].20
In what sense is logical omniscience (which gives rise to conclusions that Hintikka himself
termed “inadmissable”) “not a problem”? It is not that the agents being modelled in such
cases are in any way better at discovering the consequences of their conclusions than those
considered by Hintikka. Every real-world agent has some kind of resource bound or other
and so cannot possibly know all the consequences of what it knows.
Instead, it is the model of the agent that is idealized. In order to model any
phenomena, certain abstractions and idealizations have to be made. Part of the criteria of
a good model is that the idealizations made in the model allow accurate conclusions, of the
type the modeller is interested in, to be drawn. The most charitable interpretation of standard
18
[Noz81, pp.204–209]. See [Dan85, pp.10–11] for a discussion.
In the possible worlds semantics: for any world w, w Ka p and w Ka (p → q). Then for each world w′
such that Rww′ , we have w′ p and w′ p → q, hence w′ q and so w q.
20
They go on to say that “ it is certainly not appropriate to the extent that we want to model resource-bounded
agents.” [FHV90, p.41]
19
1.
24
epistemic logic, therefore, is in terms of idealized models of real agents’ knowledge. This
latter notion can itself be interpreted in different ways:
on the one hand, one might say that the concept of knowledge one is modelling is knowledge in the ordinary sense, but that the theory is intended to
apply only to idealized knowers—those with superhuman logical capacities.
Alternatively, one might say that the theory is intended to model an idealized
sense of knowledge—the information that is implicit in one’s knowledge—that
literally applies to ordinary knowers [Sta06].
In a similar vein, Vardi comments that
[o]ne can accept this situation [logical omniscience] by endowing the epistemic
notions with new pre-systematic interpretations. For example, one can restrict
oneself to idealized agent with unbounded reasoning power [Moo85], or one
can reinterpret knowledge and belief to be implicit rather then explicit, i.e., a
believes p if p follows from a’s explicit beliefs [Var86, p.294].21
However, as Vardi continues: “this leaves us in want of a precise treatment
of knowledge and belief in the customary senses”—i.e. of explicit, rather than implicit,
knowledge and belief.22 Moreover, it leaves the correctness of the logic to a contingency.
Which idealizations are appropriate in a model of an agent? Well, of course, that depends
on what one wants of the model, what its purpose is. Wooldridge [Woo95] argues that the
possible worlds framework is not useful in practice unless the worlds are given a concrete
interpretation—“grounded”, in his terminology.
Whether or not standard epistemic logic is an acceptable tool depends on how
far the use of knowledge can be stretched. This is perfectly acceptable—standard epistemic
logic has proved successful in many applications in computer science and AI—yet one can’t
help feeling that, were there an epistemic logic just as useful as Hintikka’s but without the
logical omniscience property, the latter would immediately be preferable. I hope to set the
groundwork for such a logic in chapters 5–7.
There is one interpretation of standard epistemic logic that should at all costs be
avoided: that it captures the notion of what one should know (given what one actually does
know).23 Such a view might set out to justify the relation between explicit and implicit
knowledge (the latter being that captured by standard epistemic logic): one implicitly
knows what one should explicitly know (given one’s explicit knowledge).
21
See also [HM90, Lev84, RP85].
See the discussion in section 1.2 above.
23
Dennett [Den87] hints at a view along these lines.
22
1.
25
But how could such a thought ever be justified? In what sense should one believe
the consequences of one’s knowledge?24 The thought might be run as follows. In believing
that p, one should believe whatever couldn’t be false when p is true. A consequence of
this approach is that one should always believe all necessary truths.25 There is a sense in
which mathematicians in the past should have believed Fermat’s last theorem to be true,
namely that it is true. But this same sense can hardly be used as a characterization of what
those mathematicians did believe. Mathematicians in the past were perfectly entitled to
withhold their judgement on the matter of Fermat’s last theorem until a suitable proof had
been found.
24
It makes more sense to talk about what one should believe, as opposed to what one should know because
belief, unlike knowledge, is largely a matter of choice. Once chooses, for the most part, what one believes, but
not what one knows.
25
This objection is somewhat similar to that raised above against Hintikka’s response in terms of defensibility
(section 1.3, especially footnote 13).
26
C 2
R L O
2.1 Impossible Possible Worlds
2.1.1 Hintikka’s Problem
Hintikka later responds to the problem of logical omniscience (although not to the issues
raised above) in [Hin75], drawing on notions from [Ran75] and [Hin73b]. He casts the
problem of logical omniscience as follows:
1. ‘a knows that φ’ is true at w iff φ is true at every world epistemically accessible from
w;
2. There are a, φ, ψ such that a knows that φ, φ logically implies ψ and yet a does not
know that ψ;
3. A sentence is logically true iff it is true at every possible world;
4. Every epistemically possible world (and so every world epistemically accessible from
any world) is logically possible.
(1-4) are clearly inconsistent; I call this Hintikka’s problem. Hintikka immediately argues
that (2) is not the culprit [Hin75, p.476]—that is, there really are such sentences, so related.
The argument that (2) is false might run as follows. The conclusions of any valid argument
are already contained in its premises; the process of logical deduction is not one that is
capable of providing one with new information. Hence, the logical consequences of one’s
knowledge should be included in what one says one knows, such that no one could know
φ without also knowing ψ when the latter follows from the former. But, of course, new
2.
27
information is gained through the process of logical deduction: one can hardly claim that
upon finding, say, the first proof of Fermat’s Last Theorem, the worldwide mathematical
community did not learn anything new.1 This is why some logical results are surprising.
2.1.2 Hintikka’s 1975 Response
Instead, Hintikka proposes to reject (4) and claim that not all epistemically possible worlds
are logically possible: “the source of the trouble is obviously the last assumption (4) which
is usually made tacitly, maybe even unwittingly. It is what prejudices the case in favour
of logical omniscience” [Hin75, p.476]. Hintikka’s reason for supposing that epistemically
possible worlds need not be logically possible is as follows.
Just because people . . . may fail to follow the logical consequences of what they
know ad infinitum, they may have to keep a logical eye on options which only
look possible but which contain hidden contradictions [Hin75, p.476].
The worlds that are epistemically possible for a from a world w, then, should not be
thought of as giving us the possibilities left open by what a knows at w; instead, they
should give us the apparent possibilities—apparent, that is, given a’s ability to “follow the
logical consequences of what [he] know[s]”. Hintikka devotes the remainder of his article
[Hin75, pp.477–483] to describing such “impossible possible worlds” [Hin75, pp.477–483].2
There are two questions to be answered. Firstly, what are ‘impossible’ possible
worlds? and secondly, why would an agent consider such worlds possible, for all she knows?
Hintikka’s response to the first question is that “they are worlds so subtly inconsistent
that the inconsistency could not be known (perceived) by an everyday logician, however
competent” [Hin62, p.478] To explain how this so, a short detour must be taken. Hintikka
expounds the notion in the setting of first-order rather than propositional logic, so some
deviation from the logic described up to this point will be necessary. I will, however, return
to the propositional case shortly.
1
I discuss a number of cases that support the idea of mathematical and logical information—i.e. of mathematical or logical results providing an agent with genuinely new information—in [Jag06a].
2
The terminology ‘impossible possible worlds’ could not have been designed more effectively to annoy.
Such worlds are called nonclassical in [Cre72, Cre73] and nonstandard in [RB79]. Levesque claims a different
methodology in [Lev84], using a notion of a situation (although Levesque’s situations are remarkably similar to
Cresswell’s nonclassical worlds [Cre72, Cre73]).
2.
28
Following Hintikka [Hin73b] and [Hin73a], quantified formulae may be interpreted in a game theoretic way.3 The world is viewed as an urn4 from which individuals
may be drawn by the two players, called ‘∀’ and ‘∃’. In a game G of the form G[∀xφ(x)],
player ∀ must pick an individual from the urn satisfying φ; if she picks individual a, the
game continues as G[a]. Similarly, the game G[∃xφ(x)] requires ∃ to pick an individual a
satisfying φ and continues as G[a]. Analogously, ∀ decides whether G[φ ∧ ψ] proceeds as
G[φ] or as G[ψ] whereas ∃ decides how G[φ∨ψ] should proceed. The game G[¬φ] proceeds
——
as the inverse game G[φ], in which the players swap rôles.
In this way, nested quantifiers represent constraints on sequences of draws from
the urn. Just as in elementary probability theory, individuals may or may not be replaced
after being drawn from the urn. Models in which individuals are drawn and then replaced
are called invariant models; those without replacement are the variant models. Invariant
models correspond to the classical interpretation of the quantifiers (an individual cannot
disappear from a model whilst a sentence is evaluated there), whereas variant models
correspond to what Hintikka terms the ‘exclusive’ interpretation of the quantifiers.
Now, from a classical point of view, all and only the invariant models count
as genuine possible worlds. Hinkikka’s main idea is that, given a formula of certain
game-theoretic complexity, there is a subset of variant models “which vary so subtly as
to be indistinguishable from invariant ones at a certain level of logical analysis” [Hin62,
p.483]. Hintikka’s notion of ‘logical analysis’ is related to the maximum number of nested
quantifiers found in the sentence to be evaluated in playing the game: this is the depth d
of the formula. For any finite sequence of d draws from the domain of the model, models
that are in fact variant will behave as invariant models with respect to formulae of depth
d. They will agree with the classical models on the truth value of sentences that require no
more than that number of draws to fully evaluate their truth value:5
in [a sentence] p [of depth d] we are considering at most d successive draws of
individuals from the model that is supposed to make p true or false. Hence the
question as to whether a person a who knows that p has to know also a certain
logical consequence q of p is naturally discussed by reference to . . . sequences
of at most d draws of individuals from the domain. This many draws he will
3
Hintikka develops ideas found in [Hen61] and [Pei92].
The terminology is taken from probability theory.
5
This is a semantic approach to epistemic possibility. Hintikka also re-casts the account in syntactic terms,
using the notion of a surface contradiction, such that “only surface tautologies must be known by everybody
. . . ‘everybody’ of course means ‘everybody who understands the propositions in question’” [Hin62, p.483].
4
2.
29
have to consider in spelling out to himself what p means, whereas there is no
logically binding reason why he should consider sequences of draws of any
greater length. [Hin75, p.482]
In order to investigate these notions in more detail, a few of the technical details
are needed. An important notion is that of an urn model, introduced by Rantala in [Ran75].
Definition 6 (Urn sequence) Let D be a domain of individuals. An urn sequence ∆ is a countable
sequence hDi | i ∈ ωi where D1 = D and, for i ≥ 1, Di ⊆ Di (the ith Cartesian power of the domain
D) such that:
ha1 · · · ai i ∈ Di iff ∃a′ ∈ D : ha1 · · · ai a′ i ∈ Di+1
Definition 7 (Urn model) Let L be a first-order language and M be a first-order structure whose
domain is D, assigning an element of D to each constant and a set of n-tuples to each n-ary relation
letter of L. Then an urn model M is a pair hM, ∆i, where ∆ is an urn sequence hD1 D2 · · ·i with
D1 = D.
Each Di in ∆ is a set of sequences of length i.6 Di+1 can then be built from Di as follows.
For each sequence σ ∈ Di , choose an individual a ∈ D and add the sequence σ appended
by a to Di+1 . Di+1 is then the smallest set constructed in this way. For example, if D = {a, b},
then each Di will contain sequences of as and bs of length i. The first three elements of two
possible urn sequences ∆ and ∆′ are shown in figure 2.1. In the first, the ith element of ∆ is
just Di , the ith Cartesian power of the domain D; but in the second, D′3 is a proper subset
of D3 . Each Di can be thought of as containing the individuals that may have been drawn
from the urn, in order, in i draws. Since D′3 (in ∆′ ) is a proper subset of D3 (in ∆) in the
example, ∆′ represents an urn with less choices after the second draw than ∆. Intuitively,
less individuals are available in the urn modelled by ∆′ at the third draw than the urn
modelled by ∆.
The notion of an individual being available at a particular draw i can be made
precise as follows. If i = 1, then all individuals in the domain are available; otherwise, an
individual ai is available at i iff there is a sequence σ = ha1 · · · ai−1 ai i ∈ Di . For i > 1, the set
δi of individuals from which to choose from at draw i in an urn model M = hM, ∆i is then
δi = {ai | ∃a1 · · · ∃ai−1 : ha1 · · · ai−1 ai i ∈ Di }
6
This is only strictly true for i > 1; when i = 1, Di contains individuals, not sequences. But it does not hurt
to identify sequences of length 1 with the individual they contain if the type signature of the sequences is not
vital.
30
2.
F 2.1: Possible urn sequences
∆=
∆′ =
{a, b}
(
haai, habi,
hbai, hbbi
)
{a, b}
(
haai, habi,
hbai, hbbi
)
(
haaai, habai, haabi, habbi
hbaai, hbabi, hbbai, hbbbi
(
haaai, habai,
hbaai, hbbai
)
)
···
···
Note that each δi ⊆ D. δ1 = D, as expected. An invariant model is then simply one in
which δi = δi+1 , for each i ∈ ω:
Definition 8 (Invariant and changing models) Let M be a first-order structure with domain
D and M = hM, hD1 D2 · · ·ii be an urn model. M is said to be invariant iff, for every i ∈ ω, δi = D.
Else, if there exists some i ∈ ω such that δi ⊂ D, then M is a changing model.
It follows immediately that there can be only one invariant urn model M over each firstorder structure M. In the case of a changing model, imagine that the urn has a mechanism
that may remove a number of balls (individuals) in between each draw, and may replace
some or all of these balls later on. The first draw is always made with the full stock of balls
and the mechanism can never remove all of the balls.7 In the example in figure 2.1, ∆′ is
a changing model: a is available for the first two draws, but then cannot be drawn on the
third.
The interesting feature of changing models is that, within a certain number of
draws, they behave just like invariant models. Suppose an urn model M over M and D is
a changing model, so for some i, δi ⊂ D. Take the least such i; then D1 · · · Di−1 is identical to
the initial segment of length i of the invariant urn model over M. This means that M will
agree with the corresponding invariant model on the truth value of all formulae of depth
≤ i (i.e. the maximum number of nested quantifiers in the formula).
An urn model M satisfies a formula φ, written M |≈ φ, when ∃ has a winning
strategy in M for φ.8 Let us take as our example the sentence ‘someone knows everyone’,
7
8
The existential condition on sequences in Di+1 ensures that each δi is nonempty.
A standard recursive definition of ‘|≈’ might proceed as follows. First, define satisfaction-at-draw-i:
M
M
M |≈i Rn (c1 , . . . , cn ) iff hcM
and {cM
1 · · · cn Mi ∈ R
1 , . . . , cn } ⊆ δi
i.e. the standard base clause with the proviso that the required individuals are available for selection at draw
i. The clauses for Booleans are then perfectly standard. Next, M |≈i ∃xφ(x) iff there is an individual a ∈ δi such
that M |≈i φ(c) for cM = a. Finally, M |≈ φ iff M |≈i φ for some i.
2.
31
formalized as ∃x∀yRxy. Let a model M with domain {a, b} assign RM = {haai, habi, hbbi}
and let M = hM, ∆i and M′ = hM, ∆′ i where ∆, ∆′ are as in figure 2.1. It is clear that
whether the game is played in M or M′ , ∃ has a winning strategy in picking a. Thus, both
M |≈ ∃x∀yRxy and M′ |≈ ∃x∀yRxy.
Now change the example to ‘everyone knows someone who knows everyone’,
∀x∃y∀z(Rxy ∧ Ryz). Now, ∃ only has a winning strategy in M′ , for here ∀ is forced to draw
a on the third draw. In M, ∀ has a winning strategy in drawing b first of all, for b doesn’t
know anyone who knows a. Hence, M′ |≈ ∀x∃y∀z(Rxy ∧ Ryz) but M 6|≈ ∀x∃y∀z(Rxy ∧ Ryz).
The reason that both urn models agree on the former but disagree on the latter sentence is
that the former has depth 2, the latter depth 3 and δ1 = δ2 ⊃ δ3 in M′ .
It is in some ways surprising that there can be a model of ∀x∃y∀z(Rxy ∧ Ryz)
when R is so defined; this is the motivation behind calling changing urn models impossible
possible worlds in the modal setting. Which of these worlds will count as epistemically
possible for a given agent from a certain world w? Here Hintikka’s notion of logical
competence comes into play. It is only when considering formulae of depth ≥ 3 that M′
can be seen to be a changing model. If an agent’s competence does not extend to formulae
of depth 3 or greater then urn models such as M′ may well seem possible, i.e. be epistemic
possibilities for that agent. In general, epistemic possibility is dependent on the notion of
d-invariance, where d represents the number of draws from the urn required to evaluate a
formula.
Definition 9 (d-invariant urn models) Let M = hM, hD1 D2 · · ·ii be an urn model as above
such that D1 = δ1 = · · · = δd . Then M is said to be d-invariant.
Clearly, d-invariant urn models are d′ -invariant for all d′ ≤ d and all invariant models are
d-invariant for all d ∈ ω. Each changing model must be d-invariant for some d, the lower
limit being d = 1. d-invariant models behave as invariant models for all formulae of depth
≤ d. Whereas all classically valid formulae are satisfied by all invariant worlds, a valid
formula of depth d need not be satisfied at a d′ -invariant model, for d′ < d.
By considering a changing but d-invariant model to be invariant (i.e. possible), an
agent thus avoids knowing all classically valid formulae. In the same way, that agent will
not know all consequences of what it knows. Instead, agents must know all d-consequences
of what they know, where ψ is a d-consequence of φ iff all d-invariant urn models that satisfy
2.
32
φ also satisfy ψ. In this way, Hintikka’s solution to what I have termed Hintikka’s problem
is to replace 4 by:
4′ . All epistemically possible worlds are (described by) urn models.
There are two problems to be addressed here: firstly, what happens in cases when quantifiers are ignored? and secondly, does the notion of closure of knowledge under dconsequence fare better than the discredited general principle of closure under consequence?
The first problem encountered is more or less a technical one. Urn models can
only be defined relative to a nonempty domain in a first-order structure, so how can their
use as a tool in propositional epistemic logic be judged? Given the urn models methodology,
the intuitive answer should be: any agent that knows anything at all should know all
tautologies automatically; and indeed this can be shown to be a formal consequence of
Rantala’s semantics: |≈ φ whenever φ is a propositional tautology [Ran75, p.466, theorem
1]. Hence, every agent automatically knows all propositional tautologies; but there is no
reason to suppose that this is the case.
Secondly, it is not clear that talking of an agent’s logical competence in terms of
the depth of formulae is accurate or helpful in a discussion of knowledge. Suppose I know
that everyone knows someone who knows everyone, formalized as ∃x∀y∃z(Rxy ∧ Ryz). It
does not seem to follow that I know all logical consequences of my knowledge that may be
expressed in formulae of depth 3 or less. The problem becomes more acute by considering a
mathematician going through the process of proving some theorem, expressed as a formula
φ of depth 100, say. To begin with, she does not know φ and let us further suppose she
does not know that ψ (of depth 99) either, even though ψ follows from what she does know.
φ and ψ may be completely unrelated. Now, when the mathematician proves and thus
comes to know that φ, all worlds she considers possible must be 100-invariant (else she
may consider some world that does not satisfy φ to be possible; but then she could not be
said to know φ). Consequently, in coming to know that φ, she also comes to know ψ, for
all worlds that the mathematician considers possible are 99-invariant, hence all satisfy ψ.
Again, we have very little reason to suppose this to be the case; certainly, it would
go against the way ‘to know’ is commonly used. It can often happen that a mathematician
finds a proof for a complex theorem but fails to draw the (less complex) corollaries. Suppose
ψ is an important open problem and a mathematician proves a very complicated theorem
2.
33
φ, of which ψ is a moderately simple corollary. The scenario would most naturally be
described by saying: she knew that the theorem φ was true, but did not know that ψ was a
corollary of it. But it is this very kind of scenario that is ruled out by the Hintikka-Rantala
account of impossible epistemic possibilities in terms of changing urn models.
2.2 Nonclassical Worlds
In this section, logics that attempt to avoid logical omniscience whilst remaining within
Hintikka’s possible worlds framework are reviewed.
2.2.1 Cresswell’s Nonclassical Worlds
Max Cresswell [Cre73] develops an account that has proved popular in the subsequent
literature on propositional attitudes and logical omniscience. Cresswell clearly shares the
motivation behind calling logical omniscience a problem for theories of belief and knowledge:
there is no reason why someone should not take a different propositional attitude (belief, say) to two propositions that are logically equivalent. And when
a mathematician discovers the truth of a mathematical principle he does not
thereby discover the truth of all mathematical principles [Cre73, p.40].
His suggestion is the partitioning of worlds into two disjoint classes; the classical and the
nonclassical.9 Logically equivalent propositions have the same truth-value at all classical
worlds—or, to speak (as Cresswell does) of propositions as sets of worlds, they contain
the same classical worlds. But they need not contain (or receive the same truth-value at)
precisely the same nonclassical worlds. Now, nonclassical worlds are ones that do not obey
all the usual logical laws so that not all logical truths are true at such worlds. Consequently,
logically equivalent propositions p and q can be distinguished between in attitude ascriptions. They are distinguished by nonclassical worlds, for example, a world at which p is
true but q false. Thus, knowledge and belief can be analyzed as truth in all epistemically
accessible worlds, by including non-classical worlds in an agent’s epistemic accessibility
relation. In a similar way, agents can be modelled as having knowledge without automatically knowing all consequences of that knowledge, by allowing nonclassical worlds into
the epistemic accessibility relations of those agents.
9
Cresswell coins the term ‘nonclassical’ at [Cre70, p.354], but acknowledges a similar notion due to Richard
Montague, under the label ‘designated points of reference’ [Mon70, p.382].
2.
34
The advantage of this manoeuvre is that the satisfaction clause for sentences of the
form Bφ (or Kφ) remains unchanged, but an agent is now allowed to consider impossible
worlds to be possible. The definition of what it is for a formula to be valid in this logic is
restricted to the classical worlds, so that the valid formulae within this logic coincide with
the classically valid formulae. It is in this respect that the defender of this approach can
claim that an agent modelled in this way need not believe every valid formula.
There are several questions raised by this approach. Just what are nonclassical
worlds? and are they the right logical tool to overcome the problem? Are they even the
correct conceptual tool to use in analyzing knowledge and belief? We shall deal with each
of these questions in turn, beginning with the nature of nonclassical (or impossible) worlds.
Just what kind of being do they have?
Carnap introduces the notion of a possible world as a state description [Car47, p.9],
a set of atomic sentences each of which has a definite truth value independently of any
other atomic sentence.10 Worlds constructed from sentences are often called ersatz worlds. It
is easy to see how this conception could be modified to accommodate nonclassical worlds:
we simply modify the rules for the assignments of truth values to atomic sentences. Since
worlds are viewed as sets of sentences, classical and nonclassical worlds are ontologically
on a par.
However, Cresswell believes that the nonclassical worlds approach can be made
to work even if one denies that possible worlds may be analyzed as sets of sentences:
if possible worlds are taken as primitive then there is nothing to stop us from
taking a subset and saying that these are the ones that are genuinely possible
worlds; the others are in some sense impossible. [Cre73, p.40]
Following the work of Kripke and others on the semantics of modal logic in the 1960s, the
tendency now (at least amongst modal logicians) is to view possible worlds as primitive
elements of the theory. The question is, how can a primitive entity, conceived of as a possible
world, be at the same time nonclassical? Cresswell’s answer is that the nonclassical worlds
are not worlds at which the impossible happens: they are not genuine impossible worlds.
Rather, they are worlds at which the connectives have a nonstandard meaning. There may
be worlds at which, for example, ‘¬’ has an intuitionistic11 or paraconsistent meaning. So a
10
Carnap is appealing to Wittgenstein’s notion of elementary propositions in the Tractatus at §5.3: “Every proposition is the result of truth operations on elementary propositions” and §5.134: “One elementary proposition
cannot be deduced from another” [Wit22].
11
This is Cresswell’s example, supposed to explain why double negation can fail at nonclassical worlds.
2.
35
sentence, which under the classical interpretation of its constants would be a contradiction,
can be true at nonclassical worlds where those connectives receive a different interpretation:
The fact that we can reinterpret [∧] and [¬] so that [φ ∧ ¬φ] is true in a possible
world no more shews us how a contradiction could ever be true than calling
birds ‘pigs’ shews us how pigs could fly. [Cre73, p.41]
This kind of reasoning is confused. Cresswell talks of the meaning of the connectives in terms of “meaning rules”, i.e. in semantic, truth-functional terms, such that it is
correct to talk of a connective such as ‘¬’ denoting a particular truth-function, e.g. classical
negation. The idea then is that the denotation of connectives may alter from world to world.
The classical worlds at which a particular symbol, say ‘¬’, denotes the same classical function. On this view, we might call the connective symbols epistemically non-rigid. That is, in
epistemic contexts, symbols such as ‘¬’ act as non-rigid designators. Now, this attempts to
explain the epistemic possibility of some logical falsehood, φ, for some agent, in claiming
that the connectives contained in φ are epistemically non-rigid. But this is implausible
for several reasons. Firstly, mathematical functions, including truth functions, are abstract
objects and so do not exist at worlds, as part of worlds. Possible worlds are constituted by
possible states of affairs. Mathematical being is, on the other hand, world-independent.
Now, what we analyze at worlds when we analyze the truth of a sentence is not
the words that constitute that sentence. Instead, we analyze what the sentence says. Take the
following example. It is a contingent fact that Tony Blair is called ‘Tony’. He could have
been named something else by his parents without altering his individuality. So, there are
worlds in which Tony Blair is called ‘Tom’, which evince the falsity of ‘Necessarily, Tony
Blair is called ‘Tony’’. So clearly, in evaluating the truth of this latter statement, we are not
looking for worlds in which there is someone called ‘Tony Blair’ who isn’t called ‘Tony’:
for there can be no such worlds. Similarly, in evaluating whether it is logically possible for
pigs to fly, we are not searching for worlds in which something termed a ‘pig’ flies. Thus,
if it is logically possible for pigs to fly, then there is a world in which pigs fly—that is, a
world at which what we mean by ‘pigs’ can fly.
David Kaplan explains the evaluative procedure in terms of direct reference theory. First, an uttered sentence expresses a proposition, which is a structured entity constituted by the denotations of the uttered referring terms, structured much as the sentence
itself is structured (Kaplan calls such entities Russellian propositions). In the case of sentences containing indexical or demonstrative elements, context determines the contribution
2.
36
of that term to the proposition. At this stage, temporal or modal operators are ignored.
This proposition is then evaluated at a world or group of worlds and a timepoint or timepoints, depending on any modal or temporal operators in the original sentence, producing
a determinate truth value. Thus, ‘φ ∧ ¬φ’ expresses a proposition that is logically structured by containing whichever functions we mean by ‘∧’ and ‘¬’. For that proposition
to be evaluated as true at some world, that world must genuinely be a world where the
impossible happens, where contradictions are genuinely true.
Zalta [Zal97] argues for a theory of genuine, as opposed to ersatz, impossible
worlds is required if we are to make use of the notion at all. He provides a metaphysical
theory of impossible worlds, based on his theory of abstract objects [Zal83, Zal88]. Within
this theory, no contradiction is true; however, contradictions can be true at impossible
worlds.12 So, a genuine metaphysical theory of impossible (and hence nonclassical) worlds
is possible, so I will now use ‘nonclassical worlds’ as a synonym for ‘impossible worlds’
(in the general sense, rather than in Hintikka’s restricted sense).
We must now address the questions: are nonclassical worlds the right kind of tool
to use in analyzing knowledge and belief? To begin with, it is not clear that they are the right
kind of conceptual tool. Suppose an agent a believes ¬¬φ but not φ. This is explained by
allowing nonclassical worlds—such as intuitionistic worlds—into a’s accessibility relation.
This implies that a holds such worlds to be possible, for all it knows. But this might not be
the case; our agent might be a staunch classical logician, who believes that double negations
are always eliminable, and so would never admit an intuitionistic world as possible. This
need not be viewed as a logical problem for the nonclassical worlds account: perhaps the
classical logician simply has contradictory beliefs about what is possible (which is, after
all, perfectly possible according to this view). However, the example does highlight a
certain conceptual uncomfortableness felt in holding that an agent must hold certain types
of world to be possible, despite her protestations to the contrary.
Regardless of these conceptual issues, the nonclassical worlds approach does not
deliver the right kind of logical results. To see why this is the case I will, in the following
sections, first cash out a logic underlying nonclassical worlds (sections 2.2.2 and 2.2.3) and
present two accounts based on such logics (sections 2.2.4 and 2.2.6). I will then present
criticisms of all such approaches in section 2.2.7.
12
Graham Priest holds that contradictions can be true in actuality; the view is known as dialethism [Pri79,
Pri87].
2.
37
2.2.2 Paraconsistent Worlds
To keep the account as general as possible, this shall be a paraconsistent logic (for simplicity,
I shall stick to the propositional case; Priest [Pri02] gives a good overview of paraconsistent
logics). To develop a paraconsistent logic from a classical logic, we may take either of the
following options:
1. Alter the valuation function V
Usually, V assigns each proposition p ∈ P a member
of the set {true, false} at each world w ∈ W, i.e. V is of type P × W −→ {true, false}. We can
define nonclassical valuations as follows:
L3 : V 3 assigns a non-empty subset of {true, false} to each proposition-world pair.
L4 : V 4 assigns a subset of {true, false} to each proposition-world pair.
L3− : V 3− assigns a proper subset of {true, false} to each proposition-world pair.
The recursion clauses for ¬, ∧, ∨ and → are as usual. We might treat each of the subsets of
{true, false} as truth values in their own right as follows:
{true} is called truth;
{false} is called falsity;
{true, false} is called both;
{} is called neither.
Then L3 , L4 and L3− are many-valued logics (there are more than the classical 2 values
available). L3 and L3− are 3-valued logics; L4 is 4 valued.13 See [Dun76, KdC77] for
presentations of many-valued approaches.
2. Alter the properties of negation Here, a classical valuation may be used and negation
is taken to be non-truth-functional. In general, the value of ¬φ will be independent of φ.
Although we only have the classical two truth values, both a formula and its negation can
be true; or just one of them true, or perhaps neither. Then, although individual formulae
are assigned classical truth values, the values of each formula-negation pair (φ, ¬φ) will be
13
4-valued logics were introduced by Belnap in [Bel77]. L3− is often used to provide semantics for negationas-failure, for example in semantics for programming logics such as PROLOG.
2.
38
(true, false), (false, true), (true, true), (false, false). We may call the latter two both and neither,
in keeping with the many-valued approach sketched above. The properties of negation
can be restricted: for example, by making ¬φ false whenever φ is true (this still allows the
value neither); or by forcing ¬φ to be true whenever φ is false (allowing the value both).
[dC74, dCA77] describe many systems of non-truth-functional paraconsistent logics along
these lines.
Now, suppose sentences are assigned truth-values at worlds according to one
such definition—say, for example, the 4-valued approach. This means that a contradiction
p ∧ ¬p may be true at some world w, in which case w is a paraconsistent world, and some
formula p ∨ ¬p may fail to be true at a world w′ , in which case w′ is said to be paracomplete.14
Obviously, these two characterizations are not mutually exclusive: worlds are allowed to
be both paraconsistent and paracomplete. Validity is restricted to the formulae true at all
classical worlds, i.e. worlds that are neither paraconsistent nor paracomplete. Knowledge
is defined in the usual way:
M, w Ki φ iff, for all w′ ∈ W, Ri ww′ implies M, w′ φ
Before providing criticism of this response to the logical omniscience problem, I
will introduce a number of accounts that share their motivation with Cresswell’s, namely
Levesque’s account of explicit and implicit knowledge and Fagin, Halpern and Vardi’s notion
of nonstandard worlds. Both accounts are based on relevant logic, so I shall begin by briefly
introducing the latter, then describing each of these accounts in more detail. I shall then
provide criticism of all such accounts—both those based on paraconsistent and those on
relevant logics—together.
2.2.3 Relevant worlds
Closely related to paraconsistent logic is the development of relevant logic, initially by
Anderson and Belnap [AB75] and furthered by Routley and Meyer [RM72a, RM72b, RM73].
The aim is to avoid the unintuitive properties of material and strict implication, for example
that (φ → ψ) ∨ (ψ → χ) for any φ, ψ and χ in the case of material implication, and
φ J (ψ J ψ) in the case of strict implication.15
14
Paraconsistent worlds are sometimes characterized as having truth gluts and paracomplete worlds as
worlds with truth gaps.
15
Strict implication was introduced by C.I. Lewis [Lew20]. φ strictly implies ψ (φ J ψ) iff the supposition that
φ is true whilst ψ is false is necessarily false (Lewis defines φ J ψ in terms of impossibility: ¬3(φ ∧ ¬ψ).) Lewis
2.
39
In the semantic interpretation, each world w has an associated world w∗ , such
that w∗∗ = w and ¬φ is true at w iff φ is false at w∗ .16 See [DR02, Res93] for an overview.
In decoupling the semantics of negation from that of positive formulae, this approach to
relevant logic is on a par with that of the non-truth-functional approach to paraconsistent
logic. Semantics can also be given by treating negation as an extensional operator but
evaluating a formula using a truth relation, between the formula and the set {true, false},
rather than a truth function into that set [Dun76]. This approach is analogous to the manyvalued approach to paraconsistency.
Both approaches to relevant logic have motivated responses to the logical omniscience problem. The Routley star approach underlies Fagin, Halpern and Vardi’s nonstandard propositional logic, whereas Dunn’s approach motivates Levesque’s logic of explicit
belief.
2.2.4 Levesque’s Logic of Explicit and Implicit Belief
The idea of distinguishing explicit from implicit belief is intuitively appealing; explicit beliefs
are those beliefs that in some way affect an agent’s state, whereas an agent’s implicit beliefs
are a superset of its explicit beliefs, containing all their logical consequences. Levesque’s
logic of implicit and explicit belief [Lev84] is an attempt to formalize this intuition. The
domain of worlds includes both impossible and incomplete worlds.17 Worlds that are neither
impossible nor incomplete are classical worlds.
Levesque’s account is similar to a 4-valued paraconsistent account of the internal
logic of possible worlds, although valuations are classical and it is negation that provides
the nonclassical element of the account. Levesque’s original account is fairly restricted
compared to more contemporary epistemic logics; for example, belief operators (either for
explicit or implicit belief) cannot occur within the scope of one another; only one agent is
accommodated and quantification is prohibited.18 Levesque’s approach is developed by
and Langford [LL32, chapter 8] then claimed that strict implication captures the notion of logical entailment.
16
This latter condition is needed to obtain the standard relevant logics discussed by Anderson and Belnap;
but the possible worlds interpretation using the Routley * operator is more general than this.
17
Levesque uses the term situation in place of world, but this is a misleading term, at least in its common
usage, for the following reason. No proper part of a world is itself a world, but proper parts of situations may
themselves be situations, e.g. a particular sporting match may be part of a larger tournament, but both may
be termed ‘situations’. Even in a philosophical situation, Levesque’s notion is sufficiently distant from uses of
‘situation’ in Barwise and Perry’s Situation Semantics [BP83, Dev91] to merit being avoided in this usage.
18
The former is rectified in [Lak87]; the latter in [Lak90] and [PS85].
2.
40
Lakemayer in [PS85, Lak86, Lak87, Lak90]. The version I present here is emended to fit
with the presentation of epistemic logics given above. A model is a structure
M = hW, R, Vt , Vf i
where W is a set of worlds, R ⊆ W × W is the accessibility relation between worlds and
Vt , Vf are both valuation functions of type P × W −→ {true, false}.19 The subset of classical
worlds is denoted W ∗ . R∗ ⊆ W × W ∗ is the accessibility relation R restricted to W ∗ in its
second argument. Two satisfaction relations t and f are defined in terms of Vt and Vf
as follows. Firstly, for primitives p and negated formulae:
M, w t
M, w f
M, w t
M, w f
p iff Vt (p, w) = true
p iff Vf (p, w) = true
¬φ iff M, w f φ
¬φ iff M, w t φ
Note that there is a clear link here between this approach to providing satisfaction conditions for negation and the approach based on the Routley star operator. An equivalent system could be obtained by using a single valuation function V = Vt and setting
V(p, w∗ ) = true iff Vf (p, w) = true.20 Clauses for conjunction and disjunction are then
standard:
M, w t
M, w f
M, w t
M, w f
M, w t
M, w f
φ ∧ ψ iff M, w t φ and M, w t ψ
φ ∧ ψ iff M, w f φ or M, w f ψ
φ ∨ ψ iff M, w t φ or M, w t ψ
φ ∨ ψ iff M, w f φ and M, w f ψ
φ → ψ iff M, w t ¬φ or M, w t ψ
φ → ψ iff M, w f ¬φ and M, w f ψ
This defines satisfaction of extensional formulae within worlds. The clauses for explicit
knowledge that φ (written Bφ, as usual), are:
M, w t Bφ iff, for all w′ ∈ W, Rww′ implies M, w′ t φ
M, w f Bφ iff M, w 1t Bφ
Note that M, w t ¬Bφ iff M, w f Bφ and M, w f Bφ iff M, w t ¬Bφ, i.e. B-formulae
behave classically. Finally, Levesque defines implicit belief that φ, written Lφ, as:
19
Leveque’s original formulation contained two functions t, f of type P −→ 2W , assigning a set of worlds to
each proposition. For any proposition φ, t(φ) is viewed as the set of worlds at which φ is true; f(φ) as the set
of worlds at which it is false. Clearly, these may overlap, and need not together exhaust the domain, whence
truth gluts and gaps.
20
The clause for negation would then be: M, w ¬φ iff M, w∗ 1 φ.
2.
41
M, w t Lφ iff, for all w′ ∈ W ∗ , R∗ ww′ implies M, w′ t φ
M, s f Lφ iff M, w 1t Lφ
A formula φ is satisfied at a world w in a model M iff M, w φ and valid iff satisfied at
all classical worlds w ∈ W ∗ in all models. The set of valid formulae thus coincides with the
set of classically valid formulae. It is easy to see that any sentence explicitly believed must
also be implicitly believed, i.e. Bφ → Lφ. However, while implicit belief is closed under
classical consequence, explicit belief is not. This is due to the inclusion of non-classical
worlds in R.
Let us see how far such approaches bring us in light of the logical omniscience
problem. The following remarks apply equally to Levesque’s setup and to a nonclassical
worlds approach based on a 4-valued logic such as Belnap’s [Bel77]. In our diagram, let
us label each world w both with the primitive propositions and negated primitives that it
satisfies, i.e. p labels w iff Vt (p) = true and ¬p labels w iff Vf (p) = true.
p
p ¬p
q
w2
w3
R
R
w1
Both p and p → q are satisfied by w2 and w3 , which are the only worlds accessible via R
from w1 . Hence, we have w1 Bp ∧ B(p → q). However, w3 1 q and so w1 1 Bq. So,
Bp ∧ B(p → q) → Bq is not valid: belief (and for the same reason knowledge) is not closed
under believed (or known) implication. But by the same token, we have the following
validity, for any formulae φ, ψ:
Bφ ∧ B(φ → ψ) → B(ψ ∨ (φ ∧ ¬φ))
In our diagram, w2 and w3 represent the only two kinds of worlds that satisfy φ → ψ,
viz. those that satisfy ψ and those that satisfy ¬φ. The latter then satisfy φ ∧ ¬φ and so,
if a world w satisfies Bφ ∧ B(φ → ψ), it must also satisfy B(ψ ∨ (φ ∧ ¬φ)). In this way,
the closure of belief and knowledge is avoided by introducing inconsistent possibilities.
2.
42
As remarked above (section 2.2.1), there appears to be a conceptual confusion here. Just
because an agent does not believe all consequences of its beliefs does not imply that the
agent considers impossible worlds to be possible. For example, consider an agent with the
following beliefs:
Bφ,
B(φ → ψ),
B¬ψ
It follows from Levesque’s semantics that the agent believes a contradiction to be true:
either B(φ ∧ ¬φ) or B(ψ ∧ ¬ψ). There are two issues here. Firstly, such an agent certainly
seems possible; I am certain that I am such an agent, yet I would never assent to a sentence
of the form φ ∧ ¬φ. So, on Levesque’s account, agents can be said to have beliefs that they
would explicitly deny having.
Secondly, where an agent has an inconsistent belief, all worlds it considers possible must be nonclassical worlds, that is, worlds labelled by both a proposition and its
negation, for some proposition. This detaches the notion of an epistemically possible world
from the situations an agent would class as possible and impossible, in a non-epistemic
sense. For example, I consider the actual world possible, but I am also certain that I have
contradictory beliefs, so the actual world is not included in the worlds I consider epistemically possible.21 As Levesque himself notes [Lev85], there is a difference between believing
φ and also believing ¬φ (when in different frames of mind, for example) and believing
φ ∧ ¬φ. Levesque’s logic presented above is unable to handle this distinction. I return to
this topic in the following section (2.2.5).22
What about other prominent types of logical omniscience: closure under valid
implication and (as an instance of this) knowledge of all tautologies? Fagin and Halpern
claim that Levesque’s agents lack knowledge of valid formulae when they are, in a certain
sense, unaware of some of the primitives occurring in those formulae. The thought seems
to be as follows. An agent may not know whether a particular primitive proposition p is
true or false; but, presumably, upon recognizing p to be a declarative proposition, could not
fail to recognize that p must either be true or false. Fagin and Halpern term this awareness
of p:
Let us say that an agent is aware of a primitive proposition p, which we abbreviate
Ap, if B(p ∨ ¬p) holds. Thus Ap is true in exactly those situations [worlds] that
21
I discuss the relation between epistemic possibility and metaphysical possibility in section 3.1 below.
The distinction can be captured using Scott-Montague neighbourhood semantics; however, a better motivation for the distinction is included in the sentential account I argue for in section 3.6 below.
22
2.
43
support [i.e. satisfy] either the truth or falsity of p (they may of course support
both the truth and falsity of p) [FH88, p.47].
They then extend the notion of awareness to cover all formulae as follows. Let sub(φ) be the
set of all subformulae of φ; then Aφ is the conjunction of Ap, for all primitives p occurring
in φ, i.e.
df
Aφ =
^
{Ap | p ∈ (sub(φ) ∩ P)}
Then, “[a]lthough not every valid proposition is believed, it is the case that a valid formula
is believed provided that an agent is aware of all the primitive propositions that appear
in it” [FH88, p.47]. They show [FH88, p.47, proposition 3.1] that if φ is a valid formula of
classical (2-valued) propositional logic, then Aφ → Bφ.23
Awareness of a formula at a world effectively rules out the truth value neither.
Worlds accessible from that world either satisfy that formula or its negation (or both)
so that, locally to that world and that formula, we have the effect of a 3-valued logic
with the truth value both in addition to the two classical values. Such logics contain all
classical tautologies at every world, plus some extra formulae that would not be allowed
by a classical 2-valued logic. So, according to Fagin and Halpern’s analysis of Levesque’s
account, closure of an agent’s knowledge under valid implication fails precisely when
agents are unaware of the primitive propositions concerned. The concept of awareness is
discussed in more detail below, in section 2.3.
2.2.5 Local Reasoning
As noted above, an agent can only be modelled as having inconsistent beliefs when all
worlds it considers possible are themselves inconsistent. This prizes the analysis of possible
worlds away from Hintikka’s original motivation for it (cf section 1.1.2 above); it also seems
to bar us from thinking of such constructions as possible worlds, epistemic or otherwise,
for an agent’s explicit beliefs may well include the belief that possibilities are necessarily
consistent.24 One may refuse to believe incoherent situations to be possible and yet still
hold inconsistent beliefs. In one sense, this is reflected in the difference between believing
23
In fact, these two claims are not identical; the latter does not imply the former. Recall that only classical
worlds are considered when evaluating the validity of a formula. So, it is at least possible for Aφ → Bφ to be
valid and yet, contra the former claim, there be a world w that satisfies Aφ but not Bφ, for some tautology φ.
24
I continue this line of criticism in section 3.1 below.
2.
44
φ and (at the same time) believing ¬φ, and believing φ ∧ ¬φ. One may believe φ in one
frame of mind and yet, in another frame of mind, believe ¬φ.
Levesque’s account can be modified to capture this distinction so as to avoid the
reliance on inconsistent worlds. In [FH88], Fagin and Halpern introduce the notion of
“non-interacting clusters” of beliefs, where a belief held in one cluster or frame of mind
may contradict a belief belonging to another cluster [FH88, p.58]. A Kripke structure for local
reasoning is a tuple
M = hW, V, L1 . . . Ln i
where W is a set of worlds, V is a classical 2-valued valuation and each Li of type P −→ 22
W
is a nonempty set of nonempty subsets of W. If Li (w) = {Loc1 , . . . , Lock }, then each Loc j≤k
is the set of worlds that agent i considers possible from w when in state-of-mind j. Bi φ is
interpreted as “agent i believes φ in some frame of mind”:
M, w Bi φ iff there is some Loc ∈ Li (s) such that, for all w′ ∈ Loc, M, w′ φ
A notion of implicit belief can also be defined within this framework by pooling information
from an agent’s various frames of mind. An agent i implicitly believes φ at w, written
M, w Li φ, iff φ is satisfied by all worlds considered possible from w in all frames of mind,
T
i.e. in all worlds w′ ∈ Li (w):
T
M, w Li φ iff, for all w′ ∈ Li (s), M, w′ φ.
If an agent holds inconsistent beliefs, albeit in different frames of mind, the intersection of
the states considered possible in all frames of mind will be empty, and thus the agent will
implicitly believe ⊥. The set of formulae of the form Bi φ is not closed under implication,
i.e. Bi φ ∧ Bi (φ → ψ) ∧ ¬Bi ψ is satisfiable. Agent i may believe φ in one frame of mind but
believe φ → ψ in another, and never be in a frame of mind where he puts the two together.
An agent can also hold inconsistent beliefs. Bi φ ∧ Bi ¬φ is satisfiable, for agent i may believe
φ in one frame of mind and ¬φ in another. However, Bi (φ∧¬φ) is not satisfiable, because no
agent believes in incoherent worlds. This is clearly more realistic than Levesque’s original
proposal.
Even so, each agent’s beliefs are still closed under valid implication and, as a
consequence, agents believe all valid formulae. Suppose an agent believes φ at w, i.e. believes it in some frame of mind, and that φ → ψ is valid. Take Loc to be the set of worlds
considered possible from w in that very frame of mind; then by definition, every world
2.
45
w′ ∈ Loc satisfies φ. Since these worlds are all classical 2-valued worlds, φ → ψ also holds
at each of these worlds, so ψ also holds at every w′ ∈ Loc. Hence the agent believes ψ
in that same frame of mind. As a consequence, logically equivalent propositions cannot
be distinguished between. Recall that Cresswell considered the ability to distinguish between logically equivalent propositions a feature of belief and knowledge: “there is no
reason why someone should not take a different propositional attitude (belief, say) to two
propositions which are logically equivalent” [Cre73, p.40].
Secondly, each agent’s beliefs are closed under implication within each frame of
mind: if the agent believes φ and φ → ψ in the same frame of mind, then it cannot but
believe ψ in that frame of mind. Each frame of mind thus gives rise to an infinite number of
beliefs, some of which the agent would not be able to acknowledge (or may even explicitly
reject) as beliefs. In a mathematical frame of mind, for example, in which an agent considers
the axiom schemes of some theory, it will thereby be modelled as knowing all theorems of
that theory. As discussed above, this is surely implausible. Let us conclude that an account
of local reasoning, in isolation, does not help with the problem of logical omniscience.
2.2.6 Fagin, Halpern and Vardi’s Nonstandard Worlds
In a similar vein to Levesque, Fagin, Halpern and Vardi [FHV90] consider a system based
on relevant logic, this time formulated in terms of the Routley star operator. For every
world w, there is a world w∗ . They term the logic internal to each world nonstandard
propositional logic (NPL).25 A nonstandard Kripke structure
M = hW, V, R1 , . . . Rn , *i
is as usual, with the addition of the * operator whose source and target is W, such that
w∗∗ = w for each w ∈ W. The classical worlds are then the worlds w that are a unit
for *, i.e. for which w∗ = w. The satisfaction conditions for belief and knowledge are
25
[FHV90] begins by defining a propositional logic, NPL, where a model M is a pair of truth assignments
to primitive propositions V, V∗ such that V∗∗ = V. Satisfaction becomes relative to one of these valuations; the
clause for negation is:
M, V ¬φ iff M, V∗ 1 φ
One would assume that, in allowing dual valuations, the authors endorse a many-valued logic with truth
gluts; yet, two years prior to publication of this paper, Fagin and Halpern argued that, insofar as the logic is
intended to capture a notion of belief, “it seems unreasonable to allow incoherent situations [i.e. to allow truth
gluts]. It is hard to imagine an agent which would consider an incoherent situation possible” [FHV90, p.47].
However, as soon as the modal semantics is introduced, the dual valuations can be dispensed with in place of
the standard * operator on worlds.
2.
46
then standard. These nonstandard Kripke structures are in many respects equivalent to
Levesque’s; the main difference is that Fagin, Halpern and Vardi define validity in terms of
all worlds w ∈ W whereas Levesque (along with Cresswell) limit validity to the formulae
satisfied by all classical worlds.
2.2.7 Evaluating the Nonclassical/Nonstandard Worlds Approach
Suppose we take some paraconsistent or relevant propositional logic Λ to be the logic
underlying worlds, of the kind discussed above. Does this move overcome Hintikka’s
problem? The answer is both a yes and a no; it overcomes the contradiction present in 1–4;
but the problem can be reformulated so as to produce a new contradiction. Recall that
premise 2 of Hintikka’s problem is:
2. There are a, φ, ψ such that a knows that φ, φ logically implies ψ and yet a does not
know that ψ.
If “logically implies” is taken to mean classical entailment, then the contradiction is resolved. As discussed above, formulae such as ¬(φ ∧ ¬φ) need not be believed by an agent
in a logic based on a suitable choice of Λ (e.g. Levesque’s system). However, there is a
tension here in basing an epistemic logic on a relevant or paraconsistent model of entailment yet, at the same time, claiming that “logically implies” means classical entailment.
Supporters of relevant logic tend to believe that relevant implication captures the notion
of logical entailment more satisfactorily than classical implication can (cf the discussion in
section 2.2.3 of the paradoxes of material and strict implication). Similarly, subscribers to
dialethism (e.g. Graham Priest [Pri87]) hold that there can be true contradictions, so that
logical consequence should be analyzed as paraconsistent consequence. Indeed, Fagin and
Halpern comment that “[w]hile restricting [validity] to complete situations [i.e. worlds]
ensures that all propositionally valid formulas continue to be valid in Levesque’s logic,
it seems inconsistent with the philosophy of looking at situations” [FHV90, p.48], i.e. the
philosophy behind paraconsistent or relevant logics.
The stance taken by such approaches has to be this: the notion of something
following from something else is properly cashed out by classical entailment (or, at least,
some logic stronger than relevant/paraconsistent logic). There is also a notion of epistemic
entailment, which accounts for the logic within epistemically possible worlds. Of course,
2.
47
some of these worlds will also be metaphysically possible, whence the genuine (i.e. classical)
notion of validity can be retained, even in epistemic contexts.
However, a version of Hintika’s problem can then be formulated as follows.
Assume that the epistemic logic in question is based on a relevant/paraconsistent logic Λ
and that Λ is the logic of the class of models C.
1. ‘a knows that φ’ is true at w iff φ is true at every world epistemically accessible from
w;
2′ . There are a, φ, ψ such that a knows that φ, φ C-entails ψ and yet a does not know that
ψ;
3′ . A sentence is C-valid iff it is true at every possible world;
4′ . Every epistemically possible world (and so all the worlds epistemically accessible
from any world) is a model in the class C.
As before, 1-4’ are inconsistent. Note that 2’ need not commit one to the view that C-validity
captures the notion of logical truth. 1 and 4’ are assumptions made by the supporter of
the C-possible worlds approach to knowledge, so such accounts must reject 2’. That is,
they must claim that, whenever φ C-entails ψ (e.g. when ψ is a relevant consequence of
φ) and φ is known, then ψ is also known. If C captures a relevant logic, then this notion
of knowledge is closed under relevant entailment. I will call this the assumption of relevant
closure.
However, knowledge is not closed in this way. If one rejects the notion of idealized
knowledge (as I argued that one should; cf section 1.3.3 above), then one has no reason
to accept relevant closure. One of the examples cited against closure of knowledge was
mathematical knowledge. It is possible to know certain mathematical truths, e.g. that
1 + 1 = 2 and yet gain new mathematical knowledge. One can know all the axiom schemes
of a particular theory (and know the rule of substitution), and yet discover that a certain
sentence is a theorem of the theory. Such a discovery constitutes new knowledge for the
agent in question. As Cresswell comments, “when a mathematician discovers the truth
of a mathematical principle he does not thereby discover the truth of all mathematical
principles” [Cre73, p.40].
There are reconstructions of classical mathematics based on paraconsistent and
relevant logics. For example, Meyer produced a variation of Peano arithmetic, based on
2.
48
the relevant logic R and gave a finitary proof that 0 = 1 is not a theorem of the resulting
relevant arithmetic [MF92].26 The paraconsistent approach to mathematics—often termed
inconsistent mathematics—has also proved popular. Motivation comes from considering
set-theoretic responses to Russell’s paradox (e.g. ZFC or Quine’s NF) to be ad hoc. Instead,
da Costa [dC74], Brady [Bra71], Priest, Routley, and Norman [PRN89] propose to retain
Frege’s original abstraction principle—every predicate determines a class—and base their
mathematics on a paraconsistent logic. In doing so, they reject the principle ex contradictione
quodlibet: that anything follows from a contradiction.27
The claim is not that such reconstructions capture classical mathematics in its
entirety. For example, Meyer and Friedman show that relevant arithmetic does not contain
all of the theorems of classical Peano arithmetic [MF92]. Nevertheless, these reconstruction
contain a large enough proportion of mathematics to warrant the following:
2′′ . There are a, φ, ψ such that ψ follows from φ in relevant/inconsistent mathematics such
that a knows φ but not ψ.
Take φ to be the conjunction of the axiom schemes of relevant Piano arithmetic together
with the rules for substitution; then a suitable ψ can surely be found, for any agent: no
actual or potentially realizable agent knows all the theorems of arithmetic, even if we
restrict our attention to relevant arithmetic. Then, taking our underlying logic Λ to be
the relevant logic R plus Meyer’s axioms for relevant Piano arithmetic (for example), the
claims 1, 2′′ , 3′ , 4′ are together inconsistent. Hintikka’s problem has not been solved. We
must conclude that possible worlds accounts of knowledge and belief cannot be rescued
by weakening the underlying logic to a paraconsistent or relevant logic.
2.3 Awareness
In section 2.2.4, I discussed Fagin and Halpern’s analysis of Levesque’s logic in terms of
awareness. An agent was defined as being aware of a primitive proposition p, written Ap,
if B(p∨ ¬p). For other formulae, Aφ is the conjunction of Ap, for all primitives p occurring in
φ. The thought then was that awareness is what distinguishes explicit from implicit belief
(implicit belief, written using L, being the notion that arises from the standard possible
26
27
This effectively solves one of Hilbert’s problems, relative to relevant arithmetic.
See also [Mor95, Pri97, Pri00].
2.
49
worlds analysis, i.e. closure of the agent’s explicit beliefs). By way of example, B(p ∨ q) is
equivalent to
(Ap ∧ Lp) ∨ (Aq ∧ Lq) ∨ (Ap ∧ Aq ∧ L(p ∨ q))
I now discuss this notion in more detail by presenting Fagin and Halpern’s logic of awareness (section 2.3.1) and present a critique (section 2.3.2), based partly on Kurt Konolige’s
observations in [Kon86b].
2.3.1 Logics of General Awareness
What could be meant by awareness of a proposition? In an intuitive sense, beliefs about
unmarried men are also beliefs about bachelors, for the two terms are co-extensive. However, this does not necessarily mean that anyone who says they believe that all unmarried
men are happy would also agree that they believe all bachelors to be happy, for they may
not have acquired the concept bachelor. (This is not to say that the beliefs are really distinct,
only that one can claim to have one but deny having the other. But perhaps in more
exotic cases involving obscure scientific or mathematical concepts, one really should say
that two beliefs, differing only in the substitution of such a concept by its definition, are
in fact distinct.) Now, what does it mean to be aware of a concept? It cannot mean being
in possession of a definition in terms of necessary and sufficient conditions for something
falling under that concept; that would be to set the standard for awareness too high. At
the least, being aware of a concept must enable me to ascertain which types of entity the
concept may meaningfully be applied to. For example, is a bachelor may be applied to Bob,
a man, but not to the colour red. So, in knowing that ‘Bob is a bachelor’ is meaningful,
one must believe it to be truth-apt, i.e. either true or false, whether one has an opinion
regarding its truth or falsity or not. This would motivate a definition of awareness of a
primitive proposition p in terms of possessing the belief that p ∨ ¬p.
This is not very convincing. There are plenty of concepts of which I am aware
that can meaningfully be combined with one another into meaningful sentences, but which
do not force me to regard those sentences as truth-apt. Unexceptional examples include
a vicar uttering “I now pronounce you husband and wife” in appropriate circumstances
and the Queen uttering “I name this ship . . . ” (again, in appropriate circumstances).
Such performatives are clearly meaningful; they have an observable perlocutionary effect
upon the world [Aus62]. A more contentious example would be found in an expressive
2.
50
interpretation of moral statements, according to which moral utterances do not express
declarative propositions about moral facts, but instead express the utterer’s feelings about
a given subject. The notion of awareness qua recognizing an utterance to be truth-apt
cannot apply here.
However, let us run with this notion of awareness, restricted to truth-apt sentences,
and ignore its motivation. The thought then in [FH88] is that an agent should only be said to
(explicitly) believe φ when it is aware of all primitives in φ. Formally, a logic of awareness,
so conceived, takes a standard multi-agent Kripke structure with classical 2-valued worlds
as its starting point. It then partitions the primitives into those that the agent is aware of
and those it is unaware of, for each agent, at each world in the domain. Typically, an agent
will not be aware of all primitive propositions at a world w. That world will then behave in
a similar way to an incomplete world of the type employed in Levesque’s logic above. This
approach can then be seen as a way of motivating the semantics of incomplete worlds: the
worlds are in fact perfectly complete—all propositions there are either true or false. But an
agent is only aware of a subset of these propositions and so, for that agent, the world acts
as an incomplete one. Of course it will seem perfectly complete to the agent, for agents do
not in general know what they are not aware of.
For n agents and a set P of primitive propositions, let A1 , . . . , An be the functions
of type W −→ 2P such that Ai (w) is the set of primitive propositions that agent i is aware
of at w. A model for n agents is then a tuple
M = hW, V, R1 , . . . , Rn , A1 , . . . , An i
Worlds w ∈ W are classical possible worlds and V is the classical valuation function.
Awareness of a formula is not formulated in the logic; it is left as a meta-logical notion.
Two satisfaction relations are then defined relative to a set Φ of primitive propositions, as
in Levesque’s account, along with a further relation ‘’ (taken to be truth simpliciter). To
begin with:
M, w Φ
t p iff V(w, p) = true and p ∈ Φ
p iff V(w, p) = false and p ∈ Φ
M, w Φ
f
M, w p iff V(w, p) = true
For the Boolean connectives, t and f are defined as in Levesque’s account above and
and is defined in the standard classical way. Belief at w is then defined as truth and
awareness in all worlds accessible from w; that is, i believes φ at w iff:
2.
51
i. all worlds accessible from w by Ri support φ; and
ii. i is aware of all primitive propositions in φ at w.
Awareness is not captured in the object language, but the expression
Φ∩Ai (w)
M, w′ t
φ
is used to say that both that w satisfies φ and that i is aware of φ at w. The clauses for belief
are then as follows:
Φ∩A (w)
i
′
′
′
φ
M, w Φ
t Bi φ iff, for all w ∈ W, Ri ww implies M, w t
Φ∩A
i (w)
′
′
′
Φ
φ
M, w f Bi φ iff, for some w ∈ W, Ri ww implies M, w f
M, w Bi p iff for all w′ ∈ W, Ri ww′ implies M, w′ P
φ
t
where P the set of all primitive propositions in the language.
This formulation of awareness has the same logical properties as Levesque’s
system: agents do not know all tautologies but, whenever an agent is aware of all the
primitive propositions in a classical tautology φ, it must know φ. Fagin and Halpern
comment that this
may be appropriate for capturing the lack of logical omniscience that arises
through lack of awareness but not for capturing the type that arises due to lack
of computational resources. There may well be a very complicated formula
whose truth is hard to figure out, even if you are aware of all the primitive
propositions that appear in it [FH88, p.48].
What response do Fagin and Halpern give to this problem? To my mind, they
implicitly abandon the possible worlds model of knowledge and belief. They generalize
the awareness functions A1 , . . . , An to be of type W −→ L (where L is the smallest language
formed in closing P under the Boolean connectives) such that each Ai (w) is an arbitrary set
of formulae, for any w. Ai (w) may, for example, contain both a formula and its negation;
or a conjunction without containing both conjuncts etc. The notion of awareness can,
therefore, be defined so as to have no semantic properties whatsoever.
Fagin and Halpern do not give a characterization of awareness in general, but
provide a formal account as follows. A model M is as before and , now the only
satisfaction relation, is defined classically for primitive propositions and Boolean formulae.
The clauses for awareness, explicit belief and implicit belief are:
M, w Ai φ iff φ ∈ Ai (w)
M, w Bi φ iff φ ∈ Ai (w) and, for all w ∈ W, Rww′ implies M, w′ φ
M, w Ai φ iff, for all w ∈ W, Rww′ implies M, w′ φ
2.
52
2.3.2 What is Awareness?
Because of the “clean separation in [the] language between belief . . . and awareness” [FH88,
p.54], it is possible to alter the properties of awareness without modifying the underlying
framework of belief. Indeed, “[o]nce we have a concrete interpretation in mind, we may
want to add some restrictions” on the notion of awareness [FH88, p.54]. For example, if we
deem the order of presentation of conjuncts to be irrelevant, we may close each awareness
set Ai (w) under conjunction permutation, i.e. for all agents i, worlds w and formulae φ, ψ:
φ ∧ ψ ∈ Ai (w) iff ψ ∧ φ ∈ Ai (w). Similarly, we might close each awareness set under
double-negation elimination. The spirit of [FH88] is very much to see what a particular
application of the logic requires, then decide on the properties of awareness.
This is a perfectly acceptable motivation for setting certain closure properties.
Indeed, abstraction from the real case is part of the motivation for developing a formal
model in the first place. A formal model should not aim to capture all aspects of a
particular agent system; it should only capture those that a designer or a testing procedure
are interested in. Thus, if an agent never behaves differently with respect to doublenegations—that is, a belief ¬¬φ will never trigger a certain action if φ would not, and
vice versa—then it is perfectly acceptable to model the agent as believing ¬¬φ whenever it
believes φ (and vice versa). The agent’s beliefs would be both upwards and downwardsclosed with respect to double negation, and the fact that an agent may be able to represent
φ to itself without representing ¬¬φ in no way counts against our representation of belief.
However, when we take stock of what this account of belief achieves, it is not hard
to see that it is self-defeating as a possible-worlds (as opposed to a syntactic28 ) account.
Recall that we could not develop a realistic account of belief without the notion of awareness
in place. Once we add the notion of awareness, we can tailor its properties to the particular
case; but it seems essential to the success of the awareness model that, in general, awareness
sets have no closure properties whatsoever. As Fagin and Halpern comment,
there is no reason to suppose that Bi (φ ∧ ψ) ≡ Bi (ψ ∧ φ), since . . . people do not
necessarily identify formulas such as ψ ∧ φ and φ ∧ ψ. Order of presentation
does seem to matter. And a computer program that can determine whether
φ ∧ ψ follows from some initial premises in time τ might not be able to determine whether ψ ∧ φ follows from those premises in time τ. [FH88, p.53, their
emphasis]
28
See chapter 4.
2.
53
Now add to this the following observation. Define agent i’s belief set B at w to be the
formulae that i can truly be said to believe at w, i.e. in a model M,
df
Bw
= {φ | M, w Bi φ}
i
Now define a closure operator Cl of type 2L −→ 2L such that Cl(Bw
) is the closure of i’s belief
i
set at w with respect to some condition or other. For example, Cl¬¬ is defined recursively
as Bw
⊆ Cl¬¬ (Bw
) and φ ∈ Cl¬¬ (Bw
) iff ¬¬φ ∈ Cl¬¬ (Bw
). Each closure operator must be
i
i
i
i
monotonic, i.e. φ ∈ Bw
implies φ ∈ Cl(Bw
) and its own fixed point, i.e. Cl(Cl(Bw
)) = Cl(Bw
).
i
i
i
i
According to the logic of awareness, an agent’s belief set can only be closed under
a certain closure operator if awareness, as defined for that agent, is so closed. That is, for
any closure operation Cl:
w
Bw
i = Cl(Bi ) only if Ai (w) = Cl(Ai (w))
I call this the awareness closure principle. The proof is simple, by induction on the structure
of φ and the definition of M, w Bi φ. Of course the converse does not hold; an agent may
be aware of ¬φ whenever it is aware of φ but believe the former and not the latter.
This shows that the notion of belief can, in practise, only have whatever semantic
(i.e. closure) properties awareness has. This undermines Fagin and Halpern’s motivational
claim that “[t]he syntactic approach [i.e. modelling belief as a set of formulae] lacks the
elegance and intuitive appeal of the semantic [i.e. possible worlds] approach” [FH88, p.40].
Given the awareness closure principle, a realistic notion of belief cannot be defined along
these lines without the essentially syntactic notion of awareness. This shows that, at the
very least, there is in intrinsically syntactic component to any realistic notion of belief.
The issue may be approached from another angle: in what way does awareness
explain the concept of belief? One reason for building formal models of an agent’s beliefs
is to gain a clearer understanding of the concept being modelled and to investigation its
formal properties. But on reflection, in attempting to explain belief in terms of awareness,
the logic of awareness explains the unclear in terms of the truly obscure. How, for example,
should we view the interaction between belief and awareness? Kurt Konolige remarks that:
the logic of general awareness represents agents as perfect reasoners, restricted
to considering some syntactic class of sentences. There don’t seem to be any
clear intuitions that this is the case for human or computer agents [Kon86b,
p.248].
2.
54
Konolige’s worry is that awareness is not a mechanism that agents actually employ. If
it were, the implication would be that agents first derive all the consequences of their
beliefs, but then discard those that do not fall within the awareness set. This is clearly not
appropriate: no agent is a perfect reasoner. As a result, the notion of awareness remains
obscure. The only point of clarification offered is that the resulting notion of belief has an
intrinsically syntactic element. It may be objected that the logic of awareness is only offered
as a practical tool and not an explanation of the concept of belief ; but then the so-called
“intuitive appeal” of the possible worlds account cannot be used as genuine motivation.
Rather, it will begin to look like historical mismotivation.29
Given a concrete formulation of awareness we may ask, why could this notion
not be used to define a notion of belief directly, i.e. by specifying certain conditions or
restrictions on the awareness set? A potential notion of awareness given in [FH88, p.54]
is that the elements of Ai (w) are precisely those formulae that agent i could determine in
a specified space and/or time bound, whether or not they follow from the information
the agent actually has to begin with in its state at w. But given this notion of awareness,
the worlds that agent i will consider possible from w are precisely those that satisfy the
formulae that follow from the agent’s information in its state at w using whatever method
of reasoning the agent can employ at w. Now, both this information and these methods
of reasoning can be captured syntactically. Consequently, a purely syntactic notion of
belief could be formulated so as to agree in extension with the beliefs ascribed to i at w
by the logic of awareness. Such thoughts motivate Kurt Konoligie’s remark that adding
a syntactical restriction to an otherwise semantic account of belief results in a logic that
“is no more powerful than current sentential logics, and can be re-expressed in terms of
them.” [Kon86b, p.248]
2.4 Conclusion
Adding what have variously been termed impossible, nonclassical or nonstandard worlds
to the set of worlds that an agent may consider possible has been a popular move. In
29
I suspect there are a number of historical and sociological factors working within the epistemic logic
community at present that contribute to the impression that the possible worlds analysis genuinely motivates
an account of belief. Vincent Hendricks writes in his preface to the recent re-issue of Hintikka’s Knowledge
and belief that, “[a]lthough everybody in logic, mainstream and formal epistemology, game-theory, economics,
computer science and social software refers to the book it is very likely that a great many have never literally
had their hands on it” [Hen05].
2.
55
Hintikka’s 1975 proposal, based on Rantala’s notion of urn models, an agent need only
know the consequences of its knowledge that could be established in a certain number
of draws from the urn, depending on the agent’s logical competence. However, agents
still know all substitution instances of propositional tautologies. Moreover, it seems to be
possible for an agent to significantly raise its competence locally, for example by proving a
complicated mathematical theorem, and yet remain ignorant of less complex consequences
of its knowledge.
Agents that do not know all propositional tautologies can be modelled by defining
satisfaction at a world in the style of paraconsistent or relevant, rather than classical, logic.
However, agents are still modelled as logically omniscient (and hence as perfect reasoners)
within paraconsistent or relevant logic, rather than classical logic.
Finally, a notion of awareness was introduced as a syntactic filter on the semantic
characterization of belief. However, to make the idea work, awareness can be given no
semantic properties whatsoever, undermining the claim that possible worlds accounts shed
light on concepts of knowledge and belief. Moreover, defining knowledge and belief in
terms of possible worlds and then adding the notion of awareness on top is both ad hoc
and unnecessary. Exactly the same results could be reached by taking whatever properties
awareness is supposed to have in a particular application and applying these properties to
a purely syntactic characterization of belief.
56
C 3
A B S
The previous chapter discussed several attempted solutions to the problem of logical
omniscience, all of which retained Hintikka’s framework of epistemically possible worlds.
The assumption common to all these accounts is that Hintikka’s analysis of knowledge and
belief in terms of epistemic possibility is correct; and that there is something wrong with
his original notion of an epistemically possible world. In this chapter, I consider several
philosophical theories of belief and belief ascription that are independent of Hintikka’s
framework. Once we have a firm account of belief, we shall be able to determine what
kind of logic is best suited as a logic of belief.
I begin by showing how it is difficult to develop a realistic account of belief in
terms of epistemic possibility when satisfaction at an epistemically possible world can be
specified recursively. I then review several accounts of belief and reject both the Fregeaninspired approaches and those that seek to reduce belief to either underlying brain process
or to a language of thought. In section 3.5 I argue that belief states are best characterized
in terms of sentences and discuss, in section 3.6, how we should view the relation between
the agent and the relevant sentence. Finally, in section 3.7, I discuss the prospects for a
useful logic of belief.
3.1 Epistemic Possibility
In this section, I argue that the notions of epistemic possibility met in the previous chapter
are not suitable tools for developing an account of belief. Hintikka’s notion of an epistemic
possibility as being what is possible, given what the agent knows is clearly dependent on a prior
account of knowledge that (according to most accounts) is itself dependent on an account
3.
57
of belief.
It is striking that Hintikka wants to base his framework on logical, rather than
psychological notions. One might be tempted to think that, whatever epistemic possibilities
actually are, they should be viewed as (at least partly) psychological entities. But, as we
saw, Hintikka is (initially at least) interested in a notion of consistency: a perfectly objective,
non-psychological notion. Hintikka’s characterization of epistemic possibility is therefore
logical, along the lines of possible worlds. However, epistemically possible worlds cannot
be the same things as metaphysically possible worlds. Since identity is a matter of de re
necessity, i.e. entities are necessarily self-identical, identity statements involving distinct
rigid designators are either necessarily true or necessarily false [Kri80]. If ‘a = b’ is true at
any world, it is true at all metaphysically possible worlds in which a exists. However, I
may believe that Bob Dylan is a great songwriter and still believe that Robert Zimmerman
is not, or even have no belief about Zimmerman, even though Robert Zimmerman is in
actuality Bob Dylan.
This is a problem for direct reference theorists, who claim that the only semantic or
truth-conditional contribution of a name is its bearer. Co-referring terms are not in general
substitutable in psychological contexts. This is not, as a direct reference theorist might
claim, merely a pragmatic feature of language. It is a semantic fact that I have the one belief
but not the other, because to claim that I have these beliefs is true, whereas the claim that I
believe otherwise is false. Epistemically possible worlds must be worlds in which Robert
Zimmerman need not be Bob Dylan. Such worlds need not be metaphysically possible.
On Hintikka’s view, the condition for membership of the set of epistemically possible worlds is just the epistemic possibility of a world’s logically primitive true sentences.
The thought is that even though ‘a’ and ‘b’ are co-denoting terms, it is at least an epistemic
possibility that the two refer to different entities. That is, even though a = b in actuality, it
remains epistemically possible that a , b. The truth of logically complex sentences is then
given by the standard recursion clauses for Booleans and quantifiers. The truths about a
particular epistemically possible world will then form a maximal consistent set (a consistent
set that, upon the addition of just one extra formula, would become inconsistent).
The fact that such a construct results in a maximally consistent theory, as the truths
about any metaphysically possible world should,1 does not license the claim that such con1
On most philosophical views of truth, an inconsistent set of sentences cannot simultaneously be true. One
exception is the view known as dialethism, which (motivated by the semantic paradoxes) holds that there are
3.
58
structs are spatio-temporal entities of any kind. One may (for independent reasons) hold
the view that all possible mathematical structures exist as abstract, mind independent entities and thus that epistemically possible worlds qua maximally consistent sets of sentences
have genuine being. But note that such entities cannot be what we invoke when we say
that we know that such-and-such is epistemically possible for an agent firstly because the
agent does not have epistemic contact with such entities2 and secondly because it seems
at least epistemically possible that Platonism about mathematical structures is false (as the
existence of nominalists, intuitionists and constructivists highlights).
It seems that we must treat epistemic possibilities simply as logical notions or
logical points, i.e. as concepts useful in the explication of a formal theory that may be
treated simply as useful fictions. If this is the case, then we cannot gain any intuition about
the logical properties of belief or knowledge by inspecting the natures of epistemically
possible worlds, precisely because there are no such natures. The theory of a logical point
need not be consistent or deductively closed. The relevant and paraconsistent ‘worlds’ met
in the previous chapter are logical points too. We may associate a point with any theory
we like, where a theory is just a set of sentences. Why, then, should we follow Hintikka in
holding that what holds in a particular epistemic possibility is closed under any notion of
logical consequence?
We have no reason to suppose classical logic to be the logic of each epistemically
possible point. We equally have no reason to suppose that any logic (or at least, any
logic other than a specifically chosen, finite set of sentences) can fulfil this rôle without
conflicting with our intuitions about belief. Without an independent account of belief with
which to support the choice of a logic as the logic of each point, any logic we choose will
be just as arbitrary and just as questionable as any other. For the remainder of this chapter,
I therefore dispense with the possible worlds approach and evaluate several candidate
theories of belief that are independent of Hintikka’s approach.
true contradictions [Pri87].
2
At least, the burden is on the mathematical Platonist to explain how the truths holding of mind-independent
mathematical reality can lead to mathematical knowledge without trivializing the issue, such that the agent
knows all mathematical truths that it believes. Current accounts in terms of truth-tracking (across metaphysically possible worlds), reliability or evidence do not seem capable of such an account.
3.
59
3.2 The Fregean Account
Frege [Fre92] discusses two questions that are of interest to us, viz. (i) why is it that
co-denoting terms are not substitutable salva veritate in belief contexts? and (ii) how is
it possible that certain identity statements are informative? The latter is known as the
problem of cognitive significance and clearly impacts on the former. In summary, Frege’s
solution is that senses mediate reference and that propositions, or thoughts, consist of senses.
For Frege, senses are mind-independent entities, distinct from the physical world and the
realm of language. Thoughts qua entities consisting of senses are not, on this view, mental
entities at all. Thoughts are mind-independent and thus the very same thought (the same
token, not just the same type) may be grasped by more than one person. Understanding
simply consists in the grasping of a thought. Roughly, we may think of the sense of a term
‘a’ as a way in which its referent a is presented. The problem of cognitive significance then
vanishes, for the terms ‘a’ and ‘b’ may have different senses, even if a = b. One would
grasp a different thought in understanding the sentence ‘a = b’ than the thought grasped
in understanding ‘a = a’.
With this framework in place, an account of belief then presents itself. Belief is
just a relation between a believer and a thought. This explains both how it is possible for
you and me to entertain exactly the same belief, for thoughts are not psychological entities,
without committing one to the direct reference view that beliefs are to be identified with
their truth-conditional content. As remarked above, one can believe that Bob Dylan but not
Robert Zimmerman is a great songwriter, even though Zimmerman is Dylan. Part of the
beauty of the Fregean framework is the symmetry between names and sentences. Names
have both a referent or denotation (bedeutung) and a sense (sinn), which is the way in which
the denotation of the name is presented. Similarly, sentences have a sense, which is the
thought they express and a denotation: either the True or the False (Frege considers the True
and the False to be primitive, indefinable entities). In a belief ascription ‘a believes that Bob
Dylan is F’, the denotation of the embedded sentence ‘Bob Dylan is F’ is taken to be the
sense that that sentence would have, were it used in direct discourse. This is why ‘Robert
Zimmerman’ is not substitutable salva veritate in indirect discourse (e.g. in belief contexts)
for ‘Bob Dylan’.
A major problem with this view is the inherently abstract nature of sense. The
metaphor of grasping a term’s sense lacks any explanatory force; nor does a more informa-
3.
60
tive answer seem possible. One simply has to posit non-natural mental powers in order to
account for our understanding and, since a theory of understanding is a theory of meaning, the meaning of language is treated as primitive and impossible to analyze further.
Secondly, if the sense of a sentence is a thought, then we should treat the senses of the
constituents of a sentence as the constituents of thought, i.e. as concepts. Frege allows this
by treating the senses of singular terms (by which Frege included descriptions as well as
names, demonstratives and indexicals) as primitive, whereas the sense of predicates and
relational terms are to be treated as functions.
By way of illustration, let us write ‘σ[P]’ for the sense of the predicate ‘P’ and
‘σ[a]’ for the sense of a singular term ‘a’. The former is a function, from the sense of
a singular term to a thought. σ[P], given σ[a] as its argument, returns the sense of the
sentence ‘Pa’, none other than the thought that a is P. We thus have a compositional way
of analyzing the structure of thought; we may write the forgoing as σ[Pa] = σ[P](σ[a]),
where the latter relatum, structured as a functional application, displays the structure of
the former relatum, i.e. the thought. However, concepts that are not themselves composite
functions do not permit a naturalistic explanation of their formation. Again, we would
have to posit non-natural mental powers to account for concept formation. As with a
theory of understanding, an adequate account of concept formation must also serve as an
explanation; but this does not seem possible on the Fregean view.
It seems that the only option, if we are to consider Fregean senses as having any
explanatory value whatsoever, is to credit them with a concrete existence. One option is to
treat the sense of a proper name as a description, or a cluster of descriptions. This has been
called the descriptivist view of names and is the view that Kripke ascribes to both Frege and
Russell [Kri80, p.27]. However, Russell did not consider ordinary proper names such as
‘Dylan’ to be logically proper names at all. He reserves the term ‘logically proper name’
for the demonstratives ‘this’ and ‘that’ [Rus17, p.211] that, he claims, have no sense at all,
but connect directly with reality. Ordinary proper names are in fact disguised descriptions:
“in order to understand such propositions [involving ordinary proper names], we need
acquaintance with the constituents of the description, but do not need acquaintance with
its denotation” [Rus17, p.216]. Note that Russell is not claiming (as Kripke would have
us believe) that the sense of an ordinary proper name is a description; rather, such names
are definite descriptions in disguise. Russell thus rejects Frege’s sense-reference distinction
(‘On Denoting’ [Rus05]).
3.
61
Frege does not follow Russell’s treatment of descriptions as quantified phrases
[Rus05], but instead treats definite descriptions as Eigennamen or proper names (although,
as Kent Bach points out, singular term would be a better rendering [Bac06]). So definite
descriptions have a sense too. The sense of a term imposes a condition that its denotation
must satisfy, but Frege is not committed to this being a descriptive condition (e.g. the
expression of a demonstrative thought might reasonably be associated with a perceptual
condition). However, it would seem that, in the case of an ordinary proper name, the only
handle we can get on its sense is to consider it to be a descriptive condition that its referent
must satisfy. Such senses should then be expressible as a description. This at least is what
Kripke should have said.
Now it seems clear that the sense of a proper name could not be associated with
a single description, for then the sense of the name would merely be a definition and
statements such as ‘Scott wrote Waverley’ would be analytic. But, of course, they are not.
Wittgenstein considers associating a cluster of descriptions with a proper name [Wit02, §79]
and is followed by Searle, who claims that “it is a necessary fact that Aristotle has the logical
sum, inclusive disjunction, of properties commonly attributed to him: any individual not
having at least some of these properties could not be Aristotle” [Sea58, p.172].
It is to this theory that Kripke objects. Plato was the teacher of Aristotle, but was
not so necessarily. Somebody else could have taught Aristotle, even though, in actuality,
Plato did; and it is perfectly meaningful, although false, to assert that Plato did not teach
Aristotle. We would still grasp to whom ‘Plato’ refers. If ‘Plato’ means ‘the teacher of
Aristotle’, to assert such a contingent fact as ‘Plato taught Aristotle’ would be a necessary
truth. Of course, it is not anything of the sort; hence, senses cannot be assimilated to
descriptions or bundles thereof. In response, adding an actual operator to the description
circumvents this modal problem: ‘Plato was the teacher of Aristotle’ is associated with the
description ‘the actual teacher of Aristotle was the teacher of Aristotle’. In this way, the
modal claim ‘Plato might not have taught Aristotle’ is associated with the true description,
‘the actual teacher of Aristotle might not have taught Aristotle’. The latter is is no way
contradictory.
However, Kripke has a more troublesome objection: suppose that Plato was not
even the actual teacher of Aristotle. That is, suppose that the history books are simply
mistaken about this fact, and perhaps about all facts that we tend to associate with Plato.
Perhaps an unheard of Ancient Greek philosopher, Fred, wrote all works that we attribute
3.
62
to Plato, and also taught Aristotle, but that Plato somehow managed to take all the credit.
Our intuition is that we can still refer to Plato using ‘Plato’. That is, ‘Plato’ refers to Plato,
not Fred, even though Fred actually taught Aristotle. This is no longer a modal objection:
we are merely supposing that we have got the facts about our actual history wrong. So,
‘Plato’ cannot refer to Plato in virtue of association with a descriptive condition or cluster
of descriptions.
In sum, senses are dubious notions and so not good candidates for an explanation
of belief. That would be to attempt to explain the unclear in terms of the truly obscure.
There are additional problems associated with the claim that the embedded sentence in
a belief report denotes its customary sense. Frege remarks that “[i]f words are used in
the ordinary way, what one intends to speak of is their denotation” [Fre92, p.58]. But
surely this is the case even in belief contexts, for ‘a believes that Dylan is F’ attributes a
belief about Bob Dylan himself, not the sense of the name ‘Dylan’. The thesis that terms
should retain their customary semantic value when embedded within belief reports has
been termed semantic innocence and has been used to argue against Fregean theories of
belief; see [CP89, Sal86, Soa85, Soa87]. Just how successful the objection from semantic
innocence is, is a moot point and so I will not pursue it any further here.
3.3 Realist and Representational Theories
One reaction to Frege’s problematic notion of sense is to reject his claim that thoughts
are mind-independent entities and try to locate psychological phenomena such as belief
within the brain. The question, then, is of the exact relationship between belief and brain
processes. On Churchland’s eliminativist view, the causes of human behaviour are brain
process (e.g. motor processes) and therefore “our common-sense psychological framework
is a false and radically misleading conception of the causes of human behaviour and the
nature of cognitive activity” [Chu84, p.43]. As a consequence, our folk-psychological
notions of belief, desire and the like are wildly mistaken. “Folk psychology is not just an
incomplete representation of our inner natures; it is an outright misrepresentation of our
internal states and activities” [Chu84, p.43].
The thought is that scientific progress eventually allows us to replace naı̈ve explanations of behaviour with descriptions of their actual causes. Thus, just as explanations
of behaviour in terms of demonic possession were replaced by scientific, psychological
3.
63
notions that people were willing to accept, explanations of behaviour in terms of belief
and desire could eventually be replaced by explanations in terms of brain activity. However, this assumes that explanations in terms of psychological concepts are in competition
with explanations in terms of brain processes, as if both cannot coexist without a dualist
ontology. This is of course not the case.
Consider descriptions of a computer program executing: an engineer might describe the electrical activity in the computer’s hardware, whereas a programmer would
view the activity as a series of statements being evaluated, with values assigned to variables and so on. A user might even describe the activity at an intentional level (‘it returned
these results because it knew that I was searching for files beginning with a ‘k’’). We would
not say that any of these accounts are any more or any less correct than the other, for they
all describe the same system at different levels of description, although the descriptions
will have differing explanatory powers. Similarly, an explanation of behaviour in terms
of belief and desire is perfectly compatible with an explanation of the same behaviour in
terms of brain processes.
Following this idea, we could claim, along with Smart, that psychological descriptions and descriptions of brain processes are distinct ways of describing the same
phenomena, just as ‘water’ and ‘H2 0’ denote the same substance. Mental phenomena are
nothing over and above brain phenomena; in fact, mental states are identical to brain states.
According to this identity theory, just as ‘water is H2 0’ is not analytic, neither is ‘beliefs are
X’, where ‘X’ describes whatever brain process beliefs are identical to. The identity theory
is not “the thesis that, for example, ‘after-image’ or ‘ache’ means the same as ‘brain process
of sort X’” [Sma59, p.144]. We are not tempted to say that ‘water’ means ‘H2 0’, for there
may be people who know precisely what ‘water’ means but do not know that water is H2 0.
The discovery that water is H2 0 was a scientific, not a semantic, advance. Rather, Smart
holds that:
in so far as ‘after-image’ or ‘ache’ is a report of a process, it is a report of a
process that happens to be a brain process. It follows that the thesis does not
claim that sensation statements can be translated into statements about brain
processes [Sma59, pp.144-5].
However, the claim that mental states happen to be brain states suggests that it is a contingent
fact that mental states are brain states. The relationship between mental states and brain
states could not then be described as an identity relation, for identity holds of necessity
3.
64
where it holds at all. Besides, it sounds odd to say that, even though our mental states are
in fact brain states, they could (metaphysically) have been foot states, i.e. caused by the
internal workings of our feet. The theory would at the least need to say why the latter is
not a genuine possibility.
If we alter the claim to be one of genuine identity and hold that mental states are
necessarily brain states, we rule out the possibility of any brainless system having beliefs
and desires. This does not sound right either. We might imagine an encounter with an
alien race who behave very much as we do given the same circumstances, e.g. showing
signs of hunger, rummaging through the kitchen cupboards, calling for take away. We
would naturally say that the alien has the desire to be fed, the belief that the cupboards
would be a good place to look for food, that the take away is a good second option and
so on. Now, it may turn out that the internal workings of the alien are greatly different
from ours, yet we might still want to describe this behaviour in intentional terms. People
make use of intentional descriptions of robots and computers in science fiction without
aggravating our sense of reality. Our intuitions allow consciousness to be a multiply
realizable phenomenon, like a piece of software that may be run on different types of
hardware. But this is what the (strict) identity theory rules out.
It is now common to characterize mental states functionally, in terms of physical
properties (e.g. in Sellars [Sel56, Sel74] and Putnam [Put75]). In particular, mental representations are characterized functionally. If we hold in addition that such representational
properties exhaust the mental properties of consciousness (the representational theory of
mind) and that the direct objects of propositional attitudes are compositional mental representations, we have a version of the language of thought hypothesis. This is an empirical
hypothesis, according to which thought occurs in a symbolic system physically realized in
the brain (or some other functionally similar physical system). Believing that φ then consists in having an appropriate mental representation of φ. Such representations are taken
to be primitive semantic properties, organized into a language-like system by syntactic
rules (see, for example, Fodor [Fod87, Fod90]). On this view, questions of psychological
interpretation are settled by appeal to the primitive semantic properties of the language of
thought.
This account flies in the face of Wittgensteinian considerations concerning privacy,
for it seeks to locate meanings within a private mental sphere. To paraphrase Wittgenstein,
suppose that there existed in the mind some private entity claimed to be a primitive
3.
65
semantic property. In considering the minds of others, which on this picture are as black
boxes to me, “it would be quite possible for everyone to have something different in his
box”—i.e. each person having a different kind of mental entity claimed as a meaning—“[o]ne
might even imagine such a thing constantly changing” [Wit02, §293]. But if we suppose
that talk of meaning has a use in a public language or, as Wittgenstein would have it, in a
language game, then the purported private semantic entity:
has no place in the language game at all; not even as a something: for the box
might even be empty.—No, one can ‘divide through’ by the thing in the box; it
cancels out, whatever it is [Wit02, §293].
Thus, even if one accepts the existence of symbols in the mind (captured functionally in terms of brain processes) that are governed by syntactic rules, this inner mental
‘language’ would still need to be interpreted. The problems of interpretation are discussed
by Quine, who comments that “[t]he metaphor of a black box, so often useful, can be
misleading here. The problem is not one of hidden facts, such as might be uncovered by
learning more about the brain physiology of thought processes” [Qui70, p.180]. Putnam
is in agreement, for ““[m]ental representations” require interpretation just as much as any
other signs do” [Put83, p.154]. Thus, it seems that positing a symbolic system in the mind
does not, in itself, answer our questions concerning the semantics of belief.
3.4 Predictive Accounts and the Intentional Stance
We might call the accounts encountered in the previous section realist accounts of belief,
for they seek to explain away propositional attitudes through some property of the brain
or mind. One has a particular propositional attitude iff one’s brain has a certain property. But although we might agree with the ontological commitments of these accounts
(propositional attitudes are nothing over and above properties of the brain), the way we go
about attributing beliefs and desires is not a matter of trying to discover such properties.
Rather, such ascriptions have a predictive purpose at the level of behaviour, and that is
our purpose in ascribing beliefs and desires to an agent. If so, then intentional ascriptions
need not be eliminable in favour of (and their meaning is not fixed by) certain neurological
descriptions.
This thought is found in Quine’s Word and Object [Qui60]. In a strict ontological
sense, “the canonical scheme for us is the austere scheme” according to which there are
3.
66
“no propositional attitudes but only the physical constitution and behaviour of organisms” [Qui60, p.221]. However, intentional idioms are “practically indispensable” [Qui60,
p.219].3 There are, of course, disagreements within this general viewpoint. Dennett [Den87,
pp.342–343] divides the resulting accounts into those based on a normative principle, according to which we ascribe the attitudes an agent ought to have, given its circumstances, and
those based on a projective principle, whereby one ascribes those attitudes that one would
have oneself in those circumstances.
The former group of accounts includes those based around Davidson’s principle
of charity [Dav85] and Dennett’s own intentional stance [Den87], which is intended as a way
of bridging the gap between realist and interpretational accounts of intentional attitude
attribution (or rather, of claiming that this is a deeply unhelpful dichotomy). Dennett
holds that, “while belief is a perfectly objective phenomenon . . . it can be discerned only
from the point of view of one who adopts a certain predictive strategy, and its existence can
be confirmed only by an assessment of the success of that strategy” [Den87, p.15]. Here,
Dennett is in agreement with Quine in that determining the truth of belief attributions
could not be reduced to the existence of some underlying physical phenomena:
It will often happen also that there is just no saying whether to count an affirmation of a propositional attitude as true or false, even given full knowledge of
its circumstances and purposes [Qui60, p.218].
Dennett describes his approach as follows:
first you decide to treat the object whose behaviour is to be predicted as a
rational agent; then you figure out what beliefs that agent ought to have, given
its place in the world and its purpose. Then you figure out what desires it ought
to have, on the same considerations, and finally you predict that this rational
agent will act to further its goals in the light of its beliefs. A little practical
reasoning from the chosen set of beliefs and desires will in most instances yield
a decision about what the agent ought to do; that is what you predict the agent
will do [Den87, p.17].
Let us call this method of ascribing beliefs the predictive strategy. A first objection is that it
becomes hard to explain false belief. If we follow Dennett’s claim that we should “attribute
as beliefs all the truths relevant to the system’s interests (or desires) that the system’s
experience to date has made available” [Den87, p.18], then how can we explain where false
beliefs come from?
3
See also Sellars [Sel56].
3.
67
Dennett considers two paradigmatic cases [Den87, pp.18–19]. In the first, an
agent a having a false belief that p because she (truly) believed that b told her that p and
(perhaps also correctly) believed that b is in general reliable and did not intend to deceive
(but happened to be incorrect) on this occasion. In the second case, a (mis)perceives the
coil of rope in the garden as a snake, so falsely believes that there is a snake in the garden.
Dennett’s point here is that:
The falsehood has to start somewhere; the seed may be sown in hallucination,
illusion, a normal variety of simple misperception, memory deterioration, or
deliberate fraud, for instance, but the false beliefs that are reaped grow in a
culture medium of true beliefs [Den87, p.19].
Stich [Sti81] objects that there are cases of false belief that cannot be explained in
any of the ways that Dennett suggests. One of his examples considers a newspaper vendor
who on an occasion gives the wrong change. The problem is that, in ascribing all the beliefs
that the agent should have, given these circumstances, we end up in a mess. Presumably
we ascribe the belief that £2 − £1.30 = 70p, so why did the vendor only hand over 60p
change for a £1.30 newspaper from a £2 coin? Assuming that this was a case of genuine
error, rather than deceit, none of the potential beliefs we could ascribe would explain the
situation. But, as Dennett replies, what is to be explained? If this was a genuine mistake,
perhaps caused by a temporary brain malfunction or miscalculation, then we would not
expect to rationalize the mistake in terms of beliefs and desires [Den87, pp.83–88]. It was
an irrational mistake and so we should not search for rational reasons.
Dennett’s response is acceptable in the case of an irrational error, such as the
vendor’s mistake. The vendor would more than likely be unable to explain why he made
the mistake himself. However, the situation changes when we consider mistakes that
happen because of bounded resources. Suppose a chess player could win a game by
making a particular series of moves but, because he has limited time in which to think and
can only think a certain number of moves ahead, does not make these moves and ends
up losing. In a sense, he has made a mistake because he did not make the most rational
moves (assuming, of course, that he wanted to win). The problem here is that the agent’s
experience—his knowledge of the rules of chess and the positions of the pieces on the
board—makes available to him information about the winning strategy.4 But we should
4
We might be tempted to say that the information about the winning strategy was not available to the agent
at all, since his bounded resources did not permit him to access the information in a useful way. I discuss
information briefly in section 8.2.3.
3.
68
not be tempted to say that the agent had no reason for acting as he did, i.e. with less than
ideal rationality. If we pointed out the winning strategy to the agent after the game, he
might claim that he could have discovered it himself, if only he had more time, or an ability
to look more moves ahead. So there are reasons we can cite to explain cases of agents acting
with less than perfect rationality. These reasons are not captured by Dennett’s predictive
strategy, which would predict that the agent chooses the winning strategy every time.
To pursue this criticism, Dennett’s distinction between opinion and belief should
be introduced. Opinion is a matter of assent and as such is a classical on-off affair, whereas
beliefs may come by degree, governed by Bayesian rules [Den81, chapter 16]. The link
between the two concepts is that belief provides the basis for an agent’s opinion. Dennett
agrees with de Sousa [dS71] that a Bayesian-style theory of belief should be used “to explain
(or at least predict statistically) the acts of assent we will make given our animal-level beliefs
and desires.” Such beliefs “explain our proclivity to make these leaps of assent” [Den81,
p.304].
In order to evaluate Dennett’s notion of opinion (and, consequently, of belief),
some auxiliary notions are helpful. When an agent makes the judgement that p, the agent
makes a conscious decision (in the case of human agents, at least) that p is the case and thus
believes that p. In judging the world to be a certain way, an agent rationally commits itself to
the logical consequences of that judgement. If an agent judges p1 , . . . , pn to be the case and
q is a logical consequence of these judgements but is rejected by the agent, then we could
point out some error in the agent’s reasoning. In showing the agent that q is a consequence
of judgements it has made, we would expect the agent to either change its mind about q
or else reject one of the original judgements.5 This notion of rational commitment does
not require an agent to explicitly recognize all of its rational commitments. One can point
out that an agent is rationally committed to some undesirable consequence p (a statement
widely held to be false, for example) as a way of persuading the agent to change its mind
about the judgements that it has explicitly made.
5
In talking about the consequences of an agent’s judgements, we may want to restrict the notion to relevant
consequences, perhaps by taking relevant implication as our model. In this way, we can rule out strange
commitments involving material implications, such as one’s judgements about what to have for tea committing
one to p → q ∨ q → r, for any (completely unrelated) propositions p, q, r. In a similar way, the notion
of commitment should avoid the ex contradictione quad libet principle, or principle of explosion, whereby
contradictory judgements would commit an agent to every proposition whatsoever. An acceptable, nonexplosive notion of consequence must therefore tolerate a degree of contradiction, as paraconsistent logics
do. So, the notion of commitment, given what an agent judges, should be characterized along the lines of a
paraconsistent, relevant consequence relation.
3.
69
If beliefs are governed by Bayesian rules and an agent’s opinions are a direct
result of what it believes, then it follows that an agent’s opinions include all of its rational
commitments. If an agent believes p1 , . . . , pn with a high degree of probability and q is
a logical consequence of these beliefs, then the agent should, according to the Bayesian
theory, believe that q with at least that degree of probability. Thus we should not say that
the agent’s opinions include p1 , . . . , pn but not q. A consequence is that an agent will be
predicted to assent to q whenever it assents to p1 , . . . , pn . As a simple counterexample,
consider a first-year student sitting a logic exam. A question instructs the student to
assume p1 , . . . , pn and then asks whether q follows (in classical propositional logic, say).
On Dennett’s view, if q indeed does follow, then the agent cannot fail to answer correctly.
Dennett’s account of belief (together with our assumption that the agent desires to do well
and its belief that, in order to do well, it should give the correct answers and so on) predicts
that the agent will behave as a perfect reasoner, whereas experience tells us that this is
unlikely to be the case.
The question that needs to be addressed now is whether Dennett’s predictive
strategy can produce a notion of belief that does not entail that the agent believes all of its
rational commitments. Dennett’s suggestion is as follows:
One starts with the idea of perfect rationality and revises downwards as circumstances dictate. That is, one starts with the assumption that people believe
all the implications of their beliefs and believe no contradictory pairs of beliefs.
. . . one is interested only in ensuring that the system is rational enough to get
to the particular implications that are relevant to its behavioural predicament
of the moment [Den87, p.21].
Let us call this the downwards revision approach. Now, one might quite legitimately ask:
just how does one revise downwards, without being completely ad hoc? Dennett is not
forthcoming here.
The problem is analogous to that of logical omniscience in the possible worlds
approach to epistemic logic. As we saw in the previous chapter, the problem cannot be
alleviated by weakening the notion of logical consequence in the model (to a paraconsistent
or relevant consequence relation, say). Nor is it sufficient to simply deem all tautologies
irrelevant to the agent’s “behavioural predicament of the moment”. There remain an
infinite number of logical consequences of the agent’s explicit judgements that will be
treated as opinions of the agent. Since these will include, for example, the opinion that
3.
70
moving the Queen to square K5 is the first move in a winning strategy whenever is a fact
that this is so, predictions of the agent’s likely behaviour will not be accurate. Just what
counts as relevant to the agent’s behavioural predicament depends not only on what the
agent has explicitly assented to and what it has observed, it also depends on how the agent
reasons and whatever bounds there may be on the agent’s reasoning process. I return to
this point in section 3.6 below.
A further problem for a downwards revision strategy is that Dennett takes a belief
ascription to relate an agent to a proposition, rather than to a sentence. In “How to change
your mind” [Den81], Dennett claims that belief is “best considered divorced from language
. . . by reference to the selective function of the state of belief in determining behaviour”
[Den81, p.305] and that this “selective function” should be viewed on the minimal possible
worlds model. Dennett follows Stalnaker [Sta76]) in holding that “a particular belief is a
function taking possible worlds into truth values” [Den81, p.305], thus identifying a belief
with what many take to be an intention or a meaning.6
However, intentions do not distinguish between logically equivalent sentences.
As a consequence, logically equivalent sentences set a limit on downwards revision within
Dennett’s framework. If the agent’s opinions include p and p is logically equivalent to
q, then the agent’s opinions also include q. This is a form of logical omniscience that is
sure to lead to inaccurate predictions of behaviour in reasoning tasks, such as a game of
chess, however logical equivalence is spelt out. One must conclude that the prospects
for downwards revision on Dennett’s view of belief are not at all promising. In the
following section, I criticize Dennett’s claim that belief is best characterized independently
of language and reject the analysis in terms of propositions.
3.5 Sentential Accounts
Above, a distinction was introduced between belief on the one hand and explicit assent
and judgements. This distinction is argued for by Dennett [Den81, Chapter 16] and de
Sousa [dS71]. One way to make the distinction, following Malcolm [Mal73], would be to
claim that, whilst it certainly seems appropriate to say that a chicken believes (or thinks)
that going to the farmer is a way of getting fed, it certainly has not judged or formed the
opinion that this is so; nor has it assented to that statement. Forming judgements and
6
See Lewis’s [Lew75], for example.
3.
71
opinions and assenting to statements are conscious mental acts, whereas having beliefs
might be viewed as a different class of mental phenomena altogether, operating on a more
fundamental, sub-personal level. This is why it makes sense to attribute beliefs to an agent
that it has not explicitly considered.
However, this does not licence the claim that, whilst judgement, opinion and
assent are to be cashed out in terms of statements (i.e. unambiguous sentences) beliefs are
to be ascribed in terms of non-linguistic entities. It may, of course, be the case that the
processes in an agent’s brain that give rise to the behavioural phenomena via which we
attribute beliefs are themselves non-linguistic. However, we must remember that beliefs
are ascribed at a certain level of description of the agent so that, even if the relevant
processes subvenient to belief are intrinsically non-linguistic, we need not conclude that
our ways of ascribing belief should be propositional, rather than sentential. As discussed
above, there may be no interesting question as to what beliefs really are.7
Moreover, the de re propositional content of the sentence that an agent would
use to express her belief might not be adequate as an explanation or prediction of her
behaviour. Consider an agent perpetually annoyed by mobile phones ringing on public
transport who, upon hearing a phone continuously ring whilst on the train to London,
gets increasingly annoyed. Each time it rings, she tries to locate the source of the annoying
ring. Finally, she realizes that she has left her own phone in her luggage at the end of the
carriage, so comes to have a belief that she would most naturally express as ‘it’s my phone
ringing.’ This belief explains her subsequent actions: embarrassment, motion towards her
luggage, apologies to the other passengers etc. John Perry considers a similar example in
[Per93] and concludes that there is something essential (from an explanatory perspective)
about the use of the indexical ‘my’. By substituting salva veritate a definite description such
as ‘the passenger in seat 12A’ for ‘my’, the explanation of the agent’s subsequent behaviour
could be lost. The true belief that the annoying phone belongs to the passenger in seat 12A
does not in itself explain the agent’s behaviour. We would need to add the belief that the
7
There is a sense in which sentences, taken out of context of utterance, do not individuate beliefs appropriately. Suppose two agents each have a belief that they would express as ‘it’s raining.’ Agent a has the
belief in London on Monday, b has it in New York on Wednesday. So a believes that it is raining in London
on Monday, whereas b believes it to be raining in New York on Wednesday. They have different beliefs and
what distinguishes them is not anything linguistic, but rather the de re fact that London isn’t New York, and
Monday isn’t Wednesday. However, for all practical purposes—explaining and making predictions about
behaviour—the sentence ‘it’s raining’, understood in its appropriate context, is perfectly adequate. Why did
the agent take an umbrella? Because she believed that it was raining.
3.
72
agent would express as ‘I am the passenger in 12A’. In short, the truth-conditional content
of the agent’s belief is not a sufficient explanation of her behaviour.
Following Perry [Per79], it is useful to distinguish between what the agent believes
and her state of belief in so believing. As our embarrassed agent retrieves her phone, the
other passengers in the carriage may well believe our agent to be the owner of the annoying
phone, but they do not share our agent’s feelings of embarrassment and the like. They
all share the same belief—who owns the annoying phone—but they entertain that belief in
different ways, and so are in very different belief states. Perry’s conclusion is that there is
something essential about the way we characterize such belief states in an agent centred
way, using I, me, here, now. No substitute for ‘I’ or ‘me’ would allow us to explain the
agent’s egocentric behaviour. It is most natural, then, to classify belief states at a cognitive
level, in terms of I-thoughts; and the way we typically attribute I-thoughts is through
direct quotation: she believed ‘that’s my phone.’ We classify belief states, therefore, using
sentences. The same considerations apply when classifying desire states. If all the runners
in the race want to win, for example, then they are all in the same (local, not total) desire
state. Yet there is no one contender such that all the contenders want that person to win,
so they all have different desires.
However, even with this distinction in place, it is still not correct to say that
what an agent believes in having a belief is a function from worlds to truth values, as
Stalnaker claims it is [Sta99]. On this view, one would believe the same thing in believing
that Fermat’s Last theorem is true and that 1 + 1 = 2. Similarly, in believing any logical
falsehood to be true, one would believe the same as one would in believing that 1 + 1 = 3,
i.e. the constant function taking any world to false. This same constant function would also
account for beliefs about Superman and Pegasus. Moreover, one would believe the same
thing in believing either that Bob Dylan or that Robert Zimmerman is a great songwriter.
These are all unintuitive results; they do not square with what an ordinary speaker means
by what one believes when one has a belief. What Stalnaker’s view of propositions does
achieve in a particularly elegant way is a characterization of the truth-conditional content
of a belief. We should conclude that whatever it is that people believe when they have a
belief should not be identified with the truth conditional content of their belief.
One might then argue that belief is a relation between an agent and a proposition
but that propositions should not be understood as functions from worlds to truth values.
An alternative is to consider propositions to be structured entities containing semantic
3.
73
values, i.e. particulars, properties, relations and descriptive conditions. This is known as
the Russellian view, popularized by Kaplan [Kap89] (amongst others) and adopted by direct
reference theorists. King [Kin96] considers structured propositions, a development on the
Russellian notion that includes the entire syntactic structure of a sentence, represented in
tree form, with semantic values appended to leaves. In Kaplan’s framework, a sentence
(in a context) first expresses a (Russellian) proposition, which is then evaluated for a truth
value at a time and a world.
Such propositions play a rôle intermediate between truth-conditional content and
belief state classification. They do not appear to offer any advantage over Stalnaker’s
view in terms of specifying truth-conditional content and do not represent a complete
solution to the problem of belief state classification. Structured propositions allow one
to distinguish the belief that Fermat’s Last Theorem is true from the belief that 1 + 1 = 2
but not between beliefs that differ only in the salva veritate substitution of one semantic
value for another, without altering syntactic form. For example, the beliefs that Dylan is
F and that Zimmerman is F relate an agent to precisely the same Russellian or structured
proposition.
In fact, it is not necessary to assume the existence of propositions to explain (or
act as the bearers of) the truth of sentences, even in direct discourse. A sentence can only
express a proposition if it can be suitably disambiguated and situated in a relevant context.
The sequence of shapes ‘I am not here today’, worked into the sand by a crab crawling
to and fro, does not express a proposition. So we may assume that we are dealing with
statements when we talk about the truth of a sentence in a context, i.e. a fully disambiguated
sentence. Given that the truth of a statement can be determined at a world, why not use
whatever mechanisms are used to establish which proposition the statement expresses to
establish its truth value directly?
The problem here has to do with the distinction between rigid and non-rigid
terms. Descriptions, for example, pick out whatever satisfies their descriptive content at
the world that they are evaluated at, whereas proper names pick out the same individual
across all worlds (in which that individual exists). Evaluating some sentences directly
across worlds would not give the correct result. For example, there is no world in which
the referent of ‘Tony Blair’ is not called ‘Tony Blair’; yet Blair’s parents did not name
Tony ‘Tony’ out of necessity. Kaplan’s thought is that there need to be two stages to the
truth-determining process: firstly, that of a sentence (in a context) expressing a proposition
3.
74
and secondly of evaluating that proposition at a world. However, neither a two-stage
evaluation procedure nor the use of propositions is necessary. As discussed in section 3.2
above, our modal intuitions can be captured without the use of propositions, by evaluating
suitably world-insensitive versions of statements. For example, by replacing ‘Tony Blair’
with ‘the actual referent of ‘Tony Blair’’, we can evaluate ‘Tony Blair is called ‘Tony Blair’’
across worlds and obtain the right results (namely that is is a contingent truth).
Returning to belief contexts, the question of whether a particular belief is true or
not is precisely the same as the question of whether the sentence that we use to classify the
agent’s belief state is true in the appropriate context. The picture we have is as follows.
We classify an agent’s belief state using sentences. Moreover, the truth of the beliefs thus
ascribed can be determined by the disambiguation and addition of ‘actual’ operators to
these sentences, without appeal to propositions. Similar remarks apply to the attribution
of desires to an agent.
Dennett’s worry here is that language “forces us on occasion to commit ourselves
to desires altogether more stringent in their conditions of satisfaction than anything we
would otherwise have any reason to endeavour to satisfy.” Language is too specific
for the specification of desire, for “you often cannot say what you want without saying
something more specific than you antecedently mean” [Den87, p.20]. These worries apply
equally to the classification of belief states, “where our linguistic environment is forever
forcing us to give—or concede—precise verbal expression to convictions that lack the hard
edges verbalization endows them with” [Den87, p.21]. We may object here that language
frequently does not look as precise as Dennett would have us believe. Vagueness, in
particular, is an intrinsic feature of natural language. Our predicates tend not to neatly
partition the domain, but instead direct us to a sample to which the present case may be
more or less similar. We understand the meaning of ‘mountain’ perfectly well, despite
not being able to pick out exactly where the plain ends and the mountain begins. One
has grasped the meaning of the concept adequately when one can pick out mountains in
paradigmatic cases, e.g. when one can distinguish clearly mountainous regions from flat
areas, or from gently undulating hillside. In a similar way, we make extensive use of vague
quantifiers such as for most. Furthermore, even when we use a determinate quantifier for
all, the domain of quantification is nearly always contextually specified and need not be
so in a precise way. Many would interpret the announcement, ‘all planes are grounded’,
made at Heathrow airport, to mean that all of the planes at that airport are grounded. It is
3.
75
not falsified by a plane taking off from Gatwick airport.8
We can point to numerous examples in which an expression of desire suggests
satisfaction conditions broader than our antecedent desire. This does not show that the
desire does not have an intrinsic language-like component, but only that the agent chose
the wrong way of expressing her desire. Moreover, in expressing a desire linguistically,
one can appeal to all the usual pragmatic features usually associated with discourse. A
desire to eat a low-fat meal, which excludes eating dust as a satisfaction condition, is
perfectly well expressed as ‘I’d like something low in fat’ in a restaurant setting. Anyone
thinking that serving the utterer a plate of dust would satisfy the request is not playing
within the conventions of the game. We often say things that, taken literally, are either
more general or more specific than we mean, but this does not imply that meanings cannot
be expressed linguistically. It merely highlights how conventional practice allows us to
express ourselves concisely and efficiently. The same holds for belief and desire. Sentences,
like beliefs and desires, allow a range of precision and there is no reason to suppose that,
in general, sentences need to be more precise than beliefs or desires.
This is the main positive conclusion that I want to draw in this chapter: states of
belief and desire should be characterized in terms of sentences. It also seems possible to
characterize what is thereby believed or desired in terms of sentences as well, rather than
in terms of propositions, although this is not essential to the account of belief that follows.
3.6 Belief and Acceptance
Having argued that intentional attitudes are best characterized in terms of sentences rather
than propositions, I now turn to discussing a number of ways in which we might view the
relationship between an agent and the sentence that we use to classify its attitude. Perry
and Corazza offer the notion of mentally accepting the sentence. Quine puts forward a
view based on direct quotation of one’s own imagined responses to an agent’s behaviour.
Finally, we can amend Dennett’s normative account to accommodate sentences, in place of
propositions.
Perry, a proponent of the sentential view of intentional attitude ascription, writes
8
This latter consideration also applies to definite descriptions, on the Russellian interpretation of ‘the F’
as ‘there is exactly one F such that . . . ’. The range of the existential quantifier is assumed to be contextually
specified, such that ‘the book in the corner’ may denote a particular book, even though there is likely to be
many books lying in corners in other parts of the world.
3.
76
that:
One has a belief by accepting a sentence. . . . [Belief] states have typical effects
which we use to classify them. In particular we classify them by the sentences
a competent speaker of the language in question would be apt to think or utter
in certain circumstances when in that state. To accept a sentence S is to be in a
belief state that would lead such a speaker to utter or think S [Per80, p.45].
Acceptance is clearly a relation between a cognitive agent and a sentence. Perry is keen
to emphasize that he is not identifying equating belief with acceptance, for two agents
can have have distinct beliefs by accepting the very same sentence. For example, distinct
agents have distinct beliefs in accepting the sentence ‘I am tired’; each believes, of itself,
that it is tired. Yet acceptance plays an important rôle in classifying an agent’s belief state,
as discussed in the previous section.
Talking in terms of acceptance of a sentence highlights the difference between
being cognitively related to ‘Dylan is F’ and to ‘Zimmerman is F’. Corazza asks us to
imagine the subject as having an indefinite number of sentence tokens and tokens of the
psychological verbs ‘believe’, ‘desire’ etc. placed in front of her. Then “[e]ach time our
agent entertains an attitude she is asked to do two things: (i) pick out a psychological verb
and (ii) choose from among the sentences the one she would use to express her attitude.
The sentence she picks out or points to is the sentence she accepts” [Cor04, p.260].9
However, here we are invited to think of acceptance as a conscious mental activity,
along the lines of making a judgement or an act of explicit assent. If accepting a sentence
is a necessary requirement for entertaining a belief one would, on this view of acceptance,
have very few beliefs indeed. I have many beliefs to which there corresponds no explicit
act of acceptance, for example the beliefs that there is a chair in front of me, that it will not
move as I sit down on it, that it will bear my weight and so on. If the notion of acceptance
is to be of any use, then belief should imply a disposition to accept the relevant sentence, or
else the notion of acceptance should itself be spelt out in dispositional terms, such that one
may accept a sentence implicitly. Note that talk of (tacitly) accepting (or being disposed
to accept) each of an infinite number of sentences is compatible with treating the agent as
having finite resource bounds, provided that each sentence could be explicitly accepted to
by the agent in question, based on its other beliefs.
9
Corazza uses the term n-acceptance, which does not entail accepting as true. This allows for an analysis of
desire, supposition etc. where the agent does not take the n-accepted sentence to be true. However, in the case
of belief, n-acceptance does entail accepting a sentence as true.
3.
77
Corazza analyzes acceptance (and hence tacit belief) in counterfactual terms, making reference to the mental representations of the attributer and attributee. One attributes
the acceptance of a sentence S to another when:
the attributee’s token mental representation (or cognitive particular) is similar
to the one that would cause the attributer, in the attributee’s context, to utter
S. . . . two agents’ token representations are of the same type insofar as they
[accept] the same sentences [Cor04, p.262].
‘Similar’ here means ‘of the same type’. Corazza is explicitly committing himself to a
representational theory of mind: “A mental state . . . is a mental representation plus the
attitude relation (belief, desire, etc.) the agent bears to the proposition. . . . A belief state, for
example, is a mental representation embedded within the operator” [Cor04, p.257].
Corazza thus holds that all belief states are to be analyzed in terms of having an appropriate
mental representation. In order to account for representations of negative facts, as would
be required for the belief that there are no pink elephants dancing on campus, Corazza
must rely on logically complex, compositional representations. Work then needs to be
done to show that this view does not collapse into the language of thought view discussed
in section 3.3 above.
A further problem is that having an appropriate mental representation is not
sufficient for having the corresponding belief, even if the cause of the representation is of
the sort that usually results in belief (such as perception). If I know that my eyes are bad
and that I often mistake coils of rope for snakes on foggy days, for example, I am unlikely
to believe that there is a snake in the garden, even if there is. What such agents believe is
likely to depend on their other attitudes.
A final problem is that, on Corazza’s approach, it would not be correct to ascribe
intentional attitudes to agents that have vastly different kinds of mental representations
to humans. We might imagine interaction with an alien race who behave similarly to
us in similar circumstances, such that we would be tempted to explain and predict their
behaviour by saying that they have such-and-such beliefs and desires. It might turn out
that their mental representations (if they have any at all) are wildly different from ours
and yet our inclination to use intentional vocabulary in describing the agents highlights
the everyday meanings we associate with ‘believes’ and ‘desires’. As Quine says, “[t]he
problem is not one of hidden facts” [Qui70, p.180]. Quine’s own view is that we classify
an agent’s belief state as follows:
3.
78
we project ourselves into what, from his remarks and other indications, we
imagine the speaker’s state of mind to have been, and then say what, in our
language, is natural and relevant for us in the state thus feigned. [Propositional
attitudes] can be thought of as involving something like quotation of one’s own
imagined verbal response to an imagined situation [Qui60, p.219].
Ascription is thus “an essentially dramatic act” founded in part on our “dramatic virtuosity” [Qui60, p.219]. To some extent, Corazza agrees with this position, writing that “[w]hen
we attribute an attitude to someone, we often imagine ourselves in her situation” [Cor04,
p.259]. Stich is also in agreement that “[i]n saying what someone else believes, we describe
his belief by relating it to one we ourselves might have. And we indicate this potential
belief of our own by uttering the sentence we would use to express it” [Sti83, p.79].
It might be thought that there is an intrinsic difference between quotation and
belief ascription, namely that quotation seeks to report something about the attributee’s
relation to the quoted words, whereas belief reports say something about the world according
to the attributee. This objection does not hold much water, for is the result of giving too
much consideration to the use-mention distinction, which although useful when discussing
meaning and reference, can often be rather artificial in other contexts. We are not tempted
to say that cases of mixed quotation are merely about words; for example, the sentence:
Quine held that ascription is “an essentially dramatic act”
says that Quine held that ascription is an essentially dramatic act. It also conveys that the
choice of phrasing is Quine’s, not mine. The sentence is just as much about dramatic acts as
it is about Quine and ascription, despite the fact that the reference to the former is enclosed
within quotation marks. So it is with other intentional ascriptions.
However, just what one agent would assent to after projecting itself into another
agent’s position can depend on the first agent’s reasoning ability. Consider two agents
playing chess, both of whom have exactly the same relevant perceptual beliefs about the
position of the pieces on the board, both know the rules of the game and neither has
any prior beliefs about what constitutes a good move in certain situations. Each agent’s
evaluation of how good a move is will be based solely on its perceptual beliefs and its
knowledge of the rules, e.g. that a Queen is more valuable than a Bishop. Nevertheless, the
agents might differ in what they take the best move available to player 1 to be. Suppose
that, after consideration, player 1 decides to take move m. Player 2, a more able reasoner
3.
79
than player 1 (say, player 2 has more memory available than player 1 and so can look more
moves ahead) can establish that a better strategy is available to player 1 by making move
m′ .
In this situation, player 2 would make move m′ if it were in player 1’s position.
Yet even player 2 should, on seeing player 1 make move m, say that player 1 believed that m
was the best move available. This highlights that, in order to make accurate ascriptions of
beliefs, one needs to take into account the reasoning ability of the agent to whom the belief
is ascribed, including any resource bounds imposed in the situation (such as the agent’s
available memory, or a time limit before a move needs to be made). I discuss this issue
further in section 3.7 below.
Above, I disagreed with Dennett’s analysis of belief in terms of propositions but
did not rule out an amended version that makes use of sentences in place of propositions.
Dennett’s approach held that we should identify an agent’s beliefs with “the truths relevant
to the system’s interests (or desires) that the system’s experience to date has made available”
[Den87, p.18]. However, it is not clear just what counts as an agent’s experience making a
truth available. Suppose m′ is the best move available to player 1 and player 2 can know
this, based on its own experience. Since player 2’s experience of the game is identical to
player 1’s, then ‘m′ is the best move’ is a truth that has been made available by player
1’s experience. But again, it is wrong to say that player 1 believed that m′ is the best
move to make, given that it made move m instead. Rather, Dennett needs to appeal to
the apparent truths that have been made available through the agent’s experience, where
making an apparent truth available implies that the agent could explicitly assent to the relevant
sentence, based on that experience. Ascriptions made in this way appeal to some notion
of a resource bound. We count an apparent truth as being made available by experience
when it can be easily inferred from perceptual beliefs, but not when it would take the agent
a lengthy reasoning process to explicitly assent to that sentence, given the same perceptual
beliefs. I return to this point in the following section.
To summarize, Perry’s notion of acceptance is promising, but should not be spelt
out in terms of mental representations as Corazza suggests it should. Quine’s idea is
intuitive, but struggles to account for discrepancies between the responses of agents with
differing reasoning capabilities to the very same situation. Dennett’s normative account
is also intuitive but it is difficult to further cash out the notion of an agent’s experience
making certain truths (or apparent truths) available in a way that does not assume that the
3.
80
agent believes all logical consequences of its beliefs.
The main advantage of Dennett’s and Quine’s accounts is the holistic treatment
of belief and desire and their relation to behaviour. There is no reason that this view
cannot be combined with aspects of Perry’s. We characterize beliefs in terms of sentences
which the agent in question bears a dispositional relation to. In cases of belief, for instance,
this dispositional relation is what may variously be termed acceptance, assent, or simply
holding true. We take an agent to hold a sentence as true when the agent exhibits certain
behaviour. A paradigmatic case is asserting that very sentence. A full account of belief
may be couched in terms of the disposition to assert the relevant sentence, if questioned,
provided that the agent does not intend to deceive and is not under duress to give that
answer.
An alternative characterization of assent is given by Barker [Bar04], who treats
assent as a commitment to a particular mental property. In asserting the sentence that it
assents to, an agent is defending the associated mental property. This approach seeks to
ground meaning in general in terms of such acts of defence within a theory of speech acts.
A full discussion is thus clearly outside the scope of the present work. For the remainder of
this chapter, I discuss the prospects for a useful logic of belief, given these considerations.
3.7 Prospects for a Logic of Belief
Above, I argued that both Quine’s projective account and Dennett’s normative account
need to appeal to a notion of a resource bound. This is apparent when attitudes are
ascribed to an agent a in a context that depends on a’s reasoning. In predicting how a
student will perform in an exam, or which move a player will make in a game of chess, we
are tacitly predicting what beliefs the agents will come to have (i.e. the beliefs that will lead
to the answers given in the test or to the moves made in the game). To predict what move
an agent will make, given the current state of the game and the agent’s prior knowledge, is
not simply a matter of saying what move I would make in the agent’s situation (as Quine
says). Neither is it simply a matter of what move the agent should make, given the state of
the game (as Dennett suggests it is).
For Quine’s view to be plausible, it must hold that an ascription is an act of
imaginative projection into the agent’s situation, in which the attributer simulates the
agent’s reasoning abilities. Suppose I can infer that my making move m1 would put my
3.
81
opponent in a strong position, were she to make move m2 on her next turn. However, the
planning that shows m2 to be a good response to m1 is complicated and I judge that my
opponent will not spot the chance. On the basis of such reasoning, I may well make move
m1 although I realize that, against an ideal opponent, it would be inadvisable. Reference to
my opponent’s reasoning ability plays an important part in this chain of reasoning. Had I
believed the opponent to be a more capable reasoner, I would have been less likely to make
move m1 .
Similar remarks apply to Dennett’s normative view. An agent should not be said
to believe whatever truths are made available by its relevant experience, for this prohibits
an agent believing a move m to be a good move, based on a certain state of a game of chess,
when in fact m is a bad move in that situation. Agents undeniably have such erroneous
beliefs; in fact, they play an important rôle in the explanation of behaviour. On Dennett’s
original view, making an inadvisable move in a game of chess is inexplicable, whereas we
often do explain such mistakes in terms of the agent’s ability (or lack thereof).
To make this point more precise, let us interpret Dennett’s somewhat vague notion
of the truths made available by an agent’s experience to be relevant observation sentences in
Quine’s sense [Qui69, QU70]. Quine and Ullian’s methodology in [QU70] mirrors the
present concern with sentences rather than propositions or states of affairs: “let us ask no
longer what counts as an observation, but turn rather to language and ask what counts as
an observation sentence” [QU70, p.23]. The important feature of an observation sentence
for present purposes is that:
any second witness would be bound to agree with me on all points then and
there, granted merely an understanding of my language. . . . In short, an
observation sentence is something that we can depend on other witnesses to
agree to at the time of the event or situation described [QU70, p.23]
such that “all reasonably competent speakers of the language be disposed, if asked, to
assent to the sentence under the same stimulations of their sensory surfaces” [QU70, p.23].
It seems sensible to emend this condition to require instantaneous and unreflective assent, to
rule out cases of Socratic prompting and the like.
On Dennett’s view, if o is a relevant observation sentence corresponding to a
situation experienced by an agent a, then we should take a to believe that o. In the chess
example, this might include sentences detailing the positions of the pieces on the board.
The question then is, given a set O of observation sentences, treated as beliefs, what other
3.
82
beliefs should be attributed to the agent? The correct answer seems to be: whatever the
agent could infer (or mis-infer) from O, given that agent’s reasoning ability and any other
restrictions on its reasoning (such as a time limit on making moves in a game of chess). This
analysis allows a source of false belief into the account, which Dennett’s original proposal
excludes. The false belief that, say, m is the best move available, given O, can be no part
of an ideal theory that incorporates O. Yet an agent reasoning non-monotonically (taking
m to be a good move unless it can deduce reasons to the contrary, for example) may well
infer (non-monotonically) that m is the best move available from O.
Given these points, the prospects for a logic of belief, in general, depend on the
prospects for modelling the kind of reasoning that the agents in question engage in. In
the case of ideal agents, we may assume that all logical consequences of the relevant set
O of observation sentences are believed. We might say that classical logic is the logic
that describes an ideal agent’s reasoning ability, for part of what we mean by an ideal
agent is that its beliefs are consistent and closed under classical consequence. The classical
consequence relation Cn is a function that, given a set O of observation sentences, returns
the beliefs of an ideal agent whose relevant experience corresponds to O.
There appears to be no formal logic that describes human reasoning processes
accurately in this sense. Humans are fallible reasoners in a way that formal logics do not
capture well. Although progress has been made on default logic and other non-monotonic
logics, it is unlikely that a formal consequence relation Cn′ can be defined for a human
agent a such that Cn′ (O) describes a’s beliefs whenever O describes a’s relevant experience.
For the class of artificial agents for which we have specialist knowledge of their
reasoning processes, the prospects for a logic of belief are better. Given knowledge of how
the agents reasons (that is, knowledge of how the agent comes about new beliefs from
old) and how much of this reasoning the agent can perform within a period of time, it
is possible to make general yet accurate statements about what the agent can and cannot
discover by reasoning within that period. Similarly, if we know that a particular solution
to a problem requires a certain amount of memory, but that the agent in question has a
smaller amount of memory available for the current task, then the agent will not be able to
solve the problem.
Consider an agent that executes in sense-think-act cycles, where a single inference
rule is applied to the agent’s current beliefs in the think part of the cycle. If we know
what beliefs the agent begins with and what percepts it will gain at each cycle, then it is
3.
83
possible to determine which beliefs the agent can have at a particular cycle. If we can
further calibrate the (approximate) length of time it takes for each cycle to complete, then
accurate temporal predictions are possible. In this case, a consequence operator Cn′ can be
defined in terms of the agent’s inference rules which, given the agent’s prior beliefs and
percepts at a cycle, returns the set of beliefs that the agent can have at the next cycle.
A logic along these lines is presented in section 4.5 below and the same methodology forms the basis of the logic developed in chapters 5–7. Before that, I review sentential
logics of belief (sections 4.1–4.4) and evaluate them against the background of the points
just raised.
84
C 4
S
4.1 First-Order Accounts
The discussion in the previous chapter concluded that an agent’s belief state is best captured
in terms of sentences. It is quite natural, therefore, to look to what has variously been termed
the syntactic or sentential approach to epistemic and doxastic logic in the literature. Our
main task is to find a logical system that is capable of capturing the framework sketched
in section 3.6, but we first need to decide on a suitable language for reasoning about an
agent’s beliefs. Although first-order logic (FOL) is an obvious candidate, I eventually settle
on a modal language.
In a first-order language, ascriptions of belief and knowledge are treated in subjectpredicate form. The predicate classifies the type of attitude, whereas the subject classifies
the attitude itself, or the accepted sentence, to use the terminology discussed in section 3.5.
For this discussion, consider a first-order language containing a belief predicate Bi (x) for
each agent i. The variable x ranges over objects of the domain, so that whatever beliefs are
ascribed to the agent must be considered to be objects in some sense. Such objects are not
propositions or states of affairs, for ‘Bi (Dylan = Zimmerman)’ is not a well-formed formula
of FOL. Moreover, whatever objects we take the variable x to range over must somehow
be related to the regular sentences of FOL, so that we can say not only that agent i has this
or that belief, but that this one is true, this one false, this one shared by agent j and so on.
In a sense, we already have the answer to the first predicament in front of us.
We want to classify an agent’s belief state using sentences but, of course, sentence tokens
are themselves objects. One may point to a line of graffiti saying ‘Bush is a moron’ and
comment that Bob believes that. Such sentences are perfectly well-formed in English; in
4.
85
fact, this was how the ‘that’-clause was introduced, as a way of demonstrating a particular
sentence. Returning to FOL, we might include in the domain a denumerable number of
sentences s1 , s2 , . . . with Bi (sn ) read as ‘agent i believes sn ’, or perhaps more vividly as ‘agent
i believes that’, with the reporter pointing to a token of sn . Here we might even hold that
the term ‘believes’ itself does the pointing, so that sentences superficially of the form ‘i
believes s’ are actually of the form ‘i believes that [s]’. As Quine, Davidson and Lepore
have shown, there is plenty of scope for philosophical ingenuity here [Dav68, LL89].
Capturing an agent’s knowledge in terms of sentences is often called the syntactic
approach to knowledge representation, studied in [Ebe74] and [MH79]. Hayes and McCarthy [HM69] present the first-order situation calculus, containing a predicate ‘holds’ such
that ‘holds(p, s)’ says that proposition p holds in situation s. The mental situation calculus
includes a predicate ‘knows’ that takes concepts as its argument. In their notation, the terms
‘mike’ and ‘telephone’ are regular terms, denoting Mike and the function taking individuals into the their telephone number, whereas ‘Mike’ and ‘Telephone’ denote the associated
concepts. Thus, it might well be the case that telephone(mike) = telephone(mary), yet
an agent may well conceptualize these numbers in distinct ways, i.e. Telephone(Mike) ,
Telephone(Mary). One then expresses knowledge of Mike’s telephone number in a situation s as ‘holds(knowsi (Telephone(Mike)), s)’. The notion of concepts in the mental situation
calculus sounds remarkably similar to Frege’s notion of sense, with concepts of individuals
as primitive concepts and other concepts captured as functions. Moreover, in order to capture knowledge of other agent’s knowledge, such concepts must be common to all agents,
again reminding us of Frege’s insistence that senses are objective. Philosophical objections
to Frege’s account were discussed in section 3.2.
Rather that dealing with the somewhat mysterious notion of concepts, better use
can be made of the idea that sentences are themselves objects. As such, they can be named,
either demonstratively using ‘that’ or, as is more appropriate to FOL, using a constant.
What is then required is a relationship between sentences and the names we give them.
In English, quotation can play this rôle, directing as it does one’s attention to the form of
words employed. A quotation device p·q can be used, such that for any sentence φ, the
term pφq denotes φ. The argument of the predicates Bi and Ki thus ranges over terms of
this sort. As Quine and Ullian remark, it is perhaps better to read ‘Bi (φ)’ and ‘Ki (φ)’ as
‘agent i believes/knows that the sentence ‘φ’ is true’, thus deflecting the question of what
is believed in the more usual belief reports altogether [QU70, p.12].
86
4.
Since pφ ∧ ψq denotes φ ∧ ψ it is not, in itself, in any way connected to either pφq
or pψq (or to φ or ψ for that matter). Thus to link the terms pφ ∧ ψq to pφq or to pψq, more
machinery is needed. As in English, quotation is related to assertion through the truth
predicate, functioning as a disquotation device. Thus, ‘the sentence ‘Dylan is F’ is true’ just
says that Dylan is F. The two are equivalent, as captured by Tarski’s T-scheme [Tar76].1
Using a truth predicate ‘T’ ranging over terms standing for sentences, we have:
(T)
T(pφq) ↔ φ
By adding T as an axiom scheme to a theory T and substituting, we obtain
⊢T T(pφ ∧ ψq) ↔ φ ∧ ψ ↔ T(pφq) ∧ T(pψq) and so pφ ∧ ψq can be seen to denote a
conjunction. By combining the quotation device ranging over all sentences of the language
with the truth predicate, ranging over quoted sentences, we have an expressive2 way
to formally represent beliefs. This approach allows for both semantic assent and decent
[Qui60], i.e. quotation and disquotation, allowing full quantification over object language
sentences.
Another option is to represent the structure of sentences directly in the syntax of the term using functions standing for the Boolean connectives.
For example,
con j(pφq, pψq) might be used in place of pφ ∧ ψq to capture the conjunction structurally.
More precisely, for every atomic sentence Rn (c1 , . . . , cn ) in the language, i.e. an n-ary relation symbol followed by n constants, let there be a constant pRn (c1 , . . . , cn )q such that
T(pRn (c1 , . . . , cn )q) ↔ Rn (c1 , . . . , cn ) is valid. Call these constants q-constants and terms that
only contain function letters or q-constants q-terms. Then let neg be a function from q-terms
to q-terms and conj, disj and imp from pairs of q-terms to q-terms. Yet another alternative is
to use Gödel numbering in place of both the p·q operator and the structural functions. In
what follows, I shall refer to the simplest syntax, using p·q alone.
From a formal point of view, the syntactic approach brings with it the large amount
of proof theory that has been developed for first order logic, making the mechanization of
proof systems for the syntactic approach relatively simple. However, the metalanguage in
which we formalize a syntactic account can become confusing quickly, due to the amount
of quotation necessary when we consider multiple-embedded reports and multi-agent
1
I am assuming that the predicate is true has a transparent, disquotational and non-expressive meaning.
For the view that truth is an expressive property, see [Bar04, Bar06].
2
I mean ‘expressive’ in its usual sense, rather than in the philosophical sense of expressive theories of truth
or moral statements.
4.
87
systems. We also must surrender compositional semantics with respect to the embedded
clause of belief reports. If we report an agent as believing a sentence φ in our metalanguage,
we cannot simply read off the meaning of φ from the report, for pφq is a constant, which
contributes no more than its denotation. But perhaps the largest problem for such accounts
is the looming threat of inconsistency, discussed in the next section.
4.2 Self-Referentiality and Inconsistency
We have seen that, to incorporate a quotation device into a logical theory, a truth predicate
is also required to act as a disquotational device. By adding a quotation device (or a system
of Gödel numbering), we allow sentences to represent their own syntax. Sentences may
even refer to themselves. For example,
(*)
T(p∗q)
says of itself that it is true. That is, the sentence called ‘*’ says that the sentence called ‘*’ is
true. Self-reference is unproblematic in the case of *; but now consider:
(λ)
T(p¬λq)
λ is known as the liar. It says of itself that it is not true (assuming, that is, the classical
meaning of ‘¬’). By substituting λ into the T scheme, we get λ ↔ T(p¬λq) ↔ ¬λ, hence
λ ↔ ¬λ. If λ is false, it entails λ and so must also be true; but if it is true, it entails ¬λ and
so must be false. Either way, a contradiction occurs. Thus, classical logics over a language
that can represent its own syntax that contain all instances of the T scheme are inconsistent.
An immediate reaction is to banish self-reference, so that a name for φ may not
occur anywhere within φ itself. Not only does this sound rather arbitrary, it also does not
avoid inconsistency. Cycles of reference can also produce inconsistency. For example, the
two-cycle:
The sentence to the right is false.
The sentence to the left is false.
(suitably formalized and added to all instances of the T scheme) is inconsistent. Moreover,
even banning all cycles of reference whatsoever, thus avoiding what Russell called the
vicious circle principle, is not sufficient to avoid inconsistency, as Yablo’s paradox shows.
Consider an ω-sequences of sentences φ1 , φ2 , . . . such that φn says that all the sentences
4.
88
φn+1 , φn+2 , . . . are false. Suppose that φn is true, so that all sentences from n + 1 are false.
Now consider some φm>n . It says that all sentences φm+1 , φm+2 , . . . are false, which they are.
So φm is in fact true, contradicting φn . Therefore our assumption was incorrect: φn must
be false, i.e. some φm>n must be true. Consequently, all φl>m must be false. But then φm+1
will turn out to be true, for it says that all sentences φm+2 , φm+3 , . . . are false, which they are.
But this cannot be the case as φm says that φm+1 is false. We have a contradiction on either
assumption.
To recap, making use of a quotation device requires use of a truth predicate
for disquotation, governed by instances of the T scheme. However, this has lead us into
inconsistency. The literature contains many examples of approaches that attempt to capture
a truth predicate in a consistent theory. They can be grouped as follows:
1. Restrict the syntax of the predicate in question
2. Restrict the underlying logic
3. Restrict the principles applying to the predicate in question
Tarski’s approach [Tar76], along the lines of option (1), was to stratify the language
into a hierarchy of languages L1 , L2 , . . . and define the predicate true-in-Ln in the next
language in the hierarchy, Ln+1 . The truth predicate Tn−1 appearing in Ln takes names
for Ln−1 sentences as its argument and hence is read as ‘true in Ln−1 ’. Since a sentence
containing the truth predicate Tn can only appear at level n + 1, there is no possibility of
a Ln sentence saying of itself, or of any other Ln sentence, that it is true or false. This
approach has several disadvantages. Firstly, if two formulae appear on the same level
in the hierarchy, then one cannot talk about the truth of the other. We also cannot assert
that all sentences within the hierarchy are true, for this would assume the existence of a
transfinite-level metalanguage above all others. The sentence ‘Everything I say is true’, for
example, could then not be truly uttered by anyone, for there is no transfinite language
in which to express the truthfulness of the utterance. Moreover, we do not seem to have
such a hierarchy in natural language; the English word ‘true’ belongs to exactly the same
language as every other word of English.
A second approach, championed by Saul Kripke [Kri75], takes the second of the
above options. Kripke uses a logic based on Kleene’s strong three-valued logic, in which
sentences can be evaluated to 1 (true), 0 (false) or 1/2 (neither). The existence of truth-valueless
89
4.
sentences such as the liar is often taken to imply the existence of truth gaps in general and
so it seems fitting that a formal theory should allow for this. However, the liar gets its
revenge:
(λλ)
This sentence is not true
is, like the original liar, assigned 1/2 because at no stage can it be assigned either 0 or 1. In
particular, λλ (the strengthened liar) is not assigned the value 1, so it is not true. But λλ
says of itself that it is not true—and so it is true!3 The problem is not one of inconsistency,
for Kripke’s theory is perfectly consistent but rather that it does not capture all the truths
that it should. This has provided motivation for the dialethist view that there are true
contradictions [Pri79, Pri87], such as the strengthened liar. The view is that the three truthvalues of Kripke’s theory must overlap, because some sentences assigned 1/2, for instance,
are true if they are assigned 1/2. The situation might be pictured as in figure 4.1.
1
1/2
0
F 4.1: Overlapping truth value regions
Recently, [Bea06] has argued that one can make sense of the notion of overlap
without admitting the existence of true contradictions. Sentences assigned a value 0 <
x < 1 are called paranormal sentences; but rather than being assigned only 1/2, paranormal
sentences may be assigned either 1/2, 1/4 or 3/4 (1 and 3/4 are designated values). The 1/4 and 3/4
values capture the overlap between Kripke’s original classifications, i.e. sentences assigned
3/4 are both true and paranormal,
those assigned 1/4 are false and paranormal. The extension
of the truth predicate, which contains sentences assigned either 1 or 3/4 as a truth value, is
negation-consistent, i.e. it contains no pairs φ, ¬φ. Thus, although sentences may be both
true and paranormal, there are no true contradictions.
An example of the third approach (restricting the properties of the truth predicate)
is given by Perlis [Per85, Per88]. He replaces T sentences with all instances of the scheme
3
That is, although λλ is not modelled as being true within the theory, we can see (from our viewpoint
outside the model) that it is true in the real world.
4.
(T∗ )
90
T(pφq) ↔ φ∗
where φ∗ is the result of replacing each occurrence of the form ¬T(pφq) in φ by T(p¬φq).
The law of excluded middle still holds in this theory, for the underlying logic is classical,
yet the predicate T(x) is no longer bivalent, i.e. both T(pφq) and T(p¬φq) may both hold.
This sounds as if the contradictions have merely been swept under the carpet—or rather,
within the scope of the truth predicate. It is plausible to claim that whenever the T-scheme
is deviated from to some extent, we end up defining a predicate that does not behave as
it intuitively should. This lends weight to Tarski’s claim that any adequate theory of truth
must entail every T-sentence.4
We have been discussing inconsistent (and consistent) theories involving a truth
predicate, which we have seen is required to make best use of a quotation device in FOL.
However, inconsistency can arise as a problem for syntactic theories of knowledge and
belief even without a truth predicate. As McCarthy [McC79b, p.146] notes, “plausible
axioms for necessity or knowledge expressed in terms of concepts [i.e. syntactically] may
lead to the paradoxes discussed in Kaplan and Montague [KM60] and Montague [Mon73]”.
Suppose we wish to straightforwardly translate the standard possible-worlds account of
idealized knowledge into a first-order theory. It will contain (at least) the truth distribution
axioms and the necessitation rule of inference:
• K(pφq) → φ
• K(pφ → ψq) ∧ K(pφq) → K(pψq)
• from φ infer K(pφq)
Montague [Mon73] shows that any theory containing these axioms is inconsistent. Thomason [Tho80] provides a similar result for theories of ideal belief, replacing ‘K’ by ‘B’ and the
truth scheme by positive introspection:
• B(pφq) → B(pB(pφq)q)
He comments that this shows “a coherent theory of idealized belief as a syntactical predicate
to be problematic” [Tho80, p.393].
Our reason for adopting a syntactic approach instead of the possible worlds model
in the first place was to avoid modelling idealized reasoners, so these results should not
4
See [Bar03] for an opposing view.
4.
91
trouble us. Both distribution and necessitation should be rejected in the case of the beliefs
of real agents. However, the ideal account should not be dismissed out of hand. The notion
of perfect rationality is a useful yardstick to incorporate in our model, as the fixpoint of the
lessening of resource bounds. We certainly do not want to say that, as the agent’s resource
bounds tend to zero, the agent’s beliefs tend towards inconsistency. Thomason’s result
shows that this is the model we would arrive at by taking our ideal fixpoint of reasoning
to be governed by the distribution, necessitation and positive introspection axioms in a
syntactic theory.
It is therefore worth reviewing proposed solutions to Montague’s and Thomason’s
problematic results. These results show that the axioms governing the modal systems of
knowledge and belief cannot be imported directly into a first order theory. However, this
does not in itself show that the modal theories cannot be embedded within first order logic,
for such embedding translations can be given. This is because first-order logic is more
expressive than standard modal logic, so that importing a modal axiom scheme directly
into a first-order theory allows it to range over more sentences than it would in the modal
theory. For example, there is no sentence corresponding to ∃xBi (x), for example, in ordinary
propositional modal logic. The inconsistency proofs in [Mon73] and [Tho80] assume that
the axiom schemes used to derive the inconsistencies range over all sentences of the firstorder language, as usual. If these axioms are restricted so as to only range over a particular
subset of formulae of the first-order language, which des Rivieres and Levesque term the
regular formulae [dRL86], then consistency can be maintained. By defining regular formulae
to be precisely those formulae of the first-order language that are expressible in a particular
modal language, then it is possible to provide a translation function from a modal to a firstorder language [dRL86]. This establishes that consistent accounts of idealized knowledge
or belief are possible within first-order logic.
Their approach restricts the expressive power of the first-order theory; it becomes
only as expressive as the modal language and in doing so, some might argue, loses much
of the attraction of the syntactic approach. Morreau and Kraus [MK98] extend the range
of the axiom schemes of des Rivieres and Levesque to include regular formulae with
propositional quantification, which they call RPQ-formulae. Two additional predicates are
added to allow the quantification over propositional attitudes: P (which picks out Gödel
numbers of sentences of the original language L) and T (which picks out the true sentences
of L). Their arguments are variables, which are instantiated with Gödel numbers. In
4.
92
this way, RPQ-formulae are more expressive than regular formulae and the theories of
knowledge and belief obtained are correspondingly more expressive.
It should be pointed out that rejecting the possible worlds framework does not
entail a rejection of a modal theory for knowledge or belief. Modal logic is conceptually
independent of possible world semantics. Indeed, one can give a semantics for a belief
operator ‘Bi ’ much as one can for a belief predicate ‘Bi (−)’, in terms of a belief set (i.e. the
extension assigned to the belief predicate by an interpretation of the first-order language).
I postpone discussion of the choice of language to section 4.5. First, let us review modal
yet sentential approaches to a logic of belief.
4.3 Sentential Logics
In this section, I discuss several approaches that incorporate a sentential approach to the
semantics of belief within a modal (rather than a first-order) language.
4.3.1 The Deduction Model
The Deduction Model [Kon86a] uses the belief-set semantics of the first-order account and
interprets a formula Bi φ as true iff agent i has φ in its belief set. The operator Bi is a modal
operator, taking a sentence (rather than a term) as its argument. Agent i’s knowledge base
KBi is a set of sentences in i’s internal language. Each agent i is assigned a set of deduction
rules ρi , which need not be logically complete (and in fact must not be to avoid closure of
belief under classical consequence). We write Γ ⊢ρi φ when φ is deducible from Γ using the
rules assigned by ρ(i). A sentence φ is said to be believed by an agent i iff it is an agent’s
belief set Bi , defined as the closure of the agent’s knowledge base under the rules assigned
to that agent by ρ, i.e. Bi = {φ | KBi ⊢ρi φ}. Konolige’s terminology here is slightly odd: if
the sentences in KBi are really known by i, and φ is derivable from KBi , then shouldn’t φ
be viewed as knowledge too? The exception would be when ρi contains invalid rules, that
is, with respect to classical semantics; but this is not the norm, even in the case of resource
bounded agents. Resource-bounded agents are not necessarily irrational and irrationality
is not the compliment of logical omniscience.
The language of the deduction model, LD , is a standard first-order logic augmented by a belief operator Bi for each agent i, such that if φ is a sentence of a regular
first-order language, then Bi φ is a sentence of LD .
4.
93
Definition 10 A deduction structure Di for an agent i is a tuple hρ(i), Li where ρ(i) is a set of
deduction rules and L is the internal language of the agent.
An interpretation takes a standard first-order model and adds a deduction structure for
each agent under consideration. As mentioned above, Konolige assumes the deductive
closure of an agent’s belief set Bi with respect to the agent’s deduction rules. The view
is thus that of an agent’s control strategy that will always apply a deduction rule where
applicable. Agents modelled in the framework of the deduction model avoid being logically
omniscient only when the rules they use to derive new sentences in their internal language
are logically incomplete. This appears to be the kind of agent that Konolige has in mind. By
fixing an agent’s deduction rules, restricted classes of deduction models may be defined.
Konolige defines classes corresponding to the common epistemic/doxastic logics K, D, S4
and S5; for example:
Saturation:
The class BK is the class of belief systems whose deduction rules are both
sound and complete (with respect to classical semantics). These models are saturated in
the sense that an agent’s belief set contains all logical consequences of its elements. This
generates the property of logical omniscience and corresponds to the modal system K.
The deduction model is an attempt to characterize the internal process of a deductive agent by interpreting the operator Bi in terms of those sentences that may actually
be derived by a particular agent. Thus, by altering the deduction rules that a particular
agent uses in its derivations, the set of sentences that may be truthfully prefixed by Bi will
also change. This is meant to account for the differing ‘competencies’ of different agents,
reflected in the deduction rules assigned to each.
However, this does not, in itself, take into account our time boundedness considerations, i.e. the fact that deduction is a process that consumes time as a resource. We
can frame the problem as: suppose we supply a deductive agent i with a set of sentences
φ1 . . . φn and ask it to apply its rules, wherever possible, to those sentences (but not to any
new ones). Once it has applied every rule wherever applicable to this set, it may add its
new conclusions to its set of beliefs and begin the process again. Call each such stage a cycle
of the agent’s reasoning. Now suppose a particular sentence ψ is derivable from φ1 , . . . , φn ,
given the rules that the agent has, in n cycles. Thus, if we were to interrupt the agent in
its reasoning during cycle (n − 1) and query whether it believes ψ or not, it should answer
4.
94
that it does not. However, since φ is derivable from the initial set of sentences given to the
agent using its deduction rules, Bi ψ holds. Thus, we are still in a predicament: our logic
takes ascriptions of belief to an agent to be true in cases where the agent itself would deny
such a belief.
The problem is evidently caused by the insistence on deductive closure. Models
of agents with access to a functionally complete set of deduction rules fall into the class of
saturated models and so the beliefs of such agents are closed under classical consequence.
There are many agents that know how to introduce and eliminate the Boolean connectives
in sentences using natural deduction rules, for example. Yet anyone taking first-year logic
knows that these agents can hardly be said to be ideal reasoners. The following sections
present logics that avoid deductive closure.
4.3.2 Syntactic Models
Syntactic models of the kind discussed by [Ebe74] and [MH79] can be embedded in a
possible worlds semantics by providing a syntactic assignment to worlds in the system
[HM90] and [FHMV95, pp.314–316]. In the possible worlds semantics considered up to
this point, the labelling function V assigned a set of primitives to each world w ∈ W. A
syntactic assignment is a generalized labelling function σ, which assigns an arbitrary set
of sentences to each world w ∈ W. Clearly, there are assignments σ that can assign both
φ and ¬φ to a world w. Models are pairs hW, σi, where W is a set of possible worlds and
σ is a syntactic assignment. The support relation is then defined directly in terms of σ:
M, w φ iff φ ∈ σ(w).
In syntactic models, logical omniscience is clearly not a problem. It is easy to define
a model in which M, w Kp ∧ K¬p, for example. However, these models are too general
to be much use without particular restrictions imposed on the syntactic assignment σ.
[FHMV95] discuss standard syntactic assignments, in which the Boolean connectives behave
as usual, i.e. M, w ¬φ iff M, w 1 φ; and M, w φ ∧ ψ iff M, w φ and M, w ψ. Note
that this does not affect sentences embedded within modal operators such as ‘K’, thus
‘K(p ∧ q) ∧ ¬Kp’ is satisfiable in a syntactic model with a standard assignment.
Models over standard syntactic assignments thus behave much as the first-order
account of knowledge does, in the sense that the Boolean connectives behave as usual in
extensional contexts (i.e. for sentences whose main connective is not a modal operator),
4.
95
but take on different behaviour when embedded within the ‘K’ and ‘B’ operator. As
it stands, Boolean connectives have no behaviour whatsoever in knowledge and belief
contexts. Epistemic accessibility relations R1 , . . . , Rn for n agents can then be introduced
to syntactic models, as discussed in [FHMV95, p.315], as a way of capturing locality in
syntactic models. If Ri is to capture the local states of agent i, then Ri ww′ can only hold
when agent i’s knowledge does not distinguish between w and w′ , i.e. M, w Ki φ iff
M, w′ Ki φ.
It is possible to formulae the deduction model discussed in the previous section
within the syntactic model approach. The set of formulae of the form Ki φ assigned to a
world w must either be elements of agent i’s knowledge base, or else derivable from that
base using i’s deduction rules. This approach suffers from the criticisms raised against the
deduction model above. The idea of an arbitrary syntactic assignment to states is important
to a number of accounts discussed below: to Ho’s dynamic epistemic logic 4.3.4, Timed
Reasoning logic 4.5 and to the account I will develop in chapters 5–7.
4.3.3 Algorithmic Knowledge
[HMV95] present a notion of knowledge based on the answers an agent could actually
give when queried, ‘do you know that φ?’ Algorithmic knowledge is defined relative to
an algorithm A that takes an agent’s local data l and a formula φ as its input and outputs an
answer ‘yes’, ‘no’ or ‘?’. Intuitively, a ‘yes’ answer means that the agent does know φ given
the algorithm A and local data l. A ‘no’ answer, on the other hand, means that the agent
does not know φ. The ‘?’ answer is given when the agent is unable to compute whether it
knows φ, given an algorithm and local data.
Formally, a global state of a system S is a tuple hse , s1 , . . . , sn i, where se represents
the state of the environment and si≤n is agent i’s local state. Because of the interest in
algorithms, each local state si for agent i is represented as a pair hA, li where A is agent i’s
local algorithm and l is i’s local data at si . [HMV95] use a semantics based on possible
runs r of the system, where (r, t) denotes the state of the system on run r at time t. If
(r, t) = hse , s1 , . . . , sn i and si = hA, li, then algi (r, t) stands for A, i.e. i’s algorithm at (r, t), and
datai (r, t) stands for l, i.e. i’s local data at (r, t). The semantics is defined relative to a model
M = hS, Vi where S is a system (a set of runs) and V is a truth-assignment to primitives
at each state of each run in the system. Algorithmic knowledge is denoted by the modal
4.
96
operator X, whose satisfaction clause is:
M, (r, t) Xi φ iff A(φ, l) = ‘yes’ where A = Algi (r, t) and l = datai (r, t)
Note that, unlike in the standard possible worlds account of knowledge, the valuation V
plays no rôle in determining what an agent knows (algorithmically); Xi φ is true at a state s
iff i’s local algorithm A returns ‘yes’ when run with φ and i’s local data l as inputs.
In a recent paper [Puc06], Pucella provides a deductive interpretation of algorithmic knowledge. A deductive system is defined over a term algebra T containing deduction
rules of the form t1 , . . . , tn t where each ti and t are terms of the algebra T. Given a deductive system D, the definition of a deduction of a term t from a set of terms Γ is standard
and is written Γ ⊢D t. Pucella gives semantics in terms of Kripke models. States are pairs
s = (e, obs) where e is the state of the environment at s and obs is a set of observations.5 The
satisfaction clause for algorithmic knowledge makes use of the definition of deduction in
place of the general notion of an algorithm, and treats obs as the agent’s local data:
M, s Xp iff s = (e, {t1 , . . . , tn }) and t1 , . . . , tn ⊢D p
Note that this definition only extends to deductive algorithmic knowledge of primitives p.
To give an account of algorithmic knowledge of arbitrary formulae φ, we need to consider a
translation φ∗ of φ into the term algebra. Deductive algorithmic knowledge is an evolution
of Konolige’s deduction model, formulated using Kripke models and substituting the
observation set obs at each state for the agent’s knowledge base.
Neither formulation of algorithmic knowledge considered here addresses time or
memory boundedness considerations directly. An agent may have an algorithm (deductive
or otherwise) at its disposal that would eventually terminate on a particular input, given
enough time to run, but cannot do so in a particular instance because it must return an
answer within a particular time bound. Cryptography is a good example here: prime
number factorization is a way of encrypting secret codes in a way that can be broken, but
typically would take longer than the predicted heat death of the universe to do so. Clearly
having an algorithm available for finding prime factors of a number is not a sufficient
condition for genuine knowledge in this case!
[HMV95] address the issue by dissociating the time parameter t in each pair (r, t)
(denoting the state of the system at time t in run r) from what they call “real time”: t is
5
The environment parameter e is used to interpret implicit (idealized) knowledge, as in the standard possible
worlds account; it plays no rôle in the account of deductive algorithmic knowledge.
4.
97
not to be thought to denote a multiple of seconds or minutes. Rather, time in the intended
interpretations “serves as a convenient way to partition the system behaviour into rounds,
where a round is a basic unit of activity” [HMV95, p.260]. Thus, (r, t + 1) is the state of the
system in run r, one round of activity after state (r, t). Evidently, an agent’s “basic activity”
includes running its local algorithm on a formula, given its local data, just once. As soon
as each agent has produced some output, the system moves into its next state.
The problem here is that, if one wants to model anything like a realistic notion
of a system’s resource bounds, one has to consider agents whose algorithms run in at
most polynomial time, given their input. Otherwise, one risks an exponential blowup
in the period of real time in between successive states of the system. This is quite a
restriction on what the algorithmic knowledge framework can model. [HMV95, p.260]
comment that a way out of the problem is to split longer algorithms between successive
states such that, the longer the algorithm takes to run (in complexity terms), the more states
it takes to output an answer. This would be an excellent solution to modelling resource
boundedness; but unfortunately no account of how to do this is provided. The account
I present in chapters 5–7 is in many ways similar to the algorithmic knowledge account;
but rather than considering running an algorithm at each state to determine what an agent
knows, the agent’s computational effort in deriving new knowledge from old is modelled
using transitions between belief states.6
4.3.4 Dynamic Logic
In [Ho95, Ho97], Ho Ngoc Duc presents an epistemic logic based on dynamic logic. If r is an
inference rule that the agent can use, then hri is the usual dynamic modality ‘after executing
(i.e. reasoning using) r, it is possible that . . . .’ For example, if mp is the action corresponding
to using modus ponens, then Bφ ∧ B(φ → ψ) → hmpiBψ says that an agent can use modus
ponens to derive the new belief that ψ from φ, φ → ψ (but it need not). So far so good: the
hri modalities give us a way to model step-by-step rule applications using composition.
For example, r1 ; r2 ; . . . ; rn is the action corresponding to inferring new beliefs by using r1
through to rn . [SGdMM96] develops an account with similar motivation, with inferences
made using rules such as modus ponens represented as dynamic modalities. Although
the models presented in [SGdMM96] are more advanced than those in [Ho95], the logical
6
These models effectively split the agent’s deductive effort in deriving all but the simplest consequences of
its beliefs across successive states, allowing for the kind of accurate model of time just discussed.
4.
98
language makes use of quotation operators and is correspondingly more cumbersome.
However, we are not always interested in what happens after the agent reasons
in this or that way. More often we want to know, for example, whether the agent could
end up believing some set of beliefs in a given time bound, given the rules available to it,
however it reasons. Such question abstract from the particular mechanisms used by the
agent to derive the target beliefs. To accommodate this, Ho introduces a future modality
hFi, defined as the iterated set of all choices of actions r1 , . . . , rn available to the agent:
F = (r1 ∪ · · · ∪ rn )∗ . hFiBφ then says that the agent can come to believe that φ and [F]Bφ
says that the agent must believe that φ at some point in the future. An intuitive relational
semantics can then be given by taking sentences of the form Bφ as primitives, assigned to
states by a modal labelling function V, such that s φ holds iff there is a state s′ accessible
from s such that s′ φ.
Unsurprisingly, hFi allows for a temporal version of most of the K-theorems to be
given. For example, if p is a propositional (modality-free) tautology, then hFiBp is a theorem
of [Ho95], as are Bφ ∧ B(φ → ψ) → hFiBψ and Bψ → hFi(Bφ → Bψ). The notion of the
future here is thus an idealized one, considering all the states in a temporally unbounded
reasoning process. But it is not even correct to read hFiBp as ‘the agent can believe p at
some point in the (idealized) future’—just consider a tautology p so large that no agent
could come to hold the sentence in its memory. The hFi operator ignores resource bounds.
This highlights an important point. Avoiding logical omniscience is not an end
in itself. Ho’s dynamic logic avoids logical omniscience in the sense that, at any particular
state, agents need not believe all consequences of their beliefs or all tautologies. However,
agents are ideal in the sense that they are modelled as having no time or memory bound.
Although reasoning takes place in time, the logic only described what beliefs the agent has
in some idealized future state. Evidently, what is therefore required is a logic that not only
avoids logical omniscience, but that captures the stages of reasoning, rather than just the
idealized endpoint.
4.4 Step Logics
Step logics [EP90] were introduced in response to these considerations, with the explicit aim
being “to model a common sense agent’s ongoing process of reasoning in a changing world”
[NKP94, p.5]. As in the deduction model, a step logic is characterized by a language and
4.
99
a set of deduction rules, and also a set of observations. If we assume that the environment
is static, the set of observations plays the rôle of the knowledge base of the deduction
model, i.e. a set of sentences that are taken for granted by the agent’s deductive process
and from which new beliefs can be derived. However, by incorporating an observation
function, more sophisticated models can model dynamic environments. Such logics are
typically non-monotonic, for what an agent observes in its environment at some stage in
its reasoning may well differ from what it observes at a later stage.
4.4.1 Agent and Meta Step Logics
Reasoning over time is characterized using cycles of deduction, as discussed above, where
each cycle is known as a step. Thus the logics abstract from the denseness of time, taking
discrete steps as their fundamental units. Beliefs are thus parametrized by a step index,
which can be thought of as the time (in steps) required by an agent to derive that belief.
Elgot-Drapkin and Perlis remark (e.g. [EP90, pp.1–2], [DP86, pp.2–3]) that logics of idealized belief (in which they include the deduction model) do not allow an agent to reason
about its own reasoning process qua real-world activity, for such activities are inherently
situated in time.
Step logics occur in pairs hSLn, SLn i, where each SLn is the meta-theory corresponding to each agent-theory SLn . Step-logic meta-theories are similar to standard temporal
epistemic logic in that they allow us to express the fact that, for example, an agent will
believe (i.e. will have derived) φ after so many steps. But this is “simply our assurance that
we have been honest in describing what we mean by a particular agent’s reasoning.”[EP90,
p.4]. It is the agent theories (which I will refer to simply as ‘step logics’) that are of interest
here, for they aim to capture an agent’s ongoing reasoning process. The crucial difference
between step logics and other temporal epistemic logics is the thought that “in order for the
agent to reason about the passage of time that occurs as it reasons, time arguments must be
put into the agent’s own language” [EP90, p.3]. Thus, not only is each belief parametrized
by a step value, but time parameters also play a rôle in an agent’s deduction rules. What
is believed at step t, t-beliefs, are used to derive (t + 1)-beliefs. Step logic deduction rules
take the form:
100
4.
t:
t + 1:
φ ψ
φ∧ψ
t:
t + 1:
φ
φ
C
t:
t + 1:
t:
t + 1:
I
φ∧ψ
φ
φ→ψ φ
ψ
D
M P
Observations are inputs that may appear at any step in the deduction process and are
automatically believed at that step. As an illustration of step-based reasoning, suppose
that an agent believes that φ → ψ and ψ → χ at step 1 and that it will observe φ at step 2.
The following table shows a snapshot of the deductive progress of an agent that uses the
deduction rules and :
t = 1:
t = 2:
t = 3:
t = 4:
φ → ψ,
φ → ψ,
φ → ψ,
φ → ψ,
ψ→χ
ψ → χ, φ
ψ → χ, φ, ψ
ψ → χ, φ, ψ, χ
Different step logics within this framework are characterized according to three
mechanisms: self-knowledge (i.e. an agent knowing what it does and does not believe);
time; and retraction (contradiction handling). SL0 is the simplest step logic, having none
of these mechanisms and can be expressed in a propositional language. The other logics
studied in [EP90] are more complicated, using a first-order language and are as follows:
SL1 :
SL2 :
SL3 :
SL4 :
SL5 :
SL6 :
SL7 :
Self-knowledge
Time
Retraction
Self-knowledge and retraction
Self-knowledge and time
Retraction and time
Self-knowledge, time and retraction
Since we are primarily concerned with the problems of logical omniscience, we can dispense for the moment with logics that incorporate retraction and concentrate on logics
that incorporate time and self-knowledge. These are the SL5 -logics, defined as triples
hL, obs, inf i, where L is a set of propositions (including timepoint constants), obs is an observation function from timepoints to sets of sentences, supplying the agent with a finite
(possibly empty) set of new beliefs at each step and inf is a set of inference rules (modelled
as taking a set of beliefs at t into a set of beliefs at t + 1). In [EP90], inf contains ,
and , plus:
t:
t+1:
···
φobs
O
4.
101
When φ can be derived at step t in a step logic SLn (obs, inf ), we say that φ is a t-theorem
of the logic and write SLn (obs, inf ) ⊢t φ. For each agent theory SLn , there corresponds a
first-order theory SLn such that
SLn (obs, inf ) ⊢ B(t, pφq) iff SLn (obs, inf ) ⊢t φ
A semantics is not provided for any step logic in [EMP91]. The only result provided is the
“analytic completeness” of SL0 : for any step t and any formula φ:
SL0 ⊢ B(t, pφq) or SL0 ⊢ ¬B(t, pφq)
However, this only holds in the propositional case corresponding to SL0 and leaves us with
no semantic intuitions whatsoever for the more complex agent theories SL1 through to SL7 .
The remainder of this section is devoted to attempts to provide semantics for step logic.
4.4.2 Active Logics
Attempts to give a minimal possible worlds semantics are found in [NKP94] and [EKM+ 99].
They consider a semantics for SL5 where belief is defined as a relation between a world and
a set of sets of worlds, based on Scott-Montague (or neighbourhood/minimal) structures; see
[Che80]. Since the key idea in active logics is modelling time, timelines rather than possible
worlds play the key semantic rôle. Worlds thus become timepoint constants, related such
that timelines are rays, infinite in one direction only. T is the set of timepoints and Tc
contains timepoint constants. Time is modelled a pair hT, ≺i where ≺ is a total order on T.
The language LA of an active logic contains primitives P × T, i.e. a set of propositional letters indexed with timepoint constants (written pτ for p ∈ P and τ ∈ T). Bτ φ is
well-formed whenever φ ∈ LA and τ ∈ T. Bτ pτ means ‘at time τ, the agent believes p to be
true at τ’, i.e. that, if the time is currently τ, the agent believes that p currently holds. Bτ pτ+1
thus means ‘at time τ, the agent believes that p will be true at the next step’. A structure in
this active logic is M = hL, T, ≺, I, V, R, obsi, where
1. L is a set of timelines;
2. hT, ≺i is a time structure;
3. I : Tc −→ T is an interpretation function for timepoint constants;
4.
102
4. V : P × T −→ 2L is a truth assignment to each p ∈ P for each timeline l ∈ L and
timepoint t ∈ T;
5. R : L × T −→ 22 is the accessibility relation, captured as a function assigning a set of
L
sets of timelines to each timeline-timepoint pair (l, t) where l ∈ L, t ∈ T;
6. obs : L × T −→ LA is an observation function.
The intension of a sentence φ in a structure M, denoted ||φ||M , is the set of timelines in M
that satisfy φ, i.e. {l | l ∈ L, M, l φ}. The intension of a sentence is often treated as the
proposition it expresses in a particular model [Sta76] (cf the discussion in 3.5 above). Thus,
for any timeline-timepoint pair (l, t), R(l, t) is a set of intensions. The satisfaction relation is defined recursively in the usual way, setting:
M, l Bτ φ iff ||φ|| ∈ R(l, V(τ))
In order to model agents with step-like reasoning, the following restrictions are placed on
models, for all timelines l ∈ L and all timepoints t ∈ T:
1. {} < R(t, l) and if ||φ||, ||ψ|| ∈ R(t, l) then ||φ|| ∩ ||ψ|| , {}
2. If ||φ||, ||ψ|| ∈ R(l, t), then ||φ|| ∩ ||ψ|| ∈ R(t + 1, l)
3. If ||φ|| ∈ R(t, l) and ||ψ|| ⊇ ||φ|| then ||ψ|| ∈ R(t + 1, l)
4. If φ ∈ obs(l, t) then ||φ|| ∈ R(t, l)
5. ||⊤|| ∈ R(l, t0 )
This active logic is axiomatized in [NKP94]. The axiom schemes and rules follow
the above restrictions fairly closely. For example, the (1) says that agents do not believe
contradictory or pairwise-contradictory sentences, hence ¬Bτ ⊥ is valid. (2) and (3) encode
stepwise conjunction introduction and modus ponens and (4) encodes belief in observation
sentences, hence
Bτ φ ∧ Bτ ψ → Bτ+1 (φ ∧ ψ) and Bτ (φ ∧ ψ) → Bτ+1 φ
are valid; and Bτ φ is valid whenever φ is observed at τ. Rules include modus ponens and
the stepwise rule from above. The chief gain of this framework is that all
4.
103
consequences of an agent’s belief set at time τ are not necessarily members of that set at τ.
Thus, {Bτ φ, Bτ φ → ψ, Bτ ψ} has a counter-model. In this sense, agents modelled in active
logics are not omniscient with respect to the classical consequences of their beliefs.
However, agents are modelled as believing all propositional tautologies and their
beliefs as closed under equivalence: the set of inference rules includes necessitation; and
Bτ φ ↔ Bτ ψ is valid whenever φ ↔ ψ is. This is in part a limitation of Scott-Montague
semantics but partly a fault of step logics. Intensions of sentences do not differentiate
between equivalent sentences: their intensions are just the worlds (or timelines, in this
case) at which they are satisfied but, since they are equivalent sentences, they are satisfied
at precisely the same worlds (timelines). As a consequence, an agent that believes just one
tautology must believe them all (similarly, an agent believing just one contradiction must
believe all logical falsehoods—see section 3.5 for discussion).
The fault is not just with the active logic semantics for step logic, for without access
to all propositional tautologies, step logic agents cannot derive many interesting beliefs at
all. Step logic agents can reason using , for example, but from where can they
obtain a major premise φ → ψ? It is surely rather baroque to claim that material implications
are observed, yet this is their only possible source in step logic. Strictly speaking, one cannot
see that the ball will go out if no one stops it—the information is inferred inductively.
Besides, even such strained uses of perceptual verbs do not incorporate perception of valid
implications. Hence, active logic must begin with the assumption that agents believe all
tautologies. Of course, an agent may have access to a complete set of axiom schemes, but
obtaining their instances is a process situated in time, just as conjunction introduction or
elimination is. I return to this general failing of step logics in section 4.5 below.
4.4.3 First-Order Semantics for Step Logics
Grant, Kraus and Perlis provide a first-order axiomatization and model theory for step logic
in [GKP00], along the lines sketched in section 4.1 above. They take the object languagemetalanguage approach of step logic, where the object language L is the language in which
the agent reasons and the metalanguage L+ is a first-order language in which knowledge
is ascribed to agents. The metalanguage is syntactically similar to that of the step logics
surveyed above, containing a constant ppq for each primitive p ∈ L as well as timepoint
constants t ∈ T and agent names i ∈ A, predicates Oi∈A , binary relation symbols Bi∈A and
4.
104
the usual Boolean connectives. The language is sorted into agent names Si , timepoint
constants St and sentences, S f , such that each Bi takes arguments of the sorts (Si , St , S f ). L+
also contains the structural function letters (section 4.1) neg, conj and imp, interpreted in
the expected way in all models. For simplicity, I shall write pφq for the structural name of
a complex formula φ: for example, pp → qq abbreviates the name imp(ppq, pqq).
Agents observe their environment, modelled using a function obs : A −→ 2L ,
which provides an agent i ∈ A with a (possibly empty) set of observed sentences. The
predicate Oi captures agent i’s observations by setting
Oi (pφq) iff φ ∈ obs(i)
An axiomatic theory T contains axiom schemes for stepwise , ,
, and . By way of illustration, stepwise
is
Bi (t, x1 ) ∧ Bi (t, imp(x1 , x2 )) → Bi (t + 1, x2 )
and is
Oi (x) → Bi (0, x)
i.e. all observation sentences are treated as knowledge at step 0. We also add that Oi (pφq)
whenever φ ∈ obs(i). It can then be shown that T has a minimal Herbrand model H . The
axiom schemes can be written as a definite logic program P:
Bi (t + 1, x2 ) ← B(i, t, x1 ) ∧ Bi (t, imp(x1 , x2 ))
Bi (t + 1, con j(x1 , x2 )) ← B(i, t, x1 ) ∧ Bi (t, x2 )
Bi (t + 1, x1 ) ← Bi (t, con j(x1 , x2 ))
Bi (t + 1, x2 ) ← Bi (t, con j(x1 , x2 ))
Bi (t + 1, x) ← Bi (t, x)
Bi (0, x) ← Oi (x)
together with all facts of the form Oi (pφq) whenever φ ∈ obs(i). Because P is a definite logic
program, it has a minimal Herbrand model H such that {φ | P |= φ} ⊆ H . Conversely,
consider some φ ∈ H . φ ∈ H ′ for every Herbrand model H ′ of P and thus every H ′
is a model of φ, so P |= φ. H is therefore a minimal model of T . This establishes that
constructing the minimal Herbrand model H of T amounts to generating all the atomic
4.
105
sentences that are logical consequences of T . H can therefore be generated using the
fixpoint of a consequence operator OT as follows. Let I be a Herbrand interpretation of
T . OT (I) is the set of all atomic sentences φ such that, for some φ1 , φ2 :
• φ1 ∧ φ2 → φ ∈ ground(I) and {φ1 , φ2 } ⊆ I; or else
• φ1 → φ ∈ ground(I) and φ1 ∈ I
where ground(I) is the set of all ground instances in the Herbrand interpretation I. OT (I)
is thus the set of all sentences that can be generated by a single application of ,
or universal instantiation to I. The model H is then constructed in stages.
Let OnT be the sentences generated in the nth application of OT and set:
O0T =
[
{Oi (pφq), Bi (0, pφq)}
(4.1)
i∈A,φ∈obs(i)
n
On+1
T = OT (OT )
(4.2)
Finally, set
Oω
T =
[
OnT
(4.3)
n∈N
i.e. ω is the least fixed point of the construction, such that OT (Oω
) = Oω
. Then we set
T
T
S
ω
H = OT , which is a model for T provided i∈A obs(i) is consistent. We thus have both a
guarantee that T is a consistent theory of belief and a way of constructing a model for it.
Not all models of T are good descriptions of an agent’s beliefs, in that a particular
model may contain sentences Bi (t, pφq) when φ is not derivable at time t from either the
agent’s observations or previous beliefs. Models that avoid these ‘extra’ sentences are
known as knowledge supported models:7
Definition 11 (Knowledge Supported Model) A model M of T is said to be knowledge supported iff M |= Bi (t, pφq) implies that there is an axiom instance ψ → Bi (t, pφq) such that M |= ψ.
Proposition 1 H is a Knowledge Supported model.
Proof:
7
The proof is in [GKP00].
⊣
Such models are called knowledge supported rather than belief supported merely because [GKP00] discuss
knowledge rather than belief; but this makes no difference to the underlying principles.
4.
106
4.5 Timed Reasoning Logic
The approach described in the previous section fulfils many of our requirements: it avoids
logical omniscience whilst allowing us to distinguish the beliefs that an agent has at a
particular moment of time from those it will come to believe, given further time in which
to reason. However, syntactic logics soon become unwieldy. Embedded reports become
progressively more untidy with each embedding; matters are not helped by the need to
use names of sentences rather than the sentences themselves. The modal languages used
in the previous chapters (and in the deduction model) are far more appealing.
A more pressing problem, introduced in section 4.4.1, was the limited kinds of
reasoning that can be modelled by a step logic-style formalism. The original step logics,
active logics and the first-order account of the previous section all make use of singlestep natural deduction style rules, but a complete natural deduction system requires more
machinery—it requires an agent to make assumptions. Assumption-based reasoning is
highly intuitive; to prove an implication, one assumes the antecedent and tries to derive
the consequent within that assumption. Such ways of reasoning cannot be modelled by
step logic; implications must be deal with by forming instances of Hilbert axioms. This is
a particularly unintuitive way of reasoning.
This section is therefore concerned with modelling agents who make assumptions
and use them in their deductive reasoning, taking as an illustration an agent who reasons
in a natural deduction style, but who takes time to reach its conclusions. The logic, which
we introduced in [Whi04, ALW04a, ALW04b, Jag05a] is called Timed Reasoning Logic, or
TRL for short.
4.5.1 Reasoning with Assumptions
An assumption is modelled as a sentence entertained by an agent for a particular purpose.
An assumption is not a belief, yet has the psychological effect of a belief whilst the assumption is entertained. In assuming a formula, one reasons and behaves as if it were the
case (this is, then, something like the psychological notion of pretence). Assumptions are
modelled using the notion of a context. Following [GG01], a context is a localized set of
sentences treated as beliefs or assumptions, connected to other contexts by inter-context
bridge rules. Each context represents the epistemic consequences of an act of pretence (i.e. of
assumption making) on behalf of the agent; for example, the context in which an agent
4.
107
makes the assumption that the moon is made of green cheese contains the sentences that
the agent would believe, were it to consider that assumption to be true.
Contexts are a suitable tool for this purpose as they can be embedded, thus
allowing one to model the making of assumptions within assumptions. For simplicity,
only a single reasoner is considered in this chapter, although it is not difficult to introduce
more into the framework. A unique context is reserved as a model of the agent’s actual
beliefs, i.e. those sentences entertained with no assumption whatever. In a more general
multi-agent setting, we would have a unique context for each agent i corresponding to i’s
assumption-free beliefs. We should perhaps point out that this is a different use of the term
context from the one that most frequently appears in the philosophical literature, which
denotes certain aspects of the world (an example is Kaplan’s [Kap89] use of such contexts
in his analysis of pure indexicals). Here, contexts are comprised of psychological notions,
although they should not be thought of as entities, but rather as potential ways of reasoning.
Consider the beliefs that an agent would come to believe in the hypothetical
circumstance of believing some formula φ. We term these the agent’s φ-beliefs and we
say they constitute the φ-context, which models what happens were the agent to make
the assumption that φ. A model partitions the agent’s internal state into contexts, one
for each assumption made. This allows us to separate the agent’s beliefs entertained
under different assumptions and thus to model multiple assumptions concurrently. The
context containing the sentences that would eventually be believed, were φ to be believed,
provides our model of an agent assuming that φ is the case. By placing a total ordering
on assumptions that can be made, we can view these contexts as a sequence. Since each
context corresponds to assuming some formula, such an ordering simply amounts to a
total order on sentences. Assumptions can be made within assumptions, and so we can
have contexts within contexts, as shown in figure 4.2.
As in step logic, agents are modelled as reasoning in deductive cycles. In each cycle, the agent’s deductive rules are matched against its beliefs to produce a set of additional
sentences, which appear as beliefs at the next cycle of reasoning. Each context is divided
into discreet timeslices, each of which contains the beliefs of the agent at a particular cycle
in its reasoning. Much as in step logic, timeslices are linked by a function inf that describes
how the agent’s beliefs change from one deductive cycle to the next. Given a set of initial
beliefs, we get a sequence of sets of sentences, each set representing the internal state of the
agent at that point in time (figure 4.3).
108
4.
φ1
beliefs
I
@
φ2
beliefs
φ2 φ3
beliefs
-
actual
beliefs
F 4.2: Embedding contexts
beliefs
at t = 1
inf
-
beliefs
at t = 2
inf
-
beliefs
at t = 3
inf
-
beliefs
at t = 4
...
F 4.3: Timeslices
Each timeslice is simply a set of sentences; it need not be consistent or deductively closed.
When we combine these temporal and assumption-making aspects of reasoning, we arrive
at a model consisting of a grid of sets of sentences, with rows representing assumption
contexts and columns representing timeslices of those contexts, as shown in figure 4.4.
t=2
t=3
t=1
assumption 2
-
-
-
...
assumption 1
-
-
-
...
actual beliefs
-
-
-
...
F 4.4: Model of an agent
In general, contexts are denoted by structural names, reflecting the embedded
structure of the contexts they denote. For example, the name c1 c2 c3 denotes a context
embedded within the context denoted by c1 c2 . Because contexts represent sequences of
4.
109
assumptions, we identify names of contexts with sequences of sentences. If c is a context
name and φ a sentence, then the sequence cφ is the result of concatenating φ to c and
denotes the context in which the assumption that φ is true is made from the context named
by c. The set of all sequences of sentences is denoted L∗ but, as the sequences we consider
represent assumptions that an agent could make, we restrict the available context names
to the finite set Σ ⊂fin L∗ of finite sequences. The empty sequence is named ‘ǫ’.
The models pictured in figure 4.4 are then easily described using a labelled language Ll . Suppose the agent we are modelling reasons in a propositional language L over
¬, ∧, ∨ and →. Well-formed sentences of Ll then consist of a label l, which describes a
context at a timepoint and a body, a sentence taken from the agent’s language L. A label
l ∈ Σ × T is a context-timepoint pair (c, t). ‘(c, t)’ is used for the remainder of the chapter as
a label metavariable. Well formed sentences of Ll are thus of the form (c, t) : φ. We assume
that T is totally ordered; for our purposes, we can take T to be the set N of natural numbers.
A belief modality B, with B(t, φ) meaning that the agent believes that φ is true at time t, can
then be defined by setting
df
B(t, φ) = (ǫ, t) : φ
Bridge rules connecting contexts and timeslices are written in the natural deduction style;
they are of the form:
(ci , t) : φ1 · · · (c j , t) : φn
(c, t + 1) : ψ
Here, ci , c j , c are metavariables ranging over contexts, t ranges over timeslices and
φ1 , . . . , φn , ψ range over well-formed sentences of the agent’s language L8 . Note that, while
each φi≤n may belong to a different context than ψ, rules always connect a timeslice t to a
successor timeslice t + 1. It is no accident that these rules look very similar to step logic
rules, for the idea is to model inference as a step-by-step process.
4.5.2 Modelling a Natural Deduction Style Reasoner
In order to see how contextual reasoning in TRL works, this section develops a model of a
natural deduction reasoner. The agent is required to make and later cancel assumptions in
order to derive new sentences (note that this could not be modelled in step logic). The aim
is to model this process using contexts connected by assumption rules, similar to the bridge
8
‘t + 1’ is convenient shorthand for the successor of t; we assume no need for arithmetical operations.
110
4.
rules of context logic [GG01]. Moreover, we model the agent’s deliberation as a step-bystep process of rule application using timeslices connected by temporal rules, similar to the
step logic rules encountered above.
We begin by looking how temporal rules relate timeslices. Intuitively, each timeslice label t ∈ T corresponds to a timepoint in a linear, non-branching temporal structure
and the timeslice labelled by t contains the beliefs of our agent at that time (within that
same context). Temporal rules may connect a timeslice labelled ‘t’ to one labelled ‘t + 1’.
Simple operations that our agent can perform from one timeslice to the next include: ∧introduction and elimination, ¬¬-elimination and ∨-introduction. These inferences are
simple as they operate on formulae in a single context. Example temporal rules are:
(c, t) : φ (c, t) : ψ
(c, t + 1) : φ ∧ ψ
(c, t) : φ ∧ ψ
(c, t + 1) : ψ
∧int
∧elimR
(c, t) : φ ∧ ψ
(c, t + 1) : φ
∧elimL
(c, t) : ¬¬φ
(c, t + 1) : φ
¬elim
To illustrate how these temporal rules connect timeslices, figure 4.5 below shows
an agent’s beliefs evolving from p ∧ ¬¬q at some time t.
t+2
t+1
t
p ∧ ¬¬q
-
p ∧ ¬¬q
p ¬¬q
-
p ∧ ¬¬q
p ¬¬q q
F 4.5: Temporal rules
It is important to note that the sentences appearing in timeslices only do so on
the basis of their appearing in the consequent of some inference rule. In particular, there
is nothing inherently monotonic about the Timed Reasoning framework being described
here. ∧int tells us, for example, that φ ∧ ψ holds in a timeslice (t + 1) when φ and ψ both hold
in the timeslice t of the same context, but leaves it open whether φ and ψ themselves hold
in timeslice (t + 1). It may be the case, for example, that each timeslice may only contain
a fixed number of sentences, due to an agent’s memory restrictions. Such a case would
clearly be non-monotonic; that is, φ following from Γ would not ensure that φ follows from
Γ ∪ ∆, for sentences derived from ∆ might take up space in some future timeslice required
to store formulae of Γ. However, since we require the natural deduction agent we present
here to reason monotonically, we need to add an extra temporal rule to ensure just that:
4.
111
(c, t) : φ
(c, t + 1) : φ
Assumption rules are the bridge rules that link the contexts representing assumptions. As an example, consider reasoning by reductio ad absurdum and →-introduction:
both require a reasoner to make an assumption that is later withdrawn. To begin with,
consider using reductio ad absurdum to derive ¬(p ∧ ¬p); one first assumes p ∧ ¬p, derives
each conjunct using ∧-elimination and, on noting the contradiction p, ¬p, establishes that
the assumption cannot have been correct and hence derives ¬(p ∧ ¬p). Contexts are used
to model the making of assumptions: sentences entertained within that context are thus
those within the scope of the corresponding assumption. It is therefore useful to label
such contexts with the assumptions they correspond to, i.e. the very sentence assumed in
making the assumption (together with sentences assumed in prior unclosed assumptions).
Thus all sentences φ occurring in a context label c must be (c, t)-theorems for every t. In
addition, we must make sure that all (c, t)-theorems are also (c′ , t)-theorems whenever c is
a subsequence of c′ . This reflects the fact that previous assumptions may be made use of
in the current assumption. These two features are captured in the following rules:
(c = · · · φ · · ·)
(c, t) : φ
(c, t) : φ
(· · · c · · · , t + 1) : φ
The rule for reduction ad absurdum is then as follows:
(cφ, t) : χ (cφ, t) : ¬χ
¬int
(c, t + 1) : ¬φ
We may read this rule as saying: having assumed φ from a context c and having derived a
contradiction in cφ (i.e. both some sentence χ and its negation are (cφ, t)-theorems for some
t), we may infer ¬φ in c at the next timeslice. Figure 4.6 shows a model of an agent using
reductio ad absurdum to derive ¬(p ∧ ¬p). In the diagram, only newly-derived sentences are
shown, contexts are represented as the rows and the the labels to the left of each row names
that context. An agent derives an implication in much the same way. Having assumed φ
in a context c and having then derived ψ, one may infer φ → ψ in the original context c at
the next timeslice:
(cφ, t) : ψ
→int
(c, t + 1) : φ → ψ
Figure 4.7 below shows a model of an agent deriving p → (q → p) (I assume bracketing to
the right for ‘→’ and so drop the parentheses). Note that the agent does not jump to the
112
4.
pq context directly; it first assumes that p and then that q is true. This is always the case
in making assumptions, such that models containing the context cφ must also contain the
context c.
p∧¬p
p ∧ ¬p
-
p ¬p
-
···
@
R
@
-
ǫ
- ¬(p ∧ ¬p)
F 4.6: Reductio ad absurdum
pq
p q
-
p∧q
···
-
···
q→p
-
···
QQ
s
p
p
-
QQ
s
ǫ
-
- p→q→p
F 4.7: → introduction
This framework is adequate as a model of the natural deduction agent in the
following sense. For any propositional sentence φ derivable in standard natural deduction,
there is a timepoint t such that φ holds in the empty context at t. We show that the labelled
formula (ǫ, t) : φ is derivable using the labelled logic. The converse also holds—a labelled
formulae (ǫ, t) : φ is derivable only when φ is derivable in standard natural deduction.
Recall that ‘ǫ’ is a context label that denotes the empty context, such that the sentences that
hold in ǫ at a timepoint t hold with no uncancelled assumptions. The sentences holding
in ǫ are thus the agent’s actual beliefs and so the agent’s beliefs precisely correspond
to deductions in standard natural deduction. Moreover, if φ is derivable from a set of
sentences Γ but ψ is not, an agent whose initial beliefs are Γ will eventually believe φ but
will never believe ψ.
We concentrate on an agent with an internal language L¬,∧ over ¬, ∧ and a set
4.
113
of propositional letters P only (of course, the other connectives could be introduced by
definition in the usual way). Let R be the set of bridge rules we have introduced for these
connectives: ∧int , ∧elimL , ∧elimR , ¬int and ¬elim , as well as the rules , and .
Definition 12 (Derivation) A labelled formula (c, t) : φ is derivable from a set of labelled formulae
Γl using the bridge rules in R, written Γl ⊢R (c, t) : φ, if there is a sequence of labelled formulae
(c1 , t1 ) : φ1 , . . . , (cm , tn ) : φk such that:
1. each labelled formula in the sequence is either a member of Γl , or is obtained from previous
labelled formulae in the sequence by applying one of the bridge rules in R; and
2. the last labelled formula in the sequence is (c, t) : φ.
Let Γ be a set of unlabelled propositional formulae and φ an unlabelled propositional formula, and set Γl = {(ǫ, 0) : φ | φ ∈ Γ}. Γl corresponds to placing each propositional
formula φ ∈ Γ into the empty (i.e. assumption-free) context ǫ at time 0. Our agent thus
begins its reasoning by believing each member of Γ to be true.
Theorem 1 Let Γ ⊆ L¬,∧ , φ ∈ L¬,∧ and Γl = {(ǫ, 1) : φ | φ ∈ Γ}. Then Γl ⊢R (ǫ, t) : φ, for some
t ∈ N, iff φ is a consequence of Γ in classical propositional logic.
Proof:
See appendix A for the proof.
⊣
Corollary 1 For any finite set of labelled formulae Γl and a labelled formula φl , it is decidable
whether Γl ⊢R φl .
Proof: This follows immediately from theorem 1 given the decidability of classical propositional entailment.
⊣
Timed Reasoning Logic thus represents an advance on step logic, in that it is
capable of explicitly representing the assumption-making process. In the next section, a
nonstandard semantics is provided that represents an improvement on both the active logic
approach (section 4.4.2) and the approach presented in [GKP00] (section 4.4.3).
4.6 Semantics for TRL
Semantics for TRL could be given along the lines of [GKP00] (section 4.4.3) by first axiomatizing the logic, defining a consequence operator OT on the resulting theory T and
4.
114
taking its fixpoint on a Herbrand interpretation of T . This fixpoint operation collects the
results of reasoning into a single set of sentences, with step-by-step reasoning represented
syntactically by labels, rather than in the structure of the models themselves. In this section,
I introduce an alternative semantics that maintains the step-by-step structure of reasoning.
The semantics is based on the diagrams used in the previous section and so is very intuitive.
To establish completeness, we introduce the notions of a sufficient model and of a minimal
model. To keep notation uniform, φ, ψ are arbitrary unlabelled sentences and Γ is a set
thereof. Adding the superscript l then denotes the labelled versions, e.g. φl is an arbitrary
labelled formula and Γl a set thereof.
Definition 13 (Models) Let Σ be a finite set of finite sequences of sentences over L and inf be a
function of type (Σ × N) −→ 2L . A model M is a tuple hΣ, inf , {mct | c ∈ Σ, t ∈ N}i where:
1. each mct ⊂fin L is a finite set of sentences such that:
cφ
m0 = mc0 ∪ {φ} and
mct+1 = inf (c, t)
2. inf satisfies:
if φ1 ∧ φ2 ∈ mct then φ1 , φ2 ∈ inf (c, t)
if φ1 , φ2 ∈ mct then φ1 ∧ φ2 ∈ inf (c, t)
if ¬¬φ ∈ mct then φ ∈ inf (c, t)
if φ ∈ mct then φ ∈ inf (c, t)
cψ
if φ, ¬φ ∈ mt then ¬ψ ∈ inf (c, t)
if φ ∈ c then φ ∈ inf (c, t)
if φ ∈ mct then φ ∈ inf (c′ , t) where c′ = · · · c · · ·
When a model M minimally (w.r.t. ⊆) satisfies these conditions, we say that it is a minimal model.
Intuitively, inf (c, t) looks at a timeslice t and infers which formulae should be in the context
c at the next timeslice.
Definition 14 (Satisfaction) We say a labelled formula of the form (c, t) : φ is satisfied by a model
M, written M |= (c, t) : φ, iff φ ∈ mct .
4.
115
Definition 15 (Validity and Entailment) A labelled formula φl is (i) Σ-valid, written |=Σ φl iff
M |= φl for all models M over Σ and (ii) a Σ-consequence of a set of labelled formulae Γl , written
Γl |=Σ φl , whenever φl is satisfied by all models M over Σ that also satisfy every ψl ∈ Γl .
In natural deduction derivations, different proofs rely on making different sequences of assumptions. It is therefore useful to introduce the notion of a set Σ of assumptions being sufficient to model a particular derivation. Note that we could not simply
assume that Σ contains all possible sequences over L, for this would allow the possibility of
infinitely many formulae being introduced to a local state by the inf condition, which would
violate the definition of each mct as a finite set. Instead, we use the following definition of
sufficiency:
Definition 16 (Sufficient Models) A set Σ of sequences is said to be sufficient for a pair hΓl, (c, t) :
φi iff either Γl 0R (c, t) : φ or else there exists a derivation Γl ⊢R (c, t) : φ such that, for each formula
(c′ , t′ ) : ψ appearing in the derivation, c′ ∈ Σ. A model M is said to be sufficient for hΓl , (c, t) : φi
when it is defined over Σ and Σ is sufficient for hΓl , (c, t) : φi.
We can now establish that the bridge rules R are sound and complete with respect to the
semantics just proposed. We begin by preparing the following lemma:
Lemma 1 Let Γl be a set of labelled formulae. For any formula φ, if M is a minimal model of Γl
sufficient for hΓl , (c, t) : φi, then: φ ∈ mct iff Γl ⊢R (c, t) : φ.
Proof:
See appendix A for the proof.
⊣
Theorem 2 (Soundness & Completeness) There is a sufficient Σ such that Γl ⊢R φl iff Γl |=Σ φl .
Proof:
Soundness is standard: clearly, the rules in R preserve validity. For completeness,
suppose Γl |=Σ φl . By definitions 13 and 16, there is a minimal model M over Σ of Γl such
that M |= φl . By Lemma 1, Γl ⊢R φl .
⊣
Corollary 2 φ is a classical consequence of Γ iff Γl |=Σ (ǫ, t) : φ for some t.
Corollary 3 It is decidable whether Γl |=Σ φl .
In general, models may contain more assumption contexts than is strictly necessary, or too few assumption contexts, for a particular derivation, whereas sufficient models
4.
116
contain just the right assumption contexts for that derivation. In [Whi04], I show how to
use goal-based reasoning to construct sufficient models. One knows, for example, that
p → q is proved by assuming p and deriving q. This is modelled by opening a p-context
and setting the goal, in that context, to be q. When the goal is reached, the context is
dispensed with. Each context is then associated with a goal context. Note that this is not a
fully-fledged theorem-prover, but merely a model of an agent that knows how to reason,
given its goals.
As mentioned at the beginning of this section, the semantics presented here captures step-by-step reasoning more intuitively than the first-order Herbrand model construction in section 4.4.3 does, yet there is a close correspondence between the two approaches.
Recall that the Herbrand model H in section 4.4.3 of a theory T was constructed in stages,
using a consequence operator OT that works in a similar way to the inf function used here.
If we model an agent that uses rules corresponding to the axioms of the theory T , applying
inf to a set of initial beliefs/observations n times has the same result as applying OT to a
Herbrand interpretation of T n times.
To recap, the agents in [GKP00] do not reason using assumptions—they reason
using , , , and —so only a
single TRL context is required to model each agent. However, [GKP00] allows for multiple
agents. We thus modify our interpretation of the TRL contexts to represent different agents,
rather than different assumptions. Let the set of context labels Σ contain a name i for each
agent to be modelled and Bi be the binary belief modality for each i ∈ Σ, defined as
df
Bi (t, φ) = (i, t) : φ
We have not introduced a modality for observation in TRL, so define Oi now as
df
Oi φ = (i, 0) : φ
The bridge rule is then
Oi φ
Bi (0, φ)
Let R contain just the bridge rules corresponding to , , , and . Each agent in [GKP00] begins its reasoning with an
initial set of observations, given by the obs function. TRL models of the axiomatic reasoner
of [GKP00] are then as follows:
4.
117
Definition 17 (Models of the axiomatic reasoner) A TRL model of an axiomatic reasoner M
is a tuple hΣ, inf , {mct | c ∈ Σ, t ∈ N}i as above with mi0 = obs(i) for each agent i ∈ Σ. Local models
mit are as in definition 13, but with condition (2) replaced by:
1. if φ ∈ mit , then φ ∈ inf (i, t + 1)
2. if φ, φ → ψ ∈ mit , then ψ ∈ inf (i, t + 1)
3. if φ1 , φ2 ∈ mit , then φ1 ∧ φ2 ∈ inf (i, t + 1)
4. if φ1 ∧ φ2 ∈ mit , then {φ1 , φ2 } ⊆ inf (i, t + 1)
These models have the following property:
Theorem 3 Let ∗ be a mapping propositional formulae to structural first-order functions as follows:
p∗ = ppq
(¬φ)∗ = neg(φ∗ )
(φ1 ∧ φ2 )∗ = conj(φ∗1 , φ∗2 )
(φ → ψ)∗ = imp(φ∗ , ψ∗ )
Now let M be a model of the type just described and H a Herbrand model of the first-order language
of section 4.4.3. Then M |= Bi (t, φ) iff H |= Bi (t, pφq)
Proof:
The proof is in appendix A.
⊣
Corollary 4 For every stage n of the construction of H , we have
OnT =
[
mit
i∈Σ
Proof:
Immediate from theorem 3, the construction of M just given and the definition of
H as
H=
[
OnT
n∈N
⊣
We can thus view the semantics presented here as equivalent to that of [GKP00],
with the added advantage that the step-by-step structure of reasoning is maintained.
4.
118
4.7 Discussion and Related Work
The logic TRL presented in the two previous sections has advantages over the possible
worlds approaches (it avoids logical omniscience) and first-order logics (the notation is
much simpler and consistency is not an issue). Moreover, the semantics maintains the
structure of step-based reasoning. However, both step logic and TRL share a fault: they
assume that whenever a sentence can be derived in one application of a rule, it is available
as a premise at the next cycle of reasoning. A step corresponds to the agent inferring all
sentences that can be derived from its current beliefs in a single rule application, before
moving on to the next cycle. Of course, this is not at all practical. Agents tend to deduce new
beliefs one by one. Our paradigm here might be a Quine-style derivation [Qui50a, Qui50b],
with a single new sentence added per line. Assuming that agents can use all of their
deduction rules in parallel is an abstraction that, in certain cases, may be unwarranted.
For example, consider the premise set {p1 , . . . , pn , p1 → q1 , . . . , pn → qn , q1 ∧ · · · ∧ qn → r}
for n > 1. It takes n applications of modus ponens to derive q1 , . . . , qn , then a further n − 1
applications of conjunction introduction to derive q1 ∧ · · · ∧ qn and a further application of
modus ponens to derive r. If n = 6, for example, 12 rule applications are required to derive
r, but step logic models the process in just 5 steps.
As an agent gathers more beliefs, the demands on the agent of rule firing in
parallel increase. As more rules need to be applied from one step to the next, the length
of time between each step will increase and so steps cease to be a good approximation
of units of time. [EMP91] attempts a work-around in which only a small subsection of
the agent’s beliefs can be matched against its rules. They call this the agent’s short term
memory, which is a fixed size, and so the number of rules that can be fired (and the number
of new beliefs added) per step stays roughly the same as the agent executes. However, a
mechanism is then required for pulling sentences from the agent’s long term memory (the
store of all the sentences derived) into short term memory, which itself takes time. The
consequence relation also ceases to be monotonic, for adding a new sentence to short term
memory may force another (which may be required as a premise of some rule) out to make
room in the the short term memory. The resulting logic is by no means as intuitive as those
encountered above. One may feel that the emphasis has shifted from agent modelling to
agent design, which is not my concern here. Regardless, there are agents that do not have
a short term-long term memory architecture that need to be modelled in some other way.
119
4.
In step logic (and TRL), the set of sentences X that are added in a step of reasoning
are those that the agent could derive in one application of one of its deduction rules.
However, the sentences added after two steps are not necessarily derivable by the agent in
two rule applications. A better approach would be to non-deterministically pick just one of
the sentences from X to be added as a belief at the next cycle of reasoning. Since the choice
is non-deterministic (either because the agent actually makes a non-deterministic choice of
which deduction rule to use, or because we do not know enough about the agent’s internal
mechanism to build a deterministic model), there is more than one possible next step. The
future branches into all possible belief states that the agent could be in at its next cycle.
This model is developed formally in the next chapter.
Another desideratum for the logic to be developed is that it should automatically
produce what was termed a minimal model in the previous section, related to a knowledge
supported model in [GKP00]. Such models only ascribe a belief to an agent using a sentence
φ when φ is derivable from the agent’s previous beliefs. The models discussed above, both
for TRL and first-order step logic, needed to be modified in order to have this property.
This suggests they are not an ideal framework for modelling belief. The models presented
in the next chapter automatically have the minimality property.
Related Work
In addition to [GKP00], the literature contains to my knowledge but one account comparable to TRL. Ågotnes [Ågo04] considers a logic of finite syntactic epistemic states. As with
TRL, the semantics is based on sets of sentences. These sets are situated in a subset lattice
such that, as an agent learns more, it moves up through the lattice. An unusual feature of
[Ågo04] is that syntactic operators take sets of sentences as their arguments. △i {φ1 , . . . , φn }
says that agent i believes (or knows, depending in the interpretation) at least that φ1 , . . . , φn
are true. Similarly, ▽i {φ1 , . . . , φn } says that agent i believes at the most that φ1 , . . . , φn are
true. The syntax of what an agent believes at a time thus closely follows the semantics.9
;
Each agent also has a set of rules with which it can derive new sentences. △ ij {R1 } means
that agent i has (at least) rule R1 at its disposal for communicating with agent j. Knowledge
;
of an inference (as opposed to communication) rule R2 is then expressed as △ ii {R2 }. Again,
;
▽ ij {R1 , . . . , Rn } says that agent i has at most rules R1 , . . . , Rn for communication with j.
9
The idea of knowing at most that φ was previously discussed by Levesque [Lev90], although Levesque’s
semantics are markedly different from Ågotnes’s.
4.
120
The semantics is provided by game-theoretic structures. To simplify somewhat,
models can be thought of as branching temporal structures. [Ågo04] makes use of the group
modalities from alternating time logic or ATL [AHK02], a generalization of computational
tree logic, in combination with the usual temporal future, next step and until operators F , X
and U. Given a set of agents G, the modality hhGii allows sentences to express co-operation
between members of G to achieve some result. For example, hhGiiX △i {φ} says that the
agents in G can co-operate to allow i to gain the belief that φ at the next step. Note that the
usual CTL modalities E and A can be expressed as hhAii and hh{}ii respectively, where A is
the set of all agents in the system.
As discussed above, branching structures allow for a more appropriate semantics
than the linear structures of TRL, as they allow an agent’s non-deterministic choice of
which rule to use to be modelled. Developing a logic with these benefits, whilst retaining
the intuitions about step-based reasoning discussed already, is the task of the following
chapters.
121
C 5
R R-B A
5.1 Motivation for the Logic
To summarize the discussion of the preceding chapters, the belief states of an agent are
best characterized in terms of sentences. Epistemic/doxastic logics that take a sentential or
syntactic approach give semantics for ‘agent i believes that . . . ’ in terms of belief sets, which
need not be closed under the agent’s deduction rules. Such belief sets are often interpreted
as models of the internal state of the agent, either in terms of the answers the agent would
give after querying its internal database of information, or in terms of the values assigned
to the variables in its program. When such an interpretation is possible, the resulting
ascriptions of belief or knowledge are said to be grounded in the agent’s internal state.1
By limiting the focus to what such an agent believes now, we are likely to arrive at
a very dull logic of belief for, as we saw in the previous two chapters, an agent’s belief set
need not be closed under any logical rules whatsoever. It is therefore necessary to be clear
as to why a logic of belief is desirable at all. One answer is the following. There has been
considerable interest in the last twenty years in verifying properties of programs. Knowing
that a program will not enter a loop from which it will never exit, or that a server cannot
enter a state in which two users can change the contents of a database at the same time,
are clearly useful things to know. The same kind of knowledge is desirable with artificial
agents in the picture. Designs in artificial intelligence can often be ad hoc and so it is vital
for researchers to be able to verify the agents they design. One use for a logic of belief is
to enable properties of artificial agents to be verified at the intentional level, the descriptive
1
[Woo00] discusses grounded theories of agency. Just how an epistemic/doxastic logic should be grounded
(and just how grounded it needs to be) depends on its application.
5. -
122
level at which the agent is said to have concepts, to believe this, to desire that and so on.
Such a logic should not just talk of belief; it must include temporal and alethic
notions, allowing for judgements such as ‘the agent can reach a state in which it believes
φ in ten cycles’, ‘the agent must reach its goal in ten cycles’ or ‘the agent cannot reach a
state in which it believes ψ within ten cycles’. Here cycle means the change from one belief
state to another. The concept can be used to give a temporal interpretation to the logic,
just as the notion of a step does in step logic. In section 4.3.4, I made the point that it is
not sufficient to know what an agent would eventually derive with unlimited resources.
Given our interest in resource-bounded agents, it is vital to be able to say what an agent
could derive from its prior beliefs within a certain number of cycles.
The motivation common to step logic, TRL and Ho’s dynamic epistemic logic is
to capture reasoning processes as activities that take place in time. This allows a model to
capture agents that are neither perfectly rational nor irrational reasoners: neither logically
omniscient nor logically ignorant [Ho95]. The more we abstract from an agent’s resource
bounds, e.g. by assuming unbounded memory and allow time to tend to infinity, the more
an agent’s belief state tends to that of an ideal reasoner; yet at no particular time does the
agent’s belief set contain all tautologies or deductive consequences of the agent’s beliefs.
One difference between the step logic-TRL approach and that of Ho [Ho95, Ho97]
and Ågotnes [Ågo04] is that the latter use a branching model of time, whereas time is
linear in the former. As discussed in the previous chapter (section 4.7), the advantage of
a discrete, branching-time model is that it distinguishes between those belief states that
might obtain and those that have to obtain within a certain time bound. Figure 5.1 shows
the structure of such a model.
k•
u kkkkk
•S
w SSS
ww
ww
ww
s • GG
GG
GG
G
v
SS
•
k•
kkkk
SS
• kSSSS
time
•
//
F 5.1: Part of a branching time model
5. -
123
Time is said to be branching in the model in the sense that the facts that hold at point s do
not determine whether point u or v will come next: although only one can actually follow
s, both are possible successors. Such models are adopted in the account presented below.
As in step logic and TRL, units of time are identified with cycles of the agent’s
reasoning, i.e. the time it takes the agent to move from one state to another. We are therefore
obliged to ensure that each cycle of reasoning takes more or less the same length of time
to complete. The models presented in this chapter follow a fine-grained approach: any
change in an agent’s belief state is modelled as a transition from one state of the model to
another. The key to modelling evolving belief states is to capture the notion of a transition
from one state to another in the heart of the logic. In figure 5.1, the points represent internal
states of the agent and the arcs, read left to right, are transitions between these states. When
an agent in state s can use one of its inference rules to derive a new belief, this is modelled
by a transition from s to a new state s′ , just like s except for the addition of that new belief.
Since using a single conclusion inference rule once produces just one new belief, states
related by a transition may differ only by a single belief.
One difference between this approach and that of step logic or TRL is found in the
way an agent’s rules are applied in each cycle of inference.2 The difference can be captured
by analogy with the concept of a rule firing strategy from rule-based systems terminology.
Step logic and TRL model the all rules at each cycle strategy:
1. Match the antecedents of all rules in all possible combinations with previously derived
sentences, marking the sentences that will be added; then
2. Add all the marked sentences to the set of derived sentences;
3. Move on to the next cycle.
This is a rather unnatural (not to mention inefficient) strategy. If one is trying to derive q
and has already derived p and p → q, the sentences p ∧ p or p ∧ (p → q) can be ignored, even
though they can be derived using conjunction introduction. In this situation, the only rule
one in interested in using is modus ponens. A potentially more useful rule firing strategy is
one rule at each cycle:
2
The rules considered here differ from those considered by step logic and TRL in that those accounts
consider standard inference rules, whereas the rules considered here are the specific rules of a rule-based
agent: see section 5.2 below.
5. -
124
1. Match the antecedents of all rules in all possible combinations with previously derived
sentences;
2. Choose one of the resulting instances to fire;
3. Add the consequent of the chosen rule instance to the set of previously derived
sentences and move on to the next cycle.
Thus, for each sentence that would be added in the all rules at each cycle strategy, there is a
transition to a successor state in the branching temporal models of the one rule at each cycle
strategy.
The agent considered here is a deductive agent that never revises its beliefs, even if
it discovers them to be inconsistent. Models of belief revision have been studied elsewhere
(e.g. [AGM85]) and, although important in a complete account of an agent’s reasoning,
are not the province of epistemic or doxastic logic per se. The remainder of this chapter is
concerned with developing and exploring the properties of such models. In order to do
this in a simple setting, I take as a working example the case of rule-based agents.
5.2 Rules and Rule-Based Agents
To investigate the kinds of models described in the previous section, I concentrate on the
class of rule-based agents, which consist of a program—a set of condition-action rules—and
a rule interpreter. In general, a rule-based agent’s program will contain condition-action
rules of the form
P1 , . . . , Pn
Q1 , . . . , Qm
or
P1 , . . . , Pn ⇒ Q1 , . . . , Qm
Here, the literals P1 , . . . , Pn are the conditions and Q1 , . . . , Qm are the resulting actions. Each
Pi , Qi may contain unbound variables or possibly even logical connectives. Such rules are
read as: if each of the conditions P1 , . . . , Pn hold, then do each of the actions Q1 , . . . , Qm .3 Actions
might include adding this or that sentence to the agent’s database, or sending a message
to the agent’s motor control or to another agent. Here, I concentrate on rules that result
3
In this respect, these rules differ from multiple-conclusion sequents in that commas on both the left and
right-hand sides stand for conjunction.
5. -
125
in a single action, which I discuss in more detail throughout this chapter. First, I briefly
survey how rule-based systems have been of use both in current AI practise and in the
wider world.
Rule-Based Systems in AI
Rule-based systems have been more or less ignored by the literature on epistemic logic, but
play an important rôle in other areas of AI.4 Rule-based approaches allow a great degree
of abstraction in specifying the behaviour of agents. As a result, there are now several
rule-based agent architectures available, e.g. SOAR [LNR87] and S-A [SL99]. Rulebased programming extensions are also increasingly being offered as add-ons to existing,
lower-level agent toolkits, e.g. JADE [BPR01] and the FIPA-OS JessAgent [PBH00]. Much of
this work is based on rule-based technology, such as the Jess rule engine [FH06], originally
developed in the area of knowledge-based systems. Another application of rule-based
approaches in AI is the semantic web. Examples from the current literature are:
• ontological reasoning in DAML/OWL;
• rule extensions to ontologies such as OWL: OWL+Rules [HPS04];
• SWRL [HPSB+ 06], a combination of OWL and RuleML [RML06].
These extensions significantly increase the expressive power of the underlying ontology
languages [HPS04].
As with any new technology, the benefits of adopting rule-based approaches
requires a formal method for establishing the properties of the resulting systems with
respect to:
Correctness: whether a rule-based agent will produce the correct output for all legal inputs;
Termination: whether a rule-based agent will produce an output at all; and
Response time: how much computation a rule-based agent will have to do before it generates any output.
The models developed in this chapter contribute towards developing these formal tools.
4
An early exception was [Kon86a]; more recent exceptions are [Whi04, ALW04a, ALW04b, AJL06b, AJL06a,
Jag06b].
5. -
126
Rule-Based Systems in Business
Rule-based technology is not limited to research in AI; it has found a home in many areas
of the business world, such as insurance (rating), financial services (loans, fraud detection,
claims routing and management), government (application process and tax calculations)
and e-commerce (personalizing the user’s experience). [Mah05] claims that all of these
areas have benefited from using rule engines:
Rule engines are used in applications to replace and manage some of the business logic. They are best used in applications where the business logic is too
dynamic to be managed at the source code level—that is, where a change in a
business policy needs to be immediately reflected in the application [Mah05].
Business rules are rules that define or constrain an aspect of a business [BRC06], e.g. every
visitor of the conference gets a 20 per cent discount on the first product purchased. They are
being used by companies to analyze the behaviour and improve the efficiency of their
business. As the business rules community puts it, “business rules are the very essence
of a business. They define the terms and state the core business policies. They control
or influence business behaviour. They state what is possible and desirable in running a
business—and what is not” [BRC06].
5.3 Modelling Rule-Based Agents
The architecture of rule-based agents is often divided into an interpreter, which matches
rules from the agent’s program against the agent’s beliefs to produce instances and a
working memory, which stores the results of firing rule instances. An agent’s program is just
the set of rules with which the agent executes. I use the term belief to cover both the rules
in the agent’s program and literals held in its working memory. Agents have an initial
stock of beliefs (which might be observations) that are neither revised nor added to, other
than by firing rules and adding their consequents as new beliefs. Thus, the set of believed
rules does not change as the agent executes, in keeping with standard AI practise.5 As
discussed in the previous chapter, a modal language containing an operator B is preferable
to a first-order metalanguage. A sentence Bα holds at a state s when the formula α is part
of an agent’s local state at s, either as a rule or as a formula held in working memory.
5
In fact, this is one of the main reasons for developing a logic for rule-based agents, namely to check that a
given program allows an agent with consistent beliefs to execute without running into inconsistencies.
127
5. -
It is instructive to begin with an example. Suppose the agent operates using just
the following two rules:
R1 PremiumCustomer(x), Product(y) ⇒ Discount(x, y, 10%)
R2 Spending(x, > 1000) ⇒ PremiumCustomer(x)
Now suppose that the agent’s initial working memory contains the beliefs
Product(iBook)
Spending(Jones, > 1000)
Product(Sunglasses)
When the agent begins executing, R2 can be matched against Jones to produce
Spending(Jones, > 1000) ⇒ PremiumCustomer(Jones)
(5.1)
Since no other instances of either R1 or R2 are possible, there is then a unique next state in
which
PremiumCustomer(Jones)
is added to the agent’s working memory. At the agent’s next cycle, x in R1 can be matched
against Jones and y against either Sunglasses or iBook to produce the instances:
PremiumCustomer(Jones), Product(Sunglasses) ⇒
(5.2)
Discount(Jones, Sunglasses, 10%)
PremiumCustomer(Jones), Product(iBook) ⇒ Discount(Jones, iBook, 10%)
(5.3)
Note that (5.1) is no longer counted as a matching rule instance, since its consequent has
already been added to the working memory.6 The agent can then move into a state in which
the working memory contains either Discount(Jones, Sunglasses, 10%) or else contains
Discount(Jones, iBook, 10%) in addition to its previous contents. If the agent fires (5.2),
adding Discount(Jones, Sunglasses, 10%) to working memory, (5.3) remains a matching
rule instance and Discount(Jones, iBook, 10%) is added at the next state. Similarly, if
the agents fires (5.3), adding Product(iBook) ⇒ Discount(Jones, iBook, 10%), then (5.2)
6
This concept of a matching rule differs slightly from standard rule-based systems terminology. In standard
refractory rule firing, a rule instance is said to be matching so long as its antecedents are in working memory
and the instance has not been fired previously; a rule instance whose consequent is already held in working
memory may be matching, allowing all possible justifications for a particular belief to be collected. Since
justifications do not play an important rôle in the current account, instances whose consequent is already
stored in working memory are not counted as matching here.
128
5. -
remains matching. There is then a next state adding Discount(Jones, Sunglasses, 10%)
to working memory. Figure 5.2 shows a branching time model in which new beliefs
are added to the working memory (only new beliefs are shown). The agent can derive
Discount(Jones, iBook, 10%) in 2 cycles, whereas it must derive it in 3 cycles. If this model
is M and its root s, then
M, s 33Discount(Jones, iBook, 10%)
and
M, s 222Discount(Jones, iBook, 10%).
Product(iBook), Spending(Jones, 0), Product(Sunglasses)
PremiumCustomer(Jones)
Discount(Jones, iBook, 10%)
Discount(Jones, Sunglasses, 10%)
Discount(Jones, Sunglasses, 10%)
Discount(Jones, iBook, 10%)
F 5.2: New literals added to WM
The formal language describing such models should be expressive enough to
distinguish ‘the agent can derive λ in n cycles’ from ‘the agent must derive λ in n cycles’.
However, we need not be concerned with distinguishing situations such as
WM
from
WM
WM
WM
···
129
5. -
in which the contents of WM does not change; nor should we be concerned with distinguishing between the left and right-hand structures of figure 5.3. Although these structures
are not isomorphic to one another, they are bisimilar (see section 5.4 below). This suggests
that we do not require the full expressive power of first-order logic and that we can work
with a suitable modal language. We will therefore use the standard modal operator 3 and
its dual 2 to express that the agent can or must add a sentence to its working memory at
the next step.
WM
λ1
WM
λ2
λ1 , λ2
λ1
λ2
λ1 , λ2
λ1 , λ2
F 5.3: Two models of the same rule-based reasoning process
In a language containing predicates, variables and constants, such as the language
used in the example, we can write a substitution instance of a literal (an atomic sentence
that may be proceeded by a single negation symbol) λ as λδ . When δ is the map x1 7→
c1 , . . . , xn 7→ cn , the instance of the literal P(x1 , . . . , xn ) under δ is written as P(x1 , . . . , xn )δ ,
i.e. the ground literal P(c1 , . . . , cn ). Similarly for rules, if ρ is a rule λ1 , . . . , λn ⇒ λ, then
ρδ = λδ1 , . . . , λδn ⇒ λδ
If the language contains a set X of variables and a set C on constants, then Σ is the set of
functions δ : X −→ C. This is the approach taken in [AJL06a]. The resulting metalanguage
is reminiscent of many used in current AI practise.7
This is not the approach taken below. Because there can be at most a denumerable number of constants used in any execution of an agent, a propositional language is
sufficient. The agent in the above example can be modelled by considering the language
containing all instances of R1 and R2 over a denumerable set of constants. In this language,
7
In order to axiomatize the approach adopted in [AJL06a], the set of constants used to instantiate the
variables in rules must be finite.
5. -
130
‘rule’ shall mean an instance of a rule such as R1 or R2; and rules may contain negation
signs immediately preceding a propositional letter but nowhere else; thus p, ¬q ⇒ ¬r is a
rule, whereas ¬(p ⇒ q) is not.
To keep the notation simple, literals λ1 , . . . will be used in place of propositions,
where each λi is either a propositional letter or a propositional letter preceded by a negation
sign. The general form of a rule is thus λ1 , . . . , λn ⇒ λ. Note that the rules presented in
this chapter do not contain disjunction (agents thus have no disjunctive beliefs). This is a
restriction on the expressiveness of the logic presented is this chapter, but is by no means a
limitation of the general framework. Disjunction within the agent’s beliefs is banned here
to reduce the complexity of this initial investigation and is introduced again in chapter 7.
Formal Syntax
We fix a denumerable set of propositions P = {p1 , p2 . . .}. A literal is either a proposition
or its negation; literals are written λ, λ1 , λ2 . . . . Rules are of the form λ1 , . . . , λn ⇒ λ and
in general rules are denoted ρ, ρ1 , ρ2 , . . . . Since it is often useful to know which belief a
rule adds when fired, cn(ρ) abbreviates λ when ρ = (λ1 , . . . , λn ⇒ λ). The agent’s internal
language LP over P contains only rules and literals; no other formulae are considered wellformed. Since P will be fixed throughout, the superscript may be informally dropped.
Arbitrary formulae of L are denoted α, α1 , . . . .
The modal language MLP , which is used to reason about the agent’s beliefs, is
built from formulae of LP (again the superscript may informally be dropped). ML contains
the usual propositional connectives ¬, ∧, ∨, →, the 3 modality and a belief operator B.8
Given a literal λ and a rule ρ, Bλ and Bρ are primitive wffs of ML, and all primitive wffs
are formed in this way. If φ1 and φ2 are both ML wffs, the complex wffs of ML are then
given by
¬φ1 | φ1 ∧ φ2 | φ1 ∨ φ2 | φ1 → φ2 | 3φ1
df
The dual modality 2’ is introduced by definition: 2φ = ¬3¬φ. Note that the primitive
formulae of ML are all of the form Bα, where α is an L-formula, hence the problem of
substitution within belief contexts does not arise in logics based on ML.
8
Note that ‘→’ is the usual truth-functional implication and appears in ML formulae, whereas ‘⇒’ appears
only in rules in L (and in ML formulae built from rules.
131
5. -
Formal Models
Models are graphs of states, with each arc representing a change in an agent’s belief state.
Although time is not explicitly represented in these models, each arc is thought of as a
transition from an agent’s belief state at one time to a (possible) belief state at a future
moment in time, arrived at by firing a rule and adding its consequent as a new belief.
Formally, a model M is a structure
hS, T, Vi
where
• S is a nonempty set of states;
• T ⊆ S × S is a transition relation on states;
• V : S −→ 2L is the labelling function, assigning a set of sentences of the agent’s internal
language to each state.
T is called the transition relation as it captures the intuitive interpretation of arcs between
points as temporal transitions between the agent’s belief states from one moment to the
next. Where there is a transition from s to s′ , s′ will be said to be a successor of s; s′ is reachable
from s when there is a sequence of states ss1 s2 · · · sn s′ such that each is the successor of the
one before.
Definition 18 (Labelling) Given a model M = hS, T, Vi, a sentence α ∈ L is said to label a
state s ∈ S when α ∈ V(s). Given models M = hS, T, Vi and M′ = hS′ , T′ , V ′ i (which need not be
distinct), states s ∈ S and s′ ∈ S′ are said to be label identical, written s
L
s′ , when V(s) = V ′ (s′ ).
The definition of a formula φ of ML being satisfied by a state s in a model M (written
M, s φ) is as follows:
M, s Bα iff α ∈ V(s)
M, s ¬φ iff M, s 1 φ
M, s φ1 ∧ φ2 iff M, s φ1 and M, s φ2
M, s φ1 ∨ φ2 iff M, s φ1 or M, s φ2
M, s φ1 → φ2 iff M, s 1 φ1 or M, s φ2
M, s 3φ iff there exists a state s′ ∈ S such that Tss′ and M, s′ φ
5. -
132
Such models are known as Kripke models. ‘M, s φ’ is read as s supports the truth of φ
in M, or s supports φ for short, when it is clear which model is being talked about. When
ambiguity cannot arise, we may write s φ for short.
Definition 19 (Global satisfiability and validity) An ML formula φ is globally satisfied in a
model M = hS, T, Vi, notation M φ, when M, s φ for each state s ∈ S. Given a class of models
C, φ is said to be valid in C or C-valid, written C φ, when M φ for any M ∈ C. Validity
(simpliciter) is validity in any class. A set of ML formulae Γ is said to be satisfied at a state s ∈ S,
written M, s Γ, when every element of Γ is satisfied at s. Γ is then globally satisfied, C-valid or
valid in a similar way.
Definition 20 (Modal Equivalence) Given models M = hS, T, Vi and M′ = hS′ , T′ , V ′ i, states
s ∈ S and s′ ∈ S′ are said to be modally equivalent, written s ! s′ , when {φ | M, s φ} =
{ψ | M′ , s′ ψ}.
Because these models are standard in modal logic, they need to be restricted in
certain ways to model rule-based agents. In particular, the rules that an agent believes do
not change; rules are neither learnt nor forgot. This is standard practise in rule-based AI
systems (cf condition S4 below). Secondly, T must relate a state s to some state u whenever
there is a rule ρ that can be fired at s, and u is just like s except the agent has gained one
new belief, the consequent of ρ. Here, ρ is said to be an s-matching rule.
Definition 21 (Matching rule) A rule ρ of the form λ1 , . . . , λn ⇒ λ is said to be s-matching, for
some state s ∈ S, iff ρ ∈ V(s), each λ1 , . . . , λn ∈ V(s) but λ < V(s).
Whenever ρ is s-matching for some state s, then the agent can move into a new state in
which it has gained a new belief. That state is said to extend s by the new belief, namely
cn(ρ).
Definition 22 (Extension of a state) For any states s, u ∈ S, u extends s by a literal λ iff V(u) =
V(s) ∪ {λ}.
If there are no matching rules at a state (and so no rule instances to fire), that
state is a terminating state and has a transition to itself (or to another identical state, which
amounts to much the same in modal logic). This ensures that every state has an outgoing
transition; in other words, T is a serial relation. As a consequence, the question ‘what will
133
5. -
the agent be doing after n cycles?’ can always be answered, even if the agent ran out of
rules to fire in less than n cycles.
Definition 23 (Terminating state) A state s is said to be a terminating state in a model M iff no
rule ρ is s-matching.
Transitions relate terminating states to identically labelled terminating states and, whenever there is a matching rule ρ at a state s, a transition should only be possible to a state u
that extends s by cn(ρ). We capture such transition systems in the class S (for single agent
models).
Definition 24 The class S contains precisely those models M that satisfy the following:
S1 for all states s ∈ S, if a rule λ1 , . . . , λn ⇒ λ is s-matching, then there is a state s′ ∈ S such that
Tss′ and s′ extends s by λ.
S2 for any terminating state s ∈ S, there exists a state s′ ∈ S such that s′
L
s and Tss′ .
S3 for all states s, s′ ∈ S, Tss′ only if either (i) there is an s-matching rule λ1 , . . . , λn ⇒ λ and s′
extends s by λ; or (ii) s is a terminating state and V(s) = V(s′ ).
S4 for all rules ρ and states s, u ∈ S, ρ ∈ V(s) iff ρ ∈ V(u).
It is clear that this definition ensures that T is a serial relation for any model M ∈ S. For
any state s ∈ S, either there is at least one matching rule or there is not. In the former case,
S1 ensures that s is related to some extension of itself by T; otherwise, s is a terminating
state and is related to an identically labelled state by T.
There may, of course, be many matching rules at a given state, and for each there
must be a state u such that Tsu. Each transition may be thought of as corresponding to the
agent’s non-deterministic choice to fire one of these rule instances. ‘3φ’ may then be read
as ‘after some such choice, φ will hold.’ We can think of the agent’s reasoning as a cycle:
1. match rules against literals;
2. choose one matching rule;
3. add the consequent of that rule to the set of beliefs; repeat.
5. -
134
By chaining diamonds (or boxes), e.g. ‘333’ we can express what properties can (and
what will) hold after so many such cycles. We can abbreviate sequences of n diamonds (or
n boxes) as 3n and 2n respectively. ‘2n φ’, for example, may be read as ‘φ is guaranteed to
hold after n cycles.’ Note that the agent’s set of beliefs grows monotonically state by state
and that the agent never revises its beliefs, even if they are internally inconsistent.
Before investigating some of the properties that these models possess that distinguish them from standard models of a modal logic, we need to introduce the concept of a
bisimulation relation.
5.4 Bisimulation
Any states s and u related by a bisimulation relation carry the same information as each
other, in the following sense: each is labelled by the same formulae and, whenever it is
possible to make a transition from the former state, there is a similar move available from
the other state. Under our interpretation of the transitions in a model, the submodels
generated by two bisimilar states each describe the same reasoning process.
It is worth mentioning why it is that S1 and S3i cannot be expressed as a biconditional. A model M ∈ S may contain a state s ∈ S labelled by a matching rule ρ and there
may well be a further state u ∈ S that extends s by cn(ρ), yet s and u need not be related by
T. S1 says that there must be some state u′ accessible from (and so extending) s, although
u′ need not be u. However, since u and u′ are label identical, any u′ -matching rule must
also be a u-matching rule. For each successor to u, there will therefore be an identically
labelled successor to u′ and vice versa and similarly for all their successors, and so on.
Figure 5.4 shows a model of an agent whose program is just the rules p ⇒ q and
p ⇒ r. At s0 , the agent believes p (as well as these rules), from which it can infer either q
or r in once cycle. It derives both beliefs within two cycles and so has the same beliefs in
s3 as in s4 . We can easily find a model M′ that captures the same reasoning process as this
model, by gluing together s3 and s4 into a single state, as shown in figure 5.5. Although M
and M′ are distinct, they are bisimilar models describing the same reasoning process.
Definition 25 (Bisimulation) Given models M = hS, T, Vi and M′ = hS′ , T′ , V ′ i, a nonempty
binary relation Z ⊆ S × S′ is a bisimulation between M and M′ , written Z : M ⋍ M′ , when:
(Label) If Zss′ then s
L
s′ .
5. -
pq
;; •
vv s1
v
v
vv
vv
v
p vvv
•H
s0 HHHH
HH
HH
HH s
H## 2
•
pr
135
pqr
// •
s
3
p ⇒ q, p ⇒ r
at all states
// • s4
pqr
F 5.4: A model M with s3 label identical to s4
pq
v;; • HH
vv u1 HHH
v
H
v
HH
v
HH
vv
HH
vv
pqr
v
v
##
• HH
;v; •
u0 HHH
vv u3
HH
vv
v
HH
v
HH u2 vvv
p ⇒ q, p ⇒ r
H## vv
p
•
pr
at all states
F 5.5: A model M′ bisimilar to M
(Forth) If Zss′ and Tsu, then there is a state u′ ∈ S′ such that T′ s′ u′ and Zuu′ .
(Back) If Zss′ and T′ s′ u′ , then there exists a state u ∈ S such that Tsu and Zuu′ .
When these conditions hold, s and s′ are bisimilar, written M, s ⋍ M′ , s′ (or simply s ⋍ s′ , if the
context makes it clear which models s and s′ belong to). When there exists such a Z, we write
M ⋍ M′ .
Proposition 2 Given two models M = hS, T, Vi and M′ = hS′ , T′ , V ′ i, for all s ∈ S and s′ ∈ S′ ,
s ⋍ s′ implies s ! s′ .
Proof:
The proof is standard; see, for example, [BdRV02, p.67].
⊣
It is sometimes convenient to work with models in which the transition relation T forms a
tree on the states S. Such models are known as tree models.
136
5. -
u •OO
forth
=⇒
T
s •
•
⋍
back
=⇒
T′
⋍
• s′
•OO u′
T′
T
s •
•OO u′
s •
⋍
u •OO
•u
⋍
⋍
u •OO
T′
T
s •
•OO u′
⋍
• u′
F 5.6: Back & forth conditions
Definition 26 (Tree model) A tree model M = hS, T, Vi with T forming a tree on S, rooted at a
unique state s ∈ S. That is, T is connected and irreflexive and, for each state s , s0 ∈ S, there is a
unique u ∈ S (the predecessor of s) such that Tus. We make use of the terminology ‘is a child of’, ‘is
a descendant of’ and ‘is an ancestor of’ with the usual meanings.
It turns out that a technique known as unravelling allows us to convert any model
into a tree model bisimilar to the original. This is extremely useful. Suppose we want
to perform an operation on a model of some kind. We can automatically work with a
tree model, which allows us to make certain structural assumptions and yet be sure that a
formula will be satisfied iff it was satisfied by the original model.
Proposition 3 Every model M has a bisimilar tree model.
Proof:
A tree model M′ can be obtained by unravelling M; the proof that M and M′ are
bisimilar is standard. See, for example, [BdRV02].
⊣
It is well-known that the converse does not hold in general. Given a model M, we can
construct a modally equivalent model N containing an infinite branch for which there can
be no bisimulation Z : M ⋍ N (suppose there is: then there will be a point on the infinite
branch in N for which the corresponding point in M has no successor, hence they cannot
be bisimilar states). However, we do have a restricted result in the converse direction:
Proposition 4 (Hennessy-Milner Theorem) A model is said to be image finite if the set
[
{u | Tsu}
s∈S
5. -
137
is finite. Given two image finite models M = hS, T, Vi and M′ = hS′ , T′ , V ′ i, for all s ∈ S and
s′ ∈ S, s ! s′ implies s ⋍ s′ .
Proof:
See, for example, [BdRV02, p.69].
⊣
5.5 Properties of Models
How well do these models capture a rule-based agent’s reasoning process? Below I present
several original results that highlight some of the interesting properties of these models.
Firstly, there is a strong relationship between the way states are labelled, the modal formulae
that hold at those states and bisimulation in these models. Secondly, such models have the
belief convergence property.
Definition 27 (Modal Depth) The modal depth of a formula φ is the maximum number of modalities embedded one within another in φ. In the syntax tree corresponding to φ, it is the maximum
number of 3s and 2s on a single branch (from root to leaf).
Theorem 4 (Properties of Models in S) Assume that a model M = hS, V, Ti is a tree model with
root r. Then:
a. For all states s, s′ of depth n, |V(s)| = |V(s′ )|. If V(r) is finite and s, s′ are not terminating
states, then |V(s)| = n + |V(r)|.
b. If V(r) is finite, then V(s) is finite for all s ∈ S.
c. If s
L
s′ and s, s′ are not terminating states, then s and s′ are of the same depth.
d. All siblings of terminating nodes are also terminating nodes.
e. If two children s1 and s2 of s are such that V(s1 ) − {λ1 } = V(s2 ) − {λ2 } then each has a child
s′ such that V(s′ ) = V(s) ∪ {λ1 , λ2 }.
Proof: (a) follows immediately from S3i and S4, (b) follows immediately from (a) and (d)
follows immediately from S2 and S3ii.
(c):
V(s) = V(s′ ) = V(r) ∪ X for some set of sentences X. Hence s, s′ are at least of depth
|X|. If s, s′ are not terminating states, they are at most of depth |X|.
138
5. -
(e):
since Tss1 and Tss2 , there are s-matching rules ρ1 , ρ2 with cn(ρ1 ) = λ1 and cn(ρ2 ) =
λ2 (from S3ii). Then ρ1 is s2 -matching and, by S1, there is a state s′2 with Ts2 s′2 and
V(s′2 ) = V(s2 ) ∪ {λ1 } = V(s) ∪ {λ1 , λ2 }. Similarly, ρ2 is s1 -matching, hence s1 has a child s′1
with V(s′1 ) = V(s1 ) ∪ {λ2 } = Vs ∪ {λ1 , λ2 }.
⊣
Lemma 2 For any M, M′ ∈ S and states s in M, s′ in M′ : if s
u′ ∈ S′ such that Ts′ u′ and u
Proof:
L
L
s′ and Tsu, then there is a
u′ .
If s is a terminating state then this is trivial; so assume that this is not the case.
Then there is an s-matching rule ρ such that V(u) = V(s){ cn(ρ)}. Since s
L
s′ , ρ is also
s′ -matching, hence there is a state u′ such that Ts′ u′ and V(u′ ) = V(s′ ) ∪ { cn(ρ)}; hence
u
L
u′ .
⊣
Theorem 5 For any models M, M′ ∈ S and all states s in M and s′ in M′ : s
Proof:
s
L
Clearly, s ! s′ implies s
L
L
s′ iff s ! s′ .
s′ . The converse: M, s φ iff M′ , s′ φ, whenever
s′ , is shown by induction on the complexity of φ. The base case is trivial so assume
that M, v ψ iff M′ , v′ ψ for all v ∈ S, v′ ∈ S′ and ψ of lower complexity than φ whenever
v
L
v′ . The cases for Booleans are also trivial, so consider φ := 3ψ. Then s
L
s′ and
M, s 3ψ implies that there is a state u ∈ S such that Tsu and M, u ψ. By lemma 2, there
is a state u′ ∈ S′ such that Ts′ u′ u′
L
s′ . By hypothesis, M′ , u′ ψ and hence M′ , s′ 3φ.
The converse holds by a similar argument, hence s ! s′ .
⊣
Theorem 6 For any models M, M′ ∈ S and all states s in M and s′ in M′ : s ! s′ iff s ⋍ s′ .
Proof: From proposition 2, s ⋍ s′ implies s ! s′ , so it only remains to show the converse.
Assume s ! s′ and that there is a u ∈ S such that Tsu; we must show that there is a state
u′ ∈ S′ such that T′ s′ u′ and u ! u′ . If s is a terminating state, this is trivial; so assume
that s is non-terminating, i.e. there is an s-matching rule ρ. Since s′
L
s, ρ must also be
s′ -matching and so, by S1, there is a state u′ ∈ S′ such that T′ s′ u′ extending s′ by cn(ρ).
Hence u′
L
s′ and so, by theorem 5, u′ ! s′ .
⊣
Corollary 5 Let M = hS, T, Vi ∈ S. For any s, s′ ∈ S and any state u reachable from s, if s
then there is a state u′ reachable from u such that u
Proof:
The proof is immediate from theorem 6.
L
L
s′
u′ .
⊣
5. -
139
These theorems provide a very useful relationship between the way states are
labelled, the modal formulae that are satisfied in those states and the way these models
capture the reasoning process of a rule-based agent. Note that the Hennessy-Milner theorem (proposition 4) was not required in the proof of theorem 6, so we are not required
to assume that the models in question are image-finite. If the root r of a model is labelled
by infinitely many sentences then there may well be infinitely many r-matching rules, in
which case every state in the model will have infinitely many children. Nevertheless, we
have s
L
s′ iff s ! s′ iff s ⋍ s′ . We can thus partition states in s ∈ S into equivalence
classes [s] based on V: set s, s′ ∈ [s] whenever V(s) = V(s′ ). By theorem 6, each set [s] is an
equivalence class: s, s′ ∈ [s] implies s and s′ satisfy the same model formulae. We can thus
transform any model M into a bisimilar model M≡ , whose states are classes of identically
labelled states from M.
Theorem 7 Let M = hS, T, Vi and M≡ = hS≡ , T≡ , V ≡ i be obtained from M as follows. The domain
of M≡ is the set of label equivalence classes [s] obtained from M such that u ∈ [s] whenever u
for all u ∈ S,
T≡ [s][u]
whenever Tsu and
V ≡ ([s])
L
s
= V(s). Then there is a bisimulation relation
Z : M ⋍ M≡ .
Proof:
For any state s ∈ S, [s]
L
s. Then, by theorems 5 and 6, [s] ! s and [s] ⋍ s.
Hence, there is a relation Z with Z[s]u iff u ∈ [s] such that Z : M≡ ⋍ M.
M≡ has the useful property that [s]
L
⊣
[u] implies that [s] = [u].
Definition 28 Let M = hS, T, Vi ∈ S and n ∈ N. Define Tn su to hold iff there are states s0 · · · sn
such that s = s0 , u = sn and, for each i < n, Tsn sn+1 .
Now we show that models in S have the property of belief convergence.
Theorem 8 (Belief Convergence) For any model M = hS, T, Vi ∈ S, any state r ∈ S and any
n ∈ N, if Tn rs and Tn ru, then there is a state s′ reachable from s and u′ reachable from u such that
s′
L
Proof:
u′ .
Without loss of generality, consider a tree model M ∈ S whose root is r. Let s, u
both be reachable from r in a finite number of transitions. Then there are equinumerous
sets X, Y such that V(s) = V(r) ∪ X and V(u) = V(r) ∪ Y. Now consider the subbranch from
r to s: for each transition Tvv′ on the branch, pick a v-matching rule ρ such that v′ extends
140
5. -
v by cn(ρ). Enumerate the selected rules ρ for which cn(ρ) < V(u) as ρ1 , . . . , ρn (from r to
s). It is easy to see that there must be a state u′ reachable from u, on the branch that results
from firing first ρ1 and then . . . and then ρn . Thus V(u′ ) = V(u) ∪ { cn(ρ1 ), . . . , cn(ρn )} =
V(u) ∪ X′ = V(u) ∪ X = V(r) ∪ Y ∪ X. By similar reasoning, there must be a state s′ reachable
from s with V(s′ ) = V(s) ∪ Y = V(r) ∪ X ∪ Y. Hence, s′
L
u′ .
⊣
This concludes the preliminary investigation into the properties of models of a
single rule-based agent. In section 5.7, these models are extended to incorporate multiple
agents that may communicate with one another. Before that, the following section investigates two restricted classes of models: firstly, those models in which all states have only
finitely many labels; and secondly models of an agent executing with a specified finite
program.
5.6 Finite Models and Programs
5.6.1 Finite Models
Because of our motivating interest in resource boundedness, we will sometimes want to
restrict ourselves to models in which each state is labelled by only finitely many L-formulae,
for these are the sentences representing the agent’s explicit beliefs, of which any real agent
may have only finitely many at any one time. We capture this intuition in the class of finite
memory models.
Definition 29 (Finite memory model) A model M = hS, T, Vi ∈ S is a finite memory model iff
V(s) is finite for each s ∈ S. Cfm is the set of all finite memory models in some class C.
Theorem 9 (Finite Model Property) For any finite memory model M = hS, T, Vi ∈ Sfm , there
is a model M′ containing only finitely many states and a bisimulation Z : M ⋍ M′ .
Proof: For any state s ∈ S, if V(s) is finite, s may only have finitely many non-label identical
children, each of which are labelled by only finitely many formulae. Let R be the set of
rules that label each state (by S4, all states are labelled by precisely the same rules); clearly
R is finite. Then any state s ∈ S can have at most |{ cn(ρ) | ρ ∈ R}| matching rules. Thus a
finite memory model with infinitely many states must have an infinite branch, on which
only a finite initial segment is generated by matching rules, i.e. only the first n states on
141
5. -
the branch are non-terminating states, for some n ≤ |{ cn(ρ) | ρ ∈ R}|. By S3ii, s
L
s′
whenever Tss′ and s, s′ are terminating states. A model M′ can be obtained by selecting the
first terminating state s on each branch in M, removing all the descendants of s and adding
a transition Tss. M′ satisfies S2 and is clearly bisimilar to M. Moreover, since s occurred
on a finite initial segment of a branch in M, M′ only contains branches of finite length. It
follows that M′ only contains finitely many states.
⊣
5.6.2 Programs
The above has been a general characterization of rule-based agents that execute a fixed but
unspecified set of rules. However, we are often interested in restricting our attention to
agents that execute using a specified program. A program is just a finite set of rules. Given
a specific program R, the subclass SR of S contains precisely those agents that believe all
rules in R and no further rules.
Definition 30 (The class SR ) Let R be a program (i.e. a finite set of rules). A model M =
hS, T, Vi ∈ SR iff M ∈ S and, for all states s ∈ S, the set {ρ | ρ ∈ V(s) & ρ is a rule } is R. An
L-formula φ is said to be SR -satisfiable iff it is satisfied at some state s in some model M ∈ SR ,
written M, s R φ. M R φ (global R-satisfiability) and R φ (R-validity) are then as definition
19.
The remainder of this section surveys some properties of the class SR , including a decidability result.
Theorem 10 Let R be a program, φ be any ML formula and n = |{ cn(ρ) | ρ ∈ R}|. If φ is
SR -satisfiable at all, then it is satisfiable in a finite model M ∈ SR containing at most nn states.
Proof:
Suppose φ is satisfiable at s in a model in SR ; then it is satisfied by a tree model
M ∈ SR whose root is s. By S3, any state u in M can have at most |R| children. Now, take
any state s in M of depth n. No ρ ∈ R can be s-matching, for otherwise, some ancestor of
s must have extended its parent by some λ < { cn(ρ) | ρ ∈ R}; but S3 prohibits this. Then
any state at depth n or greater must be a terminating state. There is then a model M′ ∈ SR
forming a rooted directed acyclic graph, bisimilar to M, in which s
L
u implies s = u
(e.g. by taking equivalence classes from M w.r.t. L). For any state s in M′ , |{s′ | Tss′ }| ≤ n
and, for states u, u′ at depth n or greater, T′ uu′ implies u = u′ . Therefore M′ can contain at
most nn states.
⊣
5. -
142
In any state in a model M ∈ SR , only the labels in the sets R and
{λ1 , . . . , λn , λ | (λ1 , . . . , λn ⇒ λ) ∈ R}
can have any effect on which rules do and do not match at that state. Thus, it is only these
formulae that affect the structure that T forms on S. Labels that are not from these sets may
be removed without changing which states are accessible from which in the model. We
can combine this with standard techniques to get a notion of filtration for SR models.
Definition 31 Let sub(φ) be the set of subformulae of φ, i.e.:
sub(Bα) = {Bα}
sub(¬φ) = {¬φ} ∪ sub(φ)
sub(3φ) = {3φ} ∪ sub(φ)
sub(φ ψ) = {φ ψ} ∪ sub(φ) ∪ sub(ψ)
for ∈ {∧, ∨, →}
and let Cl(φ) be sub(φ) closed under negation.
Definition 32 (R-filtration) Let Γ be closed under both subformulae and negation, i.e. Γ = Cl(Γ);
and set
LΓ = R ∪ {α | Bα ∈ Γ} ∪ {λ1 , . . . , λn , λ | (λ1 , . . . , λn ⇒ λ) ∈ R}
An R-filtration of M = hS, T, Vi through Γ is then a model MΓ = hS, T, VΓi where VΓ (s) = V(s)∩LΓ .
Filtration here is rather different than in regular modal logic. Here, we must ensure that
rules (and the beliefs needed for them to match) are not removed from states when we
filter, hence the use of LΓ .
Lemma 3 Let Γ = Cl(Γ), M = hS, T, Vi ∈ SR and MΓ be the R-filtration of M through Γ. Then for
any φ ∈ Γ and s ∈ S: M, s φ iff MΓ , s φ.
Proof:
By induction on the complexity of φ. If φ is an ML primitive this is trivial. So
assume that, for all ψ ∈ Γ of complexity k < n and any state s ∈ S: M, s ψ iff MΓ , s ψ.
We show this holds for all φ ∈ Γ of complexity n. The only if direction is trivial; in the if
direction, consider these cases:
φ := ¬ψ. Then M, s 1 ψ and, by hypothesis, MΓ , s 1 ψ, hence MΓ , s φ.
143
5. -
φ := ψ1 ∧ψ2 . Then M, s ψ1 and M, s ψ2 . By hypothesis, MΓ , s ψ1 and MΓ , s ψ2 ,
hence MΓ , s φ.
φ := 3ψ. Then there is a s′ ∈ S such that M, s′ ψ and Tss′ . By hypothesis, MΓ , s′ ψ
and hence MΓ , s φ.
The other Boolean cases are similar; it follows that MΓ , s φ.
⊣
Lemma 4 Let Γ = Cl(Γ), M ∈ SR and MΓ be the R-filtration of M through Γ. Then MΓ ∈ SR .
Proof: It follows from lemma 3 that any rule ρ is s-matching in M iff it is s-matching in MΓ
and that ρ ∈ VΓ (s) iff ρ ∈ V(s). Since T is common to both M and MΓ , S1-4 are satisfied and
hence MΓ ∈ SR .
⊣
Theorem 11 (Finite Memory Property) Let R be a program and φ be any ML formula. If φ is
fm
R-satisfiable, then it is satisfiable in a finite memory model M ∈ SR .
Proof:
Assume that M, s satisfies φ. Let MΓ be the R-filtration of M through Γ = Cl(φ). By
lemma 3, MΓ , s φ and, by lemma 4, MΓ ∈ SR . Since Cl(φ) and R are both finite, V(s) is
fm
finite for every s ∈ S, hence MΓ ∈ SR .
⊣
Theorem 12 (Decidability) Let R be a program and φ be any ML formula. Then it is decidable
whether φ is SR -satisfiable.
Proof: Suppose φ is R-satisfiable; then it is satisfied at the root r of some tree model M ∈ SR .
Let MΓ be the R-filtration of M through Γ = Cl(φ). By inspecting the proof of theorem 11,
fm
MΓ , r φ, MΓ ∈ SR and VΓ (r) = V(s) ∩ LΓ , with LΓ as definition 32. Let n = |{ cn(ρ) | ρ ∈ R}|.
By inspecting the proof of theorem 10, a model M′Γ can be obtained that has at most nn
states. Thus if φ has a model in S at all, then one can be found by considering each model
in SR with no more then nn states whose root is labelled by a subset of LΓ . Since LΓ is
bounded by the size of φ and R, we have an upper bound on the search for a model. We
therefore have a terminating algorithm that will find a model in SR for φ if one exists.
⊣
Given some such program R, it is easy to axiomatize the logic of the class SR , which I do
in the following chapter, section 6.2.
5. -
144
5.7 Multi-Agent Systems
In this section, the account is extended to handle systems containing multiple agents that
may communicate information to one another. In particular, one agent may communicate
with another either by asking a question, or by telling the latter agent something. Agents
handle communication by firing rules, much in the way that they handle the deduction of
new beliefs.
5.7.1 Communication Between Agents
We consider a set of agents A that share a common internal language and assume a unique
name for each agent. As in the single agent case, each agent has a working memory which
holds formulae of its internal language. Agent i believes a formula α when α is a rule in its
program or α is held in i’s working memory. The ascription language ML now contains
a belief operator Bi for each agent i ∈ A, with ‘Bi α’ read as ‘agent i believes α.’ Agents
communicate either by asking or telling each other things. Only definite beliefs may be
communicated between agents, i.e. an agent cannot tell another what rules it believes.
Similarly, an agent cannot ask another agent about the rules it believes. This is because, as
before, we treat an agent’s rules as constituting the agent’s program. Communication thus
always relates to some literal λ.
The mechanism whereby information is transmitted between agents is very simple. We assume that there will always be a reliable and instantaneous communication
channel open between any two agents. Whenever an agent tells another that something is
the case, the latter is aware that it has been told this information. Similarly, when an agent
asks another for some information, the latter always receives this request instantaneously.
That is not to say that communicated information is always believed, or that questions are
always answered. When an agent i tells agent j that λ, j becomes aware that:
i. it has received a message;
ii. the sender was agent i; and
iii. the content of the message is λ.
We introduce two modalities to the agents’ internal language L, the ask modality
5. -
145
? ij and the tell modality ⊢ ij .9 If ‘i’ and ‘j’ denote agents and λ is a literal, then ? ij λ and
⊢ ij λ are well-formed formulae of L, read ‘agent i has asked agent j that λ’ and ‘agent i
has told agent j that λ’ respectively. A formula of the form ? ij λ is called an ask and ⊢ ij λ
a tell. In the ascription language ML, an ask or tell may occur as a belief in its own right,
i.e. Bi ? ji λ or Bi ⊢ ji λ (agent i believes that agent j has asked/told her that λ). Agents do
not become confused or forgetful about what they have told one another, so that Bi ⊢ ji λ
implies that j has indeed told i that λ. Similarly, agents are always aware of what they have
been asked and told.
Asks and tells may also occur in rules, although the position in which they may
occur depends on which agent’s program the rule belongs to. Agent i may have rules in
which an ask or tell appears as the consequent:
λ1 , . . . , λn ⇒ ? ij λ
λ1 , . . . , λn ⇒ ⊢ ij λ
Agent i interprets these rules as ‘if λ1 , . . . , λn , then ask/tell agent j that λ.’ On the other
hand, agent i may have rules in which an expression ? ji λ or ⊢ ji λ appears as an antecedent,
i.e. for any other agent j, the following are allowed in agent i’s program:
′
λ1 , . . . , ? ji λ, . . . , λn ⇒ λ
λ1 , . . . , ⊢ ji λ, . . . , λn ⇒ λ′
For example, agent i using the rule ⊢ ji λ ⇒ λ says that i trusts j about λ. Note that in
all of these cases, agent i must be distinct from agent j, for our agents neither ask nor tell
themselves anything. Apart from the rules of the types just listed, no rule containing an
ask or tell may be used by agent i, for an agent has no control over what another agent asks
or tells the others.
5.7.2 Communication and deduction rules
When a rule has either an ask or a tell as its consequent, it is called a communication rule, or
C-rule for short. These are the rules that bring the system’s communicative mechanism into
play, whose job it is to ensure that, if i believes ⊢ ij λ or ? ij λ then j does too and vice versa.
9
These symbols are intended to be reminiscent of the interrogative and assertoric force symbols ‘?’ and ‘⊢’
used by Frege; the latter should not be confused with the use of ‘⊢’ for theoremhood.
5. -
146
Both the sender and recipient of a message are always aware that a message has been sent,
who the sender and recipient are and what the content of the message was. Thus, when
an instance of a C-rule is fired, two new beliefs are produced in the system, one belonging
to the agent that fired the rule and one to the recipient of the ask or tell (agents i and j
respectively in the example above). Both i and j then share the relevant belief, e.g. that
⊢ ij λ. Agent i can then avoid sending the same message to j again and, if j trusts i, j will
accept that λ.
One use that agents can make of C-rules is to cooperate with other agents. If i is
disposed to cooperate with j on the topic of λ, the rule
? ji λ, λ ⇒ ⊢ ij λ
will appear in i’s program Ri . By way of example let us return to the business rules used
in section 5.3. Suppose agent a keeps track of how much customers have spent and agent
b is responsible for issuing discounts. In our propositional language, agent a uses all
propositional instances of the rules:
a1: ? ba PremiumCustomer(x), Spending(x, > 1000) ⇒ ⊢ ab PremiumCustomer(x)
a2: ? ba PremiumCustomer(x), ¬Spending(x, > 1000) ⇒ ⊢ ab ¬PremiumCustomer(x)
Then b issues discounts using all propositional instances of the rule
b1: PremiumCustomer(x), Product(y) ⇒ Discount(x, y, 10%)
Rules whose consequent is not an ask or a tell are known as deduction rules, or Drules. These include rules with asks and tells in the antecedent, as well as rules containing
neither an ask nor a tell. The trust scheme for agent i:
⊢ ji λ ⇒ λ
saying that i trusts agent j is thus a D-rule (here, i and j are agent names, whereas λ
is a metavariable, standing for any L formula). In general, an arbitrary rule is written
α1 , . . . , αn ⇒ α, where α and each αi≤n is either a literal, an ask or a tell.
5.7.3 Models of Multi-Agent Systems
There are two approaches to modelling rule-based agents that communicate using asks
and tells. The first is to provide each agent i ∈ A with its own transition relation Ti , which
5. -
147
behaves much as T did in the case of a single agent. The ascription language ML then
contains modalities hii for each i ∈ A, such that s hiiφ iff there is a u such that Tsu
and u φ. When modelling an multi-agent system, the interest is usually in how the
system behaves as a whole, rather that in considering an agent individually. To this end, a
single modality 3 describing the transitions that the system as a whole makes seems more
appropriate. I thus devote the remainder of the chapter to developing the latter approach.
Agents have a common internal language L, wffs of which are either literals,
C-rules or D-rules as described in the previous section. ML is as before but with a family
{Bi }i∈A of modalities in place of B.
Definition 33 (Multi-agent models) Given a group of n agents A, a multi-agent model M is an
(n + 3)-tuple
S, A, T, {Vi}i∈A
where S is a set of states, T is the transition relation and each Vi : S −→ 2L is the labelling function
for agent i ∈ A, assigning a set of L-formulae to each state.
The support relation is defined as in section 5.3, with the base clause for Bi α (for i ∈
A, α ∈ L) replaced by:
M, s Bi α iff α ∈ Vi (s)
In the case of a singe agent, it is clear that each transition should correspond to
an atomic inference, resulting in the change from one belief state to another. However,
in the case of multiple agents, it is not clear what a transition of the system as a whole
should correspond to. There are several options, which are made evident by the following
scenarios.
Scenario 1
The agents in the system operate in rounds, taking turns to derive new
formulae, such that when agent i has inferred a new belief, it cannot infer another new
belief until every other agent has taken its turn. In this case, it may be appropriate to take
a transition to be the change in the state of the system from one round of inference to the
next. Individual acts of inference then do not affect the system until each agent has taken
its turn. Along these lines, the order in which agents take their turns is abstracted from
and the agents are modelled as concurrent reasoners. In this approach, it is often helpful
to allow agents a skip move, on which no formula is inferred.
5. -
Scenario 2
148
The agents in the system are independent of one another, although all run
to a global clock. An agent may infer a new formula whenever it has a matching rule,
regardless of what the other agents in the system are doing. In such situations, it is more
appropriate to take each agent’s individual acts of inference to be a transition of the system
as a whole. When agent i infers a new formula from its beliefs, the system moves into a
new state. To keep things uniform, a transition corresponds to exactly one agent inferring
a new belief (or communicating with another agent).
I should stress that neither of these options is the correct (or incorrect) way to
model a system of agents; rather, what counts as a good model will depend on the application. A formal model of a concurrent communicating multi-agent system along the lines
of scenario 1 is given in [AJL06a]. Purely for interest in the unexplored, I discuss models
corresponding to scenario 2 here. Models of such agents are presented in the remainder
of this chapter and an axiomatization and complexity analysis is given in the following
chapter.
To describe these models, the definitions of a matching rule and a terminating state
need to be extended from those given above to fit the multi-agent scenario. This is done as
follows.
Definition 34 (Matching rule) Let s ∈ S and i ∈ A. A rule ρ = α1 , . . . , αn ⇒ α is i-s-matching
iff:
i. ρ ∈ Vi (s);
ii. each α1 , . . . , αn ∈ Vi (s); but
iii. α < Vi (s)
Definition 35 (Terminating state) A state s is said to be a terminating state in a model M iff, for
all agents i ∈ A, no rule ρ is i-s-matching.
As remarked above, firing an instance of a C-rule will add two new formulae to
the next state, one to the working memory of the agent firing the rule instance and one
to the working memory of the recipient of the ask/tell. The diagram below shows such
a transition from state s to state u. Agent a uses the rule p ⇒ ⊢ ab p. The row labelled a
represents agent a’s working memory (and similarly for the row labelled b).
5. -
s
a
p
149
u
T
b
p
⊢ ab p
⊢ ab p
In state s agent a believes p, which it can match with its rule p ⇒ ⊢ ab p. In the transition
to u, agent a tells b that p and so ⊢ ab p appears in the working memory of both agents at u.
We thus need to update our definition of when one state extends another to account for the
difference between D-rules and C-rules.
Definition 36 (D-extension) Let i be an agent, λ a literal and s, u ∈ S. Then u is a D-i-extension
of s when:
i Vi (u) = Vi (s) ∪ {λ}
ii for all agents j , i, V j (u) = V j (s).
In such a D-i-extension, agent i fires a D-rule instance at s and gains the belief that λ at u, whereas
the beliefs of all other agents stay the same.
Definition 37 (C-extension) Let i, j be distinct agents, s, u ∈ S and ij λ be either ⊢ ij λ or ? ij λ.
Then u is a C-i-extension of s by ij λ iff:
i Vi (u) = Vi (s) ∪ {ij λ};
ii V j (u) = V j(s) ∪ {ij λ}; and
iii for all agents k , i, k , j, Vk (u) = Vk (s).
Here, agent i fires an instance of a C-rule at s and both i and j gain the belief ij λ at u, but the
beliefs of all other agents stay the same.
There are thus two types of extension to a state, D-extensions and C-extensions, corresponding to an instance of a D-rule or C-rule being fired. The former concerns the deduction
of a new belief for some agent, the latter a communication from one agent to another. If
agent i has a matching D-rule, there should be a transition to a D-i-extension of the current
state. On the other hand, if agent i has a matching C-rule, there should be a transition to a
matching C-i-extension of the state. Models in which transitions are restricted in this way,
in addition to the restrictions on which rules an agent may use discussed above, comprise
the class M (for multi-agent models).
5. -
150
Definition 38 The class M contains precisely those models M = hS, A, T, {Vi }i∈A i that satisfy the
following conditions:
M1 For all states s ∈ S and agents i ∈ A, if there is an i-s-matching rule ρ then there is a state
u ∈ S such that Tsu and u is the C-i-extension (D-i-extension) of s by cn(ρ) whenever ρ is a
C-rule (D-rule).
M2 For all states s, u ∈ S and all agents i, Tsu only if either (i) there is an i-s-matching rule ρ and
u is the C-i-extension (D-i-extension) of s by cn(ρ) whenever ρ is a C-rule (D-rule), or else
(ii) s is a terminating state and, for all i ∈ A, Vi (s) = Vi (u).
M3 For any terminating state s, there exists a state u ∈ S such that Tsu and, for all i ∈ A,
Vi (u) = Vi (s).
M4 For all rules ρ, i ∈ A and s, u ∈ S, ρ ∈ Vi (s) iff ρ ∈ Vi (u) and ρ ∈ Vi (s) only if:
i. for any ask or tell α in the antecedent of ρ, α’s second subscript is i and
ii. for any ask or tell α in the consequent of ρ, α’s first subscript is i.
M5 For any state s ∈ S, literal λ and distinct agents i, j ∈ A,
i. ⊢ ij λ ∈ V j (s) whenever ⊢ ij λ ∈ Vi (s) and
ii. ? ij λ ∈ V j (s) whenever ? ij λ ∈ Vi (s).
M1-3 correspond to S1-3; M4 ensures both that the set of rules that an agent believes
does not change throughout a model, and that agents do not believe rules containing
inappropriate communication modalities (e.g. agent i using λ ⇒ ⊢ jk λ when j , i). M5
ensures that communicating agents are aware of relevant aspects of the communication:
their force (ask or tell), the participants and the content.
In the case of single agent models S, the way in which a state s is labelled automatically determines the labels on states accessible from s. This lead to the result that for
any two states s, u (possibly in different models), s
L
u iff s ! u iff s ⋍ u. In multi-agent
models, the beliefs of agent i at a state do not determine its beliefs at all accessible states, for
the agent might receive information from other agents independently of its beliefs. However, fixing Vi (s) for each i ∈ A does fix the labels on all states accessible from s. Because
of this, theorems corresponding to those in the previous sections apply to the class M, as
follows.
151
5. -
Theorem 13 For any models M, M′ ∈ M and all states s in M and s′ in M′ : s
L
s′ iff s ! s′
iff s ⋍ s′ .
Proof:
The proof is almost identical to that of theorems 5 and 6 and so I omit it here.
⊣
The other important result from section 5.5 was belief convergence (theorem 8).
The proof rested on the idea that, if a rule ρ whose consequent is λ is s-matching, then ρ
will continue to be matching at all states accessible from s unless ρ is derived from some
rule (which may or may not be ρ). The only amendment that needs to be made in the case
of multi-agent models is that a matching rule ρ continues to be matching for an agent i
until its consequent is either derived or received in communication from a trusted source
(i.e. another agent tells i that cn(ρ) and i believes it). But then ρ remains a matching rule
until its consequent is believed, just as in single agent models. Thus:
Theorem 14 (Multi-Agent Belief Convergence) For any model M = hA, S, T, {Vi}i∈A i ∈ M,
any state r ∈ S and any n ∈ N, if Tn rs and Tn ru, then there is a state s′ reachable from s and u′
reachable from u such that s′
Proof:
L
u′ .
Again, the proof is very similar to that of theorem 8, modulo minor allowances for
communication between agents. I omit the full proof here.
⊣
5.7.4 Programs
Although our agents share a common language, each has its own unique program. As we
mentioned above, there are restrictions on which rules may appear in an agent’s program,
summarized here. Corresponding to M5, a rule ρ may appear in Ri , for any agent i ∈ A,
only if:
1. for any ask or tell α in the antecedent of ρ, α’s second argument is i; or
2. for any ask or tell α in the consequent of ρ, α’s first argument is i
A program for an agent is then a set of rules that satisfy these conditions. Given a program
Ri for each agent i ∈ A, we define the program set
R = {Ri }i∈A
5. -
152
Just as in the single agent case above, we want to pick out those models in which agents
believe all the rules in their program and no further rules. Such models comprise the class
MR .
Definition 39 (The class MR ) For each i ∈ A, Ri is a program as above. Set R = {Ri }i∈A . A
model M ∈ MR iff M ∈ M and, for every state s in M, rule ρ and agent i ∈ A, ρ ∈ Vi (s) iff ρ ∈ Ri .
A sentence φ is MR -satisfiable iff φ is satisfiable at a state s in a model M ∈ MR , written M, s R φ.
Global MR -satisfiability M R φ and MR -validity R φ are then as in definition 19.
Note that when A is a singleton {a}, the definition of the class M{Ra } matches that of the
single agent class SR defined above (assume we drop the indices on R and B). This is
because, for any literal λ, both ⊢ aa λ and ? aa λ are ill-formed. Agent a may then only use
D-rules, states have only D-extensions and so on. Thus SR is a special case of M{Ra } when
A is a singleton {a}.
In fact, each class MR shares many of the logical properties of the classes SR . For
any MR -satisfiable sentence φ, a finite model for φ can be found whose upper bound on
the number of states is equal to that in the case of single agent models.
Definition 40 (Finite memory model) A model M ∈ M over A is a finite memory model iff
Vi (s) is finite for each s ∈ S and i ∈ A. Again, Cfm is the set of all finite memory models in some
class C.
Theorem 15 (Finite Model Property) Let R = {Ri }i∈A be a program set for m agents A, φ be
any ML formula and n = |{ cn(ρ) | ρ ∈ Ri , i ∈ A}|. If φ is MR -satisfiable at all, then it is satisfiable
in a finite model M containing at most nn states.
Proof: Assume φ has a model M ∈ MR . The set {ρ | ρ ∈ Ri , i ∈ A} contains all rules available
to any agent in A; thus the number of matching rules at any state s in M cannot exceed
|{ cn(ρ) | ρ ∈ Ri , i ∈ A}|. By the argument used in the proof of theorem 10, there is a rooted
model M′ of φ in which no state is more than n − 1 transitions from the root. Hence, M′
can contain at most nn states.
⊣
As with models of single agents, we can filter a model through a set of formulae Γ
and a program set R to obtain a finite memory model—this is called an R-filtration through
Γ. Such a filtration removes all labels from all states unless they are rules appearing in a
5. -
153
program in R, or a literal/ask/tell appearing in any of those rules, or a sentence α and Bi α
is a subformula of a formula in Γ.
Definition 41 (R-filtration) For a set of agents A, let R be a program set over A, M =
hA, S, T, {Vi}i∈A i ∈ MR and Γ be a set of ML formulae closed under both negation and subformulae,
i.e. Γ = Cl(Γ). Then define LΓ as:
LΓ = {α | Bα ∈ Γ} ∪
[
Ri ∪ {α1 , . . . , αn , α | (α1 , . . . , αn ⇒ α) ∈ Ri , i ∈ A}
i∈A
The R-filtration of M through Γ is then a model MΓ = hA, S, T, {ViΓ }i∈A i where ViΓ (s) = Vi (s) ∩ LΓ
for each s ∈ S.
Theorem 16 (Finite Memory Property) Given a set A of agents, let R be a program set over A
and φ be any ML formula. If φ is MR -satisfiable, then it is satisfiable in a finite memory model
fm
M ∈ MR .
Proof:
Assume that a state s in a model M ∈ MR satisfies φ. Set Γ = Cl(φ) and let MΓ be
the R-filtration of M through Γ. By a similar argument to the proof of lemmas 3 and 4, it
follows that MΓ , s φ and MΓ ∈ MR . Since Cl(φ) and all R ∈ R are finite, so is LΓ . Since
each ViΓ (s) ⊆ LΓ , it follows that ViΓ (s) is finite, for every s ∈ S and i ∈ A.
⊣
Theorem 17 (Decidability) Given a set A of agents, let R be a program set over A and φ be any
ML formula. Then it is decidable whether φ is MR -satisfiable.
Proof:
Assume φ is satisfied at the root of a tree model M ∈ MR and let MΓ be the R-
filtration of M through Γ = Cl(φ). Again, set n = |{ cn(ρ) | ρ ∈ Ri , i ∈ A}| and let LΓ be as
definition 41. Using theorem 10, a model M′Γ can be obtained from MΓ that has at most nn
states, each of which is labelled by a subset of LΓ . Since both n and LΓ are bounded in size
by φ and R, we have a terminating algorithm that will produce a model in MR for φ if one
exists.
⊣
This concludes the investigation into models of multiple rule-based agents in the
class M. The next chapter presents logics corresponding to the classes SR and MR and
discusses various proof techniques for these logics.
154
C 6
P S
6.1 Introduction
This chapter is concerned with axiomatizing the logics corresponding to the classes of
models presented in the preceding chapter. Axiomatization of a modal logic is frequently
confined to considerations of frame definability. This is because the interesting notion of
validity in modal logic arises at the level of frames. Axioms characterize a frame by forcing
the associated accessibility relation to have certain properties; well-known examples are
the T axiom 2φ → φ and the S4 axiom 2φ → 22φ, characterizing reflexive and transitive
accessibility relations, respectively.
It might appear, then, that axiomatizing a particular kind of epistemic logic
amounts to stating the conditions on its accessibility relation and using the axioms that
characterizes those conditions. However, the task here is not so simple, for frames alone
do not give rise to the class of models that we are interested in. The extra conditions
we enforce on models to make them models of rule-firing agents depend on the labelling
function V as well as on the transition (i.e. accessibility) relation T. If M is a model of a
system of rule-based agents (i.e. M ∈ S or M ∈ M), then certain ways of labelling a state s
in M ensure that there exists a state accessible from s in M that must then be labelled in a
certain way. In the other direction, if two states of M are related by T, then the way one
is labelled greatly restricts the way the other must be labelled. So, unlike modal logics in
general, T and V interact here. As a consequence, merely characterizing a frame will not be
a sufficient axiomatization of the logics we are interested in. We have to somehow capture
the interaction between transitions and labelling as well.
As remarked above, our interest is usually in agents that have a particular fixed
6.
155
program. Given a program R (or a program set R), it is relatively easy to axiomatize logic
corresponding to SR (or to MR ). In the remainder of this section, I consider the axioms that
are required for such an axiomatization, regardless of the corresponding agent’s program
R. In the following section, we will see that the task becomes easier once R has been fixed.
To begin with, the logic on top of sentences of the form Bα is a form of the modal logic K
introduced in chapter 1, hence all instances of propositional tautologies over ML are valid;
2 distributes over →; and our inference rules are standard modus ponens and necessitation:
PL All classical propositional tautologies over ML
K 2(φ → ψ) → (2φ → 2ψ)
MP
φ
φ→ψ
ψ
N
φ
2φ
Here, ‘φ’ and ‘ψ’ are metavariables ranging over ML.
The remaining axiom schemes capture conditions particular to the class S. In
these schemes, ‘α’, ‘λ’ and ‘ρ’ (with or without subscripts) are metavariables ranging over
all formulae, literals and rules of L respectively. Agents are monotonic reasoners and so
beliefs are not revised or forgotten, and T is serial:
A1 Bα → 2Bα
A2 3⊤
Next, all models in S have the rule matching property, expressible as:
B(λ1 , . . . , λn ⇒ λ) ∧ Bλ1 ∧ · · · ∧ Bλn ∧ ¬Bλ → 3Bλ
However, A1 and A2 entail that the condition ¬Bλ is not required here. We thus express
rule matching as:
A3 B(λ1 , . . . , λn ⇒ λ) ∧ Bλ1 ∧ · · · ∧ Bλn → 3Bλ
The next set of axioms reflect restrictions on transitions. Successor states extend their
predecessors by a single new belief, so for any pair α, β of beliefs with α , β, either α or β
must have been believed at the previous state:
A4 3(Bα ∧ Bβ) → Bα ∨ Bβ
156
6.
These axioms are valid in any class SR , i.e. they are S-valid simpliciter. At present,
I do not know whether it is possible to provide a complete axiomatization of the logic of
S itself. The axiom schemes just given are unlikely to result in a complete proof theory
with respect to S. The canonical model obtained while attempting to show that they are
complete is not a member of S, for the axioms do not rule out models containing transitions
such as the following:
p
•
pq
•
p⇒q
p •
• pq
• p
In the left-hand case, a state extends a terminating state by q, which is not permitted in S.
The problem is not restricted to terminating states; it is, generally, that the axioms do not
rule out Tsu when u extends s by λ, but there is no s-matching rule whose consequent is λ.
Although A4 rules out the addition of more than one new belief in successor states, a finite
axiom scheme cannot be given to rule out the addition of ‘rogue’ new beliefs that do not
correspond to a rule being fired. To do so, axioms would have to force the existence of a
matching rule with consequent λ whenever 3λ ∧ ¬λ holds. But this cannot be expressed
without quantifying over all rules in the language or by using an infinite disjunction,
neither of which are allowed in ML.
The right-hand case contains a transition from a non-terminating state that does
not result in a new belief. Again, this is not permitted in S, but cannot be ruled out by a finite
axiom scheme (as in the previous case, to do so would require existential quantification
or infinite disjunction over rule formulae). Both of these snags are easily remedied once
a program R has been set, since programs are finite by definition. It is then easy to
supplement these axiom schemes to give a proof theory over a program R whose canonical
model is in the class SR . This is discussed in the following section.
6.2 Logic for a program R
In this section, given a program R, a logic corresponding to the class SR will be presented.
To begin with, agents believe all rules ρ ∈ R at all states and no further rules at any state;
hence Bρ is valid for ρ ∈ R and ¬Bρ is valid for ρ < R. Since R contains a finite number
of rules, finite disjunctions over R can be used as ersatz existential quantifiers and so the
problems encountered in the previous section can be avoided. The first problem that was
157
6.
encountered there is ruled out by demanding that all beliefs present in any successor state
are either present in the current state, or else the consequent of a currently matching rule.
Enumerate the rules ρ1 , . . . , ρn ∈ R whose consequent is λ. Then if λ is a new belief in
a successor state, then either all antecedents of ρ1 , or all antecedents of ρ2 , or . . . or all
W
antecedents of ρn are believed. This is expressible using -notation as:
_
Bλ1 ∧ · · · ∧ Bλn
(λ1 ,...,λn ⇒α)∈R
(note that, in cases in which there is no such rule ρ ∈ R,
condition can be captured as:
_
A7 3Bα → (Bα ∨
Bλ1 ∧ · · · ∧ Bλn )
W
{} is defined as ⊥). Then our
(λ1 ,...,λn ⇒α)∈R
The second snag met in the previous section requires an axiom saying: all successors of a non-terminating state extend that state by at least one new belief (which, in
tandem with A4, will guarantee that there will be exactly one new belief). Suppose that
ρ1 , . . . , ρn ∈ R all match at the current state and that no others in R match. Then every successor must either extend the current state by cn(ρ1 ) or by . . . or by cn(ρn ). In other words,
the disjunction cn(ρ1 ) ∨ · · · ∨ cn(ρn ) holds in all successor states, so 2( cn(ρ1 ) ∨ · · · ∨ cn(ρn ))
must hold in the current state. In expressing the antecedent condition, the abbreviation
df
match(λ1 , . . . , λn ⇒ λ) = Bλ1 ∧ · · · ∧ Bλn ∧ ¬Bλ
is helpful. Then, for any n ≥ 1:
A8 matchρ1 ∧ · · · ∧ matchρn ∧
^
¬matchρ → 2(B cn(ρ1 ) ∨ · · · ∨ B cn(ρn ))
ρi≤n ,ρ∈R
The full axiom system is given in figure 6.1. Call the logic resulting from all
instances of these axiom schemes ΛR . A derivation in ΛR is defined in a standard way,
relative to R. An ML-formula φ is derivable from a set of ML-formulae Γ, written Γ ⊢R φ,
iff there is a sequence of formulae φ1 , . . . , φn where φn = φ and each φi≤n is either an instance
of an axiom scheme, or a member of Γ, or is obtained from the preceding formulae by MP
or N. Suppose an agent’s program R contains the rules ρ1 , . . . , ρn . This agent is guaranteed
to reach a state in which it believes α in k steps, starting from a state where it believes λ1 ,
. . . , λm , iff
Bρ1 ∧ . . . ∧ Bρn ∧ Bλ1 ∧ . . . ∧ Bλm → 2k Bα
158
6.
PL all classical propositional tautologies
K 2(φ → ψ) → (2φ → 2ψ)
A1 Bα → 2Bα
A2 3⊤
A3 B(λ1 , . . . , λn ⇒ λ) ∧ Bλ1 ∧ · · · ∧ Bλn → 3Bλ
A4 3(Bα ∧ Bβ) → Bα ∨ Bβ
α,β
A5 Bρ
ρ∈R
ρ<R
A6 ¬Bρ
A7 3Bα → (Bα ∨
_
Bλ1 ∧ · · · ∧ Bλn )
(λ1 ,...,λn ⇒α)∈R
A8 matchρ1 ∧ · · · ∧ matchρn ∧
^
ρi≤n ,ρ∈R
MP
N
¬matchρ → 2 B cn(ρ1 ) ∨ · · · ∨ B cn(ρn )
n≥1
φ φ→ψ
ψ
φ
2φ
F 6.1: Axiom schemes for ΛR
is derivable in ΛR (2k is an abbreviation for a sequence of k boxes).
The rest of this section establishes that ΛR is indeed the logic of the class SR : that
is, the proof system given in figure 6.1 is sound and complete with respect to the class SR .
Assume a fixed program R and that all meta-properties are given with respect to the class
SR for the remainder of this section. The semantic sequent Γ R φ is taken as an abbreviation
of: for all models M ∈ SR , M Γ implies M φ (the fact that R is a program of a single
agent allows us to avoid using a further ‘S’ parameter).
Theorem 18 (Soundness) For any ML-formula φ and set of ML-formulae Γ, if Γ ⊢R φ then
Γ R φ.
Proof: Given the discussion in the previous section, soundness is more or less immediate.
The validity of instances of PL and K are standard and, given S1–S4, the validity of A1–A6
6.
159
is trivial. Were A7 not valid, there would be a state s supporting 3λ ∧ ¬λ where λ is not
the consequent of an s-matching rule; but this is ruled out by S3i. Were A8 not valid,
there would be a transition from a non-terminating state to a state that does not extend the
former; but again, this is ruled out by S3i.
⊣
To demonstrate the completeness of the system shown in figure 6.1, the strategy
is to construct a canonical model for ΛR and show that it is in the class SR . First, a
few lemmas need to be prepared, most of which are standard. The canonical model is
constructed syntactically from maximal consistent sets:
Definition 42 (Maximal Consistent Set) A set of ML-formulae Γ is a maximal consistent set
with respect to a logic Λ (a Λ-) iff Γ is Λ-consistent, but any proper superset Γ′ ⊃ Γ over ML
is Λ-inconsistent.
Intuitively, any Λ- contains as many ML-formulae as possible, so that adding one more
formula from the language would cause an inconsistency. We need to rely on the following
result:
Lemma 5 (Lindenbaum lemma) Any Λ-consistent set Γ over ML can be extended to a Λ-
Γ+ .
Enumerate all formulae φ1 , . . . of ML and set Γ0 = Γ. Given a set Γm , let Γm+1 =
S
Γm ∪ {φm } if φm is Λ-consistent with Γm ; otherwise, Γm+1 = Γm . Finally, set Γ+ = m∈N Γm . Γ+
Proof:
is then both maximal (in L) and Λ-consistent. For suppose it were not Λ-consistent; then
some Γi must be Λ-inconsistent; but then Γi−1 must be Λ-inconsistent, and so must each
Γ j<i . But this contradicts our assumption that Γ is Λ-consistent. Suppose Γ+ is not maximal,
i.e. some proper superset Γ′ ⊃ Γ+ over L is Λ-consistent, i.e. Γ′ ⊇ Γ+ ∪ {φn } for some n ∈ N.
Since φn is Λ-consistent with Γ+ , it must have been added to form Γn+1 ⊂ Γ+ . Hence there
can be no set Γ′ .
⊣
A canonical model MR = hSR , TR , V R i for ΛR is built from sets of ΛR - as
follows. First, SR = {∆ | ∆ is a ΛR -} and, for each s ∈ SR , V R (s) = {α | Bα ∈ s}. Finally,
for all states s, u ∈ S, TR su iff {φ | 2φ ∈ s} ⊆ u (or equivalently, iff {3φ | φ ∈ u} ⊆ s). This
construction guarantees that we have enough states to interpret 3 as we would like:
Lemma 6 (Existence lemma) Let MR = hSR , TR , V R i be a canonical model for ΛR . For any
φ ∈ ML and s ∈ SR , if 3φ ∈ s then there is a state u ∈ SR such that TR su and φ ∈ u.
6.
160
Proof: Suppose that 3φ ∈ s and set ∆ = {φ}∪ {ψ | 2ψ ∈ s}. If ∆ is inconsistent, there must be
ψ1 , . . . , ψn ∈ ∆ such that ψ1 , . . . , ψn ⊢R ¬φ. This implies that ⊢R 2ψ1 ∧ · · · ∧ 2ψn → ¬3φ (by
applying the deduction theorem, necessitation and substituting ¬3 for 2¬). But since s is a
ΛR - containing 2ψ1 , . . . , 2ψn , it follows that ¬3φ ∈ s. This contradicts our assumption
that 3φ ∈ s; hence ∆ must consistent. By the Lindenbaum lemma, ∆ can therefore be
extended to a ΛR - ∆+ with φ ∈ ∆+ and {ψ | 2ψ ∈ s} ⊆ ∆+ and so, by the construction of
MR , TR s∆+ .
⊣
Lemma 7 (Truth lemma) Let MR = hSR , TR , V R i be a canonical model for ΛR . For any φ ∈ ML
and any state s ∈ SR , M, s φ iff φ ∈ s.
Proof: By induction on the complexity of φ. The base case is given by the definition of V R .
Since s is both maximal and consistent, the Boolean cases are trivial. So assume M, s 3ψ;
there is then a u such that TR su and, by hypothesis, ψ ∈ u. By the definition of TR , 3ψ ∈ s.
In the other direction, the existence lemma guarantees that 3ψ ∈ s implies the existence of
a state u containing ψ with TR su. By hypothesis, MR , u R ψ and hence MR , s R 3ψ.
⊣
These results are standard in modal logic, but the following is specific to the logic discussed
here:
Lemma 8 Let MR = hSR , TR , V R i be a canonical model for ΛR . For any α ∈ L and s, u ∈ S, if
TR su and α ∈ V R (u) but α < V R (s), then: (i) V R (u) = V R (s) ∪ {α}; and (ii) α in must be a literal.
Proof:
Assume that TR su, α ∈ V R (u) but α < V R (s). Since ⊢R Bα → 2Bα, s ⊆ u. Moreover,
for any β ∈ V R (u) with β , α, axiom A4 entails β ∈ V R (s), hence (i) follows. For part (ii), if
α were not a literal it must be some rule ρ. Given axioms A5 and A6, this can only be the
case if ρ ∈ R; but then α ∈ s, contrary to hypothesis.
⊣
We are now ready to show that the system shown in figure 6.1 is strongly complete.
Theorem 19 (Strong completeness) For any set of ML formulae Γ and any ML formula φ,
Γ R φ only if Γ ⊢R φ.
Proof:
If Γ is ΛR -inconsistent then the result is trivial, so assume otherwise. Using
the Lindenbaum lemma, expand Γ to a ΛR - Γ+ and build a canonical model MR =
6.
161
hSR , TR , V R i as described above. From the truth lemma, it follows that MR , Γ+ R Γ. It
remains only to show that MR is in the class SR , i.e. that it satisfies S1–S4.
MR satisfies S1:
Assume there is an s-matching rule ρ. Given the truth lemma, it is easy
to see that ρ and its antecedents are members of s, whereas its consequent is not. Axiom
A3 and the existence lemma guarantee a successor u with B cn(ρ) ∈ u. Given lemma 8, u
must be the extension of s by cn(ρ).
MR satisfies S2:
Suppose s is a terminating state. By axiom A2, it has a successor s′ . By
axiom A1, V(s) ⊆ V(s′ ) and, since no rules match at s, axiom A7 ensures that V(s′ ) ⊆ V(s).
MR satisfies S3:
Suppose TR su for states s, u in MR . By the definition of TR , {φ | 2φ ∈
s} ⊆ u. By axiom A1, V R (s) ⊆ V R (u). Now take any α ∈ V R (u): by the truth lemma, Bα ∈ u
and, by the definition of TR , 3Bα ∈ s. By axiom A7, either α ∈ s or else there is at least one
rule ρ = (λ1 , . . . , λn ⇒ α) ∈ R such that {Bλ1 , . . . , Bλn } ⊆ s. In the latter case, ρ is s-matching
and, by lemma 8, u extends s by α. If there is no such rule, for any α ∈ V R (u), then u is a
terminating state with V R (u) = V R (s).
MR satisfies S4:
Given axioms A5 and A6, this is trivial. This completes the proof.
⊣
6.3 Logic for a Multi-Agent System
Given a set of agents A and a program set R for these agents, the logic of the class MR is
called ΛR (again, the fact that R is a program set avoids the need for an extra ‘M’ parameter).
In this section, let A and R be fixed throughout and let ML contain a modality Bi for each
agent i ∈ A.
The logic ΛR is then easily axiomatized, based on the axiom system used for
ΛR above. The schemes PL,K, A1–A3 and A5–A6 are imported from figure 6.1 with no
changes, except that each ‘B’ is replaced by ‘Bi ’ and ‘R’ is replaced by ‘Ri ’ on the conditions
of A5 and A6. Instances of A4 are only valid when either Bi α , Bi β, or else when both α
and β are not an ask ? ij λ or ? ji λ, or a tell ⊢ ij λ or ⊢ ji λ; for if i asks whether or tells j that
λ, then both i and j gain the new belief that the communication took place.
Similarly, instances of A7 are valid in ΛR so long as α is not a communication from
another agent; that is, if i believes α and α is neither ? ji λ nor ⊢ ji λ, then either α is an old
belief or the consequent of a matching rule:
162
6.
A7 3Bi α → (Bi α ∨
_
Bi α1 ∧ · · · ∧ Bi αn )
for α , ji λ
(α1 ,...,αn ⇒α)∈Ri
where ‘’ may be uniformly instantiated by either ‘
? ’ or ‘
⊢ ’ (the condition α , ji λ
means that α is neither ? ji λ or ⊢ ji λ). Because of this restriction, additional axioms are
required to handle belief in new asks and tells. These axioms mimic A7 except that they
require that a new belief ? ji λ be produced by a rule whose consequent is ? ji λ and whose
antecedents are believed by agent j rather than agent i (and similarly for tells ⊢ ji λ):
A7- 3Bi ji λ → (Bi ji λ ∨
_
B jα1 ∧ · · · ∧ B j αn )
(α1 ,...,αn ⇒ ji λ)∈R j
In ΛR , the A8 scheme ensured that at least one consequent of one matching rule
(if there is one) appears in each successor state. In ΛR it is, in addition, necessary to ensure
that agent i ends up believing the consequents of its own matching rules. It is useful to
define a new macro:
df
matchi α1 , . . . , αn ⇒ α = Bi α1 ∧ · · · ∧ Bi αn ∧ ¬Bi α
and redefine matchρ to mean there is an i such that matchi ρ. For n ≥ q, the A8 scheme
becomes:
A8 matchi1 ρ1 ∧ · · · ∧ matchin ρn ∧
^
¬matchρ → 2(Bi1 cn(ρ1 ) ∨ · · · ∨ Bin cn(ρn ))
S
ρ∈ R−{ρ1 ,...,ρn }
Finally, two new schemes are added to deal with the distribution of asks and tells to the
relevant parties:
A9 Bi ij λ ↔ B j ij λ
The full set of axiom schemes is show in figure 6.2. A derivation in ΛR is defined
as above. When φ is derivable from Γ in ΛR , we write Γ ⊢R φ. The rest of this section
establishes that this system is sound and complete with respect to the class MR .
Theorem 20 (Soundness) For any ML-formula φ and set of ML-formulae Γ, if Γ ⊢R φ then
Γ R φ.
Proof: As above, it is easy to see that PL, K and A1–A6 are valid on MR . A7 can be seen to
be valid for much the same reasons given in theorem 18. In the case of A7-, i’s new beliefs
of the form ji λ can only result from j firing a rule whose consequent is ji λ. This is seen
163
6.
A1 Bi α → 2Bα
A2 3⊤
A3 Bi (λ1 , . . . , λn ⇒ λ) ∧ Bi λ1 ∧ · · · ∧ Bi λn → 3Bi λ
unless α = β = i j λ or α = β = ji λ or Bi α = B j β
A4 3(Bi α ∧ Bi β) → Bi α ∨ Bi β
A5 Bi ρ
ρ ∈ Ri
A6 ¬Bi ρ
A7 3Bi α → (Bi α ∨
ρ < Ri
W
(α1 ,...,αn ⇒α)∈Ri Bi α1
A7- 3Bi ji λ → (Bi ji λ ∨
W
α , ji λ
∧ · · · ∧ B j αn )
¬matchρ → 2 Bi1 cn(ρ1 ) ∨ · · · ∨ Bin cn(ρn ) n ≥ 1
(α1 ,...,αn ⇒ ji λ)∈Ri B j α1
A8 matchi1 ρ1 ∧ · · · ∧ matchin ρn ∧
A9 Bi i j λ ↔ B j i j λ
∧ · · · ∧ Bi αn )
^
S
ρ∈ R−{ρ1 ,...,ρn }
Plus PL, K, MP and N as figure 6.1.
F 6.2: Axiom schemes for ΛR
to be sound by the same reasoning used for A7. In the case of A8, suppose the consequent
2(B1 cn(ρ1 ) ∨ · · · ∨ Bi (ρ) ∨ · · · ∨ Bm cn(ρn )) of an instance I is false at some state s in a model
M ∈ MR ; then cannot be i-s-matching. Hence s 1 matchi ρ and so I too is not supported by
s; it follows that all instances of 8 are valid. Finally, given M5, A9 is clearly valid.
⊣
Completeness is demonstrated using the same strategy as above, using the Lindenbaum lemma given there. The canonical model MR = hA, SR , TR , {ViR }i∈A i for ΛR is
constructed as before, apart from SR is now the set of all ΛR - over ML. The existence
and truth lemmas are then exactly as above.
Lemma 9 Let MR be the canonical model for ΛR over a set of agents A and let α ∈ L, s, u ∈ S
and i ∈ A. If TR su and α ∈ ViR (u) but α < ViR (s), then (i) ViR (u) = ViR (s) ∪ {α} and (ii) α must be
either a literal, an ask or a tell.
Proof:
Assume that TR su and α ∈ ViR (u) but α < ViR (s). Following the proof of lemma (8),
ViR (s) ⊆ ViR (u), ViR (u) ∪ {α} ⊆ ViR (s) and α cannot be a rule ρ; therefore it must be a literal,
an ask or a tell.
⊣
6.
164
Theorem 21 (Strong completeness) Given a program set R over a set of agents A, a set of ML
sentences Γ and a ML sentence φ, Γ R φ only if Γ ⊢R φ.
Proof:
The strategy is much as before. Expand Γ to a ΛR - Γ+ and build a canonical
model MR = hA, SR , TR , {ViR }i∈A i. From the truth lemma, it follows that MR , Γ+ Γ and so
it remains only to be shown that MR is in the class MR , i.e. that MR satisfies M1–M5 and,
for any i ∈ A and ρ ∈ Ri ∈ R, Bi ρ is globally satisfied in MR . The latter condition is trivial,
given A5 and A6, as are the cases for M4 (the definition of a program excludes rules not
conforming to i and ii) and M5 (by A9). The remaining cases are as follows:
MR satisfies M1:
Assume that there is an i-s-matching rule ρ with cn(ρ) = α. By A3 and
the truth lemma, 3Bi α ∈ s and so, by the existence lemma, there is a state u such that TM su
and Bi α ∈ s. Since ρ is i-s-matching, α < V R (s) and so, by lemma 9, ViR (u) = ViR (s)∪{α}. Since
ρ ∈ Ri , α cannot be of the form ji λ, as no such rule is permitted in an agent’s program. If
α is a literal, or of the form jk λ with j , k, then A1 and A4 entail that VlR (u) = VlR (u) for all
l , i ∈ A. So suppose α := ij λ. By A9 and the definition of V Rj , ij λ ∈ V Rj and so by A4,
V Rj (u) = V Rj (s) ∪ {α}. Moreover, for any k ∈ A with i , k , j, A4 implies that VkR (u) = VkR (s).
Hence u is the appropriate extension of s by α.
MR satisfies M2:
Assume TR su. First, suppose that s is non-terminating. Then there are
rules ρ1 , . . . , ρn and i1 , . . . , im ∈ A such that each ρk≤n is i-s-matching for some il≤m and, for
any i-s-matching rule ρ, ρ ∈ {ρ1 , . . . , ρn }. Take any i ∈ {i1 , . . . , im }. By A8 and the definition
of ViR , and since u is a ΛR -, either cn(ρ1 ) ∈ ViR (u) or . . . or cn(ρn ) ∈ ViR (u). Then for
some k ≤ n, cn(ρk ) ∈ ViR (u) but cn(ρk ) < ViR (s) and so, following the proof of the previous
case, u is either the C or D-extension (depending on whether ρk is a C or D-rule) of s by
cn(ρk ). On the other hand, if s is a terminating state then there is no i ∈ A such that any
ρ ∈ Ri that is i-s-matching. Take any i ∈ A and α ∈ ViR . If α , ji λ for any j, λ, then A7
implies Bi α ∈ s or, for some rule ρ = (α1 , . . . , αn ⇒ α) ∈ Ri , Bi α1 ∧ · · · ∧ Bi αn ∈ s. In the latter
case, ρ cannot be matching by assumption and so ¬Bi α < s. Since s is maximal, Bi α ∈ s and
so α ∈ ViR (s).
MR satisfies M3:
Assume that s is a terminating state. By A2 and the definition of TR , s
has a successor (call it u). By A1, ViR (s) ⊆ ViR (u). Now take any i ∈ A and α ∈ ViR (u) and
suppose α < V R (s). By the definitions of V R and TR , 3α ∈ s. If α is of the form ji λ then, by
A7-, there is a sentence φ1 ∨ · · · ∨ φn ∈ s where each disjunct has the form B j α1 , . . . , B j αn
6.
165
such that α1 , . . . , αn ⇒ ji λ ∈ R j . But then this rule would be j-s-matching, contradicting
the assumption that s is a terminating state. Hence each disjunct φm≤n is false; but this
cannot be the case either, since s is consistent by definition. It follows that α ∈ V R (s). If, on
the other hand, α is not of the form ji λ then there is a sentence φ1 ∨ · · · ∨ φn ∈ s where each
disjunct has the form Bi α1 , . . . , Bi αn such that α1 , . . . , αn ⇒ α ∈ Ri . By similar reasoning,
there can be no such rule and so it follows that α ∈ V R (s). Hence, for any agent i ∈ A,
ViR (u) = ViR (s).
⊣
6.4 Complexity of Satisfiability Checking
In section 5.6.2, a decidability proof (theorem 12) was given for SR -satisfiability checking;
this was extended to a decidability proof for MR -satisfiability in section 5.7.4 (theorem 17).
The proof relied on generating models that have a number of states exponential in the size
of the sentence being checked plus the size of R or R. At that point, the complexity of
a satisfiability test was not analyzed in any more detail. It is to this that I now turn. To
keep the presentation simple, I concentrate on the single-agent case, which can easily be
extended to the multi-agent logic.
Theorem 22 Given a particular program R, the problem of deciding whether a formula φ is
satisfiable in a model M ∈ SR is NP-complete.
Proof:
Clearly the problem is NP-hard. Let n = |{ cn(ρ) | ρ ∈ R}|. From theorem 10, any
SR -satisfiable sentence φ has a tree model M ∈ SR containing no more than nn states that,
given the proof of theorem 11, is no larger than |φ| × nn . Given any Kripke structure M′ ,
state s in M′ and a modal formula ψ, it takes time polynomial in the size of M′ and ψ to
check whether M′ , s ψ [BdRV02]. The crucial point here is that |R|, and hence nn , is fixed
in SR and therefore can be treated as a constant. Thus, we can guess a rooted model M ∈ SR
of size no greater than |φ| × nn and check whether φ is satisfied at the root of M in time
polynomial in |φ|. It follows that the problem of deciding whether φ is SR -satisfiable is in
NP.
⊣
One of the main practical uses of models in a class SR is to check whether R
satisfies certain properties, specified as an input formula φ. One may want to check a
range of different programs against a different property: for example, suppose a developer
6.
166
requires an agent that can never move into a state in which φ holds. On discovering that
φ is SR1 -satisfiable, she must reject R1 . If R2 is the next generation of the program, then
φ needs to be checked for SR2 -satisfiability. The evolution from R1 to R2 may have added
a large number of rules to the program. This example highlights that it is not just the
scalability of satisfiability checking given φ as an input that should concern us. How the
problem scales with the size of the agent’s program is also crucial.1 An interesting problem
to consider, therefore, is the one that takes both a formula φ and a program R as its input
and determines whether φ is SR satisfiable. I call this the S-SAT problem. The complexity
of the problem should be investigated in terms of |R| and |φ| rather than in terms of |φ|
alone.
Definition 43 Let X be a set of ML sentences. Then L(X, R) contains precisely the L sentences
occurring in either X or R, i.e:
L(X, R) = {α | Bα ∈ X} ∪ {(λ1 , . . . , λn ⇒ λ), λ1 , . . . , λn , λ | (λ1 , . . . , λn ⇒ λ) ∈ R}
Definition 44 (SR -Hintikka sets) Let X ⊆ ML be closed under subformulae and negations
(i.e. X = Cl(Y) for some Y ⊆ ML) and R be a program. A Hintikka set H′ over X is a maximal
subset of X satisfying:
1. ⊥ < H′
2. If ¬φ ∈ X, then ¬φ ∈ H′ iff φ < H′
3. If φ ∧ ψ ∈ X, then φ ∧ ψ ∈ H′ iff {φ, ψ} ⊆ H′
4. If φ ∨ ψ ∈ X, then φ ∨ ψ ∈ H′ iff φ ∈ H′ or ψ ∈ H′
5. If φ → ψ ∈ X, then φ → ψ ∈ H′ iff ¬φ ∈ H′ or ψ ∈ H′
Let A be the smallest set containing all instances of the axiom schemes A1–A8 over L(X, R). An
SR -Hintikka set H over X is then a Hintikka set over Cl(X ∪ A). When an SR -Hintikka set H is
SR -satisfiable, it is called an R-atom.
Definition 45 Let H be an SR -Hintikka set over X and R with 3φ ∈ H. Then
D(3φ, H) = {φ} ∪ {ψ | 2ψ ∈ H}
1
In fact, this can often be the more important factor of the two, for the size of many programs currently in
use far exceeds the size of the formulae that it is useful to check for satisfiability.
6.
167
is the demand created in H by 3φ. H3φ is then the set of SR -Hintikka sets H′ over Cl(D(H, 3φ))
with D(H, 3φ) ∈ H′ .
The technique adopted here, adapted from [BdRV02], will construct what is essentially an abstract tableaux system for ΛR that never requires more than space polynomial
in the size of φ and R. The key idea is as follows. SR -Hintikka sets need not be satisfiable,
although they contain no explicit contradictions (any inconsistency is thus a ‘modal inconsistency’). However, any SR -Hintikka set H is satisfiable if there is a satisfiable SR -Hintikka
set that meets the demands set by 3-formulae in H. Thus, by recursively checking for the
existence of such a set for each subformula 3φ ∈ H, H can be shown to be satisfiable. This
process is going to be genuinely algorithmic for the following reason. Each time a set H′ is
found that meets the demand created by some 3φ ∈ H, the degree of H′ will be less than
the degree of H (the degree of a set of sentences is simply the maximum of the degrees of
its elements). When the degree of a set H is 0, the task is simply to verify that H is indeed
an SR -Hintikka set.
When an SR -Hintikka set H meets the demand made by a sentence 3φ, H is
called a witness to 3φ and a structure that recursively provides a witness for every such
subformula is called a witness set, generated by H:
Definition 46 (Witness Set) Let X ⊂fin ML, R be a program, A contain precisely the instances
of the axiom schemes A1–A8 over L(X, R) and H be an SR -Hintikka set over a X. H ⊆ 2X∪A is a
witness set generated by H over X and R iff H ∈ H and, for any Y ∈ H :
i. for each 3φ ∈ Y, there is a Y′ ∈ Y3φ ∩ H
ii. if Y , H then, for some n > 0, there are Y0 , . . . , Yn with Y0 = H and Yn = Y and, for each
k for some 3φ ∈ Yk .
k < n, Yk+1 ∈ Y3φ
The first condition ensures that, for any 3φ in any set Y in the witness set, there is a
further set in the witness set that contains the demand D(Y, 3φ), found in Y3φ . The second
condition ensures that all sets in the witness set are forced there by the first condition, such
that the witness set contains as few elements as possible. The existence of a witness set
generated by an SR -Hintikka set containing φ establishes that φ is satisfiable. To show that
this is so, two facts need to be proved:
1. φ is SR -satisfiable iff there is an R-atom H with φ ∈ H.
6.
168
2. H is an R-atom iff there is a witness set generated by H over a common finite set X
and program R.
The first claim is trivial, given the validity of all instances of A1–A8 on SR (theorem 18)
and so the proof is omitted. The second claim requires more work, for which the following
lemmas are helpful.
Lemma 10 If H is an R-atom containing 3φ, then there is one R-atom in H3φ containing
D(H, 3φ).
Proof: Clearly, D(H, 3φ) is SR -satisfiable for H is an R-atom; then there is a model M ∈ SR
and a state s such that M, s D(H, 3φ). Then the set
{ψ | M, s ψ} ∩ Cl(D(H, 3φ))
is R-satisfiable and clearly contains D(H, 3φ).
⊣
Lemma 11 Let H be an SR -Hintikka set over a finite set X and H be a witness set generated by H
over X. Then each Y ∈ H contains only finitely many sentences of the form 3φ.
Proof:
3φ ∈ Y only if either 3φ ∈ X; or φ := Bλ and there is an instance of A3 over
{λ | Bλ ∈ Cl(X)} whose consequent is 3Bλ; or φ := ⊤. Since X is finite, this guarantees that
{3φ | 3φ ∈ Y} is finite too.
⊣
Now for the crucial result:
Lemma 12 Let H be a finite SR -Hintikka set over a finite set X and a program R. Then H is an
R-atom iff there is a witness set generated by H over X and R.
Proof:
The result for K-satisfiability, from which the current proof is adapted, is shown in
[BdRV02]. The left to right direction proceeds by induction on the degree of X and, given
lemma 10, is no different in the case of SR -satisfiability. The right to left direction, however,
does need additional work. It needs to be shown that, if H is a witness set generated by
H over X, then H is SR -satisfiable. To do this, we will construct a model MH from H and
show it to be a member of SR . The construction of MH proceeds step-by-step. Let S contain
countably many states s0 , s1 , . . . and set S0 = {s0 }, T0 = {s0 , s0 } and f0 = {s0 , H}. Now assume
Sn , Tn and fn have been defined. The construction halts at state n if, for all s ∈ Sn with some
6.
169
3φ ∈ fn (s), there is a state s′ ∈ Sn with φ ∈ fn (s′ ) and fn (s′ ) ∈ fn (s)3φ . Otherwise, there must
be some s ∈ Sn with 3φ ∈ fn (s) not meeting this condition. By the definition of H , there is
then some Y ∈ H ∪ fn (s)3φ . Then stage n + 1 sets:
Sn+1 = Sn ∪ {sn+1 }
Tn+1 = (Tn − {(s, s)}) ∪ {(s, sn+1 ), (sn+1 , sn+1 )}
fn+1 =
fn ∪ {(sn+1 , Y)}
Given lemma 11, the tree constructed is finitely branching. This construction
differs from that used in the corresponding structure for K, in that the leaves of the tree
constructed here are reflexive points (this is required to make the structure finite). To see
that the structure is finite, first note that, whenever Tn ss′ , deg( fn (s)) > 1 implies deg( fn (s′ )) <
deg( fn (s)) and deg( fn (s)) = 1 implies deg( fn (s′ )) = 1. If deg(X) = k, then there is a state s on
each branch with deg(s) = 1 reachable in k − 1 transitions from s0 , i.e. for any 3ψ ∈ fn (s),
ψ := Bλ. Note that no fn (s) has a degree of 0, since each Y ∈ H contains 3⊤. Set k = |R|.
By definition, if Tn ss′ and 3Bλ ∈ fn (s) but Bλ < fn (s), then {3Bλ, Bλ} ⊆ fn (s′ ). Suppose
deg(si ) = 1 and that there is a subbranch si1 · · · sik at stage n; then, for any 3Bλ ∈ fn (sik ),
Bλ ∈ fn (sik ) as well. It follows that, for any 3φ ∈ fn (sik ), D(3φ, fn (sik )) ∈ fn (sik ). When
each leaf of the tree has this latter property, which it must do at some finite stage n, the
construction halts. The tree that has been constructed is shallow, for the length of any
branch cannot exceed max{deg(H), |R|}.
Suppose the construction halts at stage n. For each s ∈ Sn , set V(s) = {α | Bα ∈ fn (s)}
and let MH = hSn , Tn , Vi. Showing that MH ∈ SR is then a matter of rehearsing the
argument used in the proof of completeness (theorem 19): that is, MH satisfies S1–S5 and,
for each state s ∈ S, the set of rules {ρ | ρ ∈ V(s)} = {R}. Note that the definition of H
provides an existence lemma and the definition of V provides a truth lemma (in terms of
fn (s) rather than s). The remainder of the proof then mimics that of theorem 19 so it is
omitted here. It then follows that MH , s0 H.
⊣
Now for the algorithm that checks for SR satisfiability, given two sets H and X
and a program R as its input. It is called the Witness algorithm, returning true iff there is a
witness set H generated by H over X and R:
6.
170
*function Witness(H, X, R) returns Boolean*
begin
if H is an SR -Hintikka set over X and R
and for each subformula 3φ ∈ H there is a set Y ∈ H3φ
such that Witness(Y, Cl(D(H, 3φ)), R)
then return true
else return false
end
By the reasoning used to prove lemma 12, it can be seen that Witness is computable
and hence an acceptable terminating algorithm. By that same reasoning, it can be seen that
Witness(H, X, R) returns true if and only if there is a witness set generated by H over X
and R. Using the correctness of Witness and lemma 12, Witness can be used to check the
SR -satisfiability of any sentence φ by guessing an SR -Hintikka set H over Cl(φ) and R
containing φ and running Witness(H, Cl(φ), R). φ is SR -satisfiable iff Witness(H, Cl(φ), R)
returns true for some such guess of H. This leads to the desired result:
Theorem 23 S-SAT is in PSPACE.
Proof:
[BdRV02] provides an implementation of Witness on a non-deterministic PSPACE
Turing machine. The important points of the encoding are:
1. Any subset of L(Cl(φ), R) can be encoded using pointers, requiring only space polynomial in |φ| and |R|.
2. The recursion clause can be evaluated by taking one subformula 3φ at a time, thus
requiring only polynomial space. Recursion is implemented using a stack.
3. Because the Turing machine is non-deterministic, it is allowed to guess each set
Y ∈ H3φ .
Given these points, each level of recursion uses only space polynomial in |φ| and |R|. But
the depth of recursion is bounded by max{deg(Cl(φ)), |R|} and hence by max{|φ|, |R|}. The
total amount of space required by a non-deterministic Turing machine to run Witness is
thus bounded by max{|φ|, |R|}2 and so the algorithm runs in NPSPACE. But, by Savitch’s
theorem, NPSPACE = PSPACE. This establishes that S-SAT is in PSPACE.
⊣
171
C 7
E L
7.1 Introduction
The previous two chapters concentrated on rule-based agents operating with what, from
the point of view of full propositional logic, are an inexpressive form of rules. One could
not express rules of the form p ⇒ q ∨ r, for example the rule human(x) ⇒ male(x) ∨
female(x). One way of dealing with rules involving disjunctions is considered in [ABG+ 06a,
ABG+ 06b]. There, each disjunctive belief an agent has is represented as a set of disjunctionfree alternatives.
Rules containing disjunctions are more expressive than those without, but are
still far more restricted than a full propositional reasoner. An agent’s program remains
fixed and thus rules cannot appear as the consequents of rules. Rather than dealing with
rules containing disjunctions directly, this chapter considers a full propositional resourcebounded reasoner. The notion of resource-bounded inference is the same as in the previous
chapters: that is, each transition corresponds to a (non-deterministic) atomic inference,
adding just one new consequent of the agent’s beliefs to its belief set. Models of a full
propositional resource bounded reasoner should have the property that, whilst at no point
does an agent believe all propositional tautologies (or all propositional consequences of
its beliefs), any propositional tautology (and any consequence of its beliefs) can become a
belief at some indefinite point in the future. This is interpreted as: given enough time, the
agent would eventually come to believe the formula.
The chapter first considers a Hilbert-style reasoner, along the lines of [GKP00] and
[Ho95, Ho97], which takes time both to instantiate Hilbert axiom schemes and to use modus
ponens to derive new beliefs. Then, a more interesting reasoner is discussed, based on the
172
7.
assumption-making reasoner modelled in Timed Reasoning Logics in chapter 4.5. Finally,
after models of these types of reasoning have been presented, the language is expanded
with more expressive temporal modalities than the ones employed in the language so far.
7.2 A Hilbert-Style Reasoner
In this section, a model of a Hilbert-style axiomatic reasoner is presented. To handle
resource bounds in general, a set I of axiom instances is used. Various resource bounds
can then be modelled by restricting the size of I, or by restricting its elements to a fixed
maximum size or a fixed, finite language. Rather than believing all members of I at every
state, axiom instances in I are introduced one by one. This reflects the fact the instantiating
an axiom scheme requires time and effort, just as applying those axioms does.
Once I has been fixed, it plays much the same rôle as an agent’s program does
in the case of rule-based agents: it limits the inferences an agent may make and hence
constrains the transitions appearing in any model over I. Rule-based agents were modelled
by imposing conditions on T corresponding to the inference rule:
λ1 , . . . , λn
(λ1 , . . . , λn ⇒ λ)
λ
An axiomatic agent, on the other hand, is modelled by imposing conditions on T corresponding to:
α
α→β
β
α
[α ∈ I]
i.e. new beliefs are inferred either by using modus ponens or by instantiating an axiom
scheme.
To keep the presentation simple, suppose the internal language of each agent is
defined over ¬ and →, with the remaining Boolean connectives introduced by definition.
To model a full propositional Hilbert-style reasoner, I should contain precisely the instances
of:
A1 α → (β → α)
A2 (α → (β → γ)) → ((α → β) → (α → γ))
A3 (¬α → ¬β) → (β → α)
7.
173
over this language. Models of this agent describe a reasoner who can, given enough time,
derive any classical consequence of its initial beliefs. That is, in any such model M, if
V(s) = Γ and φ is a consequence of Γ in propositional logic, then there is a state u in M
reachable from s such that M, u Bα.
The class of all models of a Hilbert-style reasoner is denoted H, with the class of
models defined over a particular set I of axiom instances denoted HI . As before, models
are triples hS, T, Vi and a model M is in the class HI iff:
H1 For all s ∈ S: there is a u ∈ S and a β ∈ L such that β < V(s), V(u) = V(s) ∪ {β} and Tsu if:
i. α, α → β ∈ V(s) and β < V(s); or
ii. β ∈ I.
H2 For all s ∈ S, if I ⊆ V(s) and there is no pair α, β such that α, α → β ∈ V(s) but β < V(s),
then there is a u ∈ S such that Tsu and V(u) = V(s).
H3 For all states s, u ∈ S, Tsu only if:
i. α, α → β ∈ V(s), β < V(s) and V(u) = V(s) ∪ {β}; or
ii. α < V(s), V(u) = V(s) ∪ {α} and α ∈ I; or
iii. I ⊆ V(s); for any α, β, if α, α → β ∈ V(s) then β ∈ V(s); and V(u) = V(s).
Conditions H1i and H3i deal with transitions due to modus ponens, whereas H1ii and H3ii
concern transitions modelling the instantiation of some axiom scheme.
When I is unrestricted, containing all instances of the Hilbert axiom schemes A1–
A3 over the agent’s internal language, it is not obvious whether it is possible to provide a
filtration technique along the lines of definition 32 (for it is not known in advance which
axiom instances are required to generate a suitable model). Moreover, modelling axiomatic
reasoners is not a particularly interesting logical exercise.1 The class of models H is defined
here merely to show that the restrictions to rule-based reasoning are by no means essential.
A more interesting approach is discussed in the next section.
1
Complete axiomatizations of Hilbert-style reasoners have been given, for example in [ÅA06], which
discusses an S4 reasoner along the lines of [Ho97] (discussed in section 4.3.4 above).
7.
174
7.3 An Assumption-Making Reasoner
Modelling an agent that can reason by making assumptions, along the lines of the agents
modelled in Timed Reasoning Logic in section 4.5, is a much more interesting task than
modelling a Hilbert-style reasoner. As discussed there, assumption-based reasoning is
analogous to co-operative reasoning within a group of agents and sub-agents. Agent i
assuming that α is analogous to i passing its beliefs to a sub-agent j whose sole belief, in
addition to those passed by i, is α. The sub-agent j then reasons as usual and, when it
derives a sentence α′ , entitles agent i to believe α → α′ .
It is beneficial to introduce a special modality marking beliefs holding within
a particular assumption to the metalanguage, rather than marking assumptions explicitly
in each sub-agent’s internal language. The sub-agents themselves have no need for hypothetical moods: they do not have beliefs-within-assumptions, they only have beliefs
simpliciter. The mechanism of making and closing assumptions in handled purely by the
inter-relations between agents and sub-agents, i.e. purely in the structure of a model.2 Talk
of assumptions being made is talk of what is being modelled in the real world, captured
by the interrelations between agents and sub-agents in a model. So as not to confuse what
I have described here as sub-agents with genuinely distinct agents in a multi-agent system,
sub-agents will henceforth be called assumption contexts. Assumptions are thus modelled
by a number of assumption contexts.
Both the ascription language and the semantics is defined relative to a set of
assumption contexts Σi for each agent i under consideration, and each such set contains
a unique zero-assumption context ǫ. For simplicity, I only consider a single assumptionmaking agent here, but it is easy to generalize to multi-agent cases. As before, assume
that the agent’s internal language is a regular propositional language L. The ascription
language MLΣ is then defined over L and a set of assumption contexts Σ such that, if α ∈ L
and c ∈ Σ, then [c]α is a well-formed formula of MLΣ ; and MLΣ is closed under 3 and
the Boolean connectives. ‘Bα’ is then introduced by definition as ‘[ǫ]α’; intuitively, ‘[ǫ]α’
means that the agent has derived α from its beliefs without any uncancelled assumptions.
Thus the agent’s ‘real’ beliefs correspond to sentences derived from initial beliefs within
the zero-assumption context (i.e. entertained with no uncancelled assumptions).
2
Syntactically, marking assumptions in the metalanguage only is similar to the way in which scope-marking
symbols in natural deduction proofs are not themselves part of the language of propositional or first-order
logic.
7.
175
As in section 4.5, Σ contains sequences of formulae from the agent’s language
L. When no limit is placed on the number of assumptions that may be represented in a
model, Σ is the set of all repeated selections (with replacement) from L, denoted L∗ . But
note that Σ need not always be identified with L∗ . For example, setting Σ = Ln means that
Σ contains all sequences of L formulae that are exactly n formulae long (the notation L≤n
can also be used to denote all sequences from over L that are at most n formulae long).
Setting Σ = L≤n allows us to model a reasoner that can embed assumptions within other
assumptions only up to depth n.
Assumption contexts are modelled in much the same way as agent’s internal
states were in section 5.7. Rather than having a labelling function Vc for each assumption
context c ∈ Σ, I consider a single labelling function that takes both an assumption context
c ∈ Σ and a sentence α ∈ L as its arguments, returning a set of L sentences that label that
context at that state.3 Models are also defined relative to a set of assumption contexts Σ;
they are 4-tuples
hΣ, S, T, Vi
where
• Σ is a set of assumption contexts;
• S is a set of states;
• T ⊆ S × S is the transition relation;
• V : (Σ × S) −→ 2L is the labelling function.
Given a denumerable number of assumption contexts c1 , c2 , . . ., V assigns a set of L sentences to each state s for each assumption context ci∈N . We can view the state s as the
collection of all such sets of sentences, as in the following diagram:
s
3
V(c1 , s) V(c2 , s) V(c3 , s) V(c4 , s)
This makes it easier to extend the single-agent case presented here to a multi-agent scenario, by adding
functions V1 , . . . , Vn for n agents.
7.
176
Note that this is not to say that a state s is a set of sets of sentences V(s, c1 ), V(s, c2 ), . . ..
States are, as usual, primitive logical points, to which sets of sentences are assigned; the
diagram merely gives an intuitive way to visualize the sets of sentences assigned to stateassumption context pairs. The support relation is defined as in section 5.3, except that
the clause for Bα is replaced by the more general clause:
M, s [c]α iff α ∈ V(c, s)
As before, these models need to be restricted to capture step-by-step reasoning in
resource bounded agents. Firstly, the labelling function V has to be restricted to capture
the way in which assumption contexts are related. In particular:
• for all assumption contexts c, α holds in the assumption context cα; and
• whatever holds in an assumption context c holds in the assumption context cα.
Secondly, the transition relation T is restricted to capture the step-by-step process of inferring new sentences from old using the agent’s inference rules. An extension of a state adds
just one literal to just one assumption context:4
Definition 47 (Extension of a state) Let c ∈ Σ be an assumption context, λ a literal and s, u ∈ S.
Then u is a c-extension of s by λ when:
i V(c, u) = V(c, s) ∪ {λ};
ii for all assumption contexts cc′ ∈ Σ, V(cc′ , u) = V(cc′ , s) ∪ {λ}; and
iii for all assumption contexts c′ ∈ Σ, if c′ , c and c′ , cc′′ for any c′′ ∈ Σ, then V(c′ , u) =
V(c′ , s).
When a formula is added to an assumption context c, it is also added to all assumption
contexts cc′ , where cc′ is a sequence of formulae whose initial segment is c. This corresponds
to the fact that whatever is written within the scope of an assumption that α in a natural
deduction proof can also be written later in the proof within the scope of an assumption
that β, so long as the latter assumption is within the scope of the former assumption.
4
This is similar to the definition of an extension in the case of multi-agent models, with agent replaced by
assumption context.
7.
177
T is then restricted to pairs of states s, u where either no inference is possible at s
(in which case both s and u are terminating states: see definition 48 below), or else the latter
extends the former by some literal λ. In the latter case, the agent can infer λ from its beliefs
at s (these restrictions are captured in conditions A2 and A3 below, which correspond to
conditions S1-3 in section 5.3). The ways in which an agent may make such inferences are
more involved than the rule-based and Hilbert-style reasoners that have been encountered
so far, for the formulae previously derived in one assumption context may be used to derive
a new formula in a distinct assumption context.
In general, an agent will have introduction and elimination rules for each connective in its internal language,5 each of which requires a condition entailing Tsu for
appropriate states s, u and a separate condition restricting T to appropriate states. Writing
down all the conditions on a class of models can then become cumbersome. It is therefore
useful to have a general way of specifying a class of models. This can be done by encoding
all admissible one-step inferences in a set R (which plays a similar rôle to R in the case
of rule-based agents and I in the case of the Hilbert-style axiomatic reasoners discussed
above). R encodes two types of inference. For inference rules that require no assumptions
to be made, R contain pairs (X, α) where X is a set of admissible inputs to the inference rule
and α is the corresponding output. An instance might be read as ‘infer α from the members
of X.’
For rules that require an assumption that α to be made, R contain triples (α, X, β).
Intuitively, these triples say, ‘if all members of X follow from assuming α, then infer β.’ By
way of example, (p, {q}, p → q) encodes an admissible inference for an agent who uses →
introduction. This rule means that p → q may be inferred in an assumption context c if q has
previously be inferred in the assumption context cp. In general, if all elements of X label
an assumption context, there will be a next state extending the current state appropriately;
otherwise, the current state is a terminating state:
Definition 48 (Terminating states) For any state s, s is a terminating state iff, for any assumption context c ∈ Σ, there is no triple (α, X, β) or pair (X, β) such that X ⊆ V(c, s) and β < V(c, s).
Some examples of sets of inference rules encoded by R are as follows.
5
At least, if we want to model a functionally complete reasoner.
178
7.
Example 1 A functionally complete classical reasoner whose language only contains →
and ⊥ can be modelled by adding all instances of the following schemes over the agent’s
internal language L to R (the annotations underneath each scheme show the natural
deduction inference rule they correspond to):
(α, {β}, α → β)
({α, α → β}, β) ({(α → ⊥) → ⊥}, α) ({⊥}, α)
→ introduction
⊥⊥ elimination
modus ponens
This agent only makes assumptions to introduce implications; it then treats α → ⊥
as the negation of α.
Example 2
To model an agent who deals with ¬ introduction and elimination directly, R
would contain all instances of:
(α, {β}, α → β)
({α, α → β}, β) (α, {β, ¬β}, ¬α)
→ introduction
modus ponens
¬ introduction
({¬¬α}, α)
¬¬ elimination
Example 3 Conjunction can be added to either of the above systems by adding all instances of:
Example 4
({α, β}, α ∧ β)
({α ∧ β}, α)
({α ∧ β}, β)
∧ introduction
∧ left-elimination
∧ right-elimination
Similarly, disjunction can be accommodated by adding all instances of:
({α}, α ∨ β)
({α ∨ β, α → γ, β → γ}, γ)
∨ introduction
∨ elimination
Given some set R, the subclass AR of A contains a model M iff M ∈ A and M satisfies the
following conditions:
7.
179
A1 For all s ∈ S and c ∈ Σ:
i. α ∈ V(cα, s); and
ii. α ∈ V(cβ, s) if α ∈ V(c, s).
A2 For all s ∈ S and c ∈ Σ:
i. if there is a triple (α, X, β) ∈ R and X ⊆ V(cα, s) but α → β < V(c, s), then there is
a u ∈ S such that Tsu and u c-extends s by α → β;
ii. if there is a pair (X, α) ∈ R and X ⊆ V(c, s) but α < V(c, s), then there is a u ∈ S
such that Tsu and u c-extends s by α;
iii. if s is terminating, then there is a u ∈ S such that Tsu and V(c, u) = V(c, s).
A3 For all states s, u ∈ S, Tsu only if at least one of the following holds:
i. for some c ∈ Σ and some triple (α, X, β) ∈ R, X ⊆ V(cα, s), α → β < V(c, s) and u
c-extends s by α → β;
ii. for some c ∈ Σ and some pair (X, α) ∈ R, X ⊆ V(c, s), α < V(c, s) and u c-extends s
by α;
iii. s is terminating and, for all c ∈ Σ, V(c, u) = V(c, s).
7.4 Investigating the Class A
Having given a class A of models of assumption-making resource bounded agents in the
previous section, this section aims to establish just what properties this class possesses and,
in particular, how many of the properties of the classes S and M are shared by A.
7.4.1 Embedding the Class S
First, is is relatively easy to embed S and M within A; that is, rule-based agents can be
captured in this general framework. Again, for simplicity, only a single agent is considered
and so I only discuss S here. To do so, we need to consider a common ascription language
MLΣ (recall that the ascription language for rule-based agents in chapter 5 does not contain
any of the assumption modalities [c]). Let the set of assumption contexts Σ be the singleton
{ǫ} (note that, by definition, this is as small as a set of assumption contexts can be) and let
180
7.
MLǫ be the corresponding ascription language, containing the modalities 3 and [ǫ] only.
In section 5.3, formulae of the form Bα are dealt with directly in the semantics, whereas B
is introduced by definition here. Since this is merely a trivial syntactic difference, let φ∗ be
the result of replacing any occurrence of [ǫ] in φ with B, for any MLǫ formula φ. Clearly,
φ ↔ φ∗ is A-valid. Now fix a program R and let R be the set
n
o
({λ1 , . . . , λn , (λ1 , . . . , λn ⇒ λ)}, λ) (λ1 , . . . , λn ⇒ λ) ∈ R
R encodes the inference rule
λ1 , . . . , λn , (λ1 , . . . , λn ⇒ λ)
λ
that lies behind rule-based inference. Then the following holds:
Theorem 24 Let MLǫ be as above, R be a program and R be defined in terms of R as just described.
For any MLǫ formula φ, if φ∗ is SR -satisfiable then φ is AR -satisfiable.
Proof: Assume that φ∗ is satisfied at a state v in a model M = hS, T, Vi ∈ SR . Define a model
M′ = h{ǫ}, S, T, V ′ i with V ′ (ǫ, s) = V(s) for all s ∈ S. Clearly, M′ , v φ∗ and so M′ , v φ. It
remains to show that M′ ∈ AR . A1 is trivially satisfied, since Σ = {ǫ}. The remaining cases
are as follows.
A2:
A2i is trivially satisfied, as R only contains pairs. Suppose ({λ1 , . . . , λn , (λ1 , . . . , λn ⇒
λ)}, λ) ∈ R and {λ1 , . . . , λn , (λ1 , . . . , λn ⇒ λ)} ∈ V(ǫ, s) for any state s ∈ S. Then by definition
of R, λ1 , . . . , λn ⇒ λ is s-matching in M and there is a state u ∈ S such that Tsu and u extends
s by λ. Hence T′ su and A2ii is satisfied. Else, s is terminating and so there is a state u ∈ S
with Tsu and s
A3:
L
u; in which case T′ su and A2iii is satisfied.
Again, A3i is trivially satisfied. Assume T′ su. Then Tsu and either there is an s-
matching rule ρ ∈ R in M and u extends s by cn(ρ), or else s is terminating in M and u
L
s.
In the former case, let ρ := λ1 , . . . , λn ⇒ λ; then there is a pair {λ1 , . . . , λn , (λ1 , . . . , λn ⇒
λ)} ∈ V(ǫ, s) and, by the definition of V ′ , {λ1 , . . . , λn , (λ1 , . . . , λn ⇒ λ)} ⊆ V ′ (ǫ, s). It follows
that A3ii is satisfied. In the latter case, s is terminating and, by the definition of V ′ ,
V ′ (ǫ, u) = V ′ (ǫ, s); hence A3iii is satisfied.
⊣
7.4.2 Capturing Natural Deduction Proofs
In this section, a class of models of a functionally complete natural deduction-style reasoner
is fixed and a number of results are given.
7.
181
Definition 49 (Models of a Natural Deduction Reasoner) Let ND be the set of all triples and
pairs from example 2 above, encoding introduction and elimination rules for → and ¬, and set Σ to
be L∗ , i.e. all sequences of formulae in the agent’s language. Let ND ⊂ AND be the class of models
M over Σ satisfying:
i. if M is a tree model whose root is r, then V(cα, r) = V(c, r) ∪ {α};
ii. otherwise, M is bisimilar to a tree model in M.
In a tree model M ∈ ND whose root is r, V(ǫ, r) acts as a set of premises Γ that the agent infers
further formulae from. Models in ND are in a sense minimal models of natural deduction
reasoning, given a set of (propositional) premises Γ. At the start of the model (i.e. at the
root, in tree models), only the elements of Γ are believed; and the only formulae that hold
in assumption contexts are those forced to hold there by condition A1. Below, I show
that a propositional formula α is derivable from a set of premises Γ in a system of natural
deduction if and only if α is believed, eventually, in the model M ∈ ND whose root is r such
that V(ǫ, r) = Γ.
First, I define a system of propositional natural deduction, which differs notationally from standard methods. Take L to be the propositional language in which proofs are
constructed, with → and ¬ the only primitive Boolean connectives. Proofs are constructed
line-by-line, either by writing a premise on the line, or by applying either modus ponens
or double-negation elimination to formulae written on previous lines of the proof, or by
starting a sub-proof, or by closing a sub-proof. To start a sub-proof, any L formula may be
written on the line; to close a sub-proof, a box is drawn round the sub-proof and a formula
β is written immediately beneath the box that has just been drawn. If this sub-proof has α
written on its first line and γ, ¬γ are written within the sub-proof, then β := ¬α. Otherwise,
if γ is the last line written in the sub-proof, then β := α → γ. If a formula α is written
in a box containing the sub-proof, it may be written within the sub-proof too. The entire
proof has a box drawn around it, labelled ‘ǫ’. If a sub-proof whose first line is α appears
immediately within a box labelled ‘c’, then the box around the sub-proof is labelled ‘cα’.
By way of example, the proof of α → (β → α) from no premises looks as follows:
182
7.
α
β
α
ǫαβ
ǫα
ǫ
β→α
α → (β → α)
Here, the box labels appear to the right-hand side of their respective boxes. The proof has
the following line-by-line structure:
1. assume α;
2. assume β;
3. copy ‘α’ into the innermost sub-proof;
4. close the innermost sub-proof, deriving β → α;
5. close the remaining sub-proof, deriving α → (β → α).
Although the notation here is slightly unusual (sub-proofs do not usually require structural
labels on their boxes), this set-up is designed to mimic the branches of models in the class
ND. Call this system of deduction D, and let Γ ⊢D α mean that there is a proof of the sort
just described whose last line is α, written in the outermost ‘ǫ’ box, such that all premises
used in the proof are elements of Γ.
Lemma 13 Let Γ be a set of L formulae and M = hΣ, S, T, Vi ∈ ND be a tree model whose root is
r, with V(ǫ, r) = Γ. Then for any L formula α and any c ∈ Σ, α appears within a box labelled ‘c’
in a proof in D whose premises are Γ iff M, r 3m [c]α for some m ∈ N, including the case m = 0,
i.e. M, r [c]α.
Proof: Note that, because ND is defined over Σ = L∗ , the agents being modelled can make
any assumption over L whatsoever.
only if direction:
Let P be a proof in D whose premises are Γ. The proof proceeds by
induction on the number of lines n of P up to and including the line on which α is written.
For the base case n = 1, either α ∈ Γ, in which case α is written in the box labelled ‘α’ and
M, r [ǫ]α; or else α is an assumption, written in a box labelled ‘ǫα’. By A1i, α ∈ V(ǫα, r)
183
7.
and so M, r [ǫα]α. For the inductive hypothesis assume that, for all k < n, if β is written
on line k of P in a box labelled ‘c′ ’, then there is a m ∈ N such that M, r 3m [c′ ]β. Now
suppose that there is a formula α written on line n of P in a box labelled ‘c’. We need to
consider the following cases:
1.
If α was inferred using modus ponens then there are formulae β → α, β written in
the box labelled ‘c’ on lines numbered k, k < n. Then the inductive hypothesis applies,
′
and so there are m, m′ ∈ N such that M, r 3m [c](β → α) and M, r 3m [c]β. There is
then a state s accessible from r such that {β → α, β} ⊆ V(c, s). Moreover, there is a pair
({β → α, β}, α) ∈ ND and so M, s 3[c]β. Hence, M, r 3 j [c]β for j = max{m, m′ } + 1.
2.
If α was inferred using double negation elimination, then ¬¬α is written on a line
k < n in P. By hypothesis, there is an m with M, r 3m [c]¬¬α and hence a state s accessible
from r with ¬¬α ∈ V(c, s). Moreover, there is a pair ({¬¬α}, α) ∈ ND and hence M, s 3[c]α.
Then M, r 3m+1 [c]α.
3.
If α was inferred using →-introduction, then α is of the form α1 → α2 and α2 is
written in a box labelled ‘cα1 ’ on line n − 1 of P. By hypothesis, M, r 3m [cα1 ]α2 for
some m and so there is a state s accessible from r such that α2 ∈ V(cα1 , s). There is also
a triple (α1 , {α2 }, α1 → α2 ) ∈ ND and so, by A2ii, M, s 3[c](α1 → α2 ). It follows that
M, r 3m+1 [c](α1 → α2 ).
4.
If α was inferred using ¬-introduction, then α is of the form ¬β and there are formulae
γ, ¬γ written within a box labelled ‘cβ’ at lines k, k′ < n in P. By a similar argument to the
previous case, there is a state s accessible from r such that {γ, ¬γ} ⊆ V(cβ, s) and a triple
(β, {γ, ¬γ}, ¬β) ∈ ND; hence M, s 3[c]¬β and so there is an m ∈ N such that M, r 3m [c]¬β.
5.
Finally, α may have been copied into the box labelled ‘c’ from elsewhere in P; then α
is written on a line k < n in P in a box that contains the box labelled ‘c’. Say that this outer
box is labelled ‘c0 ’: then there is a sequence of formulae α1 · · · αn such that c = c0 α1 · · · αn .
By hypothesis, there is an m ∈ N such that M, r 3m [c0 ]α and, by condition A1ii, we also
have:
M, r 3m [c0 α1 ]α
M, r 3m [c0 α1 α2 ]α
7.
184
..
.
M, r 3m [c0 α1 · · · αn ]α
and hence M, r 3m [c]α.
if direction:
The proof is by induction on the number of transitions n from the root r of
M to s. As a base case take n = 0 and assume that M, r [c]α for some c = α1 · · · αn ∈ Σ. By
the definition of the class ND, either α ∈ Γ, in which case it can be written in the outermost
box labelled ‘ǫ’ in a proof P, or else α = αi≤n , in which case, we can start a proof and open
assumptions α1 through to αn . The innermost box will be labelled ‘ǫα1 · · · αn ’ and, since it
is contained in a box labelled ‘ǫα1 · · · αi ’, we can copy αi into the box labelled ‘ǫα1 · · · αn ’.
For the induction hypothesis assume that, for all k < n, if M, r 3k [c]α then there
is a proof P in the system D whose premises are Γ, in which α is written in a box labelled
‘c’. Now suppose that M, r 3n [c]α, i.e. there are a states s, u such that Tn−1 rs, Tsu and
M, u [c]α. If α ∈ V(c, s) then, by hypothesis, we have the desired result. If α < V(c, s), on
the other hand, then α < c; so either (i) there is a pair (X, α) ∈ ND with X ⊂ V(c, s) or else (ii)
there is a triple (β, Y, α) ∈ ND with Y ⊂ V(c, s). In case (i), the inductive hypothesis applies
and so there is a proof P with all elements of X written in a box labelled ‘c’. Moreover,
either X = {β, β → α}, or else X = {¬¬α}. Either way, P may be extended by writing α in
the box labelled ‘c’ (using modus ponens or ¬¬-elimination, respectively). In case (ii), the
inductive hypothesis applies and so so there is a proof P with all elements of Y written
in a box labelled ‘cβ’. Either Y = {γ} and α := β → γ, or else Y = {γ, ¬γ} and α := ¬β.
Either way, P may be extended by writing α in the box labelled ‘c’ (using →-introduction
or ¬-introduction respectively).
⊣
As an immediate consequence of this result, we have the following:
Theorem 25 For all α ∈ L and Γ ⊆ L, Γ ⊢D α iff there is a tree model M ∈ ND whose root is r and
V(ǫ, r) = Γ such that M, r 3n Bα for some n ∈ N.
Proof:
This is just a special case of lemma 13, for Bα is defined as [ǫ]α and Γ ⊢D α when
there is a proof P in the system D whose premises are Γ and α is written in the box labelled
‘ǫ’.
⊣
185
7.
7.4.3 Properties of AR
In this section, I show that some of the interesting results that were shown to hold of the
class S in chapter 5, also hold of the class A.
Definition 50 (Label Identity) Let R be any set of pairs (X, α) and triples (α, X, β) and Σ a set of
assumption contexts over L. Given models M = hΣ, S, T, Vi, M′ = hΣ, S′ , T′ , V ′ i ∈ SR and states
s ∈ S, s′ ∈ S′ , s
L
s′ holds with respect to M, M′ iff, for all c ∈ Σ, V(c, s) = V ′ (c, s′ ).
Theorem 26 Let R be any set of pairs (X, α) and triples (α, X, β) and Σ a set of assumption contexts
over L. Given any models M = hΣ, S, T, Vi, M′ = hΣ, S′ , T′ , V ′ i ∈ SR and states s ∈ S, s′ ∈ S′ ,
s
L
Proof:
s′ iff s ! s′ .
Clearly, s ! s′ implies s
L
s′ . The proof that s
L
s′ implies M, s φ iff
M′ , s′ φ (from with the result follows by definition) is by induction on the complexity of
φ. The base case in which φ := Bα is trivial. So assume as the inductive hypothesis, for
all states v ∈ S, v′ ∈ S′ and all formulae ψ of complexity less than φ, that v
L
v′ implies
M, v ψ iff M′ , v′ ψ. The Boolean cases are simple, so take φ := 3ψ. We have M, s 3ψ
and so there is a state u ∈ S such that Tsu and M, u ψ. Given A3, there are three cases i–iii
to consider.
A3i:
There is a c ∈ Σ and a triple (α, X, β) ∈ R with X ⊆ V(cα, s), V(c, u) = V(c, s) ∪ {α → β}
and, for all c′ , c ∈ Σ, V(c′ , u) = V(c′ , s). Since s
L
s′ , X ⊆ V ′ (cα, s′ ) and so there is a state
u′ ∈ S′ with V ′ (c, u′ ) = V ′ (c, s′ ) ∪ {α → β} and, for all c′ , c ∈ Σ, V ′ (c′ , u′ ) = V ′ (c′ , s′ ), hence
u
L
A3ii:
u′ . By hypothesis, M′ , u′ ψ and so M′ , s′ 3ψ.
There is a c ∈ Σ and a pair (X, α) ∈ R with X ⊆ V(c, s), V(c, u) = V(c, s) ∪ {α} and,
for all c′ , c ∈ Σ, V(c′ , u) = V(c′ , s). Since s
L
s′ , X ⊆ V ′ (c, s′ ) and so there is a state
u′ ∈ S′ such that V ′ (c, u′ ) = V ′ (c, s′ ) ∪ {α} and, for all c′ , c ∈ Σ, V ′ (c′ , u′ ) = V ′ (c′ , s′ ). Hence
u
L
u′ and so, by hypothesis, M′ , u′ ψ. It follows that M′ , s′ 3ψ.
s is a terminating state, s
A3iii:
state u′ ∈ S′ with T′ s′ u′ and u′
u
L
u′ .
By hypothesis,
M′ , u ′
L
L
u. Then s′ must also be terminating, so there is a
s′ . Since
ψ and so
L
M′ , s′
is an equivalence relation, it follows that
3ψ.
⊣
Theorem 27 Let R, Σ and M, M′ ∈ SR be as above. For any states s ∈ S, s′ ∈ S′ , s ! s′ iff s ⋍ s′ .
186
7.
Proof:
s ⋍ s′ implies s ! s′ but the converse remains to be shown. Assume that s ! s′
and Tsu for some u ∈ S. Since Tsu, either A3i or ii applies. Assume the former; then there
is a c ∈ Σ and a triple (α, X, β) ∈ R with X ⊆ V(cα, s) and u c-extends s by α → β. By the
argument used in the previous proof, there is a u′ ∈ S′ with u
L
u′ and hence, by the
previous result, u ! u′ . If A3ii applies, on the other hand, then there is a c ∈ Σ and a pair
(X, α) with X ⊆ V(c, s) and u c-extends s by α. Again, by the argument used above, there is
a u′ ∈ S′ such that u′
L
u and hence u ! u′ .
⊣
These two results show us that the relationship between the way states are labelled,
the formulae supported there and bisimulation holds in the case of full propositional
reasoners. It is not a result that arises when attention is restricted to rule-based agents;
rather, it arises when the transition relation between states captures atomic inferences.
Models in A also have the belief convergence property.
Theorem 28 (Belief Convergence) Let R and Σ be as above. For any M = hΣ, S, T, Vi ∈ SR ,
r ∈ S and any n ∈ N, if Tn rs and Tn ru, then there is a state s′ reachable from s and u′ reachable from
u such that s′
Proof:
L
u′ .
The proof proceeds in much the same way as the proof of theorem 8 (chapter
5). Assume that M is a tree model whose root is r. Then there is a sequence of states
s0 · · · sn with s0 = r, sn = s and, for each i < n, Tsi si+1 . For each k < n, either sk is
terminating; or else there is a c ∈ Σ a set X such that either there is a triple (α, X, β) ∈ R
and sk+1 c-extends sk by α → β or there is a pair (X, α) ∈ R and sk+1 c-extends sk by α.
The transitions from r to s can be mimicked from u to a state u′ such that for any c ∈ Σ,
V(c, u′ ) = V(c, u) ∪ (V(c, s) − V(c, r)) = V(c, r) ∪ (V(c, u) − V(c, r)) ∪ (V(c, s) − V(c, r)). By
applying the same reasoning, a state s′ is reachable from s′ such that, for any c ∈ Σ,
V(c, s′ ) = V(c, s) ∪ (V(c, u) − V(c, r)) = V(c, r) ∪ (V(c, s) − V(c, r)) ∪ (V(c, u) − V(c, r)). Hence,
s′
L
u′ .
⊣
It would be interesting to investigate whether these results hold in the non-monotonic case
(see the future work section 8.2.2 below); this is left for future work.
7.5 Adding Temporal Modalities
So far, the 3 modality (and its dual 2) have been used to describe what the agent can (and
must) believe after the next rule is fired. These modalities may be chained to express what
7.
187
an agent can (and must) believe after a certain specified number of cycles of inference.
In general, it is not possible to express the beliefs that an agent could (or must) have at
some indefinite point in the future using these modalities. An additional modality relating
to what will eventually hold is required to express such facts; such a modality cannot
be defined in the existing language. Since time is branching in the present account, the
additional modalities must also distinguish between those properties that may eventually
hold and those that must hold, sooner or later.
Models of Computational Tree Logic (CTL) provides just such modalities. They
are complex modalities, consisting of a temporal part prefixed by a path quantifier:
• EF φ: on some branch, φ will eventually be true
• EGφ: on some branch, φ will always be true
• EXφ: on some branch, φ will be true at the next step
• EφUψ: on some branch, φ will be true until ψ becomes true
• AF φ: on all branches, φ will eventually be true
• AGφ: on all branches, φ will always be true
• AXφ: on all branches, φ will be true at the next step
• AφUψ: on all branches, φ will be true until ψ becomes true
Note that U is a binary operator, but is usually written with infix rather than prefix notation.
Since each sequence of states in a model intuitively represents a possible future, E-formulae
may be read as ‘in some possible future . . . ’ and A-formulae as ‘in all possible futures . . . ’.
The 3 modality used in the language ML above thus corresponds to the EX modality: it
says what will hold at the next step on some branch (in the same way, 2 corresponds to
AX).
Adding EF , on the other hand, does alter the logic; EF essentially works as a
provability modality for the agent’s logic. If EF Bα is valid in a class of models C ⊆ A (in
the language extended by temporal operators), then α is provable in the internal logic of
the agent modelled by C. As a consequence, if the agent’s logic is undecidable, then so is
the logic of the corresponding subclass of A.
7.
188
Semantics is given by treating the branches of a model as first-order entities
(sometimes, branches are called runs or computations in CTL). A branch θ is an infinite
sequence of states s0 s1 · · · such that, for each n ∈ N, Tsn n + 1 and ‘θ[i]’ denotes si . Let the
set of all branches over some relation T be B and B be a function of type B −→ 2B such that
B(s) = {θ ∈ B | θ[0] = s}, i.e. returning the set of branches from a state s. Then semantics for
the new compound modalities is given as follows:
M, s EXφ iff ∃θ∈B(s) M, θ[1] φ
M, s EF φ iff ∃θ∈B(s) ∃i∈N M, θ[i] φ
M, s EGφ iff ∃θ∈B(s) ∀i∈N M, θ[i] φ
M, s EφUψ iff ∃θ∈B(s) ∃i∈N ∀ j<i∈N M, θ[j] φ and M, θ[i] ψ
M, s AXφ iff ∀θ∈B(s) M, θ[1] φ
M, s AF φ iff ∀θ∈B(s) ∃i∈N M, θ[i] φ
M, s AGφ iff ∀θ∈B(s) ∀i∈N M, θ[i] φ
M, s AφUψ iff ∀θ∈B(s) ∃i∈N ∀ j<i∈N M, θ[j] φ and M, θ[i] ψ
Given the classes S, M, H and A and their corresponding logics, it is then easy to
add these CTL modalities. The definitions of and restrictions on models remain the same
in each case: extra cases of well-formed formulae and the extra satisfaction clauses are all
that need to be added. The EF modality provides a way to express the beliefs that an agent
would eventually derive, if all time and memory restrictions on its reasoning were lifted.
In the case of full propositional reasoners, EF tells us what an unbounded agent
who can make any assumption whatsoever (or use any instance of a Hilbert axiom) could
eventually derive. Thus, Γ ⊢D α if and only if there is a model M ∈ ND with V(ǫ, root) = Γ
and M EF Bα. As a special case, α is a propositional tautology iff ND EF Bα. In
this way, EF corresponds to the dynamic modality hF i of Ho’s dynamic epistemic logic,
discussed in section 4.3.4.
One advantage of incorporating CTL syntax into the logic (in addition to the
expressiveness thereby gained) is that it opens up the possibility of using automated
model checking techniques to verify properties of the modelled agents. In the case of rulebased agents, for example, a programmer developing a rule-based agent could use model
checking software to automatically verify whether her program satisfies certain desirable
criteria. While no model checker exists for the current framework at present, developing
one is simply a matter of encoding the restrictions appropriate to T and V in the class under
investigation in an existing model checker. This is left for future work.
189
C 8
C
8.1 Summary of the Thesis
In chapters 1 and 2, I established that logical omniscience is a genuine problem for epistemic
logic and that all of the approaches that seek to avoid logical omniscience within the
possible worlds framework fail to do so. Logical omniscience, in some form or another,
is always present in such accounts. The minimal possible worlds account, which uses
Scott-Montague neighbourhood semantics, models belief as a relation between an agent
and a proposition (treated as a set of possible worlds). However, belief is not best captured
in terms of propositions. As I argued in chapter 3, classifying an agent’s belief states in
terms of sentences is to be preferred. Thus, sentential (or ‘syntactic’) logics are an ideal
choice of a logic for modelling belief. They are the philosophically motivated account of
belief, whether we are dealing with ideal or resource bounded agents.
Throughout chapters 5–7, I have developed a logical framework for modelling
resource bounded agents. I began by considering rule-based agents, but extended the
account to cover full propositional reasoners in the present chapter. A general objection to
epistemic logics that take the syntactic or sentential approach to modelling knowledge or
belief is that they merely give us “ways of representing knowledge [and belief] rather than
modelling knowledge [and belief]”. If so, the thought runs, “[o]ne gains very little intuition
about knowledge [or belief] from studying syntactic structures” [FHMV95, p.320]. This is
because the syntactic approach, it is claimed, “lacks the elegance and intuitive appeal of the
semantic [possible worlds] approach” [FH88, p.40]. I have shown that the logic proposed
here has many interesting and useful properties. A number of these are due to the logic
being a modal logic: the modal approach can be used without recourse to the possible
8.
190
worlds analysis of belief. The interpretation of transitions in models as the agent’s acts of
atomic inference provides a number of other useful properties. I believe that the logic I
have developed, far from lacking the “elegance and intuitive appeal” of the possible worlds
approach, is intuitive, philosophically well-motivated and will be useful to those working
in artificial intelligence and computer science, who need to produce realistic models of
real-world, resource-bounded agents.
8.2 Future Work
8.2.1 Embedding into Alternating Time Logic
There are extensions of CTL that increase its expressiveness. CTL∗ allows the temporal
parts of the CTL modalities, F , G, X and U, to occur separately from the path quantifiers
E, A. The resulting logic is more complex, but the semantics is more or less the same. The
extension I want to consider very briefly here is the yet more expressive alternating time
logic or ATL [AHK02]. The syntax of ATL contains group modalities hhGii for some subset
of agents G, in place of the CTL path quantifiers E and A. Thus, hhGiiX, hhGiiF , hhGiiU are
all ATL modalities. The interpretation of hhGii — is that the agents in the group G can
cooperate to achieve —. If A is the set of all agents under consideration, then hhAii acts as
the existential and hh{}ii as the universal path quantifier.
Semantics for ATL is given in terms of concurrent game structures that, despite
the superficial differences to Kripke structures, provide a very natural interpretation of the
notions of rule selection and firing discussed above. I will give only a very brief sketch of
how this might be done for n rule-based agents A whose programs are R = {R1 , . . . , Rn }.
First, a very brief introduction to the game semantics used by ATL is needed. For each
agent i ∈ A, mi : S −→ N assigns a natural number to agent i at each state and the set
{0, . . . , mi (s)} is the set of moves available to i at s. Combining the available moves for all
~ (s) = {1, . . . , m1 (s)} × · · · × {1, . . . , mn (s)} is the set of move vectors at s. The transition
agents, m
~ ∈m
~ −→ S takes a state s and a move vector m
~ (s) and returns the next
function δ : S × m
~ Intuitively, if the agents 1, . . . , n choose moves m1 , . . . , mn at a state s, then
state, δ(s, m).
δ(s, (m1 , . . . , mn )) is the next state of the system.
A strategy σi : S −→ N for agent i picks a move for i at each state s ∈ S. Given a
group of agents G ⊆ A, ~σG is a strategy vector for the group G. The set of all such strategy
vectors for G is denoted ΣG . A computation is an infinite sequence of states s0 · · · sn · · ·. For
8.
191
a computation c = s0 · · ·, ‘c[i]’ denotes si . Given a group of agents G and a state s, each
strategy vector ~σG ∈ ΣG induces a set of outcomes, out(s, ~σG ), from s. For a group G of n
agents, a computation c is in out(s, ~σG ) iff:
i c[0] = s; and
~ ∈m
~ (c[k]) = (m1 , . . . , mn ) such that
ii for every integer k > 0, there is a move vector m
~ ).
mi = σi (c[k]) for each agent i and c[k + 1] = δ(c[k], m
Let M be a structure hA, S, {Vi , mi }i∈A , δi of the kind just described (S is a set of states and V
a labelling function) and let the quantifier phrase Q(s, c) abbreviate ∀~σG ∈ ΣG ∃c ∈ out(s, ~σG ).
The support relation holding between a model, a state and a formula is defined as follows:
M, s hhGiiXφ iff Q(s, c)(S, c[1] φ)
M, s hhGiiF φ iff Q(s, c)∃n>1 (S, c[n] φ)
M, s hhGiiGφ iff Q(s, c)∀n>1 (S, c[n] φ)
M, s hhGiiφUψ iff Q(s, c)∃n>1 (S, c[n] ψ & ∀k<n (S, c[k] φ))
Now these notions can be adapted to the specific case of rule-based agents. Moves
are associated with choosing and firing a particular matching rule. For an agent i and a
state s, let ρ1 , . . . , ρmi be an enumeration of the i-s-matching rules. Given a strategy σi ∈ ~σG
for an agent i ∈ G, i fires rule σi (s) at state s. Recall that a transition from one state to
another represents exactly one agent firing one rule. A condition that must be imposed on
~ ∈m
~ (s) for any state s is that, if agent i chooses a move mi > 0 then m j = 0
any move vector m
~ =u
for each j , i, j ≤ n. The transition function δ is then defined as expected: if δ(s, m)
~ = (0, . . . , k, . . . , 0) where k > 0 is in the ith position in m,
~ then u i-extends s by the
and m
consequent of the kth s-i-matching rule.
This is a very preliminary sketch of how game semantics could be given for
rule-based agents, but the formulation does appear to be natural and so constitutes an
interesting direction for future work to take.
8.2.2 Non-monotonicity
The assumption running throughout this work has been that agents are monotonic reasoners. The task was to model resource bounded reasoning and the assumption of monotonicity has allowed the discussion to focus on that issue. However, non-monotonic reasoning
192
8.
is important in many areas of AI: see [Gin94], for example. In fact, a good deal of practical
reasoning is non-monotonic. Makinson comments that “[n]onmonotonic reasoning is not
something strange and esoteric. In fact, almost all of our everyday reasoning is nonmonotonic; purely deductive, monotonic inference takes place only in rather special contexts,
notably pure mathematics” [Mak05, p.19]. Dropping the monotonicity requirement would
constitute a challenging development, but the payoff would be a logical framework with
much wider applications in AI and computer science.
Non-monotonic reasoning in rule-based systems can arise in a number of ways.
One is when certain conditions determine which rule should be fired in the next cycle.
Situations can arise in which ρ could be fired but would not be if the agent were to know
more information. For example, suppose the agent’s rules are ordered such that firing
a rule ρ′ (whose consequent differs from that of ρ) takes precedence over firing ρ. If ρ
matches but ρ′ does not because the agent lacks the relevant beliefs, then beliefs may be
derived from firing ρ that would not be derived were ρ′ a matching belief. The resulting
consequence relation is non-monotonic. Another route to non-monotonicity in rule-based
systems is to consider rules of the form
P1 , . . . , Pn ⇒ ∼Q
where ∼Q instructs the agent to remove Q from its working memory. Firing such a rule does
not lead to a new belief; but it can lead to the agent having one less belief.
A starting point is to amend the requirement that one state extends another when
there is a transition to the first from the second. Instead, we can define an amend operation
‘◦’ on 2L × L such that X ◦ p = X ∪ {p} and X ◦ ∼p = X − {p}. Then, whenever there is an
s-matching rule ρ, there is a state u such that Tsu and u amends s by cn(ρ). That is, the
transition relation must be constrained by (at least) the two following conditions:
1. whenever there is an s-matching rule ρ, there is a state u such that Tsu and V(u) =
V(s) ◦ cn(ρ).
2. whenever Tsu, either s is terminating and s
L
u or else there is an s-matching rule
ρ and V(u) = V(s) ◦ cn(ρ).
When a rule whose consequent is ∼p is fired, the resulting state is just like the previous
one, except that p does not feature in the latter (whereas it may or may not feature in the
former).
8.
193
In this system, the order in which rules fire matters. Moreover, it is no longer the
case that if Γ entails φ then Γ ∪ {ψ} entails φ. It would be interesting to see which of the
properties discussed above hold of this logic; this is left for future work. This approach
covers the case in which beliefs are removed from an agent’s working memory. It does not
cover the case in which non-monotonicity arises by placing an order on the agent’s rules,
for example. Again, investigating this cause of non-monotonicity is left for future work.
8.2.3 Information and Epistemic Possibility
Having developed a framework for modelling resource-bounded reasoning, it is interesting
to investigate how far the ideas may be applied. One such application is the analysis of
epistemic possibility and information. Consider Bob, a famous mathematician who, until
a few years ago, did not know whether Fermat’s theorem was true or false: both were
epistemic possibilities for Bob because, as far as he could see, nothing he already believed
entailed (or strongly suggested) either that the theorem is true or that it is false. In other
words, Bob was uncertain as to the truth of the theorem because he is a resource-bounded
agent. What counts as an epistemic possibility is relative to an agent’s resource bounds.
Agents that observe exactly the same phenomena, give identical interpretations to their
observations and reason in precisely the same way as one another can nevertheless differ
on what they consider epistemically possible. A superhuman agent vastly more capable
of processing mathematical information than Bob might agree with Bob on all the facts
and share all his beliefs but nevertheless be able to rule out the possibility of Fermat’s last
theorem being false.
An epistemic possibility, then, is no more than an agent’s inability to detect a
conflict between her beliefs and what that possibility claims is the case. Let Γ be an arbitrary
set of sentences. An agent i is likely to hold that Γ may well be a correct description of
the world if she cannot find an explicit contradiction between the sentences in Γ and her
own beliefs. Let us take an explicit contradiction to be either a pair α, ¬α or a conjunction
α ∧ ¬α. Then, a set of sentences Γ is not epistemically possible for agent i if i can derive
either α, ¬α or α ∧ ¬α from Γ.
Suppose i is a natural deduction reasoner that can make any assumption she likes
and use all the usual natural deduction inference rules—that is, i’s reasoning is modelled
by models in the class A∗ND . Suppose M ∈ A∗ND contains a state s such that V(s) = Γ and
8.
194
that i’s resource bound allows her to reason for n steps. If M, s (3n Bi α ∧ Bi ¬α) or M, s 3n Bi (α ∧ ¬α) for any α, then i can detect an explicit contradiction in Γ (within her resource
bound) and hence should not consider Γ to be possible. If M, s 3n Bi α ∧ 3n Bi ¬α, on the
other hand, then i does not necessarily hold Γ to be epistemically possible. 3n Bi α ∧ 3n Bi ¬α
says that i can derive α from its current beliefs within its resource bound, and it can also
derive ¬α; but leaves it open whether the agent could derive both within its resource bound.
It is only when an agent could note an explicit contradiction in Γ within its resource bound
that Γ is said to be an epistemic impossibility for that agent.
This relation of epistemic possibility can be modelled by a relation Ri between
states. Let δi ∈ N be agent i’s resource bound. Rather than restricting Ri to states that
are logically possible worlds, Ri is restricted to states s in a model M such that M, s 1
3δi Bi (α∧¬α) and M, s 1 3δi (Bi α∧Bi ¬α). An epistemic modality E may then be introduced,
with:
M, s Eα iff there is a state u such that Ri su and M, u α
In [Jag06a], I use this idea to give an analysis of being informed that does not assume
that agents are logically omniscient. Traditional accounts of information1 assume that an
agent cannot be genuinely informed by the consequences of its information and so cannot
be informed by a tautology. If β is a logical consequence of α and i is informed that α, then i
has also been informed that β. Agents automatically have infinite amounts of information
(or rather, β does not count as information over and above α). This is unintuitive. I argue
that, for instance, being informed that α (which is in fact a tautology) is a tautology may
well be informative, for example, to an agent sitting her logic exam. The informative nature
of certain consequences of an agent’s current information can be modelled using the notion
of epistemic possibility I have just sketched. Agent i is informed that α (in the static sense
that i has the information that α) at state s iff α holds at u for all states u such that Ri su.
An information modality ‘I’ can then be introduced. So long as ‘information’ is
used in the inclusive sense in which information may be false, I is the dual of E.2 This
is a static notion of information: it captures the information that an agent possesses at a
moment in time. A dynamic notion of information, associated with the act of becoming
informed that β, is then modelled as an update on the relation Ri , restricting it to pairs
1
See [VB03] for an overview and [Flo06] for a recent possible-worlds account of information.
[Flo05, Flo06] argue that information must be true by definition. On this view, the ‘I’ modality captures
apparent information, which stands in the same relation to genuine information as belief stands to true belief.
2
8.
195
(s, u) such that u supports β. The effect of a (true) public announcement that β in a society
of trusting agents will be that each agent that did not already have the information will
become informed that β. For these agents, who previously considered ¬β to be an epistemic
possibility, the announcement has the effect of ruling out any states that support β their
take on how the world actually is. This is captured by updating the R relation associated
with these agents such that these states are no longer epistemically accessible.
To the agents that already had the information that β, the public announcement
is not informative in the slightest. This fact is already captured in the model, for updating
the R relation of these agents causes no change in the states that it relates. The model
equates the informative content of a sentence, for a particular agent a, with the effect of
updating Ra in the way described. An update that produces no chance to Ra means that
the announcement was not informative to agent a in the slightest.
The above discussion, although brief, appears to be a philosophically wellmotivated analysis of epistemic possibility, which in turn leads to a workable notion of
both static and dynamic information. An investigation of the particular properties of this
logic and an analysis of its scope in related areas is left for future work.
196
References
[ÅA06]
Thomas Ågotnes and Natasha Alechina. Semantics for dynamic syntactic epistemic logics. In P. Doherty, J. Mylopoulos, and C Welty, editors, Proceedings
of the 10th International Conference on Principles of Knowledge Representation and
Reasoning, pages 411–419, 2006.
[AB75]
A. R. Anderson and N. D. Belnap. Entailment—the logic of relevance and necessity.
Princeton University Press, Princeton, NJ, 1975.
[ABG+ 06a] N. Alechina, P. Bertoli, C. Ghidini, M. Jago, B. Logan, and L. Serafini. Model
checking space and time requirements for resource-bounded agents. In Proceedings of the Fourth Workshop on Model Checking and Artificial Intelligence (MoChArt
06), 2006.
[ABG+ 06b] N. Alechina, P. Bertoli, C. Ghidini, M. Jago, B. Logan, and L. Serafini. Verifying space and time requirements for resource-bounded agents. In Proceedings
AAMAS 2006, 2006.
[AGM85] Carlos E. Alchourrón, Peter Gärdenfors, and D. Makinson. On the logic of theory
change: Partial meet contraction and revision functions. Journal of Symbolic Logic,
50:510–530, 1985.
[Ågo04]
Thomas Ågotnes. A Logic of Finite Syntactic Epistemic States. PhD thesis, Department of Informatics, University of Bergen, Norway, April 2004.
[AHK02] R. Alur, T. Henzinger, and O. Kupferman. Alternating-time temporal logic.
Journal of the ACM, 49:672—713, 2002.
[AJL06a] N. Alechina, M. Jago, and B. Logan. Modal logics for communicating rule-based
agents. In Proceedings of ECAI 06, 2006. To appear.
REFERENCES
197
[AJL06b] Natasha Alechina, Mark Jago, and Brian Logan. Resource-bounded belief revision and contraction. In M. Baldoni, U. Endriss, A. Omicini, and P. Torroni,
editors, Declarative Agent Languages and Technologies III, Selected and Revised Papers, LNCS 3904, pages 141–154. Springer, 2006.
[ALW04a] N. Alechina, B. Logan, and M. Whitsey. A complete and decidable logic for
resource-bounded agents. In Proceedings of the Third International Joint Conference
on Autonomous Agents and Multi-Agent Systems (AAMAS 2004), pages 606–613.
ACM Press, July 2004.
[ALW04b] Natasha Alechina, Brian Logan, and Mark Whitsey. Modelling communicating
agents in timed reasoning logics. In proceedings of JELIA 04, pages 95–107, 2004.
[Arm97]
D.M. Armstrong. A World of States of Affairs. Cambridge University Press,
Cambridge, 1997.
[Arm04]
D.M. Armstrong. Truth and truthmakers. Cambridge University Press, 2004.
[Aus62]
J.L. Austin. How to Do Things With Words. Oxford University Press, Oxford,
1962.
[Bac06]
Kent Bach. Comparing Frege and Russell. http://userwww.sfsu.edu/∼kbach/
FregeRus.html, April 2006.
[Bar03]
Stephen Barker. Truth and conventional implicature. Mind, 112:1–33, 2003.
[Bar04]
Stephen Barker. Renewing Meaning: A Speech-Act Theoretic Approach. Oxford
University Press, 2004.
[Bar06]
Stephen Barker. An expressivist theory of truth. Manuscript, 2006.
[BdRV02] Patrick Blackburn, Maarten de Rijke, and Yde Venema. Modal Logic. Cambridge
University Press, New York, 2002.
[Bea06]
J.C. Beall. True, false and paranormal. Analysis, 66(2):102–113, 2006.
[Bel77]
N.D. Belnap. A useful four-valued logic. In J.M. Dunn and G. Epstein, editors,
Modern Use of Multiple-valued Logic. D.Reidel, Dordrecht, 1977.
[BP83]
J. Barwise and J. Perry. Situations and Attitudes. Bradford Books, MIT press, 1983.
REFERENCES
[BPR01]
198
F. Bellifemine, A. Poggi, and G. Rimassa. Developing multi-agent systems with
a FIPA-compliant agent framework. Software Practice and Experience, 21(2):103–
128, 2001.
[Bra71]
R. Brady. The consistency of the axioms of abstraction and extensionality in a
three-valued logic. Notre Dame Journal of Formal Logic, 12:447–453, 1971.
[BRC06]
Business Rules community website, http://www.brcommunity.com/, accessed
13th March, 2006.
[Car47]
R. Carnap. Meaning and Necessity. University of Chicago Press, 1947.
[Che80]
B. Chellas. Modal logic : an introduction. Cambridge University Press, London,
1980.
[Chu84]
P. M. Churchland. Matter and Consciousness. MIT Press, Cambridge, Mass, 1984.
[Cor04]
Eros Corazza. Reflecting the Mind: Indexicality and Quasi-Indexicality. Oxford
University Press, 2004.
[CP89]
Mark Crimmins and John Perry. The prince and the phone-booth: Reporting
puzzling beliefs. Journal of Philosophy, 86:685–711, 1989.
[Cre70]
M.J. Cresswell. Classical intensional logics. Theoria, 36:347–72, 1970.
[Cre72]
M.J. Cresswell. Intensional logics and logical truth. Journal of Philosophical Logic,
1:pp. 2–15, 1972.
[Cre73]
M.J. Cresswell. Logics and Languages. Methuen and Co., 1973.
[CSM85] J. Cottingham, R. Stoothoff, and D. Murdoch, editors. The Philosophical Writings
of Descartes. Cambridge University Press, Cambridge, 1985.
[Dan85]
Jonathan Dancy. Introduction to Contemporary Epistemology. Blackwells, Oxford,
1985.
[Dav68]
Donald Davidson. Truth and Interpretation, chapter On Saying That, pages 93–
108. Basil Blackwell, Oxford, 1968.
[Dav85]
D. Davidson. Inquiries into Truth and Interpretation. Clarendon Press, Oxford,
1985.
REFERENCES
[dC74]
199
N. da Costa. On the theory of inconsistent formal systems. Notre Dame Journal
of Formal Logic, 15(4):497–510, 1974.
[dCA77]
N. da Costa and E.H. Alves. Semantical analysis of the calculi cn. Notre Dame
Journal of Formal Logic, 18(4):621–630, 1977.
[Den81]
D. Dennett. Brainstorms. MIT Press, Harvard, MASS., 1981.
[Den87]
Daniel C. Dennett. The Intentional Stance. MIT Press, 1987.
[Dev91]
K. Devlin. Logic and Information. Cambridge University Press, New York, 1991.
[DP86]
J. Drapkin and D. Perlis. A preliminary excursion into Step-Logics. Proceedings
of the SIGART International Symposium on Methodologies for Intelligent Systems,
pages 262–269, 1986.
[DR02]
J.M. Dunn and G. Restall. Relevance logic. In D. Gabbay and F. Guenthner, editors, Handbook of Philosophical Logic, volume 6, chapter Relevance Logic
Handbook of Philosophical Logic, pages 1–136. Kluwer Academic, Dordrecht,
2002.
[dRL86]
J. des Rivieres and H. J. Levesque. The consistency of syntactical treatments of
knowledge. Computational Intelligence, 4(1):31–41, 1686.
[dS71]
R. de Sousa. How to give a piece of your mind: or, the logic of belief and assent.
Review of Metaphysics, 25:52–79, 1971.
[Dun76]
J.M. Dunn. Intuitive semantics for first degree entailment and coupled trees.
Philosophicl Studies, 29:149–68, 1976.
[Ebe74]
R.A. Eberle. The logic of believing, knowing, and inferring. Synthese, 26:356–382,
1974.
[EKM+ 99] J. Elgot-Drapkin, S. Kraus, M. Miller, M. Nirkhe, and D. Perlis. Active logics:
A unified formal approach to episodic reasoning. Technical Report CS-TR-4072,
University of Maryland, Department of Computer Science, 1999.
[EMP91] J. Elgot-Drapkin, M. Miller, and D. Perlis. Memory, reason and time: the StepLogic approach. In R. Cummins and J. Pollock, editors, Philosophy and AI: Essays
at the Interface, pages 79–103. MIT Press, Cambridge, Mass., 1991.
200
REFERENCES
[EP90]
J. Elgot-Drapkin and D. Perlis. Reasoning situated in time I: Basic concepts.
Journal of Experimental and Theoretical Artificial Intelligence, 2(1):75–98, 1990.
[Fas03]
M. Fasli. Reasoning about knowledge and belief: A syntactic treatment. Logical
Journal of the IGPL, 11(2):247–284, 2003.
[FH88]
R. Fagin and J.Y. Halpern. Belief, awareness and limited reasoning. Artificial
Intelligence, 34:39–76, 1988.
[FH06]
E. Friedman-Hill. Jess: the rule engine for the Java platform website. http:
//www.jessrules.com/, March 2006.
[FHMV95] R. Fagin, J.Y. Halpern, Y. Moses, and M.Y. Vardi. Reasoning About Knowledge.
MIT press, 1995.
[FHV90] R. Fagin, J.Y. Halpern, and M.Y. Vardi. A nonstandard approach to the logical
omniscience problem. In R. Parikh, editor, Proceedings of the Third Conference on
Theoretical Aspects of Reasoning about Knowledge, pages 41–55. Morgan Kaufmann,
1990.
[Flo05]
L. Floridi. Is information meaningful data?
Philosophy and Phenomenological
Research, 70(2):351–370, 2005.
[Flo06]
L. Floridi. The logic of being informed. Logique et Analyse, 49, December 2006.
[Fod87]
Jerry Fodor. Psychosemantics: The Problem of Meaning in the Philosophy of Mind.
MIT Press, Cambridge, Mass., 1987.
[Fod90]
Jerry Fodor. A Theory of Content and Other Essays. MIT Press, Cambridge, Mass.,
1990.
[Fre92]
Gottlob Frege.
Über sinn und bedeutung.
Zeitschrift für Philosophie und
philosophische, 100, 1892.
[GG01]
C. Ghidini and F. Giunchiglia. Local models semantics, or contextual reasoning
= locality + compatability. Artificial Intelligence, 127(2):221–259, 2001.
[Gin94]
M. Ginsberg. AI and nonmonotonic reasoning. In D.M. Gabbay et al, editor, Handbook of Logic in Artificial Intelligence and Logic Programming. Volume 3:
REFERENCES
201
Nonmonotonic Reasoning and Uncertain Reasoning, pages 1–33. Clarendon Press,
Oxford, 1994.
[GKP00]
John Grant, Sarit Kraus, and Donald Perlis. A logic for characterizing multiple
bounded agents. Autonomous Agents and Multi-Agent Systems, pages 351–387,
2000.
[Hal86]
J.Y. Halpern, editor. Proceedings of the First Conference on Theoretical Aspects of
Reasoning About Knowledge. Morgan Kaufman, 1986.
[Hal87]
J.Y. Halpern. Using reasoning about knowledge to analyze distributed systems.
In J. Traub et al., editors, Annual Review of Computer Science, volume 2, pages pp.
37–68. Annual Reviews Inc., 1987.
[Hen61]
Leon Henkin. Some remarks on infinitely long formulas. In Infinitistic Methods,
pages 167–183. Pergamon Press, Oxford, 1961.
[Hen05]
Vincent F. Hendricks. Preface to Knowledge and Belief: An Introduction to the Logic
of the Two Notions, expanded edition, edited by Vincent F. Hendricks and John
Symons. King’s College Publications, 2005.
[Hin62]
J. Hintikka. Knowledge and belief: an introduction to the logic of the two notions.
Cornell University Press, Ithaca, N.Y., 1962.
[Hin73a] J. Hintikka. Logic, Language-Games and Information: Kantian Themes in the Philosophy of Logic. Clarendon Press, Oxford, 1973.
[Hin73b] J. Hintikka. Surface semantics and its motivation. In H. Leblanc, editor, Truth,
Syntax and Modality. North-Holland, Amsterdam, 1973.
[Hin75]
J. Hintikka. Impossible possible worlds vindicated. Journal of Philisophical Logic,
4:475–484, 1975.
[HM69]
P. Hayes and J. McCarthy. Some philosophical problems from the standpoint of
artificial intelligence. Machine Intelligence, 4:463–502, 1969.
[HM90]
J.Y. Halpern and Y. Moses. Knowledge and common knowledge in a distributed
environment environment. Journal of the ACM, 37(3):549–587, 1990.
REFERENCES
202
[HMV95] J.Y. Halpern, Y. Moses, and M.Y. Vardi. Algorithmic knowledge. In R. Fagin,
editor, Theoretical Aspects of Reasoning about Knowledge: Proceedings of the Fifth
Conference (TARK 1994), pages 255–266. Morgan Kaufmann, San Francisco, 1995.
[Ho95]
D.N. Ho.
Logical omniscience vs. logical ignorance.
In C.P. Pereira and
N. Mamede, editors, Proceedings of EPIA’95, volume 990 of LNAI, pages 237–
248. Springer, 1995.
[Ho97]
D.N. Ho. Reasoning about rational, but not logically omniscient, agents. Journal
of Logic and Computation, 5:633–648, 1997.
[HPS04]
Ian Horrocks and Peter F. Patel-Schneider. A proposal for an OWL rules language. In Proceedings of the 13th international conference on World Wide Web, WWW
2004, pages 723–731. ACM, 2004.
[HPSB+ 06] I. Horrocks, P. Patel-Schneider, H. Boley, S. Tabet, B. Grosof, and M. Dean.
SWRL: A semantic web rule language combining OWL and RuleML. http:
//www.w3.org/Submission/SWRL/, April 2006.
[Jag05a]
Mark Jago. Modelling assumption-based reasoning using contexts. In Proceedings of the Contextual Representation and Reasoning workshop. CEUR electronic
proceedings, 2005.
[Jag05b]
Mark Jago. Rethinking epistemic logic. Read at The First World Colloquium on
Universal Logic, April 2005.
[Jag06a]
Mark Jago. Imagine the possibilities: Information without overload. Logique et
Analyse, 49, December 2006.
[Jag06b]
Mark Jago. Rule-based and resource-bounded: A new look at epistemic logic.
In Proceedings of the Logics for Resource-Bounded Agents workshop, ESSLLI 06, 2006.
To appear.
[Kap89]
D. Kaplan. Demonstratives. In J. Almog, J. Perry, and H. Wettstein, editors,
Themes from Kaplan, chapter 17, pages 481–563. Oxford University Press, New
York, 1989.
REFERENCES
[KdC77]
203
J. Kotas and N. da Costa. On the problem of Jaskowski and the logics of
Łukasiewicz. In A.I. Arruda, N. da Costa, and R. Chuaqui, editors, Non-Classical
Logic, Model Theory and Computability, pages 127–39. North Holland, Amsterdam,
1977.
[Kin96]
Jeffrey King. Structured propositions and sentence structure. Journal of Philosophical Logic, 25:495–521, 1996.
[KM60]
D. Kaplan and R. Montague. A paradox regained. Notre Dame Journal of Symbolic
Logic, 1:79–90, 1960.
[Kon86a] K. Konolige. A Deduction Model of Belief. Morgan Kaufman, 1986.
[Kon86b] K. Konolige. What awareness isn’t: a sentential view of implicit and explicit
belief. In Halpern [Hal86], pages 241–250.
[Kri75]
Saul Kripke. Outline of a theory of truth. Journal of Philosophy, 72:690–716, 1975.
[Kri80]
Saul Kripke. Naming and Necessity. Blackwell, Oxford, 1980.
[Lak86]
G. Lakemeyer. Steps towards a first-order logic of explicit and implict belief.
In J. Y. Halpern, editor, Proceedings of the First Conference on Theoretical Aspects
of Reasoning About Knowledge, pages 325–340, San Francisco, California, 1986.
Morgan Kaufmann.
[Lak87]
G. Lakemayer. Tractable metareasoning in propositional logic of belief. In
Proceedings of the Tenth International Joint Conference on Artificial Intelligence, pages
401–408, 1987.
[Lak90]
G Lakemeyer. A computationally attractive first-order logic of belief. In Proceedings of JELIA 90, pages 333–347, Heidelberg, 1990. Springer.
[Lev84]
H. J. Levesque. A logic of implicit and explicit belief. In National Conference on
Artificial Intelligence, pages 1998–202, 1984.
[Lev85]
H. J. Levesque. Global and local consistency and completeness of beliefs. Unpublished manuscript, 1985.
[Lev90]
H. J. Levesque. All I know: a study in autoepistemic logic. Artificial Intelligence,
42:263—309, 1990.
204
REFERENCES
[Lew20]
C.I. Lewis. Strict implication–an emendation. Journal of Philosophy, 17(11):300–
302, 1920.
[Lew75]
David Lewis. Language and languages. In K. Gunderson, editor, Language,
Mind and Knowledge, pages 3–35. University of Minnesota Press, 1975.
[Lew86]
David Lewis. On the Plurality of Worlds. Basil Blackwell, Oxford & New York,
1986.
[LL32]
C.I. Lewis and C.H. Langford. Symbolic Logic. The Appleton-Century Company,
New York, 1932.
[LL89]
B. Loewer and E. Lepore. You can say that again. Midwest Studies in Philosophy,
14:338–356, 1989.
[LNR87] J. E. Laird, A. Newell, and P. S. Rosenbloom. SOAR: An architecture for general
intelligence. Artificial Intelligence, 33:1–64, 1987.
[Mah05]
Q.H. Mahmoud.
94):
Getting started with the Java Rule Engine API (JSR
Toward rule-based applications.
http://java.sun.com/developer/
technicalArticles/J2SE/JavaRule.html?feed=JSC, July 2005.
[Mak05]
David Makinson. Bridges from Classical to Nonmonotonic Logic, volume 5 of Texts
in Computing. King’s College Publications, 2005.
[Mal73]
Norman Malcolm. Thoughtless brutes. In Proceedings and Addresses of the American Philosophical Association, volume 46, pages 5–20, 1973.
[McC79a] J. McCarthy. Ascribing mental qualities to machines. In M. Ringle, editor,
Philosophical Perspectives in Artificial Intelligence, pages 161–195. Harvester Press,
1979.
[McC79b] John McCarthy. First order theories of individual concepts and propositions. In
D. Michie J.E. Hayes and L.I. Mikulick, editors, Machine Intelligence, volume 9,
pages 129–147. Halstead Press, New York, 1979.
[MF92]
R.K. Meyer and H. Friedman. Whither relevant arithmetic?
Symbolic Logic, 57:824–831, 1992.
The Journal of
REFERENCES
[MH79]
205
R. C. Moore and G. Hendrix. Computational models of beliefs and the semantics
of belief sentences. Technical Note 187, SRI International, Menlo Park, Calif.,
1979.
[MK98]
M. Morreau and S. Kraus. Syntactical treatments of propositional attitudes.
Artificial Intelligence, 106:161–177, 1998.
[Mon70]
R. Montague. Universal grammar. Theoria, 36:373–98, 1970.
[Mon73]
R. Montague. The proper treatment of quantification in ordinary english. In
R. Thomason, editor, Formal Philosophy, Selected Papers of Richard Montague, pages
247–270. Yale University Press, 1973.
[Moo39]
G.E. Moore. Proof of an external world. Proceedings of the British Academy,
25:247–270, 1939.
[Moo85]
R.C. Moore. A formal theory of knowledge and action. In J. Hobbs and R. C.
Moore, editors, Formal Theories of the Commonsense World, pages 319–358. Ablex
Publishing Corp., Norwood, N.J., 1985.
[Mor95]
C. Mortensen. Inconsistent Mathematics. Mathematics and Its Applications.
Kluwer, Dordrecht, 1995.
[NKP94] M. Nirkhe, S. Kraus, and D. Perlis. Thinking takes time: a modal active-logic
for reasoning in time. Technical Report CS-TR-3249, University of Maryland,
Department of Computer Science, 1994.
[Noz81]
Robert Nozick. Philosophical Explanations. Clarendon press, Oxford, 1981.
[PBH00]
S. Poslad, P. Buckle, and R. G. Hadingham. The FIPA-OS agent platform: Open
source for open standards. In Proceedings of the Fifth International Conference
and Exhibition on the Practical Appli- cation of Intelligent Agents and Multi-Agents
(PAAM2000), pages 355–368, Manchester, April 2000.
[Pei92]
Charles Sanders Peirce. Reasoning and the Logic of Things: The Cambridge Conferences Lectures of 1898. Harvard University Press, Cambridge Mass., 1992.
[Per79]
John Perry. The problem of the essential indexical. Noûs, 13:3–21, 1979.
REFERENCES
206
[Per80]
John Perry. Belief and acceptance. Midwest Studies in Philosophy, 5:553–54, 1980.
[Per85]
D. Perlis. Languages with self-reference I: Foundations. Artificial Intelligence,
25:301–322, 1985.
[Per88]
D. Perlis. Languages with self-reference II:knowledge, belief and modality.
Artificial Intelligence, 34:179–212, 1988.
[Per93]
John Perry. The Problem of the Essential Indexical. Oxford University Press, Oxford,
1993.
[Pri79]
G. Priest. Logic of paradox. Journal of Philosophical Logic, 8:219–241, 1979.
[Pri87]
G. Priest. In Contradiction: A Study of the Transconsistent. Martinus Nijhoff,
Dordrecht, 1987.
[Pri97]
G. Priest. Inconsistent models for arithmetic I: Finite models. The Journal of
Philosophical Logic, 26:223–235, 1997.
[Pri00]
G. Priest. Inconsistent models for arithmetic II: the general case. The Journal of
Symbolic Logic, 65:1519–29, 2000.
[Pri02]
G. Priest. Paraconsistent logic: Essays on the inconsistent. In D. Gabbay and
F. Guenthner, editors, Handbook of Philosophical Logic, volume 6, pages 287–393.
Kluwer Academic, Dordrecht, 2002.
[PRN89] G. Priest, R. Routley, and J. Norman, editors. Paraconsistent Logic: Essays on the
Inconsistent. Philosophia Verlag, München, 1989.
[PS85]
P.F. Patel-Schneider. A decidable first-order logic for knowledge representation.
In Proceedings of the 9th international joint conference on artificial intelligence, pages
455–458, 1985.
[Puc06]
Riccardo Pucella. Deductive algorithmic knowledge. Journal of Logic and Computation, 16(2):287–309, 2006.
[Put75]
Hilary Putnam. Mind, Language and Reality. Cambridge University Press, Cambridge, 1975.
REFERENCES
[Put77]
207
Hilary Putnam. Reaslism and reason. Proceedings and Addresses of the American
Philosophical Association, 50(6):483–498, 1977.
[Put81]
Hilary Putnam. Reason, truth and history. Cambridge University Press, 1981.
[Put83]
Hilary Putnam. Computational psychology and interpretation theory. In Realism
and Reason, Philosophical Papers III. Cambridge University Press, Cambridge,
1983.
[QU70]
W.V.O. Quine and J.S. Ullian. The Web of Belief. Random House, New York,
second edition, 1970.
[Qui50a] W.V.O. Quine. Methods of Logic. Henry Holt & Co., NY, 1950.
[Qui50b] W.V.O. Quine. On natural deduction. Journal of Symbolic Logic, 15:93—102, 1950.
[Qui60]
W.V.O. Quine. Word and Object. MIT Press, Cambridge, Mass., 1960.
[Qui69]
W.V.O. Quine. Ontological Relativity and Other Essays, chapter Epistemology
Naturalized, pages 69–89. Columbia University Press, 1969.
[Qui70]
W.V.O. Quine. On the reasons for indeterminacy of translation. Journal of Philosophy, 67:178–83, 1970.
[Ran75]
V. Rantala. Urn models. Journal of Philosophical Logic, 4:455–474, 1975.
[RB79]
N. Rescher and R. Brandon. The Logic of Inconsistency. Rowman and Littlefield,
1979.
[Res93]
G. Restall. Simplified semantics for relevant logics (and some of their rivals).
Journal of Philosophical Logic, pages 481–511, 1993.
[RG91]
A.S. Rao and M.P. Georgeff. Modeling rational agents within a BDI-architecture.
In Proceedings of the Second International Conference on Principles of Knowledge
Representation and Reasoning, pages 473–484, 1991.
[RK86]
S.J. Rosenschein and L.P. Kaelbling. The synthesis of digital machines with
provable epistemic properties. In Halpern [Hal86].
[RM72a] R. Routley and R. Meyer. The semantics of entailment II. Journal of Philosophical
Logic, 1:53–73, 1972.
REFERENCES
208
[RM72b] R. Routley and R.K. Meyer. The semantics of entailment III. Journal of philosophical
logic, 1:192–208, 1972.
[RM73]
R. Routley and R. Meyer. The semantics of entailment I. In H. Leblanc, editor,
Truth, Syntax, and Semantics, pages 194–243. North-Holland, 1973.
[RML06] The RuleML website. http://www.ruleml.org/, April 2006.
[RP85]
S.J. Rosenschein and F. Pereira. Knowledge and beliefand action in situated
automata. unpublished manuscript, 1985.
[Rus05]
Bertrand Russell. On denoting. Mind, 14:479–493, 1905.
[Rus17]
Bertrand Russell. Mysticism and Logic and Other Essays, chapter Knowledge by
Acquaintance and Knowledge by Description, pages 197–218. Pelican, London,
1917.
[Sal86]
Nathan Salmon. Frege’s Puzzle. MIT press/Bradford books, 1986.
[Sea58]
John Searle. Proper names. Mind, 67:26–54, 1958.
[Sel56]
W. Sellars. Empiricism and the philosophy of mind. In H. Feigl and M. Scriven,
editors, The Foundations of Science and th Concepts of Psychology and Psychoanalysis.
University of Minnesota press, 1956.
[Sel74]
W. Sellars. Meaning as functional classification: a perspective on the relation of
syntax to semantics. Synthese, 27:417–38, 1974.
[SGdMM96] Carles Sierra, Lluı́s Godo, Ramon López de Màntaras, and Mara Manzano.
Descriptive dynamic logic and its application to reflective architectures. Future
Gener. Comput. Syst., 12(2-3):157–171, 1996.
[SL99]
A. Sloman and B. Logan. Building cognitively rich agents using the sim agent
toolkit. Communications of the ACM, 42(3):71–77, March 1999.
[Sma59]
J.C.C. Smart. Sensations and brain processes. Philosophical Review, 68:141–56,
1959.
[Soa85]
Scott Soames. Lost innocence. Linguistics And Philosophy, 18:59–72, 1985.
REFERENCES
[Soa87]
209
Scott Soames. Direct reference, propositional attitudes and semantic content.
Philosophical Topics, 15, 1987.
[Sta76]
R. Stalnaker. Propositions. In A. MacKay and D. Merrill, editors, Issues in the
Philosophy of Language. New Haven, Yale, 1976.
[Sta99]
R. Stalnaker. Context and Content: Essays on Intentionality in Speech and Thought.
Oxford University Press, Oxford, 1999.
[Sta06]
R. Stalnaker. On logics of knowledge and belief. Philosophical Studies, 128(1):169–
199, 2006.
[Sti81]
S. Stich. Dennett on intentional systems. Philosophical Topics, 12:38–62, 1981.
[Sti83]
S. Stich. From Folk Psychology to Cognitive Science. MIT press, Cambridge, Mass,
1983.
[Tar76]
A. Tarski. The Concept of Truth in Formalised Languages. Oxford Clarendon Press,
1976.
[Tho80]
R.H. Thomason. A note on syntactical treatments of modality. Synthese, 44:391–
395, 1980.
[Var86]
M.Y. Vardi. On epistemic logic and logical omniscience. In Halpern [Hal86].
[VB03]
J. Van Benthem. Logic and the dynamics of information. Minds and Machines,
13(4):503–519, 2003.
[vdHvLM99] W. van der Hoek, B. van Linder, and J.-J. Ch. Meyer. An integrated modal
approach to rational agents. In M. Wooldridge and A. Rao, editors, Foundations
of Rational Agency, pages 133–168. Kluwer Academic, Dordrecht, 1999.
[Whi03]
M. Whitsey. Logical omniscience: a survey. Technical Report NOTTCS-WP2003-2, School of Computer Science and IT, University of Nottingham, 2003.
[Whi04]
M. Whitsey. Modelling resource bounded reasoners: An example. In Proceedings
of the Logic and Communication in Multi-Agent Systems workshop (LCMAS 04),
pages 118–137. Loria, 2004.
[Wil00]
T. Williamson. Knowledge and its limits. Oxford University Press, Oxford, 2000.
REFERENCES
210
[Wit22]
Ludwig Wittgenstein. Tractatus Logico-Philosophicus. Kegan Paul, 1922.
[Wit02]
Ludwig Wittgenstein. Philosophical Investigations. Blackwell, 2002.
[Woo95]
M. Wooldridge. An abstract general model and logic of resource-bounded
believers. In M. Cox and M. Freed, editors, Representing Mental States and
Mechanisms—Proceedings of the 1995 AAAI Spring Symposium, pages pp. 136–141.
AAAI Press, 1995.
[Woo00]
Michael Wooldridge. Computationally grounded theories of agency. In E. Durfee, editor, Proceedings of the Fourth International Conference on Multi-Agent Systems
(ICMAS-2000), pages 13–20. IEEE Press, 2000.
[Zal83]
Edward N. Zalta. Abstract Objects: An Introduction to Axiomatic Metaphysics.
D. Reidel, Dordrecht, 1983.
[Zal88]
Edward N. Zalta. Intensional Logic and the Metaphysics of Intentionality. Bradfood
Books, the MIT Press, Cambridge, Mass., 1988.
[Zal97]
Edward N. Zalta. A classically-based theory of impossible worlds. Notre Dame
Journal of Formal Logic, 38(4):640—660, 1997.
211
A A
P
Proof of theorem 1
Theorem 1, section 4.5.2:
Let Γ ⊆ L¬,∧ , φ ∈ L¬,∧ and Γl = {(ǫ, 1) : φ | φ ∈ Γ}. Then
Γl ⊢R (ǫ, t) : φ, for some t ∈ N, iff φ is a consequence of Γ in classical propositional logic.
only if direction:
Γl
Let c = ψ1 · · · ψk . We show that, for any t and sequence c = ψ1 · · · ψk ,
⊢R (c, t)φ implies that Γ ∪ {ψ1 . . . ψk } ∪ {¬φ} is inconsistent. The only if direction of the
proof is a special case of this, with c = ǫ. The proof is by induction on t. When t = 0, either
φ ∈ Γ, in which case Γ ∪ {¬φ} is clearly inconsistent, or else φ := ψ j for some j ≤ k, in which
case {ψ1 . . . ψk } ∪ {¬φ} is inconsistent. For the induction case, assume that the desired result
holds for all i < t. Now consider a proof P using the rules R whose premises are Γ and
whose last line is (c, t) : φ. This line must have been obtained using one of the rules in R.
The case for follows immediately from the inductive hypothesis, and the remaining
Boolean cases are simple for they mimic standard natural deduction rules. I list these cases
here for the sake of completeness:
∧int : Then φ := φ1 ∧φ2 and both (c, t−1) : φ1 and (c, t−1) : φ2 are written in P. By hypothesis,
Γ ∪ {ψ1 . . . ψk } entails both φ1 and φ2 , hence Γ ∪ {ψ1 . . . ψk } ∪ {¬(φ1 ∧ φ2 )} is inconsistent.
∧elimL : Then a formula (c, t − 1) : φ ∧ ψ is written in P and, by hypothesis, Γ ∪ {ψ1 . . . ψk } ∪
{¬(φ ∧ ψ)} is inconsistent, hence Γ ∪ {ψ1 . . . ψk } ∪ {¬φ} is inconsistent as well. A similar
argument applies in the ∧elimR case.
¬elim : Then (c, t − 1) : ¬¬φ is written in P and by hypothesis, Γ ∪ {ψ1 . . . ψk } ∪ {¬¬φ} is
inconsistent; hence Γ ∪ {ψ1 . . . ψk } ∪ {φ} is also inconsistent.
.
212
¬int : Then φ := ¬ψ and, for some unlabelled formula χ, both (cψ, t − 1) : χ and (cψ, t − 1) : ¬χ
are written in P. By hypothesis, both Γ ∪ {ψ1 . . . ψk } ∪ {χ} and Γ ∪ {ψ1 . . . ψk , ψ} ∪ {χ}
are inconsistent, hence Γ ∪ {ψ1 . . . ψk ∪ {¬¬ψ} is also inconsistent.
: Then there is a subsequence c′ = χ1 · · · χi of c such that (c′ , t) : φ is written in P. By
hypothesis, Γ ∪ {χ1 · · · χi } ∪ {¬φ} is inconsistent. But, since each χm≤i = ψn for some
n ≤ k, Γ ∪ {ψ1 · · · ψk } ∪ {¬φ} is inconsistent too.
: Then φ := ψn for some n ≤ k, hence {ψ1 , . . . , ψk , ¬φ} is inconsistent.
if direction:
Again, set c = ψ1 · · · ψk and suppose that φ is a classical propositional
consequence of Γ. Then it is possible to construct a proof P of φ whose premises (written at
line 0) are the elements of Γ using the standard propositional rules for natural deduction.
We will show that, if a formula ψ appears in P at a point in the proof at which the unclosed
assumptions, from the start of the proof to the point at which ψ is written, are ψ1 , . . . , ψn ,
then there is a t ∈ N such that Γl ⊢R (ψ1 , . . . , ψn , t) : ψ. The proof proceeds by induction on
the line number l of ψ. If l = 0, then ψ ∈ Γ and hence Γl ⊢R (ǫ, 0) : ψ. For the induction
hypothesis, assume that the desired result hods for all n < l. Now consider ψ written
at line l of P, within unclosed assumptions ψ1 , . . . , ψn . It must have been obtained using
one of the propositional natural deduction rules from formulae written at lines n < l in P.
If the rule used was a rule for ∧, or ¬-elimination, then the desired result follows easily
by applying the inductive hypothesis. If the rule used to obtain ψ was ¬-introduction,
then ψ := ¬χ and, for some formula ξ, both ξ and ¬ξ are written at lines k, k′ < l of P
within unclosed assumptions ψ1 , . . . , ψn , χ. By hypothesis, there are t1 , t2 ∈ N such that
Γ ⊢R (ψ1 · · · ψn χ, t1 ) : ξ and Γ ⊢R (ψ1 · · · ψn χ, t2 ) : ¬ξ. Set t = max{t1 , t2 }; then, by applying
, there is a proof using the rules R from Γl of (ψ1 · · · ψn χ, t) : ξ and (ψ1 · · · ψn χ, t) : ¬ξ.
Finally, by a single application of ¬int , we obtain Γl ⊢R (ψ1 · · · ψn , t1 ) : ¬χ.
⊣
Proof of lemma 1
Lemma 1, section 4.6: Let M be minimal for Γl and sufficient for hΓl , (c, t) : φi. Then for every
formula φ, φ ∈ mct iff Γl ⊢R (c, t) : φ.
only if direction:
The proof is by induction on t. Assume that (c, 0) : φ ∈ mi0 ; then either
(c, 0) : φ ∈ Γl or else c = . . . φ . . . (definition 13). In the former case, Γl ⊢R (i, 0) : φ by definition
213
.
of ⊢R . In the latter case, Γl ⊢ (i, 0) : φ by . For the inductive hypothesis assume that, for
all k ≤ t, φ ∈ mck only if Γl ⊢R (c, k) : φ. Now consider any φ ∈ mct ; then φ ∈ inf (c, t − 1). Since
M is a minimal model, one of the following cases must apply:
c = · · · φ · · ·, in which case a single application of gives Γ ⊢R (c, t) : φ.
For some ψ ∈ {φ, φ ∧ χ, χ ∧ φ, ¬¬φ}, ψ ∈ mct−1 . Then, by hypothesis, there is a proof of
(c, t − 1) : ψ from Γl using the rules in R. Moreover, ψ is the premise of one of the rules
, ∧elimL , ∧elimR or ¬elim and hence Γl ⊢R (c, t) : φ.
φ := φ1 ∧ φ2 and φ1 , φ2 ∈ mct−1 . By hypothesis, Γl ⊢R (c, t − 1) : φ1 and Γl ⊢R (c, t − 1) : φ2
and so, by an application of ∧int , Γl ⊢R (c, t − 1) : φ1 ∧ φ2 .
ψ, ¬ψ ∈ mcχ
and φ := ¬χ. By hypothesis, Γl ⊢R (cχ, t − 1) : ψ and Γl ⊢R (cχ, t − 1) : ¬ψ.
t−1
The proofs may be combined and, by a single application of ¬int , Γl ⊢R (c, t) : ¬χ.
′
φ ∈ mct−1 where c = · · · c′ · · ·. By hypothesis, Γl ⊢R (· · · c · · · , t − 1) : φ and, by an
application of , Γl ⊢R (c, t) : φ.
if direction:
Again, the proof is by induction on t. For the base case, suppose Γl ⊢R (c, 0) :
φ. Then either (c, 0) : φ ∈ Γl , in which case φ ∈ mc0 , or else c = · · · φ · · · in which case φ ∈ mc0
(by point 1, definition 13). For the induction hypothesis suppose that, for all k < t, φ ∈ mck
if Γl ⊢R (c, k) : φ. Now suppose Γl ⊢R (c, t) : φ, i.e. there is a proof P from Γl using the rules in
R whose last line is (c, t) : φ, which must have been obtained by one of the rules in R. Then
one of the following must apply:
Γl ⊢R (c, t−1) : ψ for ψ ∈ {φ, φ∧χ, χ∧φ, ¬¬φ}. By hypothesis, ψ ∈ mct−1 , so φ ∈ inf (c, t−1)
and hence φ ∈ mct .
φ := φ1 ∧ φ2 , Γl ⊢R (c, t − 1) : φ1 and Γl ⊢R (c, t − 1) : φ2 . By hypothesis, φ1 , φ2 ∈ mct−1 , so
φ1 ∧ φ2 ∈ inf (c, t − 1) and hence φ1 ∧ φ2 ∈ mct .
φ := ¬ψ and, for some χ, Γl ⊢R (cψ, t − 1) : χ and Γl ⊢R (cψ, t − 1) : ¬χ. By hypothesis,
cψ
χ, ¬χ ∈ mt−1 . Then ¬ψ ∈ inf (c, t − 1) and so ¬ψ ∈ mct .
′
Γl ⊢R (c′ , t − 1) : φ and c = · · · c′ · · ·. By hypothesis, φ ∈ mct−1 , so φ ∈ inf (c, t − 1) and
hence φ ∈ mct .
Finally, if c = · · · φ · · · then φ ∈ mct by definition.
⊣
.
214
Proof of theorem 3
Theorem 3, section 4.6: Let
∗
be a mapping propositional formulae to structural first-order
functions as follows:
p∗ = ppq
(¬φ)∗ = neg(φ∗ )
(φ1 ∧ φ2 )∗ = conj(φ∗1 , φ∗2 )
(φ → ψ)∗ = imp(φ∗ , ψ∗ )
and let M be a model of an axiomatic reasoner, as in definition 17 and H a Herbrand model of the
first-order language of section 4.4.3. Then M |= Bi (t, φ) iff H |= Bi (t, φ∗ ).
only if direction:
for any t, i, φ ∈ mit only if H |= Bi (t, φ∗ ). The proof is by induction on t.
For the base case, set t = 0. For any i, φ ∈ mi0 only if φ ∈ obs(i) only if {Oi (φ∗ ), Bi (0, φ∗ )} ⊆ O0T
only if H |= Bi (0, φ∗ ). For the induction hypothesis assume that, for any k < t and any
i, H |= Bi (k, ψ∗ ) holds if ψ ∈ mik . Now consider some φ ∈ mit ; one of the following cases
applies:
φ ∈ mit−1 : Then H |= Bi (t − 1, φ∗ ) by hypothesis and so Bi (t − 1, φ∗ ) ∈ Ot−1
. By
T
construction of H , Bi (t − 1, φ∗ ) ∈ OtT and hence H |= Bi (t, φ∗ ).
ψ, ψ → φ ∈ mit−1 : By hypothesis, H |= Bi (t− 1, ψ∗ ) and H |= Bi (t− 1, imp (ψ∗ , φ∗ )). Then
{Bi (t − 1, ψ∗ ), Bi (t − 1, imp(ψ∗ , φ∗ ))} ⊆ Ot−1
and, by the construction of H , Bi (t, φ∗ ) ∈ OtT
T
and thus H |= Bi (t, φ∗ ).
φ1 , φ2 ∈ mit−1 and φ := φ1 ∧ φ2 : By hypothesis, H |= Bi (t − 1, φ∗1 ) and H |= Bi (t − 1, φ∗2 ).
. By the construction of H , conj(φ∗1 , φ∗2 ) ∈ OtT
Then {Bi (t − 1, φ∗1 ), Bi (t − 1, φ∗2 )} ⊆ Ot−1
T
and thus H |= Bi (t, φ∗ ).
.
φ∧ψ ∈ mit−1 : By hypothesis, H |= Bi (t−1, con j(φ∗ , ψ∗ )) and Bi (t−1, con j(φ∗ , ψ∗ )) ∈ Ot−1
T
Then Bi (t − 1, φ∗ ) ∈ OtT and hence H |= Bi (t, φ∗ ). A similar argument holds for the
case ψ ∧ φ ∈ mit−1 .
if direction:
Again, the proof is by induction on t. For the base case take t = 0. The,
for any i, H |= Bi (0, φ∗ ) only if Bi (0, pφq) ∈ O0T only if φ ∈ obs(i) only if φ ∈ mi0 . For
.
215
the induction step assume that, for all k < t and any i, ψ ∈ mik if H |= Bi (k, ψ∗ ). By the
construction of H , if H |= Bi (t, φ∗ ) then one of the following cases must apply:
H |= Bi (t − 1, φ∗ ): By hypothesis, φ ∈ mit−1 and, by definition 17 condition 1, φ ∈
inf (i, t − 1) and so φ ∈ mit .
φ := φ1 ∧ φ2 , H |= Bi (t − 1, φ∗1 ) and H |= Bi (t − 1, φ∗2 ): By hypothesis, {φ1 , φ2 } ⊆ mit−1
and so φ ∈ inf (i, t − 1), hence φ ∈ mit .
H |= Bi (t−1, conj(φ∗ , ψ∗ )) for some ψ: By hypothesis, φ∧ψ ∈ mit−1 and so φ ∈ inf (i, t−1),
hence φ ∈ mit . Similar reasoning applies if H |= Bi (t − 1, conj(ψ∗ , φ∗ )).
H |= Bi (t − 1, imp(ψ∗ , φ∗ )) and H |= Bi (t − 1, ψ∗ ) for some ψ: By hypothesis, {ψ →
φ, ψ} ⊆ mit−1 and so φ ∈ inf (i, t − 1). It follows that φ ∈ mit .
⊣
© Copyright 2026 Paperzz