Rethinking Epistemic Logic Mark Jago 1 Introduction Hintikka’s logic of knowledge and belief [8] has become a standard logical tool for dealing with intentional notions in artificial intelligence and computer science. One reason for this success is the adoption by many computer scientists of modal logics in general, as tools for reasoning about relational structures. Many areas of interest to computer scientists, from databases to the automata which underlie many programs, can be thought of as relational structures; such structures can be reasoned about using modal logics. Hintikka’s work can be seen as the start of what is known to philosophers, logicians and computer scientists as formal epistemology: that branch of epistemology which seeks to uncover the formal properties of knowledge and belief. In Reasoning About Knowledge [6], Fagin, Halpern, Moses and Vardi showed how logics based on Hintikka’s ideas can be used to solve many real-world problems; these were problems about other agent’s knowledge, as well as problems in which agents reason about the world. Solutions to such problems have applications in distributed computing, artificial intelligence and game theory, to name a but a few key areas. With the appearance of Reasoning About Knowledge, Hintikka’s approach was firmly cemented as the orthodox logical account of belief for philosophers and computer scientists alike. It is rare for an orthodox account of such popularity to receive no criticism and Hintikka’s framework is no exception. One major source of objections is the so-called problem of logical omniscience whereby, as a result of the modal semantics applied to ‘knows’ (and ‘believes’), agents automatically know every tautology as well as every logical consequence of their knowledge. Just how such a consequence should be viewed is a moot point; perhaps this notion of knowledge applies to ideal agents, or perhaps it is an idealised notion, saying what a non-ideal agent should believe (given what it already does believe). Although neither account is entirely satisfactory, defenders of the approach claim that, in many cases, the assumptions are harmless, and that the applications which such logics have found speak for themselves. [11] surveys logics for which logical omniscience is a problem and concludes that an alternative logic of knowledge and belief is required, if real-world agents are to be modelled with any kind of fidelity. However, my aim here is not to criticise Hintikka’s approach on the ground of logical omniscience. Instead, I show that the assumptions required by Hintikka’s approach cannot be justified by an acceptable account of belief. I concentrate throughout on the notion of belief, rather than that of knowledge, as the former is (usually) regarded as the more primitive notion; an account of knowledge will usually proceed from an account of belief. Hintikka’s logic is not in itself such 1 an account, a fact which has been ignored in the recent literature in artificial intelligence and computer science. Hintikka’s approach cannot be seen as a definition of belief without vicious circularity. I then argue that any acceptable account of belief must be able to account for Frege’s problem of informativeness and settle on a partly representational, partly dispositional account of belief. Such an account clearly shows the mistaken assumptions in Hintikka’s approach. In the second half of the paper, I introduce a new approach to epistemic logic, based on these considerations.1 The logic is introduced for the case of rule-based agents of the kind common in artificial intelligence but is then extended to a full propositional reasoner. The logic differs from Hintikka’s in using representational elements to classify belief states and in treating temporal and alethic modality as essential components of an account of the dynamics of belief states. The rest of the paper is organised as follows. In the following section, I present Hintikka’s proposal and, in section 3, discuss the basis notion behind it, that of an epistemically possible world. In section 4, I discuss and reject accounts of belief which are formulated in terms of Fregean senses, before putting forward my own account of belief (section 5). Sections 6 and 7 present the logic of dynamic belief states, including some interesting properties possessed by models of the logic. An axiomatization of the logic is also given and proved to be complete with respect to these models. To conclude, I briefly mention ways in which this approach to epistemic logic can be implemented in real AI systems, where Hintikka’s approach has proved troublesome. 2 Epistemic Logic In this section, Hintikka’s logic of knowledge and belief—hereafter, standard epistemic logic—is presented. Hintikka’s idea in in Knowledge and Belief [8] was to treat knowledge and belief as ways of locating the actual world in a space of logical possibilities. If I believe that φ, I rule out possibilities in which φ is not true, for such worlds are not epistemic possibilities for me. On the other hand, if I am not sure whether ψ, then I cannot say whether the actual world falls into the class of worlds in which ψ is true, or in which it is not. In this case, the space of possibilities in which I locate the actual world will include both worlds in which ψ holds and those in which it does not. If I have knowledge, rather than just belief, then the actual world must be within the class of possibilities in which I think I have located it. A clarification is needed here. People do not always take their beliefs to rule out incompatible possibilities, as evinced in the commonly heard, ‘I believe that φ . . . but of course, I could be wrong.’ I believe that I have two hands, and that there is a screen before my eyes as I type; yet, if the kind of radical scepticism which first motivated Descartes’ meditations were true, these beliefs would be false. Since I have no proof that such scepticism is false, I have to admit that although I believe these things, they could be false. Given I admit this possibility, would a world in which I were a brain in an vat count as an epistemically possibility for me, in Hintikka’s sense? It cannot for, if it did, I would believe very little indeed (perhaps only that my mind exists, and that I 1 I use the term epistemic logic to include what should more properly be termed doxastic logic. A correct account of knowledge relies on a correct account of belief, so I view these logical considerations as pertaining to epistemic logic in general. 2 am currently thinking). So Hintikka’s notion must mean something else: given that my beliefs are true, the possibilities that are then left open are the epistemic possibilities for me (regardless of whether those beliefs are true or not). Hintikka used possible world semantics, with a primitive relation of epistemic accessibility R holding between worlds. Rww′ says that w′ is epistemically possible (for the agent in question) from w. Belief is then defined in terms of this relation. An agent believes φ (at a world w) if φ is true in all worlds epistemically possible from R, the worlds w′ such that Rww′ . In the case of an agents actual beliefs, a unique world @ may be distinguished as the actual world. An agent’s beliefs are then the sentences which hold at all worlds accessible from the actual world, i.e. the worlds w such that R@w. Much the same holds for knowing that φ, although then a different accessibility relation RK must be considered, as knowledge has different properties to belief. At the very least, RK must relate every world to itself so that, if one knows φ at w, then φ is true there. Let us leave knowledge to one side; our worry is to give an account of belief. Models of belief are relational structures. The domain of the model is a set of points W , considered to be possible worlds. At each world, the sentences which are said to be true are closed under some logical consequence relation (say, that of classical propositional or first-order logic). The accessibility relation R (⊆ W × W ) holds between worlds in the domain of the model. A model represents the beliefs of more than one agent will have a distinct accessibility relation for each agent; we then talk about a multi-modal logic, in which we have a belief modality Bi for each agent i. The satisfaction relation ‘|=’ holding between a model, a world in that model and a sentence of the logic is defined recursively in the usual way. In the propositional case, truth-values are assigned to propositions at each state; in the first-order case, constants are assigned an individual of the domain and relation letters an extension. In either case, the following clauses define the truth of logically complex sentences, in terms of less complex sentences, at each world w. M, w |= ¬φ iff M, w 6|= φ M, w |= φ ∧ ψ iff M, w |= φ and M, w |= ψ M, w |= φ ∨ ψ iff M, w 6|= φ or M, w |= ψ M, w |= φ → ψ iff M, w 6|= φ or M, w |= ψ Satisfaction (or truth) at a world is clearly standard. The definition of ‘|=’ continues with the following clause for belief (with ‘Bφ’ read as ‘the agent believes that φ’): M, w |= Bφ iff, for all w′ ∈ W , Rww′ implies M, w′ |= φ Logically equivalent sentences must have a common truth value at each world in W and so beliefs which are logically equivalent (in that logic) are indistinguishable: to believe the one is to believe the other. In the same sense that a set of premises is said to contain the conclusions which may be drawn from it, this is a notion of belief in which the belief that φ includes whatever logical consequences φ may have. Since the consequences of the empty set ate precisely the valid sentences of a logic, to be said to believe the latter is viewed as unproblematic. 3 What of a traditional problem for accounts of belief: the substitution of coreferring terms? This has been viewed as a problem because, in a sense, a belief about Bob Dylan is a belief about Robert Zimmerman, for they are one and the same person. Yet, I may believe Dylan, but not Zimmerman, to be a great songwriter. The problem is blocked in Hintikka’s account by requiring that the worlds in W need only be epsitemically possible. That is, the condition for membership of W is just the epistemic possibility of a world’s logically primitive true sentences. The truth of logically complex sentences is then given by the standard recursion clauses for Booleans and quantifiers. The thought is that even though ‘a’ and ‘b’ are co-denoting terms, it is at least an epistemic possibility that the two refer to different entities. That is, even though a = b in actuality, it remains epistemically possible that a 6= b. 3 Epistemic Possibility There are many worries raised by this account of belief; not least that, as mentioned in the introduction, agents are modelled are being logically omniscient. There is perhaps a place for such abstraction and it has proved popular in AI and computer science; my target here is to concentrate on two problems which remain even if we allow this idealisation. The first is that we have not been given an account of what beliefs are at all; the second is that, if the account were correct, it would give us no reason for thinking that the logical principles upon which the account is based should hold of belief. Both of these problems relate to the notion of epistemically possible worlds. Just what kind of entities are they? Evidently, they are worlds in which Robert Zimmerman need not be Bob Dylan, even though Robert Zimmerman is in actuality Bob Dylan. Yet, identity is a matter of de re necessity: entities are necessarily self-identical. Kripke [12] gives us additional reasons to suppose that identity statements involving distinct rigid designators are either necessarily true or necessarily false. If ‘a = b’ is true at any world, it is true at all worlds in which a exists. Epistemic possibilities, we must conclude, need not be metaphysical possibilities. But what right have we to call such possibilities entities? I take it that, by world, it is meant an entity, the truths about which form a maximal consistent set (a consistent set which, upon the addition of just one extra formula, would become inconsistent). We might then think of worlds simply as assignments to the primitive constructs in the language, together with the closure of such under the satisfaction statements for Booleans and quantifiers. Does this conception entitle us to think of any such assignment as giving an entity? It does not; we may describe a logical theory by assigning different elements of our domain to the constants ‘a’ and ‘b’ at a world w1 , and assigning them the same element at w2 . As far as the logical theory goes, this is fine; but we are then caught in a dilemma. Either we treat epistemically possible worlds simply as logical notions; or we treat them as genuine entities. If we take the latter horn, we are left explaining how something, whose existence is impossible, can exist (I take it that the scope of metaphysical possibility decides all questions of ontological possibility). If we take the former horn of the dilemma, we arrive at the original accusation that we do not have a theory of belief at all. For why should a logical point, or a logical theory, be considered to be epistemically possible or impossible by 4 an agent? On what basis would we make such a judgement? It seems that one would judge such a logical point to be epistemically impossible, in the sense settled on in the previous section, precisely because one’s beliefs rule it out as a contender for actuality—and possible otherwise. To put the point another way, what could count as the truthmaker for a logical point, or a logical theory, being epistemically possible for an agent, other than what that agent considers possible? But the sense of epistemically possible settled on in the previous section is in no way comparable to objective notions of metaphysical possibility, as the example of scepticism has shown. An agent considers a state of affairs epistemically possible, in the sense required for the standard epistemic logic to be useful, precisely when her beliefs do not rule it out. This sense of epistemic possibility cannot be accounted for without a prior understanding of belief. Standard epistemic logic, then, is not in itself a theory of belief at all. Given what an agent considers possible, the theory tells us what beliefs she has and what follows from those beliefs. But, to fix what an agent considers possible, it is necessary to fix what her beliefs are first. Belief is the more fundamental notion here. Now, it is not always incumbent on a logic of some concept to explain, on its own, what that concept is or means. Logics may help focus our thoughts when thinking about a particular concept. A logic of belief can help us top become clearer about the implications our account (for example by settling whether one’s believing φ implies believing that one believes φ). But then we must search elsewhere for an account of beliefs, their meanings and their identity. There is a second worry concerning modal epistemic logic, which challenges its claim even to help us to clarify questions concerning the logic of belief. The worry is this. Let us grant the required notion of epistemic possibility, either as genuine entities (ignoring the remarks above) or as purely logical notions. Given that these possibilities lie outside the scope of the metaphysically possible, what weight can there be to the insistence on logical necessity that is supposed to hold at such worlds? When we say that some formula is a logical consequence of another, we mean that the former cannot be true unless the latter is too, on pain of contradiction. But, in the realm of the conceptually possible but metaphysically impossible, why should the threat of contradiction pain us? Why should the notion of logical consequence hold any weight at all? To put the point another way, consider the propositional case, in which precisely one value is assigned to each primitive proposition in the language at each world. Now, what is our criterion for calling these values truth values, rather than just arbitrary assignments of one symbol to another? With no notion of metaphysical necessity in play, it seems we can make no headway with the notion of truth. Some logicians, e.g. Hintikka [9] and Levesque [13], endorse this thought and treat it as an advantage, allowing for primitive propositions to be both true and false at a world. Let us call such a world a paraconsistent world. Allowing paraconsistent worlds in the domain W allows for a logic in which agents may have inconsistent beliefs and need not believe every classical tautology. In the former case, all accessible worlds will be paraconsistent. (If a logic is concerned with which beliefs are true, it must force worlds which are accessible to themselves to be classical. Otherwise, an agent might have the belief that, say, p as well as the belief that ¬p; but these beliefs cannot be true at the same time.) We have seen that we have no reason to suppose classical logic to be the logic of each epistemically possible world. We equally have no reason to sup5 pose that any logic can fulfil this rôle without conflicting with the given account of epistemic possibility. Suppose a modal system contains worlds which contain a non-modal logic Λ (for example, in paraconsistent propositional modal logic, the theory of each world contains the theorems of paraconsistent propositional logic).2 Each member of Λ will be epistemically necessary and so must be believed by any agent, regardless of which worlds she considers possible. The question is, given the account of epistemic possibility, why should any sentence be considered a universal epistemic necessity? We could always find some element of Λ which someone could take to be false. Suppose we have little reason, for all we know, to consider a sentence ‘φ’ to be a theorem of Λ but equally little reason to think that it is not (perhaps Λ is undecidable; perhaps the complexity of validity checking too high). It seems that the theoremhood or otherwise of ‘φ’ is epistemically open. Yet, according to the standard epistemic logic over Λ, ‘φ’ is universally and globally believed iff it is Λ-valid. The only logic which could avoid this difficulty is the zero logic ∅, which contains no theorems whatsoever; and it is clear that the zero logic cannot provide us with a useful tool for analysing belief at all. 4 Fregean senses Frege [7] discusses two questions which are of interest to us, viz. (i) why is it that co-denoting terms are not substitutable salva veritate in belief contexts? and (ii) how is it possible that certain identity statements are informative? The latter is known as the problem of cognitive significance and clearly impacts on the former. In summary, Frege’s solution is that senses mediate reference and that propositions, or thoughts, consist of senses. Frege thought of senses as mind-independent entities, distinct both from the physical world and the realm of language. Thoughts qua entities consisting of senses are not, on this view, mental entities at all. Thoughts are mind-independent and thus the very same thought (the same token, not just the same type) may be grasped by more than one person. Understanding simply consists in the grasping of a thought. Roughly, we may think of the sense of a term ‘a’ as a way in which its referent a is presented. The problem of cognitive significance then vanishes, for the terms ‘a’ and ‘b’ may have very different senses, even if a = b. One would grasp a different thought in understanding the sentence ‘a = b’ than the thought grasped in understanding ‘a = a’. A major problem with this view is in part caused by the inherent abstract nature of such entities. The metaphor of grasping a term’s sense lacks any explanatory force; nor does a more informative answer seem possible. One simply has to posit non-natural mental powers in order to account for our understanding and, since a theory of understanding is a theory of meaning, the meaning of language is treated as primitive and impossible to analyse further. Secondly, if the sense of a sentence is a thought, then we should treat the senses of the constituents of a sentence as the constituents of thought, i.e. as concepts. Frege allows this by treating the senses of singular terms (by which Frege included descriptions as well as names, demonstratives and indexicals) as primitive, whereas the sense of predicates and relational terms are to be treated as functions. 2 The theory of a world is just the set of formulas true at that world. 6 By way of illustration, let us write ‘σ[P ]’ for the sense of the predicate ‘P ’ and ‘σ[a]’ for the sense of a singular term ‘a’. The former is a function, from the sense of a singular term to a thought. σ[P ], given σ[a] as its argument, returns the sense of the sentence ‘P a’, none other than the thought that a is P . We thus have a compositional way of analysing the structure of thought; we may write the forgoing as σ[P a] = σ[P ](σ[a]), where the latter relatum, structured as a functional application, displays the structure of the former relatum, i.e. the thought. However, if we suppose concepts to be primitive senses (those with no internal structure), no naturalistic explanation of concept formation is possible. Again, we would have to posit non-natural mental powers to account for concept formation. As with a theory of understanding, an adequate account of concept formation must also serve as an explanation; and some such does not seem possible on the Fregean view. It seems that the only option, if we are to consider Fregean senses as having any explanatory value whatsoever, is to credit them with a concrete existence. One option is to treat senses as descriptions, or as bundles of descriptions. Then, we treat meanings and concepts as such and understanding as the grasping of such bundles. But this account faces a well-known objects. Plato, the teacher of Aristotle, was not so of necessity. Somebody else could have taught Aristotle, even though, in actuality, Plato did. But if ‘Plato’ means ‘the teacher of Aristotle’, to assert such a contingent fact as ‘Plato taught Aristotle’ would be a necessary truth. Of course, it is not anything of the sort; hence, senses cannot be assimilated to descriptions or bundles thereof. There is an interesting idea here, despite the falure of the account. If thoughts were constituted from senses qua bundles of descriptions, then thought would have a language-like character. Concepts would contain cognitive information analogous to the information contained within a description’s descriptive condition. Mental information, on such an account, would be precisely analogous to linguistic information. We shall return to this point below. 5 Belief States An account of what belief is is now owed. Accounts of belief in terms of epistemically possible worlds or Fregean senses have been rejected; so let us consult our intuitions concerning what would and would not count as a belief. Senses were originally invoked to solve the problem of informativeness of different yet co-denoting terms. The problem can be re-cast in terms of sentences by dealing with equivalences rather than identity statement. Where Frege asked how it can be that ‘a = b’ is informative, we may ask how it is that ‘someone is a bachelor if and only if he is an unmarried man’ is informative, whereas ‘someone is a bachelor if and only if he is a bachelor’ is not. One possibility is that, in combining one’s concept of marriage (or being unmarried) with that of a man, one does not necessarily arrive at one’s concept of a bachelor. This would be the case when one has been told that, say, Rob is a bachelor but does not know what ‘bachelor’ means. Yet, without a prior understanding of what concepts are, this is simply a re-statement the problem. An account of concepts in terms of Fregean senses has been rejected. May concepts instead be accounted for as abilities? Certainly, to have a concept is to have an ability, to be able to distinguish things which fall under that concept 7 from those that do not. However, one need not be disposed to correctly make all such distinctions in order to possess the relevant concept. My concept of an elm is distinct from my concept of a beech and yet, presented with one type of tree, I doubt I could say which it is. This is not a defect in my conceptual scheme; I simply lack the requisite information to distinguish instances of the one type from those of the other. This suggests the question, how are the two concepts distinct at all? Were I asked to list distinct types of tree, I would certainly utter both ‘beech’ and ‘elm’; this seems evidence that my concepts elm and beech are distinct. I have a repository of information associated with each term, or with the mental equivalent thereof. Rather than assuming the concept is nothing but this information, as the senses-qua-bundle-of-descriptions account does, the repository of information relating to elms seems intrinsically linked to my uses of ‘elm’. Were I to learn that elms possess a property P lacking in beech trees, I become disposed to assert ‘elms are P ’. Notice that this new information might not provide me with the ability to distinguish elms from beeches, for I might be unable to identify the P s from non-P s. The identity of concepts, then, goes beyond particular distinguishing abilities. The notion of a concept is best explained on the analogy of a mental repository of information. The information associated with a particular concept may change, in some cases dramatically, yet what the concept is a concept of stays the same. My concept of an elm is intrinsically linked to the term ‘elm’; rather than the word being a label for the concept, as the Fregean account presupposes, the very identity of the concept appears tied to a linguistic, or at least representational, entity. In fact, in order to account for the problem of informativeness, concepts must be to a greater or lesser extent representational. There is not the space here to argue for a representational account of concepts in the detail it deserves; the aim, after all, is to provide a logic of belief. It suffices to say that, if this account is correct, then what one believes about elms is what representational information one associates with one’s elm concept. Certain information disposes one to assert certain sentences and not others; having a certain belief is therefore a disposition to assert certain sentences and not others. This latter formulation, if correct, allows an analysis of belief free from assumptions about the nature of the mind. Yet this formulation is not correct as it stands and is in a degree of tension with the formulation in terms of representational mental information. Consider these two cases. You are asked whether the Eiffel tower is taller than Elvis ever was. You have never made such a comparison before yet, with relatively little mental effort, you answer in the affirmative. You believe that things are so and your disposition to answer the question in the affirmative is evidence of this belief. Now suppose a logician is asked whether a sentence φ is a theorem of some logic Λ. After a week of a priori mental toil, she responds that it is. Did she believe it to be so at the instant the question was posed? Surely not, for then she would not have taken so long and taken such pains to reply. She was disposed to assert that φ is a theorem but only after a great deal of mental gymnastics. Her mental state at the posing of the question cannot be considered a state of belief (about φ), yet we suppose that one’s mental state in the case of the Eiffel tower-Elvis comparison is a state of belief. Surely the difference is only a matter of degree?—the former case requires a great deal of thought before an assertion is made, the latter very little. In the case of the logician, she arrives at the belief that φ is a theorem precisely when her deliberating—the kind of thought process which enables one 8 to assert or deny statements—internally represents φ as being so. It as if she internally derives the mental analogue of the sentence ‘φ is a theorem’. This internal derivation may be a more or less logical—more or less rational—process; its important feature is that it enables the logician to assert, without further deliberation, that φ is a theorem. The disposition to assert a certain sentence characterises the kind of deliberative mental process required for belief in a precise way. In a strict sense, having the belief that φ consists in being disposed to assert that φ, in virtue of being in a cognitive state involving the representational mental analogue of φ, without further deliberation or empirical investigation. In the case of artificial agents, this account seems undeniable; if we are to credit artificial agents with beliefs, then they must be accounted for in the following way. The individuation of the internal states of the agent is theoretically unproblematic. Suppose the agent is a program which executes by assigning different values to variables at different times (or, to be more precise, at different cycles of the hardware upon which the program executes). Such states may be unambiguously translated into a metalanguage—say, a precise logical language—by assigning values to variables in a way which mimics variable assignment in the execution of the program. Then, an agent believes that φ in a certain state s iff the part of s which is responsible for declarative output, when translated into our metalanguage, includes φ. One might protest that whether or not an agent can actually make declarative utterances (or truth-apt output of any kind) is irrelevant when considering whether a particular internal state is a state of belief. If we allow such states to be belief states, then such an agent believes that φ in state s iff the representational part of s, when translated into the metalanguage, includes φ. In the case of human belief states, things are not so simple. They never are. Upon being asked whether the Eiffel tower is taller than Elvis was, one’s cognitive state is likely to change ever so slightly, moving into a cognitive state which includes comparing. The change may be so slight as to be unnoticeable to the agent so that, on forming the belief (in the latter mental state) that the Eiffel tower is indeed taller than Elvis and asserting so, it seems that the belief must have been there in the former state as well. Our practises of reporting beliefs are not fine-grained enough to distinguish the two states; they cannot be, for our sensory apparatus has its own granularity. More tellingly, we would have no reason to make such a distinction. We are usually interested in the beliefs of others as a guide to future action, or as an explanation of their behaviour, or as a guide for our own beliefs and actions. The granularity of these considerations is, of course, far coarser than that of cognitive states. We thus generally disqualify acts of recalling from memory, quick and easy comparisons and the like as acts which would distinguish between belief states. Just which cognitive acts we do include in distinguishing one belief state from another is a vague matter. We include a week’s worth of deliberation in this category, we do not include split-second comparisons. At some point in between the two extremes, all bets will be off. Human belief states have vague borders and so there must be an inherent vagueness in the notion of human belief itself. 9 6 A Logic of Belief States From what has been said above, it should be clear that the aim is to provide a descriptive model of an agent’s beliefs, rather than a normative account. It seems the question of whether an artificial agent believes a sentence φ can be settled by considering the internal state of the agent—by looking at which values have been assigned to which variables. By limiting the focus to what such an agent believes now, we are likely to arrive at a very dull logic of belief. It is therefore necessary to be clear as to why a logic of belief is desirable at all. One answer is the following. There has been considerable interest in the last twenty years in verifying properties of programs. Knowing that a program will not enter a loop from which it will never exit, or that a server cannot enter a state in which two users can change the content of a database at the same time, are clearly useful things to know. The same kind of knowledge is desirable with artificial agents in the picture. We judge many current artificial agents to be stupid because they frequently do stupid things. No doubt, the programmers did not envisage such possibilities coming about and would like to ensure future generations of the agent avoid similar mistakes, i.e. to verify that the future generation satisfies certain properties. One use for a logic of belief is to enable properties to be verified at the intentional level, the descriptive level at which the agent is said to have concepts, to believe this, to desire that and so on. Such a logic cannot just talk of belief; it must include temporal and alethic notions, allowing for judgements such as ‘the agent may reach a state in which it believes φ in ten cycles’, ‘the agent can reach its goal in ten cycles’ or ‘the agent cannot reach a state in which it believes ψ within ten cycles’ (here cycle means something like the change from one belief state to another). The logic is illustrated by the various belief states of Doris, who singlehandedly runds Doris’ Dating Agency. Doris is a rule-based agent, whose program incorporates rules such as suits(x,y), suits(y,x) match(x,y) (if x is suited to y and vice versa, then they are a good match for one another). Such rules are read by Doris as inference rules: given that suits(x,y) and suits(y,x) have been derived (where x and y are consistently replaced with constants), infer match(x,y) (with the same constants substituted for x and y). In reasoning about Doris, replace ‘have been derived’ with ‘are believed’ and ‘infer’ with ‘believe’ in this reading. Such rules are a species of condition-action rule; the condition is the current state of Doris’ beliefs, the action required is for Doris to form a new belief. Doris works in cycles. In each cycle, she checks her beliefs against her rules, trying to find a consistent matching instance. If there are matches, she picks one and forms the belief it tells her to. More precisely, an instance of a rule is obtained by uniformly replacing all variables by with constants. Let δ be some substitution function from the set of variables of the rule into the set of constants and ρδ be our notation for the instance of the rule ρ under δ. For example, if δ assigns Rob to x and Roberta to y, then δ suits(x,y), suits(y,x) suited(x,y) = suits(Rob,Roberta), suits(Roberta,Rob) suited(Rob,Roberta) 10 Given a rule instance ρδ under the substitution δ, Doris can fire that instance, adding the new belief (the consequent of the rule ρ under the substitution δ, written cn(ρ)δ ) to her working memory. Only one rule instance may be fired at a time, for rule firing is strictly serial. In order to reason about Doris’ belief states, let us introduce a further language, containing sentences such as B male(Rob)∧B suits(Roberta,Rob), read as ‘Doris believes that Rob is male and that Roberta is suited to Robert.’ The question is, what kind of logical apparatus can be applied to such sentences? To be sure, the above sentence should be true precisely when Doris has those two beliefs, i.e. when both male(Rob) and suits(Roberta,Rob) are held in Doris’ working memory; but what logical principles hold of the beliefs themselves? Suppose Doris believes the rule male(x) –female(x), which she matches to produce the instance male(Rob) –female(Rob). Does this mean that, if Doris believes male(Rob), then she believes –female(Rob)? It does not, for we can easily imagine a case in which the former is held is working memory but the latter is not. Perhaps Doris has discovered many instances of the rule and is working through them one at a time, checking which may be fired, and has not yet come to seeing whether she believes male(Rob). Thus, B male(Rob) –female(Rob) is not equivalent to B male(Rob) → B–female(Rob). We can say that, given enough time, and provided she has enough room in her memory to add new beliefs, B–female(Rob) will eventually become true. We can also say that Doris could fire that rule instance and add –female(Rob) to her beliefs at the very next cycle; nothing prevents Doris’ beliefs evolving this way. As this brief discussion shows, the interesting features of Doris’ belief states go beyond what she believes now to include what she will and what she she could believe in the future, given her current belief state. As mentioned above, temporal and alethic matters are important in reasoning about belief states, yet this platitude is ignored by standard epistemic logics. The approach here is to combine temporal and alethic modalities by using a discrete branching model of time. Figure 1 shows part of such a a model. Time is said to be branching in the • y kkkkkk • SSSS SS ww ww • w w w x • GG GG GG G kkkkk • • kS z SSSSS • Figure 1: Part of a branching time model model in the sense that the facts which hold at point x do not determine whether point y or z will come next; although only one may actually follow x, both are possible successors. In the same way, given what we have said about rule-based agents such as Doris, the agent’s current belief state does not determine future belief states, but does make certain states possible successors to the current one. Of course, in reality any belief is a possible belief, in the sense that an agent could perceive or hallucinate something completely disconnected from the focus of its thoughts and come to entertain the corresponding belief. The use 11 of possibility here has to be limited to the possibilities afforded by the agent’s belief states, or those of other agents, considered as a closed system. Such structures are relational and can be described using a modal logic. Modal logic has proved popular in describing relational structures because of its conceptual simplicity and its robust decidability. Below, a modality ‘3’ is introduced, with ‘3Bα’ meaning that the agent can move into a belief state in its next cycle which includes the belief that α. Times are not referred to explicitly; however, the ‘3’ modality may be chained, so that ‘333Bα’ means that the agent can in three cycles move into a belief state containing the belief that α. They key to modelling evolving belief states is to capture the notion of a transition from one state to another in the heart of the logic. The semantics of this logic is based on the kind of relational structure diagrammed in figure 1, where the points represent internal states of the agent and the lines, read left to right, are transitions between these states. When an agent in state s can fire a rule to derive a new belief, this is modelled by a transition from s to a new state s′ , just like s except for the addition of the new belief. Since firing a rule produces just the one new belief, states related by a transition may differ only by a single belief. There is a close correlation between transitions and rule firings, allowing for a very fine-grained model of belief state change. The remainder of this section is fairly technical. It presents a logic for reasoning about the kind of rule-based agents introduced above. To conclude, it is indicated how the logic can be extended to incorporate multiple agents and agents which use a more expressive language than that of simple condition-action rules. 6.1 Language We fix the set of predicate symbols P and the set of constants D. We denote the set of all possible substitutions δ : X −→ D by Σ, where X is any set of rules. Given a rule ρ = (λ1 , . . . , λn λ), where each λi is a literal, the instance of ρ under δ, written ρδ , is λδ1 , . . . , λδn λδ . Note that given finite sets X and D, Σ is also finite; then the set of all possible rule instances is finite as well. The agent’s internal language L(P, D) over P and D contains only rules and ground literals. Since we assume both P and D are fixed throughout, we may drop these arguments and refer to the agent’s internal language simply as L. The following notation is used: • literals are denoted by λ, λ1 , λ2 , . . . • ground literals are denoted by λδ , λδ1 , λδ2 , . . ., where δ ∈ Σ • rules of the form λ1 , . . . λn λ are denoted by ρ, ρ1 , ρ2 , . . . • instances of a rule ρ are denoted ρδ , where δ ∈ Σ Only ground literals and rules, the λδ s and the ρs, are considered well-formed formulas of L. Arbitrary formulas of L are denoted α, α1 , . . . . The modal language ML(P, D), which is used to reason about the agent’s beliefs, is built from formulas of L(P, D). ML(P, D) contains the usual propositional connectives ¬, ∧, ∨, →, the ‘3’ modality and a belief operator B. Given a literal λ, a rule ρ and a substitution function δ, ‘Bλδ ’ and ‘Bρ’ are primitive wffs of ML(P, D). There are no other primitive wffs. If φ1 and φ2 are both ML(P, D) wffs, the 12 complex wffs of ML(P, D) are then given by ¬φ1 | φ1 ∧ φ2 | φ1 ∨ φ2 | φ1 → φ2 | 3φ1 df The dual modality ‘2’ is introduced by definition: 2φ = ¬3¬φ. Since P and D will be fixed throughout, these arguments may informally be dropped, and the agents’ language referred to as L and the modal language as ML. Note that the primitive formulas of ML are all of the form Bα, where α is a L-formula, hence the problem of substitution within belief contexts does not arise in logics based on ML. 6.2 Models for the single-agent logic A model M is a structure hS, T, V i where S is a set of states; T ⊆ S × S is the transition (accessibility) relation on states; and V : S −→ ℘(L) is the labelling function assigning to each state the set of L-formulas which the agent believes in that state. The definition of a formula φ of ML being true, or satisfied, by state s in a model M (written M, s φ) is as follows: M, s Bα iff α ∈ V (s) M, s ¬φ iff M, s 6 φ M, s φ1 ∧ φ2 iff M, s φ1 and M, s φ2 M, s φ1 ∨ φ2 iff M, s φ1 or M, s φ2 M, s φ1 → φ2 iff M, s 6 φ1 or M, s φ2 M, s 3φ iff there exists a state s′ ∈ S such that T ss′ and M, s′ φ By substituting the definition of ‘2’ into the clause for ‘3’, we get M, s 2φ iff, for all states s′ ∈ S such that T ss′ , M, s′ φ. Definition 1 (Global satisfiability and validity) An ML formula φ is globally satisfied in a model M = hS, T, V i, notation M φ, when M, s φ for each state s ∈ S. Given a class of models C, φ is said to be valid in C or C-valid, written C φ, when M φ for any M ∈ C. Validity (simpliciter) is validity in any class. A set of ML formulas Γ is said to be satisfied at a state s ∈ S, written M, s Γ, when every element of Γ is satisfied at s. Γ is then globally satisfied, C-valid or valid in a similar way. This formulation applies to relational structures in general, not just to models of rule-based agents. To get the desired class of model, structures have to be restricted in the following way. To begin with, the agent’s program—the set of rules it believes—is finite by definition and does not change; rules are neither learnt nor forgot. This is standard practise in rule-based AI systems. This means that, if the set R contains all rules believed at a state s, then the rule believed at all states reachable from s (i.e. in some unspecified number of transitions) will also be precisely those in R (and similarly for all states from which s is reachable). To say that a rule has an instance (which may be fired) is to say that the rule is believed and that there is a substitution such that the 13 premises of the rule are believed (under that substitution) but the consequent is not (agents do not try to derive what they already believe). Such rules are said to match. When a rule ρ matches under a substitution δ, we say that δ(ρ) is a matching instance of ρ under δ. Definition 2 (Matching rule) Let ρ be a rule of the form λ1 , . . . , λn → λ and δ a substitution function for ρ. ρ is then said to be s-δ-matching, for some state s ∈ S, iff ρ ∈ V (s), each λδ1 , . . . , λδn ∈ V (s) but λδ 6∈ V (s). As explained above, transitions from one state to another correspond to the agent firing a rule instance and adding a new belief to its working memory. When a rule instance may be fired in a state s, and a transition to a further state s′ is possible, s′ must then be just like s except for the addition of that new belief. In such cases, we say that s′ extends s by that new belief. Definition 3 (Extension of a state) Let δ be a substitution function for a rule ρ and λδ be the consequent of ρ under δ. Then a state s′ is said to extend a state s by λδ when V (u) = V (s) ∪ {λδ }. One exception is made to the stipulation that transitions correspond to a rule instance being fired, purely for technical reasons. If there are no matching rules at a state (and so no rule instances to fire), that state is a terminating state and has a transition to itself (or to another identical state, which amounts to much the same in modal logic). This ensures that every state has an outgoing transition; in other words, T is a serial relation. As a consequence, the question ‘what will the agent doing after n cycles’ may always be answered, even if the agent ran out of rules to fire in less than n cycles. Definition 4 (Terminating state) A state s is said to be a terminating state in a model M iff, for all substitution functions δ ∈ Σ, no rule ρ is s-δ-matching. Transitions may relate terminating states. If, on the other hand, there is a matching rule at a state s, then a transition should only be possible to a state s′ when s′ extends s by an appropriate belief (i.e. the consequent of a matching rule instance at s). We capture such transition systems in the class S (for single agent models). Definition 5 The class S contains precisely those models M which satisfy the following: S1 for all states s ∈ S, if a rule λ1 , . . . , λn → λ is s-δ-matching, then there is a state s′ ∈ S such that T ss′ and s′ extends s by λδ . S2 for any terminating state s ∈ S, there exists a state s′ ∈ S such that V (s′ ) = V (s) and T ss′ S3 for all states s, s′ ∈ S, T ss′ only if either (i) there is an s-δ-matching rule λ1 , . . . , λn → λ and s′ extends s by λδ ; or (ii) s is a terminating state and V (s) = V (s′ ). There may, of course, be many rules matching at a given state and many matching instances of each (i.e. instances in which the consequent of the rule, under 14 that substitution, is not already believed). For each such instance of each matching rule at a state s, there will be a state s′ with an transition to it from s. Each transition may be thought of as corresponding to the agent’s nondeterministic choice to fire one of these rule instances (i.e. to add the consequent of that rule instance to its set of beliefs). ‘⋄φ’ may then be read as ‘after some such choice, φ will hold.’ We can think of the agent’s reasoning as a cycle: 1. match rules to produce rule instances; 2. choose a rule instance; 3. add the consequent of that instance to the set of beliefs; repeat. By chaining diamonds (or boxes), e.g. ‘333’ we can express what properties can (and what will) hold after so many such cycles. We can abbreviate sequences of n diamonds (or n boxes) as 3n and 2n respectively. ‘2n φ’, for example, may be read as ‘φ is guaranteed to hold after n cycles.’ Note that the choices made in each of these cycles are nondeterministic; that the agent’s set of beliefs grows monotonically state by state and that the agent never revises its beliefs, even if they are internally inconsistent. In [4], an analysis of an agent which makes a deterministic choice of which rule instance to fire at each cycle is given. In [3], it is shown how similar agents can revise inconsistent belief states in a computationally efficient way. In the deterministic case, models are linear rather than branching, i.e. each state has a transition to a unique successor state. Given a program (a finite set for rules) R for the agent, models in the class SR are those models in which the agent believes all the rules in R and no further rules. Each program R thus defines a unique class of model, SR , such that models in this class represent agents which reason based on that program. Definition 6 (The class SR ) Let R be a set of rules. A model M ∈ SR iff M ∈ S and, for every state s in M , M, s Bρ iff ρ ∈ R. Given a set of ML formulas Γ and an ML formula φ, we write Γ R φ iff every model of Γ which is in the class SR is also a model of φ. 6.3 Some properties of the class SR This section surveys a few of the more interesting properties of the class SR (for some fixed program R); a more detailed discussion is given in [10]. A few definitions need to be given first. Definition 7 (Label equivalence) Given models M = hS, T, V i and M ′ = hS ′ , T ′ , V ′ i, states s ∈ S and s′ ∈ S ′ are said to be label equivalents, notation s L s′ , iff V (s) = V ′ (s′ ). Definition 8 (Modal equivalence) Given models M = hS, T, V i and M ′ = hS ′ , T ′ , V ′ i, the theory of a state s ∈ S, written th(s), is the set of all formulas satisfied at s, i.e. {φ | M, s φ}. States s ∈ S and s′ ∈ S ′ are modally equivalent, written s ! s′ , iff they have the same theory. Similarly for models, the theory of M is the set {φ | M φ} and M is modally equivalent to M ′ , M ! M ′ , iff their theories are identical. 15 Definition 9 (Bisimulation) Given two models M = hS, T, V i and M ′ = hS ′ , T ′ , V ′ i, a nonempty binary relation Z ⊆ S × S ′ is a bisimulation between M and M ′ , written Z : M ⋍ M ′ , when Label If Zss′ then s and s′ are label identical (i.e. s L s′ ) Forth If Zss′ and T su, then there is a state u′ ∈ S ′ such that T ′ s′ u′ and Zuu′ Back If Zss′ and T ′ s′ u′ , then there is a state u ∈ S such that T su and Zuu′ When these conditions hold, we say s and s′ are bisimilar and write M, s ⋍ M ′ , s′ (or simply s ⋍ s′ if the context makes it clear which models s and s′ belong to). When there exists such a Z, we write M ⋍ M ′ . Proposition 1 Given two models M = hS, T, V i and M ′ = hS ′ , T ′ , V ′ i, for all s ∈ S and s′ ∈ S ′ , s ⋍ s′ implies s ! s′ . Proof: The proof is standard; see, for example, [5, p.67]. 2 A tree model has the form diagrammed in figure 1, such that each state s has a unique parent (the state from which there is a transition to s) except the root. Proposition 2 Every model M has a bisimilar tree model. Proof: A tree model M ′ can be obtained by unravelling M ; the proof that M and M ′ are bisimilar is standard. 2 What is the significance of this result? It means that any satisfiable ML formula can be satisfied in a finite model. Suppose M satisfies φ; then there is a tree model M ′ bisimilar to M , so M ′ must satisfy φ too. Now, consider a syntax tree for φ with all but the modalities removed from the nodes (so that the same number of modalities appears on the tree as in φ). Then the greatest number of modalities which can be traced from a leaf to the root is called the modal depth of φ. For example, the modal depth of 2(3p ∨ 32q) is 3. If a formula is satisfiable, it is satisfiable in a tree model whose longest branch does not exceed the modal depth of the formula. To obtain this model, simply take the tree model for the formula φ and chop off the branches at (the modal depth of φ) states from the root. The upshot is that modal logics are decidable in an extremely robust way. It easy to find models for satisfiable formulas using the automatic technique of model checking. Of course, chopping a model down in this way may disqualify it from membership of SR ; but this can be re-instated by taking a chopped down model and continuing each branch until no matching rules are left; then looping the last state on the branch back to itself. Since R and D are finite, this is guaranteed to happen in finitely many steps. It is well-known that the converse of proposition 2 does not hold in general. Given a model M , we can construct a modally equivalent model M ′ containing an infinite branch for which there can be no bisimulation Z : M ⋍ M ′ (if we suppose there is, we will eventually come to a point on the infinite branch in M ′ for which the corresponding point in M has no successor; hence they cannot be bisimilar states). However, we do have a restricted result in the converse direction: 16 Proposition 3 (Hennessy-Milner Theorem) A model is image finite if the S set s∈S {s′ | T ss′ } is. Then, given two image finite models M = hS, T, V i and M ′ = hS ′ , T ′ , V ′ i, for all s ∈ S and s′ ∈ S, s ! s′ implies s ⋍ s′ . Proof: See, for example, [5, p.69] 2 Corollary 1 Given a finite program R, models M, M ′ ∈ SR and any state s in M and s′ in M ′ : s ⋍ s′ iff s ! s′ . Proof: Clearly any such model in SR is image finite, for a finite number of matching rules will only induce a finite number of transitions from a state. Then the ‘if’ direction is given by the Hennessy-Milner theorem, the ‘only if’ direction by proposition 1. 2 This is a useful result. We can now say that two such models are indistinguishable iff there is a bisimulation between their states. Definition 10 (Run and Path) A run θ = s0 · · · sn · · · is an infinite sequence of states with T si si+1 for each i ∈ N. A path is a finite sequence of states s0 · · · sn , again with T si si+1 for each i ≤ n; then length(θ) = n. We write θ[i] to denote state i−1 on θ, i.e. si (with i ≤ length(θ) if θ is a path). We say θ is a run or path from s when s = θ[0]. θ[0, i] denotes the path consisting of the first i states s0 · · · si of θ (with i ≤ length(θ) if θ is a path). We say that two runs θ and θ′ are label-identical iff θ[i] L θ′ [i] for each i ∈ N and similarly for paths of equal length (paths of unequal length are trivially non-label-idential). The following results show that the way a state s is labelled determines which formulas are true at states reachable from s. This allows us to establish a useful connection between label identical states, bisimulation and modal equivalence. The following abbreviation is helpful. df cn(λ1 , . . . , λn λ) = λ (1) Lemma 1 For tree models M, M ′ ∈ SR and states s in M , s′ in M : if s L s′ then, for any path θ from s in M , there is a path θ′ from s′ in M ′ which is labelidentical to θ. Proof: By induction on the position of s in θ. Assume θ[0] L θ′ [0] and that the length of θ is n. The base case is covered by assumption so assume further that, for all j < i ≤ n, θ[j] exists and θ[j] L θ′ [j]. We shown this holds when j = i. Set w = θ[i − 1] and u = θ[i]. Since T wu, there must be a w-matching rule ρ such that u extends w by cn(ρ). By hypothesis, w′ = θ′ [i − 1] exists and w L w′ . Then ρ is also a w′ -matching rule; hence u′ = θ′ [i] exists and T w′ u′ . Then u′ extends w′ by cn(ρ), so u L u′ . 2 This lemma leads to a surprising result, which does not hold of modal systems in general. Theorem 1 For any models M, M ′ ∈ SR and all states s in M , and s′ in M ′ : s L s′ iff s ! s′ iff s ⋍ s′ . 17 Proof: Without loss of generality, assume M and M ′ are both tree models and that s L s′ , but not s ! s′ for s in M and s′ in M ′ . Then there is some φ such that M, s ⋄φ but M ′ , s′ 1 ⋄φ, hence a path θ from s in M such that, for some i, there is no path θ′ from s′ in M ′ such that θ′ [i] L θ[i]. But this contradicts lemma 1. 2 Corollary 2 Let M ∈ SR be a model in which s L s′ , X = {u ∈ S | T us′ } and Y be the set of all states reachable from s′ but not from s. Let M ′ be the model in which S ′ = S − (Y ∪ {s′ }), T ′ is just like T (restricted to S ′ ) except T ′ us for each x ∈ X and v ′ = v restricted to S ′ . Then there is a bisimulation Z : M ⋍ M ′. Proof: Given s L s′ , we have s ! s′ (theorem 1). So assume there is no such bisimulation Z. Since M ′ is identical to M except for the descendants of each u ∈ X, there must a formula φ and state x ∈ X such that M, x φ but M ′ , x 6 φ. But we have M, u ⋄φ iff M, s′ φ iff M ′ , u ⋄φ, for each u ∈ X. Hence our assumption was wrong. 2 This last result says that if we take any model and squash together similarly labelled states, we get a model which satisfies a formula iff the original model did. As can be seen, models of rule-based agents (those in the class SR , for some program R) have many desirable properties, several of which are not possessed by models of modal logics in general. The computational properties of models of rule based agents make it easy to verify properties of the agents they model; this can only be a sign of the success of the logic. In the following section, an axiomatization of the logic is given. 6.4 Complete and sound axiom system Fix a program R and a finite set of constants D which may be used in substitutions. To axiomatize the transition systems in which every transition corresponds to firing exactly one rule instance in R, the following abbreviations are helpful. df match(λ1 , . . . , λn λ) = Bλδ1 ∧ · · · ∧ Bλδn ∧ ¬Bλδ (2) _ df match ρ = matchδ ρ (3) δ∈Σ The axiom system shown in figure 2 is called ΛR . Explanations of the more complicated axiom schemata A6 and A7 are given below. A6 says that, when a belief is added, it must have been is added by some matching rule instance in R. The abbreviation _ Bλδ1 ∧ . . . ∧ Bλδn λ1 ,...,λn λ∈R,λδ =α abbreviates the disjunction of all formulas Bλδ1 ∧ . . . ∧ Bλδn for which there is a rule λ1 , . . . , λn → λ in the agent’s program R whose consequent under δ is α. Intuitively, the A7 schema says that, if all matching rule instances in the current state are ρδ11 , . . . , ρδnm , then each of the successor states should contain the consequent of one of those instances. 18 Cl all classical propositional tautologies K 2(φ → ψ) → (2φ → 2ψ) A1 Bρ where ρ ∈ R A2 ¬Bρ where ρ 6∈ R A3 Bα → 2Bα A4 B(λ1 , . . . , λn → λ) ∧ Bλδ1 ∧ · · · ∧ Bλδn → 3Bλδ A5 3(Bα ∧ Bβ) → Bα ∨ Bβ W A6 3Bα → (Bα ∨ ρ∈R,δ∈Σ: cn(ρ)δ =α matchδ ρ) V A7 matchδ1 ρ1 ∧ · · · ∧ matchδm ρn ∧ δ6=δi ,ρ6=ρj ¬matchδ ρ → h i 2 B cn(ρ1 )δ1 ∨ · · · ∨ B cn(ρn )δm V A8 ρ∈R ¬matchρ → 3⊤ MP N for each δ ∈ Σ 1 ≤ i ≤ m, 1 ≤ j ≤ n φ φ→ψ ψ φ 2φ Figure 2: Axiom schemes for ΛR A derivation in ΛR is defined in a standard way, relative to R: φ is derivable from a set of formulas Γ (written Γ ⊢R φ) iff there is a sequence of formulas φ1 , . . . , φn where φn = φ and each φi is either an instance of an axiom schema, or a member of Γ, or is obtained from the preceding formulas by MP or N. Suppose an agent’s program R contains the rules ρ1 , . . . , ρn . This agent is guaranteed to reach a state in which it believes α in k steps, starting from a state where it believes λδ11 , . . . , λδmm , iff the following statement is derivable in ΛR : Bρ1 ∧ . . . ∧ Bρn ∧ Bλδ11 ∧ . . . ∧ Bλδmm → 2k Bα (Again, 2k α is an abbreviation for 22 · · · 2α, k times). Below, a proof is given that ΛR is the logic of the class SR . First, a few lemmas need to be prepared. In each case a sketch of the proof is given; the full proofs, together with more discussion, can be found in [10]. Lemma 2 (Lindenbaum lemma) Any set of formulas Γ can be expanded to a ΛR -maximal consistent set Γ+ . Proof: The proof is standard. 2 A canonical model M R = hS, T, V i is built in the usual way. States in S are ΛR -maximal consistent sets; T su iff {φ | 2φ ∈ s} ⊆ u (or equivalently, iff {3φ | φ ∈ u} ⊆ s). Finally, V (s) = {α ∈ L | Bα ∈ s}, for each s ∈ S. Lemma 3 (Existence lemma) For any state s in M R , if there is a formula ⋄φ ∈ s then there is a state u in M R such that T su and φ ∈ u. 19 Lemma 4 (Truth lemma) For any φ and any state s ∈ S: M R , s φ iff φ ∈ s. Proof: The proofs of lemmas 3 and 4 are standard. 2 Lemma 5 Let M R be a canonical model and let α ∈ L and s, u ∈ S. Then (i) if T su and α ∈ V (u) but α ∈ / V (s), then V (u) = V (s) ∪ {α}. (ii) α in part (i) must be a ground literal. Proof: Part (i) follows from the definition of M, s Bα as α ∈ V (s) together with the truth lemma and the fact that states are closed under axioms A3 and A5. The former axiom ensures that s is a subset of u, the latter ensures that α is the only new belief. For part (ii), if we suppose α were some rule we would have α ∈ R and so α ∈ s, contrary to hypothesis. 2 Lemma 6 M R satisfies condition S1. Proof: Assume there is a matching rule ρ in s under some substitution δ. Given the truth lemma, it is easy to see that each of its (ground) antecedents under δ is a member of s, whereas its consequent is not. A4 and the existence lemma guarantee an accessible state u which, given lemma 5, is the extension of s by the consequent of ρ under δ. 2 Lemma 7 M R satisfies condition S3. Proof: Suppose T su for states s, u in M R . By definition, {φ | 2φ ∈ s} ⊆ u. By axiom A7, there must be one ground literal believed in u but not in s, namely the consequent of either ρδ11 or . . . or ρδnm . Then by the argument used in lemma 6, it follows that u is the extension of s by this new belief. 2 Theorem 2 (Completeness) ΛR is strongly complete with respect to the class SR : given a program R, a set of ML-formulas Γ and an ML-formula φ, Γ R φ only if Γ ⊢R φ. Proof: Expand Γ to a ΛR -maximal consistent set Γ+ from which we build a canonical model M R . From the truth lemma, it follows that M R , Γ+ Γ. It remains only to show that M R is in the class SR . Given lemmas (6) and (7), we only have to show that M R satisfies condition S2. So suppose s is a terminating state. By axiom A8, there is an accessible state s′ . By axiom A6, α ∈ V (s′ ) implies α ∈ V (s) for any literal α (this holds because there are no matching rules at s). It then follows from axioms A1–A3 that V (s′ ) = V (s), hence S2 is satisfied. 2 7 Extending the logic The logic can easily be extended to accommodate a system of agents in communication with one another. To accommodate multiple agents, we replace the labelling function V with a family of such functions. To model a system consisting of n agents comprising the set A, a model is an n + 3-tuple hS, A, T, {Vi }i∈A i 20 where each Vi is the labelling function for agent i. The language of these models is extended, first to include a belief operator Bi for each agent i ∈ A and secondly to include formulas for communication between agents, such as ‘ask(i, j)λδ ’ and ‘tell(i, j)λδ ’. These are read as ‘agent i has asked agent j whether λδ ’ and ‘agent i has told agent j that λδ ’; such formulas are called asks and tells, respectively. As above, λδ is some ground literal (these are the only types of belief which rule-based agents communicate). The primitive wffs of the modal language ML(P, D) are then Bi λδ | Bi tell(i, j)λδ | Bi ask(i, j)λδ | Bi ρ and the complex wffs are as above. The definition of ‘’ changes its first clause to: M, s Bi α iff α ∈ Vi (s), for i ∈ A, α ∈ L but the remaining clauses stay the same. The definition of a matching rule is also made relative to each agent; instead of a rule being s-δ-matching, it will now be Vi (s)-δ-matching, for some agent i. The class M, for multi-agent models, is defined in much the same way as the class S modulo the amendments just noted. Although agents share a common language, each has its own unique program. There are restrictions on which rules which may appear in an agent’s program, which we summarise here. A rule ρ may appear in Ri , for any agent i ∈ A, only if: 1. for any ask or tell α in the antecedent of ρ, α’s second argument is i; or 2. for any ask or tell α in the consequent of ρ, α’s first argument is i A program for an agent is then a finite set of rules which satisfy these conditions. Given a program Ri for each agent i ∈ A, we define the program set R = {R1 , . . . , Rn } for A. Just as in the single agent case above, the class MR contains precisely those models in which agents believe all the rules in their program (in R) and no further rules. To axiomatize the resulting logic, the following axiom schemes need to be added to those in figure 2: W A6-tell 3Bi tell(j, i)λ → Bi tell(j, i)λ ∨ ρ∈Rj , cn(ρ)δ =tell(j,i)λ matchδj ρ A6-ask 3Bi ask(j, i)λ → Bi ask(j, i)λ ∨ W ρ∈Rj , cn(ρ)δ =ask(j,i)λ matchδj ρ A9 Bi tell(i, j)λ ↔ Bj tell(i, j)λ A10 Bi ask(i, j)λ ↔ Bj ask(i, j)λ In addition, A7 needs to be replaced with the following similar-looking axiom (the differences is just that the substitution δ is no longer uniform; there may be a distinct substitution for each matching rule of each agent). V A7′ matchδi11 ρ1 ∧ . . . ∧ matchδinm ρn ∧ (δ,ρ)6∈{(δ1 ,ρ1 ),...,(δm ,ρn )} ¬matchδ ρ → 2 Bi1 cn(ρ1 )δ1 ∨ . . . ∨ Bin cn(ρn )δm Call the resulting logic ΛR . The following result then holds (the proof is much the same as that of theorem 2 above and can be found in [10]). 21 Theorem 3 (Completeness) ΛR is strongly complete with respect to the class MR : given a set of programs R = {R1 , . . . , Rn }, for n agents, a set of ML formulas Γ and a ML sentence φ, Γ R φ only if Γ ⊢R φ. Extending the language beyond that of the condition-action rules discussed above requires a slightly different approach. One such way to extend the language is to introduce a symbol ‘|’ for disjunction into the agent’s internal language, so that information such as man(Rob) | woman(Rob) may be represented internally. In this way, an agent’s program may include definition-like pairs of rules, such as human(x) man(x) | woman(x) man(x) | woman(x) human(x) The question is, how can this internal disjunction be handled by an agent in the step-by-step way described above? How would an agent use the information that man(Rob) | woman(Rob), together with other information, to conclude that man(Rob)? The answer is: by case-based reasoning. This is just the kind of reasoning one does to eliminate disjunction from a natural deduction proof (rule ➀ in figure 3). Expanding this rule using the introduction rule for ‘→’ (and writing ‘[φ]’ for the closed assumption that φ) gives rule ➁, which explicitly shows the form of case-based reasoning: see what follows from the left disjunct and then we see what follows from the right. If something follows from both, it follows from the disjunction, simpliciter. Reasoning by cases, an agent with disjunctive beliefs can thus form non-disjunctive beliefs. φ∨ψ [φ] [ψ] .. .. . . φ∨ψ φ→χ ➀ ψ→χ χ χ ➁ χ χ Figure 3: Eliminating disjunction using case-based reasoning Case-based reasoning can be captured in the kind of transition system used above by introducing the notion of a set of alternatives. Alternatives are primitive points to which sentences are assigned—they have many of the properties of what were called states above—and states are now defined as sets of alternatives. Perhaps a diagrammatic example is best here; see figure 4. To keep things simple, the example contains just a single agent, whose rules are pr qr. In the diagram, the dots represent alternatives, the circles which enclose them are the states s1 to s4 . Transitions are from left to right, so that s1 s2 s3 s4 forms a path. The dotted arrows show the reasoning which the agent is doing in each transition. Whereas the agents described in the previous section could only perform one type of mental action, viz. to fire a rule and form a new belief (and in doing so to move to a state which extends its predecessor), the agents modelled in this framework can now enter into case-based reasoning. Call these two types of move extend and split, respectively. Then, the example can be read from left to right as: 22 Rules: p|q pr qr p|q, p p|q, p, r b b b p|q, q p|q, q p|q, q, r b b s1 p|q, p, r b s2 b s3 s4 Figure 4: Case-based reasoning 1. split s1 to move to s2 ; 2. extend s2 ’s top alternative to move to s3 ; 3. finally, extend s3 ’s bottom alternative to move to s4 . To be precise, an alternative w′ is said to extend another, w, as in definition 3 above (replacing ‘state’ with ‘alternative’). In the example, the top alternative in s3 extends the top alternative in s2 (by ‘r’). A state s′ is now said to extend a state s when one alternative in s′ extends one in s and all others remain the same. In the example, s3 extends s2 . A state s′ is said to split a state s when they differ only in that an alternative w ∈ s, labelled with a disjunction λ1 |λ2 , is replaced by alternatives w1 , w2 ∈ s′ , such that λ1 labels w1 and λ2 labels w2 ; otherwise, w1 and w2 agree with w. In the example, s2 splits s1 . A transition is allowed between states s and s′ only when s′ extends s, or s′ splits s, or both s, s′ are terminating states. Models of a set A of n agents are now n + 3-tuples hW, A, T, {Vi }i∈A i where W is a set of alternatives, T ∈ ℘(W ) × ℘(W ) is a serial relation on states (i.e. on sets of alternatives) and each Vi : W −→ ℘(L) assigns a set of L formulas to each alternative. Finally, a belief α holds at a state s for agent i when Vi labels every alternative in s with α: M, s Bi α iff α ∈ Vi (w) for all w ∈ s In this way, agents which reason in a full propositional language (with variables and substitution) can be modelled. The restriction to rule-based agents is not a restriction of this logical approach. Rule-based agents were introduced because they provide a clear example of agents which form new beliefs as a deductive step-by-step process, and because their models have interesting computational properties. 8 Conclusion Standard epistemic logic is not, in itself, an account of belief; yet it appears that no acceptable account of belief can justify the assumption which the logic makes. Belief should be thought of in terms of the mental representations disposing an agent to make the relevant assertion. The logic presented here makes use of the connection between belief and internal representation, in capturing an agent’s 23 internal representations as a logical language, termed the internal language. In the case of artificial agents which do not so represent their environment, it is nevertheless possible to provide an unambiguous translation, from the values which variables may whilst the agent executes, into such a language. The logic which was then developed can capture the class of rule-based agents, which have favourable computational properties and can be used to verify properties of the modelled agent, but can also capture agents which reason in a more expressive language. In the case of human agents, the case of belief is not so clear-cut. Our practise of belief ascription, and our purposes in so doing, are not so fine-grained as, say, the practise of debugging a program (in which a programmer may want to know the precise point at which the program went wrong). Human belief states thus have vague borders, but are nevertheless genuine mental states. In practise, the vagueness of belief is just as unproblematic as the inherent vagueness of medium-sized objects, such as tables and animals. I hope this discussion has achieved two things. First, I hope it has cleared the way for similar accounts in epistemology; for example, in analysing the notion of knowledge, or of information, one first needs a correct account of belief. Secondly, I hope the logic of belief presented in the second half of the paper will form the basis of practical accounts of intentional states in AI and computer science. Many such accounts in these domains, such as the AGM theory of belief revision [1], have high computational complexity and are notoriously difficult to implement in a practical real-time system. An account of belief revision for rule-based agents is given in [3], based on the notion of belief argued for above. The mechanism proposed in the latter work has been incorporated into the agent programming language AgentSpeak [2]; thus demonstrating the practical nature of this approach to intentional notions in AI and computer science. References [1] Carlos E. Alchourrón, Peter Gärdenfors, and D. Makinson. On the logic of theory change: Partial meet contraction and revision functions. Journal of Symbolic Logic, 50:510–530, 1985. [2] Natasha Alechina, Rafael Bordini, Mark Jago, and Brian Logan. Belief revision for agentspeak agents. Manuscript. [3] Natasha Alechina, Mark Jago, and Brian Logan. Resource-bounded belief revision and update. In 3rd International Workshop on Declarative Agent Languages and Technologies (DALT 05), 2005. [4] Natasha Alechina, Brian Logan, and Mark Whitsey. Modelling communicating agents in timed reasoning logics. In proc. JELIA 04, pages 95–107, Lisbon, September 2004. [5] Patrick Blackburn, Maarten de Rijke, and Yde Venema. Modal Logic. Cambridge University Press, New York, 2002. [6] Ronald Fagin, Joseph Y. Halpern, Yoram Moses, and Moshe Y. Vardi. Reasoning About Knowledge. MIT press, 1995. 24 [7] Gottlob Frege. Über sinn und bedeutung. Zeitschrift für Philosophie und philosophische, 100, 1892. [8] Jaako Hintikka. Knowledge and Belief: An Introduction into the logic of the two notions. Cornell University Press, Ithaca, 1962. [9] Jakko Hintikka. Impossible possible worlds vindicated. Journal of Philisophical Logic, 4:475–484, 1975. [10] Mark Jago. Logics for resource-bounded agents. Forthcoming PhD Thesis. [11] Mark Jago. Logical omniscience: A survey. Technical report, University of Nottingham, 2003. [12] Saul Kripke. Naming and Necessity. Blackwell, Oxford, 1980. [13] H. J. Levesque. A logic of implicit and explicit belief. In National Conference on Artificial Intelligence, pages 1998–202, 1984. 25
© Copyright 2024 Paperzz