Utility functions and Concrete architectures: deductive agents CS7ET03: AI for IET Lecturer: S Luz [email protected] September 29, 2014 1 Abstract Architectures (ctd.) I An agent’s behaviour is encoded as its history: a a au−1 a h : s0 →0 s1 →1 ... → su →u ... I Define: I I H A : set of all histories which end with an action H S : histories which end in environment states I A state transformer function τ : H A → ℘(S) represents the effect an agent has on an environment. I So we may represent environment dynamics by a triple: Env =< S, s0 , τ > I And similarly, agent dynamics as Ag : H S → A 2 Utility functions I I Problem: how to “tell agents what to do”? (when exhaustive specification is impractical) Decision theory (see [Russell and Norvig, 1995, ch. 16]): I I associate utilities (a performance measure) to states: u:S →R Or, better yet, to histories: u:H→R Example: The Tileworld Ag 4 Ag Ag Example: The Tileworld Ag I Ag The utility of a course of action can be given by: u(h) = I 4 Ag number of holes filled in h number of holes that appeared in h When the utility function has an upper bound (as above) then we can speak of optimal agents. Optimal agents I Let P(h|Ag , Env ) denote the probability that h occurs when agent Ag is placed in Env . I Clearly X P(h|Ag , Env ) = 1 (1) h∈H I An optimal agent Agopt in an environment Env will maximise expected utility: X Agopt = arg max u(h)P(h|Ag , Env ) (2) Ag h∈H From abstract to concrete architectures I I Moving from abstract to concrete architectures is a matter of further specifying action (e.g. by means of algorithmic description) and choosing an underlying form of representation. Different ways of specifying the action function and representing knowledge: I I I I Logic-based: decision function implemented as a theorem prover (plus control layer) Reactive: (hierarchical) condition → action rules BDI: manipulation of data structures representing Beliefs, Desires and Intentions Layered: combination of logic-based (or BDI) and reactive decision strategies Logic-based architectures I I AKA Deductive Architectures Background: symbolic AI I I I I Knowledge representation by means of logical formulae “Syntactic” symbol manipulation Specification in logic ⇒ executable specification “Ingredients”: I Internal states: sets of (say, first-order logic) formulae I I I I Environment state and perception, Internal state seen as a set of beliefs Closure under logical implication (⇒): I I ∆ = {temp(roomA, 20), heater (on), ...} closure(∆, ⇒) = {ϕ|ϕ ∈ ∆ ∨ ∃ψ.ψ ∈ ∆ ∧ ψ ⇒ ϕ} (is this a reasonable model of an agent’s beliefs?) Representing deductive agents I We will use the following objects: I L: a set of sentences of a logical system I I I I 8 As defined, for instance, by the usual wellformedness rules for first-order logic D = ℘(L): the set of databases of L ∆0 , ..., ∆n ∈ D: the agent’s internal states (or beliefs) |=ρ : a deduction relation described by the deduction rules ρ chosen for L: We write ∆ |=ρ ϕ if ϕ ∈ closure(∆, ρ) Describing the architecture I A logic-based architecture is described by the following structure: ArchL =< L, A, P, D, action, env , see, next > I The update function consists of additions and removals of facts from the current database of internal states: I next : D × P → D I I 9 old: removal of “old” facts new : addition of new facts (brought about by action) (3) Pseudo-code for action 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 10 function action(∆ : D) : A begin for each a ∈ A do if ∆ |=ρ do(a) then return a end if end for for each a ∈ A do if ∆ 6|=ρ ¬do(a) then return a end if end for return noop end function action Environment and Belief states I The environment change function, env , remains as before. I The belief database update function could be further specified as follows next(∆, p) = (∆ \ old(∆)) ∪ new (∆, p) (4) where old(∆) represent beliefs no longer held (as a consequence of action), and new (∆, p) new beliefs that follow from facts perceived about the new environmental conditions. 11 Example: The Vacuum world Example: The Vacuum world (0,2) (0,1) (0,0) (1,0) Example: The Vacuum world (0,2) (0,1) (0,0) (1,0) Example: The Vacuum world (0,2) (0,1) (0,0) (1,0) [Russell and Norvig, 1995, Weiss, 1999] 12 13 Describing The Vacuum world I Environment I I I I Actions I I perceptual input: dirt, null directions: (facing) north, south, east, west position: coordinate pairs (x, y ) move forward, turn 90o left, clean Perception: P = {{dirt(x, y ), in(x, y ), facing (d), ...}, ...} Deduction rules I I Part of the decision function Format (for this example): I I I P(...) ⇒ Q(...) PROLOG fans may think of these rules as Horn Clauses. Examples: I I I I in(x, y ) ∧ dirt(x, y ) ⇒ do(clean) in(x, 2) ∧ ¬dirt(x, y ) ∧ facing (north) ⇒ do(turn) in(0, 0) ∧ ¬dirt(x, y ) ∧ facing (north) ⇒ do(forward) ... Updating the internal state database I next(∆, p) = (∆ \ old(∆)) ∪ new (∆, p) I old(∆) = {(P(t1 , ..., tn )|P ∈ {in, dirt, facing } ∧ P(t1 , ..., tn ) ∈ ∆} new (∆, p) : I I I I update agent’s position, update agent’s orientation, etc The “Wumpus World” 16 I See [Russell and Norvig, 2003, section 7.2] I BTW, Ch 7 is available online (last accessed Oct 2012) at http: //aima.cs.berkeley.edu/newchap07.pdf Shortcomings of logic-based agents I Expressivity issues: problems encoding percepts (e.g. visual data) etc I Calculative rationality in dynamic environments I Decidability issues Semantic elegance vs. performance: I I I I etc loss of “executable specification” weakening the system vs. temporal specification Existing (??) logic-based systems I MetameM, Concurrent MetameM: specifications in temporal logic, model-checking as inference engine (Fisher, 1994) I CONGOLOG: Situation calculus I Situated automata: compiled logical specifications (Kaelbling and Rosenschein, 1990) I AgentSpeak, ... I (see [Weiss, 1999] or [Wooldridge, 2002] for more details) BDI Agents I Implement a combination of: I I deductive reasoning (deliberation) and planning (means-ends reasoning) Planning I Planning formalisms describe actions in terms of (sets of) preconditions (Pa ), delete lists (Da ) and add lists (Aa ): < Pa , Da , Aa > I E.g. Action encoded in STRIPS (for the “block’s world” example): Stack(x,y): pre: clear(y), holding(x) del: clear(y), holding(x) add: armEmpty, on(x,y) Planning problems I A planning problem < ∆, O, γ > is determined by: I I I I the agent’s beliefs about the initial environment (a set ∆) a set of operator descriptors corresponding to the actions available to the agent: O = {< Pa , Da , Aa > |a ∈ A} a set of formulae representing the goal/intention to be achieved (say, γ) A plan π =< ai , ..., an > determines a sequence ∆0 , ..., ∆n+1 where ∆0 = ∆ and ∆i = (∆i−1 \ Dai ) ∪ Aai , for 1 ≤ i ≤ n. Suggestions I Investigate the use of BDI systems and agents in games. I See, for instance, [Norling and Sonenberg, 2004] which describe the implementation of interactive BDI characters for Quake 2. I And [Wooldridge, 2002, ch. 4], for some background. I There are a number of BDI-based agent platforms around. ‘Jason’, for instance, seems interesting: http://jason.sourceforge.net/ References I Norling, E. and Sonenberg, L. (2004). Creating interactive characters with bdi agents. In Proceedings of the Australian Workshop on Interactive Entertainment IE’04. Russell, S. J. and Norvig, P. (1995). Artificial Intelligence. A Modern Approach. Prentice-Hall, Englewood Cliffs. Russell, S. J. and Norvig, P. (2003). Artificial Intelligence. A Modern Approach. Prentice-Hall, Englewood Cliffs, 2nd edition. Weiss, G. (1999). Multiagent Systems. MIT Press, Cambridge. Wooldridge, M. (2002). An Introduction to MultiAgent Systems. John Wiley & Sons. 23
© Copyright 2026 Paperzz