Utility functions and Concrete architectures: deductive agents

Utility functions and Concrete architectures:
deductive agents
CS7ET03: AI for IET
Lecturer: S Luz
[email protected]
September 29, 2014
1
Abstract Architectures (ctd.)
I
An agent’s behaviour is encoded as its history:
a
a
au−1
a
h : s0 →0 s1 →1 ... → su →u ...
I
Define:
I
I
H A : set of all histories which end with an action
H S : histories which end in environment states
I
A state transformer function τ : H A → ℘(S) represents the
effect an agent has on an environment.
I
So we may represent environment dynamics by a triple:
Env =< S, s0 , τ >
I
And similarly, agent dynamics as
Ag : H S → A
2
Utility functions
I
I
Problem: how to “tell agents what to do”? (when exhaustive
specification is impractical)
Decision theory (see [Russell and Norvig, 1995, ch. 16]):
I
I
associate utilities (a performance measure) to states:
u:S →R
Or, better yet, to histories:
u:H→R
Example: The Tileworld
Ag
4
Ag
Ag
Example: The Tileworld
Ag
I
Ag
The utility of a course of action can be given by:
u(h) =
I
4
Ag
number of holes filled in h
number of holes that appeared in h
When the utility function has an upper bound (as above) then
we can speak of optimal agents.
Optimal agents
I
Let P(h|Ag , Env ) denote the probability that h occurs when
agent Ag is placed in Env .
I
Clearly
X
P(h|Ag , Env ) = 1
(1)
h∈H
I
An optimal agent Agopt in an environment Env will maximise
expected utility:
X
Agopt = arg max
u(h)P(h|Ag , Env )
(2)
Ag
h∈H
From abstract to concrete architectures
I
I
Moving from abstract to concrete architectures is a matter of
further specifying action (e.g. by means of algorithmic
description) and choosing an underlying form of
representation.
Different ways of specifying the action function and
representing knowledge:
I
I
I
I
Logic-based: decision function implemented as a theorem
prover (plus control layer)
Reactive: (hierarchical) condition → action rules
BDI: manipulation of data structures representing Beliefs,
Desires and Intentions
Layered: combination of logic-based (or BDI) and reactive
decision strategies
Logic-based architectures
I
I
AKA Deductive Architectures
Background: symbolic AI
I
I
I
I
Knowledge representation by means of logical formulae
“Syntactic” symbol manipulation
Specification in logic ⇒ executable specification
“Ingredients”:
I
Internal states: sets of (say, first-order logic) formulae
I
I
I
I
Environment state and perception,
Internal state seen as a set of beliefs
Closure under logical implication (⇒):
I
I
∆ = {temp(roomA, 20), heater (on), ...}
closure(∆, ⇒) = {ϕ|ϕ ∈ ∆ ∨ ∃ψ.ψ ∈ ∆ ∧ ψ ⇒ ϕ}
(is this a reasonable model of an agent’s beliefs?)
Representing deductive agents
I
We will use the following objects:
I
L: a set of sentences of a logical system
I
I
I
I
8
As defined, for instance, by the usual wellformedness rules for
first-order logic
D = ℘(L): the set of databases of L
∆0 , ..., ∆n ∈ D: the agent’s internal states (or beliefs)
|=ρ : a deduction relation described by the deduction rules ρ
chosen for L:
We write ∆ |=ρ ϕ if ϕ ∈ closure(∆, ρ)
Describing the architecture
I
A logic-based architecture is described by the following
structure:
ArchL =< L, A, P, D, action, env , see, next >
I
The update function consists of additions and removals of
facts from the current database of internal states:
I
next : D × P → D
I
I
9
old: removal of “old” facts
new : addition of new facts (brought about by action)
(3)
Pseudo-code for action
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
10
function action(∆ : D) : A
begin
for each a ∈ A do
if ∆ |=ρ do(a) then
return a
end if
end for
for each a ∈ A do
if ∆ 6|=ρ ¬do(a) then
return a
end if
end for
return noop
end function action
Environment and Belief states
I
The environment change function, env , remains as before.
I
The belief database update function could be further specified
as follows
next(∆, p) = (∆ \ old(∆)) ∪ new (∆, p)
(4)
where old(∆) represent beliefs no longer held (as a
consequence of action), and new (∆, p) new beliefs that follow
from facts perceived about the new environmental conditions.
11
Example: The Vacuum world
Example: The Vacuum world
(0,2)
(0,1)
(0,0)
(1,0)
Example: The Vacuum world
(0,2)
(0,1)
(0,0)
(1,0)
Example: The Vacuum world
(0,2)
(0,1)
(0,0)
(1,0)
[Russell and Norvig, 1995, Weiss, 1999]
12
13
Describing The Vacuum world
I
Environment
I
I
I
I
Actions
I
I
perceptual input: dirt, null
directions: (facing) north, south, east, west
position: coordinate pairs (x, y )
move forward, turn 90o left, clean
Perception:
P = {{dirt(x, y ), in(x, y ), facing (d), ...}, ...}
Deduction rules
I
I
Part of the decision function
Format (for this example):
I
I
I
P(...) ⇒ Q(...)
PROLOG fans may think of these rules as Horn Clauses.
Examples:
I
I
I
I
in(x, y ) ∧ dirt(x, y ) ⇒ do(clean)
in(x, 2) ∧ ¬dirt(x, y ) ∧ facing (north) ⇒ do(turn)
in(0, 0) ∧ ¬dirt(x, y ) ∧ facing (north) ⇒ do(forward)
...
Updating the internal state database
I
next(∆, p) = (∆ \ old(∆)) ∪ new (∆, p)
I
old(∆) =
{(P(t1 , ..., tn )|P ∈ {in, dirt, facing } ∧ P(t1 , ..., tn ) ∈ ∆}
new (∆, p) :
I
I
I
I
update agent’s position,
update agent’s orientation,
etc
The “Wumpus World”
16
I
See [Russell and Norvig, 2003, section 7.2]
I
BTW, Ch 7 is available online (last accessed
Oct 2012) at http:
//aima.cs.berkeley.edu/newchap07.pdf
Shortcomings of logic-based agents
I
Expressivity issues: problems encoding percepts (e.g. visual
data) etc
I
Calculative rationality in dynamic environments
I
Decidability issues
Semantic elegance vs. performance:
I
I
I
I
etc
loss of “executable specification”
weakening the system vs. temporal specification
Existing (??) logic-based systems
I
MetameM, Concurrent MetameM: specifications in temporal
logic, model-checking as inference engine (Fisher, 1994)
I
CONGOLOG: Situation calculus
I
Situated automata: compiled logical specifications (Kaelbling
and Rosenschein, 1990)
I
AgentSpeak, ...
I
(see [Weiss, 1999] or [Wooldridge, 2002] for more details)
BDI Agents
I
Implement a combination of:
I
I
deductive reasoning (deliberation) and
planning (means-ends reasoning)
Planning
I
Planning formalisms describe actions in terms of (sets of)
preconditions (Pa ), delete lists (Da ) and add lists (Aa ):
< Pa , Da , Aa >
I
E.g. Action encoded in STRIPS (for the “block’s world”
example):
Stack(x,y):
pre: clear(y), holding(x)
del: clear(y), holding(x)
add: armEmpty, on(x,y)
Planning problems
I
A planning problem < ∆, O, γ > is determined by:
I
I
I
I
the agent’s beliefs about the initial environment (a set ∆)
a set of operator descriptors corresponding to the actions
available to the agent:
O = {< Pa , Da , Aa > |a ∈ A}
a set of formulae representing the goal/intention to be
achieved (say, γ)
A plan π =< ai , ..., an > determines a sequence ∆0 , ..., ∆n+1
where ∆0 = ∆ and ∆i = (∆i−1 \ Dai ) ∪ Aai , for 1 ≤ i ≤ n.
Suggestions
I
Investigate the use of BDI systems and agents in games.
I
See, for instance, [Norling and Sonenberg, 2004] which
describe the implementation of interactive BDI characters for
Quake 2.
I
And [Wooldridge, 2002, ch. 4], for some background.
I
There are a number of BDI-based agent platforms around.
‘Jason’, for instance, seems interesting:
http://jason.sourceforge.net/
References
I
Norling, E. and Sonenberg, L. (2004).
Creating interactive characters with bdi agents.
In Proceedings of the Australian Workshop on Interactive Entertainment IE’04.
Russell, S. J. and Norvig, P. (1995).
Artificial Intelligence. A Modern Approach.
Prentice-Hall, Englewood Cliffs.
Russell, S. J. and Norvig, P. (2003).
Artificial Intelligence. A Modern Approach.
Prentice-Hall, Englewood Cliffs, 2nd edition.
Weiss, G. (1999).
Multiagent Systems.
MIT Press, Cambridge.
Wooldridge, M. (2002).
An Introduction to MultiAgent Systems.
John Wiley & Sons.
23