Lecture 4: Incomplete Information

Lecture 4:
Incomplete Information
Multiple Player Games and other Games
with Incomplete Information
State-Space Search with Multiple Players
Minimax with --Cutoff
Planning under Uncertainty
General Game Playing
Incomplete Information
1
Checkers
General Game Playing
Incomplete Information
2
Bughouse Chess
General Game Playing
Incomplete Information
3
Backgammon
General Game Playing
Incomplete Information
4
Maze World
?
General Game Playing
Incomplete Information
5
Kriegspiel
General Game Playing
Incomplete Information
6
Poker
General Game Playing
Incomplete Information
7
State Machine Model
a
General Game Playing
b
e
h
c
f
i
d
g
j
Incomplete Information
k
8
Initial State Uncertainty (e.g., Poker)
a
General Game Playing
b
e
h
c
f
i
d
g
j
Incomplete Information
k
9
Nondeterministic Moves (e.g., Rolling a Dice)
a
b
a
a
a
b
d
h
a
b
a
a
f
a
b
General Game Playing
e
a
c
a
a
g
a
i
b
a
a
Incomplete Information
b
k
a
b
j
10
Action Uncertainty (any n ≥ 2-Player Game)
a/a
b
a/a
a/b
a
b/a
b/b
h
a/a
b/b
a/a
a/b
f
a/b
d
General Game Playing
e
a/b
c
a/b
a/a
g
a/b
i
b/a
b/a
a/a
Incomplete Information
b/b
k
a/a
b/b
j
11
State Model Uncertainty
b
a
e
h
c
f
i
d
g
j
b
e
h
c
f
i
g
j
a
a
a
k
k
a
d
General Game Playing
a
Incomplete Information
12
Termination and Goal Uncertainty
a
a
General Game Playing
b
e
h
c
f
i
d
g
j
b
e
h
c
f
i
d
g
j
Incomplete Information
k
k
13
Problems with Incomplete Information
Dealing with incomplete information can be costly, as multiple
options must be considered.
In the face of incomplete information, there may be no way of
knowing that one has succeeded.
In GGP, it is customary to ensure that the players have enough
information to determine legality of moves, termination, and goals.
(Note: Kriegspiel requires an arbiter to ensure legal play.)
General Game Playing
Incomplete Information
14
State-Space Search with Multiple Players
s2
a/a
a/b
s1
b/a
b/b
s3
a/b
e
a/b
h
a/a
b/b
a/a
a/b
f
a/b
s4
General Game Playing
a/a
a/a
g
a/b
i
b/a
b/a
a/a
Incomplete Information
b/b
k
a/a
b/b
j
15
Single Player Game Graph
s2
s1
s3
s4
General Game Playing
Incomplete Information
16
Multiple Player Game Graph
s2
aa
ab
ba
s1
s3
bb
s4
General Game Playing
Incomplete Information
17
Bipartite Game Graph
s2
aa
a
ab
s1
s3
b
ba
bb
General Game Playing
Incomplete Information
s4
18
Move Lists
Simple move list
[(a,s2),(b,s3)]
Multiple player move list
[([a,a],s2),([a,b],s1),
([b,a],s3),([b,b],s4)]
Bipartite move list
[(a,[([a,a],s2),([a,b],s1)]),
(b,[([b,a],s3),([b,b],s4)])]
General Game Playing
Incomplete Information
19
Single Player Node Expansion (from Previous Lecture)
function expand (node)
var match,role,old,data,al,nl,a
begin
match := node.match; role := match.role; al := []; nl := [];
for a in legals(role,node) do
data := simulate(node,{does(role,a)});
new := create_node(match,data,node.theory,node,[],-1);
if terminal(new) then new.score := goal(role,new);
nl := {new} ∪ nl;
al := {(a,new)} ∪ al
end-for;
node.alist := al; return nl
end
General Game Playing
Incomplete Information
20
Multiple Player Node Expansion
function expand (node)
var match,role,old,data,al,jl,nl,a
begin
match := node.match; role := match.role; al := []; jl := []; nl := [];
for a in legals(role,node) do
for j in joints(role,a,match.roles,node) do
data := simulate(node,jointactions(match.roles,j));
new := create_node(match,data,node.theory,node,[],-1);
if terminal(new) then new.score := goal(role,new);
nl := {new} ∪ nl;
jl := {(j,new)} ∪ jl
end-for;
al := {(a,jl)} ∪ al
end-for;
return nl
end
General Game Playing
Incomplete Information
21
New Subroutines
function joints (role,action,roles,node) % returns the combinatorial list of all joint actions
var jl :=[]
% where role does action
begin
if roles =[] then return [[]];
for al in joints(role,action,tail(roles),node) do
if head(roles) = role then jl := {{action} ∪ al} ∪ jl
else for x in legals(head(roles),node) do
jl := {{x} ∪ al} ∪ jl;
return jl
end
function jointactions(roles,j)
% returns set of does atoms
begin
% for joint action j
if roles = [] then return []
else return {does(head(roles),head(j))} ∪ jointactions(tail(roles),tail(j))
end
General Game Playing
Incomplete Information
22
Example
Call: joints(1,a,[1,2],s1)
Call: joints(1,a,[2],s1)
Call: joints(1,a,[],s1)
Exit:[[]]
Exit:[[a],[b]]
Exit:[[a,a],[a,b]]
Call: jointactions([1,2],[a,b])
Call: jointactions([2],[b])
Call: jointactions([],[])
Exit:[]
Exit:[does(2,b)]
Exit:[does(1,a),does(2,b)]
General Game Playing
Incomplete Information
23
Best Move (Multiple Player Games)
function bestmove (node)
var max,score,best,a,jl
begin
max := 0;
(best,jl) := head(node.alist);
for (a,jl) in node.alist do
score := minscore(jl);
if score = 100 then return a;
if score > max then
max := score; best := a
end-if
end-for;
return best
end
Note: This makes the pessimistic assumption that the other players make
the most harmful (for us) joint move.
General Game Playing
Incomplete Information
24
Node Evaluation (Joint Moves)
function minscore (jl)
var min,score,j,child
begin
min := 100;
for (j,child) in jl do
score := maxscore(child);
if score = 0 or score = -1 then return score;
if score < min then min := score
end-for;
return min
end
General Game Playing
Incomplete Information
25
Node Evaluation (Our Moves)
function maxscore (node)
var max,score,a,jl,child
begin
if node.score > -1 then return node.score;
if node.alist = [] then return -1;
max := 0;
for (a,jl) in node.alist do
score := minscore(jl);
if score = 100 or score = -1 then return score;
if score > max then max := score
end-for;
return max
end
General Game Playing
Incomplete Information
26
Minimax for Two-Person Zero-Sum Games
40
max
40
75
General Game Playing
40
40
50
80
10
40
60
Incomplete Information
35
min
20
10
27
The --Principle: -Cutoffs
40
max
40
75
General Game Playing
≤ 40
40
50
80
≤ 35
40
60
Incomplete Information
35
min
20
10
28
The --Principle: - and -Cutoffs
=0
 = 100
=0
 = 100
=0
 = 100
60
60  = 0
 = 60
45
General Game Playing
 = 60
 = 100
60
75
≥ 60
90
max
≤ 60
 = 60
 = 100
50
10
50
≤ 60
min
40
35
30
Incomplete Information
35
40
max
20
15
29
Non Zero-Sum Games
Problem:
All these techniques assume that the other players together choose
the joint move that is most harmful for us. This is often too pessimistic
a model for the opponents in other than two-person zero-sum games.
Solution:
Game Theory (Lecture 7)
General Game Playing
Incomplete Information
30
Maze World
d
a
?
c
b
Initial State: ac (robot in a, gold in c)
General Game Playing
Incomplete Information
31
Environment Model
General Game Playing
ca
da
cb
db
ba
aa
bb
ab
ai
bi
di
ci
cd
dd
cc
dc
bd
ad
bc
ac
Incomplete Information
32
Agent Actions
g, d
ca
m
ba
m
g, d
da
cb
m
m
g, d
g
ai
m
cd
m
bd
g, d
General Game Playing
m
m
g g
g g
d
bb
bi
g
g
di
m ci
dd
d
g
db
m
ab
g, d
d
cc
m
ad
bc
g, d
g, d
m
m
m
m
m
m
aa
d
g, d
g, d
g, d
Incomplete Information
g, d
m
dc
m
m
ac
g, d
33
Agent Percepts
g, d
ca
m
ba
m
g, d
da
cb
m
m
g, d
g
ai
m
cd
m
bd
g, d
General Game Playing
m
m
g g
g g
d
bb
bi
g
g
di
m ci
dd
d
g
m
ab
g, d
d
cc
m
ad
bc
g, d
db
m
m
m
m
g, d
m
m
aa
d
g, d
g, d
g, d
Incomplete Information
g, d
m
dc
m
m
ac
g, d
34
Initial States and Goals
g, d
ca
m
ba
m
g, d
da
cb
m
m
g, d
g
ai
m
cd
m
bd
g, d
General Game Playing
m
m
g g
g g
d
bb
bi
g
g
di
m ci
dd
d
g
m
ab
g, d
d
cc
m
ad
bc
g, d
db
m
m
m
m
g, d
m
m
aa
d
g, d
g, d
g, d
Incomplete Information
g, d
m
dc
m
m
ac
g, d
35
Planning
Planning is the process of finding a transition
diagram for our agent that causes its environment
to go from any initial state to a goal state.
s1
s2
s3
s4
s5
s6
m
m
g
m
m
d
Planning can be done offline and the resulting plan/
program installed in the agent or the planning can
be done online followed by execution.
General Game Playing
Incomplete Information
36
State Space Planning
g, d
ca
m
ba
m
g, d
da
cb
m
m
g, d
g
ai
m
cd
m
bd
g, d
General Game Playing
m
g
di
dd
d
m
g g
g g
d
bb
bi
g
m ci
m
ab
g, d
g
d
cc
m
ad
bc
g, d
db
m
m
m
m
g, d
m
m
aa
d
g, d
g, d
g, d
Incomplete Information
g, d
m
dc
m
m
ac
g, d
37
Incompleteness
Incompleteness
Initial state
Transition diagram for environment
Goal
Complete Planning Techniques
Coercion (e.g. do the grab move at all locations)
Conditional plan (e.g. if see the gold grab it; else go)
Postponement Techniques
Delayed planning
Assumption
General Game Playing
Incomplete Information
38
Initial State Uncertainty
g, d
ca
m
ba
m
g, d
da
cb
m
m
g, d
g
ai
m
cd
m
bd
g, d
General Game Playing
m
m
g g
g g
d
bb
bi
g
g
di
m ci
dd
d
g
db
m
ab
g, d
d
cc
m
ad
bc
g, d
g, d
m
m
m
m
m
m
aa
d
g, d
g, d
g, d
Incomplete Information
g, d
m
dc
m
m
ac
g, d
39
Sequential State Set Progression
bi
bc
bd
g
bi
bb
bc
bd
m
ci
cb
cc
cd
d
bb
bc
bd
General Game Playing
Incomplete Information
40
Sequential State Set Plan
aa
ab move
ac
ad
ba
bb
bc
bd
da
di
dd
General Game Playing
grab
grab
ba
bi move
bc
bd
ca
ci
cc
cd
da move
di
aa drop
ai
Incomplete Information
grab
ca
ci
cd
move
aa
41
Execution
g, d
ca
m
ba
m
g, d
da
cb
m
m
g, d
g
ai
m
cd
m
bd
g, d
General Game Playing
m
m
g g
g g
d
bb
bi
g
g
di
m ci
dd
d
g
db
m
ab
g, d
d
cc
m
ad
bc
g, d
g, d
m
m
m
m
m
m
aa
d
g, d
g, d
g, d
Incomplete Information
g, d
m
dc
m
m
ac
g, d
42
Conditional State Set Progression
bi
g
bi
bb
1
bi
bb
bc
bd
m
ci
cb
m
cc
cd
d
bb
0
bc
bd
General Game Playing
Incomplete Information
43
Conditional State Set Plan
m
m
1/g
0/m
1/g
0/m
m
g
0/m
d
General Game Playing
Incomplete Information
44
Comparison
Sequential plan
possible that no plan exists
plan may contain redundant moves
Conditional plan
large search space
Delayed planning
irreversibility problematic
General Game Playing
Incomplete Information
45
Moral of this Story
As we can see from this analysis, it is sometimes desirable for an agent to do
only a portion of its planning up front, secure in the knowledge that it can do
more later as necessary.
Planning can be done offline and the resulting plan/program executed during
play or the planning can be done online and interleaved with execution.
General Game Playing
Incomplete Information
46

Download Report

Lecture 4: Incomplete Information

Paperzz.com

Your Paperzz