Introduction - Stanford Logic Group

General Game Playing
Lecture 3
Complete Search
Michael Genesereth
Spring 2012
Game Description
init(cell(1,1,b))
init(cell(1,2,b))
init(cell(1,3,b))
init(cell(2,1,b))
init(cell(2,2,b))
init(cell(2,3,b))
init(cell(3,1,b))
init(cell(3,2,b))
init(cell(3,3,b))
init(control(x))
next(cell(M,N,P)) :does(P,mark(M,N))
row(M,P) :true(cell(M,1,P)) &
true(cell(M,2,P)) &
true(cell(M,3,P))
legal(P,mark(X,Y)) :true(cell(X,Y,b)) &
true(control(P))
next(cell(M,N,Z)) :does(P,mark(M,N)) &
true(cell(M,N,Z)) & Z#b column(N,P) :true(cell(1,N,P))
next(cell(M,N,b)) :true(cell(2,N,P))
does(P,mark(J,K)) &
true(cell(3,N,P))
true(cell(M,N,b)) &
(M#J | N#K)
diagonal(P) :true(cell(1,1,P))
next(control(x)) :true(cell(2,2,P))
true(control(o))
true(cell(3,3,P))
legal(x,noop) :true(control(o))
next(control(o)) :true(control(x))
legal(o,noop) :true(control(x))
terminal :- line(P)
terminal :- ~open
goal(x,100) :- line(x)
goal(x,50) :- draw
goal(x,0) :- line(o)
goal(o,100) :- line(o)
goal(o,50) :- draw
goal(o,0) :- line(x)
&
&
&
&
diagonal(P) :true(cell(1,3,P)) &
true(cell(2,2,P)) &
true(cell(3,1,P))
line(P) :- row(M,P)
line(P) :- column(N,P)
line(P) :- diagonal(P)
open :- true(cell(M,N,b))
draw :- ~line(x) &
~line(o)
2
Game Manager
Graphics for
Spectators
Game
Descriptions
Temporary
State Data
Game Manager
Match
Records
Tcp/ip
...
Player
...
3
Game Playing Protocol
Start - manager sends Start message to players
start(id,role,game,startclock,playclock)
ready
Play - manager sends Play messages to players
play(id,[action,…,action])
action
Stop - manager sends Stop message to players
stop(id,[action,…,action])
done
4
Your Mission
Your mission is to write definitions for the three
basic event handlers (start, play, stop).
To help you get started, we provide subroutines for
accessing the implicit game graph instead of the
explicit description.
NB: The game graph is virtual - it is computed on
the fly from the game description. Hence, our
subroutines are not cheap. Best to cache results
rather than call repeatedly.
5
Basic Player Subroutines
findroles(game)  [role,...,role]
findbases(game)  {proposition,...,proposition}
findinputs(game)  {action,...,action}
findinitial(game)  state
findterminal(state,game)  boolean
findgoal(role,state,game)  number
findlegals(role,state,game) {action,...,action}
findnext([action,...,action],state,game)  state
6
Legal Play
7
Starting
var game;
var player;
var state;
function start (id,role,rules,start,play)
{game = rules;
player = role;
state = null;
return 'ready'}
8
Playing
function play (id,move)
{if (move.length == 0)
{state=findinitial(game)}
else {state=findnext(move,state,game)};
var actions=findlegals(role,state,game);
return actions[0]}
9
Stopping
function stop (id,move)
{return 'done'}
10
Single Player Games
11
Game Graph
a
s2
a
d
b
s1
c
b
s3
d
b
s6
c
b
s7
s8
a
a
a
a
s4
b
s5
s9
a
a
b
s11
a
b
s10
12
Depth First Search
a b e f c g h d i j
a
b
e
c
f
g
d
h
i
j
Advantage: Small intermediate storage
Disadvantage: Susceptible to garden paths
Disadvantage: Susceptible to infinite loops
13
Breadth First Search
a b c d e f g h i j
a
b
e
c
f
g
d
h
i
j
Advantage: Finds shortest path
Disadvantage: Consumes large amount of space
14
Time Comparison
Branching 2 and depth d and solution at depth k
Time
Best
Worst
Depth
k
2 d  2 dk
Breadth
2 k1
2k  1
15
Time Comparison
Analysis for branching b and depth d and solution at depth k.
Time
Best
Worst
Depth
k
b d  b d k
b 1
Breadth
b k 1  1
1
b 1
bk  1
b 1
16
Space Comparison
Total depth d and solution depth k.
Space
Binary
General
Depth
d
(b  1)  (d  1) 1
Breadth
2 k 1
b k 1
17
Iterative Deepening
Run depth-limited search repeatedly,
starting with a small initial depth,
incrementing on each iteration,
until success or run out of alternatives.
18
Example
a
a b c d
a b e f c g h d i j
a
b
e
c
f
g
d
h
i
j
Advantage: Small intermediate storage
Advantage: Finds shortest path
Advantage: Not susceptible to garden paths
Advantage: Not susceptible to infinite loops
19
Time Comparison
Depth
Iterative
Depth
1
1
1
2
4
3
3
11
7
4
26
15
5
57
31
n
2 n1  n  2
2 n 1
20
General Results
Theorem [Korf]: The cost of iterative deepening search
is b/(b-1) times the cost of depth-first search (where b
is the branching factor).
Theorem: The space cost of iterative deepening is no
greater than the space cost for depth-first search.
21
Game Graph I
e
b
a
c
d
f
g
l
h
m
i
n
j
k
22
Game Graph II
e
f
a
g
b
h
c
i
j
b
c
n
d
k
23
Bidirectional Search
a
m
24
Minimax
25
Basic Idea
Intuition - Select a move that is guaranteed to
produce the highest possible return no matter what
the opponents do.
In the case of a one move game, a player should
choose an action such that the value of the resulting
state for any opponent action is greater than or
equal to the value of the resulting state for any other
action and opponent action.
In the case of a multi-move game, minimax goes to
the end of the game and “backs up” values.
26
Bipartite Game Graph/Tree
Max node
Min node
27
State Value
The value of a max node for player p is either the utility
of that state if it is terminal or the maximum of all values
for the min nodes that result from its legal actions.
value(p,x) =
goal(p,x) if terminal(x)
max({value(p,minnode(p,a,x)) | legal(p,a,x)})
The value of a min node is the minimum value that
results from any legal opponent action.
value(p,n) =
min({value(p,maxnode(P-p,b,n))legal(P-p ,b,n)})
28
Bipartite Game Tree
3
3
3
3
7
1
1
3
2
3
7
5
4
5
7
6
7
3
7
8
8
5
7
6
3
5
4
1
3
2
1
29
Best Action
A player should choose an action a that leads to a
min node with maximal value, i.e.the player should
prefer action a to action a’ if and only if the
following holds.
value(p,minnode(p,a, x)) > value(p,minnode(p,a’, x))
30
Best Action
function bestaction (role,state)
{var value = [];
var acts = legals(role,state);
for (act in acts)
{value[act]=minscore(role,act,state)};
return act that maximizes value[act]}
31
Scores
function maxscore (role,state)
{if (terminalp(state)) return goal(state);
var value = [];
for (act in legals(role,state))
{value[act] = minscore(role,act,state);
return max(value)}
function minscore (role, action, state)
{var value = [];
for (move in moves(role,action,state))
{value[move] =
maxscore(role,next(move,state))};
return min(value)}
32
Bounded Minimax
If the minvalue for an action is determined to be
100, then there is no need to consider other actions.
In computing the minvalue for a state if the value to
the first player of an opponent’s action is 0, then
there is no need to consider other possibilities.
33
Bounded Minimax Example
100
100
100
0
0
100
100
50
100
100 100 100 50 100 100
34
Alpha Beta Pruning
35
Alpha Beta Pruning
Alpha-Beta Search - Same as Bounded Minimax
except that bounds are computed dynamically and
passed along as parameters. See Russell and
Norvig for details.
36
Alpha Beta Example
15
15
4
15
50
15
16
15
14
14
4
50
12
2
6
2
4
3
4
5
37
Benefits
Best case of alpha-beta pruning can reduce search
space to square root of the unpruned search space,
thereby dramatically increasing depth searchable
within given time bound.
38
Caching
39
Virtual Game Graph
The game graph is virtual - it is computed on the fly
from the game description. Hence, our subroutines
are not cheap.
May be good to cache results (build an explicit
game graph) rather than call repeatedly.
40
Benefit - Avoiding Duplicate Work
Consider a game in which the actions allow one to
explore an 8x8 board by making moves left, right,
up, and down.
Game tree has branching factor 4 and depth 16.
Fringe of tree has 4^16 = 4,294,967,296 nodes.
Only 64 states. Space searched much faster if
duplicate states detected.
41
Disadvantage - Extra Work
Finding duplicate states takes time proportional to
the log of the number of states and takes space
linear in the number of states.
Work wasted when duplicate states rare. (We shall
see two common examples of this next week.)
Memory might be exhausted. Need to throw away
old states.
42
43
44

Download Report

Introduction - Stanford Logic Group

Paperzz.com

Your Paperzz