From Efficient belief-state AND–OR search, with application to

Kriegspiel
Stuart Russell and Jason Wolfe
UC Berkeley
From Efficient belief-state AND–OR search, with application to Kriegspiel.
IJCAI 2005 (in press).
UCB 5/18/2005
1
The Real World…

Contains multiple agents with disparate goals

Requires adversarial decision making


Is partially observable


Agents must make decisions, despite having incomplete
knowledge about the state of their environment
Agents should consider both their own information state (to
gather information) and their opponent’s information state (to
hide information)


Commonly studied through fully observable games, such as chess
or backgammon
Optimal strategies are randomized
Can thus be modeled by partially observable games
(e.g., poker)
UCB 5/18/2005
2
Kriegspiel (“war-game”)

A partially observable variant of chess – opponent pieces
invisible, moves secret



Referee observes actions, provides percepts
Players attempt possibly-legal actions until one is legal
Symmetric player percepts




Move illegal; repeated identical illegal moves disallowed;
attempts (except pawn captures) must be legal in absence of
opponent pieces
Capture occurred on <square>
Check occurred in <directions>. <directions> is one or more of
Rank, File, Long Diagonal, Short Diagonal, or Knight (from king’s
perspective)
Checkmate and stalemate
UCB 5/18/2005
3
Task/Metric

Task: given a Kriegspiel move and observation history for White,
determine if White can guarantee a win within 3-ply (actual moves)

Can assume that Black has full observation, consider only deterministic
strategies
 Two sub-problems



State estimation: determine belief state, set of possible board positions
consistent with move and observation history
Move selection: given the belief state, determine a move plan for White that
guarantees a win within 3-ply
Metric: performance on a Kriegspiel checkmate problem database

By playing two different Kriegspiel agents, generated database of 500
“mate instances” with guaranteed wins for White and 500 “near-miss
instances” that “almost” have guaranteed wins for White.
 Measure accuracy in classifying database problem instances as
mates/near-misses within some fixed time limit
UCB 5/18/2005
4
Belief-State AND–OR Search


Test for guaranteed
checkmate by searching a
tree whose nodes correspond
to White’s belief states
Common algorithms are
depth-first search (DFS) and
proof-number search (PNS)

Both treat a belief state as a
“black box”
UCB 5/18/2005
5
Algorithmic Contribution

Developed a new family of
“incremental” belief-state
AND–OR search algorithms
that “look inside” the belief
state


Treat uncertainty as a new
search dimension in addition to
familiar depth and breadth
G-DBU and IPNS (black and
blue at right) are two
incremental algorithms

Mate Instance Solve Time
Near-Miss Instance Solve Time
Both can be orders of
magnitude faster than previous
algorithms
UCB 5/18/2005
6
New algorithms can solve 3-ply database
instances with 98% accuracy in 10 s.
100%@2.66s
98%@10s
UCB 5/18/2005
7
Demonstrations
Demo 1: state estimation for a 3-ply mate
instance
 Demo 2: move selection for this same
problem instance
 Demo 3: play against our Kriegspiel agent

UCB 5/18/2005
8