Depth-First Proof-Number Search in the Presence of Repetitions

Depth-First Proof-Number
Search in the Presence of
Repetitions
Akihiro Kishimoto
[email protected]
Tokyo Institute of Technology, Japan
Outline of this Talk

Background
 Depth-First Proof-Number Search (DFPN)
 DFPN Dealing with Repetitions
 Experimental Results in Tsume-Shogi
(Checkmating Problem in Japanese Chess)
 Conclusions and Future Work
AI and Games

Why computer games?
– Ideal test bed for search algorithms
 Simple rules
 Computationally intensive
 Need real-time response
– A lot of applications
 Bioinformatics, theorem-proving systems
– Strong and clear motivation
 Develop algorithms outperforming best humans
Why Computer Shogi?

Deep Blue won against Garry Kasparov in 1997
 Shogi (Japanese chess) is next target
– Popular game in Japan


15 million players and 150-200 professional players
Annual computer shogi championship with over 40
participants
– Harder game for computers to play well
 Chess 1040 possible positions (branching factor 35)
 Shogi 1070 positions (branching factor 80-100)
Tsume-Shogi (Checkmating
Problem) in Computer Shogi

Problem of finding mate for opponent king
 Restricted to play moves checking king/escaping
from checks
– Branching factor 5 on average

Ideal domain to investigate ideas on AND/OR tree
search
– Important component used in strong programs
– Many hard problems created by humans over 400 years

E.g. Microcosmos: 1525 ply(=depth) to mate
Solving Tsume-Shogi by
AND/OR Tree Search
1st player needs one
AND/OR tree Interior node
move leading to win Root node
A W
(OR node)
L
W
 2nd player verifies all
B
C
the moves leading to
W
W
L
U
loss (AND node)
D
E
F
G
Terminal node
Expand search tree
Leafuntil
node
finding a solution

OR node
H
I
J
K
L
AND node
Unknown Loss Loss Win Win
M
Loss
Importance for Enhancements

Search space
O(bd)
Example
Proof tree
A W
– b: branching factor
– d: search depth

Proof tree
O(bd/2)
– Tree that proves a
proven win/loss
Trade off: speed versus
search space reduction H
OR node
W
L
B
U
C
D
L
E
I
J
Unknown Loss
AND node
W
G
W
F
K
L
Loss Win Win
M
Loss
How to Improve Search
Efficiency?

Game independent techniques
– Applicable to AND/OR tree search in general

Game dependent techniques
– Applicable to that game only

High-performance solvers usually use both
 This talk focuses only on shogi independent
techniques
Depth-First Proof-Number
Search (DFPN) [Nagai:2002]

One of the most successful AND/OR tree search
algorithms adapted to many domains
– Checkers, shogi (Japanese chess), Go etc

Use proof and disproof numbers [Allis:94]
– Proof number: minimal number of leaf nodes to prove
a win
– Disproof number: minimal number of leaf nodes to
prove a loss

Select most promising node based on proof and
disproof numbers in current search tree
 Expand a leaf node selected by the above strategy
and re-compute proof and disproof numbers
Proof Numbers
Example
A

pn(B)=pn(D)+pn(E)+pn(F)
=1+1+1=3
 pn(C)=pn(G)=1
 pn(A)=min(pn(B),pn(C))
=min(3,1)=1
1
B
C
3
D
1
E
1
pn OR node
F
1
pn
1
G
1
AND node
Disproof Numbers

dn(B)=dn(D)+dn(E)+dn(F)
=1+1+1=3
 dn(C)=dn(G)=1
 dn(A)=min(dn(B),dn(C))
=min(3,1)=1
Example
A
1
B
C
3
D
E
1
1
dn OR node
F
1
dn
1
G
1
AND node
Depth-First Proof-Number Search
(DFPN) (Cont’d)

Use two thresholds of proof and disproof numbers
– Reformulate proof-number search (PNS) [Allis:94] into
depth-first search
– Equivalent to PNS in its best-first search behavior
– Smaller memory requirement and re-expansions of
interior nodes

Leverage transposition table (TT)
– Cache proof and disproof numbers of expanded nodes
to reduce re-expansion
– Also reuse search results for transposed positions by
ignoring path to reach
– Constructed as hash table
Behavior of DFPN
Example
pn(A)=1
thpn(A)=INF
dn(A)=2
thdn(A)=INF
A
dn(G)=2>=thdn(G)=2
pn(B)=1
pn(B)=3
pn(C)=2
pn(C)=1
thpn(C)=4
thpn(B)=2
dn(B)=1
B pn(B)=3>=thpn(B)=2C
dn(C)=1
thdn(C)=INF-1
thdn(B)=INF-1
pn(H)=1
thpn(G)=3
D
E
F thpn(H)=3
H
pn(G)=1 G
dn(H)=1
thdn(G)=2
dn(G)=1
dn(G)=2
pn(D)=1 pn(E)=1 pn(F)=1thdn(H)=3
I
J
dn(D)=1 dn(E)=1 dn(F)=1
OR node
AND node
Transposition Table – Crucial
Ehnancement to DFPN

Use properties of
DAG
 Cache previous
search efforts
Example
A
B
C
Transposition Table
D
A
B
C
D
Win
No duplicate search
Win
OR node
AND node
DFPN in the Presence of
Repetitions

DFPN is equivalent to PNS
– This property holds only if search space is tree

Search space of many games is directed
cyclic graph (DCG) involving repetitions
 DFPN has several issues with repetitions
– Correctness
– Incompleteness
– Inefficiency
Graph-History Interaction
(GHI) Problem [Palay:83]


Assume that move leading Example
to previous position is
illegal
B
Transposition table
ignores history
– May contain incorrect
results
OR node
AND node
A
C
D
Win or loss?
ABD(B) Win
Result at D is dependent on
ACDB(D)
Loss
path
Solution to GHI Problem
[Kishimoto & Mueller:AAAI2004]




Prepare encoded position
Example
and encoded path to
transposition table entry
Reuse proof and disproof
B
numbers for unproven node
Save win/loss via path if
repetitions are involved
Save win/loss with no
condition if repetitions are
not involved
D via ABD Win
D via ACD Loss
A
C
D
Infinite Loop in DF-PN
[Kishimoto & Mueller:2003,2008]

Search space of many games is
DCG
 No new leaf node is expanded
 (Dis)proof numbers are over- D
counted due to repetitions
dn(O)=dn(I) + dn(P) >= thdn(O)
dn(N)=dn(O) >= thdn(N)
A
B
C
E
F
G
H
I
J
K
L
M
N
O
P
DFPN(r) Algorithm
[Kishimoto & Mueller:ACG2003]


md=0
Keep the minimal
distance from root
md=1
Modify computation of
md=2
pn & dn
md=3
Example
md=4
 dn(O)=dn(P)
if P is unproven
md=5
 dn(O)=dn(P) + dn(I) in
standard computation of md=4
disproof number
md=5
A
B
D
C
E
F
G
H
I
J
K
L
M
N
O
P
Underestimation Problem in
DFPN(r)

Underestimating (dis)proof Example
A md=0
numbers for directed acyclic
graph (DAG)
md=1 B
 dn(C) must be dn(D) + dn(E)
 DFPN(r) computes
md=2 C
dn(C)=dn(D)
md=3 D
OR node
AND node
E
md=1
Threshold Controlling Algorithm
(TCA) [Kishimoto:AAAI2010]
Example
Don’t change the way of
A md=0
computing (dis)proof numbers
 Increase threshold if node n has md=1
B
child with smaller minimal
distance than that of n
md=2 C
Example

dn(C)=dn(D)+dn(E)
md=3
D
thdn(C)=max(thdn(C), dn(C)+1)
=max(thdn(C), dn(D) + dn(E)+1)
OR node
AND node
E
md=1
Overestimation Problem

Classical problem pointed out
Example
by many researchers [Allis:94]
 Occurs more frequently in
md=1
TCA
Example
True dn(A) must be dn(D) + dn(E)
dn(A)=dn(B)+dn(E)
=dn(C) + dn(E)
=dn(D) + dn(E) +dn(E)
= dn(D)+2dn(E)
A
md=0
B
md=2 C
md=3 D
E
md=1
Can We Just Ignore
Overestimation Problem?
No, we can’t
 Many tsume-shogi
problems with very
long solutions involve
DAGs
 (Dis)proof numbers
sometimes blow up
 NP-hard to solve
completely

Source Node Detection Algorithm
(SNDA) [Kishimoto:AAAI2010]

Detect a source of DAG
Example
 Use max instead of sum for
A
moves that may cause overestimation
md=1
md=0
B
Example
dn(A)=max(dn(B), dn(E)) + dn(G)
md=2 C
=max(dn(C), dn(E)) + dn(G)
md=3
=max(dn(D)+dn(E), dn(E))
D
E
+ dn(G)
= dn(D)+dn(E)+dn(G)
OR node
AND node
G
md=1
Other Techniques

Heuristic initialization & non-uniform threshold
control [Kishimoto & Mueller:AAAI2005]
– Use evaluation functions to initialize proof and disproof
numbers at leaf nodes

Garbage collection scheme
– Discard 70% of entries whose sub-tree sizes are small
when TT entries are used up [Nagai:1999,2002]

Three-ply depth-first search + forward pruning
 Proof pieces effectively reusing proven results
with dominance relations [Seo:1999]
Experimental Results (1/3)

Conditions
– 2.66GHz Xeon PC
– 2GB transposition table
– 50,000 seconds per instance

Test suite
– 78 notoriously difficult tsume-shogi instances
– Solution length >= 300 ply
Experimental Results (2/3)
Algorithm
DFPN(r)
DFPN(r)+SNDA
DFPN+TCA
DFPN+TCA+NAGAI
DFPN+TCA+WPNS
DFPN+TCA+SNDA
# Unsolved
14
20
8
4
1
1
NAGAI: Approach dealing with overestimation [Nagai:2002]
WPNS: Approach dealing with overestimation [Ueda et al.:2008]
Experimental Results (3/3)
Execution Time for Solving Hardest Instances
Instance
Meta-Shinsekai
Sekitoba
Megalopolis
TCA+WPNS
45,356
18,545
48,907
Journey to Jupiter
411 ply 5480 seconds
TCA+SNDA
7,298
7,241
13,590
Megalopolis
515 ply 13,590 seconds
Solution Length
941
525
515
Atlantis
951 ply 210 seconds
Positions unsolved previously by any other solver
Asuka (703 Ply to Mate) –
Notoriously Difficult Position

Solved by one of the development versions
leveraging TCA and SNDA
 Node expansions: 83,990,970,117
 6.6 days to solve
Conclusions and Future Work
Conclusions
 Presented DFPN + dealing with issues related to
repetitions to develop strong tsume-shogi solver
Future Work
 Prove completeness of DF-PN + TCA
– DFPN is complete if search space is DAG
[Kishimoto & Mueller:CG2008]
– Open questions if search space is DCG

Experiments in other domains
– Tsume-Go [Kishimoto & Mueller:AAAI2005]