Depth-First Proof-Number Search in the Presence of Repetitions Akihiro Kishimoto [email protected] Tokyo Institute of Technology, Japan Outline of this Talk Background Depth-First Proof-Number Search (DFPN) DFPN Dealing with Repetitions Experimental Results in Tsume-Shogi (Checkmating Problem in Japanese Chess) Conclusions and Future Work AI and Games Why computer games? – Ideal test bed for search algorithms Simple rules Computationally intensive Need real-time response – A lot of applications Bioinformatics, theorem-proving systems – Strong and clear motivation Develop algorithms outperforming best humans Why Computer Shogi? Deep Blue won against Garry Kasparov in 1997 Shogi (Japanese chess) is next target – Popular game in Japan 15 million players and 150-200 professional players Annual computer shogi championship with over 40 participants – Harder game for computers to play well Chess 1040 possible positions (branching factor 35) Shogi 1070 positions (branching factor 80-100) Tsume-Shogi (Checkmating Problem) in Computer Shogi Problem of finding mate for opponent king Restricted to play moves checking king/escaping from checks – Branching factor 5 on average Ideal domain to investigate ideas on AND/OR tree search – Important component used in strong programs – Many hard problems created by humans over 400 years E.g. Microcosmos: 1525 ply(=depth) to mate Solving Tsume-Shogi by AND/OR Tree Search 1st player needs one AND/OR tree Interior node move leading to win Root node A W (OR node) L W 2nd player verifies all B C the moves leading to W W L U loss (AND node) D E F G Terminal node Expand search tree Leafuntil node finding a solution OR node H I J K L AND node Unknown Loss Loss Win Win M Loss Importance for Enhancements Search space O(bd) Example Proof tree A W – b: branching factor – d: search depth Proof tree O(bd/2) – Tree that proves a proven win/loss Trade off: speed versus search space reduction H OR node W L B U C D L E I J Unknown Loss AND node W G W F K L Loss Win Win M Loss How to Improve Search Efficiency? Game independent techniques – Applicable to AND/OR tree search in general Game dependent techniques – Applicable to that game only High-performance solvers usually use both This talk focuses only on shogi independent techniques Depth-First Proof-Number Search (DFPN) [Nagai:2002] One of the most successful AND/OR tree search algorithms adapted to many domains – Checkers, shogi (Japanese chess), Go etc Use proof and disproof numbers [Allis:94] – Proof number: minimal number of leaf nodes to prove a win – Disproof number: minimal number of leaf nodes to prove a loss Select most promising node based on proof and disproof numbers in current search tree Expand a leaf node selected by the above strategy and re-compute proof and disproof numbers Proof Numbers Example A pn(B)=pn(D)+pn(E)+pn(F) =1+1+1=3 pn(C)=pn(G)=1 pn(A)=min(pn(B),pn(C)) =min(3,1)=1 1 B C 3 D 1 E 1 pn OR node F 1 pn 1 G 1 AND node Disproof Numbers dn(B)=dn(D)+dn(E)+dn(F) =1+1+1=3 dn(C)=dn(G)=1 dn(A)=min(dn(B),dn(C)) =min(3,1)=1 Example A 1 B C 3 D E 1 1 dn OR node F 1 dn 1 G 1 AND node Depth-First Proof-Number Search (DFPN) (Cont’d) Use two thresholds of proof and disproof numbers – Reformulate proof-number search (PNS) [Allis:94] into depth-first search – Equivalent to PNS in its best-first search behavior – Smaller memory requirement and re-expansions of interior nodes Leverage transposition table (TT) – Cache proof and disproof numbers of expanded nodes to reduce re-expansion – Also reuse search results for transposed positions by ignoring path to reach – Constructed as hash table Behavior of DFPN Example pn(A)=1 thpn(A)=INF dn(A)=2 thdn(A)=INF A dn(G)=2>=thdn(G)=2 pn(B)=1 pn(B)=3 pn(C)=2 pn(C)=1 thpn(C)=4 thpn(B)=2 dn(B)=1 B pn(B)=3>=thpn(B)=2C dn(C)=1 thdn(C)=INF-1 thdn(B)=INF-1 pn(H)=1 thpn(G)=3 D E F thpn(H)=3 H pn(G)=1 G dn(H)=1 thdn(G)=2 dn(G)=1 dn(G)=2 pn(D)=1 pn(E)=1 pn(F)=1thdn(H)=3 I J dn(D)=1 dn(E)=1 dn(F)=1 OR node AND node Transposition Table – Crucial Ehnancement to DFPN Use properties of DAG Cache previous search efforts Example A B C Transposition Table D A B C D Win No duplicate search Win OR node AND node DFPN in the Presence of Repetitions DFPN is equivalent to PNS – This property holds only if search space is tree Search space of many games is directed cyclic graph (DCG) involving repetitions DFPN has several issues with repetitions – Correctness – Incompleteness – Inefficiency Graph-History Interaction (GHI) Problem [Palay:83] Assume that move leading Example to previous position is illegal B Transposition table ignores history – May contain incorrect results OR node AND node A C D Win or loss? ABD(B) Win Result at D is dependent on ACDB(D) Loss path Solution to GHI Problem [Kishimoto & Mueller:AAAI2004] Prepare encoded position Example and encoded path to transposition table entry Reuse proof and disproof B numbers for unproven node Save win/loss via path if repetitions are involved Save win/loss with no condition if repetitions are not involved D via ABD Win D via ACD Loss A C D Infinite Loop in DF-PN [Kishimoto & Mueller:2003,2008] Search space of many games is DCG No new leaf node is expanded (Dis)proof numbers are over- D counted due to repetitions dn(O)=dn(I) + dn(P) >= thdn(O) dn(N)=dn(O) >= thdn(N) A B C E F G H I J K L M N O P DFPN(r) Algorithm [Kishimoto & Mueller:ACG2003] md=0 Keep the minimal distance from root md=1 Modify computation of md=2 pn & dn md=3 Example md=4 dn(O)=dn(P) if P is unproven md=5 dn(O)=dn(P) + dn(I) in standard computation of md=4 disproof number md=5 A B D C E F G H I J K L M N O P Underestimation Problem in DFPN(r) Underestimating (dis)proof Example A md=0 numbers for directed acyclic graph (DAG) md=1 B dn(C) must be dn(D) + dn(E) DFPN(r) computes md=2 C dn(C)=dn(D) md=3 D OR node AND node E md=1 Threshold Controlling Algorithm (TCA) [Kishimoto:AAAI2010] Example Don’t change the way of A md=0 computing (dis)proof numbers Increase threshold if node n has md=1 B child with smaller minimal distance than that of n md=2 C Example dn(C)=dn(D)+dn(E) md=3 D thdn(C)=max(thdn(C), dn(C)+1) =max(thdn(C), dn(D) + dn(E)+1) OR node AND node E md=1 Overestimation Problem Classical problem pointed out Example by many researchers [Allis:94] Occurs more frequently in md=1 TCA Example True dn(A) must be dn(D) + dn(E) dn(A)=dn(B)+dn(E) =dn(C) + dn(E) =dn(D) + dn(E) +dn(E) = dn(D)+2dn(E) A md=0 B md=2 C md=3 D E md=1 Can We Just Ignore Overestimation Problem? No, we can’t Many tsume-shogi problems with very long solutions involve DAGs (Dis)proof numbers sometimes blow up NP-hard to solve completely Source Node Detection Algorithm (SNDA) [Kishimoto:AAAI2010] Detect a source of DAG Example Use max instead of sum for A moves that may cause overestimation md=1 md=0 B Example dn(A)=max(dn(B), dn(E)) + dn(G) md=2 C =max(dn(C), dn(E)) + dn(G) md=3 =max(dn(D)+dn(E), dn(E)) D E + dn(G) = dn(D)+dn(E)+dn(G) OR node AND node G md=1 Other Techniques Heuristic initialization & non-uniform threshold control [Kishimoto & Mueller:AAAI2005] – Use evaluation functions to initialize proof and disproof numbers at leaf nodes Garbage collection scheme – Discard 70% of entries whose sub-tree sizes are small when TT entries are used up [Nagai:1999,2002] Three-ply depth-first search + forward pruning Proof pieces effectively reusing proven results with dominance relations [Seo:1999] Experimental Results (1/3) Conditions – 2.66GHz Xeon PC – 2GB transposition table – 50,000 seconds per instance Test suite – 78 notoriously difficult tsume-shogi instances – Solution length >= 300 ply Experimental Results (2/3) Algorithm DFPN(r) DFPN(r)+SNDA DFPN+TCA DFPN+TCA+NAGAI DFPN+TCA+WPNS DFPN+TCA+SNDA # Unsolved 14 20 8 4 1 1 NAGAI: Approach dealing with overestimation [Nagai:2002] WPNS: Approach dealing with overestimation [Ueda et al.:2008] Experimental Results (3/3) Execution Time for Solving Hardest Instances Instance Meta-Shinsekai Sekitoba Megalopolis TCA+WPNS 45,356 18,545 48,907 Journey to Jupiter 411 ply 5480 seconds TCA+SNDA 7,298 7,241 13,590 Megalopolis 515 ply 13,590 seconds Solution Length 941 525 515 Atlantis 951 ply 210 seconds Positions unsolved previously by any other solver Asuka (703 Ply to Mate) – Notoriously Difficult Position Solved by one of the development versions leveraging TCA and SNDA Node expansions: 83,990,970,117 6.6 days to solve Conclusions and Future Work Conclusions Presented DFPN + dealing with issues related to repetitions to develop strong tsume-shogi solver Future Work Prove completeness of DF-PN + TCA – DFPN is complete if search space is DAG [Kishimoto & Mueller:CG2008] – Open questions if search space is DCG Experiments in other domains – Tsume-Go [Kishimoto & Mueller:AAAI2005]
© Copyright 2026 Paperzz