Monte-Carlo Tree Search (MCTS) for Computer Go

Monte-CarloTree Search
(MCTS)forComputer Go
Yang-Qiao Meng
Universityof Toronto
Nov17th,2016
Outline
• BackgroundofGo
• MCTS
• UCT
• RAVE
• HeuristicMC-RAVE
Acknowledgement
• Bruno Bouzy, Université Paris Descartes
• AdrienCouetoux,MartinMullerandOlivierTeytaud
ThegameofGo
• OriginatefromChina4thcenturyBC
• KoreaandJapan,5th-7th centuryCE
• 19x19board
• 9x9boardforbeginner
• Statespace:10^170
• Rule:encirclement&occupation
Example
MonteCarloTreeSearch
MonteCarloTreeSearch
MonteCarloTreeSearch
UpperConfidenceTree(UCT)
• Eachstateisamulti-armbandit
• Eachactionisabanditarm
Example
Example
Example
Example
Example
Example
Example
Example
Example
RapidActionValueEstimation(RAVE)
• All-move-as-first(AMAF)
• Ageneralvalueforeachmove,regardlesswhenitisplayed
RAVE
RAVE
MC-RAVE
• WeightedsumbetweenMCvalueandAMAFvalue
UCT-RAVE
• MC+UCT+RAVE
Schedule
• Handselectedschedule
• MinimizingMSEschedule
UCTvsMC-RAVE
HeuristicMC-RAVE
• Heuristicevaluationfunction:H(s,a)
• Heuristicconfidencefunction:C(s,a)
HeuristicMC-RAVE
Questions?