Machine Learning for Go

Machine Learning for Go
Jung-Yun Lo
Dept. of computer science and information engineering
National Dong Hwa University
1
Outline
• A survey of the application of machine
learning to the game of Go
• A learning architecture for the game of Go
2
A survey
• Some possible directions of research
– Global approaches
– Learning in search
– Learning in the endgame
– Learning in the opening
• The representation language
3
Global approaches
• Learning a function from a board position
and a move to a reward
• On large boards, probably a more specific
method should be used for different
subproblems of the game
4
Learning in search
candidate
move ordering
Leaf node static
evaluation
temperature
(high → deeper search)
temperature
(low → stop search)
5
Learning in the endgame
• Each local endgame positions is evaluated,
then the whole game is considered as a
sum of games
– Decomposition search
6
Learning in the opening
• Hard to quantify
• Using joseki
– Depend on the surrounding situation
• Learning a global rules for opening moves
7
The representation language
• Difficult to express more high-level
concepts such as liberty, atari, ladder and
eye
• Making the representation language more
expressive
8
The representation language
• block( BlockID, Color, Size, LibertyCount)
• board( X, Y, GroupID)
• adjacent( BlockID1, BlockID2)
9
The representation language
10
The representation language
The Common Fate Graph (Enzenberger, 1996)
11
The representation language
12
The representation language
Loss of information in the CFG
13
Conclusions
• Learning result are promising, but the
whole field is nearly unexplored and much
opportunities to do research
14
A learning architecture for the
game of Go
• Combinatorial Game Theory
• The HUGO Architecture
• Three Components of HUGO
– Choice of Subgames
– Initiative Engine
– Computing Game Value
15
Combinatorial game theory
• G = {F|O}
• F : the set of options that player Friend can
reach with one legal move
• O : for player Opponent
• F can be 2 possible value
– W : win for Friend
– L : loss for Friend
16
Combinatorial game theory
• 4 possible outcomes for a combinatorial game :
WW, WL, LL, and LW
• WW : won by Friend, irrespective of who moves
first
• WL : an unsettled game, won by the player who
moves first
• LL : lost for Friend even if Friend moves first
• LW : the player who moves first will loss the
game
17
The HUGO
architecture
Can be applied to any
2-player,
deterministic,
full information,
partizan,
combination game
18
3 components of HUGO
• Choice of subgames
– Select a collection of well-defined subgames
– Ensure a high discriminative abilities
• Initiative engine
– Find the move that yields the most points
– Prefer holding initiative
• Computing game values
– Compute the game-theoretic value of a particular game
19
Future work
• Study more reference about machine
learning
20
Reference
•
•
•
•
A Survey of The Application of Machine Learning to The Game of Go
Jan Ramon, Hendrik Blockeel / Katholieke Univ. Leuven
A Learning Architecture for The Game of Go
A.B. Meijer, H. Koppelaar / Delft Univ. of tech.
Computer Go and Machine Learning
Thore Graepal
http://bbs2.xilubbs.com/cgi-bin/bbs/view?forum=godknows&message=105
21