Slides

Combinatorial Games
Martin Müller
Contents
• Combinatorial game theory
• Thermographs
• Go and Amazons as combinatorial games
Combinatorial Games
• Basics
• Example: Domineering
• Simplifying games
• Sums of games
• Hot games
What is a Game?
• 2 players, Left and Right
• Set of positions, starting
position
• Moves defined by rules
• Alternating moves
• Player who cannot move loses
(no draws)
Conway's plan:
find the simplest
possible definition
Properties of Games
• Complete information
• Perfect information
• No random element
(no dice, coin throws, …)
Definition of a Game
G = { L1,…,Ln | R1,…,Rm }
• Move options of players
• Each move leads to a game
• Player who cannot move loses
A B C
D E
{ A,B,C | D,E }
Creating Games
G = { L1,…,Ln | R1,…,Rm }
• Simplest possible game:
{|}
• Next step:
{{ | } | }
{ | { | }}
{{ | } | { | }}
• Continue...
Games and Numbers
• Insight: some games represent a number of
free moves for one player
0={|}
1={0|}
2={1|}
-1 = { | 0 }
-2 = { | -1}
Infinite Games
• Recursion: option
leads back to game
G
G = { A,B | C }
A = { |G }
A B
C
The Domineering Game
R
L
Domineering Examples
Inverse Game
• Swap all Left and Right moves
• Compute inverse for all options recursively
G = { L1,…,Ln | R1,…,Rm }.
• Inverse:
-G = { -R1,…,-Rm | -L1,…,-Ln }
• Property of inverses:
-(-G) = G
Examples of Inverses
-(0) = -({ | }) = { | } = 0
-(1) = -({0 | }) = { | -0} = { | 0} = -1
-({0|0}) = {-0 | -0} = {0|0}
Domineering Example
• Inverse of domineering position:
rotate by 90˚
90Þ
G
-G
Classification of Games
G>0
G<0
G=0
G || 0
Left wins
Right wins
Second player wins
First player wins
Classification Examples
0={|}
First player loses
{0|0}
First player win
{ 0 | { 0 | 0 } } Left always wins
{{ 0 | 0 } | 0 }
Right always wins
Comparing Games
• G > H if
G-H>0
Left wins difference game
• G < H if
G-H<0
Right wins difference game
• G = H if
G-H=0
Second player wins difference game
• G || H if
G - H || 0
First player wins difference game
Canonical Form of Games
• Loopfree games have canonical form
• Two operations:
– Delete dominated options
– Reversing reversible options
• Apply as long as possible
• End result: unique canonical form
Deleting Dominated Options
• Example:
{2, -5, 6, 3 | -2, 6, 13, -8} = {6|-8}
• General problem: compare games
• Complete algorithm implemented in David
Wolfe's games package
Sums of Games
• Two games, G and H
• Choice: play either in G or in H
G+H = { G+HL, GL+H | G+HR, GR+H }
• Example:
-5+3 = { -5+3L, -5L+3 | -5+3R, -5R+3 }
= {-5+2|-4+3} = {-3|-1} = -2
Sum of Domineering Positions
Fractions
• Example: {0|1} + {0|1} = 1
-1
0
{-1,0|1}={0|1} = 1/2
1
Hot Games
• First player gets extra moves
• Both are eager to play
• Example: {1|-1}
The 2x2 square is hot
Sums of Hot Games
• Can be much more complex than summands
• Example:
a = {1|-1}, b = {2|-2}, c = {3|-3}, d = {4|-4}
• Sums:
a+b = {{3|1}|{-1|-3}}
a+b+c = {{{6|4}|{2|0}}|{{0|-2}|{-4|-6}}}
a+b+c+d = {{{10|8}|{6|4}}|{{4|2}|{0|-2}}}
|{{{2|0}|{-2|-4}}|{{-4|-6}|{-8|-10}}}
Mean
• Mean m
• Average outcome
• Means add
Examples:
m(4|-4)
m(6|-4)
m(4|{-4|-10})
m(4|{-4|-20})
Theorem: m(a+b) = m(a) + m(b)
=0
=1
= -3/2
= -4
Temperature
• Measures urgency
of move
• Sum does not
become hotter
temp(a+b)
Examples:
temp(4|-4}) = 4
temp(4|{-4|-10})= 11/2
temp(4|{-4|-20})= 8
temp(4|{-4|-100}) = 8
 max(temp(a), temp(b))
Example
•
•
•
•
•
a = 4|-4, b = 5|-5, c = 5 |{-4|-6}
temp(a) = 4, temp(b) = 5, temp(c) = 5
temp(a + b) = 5
temp(b + c) = 1
temp(b + b) = 0
Leftscore and Rightscore
• Also called LeftStop and RightStop
• Minimax values of game if left (right) plays
first
• Assumption: play stops in numbers
• Base points of thermograph (see next
slides)
Thermograph
t
temperature
Left
scaffold
score
Right
scaffold
mean
Thermograph (TG)
• Consists of left and right scaffold
• May coincide in a mast
• Leaf node: TG of numbers are masts
• Constructed from TG of followers
– Tax right scaffold of left follower by t
– Tax left scaffold of right follower by -t
– Compute max (min) over all left (right) followers
– Cut off above intersection of left, right, add mast
Sente and Gote Thermographs
• Three examples
– Gote
– One-sided sente
– Double sente
• All examples: leftScore - rightScore = 4.
• Appear the same to a local minimax search
• But they are very different!
Gote
• Game: 4|0
• leftScore 4
• rightScore 0
• Mean: 2
• Temperature: 2
a
One-sided Sente
• Game: 22|4||0
• leftScore 4
• rightScore 0
• Mean: 4
• Temperature: 4
a
Double Sente
• Game: 12|3 || -1|-11.5
• leftScore 3
• rightScore -1
• Mean: 0.5
• Temperature: 7
a b
Extensions (1)
• Sub-zero thermography
– Problem: hard to check when game is number
– extend TG to range [-1..0]
– “colored ground” rule for zugzwang-like games
– Can now construct TG from options in a uniform
way
– TG = makeTG(left-option-TGs,right-option-TGs)
Extensions (2)
• TG for games including loops
– Defined by Berlekamp’s Economists’s view
paper
– I did the first practical algorithm and
implementation
– Much more complex…
– Caves, hills, bent masts, backward masts,…
Some Wild Ko Thermographs
Stable and Unstable Positions
• Position H in game G is called stable if
temperature is lower than all of its ancestors
• H is unstable if it has an ancestor with
lower temperature
• H is semistable if not unstable and has
ancestor of same temperature
Subtree of Stable Followers
• Root of a game tree is stable by definition
• Find first stable node on each line of play
• Go on recursively
• This subtree of stable followers is a (very
good) small summary of the whole game
Mainlines and Sidelines
• Given G, play n copies of G optimally
• Let n go to infinity
• Some lines of play will be played more and
more often
– Mainlines
• Other lines played only finitely often
– Sidelines
Stable Followers in Mainlines
• Stable mainline gote position:
has two stable followers, one for each color
• Stable mainline one-sided sente position:
– Only stable follower of one color (sente)
• In a “rich environment” (e.g. coupon stack),
play follows mainlines.
Playing Sum Games
• Choose one subgame
• Choose move in that subgame
• Brute force algorithm:
– Compute sum
– Find move retaining minimax value
– Problem: computing sum is slow
Fast Approximate Methods
• Goal: identify good move without
computing sum
• Two parameters: mean and temperature
• Hottest games usually most urgent
• Refinement: Thermostrat