Game playing

Game playing
Type of games
Deterministic
Perfect
Information
Imperfect
Information
Chess
Chance
Backgammon
Bridge
Here we consider Games with Perfect Information
Example - Tic-Tac-Toe
O
X
X
O O
O
X
X
-10
X
O O X
O
X
X
-10
X
O
X
X
X
X
10
O O
O O O
X
X
O X O
X
X
X
10
O O O
0
X
O O
X
O
X O X
-1
X
X
X
10
O
X
10
O
O X O
O X O
X
X
X
X O X
0
0
X
O
0
X O
0
Minimax search
MAX node
MIN node
f value
value computed
by minimax
Minimax search
Minimax algorithm
Properties of minimax









Complete? Yes (if tree is finite)
Optimal? Yes (against an optimal opponent)
Time complexity? O(bm)
Space complexity? O(bm) (depth-first exploration)
For chess, b ≈ 35, m ≈100 for "reasonable" games
 exact solution completely infeasible
Alpha-beta pruning

We can improve on the performance of the minimax
algorithm through alpha-beta pruning.
2
2
1
7
1
?
• We don’t need to compute
the value at this node.
• No matter what it is it can’t
effect the value of the root
node.
Alpha-beta pruning
Alpha-beta pruning
Alpha-beta pruning
Why is it called α-β?
 α is the value of the
best (i.e., highestvalue) choice found so
far at any choice point
along the path for max
 If v is worse than α,
max will avoid it
  prune that branch
 Define β similarly for
min
The Alpha-Beta Procedure
There are two rules for terminating
search:
– Search can be stopped below any MIN
node having a beta value less than or
equal to the alpha value of any of its
MAX ancestors.
– Search can be stopped below any MAX
node having an alpha value greater
than or equal to the beta value of any
of its MIN ancestors.
The α-β algorithm
The α-β algorithm
Properties of α-β
 Pruning does not affect final result
 Good move ordering improves effectiveness of pruning
 With "perfect ordering," time complexity = O(bm/2)
 doubles depth of search
The Alpha-Beta Procedure
Example:
max
min
max
min
The Alpha-Beta Procedure
Example:
max
min
max
=4
4
min
The Alpha-Beta Procedure
Example:
max
min
max
=4
4 5
min
The Alpha-Beta Procedure
Example:
max
min
=3
=3
4 5 3
max
min
The Alpha-Beta Procedure
Example:
max
min
=3
=3
max
=1
4 5 3 1
min
The Alpha-Beta Procedure
Example:
max
=3
min
=3
=3
max
=1
=8
4 5 3 1
8
min
The Alpha-Beta Procedure
Example:
max
=3
min
=3
=3
max
=1
4 5 3 1
=6
8 6
min
The Alpha-Beta Procedure
Example:
max
=3
=3
=3
min
=6
=1
4 5 3 1
=6
8 6 7
max
min
The Alpha-Beta Procedure
=3
Example:
=3
=3
=3
min
=6
=1
4 5 3 1
max
=6
8 6 7
max
min
The Alpha-Beta Procedure
=3
Example:
max
=3
=3
=3
min
=6
=1
4 5 3 1
=6
8 6 7
max
=2
2
min
The Alpha-Beta Procedure
=3
Example:
max
=3
=3
=3
min
=6
=1
4 5 3 1
=6
8 6 7
=3
=2
2
max
min
The Alpha-Beta Procedure
=3
Example:
max
=3
=3
=3
min
=6
=1
4 5 3 1
=6
8 6 7
=3
max
=2
=5
2
5
min
The Alpha-Beta Procedure
=3
Example:
max
=3
=3
=3
min
=6
=1
4 5 3 1
=6
8 6 7
=3
=2
2
max
=4
5 4
min
The Alpha-Beta Procedure
=3
Example:
max
=3
=3
=3
=4
=6
=1
4 5 3 1
=6
8 6 7
=4
=2
2
min
max
=4
5 4 4
min
The Alpha-Beta Procedure
=3
Example:
max
=3
=3
=3
=4
=6
=1
4 5 3 1
=6
8 6 7
min
=4
=2
2
max
=4
=6
5 4 4
6
min
The Alpha-Beta Procedure
=3
Example:
max
=3
=3
=3
=4
=6
=1
4 5 3 1
=6
8 6 7
min
=4
=2
2
max
=4
=6
5 4 4
6 7
min
The Alpha-Beta Procedure
=4
Example:
max
=3
=3
=3
=4
=6
=1
4 5 3 1
=6
8 6 7
min
=4
=2
2
=6
=4
=6
5 4 4
6 7 7
max
min
The Alpha-Beta Procedure
=4
Example:
Done!
=3
=3
=3
=4
=6
=1
4 5 3 1
=6
8 6 7
min
=4
=2
2
=6
=4
max
=6
5 4 4
6 7 7
max
min
Games that include chance
 Possible moves (5-10,5-11), (5-11,19-24),(5-10,10-16) and (5-11,1116)
Games that include chance
chance nodes


Possible moves (5-10,5-11), (5-11,19-24),(5-10,10-16) and (5-11,11-16)
[1,1], [6,6] chance 1/36, all other chance 1/18
Games that include chance

[1,1], [6,6] chance 1/36, all other chance 1/18

Cannot calculate definite minimax value, only expected value
Expected minimax value
EXPECTED-MINIMAX-VALUE(n)=
UTILITY(n)
If n is a terminal
maxs  successors(n) MINIMAX-VALUE(s)
If n is a max node
mins  successors(n) MINIMAX-VALUE(s)
If n is a max node
s  successors(n) P(s) . EXPECTEDMINIMAX(s) If n is a chance node
These equations can be backed-up recursively all the way to the root of
the game tree.
Position evaluation with chance nodes




Left, A1 wins
Right A2 wins
Outcome of evaluation function may change when values are scaled differently.
Behavior is preserved only by a positive linear transformation of EVAL.
Alpha-beta pruning in games with chance

We can improve on the performance of the minimax algorithm
for games with chance through alpha-beta pruning.
• We don’t need to compute
the value at these nodes.
=1
0.5
0.5
3
4
-1
 = 3 * 0.5 + (- 1) * 0.5 = 1.0
0.5
0.5
 = 0.5
5
-4
 2.5
?
?
?
Suppose -5  e(n)  5