Two-Player Games

Iterative Deepening A*
Iterative Deepening A*
Algorithm A* has memory demands that increase
exponentially with the depth of the goal node (unless
our estimates are perfect).
• In the first iteration, we determine a “cost cut-off”
f’(n0) = g’(n0) + h’(n0) = h’(n0), where n0 is the start
node.
You remember that we improved the space efficiency
of the breadth-first search algorithm by applying
iterative deepening.
• We expand nodes using the depth-first algorithm
and backtrack whenever f’(n) for an expanded node
n exceeds the cut-off value.
Can we do a similar thing for the Algorithm A* ?
Sure!
• If this search does not succeed, determine the
lowest f’-value among the nodes that were visited
but not expanded.
• Use this f’-value as the new cut-off value and do
another depth-first search.
• Repeat this procedure until a goal node is found.
February 23, 2016
Introduction to Artificial Intelligence
Lecture 9: Two-Player Games I
1
February 23, 2016
Introduction to Artificial Intelligence
Lecture 9: Two-Player Games I
2
Two-Player Games with Complete Trees
Let us now investigate…
We can use search algorithms to write “intelligent”
programs that play games against a human
opponent.
Just consider this extremely simple (and not very
exciting) game:
Two-Player Games
• At the beginning of the game, there are seven coins
on a table.
• Player 1 makes the first move, then player 2, then
player 1 again, and so on.
• One move consists of removing 1, 2, or 3 coins.
• The player who removes all remaining coins wins.
February 23, 2016
Introduction to Artificial Intelligence
Lecture 9: Two-Player Games I
3
February 23, 2016
Two-Player Games with Complete Trees
C
The computer wants to make decisions that
guarantee its victory (in this simple game).
H
The underlying assumption is that the human always
finds the optimal move.
C
7
1
H
C
5
1
2
3
3 2
3
2
1
3
1
3
2
1
1
H H H
1
2
2
3
3 2
2
3 2
3
4
1
4
2
3 2
4
1
5
5
2
4
1
3
2
6
1
H
Introduction to Artificial Intelligence
Lecture 9: Two-Player Games I
4
Two-Player Games with Complete Trees
Let us assume that the computer has the first move.
Then, the game can be described as a series of
decisions, where the first decision is made by the
computer, the second one by the human, the third
one by the computer, and so on, until all coins are
gone.
February 23, 2016
Introduction to Artificial Intelligence
Lecture 9: Two-Player Games I
1
3
3
2
1
1
C C C
1
H H H
1
C C C
February 23, 2016
Introduction to Artificial Intelligence
Lecture 9: Two-Player Games I
6
1
Two-Player Games with Complete Trees
Two-Player Games with Complete Trees
So the computer will start the game by taking three coins and
is guaranteed to win the game.
C (max)
The most practical way of implementing such an algorithm is
the Minimax procedure:
H
(min)
C
(max)
H
(min)
41
3 -1 2 -1 1-1 3 -1 2-1 1-1
C
(max) 3 1
21 11
-1
H -1
H -1
H -1
H -1
H -1
H
H
(min)
• Call the two players MIN and MAX.
• Mark each leaf of the search tree with -1, if it
shows a victory of MIN, and with 1, if it shows a
victory of MAX.
• Propagate these values up the tree using the rules:
–
If the parent state is a MAX node, give it the
maximum value among its children.
–
If the parent state is a MIN node, give it the
minimum value among its children.
February 23, 2016
Introduction to Artificial Intelligence
Lecture 9: Two-Player Games I
7
71
5 -1
6 -1
51
4-1
41
31 21 11
4 -1
C
1 C
1 C
1
C C
1
1 C
1
February 23, 2016
Two-Player Games with Complete Trees
Introduction to Artificial Intelligence
Lecture 9: Two-Player Games I
8
Two-Player Games
The previous example shows how we can use the
Minimax procedure to determine the computer’s best
move.
Therefore, we need to define a static evaluation
function e(p) that tells the computer how favorable
the current game position p is from its perspective.
It also shows how we can apply depth-first search
and a variant of backtracking to prune the search
tree.
In other words, e(p) will assume large values if a
position is likely to result in a win for the computer,
and low values if it predicts its defeat.
Before we formalize the idea for pruning, let us move
on to more interesting games.
In any given situation, the computer will make a move
that guarantees a maximum value for e(p) after a
certain number of moves.
For such games, it is impossible to check every
possible sequence of moves. The computer player
then only looks ahead a certain number of moves and
estimates the chance of winning after each possible
sequence.
February 23, 2016
Introduction to Artificial Intelligence
Lecture 9: Two-Player Games I
9
For this purpose, we can use the Minimax procedure
with a specific maximum search depth (ply-depth k
for k moves of each player).
February 23, 2016
Introduction to Artificial Intelligence
Lecture 9: Two-Player Games I
Two-Player Games
Two-Player Games
For example, let us consider Tic-Tac-Toe (although it
would still be possible to search the complete game
tree for this game).
What would be a suitable evaluation function for this
game?
We could use the number of lines that are still open
for the computer (X) minus the ones that are still open
for its opponent (O).
Introduction to Artificial Intelligence
Lecture 9: Two-Player Games I
11
X
O O X
O X
X O
X
e(p) = 8 – 8 = 0
e(p) = 6 – 2 = 4 e(p) = 2 – 2 = 0
shows the weakness of this e(p)
How about these?
O O X
X X
X
O O O
e(p) = ∞
e(p) = - ∞
X
February 23, 2016
10
February 23, 2016
X
Introduction to Artificial Intelligence
Lecture 9: Two-Player Games I
12
2
Two-Player Games
The Alpha-Beta Procedure
Now let us specify how to prune the Minimax tree in
the case of a static evaluation function.
• Use two variables alpha (associated with MAX
nodes) and beta (associated with MIN nodes).
• These variables contain the best (highest or lowest,
resp.) e(p) value at a node p that has been found so
far.
• Notice that alpha can never decrease, and beta can
never increase.
February 23, 2016
Introduction to Artificial Intelligence
Lecture 9: Two-Player Games I
13
February 23, 2016
The Alpha-Beta Procedure
Introduction to Artificial Intelligence
Lecture 9: Two-Player Games I
14
The Alpha-Beta Procedure
Example:
There are two rules for terminating search:
max
• Search can be stopped below any MIN node having
a beta value less than or equal to the alpha value of
any of its MAX ancestors.
min
• Search can be stopped below any MAX node
having an alpha value greater than or equal to the
beta value of any of its MIN ancestors.
max
Alpha-beta pruning thus expresses a relation
between nodes at level n and level n+2 under which
entire subtrees rooted at level n+1 can be eliminated
from consideration.
February 23, 2016
Introduction to Artificial Intelligence
Lecture 9: Two-Player Games I
15
min
February 23, 2016
The Alpha-Beta Procedure
Example:
Example:
max
min
min
max
max
min
min
β=4
4 5
4
February 23, 2016
16
The Alpha-Beta Procedure
max
β=4
Introduction to Artificial Intelligence
Lecture 9: Two-Player Games I
Introduction to Artificial Intelligence
Lecture 9: Two-Player Games I
17
February 23, 2016
Introduction to Artificial Intelligence
Lecture 9: Two-Player Games I
18
3
The Alpha-Beta Procedure
Example:
The Alpha-Beta Procedure
max
Example:
max
min
max
α=3
min
β=3
4 5 3
min
max
α=3
β=3
4 5 3 1
February 23, 2016
Introduction to Artificial Intelligence
Lecture 9: Two-Player Games I
19
February 23, 2016
The Alpha-Beta Procedure
Example:
max
β=1
4 5 3 1
Example:
min
α=3
min
β=8
Introduction to Artificial Intelligence
Lecture 9: Two-Player Games I
21
max
β=3
β=1
α=3
β=3
β=1
max
4 5 3 1
February 23, 2016
min
β=6
8 6
February 23, 2016
8 6 7
23
α=3
22
α=3
β=3
max
α=6
β=1
February 23, 2016
max
min
β=3
4 5 3 1
Introduction to Artificial Intelligence
Lecture 9: Two-Player Games I
Introduction to Artificial Intelligence
Lecture 9: Two-Player Games I
Example:
min
α=6
min
β=6
The Alpha-Beta Procedure
max
β=3
min
α=3
The Alpha-Beta Procedure
Example:
20
max
β=3
4 5 3 1
8
February 23, 2016
Introduction to Artificial Intelligence
Lecture 9: Two-Player Games I
The Alpha-Beta Procedure
max
β=3
β=3
min
β=1
min
β=6
8 6 7
Introduction to Artificial Intelligence
Lecture 9: Two-Player Games I
24
4
The Alpha-Beta Procedure
α=3
Example:
α=3
β=3
α=6
β=1
4 5 3 1
max
Propagated from
grandparent – no values
below 3 can influence
MAX’s decision any more.
β=3
min
β=2
8 6 7
February 23, 2016
max
25
α=3
Example:
β=3
β=1
α=3
β=3
β=1
4 5 3 1
β=6
β=2
8 6 7
February 23, 2016
max
α=3
2
min
β=4
5 4
27
α=3
Example:
β=3
α=3
β=3
α=6
β=1
4 5 3 1
February 23, 2016
β=6
8 6 7
max
α=4
β=2
2
β=4
β=6
5 4 4
Introduction to Artificial Intelligence
Lecture 9: Two-Player Games I
max
β=1
min
β=4
max
α=4
β=6
β=2
8 6 7
February 23, 2016
2
min
β=4
5 4 4
Introduction to Artificial Intelligence
Lecture 9: Two-Player Games I
28
min
29
max
β=3
α=3
β=3
β=1
February 23, 2016
min
β=4
α=6
4 5 3 1
6
α=3
Example:
min
β=4
26
The Alpha-Beta Procedure
max
β=3
5
α=3
α=6
The Alpha-Beta Procedure
α=3
2
min
β=5
Introduction to Artificial Intelligence
Lecture 9: Two-Player Games I
β=3
4 5 3 1
Introduction to Artificial Intelligence
Lecture 9: Two-Player Games I
β=2
8 6 7
Example:
min
α=6
β=6
February 23, 2016
max
α=3
The Alpha-Beta Procedure
max
β=3
min
α=6
The Alpha-Beta Procedure
α=3
max
β=3
4 5 3 1
2
Introduction to Artificial Intelligence
Lecture 9: Two-Player Games I
α=3
Example:
min
α=3
β=6
The Alpha-Beta Procedure
β=6
8 6 7
max
α=4
β=2
2
β=4
β=6
5 4 4
Introduction to Artificial Intelligence
Lecture 9: Two-Player Games I
min
6 7
30
5
The Alpha-Beta Procedure
α=4
Example:
max
β=3
α=3
β=3
β=1
4 5 3 1
February 23, 2016
β=6
8 6 7
α=4
β=2
2
α=6
β=4
β=6
5 4 4
Introduction to Artificial Intelligence
Lecture 9: Two-Player Games I
max
min
31
Done!
β=3
α=3
β=3
β=1
February 23, 2016
β=6
8 6 7
α=4
β=2
2
α=6
β=4
β=6
5 4 4
Introduction to Artificial Intelligence
Lecture 9: Two-Player Games I
max
min
β=4
α=6
4 5 3 1
6 7 7
α=4
Example:
min
β=4
α=6
The Alpha-Beta Procedure
max
min
6 7 7
32
6