Game Playing - Computer and Information Science

Why study games?
Game Playing
Games playing: ideal world of hostile agents attempting to
diminish one’s well being.
Reasons to study games:

Modeling strategic and adversary problems is of general interest (e.g.
economic situations).
Handling opponents introduces uncertainty and requires contingency
plans.
Problems are usually complex and very often viewed as an indicator of
intelligence.
Characteristics:

Why Study Games?
Games offer:
¾ Intellectual Engagement
¾ Abstraction
¾ Representability
¾ Performance Measure
Well-formalized problems: clear description of the environment.
Common-sense knowledge is not required.
Rules are fixed.
Number of nodes in the tree might be high, but memorizing the past is
not needed.
Game Playing as Search
Playing a game involves searching for the best move.
Board games clearly involve notions like start state,
goal state, operators, etc. We can this usefully import
problem solving techniques that we have already met.
Not all games are suitable for AI research. We will
restrict ourselves to 2 person perfect information
board games.
There are nevertheless important differences from
standard search problems.
Special Characteristics of Game Playing Search
AI and game playing
Main differences are uncertainties introduced by
Till now we assumed the situation is not going to change
while we search. However ….
Presence of an opponent. One do not know what the
opponent will do until he/she does it. Game playing
programs must solve the contingency problem.
Complexity. Most interesting games are simply too
complex to solve by exhaustive means. Chess, for
example, has an average branching factor of 35.
Uncertainty also arises from not having the resources to
compute a move which is guaranteed to be the best.
Game playing (especially chess and
checkers) was the first test application of AI
It involves a different type of search
problem than we have considered up to now
– a solution is not a path, but simply the
next move
The best move depends on what the
opponent might do (adversary search)
1
Two-player games: motivation
Previous heuristics and search procedures are
only useful for single-player games

no notion of turns: one or more cooperative agents
does not take into account adversarial moves
Games are ideal to explore adversarial
strategies
well-defined, abstract rules
most formulated as search problems
really hard combinatorial problems -- chess!!

MinMax search strategy
Search for A’s best next move, so that no
matter what B does (in particular, choosing its
best move) A will be better off
At each step, evaluate the value of all
descendants: take the maximum if it is A’s
turn, or the minimum if it is B’s turn
We need the estimated values d moves ahead

generate all nodes to level d (BFS)
propagate Min-Max values up from leafs
How to play a game
A way to play such a game is to:
Consider all the legal moves you can make
Compute the new position resulting from each move
Evaluate each resulting position and determine which is
best
Make that move
Wait for your opponent to move and repeat
Key problems are:
Representing the “board”
Generating all legal next boards
Evaluating a position
Two-player games
Search tree for each player remains the same

Even levels i are moves of player A
Odd levels i+1 are moves of player B
Each player searches for a goal (different for
each) at their level
Each player evaluates the states according to
their heuristic function
A’s best move brings B to the worst state
A searches for its best move assuming B will
also search for its best move
Typical case
2-person game
Players alternate moves
Zero-sum: one player’s loss is the other’s gain
Perfect information: both players have access to
complete information about the state of the game.
No information is hidden from either player.
No chance (e.g., using dice) involved
Examples: Tic-Tac-Toe, Checkers, Chess, Go, Nim,
Othello
Not: Bridge, Solitaire, Backgammon, ...
Ingredients of 2-Person Games
Players: We call them Max and Min.
Initial State: Includes board position and whose turn it is.
Operators: These correspond to legal moves.
Terminal Test: A test applied to a board position which
determines whether the game is over. In chess, for
example, this would be a checkmate or stalemate situation.
Utility Function: A function which assigns a numeric
value to a terminal state. For example, in chess the
outcome is win (+1), lose (-1) or draw (0). Note that by
convention, we always measure utility relative to Max.
2
Normal and Game Search Problem
Normal search problem: Max searches for a sequence
of moves yielding a winning position and then makes
the first move in the sequence.
Chess as a First Choice
It provides proof that a machine can actually do something
that was thought to require intelligence.
It has simple rules.
Game search problem: Clearly, this is not feasible in in
a game situation where Min's moves must be taken into
consideration. Max must devise a strategy which leads
to a winning position no matter what moves Min makes.
The world state is fully accessible to the program.
Some games
Complexity of Searching
• tic-tac-toe
• checkers
• Go
• Othello
• chess
The computer representation can be correct in every
relevant detail.
The presence of an opponent makes the decision problem
more complicated.
Games are usually much too hard to solve.
Games penalize inefficiency very severely.
• poker
• bridge
Things to Come…
Perfect Decisions in Two-Person Games
Imperfect Decisions
Perfect Decisions in
Two-Player Games
Alpha-Beta Pruning
Games That Include an Element of Chance
3
Games as Search Problem
Two Player Game
Some games can normally be defined in the form of a tree.
Two players: Max and Min
Objective of both Max and Min to optimize winnings
Branching factor is usually an average of the possible
number of moves at each node.

Max must reach a terminal state with the highest utility
Min must reach a terminal state with the lowest utility
This is a simple search problem: a player must search this
search tree and reach a leaf node with a favorable outcome.
Game ends when either Max and Min have reached a
terminal state
upon reaching a terminal state points maybe awarded or
sometimes deducted
Search Problem Revisited
Game Playing - Minimax
Simple problem is to reach a favorable terminal state
Game Playing
Problem Not so simple...

Max must reach a terminal state with as high a utility as
possible regardless of Min’s moves
An opponent tries to thwart your every move
Max must develop a strategy that determines best possible
move for each move Min makes.
1944 - John von Neumann outlined a search
method (Minimax) that maximised your
position whilst minimising your opponents
Game Playing – Example
Game Playing - Minimax
Nim (a simple game)
Start with a single pile of tokens
At each move the player must select a pile and divide
the tokens into two non-empty, non-equal piles
Starting with 7 tokens, the game is small
enough that we can draw the entire game
tree
+
+
The “game tree” to describe all possible
games follows:
+
4
7
Game Playing - Minimax
6-1
5-1-1
5-2
4-2-1
4-1-1-1
4-3
3-2-2
3-2-1-1
3-1-1-1-1
3-3-1
2-2-2-1
2-2-1-1-1
Conventionally, in discussion of minimax,
have two players “MAX” and “MIN”
The utility function is taken to be the
utility for MAX
Larger values are better for MAX”
2-1-1-1-1-1
Game Playing – Nim
Remember that larger values are taken to be better for
MAX
Assume that use a utility function of
1 = a win for MAX
0 = a win for MIN
We only compare values, “larger or smaller”, so the
actual sizes do not matter
in other games might use {+1,0,-1} for
{win,draw,lose}.
Game Playing – Minimax
Basic idea of minimax:
Player MAX is going to take the best move
available
Will select the next state to be the one with
the highest utility
Hence, value of a MAX node is the
MAXIMUM of the values of the next possible
states

Game Playing – Minimax
i.e. the maximum of its children in the search tree
Game Playing – Minimax
Summary
Player MIN is going to take the best move available
for MIN
i.e. the worst available for MAX
A “MAX” move takes the best move for MAX –
so takes the MAX utility of the children
Will select the next state to be the one with the lowest
utility
recall, higher utility values are better for MAX and
so worse for MIN
A “MIN” move takes the best for min – hence
the worst for MAX – so takes the MIN utility of
the children
Hence, value of a MIN node is the MINIMUM of the
values of the next possible states
i.e. the minimum of its children in the search tree
Games alternate in play between MIN and
MAX
5
1
MIN
1
MAX
1
6-1
MIN
0
4-3
1
5-1-1
Game Playing – Use of Minimax
1
5-2
0
MAX
7
0
4-2-1
3-2-2
3-3-1
1
4-1-1-1
MIN
3-1-1-1-1
MAX
3-2-1-1
0
2-1-1-1-1-1
The Min node has value +1
1
0
2-2-2-1
2-2-1-1-1
All moves by MIN lead to a state of value
+1 for MAX
MIN cannot avoid losing
1
From the values on the tree one can read
off the best moves for each player
0 (loss for MAX)

make sure you know how to extract these
best moves (“perfect lines of play”)
Game Playing – Bounded
Minimax
Game Playing – Bounded
Minimax
For real games, search trees are much
bigger and deeper than Nim
The terminal states are no longer a definite win/loss
actually they are really a definite win/draw/loss but
with reasonable computer resources we cannot
determine which
Have to heuristically/approximately evaluate the
quality of the positions of the states
Cannot possibly evaluate the entire tree
Evaluation of the utility function is expensive if it is not
a clear win or loss
Have to put a bound on the depth of the
search
Game Playing – Bounded
Minimax
Next Slide:
Artificial example of minimax bounded
1
MAX
MIN
1
A
B
-3
C
B
Utility values of “terminal” positions obtained
Evaluate “terminal position” after all possible
moves by MAX
by an evaluation function
(The numbers are invented, and just to
illustrate the working of minimax)
= terminal position
= agent
= opponent
6
Game Playing – Bounded
Minimax
Example of minimax with bounded depth
Evaluate “terminal position” after all possible moves
in the order:
1. MAX (a.k.a “agent”)
2. MIN (a.k.a. “opponent”)
3. MAX
(The numbers are invented, and just to illustrate the
working of minimax)
Assuming MX plays first, complete the MIN/MAX
tree
Game Playing – Bounded
Minimax
1
MAX
MAX
B
1
MIN
4
4
A
D
1
-5
E
-5
2
1
-3
2
1
G
-3
= agent
MAX
-8
= opponent
A
B
1
F
-7
= terminal position
MIN
C
-3
C
-3
If both players play their best moves,
then which “line” does the play follow?
MAX
4
4
D
1
-5
E
-5
2
1
-3
-7
= terminal position
Game Playing – Perfect Play
F
2
G
-3
= agent
-8
= opponent
Two-Ply Game: Revisited
3
Note that the line of perfect play leads the a terminal
node with the same value as the root node
All intermediate nodes also have that same value
Essentially, this is the meaning of the value at the
root node
Caveat: This only applies if the tree is not expanded
further after a move because then the terminals will
change and so values can change
3
3
12
2
8
2
4
2
6
14
5
2
7
An Analysis
Is There Another Way?
This algorithm is only good for games with a low
branching factor, Why?
Take Chess on average has:

In general, the complexity is:
where: b = average branching factor
O(bd)
d = number of plies

35 branches and
usually at least 100 moves
so game space is:
• 35100
Is this a realistic game space to search?
Since time is important factor in gaming searching this
game space is highly undesirable
Why is it Imperfect?
Imperfect Decisions
Many games produce very large search trees.
Without knowledge of the terminal states the program is
taking a guess as to which path to take.
Cutoffs must be implemented due to time restrictions,
either buy computer or game situations.
Evaluation Functions
How to Judge Quality
A function that returns an estimate of the expected utility
of the game from a given position.
Evaluation functions must agree with the utility functions
on the terminal states.

Given the present situation give an estimate as to the value of the
next move.
The performance of a game-playing program is dependant
on the quality of the evaluation functions.
It must not take too long ( trade off between accuracy and
time cost).
Should reflect actual chance of winning.
8
Design
Different Types
Different evaluation functions must depend on the nature
of the game.
Material Advantage Evaluation Functions

Encode the quality of a position in a number that is
representable within the framework of the given language.
Values of the pieces are judge independent of other pieces on
the board. A value is returned base on the material value of the
computer minus the material value of the player.

Design a heuristic for value to the given position of any
object in the game.
Example
Weighted Linear Functions
• W1f1+w2f2+……wnfn
W’s are weight of the pieces
F’s are features of the particular positions
Different Types
Use probability of winning as the value to return.

Chess : Material Value – each piece on the board is worth
some value ( Pawn = , Knights = 3 …etc)
www.imsa.edu/~stendahl/comp/txt/gnuchess.txt

If A has a 100% chance of winning then its value to return is 1.00
Othello : Value given to # of certain color on the board and #
of colors that will be converted
lglwww.epfl.ch/~wolf/java/html/Othello-desc.html
Cutoff Search
Consequences
Cutting of searches at a fixed depth dependant on time
Evaluation function might return an incorrect value.

The deeper the search the more information is available to the
program the more accurate the evaluation functions
Iterative deepening – when time runs out return the
program returns the deepest completed search.

Is searching a node deeper better than searching more nodes?

If the search in cutoff and the next move results involves a capture
then the value that is return maybe incorrect.
Horizon problem

Moves that are pushed deeper into the search trees may result in an
oversight by the evaluation function.
9
Improvements to Cutoff
Evaluation functions should only be applied to quiescent
position.

Alpha-Beta Pruning
Quiescent Position : Position that are unlikely to exhibit wild
swings in value in the near future.
Non quiescent position should be expanded until on is
reached. This extra search is called a Quiescence search.

Will provide more information about that one node in the search
tree but may result in the lose of information about the other nodes.
Pruning
Alpha-Beta Pruning
What is pruning?
A particular technique to find the optimal solution
according to a limited depth search using evaluation
functions.
Returns the same choice as minimax cutoff decisions, but
examines fewer nodes.
Gets its name from the two variables that are passed along
during the search which restrict the set of possible
solutions.

The process of eliminating a branch of the search tree from
consideration without examining it.
Why prune?

To eliminate searching nodes that are potentially unreachable.
To speedup the search process.
Definitions
Implementation
Alpha – the value of the best choice so far along the path
for MAX.
Beta – the value of the best choice (lowest value) so far
along the path for MIN.
Set root node alpha to negative infinity and beta to positive
infinity.
Search depth first, propagating alpha and beta values down
to all nodes visited until reaching desired depth.
Apply evaluation function to get the utility of this node.
If parent of this node is a MAX node, and the utility
calculated is greater than parents current alpha value,
replace this alpha value with this utility.
10
Implementation (Cont’d)
If parent of this node is a MIN node, and the utility
calculated is less than parents current beta value, replace
this beta value with this utility.
Based on these updated values, it compares the alpha and
beta values of this parent node to determine whether to
look at any more children or to backtrack up the tree.
Continue the depth first search in this way until all
potentially better paths have been evaluated.
Example: Depth = 4
α=−∞
β = +3∞
α = −3∞
αα==−3∞
β=+∞
β= 3
α=−∞
ϑ
αα== 33
α=−∞
β = +8
3∞
β = +2∞
β= 3
α = −8∞
α = −3∞
αα== 23
2
αα== −
14
14∞
β=+∞
β= 8
β=+∞
ββ== 33
MIN
MAX
Effectiveness
Problems
The effectiveness depends on the order in which the search
progresses.
If b is the branching factor and d is the depth of the search,
the best case for alpha-beta is O(bd/2), compared to the best
case of minimax which is O(bd).
If there is only one legal move, this algorithm will still
generate an entire search tree.
Designed to identify a “best” move, not to differentiate
between other moves.
Overlooks moves that forfeit something early for a better
position later.
Evaluation of utility usually not exact.
Assumes opponent will always choose the best possible
move.
Chance Nodes
Games that Include an Element
of Chance
Many games that unpredictable outcomes caused by such
actions as throwing a dice or randomizing a condition.
Such games must include chance nodes in addition to MIN
and MAX nodes.
For each node, instead of a definite utility or evaluation,
we can only calculate an expected value.
11
Inclusion of Chance Nodes
Calculating Expected Value
For the terminal nodes, we apply the utility function.
We can calculate the expected value of a MAX move by
applying an expectimax value to each chance node at the
same ply.
After calculating the expected value of a chance node, we
can apply the normal minimax-value formula.
Expectimax Function
Provided we are at a chance node preceding MAX’s turn,
we can calculate the expected utility for MAX as follows:

Let di be a possible dice roll or random event, where P(di)
represents the probability of that event occurring.
If we let S denote the set of legal positions generated by each dice
roll, we have the expectimax function defined as follows:
expectimax(C) = Σi P(di) maxs єS(utility(s))
Where the function maxs єS will return the move MAX will pick out
of all the choices available.
Alternately, you can generate an expextimin function for chance
nodes preceding MIN’s move.
Together they are called the expectiminimax function.
Application to an Example
MAX
Chance
3.56
.6
MIN
3.0
Chance
4.4
3.0
3.6
.6
MAX
.4
4
2
Chance Nodes: Differences
.4
.6
3
43
.6
3
3
2 3
5.8
.4
12
.6
7
5
35
4.4
.4
21
.4
6
7 5
2
61
2
Complexity of Expectiminimax
Where minimax does O(bm), expectiminimax will take
O(bmnm), where n is the number of distinct rolls.
The extra cost makes it unrealistic to look too far ahead.
For minimax, any order-preserving transformation of leaves do not
affect the decision.
However, when chance nodes are introduced, only positive linear
transformations will keep the same decision.
How much this effects our ability to look ahead depends
on how many random events that can occur (or possible
dice rolls).
12
Things to Consider
Wrapping Things Up
Calculating optimal decisions are intractable in most cases,
thus all algorithms must make some assumptions and
approximations.
The standard approach based on minimax, evaluation
functions, and alpha-beta pruning is just one way of doing
things.
These search techniques do not reflect how humans
actually play games.
Demonstrating A Problem
Given this two-ply tree, the minimax algorithm will select
the right-most branch, since it forces a minimum value of
no less than 100.
This relies on the assumption that 100, 101, and 102 are in
fact actually better than 99.
Summary
We defined the game in terms of a search.
Discussion of two-player games given perfect information
(minimax).
Using cut-off to meet time constraints.
Optimizations using alpha-beta pruning to arrive at the
same conclusion as minimax would have.
Complexity of adding chance to the decision tree.
13

Download Report

Game Playing - Computer and Information Science

Paperzz.com

Your Paperzz