Two-person zero-sum game theory - K-REx

TWO-PERSON ZERO-SUM GAME THEORY
by
WAYNE O'NEIL EVANS
B. A.
,
Adams
State College of Colorado,
A MASTER'S REPORT
submitted
in partial fulfillment of the
requirements for the degree
MASTER OF SCIENCE
Department
of
Mathematics
KANSAS STATE UNIVERSITY
Manhattan, Kansas
1964
Approved
by:
j^
Major Professor
1962
11
9.1
,c^
TABLE OF CONTENTS
^^
c
2j
INTRODUCTION
1
Terminology and Classification
2
Mathematical Formulation
3
MATRIX GAMES
9
Ganaes with Saddle -Points
11
Cannes with Perfect Information
Games
,...,
without Saddle -Points
Graphical Representation
of
l6
20
Mixed Strategies
28
THE FUNDAMENTAL THEOREM OF GAME THEORY
34
SOLVING MATRIX GAMES
40
Algebraic Solution
44
Linear Prograrriming
47
Iterative Solution of a
Game by
TOPICS FOR FURTHER STUDY
Fictitious Play
54
58
N-Person Games
58
Non-Zero-Sum Games
59
Games
60
Infinite
ACKNOWLEDGMENT
6l
BIBLIOGRAPHY
62
INTRODUCTION
The theory
of
games
of strategy
may
be described as a mathematical
theory of decision-making by participants in a competitive environment.
Sonne
common examples
of
games
of
strategy are such parlor
chess, bridge, and poker, where the players
to outwit
each other.
The theory
games
of
is
make use
is
sellers, are real-life
Game
which involve conflicting
partially controlled by one side and
partially by the opposing side of the conflict.
of targets against attack,
of their ingenuity
gaining importance because
of its general applicability to real-life situations
interests in which the outcome
games as
Military attack and defense
and economic price conapetition between two
games
of strategy.
theory does not describe how a game should be played but rather
what strategy a player should select assuming that his opponent chooses
The theory
his best possible strategy.
attenapts to select a strategy which
of
games assumes
maximizes
that a player
his smallest gain (security
level).
Games
of
chance have been studied mathematically for many years.
The mathematical theory
of probability
French mathematician Emile Borel,
to abstract
games
of strategy into a
the formulation of what is
now part
back by his failure to prove
this
has resulted from such study.
The
in 19^1, miade one of the first attempts
mathematical theory
of the
theorem
of strategy with
minimax theory, but he was held
(8).
John von Neumann, on December
Mathematical Society
theorenn
Theory
Games
of
of
was not
it
Economic Behavior
Dr. Morgenstern
some chapters
game
of the
is
minimax
of the
until 1944, with the publication of
,
by von Neumann and Oskar
Morgenstern, that the mathennatical theory
attention (4).
gave a talk to the
which he proved parts
in Gottingen in
However,
(8).
1926,
7,
of
games received much
a professor of economics (at Princeton);
book stress economic significance
of the results of
theory.
Terminology and Classification
A game
described by
of strategy is
its set of
rules which specify
clearly what each person called a "player" is allowed to do under all
possible circumstances.
The rules
of
which moves, known as "information
players.
When an information
is totally
informed.
When
all the
games whose rules result
game
at
game
is
which one
word "choice"
will
played.
advance
sets, " are indistinguishable to the
moves are
of this type the
game
is
Ticktacktoe and chess are examples
in perfect information.
The word "play" will be employed to denote the number
particular
in
set consists of a single naove, the player
said to have perfect information.
of
any game must specify
of
times a
The word "move" will mean a point
in the
of the players selects one of a set of alternatives.
mean
uses this terminology.
the alternative selected.
The
The following example
Black won the third play
choice on his tenth move.
A game
finite
is finite if
number
each player has a
of choices available at
Games
infinite.
game by
of the (chess)
number
finite
of
It is
a play of an n-person
.
.
,
n)
call the play zero-
game with players P
the
game
sum.
itself is called a
If
i.
e.
,
P
,
.
.
.
P and
,
Consider
not.
let
every possible play
zero- sum game.
of a
All other
game
is
Then
games are
called
Mathematical Formulation
formulates
in
the
is
of a
game, each player
advance a plan for playing the game from the beginning to
the end, instead of
plan of play
mathematical description
making his decision
at
each move.
called a strategy of that player.
complete and cover
all
possible conditions that
A
Such a complete
strategy must be
may
if
zero- sum,
non-zero-sum games.
To simplify
,
convenient to distinguish between
be the payment made to P. at the end of the play.
we
a
each move; other gannes are called
games whose pay-offs are zero- sum and those which are
.
moves and
are classified according to the number of players,
as 2-person, 3-person, etc.
0.(i=l, 2,
a clever
arise in the play.
Suppose that one of the players, Blue, has
correspond
to the
m
strategies, which
numbers
i
= 1,
Z,
.
.
.
,
m.
Suppose the other player. Red, has n strategies, which correspond
j
Every pair
of strategies,
play of the game.
= 1,
2,
.
.
.
,
n.
one strategy for each player, determines a
Thus a play
of a
game consists
of
a play of the game and a pay-off to the two players.
The pay-off
The game
is
to
Red
is -a., in a
ij
^12
In
^21
^ZZ
2n
i
be the pay-
a
a
mXn
matrix each Blue strategy
strategy is represented by a column.
row
a.,
two-person zero-sum game.
^11
.
ml
or
Let
thus determined by Blue's pay-off matrix.
A=(a..)
In this
each player making
These two choices determine
one decision, the selection of a strategy.
off to Blue.
to
and Red chooses the
j
-,
m2
... amn
is
represented by a row; each Red
If
Blue chooses the
strategy or column
j,
.th
i
strategy
then Red
is to
pay Blue the amount
Blue wants
a...
be as large as possible, but
a., to
ij
ij
he controls only the choice of his strategy
Red wants
i.
small as possible, but he controls only the choice of
a conflict; Blue maximizing
by his choice
a.,
of
a.,
by his choice of
i,
j.
be as
a., to
Hence, we have
and Redminimizing
Blue will be referred to as the maximizing player
j.
and Red the minimizing player.
As an
game
illustration of a
simplified version of the
game known as NIM,
and description can be found
Let two piles
Red takes
P
consider the following
of strategy,
P and P
for
which general rules
in
many places
of
two items each be given.
either one or two items
including
from one
pile.
(1,
The player
Let this pile be called
Player Blue then draws one or two items from either
.
36-38).
pp.
The
pile.
drawing continues until the player picking up the last item loses.
The following ganne tree shows the possible succession
Red and
Here, the notation,
Blue.
X items and the second
faced with
(2, 2)
in the pile
Move
of:
P
means
On Red's
pile contains y items.
or
is
faced with either
(0, 2),
two
in the pile
(1, 2),
P
.
moves
of
that the first pile contains
which means there are two items
Red's first move. Blue
two
(x, y)
of
in
first
each
move
pile.
he is
After
one in the pile
P and
—
Red
(2,2)
L__
Blue
(1,2)
Red
(0,2)
I
—
(0,1)
(0,2)
(1,1)
I
II
Blue
(0,0)
win for
(0,77
(1.0)
"(^0)
Red
(0,0)
win for
Blue
I
Blue
Red
(0,0)
win for
I
(0,0)
win for
Red
Each player decides,
in
advance, what he will do in any possible
situation and all of these decisions together
The player's strategies are given
column headed Move Number
is the
form
in the following tables.
number
of the
The column headed by Condition Before Move
ditions which the player might
column.
The column under
each column the condition
in
meet on
move by
The
the player.
lists all the possible
the particular
move
the heading Condition After
using a certain strategy, this
is
If
such a position
shown by dashes
Condition
Before Move
Move
lists in
is not
(--).
Condition After Move
2
1
2
3
4
{2.,Z)
(1,2)
(1,2)
(1,2)
(1,2)
(0,2)
(0,1)
(0,0)
(0,1)
(0,0)
--
(1,1)
(0,1)
(0,1)
(1,0)
(1.0)
--
(1.0)
(0,0)
(0,0)
(0,0)
(0.0)
--
(0,1)
--
--
--
-
_.
if
he
possible
Red's Strategies
1
con-
in the first
which the player leaves the game
uses the strategy above the column.
Move
Numbe r
a strategy.
5
(0.2)
(0.0)
For example. Red's
To begin
first strategy consists of the following
with, he is faced with position (2, 2).
from P^ thus changing
he
may
be faced with
(0, 2) if
Red may be faced with
Pj.
with (1,0)
if
faced with
(0, 1)
decides that
if
the position to (1, 2).
with
if
(1, 1) if
and
if
item
At Red's second move
Blue has taken one item from
P
,
but
Red
the dashes in the strategy
he were faced with
(1, 1) to (0, 1)
will take one
Blue has taken the remaining item from
Blue has taken two items from
shown by
Red
moves:
with
will not be
column.
he would reduce this to
(0, 2)
(1, 0)
1
he would reduce this to
Red's other strategies are similarly given
P
Red
(0, 1),
(0, 0).
in the table.
Blue strategies are tabulated as follows:
Move
Numbe r
Condition
Before Move
Condition After
Move
Blue's Strategies
•
1
1
2
3
4
5
6
(1,2)
(0,2) (0.2) (1,1) (1,1) (1,0) (1,0)
(0,2)
(0,0) (0,1) (0,0) (0.1) (0,0) (0,1)
(0.1)
(0,0) (0.0) (0.0) (0.0)
--
2
--
(1.0)
The reduction
normalization.
game's structure
(0.0) (0.0)
--
--
to strategies is called
There exists an elaborate theory referring
in their extensive
moves
of a
--
form
(i.
e.
,
games
taking account of the succession of
and, in particular, of the pattern of information).
will not discuss this aspect of
to
game
theory.
This paper
,
The pay-off matrix
to Blue is as follows
(where
1
denotes a win
and -la loss by Blue).
Red
Strategy
2
4
3
Row
5
1
-1
-1
2
-1
1
3
-1
-1
4
-i
1
Min.
Blue
5
Col.
-1
1
6
1
Max.
1
column and row have
In addition to the pay-off matrix an extra
been added giving the row minimum and column
maximum
respectively.
This represents the least amount a player can receive from a strategy
and
is
called the security level of that strategy.
maximum
represents the worst that could happen
For Blue
a loss might occur.
is +1 so the
the
row minimum
worst that could happen
that he would win.
if
For Red,
if
the
column
he uses each strategy;
for Blue's sixth strategy
Blue uses his sixth strategy
is
Thus, Blue maximizes his security level by selecting
strategy six.
The game just represented
every previous move
game possesses
is at all
is
a
game
of perfect
information because
times known to each of the players.
a special feature in that there
was an entry
This
in the pay-off
matrix which was the smallest
largest in
its
column.
row and
in its
Such an element
at the
same time
the
called a saddle point.
is
If
Blue were to announce in advance that he planned to play strategy
Red could
if
not take advantage and reduce Blue's pay-off.
Red were
to
six,
Similarly,
announce which strategy he was using. Blue could not
increase his own pay-off.
MATRIX GAMES
A
two-person zero-sum game P consists
and a real valued function
j ^
J.
The elements
i€ I
The pay-off function
is
defined on the pairs
and
Blue and Red respectively.
of a pair of sets
j
^
(i, j)
where
i€ I
I
and J
and
J are called the strategies for players
The function
is called the
pay-off function.
A
represented by the game matrix
= (a..).
J
A
Consider the general pay-off matrix
which Blue chooses; he can be sure
min
where
the
minimum
at liberty to choose
is
i,
taken over
he can
of
a.
.
Red's strategies.
that he receives at least
max
mm
j
£J
i
,
his choice in such a
iCI
any strategy
receiving at least
all of
make
= (a..) for
a
.
.
^J
,
Since Blue is
way as
to insure
10
called the
maxmin
of
F
.
Similarly, for any strategy
j
which Red may
choose, he can be sure Blue gets no more than
max
a..
i€I
Since
Red
is at liberty to
choose
.
y
he can choose
j,
it
in such a
way
that
Blue gets at most
min max
j (
called the
minmax
of
P.
J
If
,
'^
These two quantities
and a relationship between them
Theorem
a
i
a..
in general are different,
given by the theorem:
is
1:
is a
function of x and
for
y,
which
min max
yf Y xeX
(x, y)
max
(x, y)
and
miin
x6Xy€
Y
exist, then
max min
x£XyeY
(x, y)
S min
max
y£YxeX
(x, y)
11
Proof:
From
the definition of
minimum, given any x^X, one has
min
(x, y)
:S
iZ)
(x, y)
.
y€Y
From
the definition of
maximum,
given any y6Y, one has
y)S max
(x,
(x, y)
x£X
Hence, combining the above two inequalities.
min
(x, y)
S
(x, y)
s max
(x, y)
x€X
y£Y
Since the right-hand side of the preceding inequalities is
independent of
x,
max min
x€X yeY
(x, y)
^
max
x€X
(x, y)
.
Since the left-hand side of the saine inequalities is independent of y.
max min
x^X ycY
(x, y) <
Games
If
min max
y €Y
(x, y)
x€X
with Saddle -Points
the preceding inequality
becomes an
equality.
12
max
mm
^^
jCJ
iil
mm
max
j€J
i<I
a.. =
a.. = v,
^-^
then Blue can choose a strategy so as to receive at least the
value and
Red can keep Blue from
for Blue and
Red respectively
j*!*
i*!*,
referred to as optimal strategies.
the solution of the
Suppose
|5
more than
v.
The strategies
which guarantee this value v are
This pair of strategies
is
called
game.
real valued function such that
is a
and J€J; then a point
of
getting
common
a.
,
.
,
where i*€l and j*€ J
0(i, j) is
defined
ie.1
called a saddle-point
is
if
1).
(iJ'iO^
2).
(i-, j-) ^
A necessary
and
for all iej
(i*,j")
{i-,j) for all
j£j.
and sufficient condition for a game to have a
saddle -point is that there exists an element of the pay-off matrix which
is
simultaneously the
column.
A game may
minimum
row and
the
have several saddle -points.
maximum
2:
The equality
max min
id
j
6J
(i,j) =
min max
j€j
ici
of its
In such a case all
same value.
the saddle -points have the
Theorem
of its
(i, j)
13
holds
if,
and only
has a saddle -point.
if,
Proof:
If (i*,j*) is
a saddle -point, then for all j€ J and iel
(i,j>:0 <
(i*.j'^),
(1)
(i*,j).
(2)
(i*,j*) <
Since the inequality
(1) is
true for all i€l,
max
^
(i, j^-)
(i-SJ^^O-
i€l
Since the inequality (2)
is
true for all
(i'SJ-) i
Combining
J,
min
(i*,j),
one has
the above inequalities,
max
j €
{i,j-) <
(i=:=,j*)
<
Ul
From
the definition
j.J
and
i*;=
min max
j^J
min
j=:%
it
(i, j)
<
i^I
follows that
max
(i, j-!')
i€l
and
min
j€J
(i*, j)
^
max min
i^I
jej
(i,j).
(i^:<,j
<^'
14
Therefore,
min max
j
€J
^ (i,
j)^
(i'-SJ'!-)
max min
^
iti
But by theorem
iti
member
the left
1,
(i, j).
..,
j€
^
of (4) is not less than
max min
i t I
Hence,
all
three
members
min max
jej
(i, j)
=
(i-'SJ'-')
max min
=
i€
let i*6 I
and
y'^
€.
max min
i€I
€
(i, j).
J
are equal.
ici
Conversely,
j
'
j
€
I
j
(c
(i, j).
J
j such that.
= rnin
(i, j)
(i-'SJ)
j€J
J
and
miin
max
j£j
i€l
(i, j)
=
max
(i, j*),
i€l
Since
min max
j€j
(i, j)
=
(i-sj*) =
max min
i^I
i€I
(i, j),
j€J
the equations above lead to the result:
min
j€J
(i'^j) =
max
i6l
(i, j-=}.
(5)
15
From
the definition of a
minimum.
min
(i*,j) i
<b
{i'-,j^
(6)
and a maximum,
max
(i*,j*) <
(i, j-).
(7)
i€l
Substituting into equation
(5),
the definitions
from
(6)
and
(7),
one has
^
min
(i-.j*) ^
max
(i-, j-)
(i-, j)
(i,
j^).
iel
Therefore,
(i^j-)
(i-,j) for all jf J
<
(i=:sj*) >
(i,j=:0
for all i<
I
satisfy the definition of a saddle -point.
The game corresponding
Col.
has a saddle -point.
is a
Max.
to the
matrix
2
6
11
3
5
2
4
3
6
2=:=
4
The element
saddle -point for the game.
in the
Row
Min.
1
second row and third column
16
Games
with Perfect Information
In order to prove the next theorem,
the notation of the truncation of a
game
convenient to introduce
it is
of perfect information.
Truncations of a game are those games which arise frona a given
game
if
is the
number
the first
move
The number
is deleted.
of truncations of a
move.
of alternatives available at the first
game picks
of a truncation of a
out the
same alternatives
game
The strategy
at the
branch
points as does the oiiginal strategy.
Theorem
3:
game P with matrix
If the
r has
then
(i, j)
game
is a
of
perfect information,
a saddle -point.
Proof:
The proof
length one (i.e.
is
,
by induction on the length
r*
be a
the first
For each
and
I
u
game
and J
By
u
is true for all
let
12
f^,
games P
,
games
If
i
is of
of length less than K,
Suppose that there are
of length K.
move, and
of the
game.
only one move) the theorem is obvious.
Suppose the theorem
Let
of the
P^,
let
.
.,
.
Pr
r
alternatives on
be the r truncations of P.
be the corresponding pay-off function
Red and Blue respectively.
be the set of pure strategies for
the induction hypothesis, there is an equilibriunn point in each
of the sets
I
u
and J
u
.
For each
Pu
,
let (i
u
-,
j
u
*)
be a saddle-point,
17
Then,
(i
u u
for u =
1
2,
,
.
.
.
,
r
and
i
u
,j
C I
u
<
=:<)
u
j
,
u
(i
J
<=;
The game " has two cases:
or
the first
(2),
Case
If
to a
move
First
1.
move
is
>:<,j
u u
u
u
*)
<0 u (iu
>:<,j
u
(8)
)
.
the first
(1),
made by one
move
is
made by chance;
of the players.
nnade by chance.
game P
q is a branch point of the truncated
move made by Blue
,
and corresponds
or Red, set
V'r(q) = i^'l^q)
j«(q) =
Since the first
points of
member
Thus
1.
of
move
1=:-
j-:=
a saddle -point of
(i*»j''') is
made by chance,
corresponds
Similarly
I.
is
is a
(
(q) respectively.
j
to a
12
a,
a.,
,
.... a
;
r
is
defined over branch
move made by Blue and
naember
of J.
It is
is a
sufficient to
show
.
Let the probabilities assigned to the
move be
i'--
alternatives at the first
r
then
r
u
u=l
In particular,
j, *,
L
Jt*,
c.
.
.
.
,
j
r
since
i
12
--',
i
=:%
.
.
.
,
* are truncations of j*,
u
u"
i
=!=
r
'u'
are truncations of
v'-^
and
18
u=l
u=l
and
r
u
u=l
From
(8) it
a
u=l
(i*, j-^O is
,j
(i
*,j
u u
(i *,j ^)
)>T' a
u
u u u u
u=l
^
a saddle-point of
/
first
Red's move
is
move may
=
Hi-,i-)'
of the players.
made by
be
= 0(1':% j*),
.
The first move made by one
The
Assume
V
(i >:<,j =^)
a
*)<
u ~ -^ u u u
u
u=l
(i
u u u
^a
•^ u
2.
"u
r
2
*^
u=l
Case
u
follows that
r
Hence
u'
either player since the proof for
analogous to the proof for Blue's move and vice versa.
the first
move, q
,
of
is
I
made by Blue.
Let
max
u
< r
(i
u u
'Ssj
=;<)
u
=
mm m
(i
^:sj
>:^).
Define a function i* by setting
^H^q)
=
u and i*(q)
= i^-(q)
(9)
19
for any point q in the truncated ganne
made by Blue.
i*
and
j^''
The j*
is
thus defined are strategies of
i€l for Blue in
which corresponds
Hand
i
I
m is
will
It
I"*.
now
to a raove
The strategies
defined as in the previous case.
strategies yield a saddle -point of
If
P
s
be shown these
.
truncation to
its
Pm
,
then
»ji^*,jj.
0(i*.j) =
In particular.
m m -,jm -).
;i*,j*) =
Thus, j^J for Red in P, and
0(i*,j-) =
if
(i
m
j
is its
(10)
.
Pm
truncation to
m m -,jm -)<0 m m -,j rn
(i
(i
)
=
,
</){v^,j).
Suppose that
i(qQ) = u.
Let
to
i
be the truncation of
u
Pu
i
to
Pu
.
Then,
if
,
^('•J' =
^(VJu>-
In particular.
0(1, j*) =
(i
u u
,j
u
*)
j6J and
"^
j
u
is its
truncation
20
Now from
(9),
m m
(i
Hence, using equation
»(i*. j*) =
Thus
(i*, j*) is
it
(i
^:sj
u u
u
^^).
(10),
V<W''Jm*>
^
»(V*' Ju*' ^
^<V' Ju*'
= »<^' J*''
a saddle -point.
The existence
the fact that
m *)>
=:sj
is a
of a saddle -point in the
game
game
of perfect infornaation.
of
chess follows from
It all
the possible
strategies for chess were enumerated, optimal strategies could be found.
However, because
of the large
points have not been computed.
of strategies,
nunnber of strategies for chess, saddle-
The game ticktacktce has a small number
and so an optimal strategy can be found.
Games
without Saddle -Points
Consider games whose pay-off nnatrix
max min
iei
The left-hand side
jej
a..
<
^^
is
min max
jej
of the inequality
iel
such that
a.,
^^
represents Blue's
minimum
security level (the least amount Blue can receive) and the right-hand
side represents the negative of Red's
Blue can receive).
minimum
security level (the most
^
21
The game defined by the pay-off naatrix
'l
3
4
2
does not have a saddle -point for
min max
j^J
i^
^-^
max min
j<J
i^I
a.. = 3
a...
= 2
"-^
Since the ganne matrix has no saddle -point, previous methods
do not determine optimum ways for Blue and Red to play.
Red can
If
discover Blue's optimal strategy, Red can drive his winning down to
the strategy is his first or 2
if
Blue's strategies were discovered, his winnings would be
However, Blue
is
if
Therefore,
the strategy is the second.
if
trying to get either
3
1
1
or
2.
or 4.
Thus, in a game without a saddle-point, the player's strategy will
depend on his opponent's choice.
Therefore, each player's strategy
should be kept unknown to his opponent.
certain strategies by using a
may
One way
random device
to do this is to play
for selecting a strategy.
He
choose a probability distribution over his set of strategies and then
an associated random probability distribution over the whole set
gies for the play of the game.
of strate^
Such a probability distribution over the
22
whole set
mixed strategy.
of strategies of a player is called a
Suppose
in the
and plays strategy
2
above game, Blue plays strategy
with frequency x
1
with frequency 1-x, and suppose Red plays
frequency y and plays
with
1
with frequency 1-y.
2
Blue
Red
y
1-y
X
1
3
1-x
4
2
The mathematical expectation
of
Blue
is
E(xy) = Ixy
/
3x(l-y)
4xy
/
X
=
/
2y
/
4(l-x)y
/
/
2(l-x)
•
(1-y)
2
= -4{x-^) (y-^) / 5/2
When
E{xy) is written in the above form,
it is
easily seen that
if
Blue
takes X = 1/2 he can insure that the expectation will be at least 5/2.
Red can insure
the expectation of Blue will be no
more than 5/2 by
playing y = 1/4.
Since Blue's
Similarly Red's
maxmin
minmax
is 2,
is -3
he will settle for 5/2 and play x = 1/2.
and he will reconcile himself to getting
-5/2 and play y = 1/4.
Thus
the optimal
mixed strategy
for Blue is to play strategy
with probability 1/2 and strategy 2 with probability 1/2.
1
Red's optimal
23
strategy is to play strategy
probability 3/4.
The value
1
with probability 1/4 and strategy 2 with
of the
game
The two optimal stra-
is 5/2.
tegies are called the solution of the game.
Mixed strategies
X.
column matrices.
will be represented by
be the probability of selecting strategy
may
probability distribution X, for Blue,
Let
Then a mixed strategy, or
i.
be represented by the
row
vector
X'
h
x^,
.
.
2
.
,
X
m
I
where
X. >
i
= 1,
2,
.
.
.
,
m
1
and
m
?,-'"
Similarly,
if y. is
the probability of selecting strategy
mixed strategy or probability distribution Y, for Red,
vector
Y
=
n
is a
j,
then a
column
24
where
y.>0
= 1,
j
2,
.
,
n
and
n
J
j=i
If X. =
1
for
some
then
i,
X
is called a
pure strategy.
Suppose Blue chooses strategy laid Red chooses mixed strategy
y; the
expected pay-off to Blue
is
n
h. =
which
is the
th
i
component
H
=
T]
a..y.
=l
J
-"
''
of the
AY
column vector
=
h
If
Red uses strategy
expected pay-off to Red
j
and Blue uses mixed strategy
is
m
k. =
m
^
a..x.
x,
the
25
which
is the
j
component
of the
K' =
row vector
X'A
= (k,,
k,,
.
where
.., k
L.
1.
If
K',
n
X and Y
Blue and Red use mixed strategies
),
respectively, then
the expected pay-off to Blue is
m
"V
ih
jTl
The following example
Dresher
(3, pp.
Example:
n
V = X'AY = 3""
of the
a..x.y. =
^J
^
K'Y
=
X'H.
J
Colonel Blotto
Game
7-8) demonstrates the application of
taken from
mixed strategies.
Colonel Blotto Gamie.
Colonel Blotto and his enemy each try to occupy two
posts by distributing their forces suitably. Let us assume
that Colonel Blotto has 4 regiments and the enemy has 3
regiments which are to be divided between the two posts.
Define the pay-off to Colonel Blotto at each post as follows:
If Colonel Blotto has more regiments than the enemy at
the post, Colonel Blotto receives the enemy's regiments
plus one (the occupation of the post is equivalent to capturing one regiment); if the enemy has naore regiments
than Colonel Blotto at the post, then Colonel Blotto loses
one plus his regiments at the post; if each side places the
same number of regiments, it is a draw and each side gets
zero. The total pay-off is the sum of the pay-offs at the
two posts.
Colonel Blotto has
of dividing 4
5 strategies,
or five different ways
regiments between the two posts.
The enemy
has 4 strategies, or four different ways of dividing his 3
regiments. There are, therefore, twenty ways for the two
sides to distribute their forces.
26
It is
evident that
if
Colonel Blotto places
3
regiments
at the first post and 1 at the second, and if the enemy places
2 regiments at the first post and 1 at the second, then Blotto
wins what amounts to 3 regiments. However, if Colonel
Blotto places 2 regiments at each post and the enemy places
regiments at either post, then Colonel Blotto
loses 2 regiments. The following pay-off matrix sumnnarizes
the payment to Colonel Blotto for each of the twenty possible
distributions:
all of his 3
Colonel Blotto Pay-off
Enemiy Strategies
(3,0)
(1,2)
4
(4,0)
4
(0,4)
Colonel Blotto
Strategies
(2,1)
(0,3)
(3,1)
1
-1
(1.3)
-1
1
(2.2)
-2
-2
In the Colonel Blotto pay-off matrix,
if
the
enemy uses
the
mixed
strategy
1/4
Y
=
1/2
1/4
and Colonel Blotto uses a pure strategy. Blotto' s expectation for each
his pure strategies is
H
=
AY, or
of
27
4
2
-
1
2
1/4
1/4
H
=
1
4
1
-1
3
2
1
=
•
1
1/2
1
2
where
the
-2
1/2
3
1
3/4
1/4
2
2
components
1
H
of
represent Colonel Blotto'
s
receipts correS'
ponding to each one of his five pure strategies.
Now,
if
the two players' strategies are
1/4
1/4
X
Y
=
=
1/4
1/2
1/4
1/2
L
then Blotto'
s
expectation
is
E
=
X'AY
=
X'H, or
2 1/4
1
(1/4,
0,
0.
1/4,
1/2)
1
3/4
= 19/16.
1/2
1
Colonel Blotto' s expectation for other combinations of strategies
can be evaluated in a similar fashion.
28
Graphical Representation of Mixed Strategies
possible to represent graphically the expectation of a player
It is
as a function of his mixed strategies.
If
one player has two strategies and the other has any number of
strategies,
game graphically
possible to solve the
it is
in
two dinaensions,
Consider a game with pay-off matrix
Red Strategies
R,
R.
^11
^12
^21
^22
B.
Blue Strategies
Any randomized strategy X
point (x
X
,
chooses (x
,
x
)
for Blue can be identified with a
on the segment of length one as in Figure
)
,
= (x
X
)
and Red chooses strategy
0{X, Ri) = a^^x^
Geometrically, 0(X,
a
XX
and a
height
w
from
)
may
(x
,
1
x
b
)
to the
R
line.
1
,
may
If
Blue
the pay-off to Blue is
/a^^x^
be represented as the
Blue's security level
.
X
R
R
1.
R
line joining
be represented as the vertical
29
12
^1^1 ^^12^2
11
Strategy B
X,
i
Strategy B.
X,
<
1
(1,0)
(x^.x^)
Figure
Similarly
if
(x
,
x
)
is
used against
be represented as the vertical height
Blue
may maximize
(0,1)
1
R
ax
Blue's security level
,
/
may
ax.
his security level against Red's best play by
playing X>^ = (x^'^X2':=).
^21
12
22
11
Strategy B
Strategy B^
1
(x^^x^^O
Figure
2.
Since the two lines intersect at the point
(
Qx
,
x
"^
,
v)
then
30
^I'l'-'^Z^Z^^"
(11)
^21^1*
Since x
-!'
= (1
-
''
= ^-
^ZZ^Z'
may
x y^) the system (11) above
a^^(l-x^^)
/
a^^x^>. = a^^
/
a^^x^*
/
be reduced to
(a^^- a^^)x^>.' = v
(12)
a^^(l-X2*)
The system
^1*
=
= a^^ / (a^^- a^^)x^>:^ = v.
(12) has solution
^22-^2
^r^21
^2 =
^ll*^ ^22" ^12" ^21
and
^ll*^ ^22" ^12" ^21
'
(13)
V = a
a
11
^11'^
To mininnize Blue's
(y ,y
).
pay-off.
22
-
a
a
12 21
^22" ^12" ^21
Red
nnust use a
mixed strategy Y
The pay-off from these mixed strategies
0{X,Y)
=
a^^x^y^
^^^^2
/
^
^21^2^1
=
is
^
^22^2^2
(14)
^^l^^ll^l
The minimax theorem states
the pay-off is the value of the
-^
if
^21^2^ '^>^2^^1 2^1 '^^22^2^-
both Red and Blue use optimal strategies,
game
v,
i.
e.
,
31
The values
x
of x *,
and v have already been found
*,
hence,
(13);
equation (15) has the solution
a
=
y
^
_
a
-
—-11
line
_,
may
(
fx^is X ^']
the horizontal,
exceed
v.
if
,
v).
-
is
may
a
12
^22" ^12" ^21
diagram.
in our
and
/
y
= 1,
the
R and hence must pass through
members
all
Since y
the
of this family of lines except
Blue chooses either pure strategy B
This
optimal strategy
For
^11'^
ZZ
be represented by a line which is a weighted
R and R
of the lines
=
y
^
must always be between R
point
will
and
_
^22" ^12' ^21
^ll'^
Equation (14) above
average
a
12
or
B
his return
impossible by the minimax theorem, hence, Red's
be represented geometrically as the horizontal line
through a minimunn ordinate
of the intersections of
two
of
Blue's pure
strategies.
Red's optimal strategy
Y--'
= (y, *,
intersecting pure strategy lines
is a
weighted average of two
R and R which
have the
Therefore,
at the point of their intersection.
V
y-,''')
=
y
^
-KR
s
/
s
y^*R^.
't
t
minimum
ordinate
3Z
X''-
To
find y
-'<«
and y *
m=
m and n
let
IR
-
be given by the equations
R^-v
n =
and
V
s
Then
n
y.''
nn
s
An extension
/
m
of this graphical analysis to
has more than two strategies
is
Each Red strategy
1
is
,
B
represented
)
/
n
games where one player
extremely simple.
where Blue has two strategies (B
Blue Strategy
m
and
n
Consider the case
and Red has six (R
in the
,
R
,
.
.
.
,
R/ )
diagram below.
Blue Strategy
2
33
If
Red wishes
to hold Blue
strategy involving only
down
R and
most
to at
he must use a randomized
v,
R,, and thus this case
is
reduced
to the
case previously studied.
Lest the reader assume that
all
games have
the
same graphical
representation as the one just analyzed, the following examples taken
from Luce and
Raiffa (6) present different features that might occur.
A
Unique Optimal Strategy
B.
(b)
In (a),
strict
Red has
a unique optimal
dominance by strategy B
(c)
mixed strategy.
is
In cases (b)
shown.
Many Optimal
Strategies
"^2
r
1^
V
V
T^
1
V
V
.z=;'
f
(a)
(c)
>
and
(c)
34
may
In these cases, Blue's optimal strategy
2.^
fall
anywhere within the
interval.
Bennion
games
method
(2)
in three
of
and Vajda (12) give examples of graphical solutions
Luce and Raiffa
dimensions.
(6)
of
present an alternate
geometrical representation.
THE FUNDAMENTAL THEOREM OF GAME THEORY
Before proving the minimax theorem the previous discussion
formalized by the generalization
game Pas
i.
any two-person zero- sum
of
is
finite
follows:
Both players are malevolent
There are two players, Blue and Red.
(i.e.
,
each
is
concerned with maximizing his own gains, or mini-
mizing his ow^n losses).
ii.
iii.
Blue has a set
Red has
I
=
a set J =
12
(i,
(j
,
,
x
iv.
j
,
.
.
.
.
.
.
Cd
,
,
i
j
m
n
)
)
of
m pure
of n
Associated with each pair of strategies
^(i, j)
v.
i^,
units
from Red
to Blue.
Both players are aware
of,
0(i, j) is
strategies.
pure strategies.
(i, j)
is the
pay-off
abbreviated by
a...
and intelligent enough to evaluate
accurately, the pay-offs associated with both players' alternative
strategies,
vi.
Blue
may
adopt a mixed strategy by employing
probability x
where
i
,
with
35
in
\
Such a strategy
X. =
1
vii.
of all
X.
for
>
= 1,
i
2,
,
m.
represented by the row vector
is
X' =
The set
and
fx,1'
[^
,
x^,
.
.
.
2'
,
X
q
randomized strategies for Blue
is
designated by
X
m
Similarly, Red's mixed strategy is denoted by a column vector
Y
=
yn
where
n
Sy-1
j=i
The set
viii.
of all
for
y. >
j
= 1,
2,
.
.
.
,
n.
J
J
randomized strategies for Red are designated by Y
For each mixed strategy pair
(X, Y), the pay-off 0(X, Y) is
defined to be
m
0(X. Y) =
X'AY
=
21
i=l
n
2
j
=
l
^i^ ViJ
n
36
ix.
The pure strategy game
I,
(0,
J
J)
be denoted by the triplet
which designates the two pure strategy spaces
m
and the pay-off function
The extension
by the triplet
x.
H may
P
of
X
(0,
0.
spaces
to
Y
,
n
and
I
mixed strategies
of
is
denoted
).
X from X
Blue's objective is to select a mixed strategy
as to maximize his security level (return).
m
This strategy
Because the
called the optimal strategy is denoted as
X--^.
game
minimize Blue's
zero- sum. Red's objective
is
is to
return by the selection of a strategy Y from Y
denoted as
Y--'
is
Y=!^
is the
from Glickman
{
(5)
The set
X--'
game f.
minimax theorem makes use
of the
Lemma
This strategy
.
optimal strategy for Red.
called a solution for the
The following proof
so
of a
theorem
2.5, p. 31).
Lemma;
Let
...
a
11
A
,
ml
.
Then either
there exists an element X' =
a, .X,
Ij
1
/
a_.x_
2j
2
/
.
.
.
/
In
=
a
(i)
^
a
a
.x
mj
mn
J
1mm
(x,
m >-
a
.
.
,
.
.
for
.
,
j
x
= 1,
)
X
of
.
.
.
,
n,
such that
or
37
there exists an element Y =
(ii)
(y,
.... y
,
1
ay/a.^y-/.../a.
my n
iZ Z
ill
Theorem
X
If
<0
for
i
n
= 1,
)
m.
....
4:
and Y are mixed strategies of the game P, then,
min max
X
Y
0{X, Y) = v =
max min
X
Y
0{X, Y)
Proof:
condition
If
(x,
1
,
.
.
.
,
X
m
)
€
(i)
X na
of
Lemma
holds, there is an element
Z. 5
such that
a, .X,
Ij
a_.x_
/
Zj
1
and hence for every Y
Y
^
/
.
.
/
.
a
Z
.x
mj
m
for
>
j
= 1,
Z,
.
.
.
,
n,
n
n
0(X, Y) =
V
^
(a. .x^
Ij
1
^
Y
/
a^.xZj
Z
/
.
.
J
Since (16) holds for every Y
,
min
0{X, Y) >
YfY n
and, hence,
max min 0(X,
XtXm Y^Y n
Y)
>
.
/
a
.x
mj
m
)
Y.
J
>
(16)
If
(Yi*
1
condition
•••» y )^
n
Y
(ii)
n
^l>^l
of
Lemma
2. 5
holds, there is an element
such that
^
\Z^Z
^
-"
X^X
and hence there exists for every
fo^
^in^n ^
^
i
= 1.
^'
•
•
•
>
^^
m
m
0(X, Y) =
S (^lYi
/
a.^y,
/
.
.
/
.
a.^y^)
.
X. <
(17)
1=1
Since the above holds for every
max
X^X m
X^Xm
,
0(X, Y) <
and, hence.
max 0(X,
Y^Y xex
n
m
min
Since either condition
(i)
or condition
Y)
(ii)
least one of the inequalities (17) or (18)
(18)
<>
of
Lemma
must
2, 5
holds, then at
hold, and hence the follow-
ing cannot be true
max min
XfeX
m Y£Y n
0(X, Y) <
<min max
YeY XeX
n
2)(X,
m
Y)
.
(19)
39
Let
A
V
from A by subtracting v
be the matrix which arises
each element
fromi
of A:
a
.... a
- V,
V
-
In
11
AV=
a
Let
are
ml
be the expectation function for
members
Xm
of
and Y
mn -
X and Y
so that for any
,
that
respectively,
n
m
)
A
V
.... a
- V,
y
(X, Y) =
n
yCa..-
X.Y.
v)
(20)
(X, Y) = 0(X, Y)
-
V.
Since the inequalities (19) do not hold for A, the following conditions
do not hold for
A
V
.
max min
XtX
m
YCf
^
(X, Y) <
<
n
max
min
YCY
n
XeX
(X, Y)
m
(21
^
Thus, from (20) and (21) the following conditions do not hold:
max
XOC
m
min
YCY
0(X, Y)
-
max
v <0 < min
YtY
n
XdX
n
-
v
m
Hence, for every v the following do not hold:
max
X^X
m
min
YtY
n
0(X,
Y)<v<min
Y6Y
n
max
Xex
m
(X, Y)
(22)
40
Since inequalities (22) are false for every
max
XkX
min
m
are true for every
From Theorem
it
the relations
max
Y€Y n X€X
min
0(X, Y) > v 2
0(X, Y)
(23)
m
n
v.
3,
max
XeX
Y€Y
follows
from
min
m
Therefore,
Y€Y
v,
0(X, Y)
min
Y€Y n
n
max min
X6X m YtY n
<
(22)
and
0)X, Y) =
max 0(X,
X^X m
Y)
(24)
max 0(X,
X^X m
Y) = v.
(25)
(24) that
min
Y^Y
n
SOLVING MATRIX GAMES
From
amount regardless
amount
this fixed
player
is
follows that each player has an optimal
if
of the strategy selected
as large (small) as
may win more
opponent
it
Using an optimal strategy, a player can expect to win
strategy.
fixed
minimax theorem,
the
by his opponent, and
is strategically possible.
(lose less) than this fixed
amount from
his opponent does not use an optimal strategy.
optimal strategies, one for each player,
The value
it
of a
his opponent
game
if
is the
(lose) a
A
(to)
A
his
pair of
is the solution for the
game.
average amount v that one player must pay
both use their optimal strategies.
41
The computations required for the solution
games are so extensive
that
it
any but the simplest
of
would be virtually impossible
The advent
solutions without the use of automatic computers.
matic computers has made
problems
in a
solution of
it
of auto-
possible to obtain an answer to some
This paper will discuss the
reasonable length of time.
games without
to obtain
specific reference to the solution by automatic
computers.
Since the annount of computation required to obtain a solution depends
upon the number
of strategies,
whenever possible.
of the
in
is
it
Sometimes
it
is
important to reduce their number
possible to
by direct inspection
tell
matrix that certain strategies will always have probability zero
an optimal strategy.
A
poor strategy of a player
pure strategy which appears with probability zero
is
defined as some
every optimal mixed
in
a player has a poor strategy, that strategy
strategy of that player.
If
may
the set of pure strategies and the resulting
be eliminated
will have the
same
from
solution.
Poor strategies are found by examining
dominances.
game
Suppose some row
of the
the pay-off
pay-off matrix
matrix for
A
= (a..) is
such
that
a.. >
i.e.
,
the
elements
elements
of
of
a
j
some row
another row
k.
i
Then
=
1,
2,
.
.
.
,
n,
are larger than the corresponding
the strategy
i
of Blue is said to
dominate
42
strictly strategy k of Blue and strategy k is a poor strategy.
for Red,
if
the
elements
of a
elements of some column
column
r
However,
are larger than the corresponding
s
a.
> a.
ir
IS
i
= 1,
2,
then Strategy r strictly dominates strategy
.
.
s
.
,
m,
and strategy
r is a
poor
strategy.
For example,
if
""l
7
2
6
2
7
5
1
6
is the pay-off matrix of sonne game, then no optimal strategy
for Blue should assign a positive probability to the third row.
No nnatter what Red does. Blue can improve his pay-off by
choosing the second row rather than the third row. In a similar manner, since every element of the first column of the
above matrix is less than the corresponding element in the
third column and since Red wants to minimize the pay-off,
then the third column may be eliminated obtaining
1
for the simplified
The solution
is
game matrix.
of the original
simplified game.
7
game may be obtained by
The optimal mixed strategy
solving the
of the original
game
obtained by assigning probability zero to the poor strategies and
the remaining strategies are assigned the
same probability as
in the
43
Hence, the value of the original game
solution of the simplified game.
is the
If
same as
the value of the simplified
some strategy k
strategies r and
a.,
ik
< c
then strategy k
•
s,
is
dominated by a convex linear combination
of
i.e.,
a.
/ (1 -c)
ir
is a
game.
•
0^
a.
IS
poor strategy and
c
<1
nriay
computations required to solve the game.
for all
i
= 1,
2,
.
.
.
m.
,
be eliminated simplifying the
Similarly columns
nniay
also
exhibit convex linear dominance and be eliminated.
For example, consider a game whose pay-off naatrix
24
is
o"
8
4
5
Notice that
4 < 1/4
.
(24) /
3/4
.
(0)
and
5 <
1/4
•
(O) /
3/4
•
(8).
Hence, Blue would never be wise to play strategy three for he
could always do better by dividing between the first two strategies any probability that he might consider assigning to the
third strategy. Thus the garae might be reduced to a simpler
game whose matrix is
24
8
The following sections will be devoted
games by
different methods.
The
first
to the solution of
method consists
matrix
of the algebraic
44
system
solution of a large
method, which
puters,
is
games by
For
is the
of inequalities
most common and
is
com-
nnethod of solving
fictitious play concludes the section.
the reader interested in different
games, Dresher
solution.
applicable to autonnatic
An approximation
programming.
linear
The second
and equalities.
(3),
Dresher
McKinsey
(3)
methods
and Willianas
(7),
of solving
(14)
matrix
present a matrix
discusses a mapping method and Luce and Raiffa
use of differential equations for the solution of matrix
(6) illustrate the
game s
To
game
solve a
it
X
suffices to find vectors
and Y whose elements
satisfy the following conditions.
x/x/.../x
12
m =1
X.
>
i
= 1,
2,
.
.
.
,
/ Y-) f
Y,
'l-'Z
m
y. >
j
•
= 1.
f
2,
=
Y
'n
.
.
.
1
n
,
(26)
a, .X,
Ij
/
1
.x_ /
2j 2
a.-
.
.
.
/
a
.x
mj
m
>
-
v for all
Ly/a.,y^/.../a.
-Cv
y
iZ' 2
m'n
il
for all
j
= 1,
2,
.
.
.
,
n
i
= 1,
2,
.
.
.
,
m
1
Algebraic Solution
The usual methods
systems
like (26)
of
elementary algebra do not suffice
to solve
above containing inequalities as well as equalities.
The minimax theorem guarantees
that there is a solution to this
system
and the algebraic method enables one to find this solution by separately
45
considering all possible cases that arise when a > sign
an
= or > sign
and the
< sign is
replaced by an
from McKinsey
following examples
is
replaced by
The two
or < sign.
=
how
(7) illustrate
this
method
is
applied.
Example
To
1:
pay-off matrix
find X
game whose
find the value and optimal strategies for the
,
X
X
,
x^
/
is
,
y
x^
y
,
/
,
X3 =
1
-1
1
1
-1
3
1
2
1
and v which satisfy the following conditions
y
1
/
Yi
y^
/
73 =
1
(27)
0<x
(1)
(-1)
(1)
il,
x^
/
0<x^l, 0<x
(-1) x^
/
x^ / (-1) x^
x^
/ (3)
x^
<1
0<y
<1,
Oiy^^l, 0<y^<,l
(-1)
X3>v
(1)
/ (2)
x^>v
(-1)
y^
/ (-1)
(-1)
y^
/ (2)
/ (1)
x^^v
y^
/
(-1) y^ / (1)
y^
y^
/
y3<v
/ (3)
(1)
y^sv
y^<v.
A solution can be found by separately considering the 2,
cases that arise when the < sign is replaced by an = sign or
sign and the > sign is replaced by an = sign or > sign.
To
<
solve the above game, replace the last six inequalities
by equalities and the result obtained using elementary algebra
is
46
x^ = 6/13, x^ = 3/13. x^ = 4/13.
and V
=
1
Example
To
y^
6/13. y^ = 4/13, y^ = 3/13,
=
/ 3
2:
find the value and optimal strategies of the ganne
pay-off nnatrix
whose
is
-2
4
2
Again
suffices to find vectors
it
X
and Y which satisfy the
conditions of (27) and which also satisfy
x^
/ (2)
x^ > V
(-2) x^ / (4) x^
/ (2)
x^
(3) x^ / (-1)
x^
(4) x^ / (2)
/ (6)
x^
>
>
(3) y^ / (-2)
V
y^
/ (4)
y^
<
v
(-1) y^ / (4) y^ / (2) y^ < V
V
(2)
y^
(2)y^
/
/ (6)
y^
< v.
However, considering the first case where all six inequalities
are replaced by equalities, no solution for these equations exist
which simultaneously makes x
x
x
and y all
y
y
,
,
,
.
non-negative.
To obtain a solution to the game replace the > by > or = and
^ by < or = in the remaining inequalities and solve the resulting
system. Continuing in this way by trial and error, finally the
case
^^1
-
-2x^
4x
x^
^^3 = ^
-^
4x^
/
/
/
^2
2x
x^
/
/
/
2x^ = V
6x
X3 =
>
V
1
which has a solution
-
^^2
'^
-y^
/
4y^
/
2y^ = V
2y^
/
^y^
/
oy^ = V
y^
is found.
^
3y^
/
y^
'^
^Vs
y3 =
1
^
47
Since 4x / 2x / 6x > v, Red will not include y in his
optimal strategy. This implies y = 0. Since 3y - 2y / 4y
in his optimal strategy, implying
< V, Blue will not include x
X = 0. Solve the remaining systenn. The set
X
=0, X
and V
= 0,
x^ =
and
1
y
2/5
=
=
y^
,
3/5, y^ =
0,
= 2
satisfies all equalities and inequalities and is non-negative.
Thus, the optimal strategies are x = (0, 0, 1), y = (2/5, 3/5,
0)
and the value
of the
For one wishing
a
to
game
is 2.
read further on this topic, reference
more complete description and some
additional examples
(7)
contains
worked
out
completely.
The algebraic method has the disadvantage that
the
number
of
possible systems of equations grows extremely large for a matrix
of
only medium size.
The next method, linear programming,
adaptation of the algebraic method but
from an
infeasible solution toward a
matic manner until the solution
is
is
has the advantage that
it
more
it
game
an
moves
feasible solution in a syste-
determined.
Linear Programming
To show
linear
that an arbitrary
programming,
let a
game may be solved by
pay-off matrix
chooses the mixed strategy X'
obtaining at least
A
In
= (x,
,
.
.
.
,
x
m
min
J
= (a..)
T]
i=l
a. .x. = v.
"^
)
the
methods
be given.
If
of
Blue
then he can be certain of
48
Therefore,
a .X
Ij
X.
X
/
/
ax Z
/
/
/
X
=
X
m
> 0.
X-
X
12
,
.
.
a
/
.
2j
1
,
.
.
.
.
.
.
,
.x
raj
m —> v
for
j
This value
make v
to be positive
is
positive
This increases the value
the sanae constant but does not change the solution.
assumed
and new variables,
n
x.
'
may
be added to
of the
= x./v,
(28]
not necessarily-
game by
Therefore, v
may
may
be defined.
1
1.
Dividing the inequalities of (28) by
,
-^
positive, but a constant large enough to
all the entries of the pay-off.
2,
1
Blue wants to make v as large as possible.
be
= 1,
then
v,
m
2]
a..x.'
i=l
>
for
1
j
= 1,
..,
.
n
(29)
^
^-^
and
m
S
X.' = 1/v.
i=l
The right-hand side
(30)
^
of equation (30)
problem has been reduced
must be minimized.
to a linear
Thus, the
programming problem
in the usual
form.
Repeating the same argument for Red, the set of inequalities
n
y^.
J= l
a. .y.'
-^
-^
<1
for
i
= 1,
.
.
.
,
m
49
holds, and
n
y.' is to
maximized,
J
These two problems are dual
thei-n,
to one another; by solving one of
the other is solved implicitly.
Having found
x.--'*
m
the
minimum
of y>
y.*'
,
x.=i-'
which equals the
maximum
of
2]
j
y.''^'
=l
game;
x.=!='
1
•
V =
x.-''
1
and
y.'"'^
•
v =
y.'--
1
1
indicate the best strategies, constituting a solution.
Example:
The Colonel Blotto Game (page
linear programming.
19)
may
be solved by
Three units may be added to each element of the
pay-off matrix without changing the solution. The resulting pay-off matrix becomes
A
and
n
i=l
the value v of the
and
=
7
3
5
4
3
7
4
5
4
2
6
3
2
4
3
6
1
1
5
5
The value
of the ganne is increased by three but the optimal
strategies remain the same.
Denote any (pure or mixed) strategy for Blue (Blotto)
by the row vector
we have
50
X' =/x,, x_, x_, x^, X
fi
3]
so that
X. >
for
i
= 1,
X
/
2,
....
5
(31)
X
L
(32)
1
and
X
/
X
/
X
/
=
Blue's expectation against each of Red's four strategies
given respectively by the elements of the column vector
7x.
/
A'X
3x,
Z
i
/
4x_
/
3
3x, / 7x-,
12
/
2x_
/
3
2x.
4
/
4x^
4
/
is
lx_
D
lx_
5
=
5x^
/
4x^
/
Gx^
/
3x^
/
5x^
4x^
/
5x^
/
3x^
/
6x^
/
5x
Let V denote the smallest element of the vector, or their
common value if there is no unique simallest element. Hence,
7x^ / 3x,
/
4x_
/
2x^
/
Ix
7x,
/
2x,
/
4x^
/
Ix
3x,
/
Z
i
>
V
D
(33;
12
/
5x_
/
5x, / 4x^
4
3
>v
6x,
/
3x^
4
3
/
5x^ > v
5
12^45
4x,
/
3x_
/
6x
.
/
5x_ > v
Blue wishes to choose his strategy X so as to maximize v.
This can be done by minimizing (1/v), Change notation as
follows:
51
1
m
and
x./v
X.' =
= 1/v.
1
Since v is positive, the division of (31) and (33) by v
gives
X.'
1
>
for
-~
= 1,
i
Z,
.
5
.,
.
and
7x/
/
3x
1
3x/
/
7x
i
/
'
Z
4x
'
/
3
•
^
/
Zx
/
'
J
2x/
/
4x
/
4
X
>
1
>
1
^
1
>
1
•
5
•
4
X
'
D
(34)
5x,
/
'
4x
/
'
5x
i
Then Blue wants
/
6x
/
3x
Z
i
4x,
'
'
Z
•
/
J
'
/
3x
determine
4
6x V
X
/
'
x_' / x^
'
x.'
/ X.'
/
'4
'
b
'
D
for
subject to the constraints of (34) so that
to
•
5x
/
4
3
5x
/
'
= 1,
i
x_
.
.
.
,
5
m
—
=
I
"5
Z,
is a minimum.
Therefore, A's problem reduces to a linear
programming problem which can be solved by the sinnplex
method, or by other methods.
x
*',
X
*'
After finding x
=:=',
x
and m, Blue may find his optimal strategy
X.* = x.'l^'/m for
1
i
= 1,
.
.
.
,
X-''
x
-!='
by
5,
1
Blue's optimal strategy for the Colonel Blotto
X*' =
=i^',
[4/9, 4/9,
0,
0,
1/9]
Game
.
is
(35)
52
Red's problem
Red's strategy by
is the
dual of Blue's problem.
Y' =
[y^. y^.
Vy
Denoting
y^]
so that
y^ >
for
Red's expectation
i
= 1,
is the
AY
3,
2,
and y^
4
/
y^
/
y^
/
y^
=
l
negative of the column vector.
=
7y^
/
3y^
/
5y^
/
4y~
3y^
/
ly^
/
4y^
/
5y^
4y^
/
2y^
/
6y^
/
3y^
2y^
/
4y^
/
3y^
/
6y^
y^
/
y3^5y3/5y^
Let V denote the largest element of the vector or their
value if there is no unique largest elenaent. Hence,
7y^
/
Sy^
/
Sy^
/
4y^ < V
3yi
/
"^^2
'^
^^3
-^
^>'4
4y^
/
Zy^
/
6y^
/
3y^ < V
2yi
^
^^Z
^
^^3
^
^Y^<^
y^
^
^1^
^^3
'^
^^4
common
^ ^
^^
choose his strategy Y so as to minimize v.
This can be done by maximizing 1/v. Change the notation again
Red wishes
to
53
J
Then Red wishes
to
Yj'
and
y./v
y.' =
M
= 1/v.
J
determine
>0
>^
Vz
^3
^^
^4
^°
so that
'
'^
'
^
^^z'
4y^
•
/
Zy,'
2Yi
'
^
^^z'
Y^ •/
y,'
-^Yi
3Yi
/^
Sy^'
/
4y^
<
1
/
4y3'
/
5y^
<
1
/
6Y3'
<
1
^^z'
/3y3.
/ 6y^
<
1
/5y3'
-^
5y,
<
1
and so that
Yl' / Y^' / Y3' / Y4' / Y5' =
M
Red's problem has been reduced to a linear
programming problem and he can find his optimal strategy Y-l*
by
is a
nriaximum.
y.* = y.*'/m for
J
Red's optimal strategy
Y*' =
When
j
= 1,
2,
3,
4.
J
is
[^1/18,
1/18, 4/9, 4/9]
.
these optimal mixed strategies are used on the
original pay-off matrix then the value of the game is 14/9 to
Blotto.
(36)
54
4
[4/9. 4/9,
0,
Bennion
0,1/9]
(2),
2
1
1/18
4
1
2
1/18
1
-1
3
-1
1
-2
-2
Glicksman
4/9
(5),
14/9
(37)
4/9
3
2
=
2
J
Tucker
have a more complete development
of
and Vajda (12 and
(11),
game theory as
Liuce and Raiffa (6) develop an alternate proof of the
13)
a linear problem.
minimax theorem
as a consequence of the duality theorem of linear programming.
Iterative Solution of a
Game by
Fictitious Play
The iterative method can be characterized by the fact that
it
rests
on the traditional statistician's philosophy of basing future decisions on
relevant past history.
of
game theory,
to
One might expect a
keep track
statistician,
perhaps ignorant
of his opponent's past plays
and choose at
each play the optimal pure strategy against the mixture represented by
all of the
opponent's past plays.
This method can best be illustrated by an example.
game defined by
the pay-off
matrix
Red Strategies
R^
R2
2
1
Blue Strategies
B
B
3
U
R.
3
3
Consider the
55
where B
,
B
and B- represent Blue's strategies and R
,
R
,
,
R
and
Red's strategies.
Assunne that Blue begins the series
of plays
by selecting strategy B
1
R
and Red chooses the pure strategy
At step two, Blue should choose
.
the pure strategy that is best against Red's
(i.e., against
R
mixed strategy
(i.e., against
each step.
Blue chooses B
).
Red likewise chooses
that is best against Blue's
B
Red chooses R
).
mixed strategy
mixed strategy
The process
is
to this point
the pure
to this point
then repeated at
The player chooses the optinnal pure strategy against his
opponent's mixed strategy to that point.
because of non-uniqueness, the player
ambiguous
this instruction is
If
may choose any
one of the
possible pure strategies which satisfy the requirement.
The column headings for the following table are defined as follows.
The number
of the play is
designated by
strategy chosen by Blue on the
of Blue after
larly for
if
plays
B and B
by Red on the
plays
N
th
N
Red uses
.
if
th
play.
Blue uses his B
and i(N) represents the pure
B
equal to the total receipts
strategy constantly, and simi-
R
R represents
the receipts of
strategy constantly.
can expect to receive on the average, after
most
is
Likewise, j(N) represents the pure strategy chosen
play and
his
N
N
N
Red
after
N
v(N) is the least that Blue
plays while v(N)
that Blue can expect to receive on the average after
N
is the
plays.
56
v(N) = 1/N
min
max
v(N) = -1/N
R.
B.
J
N
i(N)
j{N)
Red Expects
Blue Expects
R
B
R.
v(N)
v{N)
000
1.000
.667
.500
1.000
1. 000
857
.750
1.000
1. 200
4.000
R,
1
1
z
3
4
5
6
7
8
9
10
11
12
13
;4
15
16
17
18
19
20
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
It
r!
r:
r:
R^
R^
r:
r:
r:
r:
r:
r:
r:
r:
r:
1
4
2
-
1
-
2
-
3
5
-
2
-
4
9
-
2
-
5
-13
-15
-17
-19
-21
-22
-23
-
2
-
6
-
5
-
6
-
8
-
6
-11
-
6
-14
-16
-18
-20
-22
-24
-26
-28
-30
-32
-34
-36
-36
-
6
-
9
2
8
4
-
4
8
7
-
6
8
10
8
8
13
10
8
16
13
9
16
16
10
16
11
16
12
16
19
22
25
28
13
16
14
16
31
15
16
34
37
40
16
16
16
41
r'
r!
42
43
44
17
18
22
26
30
34
can be shown that v
least upper bound of v(N).
16
18
20
22
24
is the
-24
-25
-26
-27
-28
-29
-30
-31
-32
-33
-12
-15
-18
-21
-24
-27
-30
-33
-36
-39
-42
1.
.
1.364
1.500
1.615
1.714
1.800
1.813
1.765
1.722
1.684
1.650
000
2.667
2. 500
2.600
2.667
2. 286
2. 000
4.
2.
Ill
200
2.272
2.333
2,385
2.429
2.467
2.500
2.412
2.333
2.263
2.200
2.
greatest lower bound of v(N) and the
This fact insures that by carrying the
approximation far enough, the value v can be found to any degree
accuracy.
V = lim v(N)
=
lim v(N)
of
57
By considering
in
N
the nunnber of times each pure strategy is played
steps of the above approximation methods, an approximation to
an optimal strategy rnay be found.
Thus
the above table, Blue plays strategy
and strategy B
strategy for
N
strategy B
once,
P
of
three times,
is
(1/8,
3/8, 4/8).
steps, an approximation to an optimal strategy will be
N
N
X(N)
=
^
1/N
Y(N)
i(K),
=
1/N
K=l
For the preceding game
the
2
J(K).
K=l
approximate optimal strategies are:
X(20) = (13/20, 3/20, 4/20),
It
rows
four times; hence, an approximation to an optimal
X=
After
B
in the first eight
Y(20) = (6/20, 4/20, 10/20),
can be verified that the exact optimal strategies are:
X*
= (11/20,
4/20, 5/20),
Y':=
= (8/20,
7/20, 5/20).
If
lim
X(N)
and
lim
exist, then these limits are a solution of the
strategies X(N) and Y(N)
may
not converge.
Y(N)
game.
If
(3 8]
However, the
the strategies fail to
58
converge, the cause
is
generally the oscillating character of the X(N)
and Y(N) around a solution.
It
can be shown, however,
that,
in
any
case, every convergent subsequence of (38) converges to an optimal
strategy.
method
Historically, the
of fictitious play
However, as a computa-
for actually computing the value of a game.
method
tional procedure, the
is
is
A number
extremely slow.
was proposed as a means
impractical since the rate
of variants of the
of
convergence
method have been pro-
posed which have better convergence properties; however, the method
The convergence
only remains of general theoretical importance.
this
method
is
proved
in
Gale
of
(4).
TOPICS FOR FURTHER STUDY
The following sections contain a brief description
the reader
details
may wish
may
of topics
which
References where more
to investigate further.
be found are listed.
N-Person Games
The previous discussion has considered only two-person zero-sum
games
finite
but the
problem may be extended
to
any number
n-person zero- sum game may be thought
player. P.,
makes
of as a
just one choice of a strategy, X.,
of players.
game
from
in
A
which
a finite set.
59
C,
being informed about the choice of
of possible strategies without
After each of the n-players has chosen a
any of the previous players.
strategy, the pay-off for each player P. is
(x
1
Since the
game
is
,
X
1
,
.
.
.
,
X
<i
n
).
zero- sum, the pay-off functions
,
,
.
.
.
,
satisfy
n
2
1=1
The theory
of
of
n-person games
what combinations
V
»i(-i
is
= °-
largely concerned with the questions
of coalitions will be
formed and what payments
the
players can be expected to make to each other as inducements to join
the various coalitions.
More than
half of von
Neumann's book
(9) is
devoted to this topic.
Non- Zero -Sum Games
Until now^ all discussion has
assumed
that the gain of one player is
the loss of another (zero- sum), but this is not true in all competitive
situations.
The bargaining
over a contract
may
of a labor union
and an industrial conapany
be considered as a two-person game, but
zero- sum for the agreement over a contract
is
it
is
not
advantageous to both and
60
the shut-down of a plant is disadvantageous to both, but not necessarily
to the
same
Some non-zero- sum games may
extent.
be simplified by
adding an additional player who will act as a "banker" to keep the pay-off
Then
zero- sum.
theory.
Nash
game may possibly be solved by n-person game
the
has dealt extensively with this area of game theory.
(8)
Once one leaves two-person zero-sum games, however, there are
serious theoretic assumptions.
and non-zero- sum game theory
A major
is the
obstacle in developing n-person
development
of a satisfactory
theory of coalition formation and the assumptions that are made about
connmunication and collusion among players.
Infinite
Games
Not every game situation can be described in terms
number
of strategies.
manufacturer who
to put into a
thus to sell
is
A
very simple example
is the
faced with the problem of how
of only a finite
problem
much
of a
of his
product
package to compete favorably with other manufacturers and
many packages,
package as not to make a
The solution
there are infinite
of
but he does not want to put so
much
into the
profit.
an infinite game
games where no
is not
straightforward; in fact,
solution exists.
Since infinite
games
cannot be treated with the generality of finite games, the solution of
infinite gannes
McKinsey
(7)
-svill
not be discussed in this paper.
discuss infinite games in some detail.
Dresher
(3)
and
61
ACKNOWLEDGMENT
The writer wishes
Dr.
S.
Thomas Parker
to
express his sincere appreciation to
for his patient guidance and supervision
given during the preparation of this report.
6Z
BIBLIOGRAPHY
W. W. Rouse. Mathematical Recreations and Essays.
Revised by H. S. M. Coxeter. London: 1947.
1.
Ball,
2.
Bennion, Edward G. Elementary Mathematics of Linear
Programmiing and Game Theory
Michigan: Bureau of
Business and Economic Research College of Business and
Public Service, Michigan State University, 19^0.
.
3.
Dresher, Melvin. Games of Strategy Theory and Applications,
New Jersey: Prentice -Hall, 1961.
4.
Gale, David.
York:
5.
6.
The Theory of Linear Economic Models.
McGraw-Hill, I960.
New
Glicksman, A. M. An Introduction to Linear Programming
and the Theory of Games. New York: John Wiley, 1963.
Luce, R. Duncan, and Howard Raiffa. Games and Decisions
Introduction and Critical Survey.
~ New York: John Wiley,
1957.
7.
McKinsey, John C. Introduction to
New York: McGraw-Hill, 1952.
8.
Nash, John. "Non-cooperative Games,
54.
286-295, 1951.
9.
the
Theory
"
of
Annals
Games.
of
Mathematics,
von Neuraann, John, and Oskar Morgenstern. Theory of Games
and Economic Behavior. Princeton: Princeton University
Press, 1944.
Thomas
The Strategy
Harvard University Press, I960.
10.
Schelling,
11.
Tucker, Albert W.
Oklahoma
C.
of Conflict.
Massachusetts:
Game Theory and Programming. Oklahoma:
State University,
1955.
':
63
12.
Vajda, S.
of
Games.
13.
.
An Introduction to Linear Programming and
New York: John Wiley, 19^0.
Theory
of
Games and Linear Programming.
the
Theory
New York:
John Wiley, 1956.
14.
Williams,
Hill,
J.
1954.
D.
The Compleat Strategyst.
New
York:
McGraw-
TWO-PERSON ZERO-SUM GAME THEORY
by
WAYNE
B. A.
,
Adams
O'NEIL EVANS
State College of Colorado,
19^2
AN ABSTRACT OF A MASTER'S REPORT
submitted in partial fulfillment of the
requirements for the degree
MASTER OF SCIENCE
Department
of
Mathematics
KANSAS STATE UNIVERSITY
Manhattan, Kansas
1964
A two-person zero-sum game
is a conflict of
two players, hence the name two-person.
loses, thus the
sum
Two-person zero-sum games are
linear
One player wins what
A game
of their gains is zero.
rules which determines what the players
programming problems.
interest which involves
may
the other
is a collection of
do.
in one sense a special case of
Solving a
game amounts
set of equations for non-negative variables
in
to solving a
such a way as to maximize
(minimize) some function.
The mathematical formulation
reduction of the game
of
pure strategies
is
NIM from
of a ganne is illustrated
extensive to normal form.
by the
The concept
introduced and games with saddle -points are investi-
gated.
Games
Games
without saddle -points are discussed and mixed strategies are
of perfect
information are shown to have saddle -points.
introduced.
The Colonel Blotto game illustrates an application
strategies.
The geometrical properties
by their graphical representation.
sum games precedes
The solution
of
and fictitious play
of
mixed
mixed strategies are illustrated
The generalization
the proof of the
of
of
two-person zero-
minimax theorem.
matrix games by algebraic methods, linear programming
is illustrated.
topics for further investigation.
The report concludes with a section
of