Computing the Mixed Strategy Nash Equilibria for Zero

Computing the
Mixed Strategy Nash Equilibria
for Zero-Sum Games
Tom Brook
Bsc (Hons) in Computer Science
University of Bath
2007
I
Computing the Mixed Strategy Nash Equilibria for Zero-sum Games
submitted by Thomas Brook
COPYRIGHT
Attention is drawn to the fact that copyright of this dissertation rests with its author. The
Intellectual Property Rights of the products produced as part of the project belong to the
University of Bath (see http://www.bath.ac.uk/ordinances/#intelprop). This copy of the
dissertation has been supplied on condition that anyone who consults it is understood to
recognise that its copyright rests with its author and that no quotation from the dissertation
and no information derived from it may be published without the prior written consent of the
author.
Declaration
This dissertation is submitted to the University of Bath in accordance with the requirements
of the degree of Bachelor of Science in the Department of Computer Science. No portion of
the work in this dissertation has been submitted in support of an application for any other
degree or qualification of this or any other university or institution of learning. Except where
specifically acknowledged, it is the work of the author.
This dissertation may be made available for consultation within the University Library and
may be photocopied or lent to other libraries for the purposes of consultation.
II
III
Abstract
Game theory is the study of situations with one or more players deciding on what action will
reap the greatest reward for themselves whilst taking into account their knowledge and
expectations of the other players' behaviour.
Game theory can be applied to many academic areas, most commonly Economics and
Sociological problems but more recently Politics and Evolutionary Theory. It is also
becoming popular among Computer Scientists due to its use in AI. Despite the wide range of
use it was not until the Von Neumann and Morgenstern book 'Theory of Games and
Economic Behaviour' was published that the potential began to be realised.
The aim of this project is to create a system that solves two-player, zero-sum games. It
concentrates on two-player games because, as J.D.Williams says, 'one player games are
uninteresting' and games with more than two players can be affected by decisions separate
from the game data such as co-operation.
A game where the sum of the payoffs is zero is known as a zero-sum game. Simply put, one
person wins what the other loses. Games exist that are not zero-sum and solutions for these
games will be looked at on completion of the main aim.
IV
Contents
INTRODUCTION TO GAME THEORY .............................................................................................1
LITERATURE REVIEW
2.1
Prisoner's Dilemma .....................................................................................................3
2.1.1
2.1.2
2.1.3
2.1.4
2.2
2.3
2.3.1
2.3.2
2.4
2.4.1
2.4.2
2.4.3
2.5
2.6
2.7
2.7.1
2.7.2
2.7.3
Players..............................................................................................................................................3
Set of Actions ..................................................................................................................................3
Preference Relation .........................................................................................................................3
Payoff Function ...............................................................................................................................3
OPEC ...........................................................................................................................5
Equilibrium..................................................................................................................6
Nash Equilibrium ............................................................................................................................6
Mixed Strategy Nash Equilibrium..................................................................................................8
Computing Mixed Strategy Equilibria .......................................................................9
Niche Market...................................................................................................................................9
Bach or Stravinsky ........................................................................................................................11
Computing Mixed Strategy Equilibria for 2 x 2 games...............................................................12
Game Value ...............................................................................................................14
Dominance.................................................................................................................15
Other Methods
Graphical .......................................................................................................................................16
Trial and Error ...............................................................................................................................16
Gaussian Elimination ....................................................................................................................16
2.8
2.9
Simplex Method ........................................................................................................18
Other Implementations..............................................................................................21
REQUIREMENTS (18.3.2007)
3.1
Functional Requirements ..........................................................................................22
3.2
Non-Functional Requirements..................................................................................23
DESIGN & IMPLEMENTATION
4.1
Non-computation Functionality................................................................................24
4.2
Pure Nash Equilibrium..............................................................................................25
4.3
Mixed Strategy for 2 x 2 Games...............................................................................25
4.4
Mixed Strategy for m x n Games..............................................................................26
4.4.1
4.4.2
Finding the Pivot ...........................................................................................................................27
Creating the New Schema.............................................................................................................28
4.5
Dynamic Memory Allocation ...................................................................................28
TESTING
5.1
Statistical Testing ......................................................................................................29
5.2
Defect Testing ...........................................................................................................32
SUMMARY
6.1
Improvements............................................................................................................34
6.2
Conclusion.................................................................................................................37
6.3
Future Work...............................................................................................................38
REFERENCES...............................................................................................................................39
V
List of Figures
Figure 1.1 Extensive Form for a 3x2 Game...........................................................1
Figure 1.2 Matrix and Extensive Form for a 2x3 Game........................................2
Figure 1.3 Difference in Representation of Zero-sum Games...............................2
Figure 2.1 Prisoner's Dilemma ...............................................................................4
Figure 2.2 The Campers..........................................................................................6
Figure 2.3 Game with no Nash Equilibrium ..........................................................7
Figure 2.4 Matching Pennies ..................................................................................8
Figure 2.5 Niche Market.........................................................................................9
Figure 2.6 Bach or Stravinsky ..............................................................................11
Figure 2.7 Computing Mixed Strategies for 2x2 Games (a) ...............................12
Figure 2.8 Computing Mixed Strategies for 2x2 Games (b) ...............................12
Figure 2.9 Finding Odds for 2x2 Games..............................................................12
Figure 2.10 Game with no Nash Equilibrium .....................................................14
Figure 2.11 Game with Dominating Strategies....................................................15
Figure 2.12 Reduced Game ..................................................................................15
Figure 2.13 Graphical Example............................................................................16
Figure 2.14 Simplex Worked Example ................................................................19
Figure 2.15 After Addition of Constant ...............................................................19
Figure 2.16 First Schema ......................................................................................19
Figure 2.17 Pivot Criterion ...................................................................................19
Figure 2.18 New Pivot Row and Column ............................................................20
Figure 2.19 Second Schema .................................................................................20
Figure 2.20 Final Schema .....................................................................................20
Figure 4.1 Rock Paper Scissors ............................................................................26
Figure 4.2 RPS First Schema................................................................................26
Figure 4.3 4x2 Game First Schema ......................................................................26
Figure 4.4 Pointer to Pointer to … .......................................................................28
Figure 5.1 Nash Equilibrium Test from Command Line.....................................29
Figure 5.2 Nash Equilibrium Test Output File ....................................................30
Figure 5.3 2x2 Game Solving from Command Line ...........................................30
Figure 5.4 2x2 Game Solving Output File ...........................................................31
Figure 5.5 Rock Paper Scissors Test from Command Line ................................31
Figure 5.6 m x n Game Test from Command Line..............................................31
Figure 5.7 Failure to Input a Player Name ...........................................................32
Figure 6.1 Game with Domination.......................................................................35
Figure 6.2 Reduced Game ....................................................................................36
VI
Acknowledgements
Thanks go to my supervisor Dr Marina De Vos for her patience and answers to all my
questions, professors Ffitch and Vorobjov for their input, my friends Jon and John for
pointing me in the right direction and my Girlfriend Adama without whom I wouldn't have
finished.
VII
Computing the Mixed Strategy Nash Equilibria for Zero-sum Games
Chapter 1
Introduction to Game theory
Game theory is the study of situations with one or more players deciding on what action will
reap the greatest reward for themselves whilst taking into account their knowledge and
expectations of the other players' behaviour. It is most frequently used in economics but the
theory can also be applied to other areas of academia including Political Science, Sociology,
Psychology, Biology and more recently Computer Science. Players are versatile and can
represent a number of different things including people, animals, governments and most
famously suspected criminals.
A strategic game consists of a set of players and for each player a set of actions and a
preference relation over the set of actions.
The Prisoner's Dilemma is the best-known strategic game. For most people learning Game
Theory it will be the first example they come across and it will be used as an example to run
through the basics of Game Theory. Before that there are a few other parts to Game theory
that should be explained.
The number of players in a game is important and can be split, for the sake of computing
strategies, into three clear groups. J.D. Williams says 'There are three values, for the number
of persons, which have special significance: one, two, and more-than-two'. The special
significance spoken of is the differences and difficulties involved in calculating optimal
strategies for games with different numbers of players. The optimal strategy for a game with
only one player is simply a matter of choosing the action that returns the greatest reward for
that player. Finding optimal strategies for two player games is, for the players, a case of
working out the best outcome given the other player's strategy. However, finding the optimal
strategies in games with more than two players is not as simple as there is the possibility of a
coalition, among other things, that effectively alter the identities of the players involved.
Games can be represented either in extensive or matrix form. The extensive form is a tree
with nodes representing the state of the game and each edge representing a possible action
from the state. This form is often used when there is more than one action to be taken and
when players act one after another. The matrix form, or strategic or normal form, is a matrix
containing each player's payoffs. The standard for the matrix is to have Player 1's actions as
the rows and Player 2's actions as the columns. It is also standard to have Player 1's payoffs
before Player 2s'.
Figure 1.1, taken from Garg's online text, is a 3x2 game in extensive form.
Figure 1.1 Extensive Form for a 3x2 Game
1
Computing the Mixed Strategy Nash Equilibria for Zero-sum Games
As the top half of the tree, where there is more available space, represents Player 1's actions,
the number of actions available to him does not have as much of an effect on the size and
clarity of the game as Player 2's actions. Figure 1.2, from Ferguson's online text, shows a
zero-sum game in matrix and extensive form and highlights the difference when Player 2 has
more actions.
Figure 1.2 Matrix and Extensive Form for a 2x3 Game
Strictly speaking, the extensive form does not allow for players to move simultaneously but if
Player 2 has no knowledge of Player 1's move when taking his, there is nothing more than a
trivial difference between it and simultaneous moves and therefore the extensive form can be
used to represent such games. However, this example shows a clear advantage of the matrix
form, it's compactness. For a relatively small game the extensive form is already becoming
large and fairly cluttered. It is clear that a game even slightly larger than those in figures 1.1
and 1.2, say more than 7 actions in total, would have a very large tree.
The payoffs relate to the type of game being played. Zero-sum games are those in which the
payoffs for each player sum to zero, hence the name. Similarly games in which the payoffs do
not sum to zero are known as non zero-sum games. The added factors of non zero-sum games
make it more difficult to work with and therefore will only be tackled in this project if success
is achieved with zero-sum games. Figure 1.3, taken from Wikipedia, shows the differences in
representation of zero-sum and non zero-sum games.
Figure 1.3 Difference in Representation of Zero-sum Games
The first game is the standard representation of a game in matrix form. The second game is
the 'shortened' representation that can be used for zero-sum games. Despite the similarities in
the both players' actions these two games are different. The action profile (Stag, Stag) returns
the payoff 3 to both players in the first game, whereas in the second game (the zero-sum
game) the same action profile returns a payoff of 3 for Player 1 but a payoff of –3 for Player
2.
2
Computing the Mixed Strategy Nash Equilibria for Zero-sum Games
Chapter 2
Literature Review
2.1
Prisoner's Dilemma
Two suspects for a major crime are to be questioned. There is evidence to convict them both
for a minor offence but not enough to convict either suspect for a major offence unless the
other gives information against the other (finks). If neither fink they will be convicted for the
minor offence and will spend one year in prison. If one finks whilst the other remains quiet,
the latter will be convicted for the major crime and will go to prison for 4 years while the
former will go free. If they both fink they will each spend 3 years in prison. This gives
enough information to model the situation. There will be an explanation of each aspect of a
strategic game using this example. There is one assumption made about the players
concerning their preferences on prison time. That is that they each want to spend as little time
in prison as possible.
2.1.1
Players
The participants in the game, in this case the two suspects
2.1.2
Set of Actions
In a strategic game each player has a set of actions. It is assumed that players know the set of
possible actions available to them. For this example each player has the set of actions {Quiet,
Fink}. The result of both players acting is an action profile, i.e. (Quiet, Quiet) where Player 1
and Player 2 both choose to keep quiet.
2.1.3
Preference Relation
A preference relation is an ordinal ranking of the various action profiles. It shows which
profile is preferred but not by how much. Another assumption made is that each player knows
their preferences over the set of action profiles. A player can be indifferent between two
actions and all preference relations are transitive, such that a player preferring a to b and b to
c must prefer a to c. Instead of Player i prefers a to b it is possible to write a  i b and a ~I b
in place of Player i is indifferent between a and b. In the Prisoner's Dilemma each player has
the preference ordering:
(Fink, Quiet)  (Quiet, Quiet)  (Fink, Fink)  (Quiet, Fink) where the first action is theirs.
2.1.4
Payoff Function
It is often convenient to represent a preference relation with a payoff function. This is a
function with values corresponding to the rankings in the preference relation of all actions.
For example, a left shoe is not much use without a right; this can be represented using the
function:
U(x, y) = (x + y) – (|x – y|) where U represents the function and stands for utility (pleasure
gained), x represents left shoes and y represents right shoes. The idea of the function is to take
the difference in left and right shoes from the total number of shoes, therefore leaving the
number of shoes that can be paired. Halving the result gives the number of pairs but as the
preferences are ordinal this is unimportant. It is not necessary to specify the function; one can
simply assign values to each action profile. This is what it would be for the Prisoner's
Dilemma:
U(Fink, Quiet) = 3, U(Quiet, Quiet) = 2, U(Fink, Fink) = 1, U(Quiet, Fink) = 0
3
Computing the Mixed Strategy Nash Equilibria for Zero-sum Games
For simplicity these can be illustrated in a table as the matrix form, shown in figure 2.1.
Suspect 1
Quiet
Fink
Suspect 2
Quiet
2, 2
3, 0
Fink
0, 3
1, 1
Figure 2.1 Prisoner's Dilemma
The rows correspond to the actions of Player 2 and the columns correspond to those of Player
1. The numbers in the table are the players' payoffs, with Player 1's payoff first. Zero-sum
games can be represented slightly different due to the link between the payoffs. For zero-sum
games it is only necessary to have the first player's payoffs in the table because the second
player's payoffs are found simply through negation.
The interesting things about the Prisoner's Dilemma are the effects of co-operation and the
numerous links with the real world. It is clear that if both suspects choose to co-operate by
staying quiet the payoff for both is greater than if they both decide to fink on each other. This
seems to be the natural choice but is not the case. Both players have the incentive to fink
because should the other player stay quiet the payoff will be the greater than if they both stay
quiet. However, should one player decide to fink the other player will get a greater payoff if
he too finks. From this it becomes clear that despite the greater overall payoff of co-operation
the incentive to cheat is too great for any outcome other than both players finking to happen.
This outcome (Fink, Fink) is known as the Nash Equilibrium. It can be used to model work
levels of two teams on a joint project, farmer's decisions for grazing (The tragedy of the
commons), it even models the Nuclear Arms race of the 1950s. One conflict the Prisoner's
Dilemma models perfectly is the OPEC (Organisation of Petroleum Exporting Countries) oil
price situation.
4
Computing the Mixed Strategy Nash Equilibria for Zero-sum Games
2.2
OPEC
J.D. Williams says
…games contain many of the ingredients common to all conflicts…
At the time OPEC had eleven member countries with the goal of holding the price per barrel
of crude oil within a reasonable limit. This can be achieved by controlling the amounts of oil
exported. If supply increases above demand, flooding the market, the price of oil will fall;
Similarly if supply decreases and drops below demand, squeezing the market, the price will
increase. There are many factors that make this situation complicated such as the difference in
member interests. Members with relatively small oil reserves or those with large populations
and few other resources push for higher prices whereas those with large reserves and smaller
populations are happy with lower prices for other economic and technology based reasons.
However, modelling is a simplification of the situation and in this case these differences in
interests are not taken into account, it is simply that the members want to increase profit.
The best way for them to increase profit, as a group, is to keep supply steady and to each
produce at the set levels. However, individuals can increase profits by increasing their
exports, whilst the others do not. It is clear that keeping export levels steady is the equivalent
to keeping quiet in the Prisoner's Dilemma and that raising exports is the equivalent to
finking. Obviously cheating on a fellow member is frowned upon but it has happened in the
past, as there are no legal rules that are enforceable.
For a number of years Saudi Arabia, with one of the largest reserves of oil, made cuts in it's
exports to allow for the cheating of other nations in order to keep the equilibrium but
eventually it became too much and they too raised their exports. This caused the market to
flood and the price to fall rapidly. This is exactly how the Prisoner's Dilemma works and
shows that despite being a very simple model, it does a good job of modelling this situation.
In the Prisoner's Dilemma the players both agree on the best outcome but have an incentive to
deviate from this, not all models are like this. In some models both players agree that taking
the same action would be best but don't agree on which action to take. In zerosum games the
players' interests are diametrically opposed, for example a game where one player wants to
take the same action as the other whilst the other wishes to take the opposite action. This
could be used to model an established firm attempting to keep the market share from a new
firm.
5
Computing the Mixed Strategy Nash Equilibria for Zero-sum Games
2.3
Equilibrium
Equilibrium is defined as a state of balance due to an equal action of opposing forces. In
Game Theory the forces are the need to increase ones payoff and the other players' actions. A
game is in equilibrium at a point that both players have been drawn to by these forces.
2.3.1
Nash Equilibrium
Named after John Nash an American born in 1928. He completed his PhD at the age of 22
and his thesis contained his work on the equilibrium. He went on to work at MIT, later
sharing the Nobel Prize in economics.
The action profile (Fink, Fink) is a Nash Equilibrium, or Pure Nash Equilibrium, for the
Prisoner's Dilemma. An action profile, a, is a Nash Equilibrium if for every player, there is no
action that would give a greater payoff, given the other players' actions. This allows for a
player's indifference between two or more actions as long as the equilibrium action is at least
as good as all others.
Nash proved that all finite games with any number of players must have a Nash equilibrium,
which until then had only been proven for 2 player zero-sum games. The difference was that
Nash included a mixed strategy, which is explained later. Although this pure equilibrium
point was not a new concept, it is now recognised as Nash Equilibrium because of his work
with mixed strategy and the area in general. Before this it was known as a saddle point.
J.D. Williams says
A saddle point (Nash Equilibrium) exists if the larger of the row minima is equal to
the smaller of the column maxima
The method of finding Nash Equilibrium is known as the minmax or minimax method. This is
shown in figure 2.2 using a matrix game from J.D.Williams' The Compleat Strategyst.
Player 1
1
2
3
4
1
7
2
5
3
Player 2
2
2
2
3
2
3
5
3
4
1
4
1
4
4
6
Figure 2.2 The Campers
The Idea of the minmax method is that Player 1 will select the action whose minimum payoff,
given all of Player 2s' possible actions, is highest. Player 2 will act in the same way. The
reason it is called minmax and not minmin is due to the fact that despite Player 2 selecting his
action on the basis of his minimum payoff, when finding it on the payoff matrix it is the
maximum payoff in relation to Player 1 as the matrix only holds Player 1's payoffs.
In this example Player 1 will choose action 3P1 because the minimum gain is 3, which is
better than the minimum gains for the other actions. Player 2 will choose action 2P2 because
the maximum loss is –3, which is better than the maximum gains for the other actions. When
the minimum gain for Player 1 is the same as the maximum loss for Player 2 it is a Nash
Equilibrium.
If the value in element [4][2] was 6 instead of 2, the outcome would be very different. Player
1 would still choose 3P1 but Player 2 would choose 3P2 as opposed to 2P2 because now the
6
Computing the Mixed Strategy Nash Equilibria for Zero-sum Games
maximum loss of 2P2 is 6 whereas the maximum loss for 3P2 is still just 5. Now there is no
Nash Equilibrium.
As an aside, it should be noted that this game is not what is known as a 'fair' game because
Player 2 can never win unless there is some side rules that can't be represented in the table. In
this case it would be fair to say that Player 2 wins if he loses less than 3. Another way side
rules are sometimes implemented is with the unfairly treated player receiving a set amount at
the end of the game; in this case 3 would be a reasonable value. If he is up having received
this he is the winner. In this project side rules will not be implemented.
Whether a game is fair or not depends on the value of the game. If the value of the game is 0
the game is fair, if it's positive the game favours Player 1 and if negative, the game favours
Player 2. If there is a Nash Equilibrium, it is the value of the game. This will be explained
further.
Assuming there is no Nash Equilibrium and a >= b, it follows that b < c (otherwise b is the
Nash Equilibrium) and c > d (otherwise c is the Nash Equilibrium), d < a (otherwise d is the
Nash Equilibrium) and a > b. By symmetry, if a <= b, a < b > c < d > a. Where a is element
(1,1), b is (1,2), c is (2,2) and d is (2,1).
It is not uncommon to find a game with no Pure Nash Equilibrium.
Player 1
1
2
Player 2
1
3
5
2
6
4
Figure 2.3 Game with no Nash Equilibrium
The minmax method returns 4 and 5, therefore there is no Pure Nash Equilibrium. If Player 2
chose to play 1P2 then he would expect to have to pay 5, as Player 1 would soon pick up on
the best way to maximise his payoff against this action. However, if Player 2 decided to toss a
coin to decide which action to take against 1P1 he would improve his expected situation. In
this case he would expect to pay 3 half of the time and 6 half the time, therefore his expected
payoff overall would be 4.5, an improvement on 5. It may seem rash to let a coin decide his
decision but it clearly improves the probability of success. This is known as a Mixed Strategy.
7
Computing the Mixed Strategy Nash Equilibria for Zero-sum Games
2.3.2
Mixed Strategy Nash Equilibrium
So far the games discussed have seen each player always choose the same action, the Nash
Equilibrium or pure strategy equilibrium, this is known as a deterministic steady state. The
other state is known as the stochastic steady state, where each player always chooses the same
probability distribution over each action, the Mixed Strategy Equilibrium (A pure strategy
equilibrium is the special case of a mixed strategy equilibrium in which a player chooses an
action with probability 1). This means that players of the game don't know what action the
other players will choose but the probability with which they will choose each action. This is
illustrated using the game Matching Pennies in figure 2.4 as an example.
Player 1
Heads
Tails
Player 2
Heads
1
-1
Tails
-1
1
Figure 2.4 Matching Pennies
In this game the players' interests are diametrically opposed. Therefore the payoffs shown are
those of Player 1, Player 2's payoffs are the negative. This is known as a strictly competitive
or zero-sum game.
Using the minmax method it is clear to see that this game has no pure Nash Equilibrium and,
therefore, has no deterministic steady state. However, it does have a stochastic steady state. It
must be assumed that both players prefer to win more money than less. Should Player 1
decide to choose Heads with probability greater than a half, Player 2 will choose Tails with
probability 1 but then Player 1 would change their strategy to Tails. This would continue for
the duration of the game or until both players realised that they can do no better than choosing
Heads and Tails with probability 1/2, this is the Mixed Strategy Equilibrium. The proof of this
can be found in Appendix E.
When the game matrix is mirrored and negated through the diagonal from the first to the last
element as in figure 2.4 both Players have the same strategy. This does not help with
calculating the strategy but if the sum of the values in the columns and rows are 0 as well, the
value of the game is 0 and the optimal strategies involve all actions equally, as is the case
with Matching Pennies and Rock Paper Scissor.
8
Computing the Mixed Strategy Nash Equilibria for Zero-sum Games
2.4
Computing Mixed Strategy Equilibria
One method for computing mixed strategies uses the players' expected utility to find an
optimal strategy. The idea is to select a strategy that wins no matter what the opposition does.
Here are two examples to explain expected utility and this method; an asymmetric game
modelling a market niche and another interesting game known as Bach or Stravinsky.
2.4.1
Niche Market
This game, shown in figure 2.5, models the real life situation of two firms entering into a
niche market. If both enter they will not be able to break even, shown by the negative payoffs.
If one firm enters while the other stays out the firm entering the market will be able to make a
profit. From the payoffs we can see that Firm 1 has some sort of advantage over Firm 2
maybe due to better technology and therefore has more incentive to enter.
Firm 1
Enter
Stay Out
Firm 2
Enter
-50, -50
0, 100
Stay Out
150, 0
0, 0
Figure 2.5 Niche Market
EU1 (Enter) = -50x + 150(1 – x)
EU1 (Stay Out) = 0
EU2 (Enter) = -50y + 100(1 – y)
EU2 (Stay Out) = 0
Where x is the probability that Firm 2 enters the market and y is the probability that Firm 1
enters the market.
If one action's expected utility is greater than the other the Firm has a pure strategy. However,
if the expected utility is the same the Firm will be indifferent to either strategy. This is
beneficial to the other Firm as it minimises the competitions' payoff, indirectly maximising
their own. Therefore, Firm 1 will base it's strategy on Firm 2's payoffs; The probability, y,
Firm 1 enters the market will be the value that sets EU2 (Enter) = EU2 (Stay Out). Similarly,
Firm 2 forms it's strategy from Firm 1's payoffs; x will be the value that sets EU1 (Enter) =
EU1 (Stay Out).
Finding the values for x and y involves solving some simultaneous equations.
For x: EU1 (Enter) = EU1 (Stay Out)
-50x + 150(1 - x) = 0
-50x + 150 - 150x = 0
200x = 150
x = 3/4
For y: EU2 (Enter) = EU2 (Stay Out)
-50y + 100(1 - y) = 0
-50y + 100 - 100y = 0
150y = 100
y = 2/3
9
Computing the Mixed Strategy Nash Equilibria for Zero-sum Games
Firm 1 enters the market with probability 2/3 and Firm 2 enters with probability 3/4.
Conversely, Firm 1 stays out of the market with probability 1/3 and Firm 2 stays out with
probability 1/4.
Firm 1's mixed strategy is: 2/3e + 1/3so
Firm 2's mixed strategy is: 3/4e + 1/4so
The mixed strategies are used to compute the payoffs each firm will expect to receive in
equilibrium. If Firm 1's strategy is correct Firm 2's payoff for entering the market will be the
same as it's payoff for staying out and vice versa.
Given Firm 1's strategy:
Firm 2's expected payoff for entering the market = 2/3(-50) + 1/3(100) = 0
Firm 2's expected payoff for staying out of the market = 0
Given Firm 2's strategy:
Firm 1's expected payoff for entering the market = 3/4(-50) + 1/4(150) = 0
Firm 1's expected payoff for staying out of the market = 0
This confirms that the mixed strategies for both firms are correct.
10
Computing the Mixed Strategy Nash Equilibria for Zero-sum Games
2.4.2
Bach or Stravinsky
This game, in figure 2.6, models the situation of two people going to see a performance. One
prefers Bach and the other Stravinsky but they would rather go together to either than to their
preferred performance alone. This is not a zero-sum game.
Player 1
Bach
Stravinsky
Player 2
Bach
2, 1
0, 0
Stravinsky
0, 0
1, 2
Figure 2.6 Bach or Stravinsky
EU1 (Bach) = 2x
EU1 (Stravinsky) = 1 – x
EU2 (Bach) = y
EU2 (Stravinsky) = 2(1 – y)
Where x is the probability that Player 2 chooses Bach and y is the probability that Player 1
chooses Bach.
For x: EU1 (Bach) = EU1 (Stravinsky)
2x = 1 – x
3x = 1
x = 1/3
For y: EU2 (Bach) = EU2 (Stravinsky)
y = 2(1 – y)
y = 2 – 2y
3y = 2
y = 2/3
Player 1's mixed strategy is: 2/3b + 1/3s
Player 2's mixed strategy is: 1/3b + 2/3s
Given Player 1's strategy:
Player 2's expected payoff from choosing Bach = 2/3(1) + 1/3(0) = 2/3
Player 2's expected payoff from choosing Stravinsky = 2/3(0) + 1/3(2) = 2/3
Given Player 2's strategy:
Player 1's expected payoff from choosing Bach = 1/3(2) + 2/3(0) = 2/3
Player 1's expected payoff from choosing Stravinsky = 1/3(0) + 2/3(1) = 2/3
This game also has two pure strategies, (Bach, Bach) and (Stravinsky, Stravinsky). Pure
strategies are special cases of mixed strategies but this method will only pick up a pure
strategy if the only mixed strategies are pure. For example Prisoner's Dilemma has just the
one pure strategy (see Appendix D). In this case the strategy found is of no importance.
11
Computing the Mixed Strategy Nash Equilibria for Zero-sum Games
2.4.3
Computing Mixed Strategy Equilibria for 2 x 2 games
These two examples show that the best strategy is to make the opposition indifferent as to
which action to choose. For 2 x 2 zero-sum games there is an equation that can be derived
from this knowledge that can be used to find the mixed strategies without the need to solve
the simultaneous equations. The expected utility for Player 1 for action 1P1 is ap + d(1 – p)
and for action 2P1 is bp + c(1 – p), equating these gives ap + d(1 – p) = bp + c(1 – p), which
is rearranged to p = c – d / a – b + c – d.
Player 1
Player 2
1
3
5
1
2
2
6
4
Figure 2.7 Computing Mixed Strategies for 2x2 Games (a)
Putting the values into the formula gives p = 4 – 5 / 3 – 6 + 4 – 5 = 1/4. That is the probability
with which Player 1 should play 1P1. It follows that he should play 2P1 with probability 3/4.
For Player 2 the matrix must be read slightly differently.
Player 1
Player 2
1
d
c
1
2
2
a
b
Figure 2.8 Computing Mixed Strategies for 2x2 Games (b)
This gives p = 5 –3 / 6 – 4 + 5 – 3 = 1/2.
Another method for calculating the mixed strategies for a 2 x 2 game involves finding the
odds of the game. The first step is to look for a Nash Equilibrium, if there is one there is no
need to continue looking for the odds, as they will be 1 for the row and column containing the
point, additionally this method will normally give incorrect strategies if there is a Nash
Equilibrium. If there is no equilibrium then the best strategy will be mixed.
The method for finding the odds for Player 2 is to subtract the 2nd row values for each action
from their respective 1st row values. One will be negative and the other positive due to the
laws relating to Nash Equilibria but the sign of the values is not important. The odds for both
actions are the results of the subtraction from the opposite action.
Player 1
1
2
Player 2
1
3
5
2
6
4
Figure 2.9 Finding Odds for 2x2 Games
In this case the odds for 1P2 is 6 – 4 = 2 and the value for 2P2 is 3 – 5 = -2. The odds for this
game for Player 2's actions are 2:2 or 1:1, equivalent to p = 1/2.
Player 1's odds are found, intuitively, by subtracting the 2nd column values for each action
from their respective 1st column values. In this case the equations are for 1P1 are 5 – 4 = 1
and for 2P1 are 3 – 6 = -3. The odds for Player 1 are 1:3, which equivalent to p = 1/4.
12
Computing the Mixed Strategy Nash Equilibria for Zero-sum Games
So for this game the mixed strategies would be for Player 1:
1/4 1P1 + 3/4 2P1
and for Player 2:
1/2 1P2 + 1/2 2P2
With these strategies Player 1's average payoff is 4.5 and Player 2's average payoff is –4.5.
Again the sign is not important, simply that they are both the same absolute value. This
proves that the strategies are optimal and also gives the value of the game
13
Computing the Mixed Strategy Nash Equilibria for Zero-sum Games
2.5
Game Value
The value of a game is the payoff that can be won with both players playing well. If it is
positive then it is the value Player 1 should, on average, win, if it's negative then it's the value
Player 2 should win and if it's 0 then the game is fair. Parlour games, such as Rock, Paper,
Scissors, are mostly fair.
If a game has a Nash Equilibrium, this is the value of the game but not many games have an
equilibrium. When a game has no equilibrium a different method must be used to find the
value.
In 2 x 2 games with no Nash equilibrium point the value is the average payoff resulting from
good play against either pure strategy of the opposition.
Player 1
1
2
Player 2
1
3
5
2
6
4
Figure 2.10 Game with no Nash Equilibrium
In the above game Player 2 can improve his expected utility from at losing least 5, if he were
to choose a pure strategy, to losing, on average, just 4.5 by playing each action half of the
time.
Calculating the average payoff is simple but first it is necessary to change the probabilities for
each action to odds. For instance in this case where Player 2 will play both actions with
probability = 0.5 the odds are 1:1, i.e. 1P2 will occur once for every occurrence of 2P2. Now
to calculate the average payoff we divide the sum of the each payoff multiplied by the
respective odds by the sum of the odds. In this case for Player 1 playing 1P1 the equation
would look like this
1 x 3 + 1 x 6 / 1 + 1 = 4.5
and for 2P1
1 x 5 + 1 x 4 / 1 + 1 = 4.5
This value is the same when calculated for Player 1 if the mixed strategies are correct. If
different values are calculated correctly the mixed strategies are incorrect and should be
checked.
Sometimes a game may be altered by either adding a constant to the payoffs or by multiplying
them by a constant. This does not affect the play of the game but it does affect the value. The
difference in value will be the change made to all the payoffs, i.e. if 2 was added to all
payoffs the value of the game would be the value of the original game + 2. Adding a constant
alters the fairness of a game whilst multiplying by a constant can represent a change in
currency.
14
Computing the Mixed Strategy Nash Equilibria for Zero-sum Games
2.6
Dominance
In 2 x m games, where m is greater than 2, finding dominated strategies is an important
reduction. If no Nash Equilibrium is found, reducing the game makes calculating the mixed
strategy a lot simpler. A strategy is dominated by another strategy if, for each action taken by
the other player, the payoff is greater in the other strategy.
Player 1
1
2
3
4
5
Player 2
1
2
4
3
5
4
2
5
3
6
4
4
Figure 2.11 Game with Dominating Strategies
In this game there are a number of dominant strategies for Player 1. Comparing 1P1 and 3P1
shows that 3P1 is dominant for both Player 2s' actions. Also 4P1 dominates 2P1 and 5P1 so
the game can be reduced to a 2 x 2 game.
Player 1
1
2
Player 2
1
3
5
2
6
4
Figure 2.12 Reduced Game
This game can now be solved as any 2 x 2 game would be solved with 0 being the probability
for the removed actions.
It should be noted that not every 2 x m game can be reduced to a 2 x 2 game and many games
will not reduce at all.
15
Computing the Mixed Strategy Nash Equilibria for Zero-sum Games
2.7
Other Methods
Here are some other methods that can be used to solve or help solve matrix games.
2.7.1
Graphical
The graphical solution, described in Williams (1966), can be used to solve 2 x m games like
the one in figure 2.13.
Blue
1
2
1
-6
7
2
-1
-2
Red
3
1
6
4
4
3
5
7
-2
6
4
-5
7
3
7
Figure 2.13 Graphical Example
The method works by plotting the m strategies, in this case Red, on a graph with two vertical
axes representing the 2 strategies, or Blue. From here the strategies for Red in the optimal mix
can be found at the highest point of the line segments that bound the graph from below. The
lines intersecting this point are the strategies that should be used. This reduces the game to 2 x
2, which can be easily solved using one of the methods described earlier. If Blue (Player 1)
has m strategies as opposed to Red, the strategies are found from the intersecting lines at the
lowest point of the line segments binding the graph from above. This is more a method of
reducing the game, similar to removing dominated strategies.
2.7.2
Trial and Error
Trial and Error, another method described in Williams (1966), should not be recommended
but does work and is very simple at the same time, albeit tedious. It involves testing 2 x 2
games from within the 2 x m game using the 2 x 2 methods described earlier until one is
found that contains the optimal strategy. This is similar to the Graphical solution in that it is
more a method of simplifying the game for other methods to work on.
2.7.3
Gaussian Elimination
This method, described in Dantzig (1963), uses 3 matrices to solve the equations. The first
matrix contains the coefficients; the second contains the right hand side (RHS) and the third
contains the variables. The RHS and variable are both column matrices; there is only one
number in each row. The system can be written as Ax = b, where A is the coefficient matrix,
x is the variable matrix and b is the RHS matrix. Using simple algebra, Ab = x. The values of
the variables in the column matrix, x, can be found using the augmented matrix Ab = [A|b],
that is the A matrix with an extra column containing the b matrix. There are three distinct
parts to this method, partial pivoting and forward and backward elimination.
Forward Elimination systematically eliminates non-zero elements below the diagonal of the
coefficient matrix using three operations, known as elementary row operations, which
preserve the solution set of the system. These are
1. Multiply a row by a non-zero scalar, c
2. Interchange rows
3. Multiply a row by a non-zero scalar, c, and add the result to another row
If a matrix is created from another using only these three operations they are said to be row
equivalent and the solution of the two will be the same.
16
Computing the Mixed Strategy Nash Equilibria for Zero-sum Games
Using the elementary row operations, the goal is to reduce the augmented matrix Ab to the
matrix ‘Ab = [U|c], where U is upper triangular. That is all elements of the matrix U below
the diagonal are zero. In order to complete the process all elements above the diagonal must
be eliminated leaving non-zero elements in all but the diagonal, this is Backward Elimination,
creating the matrix, ‘Ab = [UL|’c]. Then all that is left to do is divide the elements in 'c with
the relative UL elements.
There is one problem that may arise and that is if a diagonal element becomes zero. If this
were to happen the final stage of the algorithm would require division by zero. The method
used to stop this is known as pivoting. Partial pivoting is preferable to the more stable
complete pivoting because it is better documented and less expensive to implement. The
difference between the two is that partial pivoting only checks the elements in the same
column below the current diagonal whereas complete pivoting checks all subsequent
columns. The partial pivoting algorithm checks the lower elements in the column of each
diagonal to make sure that the diagonal element is larger than those below it. It takes the
absolute value of the elements below because, during the forward elimination, multiplication
is used that will incorporate the sign of the element and therefore despite a large negative
element being less than a small positive element it will still be more reliable. If the diagonal is
not the largest the algorithm will swap all elements of the row containing the diagonal and the
row containing the largest lower element in the column. This must take place before the
Gaussian elimination can take place.
17
Computing the Mixed Strategy Nash Equilibria for Zero-sum Games
2.8
Simplex Method
The following algorithm for solving finite games is the simplex method used for solving zerosum, m x n games as described in J.D.Williams (1966).
1
Add a constant to all elements of the game matrix, if necessary, to insure that the value
is positive. After computation this constant must be subtracted from the value of the
new matrix game to get the value of the original matrix game. The strategies are not
affected.
2
Create the first schema by augmenting the game matrix with a new column of -1s on
the lower edge, a new row containing 1s on the right edge and zero in the lower right
corner. Label Player 1’s strategies on the left from x1 to xm and Player 2’s strategies on
the top from y1 to yn.
3
Select any entry from the interior of the schema to be the pivot, say row p column q,
subject to the following properties:
3.1
3.2
3.3
3.4
3.5
4
The numbers for the next schema are found as follows:
4.1
4.2
5
6
The number in the final row of the potential pivot column, r, must be negative.
The potential pivot, p, must be positive.
Compute for each potential pivot the quantity – r x c / p, where c is the final
value in the column containing the potential pivot. This value is known as the
pivot criterion.
Find the smallest criterion in each valid column (Those containing a negative
value in the final row).
Find the largest criterion from the set of smallest column criterion. The element
corresponding to this criterion is the pivot.
Number corresponding to the pivot is the value, D, of the preceding schema.
Numbers that correspond to those of the preceding pivot row stay the same.
4.3 The numbers that correspond to those of the preceding pivot column are the
same value negated, except the pivot.
4.4 All other values, except the pivot, are computed from N x p – pR x pC / D,
where N is the original number and pR and pC are the numbers that have rows
and columns in common with p and N.
4.5 The next value of D is the pivot value, p, of the preceding schema. N.B. D is
1 for the first schema.
Player 1's strategies are represented by the rows from the original game matrix and
Player 2's strategies are, similarly, represented by the columns from the original game
matrix.
The two strategies represented by the row and column containing the pivot are part of
the optimal strategy for each player respectively. However, when pivoting the strategies
swap the action they represent. Example, p = a[1][4], This means that Player 1's
action, represented by row 1 is part of his optimal strategy but is now represented by
the value in the final element of column 4. Similarly Player 2's action that is
represented by column 4 is now represented by the value in the final element of row 1.
Check the final row of the new schemata for negative numbers.
6.1
6.2
If one exists, return to step 3.
If not, the computation is complete and the strategies can be found
18
Computing the Mixed Strategy Nash Equilibria for Zero-sum Games
The numbers in the final column that are part of the optimal strategy for Player 1 are the odds
for the action they represent.
The numbers in the final row that are part of the optimal strategy for Player 2 are the odds for
the action they represent.
The value of the probabilities is easily calculated by dividing each individual odd by the sum
of the odds. Example, if Player 1's optimal strategy is represented by 0:1:2, the probability for
action 1 is 0, for action 2 is 1/3 and for action 3 is 2/3.
The game value is the number in the final element of the final schemata.
Figure 2.14 is taken from William's (1966) and is used in the following worked example
1
2
3
Player 1
Player 2
2
0
-2
6
1
6
8
4
3
3
3
5
Figure 2.14 Simplex Worked Example
The first step is to clear the matrix of any negative numbers by adding a constant, equal to the
largest negative value, to all elements. In figure 2.14 the value is –2 for the action profile (2,
2). This leaves figure 2.15
1
2
3
Player 1
Player 2
2
2
0
8
1
8
10
6
3
5
5
7
Figure 2.15 After Addition of Constant
This matrix is now augmented with a column of 1s, a row of –1s and a 0 where the two meet.
This leaves the matrix in figure 2.16.
Player 1
1
2
3
1
8
10
6
-1
Player 2
2
2
0
8
-1
3
5
5
7
-1
1
1
1
0
Figure 2.16 First Schema
The next step is to find the pivot. This involves finding the largest of the smallest column
pivot criterion.
1/8
1/10
1/6
1/2
1/8
1/5
1/5
1/7
Figure 2.17 Pivot Criterion
19
Computing the Mixed Strategy Nash Equilibria for Zero-sum Games
In figure 2.17 the largest pivot criterion from the set of smallest column pivot criterion is 1/7
in the final row and column, this will be the first pivot.
1
Player 1
Player 2
2
1
2
6
8
-5
-5
1
5
3P1
1
3P2
Figure 2.18 New Pivot Row and Column
The pivot column values are negated and the pivot row columns are left as they are. The
exception to these two rules is the pivot itself. The pivot takes the value of D, which in the
first augmentation is 1. D then takes the value of the previous pivot, in this case 7.
Player 1
1
2
1
26
40
6
-1
Player 2
2
-26
-40
8
1
-5
-5
1
5
3P1
2
2
1
1
3P2
Figure 2.19 Second Schema
Figure 2.19 is the completed second schema. All values that are not part of the pivot row or
column are found using the same formula, N x P – R x C / D, where N is the value taken from
the same element of the previous schema, P is the pivot and C and D are the values that share
the same row and column as N and P. For the first element the calculation is 8 x 7 – 6 x 5 / 1.
This method is repeated until there are no more negative numbers left in the final row. Figure
2.19 shows that there is still a negative number in the final row and therefore another pivot
must be found from the columns containing negative numbers in the last row, in this case only
the first column.
1
Player 1
-26
7
-6
1
2P1
Player 2
2
0
-40
80
0
-10
-5
10
5
3P1
4
2
4
6
1P2
3P2
Figure 2.20 Final Schema
Once the final schema has been found, as in figure 2.20 when there are no negative numbers
in the final row, the strategies are ready to be taken from the matrix. Player 1's mixed
strategy is 2P1 and 3P1 in the ratio 1:5 or 0:1:5. Player 2's mixed strategy is 1P2 and 3P2 in
the ratio 1:2 or 1:0:2. The value of the game is found by dividing D, which is 40, by the final
element of the matrix, 6, leaving 6 2/3.
20
Computing the Mixed Strategy Nash Equilibria for Zero-sum Games
2.9
Other Implementations
Despite Game Theory's usefulness in economic theory and other areas of academia software
that finds Nash Equilibria is not abundant. Some programming languages have had a simplex
function written for them, such as Maple and on the Internet there is a library of tools known
as Gambit and some applets for solving zero-sum games in normal form. There is also an
application available for certain calculators that can be downloaded and there are many sites
that explain how to find the Nash Equilibrium.
21
Computing the Mixed Strategy Nash Equilibria for Zero-sum Games
Chapter 3
Requirements (18.3.2007)
Requirements set out what the program should do, defining constraints on the operation and
implementation. For a larger system this document can be very large and will be checked and
used by managers, engineers and the software developers but in this case the system is small
scale and relatively simple and with one person acting in all of the above positions there will
not be many requirements.
There are a number of benefits a small-scale project, such as this, has in terms of the
requirements stage. Familiarity avoids confusion over requirement wording and conflicting
interpretations of natural language that can lead to problems in the final system. It also
removes the possibility of misinterpreting system goals that have not been clearly verified.
Despite these benefits it is important to keep an updated requirements document. In small
projects it is easy to write the requirements and then not use them. This can lead to alterations
of the requirements at the end of a project to fit the final system, which is bad practise for
future large-scale projects.
The program has one function, computing the mixed strategy Nash Equilibrium for the
payoffs inputted by the user. It has a place in industry for improving theoretic knowledge of
situations but in reality the equilibrium model is too simple for use as a sole advisor. The
software's main use will be in academia, aiding the teaching of game theory in economics and
various sciences.
This document was regularly updated until it was decided to implement the Simplex method
when it was finalised and included the algorithm requirements. The game data has slightly
changed as the player names are now entered by the user at the command line instead of being
part of the input file. The output file is also specified at the command line instead of being
hard coded to allow the user more freedom and the ability to easily store a number of results.
3.1
Functional Requirements
1.
User Input – In the form: P1 Action Count, P2 Action Count, Payoffs…
1.1
Game data can be separated by either a comma or whitespace
1.2
User can input any m x n matrix
1.3
Payoffs can be positive and negative numbers
1.4
Payoffs can be floating-point numbers
1.5
Payoffs can be fractions
System Method
2.1
Look for Pure Nash Equilibrium
2.1.1 Return results if Pure Nash Equilibrium is found and terminate
2.2
Check size of game
2.2.1 Do 2 x 2 method if possible and terminate
2.3
Begin Simplex Method
2.3.1 Find Pivot point
2.3.2 Create new schema
2.3.3 Exchange Player Actions
2.3.4 Search final row for negative numbers
2.3.5 Return to 2.3.1 if a negative number is found
2.3.6 Return Solution
System Output
3.1
Results will go to the command line and to the output file
2.
3.
22
Computing the Mixed Strategy Nash Equilibria for Zero-sum Games
3.2
3.3
3.4
3.5
3.6
3.7
3.2
Users gives the system the name of the output file
System will output the optimal strategies for both players and the value of the
game
3.3.1 Output will be in the form:
Player 1's strategy: a Action1 + b Action2 + …
Player 2's strategy: c Action1 + d Action2 + …
Value of Game = gV
The game favours Player …
Game value will be either a floating-point number or fraction
Game value can be positive or negative and zero
The Action probabilities must be in the range [0,1]
The sum of the Action probabilities for each player must be 1
Non-Functional Requirements
1.
2.
3.
4.
5.
6.
7.
8.
9.
The system must be written in C
The system must run on the BUCS machines
The system should run efficiently
The system should be easy to use
The system must be reliable
5.1
Error messages should result from any error
5.2
Error messages must be clear and easily understood
5.3
The system will terminate if an error is found
Data must be held in a Plain text file
Poster presentation due 16.4.07
Final Draft due 18.4.07
Completed project due 3.5.07
23
Computing the Mixed Strategy Nash Equilibria for Zero-sum Games
Chapter 4
Design & Implementation
This chapter describes, in more detail the functionality of the algorithms mentioned in
Chapter 2 and discusses how they will be implemented as well as any problems that may arise
and why C was chosen. It also explains the design decisions made for data entry and the
system output.
4.1
Non-computation Functionality
It is considered bad coding to hardcode files for storing results. Therefore, the first task is to
get the name of the file the user wants to store the results in. As there are no real users it is
impossible to know whether input of the filename should be with the game data on file or
inputted at the command line. The difference in coding is minimal so it has been done at the
command line to show a mix of input possibilities. Along with the filename the user is
prompted to input the names of the two players, which are then used for printing out the
strategy on completion of the computation.
Before the computation can begin the game data must be retrieved from the data file. The
game data is the number of actions for each player and their payoffs. As the data is stored as a
string it is necessary to split this string into tokens and change the type, this happens at every
occurrence of whitespace or a comma.
chPtr = strtok(str," ,");
col = atoi(chPtr);
...
chPtr = strtok(NULL," ,");
row = atoi(chPtr);
The first two tokens are the number of actions Player 1 and Player 2 have respectively. These
also specify the size of the game and are therefore very important for limiting iteration in the
system. Once the numbers of actions are stored, Player 1's as col and Player 2's as row, the
payoffs can be converted to floating point numbers and assigned to the game matrix.
for(i = 0; i < row; i++){
for(j = 0; j < col; j++){
Pay[i][j] = atoi(chPtr);
chPtr = strtok(NULL," ,");
}
}
24
Computing the Mixed Strategy Nash Equilibria for Zero-sum Games
4.2
Pure Nash Equilibrium
As explained in the literature review a zero-sum game has a Nash Equilibrium if one of the
elements is the largest in the column and smallest in the row. The system uses a couple of for
loops to iterate over the game matrix and finds, for each row, the smallest value and stores it
and the column that it lies in. It then finds, for each column, the largest value and stores it and
the row that it lies in. Having found the minima and maxima for each row and column
respectively they are then checked to see if any of the elements occur in both the row minima
and column maxima. If there is such an element it is the Pure Nash Equilibrium and all that is
left to do is to calculate the value of the game and print the pure strategy.
for(i = 0; i < row; i++){
column = rowSt[i];
if(colSt[column] == i){
...
The arrays rowSt[] and colSt[] store the column and row numbers for the minima and
maxima of each row and column, respectively. For example rowSt[i] stores the column
number that contains the minimum value from the ith row. If rowSt[i] points to the jth
column and colSt[j] points to the ith row, then a[i][j] is the Nash Equilibrium.
findNash();
if(!checkFlag){
/* If Nash Equilibrium does not exist continue else finish */
if(col == 2 && row == 2)
/* If game is 2x2 do short method else do simplex */
find2Strat();
else{
/* Run the Simplex Method */
If a Nash equilibrium is found the function sets the checkFlag so that the program does not
continue to look for mixed strategies. If not the program checks the size of the game. If the
game is 2 x 2 the quick method can be used to solve it, otherwise the simplex method is used.
4.3
Mixed Strategy for 2 x 2 Games
Should no pure Nash equilibrium exists in the game the next check is on the size of the game,
remembering that 2 x 2 games are very simple to solve. It is unnecessary to go into any more
detail than to say that the odds for both players' actions are stored in separate arrays from
which, the probability distribution over the actions is calculated and the optimal strategy is
printed.
25
Computing the Mixed Strategy Nash Equilibria for Zero-sum Games
4.4
Mixed Strategy for m x n Games
If the game has no pure Nash equilibrium and it is larger than 2 x 2, the simplex method will
be used to compute the mixed strategy. There are two parts to this method, finding the pivot
and creating the new schema using the pivot.
Rock
Paper
Scissors
Player 1
Player 2
Paper
-1
0
1
Rock
0
1
-1
Scissors
1
-1
0
Figure 4.1 Rock Paper Scissors
The first action is creating the first schema, which involves adding one extra row and column
to the payoff matrix, the row containing 1s and the column containing –1s, the element where
the new row and column meet is 0. All elements in the schema taken from the original payoff
matrix must be positive, in order for this to be so it is necessary to subtract a constant from all
elements before entering them into the schema. The value of the constant will be the largest
negative element in the original payoff matrix; if there are no negative elements the constant
will be 0. The largest negative value in the Rock Paper Scissors game is –1.
Player 1
Rock
Paper
Scissors
Rock
1
2
0
-1
Player 2
Paper
0
1
2
-1
Scissors
2
0
1
-1
1
1
1
0
Figure 4.2 RPS First Schema
In figure 4.2, the first schema for the game Rock Paper Scissors, the original game matrix is
shown in the larger bordered area and the constant has been subtracted from the values. From
here the pivot finding and schema augmentation begin a recursive loop until there are no
negative values in the final row.
Player 1
1
2
3
4
1
2
4
3
5
-1
Player 2
2
5
3
6
4
-1
0
0
0
0
0
0
0
0
0
0
1
1
1
1
0
Figure 4.3 4x2 Game First Schema
Figure 4.3 shows how the program creates a square matrix for the simplex method to work on
from an uneven game. The game in question is 4 x 2 and in order to make it square two
columns of 0s have been inserted. These columns are never used in calculations; they are
simply there to simplify the strategy swapping. Similarly for a 2 x 4 game, two rows of 0s
would be inserted and again are never used in the calculations.
26
Computing the Mixed Strategy Nash Equilibria for Zero-sum Games
4.4.1
Finding the Pivot
Potential pivots are searched for column by column and there are two checks that must be
made before looking for a pivot. A potential pivot must be positive and the final element of
it's column must be negative. Checking the final row for negative elements saves unnecessary
checks and is therefore done before checking the sign of each column element.
for(j = 0; j < col; j++){
...
if(Sc1[rLim][j] < 0){
for(i = 0; i < row; i++){
if(Sc1[i][j] > 0){
...
When a valid element is found, the code calculates the pivot criterion, r x c / p, and stores the
value in an array, crits[], which stores the minimum criterion for each column. If a pivot
criterion in jth column is smaller than the current value stored in crits[j], it replaces that
value and it's row number is stored in an array, colSt[]. Once all columns have been
checked and the smallest pivot criterions from each valid column have been found the next
step is finding the largest of these values, the element that this criterion corresponds to is the
pivot. The pivot column, cP, is set during the search for the largest criterion and the pivot
row, rP, is set to the value in colSt[cP].
Once the pivot has been found there is one thing that must be done before the computation
can begin. Each player's strategy is heavily affected by the pivot and the changes must be
recorded in order to find the optimal strategy in the final schema. The arrays
P1s[rLim][0] and P2s[cLim][0] are updated every time the pivot takes place with the
relevant information.
P1s[rP][0] = cP + 1;
P2s[cP][0] = rP + 1;
For example, Player 1's first action is originally represented by the first row. If the first pivot
is element[0][2] this action is now represented by the 3rd column.
27
Computing the Mixed Strategy Nash Equilibria for Zero-sum Games
4.4.2
Creating the New Schema
The second part of the Simplex method involves the calculation based on the pivot. There are
for groups of numbers that need to be found, they are: those in the pivot row, those in the
pivot column, the number corresponding to the pivot and all other numbers corresponding to
the original game matrix. The number corresponding to the pivot takes the value of D, which
from the first schema is 1 but for all following schema is the value of the pivot from the
preceding schema. The numbers in the pivot row stay unchanged. The numbers in the column
row, apart from the pivot, are negated. All other numbers from the original game matrix are
computed from N x p – pR x pC / D, where N is the original number and pR and pC are the
numbers that have rows and columns in common with p and N.
for(i = 0; i < row; i++){
if(i != rP){
for(j = 0; j < col; j++){
if(j != cP)
Sc1[i][j] = (Sc1[i][j]*prevPiv - Sc1[rP][j]*Sc1[i][cP]) / d;
}
}
}
Any additional rows or columns inserted to make the game square are not altered. Once these
changes have been made D can be assigned the value of the previous pivot. A check is then
run to see if there are any negative values in the final row. If there are a new pivot must be
found.
4.5
Dynamic Memory Allocation
Arrays cannot be declared without a size parameter. As the games can be any size it is not
possible to statically allocate memory for the multidimensional arrays used to store the
payoffs without guessing at a maximum game size that would undoubtedly lead to either
unnecessary waste or worse, the opposite, not enough space. Therefore memory must be
allocated dynamically. It is simple to dynamically allocate memory for ordinary arrays but
multidimensional arrays are more complex and require the use of pointers to pointers.
Summit's online text gives a good description on how to do this and figure 4.4 is taken from
this text as a visual aid.
Pay = malloc(row * sizeof(double *));
for(I = 0; I < row; I++)
Pay[I] = malloc(col * sizeof(double));
It is probably best to think of it as an array of pointers, representing each row, that point to
another array of pointers, representing each column.
Figure 4.4 Pointer to Pointer to …
It is now possible to reference the elements as if they were stored in a multidimensional array,
for example the element in the first row and second column is a[0][1].
28
Computing the Mixed Strategy Nash Equilibria for Zero-sum Games
Chapter 5
Testing
Testing is a continuous process from requirements gathering to the final program. Software
inspections with the aid of additional print statements were used to make sure that the code
was working as expected throughout the implementation process. This chapter concentrates
on the finished program and how it performed in the testing.
Sommerville (2001) separates software testing into two parts, Defect testing and Statistical
testing. Sommerville says 'Defect testing is intended to find inconsistencies between a
program and it's specification' while 'Statistical testing is used to test the program's
performance and reliability'.
The remaining test data can be found in Appendix B.
5.1
Statistical Testing
Statistical testing involves realistic user inputs to gauge if the system satisfies the
requirements. The tests in this chapter are mostly taken from games specified in Williams
(1966) and have been chosen to show the system's full range of functionality.
Test 1: Correct Nash Equilibrium Selection
The game data for this test was [2 2 6, 5, 5 4]. This game has a Nash equilibrium in
a[0][1] and the data was separated by a mix of whitespace and commas to show that both
are acceptable.
Figure 5.1 Nash Equilibrium Test from Command Line
This is taken from the command line. The user is prompted for the output file and the names
of the players. Once these are entered the code gets the game data, prints the game matrix and
begins solving. The test result is a success as the equilibrium point was found and no other
calculations took place.
29
Computing the Mixed Strategy Nash Equilibria for Zero-sum Games
Figure 5.2 Nash Equilibrium Test Output File
This is the output file, Nash.txt, which was specified by the user at run-time. If the file already
exists the data will be overwritten, if not the file will be created.
Test 2: 2 x 2 Game Solving
This 2 x 2 game had no Nash Equilibrium and should have been solved using the simple
method described in chapter 2.
Figure 5.3 2x2 Game Solving from Command Line
These results match those in Williams (1966) and have been checked on paper. The slight
distinguishing mark of the simple 2x2 solving method is the extra line between the game
matrix and the strategies, showing that this was calculated using the correct method.
30
Computing the Mixed Strategy Nash Equilibria for Zero-sum Games
Figure 5.4 2x2 Game Solving Output File
Again the output file has the correct data stored.
Test 3: 3x3 Game Solved using the Simplex Method
Rock Paper Scissors was used for this test as it is large enough, has no Nash equilibrium and
it has negative values. The method should add a constant to the game matrix, calculate the
optimal strategies for the new game, which will be the same as the original game strategies,
and then subtract the constant from the value of the new game to find the value of the original
game, which should be 0 as Rock Paper Scissors is fair.
Figure 5.5 Rock Paper Scissors Test from Command Line
Test 4: m x n Game Solved using the Simplex Method
This tested the programs ability to solve games that are not square.
Figure 5.6 m x n Game Test from Command Line
31
Computing the Mixed Strategy Nash Equilibria for Zero-sum Games
This test required the program to insert an extra column of 0s in order to calculate the correct
strategies. The program was also tested with game that required rows of 0s.
5.2
Defect Testing
Sommerville (2001) describes Defect testing as demonstrating 'the presence, not absence, of
program faults' and says 'a successful defect test is a test which causes the system to perform
incorrectly'. Again these tests are a selection of the more interesting and useful defect tests
that were run.
Test 5: Failure to Specify an Output File / Player Name
This test simulates a user mistakenly pressing return before having entered an output file. The
system continues and creates an output file named 'Test.txt'. This bit of functionality means
that there will always be an output file created even if a name is not specified by the user for
whatever reason. Similarly, if the user does not enter a player name the program assigns either
Player 1 or Player 2 as a default name.
Figure 5.7 Failure to Input a Player Name
Figure 5.7 shows the output at the command line if a player name is not entered. In this case
Player 1's name has been left blank for one reason or another and the program, instead of
having a blank space has used the default name, Player 1. A name not being given could be a
mistake or the user may not know the name or consider it necessary for the output.
Test 6: Invalid Game Data – Incorrect Number of Inputs
When a game is 3 x 3 or bigger it is easy to give the wrong number of payoffs to the input
file. It is also easy to forget to give both players action counts at the beginning of the input
file, especially when the game is square and both player's have the same number of actions. If
this occurs there is a segmentation fault as the program expects more or less tokens than it has
received. This is not an ideal outcome and will be mentioned in the improvements section of
chapter 6.
32
Computing the Mixed Strategy Nash Equilibria for Zero-sum Games
Test 7: Invalid Game Data – Invalid Data Type
All tokens from the input file should be numbers and the code has been written to expect this.
If an unexpected character is found in the input file the atoi function returns the value 0,
which will often lead to misleading results. This will also be mentioned in chapter 6.
Despite the defect testing picking up a number of faults in the program the statistical testing
results are encouraging.
33
Computing the Mixed Strategy Nash Equilibria for Zero-sum Games
Chapter 6
Summary
6.1
Improvements
There are a number of improvements that have either arisen during the coding or that I have
been made aware of by the testing. Some of these have been implemented but due to time
constraints some have not. Because there are no current users for the software the gathering of
requirements was hampered, relying on my own input and a brainstorming session with a
former Economics Teacher. The improvements that became obvious during coding and after
testing would, probably, have been picked up by regular users and, I imagine, there are more
that I still have not thought of. I will discuss all improvements including those that have been
implemented and may not be a part of the requirements document.
A square game should only require one size parameter – This is not a problem but it was
noted in software testing throughout implementation that it is easy to forget the second
parameter when it is the same. It would be possible to implement this functionality with a test
along the lines of:
If number of tokens is not as expected check whether the
length of the tail is twice the value of the head.
For example the input [2 3 4 5 6] has a tail with length 4, twice the value of the head.
This input would cause a segmentation fault with the current code. Alternatively, it is possible
that the user has simply forgotten to give the second row of data, in which case the code
would mistakenly calculate the strategies for the 2 x 2 matrix and the user would be none the
wiser. This problem could be controlled by the following improvement.
A warning message for an unexpected number of tokens – This would require a check before
the memory allocation. If there was the possibility that the game was square and the user had
only given one size parameter the system could print the data as if this was the case and give
the user the option to continue with the calculation or to change it. For example,
[2 3 4 5 6] Is this the 2x2 game [3,4][5,6]? (Y/N):
The user could either confirm this or terminate the code and input the data correctly.
Player and Action Names – After running a number of tests became clear that the strategies
would be clearer and more presentable if the user was able to give the names of the players
and their actions. This would also be useful if the output was to be used in a presentation as
Player 1's 1st action would mean very little to anyone other than the user. The functionality
allowing player names to be given at run time via the command line has been implemented.
Storing Action names would require two arrays for both players that would have to be
dynamically allocated memory in order to store the names.
Default Names – This became clear having implemented the player names functionality. If the
return key was mistakenly pressed or the name of the player was unnecessary or unknown
then a default name is given. This has been implemented with the default names Player 1 and
Player 2.
Same Actions – If the Action names functionality was implemented it would be nice for the
user to be able to specify whether or not Player 2's actions are the same as Player 1. Often in a
34
Computing the Mixed Strategy Nash Equilibria for Zero-sum Games
games both players have the same actions and typing 'Rock Paper Scissors Rock Paper
Scissors' is unnecessarily time consuming.
Output File Specification – If the user wanted to save a number of results, the ability to give
the output file a name would be very useful. This has been implemented, like the player
names, at run time, before this the output files would be overwritten every time the program
was run.
Default Output File – Similarly with the player names, it is easy to mistakenly press return. In
this circumstance it is important that a default file is written to so that results are not lost. In
this situation a message should be printed on the command line, warning the user that the
results have been written to the default file. The default file name has been implemented but
the warning message has not. This just requires a print statement within the default file check.
Output File Optional – There may be occasions when an output file is not wanted. It would be
possible to allow the user to type a certain phrase or word that would stop the output file
creation.
Overwritten Data – With the user giving the name of the output file it would be easy to
overwrite data stored from previous test. A warning message should be sent if a filename
given is already present giving the user the opportunity to give another name.
Fractions as Output / Input – One requirement stated that the user should be able to give the
payoffs as fractions. This functionality was not implemented. It would also be useful and
smarter if the strategies were given as fractions. This could be easily implemented.
Differences in Output – During testing it became clear that the methods output all slightly
differed. When a Nash Equilibrium is found the actions are given as 'A1' as opposed to the
other situations where they are given as 'Action 1'. There is also a difference between outputs
of the 2 x 2 and simplex methods. It would be better to have a standard format for the outputs.
Removal of Dominated Strategies – Although dominance was described in the literature
review it was not implemented. It would require checking each action for domination and
freeing the memory allocated to the dominated action. It could save time on calculations but I
decided that it was not worth the implementation cost when compared with the average time it
would save. The removal of dominated strategies is aimed more at methods for solving 2 x n
games other than the simplex method, as I have used the simplex method for all games larger
than 2 x 2 and the fact that most games I have come across do not have dominated strategies I
decided not to implement this functionality.
Figure 6.1 Game with Domination
35
Computing the Mixed Strategy Nash Equilibria for Zero-sum Games
Figure 6.1 shows a game that the program solved using the simplex method. This game can
be reduced to a 2 x 2 game, shown in figure 6.2, which can be solved using the quicker 2 x 2
method.
Figure 6.2 Reduced Game
Obviously the dominated action functionality would be useful in this situation and if I was to
spend more time on it this would be implemented.
Nash Equilibrium Search – The search method implemented is not infallible, which can be a
problem. If there is a Nash equilibrium the other methods will not necessarily obtain the
correct results and therefore it is important to correctly check for an equilibrium point. The
limitation of my search method is that it only stores one value. If two or more elements shared
a value and a column or row, this search will not always return the correct result. This has not
shown up in any of the tests I have run using games with Nash equilibrium but theoretically it
could.
Memory Freeing – I have not implemented any code to free the memory used in my program.
I had intended to add this at the end once full functionality was reached but it wasn't reached
and therefore it is lacking.
36
Computing the Mixed Strategy Nash Equilibria for Zero-sum Games
6.2
Conclusion
This project has seen the development of a program capable of calculating the optimal
strategies for players of two-player zero-sum games. There is a lot more to game theory than
just two-player, zero-sum games and there are few real life situations that can be successfully
modelled by them. Non-zero-sum games are better for modelling real life but the research
involved far outweighed the limitations of the program and, despite these limitations, my
program has a purpose and performs it well. This project has not only improved my
knowledge of the whole topic in general but also my researching methods.
Despite the large number of improvements that were found throughout the design and
implementation stage the statistical testing stage produced some pleasing results, suggesting
that the code works correctly. I am personally very interested in economics and especially
game theory and I am happy that I have managed to get this project working so successfully.
I have tried to implement as many of the improvements as possible while eradicating the
problems but have unfortunately not been able to get through them all. The project itself has
given me more confidence in my programming ability and has taught me a lot of useful things
about the C language and I've become much better at searching the C libraries for functions.
I did manage to get some time to look into the possibility of a method to solve non-zero-sum
games and this work on Gaussian elimination can be found in Appendices G and H.
37
Computing the Mixed Strategy Nash Equilibria for Zero-sum Games
6.3
Future Work
In the future as well as making the improvements that I did not have time to make I would
like to introduce the following extra functionality.
Addition of N-player capabilities – It is not common to see examples of Game Theory where
there are more than two players but this is more to do with the complexity of such an example
as opposed to the rarity of them. To make this software complete I would have to include Nplayer capabilities.
Multiple Solutions – The system has been implemented to find one basic solution to any given
game. However, all games have either exactly one solution or infinitely many as explained by
Williams (1966). This may sound as though implementing this functionality is an unnecessary
and impossible task but these solutions are all based on a finite number of basic solutions,
which are countable. The simplex method can be altered to include this functionality. It
requires an extra step of pivot finding when a 0 appears in the final row of the final schema;
remember that basic Simplex method looks for negative numbers only. This extra pivot step
is repeated until all the basic solutions are found.
Non-Zero-sum Games – The subject was touched upon in the literature review but, as with
many of the updates, the time constraints of the project did not allow me to complete this
functionality.
On an interaction level I would like to have a graphical user interface for the input and output
of all game data. I think this would improve the usability as well as making it a more
enjoyable experience.
38
Computing the Mixed Strategy Nash Equilibria for Zero-sum Games
References
DANTZIG, G (1963), Princeton University Press – Linear Programming And Extensions
FERGUSON, T – Online Game Theory Text, available at
http://www.math.ucla.edu/~tom/Game_Theory/Contents.html
GARG, R – Online Game Theory Text, available at
http://www.cse.iitd.ac.in/~rahul/cs905/lecture5
OSBORNE, MARTIN J (1997), – An Introduction to Game Theory
http://www.socsci.mcmaster.ca/~econ/faculty/osborne
SOMMERVILLE, I (2001), Addison-Wesley – Software Engineering
SUMMIT, S (1996) – Online Dynamically Allocating Multidimensional Arrays Text,
available at
http://www.eskimo.com/~scs/cclass/int/sx9b.html
VON NEUMANN, MORGENSTERN (1953), Princeton University Press – Theory of Games
and Economic Behaviour
WIKIPEDIA, The Free Online Encyclopedia, available at
http://en.wikipedia.org/wiki/Game_Theory
WILLIAMS, J.D (1966), McGraw Hill Book Company – The Compleat Strategyst
Bibliography
C++ RESOURCES NETWORK, available at http://www.cplusplus.com
CONITZER, SANDHOLM (2006) – A Technique for Reducing Normal-Form Games to
Compute a Nash Equilibrium
DASKALAKIS, GOLDBERG, PAPADIMITRIOU (2006) – The Complexity of Computing a
Nash Equilibrium
DATTA, RUCHIRA S (2003) – Using Computer Algebra to find Nash Equilibria
EPPERSON, JAMES (2002) – An Introduction to Numerical Methods and Analysis
KOLLER, MEGIDDO, VON STENGEL (1994) – Fast Algorithms for Finding Randomised
Strategies in Game Trees
MCKELVEY, RICHARD, MCLENNAN, ANDREW, TUROCY, THEODORE – Gambit:
Software tools for Game Theory, available at http://econweb.tamu.edu/gambit/
PORTER, NUDELMAN, SHOHAM (2004) – Simple Search Methods for Finding a Nash
Equilibrium
ZHANG, T – Teach Yourself C in 24 Hours, available at http://aelinik.free.fr/c/index.html
39
Computing the Mixed Strategy Nash Equilibria for Zero-sum Games
Appendix A
Code
Once the game data is in the multidimensional array, the program checks the game for a Nash
equilibrium. If there is one the program sets a flag, if not the program continues on. It checks
the flag, if it has been set then the program terminates otherwise it checks the size of the
game. If the game is 2x2 the program will run the simple 2 x 2 method. If not the Simplex
method will be used. The Simplex method requires the game matrix to be augmented,
creating the first schema. It then searches for a pivot and creates new schema recursively until
the final row has no negative values.
/*
* Tom Brook - 27.4.07
*
* Computing Mixed Strategy Nash Equilibria
*
* Version Spx7 - Williams (1966) Simplex Method
*
* Final Version
*/
#include<stdio.h>
#include<stdlib.h>
#include<string.h>
void getData(void); /* Gets the game data and enters it into the game
matrix */
void printM(double **Mat, int row, int col); /* Prints a given matrix
*/
void createSch(void); /* Dynamically allocates memory to store the
schema and the strategies*/
void fillSch(void); /* Creates the first schema with game data */
void findMin(void); /* Finds constant to make all values >= 0*/
void findNash(void); /* Searches for Nash Equilibrium */
void findPiv(void); /* Finds the pivot */
void find2Strat(void); /* Finds mixed strategy for 2x2 games */
void pivPart(void); /* Augments schema */
void getStrat(void); /* Calculates the mixed strategy from final
schema */
void printStrat(double **Arr, int max, int p); /* Prints given
player's strategy */
FILE *toFile;
int row, col, sRow, sCol, rLim, cLim, dif, cP, rP, checkFlag = 0;
/* Row and Column store the size of the original game
* sRow and sCol are the size of the Schema
* rLim and rCol are the size of the Schema - 1, used for rectangular
games
* dif is the difference between row and col, used to set sRow and
sCol
* cP and rP are the pivot column and row
* checkFlag is set if a Nash Equilibrium is found
*/
double **Pay, **Sc1, piv, prevPiv, val, **P1s, **P2s, c, P1Sum = 0,
P2Sum = 0, d = 1;
/* Pay is the game matrix
40
Computing the Mixed Strategy Nash Equilibria for Zero-sum Games
* Sc1 is the schema
* piv is the current pivot value
* prevPiv is the previous pivot value
* val is the value of the game
* P1s and P2s store the used strategies
* c is the constant
* P1Sum and P2Sum store the sum of each player's odds
* d is the D value of the game
*/
char **Names;
/* Names contains the players' names */
int main(void){
char filename[20], P1[20], P2[20];
int leng;
printf("Enter file name for results without extension: ");
gets(filename);
strcat(filename,".txt");
/* Mallocs memory to store the names of the players */
Names = malloc(2 * sizeof(char *));
printf("Enter Player 1's name: ");
gets(P1);
leng = strlen(P1);
Names[0] = malloc(leng * sizeof(char));
Names[0] = P1;
printf("Enter Player 2's name: ");
gets(P2);
leng = strlen(P2);
Names[1] = malloc(leng * sizeof(char));
Names[1] = P2;
getData();
printM(Pay, row, col);
toFile = fopen(filename,"w");
findNash();
if(!checkFlag){
/* If Nash Equilibrium does not exist continue else finish */
if(col == 2 && row == 2)
/* If game is 2x2 do short method else do simplex */
find2Strat();
else{
findMin();
createSch();
findPiv();
getStrat();
printStrat(P1s, row, 0);
printStrat(P2s, col, 1);
}
printf("Game Value = %f\n", val);
fprintf(toFile, "Game Value = %f\n", val);
}
/* Calculates who the game favours */
if(val > 0){
printf("Game favours %s\n", Names[0]);
fprintf(toFile, "Game favours %s\n", Names[0]);
}
else if(val < 0){
41
Computing the Mixed Strategy Nash Equilibria for Zero-sum Games
printf("Game favours %s\n", Names[1]);
fprintf(toFile, "Games favours %s\n", Names[1]);
}
else{
printf("Game is fair\n");
fprintf(toFile, "Game is fair\n");
}
fclose(toFile);
return 0;
}
void getData(void){
FILE *nuFile;
int i = 0, j;
char str[100], *chPtr;
nuFile = fopen("In.txt","r");
if(nuFile == NULL) perror ("Error opening input file");
else{
/* Stores game data in the string 'str' */
fgets(str, 100, nuFile);
fclose(nuFile);
}
/* Tokenises 'str' and converts to integers */
chPtr = strtok(str," ,");
row = atoi(chPtr);
sRow = row + 1;
chPtr = strtok(NULL," ,");
col = atoi(chPtr);
sCol = col + 1;
chPtr = strtok(NULL," ,");
dif = row - col;
/* Creates a square schema in case of rectangular game */
if(dif < 0)
sRow = sRow - dif;
else
sCol = sCol + dif;
rLim = sRow - 1;
cLim = sCol - 1;
/* Mallocs memory for game matrix and assigns the payoffs */
Pay = malloc(row * sizeof(double *));
for(i = 0; i < row; i++)
Pay[i] = malloc(col * sizeof(double));
for(i = 0; i < row; i++){
for(j = 0; j < col; j++){
Pay[i][j] = atoi(chPtr);
chPtr = strtok(NULL," ,");
}
}
}
void findMin(void){
int i, j;
42
Computing the Mixed Strategy Nash Equilibria for Zero-sum Games
c = Pay[0][0];
for(i = 0; i < row; i++){
for(j = 0; j < col; j++){
if(c > Pay[i][j])
c = Pay[i][j];
}
}
if(c > 0)
c = 0;
}
void printM(double **Mat, int row, int col){
int i, j;
printf("Matrix\n");
for(i = 0; i < row; i++){
if(i > 0)
printf("\n");
for(j = 0; j < col; j++)
printf("%f\t", Mat[i][j]);
}
printf("\n");
}
void createSch(void){
int i, j, max;
Sc1 = malloc(sRow * sizeof(double *));
for(i = 0; i < sRow; i++)
Sc1[i] = malloc(sCol * sizeof(double));
max = sRow - 1;
P1s = malloc(max * sizeof(int *));
for(i = 0; i < max; i++)
P1s[i] = malloc(sizeof(int));
P2s = malloc(max * sizeof(int *));
for(j = 0; j < max; j++)
P2s[j] = malloc(sizeof(int));
for(i = 0; i < max; i++)
P1s[i][0] = 0;
for(j = 0; j < max; j++)
P2s[j][0] = 0;
fillSch();
}
void fillSch(void){
int i, j;
43
Computing the Mixed Strategy Nash Equilibria for Zero-sum Games
/* Fills final row, if a game is rectangular the unused rows have 0s
*/
for(i = 0; i < rLim; i++){
if(i < row)
Sc1[i][cLim] = 1;
else
Sc1[i][cLim] = 0;
}
/* Fills final column, if a game is rectangular the unused columns
have 0s */
for(j = 0; j < cLim; j++){
if(j < col)
Sc1[rLim][j] = -1;
else
Sc1[rLim][j] = 0;
}
/* Final element = 0 */
Sc1[rLim][cLim] = 0;
/* Inner elements contain original game data and 0s for unused rows
or columns.
* Game data is adjusted using the constant so all values are >= 0 */
for(i = 0; i < rLim; i++){
for(j = 0; j < cLim; j++)
if(i < row && j < col)
Sc1[i][j] = Pay[i][j] - c;
else
Sc1[i][j] = 0;
}
}
void findPiv(void){
int i, j, colSt[col];
double crits[col], min, checkVal;
/* Finds the minimum criterion for all columns with a negative final
element */
for(j = 0; j < col; j++){
crits[j] = 100;
colSt[j] = -1;
if(Sc1[rLim][j] < 0){
for(i = 0; i < row; i++){
if(Sc1[i][j] > 0){
min = -(Sc1[i][cLim] * Sc1[rLim][j]) /
Sc1[i][j];
if(min < crits[j]){
crits[j] = min;
colSt[j] = i;
}
}
}
}
}
/* Finds largest of the minimum criterion and it's column position */
j = 0;
while(j < col){
if(crits[j] < 100){
min = crits[j];
44
Computing the Mixed Strategy Nash Equilibria for Zero-sum Games
cP = j;
j = col;
}
else
j++;
}
for(j = 1; j < col; j++){
if(crits[j] < 100){
checkVal = crits[j];
if(checkVal > min){
min = checkVal;
cP = j;
}
}
}
/* Sets the pivot value and sets the values of the pivot row and
column */
piv = min;
rP = colSt[cP];
P1s[rP][0] = cP + 1;
P2s[cP][0] = rP + 1;
pivPart();
}
void pivPart(void){
int i, j, count = 0;
/* Sets the previous pivot point */
prevPiv = Sc1[rP][cP];
/* Augments all elements of the game matrix that aren't in the pivot
row or column */
for(i = 0; i < row; i++){
if(i != rP){
for(j = 0; j < col; j++){
if(j != cP)
Sc1[i][j] = ((Sc1[i][j] * prevPiv) (Sc1[rP][j] * Sc1[i][cP])) / d;
}
}
}
/* Augments the pivot column and the final column */
for(i = 0; i < row; i++){
if(i != rP){
Sc1[i][cLim] = ((Sc1[i][cLim] * prevPiv) (Sc1[rP][cLim] * Sc1[i][cP])) / d;
Sc1[i][cP] = -Sc1[i][cP];
}
}
/* Augments the the final row */
for(j = 0; j < col; j++){
if(j != cP){
Sc1[rLim][j] = ((Sc1[rLim][j] * prevPiv) (Sc1[rP][j] * Sc1[rLim][cP])) / d;
45
Computing the Mixed Strategy Nash Equilibria for Zero-sum Games
}
}
/* Augments the final element */
Sc1[rLim][cLim] = ((Sc1[rLim][cLim] * prevPiv) - (Sc1[rP][cLim]
* Sc1[rLim][cP])) / d;
/* Augments the final pivot column element */
Sc1[rLim][cP] = - Sc1[rLim][cP];
/* Assigns new value to pivot */
Sc1[rP][cP] = d;
/* Assigns new value to D */
d = prevPiv;
/* Checks for negative values in the final row */
for(j = 0; j < col; j++){
if(Sc1[rLim][j] < 0)
count++;
}
/* Finds the game value */
val = (d / Sc1[rLim][cLim]) + c;
/* Finds new pivot if there is a negative value in the final row */
if(count > 0)
findPiv();
}
void getStrat(void){
int i, j;
double swap;
/* Sums the odds for both players */
for(i = 0; i < row; i++){
if(P1s[i][0] > 0)
P2Sum = P2Sum + Sc1[i][cLim];
}
for(j = 0; j < col; j++){
if(P2s[j][0] > 0)
P1Sum = P1Sum + Sc1[rLim][j];
}
/* Calculates the probability distribution for both players using the
odds */
for(i = 0; i < row; i++){
if(P1s[i][0] > 0){
j = P1s[i][0] - 1;
P1s[i][0] = Sc1[rLim][j] / P1Sum;
}
}
for(j = 0; j < col; j++){
if(P2s[j][0] > 0){
i = P2s[j][0] - 1;
P2s[j][0] = Sc1[i][cLim] / P2Sum;
}
}
}
46
Computing the Mixed Strategy Nash Equilibria for Zero-sum Games
void printStrat(double **Arr, int max, int p){
int i;
printf("%s's Mixed Strategy = ", Names[p]);
fprintf(toFile, "%s's Mixed Strategy = ", Names[p]);
for(i = 0; i < max - 1; i++){
printf("%f Action %d + ", Arr[i][0], i + 1);
fprintf(toFile, "%f Action %d + ", Arr[i][0], i + 1);
}
printf("%f Action %d\n", Arr[max - 1][0], max);
fprintf(toFile, "%f Action %d\n", Arr[max - 1][0], max);
}
void findNash(void){
int i, j, rowSt[row], colSt[col], thisRow, column;
double rMin[row], cMax[col];
/* Finds minima and maxima in rows and columns respectively */
for(i = 0; i < row; i++){
rMin[i] = Pay[i][0];
rowSt[i] = 0;
}
for(j = 0; j < col; j++){
cMax[j] = Pay[0][j];
colSt[j] = 0;
}
for(i = 0; i < row; i++){
for(j = 0; j < col; j++){
if(Pay[i][j] < rMin[i]){
rMin[i] = Pay[i][j];
rowSt[i] = j;
}
}
}
for(j = 0; j < col; j++){
for(i = 0; i < row; i++){
if(Pay[i][j] > cMax[j]){
cMax[j] = Pay[i][j];
colSt[j] = i;
}
}
}
/* Searches for an element that is both a row minima and a column
maxima */
for(i = 0; i < row; i++){
column = rowSt[i];
if(colSt[column] == i){
thisRow = colSt[column];
val = Pay[thisRow][column];
printf("Pure Nash Equilibrium = %f at [%d][%d]\n",
val, colSt[column], column);
printf("%s's pure strategy = A%d\n", Names[0],
colSt[column] + 1);
47
Computing the Mixed Strategy Nash Equilibria for Zero-sum Games
printf("%s's pure strategy = A%d\n", Names[1],
column + 1);
printf("Value of Game = %f\n", val);
fprintf(toFile, "Pure Nash Equilibrium = %f at [%d][%d]\n", val,
colSt[column], column);
fprintf(toFile, "%s's pure strategy = A%d\n", Names[0], colSt[column]
+ 1);
fprintf(toFile, "%s's pure strategy = A%d\n", Names[1], column + 1);
fprintf(toFile, "Value of Game = %f\n", val);
/* Sets check flag to signal that the game is in equilibrium */
checkFlag = 1;
}
}
}
void find2Strat(void){
int i, j;
double bOdds[2], rOdds[2], bProb, rProb, total, minus;
/* bOdds and rOdds contain the odds for each action for both players
* bProb, rProb are the probabilities for each player's first action
* total is used as sum of each player's odds in order to calculate
* the probability minus is the value of each player's second action
*/
for(i = 0; i < 2; i++){
bOdds[1 - i] = Pay[i][0] - Pay[i][1];
rOdds[1 - i] = Pay[0][i] - Pay[1][i];
}
total = abs(bOdds[0]) + abs(bOdds[1]);
bProb = abs(bOdds[0]) / total;
minus = 1 - bProb;
printf("\n%s's Mixed Strategy = %f Action 1 + %f Action 2\n",
Names[0], bProb, minus);
fprintf(toFile, "%s's Mixed Strategy = %f Action 1 + %f Action
2\n", Names[0], bProb, minus);
total = abs(rOdds[0]) + abs(rOdds[1]);
rProb = abs(rOdds[0]) / total;
minus = 1 - rProb;
printf("%s's Mixed Strategy = %f Action 1 + %f Action 2\n\n",
Names[1], rProb, minus);
fprintf(toFile, "%s's Mixed Strategy = %f Action 1 + %f Action
2\n", Names[1], rProb, minus);
val = ((abs(bOdds[0]) * Pay[0][0]) + (abs(bOdds[1]) *
Pay[1][0])) / (abs(bOdds[0]) + abs(bOdds[1]));
}
48
Computing the Mixed Strategy Nash Equilibria for Zero-sum Games
Appendix B
Test Data
2x2 Tests
49
Computing the Mixed Strategy Nash Equilibria for Zero-sum Games
'2x4' Games
50
Computing the Mixed Strategy Nash Equilibria for Zero-sum Games
'4x2' Games
51
Computing the Mixed Strategy Nash Equilibria for Zero-sum Games
3x3+ Games
The output files can be found on the CD
52
Computing the Mixed Strategy Nash Equilibria for Zero-sum Games
Appendix C
Original Requirements
This Document has been included to enable a comparison between the first and final
requirements documents.
Requirements (12.3.2007)
Functional Requirements
•
Accept a text file, created by the user, containing the information for the game:
The text file is a list of numbers and words separated by whitespace. It contains the
player names, the number of actions and their names for both players and the payoffs
for each action profile. This information will be ordered with Player 1's data first. E.g.
The data for the matching pennies game:
2 2 Tom Dave Heads Tails Heads Tails 1 –1 –1 1
•
Check syntax of text file: The code expects to read two numbers first (The number
of actions for each player) from these numbers it works out how many words should
follow and also how many integers at the end. For the above data it will expect 6
words to follow. There will be two names, same for all games, followed by the action
names, in this case 4. This number is clearly the sum of the actions for each player,
i.e. the two numbers read at the beginning of the file. It will then expect 4 integers.
This is calculated by multiplying the number of actions for both players, in this case
2x2
•
o
Token check: The first check it will do is to make sure that, having split the
file into tokens, the number of tokens is as expected. For the matching
pennies game it would expect 12 tokens, 2 action counts, 2 names, 4 action
names and 4 payoffs. If it finds more or less tokens than expected it will
report an error via the command line reading ‘Error – Incorrect number of
tokens. Make sure there is no unnecessary whitespace’ and exit
o
Token type check: If the number of tokens is correct it will proceed to check
the type of the tokens. It will accept anything for the names of players and
actions as restricting them to characters would be a mistake. Players will not
necessarily be human and may contain a number in their name. The concern
is that a character should not be found where an action count or payoff is
expected. If it finds a problem it will report an error via the command line
reading ‘Error – Unexpected type of Action count/Payoff’ and exit
Compute the mixed strategy Nash Equilibrium and output the results via the
command line and write them to a text file: The output will give both players'
strategies in this form – Player Name strategy = a Action 1 + b Action 2, where a and
b are the probabilities for each action
53
Computing the Mixed Strategy Nash Equilibria for Zero-sum Games
Appendix D
Prisoner's Dilemma
Suspect 1 Quiet
Fink
Suspect 2
Quiet
Fink
2, 2
0, 3
3, 0
1, 1
EU1 (Quiet) = 2x
EU1 (Fink) = 3x + (1 - x)
EU2 (Quiet) = 2y
EU2 (Fink) = 3y + (1 - y)
Where x is the probability that Suspect 2 is quiet and y is the probability that Suspect 1 is
quiet.
For x: EU1 (Quiet) = EU1 (Fink)
2x = 3x + (1 - x)
2x = 2x + 1
For y: EU2 (Quiet) = EU2 (Fink)
2y = 3y + (1 - y)
2y = 2y + 1
This game only has the pure strategy (Fink, Fink). The players will always choose the action
that maximises their payoff. It is clear that the expected utility from finking is more than the
expected utility from staying quiet for both players. This means that x and y are both 0.
54
Computing the Mixed Strategy Nash Equilibria for Zero-sum Games
Appendix E
Matching Pennies
Player 1
Player 2
Heads
1
-1
Heads
Tails
Tails
-1
1
EU1 (Heads) = x - (1 - x)
EU1 (Tails) = -x + (1 - x)
EU2 (Heads) = -y + (1 - y)
EU2 (Tails) = y - (1 - y)
Where x is the probability that Player 2 chooses heads and y is the probability that Player 1
chooses heads.
For x: EU1 (Heads) = EU1 (Tails)
x - (1 - x) = -x + (1 - x)
2x - 1 = 1 - 2x
4x = 2
x = 1/2
For y: EU2 (Heads) = EU2 (Tails)
-y + (1 - y) = y - (1 - y)
2y - 1 = 1 - 2y
4y = 2
y = 1/2
This is a zero-sum game with each player having the same action choices. This situation
always results in the above probabilities but not all games, zero-sum or not, have the same
action choices. These games may have similarities between the probabilities but they will
have no useful link.
Using the other method the results are the same but are reached differently. The minmax
method for finding saddle points returns –1 and 1 so there is no pure equilibrium. Finding the
odds gives 2 and –2 for both players therefore 2:2, or 1:1 which equates to 1/2 Action 1 + 1/2
Action 2
55
Computing the Mixed Strategy Nash Equilibria for Zero-sum Games
Appendix F
Rock Paper Scissors
Player 1
Rock
Paper
Scissors
Player 2
Paper
0, 1
0, 0
1, 0
Rock
0, 0
1, 0
0, 1
Scissors
1, 0
0, 1
0, 0
EU1 (Rock) = 1 - x - y
EU1 (Paper) = x
EU1 (Scissors) = y
EU2 (Rock) = 1 - w - z
EU2 (Paper) = w
EU2 (Scissors) = z
Where: x = P (Player 2 chooses Rock)
y = P (Player 2 chooses Paper)
w = P (Player 1 chooses Rock)
z = P (Player 1 chooses Paper)
For x and y: EU1 (Rock) = EU1 (Paper) = EU1 (Scissors)
x = y = 1 - x - y = 1/3
This answer is easy to compute because the sum of the probabilities must add up to 1
therefore having 3 equal probabilities means that they are 1/3. It is the same for w and z.
If you specify that the loser will actually lose one of the points they have earned earlier it is
possible to model this as a zero-sum game.
Player 1
Rock
Paper
Scissors
Rock
0
1
-1
but this will not change the result.
56
Player 2
Paper
-1
0
1
Scissors
1
-1
0
Computing the Mixed Strategy Nash Equilibria for Zero-sum Games
Appendix G
Gaussian Elimination
This is the method that I was planning on implementing to solve non-zero-sum games.
4x1 + 2x2 – x3 = 5
x1 + 4x2 + x3 = 12
2x1 – x2 + 4x3 = 12
Matrix A =
4
1
2
2
4
-1
-1
1
4
Matrix b =
5
12
12
Matrix Ab =
4
1
2
2
4
-1
-1
1
4
5
12
12
4
0
0
2
7/2
-2
-1
5/4
9/2
5
43/4
19/2
4
0
0
2
7/2
0
-1
5/4
73/14
5
43/4
219/14
R2: R2 – 1/4R1
R3: R3 – 1/2R1
R3: R3 + 4/7R2
This process is known as forward elimination. At this point I know that 73/14x3 = 219/14 and
therefore x3 = 3. It is now fairly easy to work out what the other variables are using
backwards elimination.
57
Computing the Mixed Strategy Nash Equilibria for Zero-sum Games
Appendix H
Solving Non-Zero-sum Games using Gaussian Elimination prototype code
This code was not completed due to time constraints. The idea was to find the mixed
strategies by solving linear systems of equations found from the expected utilities of both
players. I have removed the parts that are similar to my completed code such as the reading of
data.
void fillM(void){
int i, j;
/* Gets the coefficient values from P1 and P2 respectively */
for(i = 0; i < P1col; i++){
for(j = 0; j < (P2row - 1); j++)
A1[i][j] = P1[i][j] - P1[i][P2row - 1];
}
for(i = 0; i < P2row; i++){
for(j = 0; j < (P1col - 1); j++)
A2[i][j] = P2[i][j] - P2[i][P1col - 1];
}
/* Again gets the values from P1 and P2 for the B matrix */
for(i = 0; i < P1col; i++)
B1[i][0] = (P1[i][P1col - 1]) * -1;
for(i = 0; i < P2row; i++)
B2[i][0] = (P2[i][P2row - 1]) * -1;
}
void gElim(double **a, double **b, double **x, int rMax, int cMax){
int i, j, k, p;
double am, hold, sum, m, min, max;
/* Finds the smallest and largest of the two integer limiters passed
to the function. Used when the matrix is not square */
if(rMax > cMax){
max = rMax;
min = cMax;
}
else{
max = cMax;
min = rMax;
}
/* Partial Pivoting - Searches columns below the diagonal for larger
absolute values and stores the row number if one is found, P*/
for(i = 0; i < min; i++){
am = abs(a[i][i]);
p = i;
for(j = i + 1; j < rMax; j++){
if(abs(a[j][i]) > am){
am = abs(a[j][i]);
p = j;
}
}
58
Computing the Mixed Strategy Nash Equilibria for Zero-sum Games
/* Checks P to see if it needs to swap rows. Swaps the rows in the
coefficient and the b matrix */
if(p > i){
for(k = 0; k < cMax; k++){
hold = a[i][k];
a[i][k] = a[p][k];
a[p][k] = hold;
}
hold = b[i][0];
b[i][0] = b[p][0];
b[p][0] = hold;
}
}
/* Forward Elimination - Eliminates the lower diagonal values by
multiplying each element a scalar of the row above */
for(i = 0; i < cMax; i++){
for(j = i + 1; j < rMax; j++){
m = a[j][i] / a[i][i];
for(k = i + 1; k < cMax; k++)
a[j][k] = a[j][k] - m * a[i][k];
b[j][0] = b[j][0] - m * b[i][0];
}
}
/* Backward Elimination - Gets the final value for the x Matrix from
the bottom diagonal and then eliminates all upper diagonal elements
using the values from the row below */
x[rMax - 1][0] = b[rMax - 1][0] / a[rMax - 1][cMax - 1];
for(i = cMax - 2; i >= 0; i--){
sum = 0;
for(j = i + 1; j < cMax; j++)
sum = sum + a[i][j] * x[j][0];
x[i][0] = (b[i][0] - sum) / a[i][i];
}
/* Increments the static count used as ownership for the strategies
and resets sum to 0 so it can be used to find the value of the final
action*/
count = count + 1;
sum = 0;
printf("Player %d : ", count);
fprintf(toFile, "Player %d : ", count);
/* Takes the first rMax - 1 elements from the x Matrix and prints
them in the form of a strategy and writes them to Results.txt*/
for(i = 0; i < rMax - 1; i++){
if(x[i][0] == 0){
printf("0 A%d + ", i + 1);
fprintf(toFile, "0 A%d + ", i + 1);
}else{
printf("%f A%d + ", x[i][0], i + 1);
fprintf(toFile, "%f A%d + ", x[i][0], i + 1);
}
}
/* Calculates the value of the final action by subtracting the sum of
the other actions from 1, prints it and writes it to Results.txt */
for(i = 0; i < cMax - 1; i++)
sum = sum + x[i][0];
printf("%f A%d\n", 1 - sum, cMax);
fprintf(toFile, "%f A%d\n", 1 - sum, cMax);
}
59

Download Report

Computing the Mixed Strategy Nash Equilibria for Zero

Paperzz.com

Your Paperzz