A general technique for computing evolutionarily stable strategies

A general technique for computing evolutionarily stable
strategies based on errors in decision-making
John M. McNamara1 , James N. Webb1 , E.J. Collins1 ,
Tamás Székely2 and Alasdair I. Houston2
1
2
School of Mathematics, University of Bristol,
University Walk, Bristol BS8 1TW
School of Biological Sciences, University of Bristol,
Woodland Road, Bristol BS8 1UG
Correspondence to John McNamara:
email: [email protected]
running head: games with errors in decision making
1
Abstract
Realistic models of contests between animals will often involve a series of state-dependent
decisions by the contestants. Computation of evolutionarily stable strategies for such statedependent dynamic games are usually based on damped iterations of the best response
map. Typically this map is discontinuous so that iterations may not converge and even if
they do converge it may not be clear if the limiting strategy is a Nash equilibrium. We
present a general computational technique based on errors in decision making that removes
these computational difficulties. We show that the computational technique works for a
simple example (the Hawk-Dove game) where an analytic solution is known, and prove
general results about the technique for more complex games. It is also argued that there
is biological justification for inclusion of the types of errors we have introduced.
2
1
Introduction
Frequency-dependent effects are ubiquitous in the natural world. To understand the action of natural selection in such circumstances, a game theoretic approach is required. The
end point of the evolutionary process is an evolutionarily stable strategy or ESS (Maynard Smith 1982, Hammerstein 1996, Weissing 1996). Early game-theoretical models were
usually highly schematic in that they considered a single decision by the contestants. Realistic models will often require the investigation of a series of state-dependent decisions
by each contestant. A problem with such models is that they are difficult to analyze. In
particular, the ESS usually has to be found by computation. The techniques used have
not always yielded an ESS (e.g. Houston & McNamara 1987, Crowley & Hooper 1994,
Holmgren & Hedenström 1995, Lucas & Howard 1995, Lucas et al. 1996) and then it is not
clear whether there is an ESS that might be found by other methods or whether no ESS
exists. For example it is not clear which of these alternatives apply in the work of Lucas et
al. (1996). The computational techniques used are based on finding the best response for
a mutant in a resident population. Problems arise because the best response does not vary
continuously with the behaviour of the resident population, i.e. the best response map is
discontinuous.
In this paper we describe a computational technique for finding ESSs. It is based on
the assumptions that there are errors in the decisions made by an animal, but that the
probability of an error decreases as its cost (in terms of reproductive success) increases.
Once errors have been introduced, a suitably defined best response map is continuous. As
we show, this smoothing of the best response map obviates many of the problems associated
with computing ESSs. We argue that in addition to its computational advantages, the
assumption of errors is also biologically realistic.
2
Computational Problems
An ESS analysis for a large population can be split into two components.
3
The environment. The strategies adopted by members of the population together with the
physical environment determine the environment experienced by population members.
Best mutant. Given an environment we can consider the fitness of all possible “mutant”
strategies within this environment. A strategy maximising fitness will be referred to as a
best mutant strategy. For some environments there is a unique best mutant, for others
there is a set of best mutants with all members of this set doing equally well.
Given these components we may define the best response map, B, as follows. Suppose
almost all population members use strategy π. This resident strategy creates an environment. Let B(π) be the set of all best mutant strategies within this environment. We will
call a strategy in B(π) a best response to π. When there is a unique best response to π
we will, with a slight abuse of notation, denote the best response by B(π). A necessary
condition for a strategy π ∗ to be an ESS is that it is a Nash equilibrium; i.e. π ∗ is a best
response to itself:
π ∗ = B(π ∗ ).
(1)
Computations typically seek to find a solution to equation (1), and it is this computational
problem we will focus on here. Since conditions for evolutionary stability are stronger than
the Nash equilibrium conditions, having found a solution to (1) one must then further
investigate the stability of the solution.
The Hawk-Dove Game. Concepts and computational problems are illustrated using the
following standard example of the Hawk-Dove game (Maynard Smith and Price 1973,
Maynard Smith 1982). The Hawk-Dove game can be solved analytically, but we can used
this simple game to illustrate problems which occur when attempting to compute Nash
equilibria for more complex games where an analytic solution is not possible.
Two animals contest a resource of value V . Each must decide whether to be aggressive
(i.e play “Hawk”) or display, (i.e play “Dove”). Each animal makes its choice before it
knows the choice of its opponent. If a Hawk meets a Dove the Hawk wins the resource.
If two Doves meet, they share the resource. If two Hawks meet each wins the resource
4
with probability 12 . The loser pays cost C, representing the cost of injury (we assume
0 < V < C).
For the Hawk-Dove game a strategy is specified by a number π, where 0 ≤ π ≤ 1. Under
strategy π, an animal plays Dove with probability 1−π and plays Hawk with probability π.
Let W (π 0 , π) be the payoff to a mutant strategy π 0 when the resident population strategy
is π. Then
W (π 0 , π) = (1 − π 0 )(1 − π)
1
V
+ π 0 (1 − π)V + π 0 π (V − C)
2
2
so that
W (π 0 , π) = (1 − π)
V
1
+ (V − πC)π 0 .
2
2
(2)
This payoff is an increasing function of π 0 when V − πC > 0 and a decreasing function
when V − πC < 0. Thus the best response map is given by
B(π) = 1 if 0 ≤ π <
V
C
B(π) is the set of all strategies when π =
B(π) = 0 if
(3)
V
C
V
< π ≤ 1.
C
(4)
(5)
From equation (3) - (5) it can be seen that the strategy π ∗ = V /C is the unique strategy
satisfying condition (1). It can easily be verified that π ∗ also satisfies the stronger condition
of Maynard Smith (1982).
Although the Hawk-Dove game can be solved analytically, complex games usually need
to be solved by numerical computation. Typically, numerical methods employ an interative
scheme which generates a sequence of strategies π0 , π1 , π2 , . . .. Hopefully this sequence
converges to a solution, π ∗ , of equation (1). The simplest scheme is as follows.
Iteration of the best response map. Take any strategy as the initial strategy π0 . Set
π1 = B(π0 ), set π2 = B(π1 ), and so on. In this way one obtains a sequence π0 , π1 , π2 , . . . of
strategies where each strategy is the best response to the previous strategy in the sequence.
The sequence obtained by the above scheme may converge to a strategy π ∗ which is the
best response to itself and so satisfies equation (1); and this method has been successfully
5
used to find the solution of a dynamic game (Houston and McNamara, 1988). Often,
however, the sequence π0 , π1 , π2 , . . . fails to converge at all.
To illustrate failure of convergence consider the Hawk-Dove game. Suppose that the
initial choice of strategy π0 satisfies π0 < V /C. Then by equations (3) and (5) π1 = B(π0 ) =
1, π2 = B(π1 ) = 0, π3 = B(π2 ) = 1, etc. The sequence obtained is thus π0 , 1, 0, 1, 0, 1, 0, . . ..
Similarly if π0 > V /C the sequence is π0 , 0, 1, 0, 1, 0, 1, . . .. Thus the sequence π0 , π1 , π2 ,
never converges unless one has been fortunate enough to choose π0 exactly equal to V /C.
One can attempt to stop the sequence of strategies π0 , π1 , π2 , . . ., from oscillating by
modifying the above iterative scheme as follows.
Best response with damping. Let λ lie in the range 0 < λ ≤ 1. Construct the sequence
π0 , π1 , π2 , . . . by taking πn to be the randomised strategy which chooses strategy πn−1 with
probability 1 − λ and strategy B(πn−1 ) with probability λ. With this interpretation we
write
πn = (1 − λ)πn−1 + λB(πn−1 ).
(6)
The previous scheme is obtained by setting λ = 1.
We can give a loose interpretation of this iterative scheme by supposing that we are
really following the evolution of a population and that πn is the resident population strategy
in generation n. Equation (6) then says that generation n is formed from generation n − 1
by replacing a proportion λ of the population in generation n − 1 by individuals whose
behavioural strategy was the fittest mutant strategy in generation n − 1. When λ = 1
there is a complete replacement of the population in each generation, and not surprisingly
the evolutionary process can oscillate. Replacing only a small fraction of the population
tends to stabilise the evolutionary process.
Computations of some complex games (e.g. Houston & McNamara 1988, Lucas &
Howard 1995, Lucas et al. 1996) illustrate cases in which the sequence π0 , π1 , π2 , . . . oscillates for λ = 1 but converges to a solution π ∗ of equation (1) for λ sufficiently small. It
is easy, however, to find examples for which the scheme (6) fails to work no matter how
small a value of λ is used. The Hawk-Dove game provides such an example. For the Hawk6
Dove game the sequence π0 , π1 , π2 , . . . of strategies fails to converge for all but a countable
non-generic set of π0 (Appendix 1). Table 1 illustrates this effect.
An obvious modification of the scheme (6) allows λ to depend on n.
Best response with increased damping. Let
πn = (1 − λn )πn−1 + λn B(πn−1 )
(7)
where the sequence λ0 , λ1 , λ2 , . . . satisfies λn > λn+1 and λn → 0 as n → ∞.
Applying this scheme to the Hawk-Dove game always produces a sequence π0 , π1 , π2 , . . .
of strategies which converges (Appendix 1). If the sequence of λn ’s tends to zero too rapidly
the limit of the sequence of strategies may not be the ESS strategy π ∗ = V /C (Table 1).
This is because excessive damping does not allow the population to “evolve”. In particular
it is shown in Appendix 1 that if
∞
X
λn < ∞
(8)
n=1
then there is no initial π0 such that πn → V /C for all possible V /C. Conversely provided
the λn ’s tend to zero slowly enough so that
∞
X
λn = ∞
(9)
n=1
the sequence π0 , π1 , π2 , . . . of strategies will converge to the ESS strategy π ∗ = V /C whatever the π0 chosen and whatever the value of V /C. (Appendix 1).
Although the numerical scheme (7) works for the Hawk-Dove game when condition (9)
holds, there are problems with using it as a general method. First, even if πn tends to
π ∗ as n tends to infinity, convergence is liable to be very slow. This is because condition
(9) means that λn tends to zero very slowly as n tends to infinity. Table 2 illustrates
convergence of πn to π ∗ in the Hawk-Dove game when λn = n−1 . As can be seen, even
after 1000 iterations |πn − π ∗ | is still of the order of 5 × 10−4 .
A second drawback of the scheme given by (7) is that there is no reason to suppose that
the scheme works for complex problems. The Hawk-Dove game is rather special in that
the set of possible strategies is the one-dimensional interval [0, 1]. For complex games the
7
set of possible strategies is typically a subset of Rn for large n, and the analytic argument,
valid for the Hawk-Dove game and presented in Appendix 1, is not applicable.
Thirdly, if a sequence π0 , π1 , π2 , . . . of strategies is calculated, how do we recognise the
sequence is converging to a limit? Of course, since in finite time one can only compute a
finite number of strategies to some prescribed limit of accuracy, one never knows. It seems
to be a reasonable working practice to assume that convergence is occurring when, say,
there is a θ < 1 such that
|πn − πn−1 | < θn
(10)
apears to be true for all sufficiently large n. (Here |πn − πn−1 | is some appropriate measure
of the distance between strategies πn and πn−1 ). Although this criterion is usually satisfied when a scheme with constant λ converges, when there is increased damping and the
damping used satisfies condition (9) |πn − πn−1 | will typically tend to zero too slowly for
(10) to hold. It is then much more difficult to decide whether the sequence π0 , π1 , π2 , . . . is
really converging.
The final difficulty may arise with any scheme in which a sequence π0 , π1 , π2 , . . . of
strategies is calculated. Suppose that a numerical calculation yields a sequence which
appears to be converging. If we accept that the sequence really is converging to a limiting
strategy π ∗ , how do we know that π ∗ satisfies equation (1)? It may be that the scheme
employed forces convergence, as with Table 1. We might have confidence that the limit π ∗
satisfies π ∗ = B(π ∗ ) if the calculation also suggests that
|πn − B(πn )| → 0
(11)
as n → ∞. But for many games the best response map is discontinuous at the ESS so that
we would not expect condition (11) to hold even when π ∗ = B(π ∗ ). The Hawk-Dove game
illustrates this point. Table 2 shows a case in which the sequence {πn } converges to the
ESS strategy π ∗ = V /C, but the sequence {B(πn )} does not converge. For the Hawk-Dove
game analytic arguments tell us that the ESS is V /C. If we did not have these analytic
arguments but only had the numerical results in Table 2, then even if we were prepared
8
to accept that {πn } was converging, we would not know from Table 2 that the limit π ∗
satisfied equation (1).
9
3
Introducing Errors into Decision Making
We now introduce errors into the choice of action made by contestants in a game. We will
do so in such a way that the probability of costly errors is small, while the probabilities of
errors with virtually no cost is large. The resulting “best response with error” turns out to
be a much better behaved function than the corresponding best response function without
error. Consequently, many of the computational problems described in the previous section
disappear.
In this section we will restrict attention to the case where contestants make a single
choice of action and all contestants are in the same state. Games in which contestants
make sequences of state dependent choices are discussed later.
Consider a game in which contestants make a choice between the K actions a1 , a2 , . . . , aK .
A strategy, π, for this game is a vector π = (p1 (π), p2 (π), . . . , pK (π)), where pi (π) is the
probability that action ai is chosen under π. We assume a large population, and refer to
a strategy as the resident population strategy if almost all population members adopt this
strategy.
Suppose the resident population strategy is π. Let Wi (π) denote the expected reproductive value of an individual which chooses action ai within this population. Under an
optimal choice of action the expected reproductive value is then
c (π) = maximum Wi (π).
W
(12)
c (π) − Wi (π).
Ci (π) = W
(13)
1≤i≤K
For each i let
Then Ci (π) = 0 if the choice of action ai is optimal and Ci (π) > 0 if the choice of action
ai is suboptimal. The quantity Ci (π) is a measure of the loss in reproductive value as a
result of choosing action ai , and is referred to by McNamara and Houston (1986) as the
canonical cost of action ai .
To introduce errors let H1 be a function of a non-negative real variable which satisfies
H1 (x) > 0 for x ≥ 0
10
(14)
H1 (x) is continuous and strictly decreasing in x
(15)
H1 (x) → 0 as x → ∞.
(16)
For some applications it is also necessary to ensure that H1 is sufficiently smooth by
imposing a condition such as
|H1 (x) − H1 (y)| ≤ H1 (0)|x − y| for all x, y.
(17)
Let the resident population strategy be π. Then we can assign weight βi (π) = H1 (Ci (π))
to the choice of action i within this population. Let
p̂i (π) =
βi (π)
K
X
.
(18)
βj (π)
j=1
Then the best response, B1 (π), to π with error function H1 is defined to be the strategy
which chooses action i with probability p̂i (π); i.e. B1 (π) = (p̂1 (π), p̂2 (π), . . . , p̂K (π)).
Under strategy B1 (π) there is a positive probability of choosing each action. The
optimal action is chosen with the highest probability and action ai is more likely to be
chosen than action aj if the canonical cost of choosing ai is less than the canonical cost
of choosing aj . As the canonical cost of choosing an action increases the probability that
action is chosen decreases and tends to zero as the cost tends to infinity.
To control the amount of error for given canonical costs we introduce a parameter δ
where δ > 0. Define the function Hδ by
Hδ (x) = H1 (x/δ).
(19)
For example if H1 (x) = e−x , then Hδ (x) = e−x/δ . Then we can use Hδ rather than
! H1
K
X (δ)
(δ)
(δ)
(δ)
to generate errors. By setting βi (π) = Hδ (Ci (π)) and p̂1 (π) = βi (π)/
βj (π) we
obtain the best response with error function Hδ given by Bδ (π) =
j=1
(δ)
(δ)
(δ)
(p̂1 (π), p̂2 (π), . . . , p̂K (π)).
For given canonical costs the probability of error declines as δ decreases. In particular if
11
an action ai has positive canonical cost, then the probability this action is chosen tends to
zero as δ tends to zero.
For given δ the best response with error function Hδ has two especially useful properties.
Property 1. Bδ (π) is uniquely defined for each strategy π.
This contrasts with the best response without error, when there may be many best
responses to a strategy π.
To obtain the second property we assume that the payoffs Wi (π) are continuous functions of π for each i. This is likely to be true for any reasonable game. Assuming it does
hold we have
Property 2. Bδ (π) is a continuous function of π.
Again this contrasts with the best response without error, which may be discontinuous,
particularly at an ESS.
We define the ESS with error function Hδ to be a strategy πδ∗ which satisfies
Bδ (πδ∗ ) = πδ∗ ;
(20)
that is, πδ∗ is the best response to itself with error function Hδ . The continuity of Bδ
(Property 2) allows us to deduce two general properties of πδ∗ .
Property 3. There is always at least one solution to equation (20).
Property 4. Let πn be a sequence of strategies which converge to some limiting strategy,
π∞ . Then π∞ = πδ∗ if and only if
|πn − Bδ (πn )| → 0 as n → ∞.
(21)
Suppose that in a game without error a sequence of iterates π0 , π1 , π2 , . . . is calculated.
Suppose also that the sequence appears to be converging. We noted previously that,
assuming the sequence is converging, it is difficult to know whether convergence is to
an ESS. The main problem is that, even if π0 , π1 , π2 , . . . is converging to an ESS π ∗ , the
sequence of best responses B(π0 ), B(π1 ), B(π2 ), . . . may not converge, or may converge to a
12
limit other than π ∗ . Property 4 shows that these difficulties disappear when attempting to
calculate an ESS with error, since criterion (21) gives a necessary and sufficient condition
that convergence is to the correct limit.
Other properties of an ESS with error may depend on the specific nature of the game
under consideration. For any particular game one would like to know what iterative schemes
are liable to work in finding πδ∗ , and would like to know which of the ESS’s without error
are limits, as δ tends to 0, of ESS’s with error. In this paper we do not attempt a general
analysis of these issues, but look at them in detail in the Hawk-Dove game.
4
The Hawk-Dove Game with Error
When each contestant must choose between just one of two actions as in the Hawk-Dove
game, the formulae of the previous section can be re-expressed in a different form. We
present this new form, indicate how it is related to the previous definitions, and then use
the new form to analyse decision errors in the Hawk-Dove game.
We begin by defining a class, {Gδ }, of error functions. Let G1 be a function of a real
variable which satisfies:
0 < G1 (x) < 1 for all x.
(22)
G1 is continuous and strictly increasing.
(23)
G1 (x) + G1 (−x) = 1
(24)
1
(and hence G1 (0) = ).
2
G1 (x) → 1 as x → ∞.
(25)
|G1 (x) − G1 (y)| ≤ |x − y| for all x, y.
(26)
Now for each δ in the range 0 < δ ≤ 1 define the error function Gδ by
Gδ (x) = G1 (x/δ).
13
(27)
As δ decreases, the function Gδ becomes more step like (Figure 1). Formally Gδ (x) → G0 (x)
as δ → 0, where by G0 we mean the function

 0 x<0
1
x=0
G0 (x) =
 2
1 x > 0.
(28)
Now suppose that there are just two actions, a1 and a2 , to choose from in a game. A
strategy is thus defined by a pair of numbers π = (p1 , p2 ). However, since p1 + p2 = 1 we
can define a strategy by a single number π where π is the probability of choosing action a2 .
As before let Wi (π) be the expected reproductive value of an individual which plays action
i when the resident population strategy is π. In the absence of errors the best mutant
response to resident population strategy π is to choose action a2 with probability 0 when
W2 (π) − W1 (π) < 0 and choose action a2 with probability 1 when W2 (π) − W1 (π) > 0. In
other words the best mutant response is to choose action a2 with probability
G0 (W2 (π) − W1 (π)),
(29)
where the function G0 is given by equation (28).
Now let Gδ be an error function as defined above. Motivated by formula (29) we define
the best response to π with error function Gδ to be the strategy under which action a2 is
chosen with probability
Bδ (π) = Gδ (W2 (π) − W1 (π)).
(30)
An ESS with error function Gδ is then a strategy πδ∗ satisfying
Bδ (πδ∗ ) = πδ∗ .
(31)
Although we have chosen to define the best response with error here in a seemingly different
way to its definition in the last section, it is not difficult to show that the two definitions
agree provided the functions G1 and H1 are related by
G1 (x) =
H1 (0)
H1 (0) + H1 (x)
14
for x ≥ 0
(32)
with G1 (x) for x < 0 being given by condition (24). For example when H1 (x) = e−x we
have
G1 (x) =
1
1 + e−x
− ∞ < x < ∞.
(33)
To introduce errors into the Hawk-Dove game we can equate “play Dove” with action
a1 and “play Hawk” with action a2 . By equation (2) we then have
1
W2 (π) − W1 (π) = (V − πC),
2
(34)
so that if the resident population plays Hawk with probability π the best response with
error function Gδ is to play Hawk with probability
1
Bδ (π) = Gδ
(V − πC) .
2
(35)
Unlike the best response in the standard Hawk-Dove game, the best response with error
is a uniquely defined strategy. Under this strategy errors are made, but the probability
of error decreases with the fitness cost of the error and tends to zero as the fitness cost
increases. For a given fitness cost, the probability of error decreases as δ decreases, and
tends to zero as δ → 0. Figure 2 shows the best response with error as a function of the
resident population strategy π.
For the Hawk-Dove game it is easy to show that, for given function Gδ , equation (31)
has a unique solution (Appendix 2). Thus there is a unique ESS with error for each function
Gδ . For given Gδ one can show that:
V
1
V
1
<
then
< πδ∗ <
C
2
C
2
V
1
If
=
then πδ∗ = V /C
C
2
1
V
1
V
If
<
< 1 then
< πδ∗ <
2
C
2
C
If 0 <
(Appendix 2). Thus πδ∗ always lies between
1
2
(36)
(37)
(38)
and the ESS without error, π ∗ = V /C.
One might hope that as the probability of error decreases, the ESS with error tends to
the ESS without error. In Appendix 2 it is shown that this is indeed the case:
πδ∗ →
V
C
as δ → 0.
15
(39)
Figure 3 shows the dependence of πδ∗ on δ for two different initial error functions G1 .
In the Hawk-Dove game without error the best response, B(π), is a discontinuous
function of the resident population strategy, π. As we have seen, the jump discontinuity at
π = V /C leads to problems when attempting to compute the ESS by numerical schemes
based on iteration of the best response map. When there is error, the best response Bδ (π) is
a continuous function of π for all π. Consequently, in computing πδ∗ almost all the previous
problems disappear. Let the sequence of strategies π0 , π1 , π2 , . . . be given by
πn = (1 − λ)πn−1 + λBδ (πn−1 )
(40)
(cf. equation (6)). Here, as before, the replacement factor λ satisfies 0 < λ ≤ 1. It can
then be shown (Appendix 2) that
πn → πδ∗
as n → ∞
(41)
δ
.
1 + C/4
(42)
provided that
0<λ≤
Thus, unlike the case without error, the iterative scheme with a fixed level of damping works
provided that there is sufficient damping. As inequality (42) shows, the level of damping
required depends on the amount of error. When the probability of error decreases (δ
decreases) the amount of damping required increases (the replacement factor λ decreases).
Finally, suppose an iterative scheme is employed in an attempt to calculate πδ∗ . Suppose
the scheme generates the sequence of strategies π0 , π1 , π2 , . . . . Then it can be shown that
this sequence converges to πδ∗ if and only if condition (21) holds (Appendix 2). This is a
stronger result than the general result given in Property 4 as we do not demand that the
sequence π0 , π1 , π2 , . . . is already known to be converging.
5
State-dependent Dynamic Games
Many realistic game-theoretic models of animal behaviour consider animals that make
repeated interactions whose consequences for an animal depends on its state. For example,
16
consider the behaviour of a small bird in winter that has to engage in contests with other
birds in order to gain access to food. During a winter’s day, a bird will typically be involved
in many contests. In each contest, the bird has to decide on its level of aggression. In this
context, a strategy is a rule that specifies how a bird’s level of aggression depends on its
energetic reserves and time. Given a resident population strategy, we can determine the
best response for a mutant bird. To do so, however, we cannot start by considering contests
in isolation from one another. This is because the value to the mutant of winning a contest
and hence obtaining food will depend on the amount of food that it is likely to obtain in
the future. This future food will depend on both the mutant’s behaviour and the behaviour
of other members of the population. At a given time in the future, the level of aggression
shown by other population members will depend on both the population strategy at that
time (which specifies how aggression depends on reserves) and on reserves at this time,
which depend on the behaviour of all population members prior to this time. An example
in which each contest over food is modelled as a Hawk-Dove game is analyzed by Houston
& McNamara (1988) and McNamara et al. (1991).
This dependence of the best current action on both the behaviour of the focal individual
in the future and of the resident’s behaviour at past and future times occurs in many games
of biological interest. Examples include information exchange during extended contests
(Enquist & Leimar, 1983, 1987, Leimar & Enquist (1984), calling to attract mates (Houston
& McNamara 1977, Lucas & Howard 1995, Lucas et al. 1996) and growth and cannibalism
(Crowley & Hopper 1994). We now present a general model that applies to all of these
cases, and provide a framework for introducing errors in such games.
The Model
We model behaviour over a finite time interval with decision epochs
t = 0, 1, 2, . . . , T − 1. For each fixed t the set of possible states, E(t), of an organism is
finite. For each fixed time t and state x the set of available actions, A(x, t), is finite. A
strategy, π, is a Markov rule for choosing actions as a function of state and time. This
rule may be probabilistic, so that for each state, x, and time, t, π specifies the probability,
17
pi (x, t; π), that each action ai is chosen.
Suppose that the resident strategy is π and consider a single individual in this population following a possibly different strategy. Suppose this individual chooses action ai at
time t when in state x. Then the individual obtains an immediate contribution ri (x, t; π)
to its reproductive success, and is in state y at time t + 1 with probability γi (x, y, t; π). If
the individual is in state xT at final time T its reproductive value is R(xT ; π). The total
payoff to the individual is the expected sum of the immediate contributions to reproductive
success at times t = 0, 1, 2, . . . , T − 1 and the final reproductive value at time T .
The best response with error
We introduce errors in decision making into this game by assuming that an organism has
a probability of making an error in every state and at every time. As before, the probability
of making an error depends on the cost of the error. Costs are, however, dependent on
future expectations and hence dependent on errors in decision making in the future. We
thus find the best response with error by working backwards as in dynamic programming.
Take an error function Hδ given by equations (19) and (14) - (17). Let π be a resident
population strategy. The best response, Bδ (π), to π with error function Hδ is defined
inductively by working backwards from final time T . Suppose that behaviour under Bδ (π)
has already been found for every state at times t + 1, t + 2, . . . , T − 1. An individual which
uses strategy Bδ (π) from time t + 1 onwards has reproductive value W (y, t + 1; π) at time
t + 1, where y is its state at this time. Now focus on an individual in state x at time t. If
this individual chooses action ai at this time and then uses strategy Bδ (π) from time t + 1
onwards its reproductive value is
Wi (x, t; π) = ri (x, t; π) +
X
γi (x, y, t; π)W (y, t + 1; π);
(43)
y
here the sum is over all possible states y at time t + 1. Set
c (x, t; π) = maximum Wi (x, t; π)
W
i
18
(44)
(cf. equation (12) and set
c (x, t; π) − Wi (x, t; π)
Ci (x, t; π) = W
(45)
(cf. equation (13)). Then we can assign a weight
βi (x, t; π) = Hδ (Ci (x, t; π))
(46)
to the choice of action ai . Strategy Bδ (π) prescribes that this focal individual chooses
action ai at time t with probability
βi (x, t; π)
p̂i (x, t; π) = X
βj (x, t; π)
(47)
j
(cf. equation 18). The reproductive value of this individual is thus
W (x, t; π) =
X
p̂i (x, t; π)Wi (x, t; π).
(48)
i
Equations (43) - (48) define W at time t in terms of W at time t + 1. Since
W (x, T ; π) = R(x; π)
(49)
we can find W (x, t; π) for all states x and all times t by backward induction. Equation
(47) then specifies the action chosen under Bδ (π) for every state and time.
Properties of the best response function
Let Bδ , defined by equation (47) above, be the best response function for our general state-dependent dynamic game. Property 1, that Bδ (π) is uniquely defined for each
strategy π, follows directly from equations (43) - (49) and the strict positivity of H. For
Property 2, that Bδ (π) is a continuous function of π, we require continuity conditions on
the immediate reward functions and transitions functions defined above. Sufficient conditions are given in Appendix 3. The general versions of Properties 3 and 4 then follow
directly from the continuity of Bδ (Appendix 3).
If we have found a convergent sequence of strategies πn , which converges to some
limiting strategy π∞ , then Property 4 enables us to check easily whether or not π∞ provides
19
a Nash equilibrium. We do not address the problem of finding such a convergent sequence
for the general state-dependent dynamic game in this paper.
An example: Brood desertion.
We illustrate the model with an example based on brood desertion. In some species, if
the brood receive no care they are likely to die, whereas the extra advantage of biparental
care over uniparental care may be small. Under these circumstances, it might be worthwhile
for a parent to desert in order to try to mate and obtain another brood. The best decision
for one parent clearly depends on the behaviour of that parent’s partner. It also depends
on how easy it is to get further mates and on future desertion decisions. The probability
of future matings depends on the number of males and females looking for mates, which
in turn depends on the previous desertion decisions of all population members.
Szekely et al. (unpub) present a particular model of desertion that is relevant to birds.
Females differ in terms of the clutch size that they lay. Once a clutch is laid both the
male and the female must decide whether to care for the brood or desert. For each sex a
strategy specifies how the desertion decision depends on clutch size and time during the
breeding season. For some parameter values, the iteration scheme based on a best response
with a fixed level of damping fails to converge whatever the level of damping. This occurs
because optimal desertion decisions depend discontinuously on the sex ratio amongst birds
looking for mates, so that the best response map is discontinuous. Introducing errors into
decision making in the manner explained above, smoothes the best response map so that
the iterative scheme now converges provided there is sufficient damping. As with the HawkDove game the level of damping required decreases as the probability of error increases as
is shown in Figure 4. The figure also shows that when the replacement factor λ is small
(high damping) convergence is slow even when it occurs.
20
6
Discussion
Previous analyses of dynamic games in a biological context have typically used some
damped iteration of the best response map in the search for Nash equilibria. This technique
has not always been successful (e.g. Houston & McNamara, 1987, Crowley & Hopper 1994,
Holmgren & Hedenström 1995, Lucas & Howard 1995, Lucas et al. 1996, Johnstone pers.
comm). The difficulties that we experienced in our attempt to find equilibria in dynamic
models of desertion led us to develop the computational technique based on errors that is
presented here. The technique is very general and applicable to any game. Introducing
errors in the way that we describe will eliminate discontinuities in the best response map.
Another advantage is that errors should obviate difficulties that arise from representing
state variables on a discrete grid. Using the technique in desertion games, we have always
been able to find equilibria. The technique has also been useful in other games (Johnstone,
1997, Henson pers. comm., Lucas pers. comm.).
Although we have introduced the idea of errors in decision-making as a computational
tool for finding ESSs, there is an alternative, biological, justification for their use. It is
clearly unreasonable to expect animals to behave in exactly the way predicted by the
simple models used in behavioural ecology. In the context of prey choice, models typically
predict that a prey type is either always accepted or always rejected (i.e. choice is all-ornone), whereas the data show that under given conditions, a prey type may be sometimes
accepted and sometimes rejected, i.e. animals show partial preferences (McNamara &
Houston, 1987). There is, however, a pattern to the deviations from the predictions of
optimisation models - costly deviations tend to be rare (see Houston, 1987 for a review
of some examples). It is precisely this aspect of real behaviour which is captured by our
procedure.
Economists have introduced errors into games not as a computational tool but to eliminate Nash equilibria which are thought to be unrealistic (see, for example, Fundenberg &
Tirole 1991). In particular the definition of a proper Nash equilibrium (Myerson, 1978) is
21
based on a similar concept to our ESS with error. Biological games may also have Nash
equilibria which are unrealistic in that they disappear when errors are introduced. Errors
have been used in biology to stabilise or otherwise resolve situations where drift may erode
neutrally stable solutions (cf. Parker & Rubenstein, 1981; Hammerstein & Parker, 1982).
Thus given that animals make errors, our technique is both computationally useful and
ensures that the predicted outcomes are biologically realistic.
Acknowledgements
JNW was supported by a BBSRC grant to JMMcN, AIH and EJC.
TS was supported by a Leverhulme Grant to AIH, Innes Cuthill and JMMcN.
22
Appendix 1. Iterative Scheme for the Hawk-Dove Game
We consider the scheme
πn−1 < π ∗
πn−1 = π ∗
πn−1 > π ∗ ,
πn = (1 − λn )πn−1 + λn
πn = πn−1
πn = (1 − λn )πn−1
(A1.1)
(A1.2)
(A1.3)
where π ∗ = V /C. Three cases are analysed.
Case I. λn constant. Let λn = λ where 0 < λ ≤ 1. Let A ⊆ [0, 1] be the set of π0 for which
πn = π ∗ for some n. Note that by equations (A1.1) - (A1.3), for any πn there are at most
three values of πn−1 which give rise to this πn . Thus there are at most 3n initial points π0
such that πn = π ∗ . It follows that the set A is countable.
Suppose πn−1 < π ∗ . Then by equation (A1.1)
πn − πn−1 = λ(1 − πn−1 ) > λ(1 − π ∗ ).
(A1.4)
Conversely, suppose πn−1 > π ∗ . Then by equation (A1.3)
πn−1 − πn = λπn−1 > λπ ∗ .
(A1.5)
By inequalities (A1.4) and (A1.5)
|πn − πn−1 | > λ min{1 − π ∗ , π ∗ }
(A1.6)
for πn−1 6= π ∗ . Thus if π0 ∈
/ A, inequality (A1.6) holds for all n and the sequence
π0 , π1 , π2 , . . . does not converge.
Case II. λn ↓ 0, Σλn = ∞. We first show that there is a sequence n0 < n1 < n2 < · · · such
that for even k
πnk ≥ πnk +1 ≥ · · · ≥ πnk+1 −1 ≥ π ∗ ≥ πnk+1 ,
(A1.7)
πnk ≤ πnk +1 ≤ · · · ≤ πnk+1 −1 ≤ π ∗ ≤ πnk+1 .
(A1.8)
and for odd k
To see this suppose that for some n we have πn > π ∗ . Then if πn+1 , πn+2 , . . . , πn+r > π ∗
we have
πn+s =
n+s
Y
(1 − λm )πn ,
m=n+1
23
1≤s≤r+1
(A1.9)
by equation (A1.3). Thus πn ≥ πn+1 ≥ · · · ≥ πn+r+1 . Since (1 − λm ) ≤ e−λm we also have
)
(
n+s
X
(A1.10)
πn+s ≤ exp −
λm .
m=n+1
If πn+s > π ∗ for all s we would have
∞
X
(
lim πn+s ≤ exp −
s→∞
)
λm
=0
m=n+1
since Σλm is divergent. Thus since π ∗ > 0 there would be an s such that πn+s ≤ π ∗
a contradiction. It follows that πn+s ≤ π ∗ for some s. This shows that given nk with
πnk ≥ π ∗ there exists nk+1 > nk such that (A1.7) holds. Construction of the sequence
(A1.8) is similar. It can then be seen that if n0 is the first value of n for which πn ≥ π ∗
one can construct the whole sequence by induction on k.
Now note that πn−1 ≥ π ∗ ≥ πn implies that
πn ≥ (1 − λn )πn−1 by (A1.2) and (A1.3)
≥ (1 − λn )π ∗ ,
so that 0 ≤ π ∗ − πn ≤ λn π ∗ ≤ λn . Similarly, if πn−1 ≤ π ∗ ≤ πn we have
πn ≤ (1 − λn )πn−1 + λn by (A1.1) and (A1.2)
≤ (1 − λn )π ∗ + λn ,
so that 0 ≤ πn − π ∗ ≤ λn (1 − π ∗ ) ≤ λn . Applying this to the sequence satisfying conditions
(A1.7) and (A1.8) shows that for k ≥ 1 we have
|πn − π ∗ | ≤ λnk
for nk ≤ n ≤ nk+1 − 1.
Thus πn → π ∗ as n → ∞.
Case III. λn ↓ 0, Σλn < ∞. We first show that the sequence {πn } is convergent. If
πn−1 < π ∗ , then by equation (A1.1) πn − πn−1 = λn (1 − πn ). If πn−1 = π ∗ , then by equation
(A1.2) πn − πn−1 = 0. Finally, if πn−1 > π ∗ , then by equation (A1.3) πn−1 − πn = λn πn−1 .
∞
X
Thus |πn−1 − πn | ≤ λn for all n. Now let η > 0 be given. Since
λn is convergent we can
n=1
24
choose an N such that
∞
X
λn ≤ η. Thus if m ≥ n ≥ N we have
n=N
|πm − πn | ≤
m
X
s=n+1
≤
∞
X
m
X
|πs−1 − πs | ≤
λs
s=n+1
λn ≤ η.
s=N
This shows that {πn } is a Cauchy sequence in [0, 1]. Since [0, 1] is closed the sequence is
convergent.
We now show that, given any π0 there is a range of values for π ∗ (depending on π0 )
such that the sequence {πn } converges to a limit which is not equal to π ∗ . Suppose π0 ≥ 21 .
Let
α=
∞
Y
(1 − λn ).
n=1
Since Σλn is convergent, it is easy to show that 1 ≥ α > 0. Suppose π ∗ < 21 α. Then
π0 > α/2. Suppose by induction that
π0 ≥ π1 ≥ · · · ≥ πn−1 ≥
Then, since π ∗ <
α
.
2
α
by equation (A1.3) we have
2
n
∞
Y
Y
α
πn = π0 (1 − λs ) ≥ π0 (1 − λs ) ≥ .
2
s=1
s=1
α
πn does not converge to π ∗ .
2
α
α
then limit πn ≤ 1 − provided π ∗ > 1 − .
2
2
n→∞
Thus πn ≥ α/2 for all n. Since π ∗ <
Similarly if π0 ≤
1
2
25
Appendix 2. The Hawk-Dove Game with Error
Theorem A2.1
The equation Bδ (π) = π has a unique solution π = πδ∗ which satisifes conditions (36) (38).
Proof. Let b : [0, 1] → [0, 1] be given by b(π) = π − Bδ (π). Then by conditions (35), (26)
and (23) b is continuous and strictly increasing. By condition (22) b(0) < 0 and b(1) > 0.
Thus there is a unique π such that b(π) = 0: i.e. B(π) = π.
V
1
Suppose
< . Then
C
2
V
V
V
1
b
= − G1 (0) = − < 0.
C
C
C
2
Also
1
1
1
1
b
= − G1
V − C
.
2
2
2δ
2
C
1
C
1
1
But
V −
< 0. Thus by G1
V −
< and hence b( 12 ) > 0. Since
2δ
2
2δ
2
2
1
V
< πδ∗ < .
b(πδ∗ ) = 0 it follows that
C
2
Results (37) and (38) follow similarly.
Theorem A2.2
For given G1 there exists > 0 such that
∗ V 2δ
πδ − ≤
C
C
for 0 < δ ≤ 1.
(A2.1)
Proof. Choose such that
1 V
1 G1 () = + − .
2
C
2
(A2.2)
First suppose V /C < 21 . Then G1 () = 1 − V /C and hence G1 (−) = V /C by equation
(24). Let π̃ satisfy V − π̃C = −2δ. Then π ≥ π̃ implies that V − πC ≤ −2δ and hence
B(π) = G1
1
V
(V − πC) ≤ G1 (−) = .
2δ
C
26
But πδ∗ > V /C by Theorem A2.1. Thus B(πδ∗ ) = πδ∗ > V /C, so that πδ∗ < π̃, i.e.
πδ∗ ≤
2δ
V
+
.
C
C
V
inequality (A2.1) follows.
C
The case V /C > 12 is similar and the case V /C =
Since πδ∗ >
1
2
is trivial. Property (39) follows
directly from this Theorem.
Theorem A2.3
Let 0 < δ ≤ 1 and define h : [0, 1] → [0, 1] by
h(π) = (1 − λ)π + λBδ (π).
(A2.3)
0 < λ ≤ δ/(1 + C/4)
(A2.4)
|h(π1 ) − h(π2 )| ≤ (1 − λ)|π1 − π2 |
(A2.5)
Suppose λ satisfies
Then
for all strategies π1 and π2 .
Proof. Without loss of generality suppose 0 ≤ π1 < π2 ≤ 1. Then
1
1
h(π2 ) − h(π1 ) = (1 − λ)(π2 − π1 ) − λ G
(V − π1 C) − G
(V − π2 C) . (A2.6)
2δ
2δ
But
0<G
1
1
C
(V − π1 C) − G
(V − π2 C) ≤ (π2 − π1 )
2δ
2δ
2δ
by conditions (23) and (26). Thus by (A2.6) and (A2.7)
Cλ
1−λ−
(π2 − π1 ) ≤ h(π2 ) − h(π1 ) ≤ (1 − λ)(π2 − π1 ).
2δ
Cλ
C
Now 1 − λ −
=1+λ−λ 2+
. But
2δ
2δ
2+
C
2
=
(δ + C/4)
2δ
δ
≤
2
(1 + C/4) since δ ≤ 1
δ
≤
2
by inequality (A2.4).
λ
27
(A2.7)
(A2.8)
Thus 1 − λ − Cλ/2δ ≥ λ − 1. Inequality (A2.5) then follows from this and inequality
(A2.8).
By Theorem A2.3, h is a contraction mapping provided inequality (A2.4) holds. Since
the fixed point of h is the fixed point, πδ , of Bδ , it follows that the sequence π0 , π1 , π2 , . . .
of strategy given by condition (40) satisfies condition (41) provided that inequality (42)
holds.
Theorem A2.4
Let {πn } be a sequence of strategies. Then
πn → πδ∗ ⇐⇒ Bδ (πn ) − πn → 0.
Proof. First suppose that πn → πδ∗ . Since Bδ is continuous Bδ (πn ) → Bδ (πδ∗ ). Thus
Bδ (πn ) − πn → πδ∗ − πδ∗ = 0.
To prove the converse, let π be any strategy. First suppose π ≥ πδ∗ . Then since Bδ is a
decreasing function Bδ (π) ≤ Bδ (πδ∗ ) = πδ∗ ≤ π. Thus
0 ≤ π − πδ∗ ≤ π − Bδ (π).
(A2.9)
Now suppose π ≤ πδ∗ . Then since Bδ is decreasing Bδ (π) ≥ Bδ (πδ∗ ) = πδ∗ ≥ π. Thus
0 ≤ πδ∗ − π ≤ Bδ (π) − π.
From inequalities (A2.8) and (A2.9)
|π − πδ∗ | ≤ |π − Bδ (π)|
for all π. Thus Bδ (πn ) − πn → 0 implies πn − πδ∗ → 0.
28
(A2.10)
Appendix 3
For a given resident strategy π, let φ(x, t; π) denote the probability a randomly selected
member of the resident population is in state x at time t. We assume that the initial
frequency distribution is given by φ(x, 0; π) = q(x) and is independent of π. Let φt (π)
denote the vector with components φ(x, t; π), x ∈ E(t), and let pt (π) denote the matrix
with components pi (x, t; π), x ∈ E(t), i ∈ A(x, t).
Throughout this appendix, Bδ denotes the function defined by equation (47). For each
fixed x and t, let ∆(x, t) denote the simplex ∆(x, t) = {(p1 , . . . , pK(x,t) ) : 0 ≤ pj ≤ 1 for
K(x,t)
j = 1, . . . , K(x, t) and Σj=1 pj = 1}, where K(x, t) denotes the number of possible actions
in A(x, t). Let ∆(t) denote the Cartesian product over x ∈ E(t) of the sets ∆(x, t) and let
∆ denote the Cartesian product over t ∈ {0, 1, . . . , T − 1} of the sets ∆(t). Note that ∆ is
a compact, convex subset of RM , where M = Σt,x K(x, t).
A strategy π is defined in terms of the probabilities pi (x, t; π), so each strategy π
corresponds to a unique point in ∆. The rule specified by π for choosing the action at time
t is defined in terms of the matrix pt (π), and each pt (π) corresponds to a unique point in
∆(t). We define the distance between two strategies π and π 0 by taking
kπ − π 0 k = maximumt,x,i |pi (x, t; π) − pi (x, t; π 0 )|
and the corresponding distance between two rules pt (π) and pt (π 0 ) by
kpt (π) − pt (π 0 )k = maximumx,i |pi (x, t; π) − pi (x, t; π 0 )|.
Similarly, let Γ(t) denote the simplex corresponding to distributions on E(t), so each
vector φt (π) corresponds to a unique point in Γ(t). Again the distance between two vectors
φt (π) and φt (π 0 ) is defined by
kφt (π) − φt (π 0 )k = maximumx |φ(x, t; π) − φ(x, t; π 0 )|.
Continuity of functions on ∆ and ∆(t) × Γ(t) is defined in the usual way. In particular,
for two strategies π and π 0 we have
kBδ (π) − Bδ (π 0 )k = maximumt,x,i |p̂i (x, t; π) − p̂i (x, t; π 0 )|,
29
where the p̂i ’s are given by equation (47). Since the maximum is over a finite number of
terms, Bδ (π) is a continuous function of π if p̂i (x, t; π) is a continuous function of π for
each fixed t, x and i.
Theorem A3.1.
For each fixed t, x, y and i, consider ri (x, t; π), R(x; π) and γi (x, y, t; π) as functions of
the resident strategy π. Let F denote the set of all these functions, for t ∈ {0, 1, . . . , T −
1}, x ∈ E(t), y ∈ E(t + 1) and i ∈ A(x, t).
(A) If each function in F is a continuous function of π then Bδ (π) is a continuous function
of π.
Alternatively consider each ri (x, t; π) to be a function ri (x, t; φt (π), pt (π)) and similarly for
the other functions in F .
(B) If each function in F is a continuous function of (φt , pt ) then Bδ (π) is a continuous
function of π.
Proof
(i) Assume (A) holds, then R(x; π) is a continuous function of π for each x ∈ E(T )
and hence W (x, T ; π) is a continuous function of π. Now assume W (x, s; π) is a
continuous function of π for each s = t + 1, . . . , T and x ∈ E(s). From equations (44)
- (48) and the standard properties of continuous functions and the positivity of the
denominator in equation (47), we have that p̂i (x, t; π) and W (x, t; π) are continuous in
π for each x ∈ E(t) and i ∈ A(x, t). Proceeding by induction, we have that p̂i (x, t; π)
is a continuous function of π for each t, x and i. Hence Bδ (π) is a continuous function
of π.
(ii) Assume (B) holds, so each function in F is a continuous function of (φt , pt ). Continuity of Bδ in π will then follow from assumption (A) if both φt (π) and pt (π) are
30
continuous functions of π. The continuity of pt (π) follows from its definition, so we
only need to show that φt (π) is continuous in π for each t ∈ {0, 1, . . . , T }.
The initial frequency distribution of the resident population is given by φ(x, 0; π) =
q(x), independent of π, so φ0 (π) is continuous in π. Now assume that φt (π) is
continuous in π for some t ∈ {0, . . . , T − 1}. Then for each y ∈ E(t + 1),
φ(y, t + 1; π) =
XX
x
φ(x, t; π)pi (x, t; π)γi (x, y, t; π)
i
so φ(y, t + 1; π) is continuous in π, and hence φt+1 (π) is continuous in π. Hence, by
induction, φt (π) is continuous in π for each t ∈ {0, 1, . . . , T }.
Theorem A3.3
If Bδ (π) is a continuous function of π, then there is always at least one solution to the
equation Bδ (π).
Proof
Bδ : ∆ → ∆, where ∆ is a non-empty, compact, convex subset of RM , and Bδ (π) is
a well-defined continuous function of π. The existence of at least one fixed point of the
mapping Bδ then follows directly from the Brouwer fixed-point theorem.
Theorem A3.4
Let πn be a sequence of strategies which converge to some limiting strategy, π∞ , and
assume Bδ (π) is a continuous function of π. Then π∞ satisfies Bδ (π∞ ) = π∞ if and only if
kπn − Bδ (πn )k → 0 as n → ∞.
Proof
(i) Assume kπn − Bδ (πn )k → 0 as n → ∞. Now kπ∞ − Bδ (π∞ )k ≤ kπ∞ − πn k + kπn −
Bδ (πn )k + kBδ (πn ) − Bδ (π∞ )k. As n → ∞ the first term tends to zero since πn → π∞ ,
the second term tends to zero by assumption and the third tends to zero by continuity
of Bδ . Hence we must have kπ∞ − Bδ (π∞ )k = 0, so π∞ = Bδ (π∞ ).
31
(ii) Assume π∞ = Bδ (π∞ ), then kπn − Bδ (πn )k ≤ kπn − π∞ k + kBδ (π∞ ) − Bδ (πn )k. As
n → ∞, the first term tends to zero since πn → π∞ , and the second term tends to
zero by continuity.
32
References
Crowley, P.H. & Hopper, K.R. (1994). How to behave around cannibals: a densitydependent dynamic game. Am. Nat. 143, 117-154.
Enquist, M. & Leimar, O. (1983). Evolution of fighting behaviour: decision rules and
assessment of relative strength. J. theor. Biol. 102, 387-410.
Enquist, M. & Leimar, O. (1987). Evolution of fighting behaviour: the effect of variation
in resource value. J. theor. Biol. 127, 187-205.
Fudenberg, D. & Tirole, J. (1991). Game Theory. Cambridge, Ma: MIT.
Hammerstein, P. (1996). Darwinian adaptation, population genetics and the streetcar
theory of evolution. J. Math. Biol. 34 511-532.
Hammerstein, P. & Parker, G.A. (1982). The asymmetric war of attrition. J. theor. Biol.
96, 647-682.
Holmgren, N. & Hedenström, A. (1995). The scheduling of moult in migratory birds.
Evol. Ecol. 9, 354-368.
Houston, A.I. (1987). The control of foraging decisions. In Quantitative Analyses of
Behavior (Commons, M.L., Kacelnik, A., Shettleworth, S.J., eds) Vol 6 Foraging,
Lawrence Erlbaum, New York.
Houston, A.I. & McNamara, J.M. (1987). Singing to attract a mate: a stochastic dynamic
game. J. theor. Biol. 129, 57-68.
Houston, A.I. & McNamara, J.M. (1988). Fighting for food: a dynamic version of the
Hawk-Dove game. Evol. Ecol. 2, 51-64.
Johnstone, R.A. (1997). The tactics of mutual mate choice and competitive search. Behavioural Ecology and Sociobiology 40, 51-59.
33
Leimar, O. & Enquist, M. (1984). Effects of asymmetries in owner-intruder conflicts. J.
theor. Biol. 111, 475-491.
Lucas, J.R. & Howard, R.D. (1995). On alternative reproductive tactics in anurans:
dynamic games with density and frequency dependence. Am. Nat. 146, 365-397.
Lucas, J.R., Howard, R.D. & Palmer, J.G. (1996). Callers and satellites: chorus behaviour
in anurans as a stochastic dynamic game. Anim. Behav. 51, 501-518.
McNamara, J.M. & Houston, A.I. (1986). The common currency for behavioral decisions.
Am. Nat. 127, 358-378.
McNamara, J.M. & Houston, A.I. (1987). Partial preferences and foraging. Anim. Behav.
35, 1084-1099.
McNamara, J.M., Merad, S. & Collins, E.J. (1991). The Hawk-Dove game as an averagecost problem. Adv. Appl. Prob. 23, 667-682.
Maynard Smith, J. (1982). Evolution and the Theory of Games. Cambridge, Cambridge
University Press.
Maynard Smith, J. & Price, G.R. (1973). The logic of animal conflict. Nature 246, 15-18.
Myerson, R.B. (1978). Refinements of the Nash equilibrium concept. International J.
Game Theory 7, 73-80.
Parker, G.A. & Rubenstein, D.I. (1981). Role assessment, reserve strategy, and acquisition
of information in asymmetric animal conflicts. Anim. Behav. 29, 221-240.
Weissing, F.J. (1996). Genetic versus phenotypic models of selection: can genetics be
neglected in a long-term perspective. J. Math. Biol. 34, 533-555.
34
Table 1. The results of some attempts to solve the Hawk-Dove game by iterations of
√
the best response map. Our computations illustrate the case π ∗ = 1/ 2 ' 0.707107. The
sequence πn = (1 − λn )πn−1 + λn B(πn−1 ) fails to converge for λn fixed. With λn decreasing
such that condition (8) holds, the sequence π0 , π1 , π2 , . . . converges to π ∗ for π0 = 0 but
not for π0 = 1.
Table 1
n
0
1
2
3
4
5
50
51
52
53
54
55
1000
1001
1002
1003
1004
1005
λn = 0.1
πn
0.000000
0.100000
0.190000
0.271000
0.343900
0.409510
0.719438
0.647494
0.682744
0.714470
0.643023
0.678721
0.719472
0.647525
0.682773
0.714495
0.643046
0.678741
λn = 1/n2
πn
πn
0.000000 1.000000
1.000000 0.000000
0.750000 0.250000
0.666667 0.333333
0.687500 0.375000
0.700000 0.400000
0.707020 0.490000
0.707132 0.490196
0.706871 0.490385
0.706975 0.490566
0.707076 0.490741
0.707173 0.490909
0.707106 0.499500
0.707107 0.499501
0.707107 0.499501
0.707106 0.499502
0.707107 0.499502
0.707107 0.499503
35
Table 2. Iterative solution of the Hawk-Dove game with λn decreasing such that
condition (9) holds. In this case the sequence πn = (1 − λn )πn−1 + λn B(πn−1 ) converges to
π ∗ for all π0 . However, the convergence is very slow, and the sequence of best responses
√
need not converge. The Table illustrates the case π ∗ = 1/ 2 ' 0.707107.
Table 2
n
0
1
2
3
4
5
50
51
52
53
54
55
1000
1001
1002
1003
1004
1005
πn
0.000000
0.100000
0.500000
0.666667
0.500000
0.600000
0.700000
0.705882
0.711538
0.698113
0.703704
0.709091
0.707000
0.707292
0.706586
0.706879
0.707171
0.706467
λn = 1/n
πn − π ∗
—0.707107
+0.292893
—0.207107
—0.040440
+0.042893
—0.107107
—0.007107
—0.001224
+0.004432
—0.008993
—0.003403
+0.001984
—0.000107
+0.000186
—0.000520
—0.000228
+0.000064
—0.000639
36
B(πn )
1.000000
0.000000
1.000000
1.000000
0.000000
1.000000
1.000000
1.000000
0.000000
1.000000
1.000000
0.000000
1.000000
0.000000
1.000000
1.000000
0.000000
1.000000
Figure captions
Figure 1. (a) The error function Gδ (x) =
1
1+e−x/δ
for δ = 1.0 (solid line), δ = 0.5 (dotted
line) and δ = 0.1 (dashed line). (b) The error function Gδ (x) =
1
2
x/δ
∗ (1 + 1+|x/δ|
) for δ = 1.0
(solid line), δ = 0.5 (dotted line) and δ = 0.1 (dashed line).
Figure 2. The best response with error for the Hawk-Dove game. For resident strategy
π the best response with error is Gδ ( 12 (V − πC)). Here the error function is given by
1
Gδ (x) =
. The figure shows the best response for δ = 0.1 and δ = 0.5. For given
1 + e−x/δ
δ, the ESS with error function Gδ satisfies Gδ (πδ∗ ) = πδ∗ , and is the value of π at which the
solid 45o line intersects the best response curve. The best response without error is also
shown as a solid line. The ESS without error is π ∗ = 2/3 (V = 2, C = 3).
The ESS with error, πδ∗ , as a function of the error parameter δ for the
1
(solid line) and
Hawk-Dove game. Two error functions are illustrated Gδ =
1 + e−x/δ
1
x/δ
Gδ (x) =
1+
(dotted line). The ESS without error is π ∗ = 2/3 (V = 2, C =
2
1 + |x/δ|
3).
Figure 3.
Figure 4. Attempts to find an ESS for the state-dependent desertion game (Szekely et al.
unpub) by iterating the best response map with fixed level of replacement λ. For the case
considered iterations without error fail to converge for any λ ≥ 0.01. The figure shows the
outcome for various combinations of λ and error parameter δ. White cells indicate cases
in which convergence occurs in less than 200 iterations. Grey cells indicate cases in which
between 200 and 1000 iterations were required. Black cells indicate no convergence after
1000 iterations.
37

Download Report

A general technique for computing evolutionarily stable strategies

Paperzz.com

Your Paperzz