A general technique for computing evolutionarily stable strategies based on errors in decision-making John M. McNamara1 , James N. Webb1 , E.J. Collins1 , Tamás Székely2 and Alasdair I. Houston2 1 2 School of Mathematics, University of Bristol, University Walk, Bristol BS8 1TW School of Biological Sciences, University of Bristol, Woodland Road, Bristol BS8 1UG Correspondence to John McNamara: email: [email protected] running head: games with errors in decision making 1 Abstract Realistic models of contests between animals will often involve a series of state-dependent decisions by the contestants. Computation of evolutionarily stable strategies for such statedependent dynamic games are usually based on damped iterations of the best response map. Typically this map is discontinuous so that iterations may not converge and even if they do converge it may not be clear if the limiting strategy is a Nash equilibrium. We present a general computational technique based on errors in decision making that removes these computational difficulties. We show that the computational technique works for a simple example (the Hawk-Dove game) where an analytic solution is known, and prove general results about the technique for more complex games. It is also argued that there is biological justification for inclusion of the types of errors we have introduced. 2 1 Introduction Frequency-dependent effects are ubiquitous in the natural world. To understand the action of natural selection in such circumstances, a game theoretic approach is required. The end point of the evolutionary process is an evolutionarily stable strategy or ESS (Maynard Smith 1982, Hammerstein 1996, Weissing 1996). Early game-theoretical models were usually highly schematic in that they considered a single decision by the contestants. Realistic models will often require the investigation of a series of state-dependent decisions by each contestant. A problem with such models is that they are difficult to analyze. In particular, the ESS usually has to be found by computation. The techniques used have not always yielded an ESS (e.g. Houston & McNamara 1987, Crowley & Hooper 1994, Holmgren & Hedenström 1995, Lucas & Howard 1995, Lucas et al. 1996) and then it is not clear whether there is an ESS that might be found by other methods or whether no ESS exists. For example it is not clear which of these alternatives apply in the work of Lucas et al. (1996). The computational techniques used are based on finding the best response for a mutant in a resident population. Problems arise because the best response does not vary continuously with the behaviour of the resident population, i.e. the best response map is discontinuous. In this paper we describe a computational technique for finding ESSs. It is based on the assumptions that there are errors in the decisions made by an animal, but that the probability of an error decreases as its cost (in terms of reproductive success) increases. Once errors have been introduced, a suitably defined best response map is continuous. As we show, this smoothing of the best response map obviates many of the problems associated with computing ESSs. We argue that in addition to its computational advantages, the assumption of errors is also biologically realistic. 2 Computational Problems An ESS analysis for a large population can be split into two components. 3 The environment. The strategies adopted by members of the population together with the physical environment determine the environment experienced by population members. Best mutant. Given an environment we can consider the fitness of all possible “mutant” strategies within this environment. A strategy maximising fitness will be referred to as a best mutant strategy. For some environments there is a unique best mutant, for others there is a set of best mutants with all members of this set doing equally well. Given these components we may define the best response map, B, as follows. Suppose almost all population members use strategy π. This resident strategy creates an environment. Let B(π) be the set of all best mutant strategies within this environment. We will call a strategy in B(π) a best response to π. When there is a unique best response to π we will, with a slight abuse of notation, denote the best response by B(π). A necessary condition for a strategy π ∗ to be an ESS is that it is a Nash equilibrium; i.e. π ∗ is a best response to itself: π ∗ = B(π ∗ ). (1) Computations typically seek to find a solution to equation (1), and it is this computational problem we will focus on here. Since conditions for evolutionary stability are stronger than the Nash equilibrium conditions, having found a solution to (1) one must then further investigate the stability of the solution. The Hawk-Dove Game. Concepts and computational problems are illustrated using the following standard example of the Hawk-Dove game (Maynard Smith and Price 1973, Maynard Smith 1982). The Hawk-Dove game can be solved analytically, but we can used this simple game to illustrate problems which occur when attempting to compute Nash equilibria for more complex games where an analytic solution is not possible. Two animals contest a resource of value V . Each must decide whether to be aggressive (i.e play “Hawk”) or display, (i.e play “Dove”). Each animal makes its choice before it knows the choice of its opponent. If a Hawk meets a Dove the Hawk wins the resource. If two Doves meet, they share the resource. If two Hawks meet each wins the resource 4 with probability 12 . The loser pays cost C, representing the cost of injury (we assume 0 < V < C). For the Hawk-Dove game a strategy is specified by a number π, where 0 ≤ π ≤ 1. Under strategy π, an animal plays Dove with probability 1−π and plays Hawk with probability π. Let W (π 0 , π) be the payoff to a mutant strategy π 0 when the resident population strategy is π. Then W (π 0 , π) = (1 − π 0 )(1 − π) 1 V + π 0 (1 − π)V + π 0 π (V − C) 2 2 so that W (π 0 , π) = (1 − π) V 1 + (V − πC)π 0 . 2 2 (2) This payoff is an increasing function of π 0 when V − πC > 0 and a decreasing function when V − πC < 0. Thus the best response map is given by B(π) = 1 if 0 ≤ π < V C B(π) is the set of all strategies when π = B(π) = 0 if (3) V C V < π ≤ 1. C (4) (5) From equation (3) - (5) it can be seen that the strategy π ∗ = V /C is the unique strategy satisfying condition (1). It can easily be verified that π ∗ also satisfies the stronger condition of Maynard Smith (1982). Although the Hawk-Dove game can be solved analytically, complex games usually need to be solved by numerical computation. Typically, numerical methods employ an interative scheme which generates a sequence of strategies π0 , π1 , π2 , . . .. Hopefully this sequence converges to a solution, π ∗ , of equation (1). The simplest scheme is as follows. Iteration of the best response map. Take any strategy as the initial strategy π0 . Set π1 = B(π0 ), set π2 = B(π1 ), and so on. In this way one obtains a sequence π0 , π1 , π2 , . . . of strategies where each strategy is the best response to the previous strategy in the sequence. The sequence obtained by the above scheme may converge to a strategy π ∗ which is the best response to itself and so satisfies equation (1); and this method has been successfully 5 used to find the solution of a dynamic game (Houston and McNamara, 1988). Often, however, the sequence π0 , π1 , π2 , . . . fails to converge at all. To illustrate failure of convergence consider the Hawk-Dove game. Suppose that the initial choice of strategy π0 satisfies π0 < V /C. Then by equations (3) and (5) π1 = B(π0 ) = 1, π2 = B(π1 ) = 0, π3 = B(π2 ) = 1, etc. The sequence obtained is thus π0 , 1, 0, 1, 0, 1, 0, . . .. Similarly if π0 > V /C the sequence is π0 , 0, 1, 0, 1, 0, 1, . . .. Thus the sequence π0 , π1 , π2 , never converges unless one has been fortunate enough to choose π0 exactly equal to V /C. One can attempt to stop the sequence of strategies π0 , π1 , π2 , . . ., from oscillating by modifying the above iterative scheme as follows. Best response with damping. Let λ lie in the range 0 < λ ≤ 1. Construct the sequence π0 , π1 , π2 , . . . by taking πn to be the randomised strategy which chooses strategy πn−1 with probability 1 − λ and strategy B(πn−1 ) with probability λ. With this interpretation we write πn = (1 − λ)πn−1 + λB(πn−1 ). (6) The previous scheme is obtained by setting λ = 1. We can give a loose interpretation of this iterative scheme by supposing that we are really following the evolution of a population and that πn is the resident population strategy in generation n. Equation (6) then says that generation n is formed from generation n − 1 by replacing a proportion λ of the population in generation n − 1 by individuals whose behavioural strategy was the fittest mutant strategy in generation n − 1. When λ = 1 there is a complete replacement of the population in each generation, and not surprisingly the evolutionary process can oscillate. Replacing only a small fraction of the population tends to stabilise the evolutionary process. Computations of some complex games (e.g. Houston & McNamara 1988, Lucas & Howard 1995, Lucas et al. 1996) illustrate cases in which the sequence π0 , π1 , π2 , . . . oscillates for λ = 1 but converges to a solution π ∗ of equation (1) for λ sufficiently small. It is easy, however, to find examples for which the scheme (6) fails to work no matter how small a value of λ is used. The Hawk-Dove game provides such an example. For the Hawk6 Dove game the sequence π0 , π1 , π2 , . . . of strategies fails to converge for all but a countable non-generic set of π0 (Appendix 1). Table 1 illustrates this effect. An obvious modification of the scheme (6) allows λ to depend on n. Best response with increased damping. Let πn = (1 − λn )πn−1 + λn B(πn−1 ) (7) where the sequence λ0 , λ1 , λ2 , . . . satisfies λn > λn+1 and λn → 0 as n → ∞. Applying this scheme to the Hawk-Dove game always produces a sequence π0 , π1 , π2 , . . . of strategies which converges (Appendix 1). If the sequence of λn ’s tends to zero too rapidly the limit of the sequence of strategies may not be the ESS strategy π ∗ = V /C (Table 1). This is because excessive damping does not allow the population to “evolve”. In particular it is shown in Appendix 1 that if ∞ X λn < ∞ (8) n=1 then there is no initial π0 such that πn → V /C for all possible V /C. Conversely provided the λn ’s tend to zero slowly enough so that ∞ X λn = ∞ (9) n=1 the sequence π0 , π1 , π2 , . . . of strategies will converge to the ESS strategy π ∗ = V /C whatever the π0 chosen and whatever the value of V /C. (Appendix 1). Although the numerical scheme (7) works for the Hawk-Dove game when condition (9) holds, there are problems with using it as a general method. First, even if πn tends to π ∗ as n tends to infinity, convergence is liable to be very slow. This is because condition (9) means that λn tends to zero very slowly as n tends to infinity. Table 2 illustrates convergence of πn to π ∗ in the Hawk-Dove game when λn = n−1 . As can be seen, even after 1000 iterations |πn − π ∗ | is still of the order of 5 × 10−4 . A second drawback of the scheme given by (7) is that there is no reason to suppose that the scheme works for complex problems. The Hawk-Dove game is rather special in that the set of possible strategies is the one-dimensional interval [0, 1]. For complex games the 7 set of possible strategies is typically a subset of Rn for large n, and the analytic argument, valid for the Hawk-Dove game and presented in Appendix 1, is not applicable. Thirdly, if a sequence π0 , π1 , π2 , . . . of strategies is calculated, how do we recognise the sequence is converging to a limit? Of course, since in finite time one can only compute a finite number of strategies to some prescribed limit of accuracy, one never knows. It seems to be a reasonable working practice to assume that convergence is occurring when, say, there is a θ < 1 such that |πn − πn−1 | < θn (10) apears to be true for all sufficiently large n. (Here |πn − πn−1 | is some appropriate measure of the distance between strategies πn and πn−1 ). Although this criterion is usually satisfied when a scheme with constant λ converges, when there is increased damping and the damping used satisfies condition (9) |πn − πn−1 | will typically tend to zero too slowly for (10) to hold. It is then much more difficult to decide whether the sequence π0 , π1 , π2 , . . . is really converging. The final difficulty may arise with any scheme in which a sequence π0 , π1 , π2 , . . . of strategies is calculated. Suppose that a numerical calculation yields a sequence which appears to be converging. If we accept that the sequence really is converging to a limiting strategy π ∗ , how do we know that π ∗ satisfies equation (1)? It may be that the scheme employed forces convergence, as with Table 1. We might have confidence that the limit π ∗ satisfies π ∗ = B(π ∗ ) if the calculation also suggests that |πn − B(πn )| → 0 (11) as n → ∞. But for many games the best response map is discontinuous at the ESS so that we would not expect condition (11) to hold even when π ∗ = B(π ∗ ). The Hawk-Dove game illustrates this point. Table 2 shows a case in which the sequence {πn } converges to the ESS strategy π ∗ = V /C, but the sequence {B(πn )} does not converge. For the Hawk-Dove game analytic arguments tell us that the ESS is V /C. If we did not have these analytic arguments but only had the numerical results in Table 2, then even if we were prepared 8 to accept that {πn } was converging, we would not know from Table 2 that the limit π ∗ satisfied equation (1). 9 3 Introducing Errors into Decision Making We now introduce errors into the choice of action made by contestants in a game. We will do so in such a way that the probability of costly errors is small, while the probabilities of errors with virtually no cost is large. The resulting “best response with error” turns out to be a much better behaved function than the corresponding best response function without error. Consequently, many of the computational problems described in the previous section disappear. In this section we will restrict attention to the case where contestants make a single choice of action and all contestants are in the same state. Games in which contestants make sequences of state dependent choices are discussed later. Consider a game in which contestants make a choice between the K actions a1 , a2 , . . . , aK . A strategy, π, for this game is a vector π = (p1 (π), p2 (π), . . . , pK (π)), where pi (π) is the probability that action ai is chosen under π. We assume a large population, and refer to a strategy as the resident population strategy if almost all population members adopt this strategy. Suppose the resident population strategy is π. Let Wi (π) denote the expected reproductive value of an individual which chooses action ai within this population. Under an optimal choice of action the expected reproductive value is then c (π) = maximum Wi (π). W (12) c (π) − Wi (π). Ci (π) = W (13) 1≤i≤K For each i let Then Ci (π) = 0 if the choice of action ai is optimal and Ci (π) > 0 if the choice of action ai is suboptimal. The quantity Ci (π) is a measure of the loss in reproductive value as a result of choosing action ai , and is referred to by McNamara and Houston (1986) as the canonical cost of action ai . To introduce errors let H1 be a function of a non-negative real variable which satisfies H1 (x) > 0 for x ≥ 0 10 (14) H1 (x) is continuous and strictly decreasing in x (15) H1 (x) → 0 as x → ∞. (16) For some applications it is also necessary to ensure that H1 is sufficiently smooth by imposing a condition such as |H1 (x) − H1 (y)| ≤ H1 (0)|x − y| for all x, y. (17) Let the resident population strategy be π. Then we can assign weight βi (π) = H1 (Ci (π)) to the choice of action i within this population. Let p̂i (π) = βi (π) K X . (18) βj (π) j=1 Then the best response, B1 (π), to π with error function H1 is defined to be the strategy which chooses action i with probability p̂i (π); i.e. B1 (π) = (p̂1 (π), p̂2 (π), . . . , p̂K (π)). Under strategy B1 (π) there is a positive probability of choosing each action. The optimal action is chosen with the highest probability and action ai is more likely to be chosen than action aj if the canonical cost of choosing ai is less than the canonical cost of choosing aj . As the canonical cost of choosing an action increases the probability that action is chosen decreases and tends to zero as the cost tends to infinity. To control the amount of error for given canonical costs we introduce a parameter δ where δ > 0. Define the function Hδ by Hδ (x) = H1 (x/δ). (19) For example if H1 (x) = e−x , then Hδ (x) = e−x/δ . Then we can use Hδ rather than ! H1 K X (δ) (δ) (δ) (δ) to generate errors. By setting βi (π) = Hδ (Ci (π)) and p̂1 (π) = βi (π)/ βj (π) we obtain the best response with error function Hδ given by Bδ (π) = j=1 (δ) (δ) (δ) (p̂1 (π), p̂2 (π), . . . , p̂K (π)). For given canonical costs the probability of error declines as δ decreases. In particular if 11 an action ai has positive canonical cost, then the probability this action is chosen tends to zero as δ tends to zero. For given δ the best response with error function Hδ has two especially useful properties. Property 1. Bδ (π) is uniquely defined for each strategy π. This contrasts with the best response without error, when there may be many best responses to a strategy π. To obtain the second property we assume that the payoffs Wi (π) are continuous functions of π for each i. This is likely to be true for any reasonable game. Assuming it does hold we have Property 2. Bδ (π) is a continuous function of π. Again this contrasts with the best response without error, which may be discontinuous, particularly at an ESS. We define the ESS with error function Hδ to be a strategy πδ∗ which satisfies Bδ (πδ∗ ) = πδ∗ ; (20) that is, πδ∗ is the best response to itself with error function Hδ . The continuity of Bδ (Property 2) allows us to deduce two general properties of πδ∗ . Property 3. There is always at least one solution to equation (20). Property 4. Let πn be a sequence of strategies which converge to some limiting strategy, π∞ . Then π∞ = πδ∗ if and only if |πn − Bδ (πn )| → 0 as n → ∞. (21) Suppose that in a game without error a sequence of iterates π0 , π1 , π2 , . . . is calculated. Suppose also that the sequence appears to be converging. We noted previously that, assuming the sequence is converging, it is difficult to know whether convergence is to an ESS. The main problem is that, even if π0 , π1 , π2 , . . . is converging to an ESS π ∗ , the sequence of best responses B(π0 ), B(π1 ), B(π2 ), . . . may not converge, or may converge to a 12 limit other than π ∗ . Property 4 shows that these difficulties disappear when attempting to calculate an ESS with error, since criterion (21) gives a necessary and sufficient condition that convergence is to the correct limit. Other properties of an ESS with error may depend on the specific nature of the game under consideration. For any particular game one would like to know what iterative schemes are liable to work in finding πδ∗ , and would like to know which of the ESS’s without error are limits, as δ tends to 0, of ESS’s with error. In this paper we do not attempt a general analysis of these issues, but look at them in detail in the Hawk-Dove game. 4 The Hawk-Dove Game with Error When each contestant must choose between just one of two actions as in the Hawk-Dove game, the formulae of the previous section can be re-expressed in a different form. We present this new form, indicate how it is related to the previous definitions, and then use the new form to analyse decision errors in the Hawk-Dove game. We begin by defining a class, {Gδ }, of error functions. Let G1 be a function of a real variable which satisfies: 0 < G1 (x) < 1 for all x. (22) G1 is continuous and strictly increasing. (23) G1 (x) + G1 (−x) = 1 (24) 1 (and hence G1 (0) = ). 2 G1 (x) → 1 as x → ∞. (25) |G1 (x) − G1 (y)| ≤ |x − y| for all x, y. (26) Now for each δ in the range 0 < δ ≤ 1 define the error function Gδ by Gδ (x) = G1 (x/δ). 13 (27) As δ decreases, the function Gδ becomes more step like (Figure 1). Formally Gδ (x) → G0 (x) as δ → 0, where by G0 we mean the function 0 x<0 1 x=0 G0 (x) = 2 1 x > 0. (28) Now suppose that there are just two actions, a1 and a2 , to choose from in a game. A strategy is thus defined by a pair of numbers π = (p1 , p2 ). However, since p1 + p2 = 1 we can define a strategy by a single number π where π is the probability of choosing action a2 . As before let Wi (π) be the expected reproductive value of an individual which plays action i when the resident population strategy is π. In the absence of errors the best mutant response to resident population strategy π is to choose action a2 with probability 0 when W2 (π) − W1 (π) < 0 and choose action a2 with probability 1 when W2 (π) − W1 (π) > 0. In other words the best mutant response is to choose action a2 with probability G0 (W2 (π) − W1 (π)), (29) where the function G0 is given by equation (28). Now let Gδ be an error function as defined above. Motivated by formula (29) we define the best response to π with error function Gδ to be the strategy under which action a2 is chosen with probability Bδ (π) = Gδ (W2 (π) − W1 (π)). (30) An ESS with error function Gδ is then a strategy πδ∗ satisfying Bδ (πδ∗ ) = πδ∗ . (31) Although we have chosen to define the best response with error here in a seemingly different way to its definition in the last section, it is not difficult to show that the two definitions agree provided the functions G1 and H1 are related by G1 (x) = H1 (0) H1 (0) + H1 (x) 14 for x ≥ 0 (32) with G1 (x) for x < 0 being given by condition (24). For example when H1 (x) = e−x we have G1 (x) = 1 1 + e−x − ∞ < x < ∞. (33) To introduce errors into the Hawk-Dove game we can equate “play Dove” with action a1 and “play Hawk” with action a2 . By equation (2) we then have 1 W2 (π) − W1 (π) = (V − πC), 2 (34) so that if the resident population plays Hawk with probability π the best response with error function Gδ is to play Hawk with probability 1 Bδ (π) = Gδ (V − πC) . 2 (35) Unlike the best response in the standard Hawk-Dove game, the best response with error is a uniquely defined strategy. Under this strategy errors are made, but the probability of error decreases with the fitness cost of the error and tends to zero as the fitness cost increases. For a given fitness cost, the probability of error decreases as δ decreases, and tends to zero as δ → 0. Figure 2 shows the best response with error as a function of the resident population strategy π. For the Hawk-Dove game it is easy to show that, for given function Gδ , equation (31) has a unique solution (Appendix 2). Thus there is a unique ESS with error for each function Gδ . For given Gδ one can show that: V 1 V 1 < then < πδ∗ < C 2 C 2 V 1 If = then πδ∗ = V /C C 2 1 V 1 V If < < 1 then < πδ∗ < 2 C 2 C If 0 < (Appendix 2). Thus πδ∗ always lies between 1 2 (36) (37) (38) and the ESS without error, π ∗ = V /C. One might hope that as the probability of error decreases, the ESS with error tends to the ESS without error. In Appendix 2 it is shown that this is indeed the case: πδ∗ → V C as δ → 0. 15 (39) Figure 3 shows the dependence of πδ∗ on δ for two different initial error functions G1 . In the Hawk-Dove game without error the best response, B(π), is a discontinuous function of the resident population strategy, π. As we have seen, the jump discontinuity at π = V /C leads to problems when attempting to compute the ESS by numerical schemes based on iteration of the best response map. When there is error, the best response Bδ (π) is a continuous function of π for all π. Consequently, in computing πδ∗ almost all the previous problems disappear. Let the sequence of strategies π0 , π1 , π2 , . . . be given by πn = (1 − λ)πn−1 + λBδ (πn−1 ) (40) (cf. equation (6)). Here, as before, the replacement factor λ satisfies 0 < λ ≤ 1. It can then be shown (Appendix 2) that πn → πδ∗ as n → ∞ (41) δ . 1 + C/4 (42) provided that 0<λ≤ Thus, unlike the case without error, the iterative scheme with a fixed level of damping works provided that there is sufficient damping. As inequality (42) shows, the level of damping required depends on the amount of error. When the probability of error decreases (δ decreases) the amount of damping required increases (the replacement factor λ decreases). Finally, suppose an iterative scheme is employed in an attempt to calculate πδ∗ . Suppose the scheme generates the sequence of strategies π0 , π1 , π2 , . . . . Then it can be shown that this sequence converges to πδ∗ if and only if condition (21) holds (Appendix 2). This is a stronger result than the general result given in Property 4 as we do not demand that the sequence π0 , π1 , π2 , . . . is already known to be converging. 5 State-dependent Dynamic Games Many realistic game-theoretic models of animal behaviour consider animals that make repeated interactions whose consequences for an animal depends on its state. For example, 16 consider the behaviour of a small bird in winter that has to engage in contests with other birds in order to gain access to food. During a winter’s day, a bird will typically be involved in many contests. In each contest, the bird has to decide on its level of aggression. In this context, a strategy is a rule that specifies how a bird’s level of aggression depends on its energetic reserves and time. Given a resident population strategy, we can determine the best response for a mutant bird. To do so, however, we cannot start by considering contests in isolation from one another. This is because the value to the mutant of winning a contest and hence obtaining food will depend on the amount of food that it is likely to obtain in the future. This future food will depend on both the mutant’s behaviour and the behaviour of other members of the population. At a given time in the future, the level of aggression shown by other population members will depend on both the population strategy at that time (which specifies how aggression depends on reserves) and on reserves at this time, which depend on the behaviour of all population members prior to this time. An example in which each contest over food is modelled as a Hawk-Dove game is analyzed by Houston & McNamara (1988) and McNamara et al. (1991). This dependence of the best current action on both the behaviour of the focal individual in the future and of the resident’s behaviour at past and future times occurs in many games of biological interest. Examples include information exchange during extended contests (Enquist & Leimar, 1983, 1987, Leimar & Enquist (1984), calling to attract mates (Houston & McNamara 1977, Lucas & Howard 1995, Lucas et al. 1996) and growth and cannibalism (Crowley & Hopper 1994). We now present a general model that applies to all of these cases, and provide a framework for introducing errors in such games. The Model We model behaviour over a finite time interval with decision epochs t = 0, 1, 2, . . . , T − 1. For each fixed t the set of possible states, E(t), of an organism is finite. For each fixed time t and state x the set of available actions, A(x, t), is finite. A strategy, π, is a Markov rule for choosing actions as a function of state and time. This rule may be probabilistic, so that for each state, x, and time, t, π specifies the probability, 17 pi (x, t; π), that each action ai is chosen. Suppose that the resident strategy is π and consider a single individual in this population following a possibly different strategy. Suppose this individual chooses action ai at time t when in state x. Then the individual obtains an immediate contribution ri (x, t; π) to its reproductive success, and is in state y at time t + 1 with probability γi (x, y, t; π). If the individual is in state xT at final time T its reproductive value is R(xT ; π). The total payoff to the individual is the expected sum of the immediate contributions to reproductive success at times t = 0, 1, 2, . . . , T − 1 and the final reproductive value at time T . The best response with error We introduce errors in decision making into this game by assuming that an organism has a probability of making an error in every state and at every time. As before, the probability of making an error depends on the cost of the error. Costs are, however, dependent on future expectations and hence dependent on errors in decision making in the future. We thus find the best response with error by working backwards as in dynamic programming. Take an error function Hδ given by equations (19) and (14) - (17). Let π be a resident population strategy. The best response, Bδ (π), to π with error function Hδ is defined inductively by working backwards from final time T . Suppose that behaviour under Bδ (π) has already been found for every state at times t + 1, t + 2, . . . , T − 1. An individual which uses strategy Bδ (π) from time t + 1 onwards has reproductive value W (y, t + 1; π) at time t + 1, where y is its state at this time. Now focus on an individual in state x at time t. If this individual chooses action ai at this time and then uses strategy Bδ (π) from time t + 1 onwards its reproductive value is Wi (x, t; π) = ri (x, t; π) + X γi (x, y, t; π)W (y, t + 1; π); (43) y here the sum is over all possible states y at time t + 1. Set c (x, t; π) = maximum Wi (x, t; π) W i 18 (44) (cf. equation (12) and set c (x, t; π) − Wi (x, t; π) Ci (x, t; π) = W (45) (cf. equation (13)). Then we can assign a weight βi (x, t; π) = Hδ (Ci (x, t; π)) (46) to the choice of action ai . Strategy Bδ (π) prescribes that this focal individual chooses action ai at time t with probability βi (x, t; π) p̂i (x, t; π) = X βj (x, t; π) (47) j (cf. equation 18). The reproductive value of this individual is thus W (x, t; π) = X p̂i (x, t; π)Wi (x, t; π). (48) i Equations (43) - (48) define W at time t in terms of W at time t + 1. Since W (x, T ; π) = R(x; π) (49) we can find W (x, t; π) for all states x and all times t by backward induction. Equation (47) then specifies the action chosen under Bδ (π) for every state and time. Properties of the best response function Let Bδ , defined by equation (47) above, be the best response function for our general state-dependent dynamic game. Property 1, that Bδ (π) is uniquely defined for each strategy π, follows directly from equations (43) - (49) and the strict positivity of H. For Property 2, that Bδ (π) is a continuous function of π, we require continuity conditions on the immediate reward functions and transitions functions defined above. Sufficient conditions are given in Appendix 3. The general versions of Properties 3 and 4 then follow directly from the continuity of Bδ (Appendix 3). If we have found a convergent sequence of strategies πn , which converges to some limiting strategy π∞ , then Property 4 enables us to check easily whether or not π∞ provides 19 a Nash equilibrium. We do not address the problem of finding such a convergent sequence for the general state-dependent dynamic game in this paper. An example: Brood desertion. We illustrate the model with an example based on brood desertion. In some species, if the brood receive no care they are likely to die, whereas the extra advantage of biparental care over uniparental care may be small. Under these circumstances, it might be worthwhile for a parent to desert in order to try to mate and obtain another brood. The best decision for one parent clearly depends on the behaviour of that parent’s partner. It also depends on how easy it is to get further mates and on future desertion decisions. The probability of future matings depends on the number of males and females looking for mates, which in turn depends on the previous desertion decisions of all population members. Szekely et al. (unpub) present a particular model of desertion that is relevant to birds. Females differ in terms of the clutch size that they lay. Once a clutch is laid both the male and the female must decide whether to care for the brood or desert. For each sex a strategy specifies how the desertion decision depends on clutch size and time during the breeding season. For some parameter values, the iteration scheme based on a best response with a fixed level of damping fails to converge whatever the level of damping. This occurs because optimal desertion decisions depend discontinuously on the sex ratio amongst birds looking for mates, so that the best response map is discontinuous. Introducing errors into decision making in the manner explained above, smoothes the best response map so that the iterative scheme now converges provided there is sufficient damping. As with the HawkDove game the level of damping required decreases as the probability of error increases as is shown in Figure 4. The figure also shows that when the replacement factor λ is small (high damping) convergence is slow even when it occurs. 20 6 Discussion Previous analyses of dynamic games in a biological context have typically used some damped iteration of the best response map in the search for Nash equilibria. This technique has not always been successful (e.g. Houston & McNamara, 1987, Crowley & Hopper 1994, Holmgren & Hedenström 1995, Lucas & Howard 1995, Lucas et al. 1996, Johnstone pers. comm). The difficulties that we experienced in our attempt to find equilibria in dynamic models of desertion led us to develop the computational technique based on errors that is presented here. The technique is very general and applicable to any game. Introducing errors in the way that we describe will eliminate discontinuities in the best response map. Another advantage is that errors should obviate difficulties that arise from representing state variables on a discrete grid. Using the technique in desertion games, we have always been able to find equilibria. The technique has also been useful in other games (Johnstone, 1997, Henson pers. comm., Lucas pers. comm.). Although we have introduced the idea of errors in decision-making as a computational tool for finding ESSs, there is an alternative, biological, justification for their use. It is clearly unreasonable to expect animals to behave in exactly the way predicted by the simple models used in behavioural ecology. In the context of prey choice, models typically predict that a prey type is either always accepted or always rejected (i.e. choice is all-ornone), whereas the data show that under given conditions, a prey type may be sometimes accepted and sometimes rejected, i.e. animals show partial preferences (McNamara & Houston, 1987). There is, however, a pattern to the deviations from the predictions of optimisation models - costly deviations tend to be rare (see Houston, 1987 for a review of some examples). It is precisely this aspect of real behaviour which is captured by our procedure. Economists have introduced errors into games not as a computational tool but to eliminate Nash equilibria which are thought to be unrealistic (see, for example, Fundenberg & Tirole 1991). In particular the definition of a proper Nash equilibrium (Myerson, 1978) is 21 based on a similar concept to our ESS with error. Biological games may also have Nash equilibria which are unrealistic in that they disappear when errors are introduced. Errors have been used in biology to stabilise or otherwise resolve situations where drift may erode neutrally stable solutions (cf. Parker & Rubenstein, 1981; Hammerstein & Parker, 1982). Thus given that animals make errors, our technique is both computationally useful and ensures that the predicted outcomes are biologically realistic. Acknowledgements JNW was supported by a BBSRC grant to JMMcN, AIH and EJC. TS was supported by a Leverhulme Grant to AIH, Innes Cuthill and JMMcN. 22 Appendix 1. Iterative Scheme for the Hawk-Dove Game We consider the scheme πn−1 < π ∗ πn−1 = π ∗ πn−1 > π ∗ , πn = (1 − λn )πn−1 + λn πn = πn−1 πn = (1 − λn )πn−1 (A1.1) (A1.2) (A1.3) where π ∗ = V /C. Three cases are analysed. Case I. λn constant. Let λn = λ where 0 < λ ≤ 1. Let A ⊆ [0, 1] be the set of π0 for which πn = π ∗ for some n. Note that by equations (A1.1) - (A1.3), for any πn there are at most three values of πn−1 which give rise to this πn . Thus there are at most 3n initial points π0 such that πn = π ∗ . It follows that the set A is countable. Suppose πn−1 < π ∗ . Then by equation (A1.1) πn − πn−1 = λ(1 − πn−1 ) > λ(1 − π ∗ ). (A1.4) Conversely, suppose πn−1 > π ∗ . Then by equation (A1.3) πn−1 − πn = λπn−1 > λπ ∗ . (A1.5) By inequalities (A1.4) and (A1.5) |πn − πn−1 | > λ min{1 − π ∗ , π ∗ } (A1.6) for πn−1 6= π ∗ . Thus if π0 ∈ / A, inequality (A1.6) holds for all n and the sequence π0 , π1 , π2 , . . . does not converge. Case II. λn ↓ 0, Σλn = ∞. We first show that there is a sequence n0 < n1 < n2 < · · · such that for even k πnk ≥ πnk +1 ≥ · · · ≥ πnk+1 −1 ≥ π ∗ ≥ πnk+1 , (A1.7) πnk ≤ πnk +1 ≤ · · · ≤ πnk+1 −1 ≤ π ∗ ≤ πnk+1 . (A1.8) and for odd k To see this suppose that for some n we have πn > π ∗ . Then if πn+1 , πn+2 , . . . , πn+r > π ∗ we have πn+s = n+s Y (1 − λm )πn , m=n+1 23 1≤s≤r+1 (A1.9) by equation (A1.3). Thus πn ≥ πn+1 ≥ · · · ≥ πn+r+1 . Since (1 − λm ) ≤ e−λm we also have ) ( n+s X (A1.10) πn+s ≤ exp − λm . m=n+1 If πn+s > π ∗ for all s we would have ∞ X ( lim πn+s ≤ exp − s→∞ ) λm =0 m=n+1 since Σλm is divergent. Thus since π ∗ > 0 there would be an s such that πn+s ≤ π ∗ a contradiction. It follows that πn+s ≤ π ∗ for some s. This shows that given nk with πnk ≥ π ∗ there exists nk+1 > nk such that (A1.7) holds. Construction of the sequence (A1.8) is similar. It can then be seen that if n0 is the first value of n for which πn ≥ π ∗ one can construct the whole sequence by induction on k. Now note that πn−1 ≥ π ∗ ≥ πn implies that πn ≥ (1 − λn )πn−1 by (A1.2) and (A1.3) ≥ (1 − λn )π ∗ , so that 0 ≤ π ∗ − πn ≤ λn π ∗ ≤ λn . Similarly, if πn−1 ≤ π ∗ ≤ πn we have πn ≤ (1 − λn )πn−1 + λn by (A1.1) and (A1.2) ≤ (1 − λn )π ∗ + λn , so that 0 ≤ πn − π ∗ ≤ λn (1 − π ∗ ) ≤ λn . Applying this to the sequence satisfying conditions (A1.7) and (A1.8) shows that for k ≥ 1 we have |πn − π ∗ | ≤ λnk for nk ≤ n ≤ nk+1 − 1. Thus πn → π ∗ as n → ∞. Case III. λn ↓ 0, Σλn < ∞. We first show that the sequence {πn } is convergent. If πn−1 < π ∗ , then by equation (A1.1) πn − πn−1 = λn (1 − πn ). If πn−1 = π ∗ , then by equation (A1.2) πn − πn−1 = 0. Finally, if πn−1 > π ∗ , then by equation (A1.3) πn−1 − πn = λn πn−1 . ∞ X Thus |πn−1 − πn | ≤ λn for all n. Now let η > 0 be given. Since λn is convergent we can n=1 24 choose an N such that ∞ X λn ≤ η. Thus if m ≥ n ≥ N we have n=N |πm − πn | ≤ m X s=n+1 ≤ ∞ X m X |πs−1 − πs | ≤ λs s=n+1 λn ≤ η. s=N This shows that {πn } is a Cauchy sequence in [0, 1]. Since [0, 1] is closed the sequence is convergent. We now show that, given any π0 there is a range of values for π ∗ (depending on π0 ) such that the sequence {πn } converges to a limit which is not equal to π ∗ . Suppose π0 ≥ 21 . Let α= ∞ Y (1 − λn ). n=1 Since Σλn is convergent, it is easy to show that 1 ≥ α > 0. Suppose π ∗ < 21 α. Then π0 > α/2. Suppose by induction that π0 ≥ π1 ≥ · · · ≥ πn−1 ≥ Then, since π ∗ < α . 2 α by equation (A1.3) we have 2 n ∞ Y Y α πn = π0 (1 − λs ) ≥ π0 (1 − λs ) ≥ . 2 s=1 s=1 α πn does not converge to π ∗ . 2 α α then limit πn ≤ 1 − provided π ∗ > 1 − . 2 2 n→∞ Thus πn ≥ α/2 for all n. Since π ∗ < Similarly if π0 ≤ 1 2 25 Appendix 2. The Hawk-Dove Game with Error Theorem A2.1 The equation Bδ (π) = π has a unique solution π = πδ∗ which satisifes conditions (36) (38). Proof. Let b : [0, 1] → [0, 1] be given by b(π) = π − Bδ (π). Then by conditions (35), (26) and (23) b is continuous and strictly increasing. By condition (22) b(0) < 0 and b(1) > 0. Thus there is a unique π such that b(π) = 0: i.e. B(π) = π. V 1 Suppose < . Then C 2 V V V 1 b = − G1 (0) = − < 0. C C C 2 Also 1 1 1 1 b = − G1 V − C . 2 2 2δ 2 C 1 C 1 1 But V − < 0. Thus by G1 V − < and hence b( 12 ) > 0. Since 2δ 2 2δ 2 2 1 V < πδ∗ < . b(πδ∗ ) = 0 it follows that C 2 Results (37) and (38) follow similarly. Theorem A2.2 For given G1 there exists > 0 such that ∗ V 2δ πδ − ≤ C C for 0 < δ ≤ 1. (A2.1) Proof. Choose such that 1 V 1 G1 () = + − . 2 C 2 (A2.2) First suppose V /C < 21 . Then G1 () = 1 − V /C and hence G1 (−) = V /C by equation (24). Let π̃ satisfy V − π̃C = −2δ. Then π ≥ π̃ implies that V − πC ≤ −2δ and hence B(π) = G1 1 V (V − πC) ≤ G1 (−) = . 2δ C 26 But πδ∗ > V /C by Theorem A2.1. Thus B(πδ∗ ) = πδ∗ > V /C, so that πδ∗ < π̃, i.e. πδ∗ ≤ 2δ V + . C C V inequality (A2.1) follows. C The case V /C > 12 is similar and the case V /C = Since πδ∗ > 1 2 is trivial. Property (39) follows directly from this Theorem. Theorem A2.3 Let 0 < δ ≤ 1 and define h : [0, 1] → [0, 1] by h(π) = (1 − λ)π + λBδ (π). (A2.3) 0 < λ ≤ δ/(1 + C/4) (A2.4) |h(π1 ) − h(π2 )| ≤ (1 − λ)|π1 − π2 | (A2.5) Suppose λ satisfies Then for all strategies π1 and π2 . Proof. Without loss of generality suppose 0 ≤ π1 < π2 ≤ 1. Then 1 1 h(π2 ) − h(π1 ) = (1 − λ)(π2 − π1 ) − λ G (V − π1 C) − G (V − π2 C) . (A2.6) 2δ 2δ But 0<G 1 1 C (V − π1 C) − G (V − π2 C) ≤ (π2 − π1 ) 2δ 2δ 2δ by conditions (23) and (26). Thus by (A2.6) and (A2.7) Cλ 1−λ− (π2 − π1 ) ≤ h(π2 ) − h(π1 ) ≤ (1 − λ)(π2 − π1 ). 2δ Cλ C Now 1 − λ − =1+λ−λ 2+ . But 2δ 2δ 2+ C 2 = (δ + C/4) 2δ δ ≤ 2 (1 + C/4) since δ ≤ 1 δ ≤ 2 by inequality (A2.4). λ 27 (A2.7) (A2.8) Thus 1 − λ − Cλ/2δ ≥ λ − 1. Inequality (A2.5) then follows from this and inequality (A2.8). By Theorem A2.3, h is a contraction mapping provided inequality (A2.4) holds. Since the fixed point of h is the fixed point, πδ , of Bδ , it follows that the sequence π0 , π1 , π2 , . . . of strategy given by condition (40) satisfies condition (41) provided that inequality (42) holds. Theorem A2.4 Let {πn } be a sequence of strategies. Then πn → πδ∗ ⇐⇒ Bδ (πn ) − πn → 0. Proof. First suppose that πn → πδ∗ . Since Bδ is continuous Bδ (πn ) → Bδ (πδ∗ ). Thus Bδ (πn ) − πn → πδ∗ − πδ∗ = 0. To prove the converse, let π be any strategy. First suppose π ≥ πδ∗ . Then since Bδ is a decreasing function Bδ (π) ≤ Bδ (πδ∗ ) = πδ∗ ≤ π. Thus 0 ≤ π − πδ∗ ≤ π − Bδ (π). (A2.9) Now suppose π ≤ πδ∗ . Then since Bδ is decreasing Bδ (π) ≥ Bδ (πδ∗ ) = πδ∗ ≥ π. Thus 0 ≤ πδ∗ − π ≤ Bδ (π) − π. From inequalities (A2.8) and (A2.9) |π − πδ∗ | ≤ |π − Bδ (π)| for all π. Thus Bδ (πn ) − πn → 0 implies πn − πδ∗ → 0. 28 (A2.10) Appendix 3 For a given resident strategy π, let φ(x, t; π) denote the probability a randomly selected member of the resident population is in state x at time t. We assume that the initial frequency distribution is given by φ(x, 0; π) = q(x) and is independent of π. Let φt (π) denote the vector with components φ(x, t; π), x ∈ E(t), and let pt (π) denote the matrix with components pi (x, t; π), x ∈ E(t), i ∈ A(x, t). Throughout this appendix, Bδ denotes the function defined by equation (47). For each fixed x and t, let ∆(x, t) denote the simplex ∆(x, t) = {(p1 , . . . , pK(x,t) ) : 0 ≤ pj ≤ 1 for K(x,t) j = 1, . . . , K(x, t) and Σj=1 pj = 1}, where K(x, t) denotes the number of possible actions in A(x, t). Let ∆(t) denote the Cartesian product over x ∈ E(t) of the sets ∆(x, t) and let ∆ denote the Cartesian product over t ∈ {0, 1, . . . , T − 1} of the sets ∆(t). Note that ∆ is a compact, convex subset of RM , where M = Σt,x K(x, t). A strategy π is defined in terms of the probabilities pi (x, t; π), so each strategy π corresponds to a unique point in ∆. The rule specified by π for choosing the action at time t is defined in terms of the matrix pt (π), and each pt (π) corresponds to a unique point in ∆(t). We define the distance between two strategies π and π 0 by taking kπ − π 0 k = maximumt,x,i |pi (x, t; π) − pi (x, t; π 0 )| and the corresponding distance between two rules pt (π) and pt (π 0 ) by kpt (π) − pt (π 0 )k = maximumx,i |pi (x, t; π) − pi (x, t; π 0 )|. Similarly, let Γ(t) denote the simplex corresponding to distributions on E(t), so each vector φt (π) corresponds to a unique point in Γ(t). Again the distance between two vectors φt (π) and φt (π 0 ) is defined by kφt (π) − φt (π 0 )k = maximumx |φ(x, t; π) − φ(x, t; π 0 )|. Continuity of functions on ∆ and ∆(t) × Γ(t) is defined in the usual way. In particular, for two strategies π and π 0 we have kBδ (π) − Bδ (π 0 )k = maximumt,x,i |p̂i (x, t; π) − p̂i (x, t; π 0 )|, 29 where the p̂i ’s are given by equation (47). Since the maximum is over a finite number of terms, Bδ (π) is a continuous function of π if p̂i (x, t; π) is a continuous function of π for each fixed t, x and i. Theorem A3.1. For each fixed t, x, y and i, consider ri (x, t; π), R(x; π) and γi (x, y, t; π) as functions of the resident strategy π. Let F denote the set of all these functions, for t ∈ {0, 1, . . . , T − 1}, x ∈ E(t), y ∈ E(t + 1) and i ∈ A(x, t). (A) If each function in F is a continuous function of π then Bδ (π) is a continuous function of π. Alternatively consider each ri (x, t; π) to be a function ri (x, t; φt (π), pt (π)) and similarly for the other functions in F . (B) If each function in F is a continuous function of (φt , pt ) then Bδ (π) is a continuous function of π. Proof (i) Assume (A) holds, then R(x; π) is a continuous function of π for each x ∈ E(T ) and hence W (x, T ; π) is a continuous function of π. Now assume W (x, s; π) is a continuous function of π for each s = t + 1, . . . , T and x ∈ E(s). From equations (44) - (48) and the standard properties of continuous functions and the positivity of the denominator in equation (47), we have that p̂i (x, t; π) and W (x, t; π) are continuous in π for each x ∈ E(t) and i ∈ A(x, t). Proceeding by induction, we have that p̂i (x, t; π) is a continuous function of π for each t, x and i. Hence Bδ (π) is a continuous function of π. (ii) Assume (B) holds, so each function in F is a continuous function of (φt , pt ). Continuity of Bδ in π will then follow from assumption (A) if both φt (π) and pt (π) are 30 continuous functions of π. The continuity of pt (π) follows from its definition, so we only need to show that φt (π) is continuous in π for each t ∈ {0, 1, . . . , T }. The initial frequency distribution of the resident population is given by φ(x, 0; π) = q(x), independent of π, so φ0 (π) is continuous in π. Now assume that φt (π) is continuous in π for some t ∈ {0, . . . , T − 1}. Then for each y ∈ E(t + 1), φ(y, t + 1; π) = XX x φ(x, t; π)pi (x, t; π)γi (x, y, t; π) i so φ(y, t + 1; π) is continuous in π, and hence φt+1 (π) is continuous in π. Hence, by induction, φt (π) is continuous in π for each t ∈ {0, 1, . . . , T }. Theorem A3.3 If Bδ (π) is a continuous function of π, then there is always at least one solution to the equation Bδ (π). Proof Bδ : ∆ → ∆, where ∆ is a non-empty, compact, convex subset of RM , and Bδ (π) is a well-defined continuous function of π. The existence of at least one fixed point of the mapping Bδ then follows directly from the Brouwer fixed-point theorem. Theorem A3.4 Let πn be a sequence of strategies which converge to some limiting strategy, π∞ , and assume Bδ (π) is a continuous function of π. Then π∞ satisfies Bδ (π∞ ) = π∞ if and only if kπn − Bδ (πn )k → 0 as n → ∞. Proof (i) Assume kπn − Bδ (πn )k → 0 as n → ∞. Now kπ∞ − Bδ (π∞ )k ≤ kπ∞ − πn k + kπn − Bδ (πn )k + kBδ (πn ) − Bδ (π∞ )k. As n → ∞ the first term tends to zero since πn → π∞ , the second term tends to zero by assumption and the third tends to zero by continuity of Bδ . Hence we must have kπ∞ − Bδ (π∞ )k = 0, so π∞ = Bδ (π∞ ). 31 (ii) Assume π∞ = Bδ (π∞ ), then kπn − Bδ (πn )k ≤ kπn − π∞ k + kBδ (π∞ ) − Bδ (πn )k. As n → ∞, the first term tends to zero since πn → π∞ , and the second term tends to zero by continuity. 32 References Crowley, P.H. & Hopper, K.R. (1994). How to behave around cannibals: a densitydependent dynamic game. Am. Nat. 143, 117-154. Enquist, M. & Leimar, O. (1983). Evolution of fighting behaviour: decision rules and assessment of relative strength. J. theor. Biol. 102, 387-410. Enquist, M. & Leimar, O. (1987). Evolution of fighting behaviour: the effect of variation in resource value. J. theor. Biol. 127, 187-205. Fudenberg, D. & Tirole, J. (1991). Game Theory. Cambridge, Ma: MIT. Hammerstein, P. (1996). Darwinian adaptation, population genetics and the streetcar theory of evolution. J. Math. Biol. 34 511-532. Hammerstein, P. & Parker, G.A. (1982). The asymmetric war of attrition. J. theor. Biol. 96, 647-682. Holmgren, N. & Hedenström, A. (1995). The scheduling of moult in migratory birds. Evol. Ecol. 9, 354-368. Houston, A.I. (1987). The control of foraging decisions. In Quantitative Analyses of Behavior (Commons, M.L., Kacelnik, A., Shettleworth, S.J., eds) Vol 6 Foraging, Lawrence Erlbaum, New York. Houston, A.I. & McNamara, J.M. (1987). Singing to attract a mate: a stochastic dynamic game. J. theor. Biol. 129, 57-68. Houston, A.I. & McNamara, J.M. (1988). Fighting for food: a dynamic version of the Hawk-Dove game. Evol. Ecol. 2, 51-64. Johnstone, R.A. (1997). The tactics of mutual mate choice and competitive search. Behavioural Ecology and Sociobiology 40, 51-59. 33 Leimar, O. & Enquist, M. (1984). Effects of asymmetries in owner-intruder conflicts. J. theor. Biol. 111, 475-491. Lucas, J.R. & Howard, R.D. (1995). On alternative reproductive tactics in anurans: dynamic games with density and frequency dependence. Am. Nat. 146, 365-397. Lucas, J.R., Howard, R.D. & Palmer, J.G. (1996). Callers and satellites: chorus behaviour in anurans as a stochastic dynamic game. Anim. Behav. 51, 501-518. McNamara, J.M. & Houston, A.I. (1986). The common currency for behavioral decisions. Am. Nat. 127, 358-378. McNamara, J.M. & Houston, A.I. (1987). Partial preferences and foraging. Anim. Behav. 35, 1084-1099. McNamara, J.M., Merad, S. & Collins, E.J. (1991). The Hawk-Dove game as an averagecost problem. Adv. Appl. Prob. 23, 667-682. Maynard Smith, J. (1982). Evolution and the Theory of Games. Cambridge, Cambridge University Press. Maynard Smith, J. & Price, G.R. (1973). The logic of animal conflict. Nature 246, 15-18. Myerson, R.B. (1978). Refinements of the Nash equilibrium concept. International J. Game Theory 7, 73-80. Parker, G.A. & Rubenstein, D.I. (1981). Role assessment, reserve strategy, and acquisition of information in asymmetric animal conflicts. Anim. Behav. 29, 221-240. Weissing, F.J. (1996). Genetic versus phenotypic models of selection: can genetics be neglected in a long-term perspective. J. Math. Biol. 34, 533-555. 34 Table 1. The results of some attempts to solve the Hawk-Dove game by iterations of √ the best response map. Our computations illustrate the case π ∗ = 1/ 2 ' 0.707107. The sequence πn = (1 − λn )πn−1 + λn B(πn−1 ) fails to converge for λn fixed. With λn decreasing such that condition (8) holds, the sequence π0 , π1 , π2 , . . . converges to π ∗ for π0 = 0 but not for π0 = 1. Table 1 n 0 1 2 3 4 5 50 51 52 53 54 55 1000 1001 1002 1003 1004 1005 λn = 0.1 πn 0.000000 0.100000 0.190000 0.271000 0.343900 0.409510 0.719438 0.647494 0.682744 0.714470 0.643023 0.678721 0.719472 0.647525 0.682773 0.714495 0.643046 0.678741 λn = 1/n2 πn πn 0.000000 1.000000 1.000000 0.000000 0.750000 0.250000 0.666667 0.333333 0.687500 0.375000 0.700000 0.400000 0.707020 0.490000 0.707132 0.490196 0.706871 0.490385 0.706975 0.490566 0.707076 0.490741 0.707173 0.490909 0.707106 0.499500 0.707107 0.499501 0.707107 0.499501 0.707106 0.499502 0.707107 0.499502 0.707107 0.499503 35 Table 2. Iterative solution of the Hawk-Dove game with λn decreasing such that condition (9) holds. In this case the sequence πn = (1 − λn )πn−1 + λn B(πn−1 ) converges to π ∗ for all π0 . However, the convergence is very slow, and the sequence of best responses √ need not converge. The Table illustrates the case π ∗ = 1/ 2 ' 0.707107. Table 2 n 0 1 2 3 4 5 50 51 52 53 54 55 1000 1001 1002 1003 1004 1005 πn 0.000000 0.100000 0.500000 0.666667 0.500000 0.600000 0.700000 0.705882 0.711538 0.698113 0.703704 0.709091 0.707000 0.707292 0.706586 0.706879 0.707171 0.706467 λn = 1/n πn − π ∗ —0.707107 +0.292893 —0.207107 —0.040440 +0.042893 —0.107107 —0.007107 —0.001224 +0.004432 —0.008993 —0.003403 +0.001984 —0.000107 +0.000186 —0.000520 —0.000228 +0.000064 —0.000639 36 B(πn ) 1.000000 0.000000 1.000000 1.000000 0.000000 1.000000 1.000000 1.000000 0.000000 1.000000 1.000000 0.000000 1.000000 0.000000 1.000000 1.000000 0.000000 1.000000 Figure captions Figure 1. (a) The error function Gδ (x) = 1 1+e−x/δ for δ = 1.0 (solid line), δ = 0.5 (dotted line) and δ = 0.1 (dashed line). (b) The error function Gδ (x) = 1 2 x/δ ∗ (1 + 1+|x/δ| ) for δ = 1.0 (solid line), δ = 0.5 (dotted line) and δ = 0.1 (dashed line). Figure 2. The best response with error for the Hawk-Dove game. For resident strategy π the best response with error is Gδ ( 12 (V − πC)). Here the error function is given by 1 Gδ (x) = . The figure shows the best response for δ = 0.1 and δ = 0.5. For given 1 + e−x/δ δ, the ESS with error function Gδ satisfies Gδ (πδ∗ ) = πδ∗ , and is the value of π at which the solid 45o line intersects the best response curve. The best response without error is also shown as a solid line. The ESS without error is π ∗ = 2/3 (V = 2, C = 3). The ESS with error, πδ∗ , as a function of the error parameter δ for the 1 (solid line) and Hawk-Dove game. Two error functions are illustrated Gδ = 1 + e−x/δ 1 x/δ Gδ (x) = 1+ (dotted line). The ESS without error is π ∗ = 2/3 (V = 2, C = 2 1 + |x/δ| 3). Figure 3. Figure 4. Attempts to find an ESS for the state-dependent desertion game (Szekely et al. unpub) by iterating the best response map with fixed level of replacement λ. For the case considered iterations without error fail to converge for any λ ≥ 0.01. The figure shows the outcome for various combinations of λ and error parameter δ. White cells indicate cases in which convergence occurs in less than 200 iterations. Grey cells indicate cases in which between 200 and 1000 iterations were required. Black cells indicate no convergence after 1000 iterations. 37
© Copyright 2026 Paperzz