Appearances Can Be Deceiving: Lessons Learned Reimplementing Axelrod’s “Evolutionary Approach to Norms” Luis R. Izquierdo1 & José M. Galán2 1 Macaulay Institute, Craigiebuckler, Aberdeen, United Kingdom [email protected] 2 Universidad de Burgos, Burgos, Spain & INSISOC Group, Valladolid, Spain [email protected] Abstract. In this paper we try to replicate the simulation results reported by Axelrod [1] in an influential paper on the evolution of social norms. Our study shows that Axelrod’s results are not as reliable as one would desire. We can obtain the opposite results by running the model for longer, by slightly modifying some of the parameters, or by changing some arbitrary assumptions in the model. This re-implementation exercise illustrates the importance of running stochastic simulations several times, of exploring the parameter space adequately, of complementing simulation with analytical work, and of being aware of the scope of our simulation models. Introduction In recent years agent based modelling (ABM) has shifted from being a heterodox modelling approach to become a recognised research methodology in a range of scientific disciplines, e.g. Economics [2, 3], Resource Management and Ecology [46], Political Science [7], Anthropology [8, 9], Sociology [10-12], Biology [13]… One of the main advantages of ABM, and what distinguishes it from other modelling paradigms, is the possibility of establishing a more direct correspondence between entities (and their interactions) in the system to be modelled and agents (and their interactions) in our models [14]. This type of abstraction is attractive for a number of reasons, e.g. it leads to formal yet more natural descriptions of the target system, enables us to model heterogeneity and to represent space explicitly, allows us to study the bidirectional relationship between individuals and groups, and it can also capture emergent behaviour (see [15-17]). However, this step forward towards descriptive accuracy, transparency, and rigour comes at a price: models constructed in this way are very often intractable using mathematical analysis, so we usually have to resort to computer simulation. As a matter of fact, agent-based models are usually so complex that we (their own developers) often do not fully understand in exhaustive detail what is going on in our models. Not knowing exactly what to expect makes it impossible to tell whether any 2 Luis R. Izquierdo and José M. Galán unanticipated results derive exclusively from what the researcher believes are the crucial assumptions in the model, or whether they are just artefacts created in the design, implementation, or in the running process. Artefacts in the design can appear when assumptions which are made arbitrarily (possibly because the designer believes they are not crucial to the research question and they will not have any significant effect in the results) have an unanticipated and significant effect in the results (e.g. the effect of using different topological structures or neighbourhood functions). When this actually occurs, we run the risk of interpreting our simulation results (which generalise the crucial assumptions believed to be irrelevant) beyond the scope of the simulation model (see e.g. [18]). Implementation artefacts appear in the potentially ambiguous process of translating a model described in natural language into a computer program [19]. Finally, artefacts can also occur at the stage of running the program because the researcher might not be fully aware of how the code is executed in the computer (e.g. unawareness of floating-point errors [20, 21]). To discern an artefact from what is not there are two techniques that have proved extremely useful: replication of experiments by independent researchers [18, 19, 22, 23] and mathematical analysis [21, 24-26]. Using these two techniques we can increase the rigour, the reliability, and the credibility of our models. In this paper we have replicated two influential models of social norms developed by Axelrod [1] and, in doing so, we illustrate the importance of both independent replication and mathematical analysis. The structure of the paper is as follows: in the next two sections we give some background to Axelrod’s models and we explain them in detail. Subsequently, we present the method used to replicate the original models and to understand their dynamics. Results and discussions are then provided for each of the two models; and, finally, conclusions are presented in the last section. Background to Axelrod’s Models Social dilemmas have been fascinating scientists from a wide range of disciplines for decades. In a social dilemma, decisions that seem to make perfect sense from each individual’s point of view can aggregate into outcomes that are unfavourable for all. In its simplest formalisation, social dilemmas can be modelled as games in which players can either cooperate or defect. The dilemma comes from the fact that everyone is better off defecting given the other players’ decisions, but they all prefer universal cooperation to universal defection. Using game theory terms, in a dilemma game all players have strictly dominant strategies1 that result in a deficient equilibrium2 [27]. Within the domain of agent-based modelling there is a substantial amount of work devoted to identifying conditions under which cooperation can be sustained in these problematic situations (see Gotts et al. [28] for an extensive review). In particular, some of this work has investigated the role of social norms and how these can be 1 For an agent A, strategy S*A is strictly dominant if for each feasible combination of the other players’ strategies, A’s payoff from playing S*A is strictly more than A’s payoff from playing any other strategy. 2 An equilibrium is deficient if there exists another outcome which is preferred by every player. Appearances Can Be Deceiving: Lessons Learned Re-implementing Axelrod’s “Evolutionary Approach to Norms” 3 enforced to promote cooperation. Following Axelrod’s [1] definition, we understand that “a norm exists in a given social setting to the extent that individuals usually act in a certain way and are often punished when seen not to be acting in this way”. Norms to cooperate provide cooperators with a crucial advantage: the option to selectively punish those who defect [29]. If a norm to cooperate is not in place, the only way that punishment can be exercised in these simple games is by withdrawing cooperation, thus giving rise to potential misunderstandings. In 1986 Axelrod wrote a pioneering and influential paper on the study of norm enforcement in social dilemmas using computer simulation [1]. In his paper, Axelrod investigates the role of metanorms (norms to follow other norms) in promoting cooperation in a simple agent-based model. He argues that in his model “metanorms can prevent defections if the initial conditions are favourable enough”. However, we have re-implemented his model and our study shows that initial conditions are irrelevant for the long-term behaviour of the model and, using Axelrod’s parameters, metanorms do not prevent defections most of the time. Furthermore, Axelrod’s results are dependent on very specific and arbitrary conditions, the absence of which tends to change the conclusions significantly. In the next section we explain the two models that Axelrod [1] presents in his paper: the Norms model and the Metanorms model. Axelrod’s Models The Norms Model The Norms game is played by 20 agents who have to make two decisions: 1. 2. Agents have to decide whether to cooperate or defect. A defecting agent gets a Temptation payoff (T = 3) and inflicts each of the other agents a Hurt payoff (H = −1). If, on the other hand, the agent cooperates, no one’s payoff is altered. The opportunity to defect given to an agent comes with a known chance of being seen by each of the other agents, called S. This probability of being observed is drawn from a uniform distribution between 0 and 1 every time a certain agent is given the opportunity to defect. For each observed defection, agents have to decide whether to punish the defector or not. Punishers incur an Enforcement cost (E = −2) every time they punish (P = −9) a defector. The strategy of an agent is defined by its propensity to defect (Boldness), and its propensity to punish agents they have observed defecting (Vengefulness). Agents defect when given the opportunity if their Boldness is higher than the probability of being observed (S); and they punish observed defectors with probability Vengefulness. In this model, each of these propensities is implemented as a 3-bit string denoting eight evenly-distributed values from 0 to 1 (0/7, 1/7,…,7/7). The actual values for 4 Luis R. Izquierdo and José M. Galán each agent’s strategy are determined randomly at the beginning of each simulation run. A round in this model is completed when every agent has been given exactly one opportunity to defect, and also the opportunity to observe (and maybe punish) any given defection that has taken place. Figures 1 and 2 show the UML activity diagram of one round. Fig. 1. UML Activity diagram of Axelrod’s models. The UML diagram of method metaNorms(Number, Agent, Agent)is provided in figure 2. Appearances Can Be Deceiving: Lessons Learned Re-implementing Axelrod’s “Evolutionary Approach to Norms” 5 Fig. 2. UML activity diagram of the method metaNorms(Number, Agent, Agent) of the object model. This method is called in the UML activity diagram shown in figure 1. The condition metaNormsActive is false in the Norms model and true in the Metanorms model. Four rounds constitute a generation. At the beginning of every generation the agents’ payoffs are initialised; at the end of every generation the payoff obtained by every agent in the four rounds is computed and two evolutionary forces come into play: 1. Agents with a payoff exceeding the population average by at least one standard deviation are replicated twice; agents who are at least one standard deviation below the population average are eliminated; and the rest of the agents are replicated once. The number of agents is kept constant, but Axelrod [1] does not specify exactly how. After having studied this process 6 Luis R. Izquierdo and José M. Galán 2. in detail, we have come to the conclusion that this ambiguity is not likely to be of major importance. Whenever a bitstring is replicated, every bit has a certain probability to be flipped (MutationRate = 0.01). Using this model, Axelrod [1] comes to the conclusion that the simulations should spend most of the time in states3 of very high Boldness and very low Vengefulness (norm collapse). The Metanorms Model Having concluded that the norm to cooperate collapses in the previous model, Axelrod investigates the role of metanorms as a way of enforcing norms. The metanorm dictates that one must punish those who do not follow the norm (i.e. those who do not punish observed defectors). However, someone who does not punish an observed defector might not be caught. In the Metanorms game, the chance of being seen not punishing a defection (given that the defection has been seen) by each of the other 18 agents (excluding the defector) is the same as the chance of seeing such defection. Similarly, the propensity to punish those who do not comply with the norm (meta-punish) is the same as the propensity to punish defectors4. As far as payoffs are concerned, meta-punishers incur a Meta-enforcement cost (ME = −2) every time they Meta-punish (MP = −9) someone who has not punished an observed defector. Figures 1 and 2 show the UML activity diagram of one round in the Metanorms model. Using this model, Axelrod argues that “the metanorms game can prevent defections if the initial conditions are favourable enough”. Method In this paper we have used the following three tools: 1. Computer models. We have re-implemented Axelrod’s models in Java 2 using RePast 2.2 [31], and added extra functionality to our programs so we can relax several assumptions made in Axelrod’s models. Using our computer models, we have been able to perform the following tasks: a. Replicate Axelrod’s experiments using our computer models that fully comply with the specifications outlined in his paper. This exercise was conducted to study the potential presence of ambiguities and artefacts in the process of translating the model described in the paper into computer code (e.g. is the description in the paper sufficient to implement the model? Could there be implementation mistakes?), look for artefacts in the process of running the program (e.g. could the results be dependent on the modelling paradigm, 3 4 The term ‘state’ denotes here a certain particularisation of every agent’s strategy. Yamagishi and Takahashi [30] use a model similar to Axelrod’s, but propose a linkage between cooperation (not being bold) and vengefulness. Appearances Can Be Deceiving: Lessons Learned Re-implementing Axelrod’s “Evolutionary Approach to Norms” 7 programming language, or hardware platform used?), and assess the process by which the results have been analysed and conclusions derived (e.g. do the results change when the program is run for longer?). b. Conduct an adequate exploration of the parameter space and study the sensitivity of the model. One major disadvantage of using computer simulation is that a single run does not provide any information on the robustness of the results obtained nor on the scope of the conclusions derived from them. In order to establish the scope of the conclusions derived from a simulation model it is necessary to determine the parameter range where the conclusions are invariant. c. Experiment with alternative models which address the relevant research question (e.g. can metanorms promote cooperation?) just as well [18]. It is often the case that an agent based model instantiates a more general conceptual model that could embrace different implementations equally well (see [32] for examples of how to find possible variations). Only those conclusions which are not falsified by any of the conceptually equivalent models will be valid for the conceptual model. 2. Mathematical analysis of the computer models. Defining a state of the system as a certain particularisation of every agent’s strategy, it can be shown that both the Norms model and the Metanorms model are irreducible positive recurrent and aperiodic discrete-time finite Markov chains (with 6420 possible states). This observation enables us to say that the probability of finding the system in each of its states in the long run5 is unique (i.e. initial conditions are immaterial) and nonzero (Theorems 3.7 and 3.15 in [33]). Although calculating such probabilities is infeasible, we can estimate them using the computer models. 3. Mathematical abstractions of the computer models. We have developed one mathematical abstraction for each of the two games (the Norms game and the Metanorms game) in which we study every agent’s expected payoff in any given state. These mathematical abstractions do not correspond in a one-to-one way with the specifications outlined in the previous section. They are simpler, more abstract models which are amenable to mathematical analysis and graphical representation. In particular, our mathematical models abstract the details of the evolutionary process (the genetic algorithm) and assume continuity of agents’ properties. The mathematical abstractions are used to suggest areas of stability and basins of attraction in the computer models, to clarify their crucial assumptions, to assess their sensitivity to parameters, and to illustrate graphically the dynamics of the system. Any results suggested by the mathematical abstractions are always checked by simulation. 5 This is also the long-run fraction of the time that the system spends in each of its states. 8 Luis R. Izquierdo and José M. Galán The Norms Model: Results and Discussion Using the Norms model, Axelrod [1] reports results from 5 runs consisting of 100 generations each. Even though the simulation results are not conclusive at all (i.e. they show three completely different possible outcomes), Axelrod comes to the correct conclusion that the simulations should spend most of the time in a state of very high Boldness and very low Vengefulness (norm collapse). In this section we provide a series of arguments that corroborate his conclusion. We start by using the mathematical abstraction of the computer model. Without making any simplifying assumption so far, we can say that an agent i, with boldness bi and vengefulness vi obtains the following payoff: n n n j =1 j ≠i j =1 j ≠i j =1 j ≠i Payoff i = Def i ⋅ T + ∑ Def j ⋅ H + ∑ Punij ⋅ E + ∑ Pun ji ⋅ P (1) where T, H, P, E n are the payoffs mentioned in the description of the model, is the number of agents, and 1 If agent i defects Def i = 0 If agent i cooperates Prob ( Def i ≡ 1) = bi Prob ( Def i ≡ 0) = 1 − bi 1 If agent i punishes agent j Punij = 0 If agent i does not punish agent j Prob ( Punij ≡ 1) = b j ⋅ (b j 2) ⋅ vi Prob ( Punij ≡ 0) = 1 − b j ⋅ (b j 2) ⋅ vi The expected payoff of agent i is then: Exp( Payoff i ) = biT + (n − 1) B− i H + E vi 2 n ∑b j =1 j ≠i 2 j + (n − 1) 2 bi V− i P 2 (2) where B−i = 1 n ∑bj n − 1 j =1 j ≠i is the population average Boldness observed by agent i, and similarly for V− i and the Vengefulness. We define now a concept of point stability that we call Evolutionary Stable State (ESS). An ESS is a state (determined by every agent’s Boldness and Vengefulness) where: a) every agent is getting the same expected payoff (so evolutionary selection pressures will not lead the system away from the state), and b) any single (mutant) agent who changes its strategy gets a strictly lower expected payoff than any of the other agents in the incumbent population (so Appearances Can Be Deceiving: Lessons Learned Re-implementing Axelrod’s “Evolutionary Approach to Norms” 9 if one single mutation occurs6, the mutant agent will not be able to invade the population). If, at this point, we assume continuity of agents’ properties, we can write a necessary condition for a state to be evolutionary stable. Let M be an arbitrary (potential mutant) agent in a given population of agents P (state), and let bM be its boldness and vM its vengefulness. Let I be any of the other (incumbent) agents in the population P. Then eq. (3) is a necessary condition for the population of agents being an ESS. ∂Exp( Payoff M ) ∂Exp( Payoff I ) (3) = ∀ I ∈ P, I ≠ M ∂bM ∂bM ∂Exp( Payoff M ) ∂Exp( Payoff I ) ≥ , ∀ I ∈ P, I ≠ M OR bM = 1 AND ∂bM ∂bM ∀ M ∈ P Payoff Payoff ∂ Exp ( ) ∂ Exp ( ) M I OR bM = 0 AND I P I M ≤ , ∀ ∈ , ≠ ∂bM ∂bM If every agent has the same expected payoff (which is a necessary condition for ESS) and eq. (3) does not hold for some M, I, the potential mutant M could get a differential advantage over incumbent agent I by changing its Boldness bM, meaning that the state under study would not be evolutionary stable. If, for instance, we find some M, I such that ∂Exp( Payoff M ) ∂Exp( Payoff I ) > AND bM ≠ 1 ∂bM ∂bM then agent M could get a higher payoff than agent I by increasing its boldness bM. Similarly, we obtain another necessary condition substituting vM for bM in eq. (3). ∂Exp( Payoff M ) ∂Exp( Payoff I ) = ∀ I ∈ P, I ≠ M ∂vM ∂vM ∂Exp( Payoff M ) ∂Exp( Payoff I ) ≥ , ∀ I ∈ P, I ≠ M OR vM = 1 AND ∂vM ∂vM ∀ M ∈ P OR vM = 0 AND ∂Exp( Payoff M ) ≤ ∂Exp( Payoff I ) , ∀ I ∈ P, I ≠ M ∂vM ∂vM (4) It is interesting to see how in general there is no direct relationship between the concept of evolutionary stability as defined above and the Nash equilibrium concept. Evolution is about relative payoffs, but Nash is about absolute payoffs. A necessary condition to be in Nash equilibrium would be, e.g.: 6 We refer here to any change in one single agent’s strategy, not a single flip of a bit. 10 Luis R. Izquierdo and José M. Galán ∂Exp( Payoff M ) =0 ∂bM ∂Exp( Payoff M ) ≥ 0 OR bM = 1 AND ∂bM ∀ M ∈ P OR bM = 0 AND ∂Exp( Payoff M ) ≤ 0 ∂bM In appendix A, we use equations (3) and (4) to demonstrate that the only ESS in the Norms game (assuming continuity and using Axelrod’s parameters) is the state of total norm collapse (bi = 1, vi = 0 for all i). Here, we use these equations to draw figure 3, which illustrates the dynamics of the system under the assumption that every agent has the same properties (bi = B, vi = V for all i). States where every agent has the same properties will be called homogeneous. Figure 3 has been drawn according to the following procedure: the arrow departing from a certain homogeneous state (B, V) has horizontal component iff the condition in eq. (3) is false in such state. In that case, the horizontal component is positive if ∂Exp( Payoff M ) ∂Exp( Payoff I ) > ∂bM ∂bM and negative otherwise. The vertical component is worked out in a similar way but using eq. (4) instead. Only vertical lines, horizontal lines, and the four main diagonals are considered. If both equations (3) and (4) are true then a red point is drawn. Vengefulness 1 0.8 0.6 0.4 0.2 0 0 0.2 0.4 0.6 0.8 1 Boldness Appearances Can Be Deceiving: Lessons Learned Re-implementing Axelrod’s “Evolutionary Approach to Norms” 11 Fig. 3. Graph showing the expected dynamics in the Norms model, using Axelrod’s parameters, and assuming continuity and homogeneity of agents’ properties. The procedure used to create this graph is explained in the text. The dashed squares represent the states of norm establishment (green, top-left) and norm collapse (red, bottom-right) as defined in the text below. The red point is the only ESS. As an example, imagine that in a certain state (B, V) where B ≠ 1 and V ≠ 0 we observe that: ∂Exp( Payoff M ) ∂Exp( Payoff I ) ∂Exp( Payoff M ) ∂Exp( Payoff I ) > AND < ∂bM ∂bM ∂vM ∂vM We then draw a diagonal arrow pointing towards greater Boldness and less Vengefulness, since a mutant with greater Boldness and less Vengefulness than the (homogeneous) population could invade it. We will see in the next section how figures constructed in this way can be extremely useful to suggest simulation experiments to run. However, we must bear in mind that they are mathematical abstractions of the computer model, so they can also be misleading. For instance, even though figure 3 (and equation 5, formally) shows that agents can always gain a competitive advantage by decreasing their Vengefulness in any homogenous state (unless nobody is bold), that is not necessarily the case in heterogeneous states. As an example, in a state where every agent’s properties are zero except for two agents who have bi = 1 and vi = 0, each of the two bold agents would become the only agent with the highest expected payoff if they (individually) increased their vengefulness. ∂Exp( Payoff M ) ∂Exp( Payoff I ) P 2 E < = B (n − 1) ⋅ B 2 = ∂v M ∂v M 2 2 ∀B≠0 (5) The mathematical analysis shows that in the vast majority of states it is not advantageous in evolutionary terms to be vengeful, particularly if we increased the number of agents (eq. 6). Punishing only one agent can be advantageous for the punisher since it inflicts more pain (P) than the cost of punishing (E) (even though the punisher would also get a lower payoff!). However, if the population is minimally bold, and being vengeful means punishing too many people, the total cost of being vengeful (exclusively borne by the punisher) can be higher than each individual punishment. Therefore vengeful agents tend to be less successful. When the level of vengefulness in the population is low enough, bold agents will tend to get higher payoffs and the system will head towards the state of norm collapse. So when both evolutionary forces are in place the system should spend most of its time in the neighbourhood of the only evolutionary stable state. ∂Exp( Payoff M ) E = 2 ∂vM n ∑b j =1 j≠M 2 j ∂Exp( Payoff I ) P 2 = bI 2 ∂vM (6) 12 Luis R. Izquierdo and José M. Galán However, it is only running simulations how we can confidently explore the dynamics of the model. To analyse the simulation runs, we define the following sets of states: Norm Collapse: We say that the norm has collapsed when the simulation is in states where the average Boldness is at least 6/7 and the average Vengefulness is no more than 1/7 (see fig. 3). Norm Establishment: We say that the norm has been established when the simulation is in states where the average Boldness is no more than 2/7 and the average Vengefulness is at least 5/7 (see fig. 3). Figure 4 shows the proportion of runs (out of 576) where the norm has been established, and where the norm has collapsed, after a certain number of generations (up to 106). Fig. 4. Proportion of runs where the norm has been established, and where the norm has collapsed, calculated over 576 runs up to 106 generations. The little figure in the middle of the graph represents the first 1,000 generations zoomed. As predicted in the previous analysis, the norm collapses almost always, as Axelrod concluded; only now the argument has been corroborated with more convincing evidence. We can also notice looking at figure 4 that it is not surprising that Axelrod found three completely different possible outcomes after having run the simulation 5 times for 100 generations. The Metanorms Model: Results and Discussion Using the Metanorms model, Axelrod [1] reports again results from 5 runs consisting of 100 generations each. In all five runs the norm is clearly established and Axelrod argues that “the metanorms game can prevent defections if the initial conditions are Appearances Can Be Deceiving: Lessons Learned Re-implementing Axelrod’s “Evolutionary Approach to Norms” 13 favourable enough”. However, as explained in the method section, initial conditions are immaterial for the long-run behaviour of any of the two models under study. In this section, we investigate whether metanorms can actually prevent defections and, if so, how robust such statement is. Replication of the Original Experiments We replicated Axelrod’s experiments but we ran many more simulations (1000 runs, as opposed to 5) and for longer (106 generations, as opposed to 100). The results are shown in figure 5. We can see now how misleading running the simulation for only 100 generations was. Even though after 100 generations the norm is almost always established, as time goes by the system approaches its limiting distribution, where the norm usually collapses. Fig. 5. Proportion of runs where the norm has been established, and where the norm has collapsed, calculated over 1000 runs up to 106 generations. The little figure in the middle of the graph represents the first 1,000 generations zoomed. To understand better the dynamics of the system and the sensitivity of the model we used again a mathematical abstraction of the computer model. Equation 7 shows the expected payoff for an agent i, with boldness bi and vengefulness vi. In appendix B we demonstrate that in the Metanorms game, assuming continuity, there is now one (only) ESS where the norm is established7 (bi = 4/169, vi = 1 for all i), an there is also at least one ESS where the norm collapses (bi = 1, vi = 0 for all i). 7 This ESS is not a Nash equilibrium. 14 Luis R. Izquierdo and José M. Galán Exp( Payoff i ) = biT + (n − 1) B−i H + E + ME vi 4 vi 2 n ∑b j =1 j ≠i 2 j + (n − 1) 2 bi V−i P + 2 1− v ∑ ∑ b (1 − v ) + MP 4 ∑ ∑ b v n n k =1 j =1 k ≠i j ≠ i ,k 3 k j i n n k =1 j =1 k ≠i j ≠i ,k 3 k (7) j Figure 6 shows the expected dynamics of the metanorms game assuming continuity and homogeneity in agents’ properties. Looking at figure 6 we find that it is not surprising that running 5 simulations for 100 rounds could mislead us to think that the norm will always be established. If the initial strategies are random, chances are that the system will move towards the ESS where the norm is established. However, the region to the left of this ESS is a nearby escape route to the ESS where the norm collapses. Intuitively, for very low levels of boldness the very few defections that occur are those that are very unlikely to be seen8, meaning that an agent who happens to observe a defection and who does not punish it is also very unlikely to be caught. And let’s face it, in this model the only reason for which agents may punish defectors is to avoid being meta-punished9. So when defections are hard to see, not punishing pays off because it is very unlikely that the non-punisher will be caught. Thus being vengeful is disadvantageous, and forgiving agents gain a competitive advantage. As the level of vengefulness decreases, the level of boldness below which it is advantageous not to be vengeful increases, since people metapunish less than before (vengefulness is both the propensity to punish and to metapunish). Agents then start to be less and less vengeful, and consequently bolder and bolder, so the norm eventually collapses. 8 9 Remember that agents defect iff their boldness is higher than the probability of being seen. Interestingly enough, recent research suggests that people genuinely enjoy punishing others if they have done something wrong [34]. Appearances Can Be Deceiving: Lessons Learned Re-implementing Axelrod’s “Evolutionary Approach to Norms” 15 Vengefulness Vengefulness Boldness 0.04 0.01 1 0.95 0.8 0.9 0.6 0.4 0.2 0 0 0.2 0.4 0.6 0.8 1 Boldness Fig. 6. Graph showing the expected dynamics in the Metanorms model, using Axelrod’s parameters, and assuming continuity and homogeneity of agents’ properties. Red points are ESS. Some Exploration of the Parameter Space One reason for which the transition towards the set of states where the norm collapses is so slow in the Metanorms model, and for which such set is not very stable (the set is indeed sometimes abandoned), is the high mutation rate used by Axelrod. Such a high mutation rate does not allow the system to adjust according to evolutionary selection pressures after a single mutation takes place and before the next mutation occurs. Using a lower mutation rate (MutationRate = 0.001, for which we can expect one mutation approximately every 8 generations) the system reaches the states of norm collapse much more quickly and such set is much more stable. Results are shown in figure 7. 16 Luis R. Izquierdo and José M. Galán Fig. 7. Proportion of runs where the norm has been established, and where the norm has collapsed, calculated over 312 runs up to 2·105 generations, with MutationRate equal to 0.001. Another reason for which Axelrod’s simulation results turned out to be so misleading is the extreme payoff structure that he used. In every round, agents might get the Temptation payoff at most once (benefit = 3), but they can be punished for being bold up to 19 times (with a total cost of 171), and they can be meta-punished for not being vengeful up to 342 times (with a total cost of 3078)! As an example, we show here how slightly altering the metanorm-related payoffs can significantly change the dynamics of the system. Assume that we divide both the Metaenforcement cost and the Meta-punishment payoff by 10, leaving their ratio untouched (ME = −0.2; MP = −0.9). Such adjustments should actually give us a more realistic model: as Yamagishi and Takahashi [30] put it: “if someone is late for a meeting you may grumble at him, but you would seldom grumble at your colleagues for not complaining to the late comer”, certainly not with the same intensity! Figure 8 shows how if we use the modified payoffs the area of stability where the norm is established is not there anymore, suggesting that the transition to the states of norm collapse will be much quicker. Appearances Can Be Deceiving: Lessons Learned Re-implementing Axelrod’s “Evolutionary Approach to Norms” 17 Vengefulness 1 0.8 0.6 0.4 0.2 0 0 0.2 0.4 0.6 0.8 1 Boldness Fig. 8. Graph showing the expected dynamics in the Metanorms model, with ME = −0.2; MP = −0.9, and assuming continuity and homogeneity of agents’ properties. The simulation runs corroborate our speculations. As we can see in figure 9, the norm quickly collapses and such state is sustained in the long term. Axelrod’s conclusions are reversed if we use (what in our opinion are) more realistic payoffs. 18 Luis R. Izquierdo and José M. Galán Fig. 9. Proportion of runs where the norm has been established, and where the norm has collapsed, calculated over 318 runs up to 2·105 generations, with ME = −0.2, and MP = −0.9. The mathematical abstraction of the computer model was also used to uncover a very counterintuitive feature of the original model. Strange as it may appear, the mathematical analysis suggests that increasing the magnitude of the Temptation payoff or decreasing the magnitude of the Punishment payoff will increase the chances that the norm is established. Figure 10 shows the expected dynamics when Temptation = 10. The ESS where the norm is established (bi = 11/169, vi = 1 for all i) is now surrounded by a larger basin of attraction, suggesting that the set of states where the norm is established will be more stable. Vengefulness 1 0.8 0.6 0.4 0.2 0 0 0.2 0.4 0.6 0.8 1 Boldness Fig. 10. Graph showing the expected dynamics in the Metanorms model, with T = 10 and assuming continuity and homogeneity of agents’ properties. We decided to test the hypothesis that a greater Temptation can increase the chances that the norm is established using our computer model. The results obtained, which are shown in figure 11, are unambiguous: the norm is clearly established in almost all runs. The reason for that is that a higher Temptation10 means that the level of optimum boldness (in evolutionary terms) in any given situation is higher than before (e.g. assuming vi = 1 for all i, it is now b = 11/169, as opposed to 4/169). As we explained before, in the previous case the system would abandon the states where the norm is established because the level of boldness in the population was so low that agents who did not punish defectors were rarely caught. However, in this case the optimum level of boldness is not so low, so agents who do not punish defectors are more likely to be observed and meta-punished. Because of this, a very high level of vengefulness is preserved. This basically means that in the presence of metanorms, it might be 10 A lower Punishment yields very similar results, and the reasoning is the same. Appearances Can Be Deceiving: Lessons Learned Re-implementing Axelrod’s “Evolutionary Approach to Norms” 19 easier to enforce norms which people have higher incentives to break (i.e. which are constantly put to the test) because that gives meta-punishers more opportunities to exert their power. However it is also clear that this argument requires a strong link between the propensity to punish and the propensity to meta-punish. Without such a link it seems unlikely that the norm could be established in the long-term in any case, since strategies that follow the norm without incurring the costs of enforcing it would gain a differential advantage over those that enforce it. Fig. 11. Proportion of runs where the norm has been established, and where the norm has collapsed, calculated over 1000 runs up to 2·105 generations, with T = 10. Other Instantiations of the Same Conceptual Model We also wanted to test the robustness of Axelrod’s conclusions using similar computer models which are, in our opinion, equally valid instantiations of the conceptual model that (we believe) Axelrod had in mind. In particular, we implemented three other evolutionary selection mechanisms apart from the one Axelrod used. In all four selection mechanisms the most successful agents at a particular time have the best chance of being replicated in the following generation, which is what we believe the conceptual model would specify. The new selection mechanisms are the following: 1. Random tournament. This method involves selecting two agents from the population at random and replicating the one with the higher payoff for the next generation. If case of tie one is selected at random. This process is repeated 20 times to keep the number of agents constant. 2. Roulette wheel. This method involves calculating every agent’s fitness, which is equal to their payoff minus the minimum payoff obtained in the generation. Agents 20 Luis R. Izquierdo and José M. Galán are then given a probability of being replicated (in each of the 20 replications) that is directly proportionate to their fitness 3. Average selection. Using this method, agents with a payoff greater than or equal to the population average are replicated twice; and agents who are below the population average are eliminated. The number of agents is then kept constant by randomly eliminating/replicating as many agents as needed. As we can see in figure 10 the results obtained vary substantially depending on the selection mechanism used. This is so particularly in the short term but also in the long term. If, for instance, random tournament is chosen, the states where the norm has collapsed are quickly reached and our experiments indicate that the long-run probability of finding the system in such states is very close to one. Fig. 12. Proportion of runs where the norm has been established, and where the norm has collapsed, calculated over 300 runs up to 2·104 generations, for different selection mechanisms. Conclusions This paper has provided evidence showing that the results reported by Axelrod [1] are not as reliable as one would desire. We can obtain the opposite results by running the model for longer, by using other mutation rates, by modifying the payoffs slightly, or by using alternative selection mechanisms. As far as the reimplementation exercise is concerned, our study represents yet another illustration of the necessity to revisit and replicate our models in order to clarify the boundaries of validity of our conclusions (see [19] for another striking example). As Axelrod himself claims: “Replication is one of the hallmarks of cumulative science. It is needed to confirm whether the claimed results of a given simulation are reliable in the Appearances Can Be Deceiving: Lessons Learned Re-implementing Axelrod’s “Evolutionary Approach to Norms” 21 sense that they can be reproduced by someone starting from scratch. Without this confirmation, it is possible that some published results are simply mistaken due to programming errors, misrepresentation of what was actually simulated, or errors in analyzing or reporting the results. Replication can also be useful for testing the robustness of inferences from models”. [22] In particular, this paper has illustrated the importance of: a) Running simulations with stochastic components several times for several periods, so we can study not only how the system can behave but also how it usually behaves. b) Exploring thoroughly the parameter space and analysing the model sensitivity to its parameters. c) Complementing simulation with analytical work. d) Being aware of the scope of our computer models and of the conclusions obtained with them. The computer model is often only one of many possible instantiations of a more general conceptual model. Therefore the conclusions obtained with the computer model do not necessarily apply to the conceptual model. The importance of the previous points has been previously exposed by authors like Gotts et al. [26, 28] and Edmonds and Hales [19]; the work reported in this paper strongly corroborates these authors’ arguments. Acknowledgement This work is funded by the Scottish Executive Environment and Rural Affairs Department, and by the Junta de Castilla y León Grant Ref.: VA034/04. We would also like to thank Gary Polhill for some advice and programming work. 22 Luis R. Izquierdo and José M. Galán References 1. Axelrod, R.M. An Evolutionary Approach to Norms, American Political Science Review 80 (1986) 1095-1111. 2. Arthur, B., Durlauf, S., Lane, D. The Economy as an Evolving Complex System II, Addison-Wesley, Reading, Massachusets. 1997. 3. Tesfatsion, L. Agent-based computational economics: Growing economies from the bottom up, Artificial Life 8 (2002) 55-82. 4. Bousquet, F., Le Page, C. Multi-agent simulations and ecosystem management: a review, Ecological Modelling 176 (2004) 313-332. 5. Hare, M., Deadman, P. Further towards a taxonomy of agent-based simulation models in environmental management, Mathematics and Computers in Simulation 64 (2004) 2540. 6. Janssen, M. Complexity and ecosystem management. The theory and practice of multiagent systems, Edward Elgar Pub, Chelteham, UK 2002. 7. Axelrod, R.M. The complexity of cooperation. Agent-based models of competition and collaboration, Princeton University Press, Princeton, N.J 1997. 8. Kohler, T., Gumerman, G.J. Dynamics in human and primate societies: Agent-based modeling of social and spatial processes, Oxford University Press and Santa Fe Institute, New York 2000. 9. Lansing, J.S. Complex Adaptive Systems, Annual Review of Anthropology 32 (2003) 183-204. 10. Gilbert, N., Conte, R. Artificial Societies: the Computer Simulation of Social Life, UCL Press, London 1995. 11. Gilbert, N., Troitzsch, K.Simulation for the social scientist, Open University Press, Buckingham 1999. 12. Suleiman, R., Troitzsch, K.G., Gilbert, N. Tools and Techniques for Social Science Simulation, Physica-Verlag, Heidelberg, New York 2000. 13. Resnick, M.Turtles, Termites, and Traffic Jams: Explorations in Massively Parallel Microworlds (Complex Adaptive Systems), MIT Press, Cambridge, US 1995. 14. Edmonds, B. The Use of Models - making MABS actually work, in S.Moss & P.Davidsson (Eds.), Multi-Agent-Based Simulation, Lecture Notes in Artificial Intelligence 1979, 2001, pp. 15-32. 15. Axtell, R.L. Why Agents? On the Varied Motivations for Agents in the Social Sciences., in C.M.Macah & D.Sallach (Eds.), Proceedings of the Workshop on Agent Simulation: Applications, Models, and Tools., Argonne National Laboratory, Argonne, Illinois., 2000. 16. Bonabeau, E. Agent-based modeling: Methods and techniques for simulating human systems, Proceedings of the National Academy of Sciences of the United States of America 99 (2002) 7280-7287. 17. Epstein, J.M. Agent-based computational models and generative social science, Complexity 4 (1999) 41-60. 18. Edmonds, B., Hales, D. Computational Simulation as Theoretical Experiment, Centre for Policy Modelling Report No.: 03-106 (2003) <http://cfpm.org/cpmrep106.html>. 19. Edmonds, B., Hales, D. Replication, replication and replication: Some hard lessons from model alignment, Jasss-the Journal of Artificial Societies and Social Simulation 6 (2003) <http://jasss.soc.surrey.ac.uk/6-4/11.html>. 20. Polhill, J.G., Izquierdo, L.R. Gotts, N.M. The ghost in the model (and other effects of floating point arithmetic) , Jasss-the Journal of Artificial Societies and Social Simulation, In Press. Appearances Can Be Deceiving: Lessons Learned Re-implementing Axelrod’s “Evolutionary Approach to Norms” 23 21. Polhill, J.G., Izquierdo, L.R. Gotts, N.M. What every agent based modeller should know about floating point arithmetic, Environmental Modelling and Software, In Press. 22. Axelrod, R.M. Advancing the Art of Simulation in the Social Sciences, in R.Conte, R.Hegselmann & P.Terna (Eds.), Simulating Social Phenomena (Lecture notes in economics and mathematical systems 456), Springer, Berlin, 1997, pp. 21-40. 23. Axtell, R.L., Axelrod, R.M., Epstein, J.M., Cohen, M.D. Aligning Simulation Models: A Case Study and Results, Computational and Mathematical Organization Theory 1 (1996) 123-141. 24. Binmore, K. Review of the book: The Complexity of Cooperation: Agent-Based Models of Competition and Collaboration, by Axelrod, R., Princeton, New Jersey: Princeton University Press, 1997, Jasss-the Journal of Artificial Societies and Social Simulation 1 (1998) <http://jasss.soc.surrey.ac.uk/1-1/review1.html>. 25. Brown, D.G., Page, S.E., Riolo, R.L., Rand, W. Agent-based and analytical modeling to evaluate the effectiveness of greenbelts, 2004. 26. Gotts, N.M., Polhill, J.G., Adam, W.J. Simulation and Analysis in Agent-Based Modelling of Land Use Change, First Conference of the European Social Simulation Association, 2003, p. Conference proceedings available online at http://www.unikoblenz.de/~kgt/ESSA/ESSA1/proceedings.htm. 27. Dawes, R.M. Social Dilemmas, Annual Review of Psychology 31 (1980) -161. 28. Gotts, N.M., Polhill, J.G., Law, A.N.R. Agent-based simulation in the study of social dilemmas, Artificial Intelligence Review 19 (2003) 3-92. 29. Boyd, R., Richerson, P.J. Punishment Allows the Evolution of Cooperation (or Anything Else) in Sizable Groups, Ethology and Sociobiology 13 (1992) 171-195. 30. Yamagishi, T., Takahashi, N. Evolution of Norms without Metanorms, in U.Schulz, W.Albers & U.Mueller (Eds.), Social Dilemmas and Cooperation, Springer-Verlag, Berlin, 1994, pp. 311-326. 31. Collier, N. RePast: An Extensible Framework for Agent Simulation, 2003, p. <http://repast.sourceforge.net/>. 32. Cioffi-Revilla, C. Invariance and universality in social agent-based simulations, Proceedings of the National Academy of Sciences of the United States of America 99 (2002) 7314-7316. 33. Kulkarni, V.G. Modelling and Analysis of Stochastic Systems, Chapman & Hall/CRC, Boca Raton, Florida 1995. 34. de Quervain, D.J.F., Fischbacher, U., Treyer, V., Schellhammer, M., Schnyder, U. A.Buck, F.Ernst, The Neural Basis of Altruistic Punishment, Science 305 (2004) 12541258. Appendix A Statement: the only ESS in the Norms game (assuming continuity and using Axelrod’s parameters) is the state of total norm collapse (bi = 1, vi = 0 for all i). Proof: Please bear in mind that eq. (3) and (4) must be fulfilled for the sate to be ESS. The following proves that the only state that satisfies eq. (3) and (4) is bi = 1, vi = 0 for all i. All variables are assumed to be within the feasible range. ∂Exp( Payoff M ) = T + (n − 1) ⋅ bM ⋅V− M ⋅ P ∂bM ∂Exp( Payoff I ) = H + E ⋅ v I ⋅ bM ∂bM 24 Luis R. Izquierdo and José M. Galán ∂Exp( Payoff M ) E = ∂vM 2 n ∑b j =1 j≠M ∂Exp( Payoff I ) P 2 = bI ∂v M 2 2 j ∂Exp( Payoff M ) ∂Exp( Payoff I ) > ⇒ { eq. (3) } ⇒ ∂bM ∂bM bM =0 bM = 0 ∀M ∂Exp( Payoff M ) E = ∂vM 2 ∀M,∃I ≠ M / n ∑b j =1 j≠M 2 j ≤ ) ( ( bM ≠ 0 ∀M ) 2 2 E E n B − bM2 ≤ n B − 1 2 2 ∂Exp( Payoff I ) P 2 P n B = bI ≥ ∂vM 2 2 n − 1 ( ) 2 2 2 ∂Exp( Payoff M ) E ∂Exp( Payoff I ) P nB ≤ ≤ n B − 1 < If B > 0.26 ⇒ ∀ M , some I ∂vM 2 2 n − 1 ∂vM ∴ If B > 0.26 ⇒ v M = 0 ∀M ⇒ { eq. (3) }⇒ bM = 1 ∀M If B ≤ 0.26 ⇒ ∃ M / bM ≤ 0.26 ⇒ { eq. (3) }⇒ ∃ M / V− M ≥ 0.09 ⇒ V− M ≥ 0.03 ∀ M V− M ≥ 0.03 ∀ M ⇒ { eq. (3) }⇒ bM ≠ 1 ∀ M (bM ≠ 1 AND bM ≠ 0) ∀ M ⇒ { eq. (3) } ⇒ ∂Exp( Payoff M ) = ∂Exp( Payoff I ) ∀ I ≠ M , ∀ M ∂bM ∂bM ∂Exp( Payoff M ) ∂Exp( Payoff I ) = ∀ I ≠ M , ∀ M ⇒ vM = V ∀ M ∂bM ∂bM ∂Exp( Payoff M ) ∂Exp( Payoff I ) = ∀ I ≠ M , ∀ M AND vM = V ∀ M ⇒ bM = B ∀ M ∂ b ∂ b M M (vM = V AND bM = B ≠ 0) ∀ M ⇒ { eq. (4) } ⇒ vM = 0 ∀ M ⇒ { eq. (3) } ⇒ bM = 1 ∀ M ( ) But bM ≠ 1 ∀ M ! and B ≤ 0.26 by assumption ⇒ No ESS if B ≤ 0.26 It is proved then that (bi = 1, vi = 0 for all i) is a necessary condition for ESS in the Norms game. Now we prove that it is sufficient. v Exp( Payoff MUTANT ) = bM T + ( n − 1) H + E M (n − 1) 2 v Exp( Payoff INCUMBENT ) = T + (bM + n − 2) H + M P ∀ INCUMBENT 2 Exp( Payoff MUTANT ) < Exp( Payoff INCUMBENT ) ∀ bM , vM , (bM ≠ 1 OR vM ≠ 0 ) Appendix B Statement 1: In the Metanorms game, assuming continuity and using Axelrod’s parameters, there is only one ESS where the norm is established (bi = 4/169, vi = 1 for all i). Appearances Can Be Deceiving: Lessons Learned Re-implementing Axelrod’s “Evolutionary Approach to Norms” 25 Proof: We will assume that if V < 0.5 the norm is not established, so we only deal with states where V ≥ 0.5. The following proves that the only state that satisfies eq. (3) and (4) is bi = 4/169, vi = 1 for all i (assuming V ≥ 0.5). All variables are assumed to be within the feasible range. ∂Exp( Payoff M ) = T + (n − 1) ⋅ bM ⋅ V− M ⋅ P = 3 − 171 ⋅ bM ⋅ V− M ⋅ ∂bM n ∂Exp( Payoff I ) = H + E ⋅ vI ⋅ bM + ME vI 3bM2 ∑ (1 − v j ) + MP 1 − vI 3bM2 ∂bM 4 4 j =1 n ∑v j =1 j ≠ I ,M j ≠ I ,M ( ( ) ) bM ⋅ 57 ⋅ bM (11vI − 9 ) ⋅ V− M − vI ⋅ (8 + bM (81 + 33vI )) − 1 4 ∂Exp( Payoff M ) ∂Exp( Payoff I ) = − ∂bM ∂bM = Let F bM , v I ,V− M j V ≥ 0.5 ⇒ V− M > 0.45 ∀ M ⇒ { eq. (3) }⇒ (bM ≠ 0 & & bM ≠ 1) ∀ M (bM ( ) ≠ 0 AND bM ≠ 1) ∀ M ⇒ { eq. (3) } ⇒ F bM , v I ,V− M = 0 ∀ I ≠ M , ∀ M Please remember that all variables must be within the feasible range. [ ] [ ] Given bM , ∃ at most two different v I , namely v I* 1 and v I* 2 , such that ( [ ] ) ( [ ] ,V ) = 0 However, (V > 0.45 AND [v ] ≤ V ≤ [v ] ) ⇒ [v ] = [v ] ∴ Given b , ∃ a unique v such that F(b , v ,V ) = 0 V > 0.45 ⇒ Given v , ∃ a unique b such that F(b , v ,V ) = 0 F bM , v ,V− M = 0 & & F bM , v * I 1 −M * I 1 −M I * I 2 −M * I M −M * I 2 M * M * I * I 1 * I 2 −M * M I ∴ vi = V ∀ i AND bi = B ∀ i (v = V −M ) ∀ i AND bi = B ∀ i AND eq. (3) AND eq. (4) AND V ≥ 0.5 ⇒ ⇒ vi = 1 ∀ i AND bi = 4/169 ∀ i It is proved then that (bi = 4/169, vi = 1 for all i) is a necessary condition for ESS in the Norms game if V ≥ 0.5. Proving that it is also sufficient is tedious but simple. Naming a potential agent’s boldness as bM and its vengefulness as vM, it can be shown that: Exp( Payoff MUTANT ) < Exp( Payoff INCUMBENT ) ∀ bM , vM , (bM ≠ 4 169 OR v M ≠ 1) i Statement 2: The state where bi = 1, vi = 0 for all i is ESS. Proof: vM (n − 1) + ME vM (n − 1) ⋅ (n − 2) 2 4 vM 1 Exp( Payoff INCUMBENT ) = T + (bM + n − 2) H + P + MP ⋅ vM ⋅ (n − 2 ) ∀ INCUMBENT 2 4 Exp( Payoff MUTANT ) < Exp( Payoff INCUMBENT ) ∀ bM , vM , (bM ≠ 1 OR vM ≠ 0 ) Exp( Payoff MUTANT ) = bM T + (n − 1) H + E
© Copyright 2026 Paperzz