Appearances Can Be Deceiving: Lessons Learned Re

Appearances Can Be Deceiving: Lessons Learned Reimplementing Axelrod’s “Evolutionary Approach to
Norms”
Luis R. Izquierdo1 & José M. Galán2
1
Macaulay Institute, Craigiebuckler, Aberdeen, United Kingdom
[email protected]
2
Universidad de Burgos, Burgos, Spain & INSISOC Group, Valladolid, Spain
[email protected]
Abstract. In this paper we try to replicate the simulation results reported by
Axelrod [1] in an influential paper on the evolution of social norms. Our study
shows that Axelrod’s results are not as reliable as one would desire. We can
obtain the opposite results by running the model for longer, by slightly
modifying some of the parameters, or by changing some arbitrary assumptions
in the model. This re-implementation exercise illustrates the importance of
running stochastic simulations several times, of exploring the parameter space
adequately, of complementing simulation with analytical work, and of being
aware of the scope of our simulation models.
Introduction
In recent years agent based modelling (ABM) has shifted from being a heterodox
modelling approach to become a recognised research methodology in a range of
scientific disciplines, e.g. Economics [2, 3], Resource Management and Ecology [46], Political Science [7], Anthropology [8, 9], Sociology [10-12], Biology [13]…
One of the main advantages of ABM, and what distinguishes it from other
modelling paradigms, is the possibility of establishing a more direct correspondence
between entities (and their interactions) in the system to be modelled and agents (and
their interactions) in our models [14]. This type of abstraction is attractive for a
number of reasons, e.g. it leads to formal yet more natural descriptions of the target
system, enables us to model heterogeneity and to represent space explicitly, allows us
to study the bidirectional relationship between individuals and groups, and it can also
capture emergent behaviour (see [15-17]). However, this step forward towards
descriptive accuracy, transparency, and rigour comes at a price: models constructed in
this way are very often intractable using mathematical analysis, so we usually have to
resort to computer simulation.
As a matter of fact, agent-based models are usually so complex that we (their own
developers) often do not fully understand in exhaustive detail what is going on in our
models. Not knowing exactly what to expect makes it impossible to tell whether any
2
Luis R. Izquierdo and José M. Galán
unanticipated results derive exclusively from what the researcher believes are the
crucial assumptions in the model, or whether they are just artefacts created in the
design, implementation, or in the running process. Artefacts in the design can appear
when assumptions which are made arbitrarily (possibly because the designer believes
they are not crucial to the research question and they will not have any significant
effect in the results) have an unanticipated and significant effect in the results (e.g. the
effect of using different topological structures or neighbourhood functions). When
this actually occurs, we run the risk of interpreting our simulation results (which
generalise the crucial assumptions believed to be irrelevant) beyond the scope of the
simulation model (see e.g. [18]). Implementation artefacts appear in the potentially
ambiguous process of translating a model described in natural language into a
computer program [19]. Finally, artefacts can also occur at the stage of running the
program because the researcher might not be fully aware of how the code is executed
in the computer (e.g. unawareness of floating-point errors [20, 21]).
To discern an artefact from what is not there are two techniques that have proved
extremely useful: replication of experiments by independent researchers [18, 19, 22,
23] and mathematical analysis [21, 24-26]. Using these two techniques we can
increase the rigour, the reliability, and the credibility of our models.
In this paper we have replicated two influential models of social norms developed
by Axelrod [1] and, in doing so, we illustrate the importance of both independent
replication and mathematical analysis. The structure of the paper is as follows: in the
next two sections we give some background to Axelrod’s models and we explain
them in detail. Subsequently, we present the method used to replicate the original
models and to understand their dynamics. Results and discussions are then provided
for each of the two models; and, finally, conclusions are presented in the last section.
Background to Axelrod’s Models
Social dilemmas have been fascinating scientists from a wide range of disciplines for
decades. In a social dilemma, decisions that seem to make perfect sense from each
individual’s point of view can aggregate into outcomes that are unfavourable for all.
In its simplest formalisation, social dilemmas can be modelled as games in which
players can either cooperate or defect. The dilemma comes from the fact that
everyone is better off defecting given the other players’ decisions, but they all prefer
universal cooperation to universal defection. Using game theory terms, in a dilemma
game all players have strictly dominant strategies1 that result in a deficient
equilibrium2 [27].
Within the domain of agent-based modelling there is a substantial amount of work
devoted to identifying conditions under which cooperation can be sustained in these
problematic situations (see Gotts et al. [28] for an extensive review). In particular,
some of this work has investigated the role of social norms and how these can be
1
For an agent A, strategy S*A is strictly dominant if for each feasible combination of the other
players’ strategies, A’s payoff from playing S*A is strictly more than A’s payoff from playing
any other strategy.
2 An equilibrium is deficient if there exists another outcome which is preferred by every player.
Appearances Can Be Deceiving: Lessons Learned Re-implementing Axelrod’s
“Evolutionary Approach to Norms” 3
enforced to promote cooperation. Following Axelrod’s [1] definition, we understand
that “a norm exists in a given social setting to the extent that individuals usually act in
a certain way and are often punished when seen not to be acting in this way”. Norms
to cooperate provide cooperators with a crucial advantage: the option to selectively
punish those who defect [29]. If a norm to cooperate is not in place, the only way that
punishment can be exercised in these simple games is by withdrawing cooperation,
thus giving rise to potential misunderstandings.
In 1986 Axelrod wrote a pioneering and influential paper on the study of norm
enforcement in social dilemmas using computer simulation [1]. In his paper, Axelrod
investigates the role of metanorms (norms to follow other norms) in promoting
cooperation in a simple agent-based model. He argues that in his model “metanorms
can prevent defections if the initial conditions are favourable enough”. However, we
have re-implemented his model and our study shows that initial conditions are
irrelevant for the long-term behaviour of the model and, using Axelrod’s parameters,
metanorms do not prevent defections most of the time. Furthermore, Axelrod’s results
are dependent on very specific and arbitrary conditions, the absence of which tends to
change the conclusions significantly.
In the next section we explain the two models that Axelrod [1] presents in his
paper: the Norms model and the Metanorms model.
Axelrod’s Models
The Norms Model
The Norms game is played by 20 agents who have to make two decisions:
1.
2.
Agents have to decide whether to cooperate or defect. A defecting agent gets
a Temptation payoff (T = 3) and inflicts each of the other agents a Hurt
payoff (H = −1). If, on the other hand, the agent cooperates, no one’s payoff
is altered.
The opportunity to defect given to an agent comes with a known chance of
being seen by each of the other agents, called S. This probability of being
observed is drawn from a uniform distribution between 0 and 1 every time a
certain agent is given the opportunity to defect. For each observed defection,
agents have to decide whether to punish the defector or not. Punishers incur
an Enforcement cost (E = −2) every time they punish (P = −9) a defector.
The strategy of an agent is defined by its propensity to defect (Boldness), and its
propensity to punish agents they have observed defecting (Vengefulness). Agents
defect when given the opportunity if their Boldness is higher than the probability of
being observed (S); and they punish observed defectors with probability Vengefulness.
In this model, each of these propensities is implemented as a 3-bit string denoting
eight evenly-distributed values from 0 to 1 (0/7, 1/7,…,7/7). The actual values for
4
Luis R. Izquierdo and José M. Galán
each agent’s strategy are determined randomly at the beginning of each simulation
run.
A round in this model is completed when every agent has been given exactly one
opportunity to defect, and also the opportunity to observe (and maybe punish) any
given defection that has taken place. Figures 1 and 2 show the UML activity diagram
of one round.
Fig. 1. UML Activity diagram of Axelrod’s models. The UML diagram of method
metaNorms(Number, Agent, Agent)is provided in figure 2.
Appearances Can Be Deceiving: Lessons Learned Re-implementing Axelrod’s
“Evolutionary Approach to Norms” 5
Fig. 2. UML activity diagram of the method metaNorms(Number, Agent, Agent) of
the object model. This method is called in the UML activity diagram shown in figure 1. The
condition metaNormsActive is false in the Norms model and true in the Metanorms
model.
Four rounds constitute a generation. At the beginning of every generation the
agents’ payoffs are initialised; at the end of every generation the payoff obtained by
every agent in the four rounds is computed and two evolutionary forces come into
play:
1.
Agents with a payoff exceeding the population average by at least one
standard deviation are replicated twice; agents who are at least one standard
deviation below the population average are eliminated; and the rest of the
agents are replicated once. The number of agents is kept constant, but
Axelrod [1] does not specify exactly how. After having studied this process
6
Luis R. Izquierdo and José M. Galán
2.
in detail, we have come to the conclusion that this ambiguity is not likely to
be of major importance.
Whenever a bitstring is replicated, every bit has a certain probability to be
flipped (MutationRate = 0.01).
Using this model, Axelrod [1] comes to the conclusion that the simulations should
spend most of the time in states3 of very high Boldness and very low Vengefulness
(norm collapse).
The Metanorms Model
Having concluded that the norm to cooperate collapses in the previous model,
Axelrod investigates the role of metanorms as a way of enforcing norms. The
metanorm dictates that one must punish those who do not follow the norm (i.e. those
who do not punish observed defectors). However, someone who does not punish an
observed defector might not be caught. In the Metanorms game, the chance of being
seen not punishing a defection (given that the defection has been seen) by each of the
other 18 agents (excluding the defector) is the same as the chance of seeing such
defection. Similarly, the propensity to punish those who do not comply with the norm
(meta-punish) is the same as the propensity to punish defectors4. As far as payoffs are
concerned, meta-punishers incur a Meta-enforcement cost (ME = −2) every time they
Meta-punish (MP = −9) someone who has not punished an observed defector. Figures
1 and 2 show the UML activity diagram of one round in the Metanorms model.
Using this model, Axelrod argues that “the metanorms game can prevent
defections if the initial conditions are favourable enough”.
Method
In this paper we have used the following three tools:
1. Computer models. We have re-implemented Axelrod’s models in Java 2 using
RePast 2.2 [31], and added extra functionality to our programs so we can relax
several assumptions made in Axelrod’s models. Using our computer models, we
have been able to perform the following tasks:
a. Replicate Axelrod’s experiments using our computer models that fully comply
with the specifications outlined in his paper. This exercise was conducted to
study the potential presence of ambiguities and artefacts in the process of
translating the model described in the paper into computer code (e.g. is the
description in the paper sufficient to implement the model? Could there be
implementation mistakes?), look for artefacts in the process of running the
program (e.g. could the results be dependent on the modelling paradigm,
3
4
The term ‘state’ denotes here a certain particularisation of every agent’s strategy.
Yamagishi and Takahashi [30] use a model similar to Axelrod’s, but propose a linkage
between cooperation (not being bold) and vengefulness.
Appearances Can Be Deceiving: Lessons Learned Re-implementing Axelrod’s
“Evolutionary Approach to Norms” 7
programming language, or hardware platform used?), and assess the process
by which the results have been analysed and conclusions derived (e.g. do the
results change when the program is run for longer?).
b. Conduct an adequate exploration of the parameter space and study the
sensitivity of the model. One major disadvantage of using computer simulation
is that a single run does not provide any information on the robustness of the
results obtained nor on the scope of the conclusions derived from them. In
order to establish the scope of the conclusions derived from a simulation
model it is necessary to determine the parameter range where the conclusions
are invariant.
c. Experiment with alternative models which address the relevant research
question (e.g. can metanorms promote cooperation?) just as well [18]. It is
often the case that an agent based model instantiates a more general conceptual
model that could embrace different implementations equally well (see [32] for
examples of how to find possible variations). Only those conclusions which
are not falsified by any of the conceptually equivalent models will be valid for
the conceptual model.
2. Mathematical analysis of the computer models. Defining a state of the system as a
certain particularisation of every agent’s strategy, it can be shown that both the
Norms model and the Metanorms model are irreducible positive recurrent and
aperiodic discrete-time finite Markov chains (with 6420 possible states). This
observation enables us to say that the probability of finding the system in each of
its states in the long run5 is unique (i.e. initial conditions are immaterial) and nonzero (Theorems 3.7 and 3.15 in [33]). Although calculating such probabilities is
infeasible, we can estimate them using the computer models.
3. Mathematical abstractions of the computer models. We have developed one
mathematical abstraction for each of the two games (the Norms game and the
Metanorms game) in which we study every agent’s expected payoff in any given
state. These mathematical abstractions do not correspond in a one-to-one way with
the specifications outlined in the previous section. They are simpler, more abstract
models which are amenable to mathematical analysis and graphical representation.
In particular, our mathematical models abstract the details of the evolutionary
process (the genetic algorithm) and assume continuity of agents’ properties. The
mathematical abstractions are used to suggest areas of stability and basins of
attraction in the computer models, to clarify their crucial assumptions, to assess
their sensitivity to parameters, and to illustrate graphically the dynamics of the
system. Any results suggested by the mathematical abstractions are always
checked by simulation.
5
This is also the long-run fraction of the time that the system spends in each of its states.
8
Luis R. Izquierdo and José M. Galán
The Norms Model: Results and Discussion
Using the Norms model, Axelrod [1] reports results from 5 runs consisting of 100
generations each. Even though the simulation results are not conclusive at all (i.e.
they show three completely different possible outcomes), Axelrod comes to the
correct conclusion that the simulations should spend most of the time in a state of
very high Boldness and very low Vengefulness (norm collapse). In this section we
provide a series of arguments that corroborate his conclusion.
We start by using the mathematical abstraction of the computer model. Without
making any simplifying assumption so far, we can say that an agent i, with boldness
bi and vengefulness vi obtains the following payoff:
n
n
n
j =1
j ≠i
j =1
j ≠i
j =1
j ≠i
Payoff i = Def i ⋅ T + ∑ Def j ⋅ H + ∑ Punij ⋅ E + ∑ Pun ji ⋅ P
(1)
where
T, H, P, E
n
are the payoffs mentioned in the description of the model,
is the number of agents, and
1 If agent i defects
Def i = 
0 If agent i cooperates
Prob ( Def i ≡ 1) = bi
Prob ( Def i ≡ 0) = 1 − bi
1 If agent i punishes agent j
Punij = 
0 If agent i does not punish agent j
Prob ( Punij ≡ 1) = b j ⋅ (b j 2) ⋅ vi
Prob ( Punij ≡ 0) = 1 − b j ⋅ (b j 2) ⋅ vi
The expected payoff of agent i is then:
Exp( Payoff i ) = biT + (n − 1) B− i H + E
vi
2
n
∑b
j =1
j ≠i
2
j
+
(n − 1) 2
bi V− i P
2
(2)
where
B−i =
1 n
∑bj
n − 1 j =1
j ≠i
is the population average Boldness observed by agent i, and similarly for
V− i
and the Vengefulness.
We define now a concept of point stability that we call Evolutionary Stable State
(ESS). An ESS is a state (determined by every agent’s Boldness and Vengefulness)
where:
a) every agent is getting the same expected payoff (so evolutionary selection
pressures will not lead the system away from the state), and
b) any single (mutant) agent who changes its strategy gets a strictly lower
expected payoff than any of the other agents in the incumbent population (so
Appearances Can Be Deceiving: Lessons Learned Re-implementing Axelrod’s
“Evolutionary Approach to Norms” 9
if one single mutation occurs6, the mutant agent will not be able to invade the
population).
If, at this point, we assume continuity of agents’ properties, we can write a
necessary condition for a state to be evolutionary stable. Let M be an arbitrary
(potential mutant) agent in a given population of agents P (state), and let bM be its
boldness and vM its vengefulness. Let I be any of the other (incumbent) agents in the
population P. Then eq. (3) is a necessary condition for the population of agents being
an ESS.
∂Exp( Payoff M ) ∂Exp( Payoff I )


(3)
=
∀ I ∈ P, I ≠ M


∂bM
∂bM





∂Exp( Payoff M ) ∂Exp( Payoff I )


≥
, ∀ I ∈ P, I ≠ M  
 OR  bM = 1 AND
∂bM
∂bM

 ∀ M ∈ P





Payoff
Payoff
∂
Exp
(
)
∂
Exp
(
)
M
I
OR  bM = 0 AND

I
P
I
M
≤
,
∀
∈
,
≠



∂bM
∂bM






If every agent has the same expected payoff (which is a necessary condition for
ESS) and eq. (3) does not hold for some M, I, the potential mutant M could get a
differential advantage over incumbent agent I by changing its Boldness bM, meaning
that the state under study would not be evolutionary stable. If, for instance, we find
some M, I such that
∂Exp( Payoff M ) ∂Exp( Payoff I )
>
AND bM ≠ 1
∂bM
∂bM
then agent M could get a higher payoff than agent I by increasing its boldness bM.
Similarly, we obtain another necessary condition substituting vM for bM in eq. (3).
∂Exp( Payoff M ) ∂Exp( Payoff I )


=
∀ I ∈ P, I ≠ M


∂vM
∂vM





∂Exp( Payoff M ) ∂Exp( Payoff I )
≥
, ∀ I ∈ P, I ≠ M  
 OR  vM = 1 AND
∂vM
∂vM

 ∀ M ∈ P



OR  vM = 0 AND ∂Exp( Payoff M ) ≤ ∂Exp( Payoff I ) , ∀ I ∈ P, I ≠ M 



∂vM
∂vM






(4)
It is interesting to see how in general there is no direct relationship between the
concept of evolutionary stability as defined above and the Nash equilibrium concept.
Evolution is about relative payoffs, but Nash is about absolute payoffs. A necessary
condition to be in Nash equilibrium would be, e.g.:
6
We refer here to any change in one single agent’s strategy, not a single flip of a bit.
10
Luis R. Izquierdo and José M. Galán
∂Exp( Payoff M )


=0


∂bM





∂Exp( Payoff M )
≥ 0  
 OR  bM = 1 AND
∂bM

 ∀ M ∈ P



OR  bM = 0 AND ∂Exp( Payoff M ) ≤ 0 



∂bM






In appendix A, we use equations (3) and (4) to demonstrate that the only ESS in
the Norms game (assuming continuity and using Axelrod’s parameters) is the state of
total norm collapse (bi = 1, vi = 0 for all i). Here, we use these equations to draw
figure 3, which illustrates the dynamics of the system under the assumption that every
agent has the same properties (bi = B, vi = V for all i). States where every agent has the
same properties will be called homogeneous. Figure 3 has been drawn according to
the following procedure: the arrow departing from a certain homogeneous state (B, V)
has horizontal component iff the condition in eq. (3) is false in such state. In that case,
the horizontal component is positive if
∂Exp( Payoff M ) ∂Exp( Payoff I )
>
∂bM
∂bM
and negative otherwise. The vertical component is worked out in a similar way but
using eq. (4) instead. Only vertical lines, horizontal lines, and the four main diagonals
are considered. If both equations (3) and (4) are true then a red point is drawn.
Vengefulness
1
0.8
0.6
0.4
0.2
0
0
0.2
0.4
0.6
0.8
1
Boldness
Appearances Can Be Deceiving: Lessons Learned Re-implementing Axelrod’s
“Evolutionary Approach to Norms”
11
Fig. 3. Graph showing the expected dynamics in the Norms model, using Axelrod’s parameters,
and assuming continuity and homogeneity of agents’ properties. The procedure used to create
this graph is explained in the text. The dashed squares represent the states of norm
establishment (green, top-left) and norm collapse (red, bottom-right) as defined in the text
below. The red point is the only ESS.
As an example, imagine that in a certain state (B, V) where B ≠ 1 and V ≠ 0 we
observe that:
∂Exp( Payoff M ) ∂Exp( Payoff I )
∂Exp( Payoff M ) ∂Exp( Payoff I )
>
AND
<
∂bM
∂bM
∂vM
∂vM
We then draw a diagonal arrow pointing towards greater Boldness and less
Vengefulness, since a mutant with greater Boldness and less Vengefulness than the
(homogeneous) population could invade it.
We will see in the next section how figures constructed in this way can be
extremely useful to suggest simulation experiments to run. However, we must bear in
mind that they are mathematical abstractions of the computer model, so they can also
be misleading. For instance, even though figure 3 (and equation 5, formally) shows
that agents can always gain a competitive advantage by decreasing their Vengefulness
in any homogenous state (unless nobody is bold), that is not necessarily the case in
heterogeneous states. As an example, in a state where every agent’s properties are
zero except for two agents who have bi = 1 and vi = 0, each of the two bold agents
would become the only agent with the highest expected payoff if they (individually)
increased their vengefulness.
∂Exp( Payoff M ) ∂Exp( Payoff I ) P 2
E
<
= B
(n − 1) ⋅ B 2 =
∂v M
∂v M
2
2
∀B≠0
(5)
The mathematical analysis shows that in the vast majority of states it is not
advantageous in evolutionary terms to be vengeful, particularly if we increased the
number of agents (eq. 6). Punishing only one agent can be advantageous for the
punisher since it inflicts more pain (P) than the cost of punishing (E) (even though the
punisher would also get a lower payoff!). However, if the population is minimally
bold, and being vengeful means punishing too many people, the total cost of being
vengeful (exclusively borne by the punisher) can be higher than each individual
punishment. Therefore vengeful agents tend to be less successful. When the level of
vengefulness in the population is low enough, bold agents will tend to get higher
payoffs and the system will head towards the state of norm collapse. So when both
evolutionary forces are in place the system should spend most of its time in the
neighbourhood of the only evolutionary stable state.
∂Exp( Payoff M ) E
=
2
∂vM
n
∑b
j =1
j≠M
2
j
∂Exp( Payoff I ) P 2
= bI
2
∂vM
(6)
12
Luis R. Izquierdo and José M. Galán
However, it is only running simulations how we can confidently explore the
dynamics of the model. To analyse the simulation runs, we define the following sets
of states:
ƒ Norm Collapse: We say that the norm has collapsed when the simulation
is in states where the average Boldness is at least 6/7 and the average
Vengefulness is no more than 1/7 (see fig. 3).
ƒ Norm Establishment: We say that the norm has been established when the
simulation is in states where the average Boldness is no more than 2/7 and
the average Vengefulness is at least 5/7 (see fig. 3).
Figure 4 shows the proportion of runs (out of 576) where the norm has been
established, and where the norm has collapsed, after a certain number of generations
(up to 106).
Fig. 4. Proportion of runs where the norm has been established, and where the norm has
collapsed, calculated over 576 runs up to 106 generations. The little figure in the middle of the
graph represents the first 1,000 generations zoomed.
As predicted in the previous analysis, the norm collapses almost always, as
Axelrod concluded; only now the argument has been corroborated with more
convincing evidence. We can also notice looking at figure 4 that it is not surprising
that Axelrod found three completely different possible outcomes after having run the
simulation 5 times for 100 generations.
The Metanorms Model: Results and Discussion
Using the Metanorms model, Axelrod [1] reports again results from 5 runs consisting
of 100 generations each. In all five runs the norm is clearly established and Axelrod
argues that “the metanorms game can prevent defections if the initial conditions are
Appearances Can Be Deceiving: Lessons Learned Re-implementing Axelrod’s
“Evolutionary Approach to Norms”
13
favourable enough”. However, as explained in the method section, initial conditions
are immaterial for the long-run behaviour of any of the two models under study. In
this section, we investigate whether metanorms can actually prevent defections and, if
so, how robust such statement is.
Replication of the Original Experiments
We replicated Axelrod’s experiments but we ran many more simulations (1000 runs,
as opposed to 5) and for longer (106 generations, as opposed to 100). The results are
shown in figure 5. We can see now how misleading running the simulation for only
100 generations was. Even though after 100 generations the norm is almost always
established, as time goes by the system approaches its limiting distribution, where the
norm usually collapses.
Fig. 5. Proportion of runs where the norm has been established, and where the norm has
collapsed, calculated over 1000 runs up to 106 generations. The little figure in the middle of the
graph represents the first 1,000 generations zoomed.
To understand better the dynamics of the system and the sensitivity of the model
we used again a mathematical abstraction of the computer model. Equation 7 shows
the expected payoff for an agent i, with boldness bi and vengefulness vi. In appendix
B we demonstrate that in the Metanorms game, assuming continuity, there is now one
(only) ESS where the norm is established7 (bi = 4/169, vi = 1 for all i), an there is also
at least one ESS where the norm collapses (bi = 1, vi = 0 for all i).
7
This ESS is not a Nash equilibrium.
14
Luis R. Izquierdo and José M. Galán
Exp( Payoff i ) = biT + (n − 1) B−i H + E
+ ME
vi
4
vi
2
n
∑b
j =1
j ≠i
2
j
+
(n − 1) 2
bi V−i P +
2
1− v
∑ ∑ b (1 − v ) + MP 4 ∑ ∑ b v
n
n
k =1 j =1
k ≠i j ≠ i ,k
3
k
j
i
n
n
k =1 j =1
k ≠i j ≠i ,k
3
k
(7)
j
Figure 6 shows the expected dynamics of the metanorms game assuming
continuity and homogeneity in agents’ properties. Looking at figure 6 we find that it
is not surprising that running 5 simulations for 100 rounds could mislead us to think
that the norm will always be established. If the initial strategies are random, chances
are that the system will move towards the ESS where the norm is established.
However, the region to the left of this ESS is a nearby escape route to the ESS where
the norm collapses. Intuitively, for very low levels of boldness the very few
defections that occur are those that are very unlikely to be seen8, meaning that an
agent who happens to observe a defection and who does not punish it is also very
unlikely to be caught. And let’s face it, in this model the only reason for which agents
may punish defectors is to avoid being meta-punished9. So when defections are hard
to see, not punishing pays off because it is very unlikely that the non-punisher will be
caught. Thus being vengeful is disadvantageous, and forgiving agents gain a
competitive advantage. As the level of vengefulness decreases, the level of boldness
below which it is advantageous not to be vengeful increases, since people metapunish less than before (vengefulness is both the propensity to punish and to metapunish). Agents then start to be less and less vengeful, and consequently bolder and
bolder, so the norm eventually collapses.
8
9
Remember that agents defect iff their boldness is higher than the probability of being seen.
Interestingly enough, recent research suggests that people genuinely enjoy punishing others if
they have done something wrong [34].
Appearances Can Be Deceiving: Lessons Learned Re-implementing Axelrod’s
“Evolutionary Approach to Norms”
15
Vengefulness
Vengefulness
Boldness
0.04
0.01
1
0.95
0.8
0.9
0.6
0.4
0.2
0
0
0.2
0.4
0.6
0.8
1
Boldness
Fig. 6. Graph showing the expected dynamics in the Metanorms model, using Axelrod’s
parameters, and assuming continuity and homogeneity of agents’ properties. Red points are
ESS.
Some Exploration of the Parameter Space
One reason for which the transition towards the set of states where the norm collapses
is so slow in the Metanorms model, and for which such set is not very stable (the set
is indeed sometimes abandoned), is the high mutation rate used by Axelrod. Such a
high mutation rate does not allow the system to adjust according to evolutionary
selection pressures after a single mutation takes place and before the next mutation
occurs. Using a lower mutation rate (MutationRate = 0.001, for which we can expect
one mutation approximately every 8 generations) the system reaches the states of
norm collapse much more quickly and such set is much more stable. Results are
shown in figure 7.
16
Luis R. Izquierdo and José M. Galán
Fig. 7. Proportion of runs where the norm has been established, and where the norm has
collapsed, calculated over 312 runs up to 2·105 generations, with MutationRate equal to 0.001.
Another reason for which Axelrod’s simulation results turned out to be so
misleading is the extreme payoff structure that he used. In every round, agents might
get the Temptation payoff at most once (benefit = 3), but they can be punished for
being bold up to 19 times (with a total cost of 171), and they can be meta-punished
for not being vengeful up to 342 times (with a total cost of 3078)! As an example, we
show here how slightly altering the metanorm-related payoffs can significantly
change the dynamics of the system. Assume that we divide both the Metaenforcement cost and the Meta-punishment payoff by 10, leaving their ratio
untouched (ME = −0.2; MP = −0.9). Such adjustments should actually give us a more
realistic model: as Yamagishi and Takahashi [30] put it: “if someone is late for a
meeting you may grumble at him, but you would seldom grumble at your colleagues
for not complaining to the late comer”, certainly not with the same intensity! Figure 8
shows how if we use the modified payoffs the area of stability where the norm is
established is not there anymore, suggesting that the transition to the states of norm
collapse will be much quicker.
Appearances Can Be Deceiving: Lessons Learned Re-implementing Axelrod’s
“Evolutionary Approach to Norms”
17
Vengefulness
1
0.8
0.6
0.4
0.2
0
0
0.2
0.4
0.6
0.8
1
Boldness
Fig. 8. Graph showing the expected dynamics in the Metanorms model, with ME = −0.2; MP =
−0.9, and assuming continuity and homogeneity of agents’ properties.
The simulation runs corroborate our speculations. As we can see in figure 9, the
norm quickly collapses and such state is sustained in the long term. Axelrod’s
conclusions are reversed if we use (what in our opinion are) more realistic payoffs.
18
Luis R. Izquierdo and José M. Galán
Fig. 9. Proportion of runs where the norm has been established, and where the norm has
collapsed, calculated over 318 runs up to 2·105 generations, with ME = −0.2, and MP = −0.9.
The mathematical abstraction of the computer model was also used to uncover a
very counterintuitive feature of the original model. Strange as it may appear, the
mathematical analysis suggests that increasing the magnitude of the Temptation
payoff or decreasing the magnitude of the Punishment payoff will increase the
chances that the norm is established. Figure 10 shows the expected dynamics when
Temptation = 10. The ESS where the norm is established (bi = 11/169, vi = 1 for all i)
is now surrounded by a larger basin of attraction, suggesting that the set of states
where the norm is established will be more stable.
Vengefulness
1
0.8
0.6
0.4
0.2
0
0
0.2
0.4
0.6
0.8
1
Boldness
Fig. 10. Graph showing the expected dynamics in the Metanorms model, with T = 10 and
assuming continuity and homogeneity of agents’ properties.
We decided to test the hypothesis that a greater Temptation can increase the chances
that the norm is established using our computer model. The results obtained, which
are shown in figure 11, are unambiguous: the norm is clearly established in almost all
runs. The reason for that is that a higher Temptation10 means that the level of optimum
boldness (in evolutionary terms) in any given situation is higher than before (e.g.
assuming vi = 1 for all i, it is now b = 11/169, as opposed to 4/169). As we explained
before, in the previous case the system would abandon the states where the norm is
established because the level of boldness in the population was so low that agents
who did not punish defectors were rarely caught. However, in this case the optimum
level of boldness is not so low, so agents who do not punish defectors are more likely
to be observed and meta-punished. Because of this, a very high level of vengefulness
is preserved. This basically means that in the presence of metanorms, it might be
10
A lower Punishment yields very similar results, and the reasoning is the same.
Appearances Can Be Deceiving: Lessons Learned Re-implementing Axelrod’s
“Evolutionary Approach to Norms”
19
easier to enforce norms which people have higher incentives to break (i.e. which are
constantly put to the test) because that gives meta-punishers more opportunities to
exert their power. However it is also clear that this argument requires a strong link
between the propensity to punish and the propensity to meta-punish. Without such a
link it seems unlikely that the norm could be established in the long-term in any case,
since strategies that follow the norm without incurring the costs of enforcing it would
gain a differential advantage over those that enforce it.
Fig. 11. Proportion of runs where the norm has been established, and where the norm has
collapsed, calculated over 1000 runs up to 2·105 generations, with T = 10.
Other Instantiations of the Same Conceptual Model
We also wanted to test the robustness of Axelrod’s conclusions using similar
computer models which are, in our opinion, equally valid instantiations of the
conceptual model that (we believe) Axelrod had in mind. In particular, we
implemented three other evolutionary selection mechanisms apart from the one
Axelrod used. In all four selection mechanisms the most successful agents at a
particular time have the best chance of being replicated in the following generation,
which is what we believe the conceptual model would specify. The new selection
mechanisms are the following:
1. Random tournament. This method involves selecting two agents from the
population at random and replicating the one with the higher payoff for the next
generation. If case of tie one is selected at random. This process is repeated 20 times
to keep the number of agents constant.
2. Roulette wheel. This method involves calculating every agent’s fitness, which
is equal to their payoff minus the minimum payoff obtained in the generation. Agents
20
Luis R. Izquierdo and José M. Galán
are then given a probability of being replicated (in each of the 20 replications) that is
directly proportionate to their fitness
3. Average selection. Using this method, agents with a payoff greater than or
equal to the population average are replicated twice; and agents who are below the
population average are eliminated. The number of agents is then kept constant by
randomly eliminating/replicating as many agents as needed.
As we can see in figure 10 the results obtained vary substantially depending on the
selection mechanism used. This is so particularly in the short term but also in the long
term. If, for instance, random tournament is chosen, the states where the norm has
collapsed are quickly reached and our experiments indicate that the long-run
probability of finding the system in such states is very close to one.
Fig. 12. Proportion of runs where the norm has been established, and where the norm has
collapsed, calculated over 300 runs up to 2·104 generations, for different selection mechanisms.
Conclusions
This paper has provided evidence showing that the results reported by Axelrod [1]
are not as reliable as one would desire. We can obtain the opposite results by running
the model for longer, by using other mutation rates, by modifying the payoffs slightly,
or by using alternative selection mechanisms. As far as the reimplementation exercise
is concerned, our study represents yet another illustration of the necessity to revisit
and replicate our models in order to clarify the boundaries of validity of our
conclusions (see [19] for another striking example). As Axelrod himself claims:
“Replication is one of the hallmarks of cumulative science. It is needed to
confirm whether the claimed results of a given simulation are reliable in the
Appearances Can Be Deceiving: Lessons Learned Re-implementing Axelrod’s
“Evolutionary Approach to Norms”
21
sense that they can be reproduced by someone starting from scratch. Without
this confirmation, it is possible that some published results are simply
mistaken due to programming errors, misrepresentation of what was actually
simulated, or errors in analyzing or reporting the results. Replication can also
be useful for testing the robustness of inferences from models”. [22]
In particular, this paper has illustrated the importance of:
a)
Running simulations with stochastic components several times for several
periods, so we can study not only how the system can behave but also how it
usually behaves.
b) Exploring thoroughly the parameter space and analysing the model
sensitivity to its parameters.
c) Complementing simulation with analytical work.
d) Being aware of the scope of our computer models and of the conclusions
obtained with them. The computer model is often only one of many possible
instantiations of a more general conceptual model. Therefore the conclusions
obtained with the computer model do not necessarily apply to the conceptual
model.
The importance of the previous points has been previously exposed by authors like
Gotts et al. [26, 28] and Edmonds and Hales [19]; the work reported in this paper
strongly corroborates these authors’ arguments.
Acknowledgement
This work is funded by the Scottish Executive Environment and Rural Affairs
Department, and by the Junta de Castilla y León Grant Ref.: VA034/04. We would
also like to thank Gary Polhill for some advice and programming work.
22
Luis R. Izquierdo and José M. Galán
References
1. Axelrod, R.M. An Evolutionary Approach to Norms, American Political Science Review
80 (1986) 1095-1111.
2. Arthur, B., Durlauf, S., Lane, D. The Economy as an Evolving Complex System II,
Addison-Wesley, Reading, Massachusets. 1997.
3. Tesfatsion, L. Agent-based computational economics: Growing economies from the
bottom up, Artificial Life 8 (2002) 55-82.
4. Bousquet, F., Le Page, C. Multi-agent simulations and ecosystem management: a review,
Ecological Modelling 176 (2004) 313-332.
5. Hare, M., Deadman, P. Further towards a taxonomy of agent-based simulation models in
environmental management, Mathematics and Computers in Simulation 64 (2004) 2540.
6. Janssen, M. Complexity and ecosystem management. The theory and practice of multiagent systems, Edward Elgar Pub, Chelteham, UK 2002.
7. Axelrod, R.M. The complexity of cooperation. Agent-based models of competition and
collaboration, Princeton University Press, Princeton, N.J 1997.
8. Kohler, T., Gumerman, G.J. Dynamics in human and primate societies: Agent-based
modeling of social and spatial processes, Oxford University Press and Santa Fe
Institute, New York 2000.
9. Lansing, J.S. Complex Adaptive Systems, Annual Review of Anthropology 32 (2003)
183-204.
10. Gilbert, N., Conte, R. Artificial Societies: the Computer Simulation of Social Life, UCL
Press, London 1995.
11. Gilbert, N., Troitzsch, K.Simulation for the social scientist, Open University Press,
Buckingham 1999.
12. Suleiman, R., Troitzsch, K.G., Gilbert, N. Tools and Techniques for Social Science
Simulation, Physica-Verlag, Heidelberg, New York 2000.
13. Resnick, M.Turtles, Termites, and Traffic Jams: Explorations in Massively Parallel
Microworlds (Complex Adaptive Systems), MIT Press, Cambridge, US 1995.
14. Edmonds, B. The Use of Models - making MABS actually work, in S.Moss &
P.Davidsson (Eds.), Multi-Agent-Based Simulation, Lecture Notes in Artificial
Intelligence 1979, 2001, pp. 15-32.
15. Axtell, R.L. Why Agents? On the Varied Motivations for Agents in the Social Sciences.,
in C.M.Macah & D.Sallach (Eds.), Proceedings of the Workshop on Agent Simulation:
Applications, Models, and Tools., Argonne National Laboratory, Argonne, Illinois.,
2000.
16. Bonabeau, E. Agent-based modeling: Methods and techniques for simulating human
systems, Proceedings of the National Academy of Sciences of the United States of
America 99 (2002) 7280-7287.
17. Epstein, J.M. Agent-based computational models and generative social science,
Complexity 4 (1999) 41-60.
18. Edmonds, B., Hales, D. Computational Simulation as Theoretical Experiment, Centre for
Policy Modelling Report No.: 03-106 (2003) <http://cfpm.org/cpmrep106.html>.
19. Edmonds, B., Hales, D. Replication, replication and replication: Some hard lessons from
model alignment, Jasss-the Journal of Artificial Societies and Social Simulation 6
(2003) <http://jasss.soc.surrey.ac.uk/6-4/11.html>.
20. Polhill, J.G., Izquierdo, L.R. Gotts, N.M. The ghost in the model (and other effects of
floating point arithmetic) , Jasss-the Journal of Artificial Societies and Social
Simulation, In Press.
Appearances Can Be Deceiving: Lessons Learned Re-implementing Axelrod’s
“Evolutionary Approach to Norms”
23
21. Polhill, J.G., Izquierdo, L.R. Gotts, N.M. What every agent based modeller should know
about floating point arithmetic, Environmental Modelling and Software, In Press.
22. Axelrod, R.M. Advancing the Art of Simulation in the Social Sciences, in R.Conte,
R.Hegselmann & P.Terna (Eds.), Simulating Social Phenomena (Lecture notes in
economics and mathematical systems 456), Springer, Berlin, 1997, pp. 21-40.
23. Axtell, R.L., Axelrod, R.M., Epstein, J.M., Cohen, M.D. Aligning Simulation Models: A
Case Study and Results, Computational and Mathematical Organization Theory 1
(1996) 123-141.
24. Binmore, K. Review of the book: The Complexity of Cooperation: Agent-Based Models
of Competition and Collaboration, by Axelrod, R., Princeton, New Jersey: Princeton
University Press, 1997, Jasss-the Journal of Artificial Societies and Social Simulation 1
(1998) <http://jasss.soc.surrey.ac.uk/1-1/review1.html>.
25. Brown, D.G., Page, S.E., Riolo, R.L., Rand, W. Agent-based and analytical modeling to
evaluate the effectiveness of greenbelts, 2004.
26. Gotts, N.M., Polhill, J.G., Adam, W.J. Simulation and Analysis in Agent-Based
Modelling of Land Use Change, First Conference of the European Social Simulation
Association, 2003, p. Conference proceedings available online at http://www.unikoblenz.de/~kgt/ESSA/ESSA1/proceedings.htm.
27. Dawes, R.M. Social Dilemmas, Annual Review of Psychology 31 (1980) -161.
28. Gotts, N.M., Polhill, J.G., Law, A.N.R. Agent-based simulation in the study of social
dilemmas, Artificial Intelligence Review 19 (2003) 3-92.
29. Boyd, R., Richerson, P.J. Punishment Allows the Evolution of Cooperation (or Anything
Else) in Sizable Groups, Ethology and Sociobiology 13 (1992) 171-195.
30. Yamagishi, T., Takahashi, N. Evolution of Norms without Metanorms, in U.Schulz,
W.Albers & U.Mueller (Eds.), Social Dilemmas and Cooperation, Springer-Verlag,
Berlin, 1994, pp. 311-326.
31. Collier, N. RePast: An Extensible Framework for Agent Simulation, 2003, p.
<http://repast.sourceforge.net/>.
32. Cioffi-Revilla, C. Invariance and universality in social agent-based simulations,
Proceedings of the National Academy of Sciences of the United States of America 99
(2002) 7314-7316.
33. Kulkarni, V.G. Modelling and Analysis of Stochastic Systems, Chapman & Hall/CRC,
Boca Raton, Florida 1995.
34. de Quervain, D.J.F., Fischbacher, U., Treyer, V., Schellhammer, M., Schnyder, U.
A.Buck, F.Ernst, The Neural Basis of Altruistic Punishment, Science 305 (2004) 12541258.
Appendix A
Statement: the only ESS in the Norms game (assuming continuity and using
Axelrod’s parameters) is the state of total norm collapse (bi = 1, vi = 0 for all i).
Proof: Please bear in mind that eq. (3) and (4) must be fulfilled for the sate to be ESS.
The following proves that the only state that satisfies eq. (3) and (4) is bi = 1, vi = 0
for all i. All variables are assumed to be within the feasible range.
∂Exp( Payoff M )
= T + (n − 1) ⋅ bM ⋅V− M ⋅ P
∂bM
∂Exp( Payoff I )
= H + E ⋅ v I ⋅ bM
∂bM
24
Luis R. Izquierdo and José M. Galán
∂Exp( Payoff M ) E
=
∂vM
2
n
∑b
j =1
j≠M
∂Exp( Payoff I ) P 2
= bI
∂v M
2
2
j
∂Exp( Payoff M )
∂Exp( Payoff I )
>
⇒ { eq. (3) } ⇒
∂bM
∂bM
bM =0
bM = 0
∀M
∂Exp( Payoff M ) E
=
∂vM
2
∀M,∃I ≠ M /
n
∑b
j =1
j≠M
2
j
≤
) (
(
bM ≠ 0
∀M
)
2
2
E
E
n B − bM2 ≤ n B − 1
2
2
∂Exp( Payoff I ) P 2 P  n B 

= bI ≥ 
∂vM
2
2  n − 1 
(
)
2
2
2
∂Exp( Payoff M ) E
∂Exp( Payoff I )
P  nB 
 ≤
≤ n B − 1 < 
If B > 0.26 ⇒
∀ M , some I
∂vM
2
2  n − 1 
∂vM
∴ If B > 0.26 ⇒ v M = 0 ∀M ⇒ { eq. (3) }⇒ bM = 1 ∀M
If B ≤ 0.26 ⇒ ∃ M / bM ≤ 0.26 ⇒ { eq. (3) }⇒ ∃ M / V− M ≥ 0.09 ⇒ V− M ≥ 0.03 ∀ M
V− M ≥ 0.03 ∀ M ⇒ { eq. (3) }⇒ bM ≠ 1 ∀ M
(bM ≠ 1 AND bM ≠ 0) ∀ M ⇒ { eq. (3) } ⇒ ∂Exp( Payoff M ) = ∂Exp( Payoff I ) ∀ I ≠ M , ∀ M
∂bM
∂bM
∂Exp( Payoff M ) ∂Exp( Payoff I )
=
∀ I ≠ M , ∀ M ⇒ vM = V ∀ M
∂bM
∂bM
 ∂Exp( Payoff M ) ∂Exp( Payoff I )


=
∀ I ≠ M , ∀ M AND vM = V ∀ M  ⇒ bM = B ∀ M
∂
b
∂
b
M
M


(vM = V AND bM = B ≠ 0) ∀ M ⇒ { eq. (4) } ⇒ vM = 0 ∀ M ⇒ { eq. (3) } ⇒ bM = 1 ∀ M
(
)
But bM ≠ 1 ∀ M ! and B ≤ 0.26 by assumption ⇒ No ESS if B ≤ 0.26
It is proved then that (bi = 1, vi = 0 for all i) is a necessary condition for ESS in the
Norms game. Now we prove that it is sufficient.
v
Exp( Payoff MUTANT ) = bM T + ( n − 1) H + E M (n − 1)
2
v
Exp( Payoff INCUMBENT ) = T + (bM + n − 2) H + M P
∀ INCUMBENT
2
Exp( Payoff MUTANT ) < Exp( Payoff INCUMBENT ) ∀ bM , vM , (bM ≠ 1 OR vM ≠ 0 )
Appendix B
Statement 1: In the Metanorms game, assuming continuity and using Axelrod’s
parameters, there is only one ESS where the norm is established (bi = 4/169, vi = 1 for
all i).
Appearances Can Be Deceiving: Lessons Learned Re-implementing Axelrod’s
“Evolutionary Approach to Norms”
25
Proof: We will assume that if V < 0.5 the norm is not established, so we only deal
with states where V ≥ 0.5. The following proves that the only state that satisfies eq.
(3) and (4) is bi = 4/169, vi = 1 for all i (assuming V ≥ 0.5). All variables are assumed
to be within the feasible range.
∂Exp( Payoff M )
= T + (n − 1) ⋅ bM ⋅ V− M ⋅ P = 3 − 171 ⋅ bM ⋅ V− M ⋅
∂bM
n
∂Exp( Payoff I )
= H + E ⋅ vI ⋅ bM + ME vI 3bM2 ∑ (1 − v j ) + MP 1 − vI 3bM2
∂bM
4
4
j =1
n
∑v
j =1
j ≠ I ,M
j ≠ I ,M
(
(
)
)
bM
⋅ 57 ⋅ bM (11vI − 9 ) ⋅ V− M − vI ⋅ (8 + bM (81 + 33vI )) − 1
4
∂Exp( Payoff M ) ∂Exp( Payoff I )
=
−
∂bM
∂bM
=
Let F bM , v I ,V− M
j
V ≥ 0.5 ⇒ V− M > 0.45 ∀ M ⇒ { eq. (3) }⇒ (bM ≠ 0 & & bM ≠ 1) ∀ M
(bM
(
)
≠ 0 AND bM ≠ 1) ∀ M ⇒ { eq. (3) } ⇒ F bM , v I ,V− M = 0 ∀ I ≠ M , ∀ M
Please remember that all variables must be within the feasible range.
[ ]
[ ]
Given bM , ∃ at most two different v I , namely v I* 1 and v I* 2 , such that
( [ ]
)
( [ ] ,V ) = 0
However, (V > 0.45 AND [v ] ≤ V ≤ [v ] ) ⇒ [v ] = [v ]
∴ Given b , ∃ a unique v such that F(b , v ,V ) = 0
V > 0.45 ⇒ Given v , ∃ a unique b such that F(b , v ,V ) = 0
F bM , v ,V− M = 0 & & F bM , v
*
I 1
−M
*
I 1
−M
I
*
I 2
−M
*
I
M
−M
*
I 2
M
*
M
*
I
*
I 1
*
I 2
−M
*
M
I
∴ vi = V ∀ i AND bi = B ∀ i
(v = V
−M
)
∀ i AND bi = B ∀ i AND eq. (3) AND eq. (4) AND V ≥ 0.5 ⇒
⇒ vi = 1 ∀ i AND bi = 4/169 ∀ i
It is proved then that (bi = 4/169, vi = 1 for all i) is a necessary condition for ESS in
the Norms game if V ≥ 0.5. Proving that it is also sufficient is tedious but simple.
Naming a potential agent’s boldness as bM and its vengefulness as vM, it can be shown
that:
Exp( Payoff MUTANT ) < Exp( Payoff INCUMBENT ) ∀ bM , vM , (bM ≠ 4 169 OR v M ≠ 1)
i
Statement 2: The state where bi = 1, vi = 0 for all i is ESS.
Proof:
vM
(n − 1) + ME vM (n − 1) ⋅ (n − 2)
2
4
vM
1
Exp( Payoff INCUMBENT ) = T + (bM + n − 2) H +
P + MP ⋅ vM ⋅ (n − 2 ) ∀ INCUMBENT
2
4
Exp( Payoff MUTANT ) < Exp( Payoff INCUMBENT ) ∀ bM , vM , (bM ≠ 1 OR vM ≠ 0 )
Exp( Payoff MUTANT ) = bM T + (n − 1) H + E