The Truth Is Out There: the Evolution of Reliability in Aggressive

The Truth Is Out There:
the Evolution of Reliability in Aggressive Communication Systems
Peter de Bourcier1 and Mike Wheeler2,
CyberLife Technology Ltd, Quern House, Mill Court, Cambridge, CB2 5LD, U.K.
Phone: +44 1223 844894, Fax: +44 1223 844918, E-Mail: [email protected]
2.
Dept. of Experimental Psychology, University of Oxford, South Parks Road, Oxford, OX1 3UD, U.K.
Phone: +44 1865 271417, Fax: +44 1865 310447, E-Mail: [email protected]
1.
Abstract
This paper reports on our ongoing research in which
we employ an artificial life methodology to study the
evolution of communication. We perform experiments
using synthetic ecologies in which artificial autonomous
agents (animats) with evolved signalling and receiving
tactics are in competition over food. Synthetic ecologies
permit an investigative strategy in which one relaxes
certain restrictive assumptions that, in the interests of
formal tractability, are made in mathematical models of
biological communication. The experiments described
here are examples of that investigative strategy.
Motivated by suggestions from recent biological
signalling theory, we allow individuals to pay attention
to multiple sources of information about a potential
opponent. Our results indicate that if signals are
unreliable, then individuals will evolve to use another
source of information which is guaranteed to be reliable
(if such a source exists), and signals will fall into disuse.
However, if we place an indirect fitness cost on the
receiving tactic of obtaining guaranteed-to-be-reliable
information, then, as this cost rises, it becomes adaptive
and evolutionarily stable for receivers to pay attention to
unreliable signals.
1. Introduction
In conflict situations, should we expect animals to
produce reliable signals of their aggressive intentions, or
should we expect cheats to prosper? If we should expect
aggressive signals to evolve to be reliable, what
evolutionary factors enforce that reliability? If we should
expect aggressive signals to evolve to be unreliable,
should aggressive communication systems persist at all
in nature? For some time now we have been examining
these sorts of questions through the methods and
techniques of artificial life (henceforth ‘A-Life’). (See
[6, 7, 35] for our previous studies.) This ongoing
research takes place within a theoretical framework that
we call Synthetic Behavioural Ecology (henceforth
‘SBE’). We carry out experiments in simple (although
not trivial) synthetic ecologies, with the specific goal of
making a contribution to the scientific understanding of
how ecological context influences the adaptive
consequences of behaviour. The term ‘synthetic
behavioural ecology’ is designed to suggest a theoretical
link between our approach and behavioural ecology, the
sub-discipline of biology in which researchers aim to
identify the functional roles that particular, ecologically
embedded behaviour patterns play in contributing to
Darwinian fitness. Of course, SBE is also closely related
to certain other approaches within the artificial life
community (e.g., [16, 32, 1]).1
It seems plausible that studies using well-constructed
synthetic ecologies may, in time, help to bridge an
explanatory gap that exists between real-world adaptive
behaviour in natural environments and idealized
mathematical models of that behaviour. Synthetic
ecologies will always be idealizations of any natural
ecology; but, unlike their natural counterparts, synthetic
ecologies permit the precise variation of the key
parameters affecting the observed behaviour, and the
easy repeatability of experiments (cf. [11]).
The relationship between synthetic ecologies and the
mathematical models to which theoretical biologists are
accustomed (see section 2) is less straightforward.
Mathematical models are undoubtedly powerful
theoretical tools. In particular, they allow biologists to
make assumptions and arguments explicit. And there
seems little doubt that synthetic ecologies do not afford
the transparent formal rigour that mathematical models
often achieve. However, as the formal modellers
themselves sometimes observe, the biological realism of
mathematical models, and therefore their usefulness to
empirical research on animal behaviour, can sometimes
be rather limited [11]. In part at least, this situation
occurs because, in order to make those models
mathematically tractable, certain restrictive assumptions
have to be made (see section 3). Our thought is that
SBE is potentially useful precisely because it provides a
platform for experimentation which sits somewhere
between field biology and formal mathematical
modelling.
Synthetic ecologies permit an investigative strategy in
which the experimenter systematically relaxes the
restrictive assumptions made in mathematical models.
Miller recommends a similar strategy:
A powerful way of using A-Life simulations is to
take an existing formal model from theoretical
biology and relax the assumptions (preferably one
at a time) that were required to make the
mathematics tractable. The results of such a
simulation are then directly comparable to the
results of the existing formal model, and will be
comprehensible and relevant for biologists. [23,
p.10]
1
For an introduction to behavioural ecology, see [14].
For a discussion of the theoretical relationship between
SBE and other A-Life-style approaches, as well as a
more detailed presentation of SBE as a general
methodology, see [35].
1
Copyright © 1997, P. de Bourcier, M. Wheeler
Our primary aim, then, is to use SBE to contribute to the
understanding of biological communication. However,
it is worth mentioning in passing that since research into
the functional questions surrounding communication
systems (e.g., the conditions under which honest
signalling is the most adaptive strategy) will have
implications for research into collective behaviour in
populations of real robots, there is no principled reason
why our SBE studies should be irrelevant to researchers
working on robot communication, simply because we are
working in simulation. For example, McFarland [20]
describes a robot communication system in which honest
signalling is assumed, but in which, as McFarland
himself observes, honesty may not be the best policy. As
far as we can tell, this is precisely the sort of issue that
SBE and related approaches are well-placed to
investigate.
2. Biological Theory: Quality, Cost, and Reliability
Consider the following real-life scenario: The
reproductive success of a red deer stag depends on its
fighting ability. The stronger the stag, the larger its
harem, and the more opportunities it will have to pass
on its genes. Contests between stags in the annual
autumn competition can be a dangerous business.
Between 20 and 30 percent of stags become permanently
injured at some point during their lives, and nearly all
stags endure minor injuries. However, despite the image
of violent conflict that such statistics suggest, all-out
fights are comparatively rare. Why is this? The first
reason is that it pays a stag to avoid those fights that it is
likely to lose. The second is that fights are often
seriously damaging to the victor as well as the
vanquished. Thus there has been a selection pressure for
settling contests by display rather than fighting, and the
red deer world has witnessed the evolution of the
distinctive (and adaptive) phenomena of roaring and
parallel walking. In a typical confrontation, the harem
holder and the challenger begin their contest by roaring
at each other. If the holder roars at a faster rate than the
challenger, then the challenger usually withdraws. If
roaring fails to decide the issue, the two stags commence
a parallel walking display which allows further
assessment. Only if this second phase behaviour still
fails to settle the contest does a serious fight occur [5].
For the student of adaptive behaviour interested in
communication, there are some important lessons to be
learned from this classic example of an animal contest.
Most strikingly, the signalling system in operation is
guaranteed to be honest. (An honest signal is one that
reliably reflects the underlying quality of the signaller
[31].) Honesty is ensured because the stag's signals are
biologically correlated with the phenotypic traits that
determine its ability to win a fight (e.g., size or physical
condition).
Such traits are known as the animal's
resource holding potential or RHP [27], and signals
which are biologically correlated with RHP are called
assessment signals [19]. Roaring is a reliable signal of
fighting ability, because, to roar at a fast rate, and to
continue such roaring for what can sometimes be a
protracted period, a stag must be physically strong. A
weaker stag is simply unable to pay the costs of roaring
at higher rates, so it cannot fool the opposition into
treating it as stronger than it, in fact, is.
According to Zahavi's handicap principle [36, 37],
this idea --- that high signalling costs increase the
reliability of the signals made --- is a fundamental
principle of biological signalling systems. Zahavi
reasons as follows: Biological signals will evolve only if
(a) there is information about a potential signaller that a
potential receiver wants and (b) it is in the potential
signaller's interests to supply that information [38].
Thus the information carried by signals must be reliable
enough to warrant a receiver's attention, and unreliable
signals will be selected out. However, if the signaller has
to make an investment in its signals (where ‘investment’
means the cost in fitness that the animal incurs through,
say, energy loss or risk of predation, as a result of
making the signal) the reliability of signals will
increase. A signal which is, for example, wasteful of
energy is, as a consequence of that wastefulness, reliably
predictive of the possession of energy. So not only is it
the case that the costliness of signals guarantees their
honesty, it is also the case that signals will evolve to be
costly and, therefore, to be honest.
The handicap principle is still controversial within
biology, and has received a number of different
interpretations.2 In his influential ESS-model of the way
in which the handicap principle could operate in mating
displays, Grafen [10] suggests a strategic choice
interpretation, in which signallers of different qualities
will signal at different levels that reveal true quality,
because each signaller `chooses' to endure a level of
handicap appropriate to his or her quality. High quality
individuals produce higher signals, and thus endure
bigger handicaps, than low quality individuals. Grafen
shows that for the strategic choice handicap to evolve,
(i) higher signals must cost more, and (ii) the costs
involved must be differential, in the sense that a specific
signal must be proportionally more costly to a weak
individual than to a strong individual. This is the
interpretation of the handicap principle which we will
adopt in what follows.
Despite all this talk of honesty, it seems that some
animal signals are susceptible to exploitation by cheats.
For example, when in foraging flocks, birds of several
species sometimes give hawk alarms when there are no
predatory hawks present. Other birds who hear the
signal often flee, an event which permits the signaller to
gain better access to the available food [25]. This is a
clear case of qualitative deception. But deception may
also be quantitative [12], and since a paradigmatic
context for quantitative deception is aggressive
signalling --- the focus of this paper --- let's consider the
case of an animal which signals the intention to
intensify an already-existing conflict situation, when, in
fact, that animal would not be prepared to escalate the
conflict to that higher level. It seems that such
quantitative cheats --- creatures who consistently
signalled higher levels of aggressive intent than they in
fact possessed --- would tend to be more successful than
honest signallers when confronting ‘trusting’ opponents
2
See [10, 29] for classifications and discussions, and [3, 13]
for recent treatments using formal techniques.
2
Copyright © 1997, P. de Bourcier, M. Wheeler
in signalling contests. Thus they would take over the
population. To put this point another way, the honest
signalling of aggressive intentions appears not to be an
evolutionarily stable strategy (or ESS). (An ESS is a
strategy which, when adopted by most members of a
population, means that that population cannot be
invaded by a rare alternative strategy [18]. At the ESS
fitness is maximized in the sense that individuals not
adopting the ESS do worse [26].)
At first sight, then, it seems that signals of aggressive
intentions ought to be unreliable. Following a line of
argument which is consistent with Zahavi-esque
reasoning, one might conclude from this that animals
should not be expected to signal aggressive intentions at
all, because, eventually, it would pay receivers to ignore
such signals in favour of other sources of information,
and the entire strategy of signalling aggressive
intentions would be selected against [14, 17]. However,
there is ample evidence from animal communication
systems that some animals, such as male African
elephants [28] and male Pere David’s deer [34] do
signal aggressive intent. Of course, the handicap
principle itself might explain this phenomenon. For
example, Enquist [8] uses a modification of the hawkdove game [18] to demonstrate that if signals of
aggressive intentions are costly and the costs are
differential in the sense identified above, then the
signalling of aggressive intentions is an ESS. So the
handicap principle might well operate to secure the
reliability of communication systems in which
aggressive intentions are signalled. And if signals of
aggressive intentions are reliable, then, all things being
equal, such signals will not be selected-out.
Despite the undoubted theoretical significance of
Enquist’s result, in both of the examples of aggressive
signalling just cited (African elephants and Pere David’s
deer), there are common circumstances in which low
quality males appear to signal higher levels of
aggressive intent than they, in fact, possess. This
strategy enables the cheats to achieve a temporary
domination of better quality males, which, in turn,
permits them to enjoy increased mating success. In
other words, these signalling systems are unreliable, yet
they persist. Perhaps, as Harper suggests [12], receivers
will evolve to ignore deceptive signals altogether only
when those signals are qualitatively, rather than
quantitatively, incorrect. Quantitatively incorrect signals
still carry some information. A threat display --- even
one which has been exaggerated --- is not a submission
display.
Things get more complicated still once one gives due
theoretical weight to the fact that the behavioural
response of a receiving animal will be the result not of
an incoming signal alone, but of that signal plus the
degree of importance which the receiver gives to that
signal. Receiver tactics will thus be an important factor
in the evolution of communication systems. Hence
‘receiver psychology’ is increasingly discussed in the
animal behaviour literature (e.g., [21, 31]). And that is
not all that happens on the receiver-side of
communication. Receivers, as well as signallers, often
face fitness costs [31]. For example, receivers assessing
the quality of potential mates who signal to them will
often pay fitness costs in terms of the time taken over
assessment, time that could be spent on other activities.
And notice that in the case of red-deer roaring discussed
earlier, a challenger has to bear the cost of roaring at an
honest level himself, in order to elicit a signal from the
harem-holder. Other receiver costs might include risks
of predation and disease transmission. On this evidence
Stamp Dawkins and Guilford conclude that “the
receiver’s ideal signal (one that is costly to give, and so
gives an honest indication of quality, but costs nothing
to receive) will be a rare commodity” [31, p.866]. They
go on to argue that receiver costs will have important
evolutionary consequences for communication systems.
If signallers and receivers pay high costs for reliable
signalling, then it may be to the mutual benefit of both
to adopt a signalling system which is less costly and
(therefore) not guaranteed to be reliable. Such signalling
systems, involving so-called ‘conventional’ signals
(signals in which the necessary connection between cost
and honesty has been lost), may be established when, for
example, (i) animals recognize each other as individuals
(so that the full costs of assessment are avoided through
memory of previous assessments), or (ii) animals
recognize each other as members of categories of
signallers who have been encountered previously. Once
receiver costs are factored into our understanding, it
seems much more likely that unreliable signalling
systems which permit some level of cheating can be
evolutionarily stable. If, on average, the fitness benefit to
receivers from paying the costs of honest assessment is
outweighed by the benefit from (a) conventional
signalling coupled to (b) the occasional probing of
signallers to impose costs on cheats, then unreliable
signals may well persist [31].
What all this tells us is that the current biological
theory of aggressive signalling, as rich as it undoubtedly
is, contains gaps, shortfalls, and unanswered questions.
In the SBE experiments described in the remainder of
this paper, we endeavour to address two outstanding
issues: multiple sources of information and the cost of
finding out the truth.
3. The New Experiments: Rationale
We noted earlier that existing ESS-models often make
certain restrictive assumptions that can reduce their
usefulness to empirical studies. We suggested also that
synthetic ecologies permit an investigative strategy in
which these restrictive assumptions are systematically
relaxed.
The study of biological communication
provides a context ripe for the application of this
strategy.
Although ESS-models are powerful
mathematical tools for investigating the logic of animal
signalling systems, even state-of-the-art models are
limited, in that they do not allow for multiple receivers
of one signal [11]. Relatedly, information flow is almost
always assumed to be one-way; that is, there is a
signaller and a receiver, and no scope for the same
individual to be both a signaller and a receiver
simultaneously. Our early SBE-models [6, 7, 35]
demonstrated that the general logic of the handicap
principle can carry over to multi-agent signalling
3
Copyright © 1997, P. de Bourcier, M. Wheeler
systems in which these two restrictive assumptions are
relaxed, systems which allow both for two-way
information flow and for an individual signal to be
picked up by many receivers. The synthetic ecology that
we use in the experiments described here preserves these
features. It also takes further the strategy of relaxing
restrictive assumptions.
The sequential assessment game of Enquist and
Leimar [9] is one ESS model which relaxes the
assumption that information flow in signalling contests
is one-way. However, even this highly sophisticated
model embodies a third restrictive assumption, which is
that contestants have no way of ‘choosing’ between
different sources of information. They have access to
direct information about an opponent’s strength, but not
to signalling information [11]. This is not a limitation
of the sequential assessment game alone. As far as we
know, situations in which combinations of signalling
information and ‘direct perception’ of quality might be
used by an animal in aggressive confrontations have yet
to be explored in detail by any concrete ESS-model,
although Grafen and Johnstone have made some
tentative suggestions about how such a model might be
developed [11]. They suggest that an individual in an
ESS-game might possess a number of registers, each of
which contains a real number representing such values
as an estimate of an opponent’s strength based on recent
evidence, or an estimate of an opponent’s willingness to
escalate a conflict. The best use to which such a register
might be put by a particular individual depends on how
other individuals are using their registers.
In the first experiment described below, we extend our
model by making it possible for receivers to evolve to
use one or the other of, or a combination of, signalling
information and direct information about true quality.
This extension was inspired primarily by recent
discussions in biological signalling theory (such as
Grafen and Johnstone’s). However, it is also suggested
by the path that our own work has taken. In our earliest
SBE models [6, 35 ], the strategy of receivers was
effectively fixed, such that the level of threat which a
receiver registered was determined solely by the values
of incoming aggressive signals. However, as stressed
earlier, to understand communication systems, we need
to consider not only the strategies of signallers, but also
the strategies of receivers. Hence we introduced the
concurrent evolution of individual signalling and
receiving strategies [7]. The results from this more
complex ecological scenario suggested the following
hypothesis (see [7, pp.766-71]): Where the fitness costs
of signalling are low, signallers will tend to produce
signals indicating levels of aggression well in excess of
actual aggression. If there is no alternative source of
relevant information, receivers may still pay heed to
those signals. Where the fitness costs of signalling are
high, the pressure on signallers to reduce the level of
signalling may still lead to communication systems in
which signals are not direct reflections of quality, in that
signallers may tend to produce signals indicating levels
of aggression lower than actual aggression. However, if
receivers evolve to give a high degree of importance to
those signals, the effect would be to compensate for the
actual values of the signals. As receivers, individuals
would still behave just as if signals were direct
reflections of aggression (so they would benefit from not
being drawn into costly conflicts); as signallers,
individuals would benefit from the low level of
signalling. Given this result, the obvious next move is to
give receivers the potential to respond to an alternative
source of information about the quality of prospective
opponents (a source that is guaranteed to be reliable),
and to investigate the effects that the existence of such a
source has, at various costs of signalling, on the
evolution of the communication system.
The second experiment described below extends the
model again, this time to connect our work directly with
the recent biological literature on receiver costs (as
discussed in section 2). In all of our previous synthetic
ecologies, the costs of communication were borne
entirely by signalling. In section 6 we describe what
happens when an indirect fitness cost is placed on the
strategy of finding out about the true RHP of potential
opponents. We then observe the effects on the evolution
of the communication system at various costs of
signalling, and at different levels of this novel indirect
cost.
4. Experimental Model
A number of mobile animats (all with equal energy
levels) and a number of stationary food particles (all
with equal energy values) are distributed randomly
throughout a two-dimensional world. This world is
1000 by 1000 units square (each animat being round and
12 units in diameter). Space is continuous, and the
edges of the world are barriers to movement. When an
animat lands on a food particle, the animat’s energy
level is incremented by the energy value of that particle,
and the particle is deemed to have been ‘eaten’. The
food resource is replenished by new food particles which
are added (with a random distribution) at each timestep; but the resource is also ‘capped,’ so that food is
never more plentiful than at the beginning of the run.
Each animat has two highly idealized sensory
modalities, which, for convenience, we label ‘vision’
and ‘olfaction’. The visual system is based on a 36-pixel
eye providing information in a full 360 degree radius
around the animat, with an arbitrarily imposed
maximum range of 165 units. Each pixel returns a value
corresponding to the proportion of that pixel’s receptive
field containing other animats. The olfactory system is
sensitive to food. Its principles are similar to those of the
visual system, the only differences being that the
olfactory range is only 35 units, and food particles are
treated as point sources.
Although animats receive energy from food, they also
lose energy in various ways. A small existence-cost is
deducted at each time-step, and animats also lose energy
for fighting, moving, signalling, and reproducing (see
below). If an animat’s energy level sinks to 0, it is
removed from the world; so food-finding is essential to
survival. To encourage foraging, each animat has a
hunger level (a disposition to move towards food) which
4
Copyright © 1997, P. de Bourcier, M. Wheeler
changes in a way inversely proportional to its energy
level.
What we think of as fights take place when animats
touch. Fighting animats suffer large energy reductions.
It is therefore plausible to regard an animat’s energy
level as a measure of its RHP. It is a characteristic of
RHP that having a high RHP will generally be costly in
contexts other than fighting. For example, being large
may well be a sign of high RHP, but it also costs the
animal in terms of growth and maintenance [12]. To
reflect this fact in our synthetic ecology, we impose a
cost in energy to making a movement, such that each
movement an animat makes results in a reduction in
energy proportional to the amount of energy that the
animat has. A movement made by an animat whose
energy-level is near the maximum possible (see below)
costs twice as much as a movement made by an animat
who is on the verge of death.
The behaviour in our previous SBE models became
increasingly difficult to interpret, as the ecological
contexts became more complicated. If SBE studies are
to be of any real use to theoretical biologists, then the
experimental advantages offered by (a) the capacity for
the precise variation of parameters and (b) the easy
repeatability of experiments (see section 1) must not be
outweighed by the disadvantages of a level of complexity
which makes an understanding of behaviour extremely
difficult to attain. Indeed, in the limit, although
synthetic ecologies will always be idealizations of any
natural ecology, a horrendously complex simulation
might offer little or no experimental advantage over
nature itself.3 With this in mind, we began our latest
round of experiments by simplifying our experimental
model. Previously the behaviour of an individual was,
in part, determined by a separate and specific aggression
module with the following effects: When an animat
made an aggressive movement, its aggression level was
increased by an amount proportional to the previous
aggression level. Conversely, a non-aggressive
movement resulted in a decrease in an animat’s
aggression level, by an amount proportional to the
previous aggression level. An individual’s aggression
level was used as the basis for the production of
aggressive signals. Although there were good reasons
for organizing things in this manner (see our reasoning
in [35]) precisely how this aggression module interacted
with other factors relevant to the production of
behaviour (e.g., energy level) was sometimes hard to
track. So, in our latest model, animats do not have a
separate aggression module controlling a separate
aggression value. Rather we define an aggressive
movement as a movement in which a first animat moves
directly towards a second (which makes sense because
fights occur when animats touch). Aggressive signals
(indications of the apparent tendency that a signaller has
to move directly towards a receiver) are calculated using
the RHP (energy level) of the signaller (as explained
below).
3
Many thanks to David Krakauer, in particular, for forcefully
pointing this out to us.
If one is to be serious about evolutionary-functional
explanations of animal contests, then aggression cannot
be thought of as an end in itself or as a ‘spontaneous
appetite’ [15], such that, in the absence of the
performance of aggressive acts, the tendency to behave
aggressively increases with time. Rather, aggressive
behaviour needs to be conceptualized as a form of
adaptive behaviour, with an adaptive purpose, such as to
win or to defend a resource (see, e.g., [1, 8, 33]).
In our synthetic ecological context, the food supply is
limited and foraging is essential for survival. Thus
animats are effectively in competition for the available
resources. Therefore it benefits an individual to inhabit
an area which is not being foraged by other animats. It
is here that aggressive movements play their adaptive
role.
Although aggressive behaviour is reactively
triggered by the presence of other animats within visual
range - so animats do not plan aggressive responses with
the explicit goal of preserving an exclusive foraging area
- nevertheless aggressive behaviour does help an
individual to ‘defend’ just such an area, precisely by
driving away approaching animats. Hence aggressive
movements serve the adaptive purpose of helping an
individual to defend a resource, even though that
purpose is not internally represented as an explicit goal
in the mechanisms controlling that individual’s
aggressive behaviour.4 Each member of the population
has three evolved strategies which, along with that
individual’s hunger and energy levels, determine its
behaviour.
Signalling strategy: Animats signal whenever at least
one other animat is within visual range. Signals are
produced in accordance with the calculation S = E.C,
where S is the value of the signal made, E is that
individual’s current energy level (RHP), and C is an
individual-specific constant, in the range 0-2. A C of 0
is equivalent to not making any signal, a C of 1 is
equivalent to producing indicators of actual RHP, and a
C of 2 is equivalent to producing signals indicating
twice actual RHP. Notice that this allows the existence
of individuals who produce signals indicating levels of
RHP lower than actual RHP. Although an individual’s
signalling strategy remains constant throughout its
lifetime, the actual values of the signals produced by any
one animat will vary across time, because each
individual’s energy level changes dynamically as a
function of its activity. If, in some population, signal D
is usually followed by some action, DD, then any
individual which signals D but does not intend to
perform DD can be thought of as a bluffer.5 An animat
which evolves to have a very high value of C can be
thought of as a bluffer who produces exaggerated signals
of aggressive intent (so this animat’s strategy is to
4
In a previous paper [35], we presented a quantitative
analysis (based on ethological techniques) which
suggested that the spatial behaviour of animats in one of
our earlier and simpler synthetic ecological contexts was
plausibly regarded as a minimal form of territoriality.
We have yet to carry out a similar analysis for the
current model.
5
This is equivalent to Maynard Smith’s notion of ‘lying’ [17].
5
Copyright © 1997, P. de Bourcier, M. Wheeler
indicate a stronger disposition to approach another
animat than it, in fact, has). An animat which evolves to
have a very low value of C can be thought of as a
different kind of bluffer, one who produces suppressed
signals of aggressive intent (so this animat’s strategy is
to indicate a weaker disposition to approach another
animat than it, in fact, has).
Figure 1a: Signalling Strategy (Low )
20000
1 8000
1 6000
1 4000
1 2000
1 0000
Aggressive signals are displays for which a signalling
animat has to pay, via a deduction in energy. (This is an
appropriate tax since an animat’s energy level is
effectively equivalent to its RHP. The ways in which
costs are paid in natural environments may be more
complex, although the notion of an ‘appropriate link’
still seems to apply [30].) At the beginning of a run, the
units of energy deducted per unit of aggressive signal are
set by the experimenter. Thus the absolute amount of
energy deducted increases with the values of aggressive
signals. In this way, the cost of signalling is differential,
in the sense required by the handicap principle, because,
given a specific signal made by a high-energy
individual, it will cost a low-energy individual
proportionally more to produce that same signal.
Receiving strategy: Animats receive the signals of any
other animats within visual range. The effect that a
signal has on the behaviour of a receiving animat is a
function of that animat’s receiving strategy. This
strategy is determined by an individual-specific constant,
K that ‘weights’ incoming signals, in order to calculate a
threat value. So T = R.K, where T is the threat, R is the
incoming signal, and K is an individual-specific
constant, in the range 0-2. A K of 0 would result in that
individual ignoring incoming signals; a K of 1 means
that the value of the incoming signal itself is used as the
threat value; and a K of 2 results in incoming signals
being doubled, and that resulting value being used as the
threat value. So the higher the value of K, the higher the
degree of importance that an individual is giving to
incoming signals.
Sensitivity to RHP: A receiving animat is also
sensitive, to some degree, to the energy level (RHP) of
any animat within visual/signalling range. This
sensitivity is determined by an individual-specific
constant, P that ‘weights’ any energy value picked up, in
order to calculate what we call ‘assessed RHP’. The
higher the value of P, the more sensitive the behaviour
of an individual is to the RHP of other animats. So W =
H.P, where W is assessed RHP, H is the visible animat’s
actual energy level (RHP), and P is an individualspecific constant, in the range 0-2. A P of 0 would result
in that individual ignoring another animat’s RHP; a P of
1 means that that RHP value itself is used as assessed
RHP; and a P of 2 results in that RHP value being
doubled, and that resulting value being used assessed
RHP.
8000
6000
4000
2000
0
T i me
0->50%
50->1 00%
1 00->150%
1 50->200%
Figure 1b: Receiving Strategy (Low )
1 6000
1 4000
1 2000
1 0000
8000
6000
4000
2000
0
T i me
0->50%
50->1 00%
1 00->150%
1 50->200%
Figure 1c: RHP Sensitivity (Low )
1 8000
1 6000
1 4000
1 2000
1 0000
8000
6000
4000
2000
0
T i me
0->50%
50->1 00%
1 00->150%
1 50->200%
Figure 2a: Signalling Strategy (High)
1 8000
1 6000
1 4000
1 2000
1 0000
8000
6000
4000
2000
At each time-step, the direction in which each animat
will move (one of 36 possible directions) is calculated
using the following equation:
0
T i me
0->50%
6
Copyright © 1997, P. de Bourcier, M. Wheeler
50->1 00%
1 00->150%
1 50->200%
P(d)=
h.s(d) + e.v(d) + t(o).v(o) + w(o).v(o) + c
S[1..n] (h.s(di )+e.v(di )+t(oi ).v(oi )+w(oi ).v(oi )+c)
where p(d) is the probability that the particular animat
will move in the direction d; n is the number of possible
directions of movement; h is the animat's hunger level;
s(d) is the value returned by the olfactory system in
direction d; e is the animat's energy level; v(d) is the
value returned by the visual system for direction d; o is
the direction 180 degrees off d --- i.e., in the opposite
direction to d; t(o) is the threat that the animat perceives
from other animats from the opposite direction to d; w(o)
is the RHP assessment value for animats from the
opposite direction to d; v(o) is the value returned by the
visual system in the opposite direction to d; and c is a
small constant which prevents zero probabilities.
Each of the three evolved strategies is encoded in 8
bits of a 24 bit genotype. So the genotype as a whole
specifies the set of strategies adopted by the particular
individual in question. This is analogous to Grafen and
Johnstone’s notion of a register (see section 2 above). At
the beginning of a run, a random population of
genotypes is created, producing a random distribution of
signalling strategies, receiving strategies, and RHPsensitivities. When an animat achieves a pre-defined
(high) energy level, it will asexually reproduce. The
selection pressures imposed by the ecological context
mean that different strategies (or different combinations
of strategies) will have different fitness consequences,
because only those individuals adopting adaptively fit
strategies will have a high probability of becoming
strong enough, in energy terms, to reproduce.
The result of reproduction is a single offspring placed
randomly in the world. This only child is given the
same initial energy level as each member of the
population had at the start of the run, and the
corresponding amount of energy is deducted from the
parent. The parent’s genotype is copied over to the
offspring, but there is a small probability that a genetic
mutation will take place (a 0.05 chance that a bit-flip
mutation will occur as each bit is copied). So it is
possible that the child will adopt different strategies to
its parent.
Figure 2b: Receiving Strategy (High)
16000
14000
12000
10000
8000
6000
4000
2000
0
T i me
0->50%
100->150%
150->200%
Figure 2c: RHP Sensitivity (High)
18000
16000
14000
12000
10000
8000
6000
4000
2000
0
T i me
0->50%
50->1 00%
100->150%
150->200%
Figure 3a: Signalling Strategy (Low )
35000
30000
25000
20000
15000
10000
5000
0
T i me
0->50%
The values of the various parameters for the synthetic
ecology were set (largely as a result of trial and error) as
follows: initial supply of food = 1200 particles; initial
size of population = 30; initial energy level = 300;
energy level at which reproduction takes place = 1000;
energy value of 1 particle of food = 45; rate of food
replenishment = a maximum of 16 particles per time
step; maximum supply of food at any one time = 1200;
existence-cost = 1; movement-cost = 1; cost of fighting =
50 units of energy per time step of fight; constant
preventing zero probabilities = 1.
50->1 00%
50->1 00%
100->150%
150->200%
Figure 3b: Receiving Strategy (Low )
35000
30000
25000
20000
15000
10000
5000
5. Experiment 1: The Triumph of Reliability
As described earlier, the first of our new experiments
was designed to investigate what would happen to the
communication system if receivers had available to them
a guaranteed-to-be-reliable source of information about
the RHP of potential opponents. In part we conceived of
this experiment as the next logical stage of our own
0
T i me
0->50%
7
Copyright © 1997, P. de Bourcier, M. Wheeler
50->1 00%
100->150%
150->200%
ongoing investigations (see section 3). However, as
indicated in section 4, we have, since our last reported
studies, simplified our model by removing the separate
and specific aggression module.
Thus our first step had to be to ensure that the
behaviour of the new model was similar to that observed
in our most recent previous study, when receivers did
not have access to any suchguaranteed-to-be-reliable
source of information (see section 4). Thus we began
the experiment by running the simulation with
sensitivity to RHP suppressed.
The results of these runs were very similar to those
reported in [7]. When the cost of signalling was high the
dominant strategy was to make very small signals. The
dominant receiving strategy was correspondingly high.
When the cost of signalling was low, the dominant
strategy was for animats to make large signals. As in the
previous experiment we found that the dominant
receiving strategy was to give (surprisingly) high
weighting to these signals.
We now enable the capacity for animats to be
sensitive to RHP, via the evolved strategy mechanism
described in section 4. The cost of signalling was set to
be low (0 units of energy deducted per unit of aggressive
signal). To expose the trends in signalling and receiving
behaviour, we partitioned the total population into four
sub-populations on the basis of signalling strategy, four
sub-populations on the basis of receiving strategy, and
four sub-populations on the basis of RHP sensitivity.
The signalling groups were identified by ranges in the
evolved value of the individual-specific signallingconstant, C (Group 1: 0-0.5, Group 2: 0.5-1, Group 3: 11.5, Group 4: 1.5-2).
The receiving groups were identified by ranges in the
evolved value of the individual-specific receivingconstant, K (Group 1: 0-0.5, Group 2: 0.5-1, Group 3: 11.5, Group 4: 1.5-2). Finally the RHP sensitivity groups
were identified by ranges in the evolved value of the
individual-specific RHP sensitivity-constant, P (Group
1: 0-0.5, Group 2: 0.5-1, Group 3: 1-1.5, Group 4: 1.52).
Figure 3c: RHP Sensitivity (Low )
30000
25000
20000
1 5000
1 0000
5000
0
T i me
0->50%
1 00->150%
1 50->200%
Figure 4a: Signalling Strategy (High)
1 4000
1 2000
1 0000
8000
6000
4000
2000
0
T i me
0->50%
50->1 00%
1 00->150%
1 50->200%
Figure 4b: Receiving Strategy (High)
1 4000
1 2000
1 0000
8000
6000
4000
2000
0
T i me
0->50%
The total energy present in each sub-population (a
reasonable guide to adaptive success) was then recorded
against time. Each sub-population, at any one time,
included all individuals adopting a strategy from the
appropriate band, including any offspring.6
We ran the simulation many times, setting various
values for the cost of signalling. Below we discuss two
typical examples of the results obtained. Figure 1a shows
the signalling strategies present in the population over
64000 time steps when the cost of signalling is low (0).
The dominant group is group 4, thereby showing
‘bluffing’ as being the best policy.
50->1 00%
50->1 00%
1 00->150%
1 50->200%
Figure 4c: RHP Sensitivity (High)
1 2000
1 0000
8000
6000
4000
2000
0
T i me
6
In the following discussions we shall often speak of
‘signallers’ and ‘receivers’; but it should be remembered that
each individual is both a signaller and a receiver, and, in the
later experiments, directly sensitive to RHP too.
8
Copyright © 1997, P. de Bourcier, M. Wheeler
0->50%
50->1 00%
1 00->150%
1 50->200%
Figure 1b shows a mix of all receiving strategies and
1c shows that group 4 (the group that give the highest
weight) is dominant for RHP sensitivity.
When the cost if signalling is increased to 0.1 units of
energy deducted per unit of aggressive signal produced
we see signalling strategies are significantly lower.
Figure 2a shows group 1 to be dominant signallers.
Figure 2b shows that the population on the whole is
giving moderate to low weighting to these signals
(groups 2 and 1). RHP sensitivity is again high (group
4).
These results imply that whether we have exaggerated
signals or suppressed signals, if there is a reliable source
of information about RHP, then unreliable signals will
be ignored. Further, the unreliable communication
system has near enough collapsed: this is the case as
receivers were paying virtually no attention at all to
signals. In the low cost case signallers still give signals:
since the cost is so low, there is virtually no
disadvantage to continuing to signal, and an animat
might even get the odd benefit as mutant high believers
are occasionally born into the world. As in previous
experiments when the cost of signalling is high there is
a strong selective pressure to make signals as low as
possible.
6. Experiment 2: Stable Unreliability
Thus far, the costs in our synthetic ecology had been
borne entirely by signallers. In order to relax this
restrictive assumption, we proceeded to place a fitness
cost on the strategy of finding out about the true RHP of
potential opponents. In essence this is a variation on the
idea of receiver costs. We assume (i) that signals are
cost-free to receive (although possibly costly to produce),
but (ii) that it is possible, although costly, for receivers
to find out the true RHP of signallers. In natural
ecologies these conditions might be met if, for example,
(i) receivers could perceive an aggressive signal whilst
maintaining a safe distance from the signaller, but (ii)
receivers could find out about the true RHP of that
signaller by approaching or probing it in some way,
thereby incurring costs by risking attack, predation, or
disease.
Respecting this line of reasoning, we implement this
new cost in our synthetic ecology by not allowing
animats to pickup values for RHP from as far away as
signals. In the case of the experiments shown here the
RHP visible range was limited to 80 units (half the range
of signal perception). As in the last experiment we ran
the simulation in both high cost, and low cost scenarios
recording the signalling strategies, receiving strategies,
and RHP sensitivities. Figure 3a shows that in the low
cost case high signalling (group 4) is dominant. Figure
3b is in contrast to the previous experiment – here
instead of a mix of strategies, it is clear that high
weighting is given to these signals (group 4). The
sensitivity to RHP signals has been reduced to group 2
(figure 3c). When the cost of signalling is increased we
see that low signalling is again dominant (group 1 and
group 2 in figure 4a). Figure 4b shows that group 4 are
the dominant receiving group – meaning high weight is
given to these signals. Finally figure 4c shows that the
sensitivity to RHP is less high than in the previous
experiment (group 4 and group 3).
Comparison of these results with those of the previous
experiments suggests that as it becomes costly to find
out the truth, the system reverts to an unreliable yet
stable signalling system. This demonstrates that even
where a guaranteed-to-be-reliable source of information
is available, if it is too costly for receivers to use that
source, the more adaptive strategy is to pay attention to
unreliable signals.
7. Conclusion
There are two related ways in which the results reported
here might be of theoretical interest: First, they lend
support to a particular view in biology, by showing that
that view has theoretical purchase in ecological contexts
which are not easily studied using the formal machinery
of mathematical modelling. To decide whether or not
one ought to expect an observed biological aggressive
communication system to be reliable, one should
endeavour to identify how the various costs in the
ecological context in question are distributed amongst
the various strategies available to individuals. If costs
are not borne by signallers alone, then stable
unreliability may be not merely possible, but likely. It
seems that this principle will apply not only to
aggressive communication systems, but also to matesignalling systems, and to other signalling systems as
well. Second, we have pursued further the investigative
strategy of using A-Life-style synthetic ecologies to relax
certain restrictive assumptions made in formal
mathematical models. Our results suggest that this
investigative strategy is a potentially fruitful one.
Acknowledgments
Peter de Bourcier is an employee of CyberLife
Technology Ltd. Michael Wheeler is supported by a
Junior Research Fellowship at Christ Church, Oxford,
with additional assistance from the McDonnell-Pew
Centre for Cognitive Neuroscience, Oxford.
References
[1] J. Archer. The Behavioural Biology of Aggression. Cambridge
University Press, Cambridge, 1988.
[2] R. Brooks and P. Maes, editors. Artificial Life IV: Proceedings of the
Fourth International Workshop on the Synthesis and Simulation of
Living Systems, Cambridge, Mass. and London, England, 1994. MIT
Press / Bradford Books.
[3] S. Bullock. Are ‘handicap equilibria’ merely “quirky possibilities”?
Paper presented to the Animal Behaviour Research Group, University of
Oxford, February 1997.
9
Copyright © 1997, P. de Bourcier, M. Wheeler
[4] D. Cliff, P. Husbands, J.-A. Meyer, and S.W. Wilson, editors. From
Animals to Animats 3: Proceedings of the Third International
Conference on Simulation of Adaptive Behavior, Cambridge, Mass.,
1994. MIT Press / Bradford Books.
[22] J.-A. Meyer and S.W. Wilson, editors. From Animals to Animats:
Proceedings of the First International Conference on Simulation of
Adaptive Behavior, Cambridge, Mass., 1991. MIT Press / Bradford
Books.
[5] T.H. Clutton-Brock and S.D. Albon. The roaring of red deer and the
evolution of honest advertisement. Behaviour, 69:145--170, 1979.
[23] G.F. Miller. Artificial life as theoretical biology: how to do real
science with computer simulation. Cognitive Science Research Paper 378,
University of Sussex, 1995.
[6] P. de Bourcier and M. Wheeler. Signalling and territorial aggression:
An investigation by means of synthetic behavioural ecology. In [4], 46372, 1994.
[7] P. de Bourcier and M. Wheeler. Aggressive signaling meets adaptive
receiving: further experiments in synthetic behavioural ecology. In [24],
760-71, 1995.
[8] M. Enquist. Communication during aggressive interactions with
particular reference to variation in choice of behaviour. Animal
Behaviour, 33:1152--1161, 1985.
[9] M. Enquist and O. Leimar. The evolution of fatal fighting. Animal
Behaviour, 39:1--9, 1990.
[10] A. Grafen. Biological signals as handicaps. Journal of Theoretical
Biology, 144:517--546, 1990.
[11] A. Grafen and R.A. Johnstone. Why we need {ESS} signalling
theory. Philosophical Transactions of the Royal Society: Biological
Sciences, 340:245--250, 1993.
[12] D.G.C. Harper. Communication. In J. R. Krebs and N. B. Davies,
editors, Behavioural Ecology --- An Evolutionary Approach,
chapter~12, pages 374--397. Blackwell Scientific, Oxford, 3rd edition,
1991.
[24] F. Moran, A. Moreno, J.J. Merelo, and P. Chacon, editors. Advances
in Artificial Life: Proceedings of the Third European Conference on
Artificial Life, Berlin and Heidelberg, 1995. Springer-Verlag.
[25] C. A. Munn. Birds that ‘cry wolf’. Nature, 319:143--5, 1986.
[26] G. A. Parker and J. Maynard Smith. Optimality theory in
evolutionary biology. Nature, 348:27--33, 1990.
[27] G.A. Parker. Assessment strategy and the evolution of fighting
behaviour. Journal of Theoretical Biology, 47:223--243, 1974.
[28] J. H. Poole. Announcing intent: Aggressive state of musth in African
elephants. Animal Behaviour, 37:140--52, 1988.
[29] J. Maynard Smith. Mini review: Sexual selection, handicaps, and
true fitness. Journal of Theoretical Biology, 115:1--8, 1985.
[30] M. Stamp Dawkins. Are there general principles of signal design?
Philosophical Transactions of the Royal Society: Biological Sciences,
340:251--255, 1993.
[31] M. Stamp Dawkins and T. Guilford. The corruption of honest
signalling. Animal Behaviour, 41(5):865--73, 1991.
[13] P.L. Hurd. Communication in discrete action-response games.
Journal of Theoretical Biology, 174:217--22, 1995.
[32] I.J.A. te Boekhorst and P. Hogeweg. Effects of tree size on
travelband formation in orang-utans: Data analysis suggested by a model
study. In [2], 119-29, 1994.
[14] J. R. Krebs and N. B. Davies. An Introduction to Behavioural
Ecology. Blackwell Scientific, Oxford, 2nd edition, 1987.
[33] F. Toates and P. Jensen. Ethological and psychological models of
motivation --- towards a synthesis. In [22], 194-203, 1991.
[15] K. Lorenz. On Aggression. Methuen, London, 1966.
[34] C. Wemmer, L. R. Collins, B. B. Beck, and B. Rettberg. The
ethogram. In B. B. Beck and C. Wemmer, editors, The Biology and
Management of an Extinct Species: Pere David's Deer, pages 91--121.
Noyes, New Jersey, 1983.
[16] B. MacLennan and G. Burghardt. Synthetic ethology and the
evolution of cooperative communication. Adaptive Behavior, 2(2):161-188, 1994.
[17] J. Maynard Smith. Do animals convey information about their
intentions? Journal of Theoretical Biology, 97:1--5, 1982.
[35] M. Wheeler and P. de Bourcier. How not to murder your neighbor:
using synthetic behavioral ecology to study aggressive signaling. Adaptive
Behavior, 3:3:273--309, 1995.
[18] J. Maynard Smith. Evolution and the Theory of Games. Cambridge
University Press, Cambridge, 1982.
[36] A. Zahavi. Mate selection --- a selection for a handicap. Journal of
Theoretical Biology, 53:205--214, 1975.
[19] J. Maynard Smith and G.A. Parker. The logic of asymmetric
contests. Animal Behaviour, 24:159--175, 1976.
[37] A. Zahavi. The cost of honesty (further remarks on the handicap
principle. Journal of Theoretical Biology, 67:603--605, 1977.
[20] D. McFarland. Towards robot cooperation. In [4], pp.440-4, 1994.
[38] A. Zahavi. The fallacy of conventional signalling. Philosophical
Transactions of the Royal Society: Biological Sciences, 340:227--230,
1993.
[2 1] P.K. McGregor. Signalling in territorial systems: a context for
individual identification, ranging and eavesdropping. Philosophical
Transactions of the Royal Society: Biological Sciences, 340:237--244,
1993.
10
Copyright © 1997, P. de Bourcier, M. Wheeler