The Truth Is Out There: the Evolution of Reliability in Aggressive Communication Systems Peter de Bourcier1 and Mike Wheeler2, CyberLife Technology Ltd, Quern House, Mill Court, Cambridge, CB2 5LD, U.K. Phone: +44 1223 844894, Fax: +44 1223 844918, E-Mail: [email protected] 2. Dept. of Experimental Psychology, University of Oxford, South Parks Road, Oxford, OX1 3UD, U.K. Phone: +44 1865 271417, Fax: +44 1865 310447, E-Mail: [email protected] 1. Abstract This paper reports on our ongoing research in which we employ an artificial life methodology to study the evolution of communication. We perform experiments using synthetic ecologies in which artificial autonomous agents (animats) with evolved signalling and receiving tactics are in competition over food. Synthetic ecologies permit an investigative strategy in which one relaxes certain restrictive assumptions that, in the interests of formal tractability, are made in mathematical models of biological communication. The experiments described here are examples of that investigative strategy. Motivated by suggestions from recent biological signalling theory, we allow individuals to pay attention to multiple sources of information about a potential opponent. Our results indicate that if signals are unreliable, then individuals will evolve to use another source of information which is guaranteed to be reliable (if such a source exists), and signals will fall into disuse. However, if we place an indirect fitness cost on the receiving tactic of obtaining guaranteed-to-be-reliable information, then, as this cost rises, it becomes adaptive and evolutionarily stable for receivers to pay attention to unreliable signals. 1. Introduction In conflict situations, should we expect animals to produce reliable signals of their aggressive intentions, or should we expect cheats to prosper? If we should expect aggressive signals to evolve to be reliable, what evolutionary factors enforce that reliability? If we should expect aggressive signals to evolve to be unreliable, should aggressive communication systems persist at all in nature? For some time now we have been examining these sorts of questions through the methods and techniques of artificial life (henceforth ‘A-Life’). (See [6, 7, 35] for our previous studies.) This ongoing research takes place within a theoretical framework that we call Synthetic Behavioural Ecology (henceforth ‘SBE’). We carry out experiments in simple (although not trivial) synthetic ecologies, with the specific goal of making a contribution to the scientific understanding of how ecological context influences the adaptive consequences of behaviour. The term ‘synthetic behavioural ecology’ is designed to suggest a theoretical link between our approach and behavioural ecology, the sub-discipline of biology in which researchers aim to identify the functional roles that particular, ecologically embedded behaviour patterns play in contributing to Darwinian fitness. Of course, SBE is also closely related to certain other approaches within the artificial life community (e.g., [16, 32, 1]).1 It seems plausible that studies using well-constructed synthetic ecologies may, in time, help to bridge an explanatory gap that exists between real-world adaptive behaviour in natural environments and idealized mathematical models of that behaviour. Synthetic ecologies will always be idealizations of any natural ecology; but, unlike their natural counterparts, synthetic ecologies permit the precise variation of the key parameters affecting the observed behaviour, and the easy repeatability of experiments (cf. [11]). The relationship between synthetic ecologies and the mathematical models to which theoretical biologists are accustomed (see section 2) is less straightforward. Mathematical models are undoubtedly powerful theoretical tools. In particular, they allow biologists to make assumptions and arguments explicit. And there seems little doubt that synthetic ecologies do not afford the transparent formal rigour that mathematical models often achieve. However, as the formal modellers themselves sometimes observe, the biological realism of mathematical models, and therefore their usefulness to empirical research on animal behaviour, can sometimes be rather limited [11]. In part at least, this situation occurs because, in order to make those models mathematically tractable, certain restrictive assumptions have to be made (see section 3). Our thought is that SBE is potentially useful precisely because it provides a platform for experimentation which sits somewhere between field biology and formal mathematical modelling. Synthetic ecologies permit an investigative strategy in which the experimenter systematically relaxes the restrictive assumptions made in mathematical models. Miller recommends a similar strategy: A powerful way of using A-Life simulations is to take an existing formal model from theoretical biology and relax the assumptions (preferably one at a time) that were required to make the mathematics tractable. The results of such a simulation are then directly comparable to the results of the existing formal model, and will be comprehensible and relevant for biologists. [23, p.10] 1 For an introduction to behavioural ecology, see [14]. For a discussion of the theoretical relationship between SBE and other A-Life-style approaches, as well as a more detailed presentation of SBE as a general methodology, see [35]. 1 Copyright © 1997, P. de Bourcier, M. Wheeler Our primary aim, then, is to use SBE to contribute to the understanding of biological communication. However, it is worth mentioning in passing that since research into the functional questions surrounding communication systems (e.g., the conditions under which honest signalling is the most adaptive strategy) will have implications for research into collective behaviour in populations of real robots, there is no principled reason why our SBE studies should be irrelevant to researchers working on robot communication, simply because we are working in simulation. For example, McFarland [20] describes a robot communication system in which honest signalling is assumed, but in which, as McFarland himself observes, honesty may not be the best policy. As far as we can tell, this is precisely the sort of issue that SBE and related approaches are well-placed to investigate. 2. Biological Theory: Quality, Cost, and Reliability Consider the following real-life scenario: The reproductive success of a red deer stag depends on its fighting ability. The stronger the stag, the larger its harem, and the more opportunities it will have to pass on its genes. Contests between stags in the annual autumn competition can be a dangerous business. Between 20 and 30 percent of stags become permanently injured at some point during their lives, and nearly all stags endure minor injuries. However, despite the image of violent conflict that such statistics suggest, all-out fights are comparatively rare. Why is this? The first reason is that it pays a stag to avoid those fights that it is likely to lose. The second is that fights are often seriously damaging to the victor as well as the vanquished. Thus there has been a selection pressure for settling contests by display rather than fighting, and the red deer world has witnessed the evolution of the distinctive (and adaptive) phenomena of roaring and parallel walking. In a typical confrontation, the harem holder and the challenger begin their contest by roaring at each other. If the holder roars at a faster rate than the challenger, then the challenger usually withdraws. If roaring fails to decide the issue, the two stags commence a parallel walking display which allows further assessment. Only if this second phase behaviour still fails to settle the contest does a serious fight occur [5]. For the student of adaptive behaviour interested in communication, there are some important lessons to be learned from this classic example of an animal contest. Most strikingly, the signalling system in operation is guaranteed to be honest. (An honest signal is one that reliably reflects the underlying quality of the signaller [31].) Honesty is ensured because the stag's signals are biologically correlated with the phenotypic traits that determine its ability to win a fight (e.g., size or physical condition). Such traits are known as the animal's resource holding potential or RHP [27], and signals which are biologically correlated with RHP are called assessment signals [19]. Roaring is a reliable signal of fighting ability, because, to roar at a fast rate, and to continue such roaring for what can sometimes be a protracted period, a stag must be physically strong. A weaker stag is simply unable to pay the costs of roaring at higher rates, so it cannot fool the opposition into treating it as stronger than it, in fact, is. According to Zahavi's handicap principle [36, 37], this idea --- that high signalling costs increase the reliability of the signals made --- is a fundamental principle of biological signalling systems. Zahavi reasons as follows: Biological signals will evolve only if (a) there is information about a potential signaller that a potential receiver wants and (b) it is in the potential signaller's interests to supply that information [38]. Thus the information carried by signals must be reliable enough to warrant a receiver's attention, and unreliable signals will be selected out. However, if the signaller has to make an investment in its signals (where ‘investment’ means the cost in fitness that the animal incurs through, say, energy loss or risk of predation, as a result of making the signal) the reliability of signals will increase. A signal which is, for example, wasteful of energy is, as a consequence of that wastefulness, reliably predictive of the possession of energy. So not only is it the case that the costliness of signals guarantees their honesty, it is also the case that signals will evolve to be costly and, therefore, to be honest. The handicap principle is still controversial within biology, and has received a number of different interpretations.2 In his influential ESS-model of the way in which the handicap principle could operate in mating displays, Grafen [10] suggests a strategic choice interpretation, in which signallers of different qualities will signal at different levels that reveal true quality, because each signaller `chooses' to endure a level of handicap appropriate to his or her quality. High quality individuals produce higher signals, and thus endure bigger handicaps, than low quality individuals. Grafen shows that for the strategic choice handicap to evolve, (i) higher signals must cost more, and (ii) the costs involved must be differential, in the sense that a specific signal must be proportionally more costly to a weak individual than to a strong individual. This is the interpretation of the handicap principle which we will adopt in what follows. Despite all this talk of honesty, it seems that some animal signals are susceptible to exploitation by cheats. For example, when in foraging flocks, birds of several species sometimes give hawk alarms when there are no predatory hawks present. Other birds who hear the signal often flee, an event which permits the signaller to gain better access to the available food [25]. This is a clear case of qualitative deception. But deception may also be quantitative [12], and since a paradigmatic context for quantitative deception is aggressive signalling --- the focus of this paper --- let's consider the case of an animal which signals the intention to intensify an already-existing conflict situation, when, in fact, that animal would not be prepared to escalate the conflict to that higher level. It seems that such quantitative cheats --- creatures who consistently signalled higher levels of aggressive intent than they in fact possessed --- would tend to be more successful than honest signallers when confronting ‘trusting’ opponents 2 See [10, 29] for classifications and discussions, and [3, 13] for recent treatments using formal techniques. 2 Copyright © 1997, P. de Bourcier, M. Wheeler in signalling contests. Thus they would take over the population. To put this point another way, the honest signalling of aggressive intentions appears not to be an evolutionarily stable strategy (or ESS). (An ESS is a strategy which, when adopted by most members of a population, means that that population cannot be invaded by a rare alternative strategy [18]. At the ESS fitness is maximized in the sense that individuals not adopting the ESS do worse [26].) At first sight, then, it seems that signals of aggressive intentions ought to be unreliable. Following a line of argument which is consistent with Zahavi-esque reasoning, one might conclude from this that animals should not be expected to signal aggressive intentions at all, because, eventually, it would pay receivers to ignore such signals in favour of other sources of information, and the entire strategy of signalling aggressive intentions would be selected against [14, 17]. However, there is ample evidence from animal communication systems that some animals, such as male African elephants [28] and male Pere David’s deer [34] do signal aggressive intent. Of course, the handicap principle itself might explain this phenomenon. For example, Enquist [8] uses a modification of the hawkdove game [18] to demonstrate that if signals of aggressive intentions are costly and the costs are differential in the sense identified above, then the signalling of aggressive intentions is an ESS. So the handicap principle might well operate to secure the reliability of communication systems in which aggressive intentions are signalled. And if signals of aggressive intentions are reliable, then, all things being equal, such signals will not be selected-out. Despite the undoubted theoretical significance of Enquist’s result, in both of the examples of aggressive signalling just cited (African elephants and Pere David’s deer), there are common circumstances in which low quality males appear to signal higher levels of aggressive intent than they, in fact, possess. This strategy enables the cheats to achieve a temporary domination of better quality males, which, in turn, permits them to enjoy increased mating success. In other words, these signalling systems are unreliable, yet they persist. Perhaps, as Harper suggests [12], receivers will evolve to ignore deceptive signals altogether only when those signals are qualitatively, rather than quantitatively, incorrect. Quantitatively incorrect signals still carry some information. A threat display --- even one which has been exaggerated --- is not a submission display. Things get more complicated still once one gives due theoretical weight to the fact that the behavioural response of a receiving animal will be the result not of an incoming signal alone, but of that signal plus the degree of importance which the receiver gives to that signal. Receiver tactics will thus be an important factor in the evolution of communication systems. Hence ‘receiver psychology’ is increasingly discussed in the animal behaviour literature (e.g., [21, 31]). And that is not all that happens on the receiver-side of communication. Receivers, as well as signallers, often face fitness costs [31]. For example, receivers assessing the quality of potential mates who signal to them will often pay fitness costs in terms of the time taken over assessment, time that could be spent on other activities. And notice that in the case of red-deer roaring discussed earlier, a challenger has to bear the cost of roaring at an honest level himself, in order to elicit a signal from the harem-holder. Other receiver costs might include risks of predation and disease transmission. On this evidence Stamp Dawkins and Guilford conclude that “the receiver’s ideal signal (one that is costly to give, and so gives an honest indication of quality, but costs nothing to receive) will be a rare commodity” [31, p.866]. They go on to argue that receiver costs will have important evolutionary consequences for communication systems. If signallers and receivers pay high costs for reliable signalling, then it may be to the mutual benefit of both to adopt a signalling system which is less costly and (therefore) not guaranteed to be reliable. Such signalling systems, involving so-called ‘conventional’ signals (signals in which the necessary connection between cost and honesty has been lost), may be established when, for example, (i) animals recognize each other as individuals (so that the full costs of assessment are avoided through memory of previous assessments), or (ii) animals recognize each other as members of categories of signallers who have been encountered previously. Once receiver costs are factored into our understanding, it seems much more likely that unreliable signalling systems which permit some level of cheating can be evolutionarily stable. If, on average, the fitness benefit to receivers from paying the costs of honest assessment is outweighed by the benefit from (a) conventional signalling coupled to (b) the occasional probing of signallers to impose costs on cheats, then unreliable signals may well persist [31]. What all this tells us is that the current biological theory of aggressive signalling, as rich as it undoubtedly is, contains gaps, shortfalls, and unanswered questions. In the SBE experiments described in the remainder of this paper, we endeavour to address two outstanding issues: multiple sources of information and the cost of finding out the truth. 3. The New Experiments: Rationale We noted earlier that existing ESS-models often make certain restrictive assumptions that can reduce their usefulness to empirical studies. We suggested also that synthetic ecologies permit an investigative strategy in which these restrictive assumptions are systematically relaxed. The study of biological communication provides a context ripe for the application of this strategy. Although ESS-models are powerful mathematical tools for investigating the logic of animal signalling systems, even state-of-the-art models are limited, in that they do not allow for multiple receivers of one signal [11]. Relatedly, information flow is almost always assumed to be one-way; that is, there is a signaller and a receiver, and no scope for the same individual to be both a signaller and a receiver simultaneously. Our early SBE-models [6, 7, 35] demonstrated that the general logic of the handicap principle can carry over to multi-agent signalling 3 Copyright © 1997, P. de Bourcier, M. Wheeler systems in which these two restrictive assumptions are relaxed, systems which allow both for two-way information flow and for an individual signal to be picked up by many receivers. The synthetic ecology that we use in the experiments described here preserves these features. It also takes further the strategy of relaxing restrictive assumptions. The sequential assessment game of Enquist and Leimar [9] is one ESS model which relaxes the assumption that information flow in signalling contests is one-way. However, even this highly sophisticated model embodies a third restrictive assumption, which is that contestants have no way of ‘choosing’ between different sources of information. They have access to direct information about an opponent’s strength, but not to signalling information [11]. This is not a limitation of the sequential assessment game alone. As far as we know, situations in which combinations of signalling information and ‘direct perception’ of quality might be used by an animal in aggressive confrontations have yet to be explored in detail by any concrete ESS-model, although Grafen and Johnstone have made some tentative suggestions about how such a model might be developed [11]. They suggest that an individual in an ESS-game might possess a number of registers, each of which contains a real number representing such values as an estimate of an opponent’s strength based on recent evidence, or an estimate of an opponent’s willingness to escalate a conflict. The best use to which such a register might be put by a particular individual depends on how other individuals are using their registers. In the first experiment described below, we extend our model by making it possible for receivers to evolve to use one or the other of, or a combination of, signalling information and direct information about true quality. This extension was inspired primarily by recent discussions in biological signalling theory (such as Grafen and Johnstone’s). However, it is also suggested by the path that our own work has taken. In our earliest SBE models [6, 35 ], the strategy of receivers was effectively fixed, such that the level of threat which a receiver registered was determined solely by the values of incoming aggressive signals. However, as stressed earlier, to understand communication systems, we need to consider not only the strategies of signallers, but also the strategies of receivers. Hence we introduced the concurrent evolution of individual signalling and receiving strategies [7]. The results from this more complex ecological scenario suggested the following hypothesis (see [7, pp.766-71]): Where the fitness costs of signalling are low, signallers will tend to produce signals indicating levels of aggression well in excess of actual aggression. If there is no alternative source of relevant information, receivers may still pay heed to those signals. Where the fitness costs of signalling are high, the pressure on signallers to reduce the level of signalling may still lead to communication systems in which signals are not direct reflections of quality, in that signallers may tend to produce signals indicating levels of aggression lower than actual aggression. However, if receivers evolve to give a high degree of importance to those signals, the effect would be to compensate for the actual values of the signals. As receivers, individuals would still behave just as if signals were direct reflections of aggression (so they would benefit from not being drawn into costly conflicts); as signallers, individuals would benefit from the low level of signalling. Given this result, the obvious next move is to give receivers the potential to respond to an alternative source of information about the quality of prospective opponents (a source that is guaranteed to be reliable), and to investigate the effects that the existence of such a source has, at various costs of signalling, on the evolution of the communication system. The second experiment described below extends the model again, this time to connect our work directly with the recent biological literature on receiver costs (as discussed in section 2). In all of our previous synthetic ecologies, the costs of communication were borne entirely by signalling. In section 6 we describe what happens when an indirect fitness cost is placed on the strategy of finding out about the true RHP of potential opponents. We then observe the effects on the evolution of the communication system at various costs of signalling, and at different levels of this novel indirect cost. 4. Experimental Model A number of mobile animats (all with equal energy levels) and a number of stationary food particles (all with equal energy values) are distributed randomly throughout a two-dimensional world. This world is 1000 by 1000 units square (each animat being round and 12 units in diameter). Space is continuous, and the edges of the world are barriers to movement. When an animat lands on a food particle, the animat’s energy level is incremented by the energy value of that particle, and the particle is deemed to have been ‘eaten’. The food resource is replenished by new food particles which are added (with a random distribution) at each timestep; but the resource is also ‘capped,’ so that food is never more plentiful than at the beginning of the run. Each animat has two highly idealized sensory modalities, which, for convenience, we label ‘vision’ and ‘olfaction’. The visual system is based on a 36-pixel eye providing information in a full 360 degree radius around the animat, with an arbitrarily imposed maximum range of 165 units. Each pixel returns a value corresponding to the proportion of that pixel’s receptive field containing other animats. The olfactory system is sensitive to food. Its principles are similar to those of the visual system, the only differences being that the olfactory range is only 35 units, and food particles are treated as point sources. Although animats receive energy from food, they also lose energy in various ways. A small existence-cost is deducted at each time-step, and animats also lose energy for fighting, moving, signalling, and reproducing (see below). If an animat’s energy level sinks to 0, it is removed from the world; so food-finding is essential to survival. To encourage foraging, each animat has a hunger level (a disposition to move towards food) which 4 Copyright © 1997, P. de Bourcier, M. Wheeler changes in a way inversely proportional to its energy level. What we think of as fights take place when animats touch. Fighting animats suffer large energy reductions. It is therefore plausible to regard an animat’s energy level as a measure of its RHP. It is a characteristic of RHP that having a high RHP will generally be costly in contexts other than fighting. For example, being large may well be a sign of high RHP, but it also costs the animal in terms of growth and maintenance [12]. To reflect this fact in our synthetic ecology, we impose a cost in energy to making a movement, such that each movement an animat makes results in a reduction in energy proportional to the amount of energy that the animat has. A movement made by an animat whose energy-level is near the maximum possible (see below) costs twice as much as a movement made by an animat who is on the verge of death. The behaviour in our previous SBE models became increasingly difficult to interpret, as the ecological contexts became more complicated. If SBE studies are to be of any real use to theoretical biologists, then the experimental advantages offered by (a) the capacity for the precise variation of parameters and (b) the easy repeatability of experiments (see section 1) must not be outweighed by the disadvantages of a level of complexity which makes an understanding of behaviour extremely difficult to attain. Indeed, in the limit, although synthetic ecologies will always be idealizations of any natural ecology, a horrendously complex simulation might offer little or no experimental advantage over nature itself.3 With this in mind, we began our latest round of experiments by simplifying our experimental model. Previously the behaviour of an individual was, in part, determined by a separate and specific aggression module with the following effects: When an animat made an aggressive movement, its aggression level was increased by an amount proportional to the previous aggression level. Conversely, a non-aggressive movement resulted in a decrease in an animat’s aggression level, by an amount proportional to the previous aggression level. An individual’s aggression level was used as the basis for the production of aggressive signals. Although there were good reasons for organizing things in this manner (see our reasoning in [35]) precisely how this aggression module interacted with other factors relevant to the production of behaviour (e.g., energy level) was sometimes hard to track. So, in our latest model, animats do not have a separate aggression module controlling a separate aggression value. Rather we define an aggressive movement as a movement in which a first animat moves directly towards a second (which makes sense because fights occur when animats touch). Aggressive signals (indications of the apparent tendency that a signaller has to move directly towards a receiver) are calculated using the RHP (energy level) of the signaller (as explained below). 3 Many thanks to David Krakauer, in particular, for forcefully pointing this out to us. If one is to be serious about evolutionary-functional explanations of animal contests, then aggression cannot be thought of as an end in itself or as a ‘spontaneous appetite’ [15], such that, in the absence of the performance of aggressive acts, the tendency to behave aggressively increases with time. Rather, aggressive behaviour needs to be conceptualized as a form of adaptive behaviour, with an adaptive purpose, such as to win or to defend a resource (see, e.g., [1, 8, 33]). In our synthetic ecological context, the food supply is limited and foraging is essential for survival. Thus animats are effectively in competition for the available resources. Therefore it benefits an individual to inhabit an area which is not being foraged by other animats. It is here that aggressive movements play their adaptive role. Although aggressive behaviour is reactively triggered by the presence of other animats within visual range - so animats do not plan aggressive responses with the explicit goal of preserving an exclusive foraging area - nevertheless aggressive behaviour does help an individual to ‘defend’ just such an area, precisely by driving away approaching animats. Hence aggressive movements serve the adaptive purpose of helping an individual to defend a resource, even though that purpose is not internally represented as an explicit goal in the mechanisms controlling that individual’s aggressive behaviour.4 Each member of the population has three evolved strategies which, along with that individual’s hunger and energy levels, determine its behaviour. Signalling strategy: Animats signal whenever at least one other animat is within visual range. Signals are produced in accordance with the calculation S = E.C, where S is the value of the signal made, E is that individual’s current energy level (RHP), and C is an individual-specific constant, in the range 0-2. A C of 0 is equivalent to not making any signal, a C of 1 is equivalent to producing indicators of actual RHP, and a C of 2 is equivalent to producing signals indicating twice actual RHP. Notice that this allows the existence of individuals who produce signals indicating levels of RHP lower than actual RHP. Although an individual’s signalling strategy remains constant throughout its lifetime, the actual values of the signals produced by any one animat will vary across time, because each individual’s energy level changes dynamically as a function of its activity. If, in some population, signal D is usually followed by some action, DD, then any individual which signals D but does not intend to perform DD can be thought of as a bluffer.5 An animat which evolves to have a very high value of C can be thought of as a bluffer who produces exaggerated signals of aggressive intent (so this animat’s strategy is to 4 In a previous paper [35], we presented a quantitative analysis (based on ethological techniques) which suggested that the spatial behaviour of animats in one of our earlier and simpler synthetic ecological contexts was plausibly regarded as a minimal form of territoriality. We have yet to carry out a similar analysis for the current model. 5 This is equivalent to Maynard Smith’s notion of ‘lying’ [17]. 5 Copyright © 1997, P. de Bourcier, M. Wheeler indicate a stronger disposition to approach another animat than it, in fact, has). An animat which evolves to have a very low value of C can be thought of as a different kind of bluffer, one who produces suppressed signals of aggressive intent (so this animat’s strategy is to indicate a weaker disposition to approach another animat than it, in fact, has). Figure 1a: Signalling Strategy (Low ) 20000 1 8000 1 6000 1 4000 1 2000 1 0000 Aggressive signals are displays for which a signalling animat has to pay, via a deduction in energy. (This is an appropriate tax since an animat’s energy level is effectively equivalent to its RHP. The ways in which costs are paid in natural environments may be more complex, although the notion of an ‘appropriate link’ still seems to apply [30].) At the beginning of a run, the units of energy deducted per unit of aggressive signal are set by the experimenter. Thus the absolute amount of energy deducted increases with the values of aggressive signals. In this way, the cost of signalling is differential, in the sense required by the handicap principle, because, given a specific signal made by a high-energy individual, it will cost a low-energy individual proportionally more to produce that same signal. Receiving strategy: Animats receive the signals of any other animats within visual range. The effect that a signal has on the behaviour of a receiving animat is a function of that animat’s receiving strategy. This strategy is determined by an individual-specific constant, K that ‘weights’ incoming signals, in order to calculate a threat value. So T = R.K, where T is the threat, R is the incoming signal, and K is an individual-specific constant, in the range 0-2. A K of 0 would result in that individual ignoring incoming signals; a K of 1 means that the value of the incoming signal itself is used as the threat value; and a K of 2 results in incoming signals being doubled, and that resulting value being used as the threat value. So the higher the value of K, the higher the degree of importance that an individual is giving to incoming signals. Sensitivity to RHP: A receiving animat is also sensitive, to some degree, to the energy level (RHP) of any animat within visual/signalling range. This sensitivity is determined by an individual-specific constant, P that ‘weights’ any energy value picked up, in order to calculate what we call ‘assessed RHP’. The higher the value of P, the more sensitive the behaviour of an individual is to the RHP of other animats. So W = H.P, where W is assessed RHP, H is the visible animat’s actual energy level (RHP), and P is an individualspecific constant, in the range 0-2. A P of 0 would result in that individual ignoring another animat’s RHP; a P of 1 means that that RHP value itself is used as assessed RHP; and a P of 2 results in that RHP value being doubled, and that resulting value being used assessed RHP. 8000 6000 4000 2000 0 T i me 0->50% 50->1 00% 1 00->150% 1 50->200% Figure 1b: Receiving Strategy (Low ) 1 6000 1 4000 1 2000 1 0000 8000 6000 4000 2000 0 T i me 0->50% 50->1 00% 1 00->150% 1 50->200% Figure 1c: RHP Sensitivity (Low ) 1 8000 1 6000 1 4000 1 2000 1 0000 8000 6000 4000 2000 0 T i me 0->50% 50->1 00% 1 00->150% 1 50->200% Figure 2a: Signalling Strategy (High) 1 8000 1 6000 1 4000 1 2000 1 0000 8000 6000 4000 2000 At each time-step, the direction in which each animat will move (one of 36 possible directions) is calculated using the following equation: 0 T i me 0->50% 6 Copyright © 1997, P. de Bourcier, M. Wheeler 50->1 00% 1 00->150% 1 50->200% P(d)= h.s(d) + e.v(d) + t(o).v(o) + w(o).v(o) + c S[1..n] (h.s(di )+e.v(di )+t(oi ).v(oi )+w(oi ).v(oi )+c) where p(d) is the probability that the particular animat will move in the direction d; n is the number of possible directions of movement; h is the animat's hunger level; s(d) is the value returned by the olfactory system in direction d; e is the animat's energy level; v(d) is the value returned by the visual system for direction d; o is the direction 180 degrees off d --- i.e., in the opposite direction to d; t(o) is the threat that the animat perceives from other animats from the opposite direction to d; w(o) is the RHP assessment value for animats from the opposite direction to d; v(o) is the value returned by the visual system in the opposite direction to d; and c is a small constant which prevents zero probabilities. Each of the three evolved strategies is encoded in 8 bits of a 24 bit genotype. So the genotype as a whole specifies the set of strategies adopted by the particular individual in question. This is analogous to Grafen and Johnstone’s notion of a register (see section 2 above). At the beginning of a run, a random population of genotypes is created, producing a random distribution of signalling strategies, receiving strategies, and RHPsensitivities. When an animat achieves a pre-defined (high) energy level, it will asexually reproduce. The selection pressures imposed by the ecological context mean that different strategies (or different combinations of strategies) will have different fitness consequences, because only those individuals adopting adaptively fit strategies will have a high probability of becoming strong enough, in energy terms, to reproduce. The result of reproduction is a single offspring placed randomly in the world. This only child is given the same initial energy level as each member of the population had at the start of the run, and the corresponding amount of energy is deducted from the parent. The parent’s genotype is copied over to the offspring, but there is a small probability that a genetic mutation will take place (a 0.05 chance that a bit-flip mutation will occur as each bit is copied). So it is possible that the child will adopt different strategies to its parent. Figure 2b: Receiving Strategy (High) 16000 14000 12000 10000 8000 6000 4000 2000 0 T i me 0->50% 100->150% 150->200% Figure 2c: RHP Sensitivity (High) 18000 16000 14000 12000 10000 8000 6000 4000 2000 0 T i me 0->50% 50->1 00% 100->150% 150->200% Figure 3a: Signalling Strategy (Low ) 35000 30000 25000 20000 15000 10000 5000 0 T i me 0->50% The values of the various parameters for the synthetic ecology were set (largely as a result of trial and error) as follows: initial supply of food = 1200 particles; initial size of population = 30; initial energy level = 300; energy level at which reproduction takes place = 1000; energy value of 1 particle of food = 45; rate of food replenishment = a maximum of 16 particles per time step; maximum supply of food at any one time = 1200; existence-cost = 1; movement-cost = 1; cost of fighting = 50 units of energy per time step of fight; constant preventing zero probabilities = 1. 50->1 00% 50->1 00% 100->150% 150->200% Figure 3b: Receiving Strategy (Low ) 35000 30000 25000 20000 15000 10000 5000 5. Experiment 1: The Triumph of Reliability As described earlier, the first of our new experiments was designed to investigate what would happen to the communication system if receivers had available to them a guaranteed-to-be-reliable source of information about the RHP of potential opponents. In part we conceived of this experiment as the next logical stage of our own 0 T i me 0->50% 7 Copyright © 1997, P. de Bourcier, M. Wheeler 50->1 00% 100->150% 150->200% ongoing investigations (see section 3). However, as indicated in section 4, we have, since our last reported studies, simplified our model by removing the separate and specific aggression module. Thus our first step had to be to ensure that the behaviour of the new model was similar to that observed in our most recent previous study, when receivers did not have access to any suchguaranteed-to-be-reliable source of information (see section 4). Thus we began the experiment by running the simulation with sensitivity to RHP suppressed. The results of these runs were very similar to those reported in [7]. When the cost of signalling was high the dominant strategy was to make very small signals. The dominant receiving strategy was correspondingly high. When the cost of signalling was low, the dominant strategy was for animats to make large signals. As in the previous experiment we found that the dominant receiving strategy was to give (surprisingly) high weighting to these signals. We now enable the capacity for animats to be sensitive to RHP, via the evolved strategy mechanism described in section 4. The cost of signalling was set to be low (0 units of energy deducted per unit of aggressive signal). To expose the trends in signalling and receiving behaviour, we partitioned the total population into four sub-populations on the basis of signalling strategy, four sub-populations on the basis of receiving strategy, and four sub-populations on the basis of RHP sensitivity. The signalling groups were identified by ranges in the evolved value of the individual-specific signallingconstant, C (Group 1: 0-0.5, Group 2: 0.5-1, Group 3: 11.5, Group 4: 1.5-2). The receiving groups were identified by ranges in the evolved value of the individual-specific receivingconstant, K (Group 1: 0-0.5, Group 2: 0.5-1, Group 3: 11.5, Group 4: 1.5-2). Finally the RHP sensitivity groups were identified by ranges in the evolved value of the individual-specific RHP sensitivity-constant, P (Group 1: 0-0.5, Group 2: 0.5-1, Group 3: 1-1.5, Group 4: 1.52). Figure 3c: RHP Sensitivity (Low ) 30000 25000 20000 1 5000 1 0000 5000 0 T i me 0->50% 1 00->150% 1 50->200% Figure 4a: Signalling Strategy (High) 1 4000 1 2000 1 0000 8000 6000 4000 2000 0 T i me 0->50% 50->1 00% 1 00->150% 1 50->200% Figure 4b: Receiving Strategy (High) 1 4000 1 2000 1 0000 8000 6000 4000 2000 0 T i me 0->50% The total energy present in each sub-population (a reasonable guide to adaptive success) was then recorded against time. Each sub-population, at any one time, included all individuals adopting a strategy from the appropriate band, including any offspring.6 We ran the simulation many times, setting various values for the cost of signalling. Below we discuss two typical examples of the results obtained. Figure 1a shows the signalling strategies present in the population over 64000 time steps when the cost of signalling is low (0). The dominant group is group 4, thereby showing ‘bluffing’ as being the best policy. 50->1 00% 50->1 00% 1 00->150% 1 50->200% Figure 4c: RHP Sensitivity (High) 1 2000 1 0000 8000 6000 4000 2000 0 T i me 6 In the following discussions we shall often speak of ‘signallers’ and ‘receivers’; but it should be remembered that each individual is both a signaller and a receiver, and, in the later experiments, directly sensitive to RHP too. 8 Copyright © 1997, P. de Bourcier, M. Wheeler 0->50% 50->1 00% 1 00->150% 1 50->200% Figure 1b shows a mix of all receiving strategies and 1c shows that group 4 (the group that give the highest weight) is dominant for RHP sensitivity. When the cost if signalling is increased to 0.1 units of energy deducted per unit of aggressive signal produced we see signalling strategies are significantly lower. Figure 2a shows group 1 to be dominant signallers. Figure 2b shows that the population on the whole is giving moderate to low weighting to these signals (groups 2 and 1). RHP sensitivity is again high (group 4). These results imply that whether we have exaggerated signals or suppressed signals, if there is a reliable source of information about RHP, then unreliable signals will be ignored. Further, the unreliable communication system has near enough collapsed: this is the case as receivers were paying virtually no attention at all to signals. In the low cost case signallers still give signals: since the cost is so low, there is virtually no disadvantage to continuing to signal, and an animat might even get the odd benefit as mutant high believers are occasionally born into the world. As in previous experiments when the cost of signalling is high there is a strong selective pressure to make signals as low as possible. 6. Experiment 2: Stable Unreliability Thus far, the costs in our synthetic ecology had been borne entirely by signallers. In order to relax this restrictive assumption, we proceeded to place a fitness cost on the strategy of finding out about the true RHP of potential opponents. In essence this is a variation on the idea of receiver costs. We assume (i) that signals are cost-free to receive (although possibly costly to produce), but (ii) that it is possible, although costly, for receivers to find out the true RHP of signallers. In natural ecologies these conditions might be met if, for example, (i) receivers could perceive an aggressive signal whilst maintaining a safe distance from the signaller, but (ii) receivers could find out about the true RHP of that signaller by approaching or probing it in some way, thereby incurring costs by risking attack, predation, or disease. Respecting this line of reasoning, we implement this new cost in our synthetic ecology by not allowing animats to pickup values for RHP from as far away as signals. In the case of the experiments shown here the RHP visible range was limited to 80 units (half the range of signal perception). As in the last experiment we ran the simulation in both high cost, and low cost scenarios recording the signalling strategies, receiving strategies, and RHP sensitivities. Figure 3a shows that in the low cost case high signalling (group 4) is dominant. Figure 3b is in contrast to the previous experiment – here instead of a mix of strategies, it is clear that high weighting is given to these signals (group 4). The sensitivity to RHP signals has been reduced to group 2 (figure 3c). When the cost of signalling is increased we see that low signalling is again dominant (group 1 and group 2 in figure 4a). Figure 4b shows that group 4 are the dominant receiving group – meaning high weight is given to these signals. Finally figure 4c shows that the sensitivity to RHP is less high than in the previous experiment (group 4 and group 3). Comparison of these results with those of the previous experiments suggests that as it becomes costly to find out the truth, the system reverts to an unreliable yet stable signalling system. This demonstrates that even where a guaranteed-to-be-reliable source of information is available, if it is too costly for receivers to use that source, the more adaptive strategy is to pay attention to unreliable signals. 7. Conclusion There are two related ways in which the results reported here might be of theoretical interest: First, they lend support to a particular view in biology, by showing that that view has theoretical purchase in ecological contexts which are not easily studied using the formal machinery of mathematical modelling. To decide whether or not one ought to expect an observed biological aggressive communication system to be reliable, one should endeavour to identify how the various costs in the ecological context in question are distributed amongst the various strategies available to individuals. If costs are not borne by signallers alone, then stable unreliability may be not merely possible, but likely. It seems that this principle will apply not only to aggressive communication systems, but also to matesignalling systems, and to other signalling systems as well. Second, we have pursued further the investigative strategy of using A-Life-style synthetic ecologies to relax certain restrictive assumptions made in formal mathematical models. Our results suggest that this investigative strategy is a potentially fruitful one. Acknowledgments Peter de Bourcier is an employee of CyberLife Technology Ltd. Michael Wheeler is supported by a Junior Research Fellowship at Christ Church, Oxford, with additional assistance from the McDonnell-Pew Centre for Cognitive Neuroscience, Oxford. References [1] J. Archer. The Behavioural Biology of Aggression. Cambridge University Press, Cambridge, 1988. [2] R. Brooks and P. Maes, editors. Artificial Life IV: Proceedings of the Fourth International Workshop on the Synthesis and Simulation of Living Systems, Cambridge, Mass. and London, England, 1994. MIT Press / Bradford Books. [3] S. Bullock. Are ‘handicap equilibria’ merely “quirky possibilities”? Paper presented to the Animal Behaviour Research Group, University of Oxford, February 1997. 9 Copyright © 1997, P. de Bourcier, M. Wheeler [4] D. Cliff, P. Husbands, J.-A. Meyer, and S.W. Wilson, editors. From Animals to Animats 3: Proceedings of the Third International Conference on Simulation of Adaptive Behavior, Cambridge, Mass., 1994. MIT Press / Bradford Books. [22] J.-A. Meyer and S.W. Wilson, editors. From Animals to Animats: Proceedings of the First International Conference on Simulation of Adaptive Behavior, Cambridge, Mass., 1991. MIT Press / Bradford Books. [5] T.H. Clutton-Brock and S.D. Albon. The roaring of red deer and the evolution of honest advertisement. Behaviour, 69:145--170, 1979. [23] G.F. Miller. Artificial life as theoretical biology: how to do real science with computer simulation. Cognitive Science Research Paper 378, University of Sussex, 1995. [6] P. de Bourcier and M. Wheeler. Signalling and territorial aggression: An investigation by means of synthetic behavioural ecology. In [4], 46372, 1994. [7] P. de Bourcier and M. Wheeler. Aggressive signaling meets adaptive receiving: further experiments in synthetic behavioural ecology. In [24], 760-71, 1995. [8] M. Enquist. Communication during aggressive interactions with particular reference to variation in choice of behaviour. Animal Behaviour, 33:1152--1161, 1985. [9] M. Enquist and O. Leimar. The evolution of fatal fighting. Animal Behaviour, 39:1--9, 1990. [10] A. Grafen. Biological signals as handicaps. Journal of Theoretical Biology, 144:517--546, 1990. [11] A. Grafen and R.A. Johnstone. Why we need {ESS} signalling theory. Philosophical Transactions of the Royal Society: Biological Sciences, 340:245--250, 1993. [12] D.G.C. Harper. Communication. In J. R. Krebs and N. B. Davies, editors, Behavioural Ecology --- An Evolutionary Approach, chapter~12, pages 374--397. Blackwell Scientific, Oxford, 3rd edition, 1991. [24] F. Moran, A. Moreno, J.J. Merelo, and P. Chacon, editors. Advances in Artificial Life: Proceedings of the Third European Conference on Artificial Life, Berlin and Heidelberg, 1995. Springer-Verlag. [25] C. A. Munn. Birds that ‘cry wolf’. Nature, 319:143--5, 1986. [26] G. A. Parker and J. Maynard Smith. Optimality theory in evolutionary biology. Nature, 348:27--33, 1990. [27] G.A. Parker. Assessment strategy and the evolution of fighting behaviour. Journal of Theoretical Biology, 47:223--243, 1974. [28] J. H. Poole. Announcing intent: Aggressive state of musth in African elephants. Animal Behaviour, 37:140--52, 1988. [29] J. Maynard Smith. Mini review: Sexual selection, handicaps, and true fitness. Journal of Theoretical Biology, 115:1--8, 1985. [30] M. Stamp Dawkins. Are there general principles of signal design? Philosophical Transactions of the Royal Society: Biological Sciences, 340:251--255, 1993. [31] M. Stamp Dawkins and T. Guilford. The corruption of honest signalling. Animal Behaviour, 41(5):865--73, 1991. [13] P.L. Hurd. Communication in discrete action-response games. Journal of Theoretical Biology, 174:217--22, 1995. [32] I.J.A. te Boekhorst and P. Hogeweg. Effects of tree size on travelband formation in orang-utans: Data analysis suggested by a model study. In [2], 119-29, 1994. [14] J. R. Krebs and N. B. Davies. An Introduction to Behavioural Ecology. Blackwell Scientific, Oxford, 2nd edition, 1987. [33] F. Toates and P. Jensen. Ethological and psychological models of motivation --- towards a synthesis. In [22], 194-203, 1991. [15] K. Lorenz. On Aggression. Methuen, London, 1966. [34] C. Wemmer, L. R. Collins, B. B. Beck, and B. Rettberg. The ethogram. In B. B. Beck and C. Wemmer, editors, The Biology and Management of an Extinct Species: Pere David's Deer, pages 91--121. Noyes, New Jersey, 1983. [16] B. MacLennan and G. Burghardt. Synthetic ethology and the evolution of cooperative communication. Adaptive Behavior, 2(2):161-188, 1994. [17] J. Maynard Smith. Do animals convey information about their intentions? Journal of Theoretical Biology, 97:1--5, 1982. [35] M. Wheeler and P. de Bourcier. How not to murder your neighbor: using synthetic behavioral ecology to study aggressive signaling. Adaptive Behavior, 3:3:273--309, 1995. [18] J. Maynard Smith. Evolution and the Theory of Games. Cambridge University Press, Cambridge, 1982. [36] A. Zahavi. Mate selection --- a selection for a handicap. Journal of Theoretical Biology, 53:205--214, 1975. [19] J. Maynard Smith and G.A. Parker. The logic of asymmetric contests. Animal Behaviour, 24:159--175, 1976. [37] A. Zahavi. The cost of honesty (further remarks on the handicap principle. Journal of Theoretical Biology, 67:603--605, 1977. [20] D. McFarland. Towards robot cooperation. In [4], pp.440-4, 1994. [38] A. Zahavi. The fallacy of conventional signalling. Philosophical Transactions of the Royal Society: Biological Sciences, 340:227--230, 1993. [2 1] P.K. McGregor. Signalling in territorial systems: a context for individual identification, ranging and eavesdropping. Philosophical Transactions of the Royal Society: Biological Sciences, 340:237--244, 1993. 10 Copyright © 1997, P. de Bourcier, M. Wheeler
© Copyright 2026 Paperzz