The importance of rapid cultural convergence in the evolution of learned symbolic communication Kenny Smith Language Evolution and Computation Research Unit, Department of Theoretical and Applied Linguistics, The University of Edinburgh, Adam Ferguson Building, 40 George Square, Edinburgh EH8 9LL [email protected] Abstract. Oliphant [6, 7] contends that language is the only naturally- occurring, learned symbolic communication system, because only humans can accurately observe meaning during the cultural transmission of communication. This paper outlines several objections to Oliphant's argument. In particular, it is argued that learning mechanisms necessary to support learned symbolic communication may be unlikely to evolve in non-human species, irrespective of their capacity for observing meaning. This is due to a delay between the emergence of the appropriate learning mechanism and a tness payo to individuals possessing that learning mechanism. Two factors which reduce this delay and increase the likelihood of learned symbolic communication emerging are investigated. These factors emphasise the important role played by cultural processes in gene-culture coevolution during the evolution of communication. 1 Introduction Language is unique among the communication systems of the natural world it is culturally transmitted, the relationship between basic lexical tokens and their meanings is arbitrary and those basic lexical tokens are combined to form structured forms which are used to communicate complex structured meanings. How did language come to be as it is and why is it unique? Much recent work in the eld has focused on the evolution of syntactic communication. Explanations of the human capacity for syntax have placed emphasis on two contrasting adaptive processes: Genetic adaptation of the genetically-encoded human language acquisition device to support syntactic communication due to tness advantages oered by syntactic communication (e.g. [5, 8]). If this is the key adaptive process, the uniqueness of language can be explained in terms of a unique set of selection pressures experienced by the ancestors of modern humans. Cultural adaptation of language in favour of compositionality, due to cultural selection resulting from language learner biases during cultural transmission of communication (e.g. [1, 3]). Such models are not primarily concerned with the origin of the language learner's biases. If cultural adaptation is the key 2 adaptive process, there are at least two possible explanations for the uniqueness of human language - only our ancestors underwent the mutation or set of mutations which equipped them with the necessary mental apparatus, or only the unique set of circumstances experienced by our ancestors resulted in selection for the mental apparatus required for the cultural evolution of language to begin. Recent work by Oliphant [6, 7], building on pioneering work by Hurford [2], focuses on the more basic issue of the emergence of arbitrary and conventionalised word meaning (see [4] for an alternative approach to a similar issue). Oliphant works within the cultural adaptation framework and makes two claims. Firstly, human language is the only learned symbolic communication system. Secondly, language is unique in this respect due to the human capacity to read the communicative intentions of other language users. Once this capacity is established optimal, learned symbolic communication reliably follows through cultural evolution of communication systems. This paper is primarily concerned with integrating the genetic and cultural adaptation styles of explanation and applying this integrated approach to an investigation of Oliphant's second claim. 2 Oliphant's argument Oliphant [6, 7] argues that human language is the only learned symbolic communication system occurring in the natural world and that this fact can be explained in terms of the problem of observing meaning during cultural transmission of communication systems. In order to learn a mapping from a signal to its meaning a learner must observe the meaning-signal pairs used by other individuals. While observing a signal may be fairly straightforward, observing the meaning that signal is intended to convey may not be - the relationship between meaning and signal is arbitrary in a symbolic communication system. Oliphant argues that only humans have mastered the trick of guessing which meaning a signal is intended to convey during acquisition of arbitrary meaning-signal mappings only humans can observe meaning reliably. Oliphant [7] supports this claim with a computational model. Bidirectional associative networks capable of mapping from meanings to signals are used to model individual communicative agents and Oliphant considers how dierent weight-update rules for such networks will inuence the cultural development of communication systems within the framework of iterated cultural transmission within a population of such agents, if the ability to accurately observe meaning is assumed. Oliphant denes a measure of communicative accuracy for these agents, which calculates the probability of any agent using a signal s to communicate a meaning m and having s interpreted as m by another agent. A population's communicative accuracy has a maximum of 1.0, representing the case where any meaning can be successfully communicated by any agent to any other agent in 3 the population. Such populations are said to possess an optimal communication system. Oliphant evaluates weight-update rules in the context of this model with respect to three criteria: Acquisition: Can a network using the rule acquire an optimal communication system? Maintenance: Can a network using the rule acquire an optimal system against reasonable levels of noise? Construction: Can a network using the rule improve a non-optimal system in such a way that a population of such agents will eventually construct an optimal system? Rules capable of only acquisition will be termed learners. Rules capable of only maintenance and acquisition will be termed maintainers and rules capable of construction, maintenance and acquisition will be termed constructors. Oliphant shows that a simple weight-update rule implementing a form of Hebbian learning is capable of constructing optimal systems. He argues that this type of rule is well within the learning capabilities of non-human species, suggesting that learned symbolic communication systems should be common in the natural world. According to Oliphant the rarity of such systems can therefore best be explained in terms of the diculty of observing meaning during cultural transmission of communication systems. 3 Objections to Oliphant's conclusions There are three main objections to Oliphant's argument that the uniqueness of human language is best explained in terms of a uniquely human ability to observe meaning. Firstly, as illustrated in [9], a closer analysis of constructor weight-update rules shows that they are strongly biased in favour of one-toone mappings between meanings and signals. This kind of learning bias may be specic to humans - even if other species were capable of observing meaning, they do not possess a constructor-type learning bias and therefore would be incapable of developing a learned symbolic communication system. Secondly, constructor weight-update rules are unlikely to evolve in simulated populations, even under apparently ideal conditions, due to a delay between the emergence of constructor rules and a tness advantage to individuals using those rules. This suggests that the capacity to observe meaning does not inevitably lead to the emergence of optimal learned communication systems. This argument is more closely considered in Section 3.1. Finally, work in progress suggests that learned symbolic communication systems can be maintained, and to a lesser extent constructed, even if the capacity of learners to observe meanings is diminished - the ability to observe meaning accurately is not a necessary precondition for the emergence of learned communication. 4 3.1 Unreliable emergence If it is the case, as argued in [9], that only humans possess the constructor-type bias in favour of one-to-one mappings between meanings and signals, a modied version of Oliphant's argument seems appealing - only humans possess the ability to observe meaning, and once this trick has been mastered natural selection quickly identies the learning biases which will support symbolic communication. But do constructor-type learning biases reliably emerge given a pre-existing ability to observe meaning? A modied version of Oliphant's model suggests that natural selection may have diculty in identifying constructor learning biases, even under apparently ideal conditions. The model consists of three elements: Communication systems: As in Oliphant's model, unanalysed meanings and unstructured signals are modelled using unit vectors. Communication systems therefore consist of a mapping from meaning vectors to signal vectors. Communicative agents: As in Oliphant's model, individual agents are modelled using bidirectional associative networks (see Figure 1)1 . The mapping between meanings and signals in such networks is determined by the connection weights in the network. Prior to learning the networks have uniform weights across all their connections - the agents have no preference for any particular communication system. During training these agents are presented with meaning-signal pairs (accurate observation of meaning is assumed) and adjust their network connection weights according to a weight-update rule. Whereas in Oliphant's model the weight-update rule used by each agent is determined by the experimenter, in the new model each individual's weightupdate rule, and therefore their learning bias, is genetically encoded 2 . To map from a meaning to a signal, or from a signal to a meaning, the appropriate input vector is presented to the network and the output unit receiving the highest activation is activated, while the other output units remain inactive. This is a winner-take-all strategy. Given multiple output units receiving equally high activation, one winner is randomly selected. Population model: A population consists of 100 agents. At each time step one agent is selected and removed. A new agent is introduced and receives three exposures to the communication systems of other members of the population. 1 10 by 10 networks are used for all models outlined in this paper. 2 The weight-update rule used by each agent is specied in a 4-locus genome corresponding to the phenotype rule (OnOn OnO OOn OO), where the value in the OnOn slot indicates the change to be made to the connection weight between coactive units, OnO is the change to be made when the meaning unit is on and the signal unit is o and so on. There are 3 possible alleles for each locus: +1, 0 or ?1, corresponding to 3 possible changes in connection strength in the phenotype network - increase connection strength, leave connection strength unaltered or decrease connection strength. Each genotype therefore encodes one of 81 possible weight update rules. Of these, 9 can be classied as constructors, 9 can be classied as maintainers, 13 can be classied as learners and 50 can be classied as non-learners, incapable of acquiring an optimal system. The initial population in all simulations consists of agents with random genotypes. 5 Meaning Layer The new agent adjusts their connection weights according to their weightupdate rule. This replacement process is repeated at each time step. Whereas in Oliphant's model removal from the population (death) is entirely random and there was no notion of breeding, in the new model selection pressures on death and breeding are introduced - less successful communicators are more likely to be removed from the population and more successful communicators are more likely to breed to produce new ospring3 , to who they transmit their genetically encoded weight-update rule4. The population is therefore under natural selection for communicative success. Signal Layer Fig. 1. The associative network. Dashed lines indicate the omission of further nodes. In common with Oliphant's model, the agents were assumed to be able to observe meaning during the cultural transmission process. This model therefore seems perfect for evolving learning biases conducive to communicative success - agents can observe meaning, weight-update rules are genetically encoded and there is strong selection pressure for communicative accuracy. Do populations under these conditions reliably converge on constructor rules and develop optimal, learned symbolic communication? At each time step the population are evaluated for communicative accuracy according to the measure described in [7]. Each individual is evaluated for their ability to successfully transmit and receive 20 randomly selected meanings with randomly selected members of the population. Tournament selection is used for both death and breeding. For death, a three-agent tournament is held, with the agent whose communicative accuracy is rated lowest being removed. For breeding, two parents are selected. Each is selected using a three-agent tournament, with the agent whose communicative accuracy is rated highest being allowed to breed. 4 One-point crossover occurs during breeding with probability 0 95. Mutation occurs at each locus on the ospring genotype with probability 0.01. Mutation results in replacement of the allele at the mutated locus with one of the other two possible alleles, randomly selected. 3 : 6 Figure 2 illustrates the progress of a run which converged on a (very near) optimal communication system. This is a typical example of a successful run. However, only 5 of 100 runs were successful in this respect - optimal communication is unlikely to emerge, even under this apparently ideal set of circumstances. 1 Average communicative accuracy Constructor proportion Maintainer proportion Learner proportion Non-learner proportion 0.8 0.6 0.4 0.2 0 0 500 1000 1500 2000 2500 Generation 3000 3500 4000 4500 5000 Fig. 2. A successful run. Why does natural selection have such diculty in identifying weight-update rules which lead to optimal communication? The problem is that constructors need time to converge on an optimal system - cultural selection over repeated time steps gradually moves their communication systems into increasingly optimal overlapping areas of communication system space until they are all converged on an optimal system. In the new model, where weight-update rules are genetically transmitted, in the early stages of this construction process individuals using constructor rules have little tness advantage over other individuals. As a consequence the genetic transmission process will be essentially random the population will undergo genetic drift. In successful runs, such as that shown in Figure 2, genetic drift preserves constructors, by chance, in sucient numbers for sucient time to allow the construction process to get well under way. Constructors then show increased communicative accuracy which leads to steady selection for constructor genes, constructor numbers in the population increase and the population's average communicative accuracy increases sharply. This switch from genetic drift to directed genetic change occurs at around generation 2000 in the run shown in Figure 2. Interestingly, when the population's communicative accuracy nears optimal levels, selection for constructors cuts out - the population enters a second stage of genetic drift, where constructor numbers uctuate randomly. This is due to the fact that maintainers, and to a lesser extent learners, are capable of putting the nishing touches on a communication system and maintaining that system once it is established, given a little assistance by selective removal of poorly- 7 communicating agents. Constructors loose their advantage over maintainers and learners and genetic transmission becomes semi-random once more. However, non-learners never drift back into the population - they are incapable of learning an optimal system and suer a severe tness penalty. This switch from selection to drift occurs around generation 3000 in the run shown in Figure 2. This threestage drift-selection-drift pattern is common over all successful runs. These simulation results suggest that the key factor in explaining the uniqueness of human language is not necessarily the human capacity for observing meaning. Even if other species could observe meaning accurately the evolution of a learning mechanism which supports optimal learned symbolic communication would not necessarily follow, due to the delay between the emergence of constructor-type learning biases and an increase in communicative accuracy for those suitably biased individuals. 4 Speeding up cultural convergence The simulation results outlined above suggest that if it were possible to speed up the construction process then agents with constructor weight-update rules would receive a tness payo more rapidly, reducing the vulnerability of the population to genetic drift and allowing optimal learned communication systems to emerge more reliably. The aim of the research outlined in this section is to identify what kind of modications to the basic model might speed up the construction process. The question of whether these modications correspond to any uniquely human state of aairs can then be addressed. If so, the uniqueness of language may be explicable in terms of these factors. Two methods of speeding up the construction process are outlined below. 4.1 Cultural subpopulations Smaller populations of constructor agents converge on optimal learned systems of communication more rapidly [7]. However, in the context of the simulation model used in Section 3.1, reducing the size of the simulated populations will have two eects. Firstly, cultural convergence will be speeded up. Secondly, genetic drift will be more pronounced in smaller populations. Since both genetic drift and speed of cultural convergence play a role in the emergence of optimal learned communication systems in this model, if we wish to isolate the inuence of speed of cultural convergence on the emergence of communication we must decouple cultural population size and genetic population size. This aim is achieved by introducing spatial organisation into the population in the arena of cultural transmission and use of communication systems, but not in the arena of genetic transmission of weight-update rules5. The parameter d determines the level of spatial organisation of the population. For small 5 For the purposes of evaluation of communicative accuracy, which determines removal from the population and opportunity to breed, the population is assumed to be organised in a ring. Agents are evaluated for their ability to communicate with other 8 d the population is eectively split into several small subpopulations during communication and cultural transmission, but consists of a single homogeneous population in genetic terms. For large d (d 20) the population behaves as a homogeneous whole for both cultural and genetic purposes. Does splitting the population into several smaller cultural subpopulations speed up cultural convergence, therefore making optimal communication systems more likely to emerge? Figure 3 plots the number of successful runs (out of 100) where the population's average communicative accuracy (ca) over the last 500 generations of a 5000 generation run exceeds a certain threshold, for various values of d. There is a clear relationship between d and the number of successful runs, with low d and rapid cultural convergence leading to more reliable emergence of optimal communication systems. It is also clear that there is little dierence in the number of successful runs regardless of whether \success" is dened as ca > 0:9, ca > 0:5 or ca > 0:1 - emergence of successful communication appears to be an all-or-nothing process. 70 ca > 0.9 ca > 0.5 ca > 0.1 60 no. runs 50 40 30 20 10 0 0 2 4 6 8 10 d 12 14 16 18 20 Fig. 3. Number of runs achieving a given level of communicative accuracy for various levels of cultural spatial organisation. 4.2 More learning Increasing the number of exposures each new agent receives to the communication systems of other members of the population also speeds up cultural agents picked from a normal distribution around the position they occupy, with a standard deviation of slots. Similarly, cultural transmission is assumed to be spatially organised, with learners observing the communicative behaviour of individuals picked from a normal distribution around the learner's position, again with standard distribution . However, there is no restriction on the spatial relationship between breeding pairs and their ospring. d d 9 convergence. In all previous simulations outlined in this paper, each agent observes the communication systems of three other members of the population and adjusts their network connection weights according to their weight-update rule. In the simulations outlined in this section, each agent observes and learns from the communicative behaviour of n other agents. All other details of the model are the same as the model outlined in Section 3.1. Figure 4 plots the number of successful runs (out of 100) for various values of n. There is a clear relationship between n and the number of runs achieving a given level of communicative accuracy, with larger values of n resulting in more reliable emergence of nearoptimal communication systems due to faster cultural convergence, although the eect becomes less marked for n > 15. 70 60 no. runs 50 40 30 20 10 ca > 0.9 ca > 0.5 ca > 0.1 0 0 5 10 15 n 20 25 30 Fig. 4. Number of runs achieving a given level of communicative accuracy for various amounts of learning. 5 Conclusions Oliphant [7] argues that optimal, learned symbolic communication will trivially emerge given a capacity to observe meaning during cultural transmission and a commonly-occurring learning bias, and that language is the only naturally occurring instance of such a system because only humans have the ability to observe meanings accurately. The rst part of this paper raises three objections to Oliphant's proposal. Firstly, the learning bias necessary to construct an optimal, learned communication system, a bias in favour of one-to-one mappings between meanings and signals, may be specic to humans. Secondly, even if other species were capable of observing meaning, the correct learning bias might be unlikely to evolve due to a delay between the emergence of the bias and a tness payo to individuals possessing it. Thirdly, near-optimal communication systems 10 may emerge and be maintained in populations of agents incapable of accurately observing meaning. In the second part of this paper two factors inuencing the speed of cultural convergence of populations were considered - organisation of the population into cultural subpopulations and increase of the amount of learning done during cultural transmission. An increase in the speed of cultural convergence due to either of these factors results in more reliable emergence of genetically-encoded constructor weight-update rules, due to a reduced delay between emergence of such rules and a tness payo to individuals possessing them. It is not the intention of this paper to argue that increased learning or cultural spatial organisation characterise crucial phenomenon in human evolution and therefore explain why only humans evolved the appropriate learning bias to support learned symbolic communication, although parallels could be drawn between increased learning in the model and the lengthy immature period in humans. Rather, these results should be taken as emphasising the rather complex and unpredictable interactions between genetic and cultural adaptation in models which do not discount one adaptive process as a starting assumption. Specically, even if a particular learning bias results in the emergence of optimal communication systems through processes of cultural selection we cannot assume that natural selection will reliably nd that bias. References 1. J. Batali. The negotiation and acquisition of recursive grammars as a result of competition among exemplars. In E. Briscoe, editor, Linguistic Evolution through Language Acquisition: Formal and Computational Models. Cambridge University Press, Cambridge, 2001. 2. James R. Hurford. Biological evolution of the saussurean sign as a component of the language acquisition device. Lingua, 77:187{222, 1989. 3. S. Kirby. Syntax without natural selection: how compositionality emerges from vocabulary in a population of learners. In Chris Knight, Michael Studdert-Kennedy, and James R. Hurford, editors, The Evolutionary Emergence of Language: Social Function and the Origins of Linguistic Form, pages 303{323. Cambridge University Press, Cambridge, 2000. 4. N.L. Komarova and M.A. Nowak. The evolutionary dynamics of the lexical matrix. To appear in Bull. Math. Biol. 5. M. A. Nowak, J. B. Plotkin, and V. A. A. Jansen. The evolution of syntactic communication. Nature, 404:495{498, 2000. 6. M. Oliphant. Rethinking the language bottleneck: Why don't animals learn to communicate? presented at 2nd International Conference on the Evolution of Language, 1998. 7. M. Oliphant. The learning barrier: Moving from innate to learned systems of communication. Adaptive Behavior, 7(3/4):371{384, 1999. 8. S. Pinker and P. Bloom. Natural language and natural selection. Behavioral and Brain Sciences, 13:707{784, 1990. 9. K. Smith. The evolution of learning mechanisms supporting symbolic communication. Submitted to CogSci2001, the 23rd Annual Conference of the Cognitive Science Society.
© Copyright 2026 Paperzz