Bulletin of Mathematical Biology (2002) 00, 1–30 doi:10.1006/bulm.2002.0315 Available online at http://www.idealibrary.com on A Model for the Emergence of Adaptive Subsystems H. DOPAZO∗ Departamento de Biologı́a, Facultad de Ciencias Exactas y Naturales, Universidad de Buenos Aires, Pabellón 2, Ciudad Universitaria, 1428 Buenos Aires, Argentina E-mail: [email protected] M. B. GORDON Laboratoire Leibniz-IMAG, 46, ave. Félix Viallet, 38031 Grenoble Cedex, France R. PERAZZO Departamento de Fı́sica, Facultad de Ciencias Exactas y Naturales, Universidad de Buenos Aires, Pabellón 1, Ciudad Universitaria, 1428 Buenos Aires, Argentina S. RISAU-GUSMAN Zentrum für Interdisziplinäre Forschung, Universität Bielefeld, Wellenberg 1, D-33615, Bielefeld, Germany We investigate the interaction of learning and evolution in a changing environment. A stable learning capability is regarded as an emergent adaptive system evolved by natural selection of genetic variants. We consider the evolution of an asexual population. Each genotype can have ‘fixed’ and ‘flexible’ alleles. The former express themselves as synaptic connections that remain unchanged during ontogeny and the latter as synapses that can be adjusted through a learning algorithm. Evolution is modelled using genetic algorithms and the changing environment is represented by two optimal synaptic patterns that alternate a fixed number of times during the ‘life’ of the individuals. The amplitude of the change is related to the Hamming distance between the two optimal patterns and the rate of change to the frequency with which both exchange roles. This model is an extension of that of Hinton and Nowlan in which the fitness is given by a probabilistic measure of the Hamming ∗ Corresponding address: Bioinformatica, CNIO, c/Melchor Fernandez Almagro 3, Madrid 28029, Spain. E-mail: [email protected] 0092-8240/02/000001 + 30 $35.00/0 Society for Mathematical Biology. c 2002 Published by Elsevier Science Ltd on behalf of 2 H. Dopazo et al. distance to the optimum. We find that two types of evolutionary pathways are possible depending upon how difficult (costly) it is to cope with the changes of the environment. In one case the population loses the learning ability, and the individuals inherit fixed synapses that are optimal in only one of the environmental states. In the other case a flexible subsystem emerges that allows the individuals to adapt to the changes of the environment. The model helps us to understand how an adaptive subsystem can emerge as the result of the tradeoff between the exploitation of a congenital structure and the exploration of the adaptive capabilities practised by learning. c 2002 Published by Elsevier Science Ltd on behalf of Society for Mathematical Biology. 1. I NTRODUCTION Survival and reproduction of living beings depend upon the access to scarce resources that are in general distributed irregularly in space and time. In order to survive and reproduce, the individuals have to adapt to the environment and this is achieved by developing complex behavioural patterns. Adaptation may take place across generations or within the lifespan of an individual. In the former case natural selection acts upon the genetic variation within a gene pool. In the latter, it is usually referred to as learning and takes place during ontogeny. Such capacity to learn or, what is the same, to change the behaviour on the basis of past experiences, is acquired because it contributes positively to the individuals’ reproductive success. Both solutions stem from the same evolutionary process, through the generation of variations and the selection of the fittest alternative, involving the interplay of two nested adaptive systems (Edelman, 1987). The interaction of learning and evolution was put into a Darwinian framework by Baldwin. In his seminal paper (Baldwin, 1896) Baldwin addressed the question of whether it is the adaptive capabilities of the individuals that guides evolution. Baldwin’s argument is as follows: bearing a learning ability may help individuals of a population to survive in conditions where individuals lacking it are eliminated by natural selection. This means that populations built by individuals capable of learning would survive while mutations accumulate so that the function that had to be learned in former generations becomes congenital (Baldwin, 1896). Several models have been developed to study the Baldwin effect (Hinton and Nowlan, 1987; Fontanari and Meir, 1990; Ackley and Littman, 1992; French and Messinger, 1994; Ansel, 1999; Dopazo et al., 2001). The general conclusion is that learning and biological evolution do indeed interact but such interplay is far from having a unique outcome. As far as the indirect transcription of environmental data into genetic information is concerned, the Baldwin effect entails a kind of quandary. On the one hand if learning is very efficient there is no selection pressure to fix information from Emergence of Adaptive Subsystems 3 the environment: the more efficient is learning, the less effective is the transcription. On the other hand, if learning bears an excessive cost, the inherited plasticity becomes useless and the Baldwin effect may never take place. Learning can be regarded as a common feature used to respond to the challenges that adaptive systems have to face during evolution. Let us mention a few of these as considered by Frank (1996). Such responses appear as the result of the interplay of nested adaptive systems and its evolution therefore fall in a group of problems related to the Baldwin effect. One response is the construction of ‘simple rules’ to generate complex phenotypes. This arises when the genetic information is not enough to encode most of the detailed constitutive phenotypical information, such as the one that goes into, say, the ‘wiring’ of the central nervous system of higher vertebrates. An example of that rule is the Hebbian process of stabilizing synaptic connections between neurons through repeated mutual stimulation of the latter. This has been extensively mentioned in the literature as an example of a process that can lead to long-term potentiation and the fixation of memories. This adaptive process is a particular case of simple ‘generative rules’ that are able to construct complex phenotypical patterns to face a changing environment. A second response is the emergence of an ‘adaptive subsystem’. Frequently offspring have to face new environments and therefore new threats. It is then useless for one generation to transfer to the next the best information to survive in its own environment. Adaptation is possible if an adaptive subsystem is available that generates variations in response to environmental stimuli and later selects the fittest alternatives. An example of this is the immune system of vertebrates that ‘learns’ to recognize the self from the foreign using the mechanism of clonal selection. A third possible response is a balance between ‘open’ systems such as the immune system that shapes itself during ontogeny, and ‘closed’ or ‘wired’ systems that are a part of the constitutive information of the living being. Both exist, for example, within the central nervous system: while one is an automatic reflex, the other corresponds to an acquired behaviour. Any living being who undergoes an adaptation process bears a cost. There are costs due to the time involved in adaptation; due to mistakes performed during that stage, or simply because there are organic limitations until the moment of reproductive maturity. There is always a balance between the exploration of new possible responses and the exploitation of successful solutions that have been already acquired. There are two common ingredients to all the above examples. One is learning under different forms. We have mentioned the fixation of memories through a Hebbian process, the adaptation of the immune system to the biotic environment through clonal selection, and the balance between automatic reflexes and acquired behaviours. The second key ingredient that is always present is a changing environment. In a fixed environment, neither learning nor the genesis and the subsequent exploitation of a flexible subsystem are required. It is much ‘cheaper’ from an evolutionary point of view to ‘hardwire’ into the genetic information all the features 4 H. Dopazo et al. that are relevant for survival, avoiding the need to pay the overhead of a plastic phenotype that requires a costly adaptation during ontogeny. The purpose of this paper is to develop a working model to investigate the origin of adaptive subsystems as a consequence of the interplay of learning and evolution in a changing environment. The kind of questions that we address are: What types of environmental challenge favour learning abilities? What types of challenge favour a genetic system to spawn a subsystem of variation and selection? It is clear that, for instance, the emergence of an adaptive subsystem has to be tuned in some way to the amplitude or to the rate of changes in the environment (Ansel, 1999). To put this into different words, the learning capability should be considered as the reflected image of the challenges arising from the changing environment faced through the evolutionary process. If one considers the transcription of environmental data into genetic information as in the Baldwin effect, the frequency of environmental changes gives rise to several situations. For example, it can be that the information is only partially transferred because it is too costly to cope with the changes, or it may be that the variability of the environment is the feature that is extracted and transcribed into the genetic information under the form of an adaptive subsystem. In Section 2 we introduce the GHN model, in which we generalize the model used by Hinton and Nowlan (1987) to include a changing environment. The individuals are characterized by a genetic information that specifies the connections of an idealized neural network. The strengths of the connections can take only two values (±1), and may be either inherited, encoded by corresponding ‘fixed alleles’, or plastic, encoded by ‘flexible alleles’. The latter can change during the lifetime of the individual. The environment is represented by a binary string, and is assumed to change between two different configurations during the lifetime of each generation. Each string represents the optimal synaptic connections of a network that would be perfectly adapted to the corresponding environment. The model allows us therefore to consider environmental changes that differ both in ‘amplitude’ (the number of bits in which the two optimal connections differ from each other) and in frequency (the number of changes that occur in the environment during the lifetime of the individual). It also allows us to investigate how the learning (adaptive) capability of the individuals is tuned to the environmental changes. In Section 3 we present a statistical approximation to the fitness function of the GHN model, that helps understanding its behaviour. In Section 4 we present the numerical methods that we use to simulate the evolutionary process with a genetic algorithm (GA). The results are presented and discussed in Section 5. We show that there are two possible regimes depending upon the difficulties of adapting to the changes of the environment. When the difficulty of such learning task is low enough, a stable learning or adaptive subsystem emerges, that is tuned to the changing environment. If that is not the case, the individuals have one of the possible configurations of the environment ‘hardwired’ into their genomes. We discuss these possible evolutionary pathways in terms of the features of the fitness land- Emergence of Adaptive Subsystems 5 scape of the model. In Section 6 we provide a further generalization of the GHN model to the case of a population of perceptrons as in Dopazo et al. (2001). This is done with the purpose of discussing the evolutionary consequences of the presence of epistatic effects other than those that are attributed to learning. The conclusions are drawn in Section 7. 2. T HE GHN M ODEL Our model, a generalization of the one proposed by Hinton and Nowlan (1987) (hereafter H & N) for the Baldwin effect, considers a population of individuals, each one having a neural network with L connections or synapses whose strengths w E = (w1 , . . . , w L ) result from the expression of L genes. Like H & N, we consider three possible alleles for each locus in the genotype, labelled −1, +1 and ? respectively. The alleles −1 and +1 express themselves through two types of connections—say inhibitory (−1) and excitatory (+1)—that remain fixed during ontogeny.† The alleles ? express themselves by adaptive or flexible synaptic connections that undergo changes during the lifetime of the individuals. This process is hereafter loosely called learning (Hinton and Nowlan, 1987).‡ Fixed and flexible alleles are inherited by the next generation. Notice that what is transmitted to the offspring is not the acquired value of each flexible synapse during the ontogeny, but only the information that the corresponding synapse is flexible. Thus, this is not a Lamarckian inheritance process. H & N assumed that there is a single optimal phenotype, well adapted to a fixed environment, and that the individuals devote a time period of G ‘days’ to learn it. The ‘learning protocol’ is the following: each ‘day’ the individual performs a random assignment of all its flexible synapses to be either +1 or −1 with equal probability. This process stops either if the optimal connection scheme is found, or if the maximal allowed learning time G is over. The fitness of an individual is a decreasing function of the number of trials needed to find the optimal connections. If the right combination is not found within the G allowed trials, the search process is stopped and a minimum fitness equal to 1 is assigned to the individual. In our model, we assume that the environment oscillates periodically between two different states, so that during the life of an individual the optimal synaptic connections change 2F times. The extension to more than two different environment states is straightforward. The allowed learning time for each environment state is of T = G/2F trials. Notice that, since the total learning time G is fixed, any increase of the frequency of environmental changes, F, reduces the number of allowed learning trials. The optimal synaptic connections corresponding to each environment are represented by vectors of dimension L labelled w E 1 and w E 2 respectively, which only † We denote the genotypes with bold, and the corresponding synaptic strengths with normal, char- acters. ‡ In Section 6 we consider a case where learning is a more appropriate term. 6 H. Dopazo et al. differ from each other on the first L v loci, the remaining L f = L − L v being equal. For the sake of concreteness we take all the synaptic strengths in w E 1 to be +1, while the first L v values of w E 2 are −1. Namely Lf L z }|v { z }| { w E 1 = 1, 1, . . . , 1, 1, 1, . . . , 1 w E 2 = −1, −1, . . . , −1, 1, 1, . . . , 1 . | {z } | {z } Lv (1) Lf In the present model, as well as in that of H & N, learning is the only way for the individuals to increase their fitness, which depends on the learning proficiency. The fitness of an individual is defined by φ= 2F 1 X φi , 2F i=1 (2) where ti φi = 1 + (L − 1) 1 − 2(T − ti ) T (3) is the partial fitness acquired in period i. The latter depends on ti , the number of trials used in that period to find the corresponding optimal weights. Due to the Heaviside function§ in equation (3), the partial fitness is definite positive, and takes values between 1 and L. If in period i the optimum is not found within the allowed number of trials T the contribution φi to the total fitness in this period is reduced to its minimum value. The total fitness of the individual, given by equation (2), is an average over all partial fitnesses, equation (3). As the total learning time G ≡ 2F T is kept constant, one expects the learning success of the individuals to decrease with the total number 2F of environmental changes. One difference with the models discussed by H & N and Dopazo et al. (2001) is that here the fitness of an individual not only depends on the number of each kind of allele, but also upon their particular location in the genotype. By introducing a variable and a fixed part in equation (1) we aim at representing a situation in which some features of the environment change while others remain constant. For instance, the colour of the food may change while its smell, size or location may remain instead constant. As a result, the neural network of the individuals must adapt in order to still recognize what is edible from what it is not, despite the environmental change. Note that any individual having only a single −1 allele in the last L f loci of the genome has the minimal possible fitness of 1 because it will be unable to find the optimal connection weights in spite of its learning possibilities. Individuals whose § 2(x) = 1 for x ≥ 0, 2(x) = 0 for x < 0. Emergence of Adaptive Subsystems 7 genotype does not contain any ? are unable to learn and its fitness is determined at birth. If the synaptic weights match w E 1 or w E 2 , a maximum partial fitness of L is reached in the corresponding environment, while it is equal to the minimum value of 1 during the rest of the time. All other genetic epistatic effects are neglected in this model (as well as in that of H & N). A variant of the present GHN model, in which the fitness is a smoother function of the allelic composition due to such effects, is discussed extensively in Section 6. As we explain below, in the simulation of the evolutionary process each new population is obtained from the preceding generation through a selection process in which each individual leaves a number of descendants proportional to its total fitness. Alleles −1, +1 or ? of the offspring are randomly mutated to either of the two other possible states with a small probability pmut , before they begin the learning process. 3. A P ROBABILISTIC A PPROXIMATION OF THE GHN M ODEL We first present some analytic results, obtained through a probabilistic treatment developed by Fontanari and Meir (1990) and also used by Dopazo et al. (2001), which provide a frame for the interpretation of the numerical simulations of our model, that is explained in detail in the next section. This approach is valid in the limit of large populations, with also large values of T and L v . 3.1. The genotype mean fitness. We estimate the fitness of each individual¶ through the mean value ϕ of φ, equation (2), averaged over all the possible outcomes of the T allowed trials per environment state. The genome is specified by the numbers Pv( f ) , Q v( f ) and Rv( f ) of alleles 1, ? and −1 respectively, in the first L v (the last L f ) loci.k The average, ϕ(Pv , Q v , Rv , P f , Q f , R f ), is calculated as in Fontanari and Meir (1990). One gets ϕ(Pv > 0, Q v , Rv = 0, P f , Q f , R f = 0) = ϕ(Pv = 0, Q v , Rv > 0, P f , Q f , R f = 0) 1 1 − (1 − 2−Q f −Q v )T = 1 + L − (L − 1) ; 2 2−Q f −Q v T (4) ϕ(Pv = 0, Q v = 0, Rv = L v , P f , Q f , R f = 0) = ϕ(Pv = L v , Q v = 0, Rv = 0, P f , Q f , R f = 0) = L +1 ; 2 (5) ¶ We borrow the term individual from the parlance of the numerical simulation. Actually, we make no distinction between individual and genotype. k In fact, only four of the six parameters are independent, as they must fulfil the two relations Pv( f ) + Q v( f ) + Rv( f ) = L v( f ) . 8 H. Dopazo et al. ϕ(Pv = 0, Q v = L v , Rv = 0, P f , Q f , R f = 0) = L − (L − 1) 1 − (1 − 2−Q f −L v )G . 2−Q f −L v T (6) Any other values of the variables Pv , Q v , Rv , P f , Q f , R f yield the minimum fitness, φ = 1, so that also ϕ = 1. The genotype mean fitness ϕ cannot be easily depicted in sequence space, because it depends on four parameters. Since it is minimal and completely flat in very large subspaces, the evolutionary process is expected to effectively take place mainly within the restricted region where ϕ > 1. The same situation arises in the original H & N model, in which the sequence space is parametrized by three numbers, related to those in the present model through P = Pv + P f , Q = Q v + Q f and R = Rv + R f . This was extensively discussed by Fontanari and Meir (1990) and Dopazo et al. (2001). Since in that model P + Q + R = L, all the realizable sequences lie within the triangle P + Q ≤ L and the fitness is different from 1 only in the subspace P + Q = L. In the present case, the relevant subspaces have R f = 0 and either Rv = 0 and Pv 6= 0, or Rv 6= 0 and Pv = 0. These two regions are in turn entirely symmetrical with respect to each other. We therefore restrict ourselves to consider the landscape of the genotype mean fitness in the subspace defined by 0 ≤ Q f ≤ L f and 0 ≤ Q v ≤ L v with R f = Rv = 0. ϕ, defined by equations (4)–(6), has three peaks. Two of them, which correspond to individuals that are unable to learn, are symmetrically located at (Pv , Rv ) = (L v , 0) and (Pv , Rv ) = (0, L v ), with Q f = Q v = 0. The corresponding synaptic connections are w E 1 and w E 2 respectively, which are optimal in either of the two possible environments, but not in both. The third peak corresponds to (Q v , Q f ) = (L v , 0) and R f = 0. The corresponding neural network has its first L v synaptic connections adaptive. The last L f are equal to the optimal connections, which are the same in both environments. From here on we will refer to the first two maxima as the fixed maxima and to the latter as the flexible maximum. The genotype mean fitness of both fixed maxima is ϕ = (L + 1)/2, whereas that of the flexible maximum depends upon the learning time T . There is a critical value Tc such that for T > Tc the absolute maximum of this fitness landscape is the flexible one, while for T < Tc the highest fitness corresponds to any of the two fixed maxima, which are degenerate. The change of regime can easily be recognized in the fitness landscapes shown in Fig. 1. The critical value Tc is obtained by solving the equation that results from equating the fitness of the three peaks, namely 2−L v −1 Tc − 1 + (1 − 2−L v )Tc = 0. (7) The value of Tc scales exponentially with L v , which may be considered as a measure of the ‘amplitude’ of the environmental variation. In fact, for large L v , the solution of equation (7) can be approximated by Tc ≈ 1.59 · 2 L v . The existence of two different regimes has a simple interpretation. Within this model, if the total learning time G is kept constant, a short learning time per Emergence of Adaptive Subsystems 9 Figure 1. Landscape of the genotype mean fitness ϕ given by equations (4)–(6). The manydimensional sequence space has been reduced to the subspace with R f = 0 and Rv = 0. The only two coordinates that are necessary are 0 ≤ Q f ≤ L f and 0 ≤ Q v ≤ L v . The panels (a), (b) and (c) correspond to three different values of T , namely 30 = T < Tc , 50 = T = Tc and 200 = T > Tc respectively. The value Tc = 50 corresponds to L v = 5, L f = 16. In every point of the grid which corresponds to a realizable genotype, the corresponding value of the mean fitness is displayed as a bar. environment state T is equivalent to a rapidly changing environment (F large). In such situations, the individuals do not have enough time to take advantage of a large number of flexible alleles. Learning becomes then a burden and the best option is to be equal to either of the two fixed maxima at birth, thus being optimally tuned to the environment, but only half of the time. During the other half the individuals have minimal partial fitness. The other regime corresponds to a large number of allowed learning trials. In this case the individuals can fruitfully take advantage of the flexible alleles. By bearing only flexible alleles in the first L v sites of the genome, an individual can adapt successfully to both possible environments. The flexible maximum then becomes the absolute maximum in the fitness landscape. Both regimes can also be thought of as differing in the cost of learning, measured in terms of the fraction of the existence that has to be devoted to learning.∗ ∗ Note that within this model one assumes that reproduction takes place once the learning period has been accomplished. 10 H. Dopazo et al. Within this interpretation, the two possible regimes can therefore be associated with ‘expensive’ or ‘cheap’ learning. In the first case either of the fixed maxima are the best evolutionary outcome while in the latter case a flexible subsystem arises in response to the changing environment. In both regimes a transfer of information from the environment to the genotype takes place. This segments into two portions of length L f and L v , thus reflecting the nature of the environmental variation. The final population is either homogeneous, with all the individuals having ? in the first L v first loci, or the population splits into two groups, both having only rigid alleles, with either L v +1, or L v −1 in the first L v loci. It is important to stress that the landscapes in the neighbourhood of the fixed and the flexible maxima are very different. While the latter peak remains rather isolated, there is instead a gradual road of increasing fitness towards either of the former. This is due to the presence of the ? alleles, which play a similar role as in the original model of H & N. These alleles provide a suitable ‘fitness road’ to both fixed maxima, that would otherwise remain completely isolated (‘needle in a haystack’ scenario). 3.2. The evolutionary process. In the case of a very large population we can use the approach introduced by Fontanari and Meir (1990) to obtain an analytic description of the evolutionary process. We assume that the genetic composition of the population corresponds to a distribution of maximum entropy (or minimal bias). This amounts to stating that at any generation g, the fractions of genotypes with Pv( f ) 1, Rv( f ) −1 and Q v( f ) ? in the first L v (last L f ) loci is 5(g; Pv , Q v , Rv , P f , Q f , R f ) = L v !( pv (g)) Pv (qv (g)) Q v (rv (g)) Rv L f !( p f (g)) P f (q f (g)) Q f (r f (g)) R f . (8) Pv !Q v !Rv ! P f !Q f !R f ! In (8) we have introduced the probabilities pv( f ) (g), qv( f ) (g) and rv( f ) (g) of the different kinds of alleles at generation g. In the limit of an infinite population, these can be approximated respectively by the frequencies Pv( f ) /L v( f ) , Q v( f ) /L v( f ) and Rv( f ) /L v( f ) . The evolution is described through six recursive equations which determine in each generation the values of these six probabilities in terms of those of the preceding one. In this process one has to bear in mind that there is a mutation rate pmut that modifies the values of the P, Q and R. The recursive equations can be deduced in exactly the same way as in Fontanari and Meir (1990). In generation g + 1, for pv (g + 1) we obtain pv (g + 1) = pmut + × 1 − 3 pmut hϕ(g)i X Pv 5(g; Pv , Q v , Rv , P f , Q f , R f ) ϕ(Pv , Q v , Rv , P f , Q f , R f ), (9) Lv Emergence of Adaptive Subsystems 11 Table 1. Variable pv qv rv pf qf rf ϕ hϕi hφi Definition Probability of occurrence of a 1 in any of the first L v loci Probability of occurrence of a ? in any of the first L v loci Probability of occurrence of a −1 in any of the first L v loci Probability of occurrence of a 1 in any of the last L f loci Probability of occurrence of a ? in any of the last L f loci Probability of occurrence of a −1 in any of the last L f loci Fitness of a genotype averaged over the learning trials Fitness averaged over the learning trials and the allele distribution Fitness averaged over the population where the summations are extended over all the possible values of the P, Q and R, and hϕ(g)i = X 5(g; Pv , Q v , Rv , P f , Q f , R f ) ϕ(Pv , Q v , Rv , P f , Q f , R f ), (10) is the genotype mean fitness of the population at generation g, averaged over the distribution of alleles. The remaining equations for the probabilities p f (g + 1), qv (g + 1), q f (g + 1), rv (g + 1) and r f (g + 1) have expressions that are similar to (9), with P f /L f , Q v /L v , Q f /L f , Rv /L v , or R f /L f in the place of Pv /L v , respectively. The preceding equations have been used to determine the curves in Fig. 2, corresponding to the two different regimes reached with an allowed number of learning trials above and below Tc . We do not detail here the analytic calculations any further, as the results are easily obtained by iteration, starting from any initial composition of the population. As a summary we give in Table 1 a complete list of the quantities that are considered in the analytic approach, as well as in the numerical simulations described in the next section. 4. T HE N UMERICAL T REATMENT OF THE GHN M ODEL The evolutionary process is simulated using a GA (Goldberg, 1989; Mitchell, 1996) which, at variance with respect to the statistical approach discussed above, allows us to consider only finite populations of N individuals. Throughout this article we have used N = 10 000. Each individual or genotype is represented by a string of L alleles −1, +1 or ? encoding for the synaptic connections of the corresponding neural network, to be −1, +1 or flexible respectively. The genomes in the initial population are generated at random, with a probability q for each allele of being ?, and 1 − q of being fixed. Among the latter, we 12 H. Dopazo et al. Figure 2. (a) hφas i as a function of T for two different initial populations, obtained with pv = 0.225, qv = 0.55, p f = 0.38, q f = 0.55 (full circles) and pv = 0.045, qv = 0.91, p f = 0.91, q f = 0.045 (empty circles), respectively. The full curves correspond to the statistical formulation of Section 3. (b) Asymptotic probabilities of the different alleles (as defined in Table 1) plotted as a function of T . select at random with probability 1/2 one of the two possible environments, and the individual is attributed fixed alleles that match (i.e., they encode for the corresponding optimal synapses) the selected environment with probability p. Since both environments are selected with equal probability, there is no bias in the initial composition of the population. At each generation g, the adaptive synapses of each individual are determined using the learning scheme described in Section 2, a process that allows one to evaluate the individual’s fitness through equation (2). The successive generations are determined through selection and mutation (Goldberg, 1989). First, all the members of the population are ranked by their fitness and each individual leaves descendants that are identical to it with a probability proportional to its fitness. Mutations are introduced by changing each allele at random Emergence of Adaptive Subsystems 13 into one of the two other possibilities with equal probability pmut . As usual pmut has to be properly tuned. It should introduce variations with a rate large enough to produce changes in the composition of the population but also sufficiently small to make it possible the fixation of successful mutations. In all the simulations we have used pmut = 0.005. Among the rules of GAs it is usual to consider the operation of cross-over (Goldberg, 1989; Mitchell, 1996) that mocks up sexual reproduction. This is an important source of variation and it usually speeds up the evolutionary process. We will, however, not use it here because it adds no relevant conceptual ingredients to the present model. We monitor the evolutionary process through the probabilities already introduced in the analytic approach of Section 3, and defined in Table 1. Probabilities are here estimated through averages of the number of different kinds of alleles over the finite population obtained in the numerical simulation. We also calculate hφ(g)i, the fitness of the individuals in generation g averaged over the population. This quantity has a large dispersion in the first stages of the evolution, but is expected to converge to the value hϕ(g)i given by equation (10) when the population reaches its optimum: either the one where the majority of individuals have a genotype close to the adaptive maximum or the mixed population of fixed alleles. The results of the simulations are discussed in the next section. 5. R ESULTS 5.1. The effect of the learning time T . In order to analyse the effects of a change in the number of learning trials we compare the evolution of different populations in which the individuals are allowed to search during increasing learning times T . This is equivalent to comparing populations with the same learning time G but evolving in environments that change with decreasing frequency F. We restrict ourselves to consider environments characterized by L v = 5 and L f = 16 [see equation (1)]. Considering other values of L v and L f does not introduce any qualitative changes in the results. Only the time scales of the different regimes described here are modified. We first consider the statistical formulation of the model, described in Section 3. The evolution is obtained running the recursive equations given in equation (9) for as many generations as necessary to reach a population with a stationary composition and we then calculate the average fitness of the whole population hϕ∞ i. The control parameters of these simulations are the probabilities of fixed and flexible alleles in the composition of the initial population. For T < Tc ' 51 the system converges to a population in which the value of hϕ∞ i grows very slowly as a function of T . But for T > 51 we find that two different behaviours are possible. If the initial probability of flexible alleles is small, the system converges to the same kind of population as for T < 51. On the other hand, if the initial probability of flexible 14 H. Dopazo et al. alleles is large enough, the population stabilizes in a value of hϕ∞ i that grows much faster with T . As the populations in the statistical model are infinite, all the systems are expected to converge to populations with the largest value of hϕ∞ i depicted in the upper branch. The fact that some systems converge to a lower value of fitness is an artifact of the calculation of the recursions, equation (9). It is the lack of precision that causes certain systems to get ‘stuck’ in the lower branch. These two behaviours correspond to the two different regimes discussed in Section 3. The values of hϕ∞ i as a function of T are displayed as continuous lines in [Fig. 2(a)]. The value Tc ' 51 agrees with the solution of equation (7) for the values of L v and L f considered here. We next compare these results with the numerical simulation of the evolutionary process as described in Section 4. We let the population evolve during 400 generations. The (approximate) asymptotic value hφas i of the fitness of each population, determined by averaging the fitness of all the individuals in the last 100 generations, is represented by symbols in the same [Fig. 2(a)], as a function of T . The different symbols correspond to two different choices of p and q in the initial population. Thus, the numerical results are seen to agree with the probabilistic estimates to good degree of accuracy. Both approaches yield the same two asymptotic regimes. The structure of the genotypes in the population is given in Fig. 2(b) where we plot the probabilities listed in Table 1 (estimated by the fraction of the corresponding alleles in the simulated populations, also averaged over the last 100 generations), as a function of T . The value of r f has not been represented because it is vanishingly small for all T . On the other hand, p f remains instead always close to unity. This is so because in all the considered environments the last L f synaptic weights are +1 and any individual having even a single −1 allele in this part of the genome is severely penalized. In the limit of small T only individuals with alleles encoding for fixed synaptic strengths close to either w E 1 or w E 2 have an appreciable fitness. Bearing too many ? is a disadvantage because the individuals have not enough time for learning. The population therefore tends to be composed of individuals as similar as possible to either of the two fixed maxima, with no flexible synapses. The corresponding values of pv and rv for T < Tc are not well defined because the evolutionary process splits the population into two arbitrary fractions, each one similar to one of the two fixed optima pv = 0, rv = 1, or pv = 1, rv = 0. We discuss this point in greater detail in the next section. The structure of the population for large g changes drastically at T ' Tc . The populations corresponding to the upper branch are composed from individuals with a mixture of flexible and fixed alleles. The former are located in the first L v loci while the latter are all +1 and occupy the last L f loci. Correspondingly, the value of qv undergoes a drastic variation at T ' Tc , from qv ' 0.1 to qv ' 1, while pv and rv drop to zero. This corresponds to the transition from the fixed to the flexible optimum. Emergence of Adaptive Subsystems 15 Figure 3. GHN model in the regime of high learning cost. Evolution of the mean fitness and allele probabilities as a function of the number of generations g for a population with T = 30 and initial conditions specified by p = 0.5 q = 0.25 and r = 0.25. The small discrepancy between hφas i and its analytical counterpart hϕ∞ i is due to the presence, in the simulations of the finite-size population, of a small fraction of ? in this part of the genomes ( p f ' 0.1) that have not yet been eliminated. Simulations with different values of p and q defining the initial population, scatter the points among the two branches in different ways. However, the existence of the two distinct regimes is robust against such changes in the initial conditions. When the initial population has a large fraction of flexible alleles [as, for example, for pv = 0.045, qv = 0.91, depicted in Fig. 2(a)] a clear change can be observed at T = Tc from one evolutionary branch to the other. For initial populations with less flexible alleles, the change is not so drastic. In the latter case, and close to the critical value Tc , the two evolutionary branches are seen to coexist. This is an effect of the finiteness of both, the size of the population and the number of generations considered in the numerical simulations. The events in the lower branch have to be considered as truly metastable populations. Except for genetic drift effects, and given an infinite number of generations, all the results that appear in the lower branch are expected to move to the upper one, which corresponds to the absolute maximum of the fitness landscape. Indeed, the fact that the lower branch fades off as T grows beyond Tc is an indication that a larger fraction of numerical simulations converge to the flexible optimum. 5.2. The rigid optimum. In this section we discuss the evolutionary process in the regime of short learning times T . This corresponds to a situation in which individuals face a difficult learning task because they have few learning trials to adapt to the changing environment. We consider as an example the case of L v = 5, L f = 16 and a number of learning trials T = 30 < Tc ' 51. In Fig. 3 we show the values of the different variables (listed in Table 1) obtained in the 16 H. Dopazo et al. numerical simulation of the evolutionary process, as a function of the number of generations g. The average fitness of the population is seen to grow rapidly, approaching an asymptotic value of hφi ' 9. Within the first ∼25 generations the −1 are eliminated from the last L f loci of the genotype because their presence gives rise to a minimal fitness. The ? alleles are eliminated from the same part somewhat more gradually. This happens because an individual with fewer flexible alleles needs less time to learn and has therefore a higher fitness. The rise in the average fitness also corresponds to the elimination of individuals with a mixture of fixed alleles in the first L v loci of the genotype. In fact, any individual bearing a combination of 1 and −1 in these sites has also minimal fitness, because it cannot adapt to any of the two possible environments. The elimination of ? follows for the same reasons that hold for the last L f sites of the genotype. Within the settings considered in this example, the individuals that survive after ∼200 generations have the maximum possible partial fitness, but only for half of the time. When the environment changes, these individuals fail completely to adapt to the new situation. Each time the environment changes all the individuals have to engage in a learning process. Since having too many ? requires a long search time, which implies a low fitness, the ? become a burden. Thus, the flexible alleles facilitate the evolution towards a fixed optimum but are progressively eliminated. This effect is stronger for lower values of T (or larger frequency F): the fewer the allowed learning trials, the less probable it is to find the optimal connection scheme. For T < Tc the flexible alleles play the same role as in the H & N model (Hinton and Nowlan, 1987; Maynard Smith, 1987): they guide the evolution towards a population in which the individuals have imprinted in their genotypes what previous generations had to learn, but finally they lose their learning ability (Dopazo et al., 2001). The initial population is chosen in such a way that its individuals have the same probability of resembling both reference strings. This symmetry is broken during the evolutionary process due to random mutations. As a consequence after the first ∼75 generations the population is essentially split into two different subpopulations having either only alleles 1 or only alleles −1 in the first L v loci of the genome. This follows (see Fig. 3) from the fact that pv and rv remain approximately linked to each other, verifying pv + rv = constant ' 0.90 ' 1 − qv . Individuals with the same number of fixed alleles have the same fitness, provided that these are either all 1 or all −1 in the first L v loci, and only 1 in the last L f . Both subpopulations can therefore coexist in a stable situation. In the example shown in Fig. 3, the subsequent ∼200 generations of the evolutionary process only give rise to a partial replacement of one subpopulation at the expense of the other as a result of the minute balance of different distributions of the best-fit individuals within each subpopulation. This process takes place with no appreciable change in the total average fitness. The symmetry imposed in the initial population that is broken during the evolution is, of course, restored if the evolutionary process is averaged over an ensemble of equivalent initial conditions. Emergence of Adaptive Subsystems 17 Figure 4. GHN model in the regime of long learning time. Evolution of the mean fitness and gene frequencies. Plot of pv , qv , rv , p f , q f , r f and hφi as a function of the number of generations g for a population with T = 112 and initial conditions specified by p = 0.5, q = 0.25 and r = 0.25. 5.3. The flexible optimum. In this section we discuss the evolutionary process in the regime of a long learning time or, equivalently, an easy learning task. We consider as an example the case of L v = 5, L f = 16 and a number of learning trials T = 112 > Tc ' 51.†† In Fig. 4 we show the results of the numerical simulation of the evolutionary process with the same conventions as in Fig. 3. In the first place one notes that the average fitness of the population grows by steps, remaining for an appreciable number of generations in the lowest ‘plateau’. This is reached after a few generations (g ∼ 20) and is similar to that of Fig. 3. The second plateau, attained around generation g ∼ 230, implies a further evolutionary change in the composition of the population. The first drastic increase of the fitness is produced at the same time that r f drops to 0 and the population is arbitrarily split into two fractions that can adapt to either environment, as described in the previous section. Individuals with any mixture of alleles 1 and −1 in the first L v and the last L f loci of the genotype are therefore eliminated. At the same time the number of flexible alleles in the last L f loci shrinks (but does not disappear), in a similar process as the one discussed in the previous section. The second plateau is reached when qv → 1, corresponding to changes that take place in the first L v loci that go one step further than what has been hitherto discussed. The drastic increase in qv is associated with an equally drastic drop of pv and rv . The population thus departs from both rigid maxima. The process taking place in the first L v loci is the opposite of the one occurring in the last L f sites. Fixed alleles are eliminated at the expense of flexible ones. The surviving †† The choice of the value T = 112 is a matter of practical convenience to better display the main features of the evolutionary process 18 H. Dopazo et al. Figure 5. Fraction of the population corresponding to each possible value of the fitness (solid squares), for two different generations. For the subpopulations associated with each fitness bin we also show the allele probabilities (listed in Table 1), restricted to the corresponding subpopulation. individuals tend then to have only ? in the first L v loci. Such adaptable individuals rapidly take over due to the much higher fitness that stems from their ability to fit to both possible configurations of the environment. The abrupt change in the structure of the population is shown in Figs 5 and 6, where we display the fraction of individuals in the population, distributed according to their fitness. Within the subpopulation associated with each fitness bin, we also show the probabilities listed in Table 1. The first plateau corresponds to a population that is split into two fractions, each one capable of adapting to one of the two possible environments but not to the other. This is similar to what has been already discussed in the previous section. In the example considered in our simulations, this takes place between generations g ' 20 and g = 225. Since both fractions have similar fitnesses the distribution of the population according to the fitness is concentrated in a single peak located at hφi ' 11 (see Fig. 5). This persists until g ∼ 227. In this generation appears the first ‘superfit genotype’ with Q v = L v . Since the learning time T is large this individual has a high probability of adapting to both environments. These turn out to be highly fit individuals, which have therefore a high probability of leaving descendants, that end up taking over the whole population. The precise moment in which this transition takes place depends upon the occurrence of a random event, and therefore the length of the first plateau of fitness may vary significantly from one numerical experiment to the other. It is also expected to increase on the average, with the ‘amplitude’ L v of the environmental change. In the simulations, after ∼270 generations the population is composed of four, clearly separated, subpopulations (see Fig. 6). The main, highly fit group, is in the neighbourhood of the flexible optimum. Two other, rather smaller groups have both the same intermediate fitness of φ ' 11 and are well adapted to one of the Emergence of Adaptive Subsystems 19 Figure 6. Fraction of the population corresponding to each possible value of the fitness (solid squares), for two different generations. For the subpopulations associated with each fitness bin we also show the allele probabilities (listed in Table 1), restricted to the corresponding subpopulation. two possible environments. The last group has a minimal fitness of 1. This pattern is easy to understand. All the minor peaks, (at φ = 1 and 11), correspond to sub-populations whose genotypes are only one mutation away from the flexible optimum, and are in equilibrium with it by the competing processes of random mutations and selection. 5.4. The population fitness landscape. We have shown two typical examples of the evolutionary process within the GHN model leading respectively to the flexible and rigid optima. Both situations can be put into a single comprehensive picture by introducing a population fitness landscape. In order to do so we use the same simplification explained for Fig. 1. However, instead of using as free parameters the number of alleles Q v and Q f we use the fractions of such alleles in the population, that are denoted qv and q f . The same considerations concerning symmetries are still valid in this case. In Fig. 7 we show examples of the population fitness landscapes and the evolutionary paths for learning times T that are above and below the critical value Tc . These are evaluated by setting a grid for values 0 ≤ qv ≤ 1 and 0 ≤ q f ≤ 1. For each node of the grid, a random population of 10 000 individuals is generated with a composition determined only by the corresponding probabilities of each kind of allele. Equation (2) is used to obtain the fitness of each individual, which is next used to calculate the corresponding average fitness. In the figure we plot the contours of equal average fitness. Note that in this case, at variance with Fig. 1 where the number of alleles can only take integer values, the surface is smooth because any point in the plane qv , q f corresponds to a realizable population with a given average fitness. 20 H. Dopazo et al. Figure 7. Population fitness landscapes of the GHN model, showing the levels of equal fitness, projected onto the qv , q f plane. Left panel: T < Tc , right panel: T > Tc . The evolutionary paths discussed in the previous sections are depicted as a line on top of the contour plot of the fitness landscapes. The insets show magnification of the evolutionary path in the neighbourhood of the fixed optimum. With the conventions of this plot, the maximum located at (qv , q f ) = (1, 0) corresponds to a population where all individuals are equal to the flexible optimum. The fixed maximum with pv = 1 corresponds to (qv , q f ) = (0, 0). The effect of increasing the learning time T can easily be recognized. For T > Tc there is only a single absolute optimum in the fitness function, which corresponds to the flexible maximum. For T < Tc only one of the two degenerate local optima having no flexible alleles can be shown in this plot. Two examples of evolutionary paths, obtained with the same initial conditions as in the preceding sections, are shown as paths on top of the fitness landscape.‡‡ The way in which the population approaches the fixed optima is clearly depicted in these examples. In order to appreciate the speed of the evolutionary process we draw one symbol for each generation along the evolutionary path. In the initial stages the evolutionary path is closely orthogonal to the contour lines. Flat regions of the landscapes, which correspond to a mild selection pressure, are associated with the plateaux displayed in Figs 3 and 4. A detail of the evolutionary paths in the neighbourhood of the fixed maximum (qv , q f ) = (0, 0) is presented as an inset in both panels. In these regions with very low selection pressure, selection and mutation produce a wandering of evolutionary path that is similar to a random walk. ‡‡ Although these paths adjust to the principal features of the landscape, they are not expected to fit precisely onto that surface. The average fitness obtained by the numerical simulations show the influence of random mutations that cause a change in the composition of the population and cannot be accounted for in the average fitness landscape. Emergence of Adaptive Subsystems 21 In the left panel, corresponding to T < Tc , the population remains confined to a small neighbourhood of the fixed optimum. In the right panel, where T > Tc , the beginning of the evolutionary path is entirely similar. However, after some time wandering near the fixed maxima, the system finds an exit path towards the flexible optimum. This is the signature that a ‘superfit genotype’ has appeared in the population. The rapid take-over of these highly fit individuals gives rise to a path that seems to jump from the fixed maximum to the flexible one bridging a local depression in the fitness surface. This happens because the path does not lie exactly onto the average surface (see footnote). In fact the average fitness of the population along the evolutionary path is a never decreasing function. 6. T HE G ENERALIZED P ERCEPTRON M ODEL (GP) In this section we present an extension of the GHN model, along the same lines as the one presented by Dopazo et al. (2001). In this framework, we consider that the neural network of each individual is a perceptron (Minsky and Papert, 1969) of L synaptic weights w E ≡ (w1 , w2 , . . . , w L ). We restrict our model by considering only ‘Ising perceptrons’ in which wk = ±1. This assumption is consistent with those of the GHN model described before. The inputs of the network are assumed to convey data from the environment. These inputs are represented by vectors xE = (x1 , . . . , x L ). We also assume that the environment provides the ‘classification’ of the input patterns into one of two possible groups, labelled by y ∈ {+1, −1}. Such classification is assumed to be given by two reference perceptrons, which represent the two possible states of the environment, whose synaptic connections are given by equation (1). To be specific, we assume that the class of each pattern depends on the weighted sum of the input vector as follows: yn = sign(w E n · xE), (11) where n is an integer number that denotes the environment’s state. In turn, the class given to input xE by the individual’s perceptron, of weights w, E is y = sign(w E · xE). (12) At variance with the GHN model, the one considered here brings in epistatic effects that are not exclusively due to learning. This is so because it includes the processing made by the network considered as an additional phenotypic trait. Such processing means that even if a perceptron is not identical to the optimum, it may nevertheless succeed in classifying properly some fraction of the patterns [see Dopazo et al. (2001) for an extensive discussion of this point]. With reference to equation (12), y may be different from yn , the ‘correct’ class associated with each environment. The fitness of each individual depends upon its ability to properly 22 H. Dopazo et al. classify new input patterns, which are different from the examples that have been used during the learning stage. Although within the GHN model a stable, flexible subsystem is seen to appear, this process may involve an unreasonably large number of generations because the final approach to the flexible optimum can only take place as a consequence of the random appearance of a ‘superfit genotype’. As we shall see, this process can be greatly accelerated by the introduction of the additional epistatic effects implied in the more complex phenotypic structure that appears in the GP model. As in the GHN model, we assume that the alleles encode for fixed or flexible synaptic weights. The latter can be adapted during ontogeny through a learning protocol. However, instead of directly searching at random an optimal value of the adaptive weights, like in the H & N model, we consider a more realistic learning scenario in which these weights are determined by learning from examples. This may be gauged by the probability of generalization pg , which is the probability that an input pattern be correctly classified by the (trained) perceptron.∗∗ It is well known from the literature on neural networks (Hertz et al., 1991; Dopazo et al., 2001) that pg is a function of the normalized overlap between the individual perceptron weight vector w E and that of the reference perceptron associated with the environment state, w En, 1 w E ·w En pg (w, E w E n ) = 1 − arcos . π |w E ||w En | (13) In general, the overlaps in equation (13) depend upon the number of examples used for learning, and also upon the learning algorithm. In the best case, the perceptron is identical to that associated with the environment, the normalized overlap is 1, and the probability of generalization is pg = 1. A different situation arises when the weight w E associated with the individual is orthogonal to that of the reference. In such a case pg = 0.5, meaning that the classification is correct only by chance. The worst case arises when the normalized overlap is −1. This extreme case corresponds to an individual that systematically misclassifies the input patterns. By way of illustration, consider the case that the perceptron classifies sensory input concerning, say, a prey into edible or not. Each component of any L-dimensional input pattern refers to the quality of some particular attributes of the prey (smell, colour, size, taste, etc.). The prey may change some of its attributes, e.g., colour or size, in different ‘seasons’ of the year, while others remain unaltered. In each season there is an optimal classification of prey into edible or not and this is provided by the current reference perceptron. This changes for the following season. In order to be consistent with the conventions used for the GHN model we define the individual fitness as the sum of the partial classification performances of the individual in all the environment states. We choose the same normalization as ∗∗ The use of p to define the fitness implies a statistical average over many sets of input patterns. In g this sense a fitness defined using pg should truly be considered an average genotype fitness. Emergence of Adaptive Subsystems 23 for the GHN model. We thus write ϕ= 2F 1 X [1 + (L − 1) pg (w, E w E i )]. 2F i=1 (14) In the following, we restrict ourselves to consider the case of only one environmental change (F = 1), and two limiting learning protocols.† In the one that we call random learning, the synaptic connections are chosen at random and are left unchanged during the whole ‘season’. In the other, that we call perfect learning, all the adaptive synapses are optimally assigned to either wk = 1 or wk = −1. These two limiting cases provide extreme bounds within which fall the results of any other learning protocol. If we consider the case of individuals with no flexible alleles, having Pv( f ) 1 and Rv( f ) −1 in the first L v (last L f ) loci of the genome respectively, the scalar products entering in equation (13) are (Pv − Rv ) + (P f − R f ) w E ·w E1 = |w E ||w E1 | L (−Pv + Rv ) + (P f − R f ) w E ·w E2 = . |w E ||w E2 | L (15) Since we have√ assumed that the neural networks are Ising perceptrons, it always holds that w E = L. For perceptrons with flexible alleles the values of Pv( f ) and Rv( f ) that enter in the scalar products [see equation (13)] depend upon how the adaptation protocol assigns the Q v( f ) flexible weights. For the case of ‘random learning’ half of them are on the average assigned to +1 and the other half to −1. Therefore the individual fitness is Pv − Rv + P f − R f 1 ϕr nd = 1 + (L − 1) 1 − arcos 2π L −Pv + Rv + P f − R f + arcos . (16) L † Although we do not consider any particular learning scheme, we mention here how it may proceed. Together with each change of the environment one may assume that an individual undergoes a session of ‘batch’ learning [see e.g., Hertz et al. (1991)]. This may be considered as tasting several potential prey to determine which are acceptable. During this process only the flexible synapses are µ adapted. With the present conventions each component of the M training inputs, xk (k = 1, . . . , L, µ = 1, . . . , M), can be selected at random. Their ‘correct’ classification are provided by the environment reference perceptron through equation (11). The adjustable weights of the individual are changed minimizing the number of mistakes. Next each individual is asked to classify new, randomly generated, input patterns until the ‘season’ is over. This procedure is repeated each time the environment changes. Thus, each ‘season’ is partly devoted to training and partly to classifying new inputs. There is a compromise here, because a longer learning time and therefore a better classification performance entails a shorter time devoted to test new examples, or equivalently to find edible food. 24 H. Dopazo et al. In ‘perfect learning’, when the current environment is w E 1 , all the synapses encoded by the alleles ? are set to wk = 1, while if the current environment is w E 2 , the adaptive weights in the first L v loci are assigned to −1 while those in the last L f loci are assigned to 1. The corresponding fitness therefore is Pv + Q v − Rv + P f + Q f − R f 1 ϕ pr f = 1 + (L − 1) 1 − arcos 2π L −Pv + Q v + Rv + P f + Q f − R f + arcos . (17) L As an example of the effects of the perceptron processing we may note, for example, that even in the unfavourable circumstance in which P f = Q f = Q v = 0 and Pv = Rv in which the last L f loci of the genome have only −1 and there is a mixture of −1 and +1 in the first L v loci, the values for ϕr nd and ϕ pr f are larger than 1. It is practical to consider the reduced subspace in which Rv = R f = 0. Making use of the fact that Pv + Q v = L v ; P f + Q f = L − L v the two above expressions for the fitness can be written in terms of the variables Q f and Q v , namely L − Qv − Q f 1 ϕr nd = 1 + (L − 1) 1 − arcos 2π L L f − L v + Qv − Q f + arcos (18) L 1 L − 2L v − 2Q v ϕ pr f = 1 + (L − 1) 1 − arcos . (19) 2π L In Fig. 8 we illustrate the landscapes associated with ϕ pr f and ϕr nd following the same conventions of Fig. 1, i.e. restricting the plot to the subspace in which R f = Rv = 0. The absolute maximum of ϕr nd is attained within this subspace, when Q f = Q v = 0 corresponding to the fixed maximum with Pv = L v and P f = L f . This has its symmetric counterpart outside the subspace, for Rv = L v and P f = L f . On the other hand, the maximum for ϕ pr f is found in the set of points Q v = L v ; ∀Q f . This is consistent with the fact that equation (19) is independent of Q f . All the points of this set correspond to genotypes in which all the first L v loci are occupied by flexible alleles, therefore corresponding to individuals with a flexible subsystem tuned to the changes of the environment. A comparison of both fitnesses shows that the flexible optimum corresponds to the absolute maximum of ϕ pr f but it is in addition a local minimum for ϕr nd . The GP model therefore confirms the results of the GHN, in the sense that the emergence of an adaptive subsystem critically depends upon the effectiveness of the adaptive protocol that prevails during ontogeny. If learning is only partially effective a point can be reached in which it may be worth, from an evolutionary point of view, to give up the possibility of following the changes of the environment and exploit the possibility of being fully adapted to one of its possible configurations. Emergence of Adaptive Subsystems 25 Figure 8. GP model. Mean fitness landscapes using the two extreme learning protocols. The plot is made with the same conventions as Fig. 1. Panel (a) corresponds to perfect learning and panel (b) corresponds to random learning. Note the different scales in the vertical axis of both panels. The calculation has been performed for L v = L f = 10 in order to better display the slopes in panel (b). Stagnation is displayed by the absence of any slope along the Q f direction in panel (a). The fact that the maximum of ϕ pr f is attained in a set of points and not in a single point can be understood as follows. For perfect learning all ? in the last L f loci of the genome are always optimally assigned. It is therefore equally effective to have an allele 1 than a ? and there is consequently no selection pressure to replace the latter by the former. This situation gives rise to a stagnation of the Baldwin transcription process of environmental data into the genotype, the same as found and discussed by Dopazo et al. (2001). It is worth stressing that such stagnation is instead absent for the process of replacing fixed alleles by flexible ones in the first L v loci of the genotype, i.e. for the process of evolving a flexible subsystem. We therefore find within the GP model the two families of evolutionary pathways as in the GHN. For a difficult learning task the evolutionary path breaks the initial symmetry with respect to both environmental configurations, thus leading to a mixed population in which there are individuals that resemble either possible environment. For an easy learning task the evolutionary path preserves the original symmetry, leading instead to a population that resembles the flexible optimum and therefore bears no bias with respect to the possible environments. There is, however, an important difference in the nature of the evolutionary paths for the GP model as compared with the GHN model. The landscapes for ϕ pr f and ϕr nd are always smooth as a consequence of the epistatic effects brought in by the processing of the perceptron. The evolutionary paths are therefore expected to always lead gradually to the corresponding optima, either fixed or flexible, without the abrupt changes displayed in Figs 3 or 4. We have found that the evolutionary 26 H. Dopazo et al. process in the GHN has two distinct steps. The first is governed by the Baldwin effect that acts primarily upon the last L f alleles. The second consists in an almost random search of the rather isolated flexible optimum. This latter situation should be considered as an artifact of the GHN model. When richer epistatic effect are brought in, as in the GP model, a smoother fitness landscape is found in which a situation such as the search of a ‘needle in a haystack’ may never happen. Thus, the evolutionary process involves a gradual and cumulative allelic substitution without the need of relaying on the random occurrence of a ‘superfit genotype’. 7. C ONCLUSIONS We have presented two working models to discuss the emergence of adaptive subsystems and their relationship with the Baldwin effect in a changing environment. One is a generalization of the well known framework developed by Hinton and Nowlan (1987) and the second is an extension of the perceptron model discussed by Dopazo et al. (2001) in which the individuals are endowed with a simple neural network having a limited processing ability. We considered a changing environment that is represented as two reference states that exchange roles a fixed number of times during the ‘life’ of the individuals. To fix ideas, features of a potentially edible prey may change with the season of the year, while the others remain constant. Individuals are therefore faced to the problem of learning which are the edible prey, each time that the environment changes. We encode the environment in two strings of information bits that represent the weights of two hypothetical neural network. Each one is assumed to discriminate perfectly which prey is edible when it is given the corresponding attributes as inputs to the network. We assimilate the ‘amplitude’ of the changes with the Hamming distance between the reference strings while the ‘frequency’ of such variation is assimilated to the number of times that both strings exchange roles during the life of the individual. The model can easily be extended to more complex situations in which there are several reference strings. However, the case considered here has all the necessary ingredients to capture the essential consequences of adaptation in a changing environment. The genotype of each individual involves fixed and flexible alleles. These express themselves respectively as fixed and flexible synaptic connections of a hypothetical neural network. The presence of flexible alleles in the genome enables the individual to attempt a searching (learning) process trying to match its synapses with the current environment. This search has to be repeated every time that the environment changes. A greater fitness is associated with a shorter searching process and a minimal one is attributed to an individual that is unable to find the current environment in a prescribed number of learning trials (Hinton and Nowlan, 1987). The situation is such that an individual that has mostly correct fixed alleles has a great success in matching one of the two possible environmental configurations but performs very badly when this changes. Such an individual can thus be thought Emergence of Adaptive Subsystems 27 of as succeeding in identifying its prey during one season of the year but starving during the next. On the other hand, an individual with flexible alleles that is able to match the changing features of the environment through learning is in the position of adapting well in both circumstances but always has to spend a fraction of its life learning (searching) how to match the current environment. A statistical estimation (Fontanari and Meir, 1990) can be made of the fitness associated with this learning process, which is related to the ‘amplitude’ and the ‘rate of change’ of the environment. Either when the changes occur too fast or when these have a large amplitude, or equivalently when the number of learning trials is too small, the (reproductive) cost of learning is high. On the other hand, when the frequency of environmental change or its amplitude is low, or when a large number of learning trials are allowed, the learning cost is low. The initial population is chosen with minimum bias, i.e. having the same resemblance to either of the reference strings. In spite of this, the symmetry is not preserved during evolution due to random mutations. The two regimes mentioned above are associated respectively to evolutionary processes involving pathways that lead to populations in which the above symmetry is broken or recovered. In the case of a high learning cost the best evolutionary strategy is to give up flexible alleles in favour of those that match either of the two possible environment reference configurations. The resulting individuals have a very high fitness only during half of the time. The final population has lost the initial symmetry with respect to both environmental strings. In the low learning cost regime, upon evolution the individual transcribes into its genetic information the fact that some of the features of the environment remain unchanged and that the rest are constantly changing. The latter features are matched by a string of flexible alleles. The resulting population recovers the initial symmetry because its individuals bear no bias in favour of either environment. When the process of allelic substitution is not able to cope with a high rate of environmental changes, natural selection preserves genetic plasticity in such a way that the possibility of learning is tuned to those features of the environment that are not constant. In either case the information that there is a changing and a fixed part in the environment is properly transcribed into the genotypes of the population. In the high-cost regime the balance between the ‘exploration’ of the different alternatives through learning and the ‘exploitation’ of a ‘hard wired’ (or ‘closed’) solution is decided in favour of the latter (Frank, 1996). The resulting population avoids the risk of misadaptation due to a poor learning performance in both environmental states and prefers an inherited perfect adaptation to one of these states. On the other hand, when learning has a low cost the evolutionary strategy is to end in a population of individuals with a stable, adaptive subsystem tuned to the changing environment. The solution is then to ‘explore’ an ‘open’ system geared to adapt to all possible environmental conditions, which is more efficient during all of the time. In this case the emergence of a stable adaptive subsystem is enhanced by selection. 28 H. Dopazo et al. We also present an alternative model, the GP model, with the aim of introducing different epistatic effects, such as those that can be attributed to a complex phenotype. Within this framework it is possible to analyse two limiting cases in which the fitness function can be formulated analytically. In one that we call ‘random learning’, the flexible alleles are randomly assigned to either 1 or −1 each time that the environment changes. This mimics the case of a very difficult (costly) learning process. The other extreme situation can be considered to represent instead a vanishing learning cost. In this, which we call ‘perfect learning’, the flexible alleles are always optimally tuned. The fitness landscape for ‘random learning’ has its largest value in coincidence with the fixed maxima. The evolutionary process within the GP model for ‘random learning’ involves the symmetry breaking pathways that we have mentioned above. The fitness landscape for ‘perfect learning’ contains instead a single absolute maximum that corresponds to the emergence of an adaptive subsystem. The corresponding evolutionary pathway therefore preserves the original symmetry. The decision between one of the two alternatives only depends upon the reproductive cost that has to be incurred by the individuals that engage in a learning process, i.e. on the effectiveness of the learning protocol. However, an important difference is found with respect to the GHN model. The flexible optimum never appears isolated thanks to the epistatic effect involved in the GP model. The approach to it is therefore expected to take place through gradual, cumulative allelic substitution and not by the random appearance of a ‘superfit genotype’ as in the GHN model. Within ‘perfect learning’ we find in addition a stagnation similar to that of the Baldwin transcription process as discussed by Dopazo et al. (2001). In both the GHN and the GP models, with difficult learning tasks an accelerated transcription of environmental features into genetic information—the classical interpretation of the Baldwin effect—takes place in two ways. On the one hand, the flexible alleles that were originally allocated to the last L f loci of the genome are replaced by 1 thus encoding the (constant) environmental features. On the other hand, individuals with a mixture of 1 and −1 in the first L v loci are eliminated. These processes are accelerated at the expense of the flexible alleles, which progressively disappear. If these were not present the evolutionary process would take place through an inefficient random walk in sequence space. Within the GHN model the process leading to the emergence of a flexible subsystem follows a different pattern. During the first stage of the evolution the population tends to resemble either of the fixed reference connection patterns. Once this situation is reached the selection pressure becomes very low and random mutations produce essentially no improvement in the average fitness of the population. This process may continue for a significant time (in fact it scales exponentially with the ‘amplitude’ L v of the environmental changes) until a ‘superfit genotype’ is found by accidental mutation. This triggers a second evolutionary stage in which the population is rapidly driven towards the flexible optimal genotype. The nature of the evolutionary process within the GP model is different. All epistatic effects that Emergence of Adaptive Subsystems 29 are brought in by the processing of the perceptron give rise to a smooth fitness landscape that has no isolated maxima. The evolutionary process therefore always entails a gradual and cumulative allelic substitution. The exponential search time is therefore dramatically reduced. It is amusing that there is a close analogy between the occurrence of the two regimes of high and low learning difficulty, as pictured through the behaviour of the population in different fitness landscapes and physical systems that undergo a first-order phase transition. In the physical systems, the equivalent of the fitness is (the negative of) the free energy, a quantity that is minimal when the system is at equilibrium at a finite temperature. The latter introduces noise playing a role that is in some senses similar to the occurrence of random mutations. The microscopic state of the system is therefore a random variable, the same as the composition of the population evolving through the GA. When the free energy presents two or more minima, the system may get trapped in the one of higher energy, from which it can only escape through the random modifications of the microscopic variables thanks to the temperature. In Physics, this sudden change is called a first-order phase transition. This process bears strong similarities to the way in which the flexible optimum is approached in the GHN model: the final stage of evolution is essentially driven by the occurrence of favourable mutations. To summarize, we may say that we find a twofold situation. For a high learning difficulty the traditional Baldwin effect is in full operation. The population ends up by resembling either of the two reference strings, giving up genetic flexibility. For a low learning difficulty, a flexible optimum is always attained. In the GHN model the final stages of evolution are governed by a random search of a rather isolated optimal configuration pattern. This may cast doubt upon the robustness of such an evolutionary outcome. However, the emergence of a flexible subsystem does not necessarily depend on the random emergence of a ‘superfit genotype’. The GP model helps one to understand how the epistatic effects introduced by the (phenotypic) processing ability of the perceptron eases the emergence of a flexible subsystem, giving rise to a gradual approach to the same optimal genotype through cumulative allelic substitutions. In this case natural selection acting upon the genetic basis of behavioural traits originates a fine-tuned adaptive subsystem able to cope with the uncertainties of a changing environment. ACKNOWLEDGEMENTS MBG and RP acknowledge support from the ZIF (Bielefeld, Germany) where part of this work has been performed, in the frame of the Research Group ‘The Sciences of Complexity: From Mathematics to Technology to a Sustainable World’. HD, SR-G, MBG and RP acknowledge economic support from the EU—research contract ARG/B7-3011/94/97. HD and RP hold a UBA research contract UBACYT PS021, 1998/2000/2001. MBG is member of the CNRS. 30 H. Dopazo et al. R EFERENCES Ackley, D. and M. Littmann (1992). Interactions between learning and evolution, in Artificial Life II, G. C. Langton, C. Taylor, J. Farmer and S. Rasmussen (Eds), Redwood City, CA, USA: Addison-Wesley. Ansel, L. W. (1999). A quantitative model of the Simpson–Baldwin effect. J. Theor. Biol. 196, 197–209. Baldwin, J. M (1896). A new factor in evolution. Am. Nat. 30, 441–451. Dopazo, H., M. Gordon, R. P. J. Perazzo and S. Risau-Gusman (2001). A model for the interaction of learning and evolution. Bull. Math. Biol. 63, 117–134. Edelman, G. M. (1987). Neural Darwinism. The Theory of Neuronal Group Selection, Oxford: Oxford University Press. Fontanari, J. F. and R. Meir (1990). The effect of learning on the evolution of asexual populations. Complex Syst. 4, 401–414. Frank, S. A. (1996). The design of natural and artificial adaptive systems, in Adaptation, M. R Rose and G. V Lauder (Eds), New York: Academic. French, R. and A. Messinger (1994). Genes, phenes and the Baldwin effect: learning and evolution in a simulated population, in Artificial Life IV, R. Brooks and P. Maes (Eds), Cambridge, MA: MIT Press. Goldberg, D. (1989). Genetic Algorithms in Search, Optimization and Machine Learning, Redwood City, CA, USA: Addison-Wesley. Hertz, J., A. Krogh and R. G. Palmer (1991). Introduction to the Theory of Neural Computation, Redwood City, CA, USA: Addison-Wesley. Hinton, G. E. and S. J. Nowlan (1987). How learning can guide evolution. Complex Syst. 1, 495–502. Maynard Smith, J. (1987). When learning guides evolution. Nature 349, 761–762. Minsky, M. and S. Papert (1969). Perceptrons, Cambridge, MA: MIT Press. Mitchell, M (1996). An Introduction to Genetic Algorithms, Cambridge, MA: MIT Press. Received 13 February 2002 and accepted 1 August 2002
© Copyright 2026 Paperzz