Evolution of Adaptive Behaviour in a Simulated Single-Celled Organism Paul J. Kennedy Thomas R. Osborn School of Computing Sciences University of Technology, Sydney PO Box 123 Broadway NSW 2007 Australia [email protected] Abstract A model of a single-celled organism comprising two interrelated parts (a genome and metabolism) is presented. The genome encodes operons that specify enzymes for the metabolism. The articial metabolism regulates the genome and constructs proteins (among other processes). The structure of operons in our model is governed by a parallel genomic language. Protein construction is accomplished with an abstraction of mRNA and ribosomal machinery called \spiders". Adaptive behaviour occurs at two dierent levels within the model: in enzyme-catalysed reactions and by the regulation of genes. Adaptive behaviour is passed among individuals of a population via the evolution of genomes and the Lamarckian evolution of initial metabolic conditions. Results are given for evolving cells for two simple environments and the adaptive behaviour of four cells is examined. 1. Introduction In this paper we present a model of a single-celled organism. Our motivation for development of this detailed evolutionary model is to build a tool to explore the use of biological ideas in simulations of adaptive behaviour and in the broader eld of evolutionary computation. Our cell model may be divided into two closely interrelated parts: a genome and an articial metabolism. The genome species enzymes to control reactions in the metabolism, whilst the metabolism regulates the genome and constructs the specied proteins (among other processes). Like a real cell, the phenotype in our model (i.e. the metabolism) is time-varying and the genome exerts control over the cell for its entire lifetime. This permits the cell model to exhibit adaptive behaviour in relation to some environment. An individual cell is constructed from a genome and an initial set of chemicals. From these chemicals and genome, an arbitrarily complex metabolism results. Presented at SAB2000 - Paris, France - September 11-16, 2000 Cell models have been devised by other workers for use as a testbed. See, for example, (Fleischer and Barr, 1994). Our model diers from their model both in the kinds of simulation undertaken (Fleischer and Barr are more interested in multicellular development whereas we examine single cells) and the kind of genome (we use a biologically plausible model whereas theirs is a simpler ad hoc dierential equation). The closest previous model to ours is probably that of (Rosenberg, 1967) although the constraints imposed by the computers of the day limited the detail of the model. Another similar approach is taken by (Jacobi, 1995). Heuristics for adaptive behaviour are encoded into a genome and initial chemical ensemble, both of which are inherited from parents. This allows us to breed cells to live in particular environments. We present results of cell simulations evolved to adapt to two simple environments and examine the adaptive behaviour found in four case studies of individual cells. 2. Overview of Cell Model Two levels of adaptive behaviour are designed into our cell simulations. The rst level relates to regulation of genes on the genome. Genes may be switched on and o throughout the lifetime of a cell by varying the types and concentrations of chemicals in the articial metabolism. Some of these chemicals may have diused in from the environment, thus allowing direct adaptation to environmental stimuli. The other method of adaptive behaviour is swifter and occurs in the metabolism itself. Enzyme catalysed chemical reactions cause the cell to modify its behaviour based on environmental (and internal) cues. As stated above, there are two distinct sections of our cell model: a genome and a simulated cellular metabolism. The genome encodes genes that specify enzymes for use in the metabolism. Following the operon model of regulation in prokaryote cells (Alberts et al., 1994), we group genes into blocks (called operons) that are regulated in the same way. Operons start with a section called the promoter region that contains information used to regulate the operon. Following the promoter region are one or more genes each encoding a protein. The cellular metabolism encompasses ve processes: enzyme-mediated chemical reactions; protein production (by gene expression and regulation); protein degradation; cell growth; and diusion of chemicals through a semipermeable cell membrane. These processes are modelled in the hope that they will form a canonical set enabling the cell to be used as a testbed in a variety of environments and experiments. Processes in the cellular metabolism are represented with chemical reactions and the kinetics of these reactions are encoded in a large system of coupled nonlinear ordinary dierential equations. The genome and initial metabolic conditions evolve in dierent ways. The genome evolves control structures (i.e. genes) in a Darwinian fashion using a genetic algorithm (GA) (Holland, 1992). Evolution of the initial chemical ensemble, however, occurs only through the maternal cell line. As well, it is Lamarckian, as changes to chemicals in a cell that occurred throughout its lifetime may be passed to ospring. The initial chemicals for a cell are the nal set of chemicals from its mother cell. A cell, then, is the result of the coevolution of a genome and initial chemical ensemble (Kennedy and Osborn, 1999). This Lamarckian strategy allows adaptations to be passed along a cell line and permits the \problem" posed by an environment to be decomposed into smaller problems solved along a germ line. Cells are simulated in an environment. For the experiments in this paper we chose to use very simple environments. 3. The Genome The genome used by our cells is an extension of the simple xed length bit string commonly used in GAs. Although a simple bit string is sucient, we found that search proceeded more quickly with the use of a more complicated genome: one modelled on the doublestranded (but haploid) DNA molecule (Kennedy, 1998). The genome is a sentence from a parallel genomic language. This language permits operons to be encoded on the genome as a string of tokens. Sixteen kinds of tokens are encoded in blocks of four bits (nibbles). Our language, however, uses only fourteen distinct tokens: ten digits in the range [0; 9] and four control codes: <start operon> (10102, 10112), <start enzyme> (11002, 11012), <start carrier> (11102) and <end operon> (11112). However, the <start carrier> token is not used in experiments in this paper. Following the <start operon> token is information used to regulate the operon. This is the promoter region. Next is a list of genes with each gene beginning with a <start enzyme> Operon promoter gene 1 ... gene i ... gene n Non coding region An example with bases: <start operon> Switch <start Data enzyme> 5 3 2 <start enzyme> 0 1 7 <end operon> or <start operon> Figure 1: Structure of operon and sample encoding. token. Bases in [0; 9] following <start enzyme> represent the monomers required (and the order) to produce the protein. Figure 1 shows the layout and encoding of an operon using our parallel genomic language. The promoter region describes how the operon may be regulated and provides a template for the shape of chemical species able to regulate the operon. Three classes of regulation in operons are modelled: constitutive (where the operon is always active and may be expressed at any time); repressible (where the operon is active, unless a \blocking" chemical is bound to the promoter region); and inducible (where the operon is inactive except when an \activator" chemical is bound to the promoter). Chemicals may diuse in from the environment and regulate inducible or repressible operons. Operon sentences are read from the genome and parsed into operons. From this, a sequence of chemical reactions is determined to build the proteins. 4. The Environment Experiments in this paper are conned to a simple environment that exposes a single cell to a xed eect and that allows no reactions to occur outside the cell. Additionally, the environment is assumed to be very large compared to the cell. This means that the concentration of chemicals in the environment is constant. Consequently, no dierential equations are required to model the environment. Although the environment is constant it still has an eect on the cell due to diusion of chemicals through the semipermeable membrane. 5. Metabolic Processes 5.1 Enzyme-Catalysed Reactions Chemicals, in our model, are an abstraction of polymers based on (Farmer et al., 1986) and (Bagley and Farmer, 1991). However, complementary matching is used throughout our simulation. The shape of a polymer appears as a string of digits (monomers) such as 4138. Each chemical species has a concentration that is dened as the ratio of the number of molecules of the chemical to the number of molecules of water (in the cell). Chemicals in the articial cell are modied via enzyme catalysed reactions. This metabolic process is the most direct way that a cell responds to environmental stimuli. Chemicals that diuse into the cell may alter the way that the cell acts by changing which reactions occur. Our model of these reactions and subsequent expression in dierential equations follows that of (Bagley and Farmer, 1991). Each enzyme-catalysed reaction models the equilibrium of joining and breaking polymers. For example, the equation o123 + o456 + e345 () o123456 + e345 + H (1) describes an equilibrium joining polymers with shape \123" and \456" into a longer polymer with shape \123456" under catalytic pressure of an enzyme with shape \345" (or alternately splitting the longer molecule into two shorter molecules). One molecule of water (H ) is released. Note that the initial part of the enzyme matches the nal part of the rst substrate and that the nal part of the enzyme matches the start of the second substrate. The closer this match, the faster the reaction proceeds. The notation onnn refers to an ordinary chemical (i.e. not an enzyme) with shape nnn and the notation ennn denotes an enzyme with shape nnn. Each possible reaction leads to two dierential equations for each chemical species in the cell: one governing the concentration of the chemical species (xi for species i) and another (xi ) governing the sum of bound enzyme/substrate or enzyme/product complexes containing the chemical species. This latter variable and equation are required to solve the \saturation problem" as per (Farmer et al., 1986). For detailed description of the dierential equations see (Kennedy, 1998). 5.2 Protein Production and Degradation Protein production in real cells is a complex multistage process. In this model, we are interested mainly in the simple notions (i) that genes specify enzymes; (ii) that our articial cell provides machinery to build the enzymes; and (iii) that the cell can regulate the rate of production of the enzymes (as the direct or indirect result of environmental stimuli). Consequently, we combine the processes of transcription and translation into a single process of expression. This operation is carried out with a new entity called a \spider" which may be viewed as a combination of mRNA and ribosomal machinery. Spiders walk along the genome reading each base and appending the matching monomer to a growing protein chain. In our simulation, spiders are chemicals with a shape similar to a particular string (arbitrarily 12312). Operon expression is modelled as a chain of irreversible chemical reactions with one reaction for each step along the genome. S + M ?! S 0 + P (2) Here a spider molecule (S ) is bound to a monomer (M i.e. matching the base being read). The substrate spider molecule (S ) may be either an unbound spider (when this reaction models the rst base of the operon) or a spider bound to the genome (when the reaction models a spider partially along the operon). A modied spider molecule (S 0 ) and perhaps a protein (P i.e. an enzyme) result. An enzyme is only produced when the reaction models the step from one gene to another or is the last of the operon. The product spider molecule will be either an unbound spider (if the reaction is the last of the operon) or a spider bound to the next position along the genome. Dierential equations follow readily from the chain of chemical reactions. A rate constant of 1 is used for all reactions except the rst of an operon. That reaction requires special treatment because it is where operon regulation is taken into account. For this rst reaction of an operon, the rate \constant" used is Gi KT where Gi is the activation of operon i (a value in [0; 1]) and KT is the transcription rate constant for the given spider. The closer the shape of the spider is to the \ideal" spider, the higher the value of KT and the faster the spider can initiate expression. The activation of an operon (Gi ) depends on the kind of switch in the operon. For constitutive operons Gi is set to 1. Inducible operons are active only when one of n competing species of \switch" chemical is bound to the promoter region. This value is ^n . Repressible operons, on the other hand, are active only when a switch chemical is not bound to the promoter region. So Gi has value 1 ? ^n . Of course, dierent chemicals will switch each operon. The probability of one of n competing switch chemicals being bound to a given promoter region (^n ) is derived in (Kennedy, 1998) and has value ^n = 11++(1(1??1))II((nn)) 1 n X i I (n) = 1 ? i i=2 (3) (4) where i is the probability that a molecule of a particular chemical species i is bound to the promoter region. This is given by i = 1 ? e?K s i i where si is the concentration of species i and Ki = (1 ? qK) e?n b : i i (5) (6) qi is the probability that switching chemical i will not immediately bind to the promoter region. That is, the probability of a bounce. The exponential part of the denominator of Ki species how long the chemical i will bind to the promoter region. This time derives from the Boltzmann distribution and depends on the average radius r tetrahedron height h Figure 2: Cell membrane sphere packing scheme. Each sphere represents a single cell membrane molecule. number of bonds between the chemical and the promoter region (ni ) and the relative strength of each bond (b typically 0.25). K is a constant used to calibrate the concentration of a switching chemical with the probability that the chemical will be bound to the promoter region. The actual value used is arbitrary but its general relationship with the other parameters is important. We typically use 1:0 103. There is a variable ^n for each operon and its derivative is added to the system of differential equations. This set of biological pathways will produce proteins but the molecules will accumulate until they poison the cell: there is no way to break proteins down. Therefore we add a simple model of protein degradation. For example, the breakdown of the enzyme e343 into its constituent monomers is modelled with the following reaction: e343 ?! 2 o3 + o4 (7) 5.3 Modelling the Cell Membrane Each cell is represented as a cell membrane lled with as many water molecules as possible to form a plump sphere. No organelles are modelled. Cell membrane molecules do not appear explicitly as chemical species. Instead, there is a variable in the system of dierential equations that directly represents the number of cell membrane molecules. As a rst approximation to the semipermeable bilipid membrane of a real cell, we model the cell membrane with two layers of spheres packed together as shown in gure 2. Each sphere represents a single cell membrane molecule. The number of membrane molecules covering the cell may be expressed as 3VW NM = 0:74 VVW = 4V1 1 (8) where NM is the number of membrane molecules associated with the cell, 0.74 is the packing density of spheres (Kittel, 1971), VW is the volume of the cell membrane and V1 is the volume of one cell membrane molecule. The approximation of 0.75 for 0.74 implies packing of slightly squishy spheres. Reorganising, we get (9) VW = 4V13NM Another way to approximate the volume of the cell membrane is to multiply the surface area of the cell by the width of the membrane. This ignores the curving of the membrane. VW = 4R2w (10) where R is the radius of the cell and w is the width of the cell membrane. Equating equations (9) and (10), using simple trigonometry to determine the width of the cell membrane from the packing scheme (i.e. 2r + h) and substituting V1 = 34 r3 , we get 2N M R2 = 2r p 9 1 + 2=3 (11) where r is the radius of a cell membrane molecule. From equation (11) and the formula for the volume of a sphere we may determine the volume of the cell as a function of the number of cell membrane molecules. 0 13=2 2 NM A VC = 34 R3 = 34 @ 2r p 9 1 + 2=3 (12) The surface area of the cell, AC , may be determined in a similar way. Given VC , the volume of the cell, we can determine the number of water molecules contained in the cell as 13=2 0 2 NM A NW = ANA 106 34 @ 2r p W 9 1 + 2=3 (13) where NA is Avogadro's number and AW is the atomic mass of (one molecule of) water. Derivation of this equation is in (Kennedy, 1998). The size of the cell, the number of water molecules it contains and hence concentrations of chemicals in the cell are a function of the number of cell membrane molecules associated with the cell. 5.4 Cell Growth As cell size is expressed in terms of the number of cell membrane molecules, growth or shrinkage of the cell occurs when this number of molecules changes. As an initial approximation to the complex process of membrane formation, we introduce two families of chemical species that may change the number of cell membrane molecules associated with the cell. One family (\builders") increases the amount of membrane and the other family (\breakers") reduce the membrane. The actual processes of building and destroying membrane are not directly modelled. A simple matching algorithm tests whether each of the chemical species in the cell (enzymes and ordinary chemicals) is a member of the family of builders or breakers (Kennedy, 1998). We typically use an (arbitrary) builder shape of 8441 and breaker shape of 0307. The dierential equation governing the number of cell membrane molecules is dNM = N k X x ? k X x a 2 W 1 b dt a2A b2B ! (14) where k1 is the rate at which building of the cell membrane occurs and k2 is the rate at which reduction of the cell membrane happens. Typically, we use the (arbitrary) values k1 = k2 = 1:0 10?6. A is the set of chemical species that are members of the cell membrane building family and B is the set of species that may act as breakers. A species may act as both a builder and a breaker if it has an appropriate shape (for example 03078441). xa is the concentration of the ath chemical species. The reader may note that equation (13) shows that NW is a function of the number of cell membrane molecules in the cell. However, here, we use it as a constant with the last computed value (the value at the last time step). This is valid because NW changes very slowly. Multiplication by NW converts the builder and breaker concentrations to raw numbers of molecules. 5.5 Communication with the Environment Communication with the environment is the way environmental stimuli comes into the cell. After chemicals diuse into the cell they become part of the chemical ensemble and may take part in reactions, regulate or express operons or change the size of the cell. Communication between the cell and its environment involves diusion of chemicals across the semipermeable membrane. Not all chemicals may diuse through the membrane: proteins and partially built proteins are assumed to be too large. Diusion of chemicals across the semipermeable membrane is modelled by an additional term subtracted from the dierential equation of each ordinary chemical (xi above). The diusion term is not applied to xi , the variable for the sum of bound complexes containing species i, because the bound complexes involve an enzyme molecule which is assumed too large to pass through the membrane. Three factors underlie our model of background diusion: the rate of diusion is (i) proportional to the concentration gradient of the permeant; (ii) pro- portional to the surface area of the membrane (AC ); and (iii) inversely proportional to the size of the permeant molecule. This last factor is estimated by the cube of the length of the polymer undergoing diusion. Dierential equations, then, are modied as follows: dxi = : : : ? KAc (x ? x ) dt li3 iIN iOUT (15) where li is the number of monomers comprising the ith chemical, K is a parameter used to weight the diusion (usually 2:0 104 ) and xiIN and xiOUT are the concentrations of the ith ordinary chemical inside and outside the cell respectively. 6. Simulating a Cell The rst step in simulating the actions of a single cell is to build the phenotype. First, the genome is parsed into a list of operons. One of the two parents is randomly designated as the mother cell. The initial chemical species are then read from this mother cell. A reaction graph consisting of metabolic and protein production and degradation reactions is then determined by matching the operon parse list to the chemical species. Next, the set of variables is determined and terms in the dierential equations are calculated from the reaction graph. Diusion terms are also added to the dierential equations. The variables are now initialised, in most cases, to the nal chemical concentrations in the mother cell. Finally, the simulation starts and a numerical integration algorithm (Runge-Kutta with adaptive step size) nds values of the variables at time steps. When chemicals appear with concentration greater than that of one molecule, new reactions become possible and the system of dierential equations is updated (Farmer et al., 1986). Simulation continues until the maximum time (1:5 105) is reached, the maximum number of steps are taken, or the cell dies (when a chemical concentration exceeds an arbitrary but high threshold of 1:0 10?4 ). A typical cell simulation cell might contain around 185 enzyme-catalysed reactions, 700 dierential equations each containing 10 to 25 terms, around 200 chemical species and 10 enzymes. This may seem large, but compared to an actual cell, our simulations are mere caricatures. 7. Evolution of the Cell A simple GA evolves the cell models. The GA runs the population of cell simulations over a network of 20 processors, spawning each cell simulation on a separate processor. Simulations queue until a processor became available. Fitness proportionate selection with the roulette wheel algorithm is used to nd breeding pairs. Mutation (with probability of bit mutation 0:005), crossover (approximately four points per genome) and inversion (with 8. Results Experiments were conducted breeding populations of cells to adapt to an environment. Two dierent environments were examined. A rst environment \Grow" makes the cell grow larger and a second environment \Shrink" causes cells to get smaller if the cell takes no action. As well as containing chemicals that aect the cell growth process, both environments also contain a number of other chemical species held at the same concentration as the cell. These chemicals (table 1) are used as building blocks for the cell. The only dierence in experiments is the concentration of chemicals. 8.1 Environment \Grow" This environment causes cell growth by maintaining a higher concentration of membrane building chemicals than inside the cell. When membrane building chemicals diuse into the cell the number of water molecules in the cell increases (because the membrane is larger) and the concentrations of chemicals in the cell decrease (because concentrations are dened as the ratio of molecules of a substance to molecules of water in the cell). This causes reactions to occur more slowly. At the same time, the surface area of the cell increases, in turn causing diusion to increase. If the cell does not act on the inux of membrane building chemicals in time, it can soon nd itself Fitness in Environment "Grow" 35 30 25 20 Fitness probability 0.15 per genome) are applied. A steady-state GA with one population (size 100) is used. Breeding occurs only when at least 75 members of the population have nished simulation. A tness function (Kennedy, 1998) scores each cell simulation. This function is a combination of six metrics in [0; 1]. The six components to tness were derived with the intention they would form a canonical set. Qualitatively they are: the change in volume of the cell over the course of its lifetime relative to a target value (cells that stay the same size or grow very slightly are rewarded) the \time" the cell lived how closely correlated the operon switching regions are to the membrane builder and breaker chemicals the complexity of the metabolic reaction graph the number of membrane building and breaking species available to the cell the number of dierential equations in the cell Additionally, the tness of cells that died during the simulation is halved. Fitness values are bounded above at 63. The metrics conict with one another, consequently the maximum tness score cannot be realised. 15 10 Average Fitness Max Fitness Fitness 5 0 0 200 400 600 Cell ID or time 800 1000 1200 Figure 3: Fitness in Environment \Grow" unable to arrest the increasing rate of diusion because reactions (including enzyme production) have slowed too much. Figure 3 is a graph of the tness attained during the run. The X-axis represents each experiment run. Values near the maximum tness were reached very early. This occurs soon after breeding begins (after 75 experiments have run). The average tness is calculated over all nished experiments currently in the population, excluding those that aborted. Fitness values on the lower half of the graph mostly represent cells that died since the tness function halves all cells that died. Most of the population becomes t and nds strategies to adapt to the environment. There is a large diversity of genes in the cells in these experiments. Environments do not have one \right" solution with maximum tness. There is a variety of valid approaches to the environment and all have similar (high) tness. This allows the population to maintain dierent genes to solve the problem as long as valid contexts for their use (i.e. initial metabolic conditions) are available. The variety of valid strategies suggests that our evolution would benet from some form of speciation. We plan to do this in future research. We calculate the correlation between the tness of a cell and each of its parents. This is done only for cells that lived till the end of their simulation period and whose parent also lived. The coecient of determination (r2 ) between a cell and its mother is 0.46 and between a cell and its father is 0.05. This suggests that ospring tness is more closely correlated to mothers than fathers. The dierence between parents is that initial chemical species are inherited only from the mother cell, whilst the genome is inherited from both parents. The context of the genome, then, has an important eect on the tness of a cell. Since children often scored dierent tness values than their parents we were interested in the stability of the solutions found. Taking the nal population of cells, we Table 1: Primary chemicals in the environments and in initial population of cells and their concentrations. Species Env. \Grow" Env. \Shrink" o0 ? o9 5:0 10?5 5:0 10?5 o01; o23 o30 5:0 10?7 5:75 10?7 ? 7 o44 5:75 10 5:0 10?7 ? 5 o0123 5:0 10 5:0 10?5 o5432; o2468; 1:0 10?6 1:0 10?6 o9673; o7345 Table 2: Operons in rst example cell in environment \Grow" Kind of Switch Genes % active Constitutive e51709 100 Inducible f066g e137 21 continued running each cell for a further time of 1:5 105 and calculated another tness value. When we examined the cells we found that there seemed to be two kinds: myopic cells with short-term solutions (that decreased in tness when run for longer); and cells with conservative longer term solutions (that either increased or maintained tness). The myopic cells implementing short-term solutions tend to have slightly higher tness than the more stable cells. The higher tness is not realised in later generations when the usefulness of the short-term strategy starts to fail. 8.1.1 Case Studies of Individual Cells in Environment \Grow" We will now examine two cells living in environment \Grow". Usually we found that many chemicals exist in the cell in small concentration and that it is often dicult to determine exactly what strategy is used by a cell. In the following case studies we present greatly abbreviated views of each cell. We show all operons, but describe only the pertinent chemicals in each cell. The rst cell we discuss is the one that scored the maximum tness in the nal population. It scored 31.0092 initially and 30.4144 for the second time period. Table 2 shows its two operons. Note that the number between braces in the table is the regulation information for the operon. Three spider species are available for transcription in this cell. These are o0123 (with concentration 2:8 10?5), o01237 (7:54 10?14) and o10123 (7:54 10?14). Table 3 lists the membrane building and breaking species in the cell. Enzyme e137 is used to build new spiders but its operon does not express well. Both enzymes build membrane breaking molecules to ght against the o44 molecules that are diusing in. This appears to be a Cell Purpose ? 5 5:0 10 Monomers 5:0 10?7 Spider building blocks 5:0 10?7 Membrane breaker 5:0 10?7 Membrane builder 5:0 10?5 Spider Other building blocks Table 3: Membrane building and breaking chemicals in rst example cell in environment \Grow" Building chemicals Conc. o44 5:43 10?7 o446 1 10?15 o447 1 10?15 o73451; o7345130 Virtually none Breaking chemicals o30 4:73 10?7 o130 9:8 10?16 o1309; o5130; ::: ... and 11 more Virtually complicated none breakers including o7345130 sound strategy since the second tness is similar to the original tness. There are, however, three weaknesses with it. The same enzyme (e51709) that is used to build many of the membrane breaking species also makes builder molecules by appending polymers starting with 1 to the polymer o7345. The enzymes are not suciently specic. As well, the builder molecules that are constructed are long, and therefore, will diuse out very slowly. The next aw in the strategy is the molecule o7345130 that is made. This molecule acts as both a builder and breaker molecule. As such, its production wastes valuable resources. The nal weakness is that many of the breaker molecules are produced using the existing breaker species o30. This chemical becomes bound and will reduce the overall tness of the cell because it is not available to breakdown the membrane. Note also that o30 will diuse into the cell because it has lower concentration than the environment. However, o44 will also diuse in and is still in higher concentration than o30 causing the cell to grow. It is interesting to note that the task faced by a cell changes as it reacts to the environment. Fit cells need to be able to adapt their response to the environment. The other cell in this environment that we will examine is the cell in the nal population that recorded the Fitness in Environment "Shrink" Table 4: Operons in second example cell in environment \Grow" Genes % active e07 100 e619 15 e53 93 e38 55 30 25 20 Fitness Kind of Switch Constitutive Inducible f0g Repressible f6g Inducible f751718g 35 15 10 Table 5: Membrane building and breaking chemicals in the second example cell in environment \Grow" Average Fitness Max Fitness Fitness 5 Building chemicals Conc. o44 5:43 10?7 Breaking chemicals o30 4:7 10?7 o96730123 6:1 10?8 o07; o075; o306; ::: ... and 44 more Virtually complicated none breakers. 0 0 200 400 600 Cell ID or time 800 1000 1200 Figure 4: Fitness in Environment \Shrink" Comparative Enivironments 35 30 25 largest increase in tness when it was run for the additional time period. Initially it scored 24.3847. After continuing the run, its tness rose to 28.9469. Table 4 shows its four operons. As well as the four enzymes encoded on its genome, this cell has inherited another enzyme from its mother (e06) but does not have the gene to produce it. This chemical will soon degrade into its two constituent molecules. Cells often have small amounts of chemicals they cannot produce. These are inherited from cells in the maternal cell-line. This cell has ve spiders at its disposal: o0123 (2:46 10?5), o01238 (virtually none), o96730123 (6:1 10?8 ), o196730123 (virtually none) and o967301238 (virtually none). Table 5 shows how it can modify the membrane. The cell has managed to build only membrane breaking chemicals, many of which do not rely solely on modifying the important chemical o30. This strategy would seem to be superior because it is more specic. Unfortunately this strategy scores less in the short term than the other cell. 8.2 Environment \Shrink" The second environment we simulate acts in the opposite way: it promotes cell shrinkage by bathing the cell in \breaker" chemicals. Any decrease in cell size causes an increase in concentration for all chemicals in the cell because the number of water molecules in the cell lessens. This increase in concentration causes two eects in the cell: an increase in the chemical reaction speed; and the possibility that one of the chemical species with high concentration now moves over the threshold (1:0 10?4) Fitness 20 15 10 Avg Fitness "Shrink" Max Fitness "Shrink" Avg Fitness "Grow" Max Fitness "Grow" 5 0 0 200 400 600 Cell ID or time 800 1000 1200 Figure 5: Fitness in Both Experiments to poison the cell. If the cell cannot respond to the environmental pressure, it can quickly move towards death, gathering speed as it continues to shrink in a positive feedback cycle. The longer the cell waits before acting, or takes to act, the more dicult it is to stop the shrinkage. Figure 4 shows the tness attained in environment \Shrink". As in environment \Grow", values near the maximum tness were reached very early. Figure 5 plots the graphs for both environments together. The average tness for both environments starts out similarly and at the end both populations are mostly lled with t individuals. Fitness of cells in environment \Grow", however, are consistently higher than those cells in environment \Shrink". As the membrane building and breaking coecients (k1 and k2 in equation 14) are set to the same values, the explanation for this is in the formulation of the tness function: the volume component of tness for cells that grow a little is higher than for cells that shrink. Results of the relationship between ospring and parent tness values and the stability of solutions are similar to the other environment and are not reproduced here. Table 6: Operons in the rst example cell in environment \Shrink" Kind of Switch Genes % active Constitutive e59 100 Inducible f124g e53 44 Table 7: Membrane building and breaking chemicals in the rst example cell in environment \Shrink" Building chemicals Conc. o44 5:4 10?7 o744 2:45 10?15 Breaking chemicals o30 6:2 10?7 o302; o630; o8302 2:5 10?15 o830 1:3 10?8 o530; o5302; Virtually o734530; o7345302 none 8.2.1 Case Studies of Individual Cells in Environment \Shrink" The rst cell we dissect is the one in the nal population that scored the highest tness (31.0556). This cell contains two operons (see table 6). This cell has only one spider molecule (o0123), which it gets by default from the initial conditions. The membrane building and breaking molecules available are listed in table 7. Only two enzymes exist in the cell: e53 and e59. Enzyme e53 is used to build many of the breaking chemicals (o530, o5302, o734530 and o7345302). The other building and breaking chemicals were inherited from the mother. This cell line has existed for some time because the concentrations of o30 and o44 are higher than in the environment. This only occurs if the cell has shrunk considerably. As the total of the membrane breaking concentrations is greater than the building concentrations the cell will continue to shrink. The strategy used by this cell seems strange because the genome instructs the metabolism to build membrane breaking molecules which will cause the cell to shrink further instead of grow. However, it is reducing the amount of o30 by putting it into a bound state whilst joining it to other chemicals. Such a contrary strategy was never envisioned by us when designing the model. In the short term, this is a good strategy, but it breaks down after some time. This is because the new chemicals produced eventually become unbound and, since they are longer, diuse out of the cell more slowly. Initially, this cell achieved a tness of 31.0556. When it was run longer, the tness reduced to 25.1239. Unfortunately, the cell has lost the ability to build o744. Presumably, this is because that ability did not Table 8: Operons in the second example cell in environment \Shrink" Type of Switch Genes % active Inducible f3g e770 17 Constitutive e53; e59 100 Table 9: Membrane building and breaking chemicals in the second example cell in environment \Shrink" Building chemicals Conc. o44 5:28 10?7 Breaking chemicals o30 6 10?7 o07345 2:4 10?15 o530; o073453; o073459; o707345; Virtually o734530; o0734530; none o073459673; o2468734530 translate to an immediate large increase in tness. This is most probably because the act of producing o744 temporarily puts o44 and o744 into bound state and therefore decreases the tness of the cell. In the long term, however, the tness will recover. This strategy, however, does not compete well with the short-term solution already mentioned. The second cell we will examine is the cell in the nal population whose tness improved by the most when it was run for the further time period: from 27.8078 to 28.8712. This cell, as shown in table 8, contains two operons. Additional spider molecules are available to this cell. They are o0123 (3 10?5), o01238 (2 10?8), o70123 (virtually none) and o701238 (virtually none). The last two of these are constructed using the enzyme e770 but not enough has been made to be of any practical use. The strategy used by this cell does not dier much from the rst cell. It uses enzymes e53 and e59 to bind more of the existing membrane catalysing species into complexes so that they cannot shrink the cell. The cell achieved a higher tness when the run was continued because it can bind o30 and o07345 into many dierent complexes. We would expect that this strategy would start to break down after a longer period. In general, cells are unable to use strategies that break o30, for example, into its constituent monomers. This is because the concentration of monomers is much higher than that of o30 and therefore the reversible reaction builds more o30 rather than breaking it down. 9. Conclusion A complex model of a cell was presented. Populations of cells evolved to adapt to an environment in favourable ways: control systems (genomes, spiders and metabolism) were evolved to achieve adaptation to two environments. Reaction graphs of the cells are very complicated systems and only simplied descriptions were given in this paper for reasons of clarity. Cells frequently chose short-term strategies over more stable longer term strategies. This is because such strategies score higher tness in the short term. However, children of these cells are, in general, less t. Distinguishing between short and long-term solutions is not possible for the tness function. To see why, consider how to score the parent. The tness function must either examine the tness of ospring or search for short-term solutions in the metabolism of the parent. The former task is impossible because a cell must be scored before it issues ospring. The latter task is dicult because it presupposes solutions to the environment and therefore particular kinds of cells. One of our goals with the design of the tness function was to keep it as abstract as possible. Consequently, the tness should be determined from the actions of the cell rather than details of the actual enzymes coded or the reaction graph embodied in the metabolic pathways. We have also used the tool developed for these experiments in a number of other contexts: (i) exploration of the use of gene expression and regulation algorithms in evolutionary computation (Kennedy and Osborn, 2000); (ii) examination of the eects of Lamarckian evolution of the cell models (Kennedy and Osborn, 1999); and (iii) preliminary examination of the eects of genetic operators on the evolution of single-celled organisms (Kennedy, 1998). In future work we wish to explore (i) the evolution of adaptive behaviour in more dynamic environments; (ii) application of the model to solving problems using a computational paradigm of chemical reactions; and (iii) modication of the model to apply it to broader problems in evolutionary computation (rather than strictly biological kinds of problems). Regulation of genes is not emphasised in the current work. Most of the time, we nd that operons work well if they are constitutive. More complex environments that change the stimuli presented to cells over time (i.e. dynamic environments) would make regulation more important. We wish to see how evolution of control of the adaptive behaviour in cells changes in such conditions. In order to apply the model to a wider range of problems in evolutionary computation, we plan to discretise the cell model. That is, move from an encoding of dierential equations to nite state machines and from oating point concentrations to integers or symbols. We believe this would make the model more applicable to other areas. The cell model would become more like (linear) genetic programming or a classier system. References Alberts, Bray, et al. (1994). Molecular Biology of the Cell. Garland Publishing, New York, third edition. Bagley, R. J. and Farmer, J. D. (1991). Spontaneous emergence of a metabolism. In Langton, C. G., Farmer, J. D., and Rasmussen, S., (Eds.), Articial Life II, volume x of SFI Studies in the Sciences of Complexity. Addison{Wesley. Farmer, J. D., Kauman, S. A., and Packard, N. H. (1986). Autocatalytic replication of polymers. Physica 22D, pages 50{67. Fleischer, K. and Barr, A. H. (1994). A simulation testbed for the study of multicellular development: The multiple mechanisms of morphogenesis. In Langton, C. G., (Ed.), Articial Life III, volume xvii of SFI Studies in the Sciences of Complexity. Addison{Wesley. Holland, J. H. (1992). Adaptation in Natural and Articial Systems. MIT Press, Cambridge, Massachusetts, rst MIT press edition. Jacobi, N. (1995). Harnessing morphogenesis. Cognitive Science Research Paper 423, School of Cognitive and Computing Sciences, University of Sussex. Kennedy, P. J. (1998). Simulation of the Evolution of Single Celled Organisms with Genome, Metabolism, and Time-Varying Phenotype. PhD thesis, University of Technology, Sydney. Kennedy, P. J. and Osborn, T. R. (1999). A coevolutionary model of a single{celled organism with double{stranded genome and time{varying phenotype. In McKay, B., Tsujimura, Y., Sarker, R., Namatame, A., Yao, X., and Gen, M., (Eds.), Proceedings of The Third Australia{Japan Joint Workshop on Intelligent and Evolutionary Systems, pages 145{ 152. Kennedy, P. J. and Osborn, T. R. (2000). Operon expression and regulation with spiders. To be presented at Gene Expression workshop at the 2000 Genetic and Evolutionary Computation Conference (GECCO{2000), Las Vegas. Kittel, C. (1971). Introduction to Solid State Physics. John Wiley and Sons, New York, fourth edition. Rosenberg, R. S. (1967). Simulation of Genetic Populations with Biochemical Properties. PhD thesis, University of Michigan.
© Copyright 2026 Paperzz