Evolution of Adaptive Behaviour in a Simulated Single

Evolution of Adaptive Behaviour in a Simulated
Single-Celled Organism Paul J. Kennedy
Thomas R. Osborn
School of Computing Sciences
University of Technology, Sydney
PO Box 123 Broadway
NSW 2007 Australia
[email protected]
Abstract
A model of a single-celled organism comprising
two interrelated parts (a genome and metabolism)
is presented. The genome encodes operons that
specify enzymes for the metabolism. The articial
metabolism regulates the genome and constructs
proteins (among other processes). The structure
of operons in our model is governed by a parallel genomic language. Protein construction is
accomplished with an abstraction of mRNA and
ribosomal machinery called \spiders". Adaptive
behaviour occurs at two dierent levels within the
model: in enzyme-catalysed reactions and by the
regulation of genes. Adaptive behaviour is passed
among individuals of a population via the evolution of genomes and the Lamarckian evolution of
initial metabolic conditions. Results are given for
evolving cells for two simple environments and the
adaptive behaviour of four cells is examined.
1. Introduction
In this paper we present a model of a single-celled organism. Our motivation for development of this detailed
evolutionary model is to build a tool to explore the use
of biological ideas in simulations of adaptive behaviour
and in the broader eld of evolutionary computation.
Our cell model may be divided into two closely interrelated parts: a genome and an articial metabolism.
The genome species enzymes to control reactions in the
metabolism, whilst the metabolism regulates the genome
and constructs the specied proteins (among other processes). Like a real cell, the phenotype in our model
(i.e. the metabolism) is time-varying and the genome
exerts control over the cell for its entire lifetime. This
permits the cell model to exhibit adaptive behaviour in
relation to some environment. An individual cell is constructed from a genome and an initial set of chemicals.
From these chemicals and genome, an arbitrarily complex metabolism results.
Presented
at SAB2000 - Paris, France - September 11-16, 2000
Cell models have been devised by other workers for use as a testbed.
See, for example,
(Fleischer and Barr, 1994). Our model diers from their
model both in the kinds of simulation undertaken (Fleischer and Barr are more interested in multicellular development whereas we examine single cells) and the
kind of genome (we use a biologically plausible model
whereas theirs is a simpler ad hoc dierential equation).
The closest previous model to ours is probably that of
(Rosenberg, 1967) although the constraints imposed by
the computers of the day limited the detail of the model.
Another similar approach is taken by (Jacobi, 1995).
Heuristics for adaptive behaviour are encoded into a
genome and initial chemical ensemble, both of which are
inherited from parents. This allows us to breed cells to
live in particular environments. We present results of cell
simulations evolved to adapt to two simple environments
and examine the adaptive behaviour found in four case
studies of individual cells.
2. Overview of Cell Model
Two levels of adaptive behaviour are designed into our
cell simulations. The rst level relates to regulation of
genes on the genome. Genes may be switched on and o
throughout the lifetime of a cell by varying the types and
concentrations of chemicals in the articial metabolism.
Some of these chemicals may have diused in from the
environment, thus allowing direct adaptation to environmental stimuli. The other method of adaptive behaviour
is swifter and occurs in the metabolism itself. Enzyme
catalysed chemical reactions cause the cell to modify its
behaviour based on environmental (and internal) cues.
As stated above, there are two distinct sections
of our cell model: a genome and a simulated cellular metabolism. The genome encodes genes that
specify enzymes for use in the metabolism. Following the operon model of regulation in prokaryote cells
(Alberts et al., 1994), we group genes into blocks (called
operons) that are regulated in the same way. Operons
start with a section called the promoter region that contains information used to regulate the operon. Following
the promoter region are one or more genes each encoding
a protein.
The cellular metabolism encompasses ve processes:
enzyme-mediated chemical reactions; protein production
(by gene expression and regulation); protein degradation; cell growth; and diusion of chemicals through a
semipermeable cell membrane. These processes are modelled in the hope that they will form a canonical set enabling the cell to be used as a testbed in a variety of
environments and experiments. Processes in the cellular
metabolism are represented with chemical reactions and
the kinetics of these reactions are encoded in a large system of coupled nonlinear ordinary dierential equations.
The genome and initial metabolic conditions evolve
in dierent ways. The genome evolves control structures (i.e. genes) in a Darwinian fashion using a genetic
algorithm (GA) (Holland, 1992). Evolution of the initial chemical ensemble, however, occurs only through
the maternal cell line. As well, it is Lamarckian, as
changes to chemicals in a cell that occurred throughout its lifetime may be passed to ospring. The initial chemicals for a cell are the nal set of chemicals
from its mother cell. A cell, then, is the result of the
coevolution of a genome and initial chemical ensemble
(Kennedy and Osborn, 1999). This Lamarckian strategy allows adaptations to be passed along a cell line and
permits the \problem" posed by an environment to be
decomposed into smaller problems solved along a germ
line.
Cells are simulated in an environment. For the experiments in this paper we chose to use very simple environments.
3. The Genome
The genome used by our cells is an extension of the
simple xed length bit string commonly used in GAs.
Although a simple bit string is sucient, we found
that search proceeded more quickly with the use of a
more complicated genome: one modelled on the doublestranded (but haploid) DNA molecule (Kennedy, 1998).
The genome is a sentence from a parallel genomic
language. This language permits operons to be encoded on the genome as a string of tokens. Sixteen
kinds of tokens are encoded in blocks of four bits (nibbles). Our language, however, uses only fourteen distinct tokens: ten digits in the range [0; 9] and four
control codes: <start operon> (10102, 10112), <start
enzyme> (11002, 11012), <start carrier> (11102) and
<end operon> (11112). However, the <start carrier>
token is not used in experiments in this paper. Following
the <start operon> token is information used to regulate
the operon. This is the promoter region. Next is a list of
genes with each gene beginning with a <start enzyme>
Operon
promoter
gene 1
... gene i ...
gene n
Non coding region
An example with bases:
<start
operon>
Switch <start
Data
enzyme>
5
3
2
<start
enzyme>
0
1
7
<end operon> or
<start operon>
Figure 1: Structure of operon and sample encoding.
token. Bases in [0; 9] following <start enzyme> represent the monomers required (and the order) to produce
the protein. Figure 1 shows the layout and encoding of
an operon using our parallel genomic language.
The promoter region describes how the operon may be
regulated and provides a template for the shape of chemical species able to regulate the operon. Three classes of
regulation in operons are modelled: constitutive (where
the operon is always active and may be expressed at
any time); repressible (where the operon is active, unless a \blocking" chemical is bound to the promoter region); and inducible (where the operon is inactive except
when an \activator" chemical is bound to the promoter).
Chemicals may diuse in from the environment and regulate inducible or repressible operons.
Operon sentences are read from the genome and
parsed into operons. From this, a sequence of chemical reactions is determined to build the proteins.
4. The Environment
Experiments in this paper are conned to a simple environment that exposes a single cell to a xed eect and
that allows no reactions to occur outside the cell.
Additionally, the environment is assumed to be very
large compared to the cell. This means that the concentration of chemicals in the environment is constant.
Consequently, no dierential equations are required to
model the environment. Although the environment is
constant it still has an eect on the cell due to diusion
of chemicals through the semipermeable membrane.
5. Metabolic Processes
5.1 Enzyme-Catalysed Reactions
Chemicals, in our model, are an abstraction
of polymers based on (Farmer et al., 1986) and
(Bagley and Farmer, 1991). However, complementary
matching is used throughout our simulation. The shape
of a polymer appears as a string of digits (monomers)
such as 4138. Each chemical species has a concentration
that is dened as the ratio of the number of molecules of
the chemical to the number of molecules of water (in the
cell). Chemicals in the articial cell are modied via enzyme catalysed reactions. This metabolic process is the
most direct way that a cell responds to environmental
stimuli. Chemicals that diuse into the cell may alter
the way that the cell acts by changing which reactions
occur. Our model of these reactions and subsequent
expression in dierential equations follows that of
(Bagley and Farmer, 1991). Each enzyme-catalysed
reaction models the equilibrium of joining and breaking
polymers. For example, the equation
o123 + o456 + e345 () o123456 + e345 + H
(1)
describes an equilibrium joining polymers with shape
\123" and \456" into a longer polymer with shape
\123456" under catalytic pressure of an enzyme with
shape \345" (or alternately splitting the longer molecule
into two shorter molecules). One molecule of water (H )
is released. Note that the initial part of the enzyme
matches the nal part of the rst substrate and that the
nal part of the enzyme matches the start of the second
substrate. The closer this match, the faster the reaction proceeds. The notation onnn refers to an ordinary
chemical (i.e. not an enzyme) with shape nnn and the
notation ennn denotes an enzyme with shape nnn.
Each possible reaction leads to two dierential equations for each chemical species in the cell: one governing
the concentration of the chemical species (xi for species
i) and another (xi ) governing the sum of bound enzyme/substrate or enzyme/product complexes containing the chemical species. This latter variable and equation are required to solve the \saturation problem" as
per (Farmer et al., 1986). For detailed description of the
dierential equations see (Kennedy, 1998).
5.2 Protein Production and Degradation
Protein production in real cells is a complex multistage
process. In this model, we are interested mainly in the
simple notions (i) that genes specify enzymes; (ii) that
our articial cell provides machinery to build the enzymes; and (iii) that the cell can regulate the rate of
production of the enzymes (as the direct or indirect result of environmental stimuli). Consequently, we combine the processes of transcription and translation into
a single process of expression. This operation is carried out with a new entity called a \spider" which may
be viewed as a combination of mRNA and ribosomal
machinery. Spiders walk along the genome reading each
base and appending the matching monomer to a growing
protein chain. In our simulation, spiders are chemicals
with a shape similar to a particular string (arbitrarily
12312). Operon expression is modelled as a chain of irreversible chemical reactions with one reaction for each
step along the genome.
S + M ?! S 0 + P
(2)
Here a spider molecule (S ) is bound to a monomer (M
i.e. matching the base being read). The substrate spider
molecule (S ) may be either an unbound spider (when this
reaction models the rst base of the operon) or a spider
bound to the genome (when the reaction models a spider
partially along the operon). A modied spider molecule
(S 0 ) and perhaps a protein (P i.e. an enzyme) result.
An enzyme is only produced when the reaction models
the step from one gene to another or is the last of the
operon. The product spider molecule will be either an
unbound spider (if the reaction is the last of the operon)
or a spider bound to the next position along the genome.
Dierential equations follow readily from the chain of
chemical reactions. A rate constant of 1 is used for all
reactions except the rst of an operon. That reaction
requires special treatment because it is where operon
regulation is taken into account. For this rst reaction
of an operon, the rate \constant" used is Gi KT where Gi
is the activation of operon i (a value in [0; 1]) and KT is
the transcription rate constant for the given spider. The
closer the shape of the spider is to the \ideal" spider,
the higher the value of KT and the faster the spider can
initiate expression.
The activation of an operon (Gi ) depends on the kind
of switch in the operon. For constitutive operons Gi is
set to 1. Inducible operons are active only when one
of n competing species of \switch" chemical is bound
to the promoter region. This value is ^n . Repressible
operons, on the other hand, are active only when a switch
chemical is not bound to the promoter region. So Gi has
value 1 ? ^n . Of course, dierent chemicals will switch
each operon.
The probability of one of n competing switch chemicals being bound to a given promoter region (^n ) is derived in (Kennedy, 1998) and has value
^n = 11++(1(1??1))II((nn))
1
n X
i
I (n) =
1
?
i
i=2
(3)
(4)
where i is the probability that a molecule of a particular
chemical species i is bound to the promoter region. This
is given by
i = 1 ? e?K s
i i
where si is the concentration of species i and
Ki = (1 ? qK) e?n b :
i
i
(5)
(6)
qi is the probability that switching chemical i will not
immediately bind to the promoter region. That is, the
probability of a bounce. The exponential part of the denominator of Ki species how long the chemical i will
bind to the promoter region. This time derives from
the Boltzmann distribution and depends on the average
radius r
tetrahedron height h
Figure 2: Cell membrane sphere packing scheme. Each
sphere represents a single cell membrane molecule.
number of bonds between the chemical and the promoter
region (ni ) and the relative strength of each bond (b typically 0.25). K is a constant used to calibrate the concentration of a switching chemical with the probability
that the chemical will be bound to the promoter region.
The actual value used is arbitrary but its general relationship with the other parameters is important. We
typically use 1:0 103. There is a variable ^n for each
operon and its derivative is added to the system of differential equations.
This set of biological pathways will produce proteins
but the molecules will accumulate until they poison the
cell: there is no way to break proteins down. Therefore we add a simple model of protein degradation. For
example, the breakdown of the enzyme e343 into its constituent monomers is modelled with the following reaction:
e343 ?! 2 o3 + o4
(7)
5.3 Modelling the Cell Membrane
Each cell is represented as a cell membrane lled with
as many water molecules as possible to form a plump
sphere. No organelles are modelled. Cell membrane
molecules do not appear explicitly as chemical species.
Instead, there is a variable in the system of dierential equations that directly represents the number of cell
membrane molecules.
As a rst approximation to the semipermeable bilipid
membrane of a real cell, we model the cell membrane
with two layers of spheres packed together as shown in
gure 2. Each sphere represents a single cell membrane
molecule.
The number of membrane molecules covering the cell
may be expressed as
3VW
NM = 0:74 VVW =
4V1
1
(8)
where NM is the number of membrane molecules associated with the cell, 0.74 is the packing density of spheres
(Kittel, 1971), VW is the volume of the cell membrane
and V1 is the volume of one cell membrane molecule.
The approximation of 0.75 for 0.74 implies packing of
slightly squishy spheres. Reorganising, we get
(9)
VW = 4V13NM
Another way to approximate the volume of the cell membrane is to multiply the surface area of the cell by the
width of the membrane. This ignores the curving of the
membrane.
VW = 4R2w
(10)
where R is the radius of the cell and w is the width of
the cell membrane. Equating equations (9) and (10),
using simple trigonometry to determine the width of the
cell membrane from the packing scheme (i.e. 2r + h) and
substituting V1 = 34 r3 , we get
2N
M R2 = 2r p
9 1 + 2=3
(11)
where r is the radius of a cell membrane molecule.
From equation (11) and the formula for the volume of
a sphere we may determine the volume of the cell as a
function of the number of cell membrane molecules.
0
13=2
2
NM A
VC = 34 R3 = 34 @ 2r p
9 1 + 2=3
(12)
The surface area of the cell, AC , may be determined
in a similar way. Given VC , the volume of the cell, we
can determine the number of water molecules contained
in the cell as
13=2
0
2
NM A
NW = ANA 106 34 @ 2r p
W
9 1 + 2=3
(13)
where NA is Avogadro's number and AW is the atomic
mass of (one molecule of) water. Derivation of this equation is in (Kennedy, 1998). The size of the cell, the number of water molecules it contains and hence concentrations of chemicals in the cell are a function of the number
of cell membrane molecules associated with the cell.
5.4 Cell Growth
As cell size is expressed in terms of the number of cell
membrane molecules, growth or shrinkage of the cell occurs when this number of molecules changes. As an initial approximation to the complex process of membrane
formation, we introduce two families of chemical species
that may change the number of cell membrane molecules
associated with the cell. One family (\builders") increases the amount of membrane and the other family
(\breakers") reduce the membrane. The actual processes
of building and destroying membrane are not directly
modelled. A simple matching algorithm tests whether
each of the chemical species in the cell (enzymes and ordinary chemicals) is a member of the family of builders
or breakers (Kennedy, 1998). We typically use an (arbitrary) builder shape of 8441 and breaker shape of 0307.
The dierential equation governing the number of cell
membrane molecules is
dNM = N k X x ? k X x
a 2
W 1
b
dt
a2A
b2B
!
(14)
where k1 is the rate at which building of the cell membrane occurs and k2 is the rate at which reduction of
the cell membrane happens. Typically, we use the (arbitrary) values k1 = k2 = 1:0 10?6. A is the set of
chemical species that are members of the cell membrane
building family and B is the set of species that may act
as breakers. A species may act as both a builder and
a breaker if it has an appropriate shape (for example
03078441). xa is the concentration of the ath chemical
species. The reader may note that equation (13) shows
that NW is a function of the number of cell membrane
molecules in the cell. However, here, we use it as a constant with the last computed value (the value at the
last time step). This is valid because NW changes very
slowly. Multiplication by NW converts the builder and
breaker concentrations to raw numbers of molecules.
5.5 Communication with the Environment
Communication with the environment is the way environmental stimuli comes into the cell. After chemicals
diuse into the cell they become part of the chemical
ensemble and may take part in reactions, regulate or
express operons or change the size of the cell. Communication between the cell and its environment involves
diusion of chemicals across the semipermeable membrane. Not all chemicals may diuse through the membrane: proteins and partially built proteins are assumed
to be too large.
Diusion of chemicals across the semipermeable membrane is modelled by an additional term subtracted from
the dierential equation of each ordinary chemical (xi
above). The diusion term is not applied to xi , the variable for the sum of bound complexes containing species
i, because the bound complexes involve an enzyme
molecule which is assumed too large to pass through the
membrane. Three factors underlie our model of background diusion: the rate of diusion is (i) proportional
to the concentration gradient of the permeant; (ii) pro-
portional to the surface area of the membrane (AC ); and
(iii) inversely proportional to the size of the permeant
molecule. This last factor is estimated by the cube of
the length of the polymer undergoing diusion. Dierential equations, then, are modied as follows:
dxi = : : : ? KAc (x ? x )
dt
li3 iIN iOUT
(15)
where li is the number of monomers comprising the ith
chemical, K is a parameter used to weight the diusion
(usually 2:0 104 ) and xiIN and xiOUT are the concentrations of the ith ordinary chemical inside and outside
the cell respectively.
6. Simulating a Cell
The rst step in simulating the actions of a single cell
is to build the phenotype. First, the genome is parsed
into a list of operons. One of the two parents is randomly designated as the mother cell. The initial chemical species are then read from this mother cell. A reaction graph consisting of metabolic and protein production and degradation reactions is then determined by
matching the operon parse list to the chemical species.
Next, the set of variables is determined and terms in the
dierential equations are calculated from the reaction
graph. Diusion terms are also added to the dierential equations. The variables are now initialised, in most
cases, to the nal chemical concentrations in the mother
cell. Finally, the simulation starts and a numerical integration algorithm (Runge-Kutta with adaptive step size)
nds values of the variables at time steps. When chemicals appear with concentration greater than that of one
molecule, new reactions become possible and the system
of dierential equations is updated (Farmer et al., 1986).
Simulation continues until the maximum time (1:5 105)
is reached, the maximum number of steps are taken, or
the cell dies (when a chemical concentration exceeds an
arbitrary but high threshold of 1:0 10?4 ).
A typical cell simulation cell might contain around
185 enzyme-catalysed reactions, 700 dierential equations each containing 10 to 25 terms, around 200 chemical species and 10 enzymes. This may seem large, but
compared to an actual cell, our simulations are mere caricatures.
7. Evolution of the Cell
A simple GA evolves the cell models. The GA runs the
population of cell simulations over a network of 20 processors, spawning each cell simulation on a separate processor. Simulations queue until a processor became available. Fitness proportionate selection with the roulette
wheel algorithm is used to nd breeding pairs. Mutation
(with probability of bit mutation 0:005), crossover (approximately four points per genome) and inversion (with
8. Results
Experiments were conducted breeding populations of
cells to adapt to an environment. Two dierent environments were examined. A rst environment \Grow"
makes the cell grow larger and a second environment
\Shrink" causes cells to get smaller if the cell takes no
action. As well as containing chemicals that aect the
cell growth process, both environments also contain a
number of other chemical species held at the same concentration as the cell. These chemicals (table 1) are used
as building blocks for the cell. The only dierence in experiments is the concentration of chemicals.
8.1 Environment \Grow"
This environment causes cell growth by maintaining a
higher concentration of membrane building chemicals
than inside the cell. When membrane building chemicals
diuse into the cell the number of water molecules in the
cell increases (because the membrane is larger) and the
concentrations of chemicals in the cell decrease (because
concentrations are dened as the ratio of molecules of a
substance to molecules of water in the cell). This causes
reactions to occur more slowly. At the same time, the
surface area of the cell increases, in turn causing diusion
to increase. If the cell does not act on the inux of membrane building chemicals in time, it can soon nd itself
Fitness in Environment "Grow"
35
30
25
20
Fitness
probability 0.15 per genome) are applied. A steady-state
GA with one population (size 100) is used. Breeding occurs only when at least 75 members of the population
have nished simulation.
A tness function (Kennedy, 1998) scores each cell
simulation. This function is a combination of six metrics
in [0; 1]. The six components to tness were derived with
the intention they would form a canonical set. Qualitatively they are:
the change in volume of the cell over the course of its
lifetime relative to a target value (cells that stay the
same size or grow very slightly are rewarded)
the \time" the cell lived
how closely correlated the operon switching regions
are to the membrane builder and breaker chemicals
the complexity of the metabolic reaction graph
the number of membrane building and breaking
species available to the cell
the number of dierential equations in the cell
Additionally, the tness of cells that died during the simulation is halved. Fitness values are bounded above at
63. The metrics conict with one another, consequently
the maximum tness score cannot be realised.
15
10
Average Fitness
Max Fitness
Fitness
5
0
0
200
400
600
Cell ID or time
800
1000
1200
Figure 3: Fitness in Environment \Grow"
unable to arrest the increasing rate of diusion because
reactions (including enzyme production) have slowed too
much.
Figure 3 is a graph of the tness attained during the
run. The X-axis represents each experiment run. Values near the maximum tness were reached very early.
This occurs soon after breeding begins (after 75 experiments have run). The average tness is calculated over
all nished experiments currently in the population, excluding those that aborted. Fitness values on the lower
half of the graph mostly represent cells that died since
the tness function halves all cells that died. Most of
the population becomes t and nds strategies to adapt
to the environment.
There is a large diversity of genes in the cells in these
experiments. Environments do not have one \right"
solution with maximum tness. There is a variety of
valid approaches to the environment and all have similar (high) tness. This allows the population to maintain
dierent genes to solve the problem as long as valid contexts for their use (i.e. initial metabolic conditions) are
available. The variety of valid strategies suggests that
our evolution would benet from some form of speciation. We plan to do this in future research.
We calculate the correlation between the tness of a
cell and each of its parents. This is done only for cells
that lived till the end of their simulation period and
whose parent also lived. The coecient of determination
(r2 ) between a cell and its mother is 0.46 and between a
cell and its father is 0.05. This suggests that ospring tness is more closely correlated to mothers than fathers.
The dierence between parents is that initial chemical
species are inherited only from the mother cell, whilst
the genome is inherited from both parents. The context of the genome, then, has an important eect on the
tness of a cell.
Since children often scored dierent tness values than
their parents we were interested in the stability of the
solutions found. Taking the nal population of cells, we
Table 1: Primary chemicals in the environments and in initial population of cells and their concentrations.
Species
Env. \Grow" Env. \Shrink"
o0 ? o9
5:0 10?5
5:0 10?5
o01; o23
o30
5:0 10?7
5:75 10?7
?
7
o44
5:75 10
5:0 10?7
?
5
o0123
5:0 10
5:0 10?5
o5432; o2468; 1:0 10?6
1:0 10?6
o9673; o7345
Table 2: Operons in rst example cell in environment \Grow"
Kind of Switch Genes % active
Constitutive
e51709
100
Inducible f066g e137
21
continued running each cell for a further time of 1:5 105
and calculated another tness value. When we examined the cells we found that there seemed to be two
kinds: myopic cells with short-term solutions (that decreased in tness when run for longer); and cells with
conservative longer term solutions (that either increased
or maintained tness). The myopic cells implementing
short-term solutions tend to have slightly higher tness
than the more stable cells. The higher tness is not
realised in later generations when the usefulness of the
short-term strategy starts to fail.
8.1.1 Case Studies of Individual Cells in Environment \Grow"
We will now examine two cells living in environment
\Grow". Usually we found that many chemicals exist
in the cell in small concentration and that it is often
dicult to determine exactly what strategy is used by
a cell. In the following case studies we present greatly
abbreviated views of each cell. We show all operons, but
describe only the pertinent chemicals in each cell.
The rst cell we discuss is the one that scored the maximum tness in the nal population. It scored 31.0092
initially and 30.4144 for the second time period. Table
2 shows its two operons. Note that the number between
braces in the table is the regulation information for the
operon.
Three spider species are available for transcription in
this cell. These are o0123 (with concentration 2:8 10?5), o01237 (7:54 10?14) and o10123 (7:54 10?14).
Table 3 lists the membrane building and breaking species
in the cell.
Enzyme e137 is used to build new spiders but its
operon does not express well. Both enzymes build
membrane breaking molecules to ght against the o44
molecules that are diusing in. This appears to be a
Cell
Purpose
?
5
5:0 10
Monomers
5:0 10?7 Spider building blocks
5:0 10?7 Membrane breaker
5:0 10?7
Membrane builder
5:0 10?5
Spider
Other building blocks
Table 3: Membrane building and breaking chemicals in rst
example cell in environment \Grow"
Building chemicals
Conc.
o44
5:43 10?7
o446
1 10?15
o447
1 10?15
o73451; o7345130
Virtually
none
Breaking chemicals
o30
4:73 10?7
o130
9:8 10?16
o1309; o5130; :::
... and 11 more
Virtually
complicated
none
breakers including
o7345130
sound strategy since the second tness is similar to the
original tness. There are, however, three weaknesses
with it. The same enzyme (e51709) that is used to
build many of the membrane breaking species also makes
builder molecules by appending polymers starting with 1
to the polymer o7345. The enzymes are not suciently
specic. As well, the builder molecules that are constructed are long, and therefore, will diuse out very
slowly. The next aw in the strategy is the molecule
o7345130 that is made. This molecule acts as both a
builder and breaker molecule. As such, its production
wastes valuable resources. The nal weakness is that
many of the breaker molecules are produced using the existing breaker species o30. This chemical becomes bound
and will reduce the overall tness of the cell because it
is not available to breakdown the membrane. Note also
that o30 will diuse into the cell because it has lower
concentration than the environment. However, o44 will
also diuse in and is still in higher concentration than
o30 causing the cell to grow. It is interesting to note
that the task faced by a cell changes as it reacts to the
environment. Fit cells need to be able to adapt their
response to the environment.
The other cell in this environment that we will examine is the cell in the nal population that recorded the
Fitness in Environment "Shrink"
Table 4: Operons in second example cell in environment
\Grow"
Genes % active
e07
100
e619
15
e53
93
e38
55
30
25
20
Fitness
Kind of Switch
Constitutive
Inducible f0g
Repressible f6g
Inducible f751718g
35
15
10
Table 5: Membrane building and breaking chemicals in the
second example cell in environment \Grow"
Average Fitness
Max Fitness
Fitness
5
Building chemicals
Conc.
o44
5:43 10?7
Breaking chemicals
o30
4:7 10?7
o96730123
6:1 10?8
o07; o075; o306; :::
... and 44 more
Virtually
complicated
none
breakers.
0
0
200
400
600
Cell ID or time
800
1000
1200
Figure 4: Fitness in Environment \Shrink"
Comparative Enivironments
35
30
25
largest increase in tness when it was run for the additional time period. Initially it scored 24.3847. After
continuing the run, its tness rose to 28.9469. Table 4
shows its four operons.
As well as the four enzymes encoded on its genome,
this cell has inherited another enzyme from its mother
(e06) but does not have the gene to produce it. This
chemical will soon degrade into its two constituent
molecules. Cells often have small amounts of chemicals
they cannot produce. These are inherited from cells in
the maternal cell-line.
This cell has ve spiders at its disposal: o0123 (2:46 10?5), o01238 (virtually none), o96730123 (6:1 10?8 ),
o196730123 (virtually none) and o967301238 (virtually
none). Table 5 shows how it can modify the membrane.
The cell has managed to build only membrane breaking
chemicals, many of which do not rely solely on modifying
the important chemical o30.
This strategy would seem to be superior because it is
more specic. Unfortunately this strategy scores less in
the short term than the other cell.
8.2 Environment \Shrink"
The second environment we simulate acts in the opposite way: it promotes cell shrinkage by bathing the cell in
\breaker" chemicals. Any decrease in cell size causes an
increase in concentration for all chemicals in the cell because the number of water molecules in the cell lessens.
This increase in concentration causes two eects in the
cell: an increase in the chemical reaction speed; and the
possibility that one of the chemical species with high
concentration now moves over the threshold (1:0 10?4)
Fitness
20
15
10
Avg Fitness "Shrink"
Max Fitness "Shrink"
Avg Fitness "Grow"
Max Fitness "Grow"
5
0
0
200
400
600
Cell ID or time
800
1000
1200
Figure 5: Fitness in Both Experiments
to poison the cell. If the cell cannot respond to the environmental pressure, it can quickly move towards death,
gathering speed as it continues to shrink in a positive
feedback cycle. The longer the cell waits before acting,
or takes to act, the more dicult it is to stop the shrinkage.
Figure 4 shows the tness attained in environment
\Shrink". As in environment \Grow", values near the
maximum tness were reached very early.
Figure 5 plots the graphs for both environments together. The average tness for both environments starts
out similarly and at the end both populations are mostly
lled with t individuals. Fitness of cells in environment
\Grow", however, are consistently higher than those cells
in environment \Shrink". As the membrane building and
breaking coecients (k1 and k2 in equation 14) are set
to the same values, the explanation for this is in the formulation of the tness function: the volume component
of tness for cells that grow a little is higher than for
cells that shrink.
Results of the relationship between ospring and parent tness values and the stability of solutions are similar
to the other environment and are not reproduced here.
Table 6: Operons in the rst example cell in environment
\Shrink"
Kind of Switch Genes % active
Constitutive
e59
100
Inducible f124g e53
44
Table 7: Membrane building and breaking chemicals in the
rst example cell in environment \Shrink"
Building chemicals
Conc.
o44
5:4 10?7
o744
2:45 10?15
Breaking chemicals
o30
6:2 10?7
o302; o630; o8302
2:5 10?15
o830
1:3 10?8
o530; o5302;
Virtually
o734530; o7345302
none
8.2.1 Case Studies of Individual Cells in Environment \Shrink"
The rst cell we dissect is the one in the nal population that scored the highest tness (31.0556). This cell
contains two operons (see table 6).
This cell has only one spider molecule (o0123), which
it gets by default from the initial conditions. The
membrane building and breaking molecules available are
listed in table 7. Only two enzymes exist in the cell: e53
and e59. Enzyme e53 is used to build many of the breaking chemicals (o530, o5302, o734530 and o7345302). The
other building and breaking chemicals were inherited
from the mother.
This cell line has existed for some time because the
concentrations of o30 and o44 are higher than in the environment. This only occurs if the cell has shrunk considerably. As the total of the membrane breaking concentrations is greater than the building concentrations
the cell will continue to shrink.
The strategy used by this cell seems strange because
the genome instructs the metabolism to build membrane
breaking molecules which will cause the cell to shrink further instead of grow. However, it is reducing the amount
of o30 by putting it into a bound state whilst joining it
to other chemicals. Such a contrary strategy was never
envisioned by us when designing the model. In the short
term, this is a good strategy, but it breaks down after
some time. This is because the new chemicals produced
eventually become unbound and, since they are longer,
diuse out of the cell more slowly. Initially, this cell
achieved a tness of 31.0556. When it was run longer,
the tness reduced to 25.1239.
Unfortunately, the cell has lost the ability to build
o744. Presumably, this is because that ability did not
Table 8: Operons in the second example cell in environment
\Shrink"
Type of Switch Genes % active
Inducible f3g e770
17
Constitutive
e53; e59
100
Table 9: Membrane building and breaking chemicals in the
second example cell in environment \Shrink"
Building chemicals
Conc.
o44
5:28 10?7
Breaking chemicals
o30
6 10?7
o07345
2:4 10?15
o530; o073453;
o073459; o707345;
Virtually
o734530; o0734530;
none
o073459673;
o2468734530
translate to an immediate large increase in tness. This
is most probably because the act of producing o744 temporarily puts o44 and o744 into bound state and therefore decreases the tness of the cell. In the long term,
however, the tness will recover. This strategy, however,
does not compete well with the short-term solution already mentioned.
The second cell we will examine is the cell in the nal
population whose tness improved by the most when it
was run for the further time period: from 27.8078 to
28.8712.
This cell, as shown in table 8, contains two operons.
Additional spider molecules are available to this cell.
They are o0123 (3 10?5), o01238 (2 10?8), o70123
(virtually none) and o701238 (virtually none). The last
two of these are constructed using the enzyme e770 but
not enough has been made to be of any practical use.
The strategy used by this cell does not dier much from
the rst cell. It uses enzymes e53 and e59 to bind more of
the existing membrane catalysing species into complexes
so that they cannot shrink the cell. The cell achieved a
higher tness when the run was continued because it
can bind o30 and o07345 into many dierent complexes.
We would expect that this strategy would start to break
down after a longer period.
In general, cells are unable to use strategies that break
o30, for example, into its constituent monomers. This is
because the concentration of monomers is much higher
than that of o30 and therefore the reversible reaction
builds more o30 rather than breaking it down.
9. Conclusion
A complex model of a cell was presented. Populations of cells evolved to adapt to an environment in
favourable ways: control systems (genomes, spiders and
metabolism) were evolved to achieve adaptation to two
environments. Reaction graphs of the cells are very complicated systems and only simplied descriptions were
given in this paper for reasons of clarity.
Cells frequently chose short-term strategies over more
stable longer term strategies. This is because such
strategies score higher tness in the short term. However, children of these cells are, in general, less t. Distinguishing between short and long-term solutions is not
possible for the tness function. To see why, consider
how to score the parent. The tness function must either
examine the tness of ospring or search for short-term
solutions in the metabolism of the parent. The former
task is impossible because a cell must be scored before
it issues ospring. The latter task is dicult because it
presupposes solutions to the environment and therefore
particular kinds of cells. One of our goals with the design of the tness function was to keep it as abstract as
possible. Consequently, the tness should be determined
from the actions of the cell rather than details of the actual enzymes coded or the reaction graph embodied in
the metabolic pathways.
We have also used the tool developed for these experiments in a number of other contexts: (i) exploration of
the use of gene expression and regulation algorithms in
evolutionary computation (Kennedy and Osborn, 2000);
(ii) examination of the eects of Lamarckian evolution
of the cell models (Kennedy and Osborn, 1999); and
(iii) preliminary examination of the eects of genetic
operators on the evolution of single-celled organisms
(Kennedy, 1998). In future work we wish to explore (i)
the evolution of adaptive behaviour in more dynamic
environments; (ii) application of the model to solving
problems using a computational paradigm of chemical
reactions; and (iii) modication of the model to apply it
to broader problems in evolutionary computation (rather
than strictly biological kinds of problems).
Regulation of genes is not emphasised in the current
work. Most of the time, we nd that operons work well if
they are constitutive. More complex environments that
change the stimuli presented to cells over time (i.e. dynamic environments) would make regulation more important. We wish to see how evolution of control of the
adaptive behaviour in cells changes in such conditions.
In order to apply the model to a wider range of problems in evolutionary computation, we plan to discretise
the cell model. That is, move from an encoding of dierential equations to nite state machines and from oating point concentrations to integers or symbols. We believe this would make the model more applicable to other
areas. The cell model would become more like (linear)
genetic programming or a classier system.
References
Alberts, Bray, et al. (1994). Molecular Biology of the
Cell. Garland Publishing, New York, third edition.
Bagley, R. J. and Farmer, J. D. (1991). Spontaneous
emergence of a metabolism. In Langton, C. G.,
Farmer, J. D., and Rasmussen, S., (Eds.), Articial Life II, volume x of SFI Studies in the Sciences
of Complexity. Addison{Wesley.
Farmer, J. D., Kauman, S. A., and Packard, N. H.
(1986). Autocatalytic replication of polymers. Physica 22D, pages 50{67.
Fleischer, K. and Barr, A. H. (1994). A simulation
testbed for the study of multicellular development:
The multiple mechanisms of morphogenesis. In
Langton, C. G., (Ed.), Articial Life III, volume
xvii of SFI Studies in the Sciences of Complexity.
Addison{Wesley.
Holland, J. H. (1992). Adaptation in Natural and
Articial Systems. MIT Press, Cambridge, Massachusetts, rst MIT press edition.
Jacobi, N. (1995). Harnessing morphogenesis. Cognitive Science Research Paper 423, School of Cognitive and Computing Sciences, University of Sussex.
Kennedy, P. J. (1998). Simulation of the Evolution of
Single Celled Organisms with Genome, Metabolism,
and Time-Varying Phenotype. PhD thesis, University of Technology, Sydney.
Kennedy, P. J. and Osborn, T. R. (1999). A coevolutionary model of a single{celled organism with
double{stranded genome and time{varying phenotype. In McKay, B., Tsujimura, Y., Sarker, R., Namatame, A., Yao, X., and Gen, M., (Eds.), Proceedings of The Third Australia{Japan Joint Workshop
on Intelligent and Evolutionary Systems, pages 145{
152.
Kennedy, P. J. and Osborn, T. R. (2000). Operon expression and regulation with spiders. To be presented at Gene Expression workshop at the 2000
Genetic and Evolutionary Computation Conference
(GECCO{2000), Las Vegas.
Kittel, C. (1971). Introduction to Solid State Physics.
John Wiley and Sons, New York, fourth edition.
Rosenberg, R. S. (1967). Simulation of Genetic Populations with Biochemical Properties. PhD thesis,
University of Michigan.