On the derivation of approximations to cellular automata models and

Mathematical Biosciences 253 (2014) 63–71
Contents lists available at ScienceDirect
Mathematical Biosciences
journal homepage: www.elsevier.com/locate/mbs
On the derivation of approximations to cellular automata models and
the assumption of independence
K.J. Davies ⇑, J.E.F. Green, N.G. Bean, B.J. Binder, J.V. Ross
School of Mathematical Sciences, University of Adelaide, South Australia, Australia
a r t i c l e
i n f o
Article history:
Received 15 November 2013
Received in revised form 10 April 2014
Accepted 15 April 2014
Available online 24 April 2014
Keywords:
Cellular automata
Continuum approximations
Agent-based simulation
Motility and proliferation
a b s t r a c t
Cellular automata are discrete agent-based models, generally used in cell-based applications. There is
much interest in obtaining continuum models that describe the mean behaviour of the agents in these
models. Previously, continuum models have been derived for agents undergoing motility and proliferation processes, however, these models only hold under restricted conditions. In order to narrow down
the reason for these restrictions, we explore three possible sources of error in deriving the model. These
sources are the choice of limiting arguments, the use of a discrete-time model as opposed to a continuous-time model and the assumption of independence between the state of sites. We present a rigorous
analysis in order to gain a greater understanding of the significance of these three issues. By finding a limiting regime that accurately approximates the conservation equation for the cellular automata, we are
able to conclude that the inaccuracy between our approximation and the cellular automata is completely
based on the assumption of independence.
Ó 2014 Elsevier Inc. All rights reserved.
1. Introduction
Cellular automata (CA) are discrete agent-based mathematical
models that allow for an individual agent’s behaviour to depend
upon the state of its neighbourhood. As such they are often an ideal
tool for modelling discrete systems composed of interacting individuals. A discipline in which they have seen widespread
use is cell biology, where they have been used, for example, to
model and understand processes such as tissue and tumour growth
[5,16,17,24,26], and wound healing [8,19]. We focus herein on CA
models appropriate for such applications, in which the biological
processes are cell motility and cell proliferation.
As the cell biological processes that we seek to understand are
likely to be evolving continuously in time, we believe a
continuous-time model to be most appropriate. However, the
literature extensively considers discrete-time CA models
[1,4–6,11,18,22,23], in particular when deriving approximations
to the average behaviour of these processes, and so we begin our
analysis with discrete-time CA models in Section 2 before
considering continuous-time CA models in Section 3.
Of much interest from both a practical and theoretical perspective, is the derivation of approximations which capture the average
⇑ Corresponding author. Tel.: +61 8 8313 1606.
E-mail address: [email protected] (K.J. Davies).
http://dx.doi.org/10.1016/j.mbs.2014.04.004
0025-5564/Ó 2014 Elsevier Inc. All rights reserved.
behaviour of the CA. Such continuum models might allow for new
insight and understanding of these important biological processes.
It has been shown by numerical experiments that these continuum
approximations are only valid under restrictive conditions on
the probabilities of cell movement and proliferation (that is, where
cell movement dominates cell proliferation) [22], limiting
the range of scenarios and applications which may be considered
[2,4,7,15,18,22].
A careful analysis of the development of existing continuum
approximations is undertaken in this paper. This gives insight into
the implicit assumptions regarding the magnitudes of the motility
and proliferation probabilities which underlie their derivation, and
shows how new approximations can be developed when these
assumptions are relaxed.
We show that the assumption of independence between
the state of different sites in the CA is a key issue with
regards to the inaccuracy of existing continuum approximations, so long as proliferation is present; and that, when there
is no proliferation, the approximation obtained by assuming
independence is identical to that found when the independence
assumption is relaxed. However, the earlier continuum models
are shown to perform unexpectedly well in approximating
the behaviour of the CA even when proliferation is included.
We show that this is largely due to a fortuitous cancellation
of errors.
64
K.J. Davies et al. / Mathematical Biosciences 253 (2014) 63–71
2. Discrete-time CA model
2.2. Deriving a continuum approximation
2.1. Defining the model
We aim to derive a partial differential equation (PDE) for the
ensemble average occupancy. However, it is axiomatic that a continuous PDE model is only likely to provide a reasonable approximation of the discrete system when large numbers of agents are
present, and the average occupancy of sites varies over length
and time scales which are much larger than the agent size (Dx)
and time step (Dt). In order to derive a continuum model, we must
hence identify these characteristic macroscopic length and time
scales for the system [14]. For the length scale, denoted L, it would
be natural to take the size of the region of the domain initially
occupied by cells whilst for the time scale, denoted T, it could naturally be the population doubling time or the average time taken
for a cell to move a distance L.
We wish to approximate the ensemble average occupancy,
C ki , of the CA model with the continuous function Cðx; tÞ, such
that C ki Cðxi ; t k Þ, where xi ¼ iDx and tk ¼ kDt, and desire that
this provides a reasonable approximation when the occupancy
varies over the macroscopic scales. We hence assume that the
ratios of the micro- and macroscopic length and time scales
are small, i.e.
We begin by introducing our one-dimensional CA model. The
model is a lattice-based system in which each site on the lattice
is in one of two states: occupied or vacant. Each occupied site contains an agent whose behaviour is determined by some process.
Two processes will be considered in our model, motility and
proliferation.
We say that the cell size and hence the lattice spacing is Dx, and
so the position of the ith lattice site is xi ¼ iDx for i ¼ 1; 2; . . . ; X. For
simplicity, we use periodic boundary conditions, resulting in a connection between site 1 and site X. We define a time step Dt and
consider the state of the process at discrete times t k ¼ kDt for
k ¼ 0; 1; . . ..
Consider an agent at site i on the lattice who is to undergo a
motility or proliferation event. In both cases, an adjacent site is
chosen uniformly at random, that is, site i 1 or site i þ 1. If the
adjacent site is vacant then the event will be carried out, otherwise
the event will be aborted [9]. In the case of motility, the agent will
move from site i to the new site, resulting in site i becoming unoccupied and the new site becoming occupied. In the case of proliferation, both the new site and site i will become occupied.
There are many ways in which these events can and have been
implemented, [3,12,22,23,25,26]. Each of these different implementations can produce different average results, although in most
cases the differences in the average CA data are minor. For example, in many of these models, the order in which events take place
in the model is arbitrarily chosen, unmotivated or in some cases
not made clear. The importance of this will be discussed in greater
detail in Section 3.1.
For the purposes of comparison, we use the following implementation as outlined in Simpson et al. [22]. We choose 2Nðtk Þ
agents uniformly at random with replacement, where Nðtk Þ is the
number of agents in the system at time t k . The first Nðt k Þ agents
are each given the opportunity to perform a motility event. The
probability of each agent performing the event is P m . If the motility
event is aborted, this is still regarded as an event taking place (and
similarly for proliferation events). The remaining Nðt k Þ agents are
then given the opportunity to perform a proliferation event, each
with probability P p . The time step is completed by moving from
tk to tkþ1 and this procedure is repeated. A realisation of the model
can be seen in Fig. 1.
In order to analyse the mean behaviour of the system described
above we consider the ensemble average occupancy at position xi
at time t k , denoted C ki . This value can be calculated numerically
by averaging the occupancy over many realisations of the CA system. This allows us to easily validate the results obtained when
deriving continuum approximations.
5
10
15
20
¼
Dx
1;
L
d¼
Dt
1;
T
and exploit this separation of scales to derive the PDE model.
We now consider the change in the ensemble average occupancy of site i; C ki , from time tk to time t kþ1 . Assuming that the state
of each lattice site is independent of the state of every other lattice
site, we obtain the following discrete conservation equation
Pm Pp k
Pm Pp
C i1 ð1 C ki Þ þ
ð1 C ki ÞC kiþ1
þ
þ
2
2
2
2
Pm
Pm k
ð1 C ki1 ÞC ki C ð1 C kiþ1 Þ þ HoT
2
2 i
Pm k
Pp
¼ C ki þ
ðC 2C ki þ C kiþ1 Þ þ ð1 C ki ÞðC ki1 þ C kiþ1 Þ þ HoT;
2 i1
2
ð2:1Þ
C kþ1
¼ C ki þ
i
where i ¼ 1; . . . ; X and HoT denotes higher order terms.
Eq. (2.1) says that the new average occupancy of each site will
be the old average occupancy plus some terms that describe the
change over that time step. It is derived by considering all different
possible transitions into and out of site i. Consider, specifically,
deriving the probability of an agent at site i 1 moving to site i.
An agent has a probability of motility Pm and has probability of
1=2 of moving in a given direction. Further, C ki1 is the probability
of site i 1 being occupied at time t k and ð1 C ki Þ is the probability
of site i being vacant at time tk . The term P2m C ki1 ð1 C ki Þ can be
obtained by assuming independence between sites and taking
the product of each of these probabilities.
25
30
35
40
45
50
Fig. 1. A single realisation of the CA model progressing with time. Each row (from top to bottom) corresponds to k ¼ 0; 1000; 3000; 5000 where k is the number of time steps
since the start of the realisation. This process is purely proliferation, with P m ¼ 0 and P p ¼ 1=200. This realisation contains X ¼ 50 sites of which 11 are initially occupied by
agents.
65
K.J. Davies et al. / Mathematical Biosciences 253 (2014) 63–71
Note that we have explicitly included only interactions between
adjacent pairs of sites as we only seek a first order accurate equation.
By first order accurate, we mean that we only consider terms of
order P m and Pp . The higher order terms (HoT) that exist are of the
form Pam P bp where a þ b P 2 and a and b are non-negative integers.
These higher order terms may appear, in discrete time, when one
site undergoes multiple interactions in a single time step: for example if an agent at site i 1 tries to move to site i and an agent at site
i þ 1 attempts to proliferate and deposit a daughter at site i. We will
be able to quantify these explicitly upon defining a limiting regime.
The conservation equation may change form depending on the
specific details of the implementation of the motility and proliferation events. This effect is particularly seen in the higher order
terms, with the first order conservation Eq. (2.1) being valid for
many implementations.
We assume the existence of a continuous approximation,
Cðxi ; t k Þ, for the ensemble average occupancy, C ki . Upon substituting
this into the simplified Eq. (2.1) and expanding the Cðxi Dx; tk Þ
and Cðxi ; tk þ DtÞ terms using Taylor series, we obtain the following
expression
C þ Dt
@C
Pm
þ O Dt 2 C ¼
@t
2
þ
1
X
j¼0
j
j
ðDxÞ @ C
2C þ
j! @xj
1
X
j¼0
j
j
Dx @ C
j! @xj
!
!
1
1
X
Pp
ðDxÞj @ j C X
Dxj @ j C
þ HoT;
ð1 CÞ
þ
j
@x
2
j!
j! @xj
j¼0
j¼0
ð2:2Þ
where all quantities are evaluated at ðx; tÞ ¼ ðxi ; t k Þ.
This expression simplifies to
Dt
1
1
X
X
@C
Dx2j @ 2j C
Dx2j @ 2j C
¼ Pm
þ Pp ð1 CÞ
þ O Dt 2 þ HoT:
2j
2j
@t
@x
@x
ð2jÞ!
ð2jÞ!
j¼1
j¼0
ð2:3Þ
We now nondimensionalise by setting
t ¼ T~t;
x ¼ L~x;
where tildes indicate dimensionless variables. Our dimensionless
equation for C is then
d
1
1
X
X
@C
2j @ 2j C
2j @ 2j C 2 ¼ Pm
þ Pp ð1 CÞ
þ O d þ HoT:
2j
~
~
ð2jÞ! @ x
ð2jÞ! @ ~x2j
@t
j¼1
j¼0
ð2:4Þ
For notational convenience, tildes will be dropped in further
equations.
Our aim is now to obtain a PDE from (2.4) in the limit as
; d ! 0. However, in order to do this, we must consider the
behaviour of P m and Pp in this limit. We distinguish two possible
scenarios below.
2.3. Type A model
A sensible place to begin is the limiting regime for the motility
event based on previous work in the literature [22] where it is
shown that an agent on a random walk has a mean-squared displacement that scales with time. The motility mechanism
described in Section 2.1 is that of a random walk and hence the following scaling for the probability of motility seems appropriate
lim
;d!0
2 P m
2d
b ;
¼P
m
ð2:5Þ
b is Oð1Þ.
where P
m
Although the proliferation event does not have the same relation of mean-squared displacement scaling with time, it is clear
that the number of proliferation events will increase as the time
step increases. Hence, we adopt the natural scaling
Pp
bp;
¼P
lim
d!0
d
ð2:6Þ
b p is Oð1Þ.
where P
Thus P m is O d=2 and Pp is OðdÞ (and hence also O d=2 ). As
a b
the higher order terms in Eq. (2.4) are of the form P m P p where
a þ b P2 it can be concluded that the higher order terms are
O d2 =4 as ; d ! 0. Upon dividing Eq. (2.4) by d the higher order
terms become O d=4 .
Substituting these scalings into (2.4), and taking the limit
; d ! 0, assuming d 4 the following expression is obtained,
at leading order
@C b @ 2 C b
¼ P m 2 þ P p ð1 CÞC:
@t
@x
ð2:7Þ
This is the expression obtained by Simpson et al. [22].
We consider the ratio of P m and Pp under the limiting regime
given by Eqs. (2.5) and (2.6), obtaining
Pm
1
¼O 2 ;
Pp
ð2:8Þ
and as 1 is required to make the continuum approximation, this
implies that
Pm
1;
Pp
ð2:9Þ
or equivalently, Pm P p . The latter was observed by Simpson et al.
[22], through numerical investigation, to be required for (2.7) to
give an accurate approximation to the average CA behaviour.
An immediately obvious problem with this approximation is
that when there is no motility (P m ¼ 0), the lack of a diffusion term
means the cells cannot spread from the region initially occupied.
However, as demonstrated in Fig. 1, proliferation alone does result
in the agents spreading.
Our goal is to develop a model that will operate accurately
without any restrictions on the parameters. We consider an alternative limiting regime below that achieves this. This result is not
only of mathematical interest but is important to many biological
systems, for example in breast cancer cell migration where motility and proliferation are estimated to be of the same order [21].
2.4. Type B model
We now consider a second limiting regime in which both the
probabilities of motility and proliferation are considered based
on the time step, that is
Pm
bm;
¼P
d!1
d
lim
ð2:10Þ
b p is defined as in Eq. (2.6). Here, P
b m and P
b p are Oð1Þ. In this
and P
derivation, Pm and Pp are both OðdÞ and hence the higher order
terms in (2.4) are Oðd2 Þ. Dividing both sides of Eq. (2.4) by d and
taking the limit as d ! 0 results in the continuous-time equation
1
1
X
@C b X
2j @ 2j C b
2j @ 2j C
¼ Pm
þ P p ð1 CÞ
2j
@t
ð2jÞ! @x
ð2jÞ! @x2j
j¼1
j¼0
b p ð1 CÞC þ ð P
bm þ P
b p ð1 CÞÞ
¼P
1
X
j¼1
@ 2j C
:
ð2jÞ! @x2j
2j
ð2:11Þ
If we take the limit as ! 0 in Eq. (2.11), we are left with only a
source term, which means that our model will be independent of
b m is removed from the equation. If we conthe motility process as P
sider the CA process, it is clear that the solution must depend on
the motility process and hence requires more than just a source
term due to proliferation.
66
K.J. Davies et al. / Mathematical Biosciences 253 (2014) 63–71
This discrepancy can be resolved by noting that the limit ! 0
is singular. Hence, we anticipate the existence of a boundary layer
in the form of an invading front in which the average occupancy,
Cðx; tÞ, changes very rapidly. By denoting the initial edge of the
front as x0 , we are able to introduce the coordinate f, such that
x ¼ x0 þ f. This transformation results in the following equation
1
X
@C b
1 @ 2j C
bm þ P
b p ð1 CÞÞ
¼ P p ð1 CÞC þ ð P
:
@t
ð2jÞ! @f2j
j¼1
ð2:12Þ
This expression is of little practical utility, due to the presence
of the infinite sum, however, the presence of the 1=ð2jÞ! in the
sum suggests that the contribution of the higher derivative terms
may be negligible. Whilst, in general, neglecting these terms may
be problematic (since it is another singular perturbation problem),
in practice, as we shall shortly demonstrate (see Fig. 2 in
Section 4.1), a good approximation of Eq. (2.1) is obtained
by retaining only the first term in the series. This results in the
nonlinear Fisher’s equation
@C b
1 b
@2C
b
¼ P p ð1 CÞC þ ð P
:
m þ P p ð1 CÞÞ
@t
2
@f2
ð2:13Þ
In order to obtain an approximation to the ensemble average
occupancy, this equation is then solved numerically and transformed back into x coordinates using x ¼ x0 þ f. In order to do
this, will be considered as a small finite number (as opposed to
the above arguments where we consider ! 0) and x0 is the initial
position of the wave front. Note that x0 is simply a fixed constant
and does not change as the wave front moves.
This derivation is more general than using the limiting regime
in the Type A model as
Pm
Oð1Þ:
lim
d!0 P p
As a consequence, it appears as though the Type B model has an
advantage over the Type A model due to reduced restrictions. For
the Type A model given by Eq. (2.7), we require both Pm P p
and 4 d, both of which are implicit assumptions in the derivation. In the Type B model, there exists no such restrictions, making
it a more flexible result. Furthermore, not being restricted to
Pm P p allows the possibility that we may be able to predict the
average behaviour of the CA when proliferation is dominant in
the system.
Scaling arguments of the type presented in this section are a
common tool in physical applied mathematics, [14], and we have
shown their usefulness in deriving continuum approximations
for CA models. As well as allowing us to be explicit about the
restrictions on the parameter values relevant to the two approximations (Type A and Type B), the size of neglected terms is also
made clear. Previously, to our knowledge, only the Type A approximation has been derived, in which the diffusion term depends
only on the probability of cell movement. In the absence of cell
motility, it predicts no spreading of the initial cluster of cells,
whereas spreading does occur in the CA as a result of proliferation
(see Section 4.1). By reconsidering the scalings, we obtained the
Type B approximation, which includes an additional diffusion term
that depends on the probability of proliferation, and hence permits
the cell cluster to spread even when there is no cell movement. It is
worth emphasising at this point that both Type A and Type B models are approximations of the conservation Eqs. (2.1), which we
obtained assuming independence between states. We observe in
Fig. 2, Section 4.1 that the extra terms obtained in the Type B
model are essential to obtain an accurate approximation to the
conservation equations which further demonstrates the importance of the choice of limiting regime.
In the next section, we relax all assumptions, except for that of
independence between states, by describing and implementing a
continuous-time CA model. This will allow us to determine the significance of the errors introduced by each process.
3. Continuous-time CA model
3.1. Defining a continuous-time model
We make two approximations through the use of Taylor series
in the derivation of the continuum approximation to the CA model
in Section 2. The existence of these approximations produces
errors between our Type A and Type B models and the true ensemble average occupancy. The magnitude of theses errors can be controlled by the choice of Dx and Dt. However, here we completely
eliminate these errors in order to further improve our approximation, leaving only the assumption of independence.
Eq. (2.1) shows the first order accurate, discrete conservation
equation. In deriving this, we made an assumption that only terms
of order Pm and Pp would be used and any term of the form Pam Pbp
where a þ b P 2 would be removed in the limit. This is essentially
equivalent to saying that on a single time step, site i will be
involved in a maximum of one event which will occur when d is
sufficiently small.
In order to increase the accuracy of our approximation, we
could keep some of these terms, giving us a higher order conservation equation. Unfortunately, the terms involved in these derivations become complicated and abundant very quickly, making
this infeasible. Furthermore, as each model depends strongly on
the fine details of the implementation, a new model would need
to be derived each time the process is slightly changed. Instead,
we can define a similar type of model, in which time is continuous,
rather than discrete. In continuous time, multiple events cannot
occur at once as we consider the exact times that events occur
and hence these higher order terms do not appear in the
derivation.
To describe this continuous-time CA model, we recognise that
the discrete-time CA model described in Section 2 can be represented by a discrete-time Markov chain. The state space of this
Markov chain is given by the possible different configurations of
the sites. As each space can be either occupied or vacant, the size
of the state space is 2X . The probability of moving from one state
to another is only dependent on the current state and not the previously visited states, thus satisfying the memoryless property of
Markov chains.
In a similar way we can describe a continuous-time Markov
chain (CTMC) to represent our continuous-time CA. This is done
by considering the rate at which events will occur. The rate at
which agents implement the motility rule in a given direction is
b m =2 and the rate at which agents implement the proliferation rule
P
b p =2. The motility and proliferation rules can be implemented
is P
whenever an agent is adjacent to a vacant site.
We sum the total number of adjacent occupied and vacant sites
(as this is the criteria for motility to be applied) and multiply this
b m to get the total motility rate. This is also done for proliferaby P
tion (as proliferation also requires adjacent occupied and vacant
b p giving the total proliferation rate.
sites) with multiplication by P
Summing the two rates will give the rate at which the first transition will occur, k, for the given state. In our CA model, k is calculated and an exponentially distributed time with rate k is
generated. Of each possible transition that can occur, one is randomly chosen proportional to the rate at which it can occur. Note
that this is not the only method, nor is it necessarily the most efficient method to simulate the continuous-time CA model. Rather,
we have chosen a method that clearly illustrates what is occurring
in the CA model.
67
K.J. Davies et al. / Mathematical Biosciences 253 (2014) 63–71
Whilst by definition [10] a CA model is discrete in time, we term
our model a continuous-time CA, as it has many analogous qualities to a standard CA model. This model is an example of an interacting particle system [13].
Note that as well as removing the errors that occur due to the
higher order terms, we have also dealt with how the events should
be implemented as discussed in Section 2. In the discrete-time CA,
the order in which agents implement their events is not clear. In
our discrete-time implementation we chose for all motility events
to occur before proliferation events, however, it is possibly more
sensible to mix the two together. Upon mixing the two together,
should the processes then alternate or should the order be randomly chosen via some distribution. Questions such as these have
been implicitly addressed by moving to a continuous-time CA
model.
4. Analysis of models
3.2. Ordinary and partial differential equation approximations
Using the continuous time approach and the assumption of
independence between states, we obtain the following conservation equation
!
bm P
bp
P
C i1 ð1 C i Þ þ
þ
2
2
dC i
¼
dt
!
bm P
bp
P
ð1 C i ÞC iþ1
þ
2
2
bm
bm
P
P
ð1 C i1 ÞC i C i ð1 C iþ1 Þ
2
2
bm
bp
P
P
¼
ðC i1 2C i þ C iþ1 Þ þ ð1 C i ÞðC i1 þ C iþ1 Þ;
2
2
ð3:1Þ
for i ¼ 1; . . . ; X.
Note that this is the exact equation that would be obtained if we
were to simply make the continuum approximation C ki C i ðtk Þ,
take a Taylor series expansion with respect to time and take the
limit as Dt ! 0 of Eq. (2.1), the conservation equation. From these
equations, we can still derive both Type A and Type B PDEs.
bm
In order to derive the Type B model, we simply assume that P
b p are Oð1Þ and obtain our solution using an identical method
and P
to that shown in Section 2. In order to derive the Type A model, we
b m to be O 1=2 and make the following limiting argument
take P
bm ¼ P
b ;
lim 2 P
m
!0
(3.1). Such a system can easily be solved numerically using
software such as MATLAB. In fact, the PDE too must be solved
numerically. This is likely to require a discretisation finer than
the number of cells in order to resolve regions where the
occupancy changes rapidly, hence it is likely that it will be more
computationally efficient to solve the ODE model. However, the
PDE model still has potential benefits, being more amenable to
analysis.
Solving the system of ODEs in Eq. (3.1) and thus removing both
truncations in time and space, means that only the assumption of
independence is required to derive our ODE approximation. By
analysing the range of models against the simulated CA system,
we are able to analyse the differences and deduce the effects that
each assumption has on the accuracy of the approximation.
ð3:2Þ
b is Oð1Þ.
where P
m
By considering this continuous-time CA, we have removed the
limiting error with respect to time, however, there still exists a
spatial truncation error that arises when taking Taylor series
expansions to obtain the PDE. An alternative is simply to solve
the system of X ordinary differential equations (ODE) given in Eq.
4.1. Analysis of discrete-time models
We now compare the models to see whether either of the discrete-time continuum models can accurately approximate the
CA. In order to do this, we have considered a range of values of
Pm and P p . In each example we use an initial condition with similar
structure to that used in Fig. 1.
We begin by explicitly solving the difference equations given by
Eq. (2.1) (ignoring the higher order terms) and comparing them to
our two approximations. Fig. 2 shows that the Type B model and
the difference equations agree strongly whereas the Type A model
only agrees under the restricted condition P m Pp . When the difference equations are derived, the only assumption that has been
made is independence between states. Fig. 2 implies that so long
as the independence assumption is valid, the Type B model should
accurately approximate the average properties of the CA for all
parameter values.
We then compare our models to the average CA. Fig. 3 shows
average results from the discrete-time CA and the two approximations used when the probability of motility is much greater than
the probability of proliferation (Pm P p ). Just as was shown in
[22], the model accurately predicts the behaviour of the CA when
motility dominates. Both models agree so well that it is impossible
to see the difference between them. This is not surprising as the
two models are equivalent when P m Pp .
Observing Fig. 4 it is immediately clear that neither the Type A
or B model accurately approximates the average properties of the
CA for all parameter values. When the probability of motility, Pm ,
is greater (but not significantly greater) than the probability of proliferation, P p , (as in the first column of Fig. 4) we see that the two
Fig. 2. Comparisons between the solution to the difference equations (green), the Type A model (red) and the Type B model (black) (note that the green and black curves are
indistinguishable). Each plot shows the solution at t ¼ 6. The comparison is over three different regimes (from left to right); ðP m ; P p Þ ¼ ð1=10; 1=100Þ; ð3=20; 3=20Þ; ð0; 1=50Þ
corresponding to the three regimes, P m P p ; P m ¼ P p ; P m ¼ 0 respectively. The total number of sites is X ¼ 400. The initial condition consists of 40 agents centrally located,
that is, occupying sites 181–220. This initial condition corresponds to L ¼ 40Dx which is scaled to 1, giving ¼ DLx ¼ 1=40. The macroscopic timescale T ¼ 500Dt has been
b ¼ 2 P m ; P
b m ¼ Pm and
chosen to approximately correspond to the time taken for a leading to cell to move L units. This is scaled to 1, giving d ¼ Dt ¼ 1=500. Consequently, P
T
m
2d
d
b p ¼ Pp following Eqs. (2.5), (2.10) and (2.6), respectively. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this
P
d
article.)
68
K.J. Davies et al. / Mathematical Biosciences 253 (2014) 63–71
Fig. 3. Comparisons between simulated data averaged over 2000 realisations (blue), the Type A model (red) and the Type B model (black) (note that the red and black curves
are indistinguishable). From left to right we have times t ¼ 2; 6; 10 all with ðP m ; P p Þ ¼ ð1; 1=1000Þ. The initial condition and all scalings are as specified in the caption of Fig. 2.
(For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)
Fig. 4. Comparisons between simulated data averaged over 2000 realisations (blue), the Type A model (red) and the Type B model (black). Rows show simulations
progressing with times t ¼ 0; 2; 6; 10. For each column ðP m ; P p Þ ¼ ð1=20; 1=100Þ; ð1=100; 3=200Þ; ð0; 1=50Þ (from left to right). The initial condition and all scalings are as
specified in the caption of Fig. 2. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)
continuum models give similar results. When the two parameters
are of a similar value (as in the second column of Fig. 4) we see that
the two models begin to separate. The Type B model (Eq. (2.13))
predicts more spread than the Type A model (Eq. (2.7)). There is
an absence of diffusion due to proliferation in the Type A model
and hence we expect less spread.
K.J. Davies et al. / Mathematical Biosciences 253 (2014) 63–71
When motility is turned off (as in the third column of Fig. 4) we
see that the Type A model does not change, whereas the Type B
model continues to spread. Although it vastly overestimates the
rate at which this spread occurs, it does qualitatively agree with
the long term dynamics of the problem. That is, we expect for
the entire domain to reach carrying capacity after some finite
amount of time. However, this will never happen in the Type A
model. These results support our findings with respect to the
parameter restrictions for the Type A model.
Given that the Type A model does not give correct long term
dynamics when Pm ¼ 0, the fact that it appears to be better than
the Type B model at approximating the average properties of the
CA for small but non-zero Pm is disappointing.
The reason that the Type A model appears to give a better
approximation to the CA is only due to multiple errors existing
in the approximation. The independence assumption gives an
overestimate of the spread whereas the absence of diffusion
due to proliferation gives an underestimate of the spread. The
69
superposition of these two errors does, in some parameter regimes,
decrease the overall error. However, this is more fortunate than
designed. We seek to eliminate both errors to increase the accuracy
in approximating the average CA properties. We therefore choose
to analyse only the Type B model from here on.
4.2. Analysis of continuous-time models
We now compare the continuous-time models. Fig. 5 shows
that the ODE and PDE models match up so well that they are
almost indistinguishable from each other. Despite removing the
truncation error in time and in the case of the ODE model, in space
as well, the models are still vastly overestimating the average
properties of the CA in all cases.
Note that there is minimal difference between Figs. 4 and 5.
This shows that moving from discrete to continuous time has a
minimal effect on results. Despite this, there are still numerous
reasons that continuous time is beneficial. As well as the removal
Fig. 5. Comparisons between simulated data averaged over 2000 realisations (blue), the ODE model (red) and the PDE model (black) (note that the red and black curves are
bm ; P
b p Þ ¼ ð25; 5Þ; ð5; 15=2Þ; ð0; 10Þ (from left to right). The initial condition
indistinguishable). Rows show simulations progressing with times t ¼ 0; 2; 6; 10. For each column ð P
and spatial scaling are as specified in the caption of Fig. 2. There is no time scaling for the continuous CA. (For interpretation of the references to colour in this figure legend,
the reader is referred to the web version of this article.)
70
K.J. Davies et al. / Mathematical Biosciences 253 (2014) 63–71
of small errors, it also makes the CA implementation clearer. The
biggest factor is that it simplifies the analysis, allowing us to
explore the assumption of independence between sites in the next
section.
At this point, we have removed all apparent possible sources of
error from our model, except for the assumption of independence
bm P
b p the
between sites. Despite this, we note that when P
approximation will give accurate results. This suggests that the
assumption of independence is causing problems for the proliferation mechanism but not the motility mechanism. We now explain
this phenomenon.
5. The independence assumption
In deriving our approximation for the average occupancy, we
have assumed that the probability of a given site being in a particular state is independent of the state of all other sites. Assuming
this, we may state that the probability of site i being in some state
a and site i þ 1 being in some state b is equal to the product of the
probability of site i being in state a and site i þ 1 being in state b.
The existence of independence between the states of sites for
motility and proliferation has been explored in the literature [20]
by calculating the correlation between sites. We demonstrate an
alternative method by considering site i and all sites that may
interact with site i in a single event. We then use marginal probabilities of site occupancy to obtain meaningful expressions for the
rate of change of the probability of occupancy of site i and are able
to show that it is impossible to write the expression for proliferation in terms of marginal probabilities.
In continuous time, site i can only interact with sites i 1 and
i þ 1 over a single event for both the motility and proliferation
events. Hence we only need to consider the possible states of the
triple made up by the sites i 1; i and i þ 1. As we are considering
three sites, each of which has two possible states, we conclude that
the triple will be in one of eight possible states. We say that the
probability of a triple being in one of these states is pj where
j ¼ 0; 1; . . . ; 7, where each j corresponds to the binary representation of that number as shown in Table 1. We define qi to be the
probability that site i is occupied. Note that when considering a
single site, qi is equivalent to considering the ensemble average
occupancy of site i.
We first consider a motility event. To perform calculations
without assuming independence we must explicitly consider the
triples themselves. We write down the rate of change of site i by
considering any triple that is in one state in site i and a different
state in an adjacent site
bm
dqi P
¼
½ðp4 þ p5 Þ þ ðp1 þ p5 Þ ðp2 þ p3 Þ ðp2 þ p6 Þ
dt
2
bm
P
½ðp4 þ 2p5 þ p1 Þ ðp3 þ 2p2 þ p6 Þ:
¼
2
ð5:1Þ
Table 1
The possible states of the triples and their corresponding labels based on the binary
interpretation of the number. In the binary representation, a 0 represents a vacant site
and a 1 represents an occupied site.
We wish to show that this is the same expression that is
obtained if we simply consider the single sites and assume independence by taking the product of probabilities when site i and
an adjacent site are in a different state. Doing this yields the
expression
dqi P^m
¼
½ðqi1 þ qiþ1 Þð1 qi Þ ð2 qi1 qiþ1 Þqi dt
2
P^m
¼
½q 2qi þ qiþ1 :
2 i1
ð5:2Þ
From here, we can show that these two expressions are equivalent
by considering marginal probabilities. The marginal probabilities
say that the probability of a given site being in a given state is the
sum of the probabilities of the triples that also satisfy the
given site being in that state. For instance, qi ¼ p2 þ p3 þ p6 þ p7
as site i is occupied in states 2, 3, 6 and 7. Hence, we can rewrite
Eq. (5.2) as
bm
dqi P
¼
½ðp4 þ p5 þ p6 þ p7 Þ 2ðp2 þ p3 þ p6 þ p7 Þ
dt
2
bm
P
½p þ 2p5 þ p1 ðp3 þ 2p2 þ p6 Þ;
þ ðp1 þ p3 þ p5 þ p7 Þ ¼
2 4
ð5:3Þ
which is equivalent to Eq. (5.1). This proves that for the unbiased
motility mechanism described, we can simply calculate Eq. (5.2)
rather than explicitly calculating the triples and we will obtain
the desired change in qi due to motility.
When attempting the same argument for proliferation (in
the absence of motility), we do not get the same result.
Consider the rate of change of site i without assuming
independence
bp
bp
P
dqi P
¼
½ðp4 þ p5 Þ þ ðp1 þ p5 Þ ¼
½p þ 2p5 þ p1 :
dt
2
2 4
ð5:4Þ
We now write down the expression obtained by considering single
sites only
bp
dqi P
¼
½ðqi1 þ qiþ1 Þð1 qi Þ;
dt
2
ð5:5Þ
and using marginal probabilities, Eq. (5.5) becomes
bp
dqi P
¼
½ððp4 þ p5 þ p6 þ p7 Þ þ ðp1 þ p3 þ p5 þ p7 ÞÞ
dt
2
ð1 p2 p3 p6 p7 Þ
bp
P
½ðp4 þ 2p5 þ p1 Þ þ ðp6 þ 2p7 þ p3 Þ ððp4 þ p5 þ p6 þ p7 Þ
¼
2
ð5:6Þ
þ ðp1 þ p3 þ p5 þ p7 ÞÞðp2 þ p3 þ p6 þ p7 Þ:
It is clear that the extra terms do not cancel each other out in Eq.
(5.6) and hence Eqs. (5.4) and (5.5) are not equivalent. Observing
Figs. 4 and 5, it is clear that this assumption of independence is
not a reasonable assumption to make, as it gives rise to extremely
inaccurate results from the system of ODEs (Eq. (3.1)) for the average occupancy probabilities of the CA.
Furthermore, we can show that no power series solution of
the marginal probabilities qi1 ; qi and qiþ1 can be used to
calculate the rate of change of the marginal probability qi . First,
we note that the probability of site i being vacant is written as
1 qi and as a result, the only terms that may appear are of
the form qai1 qbi qciþ1 for a; b; c 2 f0; 1; 2; . . .g. Second, we note that
as the expression (5.4) is linear in each value of pj , and qi is
linear in each value of pj we may ignore all terms that are
nonlinear.
K.J. Davies et al. / Mathematical Biosciences 253 (2014) 63–71
Hence we are left with the expression
References
bp
dqi P
¼
½a1 qi1 þ a2 qi þ a3 qiþ1 dt
2
bp
P
½a1 ðp4 þ p5 þ p6 þ p7 Þ þ a2 ðp2 þ p3 þ p6 þ p7 Þ
¼
2
þ a3 ðp1 þ p3 þ p5 þ p7 Þ
¼
bp
P
½a3 p1 þ a2 p2 þ ða2 þ a3 Þp3 þ a1 p4 þ ða1 þ a3 Þp5
2
þ ða1 þ a2 Þp6 þ ða1 þ a2 þ a3 Þp7 :
71
ð5:7Þ
There is no set fa1 ; a2 ; a3 g such that this expression is equal to
Eq. (5.4). Therefore we have a contradiction, hence it is impossible
to write the rate of change of the probability of occupancy of site i
due to proliferation in terms of simply marginal probabilities.
6. Conclusion
The purpose of this paper was to rigorously explore the derivation of continuum approximations to cellular automata, ensuring
that no issues have been overlooked. By choosing an appropriate
limiting regime we have derived a model where there are no implicit restrictions on the parameter values. By using a continuoustime CA model we are able to eliminate truncation errors with
respect to time and by then considering an ODE approximation,
we are also able to remove truncation errors with respect to space.
The continuous-time CA also made the implementation of the
process clear as well as allowing us to analyse independence in
Section 5 without making any approximations.
We showed that the continuum approximation Type B (see
Section 2.4) was an excellent approximation to the conservation
equation, confirming the choice of limiting regime and showing
that the conservation equation itself is a poor approximation of
the CA. As the only approximation involved in the derivation of
the conservation equation is that of independence between sites,
we can conclude that this assumption is not appropriate whenever
proliferation is present.
Research has already begun moving in the direction of alleviating the assumption of independence [2,15,20] and we have now
confirmed that it is the sole issue with regards to the accuracy of
CA approximations. In order to move forward in the field of
approximations to cellular automata models, it is clear that the
independence issue must be resolved. In future work, we too
intend to present our ideas and research in this direction.
Acknowledgements
JEFG gratefully acknowledges funding from an ARC Discovery
Early Career Researcher Award (DE 130100031). The work of
NGB and JVR is supported by ARC Discovery Project Funding
(DP 110101929). BJB was supported by an Australian Government
National Health and Medical Research Council Project Grant
(APP1069757).
[1] D.J.G. Agnew, J.E.F. Green, T.M. Brown, M.J. Simpson, B.J. Binder, Distinguishing
between mechanisms of cell aggregation using pair-correlation functions, J.
Theor. Biol. 352 (2014) 16–23.
[2] R.E. Baker, M.J. Simpson, Correcting mean-field approximations for birthdeath-movement processes, Phys. Rev. E 82 (2010) 041905.
[3] C. Beauchemin, J. Samuel, J. Tuszynski, A simple cellular automaton model for
influenza a viral infections, J. Theor. Biol. 232 (2) (2005) 223–234.
[4] B.J. Binder, K.A. Landman, Exclusion processes on a growing domain, J. Theor.
Biol. 259 (3) (2009) 541–551.
[5] B.J. Binder, K.A. Landman, M.J. Simpson, Modeling proliferative tissue
growth: a general approach and an avian case study, Phys. Rev. E 78 (2008)
031912.
[6] B.J. Binder, J.V. Ross, M.J. Simpson, A hybrid model for studying spatial aspects
of infectious diseases, ANZIAM J. 54 (2012) 37–49.
[7] J.M. Bloomfield, J.A. Sherratt, K.J. Painter, G. Landini, Cellular automata and
integrodifferential equation models for cell renewal in mosaic tissues, J. Roy.
Soc. Interface 7 (52) (2010) 1525–1535.
[8] A.Q. Cai, K.A. Landman, B.D. Hughes, Multi-scale modeling of a wound-healing
cell migration assay, J. Theor. Biol. 245 (3) (2007) 576–594.
[9] D. Chowdhury, A. Schadschneider, K. Nishinari, Physics of transport and traffic
phenomena in biology: from molecular motors and cells to organisms, Phys.
Life Rev. 2 (2005) 318–352.
[10] G.B. Ermentrout, L. Edelstein-Keshet, Cellular automata approaches to
biological modeling, J. Theor. Biol. 160 (1993) 97–133.
[11] E.J. Hackett-Jones, K.J. Davies, B.J. Binder, K.A. Landman, Generalized index for
spatial data sets as a measure of complete spatial randomness, Phys. Rev. E 85
(2012) 061908.
[12] Y. Lee, S. Kouvroukoglou, L.V. McIntire, K. Zygourakis, A cellular automaton
model for the proliferation of migrating contact-inhibited cells, Biophys. J. 69
(1995) 1284–1298.
[13] T.M. Liggett, Interacting Particle Systems, Springer, Berlin Heidelberg, 2005.
[14] C.C. Lin, L.A. Segel, Mathematics Applied to Deterministic Problems in the
Natural Sciences, SIAM, 1974.
[15] D.C. Markham, M.J. Simpson, R.E. Baker, Simplified method for including
spatial correlations in mean-field approximations, Phys. Rev. E 87 (2013)
062702.
[16] M. Markus, D. Böhm, M. Schmick, Simulation of vessel morphogenesis using
cellular automata, Math. Biosci. 156 (1–2) (1999) 191–206.
[17] A.A. Patel, E.T. Gawlinski, S.K. Lemieux, R.A. Gatenby, A cellular automaton
model of early tumor growth and invasion: the effects of native tissue
vascularity and increased anaerobic tumor metabolism, J. Theor. Biol. 213
(2001) 315–331.
[18] M.J. Plank, M.J. Simpson, Models of collective cell behaviour with crowding
effects: comparing lattice-based and lattice-free approaches, J. Roy. Soc.
Interface 9 (76) (2012) 2983–2996.
[19] B.G. Sengers, C.P. Please, R.O.C. Oreffo, Experimental characterization and
computational modelling of two-dimensional cell spreading for skeletal
regeneration, J. Roy. Soc. Interface 4 (17) (2007) 1107–1117.
[20] M.J. Simpson, R.E. Baker, Corrected mean-field models for spatially
dependent advection–diffusion–reaction phenomena, Phys. Rev. E 83 (2011)
051922.
[21] M.J. Simpson, B.J. Binder, P. Haridas, B.K. Wood, K.K. Treloar, D.L.S. McElwain,
R.E. Baker, Experimental and modelling investigation of monolayer
development with clustering, Bull. Math. Biol. 75 (5) (2013) 871–889.
[22] M.J. Simpson, K.A. Landman, B.D. Hughes, Cell invasion with
proliferation mechanisms motivated by time-lapse data, Physica A 389
(2010) 3779–3790.
[23] M.J. Simpson, A. Merrifield, K.A. Landman, B.D. Hughes, Simulating invasion
with cellular automata: Connecting cell-scale and population-scale properties,
Phys. Rev. E 76 (2007) 021918.
[24] Z. Wang, T.S. Deisboeck, Computational modeling of brain tumors: discrete,
continuum or hybrid?, Sci Model. Simul. 15 (1–3) (2008) 381–393.
[25] S.H. White, A.M. del Rey, G.R. Sànchez, Modeling epidemics using cellular
automata, Appl. Math. Comput. 186 (2007) 193–202.
[26] D. Wodarz, A. Hofacre, J.W. Lau, Z. Sun, H. Fan, N.L. Komarova, Complex spatial
dynamics of oncolytic viruses in vitro: mathematical and experimental
approaches, PLOS Comput. Biol. 8 (6) (2012) e1002547.