Mathematical Biosciences 253 (2014) 63–71 Contents lists available at ScienceDirect Mathematical Biosciences journal homepage: www.elsevier.com/locate/mbs On the derivation of approximations to cellular automata models and the assumption of independence K.J. Davies ⇑, J.E.F. Green, N.G. Bean, B.J. Binder, J.V. Ross School of Mathematical Sciences, University of Adelaide, South Australia, Australia a r t i c l e i n f o Article history: Received 15 November 2013 Received in revised form 10 April 2014 Accepted 15 April 2014 Available online 24 April 2014 Keywords: Cellular automata Continuum approximations Agent-based simulation Motility and proliferation a b s t r a c t Cellular automata are discrete agent-based models, generally used in cell-based applications. There is much interest in obtaining continuum models that describe the mean behaviour of the agents in these models. Previously, continuum models have been derived for agents undergoing motility and proliferation processes, however, these models only hold under restricted conditions. In order to narrow down the reason for these restrictions, we explore three possible sources of error in deriving the model. These sources are the choice of limiting arguments, the use of a discrete-time model as opposed to a continuous-time model and the assumption of independence between the state of sites. We present a rigorous analysis in order to gain a greater understanding of the significance of these three issues. By finding a limiting regime that accurately approximates the conservation equation for the cellular automata, we are able to conclude that the inaccuracy between our approximation and the cellular automata is completely based on the assumption of independence. Ó 2014 Elsevier Inc. All rights reserved. 1. Introduction Cellular automata (CA) are discrete agent-based mathematical models that allow for an individual agent’s behaviour to depend upon the state of its neighbourhood. As such they are often an ideal tool for modelling discrete systems composed of interacting individuals. A discipline in which they have seen widespread use is cell biology, where they have been used, for example, to model and understand processes such as tissue and tumour growth [5,16,17,24,26], and wound healing [8,19]. We focus herein on CA models appropriate for such applications, in which the biological processes are cell motility and cell proliferation. As the cell biological processes that we seek to understand are likely to be evolving continuously in time, we believe a continuous-time model to be most appropriate. However, the literature extensively considers discrete-time CA models [1,4–6,11,18,22,23], in particular when deriving approximations to the average behaviour of these processes, and so we begin our analysis with discrete-time CA models in Section 2 before considering continuous-time CA models in Section 3. Of much interest from both a practical and theoretical perspective, is the derivation of approximations which capture the average ⇑ Corresponding author. Tel.: +61 8 8313 1606. E-mail address: [email protected] (K.J. Davies). http://dx.doi.org/10.1016/j.mbs.2014.04.004 0025-5564/Ó 2014 Elsevier Inc. All rights reserved. behaviour of the CA. Such continuum models might allow for new insight and understanding of these important biological processes. It has been shown by numerical experiments that these continuum approximations are only valid under restrictive conditions on the probabilities of cell movement and proliferation (that is, where cell movement dominates cell proliferation) [22], limiting the range of scenarios and applications which may be considered [2,4,7,15,18,22]. A careful analysis of the development of existing continuum approximations is undertaken in this paper. This gives insight into the implicit assumptions regarding the magnitudes of the motility and proliferation probabilities which underlie their derivation, and shows how new approximations can be developed when these assumptions are relaxed. We show that the assumption of independence between the state of different sites in the CA is a key issue with regards to the inaccuracy of existing continuum approximations, so long as proliferation is present; and that, when there is no proliferation, the approximation obtained by assuming independence is identical to that found when the independence assumption is relaxed. However, the earlier continuum models are shown to perform unexpectedly well in approximating the behaviour of the CA even when proliferation is included. We show that this is largely due to a fortuitous cancellation of errors. 64 K.J. Davies et al. / Mathematical Biosciences 253 (2014) 63–71 2. Discrete-time CA model 2.2. Deriving a continuum approximation 2.1. Defining the model We aim to derive a partial differential equation (PDE) for the ensemble average occupancy. However, it is axiomatic that a continuous PDE model is only likely to provide a reasonable approximation of the discrete system when large numbers of agents are present, and the average occupancy of sites varies over length and time scales which are much larger than the agent size (Dx) and time step (Dt). In order to derive a continuum model, we must hence identify these characteristic macroscopic length and time scales for the system [14]. For the length scale, denoted L, it would be natural to take the size of the region of the domain initially occupied by cells whilst for the time scale, denoted T, it could naturally be the population doubling time or the average time taken for a cell to move a distance L. We wish to approximate the ensemble average occupancy, C ki , of the CA model with the continuous function Cðx; tÞ, such that C ki Cðxi ; t k Þ, where xi ¼ iDx and tk ¼ kDt, and desire that this provides a reasonable approximation when the occupancy varies over the macroscopic scales. We hence assume that the ratios of the micro- and macroscopic length and time scales are small, i.e. We begin by introducing our one-dimensional CA model. The model is a lattice-based system in which each site on the lattice is in one of two states: occupied or vacant. Each occupied site contains an agent whose behaviour is determined by some process. Two processes will be considered in our model, motility and proliferation. We say that the cell size and hence the lattice spacing is Dx, and so the position of the ith lattice site is xi ¼ iDx for i ¼ 1; 2; . . . ; X. For simplicity, we use periodic boundary conditions, resulting in a connection between site 1 and site X. We define a time step Dt and consider the state of the process at discrete times t k ¼ kDt for k ¼ 0; 1; . . .. Consider an agent at site i on the lattice who is to undergo a motility or proliferation event. In both cases, an adjacent site is chosen uniformly at random, that is, site i 1 or site i þ 1. If the adjacent site is vacant then the event will be carried out, otherwise the event will be aborted [9]. In the case of motility, the agent will move from site i to the new site, resulting in site i becoming unoccupied and the new site becoming occupied. In the case of proliferation, both the new site and site i will become occupied. There are many ways in which these events can and have been implemented, [3,12,22,23,25,26]. Each of these different implementations can produce different average results, although in most cases the differences in the average CA data are minor. For example, in many of these models, the order in which events take place in the model is arbitrarily chosen, unmotivated or in some cases not made clear. The importance of this will be discussed in greater detail in Section 3.1. For the purposes of comparison, we use the following implementation as outlined in Simpson et al. [22]. We choose 2Nðtk Þ agents uniformly at random with replacement, where Nðtk Þ is the number of agents in the system at time t k . The first Nðt k Þ agents are each given the opportunity to perform a motility event. The probability of each agent performing the event is P m . If the motility event is aborted, this is still regarded as an event taking place (and similarly for proliferation events). The remaining Nðt k Þ agents are then given the opportunity to perform a proliferation event, each with probability P p . The time step is completed by moving from tk to tkþ1 and this procedure is repeated. A realisation of the model can be seen in Fig. 1. In order to analyse the mean behaviour of the system described above we consider the ensemble average occupancy at position xi at time t k , denoted C ki . This value can be calculated numerically by averaging the occupancy over many realisations of the CA system. This allows us to easily validate the results obtained when deriving continuum approximations. 5 10 15 20 ¼ Dx 1; L d¼ Dt 1; T and exploit this separation of scales to derive the PDE model. We now consider the change in the ensemble average occupancy of site i; C ki , from time tk to time t kþ1 . Assuming that the state of each lattice site is independent of the state of every other lattice site, we obtain the following discrete conservation equation Pm Pp k Pm Pp C i1 ð1 C ki Þ þ ð1 C ki ÞC kiþ1 þ þ 2 2 2 2 Pm Pm k ð1 C ki1 ÞC ki C ð1 C kiþ1 Þ þ HoT 2 2 i Pm k Pp ¼ C ki þ ðC 2C ki þ C kiþ1 Þ þ ð1 C ki ÞðC ki1 þ C kiþ1 Þ þ HoT; 2 i1 2 ð2:1Þ C kþ1 ¼ C ki þ i where i ¼ 1; . . . ; X and HoT denotes higher order terms. Eq. (2.1) says that the new average occupancy of each site will be the old average occupancy plus some terms that describe the change over that time step. It is derived by considering all different possible transitions into and out of site i. Consider, specifically, deriving the probability of an agent at site i 1 moving to site i. An agent has a probability of motility Pm and has probability of 1=2 of moving in a given direction. Further, C ki1 is the probability of site i 1 being occupied at time t k and ð1 C ki Þ is the probability of site i being vacant at time tk . The term P2m C ki1 ð1 C ki Þ can be obtained by assuming independence between sites and taking the product of each of these probabilities. 25 30 35 40 45 50 Fig. 1. A single realisation of the CA model progressing with time. Each row (from top to bottom) corresponds to k ¼ 0; 1000; 3000; 5000 where k is the number of time steps since the start of the realisation. This process is purely proliferation, with P m ¼ 0 and P p ¼ 1=200. This realisation contains X ¼ 50 sites of which 11 are initially occupied by agents. 65 K.J. Davies et al. / Mathematical Biosciences 253 (2014) 63–71 Note that we have explicitly included only interactions between adjacent pairs of sites as we only seek a first order accurate equation. By first order accurate, we mean that we only consider terms of order P m and Pp . The higher order terms (HoT) that exist are of the form Pam P bp where a þ b P 2 and a and b are non-negative integers. These higher order terms may appear, in discrete time, when one site undergoes multiple interactions in a single time step: for example if an agent at site i 1 tries to move to site i and an agent at site i þ 1 attempts to proliferate and deposit a daughter at site i. We will be able to quantify these explicitly upon defining a limiting regime. The conservation equation may change form depending on the specific details of the implementation of the motility and proliferation events. This effect is particularly seen in the higher order terms, with the first order conservation Eq. (2.1) being valid for many implementations. We assume the existence of a continuous approximation, Cðxi ; t k Þ, for the ensemble average occupancy, C ki . Upon substituting this into the simplified Eq. (2.1) and expanding the Cðxi Dx; tk Þ and Cðxi ; tk þ DtÞ terms using Taylor series, we obtain the following expression C þ Dt @C Pm þ O Dt 2 C ¼ @t 2 þ 1 X j¼0 j j ðDxÞ @ C 2C þ j! @xj 1 X j¼0 j j Dx @ C j! @xj ! ! 1 1 X Pp ðDxÞj @ j C X Dxj @ j C þ HoT; ð1 CÞ þ j @x 2 j! j! @xj j¼0 j¼0 ð2:2Þ where all quantities are evaluated at ðx; tÞ ¼ ðxi ; t k Þ. This expression simplifies to Dt 1 1 X X @C Dx2j @ 2j C Dx2j @ 2j C ¼ Pm þ Pp ð1 CÞ þ O Dt 2 þ HoT: 2j 2j @t @x @x ð2jÞ! ð2jÞ! j¼1 j¼0 ð2:3Þ We now nondimensionalise by setting t ¼ T~t; x ¼ L~x; where tildes indicate dimensionless variables. Our dimensionless equation for C is then d 1 1 X X @C 2j @ 2j C 2j @ 2j C 2 ¼ Pm þ Pp ð1 CÞ þ O d þ HoT: 2j ~ ~ ð2jÞ! @ x ð2jÞ! @ ~x2j @t j¼1 j¼0 ð2:4Þ For notational convenience, tildes will be dropped in further equations. Our aim is now to obtain a PDE from (2.4) in the limit as ; d ! 0. However, in order to do this, we must consider the behaviour of P m and Pp in this limit. We distinguish two possible scenarios below. 2.3. Type A model A sensible place to begin is the limiting regime for the motility event based on previous work in the literature [22] where it is shown that an agent on a random walk has a mean-squared displacement that scales with time. The motility mechanism described in Section 2.1 is that of a random walk and hence the following scaling for the probability of motility seems appropriate lim ;d!0 2 P m 2d b ; ¼P m ð2:5Þ b is Oð1Þ. where P m Although the proliferation event does not have the same relation of mean-squared displacement scaling with time, it is clear that the number of proliferation events will increase as the time step increases. Hence, we adopt the natural scaling Pp bp; ¼P lim d!0 d ð2:6Þ b p is Oð1Þ. where P Thus P m is O d=2 and Pp is OðdÞ (and hence also O d=2 ). As a b the higher order terms in Eq. (2.4) are of the form P m P p where a þ b P2 it can be concluded that the higher order terms are O d2 =4 as ; d ! 0. Upon dividing Eq. (2.4) by d the higher order terms become O d=4 . Substituting these scalings into (2.4), and taking the limit ; d ! 0, assuming d 4 the following expression is obtained, at leading order @C b @ 2 C b ¼ P m 2 þ P p ð1 CÞC: @t @x ð2:7Þ This is the expression obtained by Simpson et al. [22]. We consider the ratio of P m and Pp under the limiting regime given by Eqs. (2.5) and (2.6), obtaining Pm 1 ¼O 2 ; Pp ð2:8Þ and as 1 is required to make the continuum approximation, this implies that Pm 1; Pp ð2:9Þ or equivalently, Pm P p . The latter was observed by Simpson et al. [22], through numerical investigation, to be required for (2.7) to give an accurate approximation to the average CA behaviour. An immediately obvious problem with this approximation is that when there is no motility (P m ¼ 0), the lack of a diffusion term means the cells cannot spread from the region initially occupied. However, as demonstrated in Fig. 1, proliferation alone does result in the agents spreading. Our goal is to develop a model that will operate accurately without any restrictions on the parameters. We consider an alternative limiting regime below that achieves this. This result is not only of mathematical interest but is important to many biological systems, for example in breast cancer cell migration where motility and proliferation are estimated to be of the same order [21]. 2.4. Type B model We now consider a second limiting regime in which both the probabilities of motility and proliferation are considered based on the time step, that is Pm bm; ¼P d!1 d lim ð2:10Þ b p is defined as in Eq. (2.6). Here, P b m and P b p are Oð1Þ. In this and P derivation, Pm and Pp are both OðdÞ and hence the higher order terms in (2.4) are Oðd2 Þ. Dividing both sides of Eq. (2.4) by d and taking the limit as d ! 0 results in the continuous-time equation 1 1 X @C b X 2j @ 2j C b 2j @ 2j C ¼ Pm þ P p ð1 CÞ 2j @t ð2jÞ! @x ð2jÞ! @x2j j¼1 j¼0 b p ð1 CÞC þ ð P bm þ P b p ð1 CÞÞ ¼P 1 X j¼1 @ 2j C : ð2jÞ! @x2j 2j ð2:11Þ If we take the limit as ! 0 in Eq. (2.11), we are left with only a source term, which means that our model will be independent of b m is removed from the equation. If we conthe motility process as P sider the CA process, it is clear that the solution must depend on the motility process and hence requires more than just a source term due to proliferation. 66 K.J. Davies et al. / Mathematical Biosciences 253 (2014) 63–71 This discrepancy can be resolved by noting that the limit ! 0 is singular. Hence, we anticipate the existence of a boundary layer in the form of an invading front in which the average occupancy, Cðx; tÞ, changes very rapidly. By denoting the initial edge of the front as x0 , we are able to introduce the coordinate f, such that x ¼ x0 þ f. This transformation results in the following equation 1 X @C b 1 @ 2j C bm þ P b p ð1 CÞÞ ¼ P p ð1 CÞC þ ð P : @t ð2jÞ! @f2j j¼1 ð2:12Þ This expression is of little practical utility, due to the presence of the infinite sum, however, the presence of the 1=ð2jÞ! in the sum suggests that the contribution of the higher derivative terms may be negligible. Whilst, in general, neglecting these terms may be problematic (since it is another singular perturbation problem), in practice, as we shall shortly demonstrate (see Fig. 2 in Section 4.1), a good approximation of Eq. (2.1) is obtained by retaining only the first term in the series. This results in the nonlinear Fisher’s equation @C b 1 b @2C b ¼ P p ð1 CÞC þ ð P : m þ P p ð1 CÞÞ @t 2 @f2 ð2:13Þ In order to obtain an approximation to the ensemble average occupancy, this equation is then solved numerically and transformed back into x coordinates using x ¼ x0 þ f. In order to do this, will be considered as a small finite number (as opposed to the above arguments where we consider ! 0) and x0 is the initial position of the wave front. Note that x0 is simply a fixed constant and does not change as the wave front moves. This derivation is more general than using the limiting regime in the Type A model as Pm Oð1Þ: lim d!0 P p As a consequence, it appears as though the Type B model has an advantage over the Type A model due to reduced restrictions. For the Type A model given by Eq. (2.7), we require both Pm P p and 4 d, both of which are implicit assumptions in the derivation. In the Type B model, there exists no such restrictions, making it a more flexible result. Furthermore, not being restricted to Pm P p allows the possibility that we may be able to predict the average behaviour of the CA when proliferation is dominant in the system. Scaling arguments of the type presented in this section are a common tool in physical applied mathematics, [14], and we have shown their usefulness in deriving continuum approximations for CA models. As well as allowing us to be explicit about the restrictions on the parameter values relevant to the two approximations (Type A and Type B), the size of neglected terms is also made clear. Previously, to our knowledge, only the Type A approximation has been derived, in which the diffusion term depends only on the probability of cell movement. In the absence of cell motility, it predicts no spreading of the initial cluster of cells, whereas spreading does occur in the CA as a result of proliferation (see Section 4.1). By reconsidering the scalings, we obtained the Type B approximation, which includes an additional diffusion term that depends on the probability of proliferation, and hence permits the cell cluster to spread even when there is no cell movement. It is worth emphasising at this point that both Type A and Type B models are approximations of the conservation Eqs. (2.1), which we obtained assuming independence between states. We observe in Fig. 2, Section 4.1 that the extra terms obtained in the Type B model are essential to obtain an accurate approximation to the conservation equations which further demonstrates the importance of the choice of limiting regime. In the next section, we relax all assumptions, except for that of independence between states, by describing and implementing a continuous-time CA model. This will allow us to determine the significance of the errors introduced by each process. 3. Continuous-time CA model 3.1. Defining a continuous-time model We make two approximations through the use of Taylor series in the derivation of the continuum approximation to the CA model in Section 2. The existence of these approximations produces errors between our Type A and Type B models and the true ensemble average occupancy. The magnitude of theses errors can be controlled by the choice of Dx and Dt. However, here we completely eliminate these errors in order to further improve our approximation, leaving only the assumption of independence. Eq. (2.1) shows the first order accurate, discrete conservation equation. In deriving this, we made an assumption that only terms of order Pm and Pp would be used and any term of the form Pam Pbp where a þ b P 2 would be removed in the limit. This is essentially equivalent to saying that on a single time step, site i will be involved in a maximum of one event which will occur when d is sufficiently small. In order to increase the accuracy of our approximation, we could keep some of these terms, giving us a higher order conservation equation. Unfortunately, the terms involved in these derivations become complicated and abundant very quickly, making this infeasible. Furthermore, as each model depends strongly on the fine details of the implementation, a new model would need to be derived each time the process is slightly changed. Instead, we can define a similar type of model, in which time is continuous, rather than discrete. In continuous time, multiple events cannot occur at once as we consider the exact times that events occur and hence these higher order terms do not appear in the derivation. To describe this continuous-time CA model, we recognise that the discrete-time CA model described in Section 2 can be represented by a discrete-time Markov chain. The state space of this Markov chain is given by the possible different configurations of the sites. As each space can be either occupied or vacant, the size of the state space is 2X . The probability of moving from one state to another is only dependent on the current state and not the previously visited states, thus satisfying the memoryless property of Markov chains. In a similar way we can describe a continuous-time Markov chain (CTMC) to represent our continuous-time CA. This is done by considering the rate at which events will occur. The rate at which agents implement the motility rule in a given direction is b m =2 and the rate at which agents implement the proliferation rule P b p =2. The motility and proliferation rules can be implemented is P whenever an agent is adjacent to a vacant site. We sum the total number of adjacent occupied and vacant sites (as this is the criteria for motility to be applied) and multiply this b m to get the total motility rate. This is also done for proliferaby P tion (as proliferation also requires adjacent occupied and vacant b p giving the total proliferation rate. sites) with multiplication by P Summing the two rates will give the rate at which the first transition will occur, k, for the given state. In our CA model, k is calculated and an exponentially distributed time with rate k is generated. Of each possible transition that can occur, one is randomly chosen proportional to the rate at which it can occur. Note that this is not the only method, nor is it necessarily the most efficient method to simulate the continuous-time CA model. Rather, we have chosen a method that clearly illustrates what is occurring in the CA model. 67 K.J. Davies et al. / Mathematical Biosciences 253 (2014) 63–71 Whilst by definition [10] a CA model is discrete in time, we term our model a continuous-time CA, as it has many analogous qualities to a standard CA model. This model is an example of an interacting particle system [13]. Note that as well as removing the errors that occur due to the higher order terms, we have also dealt with how the events should be implemented as discussed in Section 2. In the discrete-time CA, the order in which agents implement their events is not clear. In our discrete-time implementation we chose for all motility events to occur before proliferation events, however, it is possibly more sensible to mix the two together. Upon mixing the two together, should the processes then alternate or should the order be randomly chosen via some distribution. Questions such as these have been implicitly addressed by moving to a continuous-time CA model. 4. Analysis of models 3.2. Ordinary and partial differential equation approximations Using the continuous time approach and the assumption of independence between states, we obtain the following conservation equation ! bm P bp P C i1 ð1 C i Þ þ þ 2 2 dC i ¼ dt ! bm P bp P ð1 C i ÞC iþ1 þ 2 2 bm bm P P ð1 C i1 ÞC i C i ð1 C iþ1 Þ 2 2 bm bp P P ¼ ðC i1 2C i þ C iþ1 Þ þ ð1 C i ÞðC i1 þ C iþ1 Þ; 2 2 ð3:1Þ for i ¼ 1; . . . ; X. Note that this is the exact equation that would be obtained if we were to simply make the continuum approximation C ki C i ðtk Þ, take a Taylor series expansion with respect to time and take the limit as Dt ! 0 of Eq. (2.1), the conservation equation. From these equations, we can still derive both Type A and Type B PDEs. bm In order to derive the Type B model, we simply assume that P b p are Oð1Þ and obtain our solution using an identical method and P to that shown in Section 2. In order to derive the Type A model, we b m to be O 1=2 and make the following limiting argument take P bm ¼ P b ; lim 2 P m !0 (3.1). Such a system can easily be solved numerically using software such as MATLAB. In fact, the PDE too must be solved numerically. This is likely to require a discretisation finer than the number of cells in order to resolve regions where the occupancy changes rapidly, hence it is likely that it will be more computationally efficient to solve the ODE model. However, the PDE model still has potential benefits, being more amenable to analysis. Solving the system of ODEs in Eq. (3.1) and thus removing both truncations in time and space, means that only the assumption of independence is required to derive our ODE approximation. By analysing the range of models against the simulated CA system, we are able to analyse the differences and deduce the effects that each assumption has on the accuracy of the approximation. ð3:2Þ b is Oð1Þ. where P m By considering this continuous-time CA, we have removed the limiting error with respect to time, however, there still exists a spatial truncation error that arises when taking Taylor series expansions to obtain the PDE. An alternative is simply to solve the system of X ordinary differential equations (ODE) given in Eq. 4.1. Analysis of discrete-time models We now compare the models to see whether either of the discrete-time continuum models can accurately approximate the CA. In order to do this, we have considered a range of values of Pm and P p . In each example we use an initial condition with similar structure to that used in Fig. 1. We begin by explicitly solving the difference equations given by Eq. (2.1) (ignoring the higher order terms) and comparing them to our two approximations. Fig. 2 shows that the Type B model and the difference equations agree strongly whereas the Type A model only agrees under the restricted condition P m Pp . When the difference equations are derived, the only assumption that has been made is independence between states. Fig. 2 implies that so long as the independence assumption is valid, the Type B model should accurately approximate the average properties of the CA for all parameter values. We then compare our models to the average CA. Fig. 3 shows average results from the discrete-time CA and the two approximations used when the probability of motility is much greater than the probability of proliferation (Pm P p ). Just as was shown in [22], the model accurately predicts the behaviour of the CA when motility dominates. Both models agree so well that it is impossible to see the difference between them. This is not surprising as the two models are equivalent when P m Pp . Observing Fig. 4 it is immediately clear that neither the Type A or B model accurately approximates the average properties of the CA for all parameter values. When the probability of motility, Pm , is greater (but not significantly greater) than the probability of proliferation, P p , (as in the first column of Fig. 4) we see that the two Fig. 2. Comparisons between the solution to the difference equations (green), the Type A model (red) and the Type B model (black) (note that the green and black curves are indistinguishable). Each plot shows the solution at t ¼ 6. The comparison is over three different regimes (from left to right); ðP m ; P p Þ ¼ ð1=10; 1=100Þ; ð3=20; 3=20Þ; ð0; 1=50Þ corresponding to the three regimes, P m P p ; P m ¼ P p ; P m ¼ 0 respectively. The total number of sites is X ¼ 400. The initial condition consists of 40 agents centrally located, that is, occupying sites 181–220. This initial condition corresponds to L ¼ 40Dx which is scaled to 1, giving ¼ DLx ¼ 1=40. The macroscopic timescale T ¼ 500Dt has been b ¼ 2 P m ; P b m ¼ Pm and chosen to approximately correspond to the time taken for a leading to cell to move L units. This is scaled to 1, giving d ¼ Dt ¼ 1=500. Consequently, P T m 2d d b p ¼ Pp following Eqs. (2.5), (2.10) and (2.6), respectively. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this P d article.) 68 K.J. Davies et al. / Mathematical Biosciences 253 (2014) 63–71 Fig. 3. Comparisons between simulated data averaged over 2000 realisations (blue), the Type A model (red) and the Type B model (black) (note that the red and black curves are indistinguishable). From left to right we have times t ¼ 2; 6; 10 all with ðP m ; P p Þ ¼ ð1; 1=1000Þ. The initial condition and all scalings are as specified in the caption of Fig. 2. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.) Fig. 4. Comparisons between simulated data averaged over 2000 realisations (blue), the Type A model (red) and the Type B model (black). Rows show simulations progressing with times t ¼ 0; 2; 6; 10. For each column ðP m ; P p Þ ¼ ð1=20; 1=100Þ; ð1=100; 3=200Þ; ð0; 1=50Þ (from left to right). The initial condition and all scalings are as specified in the caption of Fig. 2. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.) continuum models give similar results. When the two parameters are of a similar value (as in the second column of Fig. 4) we see that the two models begin to separate. The Type B model (Eq. (2.13)) predicts more spread than the Type A model (Eq. (2.7)). There is an absence of diffusion due to proliferation in the Type A model and hence we expect less spread. K.J. Davies et al. / Mathematical Biosciences 253 (2014) 63–71 When motility is turned off (as in the third column of Fig. 4) we see that the Type A model does not change, whereas the Type B model continues to spread. Although it vastly overestimates the rate at which this spread occurs, it does qualitatively agree with the long term dynamics of the problem. That is, we expect for the entire domain to reach carrying capacity after some finite amount of time. However, this will never happen in the Type A model. These results support our findings with respect to the parameter restrictions for the Type A model. Given that the Type A model does not give correct long term dynamics when Pm ¼ 0, the fact that it appears to be better than the Type B model at approximating the average properties of the CA for small but non-zero Pm is disappointing. The reason that the Type A model appears to give a better approximation to the CA is only due to multiple errors existing in the approximation. The independence assumption gives an overestimate of the spread whereas the absence of diffusion due to proliferation gives an underestimate of the spread. The 69 superposition of these two errors does, in some parameter regimes, decrease the overall error. However, this is more fortunate than designed. We seek to eliminate both errors to increase the accuracy in approximating the average CA properties. We therefore choose to analyse only the Type B model from here on. 4.2. Analysis of continuous-time models We now compare the continuous-time models. Fig. 5 shows that the ODE and PDE models match up so well that they are almost indistinguishable from each other. Despite removing the truncation error in time and in the case of the ODE model, in space as well, the models are still vastly overestimating the average properties of the CA in all cases. Note that there is minimal difference between Figs. 4 and 5. This shows that moving from discrete to continuous time has a minimal effect on results. Despite this, there are still numerous reasons that continuous time is beneficial. As well as the removal Fig. 5. Comparisons between simulated data averaged over 2000 realisations (blue), the ODE model (red) and the PDE model (black) (note that the red and black curves are bm ; P b p Þ ¼ ð25; 5Þ; ð5; 15=2Þ; ð0; 10Þ (from left to right). The initial condition indistinguishable). Rows show simulations progressing with times t ¼ 0; 2; 6; 10. For each column ð P and spatial scaling are as specified in the caption of Fig. 2. There is no time scaling for the continuous CA. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.) 70 K.J. Davies et al. / Mathematical Biosciences 253 (2014) 63–71 of small errors, it also makes the CA implementation clearer. The biggest factor is that it simplifies the analysis, allowing us to explore the assumption of independence between sites in the next section. At this point, we have removed all apparent possible sources of error from our model, except for the assumption of independence bm P b p the between sites. Despite this, we note that when P approximation will give accurate results. This suggests that the assumption of independence is causing problems for the proliferation mechanism but not the motility mechanism. We now explain this phenomenon. 5. The independence assumption In deriving our approximation for the average occupancy, we have assumed that the probability of a given site being in a particular state is independent of the state of all other sites. Assuming this, we may state that the probability of site i being in some state a and site i þ 1 being in some state b is equal to the product of the probability of site i being in state a and site i þ 1 being in state b. The existence of independence between the states of sites for motility and proliferation has been explored in the literature [20] by calculating the correlation between sites. We demonstrate an alternative method by considering site i and all sites that may interact with site i in a single event. We then use marginal probabilities of site occupancy to obtain meaningful expressions for the rate of change of the probability of occupancy of site i and are able to show that it is impossible to write the expression for proliferation in terms of marginal probabilities. In continuous time, site i can only interact with sites i 1 and i þ 1 over a single event for both the motility and proliferation events. Hence we only need to consider the possible states of the triple made up by the sites i 1; i and i þ 1. As we are considering three sites, each of which has two possible states, we conclude that the triple will be in one of eight possible states. We say that the probability of a triple being in one of these states is pj where j ¼ 0; 1; . . . ; 7, where each j corresponds to the binary representation of that number as shown in Table 1. We define qi to be the probability that site i is occupied. Note that when considering a single site, qi is equivalent to considering the ensemble average occupancy of site i. We first consider a motility event. To perform calculations without assuming independence we must explicitly consider the triples themselves. We write down the rate of change of site i by considering any triple that is in one state in site i and a different state in an adjacent site bm dqi P ¼ ½ðp4 þ p5 Þ þ ðp1 þ p5 Þ ðp2 þ p3 Þ ðp2 þ p6 Þ dt 2 bm P ½ðp4 þ 2p5 þ p1 Þ ðp3 þ 2p2 þ p6 Þ: ¼ 2 ð5:1Þ Table 1 The possible states of the triples and their corresponding labels based on the binary interpretation of the number. In the binary representation, a 0 represents a vacant site and a 1 represents an occupied site. We wish to show that this is the same expression that is obtained if we simply consider the single sites and assume independence by taking the product of probabilities when site i and an adjacent site are in a different state. Doing this yields the expression dqi P^m ¼ ½ðqi1 þ qiþ1 Þð1 qi Þ ð2 qi1 qiþ1 Þqi dt 2 P^m ¼ ½q 2qi þ qiþ1 : 2 i1 ð5:2Þ From here, we can show that these two expressions are equivalent by considering marginal probabilities. The marginal probabilities say that the probability of a given site being in a given state is the sum of the probabilities of the triples that also satisfy the given site being in that state. For instance, qi ¼ p2 þ p3 þ p6 þ p7 as site i is occupied in states 2, 3, 6 and 7. Hence, we can rewrite Eq. (5.2) as bm dqi P ¼ ½ðp4 þ p5 þ p6 þ p7 Þ 2ðp2 þ p3 þ p6 þ p7 Þ dt 2 bm P ½p þ 2p5 þ p1 ðp3 þ 2p2 þ p6 Þ; þ ðp1 þ p3 þ p5 þ p7 Þ ¼ 2 4 ð5:3Þ which is equivalent to Eq. (5.1). This proves that for the unbiased motility mechanism described, we can simply calculate Eq. (5.2) rather than explicitly calculating the triples and we will obtain the desired change in qi due to motility. When attempting the same argument for proliferation (in the absence of motility), we do not get the same result. Consider the rate of change of site i without assuming independence bp bp P dqi P ¼ ½ðp4 þ p5 Þ þ ðp1 þ p5 Þ ¼ ½p þ 2p5 þ p1 : dt 2 2 4 ð5:4Þ We now write down the expression obtained by considering single sites only bp dqi P ¼ ½ðqi1 þ qiþ1 Þð1 qi Þ; dt 2 ð5:5Þ and using marginal probabilities, Eq. (5.5) becomes bp dqi P ¼ ½ððp4 þ p5 þ p6 þ p7 Þ þ ðp1 þ p3 þ p5 þ p7 ÞÞ dt 2 ð1 p2 p3 p6 p7 Þ bp P ½ðp4 þ 2p5 þ p1 Þ þ ðp6 þ 2p7 þ p3 Þ ððp4 þ p5 þ p6 þ p7 Þ ¼ 2 ð5:6Þ þ ðp1 þ p3 þ p5 þ p7 ÞÞðp2 þ p3 þ p6 þ p7 Þ: It is clear that the extra terms do not cancel each other out in Eq. (5.6) and hence Eqs. (5.4) and (5.5) are not equivalent. Observing Figs. 4 and 5, it is clear that this assumption of independence is not a reasonable assumption to make, as it gives rise to extremely inaccurate results from the system of ODEs (Eq. (3.1)) for the average occupancy probabilities of the CA. Furthermore, we can show that no power series solution of the marginal probabilities qi1 ; qi and qiþ1 can be used to calculate the rate of change of the marginal probability qi . First, we note that the probability of site i being vacant is written as 1 qi and as a result, the only terms that may appear are of the form qai1 qbi qciþ1 for a; b; c 2 f0; 1; 2; . . .g. Second, we note that as the expression (5.4) is linear in each value of pj , and qi is linear in each value of pj we may ignore all terms that are nonlinear. K.J. Davies et al. / Mathematical Biosciences 253 (2014) 63–71 Hence we are left with the expression References bp dqi P ¼ ½a1 qi1 þ a2 qi þ a3 qiþ1 dt 2 bp P ½a1 ðp4 þ p5 þ p6 þ p7 Þ þ a2 ðp2 þ p3 þ p6 þ p7 Þ ¼ 2 þ a3 ðp1 þ p3 þ p5 þ p7 Þ ¼ bp P ½a3 p1 þ a2 p2 þ ða2 þ a3 Þp3 þ a1 p4 þ ða1 þ a3 Þp5 2 þ ða1 þ a2 Þp6 þ ða1 þ a2 þ a3 Þp7 : 71 ð5:7Þ There is no set fa1 ; a2 ; a3 g such that this expression is equal to Eq. (5.4). Therefore we have a contradiction, hence it is impossible to write the rate of change of the probability of occupancy of site i due to proliferation in terms of simply marginal probabilities. 6. Conclusion The purpose of this paper was to rigorously explore the derivation of continuum approximations to cellular automata, ensuring that no issues have been overlooked. By choosing an appropriate limiting regime we have derived a model where there are no implicit restrictions on the parameter values. By using a continuoustime CA model we are able to eliminate truncation errors with respect to time and by then considering an ODE approximation, we are also able to remove truncation errors with respect to space. The continuous-time CA also made the implementation of the process clear as well as allowing us to analyse independence in Section 5 without making any approximations. We showed that the continuum approximation Type B (see Section 2.4) was an excellent approximation to the conservation equation, confirming the choice of limiting regime and showing that the conservation equation itself is a poor approximation of the CA. As the only approximation involved in the derivation of the conservation equation is that of independence between sites, we can conclude that this assumption is not appropriate whenever proliferation is present. Research has already begun moving in the direction of alleviating the assumption of independence [2,15,20] and we have now confirmed that it is the sole issue with regards to the accuracy of CA approximations. In order to move forward in the field of approximations to cellular automata models, it is clear that the independence issue must be resolved. In future work, we too intend to present our ideas and research in this direction. Acknowledgements JEFG gratefully acknowledges funding from an ARC Discovery Early Career Researcher Award (DE 130100031). The work of NGB and JVR is supported by ARC Discovery Project Funding (DP 110101929). BJB was supported by an Australian Government National Health and Medical Research Council Project Grant (APP1069757). [1] D.J.G. Agnew, J.E.F. Green, T.M. Brown, M.J. Simpson, B.J. Binder, Distinguishing between mechanisms of cell aggregation using pair-correlation functions, J. Theor. Biol. 352 (2014) 16–23. [2] R.E. Baker, M.J. Simpson, Correcting mean-field approximations for birthdeath-movement processes, Phys. Rev. E 82 (2010) 041905. [3] C. Beauchemin, J. Samuel, J. Tuszynski, A simple cellular automaton model for influenza a viral infections, J. Theor. Biol. 232 (2) (2005) 223–234. [4] B.J. Binder, K.A. Landman, Exclusion processes on a growing domain, J. Theor. Biol. 259 (3) (2009) 541–551. [5] B.J. Binder, K.A. Landman, M.J. Simpson, Modeling proliferative tissue growth: a general approach and an avian case study, Phys. Rev. E 78 (2008) 031912. [6] B.J. Binder, J.V. Ross, M.J. Simpson, A hybrid model for studying spatial aspects of infectious diseases, ANZIAM J. 54 (2012) 37–49. [7] J.M. Bloomfield, J.A. Sherratt, K.J. Painter, G. Landini, Cellular automata and integrodifferential equation models for cell renewal in mosaic tissues, J. Roy. Soc. Interface 7 (52) (2010) 1525–1535. [8] A.Q. Cai, K.A. Landman, B.D. Hughes, Multi-scale modeling of a wound-healing cell migration assay, J. Theor. Biol. 245 (3) (2007) 576–594. [9] D. Chowdhury, A. Schadschneider, K. Nishinari, Physics of transport and traffic phenomena in biology: from molecular motors and cells to organisms, Phys. Life Rev. 2 (2005) 318–352. [10] G.B. Ermentrout, L. Edelstein-Keshet, Cellular automata approaches to biological modeling, J. Theor. Biol. 160 (1993) 97–133. [11] E.J. Hackett-Jones, K.J. Davies, B.J. Binder, K.A. Landman, Generalized index for spatial data sets as a measure of complete spatial randomness, Phys. Rev. E 85 (2012) 061908. [12] Y. Lee, S. Kouvroukoglou, L.V. McIntire, K. Zygourakis, A cellular automaton model for the proliferation of migrating contact-inhibited cells, Biophys. J. 69 (1995) 1284–1298. [13] T.M. Liggett, Interacting Particle Systems, Springer, Berlin Heidelberg, 2005. [14] C.C. Lin, L.A. Segel, Mathematics Applied to Deterministic Problems in the Natural Sciences, SIAM, 1974. [15] D.C. Markham, M.J. Simpson, R.E. Baker, Simplified method for including spatial correlations in mean-field approximations, Phys. Rev. E 87 (2013) 062702. [16] M. Markus, D. Böhm, M. Schmick, Simulation of vessel morphogenesis using cellular automata, Math. Biosci. 156 (1–2) (1999) 191–206. [17] A.A. Patel, E.T. Gawlinski, S.K. Lemieux, R.A. Gatenby, A cellular automaton model of early tumor growth and invasion: the effects of native tissue vascularity and increased anaerobic tumor metabolism, J. Theor. Biol. 213 (2001) 315–331. [18] M.J. Plank, M.J. Simpson, Models of collective cell behaviour with crowding effects: comparing lattice-based and lattice-free approaches, J. Roy. Soc. Interface 9 (76) (2012) 2983–2996. [19] B.G. Sengers, C.P. Please, R.O.C. Oreffo, Experimental characterization and computational modelling of two-dimensional cell spreading for skeletal regeneration, J. Roy. Soc. Interface 4 (17) (2007) 1107–1117. [20] M.J. Simpson, R.E. Baker, Corrected mean-field models for spatially dependent advection–diffusion–reaction phenomena, Phys. Rev. E 83 (2011) 051922. [21] M.J. Simpson, B.J. Binder, P. Haridas, B.K. Wood, K.K. Treloar, D.L.S. McElwain, R.E. Baker, Experimental and modelling investigation of monolayer development with clustering, Bull. Math. Biol. 75 (5) (2013) 871–889. [22] M.J. Simpson, K.A. Landman, B.D. Hughes, Cell invasion with proliferation mechanisms motivated by time-lapse data, Physica A 389 (2010) 3779–3790. [23] M.J. Simpson, A. Merrifield, K.A. Landman, B.D. Hughes, Simulating invasion with cellular automata: Connecting cell-scale and population-scale properties, Phys. Rev. E 76 (2007) 021918. [24] Z. Wang, T.S. Deisboeck, Computational modeling of brain tumors: discrete, continuum or hybrid?, Sci Model. Simul. 15 (1–3) (2008) 381–393. [25] S.H. White, A.M. del Rey, G.R. Sànchez, Modeling epidemics using cellular automata, Appl. Math. Comput. 186 (2007) 193–202. [26] D. Wodarz, A. Hofacre, J.W. Lau, Z. Sun, H. Fan, N.L. Komarova, Complex spatial dynamics of oncolytic viruses in vitro: mathematical and experimental approaches, PLOS Comput. Biol. 8 (6) (2012) e1002547.
© Copyright 2026 Paperzz