(This is a sample cover image for this issue. The actual cover is not yet available at this time.) This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and education use, including for instruction at the authors institution and sharing with colleagues. Other uses, including reproduction and distribution, or selling or licensing copies, or posting to personal, institutional or third party websites are prohibited. In most cases authors are permitted to post their version of the article (e.g. in Word or Tex form) to their personal website or institutional repository. Authors requiring further information regarding Elsevier’s archiving and manuscript policies are encouraged to visit: http://www.elsevier.com/copyright Author's personal copy Journal of Theoretical Biology 295 (2012) 86–99 Contents lists available at SciVerse ScienceDirect Journal of Theoretical Biology journal homepage: www.elsevier.com/locate/yjtbi Within-host demographic fluctuations and correlations in early retroviral infection T.G. Vaughan a,n, P.D. Drummond a, A.J. Drummond b,c a Centre for Atom Optics and Ultrafast Spectroscopy, Swinburne University of Technology, Melbourne, Australia Department of Computer Science, The University of Auckland, Auckland, New Zealand c Allan Wilson Centre for Molecular Ecology and Evolution, The University of Auckland, Auckland, New Zealand b a r t i c l e i n f o a b s t r a c t Article history: Received 12 July 2011 Received in revised form 16 November 2011 Accepted 17 November 2011 Available online 25 November 2011 In this paper we analyze the demographic fluctuations and correlations present in within-host populations of viruses and their target cells during the early stages of infection. In particular, we present an exact treatment of a discrete-population, stochastic, continuous-time master equation description of HIV or similar retroviral infection dynamics, employing Monte Carlo simulations. The results of calculations employing Gillespie’s direct method clearly demonstrate the importance of considering the microscopic details of the interactions which constitute the macroscopic dynamics. We then employ the t-leaping approach to study the statistical characteristics of infections involving realistic absolute numbers of within-host viral and cellular populations, before going on to investigate the effect that initial viral population size plays on these characteristics. Our main conclusion is that cross-correlations between infected cell and virion populations alter dramatically over the course of the infection. We suggest that these statistical correlations offer a novel and robust signature for the acute phase of retroviral infection. & 2011 Elsevier Ltd. All rights reserved. Keywords: HIV Population dynamics Monte Carlo simulation 1. Introduction The infection of a macroscopic organism by a viral agent is an extremely complex process involving a vast number of both intracellular and intercellular microscopic events, constituting the interaction between an invading within-host viral population and the cellular populations of which the host’s immune system is comprised. This is particularly true of the Human Immunodeficiency Virus (HIV), which targets the immune system directly (refer to Levy, 2007, for a good review). Due to this complexity, the discrete nature of these events is often ignored in the development of mathematical models, many popular examples of which (Perelson et al., 1993; Nowak and Bangham, 1996; Weinberger et al., 2009) assume instead that the magnitudes of the relevant within-host viral and cellular populations can be adequately represented by continuous variables evolving deterministically under the influence of a system of ordinary differential equations (ODEs). This approximation is mathematically equivalent to the use of deterministic ‘rate equations’ in chemistry to describe the dynamics of reacting chemical solutions, despite the fact that the solutions always contain integer numbers of particles. It is usually justified on the basis that the within-host populations are n Corresponding author. Tel.: þ61 3 9214 8465. E-mail address: [email protected] (T.G. Vaughan). 0022-5193/$ - see front matter & 2011 Elsevier Ltd. All rights reserved. doi:10.1016/j.jtbi.2011.11.016 so extremely large that neglecting the discrete nature of the populations and interactions does significantly affect the accuracy of the models. However, while this approximation is certainly useful in providing a practically manageable and fairly accurate description of the macroscopic dynamics – one that is often used to study the effects of various treatment regimes (e.g. Bonhoeffer et al., 1997; Leenheer, 2009) – it is certainly not without cost. Indeed, important features of the early infection – including the observed between-patient fluctuations in the viral load (Bonhoeffer et al., 2003; Fraser et al., 2007) and the potential for within-patient fluctuations to result in premature extinction of the viral population (Tan and Wu, 1998; Kamina et al., 2001; Khalili and Armaou, 2008) – cannot be described using this approximation.1 To allow studies of the effects of discrete numbers of virions, the continuous-population assumption can be replaced by a more realistic assumption: that discretely sized cell and virus populations are influenced by microscopic events which occur at random times. This leads directly to probabilistic descriptions of the within-host infection dynamics, and is the source of what population biologists commonly refer to as demographic fluctuations. 1 Although spontaneous clearance of an established HIV infection is almost never observed, we are referring here to clearances immediately following the receipt by the host of very small numbers of viral particles. Such clearances doubtless do occur, but in cases never clinically classed as infections. Author's personal copy T.G. Vaughan et al. / Journal of Theoretical Biology 295 (2012) 86–99 We emphasize at the outset that the impact of these demographic fluctuations extends beyond uncertainty with respect to the timing of peak viremia, and that conditioning on the survival of the infection is not enough to eliminate the stochastic element they introduce. On the contrary, we will demonstrate that crosscorrelations between the viral and cellular populations persist even under these conditions. A large variety of stochastic approaches to modeling withinhost infection dynamics have appeared in the literature over the last two decades, mostly in the context of modeling the progress of HIV infection, which we regard as a prototypical retroviral infection in terms of early stage behavior. These can be grouped into two very broad classes. The first of these encompasses those models which are expressed analytically, an example being the continuous-time branching process description of early HIV infection presented by Merrill (1989) (see Merrill, 2005, for a more recent description of this model). A more sophisticated branching process model was recently presented by Conway and Coombs (2011) in their work on characterising the apparently random appearance of detectable viral loads (known as ‘blips’) in patients undergoing anti-retroviral treatment. The continuous-time diffusion model employed by Tuckwell and le Corfec (1998) is another example of an analytical model which offers a stochastic description of the dynamics of both the viral and target cell populations, in this case presented as a set of coupled stochastic differential equations (SDEs) closely analogous to the ODEs employed by Perelson et al. (1993) in their deterministic model. It is capable of describing the full equilibration phase of the infection. However, while their model does incorporate the stochasticity arising from the fundamentally discrete dynamics, their diffusion representation necessarily employs continuous variables to represent the population sizes. As has been pointed out by Kamina et al. (2001), this is an approximation which is only valid in the limit that the viral and target cell populations are large—preventing the model from correctly predicting the probability of viral extinction in the early initial phase of an infection.2 The second class includes stochastic models formulated directly in terms of a computational algorithm. This includes the prominent Monte Carlo study of Tan and Wu (1998), which arguably constitutes the first full stochastic treatment of withinhost HIV infection dynamics. Although their paper does demonstrate that the model can be written in terms of generalized – i.e. non-Langevin – stochastic differential equations, these equations are not used in generating numerical results. The class also includes the work of Heffernan and Wahl (2005) which describes the results of stochastically modeling the dynamics of individual viral particles (virions), target cells and other relevant components of the immune system; each of these units possessing a unique identity under the model. This approach permits a very detailed analysis of the effects of non-exponential probability distributions for the time between events; the only problem being the computational burden created by the necessity of individually tracking arbitrarily large numbers of interacting particles. Finally, the algorithmic class includes many descriptions of infection dynamics in terms of stochastic cellular automata (SCA). Prominent examples of the application of such modeling to the study of HIV infection can be found in the work of Ruskin et al. (2002), Castiglione et al. (2004), and Lin and Shuai (2010), although models of the dynamics of other infectious pathogens also exist. (For example, Beauchemin et al., 2005 use an SCA to model the infection 2 While one might consider the continuous-time Markov chain model of disease progression in HIV patients developed by Kousignian et al. (2003) as a further example, here we are only interested in models of the microscopic withinpatient infection dynamics. 87 dynamics of Influenza A.) Due to the algorithmic simplicity of SCAbased models, which describe the discrete-time dynamics of a spatially distributed population in terms of a set of ‘rules’ which govern local interactions between adjacent ‘cells’, they have been used as the foundation for biologically complex large-scale simulations of response of the human immune system to viral infection (Halling-Brown et al., 2009). The highly intuitive nature of these algorithmic descriptions of infection dynamics together with the fact that models posed in this fashion provide an explicit link between the model and the data predicted by the model are the two main advantages that algorithmic models hold over their analytically formulated counterparts, and likely go a long way toward explaining their popularity. However, we argue that carefully posed analytical descriptions are more useful in the long term for two reasons. Firstly, being expressed in the language of mathematics, analytical models can be investigated using a wide variety of analytical techniques. The branching process model of Conway and Coombs (2011) mentioned earlier, for instance, was shown to possess analytical solutions. This is an important fact, as even approximate analysis can yield valuable insights into the dynamics. Secondly, the fact that such descriptions are not explicitly tied to a particular method of numerical solution can be an advantage in its own right, as it is sometimes the case that a number of specific algorithms must be tried before an optimal means of solution can be found. For these reasons we focus here on an analytically formulated model (as opposed to algorithmically formulated) of viral infection dynamics. While biologically less sophisticated than some existing models (in particular lacking any explicit treatment of the immune response), our model provides a logical extension to the common deterministic models by explicitly modeling these dynamics in terms of discrete events occurring at unknown times. It also has the capacity to be extended further. Specifically, we treat a stochastic extension to a simple form of the deterministic models originally developed by Perelson et al. (1993) (see Perelson, 2002, for a more recent review) and as described in the book by Nowak and May (2000), which we argue is a sufficient approximation over the initial phase of the infection. We present this model in terms of a chemical master equation (CME); a particular variety of forward Kolmogorov equation – an equation of motion for a probability distribution over accessible system states – commonly used to describe the Markovian stochastic dynamics of reacting chemical agents (see van Kampen, 2007, for an excellent overview). After detailing its relationship to deterministic models we demonstrate that, for small pathogen populations, the CME can be solved numerically using the stochastic simulation algorithm (Gillespie, 1976, 1977). This same algorithm was also recently employed by Conway and Coombs (2011) in the study of their branching process-based model of infection dynamics. In our case the solutions reveal, in a consistent way free from unnecessary approximation, the extent to which the random timing of the underlying discrete events constituting the infection process affects the macroscopic features of the dynamics. We then widen our study to include infections involving within-host viral and cellular population sizes comparable to those actually present in humans during the initial phases of HIV infection. These larger systems are impractical to study using the SSA due to the way the computational complexity of that algorithm depends on the number of particles. We therefore employ the t-leaping algorithm (Gillespie, 2001) which is essentially a finite time-step integration algorithm for birth–death processes and possesses huge efficiency gains over the SSA. Importantly, we show that demographic fluctuations give rise to significant cross-correlations between different populations contained within the model—even when the absolute numbers of Author's personal copy 88 T.G. Vaughan et al. / Journal of Theoretical Biology 295 (2012) 86–99 virions exceed 1013. This demonstrates that the effects of these fluctuations are not limited to small within-host pathogen populations. We note that t-leaping has been used recently in another study of early-time stochastic infection dynamics by Khalili and Armaou (2008), although that work focuses exclusively on fluctuation-driven infection extinction involving much smaller within-host populations. Finally, we investigate the effect of the size of the initial within-host viral population on the stochastic infection dynamics. We find that although the enhanced probability of extinction of the viral and proviral populations dramatically alters the expected dynamics in this regime, the dynamics conditional on the ‘survival’ of the infection are very similar to those stemming from larger initial viral populations. We investigate the dependence of the cross-correlations on this initial size, and show how virion-infected cell cross-correlations can be used as a robust indicator for the acute stage of the infection. 2. A stochastic virus model As stated in the previous section, the focus of this paper is a stochastic extension to the standard deterministic model of viral dynamics (Perelson et al., 1993; Perelson, 2002) in the form presented by Nowak and May (2000). This model involves three distinct populations: an uninfected target cell population, the population of infected cells and the virus population. Members of these populations (denoted respectively X, Y and V) behave and interact according to a number of elementary processes, which will be described below. We consider this model in the particular context of primary HIV-1 infection. It is important to bear in mind that HIV and similar retroviruses such as HCV exhibit strong evolutionary dynamics during the course of infection (Rambaut et al., 2004; Farci et al., 2000), driven by high rates of genetic diversification. Our model therefore appears quite at odds with reality, as the viral populations described by the model clearly lack any internal genetic structure. However, it has been shown by Keele et al. (2008) that, atleast in the case of HIV-1, little or no selection favouring particular viral strains occurs within the early phase of the infection preceding peak viremia. In other words, the evolution is entirely neutral over the initial phase of the infection, and can therefore have little influence on the dynamics of the total population sizes. We are therefore justified in ignoring this additional structure for the time being, provided that we limit our investigation to short times following infection. Additionally, the adaptive immune response to the infection is omitted from this tri-population model. Again, however, while clearly important over longer time-scales, the success of simple deterministic models which ignore immune response suggest that its effect on the within-host population size dynamics during the initial phase of the infection must be negligible, as discussed in the review by Perelson (2002). In the model, a schematic of which is provided in Fig. 1, the uninfected target cell population is continuously replenished via a constant-rate birth process. Borrowing from the notation of chemical kinetics, we can express this process in the form of the reaction l 0!X, ð1Þ where 0 is a stand-in for the target-cell progenitor (we ignore this population in our model) and l is the constant which specifies the rate of the replenishment process in units of inverse time. In the case of HIV-1, the target cell population consists primarily of CD4þ T lymphocytes, which are produced by the thymus.3 While the exact rate at which these cells are produced is known to vary throughout their host’s life (Bains et al., 2009), this variation can be ignored over the far shorter time-scale of the primary infection dynamics. For an adult human, this rate has been measured to be approximately l ¼ 108 cells per day (Vrisekoop et al., 2008; Clark et al., 1999). Secondly, the target cell production process is countered by the natural death of these cells, which occurs via the following decay process: d X!0: ð2Þ For human CD4 þ T cells, this decay rate was recently measured by Vrisekoop et al. (2008) to be on the order of 5 104 per cell per day. Together with the thymic output rate, this implies that the healthy adult human body maintains a population of C2 1011 CD4þ T cells. In the same parlance, the process by which a virion and a target cell combine to form a productively infected cell can be written b X þ V!Y: ð3Þ In retroviruses, this implicitly involves the reverse-transcription of the viral RNA into DNA via the viral enzyme reverse transcriptase (RT). The action of this enzyme is hugely error-prone and is a major source of genetic diversification within viral population. In our model however, we ignore this diversity and instead focus only on the bulk dynamics of the viral population. In the case of HIV-1, infection rate b is not often measured directly, but can be inferred from estimates of the steady state infected cell population. Once infected, cells in our model emit a constant stream of viral particles via the reaction k Y!Y þ V: ð4Þ For HIV, the rate at which virions are produced is on the order of k ¼ 103 particles per infected cell per day (Hockett et al., 1999). This process is a further source of genetic diversity in retroviral populations, as the cellular enzyme RNA polymerase (RNAP) – responsible for transcribing the provial DNA into RNA – occasionally induces errors, although at a rate which is thought to be much lower than the RT error rate. Fig. 1. Schematic of the model used in this paper, detailing the microscopic processes involving target cells (X), infected cells (Y) and virions (V). Each arrow represents a single process occurring at the rate given by its label, with its tail(s) indicating the one or more bodies which instigate the process and the head indicating the product. The dashed line indicates that infected cells are not consumed in the production of virions. 3 It should be noted that CD4þ T cells are actually produced by two distinct processes: T cell generation by the thymus and T cell proliferation. While a constant-rate birth process is a reasonable description of the thymus output, proliferation is a nonlinear process which involves T cells replenishing their own population via cell division. While this complication is often completely ignored, it is important to note that the relative contributions of each of these processes to maintaining homeostasis in the pre-infection population of T cells and the subsequent dynamics of this population post-infection are still largely unknown (see Borghans and de Boer, 2007, for an insightful discussion). Author's personal copy T.G. Vaughan et al. / Journal of Theoretical Biology 295 (2012) 86–99 Just as for uninfected cells, the production of infected target cells is countered by an associated decay process a Y!0: ð5Þ Compared with the decay of uninfected cells, infected cells perish at a greatly enhanced rate due to the burden that viral production places on the cellular machinery, together with the active targeting of such cells by the immune system. In the case of HIV, this decay rate has been measured by Markowitz et al. (2003) to be close to 1 cell per infected cell per day; more than three orders of magnitude greater than the uninfected T cell death rate. The final elementary process in our model describes the combined effects of removal of virions by the immune system and the natural decay of the viral particles themselves u V!0: ð6Þ The decay of HIV virions is known to occur at atleast two distinct rates: between 9 and 36 virions per day for every virion which exists in the blood plasma (Ramratnam et al., 1999), and approximately 3 virions per day for every virion localised within lymphatic tissue (Müller et al., 2001). As the latter describes the vast bulk of the total viral population (Haase et al., 1996), we assume that u ¼3 per day is the most sensible clearance rate to use in the simple three-mode model. Before we proceed with our analysis, we feel it is important to once more emphasize that this model presents a drastically simplified view of even the short-time dynamics of the infection process. For instance, there is the obvious fact that real human bodies are highly heterogeneous entities, the implications of which have been studied theoretically by Funk et al. (2005). In contrast, our model assumes that the populations interact in a completely homogeneous environment—essentially treating the host as a biological analogue of a well-mixed chemical reactor vessel. Furthermore, the implicit assumption that each of the interacting populations can be regarded as a monoculture is quite far from the truth.4 This said, it is equally important to note that overly sophisticated models can be lacking in their ability to generate useful insight. The significance of the particular model that we have chosen to consider lies in the fact that, while it is only loosely connected to the biological reality, it allows us to focus on the specific task of treating the underlying discrete nature of the populations involved in the early dynamics of the infection. 89 occupy at each time: @ Pð~ n ,tÞ ¼ l½Pð~ n X ,tÞPð~ n ,tÞ @t n þ X þ VY ,tÞnX nV Pð~ n ,tÞ þ b½ðnX þ 1ÞðnV þ1ÞPð~ n V ,tÞnY Pð~ n ,tÞ þ k½nY Pð~ þ d½ðnX þ 1ÞPð~ n þ X ,tÞnX Pð~ n ,tÞ n þ Y ,tÞnX Pð~ n ,tÞ þ a½ðnY þ 1ÞPð~ n þ V ,tÞnX Pð~ n ,tÞ: þ u½ðnV þ 1ÞPð~ ð7Þ Here ~ n X ðnX 1,nY ,nV Þ, and both ~ n Y and ~ n V are defined similarly. It is important to note the subtle but fundamental distinction between the true (but unknown) population sizes Ni(t) and the integer variables ni which range over all of the possible values that the true population sizes may hold. This chemical master equation (CME) constitutes the full description of the stochastic model. Given an initial condition (in the form of either a particular known initial state ~ n 0 or a probability distribution Pð~ n ,0Þ over possible initial states), the CME can be used to determine the probability distribution over the states through which the system is likely to pass at any future time. However, there is no known general solution to Eq. (7). Furthermore, although direct numerical integration of the CME is possible for small numbers of cells and virions, this approach is also infeasible for numbers comparable to what we expect to find in a human body (on the order of 1010 for each population during HIV infection) due to the prohibitively large volume of accessible state-space, which goes as V ¼ Nmax N max Nmax , X Y V ð8Þ are the maximum population sizes to be considered where N max i in the calculation. Thus, a full-sized calculation using this very simplistic dynamical model would involve the integration of over 1030 coupled ordinary differential equations. We therefore seek alternative approaches. 2.2. Summary statistics for cellular and viral populations An easy first step in simplifying the problem of solving Eq. (7) can be made by explicitly shifting our focus from the dynamics of the entire probability distribution to the dynamics of derived moments or ‘summary statistics’. The simplest of these are the expected population sizes, defined as X /N i ðtÞS ¼ ni Pð~ n ,tÞ, ð9Þ ~ n 2.1. Chemical master equation In order to investigate the dynamics of the populations involved in our model of the infection process, we describe the ~ ðNX ,NY ,NV Þ, where state of the system using the vector form N Ni are positive integers describing the sizes of the populations involved in the dynamics. During the course of an infection, this ~ ðNX ðtÞ,N Y ðtÞ,NV ðtÞÞ through the vector follows a trajectory NðtÞ accessible state space, with t being the (possibly fractional) number of days following inoculation. Making explicit our assumption of homogeneity and adding a further assumption that the processes involved in the model occur at constant rates and at independent random intervals, we can express the reactions given in Eqs. (1)–(6) in terms of the following equation of motion for the probability distribution ~ ¼~ Pð~ n ,tÞ PðNðtÞ n Þ over the possible states that the system may where again we have used the i to stand in for X, Y or V. These can be thought of as the average of the population sizes measured at a particular time in each of a large ensemble of identically infected hosts. Similarly, we define the expected product of population sizes X /N i ðtÞNj ðtÞS ¼ ni nj Pð~ n ,tÞ: ð10Þ ~ n It is important to note that an expected product of the population sizes, /N X ðtÞNV ðtÞS for example, is not in general equivalent to the product of the expected population sizes, /N X ðtÞS/NV ðtÞS, due to the interdependence of the population sizes which we anticipate arise naturally from the reactions incorporated into Eq. (7). This difference is quantified by the inter-population covariance, defined in our context by CovðNi ðtÞ,Nj ðtÞÞ ¼ /N i ðtÞNj ðtÞS/Ni ðtÞS/Nj ðtÞS ¼ /DNi ðtÞDNj ðtÞS, ð11Þ 4 The CD4þ T cell population, for instance, consists of naı̈ve, effector and memory sub-populations, all of which can be in either active or inactive states. Each of these sub-populations possesses unique characteristics with regard to their interaction with HIV virions. where DN i ðtÞ N i ðtÞ/N i ðtÞS. We can develop a more intuitive grasp of what such covariances represent by considering the second form of the definition Author's personal copy 90 T.G. Vaughan et al. / Journal of Theoretical Biology 295 (2012) 86–99 above, where we see that the covariance is a measure of the expected product of the differences between the real population sizes and their individual expected values. If these differences are expected to be both positive or both negative simultaneously, the covariance will be a positive number. That is, when fluctuations in one of the populations are to some degree synchronous and of the same sign as fluctuations in the other, we should find that CovðNi ðtÞ,Nj ðtÞÞ 4 0. If, on the other hand, the fluctuations are synchronous but of opposite signs in each population, we expect this to be reflected by a negative covariance, CovðN i ðtÞ,N j ðtÞÞ o 0. Lastly, if the fluctuations are completely asynchronous (uncorrelated), the product /DN i ðtÞDNj ðtÞS will factorize and the covariance will vanish, as /DNi ðtÞS ¼ 0 by definition. Another way of thinking about the covariance is by considering a simple thought experiment in which two T cells are sampled (with replacement) from the system at time t without regard for their respective types. The joint probability that the two individuals sampled are of types i and j is * + Ni ðtÞN j ðtÞ , ð12Þ Pði,jÞ ¼ NT ðtÞ2 where we have used N T ðtÞ ¼ N X ðtÞ þN Y ðtÞ to represent the total number of individual T cells present in the system at that time. In the case that the two populations are completely independent, the joint probability factorizes into PðiÞPðjÞ, where PðiÞ ¼ /Ni ðtÞ= NT ðtÞS is simply the probability of sampling an individual T cell from population i. Assuming that the relative variance in NT(t) is negligible and that the denominators can be factored out of P(i), P(j) and Pði,jÞ, the difference between the joint probability and the factorized form is Pði,jÞPðiÞPðjÞ C CovðN i ðtÞ,N j ðtÞÞ /N T ðtÞS2 : ð13Þ of the stochastic variables they enclose, while CovðNi ,Nj Þ represents the inter-population covariance defined in Eq. (11). These covariance terms prevent us from solving Eqs. (15)–(17) exactly, and show that even the dynamics of the expected population sizes depend explicitly on the correlations which can exist between the populations. Neglecting these covariances for a moment, we recover a closed system of ODEs which form the standard deterministic model of virus infection described by Perelson (2002). The mean steady state infected cell population predicted by this model is l 1 /N nY S ¼ 1 , ð18Þ R0 a where R0 is the effective reproductive ratio of the virus, defined by R0 ¼ lbk dau 1 a : k ð19Þ In the case of HIV, the steady state number of productively infected cells has been estimated to be on the order of 108 in total (Cavert et al., 1997). In combination with the parameter estimates already noted and Eq. (18), this yields an order-ofmagnitude estimate for the infection rate of b 1013 per cell per virion per day. Fig. 2 illustrates the dynamics predicted by the approximate deterministic model using the parameters tabulated in Table 1. The precise values of the cellular production rate l and the uninfected cell death rate d have been chosen such that the uninfected steady state of the target cell population is 2:5 1011 ; the value estimated by Clark et al. (1999). The transient oscillations present in the figure are probably unrealistic, as they do not clearly show up in time series measurements such as those made of the blood plasma viral load That is, ð14Þ Thus, a positive covariance implies that the joint probability of sampling individuals from the two populations (infected and uninfected) involved is greater than the probability one would anticipate if the dynamics of the two population sizes were completely independent. Conversely, a negative covariance suggests that the joint sampling probability is less than it should be if the populations are independent and is thus evidence of an anticorrelation between the sizes of the two populations. Finally, we note that in the case that i and j refer to the same population, the covariance reduces to the variance. This describes our degree of uncertainty in the magnitude of that population due to the stochastic nature of the model. 1012 Total Cell/Virion Number CovðNi ðtÞ,Nj ðtÞÞpPði,jÞPðiÞPðjÞ: 109 106 103 Uninfected T cell Infected T cell Viral load 1 2.3. Dynamics of expected population sizes 0.1 The exact dynamics of the mean population numbers can be derived from the CME using the approach described (for example) in Gardiner (2004). This yields d/NX S ¼ lb½/N X S/NV Sþ CovðNX ,NV Þd/NX S, dt ð15Þ d/NY S ¼ b½/NX S/NV S þ CovðNX ,N V Þa/N Y S, dt ð16Þ d/NV S ¼ k/NY Sb½/NX S/NV S þ CovðNX ,NV Þu/N V S, dt ð17Þ 0.5 1.0 5.0 10.0 50.0 t (days) Fig. 2. Results obtained by integrating deterministic equations of motion, using the parameters given in Table 1. Table 1 Descriptions of model parameters and the values used in our simulations. Parameter Description where for the sake of brevity and visual clarity we have omitted the time-dependence of the stochastic population variables (i.e. Ni N i ðtÞ in these equations). As explained above in Section 2.2, the angled brackets / . . . S are used to denote the expected value Value l b Target cell production rate 2:5 108 =day Target cell infection rate k d a u Virion production rate Uninfected target cell death rate Infected target cell death rate Virion clearance rate 5 1013 =virus=target cell=day 103/infected cell/day 10 3/target cell/day 1/infected cell/day 3/virion/day Author's personal copy T.G. Vaughan et al. / Journal of Theoretical Biology 295 (2012) 86–99 3. Stochastic simulations using the direct method Arguably the most common numerical alternative to directly integrating master equations such as Eq. (7) is the stochastic simulation algorithm (SSA) developed by Daniel Gillespie to treat the dynamics of evolving chemical systems (Gillespie, 1976, 1977). This approach involves numerically generating individual ~ ðiÞ ðtÞ, which can then be considered indesystem ‘trajectories’ N pendent samples drawn from the probability distribution Pð~ n ,tÞ. The probability distribution can then be approximated in terms of a finite number of these trajectories by way of the estimator N 1 X P^ N ð~ n ,tÞ ¼ d3 ðiÞ , N i ¼ 1 ~n , N~ ðtÞ 3 where d~ ~ ðiÞ n , N ðtÞ ð20Þ represents a three-dimensional Kronecker delta ðiÞ ~ ðtÞ ¼ ~ n and zero otherwise. function that is unity whenever N This converges to the true distribution in the limit of an infinite number of trajectories, N . This estimator allows us to express moments such as population means and variances in terms of the trajectory ensemble. The average uninfected target cell number, for example, can be calculated in the following way: N X 1 X P^ N ð~ n ,tÞnX ¼ lim NðiÞ X ðtÞ: N -1 N -1 N ~ i¼1 n /NX ðtÞS ¼ lim ð21Þ While in principle an infinite number of trajectories are needed to obtain exact results from the algorithm, the strength of the SSA – and the strength of all Monte Carlo algorithms – lies in the fact that relatively small trajectory ensembles can often be used to calculate sample moments which are good approximations of the true moment values. Coupled with the fact that one can easily obtain statistical estimates of the error associated with these finite-ensemble approximations, results obtained using finite ensembles generated using the SSA can be regarded essentially exact in the same way that results of finite step size numerical integration algorithms used for integrating ordinary differential equations are considered exact, provided the associated errors are kept in check. reactions correspond to the simplified CME @ Pð~ n ,tÞ ¼ l½Pð~ n X ,tÞPð~ n ,tÞ þd½ðnX þ 1ÞPð~ n þ X ,tÞnX Pð~ n ,tÞ @t ð22Þ and therefore to the following equation of motion for the mean target cell population dynamics: d /N X ðtÞS ¼ ld/N X ðtÞS: dt ð23Þ The graph shown in Fig. 3 displays the time that was taken by a 2.8 GHz Intel Xeon processor to simulate each of a series of trajectories possessing a variety of initial conditions N X ð0Þ ranging between 109 and 1010. In every case, the cellular death rate was fixed at d ¼ 103 while the replenishment rates were set by l ¼ dNX ð0Þ in order that the initial condition corresponded to the expected steady state population size obtained from Eq. (23) above. To demonstrate the reproducibility of these results, the graph actually displays the mean and standard deviation of sets of 10 trajectories, identically conditioned apart from the initial state of the pseudo-random number generator. The linear dependence of the calculation time on the population size which is clearly evident in the graph is the principal short-coming of the standard SSA approach to deal with chemical master equations. This fact alone ensures that SSA calculations will become infeasible at a large enough population size. The extent of this problem is quantified by the slope of the line of best fit shown as the dashed line in Fig. 3, which is approximately 108 s per additional target cell. For the number of target uninfected T cells present in a healthy individual, which is thought to be on the order of 1011, generation of 100 day simulation trajectories are expected to take approximately a quarter of an hour (103 s). As thousands of trajectories are needed to accurately estimate moments, using the SSA to accurately determine the expected number of T cells over the course of the 100 day simulation period is likely to take many hundreds of CPU hours, even using this simplified healthy-state model. Using the SSA to analyze the full stochastic model given in Eq. (7) involves two additional populations, together with another four stochastic processes. This greatly increases the complexity and effectively rules out the possibility of using the SSA to analyze the statistical properties of the full-sized stochastic model of HIV infection defined by the parameters listed in Table 1. We will address this issue in Section 4 through the use of the t-leaping 120 100 80 ttraj (s) made by Lifson et al. (1997) in their study of the early infection dynamics of the closely related simian immunodeficiency virus (SIV) in macaques. Neither do they appear in the longitudinal studies of HIV viral load presented by Stafford et al. (2000), who conclude that the oscillations predicted by the simpler viral dynamics models are suppressed in reality by either CTLmediated clearance of infected target cells, or the influence of CD8þ T cell antiviral factor. Additionally, we hypothesize that such oscillations may be dampened if the true structural heterogeneity of the human body were to be taken into account. The other features of the dynamics predicted here – such as the time to peak viremia and the steady state viral load – are sensible. Regardless, it is not the mean dynamics which are the concern of this paper. Instead, we are interested in the statistical dynamics which take place over the time-scale of the initial infection. This will be the focus of the following sections. 91 60 40 20 3.1. The scaling problem Despite the fact that this approach is clearly superior to direct integration of Eq. (7), it still suffers from the problem of rapidly increasing computational complexity. To demonstrate this, consider the uninfected dynamics of a host resulting from the reactions given in Eqs. (1) and (2) alone. On their own, these 2e+09 4e+09 4e+09 6e+09 8e+09 1e+10 <Nx> Fig. 3. Dependence of trajectory simulation time using the SSA on the steady-state target cell population size. For each of these sizes, the error bars indicate the standard deviation in times obtained from 10 independent simulations generated using equivalent parameters. Author's personal copy 92 T.G. Vaughan et al. / Journal of Theoretical Biology 295 (2012) 86–99 algorithm. In order to gain some initial insight into the statistical dynamics predicted by Eq. (7), however, we firstly use Gillespie’s SSA to analyze a smaller version of the same problem. 3.2. Population correlation dynamics A more approachable set of rate constants for the deterministic variant of the infection model is found in Nowak and May (2000, Chapter 3), and are shown in Table 2. The primary difference between these parameters and those shown in Table 1 is that l and b in the former are scaled so that the dynamical variables they influence represent population counts per milliliter rather than absolute numbers (i.e. per host). As a result, when used in conjunction with the equivalent stochastic model, these parameters describe the infection dynamics of a scaled-down host system occupying a volume on the order of 1 cm3. Aside from allowing us to obtain a preliminary intuitive understanding of the stochastic dynamics of the full system, small-volume models such Table 2 Small-scale parameters used for the cut-down infection model which was analyzed using Gillespie’s SSA. These parameters were obtained from Nowak and May (2000). Parameter Description Value l b Target cell production rate 1:0 105 =day Target cell infection rate k d Virion production rate Uninfected target cell death rate Infected target cell death rate Virion clearance rate 2 107 ml=virus=target cell=day 102/infected cell/day 0.1/target cell/day a u 0.5/infected cell/day 5/virions/day as this may also be directly applicable to certain ex vivo HIV infection experiments, such as those carried out by Blauvelt et al. (2000). Using these parameters and following the algorithm presented in Gillespie (1976), we generated an ensemble of 4 103 stochastic trajectories corresponding to the first 30 days of the infection post inoculation. In order to account for the natural uncertainty with respect to initial sizes due to the limitations in the precision with which such preparations and measurements can be made, we draw the initial population sizes of each trajectory from Poissonian distributions having the following means (and hence variances): /N X ð0ÞS ¼ 106 , ð24Þ /N Y ð0ÞS ¼ 0, ð25Þ /N V ð0ÞS ¼ 102 : ð26Þ As a way of estimating the uncertainty introduced by the finite number of trajectories employed in the calculation of the moments discussed below, the ensemble was evenly divided into 10 sub-ensembles, yielding 10 independent estimates of each of the calculated quantities. These were then used to estimate the standard error in the mean of each of the quantities measured. The results obtained from these simulations are summarized in Fig. 4. Firstly, Fig. 4a illustrates the mean dynamics of each of the populations on a logarithmic vertical scale. Like the deterministic dynamics shown earlier in Fig. 2, the commonly observed qualitative features of the infection are again present, including the initial exponential increase in viral load, the presence of a point of peak viremia after a few days, and the subsequent relaxation to an apparently stable set point viral load. In an important improvement over the earlier results, however, the 106 Relative Variance Population Sizes 107 105 103 Uninfected Cells Infected Cells Virions 104 102 Uninfected Cells Infected Cells Virions 1 0 5 10 15 20 25 30 0 5 Time (days) 15 20 25 30 Time (days) 0.6 Relative Covariance 10 Cov Cov Cov 0.4 (N N ) (N N ) (N N ) 0.2 0.0 −0.4 0 5 10 Time (days) 15 Fig. 4. Results of Gillespie simulations of the stochastic virus model using the small-system parameters given in Table 2. These include (a) the mean population sizes, (b) the relative variances of the fluctuations in those populations, and (c) the relative covariances between those populations over course of the 30 day simulation period. The shading in (b) and (c) indicates uncertainty due to the finite trajectory ensemble sizes. (In (a) this uncertainty is too small to show.) Author's personal copy T.G. Vaughan et al. / Journal of Theoretical Biology 295 (2012) 86–99 population dynamics displayed in Fig. 4a corresponds exactly to what is predicted by the governing CME in Eq. (7), and includes the effects of the covariance terms in Eqs. (15)–(17). While informative, population averages are only first order moments, and as such provide a very limited picture of the complex statistical dynamics that occurs over this time period. As an example of these additional details, we have used the simulated trajectory ensemble to estimate the ratio between the population variances and their corresponding means at each of the 1001 sampled points spanning the simulation period. These ratios, known as dispersion indices, are unity in the case of Poissonian fluctuations, with greater and lesser values indicating super- and sub-Poissonian fluctuations, respectively. The dispersion indices shown in Fig. 4b therefore demonstrate that the initially Poissonian fluctuations in each of the three populations rapidly grow throughout the acute phase, reaching a maximum around the time of peak viremia, then fall a little to settle on apparently constant values. While the dispersion indices of each of the populations display very similar qualitative characteristics, it is the uninfected target cell population that exhibits the largest relative variance, with the virion population coming in closely behind. These highly super-Poissonian fluctuations provide a strong motivation for quantitative statistical modeling of the infection process, as they imply that the uncertainty in regard to the timing of the individual microscopic interactions translates to a relatively large uncertainty in the viral and cellular population sizes. It is interesting to note that, while the rapid growth of the fluctuations in the viral population immediately following inoculation (which has been considered by Heffernan and Wahl, 2005) is to be expected due to the random nature of the birth process, the stabilization of the variance beyond this point is somewhat surprising. It suggests that, once the set point is reached, the underlying probability distribution Pð~ n ,tÞ achieves something approaching quasi-stationarity under the present model. Note that this obviously excludes the impact of the immune system, and this particular quasi-stationary distribution is therefore likely to be short-lived in reality. Finally, we consider how the infection process influences the covariances between the cellular and viral populations. In order to make the comparison between the populations easier, we define the following ‘relative’ covariances: Covrel ðN i ðtÞ,Nj ðtÞÞ ¼ CovðN i ðtÞ,N j ðtÞÞ /N i ðtÞNj ðtÞS ¼ 1, /N i ðtÞS/N j ðtÞS /Ni ðtÞS/Nj ðtÞS ð27Þ where i and j are the possible pairs of non-equivalent populations. As explained in Section 2.2, the covariances discussed in this paper are measures of the correlation between the sizes of the populations at a given instant in time. The relative covariances defined above are equivalent in this regard, but are normalized to reduce the influence of the absolute sizes of the populations involved. This allows us to more meaningfully compare covariances between different pairs of populations and at different times. The result of using our ensemble of stochastic trajectories to calculate these quantities is shown in Fig. 4c. As one might expect, the variables NX(t) and NY(t) develop strong anti-correlations during the infection process, due to the fact that the cellular infection process (described by the reaction in Eq. (3)) increments NY(t) at the direct expense of NX(t). Similarly, anti-correlations develop between NX(t) and NV(t), although in this case the coupling between the populations is due to a combination of the cellular infection process and the virion production process (described by the reaction in Eq. (4)). On the other hand, the remaining pair of population variables NY(t) and NV(t) develop strong positive correlations, especially leading up to peak viremia. This is due to the fact that the virion production process in the 93 model increases the viral population size without penalizing the infected cell population at all.5 4. Scalable and exact stochastic simulation using s-leaping While the numerical results presented in the previous section clearly showed that even our simplified stochastic model of viral infection is capable of generating relatively rich statistical dynamics, these results were strongly limited by the computational complexity inherent in the stochastic simulation algorithm employed. As discussed in the introductory section, however, the fact that we have chosen to represent our model analytically in terms of Eq. (7) allows us to attempt another method of numerical solution. The so-called t-leaping approach to CME integration was initially proposed by Gillespie (2001) as an alternative to the traditional SSA. Rather than incrementing the system state according to events separated by exponentially distributed waiting times, he proposed dividing the time domain into a finite number of fixed intervals of length t and estimating the number of each kind of event that is likely to occur during each interval. While this approach is sometimes regarded as merely an approximation to the ‘true’ stochastic simulation algorithm, it is in fact only approximate in the same sense that any finite timestep integration algorithm – such as Euler’s method, Runge–Kutta, and the plethora of algorithms available for integrating stochastic differential equations (Kloeden and Platen, 1999) – are approximate. In other words, the algorithm can be considered exact from a practical standpoint, as it can be used to generate results accurate to within any required precision by using a small enough time step t. The convergent nature of the algorithm can easily be shown using a path integral representation of the system dynamics, a procedure which is presented in Appendix A. 4.1. Time-complexity of t-leaping calculations The primary advantage of t-leaping over the SSA is that its time-complexity is not explicitly tied to the number of reactions which occur over the course of the simulation. Just as for the previous algorithms, we can demonstrate this scaling empirically by considering the time t traj taken to calculate a single trajectory for the simplified infection-free model discussed earlier in Section 3.1. A series of 10 day simulations using a fixed time step of t ¼ 105 days and a variety of population sizes were conducted, the results of which are shown in Fig. 5. The clear plateau in the calculation times is clear evidence that t-leaping does not suffer from the same scaling problem that plagues the SSA. The nonlinear behavior of this time at smaller populations is merely due to a change in the way in which Poissonian pseudo-random numbers below a certain mean are generated. Note that as the time-complexity of t-leaping depends directly on the t, sensible choices for which can depend quite strongly on the particulars of the system being modeled, there is still an underlying connection between the time-complexity and the population size (or, more accurately, the state of the host–virus system). This dependence is more subtle than that of the SSA, however, especially as for t-leaping it is more often systems with populations close to zero that require smaller step sizes to avoid the appearance of negative populations (Gillespie, 2007, presents a useful discussion of these problems). 5 Note that in reality, this process may lead to weak negative correlation due the established link between virion production and a reduced lifetime of the infected cell. Author's personal copy 94 T.G. Vaughan et al. / Journal of Theoretical Biology 295 (2012) 86–99 of the initial Poissonian distributions are chosen to be 4.2. Full-scale population correlation dynamics We are now in a position to address the stochastic dynamics of the full-scale viral infection model using the parameters listed in Table 1. As with the smaller calculation discussed in Section 3, we consider initial cellular and viral population sizes subject to Poissonian fluctuations. In this case, however, the mean values ttraj (s) 3 2 1 0 8e+09 1e+10 Fig. 5. Dependence of t-leaping simulation time on steady-state target cell population size l=d. As in Fig. 3 the steady state population size was changed by holding d constant at 103 and varying l. Simulations were initialized at the deterministic steady state value. Error bars indicate the standard deviation in the times obtained from 10 independent simulations generated for each single set of parameters. 1014 ð30Þ 1012 1011 Relative Variance Expected Popoulation Sizes ð29Þ Again, the mean of the initial target cell population corresponds to the expected steady state l=d. With these parameters, the t-leaping algorithm was used to generate an ensemble of 20 480 individual trajectories, using a fixed time step of 4 104 days. Just as in the earlier SSA calculation, the ensemble was evenly divided into 10 sub-ensembles which were in turn used to generate 10 independent estimates of each calculated quantity allowing assessment of the sampling error associated with each result. A second independent ensemble was also generated using a half-sized time step of 2 104 days. The magnitude of the difference between results calculated from this ensemble and those calculated using the ensemble of full-sized time step trajectories was used as an estimate of the uncertainty in those results due to finite time step errors. The moments which were calculated are shown in Fig. 6, and are directly comparable to those shown earlier in Fig. 4. Firstly, Fig. 6a displays the dynamics of the mean population sizes, illustrating a close agreement with the deterministic prediction shown in Fig. 2. We will later discover that part of the reason for this close qualitative agreement is the large inoculation size used in this calculation. Secondly, Fig. 6b shows the dynamical behavior of the relative variance for each of the populations. As in Fig. 4b, we again see that each of the populations rapidly develop fluctuations many orders of magnitude above those of Poissonian distributions of the same means, with the largest fluctuations 4 6e+09 <Nx> /N Y ð0ÞS ¼ 0, /N V ð0ÞS ¼ 103 : 5 4e+09 ð28Þ and 6 2e+09 /N X ð0ÞS ¼ 2:5 1011 , 108 105 102 Uninfected T cells Infected T cells Virions 1 0.1 0.5 5.0 109 106 103 1 10−3 50.0 0.1 0.5 Time (days) 5.0 Time (days) 50.0 Relative Covariance 0.10 0.05 0.00 −0.10 0 2 4 6 8 10 Time (days) Fig. 6. Results of t-leaping stochastic integration of stochastic infection model, using the full-sized system parameters given in Table 1. As in Fig. 4, these include the (a) mean population sizes, (b) dispersion indices of those populations, and (c) relative covariances between those populations over course of the simulation period. The shading in (c) indicates the combined uncertainty due to the finite time step and finite trajectory ensemble sizes. (The corresponding uncertainties in (a) and (b) are too small to show.) Author's personal copy T.G. Vaughan et al. / Journal of Theoretical Biology 295 (2012) 86–99 appearing at the time of peak viremia. Finally, Fig. 6c illustrates the dynamics of the three distinct relative inter-population covariances as defined by Eq. (27). While the magnitudes of these covariances are somewhat less than those of the scaled-down model, the qualitative behavior is very similar. Overall, these results suggest that the size of the host/virus system does not greatly influence many of the qualitative features of the stochastic dynamics. 95 Varying the initial viral load has an even more pronounced effect on the dynamics of the relative covariance between the infected cell and viral populations, as shown in Fig. 7b. (Note that the vertical axes of Figs. 7b and 9b display Covrel ðNY ,NV Þ þ1 to permit logarithmic scaling of those axes.) The most striking feature of this result is that reducing the initial viral load greatly increases the positive covariance between these populations, including the ‘baseline’ value that the covariance returns to following the previously noted drop at peak viremia. 4.3. Dependence on the size of the initial viral population All of the stochastic calculations presented up to this point have used relatively large initial viral populations. These sizes were specifically chosen to avoid complications relating to fluctuation-driven extinction of the infection. We will now directly investigate the effects that very small initial viral population sizes impose on the stochastic dynamics. The precise initial conditions we seek to consider in this section are /NX ð0ÞS ¼ 2:5 1011 , ð31Þ /NY ð0ÞS ¼ 0, ð32Þ NV ð0Þ ¼ 10m where m ¼ 0; 1,2; 3: ð33Þ While all of the initial population sizes in the stochastic calculations described so far in this paper have been drawn from Poissonian distributions, the initial viral populations discussed in this section are presumed to be perfectly determined. For each of the four sets of initial conditions, we have generated 20 480 independent stochastic t-leaping trajectories, using a time step of 102 days. Note that in generating these trajectories, we have employed the modified t-leaping algorithm of Cao et al. (2005), which avoids the negative-population problems which plague the basic t-leaping algorithm when generating trajectories involving small populations. It does this by reverting to the standard SSA whenever populations stray within a critical number of reactions NC of reaching the origin. In our simulations we have set NC ¼100. As before, sub-ensembles were used to estimate the statistical uncertainty in our results due to the finite ensemble sizes, and separate trajectory ensembles generated using half-sized time steps were used to estimate the finite time step errors. Fig. 7a illustrates the dependence of the dynamics of the expected viral load on the initial population size of the pathogen, and clearly shows that the dependence is strong: the average viral loads resulting from inoculants of reduced strength are persistently lower than the those corresponding to higher initial viral populations—even in the long-term. 4.4. Clearance probability in small populations This dependence on initial population size is not tremendously surprising, given the fact that a reduction in the initial viral population size increases the probability that random fluctuations will drive the viral (and proviral) populations extinct before the infection can really take hold. The effect that the initial viral load has on the probability of such extinctions can be seen in Fig. 8a, where the fraction of the stochastic trajectories which have been depleted of infectious particles is shown as a function of time for each of the initial viral population sizes considered in Fig. 7. Combined with Fig. 8b, which shows the total fraction of such trajectories at the end of the 200-day simulation period, these results demonstrate that fluctuation-driven extinction of the infection only occurs with significant probability when the initial number of virions is fewer than approximately 100, but is more likely than not when the total initial virion count is on the order of 10. We also see that when they do occur, such clearances happen exclusively during the first one or two days of the infection. This type of demographic extinction of small populations is also known to occur in a generic stochastic logistic model, when treated using the SSA or related methods (Drummond et al., 2010). It is important to note that all of the initial viral loads considered in these simulations – including those which give rise to fluctuation-driven extinction of the infectious population – are well below the detection limit of currently available viral load assays, which is currently on the order of 20 virions per ml of sample (Verhofstede et al., 2010). Taking only the blood virus pool into account, this corresponds to an absolute detection limit of 107 virions. Thus, as the extinction events predicted by the model are not in conflict with the clinical observation that spontaneous clearance of HIV does not occur, as the model does give rise to these events when the viral load reaches detectable levels. However, these events are still of clinical significance as they contribute to the probability of diagnosable infection arising from a given inter-host interaction. 50 1011 108 105 Nv(0) = 1000 Nv(0) = 100 Nv(0) = 10 Nv(0) = 1 102 1 0.1 0.5 5.0 Time (days) 50.0 Covrel (Ny,Nv)+1 Expected Viral Load 1014 20 10 5 2 1 0 2 4 6 8 10 Time (days) Fig. 7. Impact of initial viral population size on (a) the expected viral load and (b) the relative covariance between the infected cell and virion populations. Shading indicates the combined uncertainty due to the finite integration time-steps and finite trajectory ensembles. Author's personal copy T.G. Vaughan et al. / Journal of Theoretical Biology 295 (2012) 86–99 1.0 1.0 0.8 0.8 0.6 NV(0) = 1000 NV(0) = 100 NV(0) = 10 NV(0) = 1 0.4 0.2 Fraction Cleared Fraction Cleared 96 0.6 0.4 0.2 0.0 0.0 0.5 1.0 1.5 2.0 Time (days) Fraction Cleared 0.0 2.5 1 3.0 10 100 1000 NV (0) 0.020 0.010 0.000 1 2 NY (0) 3 4 Fig. 8. Effects of size of initial infective population on fluctuation-driven extinction of the infection. For various initial viral population sizes, (a) shows the fraction of simulated infections have been cleared as a function of time post-inoculation, while (b) shows this same fraction at the end of the 200 day simulation period. The infection clearance fractions for infections initiated by infected cell populations of different sizes are shown in (c). While we are primarily concerned with assessing the influence of this fluctuation-driven extinction on the summary statistics calculated in the previous section, these results are interesting in their own right as they effectively set a lower bound on the size of a viral load likely to result in the establishment of ongoing infection. In this regard, they are in relatively close agreement with similar results obtained a number of years ago by Kamina et al. (2001), who showed that fluctuation-driven extinctions were likely in the stochastic model of Tan and Wu (1998) for initial total viral populations sizes of less than 300 virions. We note that Tan and Wu’s modeling involved an additional population of latently infected cells, while we ignore such populations in our model. Interestingly, the more recent study of fluctuation-driven extinction of HIV infection of Khalili and Armaou (2008) concludes that 100 virions per ml are required for ongoing infection to be certain. This is many orders of magnitude larger than our threshold of 100 in total. However, Khalili and Armaou arrived at their result by scaling up the threshold they obtained from simulations involving a compartment holding a single millilitre of blood. As extinction probability depends on the absolute virion population size, this procedure likely leads to overestimation—a fact that Khalili and Armaou note in their article and which we believe accounts for the discrepancy we observe here. As an aside, given that major routes of HIV infection involve the transfer of already-infected cells to a previously uninfected host, it is also interesting to consider the effect of replacing the initial viral population with an initial infected cell population (while setting the initial viral load to zero). We have thus analyzed a second set of ensembles of 20 480 stochastic trajectories, each of which was generated in exactly the same way as those discussed above, but starting from a small and precisely known value of N Y ð0Þ, with NV ð0Þ set to zero. Fig. 8c shows the clearance fraction for each of these ensembles, demonstrating that fluctuation-driven eradication of the infection is far less likely for cell-initiated infections: a single infected cell has approximately the same chance of leading to ongoing infection of the host as 100 free virions. This is what we expect, as the rate b of the cellular infection process is far slower than the rate k of the production of virions by an already infected cell, meaning that individual free virions are much less likely to ‘reproduce’ (give rise to new free virions) before they are cleared than are individual infected cells. 4.5. Results conditional on infection survival By repeating the t-leaping simulations under the condition that any trajectories which experience total viral clearance are discarded, we can obtain a trajectory ensemble drawn from the probability distribution conditional on survival of the infection; thus allowing us to deduce which attributes of the earlier results are a side-effect of stochastic viral clearance. Fig. 9a shows the mean viral load obtained from this modified ensemble. It is clear from this figure that the long-term reduction in viral load noted in the corresponding non-conditional results in Fig. 7a were simply due to an extinct sub-ensemble drawing down the estimated mean. Likewise, Fig. 9b demonstrates that the presence of the extinct sub-ensemble was also responsible for the increase in the baseline relative covariance between the viral and infected cell populations at small inoculation sizes shown in Fig. 7b. This is the result of the fact that an extinct sub-ensemble causes measurement of the viral load to become much more informative regarding the size of the infected cell population: a non-zero viral load measurement suggests that NY is almost certainly also nonzero. In this way, the extinct sub-ensemble drives up the relative covariance of the two populations. Significantly, the covariance calculated using the conditional ensemble is qualitatively very similar to the covariance obtained Author's personal copy 1014 7 1011 5 4 108 105 N N N N 102 1 0.1 0.5 5.0 (0) = 1000 (0) = 100 (0) = 10 (0) = 1 50.0 Covrel (Ny,Nv)+1 Expected Viral Load T.G. Vaughan et al. / Journal of Theoretical Biology 295 (2012) 86–99 97 3 2 1 0 5 10 15 20 Time (days) Time (days) Fig. 9. Influence of variation of the initial viral population size on (a) the expected viral load and (b) the relative covariance between the infected cell and virion populations, conditional on the survival of the infection beyond the initial. Shading again indicates uncertainty due to the effects of the finite ensemble and finite integration time-step. from the unconditional ensembles with higher initial viral population sizes. However, the magnitude of the plateau value attained during the acute phase of the infection remains quite strongly dependent on the initial population size. Additionally, smaller inoculation sizes seem to increase the magnitude of a transient ‘recovery’ in the strength of this covariance following peak viremia. This feature was not apparent in the simulations involving larger initial viral populations. Thus, while we have confirmed our earlier suspicion that much of the strong dependence of the results presented in Fig. 7 can be removed by accounting for fluctuation-driven clearance of the infection, significant sensitivity to initial conditions still remains in the degree of covariance between the viral and infected cell populations leading up to peak viremia. 5. Summary and discussion In this paper we have delivered a thorough treatment of a very basic model of within-host viral infection which takes into account the demographic fluctuations resulting from the variable timings between the discrete microscopic events constituting the macroscopic behavior. By presenting this model analytically in terms of a chemical master equation rather than in the form of a heuristically constructed simulation algorithm, we have been free to choose from a variety of methods of analytical and numerical analysis. This has allowed us to perform both small-scale calculations using the stochastic simulation algorithm, as well as large-scale calculations using the t-leaping algorithm which are directly comparable to the scale of the dynamics occurring during retroviral infection of an adult human involving nearly 1014 individual virions. It is evident from many of the results presented in the previous sections that the microscopic details of the host–virus interaction can have a strong influence on the qualitative dynamics of the infection. This is particularly true in the case of infections beginning from small numbers of virions, where the stochastic predictions can be very different to those produced by the deterministic model shown in Fig. 2. We have seen that these microscopic details give rise to a variety of statistical quantities which possess their own unique dynamics; the study of which is beyond the reach of deterministic models. Particularly interesting among these is the relative covariance between the viral and infected cellular populations, as certain qualitative features of its dynamics appear to be robust against changes in the quantitative details of the model. In all cases studied, for example, Covrel ðN Y ðtÞ,NV ðtÞÞ rose almost immediately from its initial position to settle at some finite positive value for the duration of the acute phase of the infection. Upon reaching the time of peak viremia, the covariance then fell sharply to a second value, where it remained until the end of the simulated period. In cases where total clearance of the viral and proviral populations was either unlikely or explicitly omitted, the covariance reached at the end of this fall was approximately zero. Given that the strength of the initial covariance is inversely related to the size of the initial viral population (Fig. 7e), we may sensibly presume that the initial level of Covrel ðN Y ðtÞ,NV ðtÞÞ is a result of the specifics of the rise of these populations being highly dependent on the exact timing of the initial cellular infection events. This leads to shot-dependent increases in the infected cell population and the viral load. As such incremental increases represent a reduced fraction of the population when larger populations are involved, this hypothesis agrees with the observed reduction in the covariance for such cases. However, the depletion of the uninfected cell population which gives rise to the viral load decay following peak viremia has the additional effect of destroying this covariance. This suggests that once the infection progresses beyond the acute phase, the microscopic details of that phase are prevented from influencing the infection dynamics. We therefore suggest that, after controlling for total virion clearance, the presence of a non-zero relative covariance between the infected target cell population and the virion population should form a strong statistical signature for the acute phase of retroviral infection. This is due to the fact that the qualitative dynamical behavior of this quantity, as discussed above, seems to be strongly independent of the specifics of the model parameters and the initial viral population size. This prediction could be tested by using blood/tissue samples to measure viral and infected cell concentrations for each of a large number of individuals infected under similar conditions and at similar times. The dynamics of the sample covariances between the population sizes could then be obtained directly and compared to the dynamics presented in this paper. Alternatively, and perhaps more practically, one could seek to draw samples from isolated populations within a single individual. We suggest that this could be achieved by exploiting the compartmentalized nature of the human body, or by considering genetically rather than spatially distinct populations. Acknowledgements Two of the authors (TGV and PDD) acknowledge the financial support of the Australian Research Council through a Discovery Project grant. Author's personal copy 98 T.G. Vaughan et al. / Journal of Theoretical Biology 295 (2012) 86–99 Appendix A. Derivation of the s-leaping algorithm Any probability distribution Pð~ n ,tÞ corresponding to a continuous time birth–death process can be expanded in terms of probability distributions at earlier times by way of successive applications of the Chapman–Kolmogorov equation, yielding " # N X Y Pð~ n ,tÞ ¼ Pð~ n ,t9~ n N ,t N Þ Pð~ n l þ 1 ,t l þ 1 9~ n l ,t l Þ Pð~ n 0 ,t 0 Þ: ðA:1Þ l¼1 ~ n 0 ,~ n 1 ,..., ~ nN As the number of intervals between the present and some fixed earlier time t0 increases, this expansion approaches what could be described as the discrete analogue of a path integral, with each sequence of intermediate states ~ n 0 , . . . ,~ n N specifying a possible path to the final state ~ n . From this point of view, each term in the sum is the probability with which the corresponding path is expected to appear. The goal of each Monte Carlo algorithm used in this paper is to randomly generate paths with these probabilities. In general, the CME satisfied by each conditional probability can be written X 0 0 0 @t Pð~ n ,t þ t9~ n ,tÞ ¼ ½T q ð~ n ~ v q ÞPð~ n ~ v q ,t þ t9~ n ,tÞ q 0 0 n ÞPð~ n ,t þ t9~ n ,tÞ, T q ð~ ðA:2Þ n Þ is the full combinatoric rate term for process q and ~ vq where T q ð~ is the state change resulting from that same process. In the small 0 t limit, Pð~ n ,t þ t9~ n ,tÞ approaches d~n 0 , ~n and we can write X 0 0 0 n ,t þ t9~ n ,tÞ C T q ð~ n Þ½Pð~ n ~ v q ,t þ t9~ n ,tÞPð~ n ,t þ t9~ n ,tÞ: ðA:3Þ @t Pð~ q The solution to this short-time CME can be obtained by way of the characteristic function and is 0 n ,tÞ C Pð~ n ,t þ t9~ 1 X X q mq ¼ 0 d~n 0 ~n ,mq ~v q etT q ð~n Þ ðtT q ð~ n ÞÞmq : mq ! ðA:4Þ We thus find that for small enough time increments, the conditional probability distributions constituting Eq. (A.1) approach Poisson distributions over the number of times each of the processes occurs during the given increment. Paths through the system state space therefore be generated with the appropriate probabilities by assembling sequences according to X ~ nl þ 1 ¼ ~ vq, nl þ mq,l ~ ðA:5Þ q where each mq,l is selected from the Poisson distribution with mean tT q ð~ n l Þ. This approach is exact in the limit t-0. References Bains, I., Antia, R., Callard, R., Yates, A.J., 2009. Quantifying the development of the peripheral naive CD4þ T-cell pool in humans. Blood 113, 5480–5487. Beauchemin, C., Samuel, J., Tuszynski, J., 2005. A simple cellular automaton model for influenza A viral infections. J. Theor. Biol. 232, 223–234. Blauvelt, A., Glushakova, S., Margolis, L.B., 2000. HIV-infected human Langerhans cells transmit infection to human lymphoid tissue ex vivo. AIDS 14, 647–651. Bonhoeffer, S., May, R.M., Shaw, G.M., Nowak, M.A., 1997. Virus dynamics and drug therapy. Proc. Natl. Acad. Sci. USA 94, 6971–6976. Bonhoeffer, S., Funk, G.A., Günthard, H.F., Fischer, M., Müller, V., 2003. Glancing behind virus load variation in HIV-1 infection. Trends Microbiol. 11, 499. Borghans, J.A.M., de Boer, R.J., 2007. Quantification of T-cell dynamics: from telomeres to DNA labeling. Immunol. Rev. 216, 35–47. Cao, Y., Gillespie, D.T., Petzold, L.R., 2005. Avoiding negative populations in explicit Poisson tau-leaping. J. Chem. Phys. 123, 054104. Castiglione, F., Poccia, F., D’Offizi, G., Bernaschi, M., 2004. Mutation, fitness, viral diversity, and predictive markers of disease progression in a computational model of HIV type 1 infection. AIDS Res. Hum. Retroviruses 20, 1314–1323. Cavert, W., Notermans, D.W., Staskus, K., Wietgrefe, S.W., Zupancic, M., Gebhard, K., Henry, K., Zhang, Z.Q., Mills, R., McDade, H., Schuwirth, C.M., Goudsmit, J., Danner, S.A., Haase, A.T., 1997. Kinetics of response in lymphoid tissues to antiretroviral therapy of HIV-1 infection. Science 276, 960–964. Clark, D.R., de Boer, R.J., Wolthers, K.C., Miedema, F., 1999. T cell dynamics in HIV-1 infection. Adv. Immunol. 73, 301–327. Conway, J.M., Coombs, D., 2011. A stochastic model of latently infected cell reactivation and viral blip generation in treated HIV patients. PLoS Comput. Biol. 7, e1002033. Drummond, P.D., Vaughan, T.G., Drummond, A.J., 2010. Extinction times in autocatalytic systems. J. Phys. Chem. A 114, 10481. Farci, P., Shimoda, A., Coiana, A., Diaz, G., Peddis, G., Melpolder, J.C., Strazzera, A., Chien, D.Y., Munoz, S.J., Balestrieri, A., Purcell, R.H., Alter, H.J., 2000. The outcome of acute hepatitis C predicted by the evolution of the viral quasispecies. Science 288, 339–344. Fraser, C., Hollingsworth, T.D., Chapman, R., de Wolf, F., Hanage, W.P., 2007. Variation in HIV-1 set-point viral load: epidemiological analysis and an evolutionary hypothesis. Proc. Natl. Acad. Sci. USA 104, 17441–17446. Funk, G.A., Jansen, V.A.A., Bonhoeffer, S., Killingback, T., 2005. Spatial models of virus–immune dynamics. J. Theor. Biol. 233, 221–236. Gardiner, C.W., 2004. Handbook of Stochastic Methods for Physics, Chemistry and the Natural Sciences, 3rd ed. Springer-Verlag, Berlin/Heidelberg/New York. Gillespie, D.T., 1976. A general method for numerically simulating the stochastic time evolution of coupled chemical reactions. J. Comput. Phys. 22, 403. Gillespie, D.T., 1977. Stochastic simulation of coupled chemical reactions. J. Phys. Chem. 81, 2340. Gillespie, D.T., 2001. Approximate accelerated stochastic simulation of chemically reacting systems. J. Chem. Phys. 115, 1716. Gillespie, D.T., 2007. Stochastic simulation of chemical kinetics. Annu. Rev. Phys. Chem. 58, 35. Haase, A.T., Henry, K., Zupancic, M., Sedgewick, G., Faust, R.A., Melroe, H., Cavert, W., Gebhard, K., Staskus, K., Zhang, Z.Q., Dailey, P.J., Balfour, H.H., Erice, A., Perelson, A.S., 1996. Quantitative image analysis of HIV-1 infection in lymphoid tissue. Science 274, 985–989. Halling-Brown, M.D., Moss, D.S., Sansom, C.E., Shepherd, A.J., 2009. A computational grid framework for immunological applications. Philos. Trans. R. Soc. A 367, 2705–2716. Heffernan, J.M., Wahl, L.M., 2005. Monte Carlo estimates of natural variation in HIV infections. J. Theor. Biol. 236, 137. Hockett, R.D., Kilby, J.M., Derdeyn, C.A., Saag, M.S., Sillers, M., Squires, K., Chiz, S., Nowak, M.A., Shaw, G.M., Bucy, R.P., 1999. Constant mean viral copy number per infected cell in tissues regardless of high, low, or undetectable plasma HIV RNA. J. Exp. Med. 189, 1545–1554. Kamina, A., Makuch, R.W., Zhao, H., 2001. A stochastic modeling of early HIV-1 population dynamics. Math. Biosci. 170, 187–198. Keele, B.F., Giorgi, E.E., Salazar-Gonzalez, J.F., Decker, J.M., Pham, K.T., Salazar, M.G., Sun, C., Grayson, T., Wang, S., Li, H., Wei, X., Jiang, C., Kirchherr, J.L., Gao, F., Anderson, J.A., Ping, L.H., Swanstrom, R., Tomaras, G.D., Blattner, W.A., Goepfert, P.A., Kilby, J.M., Saag, M.S., Delwart, E.L., Busch, M.P., Cohen, M.S., Montefiori, D.C., Haynes, B.F., Gaschen, B., Athreya, G.S., Lee, H.Y., Wood, N., Seoighe, C., Perelson, A.S., Bhattacharya, T., Korber, B.T., Hahn, B.H., Shaw, G.M., 2008. Identification and characterization of transmitted and early founder virus envelopes in primary HIV-1 infection. Proc. Natl. Acad. Sci. USA 105, 7552–7557. Khalili, S., Armaou, A., 2008. Sensitivity analysis of HIV infection response to treatment via stochastic modeling. Chem. Eng. Sci. 63, 1330. Kloeden, P.E., Platen, E., 1999. Numerical Solution of Stochastic Differential Equations. Springer-Verlag. Kousignian, I., Autran, B., Chouquet, C., Calvez, V., Gomard, E., Katlama, C., Rivire, Y., Costagliola, D., 2003. IMMUNOCO Study Group, 2003. Markov modelling of changes in HIV-specific cytotoxic T-lymphocyte responses with time in untreated HIV-1 infected patients. Stat. Med. 22, 1675. Leenheer, P.D., 2009. Within-host virus models with periodic antiviral therapy. Bull. Math. Biol. 71, 189–210. Levy, J.A., 2007. HIV and the Pathogenesis of AIDS. ASM Press, Washington, DC. Lifson, J.D., Nowak, M.A., Goldstein, S., Rossio, J.L., Kinter, A., Vasquez, G., Wiltrout, T.A., Brown, C., Schneider, D., Wahl, L., Lloyd, A.L., Williams, J., Elkins, W.R., Fauci, A.S., Hirsch, V.M., 1997. The extent of early viral replication is a critical determinant of the natural history of simian immunodeficiency virus infection. J. Virol. 71, 9508–9514. Lin, H., Shuai, J.W., 2010. A stochastic spatial model of HIV dynamics with an asymmetric battle between the virus and the immune system. New J. Phys. 12, 043051. Markowitz, M., Louie, M., Hurley, A., Sun, E., Mascio, M.D., Perelson, A.S., Ho, D.D., 2003. A novel antiviral intervention results in more accurate assessment of human immunodeficiency virus type 1 replication dynamics and T-cell decay in vivo. J. Virol. 77, 5037–5038. Merrill, S.J., 1989. Modeling the interaction of HIV with cells of the immune system. In: Castillo-Chavez, C. (Ed.), Mathematical and Statistical Approaches to AIDS Epidemiology. Springer, Berlin, pp. 371. Merrill, S.J., 2005. The stochastic dance of early HIV infection. J. Comput. Appl. Math. 184, 242. Müller, V., Marée, A.F., Boer, R.J.D., 2001. Release of virus from lymphoid tissue affects human immunodeficiency virus type 1 and hepatitis C virus kinetics in the blood. J. Virol. 75, 2597. Nowak, M.A., Bangham, C.R., 1996. Population dynamics of immune responses to persistent viruses. Science 272, 74–79. Nowak, M.A., May, R.M., 2000. Virus Dynamics. Oxford University Press. Perelson, A.S., 2002. Modelling viral and immune system dynamics. Nat. Rev. Immunol. 2, 28. Author's personal copy T.G. Vaughan et al. / Journal of Theoretical Biology 295 (2012) 86–99 Perelson, A.S., Kirschner, D.E., Boer, R.D., 1993. Dynamics of HIV infection of CD4 þ T cells. Math. Biosci. 114, 81–125. Rambaut, A., Posada, D., Crandall, K.A., Holmes, E.C., 2004. The causes and consequences of HIV evolution. Nat. Rev. Genet. 5, 52–61. Ramratnam, B., Bonhoeffer, S., Binley, J., Hurley, A., Zhang, L., Mittler, J.E., Markowitz, M., Moore, J.P., Perelson, A.S., Ho, D.D., 1999. Rapid production and clearance of HIV-1 and hepatitis C virus assessed by large volume plasma apheresis. Lancet 354, 1782–1785. Ruskin, H.J., Pandey, R.B., Liu, Y., 2002. Viral load and stochastic mutation in a Monte Carlo simulation of HIV. Physica A 311, 213. Stafford, M.A., Corey, L., Cao, Y., Daar, E.S., Ho, D.D., Perelson, A.S., 2000. Modeling plasma virus concentration during primary HIV infection. J. Theor. Biol. 203, 285–301. Tan, W.Y., Wu, H., 1998. Stochastic modeling of the dynamics of CD4 þ T-cell infection by HIV and some Monte Carlo studies. Math. Biosci. 147, 173. 99 Tuckwell, H.C., le Corfec, E., 1998. A stochastic model for early HIV-1 population dynamics. J. Theor. Biol. 195, 451. van Kampen, N.G., 2007. Stochastic Processes in Physics and Chemistry, 3rd ed. Elsevier, Amsterdam/Boston/Heidelberg. Verhofstede, C., Van Wanzeele, F., Reynaerts, J., Mangelschots, M., Plum, J., Fransen, K., 2010. Viral load assay sensitivity and low level viremia in HAART treated HIV patients. J. Clin. Virol. 47, 335–339. Vrisekoop, N., den Braber, I., de Boer, A.B., Ruiter, A.F.C., Ackermans, M.T., van der Crabben, S.N., Schrijver, E.H.R., Spierenburg, G., Sauerwein, H.P., Hazenberg, M.D., de Boer, R.J., Miedema, F., Borghans, J.A.M., Tesselaar, K., 2008. Sparse production but preferential incorporation of recently produced naive T cells in the human peripheral pool. Proc. Natl. Acad. Sci. USA 105, 6115–6120. Weinberger, A.D., Perelson, A.S., Ribeiro, R.M., Weinberger, L.S., 2009. Accelerated immunodeficiency by anti-CCR5 treatment in HIV infection. PLoS Comput. Biol. 5, e1000467.
© Copyright 2026 Paperzz