Modelling disease spread through random and regular contacts in

ARTICLE IN PRESS
Theoretical Population Biology 73 (2008) 104–111
www.elsevier.com/locate/tpb
Modelling disease spread through random and regular contacts in
clustered populations
K.T.D. Eames
Department of Biological Sciences and Mathematics Institute, University of Warwick, Gibbet Hill Road, Coventry CV4 7AL, UK
Received 15 March 2007
Available online 9 October 2007
Abstract
An epidemic spreading through a network of regular, repeated, contacts behaves differently from one that is spread by random
interactions: regular contacts serve to reduce the speed and eventual size of an epidemic. This paper uses a mathematical model to explore
the difference between regular and random contacts, considering particularly the effect of clustering within the contact network. In a
clustered population random contacts have a much greater impact, allowing infection to reach parts of the network that would otherwise
be inaccessible. When all contacts are regular, clustering greatly reduces the spread of infection; this effect is negated by a small number
of random contacts.
r 2007 Elsevier Inc. All rights reserved.
Keywords: Epidemic; Network; Clustering; Pair-wise approximation
1. Introduction
When trying to understand the spread of an infectious
disease through a population, many different factors are
important. Pathogen genetics, host immune response, host
behaviour, and public health interventions all have a role.
Here, we focus on the part played by the social behaviour
of the host population, specifically considering a human
population.
Infections are often transmitted through close contact.
Sexually transmitted infections are spread by intimate
physical contact whereas common airborne infections such
as influenza require less close proximity (Hethcote and
Yorke, 1984; Anderson and May, 1992; Wallinga et al.,
1999). Vector or water-borne diseases do not generally rely
on an interaction between infector and infected but instead
tend to be spatially localised (Anderson and May, 1992).
In this paper we will consider infections that are transmitted through direct interaction between individuals, and
observe that the relevant definition of ‘‘interaction’’ is
pathogen-specific.
E-mail address: [email protected]
0040-5809/$ - see front matter r 2007 Elsevier Inc. All rights reserved.
doi:10.1016/j.tpb.2007.09.007
Many contacts are long lasting. Those who share houses,
workspaces, or those who are long-term sexual partners
interact repeatedly and over a long period of time
(Edmunds et al., 1997; Martin and Yeung, 2006). It is no
surprise that much transmission of infectious diseases takes
place within the close household, since these individuals
interact with high frequency (Eichner, 2003; Ferguson
et al., 2006).
Some contacts can be considered as random events:
interactions that take place once and only once. Those who
sit next to each other on a train, or bump into each other in
the street, for example, would fall into this category
(Eichner, 2003; Ferguson et al., 2006).
Other contacts are harder to categorise: if an infection
lasts for a fortnight then a barber visited each month or an
aunt seen every Christmas is a contact that will not be
repeated during the infectious period, and is, by this
measure, indistinguishable from a random interaction;
however, such contacts may have a place in the wider social
network—linked through mutual friends, for instance—
that gives them more significance than a random event.
For the purposes of this paper, we will categorise
contacts as regular or random: regular contacts are those
that are seen repeatedly during the course of an infection,
ARTICLE IN PRESS
K.T.D. Eames / Theoretical Population Biology 73 (2008) 104–111
random contacts are those that will not be encountered
multiple times by an infectious individual. This categorisation of contacts as regular (repeated) or random (one-off)
interactions simplifies greatly the reality of human social
behaviour, but it allows the tractable exploration of
the effect of different types of interaction without the need
for large scale data collection to parameterise detailed
simulations.
When modelling the spread of an infection, the two
distinct types of interaction require different treatment.
Regular interactions can be viewed as a contact network
(Watts and Strogatz, 1998; Bearman et al., 2004; De et al.,
2004; Eames and Keeling, 2004; Eubank et al., 2004; Andre
et al., 2006), with individuals appearing as nodes and those
individuals who have regular contact with each other being
joined by lines. This network representation illustrates very
clearly that each individual has, in general, regular contact
with only a small subset of the population.
Random, one-off, contacts can be well captured with
more familiar mean-field models (Anderson and May,
1992). These models, which have been applied to a huge
range of infection scenarios, assume that all individuals in a
population are capable of interacting, in contrast to the
network paradigm that limits the number of interactions.
Mean-field models adhere nicely to the idea of brief,
unrepeated, contacts, resulting in the force of infection
experienced by a susceptible individual being proportional
to the level of infection present in the whole population.
If mean-field and network models with the same contact
intensity are compared we would expect infection to spread
more slowly within a network. This is because of a local
‘‘burning out’’ of susceptible individuals and is manifested
by correlations in infection status emerging between
connected individuals—infected individuals tend to be
linked to other infected individuals, by whom they were
infected or to whom they have transmitted infection;
therefore the number of contacts between infected and
susceptible individuals is reduced and the spread of
infection slowed (Diekmann et al., 1998; Keeling, 1999).
It has been shown that the structure of a contact network
has important implications for the spread of infection.
Individuals with particularly high numbers of contacts can
dominate the dynamics and are key targets for interventions (Kretzschmar et al., 1996; Rothenberg et al., 1998; De
et al., 2004; Andre et al., 2006), an effect also shown by
high-mixing subgroups in mean-field models (Hethcote and
Yorke, 1984; Anderson and May, 1992). It has also been
shown that highly clustered networks, those in which two
linked individuals are particularly likely to have other
contacts in common, result in reduced epidemic spread; in
clustered networks infection tends to be confined to highly
interconnected cliques and is less likely to emerge into the
wider population (Rothenberg et al., 1996; Keeling, 1999;
Eubank et al., 2004) although it may be more likely to
infect all of a well-connected cluster (Newman, 2003).
Clustering makes contact tracing a more effective control
measure since it allows multiple routes for intervention
105
efforts to follow to reach infected individuals (Eames and
Keeling, 2003; Tsimring and Huerta, 2003; Kiss et al.,
2005). Further, by increasing the chance that two infected
individuals will compete to infect the same susceptible
individual, clustering alters the evolutionary pressures that
a pathogen experiences (Boots and Sasaki, 1999; Read and
Keeling, 2003; Boots et al., 2004).
In this paper we will examine the impact of regular and
random contacts on disease spread, looking particularly at
the differences between clustered and unclustered social
networks. We will see that disease spreads further and
more rapidly when contacts are random, and that the
difference between regular and random contacts is most
apparent in highly clustered populations.
2. Methods
Mixing patterns within a population display a combination of order
and randomness. This observation is particularly apparent when looking
at the spatial arrangement of contacts; many contacts are with nearby
individuals while some are between individuals a large distance apart
(Read and Keeling, 2003; Ferguson et al., 2006). Modelling approaches
have dealt with this fact in a number of ways: notably, models of foot-andmouth infection using an interaction strength based on separation
(Ferguson et al., 2001; Keeling et al., 2001) with, in these cases, nearby
farms being much more likely to interact than those widely separated; an
alternative approach, sparked by the work of Watts and Strogatz (1998), is
to use small-world networks, i.e. networks that look in the main like
lattices, with individuals connected to their nearest neighbours, but with
some links rewired to allow long-distance connections (Watts and
Strogatz, 1998; Boots and Sasaki, 1999; Tsimring and Huerta, 2003).
Each of these models idealises spatial regularity and randomness as local
and long-distance links. However, none contains the distinction between
social regularity and randomness that we seek to investigate here: all links
are of the same type once they have formed, rather than some being longlasting and others short lived. The models can be seen as spatially
extended versions of either the mean-field or network approach but not as
an amalgam of the two.
Some previous models have been developed to combine network and
mean field interactions (Ball et al., 1997; Ferguson et al., 2001; Eames and
Keeling, 2003; Kiss et al., 2006). The approach taken here follows the path
laid by these. In this paper we will investigate disease dynamics in a
population that has both random and regular contacts, which we will
model as mean-field and network interactions, respectively.
2.1. Models
We will look at two modelling approaches: stochastic and deterministic.
The stochastic approach is more natural but it requires that every link
within the network is known; this is not a problem when using simulated
networks but it will be a concern when applying the model to a real-world
population. Stochastic models allow individual variation to be included
with relative ease providing there are appropriately detailed data (Eubank
et al., 2004; Ferguson et al., 2006). The deterministic approach only
provides an approximation to the true mixing behaviour but requires far
less information to parameterise. By allowing the calculation of analytical
results, it allows a more straightforward evaluation of the impact of a
pathogen over a wide range of population-level behavioural parameters
such as the mean number of contacts and the level of clustering.
We will investigate the case of a pathogen that confers lasting immunity
following infection and which allows individuals to be classified as either
susceptible ðSÞ, infected and infectious ðIÞ or recovered ðRÞ. This
commonly used SIR classification is a useful approximation to the many
complexities of the infection process within an individual (Hethcote and
ARTICLE IN PRESS
106
K.T.D. Eames / Theoretical Population Biology 73 (2008) 104–111
Yorke, 1984; Anderson and May, 1992). Regular contacts are treated as a
static network while random interactions are modelled as mean-field
mixing.
2.1.1. Stochastic model
Combining regular and random interactions in a stochastic model is
straightforward. Regular contacts are represented as a network, through
which infection can spread along links between connected individuals.
Random contacts are thought of as short-lived interactions with members
of the population chosen at random and are modelled as mean-field
interactions.
The rate at which a susceptible individual becomes infected is therefore:
tN # infected regular contacts
þ tM # random contacts
proportion of population infected,
ð1Þ
where tN and tM are the rates of transmission through regular and
random contacts, respectively. Throughout, we will use the subscripts N
and M to represent parameters related to the regular (network) and
random (mean-field) interactions, respectively. In this vein, we define kN
and kM to be the number of regular and random contacts, respectively,
and we define the total number of contacts, K, by K ¼ kN þ kM . In the
mean-field case, kM can be thought of as the number of contact events per
unit time, whereas the kN network contacts are assumed to be fixed.
The appropriate network to use depends on the population being
studied and the relevant definition of a contact—for instance, for a given
population, sexual contact networks will differ from social interaction
networks. Rather than attempting an accurate representation of a
particular population, here we will look at simplified network configurations. Throughout, for simplicity, we will assume that the network is
homogeneous, with each individual having the same number, kN , of
contacts. Following Keeling et al. (1997), we define the clustering
coefficient, f, to be the proportion of triples that form triangles, i.e. the
proportion of contacts of any given individual that are connected to
each other.
We use an iterative process to form clustered networks: first, a
proportion of the desired links are made by connecting randomly chosen
nodes; then, with some probability, triangles are formed by adding links
between individuals with a mutual contact (with the restriction that no
individuals may have more than kN contacts and that no two individuals
may be connected to each other twice). These two steps are repeated until
the desired number of contacts has been reached. Clustering can be
increased by either increasing the probability of forming triangles or by
decreasing the proportion of the desired number of contacts made each
iteration. In contrast to other models (for example Watts and Strogatz,
1998; Ferguson et al., 2001; Read and Keeling, 2003), the probability of
any two individuals becoming connected does not depend on the distance
between them in any abstract or realistic space. Aside from the clustering
introduced the networks formed will resemble random graphs. The
network formation method used, although not enabling a network with a
specified value of f to be formed, permits a range of networks with
different values of f to be simply generated. The mechanism chosen allows
clustering to reach high levels without the network breaking up into a
collection of disconnected components (Newman, 2003).
2.1.2. Deterministic model
Often, for a population of interest, it is impossible to determine
accurately the network of regular contacts; complete network determination requires data about all individuals and all interactions. In such cases
either one must make assumptions about the form of the network or seek
some other approach.
Pair-wise models have been developed to provide a means of modelling
the spread of epidemics on networks without the need for full network
parameterisation (Keeling et al., 1997; Rand, 1999). These deterministic
models form an intermediate step between mean-field and full network
models by including as their variables all possible configurations of pairs
of individuals within the network.
A fuller discussion of the formulation of pair-wise models can be found
elsewhere (Rand, 1999); briefly, we let ½A be the number of individuals of
type A (either S, I, or R to represent susceptible, infected, or recovered,
respectively); ½AB the number of contacts in the network between a type A
individual and a type B individual; and ½ABC the number of A B C
triples within the network. For ease of book-keeping, pairs are counted
once in each direction, thus ½AB ¼ ½BA and ½AA is even. With this
notation, we can form a system of differential equations describing the
spread of infection through the population passing along both regular
network contacts and through random mean-field interactions:
_ ¼ tN ½SI tM kM ½S½I=P,
½S
(2)
_ ¼ tN ½SI þ tM kM ½S½I=P g½I,
½I
(3)
_ ¼ g½I,
½R
(4)
_ ¼ 2tN ½SSI 2tM kM ½SS½I=P,
½SS
(5)
_ ¼ tN ð½SSI ½ISI ½SIÞ þ tM kM ð½SS½I=P
½SI
½SI½I=PÞ g½SI,
ð6Þ
_ ¼ tN ½ISR tM kM ½SR½I=P þ g½SI,
½SR
(7)
_ ¼ 2tN ð½ISI þ ½SIÞ þ 2tM kM ½SI½I=P 2g½II,
½II
(8)
_ ¼ tN ½ISR þ tM kM ½SR½I=P þ gð½II ½IRÞ,
½IR
(9)
_ ¼ 2g½IR,
½RR
(10)
where g is the recovery rate and P ¼ ½S þ ½I þ ½R is the population size.
To close the system we need to be able to approximate the triples that
appear in terms of pairs and singletons. In an unclustered network we can
use the approximation (Keeling et al., 1997)
½ABC ðkN 1Þ ½AB½BC
kN
½B
(11)
which assumes that the triple is made up of two pairs that have a central
individual in common but are otherwise independent. In the case of
clustered networks, this assumption is no longer appropriate and instead
we use
ðkN 1Þ ½AB½BC
½ACP
½ABC ð1 fÞ þ f
,
(12)
kN
½B
½A½CkN
which is made up of two parts: the approximation, as above, for
unclustered triples (i.e. those that do not form a triangle) along with the
correlation between the outside members of the triple if the triple happens
to form a triangle (Keeling et al., 1997). With the application of this
approximation, the system of equations can be closed and numerically
iterated to approximate the behaviour of an epidemic on a network with a
given number of regular contacts, kN , and a given clustering coefficient, f.
One advantage of the deterministic model over its stochastic counterpart is the scope it gives for obtaining easily interpreted analytic results. In
particular here we will derive expressions for the initial growth rate of the
epidemic as functions of the clustering coefficient, f, and the distribution
of contacts between regular (kN ) and random (kM ).
3. Results
3.1. Unclustered populations
We begin by investigating the difference between regular
and random contacts within an unclustered population.
The network of regular contacts in such a population is
approximately tree-like, with very few triangles and therefore very little social clustering.
ARTICLE IN PRESS
K.T.D. Eames / Theoretical Population Biology 73 (2008) 104–111
_ ¼ tN ½SI þ tM kM ½S½I=P g½I
½I
¼ ðtN ½SI=½I þ tM kM gÞ½I.
ð13Þ
ð14Þ
The early exponential growth of the epidemic is
characterised by a phase during which ½SI=½I ¼ l, say,
giving a growth rate of tN l þ tM kM g. From this growth
rate we can define a quantity, R, which has analogous properties to the basic reproductive ratio, R0 (the number of
secondary cases produced by an index case in an otherwise
susceptible population, Anderson and May, 1992):
R ¼ ðtN l þ tM kM Þ=g.
4500
4000
3500
3000
Total cases
Fig. 1 shows typical time series of epidemics as the
proportion of random contacts is varied. We see that the
stochastic and deterministic models are in close agreement
and both predict that there is a difference in the initial
epidemic growth rate, with growth rates being largest when
all contacts are random. This difference is translated into a
difference in the final size of the epidemic, with more
individuals being infected when there are few regular
contacts (Diekmann et al., 1998; Keeling, 1999).
Using the pair-wise approximation, we will explore the
exponential growth rate of the epidemic during its early
stages in more detail. We can derive an expression for the
early stage growth rate of an epidemic: we have that
107
2500
2000
1500
1000
500
0
0
5
10
15
20
25
Time
Fig. 1. Typical time series (number of infected plus recovered individuals)
for stochastic (red) and deterministic (blue) epidemics. Here
tN ¼ tM ¼ 0:2, g ¼ 1, kN þ kM ¼ 10. Shown are the cases where all
contacts are random (kN ¼ 0, solid lines) when all are regular (kM ¼ 0,
dashed lines), and where half are random, half regular (kN ¼ kM ¼ 5,
dash-dot lines). Stochastic epidemics are simulated with a population size
of 5000 and a seed of five cases distributed at random. The mean of 50
stochastic epidemics is plotted.
(15)
R and R0 are not the same thing in this network model,
since the initial growth rate does not translate into number
of secondary cases in the way that it does for simple meanfield models, but it shares some important properties: in
particular, R41 implies that the epidemic initially grows
whereas Ro1 implies that there is no spread.
To evaluate R we must calculate the value taken by
½SI=½I in the early stages of the epidemic. Following the
methods of Keeling (1999) we calculate the early quasiequilibrium value of l as follows:
contacts (kM ¼ 0) we have R ¼ tN ðkN 2Þ=g (Keeling,
1999). We see, as previously observed, that an epidemic
spreads more slowly when restricted to pass through a
network (Diekmann et al., 1998; Keeling, 1999). If we
define RN ¼ tN ðkN 2Þ=g and RM ¼ tM kM =g the expression simplifies to
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
RM þ RN þ ðRM þ RN Þ2 þ 8RM RN =ðkN 2Þ
.
R¼
2
(20)
_
_
l_ ¼ 0 ) ½SI½I
¼ ½I½SI
)
Further straightforward manipulation shows that
RM þ RN oRoRM þ tN kN =g. Thus the growth rate of
an epidemic that spreads through both regular and random
contact types exceeds the sum of the growth rates of an
epidemic when restricted to each contact type in turn; the
random spread speeds up the spread of an epidemic
through a network of regular contacts. As expected,
though, the growth rate is lower than it would be if all
contacts were random.
Fig. 2 shows the effect on an epidemic of transferring
interactions from regular to random. K ¼ kN þ kM is held
constant but the fraction of random contacts is varied. We
see that populations with more random contacts experience
greater transmission, and that the effect is stronger when
there are fewer contacts in total: when there are few contacts
the network and the mean-field models are more different.
(16)
ðtN ð½SSI ½ISI ½SIÞ þ tM kM ½I=Pð½SS
½SIÞ g½SIÞ½I
¼ ðtN ½SI þ tM kM ½S½I=P g½IÞ½SI.
ð17Þ
Using the triples approximation (Eq. (11)) and making
the substitution ½SI ¼ l½I and linearising at the early stage
of the epidemic, when ½I51, ½S P, and ½SS kN P we
obtain
l¼
RgkN
,
2tN þ Rg
(18)
which gives the result:
R¼
tM kM þ tN ðkN 2Þ þ
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
ðtM kM þ tN ðkN 2ÞÞ2 þ 8tM kM tN
2g
.
(19)
We observe a number of things: first, when there are no
regular contacts (kN ¼ 0) the result simplifies to the
expected result: R ¼ tM kM =g. When there are no random
3.2. Clustered populations
We now look at epidemics on clustered networks. The
models for these networks closely resemble those discussed
ARTICLE IN PRESS
K.T.D. Eames / Theoretical Population Biology 73 (2008) 104–111
above; in the stochastically generated networks, triples are
joined to form triangles (as described above), thus increasing the amount of clustering, and in the deterministic
case the triples approximation (Eq. (12)) is used to include
the effect of clustering in the network of regular contacts.
In the deterministic case similar calculations can be made
to derive an expression for the initial growth rate of the
epidemic. However, the added complexity of the triples
approximation means that a simple closed form for the
growth rate cannot be obtained.
4500
4000
3500
Final size
108
Total contacts=5
Total contacts=10
3000
Total contacts=20
2500
2000
1500
0
Hence, when the clustering coefficient, f, is non-zero we
must also derive the early stage quasi-equilibrium value of
½II=½I. We let a ¼ ½II=½I and the same methodology gives
a¼
2tN l
.
tN l þ g 2tN l2 fðkN 1Þ=k2N
(23)
Substituting this back in we obtain that l is the solution
of the equation:
fl
l ¼ ðkN 1Þ 1 f þ
1
kN
ðkN 1Þf
2tN l
.
ð24Þ
l
2
kN
tN l þ g 2tN l2 fðkN 1Þ=k2N
This can be solved numerically with ease. We see from
Fig. 3b that it leads to the same conclusions as the
stochastic model: that clustering in a network noticeably
impedes the spread of infection. Increasing f leads to a
decrease in l and therefore a decrease in epidemic growth
0.4
0.6
0.8
Proportion of random contacts
1
2
1.9
1.8
1.7
1.6
Total contacts=5
1.5
ðtN ð½SSI ½ISI ½SIÞ g½SIÞ½I ¼ ðtN ½SI g½IÞ½SI.
(21)
In the case of a clustered population, the term ½ISI is no
longer negligible if such a triple forms a triangle (since
infected individuals are highly correlated). In this case, the
expression reduces to
fl
ðkN 1Þf ½II
l ¼ ðkN 1Þ 1 f þ
.
(22)
1l
kN
½I
k2N
0.2
2.1
R
3.2.1. Regular contacts only
We begin by looking at the case of a clustered network of
regular contacts with no random interactions, i.e. kM ¼ 0.
This section considers solely the influence of clustering
within a social network on the spread of infection.
We see in Fig. 3 that the stochastic model predicts that
clustering within a network can have a huge effect on the
spread of an epidemic: particularly when the number of
contacts is small, clustering prevents infection from
emerging from densely connected cliques in to the wider
population, and therefore the progress of infection is
slowed (Rand, 1999). We can obtain the same result by
looking at the initial epidemic growth rate using the
deterministic pair-wise model. As before, we have R ¼
tN l=g, where at the early stage equilibrium value of l we
have
Total contacts=10
Total contacts=20
1.4
1.3
1.2
1.1
0
0.2
0.4
0.6
0.8
Proportion of random contacts
1
Fig. 2. (a) Mean final size of a stochastic epidemic as a function of the
proportion of contacts that are random. g ¼ 1 and tN ¼ tM ¼ 2g=K to
ensure that all cases have the same growth rate when kM ¼ K. Curves are
plotted for K ¼ 5 (blue), K ¼ 10 (green), and K ¼ 20 (red). (b) Initial
growth rate of an epidemic (as calculated using the deterministic result) for
the same parameter values.
rate. Further, we see that the lower the number of contacts
the greater the impact of clustering.
Eq. (24) also allows us to calculate analytically the
critical value of the clustering coefficient f for which an
epidemic cannot take off (see Rand, 1999 for a similar
calculation in the case of an epidemic in which individuals
are once again susceptible on recovery): at this point, R ¼ 1
and l ¼ g=tN giving the critical value of f to be the
solution of the quadratic
f2
2
g ðkN 1Þ
g
g
g
kN k2N
þf 2 þ
tN k N
tN
tN tN
2
kN
g
kN 2 ¼ 0.
þ
tN
ðkN 1Þ
ð25Þ
ARTICLE IN PRESS
K.T.D. Eames / Theoretical Population Biology 73 (2008) 104–111
Fig. 4 shows how the critical value of f depends on both
the transmission parameter and the neighbourhood size.
No matter how large the transmission rate or the neighbourhood size, the model predicts that there is always a
level of clustering that will suffice to prevent the epidemic
from taking off. As tN =g ! 1 or kN ! 1 the critical
value of f approaches 1, i.e. a fully clustered population
with all individuals with a mutual contact being connected.
Such high clustering coefficients would require the network
to be split into multiple separate cliques with everyone
within each clique being connected; this is unlikely to be
seen in any real population, hence clustering is anticipated
to be capable of preventing an epidemic only for smaller
values of the transmission rate or neighbourhood size.
4000
Contacts=5
Contacts=10
Contacts=20
Final size
3500
3.2.2. Regular and random contacts
To the clustered network model described in the
previous section we can add random connections leading
to mean-field spread of infection. We can thereby investigate the difference between regular and random contacts in
a clustered society.
Again, the adjustment to the models is slight, with meanfield spread being included in the models. Once again,
although no simple result is forthcoming, the same
approach as above can be used to explore the initial
growth rate of the deterministic epidemic. Here we have
R ¼ ðtN l þ tM kM Þ=g,
where here l is the solution of
fl
l ¼ ðkN 1Þ 1 f þ
1
kN
ðkN 1Þfa tM kM kN
1 .
þ
l
tN
l
k2N
ð26Þ
3000
As before a ¼ ½II=½I evaluated at the early quasiequilibrium level given by
2500
a¼
2000
1500
1000
500
0
0
0.1
0.2
0.3
0.4
0.5
0.6
Clustering coefficient, φ
1.6
Contacts=5
Contacts=10
Contacts=20
1.5
1.4
1.3
R
109
1.2
1.1
1
0.9
0.8
0
0.2
0.4
0.6
0.8
1
Clustering coefficient, φ
Fig. 3. (a) Mean final size of a stochastic epidemic in a population with
kN ¼ K as a function of the clustering coefficient, f. g ¼ 1 and tN ¼
1:6g=ðK 2Þ to ensure that all cases have the same growth rate when
f ¼ 0. Curves are plotted for K ¼ 5 (blue), K ¼ 10 (green), and K ¼ 20
(red). (b) Initial growth rate of an epidemic (as calculated using the
deterministic result) for the same parameter values.
tM k M
2tN l
.
þ tN l þ g 2tN l2 fðkN 1Þ=k2N
ð27Þ
As before, numerical solution is straightforward. Fig. 5
shows the influence of clustering on an epidemic in a
population with both regular and random contacts. Both
for the stochastic final size and deterministic growth rate,
we reach the same conclusions. When contacts are random
rather than regular in an unclustered network the speed
and final size of an epidemic are increased by a noticeable
but not enormous amount (the effect being greatest when
the total number of contacts is small); in a clustered
network, the impact is far larger; when clustering is
particularly high a few random contacts can be the
difference between an epidemic that fails to take off and
one that infects half the population.
Again, we can look for the critical level of clustering that
reduces R to below one (Fig. 6). We see in this case that the
more random contacts there are the smaller the region of
parameter space in which clustering of regular contacts can
prevent epidemic spread. The presence of random contacts
slightly enlarges the region in parameter space where an
epidemic in an unclustered population can take off and
also dramatically reduces the possible impact of clustering.
The greater the proportion of random contacts, the lower
the capacity of clustering to influence an epidemic.
4. Conclusions
A wide range of interactions is implicated in disease
transmission: from contacts whom we meet every day to
those whom we will never see again. From the point of view
of the infected individual, the distinction is between those
contacts who will be seen multiple times during the
infectious period of the pathogen and those who will not.
ARTICLE IN PRESS
K.T.D. Eames / Theoretical Population Biology 73 (2008) 104–111
110
4500
4000
0.8
3500
0.6
3000
0.4
Final size
Critical clustering
1
0.2
0
10
2500
2000
Random contacts=0%
Random contacts=10%
Random contacts=20%
Random contacts=50%
Random contacts=100%
1500
8
6
kN
4
2
0
0.1
0.2
0.3
0.4
0.5
1000
τN/g
500
0
Fig. 4. Critical value of the clustering coefficient, f, for which no epidemic
can take off as predicted by the deterministic pair-wise model in a
population with no random contacts.
0.1
0.2
0.3
0.4
0.5
Clustering coefficient, φ
0.6
0.7
2
1.5
R
It makes more than merely a descriptive difference
whether contacts are regular or random. Regular, repeated,
contacts constrain the spread of infection since effectively
an infected individual ‘‘wastes’’ some contact events by
meeting again people whom he has already infected.
Therefore, when contacts are random, one-off, events
infection spreads faster and further.
The simple models of social interactions described here
suggest that, in the case of an unclustered network, the
impact of an epidemic is increased by a small but
potentially significant amount when contacts are random
rather than regular. However, when regular contacts are
clustered then random contacts make a huge amount of
difference to the epidemic. In the absence of random
mixing, when interactions are limited to a network of
regular contacts, high levels of clustering can prevent an
epidemic from taking place; a few random contacts allow
wider transmission, even in highly clustered networks. In a
clustered network infection is restricted to tightly knit
cliques with little interaction with the rest of the population; the presence of random contacts allows infection to
emerge and spread from clique to clique.
In the case of unclustered networks, random contacts
serve to reduce the correlation between interacting individuals. In clustered networks, however, random contacts
act more like the long-range links introduced in smallworld models (Watts and Strogatz, 1998), allowing infection to pass between different regions of the network.
However, unlike in small-world models, in the model
investigated here randomness does not relate to the spatial
arrangement of links but to the nature of the interactions.
Both the stochastic and deterministic approaches taken
here support the conclusions that clustering is an important
aspect of social mixing patterns whose impact is reduced by
random, irregular, interactions. However, neither model is
capable of capturing the complex processes that lead to realworld social network formation. In the real world, clustering
may arise from a number of different routes, such as through
0
1
Random contacts=0%
Random contacts=10%
Random contacts=20%
Random contacts=50%
Random contacts=100%
0.5
0
0
0.2
0.4
0.6
0.8
1
Clustering coefficient, φ
Fig. 5. (a) Mean final size of a stochastic epidemic as a function of the
clustering coefficient, f. g ¼ 1, tN ¼ tM ¼ 0:2, and kN þ kM ¼ K ¼ 10
throughout. Curves are plotted for kM ¼ 0 (red), kM ¼ 1 (green), kM ¼ 2
(blue), kM ¼ 5 (maroon), and kM ¼ 10 (black). When all contacts are
random, f is undefined; this line is shown for comparison. (b) Initial
growth rate of an epidemic (as calculated using the deterministic result) for
the same parameter values.
sharing homes, schools, or workplaces, or through spatial
proximity, or through common social activities (Ball et al.,
1997; Rothenberg et al., 1998; Watts and Strogatz, 1998;
Ferguson et al., 2001, 2006; Read and Keeling, 2003; De et
al., 2004; Eubank et al., 2004), and there are many ways to
generate models of clustered networks, some of which are
based on approximations of these social processes. The
models presented here aim to make as few assumptions as
possible about how social structure arises and so provide an
abstract arena in which to investigate particular aspects of
behaviour. While the broad conclusions are expected to
apply more generally, different modelling approaches will
give different quantitative predictions of the effects of
clustering and of interaction patterns.
ARTICLE IN PRESS
K.T.D. Eames / Theoretical Population Biology 73 (2008) 104–111
14
12
K
10
8
6
4
0
0.1
0.2
0.3
0.4
0.5
τ/g
Fig. 6. Region of parameter space for which clustering can reduce R to
below 1. Above the lower line of each type an epidemic can take off in an
unclustered population; above the upper line clustering is unable to
prevent an epidemic from taking off. kM ¼ 0:1K (solid lines), kM ¼ 0:2K
(dashed lines), kM ¼ 0:5K (dash-dot lines). tN ¼ tM ¼ t, g ¼ 1 throughout.
To know the relative importance of regular and random
contacts, it is vital to be able to assess the level of clustering
within social networks. This is no easy task, and requires
extensive surveying of social mixing behaviour (Edmunds et
al., 1997). Information about behaviour within households
or workplaces, for example, though interesting and useful, is
obtained from a naturally clustered subset of the population,
so does not provide a sensible estimate of social clustering.
Accurate surveys are needed to determine both how many
regular and how many random contacts we have and also the
amount of clustering displayed by the contacts.
Acknowledgments
The author thanks EPSRC and Emmanuel College,
Cambridge, for financial support, Matt Keeling for helpful
discussions, and two anonymous referees for their constructive suggestions.
References
Anderson, R.M., May, R.M., 1992. Infectious Diseases of Humans.
Oxford University Press, Oxford.
Andre, M., Ijaz, K., Tillinghast, J.D., Krebs, V.E., Diem, L.A., Metchock, B.,
Crisp, T., McElroy, P.D., 2006. Transmission network analysis to
complement routine tuberculosis contact investigations. Am. J. Pub.
Health 96, 1–11.
Ball, F., Mollison, D., Scalia-Tomba, G., 1997. Epidemics with two levels
of mixing. Ann. Appl. Probab. 7, 46–89.
Bearman, P.S., Moody, J., Stovel, K., 2004. Chains of affection: the structure
of adolescent romantic and sexual networks. Am. J. Soc. 110, 44–91.
Boots, M., Hudson, P.J., Sasaki, A., 2004. Large shifts in pathogen
virulence relate to host population structure. Science 303, 842–844.
Boots, M., Sasaki, A., 1999. ‘‘Small worlds’’ and the evolution of
virulence: infection occurs locally and at a distance. Proc. R. Soc.
London B 266, 1933–1938.
111
De, P., Singh, A.E., Wong, T., Yacoub, W., Jolly, A.M., 2004. Sexual
network analysis of a gonorrhoea outbreak. Sex. Transm. Infect. 80,
280–285.
Diekmann, O., de Jong, M.C.M., Metz, J.A.J., 1998. A deterministic
epidemic model taking account of repeated contacts between the same
individuals. J. Appl. Probab. 35, 448–462.
Eames, K.T.D., Keeling, M.J., 2003. Contact tracing and disease control.
Proc. R. Soc. London B 270, 2565–2571.
Eames, K.T.D., Keeling, M.J., 2004. Monogamous networks and the
spread of sexually transmitted disease. Math. Biosci. 189, 115–130.
Edmunds, W.J., O’Callaghan, C.J., Nokes, D.J., 1997. Who mixes with
whom? A method to determine the contact patterns of adults that may
lead to the spread of airborne infections. Proc. R. Soc. London B 264,
949–957.
Eichner, M., 2003. Case isolation and contact tracing can prevent the
spread of smallpox. Am. J. Epidemiol. 158, 118–128.
Eubank, S., Guclu, H., Kumar, V.S.A., Marathe, M.V., Srinivasan, A.,
Toroczkai, Z., Wang, N., 2004. Modelling disease outbreaks in
realistic social networks. Nature 429, 180–183.
Ferguson, N.M., Donnelly, C.A., Anderson, R.M., 2001. The foot-andmouth epidemic in Great Britain: pattern of spread and impact of
interventions. Science 292, 1155–1160.
Ferguson, N.M., Cummings, D.T., Fraser, C., Cajka, J.C., Cooley, P.C.,
Burke, D.S., 2006. Strategies for mitigating an influenza pandemic.
Nature 442, 448–452.
Hethcote, H.W., Yorke, J.A., 1984. Gonorrhea: Transmission Dynamics
and Control. Springer Lecture Notes in Biomathematics, Springer,
Berlin.
Keeling, M.J., 1999. The effects of local spatial structure on epidemiological invasions. Proc. R. Soc. London B 266, 859–867.
Keeling, M.J., Rand, D.A., Morris, A.J., 1997. Correlation models for
childhood epidemics. Proc. R. Soc. London B 264, 1149–1156.
Keeling, M.J., Woolhouse, M.E.J., Shaw, D.J., Matthews, L., ChaseTopping, M., Haydon, D.T., Cornell, S.J., Kappey, J., Wilesmith, J.,
Grenfell, B.T., 2001. Dynamics of the 2001 UK foot and mouth
epidemic: stochastic dispersal in a heterogeneous landscape. Science
294, 813–817.
Kiss, I.V., Green, D.M., Kao, R.R., 2005. Disease contact tracing in
random and clustered networks. Proc. R. Soc. London B 272,
1407–1414.
Kiss, I.V., Green, D.M., Kao, R.R., 2006. The effect of contact
heterogeneity and multiple routes of transmission on final epidemic
size. Math. Biosci. 203, 124–136.
Kretzschmar, M., van Duynhoven, Y.T.H.P., Severijnen, A.J., 1996.
Modeling prevention strategies for gonorrhea and chlamydia using
stochastic network simulations. Am. J. Epidemiol. 144, 306–317.
Martin, J.L., Yeung, K.-T., 2006. Persistence of close personal ties over a
12-year period. Soc. Net. 28, 331–362.
Newman, M.E.J., 2003. Properties of highly clustered networks. Phys.
Rev. E 68, 026121.
Rand, D.A., 1999. Correlation equations for spatial ecologies. In:
McGlade, J. (Ed.), Advanced Ecological Theory. Blackwell, London,
pp. 99–143.
Read, J.M., Keeling, M.J., 2003. Disease evolution on networks: the role
of contact structure. Proc. R. Soc. London B 270, 699–708.
Rothenberg, R.B., Potterat, J.J., Woodhouse, D.E., 1996. Personal risk
taking and the spread of disease: beyond core groups. J. Inf. Dis. 174,
S144–S149.
Rothenberg, R.B., Sterk, C., Toomey, K.E., Potterat, J.J., Johnson, D.,
Schrader, M., Hatch, S., 1998. Using social network and ethnographic
tools to evaluate syphilis transmission. Sex. Transm. Dis. 25, 154–160.
Tsimring, L.S., Huerta, R., 2003. Modeling of contact tracing in social
networks. Physica A 325, 33–39.
Wallinga, J., Edmunds, W.J., Kretzschmar, M., 1999. Perspective: human
contact patterns and the spread of airborne infectious diseases. Trends
Microbiol. 7, 372–377.
Watts, D.J., Strogatz, S.H., 1998. Collective dynamics of ‘‘small-world’’
networks. Nature 393, 440–442.