EASC presentation UW

A method to estimate contact
probabilities.
Spread of disease and animal transport ….
Associate Professor
Uno Wennergren
Spatio-Temporal Biology
Division of Theoretical Biology
Linköping University
Sweden
Funded by MSB Swedish Civil Contingencies Agency
1
Contact probabilities
Spread of disease
Animal transport
• Predicting the spread of a disease
Depends on
the contact between entities (persons, animals, farms)
the specifics of the disease
C
A
D
B
This talk will focus on the contact pattern
what is important
how to estimate
E
D
F
2
Contact probabilities
Spread of disease
Animal transport
Contact probabilities
Spread of disease
Animal transport
A
E
C D
B
D F
• Structure of the talk
1. Why modeling spread of disease?
2. What is important? Characteristics in contact structure
that have effect on spread of disease (according to science).
3. Different models. Advocating data driven approach.
4. How to construct a data driven model: Spread of disease
between farms.
Contacts through animal transports/shipments
3
Contact probabilities
Spread of disease
Animal transport
A
Why Modeling
Spread of Disease
E
C D
B
D F
Why modeling
What is important
Different models
How to
 Spread of disease: health and costs.

Foot and mouth outbreak in UK 2001:


slaughter of 3.4 million animals
loss of 2 billion £
 Modeling Spread of Disease: A tool to predict
the spread.

May improve decisions and interventions
(calculations show that less 3% of proactive slaughter
during FMD outbreak was on infected herds, Chris Ster
2009)
May or may not!
Is it reality or virtual reality?
4
Contact probabilities
Spread of disease
Animal transport
1.
What is important?
Characteristics in contact structure that
have effect on spread of disease
Spread - Diffusion: Distance per time unit
In meter och number of nodes
A
E
C D
B
D F
Why modeling
What is important
Different models
How to
Mathematics - analysis – tells us that the
1.
2.
The amount of contacts/time. The width of the kernel. Variance.
The proportion short distance vs long distance. The shape of the
kernel. Kurtosis.
If bounded by exponential distribution than enough
with width, the variance. (Mollison 1977 and Clark 1998)
Mathematical analysis are only
possible in homogenous landscapes
5
Contact probabilities
Spread of disease
Animal transport
What is important?
Characteristics in contact structure that
have effect on spread of disease
A
C D
B
E
D F
Mathematical analysis are only
possible in homogenous landscapes!
Why modeling
What is important
Different models
How to
Livestock distribution in the US.
(from USDA 2003)
6
Contact probabilities
Spread of disease
Animal transport
What is important?
Characteristics in contact structure that
have effect on spread of disease
Mathematical analysis are only
possible in homogenous landscapes!
Livestock distribution in the US.
(from USDA 2003)
What contacts are there?
A
C D
B
Database EU:
All animal (cattle and pigs) movements
between farms and farms to
slaughterhouses.
A
C D
B
E
G F
Why modeling
What is important
Different models
How to
E
G F
7
Contact probabilities
Spread of disease
Animal transport
What is important?
Characteristics in contact structure that
have effect on spread of disease
A
C D
B
E
A
C D
B
E
G F
Why modeling
What is important
Different models
How to
G F
Database EU:
All animal (cattle and pigs)
movements between farms and
farms to slaughterhouses.
8
Contact probabilities
Spread of disease
Animal transport
Different models.
Advocating data driven approach.
What models and
approaches?
Why modeling
What is important
Different models
How to
Veterinary medicine
Behavioural Modeling
A
C D
B
E
G F
Using databases – cutting edge statistics
Using behavioural modeling
Making assumptions
Use the data that’s
available.
Transport data
Outbreak data
9
Contact probabilities
Spread of disease
Animal transport
A
C D
B
Different models.
Advocating data driven approach.
E
Why modeling
What is important
Different models
How to
G F
Transport data and other data
+
Most recent structure
Before outbreak
Not all transmission paths
Not how infective
vs
Outbreak data (UK 2001)
+
actual transmissions
Not all countries have such data
Different structures in different countries
Structure change over time
10
Contact probabilities
Spread of disease
Animal transport
A
C D
B
E
G F
Different models.
Advocating data driven approach.
Swedish choice:
Not to use UK outbreak data.
No cattle markets inSweden
etc
Why modeling
What is important
Different models
How to
Transport data and other data
+
Most recent structure
Before outbreak
Not all transmission paths
Not how infective
11
Contact probabilities
Spread of disease
Animal transport
How to construct a data driven model:
Spread of disease between farms..
A
C D
B
E
G F
Why modeling
What is important
Different models
How to
Cutting edge statistics MCMC Bayesian method
Combine a set of kernels, reflecting set of behaviour
12 months - cattle approximately 1 000 000
reports of sales and purchase
12
Contact probabilities
Spread of disease
Animal transport
How to construct a data driven model:
Spread of disease between farms..
A
C D
B
cattle
E
G F
pig
Why modeling
What is important
Different models
How to
Cutting edge statistics MCMC Bayesian method
Combine a set of kernels, reflecting set of behaviour
12 months - cattle approximately 1 000 000
reports of sales and purchase
13
Contact probabilities
Spread of disease
Animal transport
How to construct a data driven model:
Spread of disease between farms..
Why modeling
What is important
Different models
How to
Does the difference matter??
Density – proportion of farms connected
A
B
Model 1
Model 2
E
C D
G F
14
Contact probabilities
Spread of disease
Animal transport
END
A
C D
B
E
G F
Cutting edge statistics
Use the data that’s available
Why modeling
What is important
Different models
How to
Associate Professor
Uno Wennergren
Theoretical Biology Linköping University
Funded by
MSB Swedish Civil Contingencies Agency
In collaboration with
15
SVA Swedish Veterinary institute
16
17
18
19
Animal transport
2006 • Aims of different projects
• The context
– Research groups, their expertise
– Data base on animal movements
• Specific research questions
• Estimating probability of animal movements –
Tom Lindström
20
Projects
- aims
- groups
• Spread of disease: Foot and mouth disease.
– Prepare to optimize intervention
• Animal welfare
– Reduce stress and distance transported
21
• Spread of disease: Foot and mouth disease.
Funded by Swedish Civil Contingencies Agency
2 grants, PI’s: UW and SSL at SVA
(Swedish DHS):
– Prepare to optimize intervention
• Spatio-Temporal Biology (4 persons)
– Biology/Ecology
– Mathematics
– Scientific Computing
• National Veterinary Institute (SVA) (3 persons)
– Disease control and epidemiology
– Veterinary medicine
22
• Animal welfare
– Reduce stress and distance per animal
• Funded by Swedish Board of Agriculture (Swedish USDA) PI: UW
• Spatio-Temporal Biology (3 persons)
– Biology/Ecology
– Mathematics
– Scientific Computing
• Dept. of Animal Environment and Health, Swedish
University of Agricultural Sciences (2 persons)
– Animal welfare
– Veterinary medicine
• Skogforsk, LiU, NHH (3 persons)
– Optimization –Logistics
– route planning –
23
Sweden
Database
• All animal (cattle and pigs)
movements between farms and farms
to slaughterhouses.
• Not per vehicle
– Cattle on individual level: birth, sale purchase,
export, import, temporarily away (pasture),
return from pasture, slaughter/house, death
– Pig, on group level: as above
• Report within seven days
Farms and slaughterhouses in Sweden. Dots: blue –farms, red –
large slaughterhouses. Green - smaller slaughterhouses. From
Håkansson et al 2007.
24
Database -specifics
• 12 months - cattle approximately 1 000 000
reports of sales and purhase
• Important: errors in reports 10%
– Possible to edit the database and reduce to 1%
error by logical corrections (database cleaning)
Spatial and temporal investigation of reported movements, births and
deaths of cattle and pigs in Sweden. Submitted. Nöremark , Håkansson,
Lindström, Wennergren, and Sternberg Lewerin.
25
Specific research questions
1. Other contacts between farms questionnaire to
farmers (SVA)
2. From measured contacts to probability of contact
3. Spread: Modeling disease specifics
4. Route planning of animal transport – effect on contacts and
movement distance.
5. Production units: composition and configuration
6. Networks
1.
2.
Analysing transport network
Testing efficiency of network measures as predictors
1.
2.
3.
Generating netorks
Testing linkdensity on network formation
Testing measures as predictors
26
Gamma=0
Gamma=1
Gamma=2
Continuous
landscapes
viewed
from the
side
Continuous
landscapes
viewed
from above
1
Digitalized
landscapes
with 10%
preferred
habitat
Digitalized
landscapes
with 40%
preferred
habitat
2
?
27
Specific research questions
From measured contacts to probability of contact
Estimating probabilities
Tom Lindström
28
Animal movements between holdings
• Which farms are likely to have contacts
through animal movements?
– Mathematical description.
– Estimation from data.
• Distance
– Contacts between nearby farms are more
common
– Several different processes
– Preventive Veterinary Medicine (any day now…)
29
A word on the data
• Should be good…
• Pigs reported at transport level by the receiving farmer
• Cattle reported at individual level by farmers at both
origin and end.
– Cattle moved on the same day between same farms
constitute one transport
– Mismatch
– “Cleaning” using the identity of cattle
• Locations of many cattle farms not in the database but
areas of valid for subsidies
• Inactive farms in the data base
30
Quantifying distance dependence
• Distance dependence needs two
measurements. Probability of contacts has
– Scale
• Measured as Variance (or Squared Displacement)
– Shape
• Measured as Kurtosis
31
Variance
P
Distance
32
Kurtosis
P
Distance
33
Why these measures?
• Important to have quantities for comparison
– Between epidemics
– Between types of contacts
– Between years
• Theoretical connection to biological invasions
– Squared displacement relates to diffusion
constant.
– Discrete representation of space (i.e. farms has X,Y
coordinates) => Fat tails more important
34
Kernel function
• A generalized normal distribution
g d a, b  
e
 a
d
b
S
• Variance and Kurtosis given by a and b.
• Extended to two dimensions (X,Y coordinates)
– S normalizes the kernel, Volume=1.
b
S
2a1 b 
35
Kernel function normalization
• With discrete representation of farm
distribution normalization over all possible
destination farms
N 1  d ik 
a

b
S  e
k 1
d is distance, i is start farm, k is possible
destination farms (k≠i) and N is number of
farms.
36
Kernel function normalization
• This separates spatial
pattern of farms from
distance dependence in
contacts.
• Important if farm
distribution is non random.
• Farm density in Sweden
(farms/km2)
37
And USA
From Shields and Mathews, 200338
Is the kernel function good enough?
.
• A single distribution may not be sufficient to
fit data on multiple scales (both short and
long distance contacts).
• An alternative model
– A mixture model
– Part distance dependent and part uniform (Mass
Action Mixing)
• Models applied to pig and cattle transports
(all transports during one year).
39
An alternative model
.
wf1 dt a, b  1  w f 2 dt 
• f1 is distance dependent part: f1 dt a, b  N 1
e
d
 t 
 a
e
• f2 is MAM part:
f 2 d t   1
b
d
 ik 
a

b
k 1
N  1
• w is proportion of distance dependence
40
Fitting to data
• Bayesian approach
• Increasingly common at least in ecological
literature.
Ellison 2008
41
Markov Chain Monte Carlo
• Parameters obtained through Markov Chain
Monte Carlo (MCMC).
• Well suitable for epidemiological problems.
• A simple model can be expanded to include
complexity.
• Drawback is computation time, and effective
parallelization is difficult.
42
Markov Chain Monte Carlo
• Repeated (correlated) random draws from the
posterior distribution of parameters.
• Gibbs Sampling
– Direct draws from known distributions
conditional on other parameter values
• Metropolis-Hastings
– Values are proposed and subsequently accepted
or rejected dependent on likelihood ratios
43
Markov Chain Monte Carlo
• Also allows for model selection by comparing
the full posterior distribution of model
probabilities.
• In our study, the mixture model was a much
better model.
Cattle
Pigs
44
Comparing models and observed data
Cattle
Pigs
Bars: observed transport distances. Dotted line: predictions by
Model 1. Solid line: predictions by Model 2
45
Network measures
• Will differences have consequences for
estimation of disease spread dynamics?
• Networks generated with the different models
• Network measures
• Nodes (farms) and links (transports)
C
A
D
B
46
Network measures
• Density – proportion of farms connected
Model 1
Model 2
47
Network measures
• Clustering Coefficient – proportion of
“triplets”
– If A is connected to B and C, are B and C
connected?
A
C
D
B
Model 1
Model 2
48
Network measures
• Fragmentation index – measures the amount
of fragments not connected to the rest.
C
A
B
Model 1
D
E
D
F
Model 2
49
Network measures
• Betweeness central
nodes more central
C
A
B
Model 1
D
E
D
C
A
F
Model 2
B
D
E
D
F
50
Animal transports
• Higher Cluster Coefficient and lower Density for
Model 2
– Depends on difference in short distance contacts
– Depletion of susceptibles
• Group Betweeness higher for Model 1 in Cattle.
– Due to long distance transport being more rare
• Conclusion: Model 2 is a better model (higher
likelihood) and the difference may have impact
on disease spread prediction.
51
More than distance?
• Why not compare to observed networks?
• Is there something but distance that matter?
• Some work in progress…
52
More than distance?
• Pig industry very structured, production types
– Multiplying herd
– Sow pool central unit
– Sow pool satellite herd
– Fattening herd
– Farrow to finish herd
– Piglet producing herd
– Nucleus herd
http://www.swedishmeats.com
53
From
To
54
Production types in cattle?
• Dairy and beef producers
• Male calves on dairy farms are often sold to
beef producers (at lest in Sweden)
• Other differences in production types?
– Roping?
– Organic farming?
– Climate/geografic factors
55
More than distance?
• Reality is messy…
– Data base not perfect
– Missing production types
– Several production types per farms
• Weights in the model
– A farm is a fraction of each possible type.
• One parameter estimation per combination
(sender/receiver) of production types.
56
More than distance?
• Size dependence
– Size (Capacity)
– Two different sizes
• Adult sows
• Piglets
• Different production types have different response
– Different for sending or receiving
– Total 64x4 parameters just for size…
– Modeled as power function (Sizeθ)
57
More than distance?
• Distance dependence
– Different for different production types
– Variance, Kurtosis and mixing parameter for each
combination
58
More than distance?
• Many parameters…
9*64=576
• Some combinations of production types have
few transports => uncertain estimations.
• Variance and kurtosis not clearly different
from ∞.
• Using a prior may help
– But it’s nicer to be objective…
59
Hierarchical Bayesian
• We can let the data decide the prior
– Hyper parameters
• Hierarchical Bayesian model
• “Borrowing strength”
60
Animal transports – part 2
P(θ )
θ1
θ2
θ3
θn
Data
61
Hierarchical Bayesian
• When would this make sense?
– If parameters values are expected to be different
but not totally different
– E.g. distance…
• Parameter estimations based on much data…
– Little influence of hierarchical prior
• Parameter estimations with little data…
– Highly influenced by the hierarchical prior.
Increases the variance of the prior distribution.
62
Thank you
• Questions?
63