A Hybrid Symbolic-Numerical Method for

A Hybrid Symbolic-Numerical
Method for Determining Model
Structure
Diana Cole, NCSE, University of Kent
Rémi Choquet, Centre d'Ecologie Fonctionnelle et Evolutive
Ben Hubbard, NCSE, University of Kent
Introduction โ€“ Example Capture-Recapture
Herring Gulls (Larus argentatus) capture-recapture data for 1983 to
1986 (Lebreton, et al 1995)
78 ๏‚ฌ83
Numbers Released: ๐‘น = 123 ๏‚ฌ84
111 ๏‚ฌ85 Recapture yr
Yr released
83๏‚ฎ
Numbers Recaptured:
84
85
86
67
4
2
84๏‚ฎ ๐‘ต = 0
103 3
85๏‚ฎ
0
0
91
Introduction โ€“ Example Capture-Recapture
67
4
2
๐‘ต = 0 103 3
0
0
91
78
๐‘น = 123
111
๏ฆi โ€“ probability a bird survives from occasion i to i+1
pi โ€“ probability a bird is recaptured on occasion i
๏ฑ = [๏ฆ1, ๏ฆ2, ๏ฆ3, p2, p3, p4 ]
recapture probabilities
๐œ™1 ๐‘2 ๐œ™1 (1 โˆ’ ๐‘2 )๐œ™2 ๐‘3 ๐œ™1 (1 โˆ’ ๐‘2 )๐œ™2 (1 โˆ’ ๐‘3 )๐œ™3 ๐‘4
๐œ™2 ๐‘3
๐œ™2 (1 โˆ’ ๐‘3 )๐œ™3 ๐‘4
๐‘ธ= 0
0
0
๐œ™3 ๐‘4
3
3
3
๐‘๐‘–๐‘—
๐‘„๐‘–๐‘—
๐ฟ=
๐‘–=1 ๐‘—=๐‘–
3
1โˆ’
๐‘–=1
๐‘…๐‘– โˆ’ 3๐‘—=๐‘– ๐‘๐‘–๐‘—
๐‘—=๐‘–
๐‘„๐‘–๐‘—
Can only ever estimate ๏ฆ3 p4 - model is parameter redundant or
non-identifiable.
Introduction
โ€ข In some models it is not possible to estimate all the
parameters. This is termed parameter redundant / nonidentifiable.
โ€ข A model is parameter redundant if it can be reparameterised
in terms of a smaller number of parameters.
โ€ข Capture-recapture example: ๏ฑ = [๏ฆ1, ๏ฆ2, ๏ฆ3, p2, p3, p4 ]
๏ฑR = [๏ฆ1, ๏ฆ2, p2, p3, ๏ข ] ๏ข = ๏ฆ3 p4
โ€ข Parameter redundancy can be due to the model (extrinsic) or
the data (intrinsic).
โ€ข Sometimes it is obvious that a model is parameter redundant
(e.g. capture-recapture example), but in more complex
models it is not necessarily obvious.
Symbolic Method
โ€ข Symbolic methods can be used to detect parameter redundancy
in less obvious cases (see for example Catchpole and Morgan,
1997, Cole et al, 2010).
โ€ข Firstly an exhaustive summary is required, ๐œฟ. An exhaustive
summary is a vector of parameter combinations that uniquely
define the model, e.g. recapture probabilities, Q.
3
3
3
๐‘๐‘–๐‘—
๐‘„๐‘–๐‘—
๐ฟ=
๐‘–=1 ๐‘—=๐‘–
3
1โˆ’
๐‘–=1
๐‘…๐‘– โˆ’ 3๐‘—=๐‘– ๐‘๐‘–๐‘—
๐‘—=๐‘–
๐‘„๐‘–๐‘—
โ€ข Let ๐œฝ denote a vector of the p parameters.
โ€ข We then form a derivative matrix,
๐œ•๐œฟ
๐‘ซ=
.
๐œ•๐œฝ
Symbolic Method
โ€ข
โ€ข
โ€ข
โ€ข
๐œ•๐œฟ
๐‘ซ=
๐œ•๐œฝ
Then calculate the rank, r, of ๐‘ซ.
When r = p, model is full rank; we can estimate all
parameters.
When r < p, model is parameter redundant with deficiency
d=pโ€“r.
In parameter redundant models we can also find a set of r
estimable parameter combinations by solving ๐œถ๐‘— ๐‘ซ๐‘ป = ๐ŸŽ then
๐œ•๐‘“
๐’‘
๐›ผ
๐’Š=๐Ÿ ๐‘–๐‘— ๐œ•๐œƒ๐‘–
al, 2010).
= 0, ๐‘— = 1, โ€ฆ ๐‘‘ (Catchpole et al, 1998 or Cole et
Problems with the Symbolic Method
โ€ข In more complex models the derivative matrix is structurally
too complex. Computer runs out of memory calculating the
rank.
โ€ข Examples:
Bio-kinetic compartment
model of sludge respiration
Douchain et al (2007)
Cole et al (2010)
Wandering Albatross
Striped Sea Bass
Multi-state models for sea birds
Tag-return models for fish
Hunter and Caswell (2009)
Jiang et al (2007)
Cole (2012)
Cole and Morgan (2010)
โ€ข How do you proceed?
โ€“ Numerically โ€“ can give the wrong results.
โ€“ Symbolically โ€“ involves extending the theory and finding
simpler exhaustive summaries (Cole et al, 2010). However
this method is complex.
โ€“ Hybrid Symbolic-Numeric Method.
Hybrid-Symbolic Numeric Method
โ€ข Calculate the derivative matrix,
๐œ•๐œฟ
๐‘ซ=
,
๐œ•๐œฝ
symbolically.
โ€ข Evaluate ๐‘ซ at a random point ๐œฝ๐‘˜ to give ๐‘ซ๐‘˜ .
โ€ข Calculate ๐‘Ÿ๐‘˜ the rank of ๐‘ซ๐‘˜ .
โ€ข Repeat for 5 random points model, then ๐‘Ÿ = max ๐‘Ÿ๐‘˜ .
โ€ข If the model is parameter redundant for any ๐‘ซ๐‘˜ with ๐‘Ÿ๐‘˜ = ๐‘Ÿ
solve ๐œถ๐‘˜ ๐‘ซ๐‘‡๐‘˜ = 0. The zeros in ๐œถ๐‘˜ indicate positions of
parameters that can be estimated.
Example Capture-Recapture
โ€ข ๏ฑ = [๏ฆ1, ๏ฆ2, ๏ฆ3, p2, p3, p4 ]
Example โ€“ multi-site capture-recapture model
โ€ข The capture-recapture models can be extended to studies
with multiple site (Brownie et al, 1993).
โ€ข Example Canada Geese in 3 different geographical regions T=6 years.
โ€ข Geese tend to return to the same site โ€“ memory model.
(๐‘ก)
โ€ข Initial state probabilities:๐œ‹๐‘—
๐‘ก
๐‘ก
๐‘ก
๐‘ก
for ๐‘— = 1,2 & ๐‘ก = 1, โ€ฆ 6 (๐œ‹3 = 1 โˆ’ ๐œ‹1 โˆ’ ๐œ‹2 )
๐‘ก
โ€ข Transition probabilities: ๐œ™โˆ—๐‘–๐‘— for ๐‘–, ๐‘— = 1,2,3 & ๐‘ก = 1, โ€ฆ , 5 and ๐œ™๐‘–๐‘—๐‘˜ for ๐‘–, ๐‘—, ๐‘˜ =
1,2,3 & ๐‘ก = 2, โ€ฆ , 5.
๐‘ก
โ€ข Capture probabilities: ๐‘๐‘— for ๐‘– = 1,2,3 , ๐‘ก = 2, โ€ฆ , 6. (p = 180 Parameters)
Example โ€“ Occupancy Models
โ€ข Rather than marking animals, occupancy models looks at whether
or not a species is present at a particular site.
โ€ข Parameters: ๐œ“ โˆ’ site is occupied, ๐‘ โ€“ species is detected.
โ€ข Species detected at a site with probability ๐œ“๐‘.
โ€ข Species not detected at a site with probability
๐œ“ 1 โˆ’ ๐‘ + 1 โˆ’ ๐œ“ = 1 โˆ’ ๐œ“๐‘
โ€ข Basic model is parameter redundant, so a robust design was
developed, so that several surveys are conducted each season at
each site, and assumed ๐œ“ is the same for each survey.
โ€ข More complex models consider multiple sites and interactions
between species.
โ€ข These models are not parameter redundant, but this assumes that
every possible combination of occupied and unoccupied is
observed. However parameter redundancy can be caused by the
data (intrinsic parameter redundancy).
Example โ€“ Occupancy models
โ€ข Monitoring of amphibians in the Yellowstone and Grand Teton
National Parks, USA (Gould et al, 2012).
โ€ข Two species: Columbian Spotted Frogs and Boreal Chorus Frogs.
โ€ข ๐œ“ occupancy probabilities, ๐‘ detection probabilities.
โ€ข (s) dependence on site, (t) dependence of time, โˆ™ dependent
on neither site nor time.
Model
๐œ“ โˆ™ ๐‘ โˆ™
๐œ“ ๐‘  ๐‘ โˆ™
๐œ“ โˆ™ ๐‘ ๐‘ 
๐œ“ ๐‘ก ๐‘ ๐‘ก
๐œ“ ๐‘ก, ๐‘  ๐‘ โˆ™
๐œ“ ๐‘ก, ๐‘  ๐‘ ๐‘ก
๐œ“ ๐‘ก, ๐‘  ๐‘ ๐‘ก, ๐‘ 
Rank
20
65
35
59
161
176
236
Deficiency No. pars
0
20
0
65
0
35
0
59
17
178
17
193
67
303
Example - Bio-kinetic compartment model of
sludge respiration
โ€ข Non-linear compartment models can be used to describe the
activated sludge-process (Dochain et al, 1995).
โ€ข The exogenous oxygen uptake, U, depends on the bio-degredation of
two substrates, S.
โ€ข Parameters: ๐œƒ = [๐‘Œ1 , ๐‘†1 0 , ๐œ‡๐‘š๐‘Ž๐‘ฅ1 , ๐พ๐‘š1 , ๐‘Œ2 , ๐‘†2 0 , ๐œ‡๐‘š๐‘Ž๐‘ฅ2 , ๐พ๐‘š2 , ๐‘‹]
where Yi is the fraction of the pollutant Si which is not oxidised
but converted into a new biocatalyst, X. The parameters ๏ญmax1
and ๏ญmax2 are rate constants, Km1 and Km2 are affinity constants.
โ€ข From extended symbolic method estimable parameter combinations:
๐œ‡๐‘š๐‘Ž๐‘ฅ๐‘– ๐‘‹ 1 โˆ’ ๐‘Œ๐‘–
, ๐พ๐‘€๐‘– + ๐‘†๐‘– 0 1 โˆ’ ๐‘Œ๐‘– , ๐‘†๐‘– (0)(1 โˆ’ ๐‘Œ๐‘– )
๐‘Œ๐‘–
Conclusion and future work
โ€ข The hybrid method can be used to find how many parameters can be
estimated in a model.
โ€ข Hybrid method is much simpler to use than extended symbolic method.
โ€ข Can be added to standard software packages. For ecological models it is
available in M-surge and E-surge.
โ€ข It can quickly give results about whether a particular data set is parameter
redundant, even for several hundred parameters.
โ€ข However it currently is only applicable to a given number of years of data
(ecological models) or substrates (sludge model). In the symbolic method
there is an extension theorem that allows general results to be developed.
Expanding the hybrid method to include the extension theorem is future
work.
โ€ข In the parameter redundant model the hybrid method can currently only
determine which of the original parameters are identifiable. Constraints
needed to give an identifiable model can only be obtained by trial and
error. The symbolic method can also give estimable parameter
combinations.
References
โ€ข
โ€ข
โ€ข
โ€ข
โ€ข
โ€ข
โ€ข
โ€ข
โ€ข
โ€ข
โ€ข
โ€ข
โ€ข
โ€ข
โ€ข
Hybrid Numeric-Symbolic Method:
Choquet, R. and Cole, D.J. (2012) A Hybrid Symbolic-Numerical Method for Determining
Model Structure. Mathematical Biosciences, 236, p117.
Symbolic Method:
Cole, D.J., Morgan, B.J.T., Titterington, D.M. (2010) Mathematical Biosciences, 228, p16.
Cole, D.J., Morgan, B.J.T. (2010), JABES, 15, p431.
Catchpole, E. A., Morgan, B. J. T (1997) Biometrika, 84, p187.
Catchpole, E. A., Morgan, B.J.T., Freeman, S. N. (1998) Biometrika, 85, p42.
Cole, D.J. (2012) Journal of Ornithology , 152, p305.
Other:
Brownie, C. Hines, J., Nichols, J. et al (1993) Captureโ€“recapture, Biometrics, 49, p1173.
Dochain, D, Vanrolleghem, P.A., Van Dale, M. (1995) Water Research, 29, p2571.
Gould, W. R., Patla, D. A., Daley, R., et al. (2012). Wetlands, 32, p379.
Hunter, C., Caswell, H. (2009) Environmental and Ecological Statistics vol 3, p. 797.
Jiang, H.H., Pollock, K.H., Brownie, C. et al, (2007), JABES, 12, p 177
Lebreton, J. Morgan, B. J. T., Pradel R. and Freeman, S. N. (1995) Biometrics, 51, p1418.