Generalized Linear Modelling for Parasitologists

Focus
dependent protein kinase fr,>m Tu/pana~oma cruzi. Mol.
Biachem. ParasitaL78,171-183
19 GOmez, M.L et al. (1989) Protein kinase C in Trypauosoma
cruzi eplmastlgole forms. Partial purification and characterization. MoL Biodzem.Parasitol.36,101-108
20 Nishi7uka, Y. (1995) Protein klnase C and lipid signaling for
sustained cellular responses. FASEBI. 9, 484-406
21 Moncada, S. el al, (1991) Nitric.oxide: physiology, pathophysiology and pharmacology. Pharmacol.R,~,.43,109-142
22 Waidman, S.A. and Murad, F. (1987) Cyclic GMP synthesis
and function. Pharmaml. Rev. 39,16,3-196
23 Martin, W. et al. (1988) Endothellum-derived relaxing factor
and alriopeplin II elevate cyclic G.MP !eve!t ~n pig aortic
endothelial cells. Br. 1. Ph,mnacol.93, 229-239
24 KnowlP., R.G. el al. (1989) Formati~.-. of nit~c oxide from
L-arginlne in the cenlral nervous system: " transductlon mechanism for stimulation of the soluble gea.'ylate cydase. /'n,c,
Natl. Acad. Sol. U. S. A. 86, 5159-5162
25 Knowk~, R.G. et aL (1990)Differentl;d induction of brain, lung
and :iver nitric oxide synthase by ,.,ndotoxin in rat. Biochem. ].
27G S33~836
26 biakanNhi, S. (1992)Molecular dlve~'sltyof glutamate receplots
and implications for brain f~netio,,1.Science 258, 597-602
27 Paveto, C. et al. (1995) The nitric oxide transduction pathway
in l~dPanosoma cru:i. J. Biol, Chem. 270,16576-16579
28 Morris, R.G.M. el at. (1986) Selective impairment of leamlng
and blockade of Iong-ten~ potentiation and N-melhyl-oaspartate receptor dniagoni'~t, AP5. Nature319, 774-776
-
Techniques
Generalized Linear Modelling
for Parasitologists
K. Wilson and B.T. Grenfell
Typioilly, tile distribution of nlacroparasites over their host
population is hig!qy a~gregeted a M empirically best
described by rite negative binomial distribatic:1. For parasitologists, this poses a statistical r,ro~slem, which is often
tackled by log-transforming the parasite data prior to
analysis by paralnetric tests. ~qere, Ken Wilson and Bryan
Grenfell show that this method is particularly prone to
type [ errors, and highlight a much more powerf:d and
flexible altei:~ative: generalized linear modelling.
A m a i o r p r o b l e m facing a n y paras i t o l e ~ i s t is h o w best to a n a l y s e his
or h e r h a r d - e a r n e d data. W h a t is
the best w a y of d e t e r m i n i n g , for
e x a m p l e , wh,other h u m a n faec,:l
e g g c o u n t s d e c l i n e w i t h age, or
w h e t h e r m a l e a n d f e m a l e rabbits
differ in t h e i r w o r m b u r d e n s ? The
p r o b l e m of correctly i d e n t i f y i n g the
best statistical test in p a r a s i t o l o g y
is a c c e n t u a t e d b y the fact that parasites t e n d to b e a g g r e g a t e d o v e r
t h e i r host p o p u l a t i o n : m o s t hosts
h a v e just a few parasites, or n o n e at
all, w h i l e o t h e r s h a v e m a n y (for
r e l e v a n t d i s c u s s i o n s , see Refs !-5).
A s a result, the p a r a s i t e d i s t r i b u tion is r i g h t - s k e w e d w i t h a l o n g tail,
a n d fails to c o n f o r m to the n o r m a l
or G a u s s i a n distribu.tion a s s u m e d
b y m o s t of the c o m m o n l y u s e d
statistical tests, ie. p a r a m e t r i c tests,
s u c h as t-tests, a n a l y s e s of variance,
Kenneth Wilson is at the Department of
Biological and Mc,iecularSciences. Unive~it/
of Stirhng, Stifling, UK FK9 4LA. Bryan Grerffell
is at the Zoology Dep, lrtment. University of
Cambridge, Downing Street, Cambridge, UK
CB2 3EJ.Tel: +1786 467807, Fax: +44
1786 464994, e.math [email protected]
Paras:tology Today, vOI. 13, no. I, 1997
r e g r e s s i o n analyses, etc. This p r e s e n t s the parasitologist w i t h a f u n d a m e n t a l p r o b l e m .
O v e r c o m i n g the p r o b l e m of n o n - n o r m a l i t y
The ntajority of parasitologists either ignore the fact
that the non-normality of their data is a problem and
use parametric tests regardless, or use non-parametric
tests, such as Mann-Whitney U-tests, Wilcoxon signed
ranks tests, Kruskal-Wallis tests, etc. (see E2ox D. While
Box 1. Current Statistical Methods used by Parasitologists
We surveyed the statistical methods r~porte~i in 50 papers published in tile past
five yea~ in the jotlrnal Parasitology (K. Wilson, unpublished; and see Table
below). All of t.hese papers contained data, such as egg counts and worm burdens,
for which we would expect the untr,msformed distribution to be discrete (rather
than continuous) and to conform to the negative binomial, or at least to Ihe
Poisson. Remarkably, in 20r/, of the papers surveyed, no statistical tests were
applied at all; 46% used standard parametric tests (such as t-tests, analyses of variance an(J regression analy~es), of which more than half failed to transform the data
in any way prior to analysis; 28c/, used non-parametric tests (such as MannWhi~ey and Kruskal-Walli,a tests); and just three of the N) papers used sophisticated
non-linear maximum-likelihood methods. Surprisingly, during this pe.'iod, there
was not one published paper that used generalized linear models.
Table. Statistical tests used in 50 papers~ 0ublished in Poras/w/ogyin 1991-1995
Statistical test
None
Parametric (x)
Paramecic (log-x)
Non-parametric
Maximum-likelihood
GLM b
Total
Nacure
parasites
4
5
3
5
I
0
18
Irnmature
parasites
5
3
8
5
2
0
23
Bo.~ mature
and immature
parasites
I
4
0
4
0
0
9
"rotal No.
(and %)
I0 (20}
12 (24)
I I (22)
14 (28)
3 (6)
0 (0)
S0
a Papers are divided hlrG those that discussed variationin the bu-~ns of mature parasites
(adultw~rms or Kicks).immature parasites(.-ggs.o~/sts, gametoc~es, sporozoites,cercariae.
microfilariae,etc.)and combined studies.Paran;eo'Ictestsare divided i,cc those that analysed
~..:v:ounrs (x) and thosechatanalysedlog-transformedcounts (Iog-~).
GLH refers to generalizedlinear models and maximum-likelihooato non.linear maximumlikelihood models.
Techniques
Box 2. Log-transformation o f t e n Fails t~ Normalize Par,~ite Distributions
A traditional method for normalizing right-skewed overdispersed data is to
log m- or lo~-transform it, at;er first adding one to avoid zero counts. However, as illustrated below, this transformation fails when the mean of the distribution is small (a) or the distfibntion is highly aggregated, as indicated
bys.mall negative binomial k values (b).
a
Mean = 1000
= ~oool
=; ~coo
~,ooo~
0 J,
g
•
ram_
4000
0
sou
o-~
12000
0
Mean = 100
8000
2ooo
~m00
,~
1
~l
2
,il
3
,,.
4
5
g 40ol
o J,
0
I._
200 400 600 ~ 0
i]!
~
m._:
0
20
40
o,-!i,
oJl!ll--
0
2
80
60
4
'
,
6
8
0J-
~-
--"
1()00
0.0 0'.5 1;0 115 210 2'.5 310
Mean = 10
lib
~ ]!um~.~n_
100
Mean = 1
¢.
0.5
1.5
~.0
1
2:0
I1._
0 J I
0.0 0.2 0.4 0.6 0.8 1.0 1.2
Loglo (parasite load + 1)
1'0 1'2
Parasite load
b
0.0
k= leo
|~t
°>" 1500l
0 J,
_
,
~
40
,
, --,
120
, ,
160
O: r-
E.
0
1
2
3
4
5
k = 10
1-%00 .-
~.
0
0
50
200
400
600
,~ml
d5
m
1.6 1.8 2.0 2.2 2.4 2.6 218
800
k=0.1
~',000.o
ot,
100 150 20O 250
0.0 0.5 1.0 1.5 2.0 2.5 3.0
~tL.__..~.__
=~00o
~
,
~
Iooo 2 ~ ~
4ooo 5ooo
Parasite load
,V
0J
o
non-parametric tests make few
assumptio~,s about the underlying
statistical distributions6, and hence are
preferable to parametric tests when
ti~e norrnafi'.7 assumption is violated,
they generally lack the power of
equivalent parametric tests6 and so are
not an ideal solution.
A common alternative to using
non-parametric tests is to use standard
parametric tests after first transforming the data so that their distribution
becomes approximately normal. Because
most macroparasite distributions are
empirically best described by the
negative binomial distribution5,7-9, an
appropriate trar~formation is generally the log m- or log~-transformation
(~ffter fiizt adding one to the parasite
count to avoid zeros) m," . However,
this transformation often fails to norrealize the distribution, especially
when it is highly skewed (see Box 2).
1
2
s
4
Loo,n (parasite [Oa.d + 1~
Effe'ct ut the mean (a) ar,d the skew (b) of the ~ istributio~ (as determined
by k ~f the negative binomial) on the efficacy of lo&m-transformation: b~tt (a)
and (b) show the irequeat%v distribut~,)n of 5000 random samples taken h u m
negative binomial d~L"ibutions (using the rnegbin function in Splus), and th~
subsequen_~di.~tribution after the samples were Iogzo-transform~d. In (a) the
average k of the distribution is one and the average population mean wries
betwcen ~000 and 1, and in (b) the average population mean is 100 and
average k varies between ~.90and O.1.
Generalized linear m o d e l s
A less c o m m o n alternative to
non-parametric tests is the family of
generalized linear raodels (GLMs)
(see Box 3). These are generalizations of classical linear modeis (such
as linear regression, analyses of variance, etc.) in which the error disb-ibution is explidtly de['med (see Box 3).
As emphasized above, classical linear
models assume that the error distribution is normal. However, for macroparasite data, the appropriate error
distribution is often the negative bin_omial, which is defined by its mean
(x) and the exponent k. The variance
(s2) of a negative binomial distribution is desc~bed as follows:
x -x 2
s2 =
(1)
k
where x and s 2 are the m e a n and
variance, respectively, of the sample
a n d k is an inverse m e a s u r e of the
degree of aggregation, such that as
aggregation declines so k increases
until, as k approaches infinity (or in
practice, above about 20), the distribution converges on the Poisson 8.
In order to fit the negative binomial distribution, we ,zeed to estimate the exponent. The most accurate estimate of k is obtained by
maximum-likelihood m e t h o d s s, but
a reasonably accurate m o m e n t estimate can be calculated by rearrangement of Eqn 1:
= _ ~2
S2 -- ~,
(2)
where ~ is the estimated value of ,L
Poraslto/ogy Todoy, ~o!. I J. on. i, 1997
Techniques
Box 3. Generalized Linear Models
Generalized lirear models (GLMs), are generalizations of classical linear models (analyses of variance, t-.~:~s~,linear,
regression ana~yS~S, etc.) that allow the error stnlcture (the distribution of residuals about the fitted mo~iel) to
be exp:~citly deigned by one of a series of distributions, usually from the expol~ential family. GLMs also use a link function, which maps the expected values of the response variable (eg. faecal egg count) on to the explanatory variables (eg.
age and sex). For classical linear models, the error structure is defined by the normal distribution and the link function is
the identity link. For the negative binomial distribution (which is not in the exponential family), the GLM error structure is
definedby Eqn I (in !he text) and the link function is generally the log or square-root link12.~3.For an accessible introduction to tne uses or GLMs in ecological studies, see Ref. 12.
The significance of teians in GLMs are generally tested by comparing the deviances of models with and without those
terms. Deviances are analogous to mean squares in ciassical linear models. For the Poisson and negative binomial
distributions, deviances are distributed approximately as Chi-square (X2),with degrees of freedom equal to the difference
in the number of parameters attributable to each of the models.
A common alternative to explicitly defining a negative binomial error dis~bution is to assume a Poisson distribution
and to adjust the scale parameter (or disversion parameter) so that the ratio of the residual deviance and its degrees of
freedom is approximately equal to one (l~efs ~2,17). Th..~ls.instead of asssuming that the variance of the parasite distribution is equal to its mean (s2 = ~2, for the Poisson distribution), we assume that it is proportional to it, ie. s2 = 4).~, where
,"p is referred to as the empirical scale or dispersion parameter. When empirical ~cale parameters are used, model
parameter estimates are not affected but the standard errors •are hi~her
z2 and , in~ a. m-nner
imila-| t~. . ,. .. .~. ., .. .~ ~,
.,,
~
la ¢
S
-u
d.aljses of variance and re~res~ion models, the scaled deviances for terms in the model are compared usine F-tests instead
of x2-tests~2.~7.
~
A number of statistical packages (eg. GLIM, Genstat,
SAS and Splus) include GLMs and have a range of
error structures already defined (including Gaussian,
g a m m a , binomial and Poi~son). Negative binomial
errors are not usually included it,, the available set,
but they can be defined by the user o~ obtained from
sources within the public domain (for example, Ref.
12 provides GLIM macros on disk both for estimating
k by maximum-likelihoo0 methods and defining negative binomial errors, and equivalent Splus functions
are available by a n o ~ y m o u s ftp from StatLib; see also
Refs 13,14).
An alternative to defini-tg a negative binomial e~ror
distribution explicitly is to generate an empirical
estimate based on Poisson errors. Equation i can be
simplified to:
s2 = x (1 - ~ ) = .~r/~
k
(3)
Thus, the negative binomial distribution can be approximated by a s s u m i n g that qJ is approximately constant over all values of x. In practice, we d~fine a
Poisson error distribution and estimate the value of q)
empirically (qb is then defined as the empirical dispersion or scale parameter; see Box 2).
A comparison of methods
Wilson, Grenfell and Shaw 14 have recently compared the log-transformation method describeci earlier
~qth these two GLM methods, using (1) simulated
data sets (Box 4), anti (2) real parasite data from an
u n m a n a g e d population of Soay sheep on the Scottish
island group of St Kilda (Box 5).
From the simulated data sets, they conclude that,
w h e n sample sizes are small, the frequency of type I1
errors, ie. incorrectly accepting the null hypothesis,
H 0, is generally slightly higher w h e n u s i n g standard
parametric tests on log-transformed data thai. w h e n
usiclg either of the GLM methods. However, m u c h
m o r e importantly, they also conclude that, almost
regardless of the sample sizes, w h e n the distributions being compared differ in their degree of aggregation (as indicated by their negative binomial k
estimates), type I errors, ie. incorrectly rejecting tt~,
Parasitology Today. rot. 13. no. l, /997
are likely to be extremely c o m m o n w h e n u s i n g the
log-transformation method, b u t negligible w h e n
u s i n g either of the GLM m e t h o d s (Box 4.). This
strongly suggests that w h e n u s i n g the log-transformation method, parasitologists are m u c h more likely
to report spurious differences in parasite bu~d,ms
than w h e n using either of the GLM methods. Fortunately, it appears to matter little which of the
two GLM m e t h o d s is used. This m e a n s that standard
statistical packages such as G U M , Genstat, SAS
and Splus, can be used without recourse to Writing
specific macros or functions to define the appropriat ~error structure.
Analyses using real parasite data also indicate that
the choice of statistical method used is also important
here 1~. For example, an analysis of the worm burdens
of Soay sheep dying during one winter on St Kilda
strongly ~tL.~gests that the log-transformation method
is capable ~f generating type II errors, and an analysis
of A u g u s t faecal egg counts of the same population of
sheep suggests that this method m a y also gen~rafe
type i errors (see Box 5 for details). In both cases,
these results were confirmed by more sophisticated
non-linear maximum-likelihood models. This latter
method is also needed w h e n analysing patterns in the
degree of aggregation, such as c h a n g ~ in k with age4,1s.
The use of explicit maximum-!ikeliho~O error
s,~'uctures is not restricted to field data. For example,
Box 6 s h o w s an analysis of the results of experimental
infections of cats with filarial worms. In this case, the
method also allows u s to estimate the death rate of
adult parasites (Box 6).
Conclusions
The analyses summarized here clearly indicate that
classical linear regression models using logtransformed data are usually m u c h more likely to
generate both type I and type II errors than are generalizec~ linear models. GLMs are becomingly increasingly incorporated into modern statistical packages
and being used by ec_olog~L~a ~ social scientists. With
familiarity, they are only marginally more difficult to
use and understand than the statistical models oarrently being employed by parasitologists. Obviously,
they are not the be-all-and-end-all in statistical
$S
Techniques
Box 4. Simulated Data Sets
In order to compare the log-transformation method with the GLM approach, Wilson, Grenfell and Shaw 14 used the two
methods to analyse a ~ries of randomly generated data sets from the negative binomial distribution. The data sets comprised 20, 100 or ~ samples from dislributions with means ranging between one and 2000, and with k values ranging
between 0.5 anti 20. These ranges cover those of most parasite burdens and faecal egg counts. For each of a pair of data sets,
with either identical means or means differing from each other by 100% (1 vs 2, 5 vs 10,10 vs 20, etc.), the statistical significance of the difference between the means wa~ assessed over 100 trials using three models of increasing sophistication: a
classical linear model nsing log,ctransformed data (a, ir, Fig. below); a GLM with Poisson errors and an empirical scale
parameter, using untransformed data (b): and a GI,M with ne~at'..,e binomial errors, n g ~ n usin~ untransformed data (c).
T~,~eexplanatory, variable of the three models co~tprised a a,,'n~le factor, which crated for each of the pair of distributions.
The number of times that the different models detc<ted significant differences between distributions was scored over 100 trials using F-tests (a and b) or Chi-square tests (c). Thus, by comparing the output of the three models it was possible to assess
ti~u probabilities of each perfornai~g tTpe I err.e,rs, ie. ineorr~:tly rejecting the null hypothesis of no difference between the
means; and type II errors, ie. incorrectly accepting the null hypothesis.
The biggest differertces among the three models came when comparing data sets that had the same means, but different k
values (for details of the other comparisons, see Ref. 14). In this series of comparisons, the probability of the log-transformarion model ~rc~ucing ~,~p,e I errors ranged between between 0 and 75% and increased with sample size (n), sample mean
x and differen,:e_ between the componen~ k values. By coml~arison, both the GLMs produced m a n y fewer type I errors over
all values of n, :, and k; and both models failed with a probab!!~y ranging between about 0 and 15% (see Fig.). Thus, the logtransfomaation method is much more likely than the GLM method to indicate spurious differences between data sets.
a
Log trnnsformation
b
1oo
0~" 80
i
I
i
' ~ " ~ "
604
~
,
/,/~B
~
~
....
,.
7,,
GLM-1
c
1oo -
1o0
80"
80
10
100
Sample mean
lO30
= k= 0 . 5 v s k = 1.0
• k= 0.5vsk=lO.O
• k= O.5vsk=20.O
~k=
o k=
1.0 vs k = 10.0
1.0vsk=20.0
~ k = lO.Ovsk=20.O
60 •
~0
40"
40
20.
0
1
GLM-2
0
1
10
100
Sample mean
1000
10
100
Sample mean
1000
A comparison of the rate of tyl~: I error production when the component distributions have ~he s3me mean but different k
values: (a), Co) and (c'l er,cti show~ ~he probability of making a ty]~ I error when using each o the three statistical methods
(see main text and Ref. 14, for a description tff the models a n d the simulations). Each point ref,. rs to the probability of incorreztly rejecting the null hypothesis of no difference between the means, a n d is based on 100 s~mulations comprisins; 20 data
points taken from two negative bi~.omial distributions with identical population mea'n~ but different population k values.
The six symbols refer to different co~abinations of k values, as indicated in the figure..qimulati~ns ba.,i~d o~ ]rgrger samples
(100 or 500 data points) produce qualitatively similar trends (see Ref. 14). The results for the log-transformation method,
shown in (a), indicate that this has a high probability of incorrectly rejecting the null hypothesis over a range at ~ample
sizes, sample means and component k values,. The results for the GLM method with an empirical scale parameter is shown
in (b), and for the GLM with negatiw_' binomial errors in (c). For both (b) a n d (c), the probability of type I error is small.
a n a l y s e s 14, a n d m e r ~ s o p h i s t i c a t e d m e t h o d s , s u c h a s
non-linear ma×imum-tike!~,~cl analysis and boots t r a p p i n g , m a y b e a p p r o p r i a t e w h e n s a m p l e sizes a r e
l a r g e o r t h e statistical m o d e l s s i m p l e leg. Refs 4, 11,
16,L H o w e v e r , G L M s a r e a s i g n i f i c a n t i m p r o v e m e n t
o n m o s t c u r r e n t m e t h o d s of a n a l y s i s , a n d t h e i r u s e b y
parasitologists must be encouraged
Acknowledgements
We thank Steve AlboF~.Tim Coulson, Mick Crawley. Brl:ln Ripte/.
Da.q~n Shaw, Bill Verlablesand An Verbyla for" positive discussions
about negatlvebmom~ais,jil Pdklngtonfor collectingmost of the Soay
sheep parasitedata. and David Deeham and ~dw,~ Hich.~el for the
data c,.qfilanal ~,orrn~ This wo~ was supporl:edby the NERC and
the Welcome Trust
3a
References
1 Pennycuick, L. (1971) Frequency distributions of parasites in a
popuiation of three-spirted sticklebacks, GasterosCeus aculeat~s
L., with particular reference to the negative binomial distribution. Parasitola,~y63, 389--406
2 Anderson, R.M. (1974) Popuhition dynamics of the cestode
Caryophyllaeus latic~ps |Pallas, 1781) in the bream (Aramis
brama L.). ]. Atom. Eccd.43, 305-321
3 Anderson, R.M. (1978)The regulation af host population g,'owth
by parasitic species. Parasitalogy 76, t 19-157
4 Pacala, S.W. and Dobson, A.P. (1988) The relation between the
number of parasites/host and hast age: population dynamic
causes and maximum likelihood estimation. Parasitoh~k,y 96,
197-21o
5 Shaw, D.J. and Dob~n, A.P. (1996) Patterns of n.,acroparaslte
abund-"~ce and aggregation in wildlife populations: a qnantitatlve review. Parasit(dogy l 1l, SI l |-St34
Parasitology Today. voL 13, :*o. /, 1997
Techniques
i
Box 5. Soay Sheer" Tarasite Data
The loe-:ransformation method was compared ;-,qth ,'he two GLM methods using post-mortem w o r m counts a n d faecal
egg cour~ts of .an u n m a n a g e d population of £,eay sheep or. the Scottish island g r o u p of St Kilda (see ReCs 18,19). This population e.~hibits severe 'crashes" every, 3-4 years, when u p to 70% of the sheep die, due mainly to the population overexploiting its winter food supFlyZ°.21, although parasites have also been implicated (Refs 18, 22, 23; and K. Wilson et al.
unpublished).
I
I
[
/
|
Worm b u r d e n s
|
In the first comparison, Wilson, Grenfell , n d Shaw ~4analysed the w o r m burdens of sheep that died during the popu- |
Iatio~ cra~:b of 1991-1992. T!:c models included two factors potentially affecting the burdens of six species or genera of |
adult helminth worms: AGECLASS (two levels: lambs and adults) and SEX (two levels; males aud females). This analysis |
identi'~c,~ three qualitative discrepancies between the results of the three models, in two of these (Trichuris ovis and Dictyoc.~ulus [florin), the two GLMs identified SEX as a significant heterogeneity in w o r m burdens, when fhe log.tr.~,sform~tlor,
method failed to d o so. In the third (Telador~agia spp), the two GLMs identified a significant AGECLASS-SEX interaction,
whereas the log-transformation method again failed to ,do so. Thus, it appears that the log-transformation method committed type II errors in these analyses and this was coni;,~ned by non-linear maximum-likelihood methods.
Faecal egg counts
In the second comparison, Wilson, Grenfell md Shaw 14 examined the factors influencing the faecal egg counts of sheep
in the month of August between 1988 ,and 1993. When male egg counts were considered in isolation, the models again had
two factors: AGECLASS (four levels: lambs, yearlings, two-year-olds and adults) and YEAR (six levels).
While all three methods indicated important between-year variation in male faecal egg counts, only the logtransformation method identified significant differences between the four age-classes (P<0.01). Thus, the standard logtransformation method appears to have made a type I error that is not made by the two GLMs, and this assertion was
further supported using a non-linear maxiraum-Iikelihood model.
6 Siegel, S. and Castellan, N.J. (1988) Nonparametric Statistics 2nd edn, McGrawHill
7 Crofton, H.D. (1971) A quantitative
~'proach to parasitism. Parasitology 63,
343-364
8 Elliot. J.M. (1977) Statistical Anat,,sis of
Samples at Benthic invertebrates, 2nd edn,
Freshwater Biological Association
9 Southwood, T.R.E. (1978) Ec,glogicalMethods 2rid edn, Chapman & Flail
10 Gregory, R.D. and Woolhouse, M.E.J.
(1992) Quantification of parasite aggregation: a simulation study. Acta Trvp. 54,
131-139
II Fulfocd, A.I.C. (1994) Dispersion and
bias: can we ;fast geometric means?
Parasitol. Today 10, 446--448
12 Crawley, MJ. (1993) or.it,: lnr F~::/,,S,,.s.
Blackwell Scientific Peblicatic,-,"
13 Venables, W.N. and Ripley, B.D. (1994)
Modern Applied Statistics with S-Plus,
Springer-Verlag
14 Wilson, K., Grenfell, B.T. and Shaw, D.J.
Analysis of aggregated parasite distributions: a comparison of methods.
Functional Ecology l0 (in press)
15 Grenfell, B.T. et at. (1991) Frequency distribution of lymphatic filarlasis mfcrofilariae in human pupa!alton=: popidation
processes and statistical estimation.
Parasitology 101,417-.427
16 Williams, B.T. and Dyo, C. (1994) Maximum likelihood for p~rasitologists.
Parasitol. "today 10, 489-493
17 Aitkin, M. et al. (1989) Statistical Modelling
in GLIM, Clarendon Press
18 Gulland, F.M.D. (1c~92)The role of nematode parasites iu Sony sheep (Ovis aries
L.) mortality during a population crash.
Parasitology 105, 493-503
19 Gulland, F.M.U. and Fox, M. 0992) Epidemiology of nematode infections cf
Sony sheep (Ovis aries L) on St. Kilda.
Parasitology 105, 481--492
20 Grenfell, B.T. et el. (lo92) Overeompensation and population cycles in an ungulate. Nature 355, 823-826
21 Clutton-Brock, T.H. et el. Stability and
instability in ungulate populations: an
empirical analysis. Am. Nat. (in press)
Parasitology Today, vol. 13, no. I, 1997
Box 6. D e a l i n g w i t h Aggregation in Experimemal Infections
Estimating parasite death rates in cats infected with Brugia pahangi. The F~g.
(below) shows a famous parasitological data set24.!'~: the n u m b e r of adult
w o r m s recovered after infection of cats with a single dose of the filarial nematode Brugia pahangi [initial dose = 100 larvae (dots) or 200 larvae (crosses;].
After a sharp initial decline in recoveries (reflecting the initial establishment
of worms26), the proportion of par:~ites declines gradually, though variably,
with time. Given a Poisson~istributed infection rate, we might expect a Poisson
distribution for parasite numbers~-~; however, these data are ranch more
aggregated than that (estimated k -+ SE = 3.29 +- 0.41), probably reflecting
heterogeneities in parasite demographic, parameters betweeh hosts 24,2~'.This
aggregation is confi.m,pA bv a good fit of the negaSv¢ binom:,al model (residual deviance = 163, degrees of freedom = 150, P = 0.222), which also indicates no significant difference in proportional recovery between infection
do~o~
0.8 - •
o
N
"~
+.;
0.4
•
• *
+
+
+ ~
•
.;'+*~~ -*~.
..÷ ,"'rr..~_ ~.
0,2
f
+.
+
"
+
*
++
I
0.0~.
"
o
+
+
%
2GO
""
"
4Go
""
+
,~GO
Days since infeeUon
The it:led !ine i~ the Fig. ',abovc) is for the regression of won.~ burden on
time for recoveries after 40 days, using a log link function. Comparison with
earlier burdens (mean indicated by the triangle) illustrates the initial d r o p in
parasite load. The regression dso allows us to estimate the adult parasite's
average death rate and iffespan (E. Michael, B.T. Grenfell, D.A. Denham and
D. Bundy, unpublished). The cleath rate i-~equal ~u the slope of the regression
line (= 0.538 per year) and the life, pan equal to l / s l o p e (= 1/0.538 = 1.86
years).
Techniques
m
'
i
i
22 Gulland. F M.D. et al. (1993) Parasite-associated polymorph~sm
in a cyclic ungulate population. Proc. 1¢. So('. Lm:don Scr. ~ 2r~-,
7-.13
~l Grenfell, [~'.T.et al. (1996) Modelling patterns of parasite aggre!,alton in aatura! populations: trichustrorgytzd n,ematode~minont interactions as a case :ttudy. P.~rasitoh~y 111
S135-S151
24 Denham, D.k, et aL (1972) Studies with Brugia p4haogi.l.
Parasitolo~ical observations on primary itffections of cats
(Felts catusL Int. I. Parasit~ t. 2, 239-247
25 Suswillo, R.R., Denham, D.A. and McGreevy, P.B. (1982) The
number and distribution of Bntgia pahangi in cats at different
times after a primary infection. Acta Trop. 39,151-156
2e Grenfell, B,T., Michael, E. and Denham, D.A. O991) /t model
for the dynanfics of human lymphatic filariasls. Par~sitol.
Today 7, 31,%-323
Letters
Plasmodium cdc2-related Kinases:
Do they Regulate Stage Differentiation?
Reply
! ca~ only add my support to Kinnaircl a~d
Mott, am's co'nm~nts (this issue): cu,rently
avaiiab;e data on Rosmodium cdc2-related
kinases a,-e ~nde'e~ ~;~suf6cientto ~aefinea
roie to: these .a~,zymusin the par.~site's life
cycle (as was cleady stated in our" original
paper on P~.'rk-l) L Nevertheless, a touch of
ca~,t.:~us :,oeculation is usefial to define
working hypotheses whose value can be
tested expedmentaliy.
O,Jr hypothesis that PfcA- I may be
involved in establishment or maintenar.ce
Of the different, :ted. nondividing state of
: g:metocytes was ~c~sec'on the facts:
(I) u~at it is mo~t homoiogc,,: to a
known downr,agulator of cell i~vohfer~Uon
(p58GTt'): and (2) that its mRNA can ke
detected only in gametocytes, Furtherr,~:;re,
expressic.~, of pSBGTA during mouse
embryonic development coincides w~th the
cess3tion of cell diwsion that accomganies
diff~'(entiation ~ - a pattern simlrar to tha t.
e:*;.~ed by Pfc~- I. However" as
nlentJon'e"-.~.) ?annaird and Nottram.
pS8 GTA is just one member (namely
PITSLRE-~ I ) of the PITSLRE family of
kJnases, of which severa! very ciose;y related
isGf6rms exist: the function of
~hese isoforms ;; '_,n~ov.,n.but most (the
exception being PtTSLR~-~,I and -(~2_)
appear no~ to t-ave deleterious effects on
cell vlabiltt), when expressed ectopically
suggesting that they do not act as
dow,'.mgulators of cell division s. The
possibiltt, vxists that Pfc,k-I is a
gametocyte-specific functional homologue
of one of these rather than of PI%LRE-~ I
and plays no roie in the regulation of
gametocytogenesis. The search for the
function of p58 GTA homologues in
Apicomplexans will banefit from the finding
that Theileda av;~ulato appears Lo possess
such a gone, as indicated by Southern blot
experiments usL'~gPfcrk-I as a probe; this
putative homologue has been toned and is
.~urrently being ch~b'actenzed (G. Langsley,
pets. commun ).
Kinnaird arid iVlottrarn s'J~e~ that cell
cycle arre~ ~uring gametocytogene~is is
more li~ely to be achieved through Lhe
action of a CDK inhibitor (CO~} than
38
through that of an additionai kinase activity.
While them. are cases in which a CDI
appears to be sufifcient to arrest the cell
cycle (eg in yeast mating-pheromone
respon~e'i), the fact that, i,, higi~e~
eukaryctes, a,dditional elements (such as
p58 °TA for" example) appear, in some
instances, to be r~quired to stop cell
proliferatior, indicates that inhibiting a cell
cycle kinase w~h a CD! r~.ay not, by itself,
be enough to induce or l~aint;.:in cell cycle
arrest: this may reflect the relatively higher
complexity of the cell cycle machinery in
such organisms. The complexity of
Plasmodium's life cycle is presumably
mi.'Tored by an underlying complexity at
the level of molecular regulation of the
progression of this cycle. The large number"
of CRKs ~'.ready identified in o~:;c~
protozoan parasite~ with complex life
cycless (versus the apparently much simpler
situation of yeast for examp~e~ is consiste ~t
with this view. Likewise, the ha~\,est of
Plasmod~um CDICs has probably only just
begun, and one can reasonably expect
more enzymes to be added to the as yet
short list (indeed, we are presently
characterizin&; a novel Pfcrk homologue).
Lack of success in heterologue mutant
cr'molementation s~stem (eg. yeast cdc-2)
does ,,~, ~ ~a#~v mean that the genes
under investigat ..-, do not have a similar
(unction to that lackinC in the mutant;
absence of functional ccmplementation
may well be due to poor .?xpresslon of the
parasite's genes in the hetero!ogous host,
or to the inability of the gene p~: duct to
establish the required interactions w, th the
host's machiner),. In other words, the bes{
way to resolve this question is to ~tudy the
fiu~;ction of tl~ese genes in the parasites
themselves. Stable transfection protocols
for both trypanosumatids and Plasmodium ~
are r~ow available, which should make this
goal attainable. It would, of course, also be
of great interest to identify CDIs in
P"Jsmodium
Kinnaird and Mottram rightly point out
that there are several steps in the parasite's
life cycle that require cell cycle arre.~,t,
Although there is no reason a pr!pn to
expect that the same gene prO(!UCT~are
involved at different stages (a&er all, if
Plasmodium uses different se~s of ribosomes
at different c',eve!opmenta! stagesz, it m'Jy
well use different en=/mes to fulfil simi!ar
functions at different stages), it is certainly
worth looking at Pfcrk-I expression during
the entire life cycle, especialiy in nondividing
stages (eg. sporozoites). In this respect,
preliminary data suggest that a P,fcrk-I gene
product peaks in late gametocycogenesis
(Day 15) and is still detectable (but
decreasing rapidly) after gametocyte
activation has been initiated (M. Kariuki,
C. Doerig and S. Martin, unpublished). If
confirmed, such data would be consistent
with Pfcrk- I expression being correlated
with the cell-division status of the parasite.
The absence of detec~ab!~ prc&-i in asexual
parasites argues against the idea that this
enzyme is involved in the development
comrnitment to gametocytogenesis, since
this commitment appears to occur in the
preceding schizont 8 (the possibility cannot
be excluded teat Prcrk-I is indeed involved
;n the developmental decision, but i5
expressed at subdetectable levels or only in
a small subpopulati0n of asexual parasites).
Demonstrating a link between Prc~. i and
cell surface components (as is the case for
~mammalian p58 GTA) would lend support to
•~.,z)',.ypothesis that it may function in sensing
and/or transducing environmental changes
ttat al'e likely to trigger developmental
proces~,es such as gametocyte activation:
clearil, "~uch more work is required before
a picture of Prrrk- I function and mode of
action eme ~as.
All colie~gues ~o,~4d, or, the regulation
of grov,¢t, and deve!oprr,en: q #!~]~,'noulum
would pr,.'sumably agree that funcl.,.D!:at
:har le[el~,zation of candidate regLd~tot"
genes is the obligate next step towards an
understano~n& of this phenomenon. This
endeavour will be greatly facilitated by the
recent availability of reverse genetics
procedures to Pla~modium. In the
meantime, we wi, still f~vour the working
hypothesis that Pfcrk- I is an important
regulator of differentiation: ;f en~pirical
evidence shows it to b, the case, so muck,.
the better, we will have gained some insigh,
into the mechanLsms of Plosmodium
development; if not, we will neverthc!ess
have delighted for a while in the exciter" '~ '.
of perhaps uncovenng a fur,damental aspect
of the parasite's biology,
Parasitology Today, vol, 13, no. I, 1997