Evaluation of some random effects methodology applicable to bird

Journal of Applied Statistics, Vol. 29, N os. 1- 4, 2002, 245- 264
Evaluation of some random eþ ects
m ethodology applicable to bird ringing data
1
2
1
KENNETH P. B URNHA M & G ARY C. W HITE , Colorado Cooperative Fish
2
& W ildlife Research Unit, USGS-B RD, C olorado, USA, and Department of Fisher y
and W ildlife B iology, C olorado State University, USA
abstract
Existing models for ring recover y and recapture data analysis treat temporal
variations in annual sur vival probability (S ) as ® xed eþ ects. Often there is no explainable
structure to the temporal variation in S 1 , . . . , S k ; random eþ ects can then be a useful
model: S i 5 E(S ) + e i . Here, the temporal variation in sur vival probability is treated as
2
2
random with average value E( e ) 5 r . This random eþ ects model can now be ® t in
program M ARK. Resultant inferences include point and inter val estimation for process
2
variation, r , estimation of E(S ) and var(EÃ (S )) where the latter includes a component
2
for r as well as the traditional component for vÅ ar(SÃ ½ SÃ ). Furthermore, the random eþ ects
model leads to shrinkage estimates, SÄ i , as improved (in mean square er ror) estimators of
S i compared to the M LE, SÃ i , from the unrestricted time-eþ ects model. Appropriate
con® dence inter vals based on the SÄ i are also provided. In addition , AIC has been
generalized to random eþ ects models. This paper presents results of a M onte Carlo
evaluation of inference perfor mance under the simple random eþ ects model. Examined by
simulation, under the simple one group C or mack- Jolly- Seber (CJS ) model, are issues
2
2
such as bias of r à , con® dence inter val coverage on r , coverage and mean square error
comparisons for inference about S i based on shrinkage versus maximum likelihood
estimators, and performance of AIC model selection over three models: S i º S (no eþ ects),
S i 5 E(S ) + e i (random eþ ects), and S 1 , . . . , S k (® xed eþ ects). For the cases simulated,
the random eþ ects methods performed well and were unifor mly better than ® xed eþ ects
M LE for the S i .
1 Introduction
The objective of this paper is to evaluate, by sim ulation, the basic operating
characteristics of som e sim ple random eþ ects inference m ethodology applicable to
Correspondence: K. P. Burnham , U SG S-BR D, Colorado Cooperative Fish and W ildlife Research Unit,
201 Wagar Building, Colorado State U niversity, Fort Collins, CO 80523 , USA.
ISSN 0266-476 3 print; 1360-053 2 online/02/010245-2 0
DOI: 10.1080 /02664760 12010875 5
© 200 2 Taylor & Francis Ltd
246
K. P. B ur nham & G. C. W hite
open m odel capture- recapture data. T he reader is assum ed to have a basic
knowledge of dead-recovery (e.g. Brownie et al., 1985) and live-recapture (e.g.
Lebreton et al., 1992) m odels, referred to here generically just as capture- recapture
(for a current review see Schwarz & Seber, 1999). Basic random eþ ects concepts
and m odels are well established in general statistical theor y under various nam es,
such as variance com ponents, random eþ ects, random coeý cient models, or
empirical Bayes estimates (see, for exam ple, Efron & M orris, 1975; Casella, 1985;
Searle et al., 1992; Longford, 1993; Carlin & Louis, 1996). There has been only
m odest application of random eþ ects in ecology (see, for example, Johnson, 1981,
1989; Burnham et al., 1987; Link & N ichols, 1994; Ver Hoef, 1996; Link, 1999;
Franklin et al., this issue) despite that, these m ethods are needed, for exam ple, to
estim ate correctly process variation in survival probabilities over space or tim e
(W hite, 2000). We assum e the reader has som e fam iliarity w ith basic concepts of
random eþ ects, such as process variation versus sam pling variation.
T he speci® c random eþ ects m ethodology for capture- recapture evaluated here
is im plem ented in program M ARK ( W hite & Burnham , 1999; W hite et al., 2002).
In this paper we ® rst sum m arize the relevant theory (i.e. m ethods evaluated). N ext,
we give the speci® c inferential aspects of random eþ ects m ethods evaluated here.
Then we give the design of the M onte Carlo sim ulation study used for this
evaluation. Finally, there are sum m arized results, such as on bias, eý ciency,
con® dence inter val coverage, and m odel selection.
T his is perhaps the ® rst perform ance evaluation of a random eþ ects m odel for
capture- recapture. Consequently, we keep the evaluation simple by restricting it
to sim ulated data based only on the Corm ack- Jolly- Seber (CJS ) tim e-speci® c
m odel {S t , p t } wherein estim ated survival and capture probabilities are allowed to
be time varying for k + 2 5 t capture occasions, equally spaced in tim e. H owever,
this should entail no loss of generality because both relevant theories are quite
general. First, the random eþ ects theory used here is general; it applies to any set
of m aximum likelihood estimates, SÃ 1 , . . . , SÃ k regardless of the type of capturerecapture data analysed. Secondly, time-speci® c models for recover y and recapture
data stem from the sam e deep, uni® ed theory (Burnham, 1991; Barker, 1997).
2 Inference m ethods
2.1 Random eþ ects
O ur context here is open m odels capture- recapture w herein k survival probabilities
are estimable, corresponding to k equal length tim e periods, often years. The basic
tim e-speci® c CJS model is considered as being conditional on the underlying
estim able survival probabilities, S 1 , . . . , S k . H ence, the M LEs for this m odel can
be considered in the context of a linear m odel (conditional by de® nition on S4 )
represented as S4Ã 5 S4 + d 4 (these are k 3 1 colum n vectors). Here, S4 represents the
structural param eters and d 4 is the model’ s stochastic com ponent. Given S4 , the
(large sam ple) expected value of d 4 is zero, hence we can take E(S4Ã ½ S4 ) 5 S4 . Also, d 4
has conditional (on S4 ) sam pling variance- covariance m atrix W, which w ill be a
com plicated function of S4 and other param eters, such as capture probabilities, p,
or ring recovery probabilities, f.
For a random eþ ects model, S4 is m odelled as a random vector with expectation
2
X b 4 and variance- covariance matrix r I; b 4 has r elem ents. By assumption, the
2
process residuals, e i 5 S i 2 E (S i ), are independent with homogeneous variance r .
Evaluation of random eþ ects methodology
247
Also, we assum e m utual independence of sampling errors d 4 and process errors e4 .
We envision ® tting a capture- recapture model (that does not constrain S4Ã ) to get
the M LE S4Ã and the usual likelihood-b ased estimator of W. O nly the sim plest case
of this random eþ ects theory is used here in the simulations, but we present the
general theory. The special case used here, in the simulations, is the m eans m odel;
hence, r 5 1, X is a colum n vector of k ones, and b 4 is just the scalar param eter
l 5 E (S ).
T he basic, unconditional, random eþ ects m odel is
S4Ã 5
Xb 4
+ d 4 + e4 ,
+ e4 ) 5
V C( d 4Ã
D 5 r
2
I
+ E S4 (W )
Here, we let VC denote a variance- covariance m atrix. T he obvious inference issues
2
are to estim ate b 4 , r , an unconditional variance- covariance m atrix for b 4Ã , and
com pute a con® dence interval on r 2 , on k 2 r 5 df degrees of freedom . A nonobvious inference issue is the opportunity to use the shrinkage estim ator of S4 , SÄ4 .
In a random eþ ects context the shrinkage estim ator has a sm aller m ean square
error than the maxim um likelihood estim ator (Efron & M orris, 1975), and hence
is to be preferred on that basis.
Shrinkage estim ators are neither intuitive, nor easy to explain thoroughly. Shrinkage estim ators are also called em pirical Bayes estim ators (Carlin & Louis, 1996).
For a sim ple m odel with independent M LEs, SÃ i, and only one population-level
structural parameter, E(S ), SÄ i lies between EÃ (S ) and SÃ i. T he extent of this shrinkage
towards EÃ (S ) depends upon the variance com ponents proportion
r
r
2
2
+ E S {var(SÃ i ½ S i )}
If this proportion is 1, no shrinkage occurs; if it is 0 then for all i, SÄ i º EÃ (S ). An
individual SÄ i m ay not im prove upon the corresponding SÃ i in the sense of being
nearer to S i in a given case. However, overall the shrinkage estim ators as a set are
to be preferred as being closer to the true S i if the random eþ ects m odel applies
2
with r > 0 (Efron & M orris, 1975; Casella, 1985).
F rom generalized least squares theory, for r 2 given, the best linear unbiased
estim ator of b 4 is
b 4Ã 5 (X ¢ D 2 1X ) 2 1X ¢ D 2 1S4Ã
(1)
Assum ing norm ality of S4Ã (approxim ate norm ality suý ces) then from the sam e
generalized least squares theory the weighted residual sum of squares
(S4Ã 2 X b 4Ã ) ¢ D 2 1 (S4Ã 2 X b 4Ã ) has a central chi-squared distribution on k 2 r degrees of
freedom . Therefore, a m ethod of m om ents estim ator of r 2 is obtained by solving
the equation
k2
r5
(S4Ã 2
X b 4Ã ) ¢ D
2
1
(S4Ã 2
X b 4Ã )
(2)
where b 4Ã comes from equation (1). N ote that the critical issue of w hat is EÃ S4 (W ) is
dealt with below.
2
2 1
An aside about the quantity RSS(r ) 5 (S4Ã 2 X b 4Ã ) ¢ D (S4Ã 2 X b 4Ã ), as a function of
2
2
r , is in order. It can be shown that in the lim it as r goes to ` , RSS(r 2 ) goes to
0. It can also be shown that RSS(r 2 ) is m onotonically decreasing in r 2 . Furtherm ore, the m athematically adm issible range of r 2 is 2 k 1 < r 2 < ` , where k 1 is the
sm allest eigenvalue of EÃ S4 (W ). For r 2 5 2 k 1 , m atrix D is singular, hence RSS( 2 k 1 )
2
is, roughly speaking, ` . For r < 2 k 1 , m atrix D is not a valid variance- covariance
248
K. P. B ur nham & G. C. W hite
m atrix. T hese properties of RSS(r 2 ) m ean there is a unique r 2 > 2 k as the solution
to any equation y 5 RSS(r 2 ) 5 (S4Ã 2 X b 4Ã ) ¢ D 2 1(S4Ã 2 X b 4Ã ).
Solving equation (2) for r à 2 requires only a 1-dim ensional num erical search; a
2
unique num erical solution always exists, but m ay be negative. If r à < 0 occurs,
2
truncate it to 0, i.e. take r à 5 0. T he theoretical unconditional sam pling variancecovariance of b 4à is
V C( b 4Ã ) 5
To get a (1 2
tively, from
(X ¢ D
2
1
X)
1
2
a ) 100% con® dence interval on r
v
2
(3)
we solve for r
2
L
and r
2
U
, respec-
2
df,1 2 a /2
5
(S4Ã 2
X b 4Ã ) ¢ D
2
1
(S4Ã 2
X b Ã4 )
(4a)
2
df,a /2
5
(S4Ã 2
X b 4Ã ) ¢ D
2
1
(S4Ã 2
X b 4Ã )
(4b)
v
2
df,p
Here, v
is the pth percentile of the central chi-squared distribution on df degrees
of freedom . Unique solutions exist to equations (4a) and (4b), although the lower
con® dence lim it can be negative. In fact, even the upper con® dence limit can be
negative. In practice, any negative solutions are replaced by zero.
T he shrinkage estim ator, SÄ4 , used in this study requires the m atrix
H 5 r D
2
1/2
5
r (r 2I
H(S4Ã 2
evaluated at r à . Then SÄ4 5
tion m atrix
+ EÃ S4 (W ))
where A 5
X(X ¢ D
2
1
X)
2
1
1/2
5
+
(I
r
1 Ã
2
E S4 (W ))
2
1/2
+ X b 4Ã . An alternative formula uses the projec-
X b 4Ã )
G 5
2
+ (I 2
H
H )AD
2
1
(5)
X ¢ . Then
SÄ4 5
GS4Ã
(6)
The theoretical, conditional, variance- covariance m atrix of the shrinkage estim ator is
VC (SÄ4 ½ S4 ) 5
GE S4 (W )G ¢
its diagonal elem ents are var(SÄ i ½ S4 ). Because the shrinkage estim ator is conditionally
biased we based con® dence intervals on
rà mse(SÄ i ½ S4 ) 5
Î
và ar(SÄ i ½ S4 )
+ (SÄ i 2
2
SÃ i )
(7)
T he shrinkage estim ator (6) used here is such that the sum of squares of the
shrunk residuals (i.e. SÄ4 2 X b 4à ), divided by k 2 r, equals r à 2 . T his coherent relationship does not hold for the usual shrinkage estim ator found in the statistical literature
(M orris, 1983; Louis, 1984). The shrinkage estim ator de® ned here is central to
being able to obtain a useful, simple extension of AIC for this random eþ ects
2
m odel, because b 4à and r à are, essentially, com putable from S4à . Therefore, the
likelihood value to associate with this random eþ ects m odel can be obtained based
on the ® xed eþ ects likelihood evaluated at SÄ4 w ithout the need to com pute (as via
num erical integration) the proper m arginal likelihood of the random eþ ects model.
An im portant uncertainty about these random eþ ects m ethods is that we do not
have form ulae for the elem ents of E S4 (W ); hence, this optimal weight m atrix cannot
be used. We cannot take the exact expectations needed. We w ill have only an
Evaluation of random eþ ects methodology
249
estim ator of E S4 (W ), and it m ay have inherent biases. In simple cases, approxim ate
expectations over S4 can be found for var(SÃ i ½ S4 ) and cov(SÃ i , SÃ j ½ S4 ). However, for a
general m ethod we want to just use 2 1 tim es the m atrix of second partial
derivatives of the log-likelihood function, say F, w hich estim ates the Fisher inform aà º Eà S4 (W ) 5 F 2 1 . We have here used this general m ethod
tion matrix, and then W
to obtain EÃ S4 (W ), used in place of E S4 (W ). G iven such approximations, and the
overall com plexities, of this random eþ ects m ethodology, Monte Carlo sim ulation
evaluation is required to determ ine actual inference perform ance.
2.2 AIC for random eþ ects
We will have started with a likelihood for a m odel at least as general as full tim e
variation on all the param eters, say , (S4 , h 4 ) 5 , (S 1 , . . . , S k , h 1 , . . . , h z ). For the
CJS m odel, the additional param eters would be just the capture probabilities,
p 2 , . . . , p k . However, the form ulae given here are m eant to apply also to other types
of m odels, such as band recoveries, w hich m ight be param eterized in term s of S i
and fi . Therefore, for generality we denote the additional model param eters
generically as h i . And in general under the model, {S t , h 4 }, we have the M LEs, S4Ã
and h 4Ã , and the maxim ized log-likelihood, log , (S4Ã , h Ã4 ) based on K 5 k + F param eters.
Thus, for large sample size, n, AIC for the ® xed eþ ects m odel is 2 2log , (S4Ã , h 4Ã ) + 2K
(Burnham & Anderson, 1998).
T he log-likelihood value for the ® tted random eþ ects m odel on S4 comes from
re-optim izing over h 4 at the value of SÄ4 . We denote this random eþ ects log-likelihood
value as
Ä
(SÄ4 , h à ) º
log ,
log ,
(SÄ4 , h 4Ã (SÄ4 )) 5
m ax [log ,
4h
(SÄ4 , h 4 )]
where SÄ4 essentially `contains’ b 4à and r à 2 . The dim ension of the param eter space to
associate with this random eþ ects m odel is K re , where
K re 5
tr(G )
+F
G is the projection m atrix (6) m apping S4Ã into SÄ4 , and tr(´) is the m atrix trace
function (tr(G ) 5 the sum of the diagonal elem ents of G; see, for exam ple, Schott,
1997, p. 4).
T he large-sam ple AIC for the random eþ ects m odel on S4 is
2
2 log ,
Ä
(SÄ4 , h 4Ã )
+ 2K re
The small sam ple corrected version, AIC c, for this random eþ ects m odel is
2
2 log ,
Ä
(SÄ4 , h Ã4 )
+ 2K re + 2
K re (K re
n
+ 1)
+ (K re 2
1)
(8)
In these sim ulations we used equation (8) w ith eþ ective sam ple size, n, as the total
num ber of anim al-release events (Burnham et al., 1994). However, sam ple sizes in
these sim ulations were always large enough that the sm all sam ple term in equation
(8) was irrelevant.
Results such as equation (8) are, in the literature, giving AIC generalized to
sem i-parametric sm oothing applications; see, for exam ple, H ur vich & Sim onoþ
(1998) and Shi & Tsai (1998). H owever, those papers are not about random eþ ects
m odels. Instead, those papers note a generalized AIC where the eþ ective num ber
250
K. P. B ur nham & G. C. W hite
of param eters is the trace of a smoothing m atrix. In fact, the m apping GS4Ã 5 SÄ4 is a
type of generalized sm oothing. It is known that the eþ ective number of param eters
to associate with such sm oothing is the trace of the sm oother m atrix (see, for
exam ple, Hastie & T ibshinari, 1990, section 3.5; also see pp. 48 - 49).
3 Study design
The questions one might ask about inference performance under random eþ ects
m odels, and information one could desire, are quite general, certainly m ore so
than the lim ited simulation evaluation presented here. To clarify the inform ation
we sough t, we ® rst present the questions we asked. Then we give the design aspects
of the particular simulation study reported here.
3.1 Inference issues
W hat is the bias of r à 2 (i.e. the solution of equation (2))? We address this
in two parts because the truncation of negative values of r à 2 back to zero
will induce bias, especially when process variation is zero, even if the
signed variance estim ator is unbiased (which can be the case in variance
com ponents estim ation).
2
(Q 1a) W hat is the bias of the signed estim ator r à ?
(Q 1b) W hat is the bias of the zero-truncated process variance estim ator? To
explore this question we actually use r à , i.e. the square root of the zerotruncated estim ator (not possible to do with the signed estim ator).
(Q 2) W hat is the relative frequency of cases wherein r à 2 < 0?
(Q 3) W hat is the coverage of the nom inal 95% con® dence inter val on estim ated
2
process variance? This can be done on the scale of r as well as r ; we
focus on r . A secondary question is also considered: what is the frequency
of coverage failures above and below the con® dence interval? For exam ple,
here a coverage failure is said to be `above’ if the con® dence interval is
entirely above true r , hence r is below the lower con® dence lim it.
(Q 1)
The next set of questions relate to inference about the individual S i . T hese
inferences are com puted as conditional on the actual survival probabilities of a
given case, but we are then evaluating their perform ance over the M onte Carlo
trials used.
(Q 4)
(Q 5)
W hat is achieved coverage of the nom inal 95% con® dence interval on S i ,
i 5 1, . . . , k? T he focus is on average coverage over all k survival probabilities (and whether coverage varies m uch by occasion, i ). Based on the
M LE under the tim e-speci® c C JS m odel, the interval used is
Sà i 6 1.96sÃe(Sà i ½ S4 ). For the shrinkage estim ator the interval uses rà mse from
equation (7), hence is SÄ i 6 1.96rà mse(SÄ i ½ S4 ).
W hat are the relative lengths of these two con® dence intervals? It suý ces
to com pare ratios of average rà mse(SÄ i ½ S4 ) to average sà e(Sà i ½ S4 ). An overall
ratio is based on ® rst com puting the average of each quantity over all
sim ulation trails, by occasion, for a set of conditions (occasions, releases,
E(S ), p, and r ), then looking at
Evaluation of random eþ ects methodology
k
i5
R
grand average length ratio 5
rÅÃ mse(SÄ i ½ S4 )
ÃÅ e(SÃ i ½ S4 )
1s
1
k
i5
R
251
(9)
If needed, we can also look at length ratio by occasion:
occasion i average length ratio 5
(Q 6)
rÅÃ mse(SÄ i ½ S4 )
sÅÃ e(SÃ i ½ S4 )
(10)
If the shrinkage estim ator is superior, these ratios will be less than one
while con® dence interval coverage rem ains at, or near, the nom inal level.
W hat are the relative mean square errors of the M LE and shrinkage
estim ators? T heory says the shrinkage estim ator has the sm aller m ean
square error (M SE). For R M onte Carlo trials (under set conditions; r
indexes trial), the estim ated M SEs by occasion are
R
r5 1
R
M SE(SÃ i ) 5
(SÃ r,i 2
S r,i )
2
S r,i )
2
R
R
r5 1
R
M SE(SÄ i ) 5
(SÄ r,i 2
R
The ratio of interest is
R
k
i5
R
k
i5
MSE(SÄ i )
Ã
1 M SE(S i )
1
(11)
and less so, the by-occasion-i ratios,
M SE(SÄ i )
M SE(SÃ i )
(Q 7)
(12)
If the ratios given by equation (12) are stable over occasions we can focus
on equation (11).
O n average, within trial, is the M LE or the shrinkage estimator closer to
the survival probabilities, S r,i , of that trial? The sum m ary statistics used
to investigate this question are
k
SSE r (SÃ ) 5
+
i5
(SÃ r,i 2
S r,i )
(SÄ r,i 2
S r,i )
2
1
k
SSE r (SÄ ) 5
+
i5
2
1
O f interest is the ratio
RSSE r 5
SSE r (SÄ )
SSE r (SÃ )
(13)
which will be less than one if the shrinkage estim ator is better than the
M LE. Q uestions (6) and (7) involve sim ilar, but not identical sum m aries
of results.
T he ® nal area of investigation concerns the perform ance of AIC m odel selection.
The m ost fundam ental issue was whether AIC for random eþ ects (i.e. form ula (8))
would perform satisfactorily. To answer the question we tabulate the perform ance
252
K. P. B ur nham & G. C. W hite
of AICc as regards selection am ong three models ® t to each sim ulated data set.
The m odels ® t were {S t , p t }, {S, p t } (i.e. S i constant), and the random eþ ects
m odel, which is properly considered interm ediate between these two ® xed eþ ects
m odels. H ence, the question is sim ply
(Q 8)
W hat is the perform ance here of AICc as regards these three m odels?
3.2 Simulation design
We sim ulated single-group live capture- recapture data of the C JS type (Lebreton
et al., 1992) as the basis to address the above questions (other choices are possible,
such as dead recoveries). T he M onte Carlo sim ulations were done as a factorial
treatm ent design using ® ve factors:
capture occasions (t 5 k + 2),
releases of new anim als (u) on each occasion,
constant capture probability ( p) on each occasion,
m ean survival probability, E(S ),
process variation, r ,
4
2
2
2
4
levels (7, 15, 23, 31)
levels (100, 400)
levels (0.6, 0.8)
levels (0.6, 0.8)
levels (0, 0.025, 0.05, 0.1)
All 128 combinations ( 5 4 3 2 3 2 3 2 3 4) of these levels de® ned the points used
in the design space. At each design point we sim ulated 500 independent data sets.
For cases of r > 0 the S 1 , . . . , S k were generated as a random sam ple from a beta
distribution with m ean E (S ) and variance r 2 (such as was done in Burnham et al.,
1995, for random capture probabilities). O n each occasion a ® xed num ber of new
`anim als’ (u i º 100 or 400) were released into the population. D ata sets were
generated one at a tim e in SAS (SAS Institute Inc, 1985) and passed directly to
M ARK where three m odels were ® t to each data set: {S t , p t }, {S, p t }, and the
random eþ ects m odel, which also required the re-optim ization of the m odel {S t , p t }
likelihood over p4 , at ® xed SÄ4 , as noted in Section 2.2.
A ® nal com m ent on design, in case the reader wonders why the levels for
occasion (t) were 7, 15, 23, 31. For this factor we focused on the num ber of
estimable S i , which is k 5 t 2 2, under the tim e-speci® c m odel, {S t , p t }. We ® rst
decided on 5 as our m inimum for k; and we wanted about 30 for our m axim um k.
N ext we decided to use four levels. G iven these choices, an increm ent to k of 8
was selected. Hence, levels on k are 5, 13, 21, 29, or in term s of occasions t, levels
are 7, 15, 23, 31.
3.3 M aximum likelihood estimation
A critical point here is that all the M LEs were com puted without being constrained
to be < 1. T his was done sim ply by using the identity link in M ARK along with
the standard likelihood for the CJS tim e-speci® c m odel. If the logical constraint
SÃ < 1 is imposed it w ill have ver y undesirable eþ ects on the num erically estimated
Fisher inform ation m atrix, F, if any instance of SÃ j 5 1 occurs. The basic problem
is that now the partial derivatives with respect to S j end up being ver y wrong
(usually 0 if a logit link is used). T he resultant numerically estim ated sam pling
variances are biased low. O f course now the variation in the set of M LEs is also
reduced by the im posed constraint. O ne m ight think everything would work out
all right in the end, but exploratory results showed this hope to be false.
Evaluation of random eþ ects methodology
253
Applying the variance-com ponents m ethods given here to ML Es is not recom m ended w hen they are bounded at 1 and any of them fall on this bound. N ote,
however, that if all M LEs are well below 1 (on the scale of their standard errors),
then doing the basic ® tting for M LEs bounded, as with use of a logit link function,
or unbounded gives the sam e results for random eþ ects inferences.
4 Results
To present results, our strategy is to give an overall perform ance result ® rst (e.g.
over all 128 cases, if appropriate), by question given their ordering in Section 3.1.
Then we partition the inferential statistic(s) by any im portant factors that aþ ect
the result. M ajor in¯ uences can be, for exam ple, the value of process variation, r ,
num ber of capture occasions, and num ber of releases at each occasion (100 or
400). The total num ber of M onte Carlo trials (64 000) was large enough that
trivially sm all diþ erences can be detected as `statistically signi® cant’ . Therefore, we
usually do not present m easures of precision on the sum mary m eans. In the spirit
of ® rst giving the big-picture result we note that the inference procedures exam ined
(i.e. the questions in Section 3.1) perform ed very well.
2
F irst, we present inform ation on the expected value of r à , the signed estim ator
2
of r , which is the solution of equation (2). Table 1 gives simulation results for
2
2
Eà (r à ) by r and occasions, t, the only factors having any noteworthy eþ ects on the
results. There is distinct positive bias w hen process variation is 0, but little or no
bias in this study for r > 0.025 (except at k 5 5 and r 5 0.025). We had hoped the
2
signed r à would be m ore nearly unbiased when true r was at or near 0; instead it
has positive bias.
H owever, the estim ator to use in practice for a single data set would be
m ax{0, r à 2 }, for w hich there m ust be positive bias, at least at sm all values of r .
Therefore, we present in Table 2 results on bias of r à 5 [m ax{0, r à 2 }] 1/2 . From Table
2, r à has little bias, in this study, when true r > 0.025. An unbiased r à when true
r 5 0 probably cannot be achieved.
C om parison of results in Tables 1 and 2, to the extent this is meaningful given
the non-linearities involved, suggests zero-truncation is not a m ajor cause of bias
in r à . This observation motivated us to produce Table 3, which gives the proportion
of trials in which r à 2 was negative. Especially w hen r 5 0, we thought perhaps the
proportion of negative signed estim ates would be m uch less than 0.5; it was not.
2
However, the distribution r à is very asym m etric: there are not a lot of negative
Table 1. Mean value of the signed estim ator of process variance, r à 2 , by true r and
number of capture occasions, t 5 k + 2, from simulation; m eans by r 2 and t are
based on 4000 trials; column m eans (i.e. by r 2 only) are based on 16 000 trials
True r
2
(r
in parentheses)
(0)
0
(0.025)
0.000625
(0.05)
0.00250
(0.1)
0.01
7
15
23
31
0.000107 8
0.000044 6
0.000029 9
0.000027 3
0.00076 1
0.00062 0
0.00063 7
0.00064 5
0.00249
0.00253
0.00251
0.00251
0.01015
0.01002
0.00999
0.00994
mean
0.000052 4
0.00066 6
0.00251
0.01002
t
254
K. P. B ur nham & G. C. W hite
Table 2. M ean value of the r à 5 [max(0, r à 2 )] 1 /2 , by true r and number of capture
occasions, t, from sim ulation; m eans by r and t are based on 4000 trials;
colum n m eans (i.e. by r only) are based on 16 000 trials
True r
0
0.025
0.05
0.10
7
15
23
31
0.00916
0.00591
0.00467
0.00462
0.0275
0.0249
0.0252
0.0254
0.0499
0.0502
0.0501
0.0501
0.1007
0.1001
0.0999
0.0997
m ean
0.00609
0.0257
0.0501
0.1001
t
2
Table 3. M ean proportion of signed estim ates, r à , that are negative, by true r
and number of capture occasions, t, from sim ulation; m eans by r and t are
based on 4000 trials; colum n m eans (i.e., by r only) are based on 16 000 trials
True r
0
0.025
0.05
0.10
0.592
0.516
0.510
0.484
0.526
0.339
0.144
0.076
0.047
0.152
0.159
0.022
0.004
0.002
0.047
0.026
0.001
0.000
0.000
0.007
t
7
15
23
31
m ean
values that have large absolute values, where `large’ is with respect to the sam pling
distribution of the positive values of r à 2 . For the record we note here that the proper
lower bound to enforce on r à 2 is the negative of the sm allest eigenvalue of Eà S4 (W ).
N ext, we consider question (3) in Section 3.1: actual coverage of the nom inal
2
95% con® dence interval on r , or equivalently, on r . T he endpoints of this interval
2
are com puted based, on r , from form ulae (4a) and (4b). Results are then zerotruncated; hence, we can take the square root of those endpoints, and coverage is
the sam e w hether considered for r 2 or r . Evaluated over all 64 000 com puted
con® dence intervals on r , coverage was 94.8% . C overage averaged by r is show n
below (each m ean is based on 16 000 trials), and separately by capture occasions, t :
r
%coverage
t
%coverage
0.000
0.025
0.050
0.100
94.6
95.0
95.1
94.7
7
15
23
31
94.9
94.9
94.8
94.8
T he other question about coverage is whether the 5% of intervals that failed to
cover did so with equal `tails’ . We therefore looked at the percentage of intervals
that failed to cover because r was below the lower lim it of the interval (denoted as
m issed `above’ ), or r was above the inter val upper end point (m issed `below’ ). The
answer is simple, there was sym m etr y: 2.63% of m isses were below and 2.53% of
m isses were above. Even for the 16 000 trials at r 5 0 the results were 2.95% of
m isses were below and 2.47% of misses were above.
Evaluation of random eþ ects methodology
255
T he next set of results are about inference on individual survival rates, S i
(questions (4) to (7) in Section 3.1). We believed, and results support, that
shrinkage estim ates have sm aller m ean square error than M LEs under a random
eþ ects m odel. However, did we then suþ er big losses in con® dence interval
coverage? T hus, coverage (question (4)) was param ount in our thinking, so we ® rst
present results for coverage of intervals com puted, based on use of equation (7),
as SÄ i 6 1.96 rà mse(SÄ i ½ S4 ), and for M LE-b ased intervals as Sà i 6 1.96 sà e(Sà i ½ S4 ) (from the
tim e-speci® c CJS m odel). Conditional on a given trial and occasion (i 5 1, . . . , k),
an interval `covers’ if true S i for that trial is in the con® dence inter val. T hese are
the type of conditional intervals one wants for the individual sur vival probabilities.
C on® dence interval coverage results, averaged over all estim able sur vival
probabilities (k) within case (500 trials) and then over all 128 cases, are 95.5% for
the shrinkage estim ator and 95.0% for the M LE under the tim e-speci® c CJS m odel.
N o design factors aþ ected the coverage of the M LE-b ased interval; therefore, we
do not look closer at its coverage results. Coverage of the con® dence interval based
on SÄ was close to the nom inal 95% except when r 5 0, wherein coverage was too
high. Coverage was relatively lower for seven occasions, k 5 5, compared to k > 13.
N um bers of releases (u) had only trivial eþ ects on coverage. As Table 4 show s,
coverage based on the shrinkage estim ator was essentially 95% for the cases
exam ined here when r > 0 and k > 5.
T he only striking result is that coverage based on SÄ is higher than nominal w hen
r 5 0 (Table 4). This result is understandable from form ula (7), wherein if S i º S
then shrinkage to the com mon m ean is correct and the term (SÄ i 2 SÃ i ) 2 is neither
needed nor appropriate. However, when r > 0 (0.025 or m ore here), this added
term is needed, even when the analysis in a given trial produces r à 5 0. For real
data we do not know r and observing r à 5 0 does not reliably m ean r 5 0. Hence,
we have here an issue in data analysis strategy. We could always use equation (7);
this is probably a good strategy if true r > 0. Or we could use equation (7) unless
r à 5 0, in which case we then use the traditional sà e(SÄ i ½ S4 ). We will com m ent m ore
on this m atter below.
G iven that con® dence inter val coverage is good, we address question (5): what
are the relative lengths of these two con® dence intervals? First we look at results
for the 128 cases based on formula (9), w hich is the ratio of overall average interval
lengths (shrinkage versus MLE based). A ratio less than one favours the shrinkagebased m ethod. T he m ean ratio over all 128 sim ulation cases is 0.842. H owever,
this ratio in equation (9) is strongly aþ ected by the value of r :
Table 4. Average percentage coverage for nom inal 95% con® dence intervals
on survival probabilities, S i, based on the shrinkage estimator, SÄ i ; results here
are averaged over i and are based on 4000 independent trials by com bination
of true r and number of capture occasions, t, and on 16 000 trials for each
column m ean
True r
0
0.025
0.05
0.10
7
15
23
31
97.8
99.5
99.7
99.8
93.1
94.0
94.8
95.3
92.1
94.3
94.7
95.0
93.7
94.8
94.7
94.8
m ean
99.2
94.3
94.0
94.5
t
256
K. P. B ur nham & G. C. W hite
con® dence intervals ratio
r
form ula (9)
0.000
0.025
0.050
0.100
0.739
0.799
0.879
0.949
There are no other factors that have eþ ects worth noting on this ratio of average
con® dence interval lengths. There is a striking result here. Fitting the CJS m odel
{S t , p t } to data wherein the S i did not vary over time (r 5 0), the con® dence interval
based on the shrinkage estim ator had coverage > 95% , the M LE-based con® dence
interval covered at 95% , and yet the con® dence interval based on the shrinkage
m ethod was, on average, about 25% shorter than the interval based on the M LE.
For the cases having r > 0, coverage of both intervals was about 95%; however,
the con® dence interval based on the shrinkage m ethod was always shorter on
average than that from the M LE analysis. T hus, the analysis strategy issue raised
above is easily resolved as regard to the use of M LE versus shrinkage when a
random eþ ects m odel is a proper m odel to consider: random -eþ ects based inference
uniform ly beat M LE in these sim ulations.
A few m ore analyses are useful. We considered the stability of the ratio of average
con® dence interval lengths by occasion, within the sim ulation case. U sing form ula
(10) we com puted this ratio of average lengths (from 500 trials). Then we com puted
the coeý cient of variation of these k ratios, hence getting one num ber for each of
the 128 sim ulation cases. If these C Vs are sm all (5% would be sm all, we feel) then
our analysis of con® dence lengths based on equation (9) is suý cient to apply to
all occasions, i 5 1, . . . , k. T he average of the 128 CVs is 2.0% , and the largest
four C Vs are (as % ) 3.6, 3.7, 4.2 and 6.0. We conclude that the analysis of the
ratio of average con® dence interval lengths based on equation (9) is suý cient.
N ext, we consider question (6): what are the relative m ean square errors of the
M LE and shrinkage estim ators? First we used equation (11) to obtain ratios of
average m ean square errors over occasions within case; hence, we get one ratio for
each of the 128 cases sim ulated. A ratio < 1 favours the shrinkage estim ator. T he
average of those 128 ratios is 0.618. T he factor that has the biggest eþ ect, by far,
on this ratio is process variation, r . Average results of equation (11) by r are
average M SE ratios
r
formula (11)
0.000
0.025
0.050
0.100
0.192
0.570
0.790
0.919
Because averages conceal as well as reveal we also used form ula (12) to com pute
this m ean square error ratio for all 2176 com binations of factors and occasions
within case. O nly three of these 2176 ratios exceeded 1: 1.003, 1.004 and 1.02.
All other such ratios were less than 1. T his is additional evidence of the superiority
of the shrinkage estim ator as com pared with the M LE in this random eþ ects
context.
Evaluation of random eþ ects methodology
257
Table 5. The proportion of sim ulation trials that had RSSE r < 1 (formula
(13)) ; if proportion > 0.5 the shrinkage estimator is preferred to the MLE;
results are based on 4000 independent trials by com bination of true r and
number of capture occasions, t, and 16 000 trials for each colum n m ean
True r
0
0.025
0.05
0.10
7
15
23
31
1.000
1.000
1.000
1.000
0.848
0.946
0.971
0.985
0.725
0.854
0.914
0.935
0.648
0.753
0.786
0.824
m ean
1.000
0.937
0.857
0.752
t
T he ® nal question about shrinkage versus M LE is question (7): how do these
two estimators com pare w ithin a data set, on average, in term s of the by-trial ratio
of their SSE, RSSE r (formula (13))? The average of such ratios can be unstable,
so for a basis of com parison we tabulated the proportion of the 500 sim ulation
trials, by case, that had RSSE r < 1. This is the sam e as the proportion of trials
where SSE r (SÄ ) < SSE r(SÃ ). A proportion between 0.5 and 1 favours the shrinkage
estim ator, and the closer the proportion is to 1, the m ore favoured is the shrinkage
estim ator. Averaged over all 128 design points, the proportion of cases w herein
SSE r (SÄ ) < SSE r (SÃ ) occurred was 0.887, and the minimum proportion over all 128
cases was 0.558. Results are strongly dependent on true r and t (occasions), hence
results averaged by these factors are given in Table 5. In term s of this m easure of
closeness of the estim ator to true S i , the shrinkage estim ator wins handily as
com pared to the MLE under CJS model {S t , p t }.
T he ® nal area of inference exp lored here is question (8): what is the perform ance
of AIC for random eþ ects models (formula (8)) when the set of three m odels
contains the random eþ ects m odel and the two ® xed eþ ects m odels, w herein S i
either varies by occasion or is constant. For all three m odels, capture probability
estim ates were allowed to be fully tim e var ying. Use of the random eþ ects m odel
for S m eans we use the shrinkage estim ator, which we expect, and results here
show, to be more parsimonious than the M LE under m odel {S t , p t }. O ur m ost
striking ® nding quite surprised us: AIC for the random eþ ects m odel was always
sm aller than AIC for the `parent’ ® xed eþ ects m odel, {S t , p t }. `Always’ , literally
m eans here in all 64 000 sim ulated trials. (We expended great eþ ort to be sure this
was not the result of any program m ing error.)
We can denote the three AIC c values com puted here by AIC(.) for the constant
S m odel, AIC (R E) for the random eþ ects m odel, and AIC(t) for the tim e-var ying
S m odel. T he diþ erence D 5 AIC(t) 2 AIC(R E) had a m inim um and maxim um
over all 64 000 trials of 0.022 and 47.938. Because AIC always selected random
eþ ects (i.e. SÄ t ) over full time variation M LEs (i.e. SÃ t ) the only other issue is about
AIC -based selection of the constant S m odel, {S, p t }, versus the random eþ ects
m odel. Table 6 gives such selection results: how often the random eþ ects m odel
was selected rather than the sim ple tim e-constant survival probability m odel.
Average results for this selection relative frequency by r 5 0, 0.025, 0.05, 0.1 are,
respectively, 0.257, 0.736, 0.921, 0.987.
O ne m otivation for wanting a random eþ ects m odel for capture- recapture was
the desire for a parsimonious m odel interm ediate between m odels {S, p t }, and
258
K. P. B ur nham & G. C. W hite
Table 6. The proportion of sim ulation trials in which AIC selected the random
eþ ects m odel rather than the constant S m odel (note, AIC always selected random
eþ ects over S time varying); results are based on 2000 independent trials for
each com bination of r , t and l , 8000 trials by r and l , and 16 000 by r value
(column means)
True r
u
0
0.025
0.05
0.10
7
7
100
400
0.303
0.374
0.445
0.760
0.653
0.911
0.914
0.992
15
15
100
400
0.228
0.350
0.550
0.892
0.892
0.996
0.994
1.000
23
23
100
400
0.186
0.255
0.632
0.954
0.948
1.000
1.000
1.000
31
31
100
100
0.158
0.201
0.677
0.976
0.967
1.000
1.000
1.000
m ean
100
400
0.219
0.295
0.576
0.896
0.865
0.977
0.977
0.998
0.257
0.736
0.921
0.987
t
m ean
{S t , p t } when there clearly is tim e variation in survival probability, but no explainable
structure to that variation. In particular, the situation arises where AIC(.) 8 AIC(t)
and yet using the sim pler m odel, hence a single SÃ º SÃÅ , m eans foregoing separate
estim ates for k 2 1 estim able sur vival param eters. T he random eþ ects m odel with
shrinkage estim ates provides the needed interm ediate m odel.
As expected, there are conditions where AIC(.) 8 AIC(t) but AIC(R E) is substantially less than these other values. T hus, the random eþ ects m odel gets not
only selected, but convincingly so in the sense that its Akaike weight is nearly 1 in
this set of three m odels. In these sam e circum stances, when the m odel choice is
restricted to the two ® xed eþ ects m odels, selection relative frequencies tend to be
about 0.5 for each of those m odels. Table 7 gives cases from the sim ulation study
where this scenario was com m on. T hese cases tended to be ones w here process
2
variance was about the sam e as mean sam pling variance, r 8 vÅ ar(SÃ ½ S4 ). A dram atic
exam ple is for the case having t 5 31, E(S ) 5 0.6, r 5 0.025, p 5 0.6, u 5 400:
AÅ IC (.) 5 13.82, AÅ IC (t) 5 13.71, but AÅ IC (R E) 5 0.064.
Table 7. Average D AIC results, over 500 trials, for som e cases where D AIC(.) 8
is substantial lower
t
15
15
15
23
23
23
31
31
31
D AIC(t) but D AIC(R E)
E(S)
r
p
u
D AIC(.)
D AIC(R E)
D AIC(t)
0.6
0.6
0.8
0.6
0.6
0.8
0.6
0.6
0.8
0.025
0.050
0.025
0.025
0.050
0.025
0.025
0.050
0.025
0.6
0.6
0.8
0.6
0.6
0.8
0.6
0.6
0.8
400
100
100
400
100
100
400
100
100
5.33
5.19
5.41
9.62
8.95
10.39
13.82
12.64
15.13
0.084
0.095
0.097
0.070
0.124
0.103
0.064
0.131
0.077
7.24
7.26
7.05
10.29
10.79
9.91
13.71
14.45
12.93
Evaluation of random eþ ects methodology
259
5 D iscussion
Using random eþ ects as a basis for m odelling collections of related param eters is
a long-standing approach in statistics and one that can be very eþ ective (see, for
exam ple, Link, 1999). U se of the random eþ ects approach in capture- recapture
has just started. We strongly support this new dim ension of capture- recapture
m odels (which can be likelihood, frequentist, Bayesian, or empirical Bayes).
However, we also believe that the m ethodology needs to be better understood as
to any potential pitfalls and as to its operating characteristics. W hile this thinking
(e.g. what is an estim ator’ s bias) is classically frequentist, it should apply as well to
Bayesian approaches if our m ethods are to be considered scienti® c (see, for
exam ple, D ennis, 1996).
In the spirit of evidence, and the potential for dis-proof, we have herein done a
sm all Monte Carlo evaluation of one approach to im plem enting random eþ ects
m odels. We are not aware of any other such evaluation of random eþ ects m odelling
speci® cally in capture- recapture. T he results (such as in Tables 1 - 7), taken as a
whole, show the m ethod perform ed quite well under the conditions of the study,
especially when r 2 > 0.025. D esirable perform ance characteristics (such as unbiased
r à 2 ) m ay be harder to achieve if r 2 is 0, or quite sm all; this is an area where m ore
m ethodology research could be done. However, we think it reasonable to believe
2
that for a worthwhile study yielding good data, process variation, r , will not be
too sm all, relative to average sam pling variation and it is for these conditions (of
`good data’ ) that we need eþ ective random eþ ects inference m ethods.
T he sim ulations generated perfect data; in particular there was no overdispersion
(see, for example, Lebreton et al., 1992; Burnham & Anderson, 1998). In practice
if there is overdispersion, as m easured by a scalar often denoted by c, the
estim ated sam pling variance- covariance m ust be adjusted by a reliable cà , to be not
2 1
2 1
Eà S4 (W ) 5 F , but rather cà F . Franklin et al. (this issue) illustrates this practice
with real data.
A key design feature to focus on for `good data’ when applying random eþ ects
is sim ply k, the num ber of estim able random eþ ects param eters (could be sites
2
instead of tim e intervals). T he sam ple size for estim ating r is k. Therefore, one
m ust not have k too sm all; k 5 5 (t 5 7) is too sm all. Even if we knew all the
underlying S i a sam ple of size 5 is too sm all for reliable inference about the variation
of these param eters (even if we had a random sam ple of them, which is not required
here). Inference perform ance here was good when k > 15. We did not look at
k 5 10, but we would guess inference would be acceptable for k > 10. T he bene® ts
(includes shrinkage estim ates) of random eþ ects m odels becom e greater as the
num ber of underlying parameters, k, increases.
T he other in¯ uential basic design feature is anim al sam ple size. Both num bers
initially released (u i ) and num bers recaptured (these depended on S i and p i ) are
im portant to the perform ance of inferences from random eþ ects m odels. We have
no re® ned design guidelines here. We initially included in our design-factor for
releases a level of u 5 25. T his proved to be too few anim als to get useful random
eþ ects inferences: sam pling variation was relatively so large that all too often the
point estim ate of r 2 was 0. We also initially included levels for S and p of 0.4. We
quickly found that these factor-levels, even com bined with u 5 100, too often did
not lead to useful random eþ ects inferences, again in the sense that too often
r à 2 5 0. This will not cause problem s w ith real data; one sim ply w ill ® nd that a
m odel with constant S is best. But we wanted to focus on design points where
260
K. P. B ur nham & G. C. W hite
`interesting’ results would occur for the random eþ ects m odel, hence we elim inated
som e of the initial design points envisioned.
T he situation where inferences from a random eþ ects model are most advanta2
geous seem to be for when r
is about the sam e as average sam pling variance,
vÅ ar 5 [ R var(SÃ i ½ S i )] /k (note that sam pling variance is m uch in¯ uenced by sam ple
sizes of anim als captured and recaptured, or recovered). If one or the other variance
com ponent dom inates the total variation in the ML Es SÃ i then the data strongly
favour either the sim ple model {S, p t } (vÅ ar dom inates), or the general m odel {S t , p t }
(r 2 dominates), rather than the random eþ ects m odel. However, it is not a problem ,
2
as regards inference about r , to have large sam ple sizes of anim als, hence sm all
sam pling variances, so that should be one’s design goal. If it then turns out that
sam pling variance is sim ilar to process variance, the random eþ ects m odel will be
quite superior to m odel {S t , p t }. Thus, in a sense the random eþ ects m odel is
optim al at the `interm ediate’ sam ple size case. As sam ple size of anim als increases,
the random eþ ects m odel converges to m odel {S t , p t }.
O ur results show that the random eþ ects inference m ethods evaluated here
perform ed well. Hence, we do not further focus on the given results, but rather on
issues we think are problem atic. A key such issue is a boundary eþ ect, at least
under what is basically a likelihood approach. If one enforces S < 1 w hen the
unbounded M LE SÃ exceeds 1 then standard num erical m ethods (as in M ARK)
used to get the observed inform ation m atrix fail. As a result, the estim ated
inform ation m atrix is incorrect for any term s concerning the SÃ that is at the bound
of 1 (and the inverse inform ation m atrix is likely wrong in all elem ents). Experience
shows that, in this case, the resultant point estim ate of r 2 can be very diþ erent from
what one gets when the survival param eter ML Es are allowed to be unbounded. T he
diþ erence can be substantial. We felt that since the bounded case can lead to
biased (low ) sam pling variances (and biased SÃ ), we would be better oþ allow ing
the M LEs used in the random eþ ects form ulae to be the unbiased ones that arise
by using an identity link for S. By so doing we have found r à 2 to be unbiased in
m ost cases exam ined here (Table 1 or 2). Because the sim ulations are very tim econsum ing we did not also run them for the bounded case.
N ote that we do not suggest routinely accepting ® nal inferences that include survival estim ates exceeding 1. In fact, the shrinkage estim ates will generally not exceed
1, so using SÄ i not SÃ i will be the needed improved inference. H owever, to get to this
® nal inference it m ay be desirable to pass through an im aginary space (S > 1), just
as imag inary num bers can facilitate real solutions to real problem s (or just as conceptualizing a param eter as a random variable allow s the power and beauty of the
Bayesian approach); m odels only need to possess utility, not full reality.
It is com mon to use the logit link function when ® tting, as here, generalized
linear m odels w hen the param eters are probabilities. M oreover, it is com m on then
to implem ent the random eþ ects on the logit (S ) scale, rather than on S directly.
We did not do this for two reasons. First, to do so for this random eþ ects m odelling
would, in eþ ect, enforce the constraint S < 1 on the M LEs. As noted, boundary
eþ ect problems then arise w ith the empirical information m atrix. Secondly, and
m ore im portant, is that biologists want, and need, r à on the scale of sur vival
probability, S (see, for exam ple, W hite, 2000) to use in population modelling
eþ orts. Basically, we need inferences about sur vival, much m ore than we need
inferences about logit (survival). T his latter reason is the important one since with
good data we rarely obser ve an unbounded M LE of S that exceeds 1. (Note:
M ARK will do random eþ ects on logit param eters.)
Evaluation of random eþ ects methodology
261
T his boundary eþ ect problem can arise unexpectedly, as w ith the ring recovery
m odels param eterized as S and r (Seber, 1970; there, k º r), as opposed to S and
f of Brownie et al., 1985. In this form ulation, the likelihood for ever y m ultinom ial
cell involves term s in 1 2 S. T his eþ ectively constrains S < 1 but can cause
undesired boundary eþ ects.
M ight the full, proper Bayesian M arkov chain M onte Carlo approach to random
eþ ects (see, for example, Brooks et al., this issue; Royle & Link, this issue) elim inate
such boundary eþ ect biases? We do not know; we doubt it (of course, we are taking
this from a frequentist operating characteristic viewpoint, which a Bayesian m ight
disavow). We have program m ed the sim ple M C M C random eþ ects analysis and
looked at som e real data w herein the unconstrained M LE SÃ exceeds 1. It is easy
in the MC M C analysis to allow S to have its distribution over an interval such as
0 to 2 (rather than 0 to 1). We did so and there is a strong eþ ect of the upper
bound on the point estim ate (and entire posterior distribution) for r 2 , and for that
particular S. This has im plications about operating characteristics of Bayesian
random eþ ects inference.
Since we think the operating characteristics of Bayesian methods should be
docum ented, and we have software to do it, why did we not do the study? The
simulations reported here took over 4 m onths of C PU tim e on w hat was then a
state-of-the-art PC com puter. T he Bayesian M C MC analysis of a data set took
about 100 tim es as m uch CPU tim e as the sim pler M ARK im plem ented analyses.
Unless we can greatly speed up the M CM C analysis on a single com puter, or use
m any com puters in parallel, we m ight be looking at years to do the corresponding
Bayesian sim ulation study.
Another issue to be aware of, as regards the param eter r 2 , is the m atter of
distinctly unequal, rather than equal length, tim e intervals (such as 1 m onth periods
in summ er, 2 m onths in spring and fall, 4 m onths in winter). L et the tim e inter val
1/D
i have length D i . Then we should param eterize the m odel as S i 5 ( c i ) i w here now
each survival probability c i is on the sam e unit tim e basis. It m ay then m ake
biological sense to consider param eters that are a m ean and variation for c 1 , . . . , c k .
But this m ay just as well not m ake sense, because the tim e intervals are intrinsically
not com parable as they m ay be in very diþ erent tim es of the annual cycle. It
becomes a subject m atter judgement as to whether random eþ ects analysis w ill be
m eaningful with unequal tim e intervals. It m ight be necessary to use a m ore general
® xed eþ ects structural m odel, rather than E( c i ) 5 l , to allow for explainable
tem poral eþ ects on survival ( but for which we have no m easurable explanatory
covariates).
T he nam e `random eþ ects’ can be m isleading in that a person m ay think it
m eans that underlying years or areas (when spatial variation is considered, rather
than tem poral) m ust be selected at random . This is neither true, nor possible, for
a set of contiguous years. Variance com ponents is a better nam e, in that at the
heart of the m ethod is the separation of process and sam pling variance com ponents.
The issue of what inferential m eaning we can ascribe to r à 2 is indeed tied to design
and subject m atter considerations. H owever, the shrinkage estimators do not
2
depend on any inferential interpretation of r à ; rather, they can always be considered
as improvem ents, in a M SE sense, over the M LEs based on full tim e-varying S t .
T he random eþ ects m odel only requires that the conceptual residuals, S i 2 x4 i¢ b 4 ,
are exchangeable. Hence, these residuals should appear like an iid sam ple; there
should be no recoverable structural inform ation left in them. There is no required
distributional assum ption, such as norm ality. Indeed, in our sim ulation, the S i were
262
K. P. B ur nham & G. C. W hite
beta-distributed. The form ulae used here (Section 2.1) are based only on ® rst and
second m om ents, except there is an assum ption that RSS (r 2 ) (see text near
form ula (2)) has a central chi-square distribution. It seem ed to be true enough to
2
lead to good inference results here about r .
C onsider the sim ple case examined in our sim ulations: E(S ) 5 l . G iven the weak
assum ption of exchangeability we are just assum ing the tim e variation in the S i
appears as if it were totally random. T hus, a suý cient sum m ary of this variation
can be the single param eter, r 2 . Essentially, all the inform ation in S 1 to S k can be
collapsed into just two parameters, l and r 2 . This is exactly what the random
eþ ects m odel does, but in a heuristic sense, via the interm ediary values of SÄ i . And
it is correct to say that the random eþ ects m odel is interm ediate between the
m odels w ith S constant and S as being unconstrained tim e-varying, regardless of
issues about whether the underlying S i are from a random sample of any sort. This
latter issue is relevant to the m eaning of any inference one wants to m ake about r 2
(and l ), but these becom e context and subject m atter issues. In general, since we
cannot select years at random in capture- recapture studies, we have no option
2
except to use such an estim ated r
for som e sort of cautious, but assum ptionbased, inference about tem poral variation in survival probabilities.
W hat of the future of m ultiple and generalized random eþ ects in capturerecapture m odels, such as to m ales and fem ales jointly with correlated random
variation? It is bright, but probably not to be m uch found in m om ent-type equations
such as in Section 2.2. We need easy ways to embed general and ¯ exible random
eþ ects into extant capture- recapture m odels, w ithout each tim e deriving estim ators.
There are two candidate approaches: Bayesian hierarchical modelling, and likelihood via h-likelihood (Lee & N elder, 1996, `h’ stands for hierarchical). T he
Bayesian approach can be easily used (and has been) to produce results (see, for
exam ple, Royle & Link, this issue), but it is computationally intensive, which makes
simulation evaluation of the m ethod quite dem anding. We think some such
evaluations m ust be done because we have found random eþ ects m ethods have
pitfalls; we do not expect the Bayesian approach as such will sidestep these pitfalls.
As for m odel selection, an AIC -like selection criterion does exist for Bayesian
hierarchical m odels: D IC, deviance information criterion (see Spiegelhalter et al.,
1998). Finally, theoretically, the h-likelihood approach works, but it has not
been tried yet for capture- recapture. It has the advantages of being m uch faster
com putationally and of potentially ® tting easily into program M ARK.
M eanw hile, this study provides the needed evidence that the sim ple but useful
random eþ ects m odel im plemented in M AR K can perform well.
Acknowledgem ents
The authors appreciate helpful com ments on an earlier draft from D rs D avid
Anderson, D oug Johnson, Byron Morgan, J. Andy Royle and an anonym ous
reviewer.
R EF ER EN C ES
B arker, R . J . (1997 ) Joint m odeling of live-recapture, tag-resighting and tag-recovery data, B iometrics,
53, pp. 666 - 677.
B rooks, S . P., C atchpole, E . A . & M organ, B. J . T. (2002 ) Bayesian m ethods for analyzing ringing
data, Journal of Applied Statistics, this issue.
Evaluation of random eþ ects methodology
263
B rownie, C ., A nderson, D . R ., B urnham, K . P. & R obson, D. S . (1985) Statistical Inference from
B and Recovery Data: a H andbook, 2nd edn (Washington, DC, U S Fish and W ildlife Service Resource
Publication 156).
B urnham, K . P. (1991 ) On a uni® ed theory for release-resam pling of animal populations. In: M . T.
C hao & P. E . C heng (Eds), Proceedings of the 1990 Taipei Symposium in Statistics, pp. 11 - 35 (Taipei,
Taiwan, Institute of Statistical Science).
B urnham, K . P. & A nderson, D . R . (1998) Model Selection and Inference: a Practical InformationTheoretical A pproach (N ew York, Springer-Verlag).
B urnham, K . P., A nderson, D . R ., W hite, G . C ., B rownie, C . & P ollock, K . H . (1987) Design and
Analysis of Fish Sur vival Experiments B ased on Release-recapture Data (Am erican Fisheries Society
M onograph 5).
B urnham, K . P., A nderson, D . R . & W hite, G . C . (1994 ) Evaluation of the Kullback- Leibler
discrepancy for model selection in open population capture- recapture models, B iometrical Journal,
36, pp. 299 - 315.
B urnham, K . P., A nderson, D. R . & W hite, G . C . (1995 ) Selection among open population capturerecapture m odels when capture probabilities are heterogeneous, Jour nal of Applied Statistics, 22,
pp. 611 - 624.
C arlin, B . P. & L ouis, T. A . (1996) B ayes and Empirical B ayes M ethods for Data Analysis (London,
Chapm an & H all).
C asella, G . (1985 ) An introduction to empirical Bayes data analysis, The Am erican Statisticia n, 39,
pp. 83 - 87.
D ennis, B . (1996 ) Discussion: should ecologists become Bayesians?, Ecological A pplications, 6,
pp. 109 5 - 1103.
E fron, B. & M orris, C . (1975 ) Data analysis using Stein’ s Estim ator and its generalizations, Journal
of the American Statistical Association, 70, pp. 311 - 319.
F ranklin, A . B., A nderson, D . R . & B urnham, K . P. (2002 ) Estimation of long-term trends and
variation in avian survival probabilities using random eþ ects models, Journal of A pplied Statistics,
this issue.
H astie, T. & T ibshirani, R . (1990) G eneralized Additive M odels (L ondon, Chapman & Hall).
H urvich, C . M . & S imonoff, J . S . (1998 ) Sm oothing param eter selection in nonparametric regression
using an improved Akaike inform ation criterion, Jour nal of the Royal Statistical Society, Series B , 60,
pp. 271 - 293.
J ohnson, D. H . (1981 ) Improved population estim ates through the use of auxiliary inform ation, Studies
in Avian B iology, 6, pp. 436 - 440.
J ohnson, D. H . (1989 ) An empirical Bayes approach to analyzing recurrent anim al surveys, Ecology,
70, pp. 945 - 952.
L ebreton, J . D ., B urnham, K . P., C lobert, J. & A nderson, D . R . (1992 ) Modeling survival and
testing biological hypotheses using marked anim als: case studies and recent advances, Ecological
M onographs, 62, pp. 67 - 118 .
L ee, Y. & N elder, J . A . (1996 ) Hierarchical generalized linear models, Jour nal of the Royal Statistical
Society, Series B , 58, pp. 619 - 678 .
L ink, W. A . (1999 ) Modeling pattern in collections of param eters, Jour nal of Wildlife M anagement, 63,
pp. 101 7 - 1027.
L ink, W . A . & N ichols, J . D . (1994 ) On the importance of sam pling variation to investigations of
temporal variation in animal population size, Oikos, 69, pp. 539 - 544.
L ongford, N . T. (1993) Random Coeý cient Models (N ew York, Oxford University Press).
L ouis, T. A . (1984 ) Estim ating a population of param eter values using Bayes and empirical Bayes
m ethods, Journal of the Am erican Statistical Association, 79, pp. 393 - 398 .
M orris, C . N . (1983 ) Param etric empirical Bayes inference: theory and applications, Journal of the
Am erican Statistical Association, 78, pp. 47 - 65.
R oyle, J . A . & L ink, W. A . (2002 ) Random eþ ects and shrinkage estimation in capture- recapture
m ethods, Journal of Applied Statistics, this issue.
SAS I nstitute I nc (1985) SAS Language G uide for Personal Computers, Version 6 edn (C ary, North
Carolina, SAS Institute).
S chott, J . R . (1997) M atrix Analysis for Statistics (N ew York, W iley).
S chwarz, C . J . & S eber, G . A . F. (1999 ) Estim ating Animal Abundance: Review III, Statistical Science,
14, pp. 427 - 456.
S earle, S . R ., C asella, G . & M cC ulloch, C . E . (1992) Variance Components (N ew York, W iley).
S eber, G . A . F. (1970 ) Estimating time-speci® c survival and reporting rates for adult birds from band
returns, B iometrika, 57, pp. 313 - 318 .
264
K. P. B ur nham & G. C. W hite
S hi, P. & Tsai, C .-L . (1998 ) A note on the uni® cation of the Akaike information criterion, Journal of
the Royal Statistica l Society, Series B , 60, pp. 551 - 558 .
S peiegelhalter, D. J ., B est, N . G . & C arlin, B . P. (1998 ) Bayesian deviance, the eþ ective num ber
of param eters and the com parison of arbitrarily complex models. Technical Report, M CR Biostatistics
U nit, Cambridge, U K.
Ver H oef, J. M . (1996 ) Parametric empirical Bayes methods for ecological applications, Ecological
Applications, 6, pp. 1047 - 1055 .
W hite, G . C . (2000 ) Population viability analysis: data requirements and essential analysis. In:
L . B oitani & T. K . F uller (Eds), Research Techniques in Animal Ecology: Controversies and Consequences,
pp. 288 - 331 (N ew York, Columbia U niversity Press).
W hite, G . C . & B urnham, K . P. (1999 ) Program MARKÐ survival estimation from populations of
m arked anim als, B ird Study, 46 (Supplem ent), pp. 120 - 138.
W hite, G . C ., B urnham, K . P. & A nderson, D. R . (2002 ) Advanced features of program M ARK. In:
R . F ields (Ed), Integrating People and W ildlife for a Sustainable Future, Proceedings of the Second
International W ildlife Management Congress (Bethesda, M aryland, T he W ildlife Society).