Ferguson, R.A. and J.G. Fryer; (1973)Jackknifing maximum likelihood estimates for a special class of distributions."

JACKKNIFING MAXIMUM LIKELIHOOD ESTIMATES FOR A
SPECIAL CLASS OF DISTRIBUTIONS
By
R. A. Ferguson and J. G. Fryer
Department of Biostatistics
University of North Carolina at Chapel Hill
~d
University of Exeter, England
Institute of Statistics Mimeo Series No. 892
SEPTEMBER 1973
The object of this note is to show how series for the lower moments of
a maximum likelihood estimate and its jackknifed form can be taken to a
higher order than usual when the
und~rlying
of the usual Koopman exponential family.
distribution belongs to a subclass
-?
The moments are available to order n -
at present, but in this special case we extend them to cover the n -4 term
for the maximum likelihood estimate and the n- 3 term for the- jackknife.
The
series expansions are then illustr.ated using a singlej-parameter special beta
distribution.
In this
~xample
it turns out that we do not usually need
these extra terms for the first two moments, but that they greatly improve
the third and fourth.
1.
Introduction
In one of the illustrations used in Ferguson and Fryer (1973) moment
calculations to the second order for the maximum likelihood estimate and its
jackknifed form turned out to be particularly simple.
the
~econd
derivative of the log-likelihood. function did not involve the basic
random variable.
e~tcnd
This was largely because
For this type of distribution it seemed likely that we could
the moment formulae for the two estimates to a higher order and so get a
hetter approximation to the exact values of the moments.
is to report on our findings.
did in
~u~
The object of this note
We shall use exactly the same set-up here as we
previous work and the reader is referred to it for the details in order
to save space.
Taling
t
2
a. log a8£(x;8)
to he a function of 8 alone implies that in the region
of non-zero density f (x; 8) has the' form
f(x;
.e
~J ~
exp{A(8) + 8B(x) + C(x)}
(1)
and so belongs to a sub-class of the usual KOOpman exponential family.
The
'speeial beta density that we used before and that we will be using here as an
i~lustration
£(x;6)
is given by
e
(8 + 1)(8 + 2)x (1 - x)
for 0
<
x
<
1
aud
8 > 1
and so can be written as
f(x;8)
exp{log[(6 + 1) (8 + 2)J +
iu the appropriate region.
6 log x + log(l - x)}
1
n. i
(2)
We note that in this case - Elog x. '" log G say
1
is a single sufficient statistic for 8 and that the maximum likelihood estimate
e = -
is also sufficient.
+_lL:- (log
n2 + 3 log G)
2 log G
G) 2]
!J
(3)
On the face of it there is evidently a case to be made for
using the maximum likelihood estimate in such cases, which justifies taking our
mQD~nt
expansions further.
The other question we have to ask ourselves is whether
- 2
~
there is any case to' be made for jackknifing a sufficient statistic.
The problem is
that jackknifing \Till destroy the sufficiency property as a rule, and this is not
a thing that we would relish doing in general.
However, we feel that it may be a
sufficie~t
reasonable thing to do when we are unable to find a monotone function of a
statistic with acceptable levels of bias and variance and there must
when this is so.
~urely
This provides the motivation for extending the moments of the
jackknifed maximum likelihood estimate to a higher order.
The search for a suitable
function of the sufficient statistic for this class of distributions is an
problem that is beyond the scope of this note.
A
e,
a sufficient statistic
with known asymptotic properties 'with those of its jackknifed form
Moment Formulae for
Expansions for
(6'
"e
and
in~eresting
We shall concern ourselves here with
the much simpler task of comparing the lower moments of
2.
be cases
ep .
"ep •
(6p -
-e) and
e) given in ::?e':guson and Fryer (1973) allow
us to calculate their first four (series) moments up to order n
-2
•
Beyond that the
algebra is unthinkable in general, but additional terms ar r within reach for 'this
special Koopman class.
With considerable
eff~rt
we have managed to extend the
maximum likelihood results to order n -4 , and the moments of the jackknifed estimate
generally 'to order n- 3 (n- 4 was too demanding).
our basic expansions for
"e
A
(e - e) = ~
n
In order to do this we have to take
A
and
ep
to a higher order.
Previously we used
5
+
(4)
+ O(A-)
n
5
but the n- 4 results for both estimates call for the next four terms i.:. the expansion
9
(at most) leaving a remainder of order 11.
g ' There are many expectations to be
n
evaluated on the way to the final forms for the moments (for example,
2
2
8
E[A (EA. )2J and E( 11.
and many inter-relationships to exploit (like K
j J
»
but to save space we will not go into the details.
= 31 2
-
There is some additional notation
needed though if we are to quote the final results in a compact fonn.
In fact
we need symbols for higher derivatives of log f and use M, N, P and Q to denote
- 3 -
II
6 log f
d8
a8
log f
-=---=-=8:'£-':':'
a8
6
an
d 119 10 g f
9
.
respec.tl.vely.
With this and previ.ous
a8
A
notation the results for the maximum likelihood estimate, 0, reduce to
J
E(6"- - 6) = - 2n1
se
t8~4 +
+!.Ii
+ 1
3
n4
{Q 5
2
1
+n
{L
2 81 3
1....[
17M +
15
48
+ -.!. [7JP
16
3841
L96
+
3
13JK
+ llJ }
5
81
+-
121
llKL ]
4
+
16
1
6"
I
~5JK2
12
25.1
+
8
1.]
2
3
5
135J K + 175J
+
8
7
161
81
}
307LM J~
960
19KN +
96
119J~{
24
2
617.1 N
+ 576
.e
+
7
21802.- J5K
}
+ _1.9005 .1
128 III
64 110
RL(8 se
+
8)2J =1- +
nI
n
n~ { 4~4
.
+-.1:.
rJ
+
n
n
2\[1K+l~Jj
I
f.85JL
L. 24
O(~)
(5)
+ 0(\)
2
31K2]
95J K + 175J
+12J +
6
7
41
81
4
}
(6)
- 4-
1 { 7N
1615
+'n4,
(7)
(8)
The fact that these formulae were derived by both ,of us working independently almost
certainly means that they are correct.
However, we also checked thee out using a
We considered the problem of estimating
simple example.
n
' h "e a ----2
f or wh 1C
lx,
.
."
.
0f
coursp..
e from an N(O,~) distribution
From'the t heory 0f t h e X2 d'1str1' b
'
ut10n
we know
1
i\
that E(e)
E
ne
a-(n-2)
...
se (e - e)
•
exactly, and so from this we get
80
4e
+ 3 + 16e
• -2en + 2'
n
n
n4
+ 0(n- 5 )
1\
Rvaluating Ese(e - e) from (5) gives precisely the same answer, despite the fact
, face P • -2520 and
that the individual elements like P and Q are non zero (1n
---s-
Q • 20160).
e9
e
The other three moment formulae check out in a similar way,
Summarising now the moment expansions for the jackknife when r is fixed
we find that
e.
- 5 -
_r (3r 2 -
3r+1)]
E (r-l)3
+
.L
n4
-!.
[1247JL
17
384
2
{-L
+
3841 5
2
+ 447K L
96
.L
f7Jp
19KN
16 [96 + 96
119JKH
+ 307LM.l
960
J
6l7J2NJ
+ 24+576
2
3
1 [1365JK
+-_._--+ 34lllJ KL +
18
48
576
+
(9)
.e
E
"
r(S
se_ p
- S)
J = -n11
2
1
+ n2
~
2
r
J
r-l ) 214,
(10)
...
3
Ese[(Sp - 6)
J=
2J
-23
n I
JK
+ 5
41
-
(ll)
(12)
e
Again, both of us produced these results working independently.
- 6 -
Note the relationship between the bias expansions for
coefficient of n- r in the bias of ~
e
1\
and
e.
p
"
The
is the same as that of n- r in the bias of ~
p
-4
except that i.t is multiplied by a function of r. For example in the case of n
r(3r 2 -3r + 1)
There are no such obvious relationships
the function of r is simply .3
.
(r - 1)
What can be said of them is that to order n- 3 all of
J2 K
.
J2
the distinct elements in each series results for
(11ke ~ and
are
for the other three moments.
a
present in the corresponding results for
1\
ep
--n )
n I
n I
but take different coefficients.
When
r is set equal to nand n is not too small, the absolute values of these
coefficients for
gp
are invariably smaller than those for
0,
often very much
,..
ep ,
smaller, which augers well for the second and third moments of
In our earlier work we noted that whether we fixed r or
13
for
at least.
ep
did make
a difference to the final bias and variance formulae to order n- 2 , though the
actual value that
13
w~
used for fixed
does appear i.n the formulae for
belongs to this special
Koop~an
the algebraic steps using fixed
13
did not.
However, the value used for
e \.hen the order is raised to n- 3 and the density
p
class (presumably in general too). On retracing
13
instead of fixed r, we find for examplp that
1
'"3
11
{.J:..
r4
(!! _
4
sj2 )
2
2
191J K
6
12r
+---
· h
u h 1C
mea~s
. . d en t a 11 y
1nC1
t
ha t
13
f~x~o
'"
= 1m1n1m1ses
(13)
t'ne mean square d error to order n- 3
However, comparing this result with that for fixed r at (10) we find that there
is no need for a separate derivation.
The reader will note that if we substitute
3
.!!.
s for r in (10) and retain terms of up to n- taking s to be fixed then we get the
result for fixed s at (13), and this is what we would have hoped for.
other formulae for fixed
13
can be deduced in the same way.
All of the
e.,
- 7 Finally 'Te have some extended series results for the 1ll0lUtlUts of two of the
standard estimates of the variances of
(Tukey 1 s es timate ST2 proved much les:; tractable so "Te abandoned it).
by AVI and "V2'
To order n
E
l:le
"a and "a , which we have previously denoted
p
-4 we have concluded that in this special Koopman class
"
(VI)
1
nr
2
1 {M
3J }
2r3 + 2r4
+ n3 8r4
1 {K
+ nl
.
or
+ - 14 {- P5 + -rL6
n
481
+
1
r5
~JN
-2~
(4LJ~
1
+ 5K
"24
)
4
2
9KM_
+
351.
+8
48 )
(14)
.e
with
E
se
(V"
_
1
2
75J4
} + 0(-)
I
27 .r- K + 8
+ ~I:r41
n5
-.!.) 2
nI
1\
The formula for
1,
(V,,) when r is fixEed on the other hand has more complicated
se '0
coefficients which generally depend on r.
We think that there is very li.ttle
point in condensing the following form for it any further:
I
nl
I
2 2
~
+ -8 (42J K + 20J"L) +
I
4
6
l61J v~~ + 10
35J
}
21
9
I
(13K
\ 12
3
+
.!.?JKL . . J~)
8'
4
(15)
- 8 -
(16)
In this case
A
Ese (V2 -
1 2
ili)
•
J2
-3
5
n I
+
r
4
J4
2n (r-l) IS
+...1
n
4
3K2
-6 ( - .'. JL)
{l
I
4
(17)
e-
- 9 -
3.
An Illustration
,.
,.
First we look at the series for bias in e and e
p
when the sample is drawn
The value of the coefficient of n
from our special beta distribution.
bias of "e, a. say, is set out in table 1 for each i < 4
1.
and
e
~
-i
in the
0, 1 • • • 4.
These coefficients are not large. damp out quickly as we raise the value of i, and
.
increase in size with the value of e.
a i +1
_
In most cases the rat1.O - - a.
1.
.!.
Plotting
the function ai(e) for several values of e shows that it is very nearly linear
fOr each value of i.
(al(B)
t i
+
can actually be proven algebraically for
For fixed r the corresponding coefficients for the jackknife,
:large values (\f e) •
.
=
*
{(Xi} , are given by a
l
= 0, a *2 = - raz
(r-l)
*
and
,0: 3
2
r(3r
- 3r + 1)0:
~.~--~~3--~ 4'
and all of them are decreasing
(r - 1)
.e
.
f unct1.ons
Som~
f r.
0
EVl'd ent 1y
. 1arge!X *
wh en r loS
2
numerical values for the bias in
values of
e
and n.
shot,' that. only tvhen
to order n-
2
fixed r
= n,
e
Il
"
and e
In each case we fix r
.=
p
- !X '
2
are given in table 2 for selected
n for '"e
p
and this is optimal.
They
is very small would we need to go beyond the approximation
(previous simulations support the levels of bias indicated by
the levels in the Table), and that
noted before.
A
:.
ep is virtually unbiased for n .:: 10 as we have
The l;;eries results for fixed s
=1
are very similar to those for
and optimal for choice of s.
Moving on now to the second order moments, we find the value of the coefficient
in V (6) and y the corresponding coefficient for E (6 - 8)2 set out
se
m
se
in table 3 for m < 4 and selected values of e.
These coefficients (which appear to
be roughly quadratic in e for each m) certainly increase with m but not dramatically
so, and this leads us to expect fairly stable series for second moments unless
n is minute (or e massive).
sho~v3this
Table 4 which
conjecture to be true.
and fourth order results when n
>
contains some series calculations
There is little differenc.ebetween the second
10, and the second order refinement for
"e
is
- 10 -
scarcely needed for n > SO.
The results for
t, with fixed
r .. n show a
silnilar trend and here the Cramer-Raobound is approached even faster.
Jackknifing evidently leads to useful gains in mean squared error here.
especially when n is relatively suall, as we have remarked before.
like the bias the size of the second order
~nts
increase with 9.
series results for fixed s .. 1 are optimal for choice of
to those for. fixed r .. n.
8
and very
Note that
Again
~imilar
Simulated values that we have outlined in our
earlier work seem to be very mch in line with valufts from these higher
order series even when n is small.
The series coefficients for the third order moments are much larger
than those for the bias and variance and furthermore increase much faster,
which suggests less stable exp~n~ions. However, the addition of the n-
3
term to the n- 2 Lerm in E .
9)3 brings the series results much closer ~~
se
the simulated values Cdescribed in our earlier work) and the addition of the
(8 -
n-4 term doser still as we can see from table 5.
From this table (where we
have fixed r .. n) it is equally clear that the addition of the term of order
-3
'
n effectively closes th~ aap between the simulated value for E(8 - 9)3 and
p
•
the ser1es result to order n
-2
• This reinforces out previous conclusion that
ECe - 8)3 :: 2E(6 p - tl)3. Agai.n, fix~ng. r .. n minimises the third order
moment and gives very similar answers to those for fixeds .. 1.
J'Jst as we might expect the series apprOX1rJ18tions ftr the fourth
order ~nt are less satisfactory than those for the third, and this can be
seen in table 6 where we have fixed r .. n. Adding the terms of order n- 3 and n- 4
to that of order n- 2 certainly successively narrows the gnp between Ese(a - 0)4
and the simulated value, but it looks as if the term in n- 5 will be needed
to bridze the remainder for the lower values of n.
(6. -
It seems also that at
·least Otle fur.ther term in Ese p 0)4 is necessary til match it with its
simulated result. but jud&ing by previous patterns it "laY be that the term in
n-
5
is less critical dUIn it is for
e.
e.
- 11 Summarising then. this special beta distribution is a case where the bias aCid
mean squared error of
"-
e
and
series formula to order n-2
"e
p
can often be adequately approximated by the
\o.'hen n is very small the addition of the term
'
l"
o f or"d er n-3 W1'11 usua1y1g1ve
an acceptab
"e approx1mat10n
to t he exact va 1ue
of the moment.
The third order moment is often considerablY improved by the
addition of the term in n~3
also be needed.
though for very small samples the term in n-
4
may
As for the fourth moment, the approximation to order n- 3 will
usually be inadequate unless
e
is small and n large.
The addition of the n-
4
" at least the n- 5
term will always help bUL it seems that for e
term.1S needed
in the more"extrem<=! caSeS.
Whether the improvement that the jackknife brings
in mean squared error and other moments is worth more than the sufficiency
that. it destroys the reacle.:: ,,,.,ill have already decided for he;:(him)self.
Acknowledgement
-e
Much of this work was caaied out whilst we were visitors at the
Biostatistics Department. Chapel Hill, North Carolina.
We ,,,.,ou1d like to
record here our gratitude for support from Institute of General Medical Sciences
Grant No. G.M. - 12868 ,,'hilst we were there,
In addition. one of us
(J.G. Fryer) would like to thank the Fels Research Institute, Yellow Springs,
Ohio for support during the SUnIDler of 1972.
Some of his work was carried out
whilst he was a Visiting Research Fellow at that Institute.
- 12 -
.References
Ferguson, R.A. and Fryer, J.G. (1973), Jackknifing maximum likelihood
estimates in the case of a single parameter.
Statistics Mimeo Series No.
North Carolina Institute of
Table 2
Some Numerical Comparisons of the
Value of
•
I
Sample
'I
Size
(n)
I
I
0
I
I
I
.
Bias to order n
,.
~
e
and
A
ep
-2
Bias to order n
"e
e" p
e
-3
"e
Bias to order n
,..
.-
p
e
-4
e
P
6
0.1200
0
0.1.293
-0.0117
0.1305
- 0.0138
0.1306
- 0.0141
10
0.0720
-
0.0755
:- 0.0(39
0.0757
- 0.0043
0.0757
- 0.0043
20
0.0360
-
0<0369
- 0.OOC9
0.0369
- 0.0010
0.0369
- 0.0010
0.0145
- 0.0001
0.0145'
- (l.0001
'0.0145
- 0.0001
-
0.3150
- 0.0286
0.3170
- 0.0339
0.3172,
- 0.0347
-
0.1833
- 0.0095
0.1837
- 0.0106
0.1838
- 0.0106
-
0.08~5
- 0.0023
0.0896
- 0.0024
0.0896
- 0.0024
50
I
0.0144
I
-
I
I
6
0.2912
10
0.1747
20
0.0874
50
0.0349
-
0.0353
- 0.0004
0.0353
- 0.0004
0.0353
- 0.0004
6
0.4582
0.4961
- 0.01.~)
0.4993
- 0.0539
0.4995
- 0.0550
10
0.2749
-
0.2886
- 0.0152
0.2892
- 0.0168
0.2893
- 0.0169
20
0.1375
-
0.1409
- 0.0036
0.1410
- 0.0038
0.1410
- 0.0038
50
0.0550
-
0.0555
- 0.0006
0.0555
- 0.0006
0.0555
- 0.0006
4
e
...
p
I
in
Bias to order n
"e
e
2
II
-1
~ias
I
I
e
I
I
e
Table 3
The coefficients of n
Value of
..
6
-r
in E
1\
se
(6 - 6)
2
"
and V (6)
se
Coefficient
B1 and Y1
B2
B3
Y2
B4
Y3
Y4
0
r-
0.8000
1.9200
2.4384
2.6851
3.1910
3.2240
1
2.7692
5.9040
7.4481
8.1788
9.6766
9.73::J 10.8695
2
5.7600
11.8923
14.9450
16.4332
19.4309
--"
3.5880
I
19.4821
21.7414
32.4335
32.4777
36.239'2
48.6848
I 48.72.';1
3
9.7561
19.8862
24.9427
27.4352
4
14.7541
29.8827
37.4412
111.1B61
.e
.
54.3628
1
i
Note
"
1\
The Bls are for V (6) and the y's for E (6 - e)
se
se
2
Table 4
The Second Order Moments of..
Value
of
e
Variance to n -1
Sample
Variance to n
"e and "e
, p
"e
6
0.1333
0.1867
10
0.0800
0.0992
Size
(n)
I
I
I
I "e
I
°
Var~ance
"
,
to n
and
"e
to Various Orders
---p - -
-3
~ M.S.E'"to n~: M.S.E'"to
MoSoEo"to n
I
I
p
e
"e
0.1679
0.1991
0.1790
0.2011
0.2158
0.2186
0.1019
0.0935
0.1044
. 0.1076
0.1079
20
0.0400
I 0.0915
I
0.0448 I 0.0427
50
0.0160
0.0168
I
0
-2
"e
I 0.0164
e
P
0.0451
0.0429
0.0461
0.0168
0.0164
'O.C170
I
8
e
-31'
nvar~anc:
to n -II
I
I
I
I
I
e
I
0.2016
0.1022
I
I·
0.0452
I
I
I
0.0465
0.0465
0.0170
0.0170
0.0168
1.4651
1.4819
1.38,15
!
I
=
2
6
0.9600
1.2903
1.1635
1.3664
1.2172
1.3751
10
0.5760
0.6949
0.6438
0.7114
I 0.6531
0.7255
0.7449
0.7471
0.7133
20
0.2880
0.3177
0.3041
0.3198
0.3050
0.3254
0.3278
0.3279
0.3199
50
0.1152
0.1200
0.1177
0.1201
0.1177
0.1212
0.1213
0.1213
0.1201
6
2.4590
3.2891
2.9629
3.4798
3.0936
3.4990
3.7244
3.7664
3.5174
I
10
4
20
50
e
l
\I
0.2951
1.8985
1..9039
1.8203
0.7798
0.8313
0.8374
0.8377
0.8179
I
0.3014
0.3101
0.3104
0.3105
1.8154
1.6658
0.8124
0.7775
0.8176
I 0.3070
0.3013
0.3074
I
I
1.8498
1.6434
0.7377
I
I
1.7742
1.4754
I
e
I
O.30~
e
Table 5
The Third Homents about
Value of
a
Sample
Size
(n)
To Order n
"a
-2
a to
Different Orders
To Order n
-3
1\"
~
aD
To Order n
.f\
a
-4
Simulated
Values
1\
1\
a
aD
a
0.0640
0.1850
0.0938
0.2125
0.2194
a
~===f===tl===I===::!:~F====F===t=======F===l==::::l.:~
o
2
I
6
00'.0°29 9 27
I
0.0124
0068
I
0.0017
0.0017
0.
1
0.0010
1.5625
3.5169
3.6324
0.9507
0.4956
1.0051
1.0983
-I
i
0.5663
0.1006
0.2069
0.J~2(
0.2103
0.2249
0.?231
0.0161
0.0301
0.0168
0.0302
0.0331
0.0195
6
0.1120
10
0.0403
0.0230
0.0561
0.0293
0.0597
0.0580
20
0.0101
0.0058
0.0120
0.0065
0.0123
50
0.0016
0.0009
0.0017
0.0010
6
1.9569
1.1182
3.0968
10
0.7045
0.4026
20
0.1761
50
0.0282
1.667/1 I
I
F======1=====*t=====I====l======t:====l========f::=====j::====-==,
4
6
7.8872
4.5070
12.4280
10
2.8394
1.6225
3.8202
20
0.7099
0.4056
50
0.1136
0.0649
6.2637
14.0953
14.7205
6.6860
1.9903
4.0363
4.0975
2.1100
0.8325
0.4505
0.8460
0.9355
0.5393
0.1214
0.0677
J.121e
0.1164
0.0631
L - -_ _---1._ _J J - _ . . . . L - _ - . ! . -_
I
_. L - _ - - L . -
.J...--_-L-
I
I
I
1
Table 6
Fourth }loments about 0 to Various Orders
Value of
Sample
Size
e
To Order n
;..
e
(n)
and
0.0533
6
"ep
-2
:To Order n
/\
-3
/\
e
ep
0.1941
0.1369
To Order n
-4
0
==--.
2
0.3105
.
P
~
10
0.0192
0.0496
0.0368
0.0647
0.0671
0.0452
20
0.0048
0.0086
0.0070
0.0095
0.0098
0.0075
50
0.0008
0.0010
0.0009
0.0010
0.0010
0.0009
6
2.7648
8.8752
6.3537
13.4974
15.9551
9.1698
10
0.9953
2.3152
1. 7518
2.9142
3.5496
2.3980
20
,'.2488
0.4138
0.3418
0.4513
0.5072
0.3966
50
0.0398
0.0504
0.0457
0.0513
0.0550
I 0.0492
i
I
I
I
i
I
II
I
l
6
18.1403
57.1623
40.9915
86.3987
107.9776
63.5952
10
6.5305
14.9593
11.3474
18.7483
21.4185
.i.4.9475
1. 6326
2.6862
2.2250
2.9230
3.3721
2.6946
0.2612
1°·3287
0.2988
0.3347
0.3317
0.300~
20
50
----
e
r
O.'~O.2602
I
4
f\
"e
,
I
Simulated Values
,..
e
I