Saleh, A.K.Md. Ehsanes and Sen, Pranab Kumar; (1984).On Shrinkage M-Estimators of Location Parameters."

•
ON SHRINKAGE M-ESTlMATORS OF LOCATION PARAMETERS
by
A.K.Md. Ehsanes Saleh
Department of Mathematics and Statistics
Carleton University, Ottawa, Canada
and
Pranab Kumar Sen
Department of Biostatistics
University of North Carolina at Chapel Hill
Institute of Statistics Mimeo Series No. 1479
December 1984
CN SHRINKAGE M-FSl'IMA'IORS OF r.cx:::ATICN PARAMETERS
...
A.K.M:i.Ehsanes saleh
and
Depart::nent of Mathematics
and Statistics,
carleton University,
Ottawa, canada KlS 5B6
Pranab Kmnar sen
Departrrent of Biostatistics
University of North Carolina,
Chapel Hill,
NC 27514 , USA
Key WoI'ds and PhI'ases : Asymptotic risk; Jcunes-Stein ruZe; ZocaZ
(Pitman type-) aZternatives; M-estimation; muZtivariate Zocation
model,; preZiminary test estimation; shrinkage estimation.
Fbr a general class of continuous and marginally synnetric,
rrultivariate distributions, shrinkage estimators of locations based
...
on suitable M-statistics are considered. These estimators are based
on the Janes-Stein rule ( incorp::>rating the idea of preliminary
test estimators too ), and the main errphasis is laid on the sttrly
of their asyrrptotic ( distributional) risk properties. Asymptotic
admissibility results are also sttrlied under fairly general regularity carxlitions.
1.
INl'RCVUCI'ICN
For a nultinonnal distribution ( of dim::nsion 3 or rrore) , under
a quadratic error Zoss function
~
the classical maximum Zike-
Uhood estimators (MLE) of location are not generally admissible
I c.f. Stein (1956)] and improved (non-linear) estimators are
available in the literature [ viz., Janes and Stein (l96l) and a
series of paPers, referred to in Berger (1980) ]. Though, in the
•
initial stage, the covariance matrix was asstmed to be of certain
specific fOnTI,
~e
case of crnpletely unknown dispersion matrix
has also been treated in generality
r viz.,
Berger am Haff (1983)
where other references have been cited ]. MUle the assurred nonnality of the underlying distribution plays a vital role in the
developnent of the so called exact dominance and admissibility
results, the exact theo:ry stumbles into difficulties when this
nonnality assunption is dispensed with. Nevertheless, under an
r explained
appropriate asymptotic setup
in detail in Sen (1984 )
am Sen am Saleh (1985) ], the theory of shrinkClfJe estimation in
multi-parameter models extends to a wider class of statistics and
a general class of distributions. Sen (1984) considered the case
of U-statistics , while, Sen am Saleh (1985) have considered the
case of R-estimators of locations. The object of the present study
is to consider sene shrinkClfJe M-estimators of location where the
theory of James-Stein rule estimation and preliminary test estim-
ation are incorporated in the fornulation of the prposed estimators.
,
let X.
... , X,~p ) , i = 1, ... , n be indeperrlent and
...~ = (X'l'
~
identically distributed (i. i. d.) raman vectors ( r. v.) having a
•
p ( > 2) variate continuous distribution function Fe' defined on
the Euclidean space
# .
Fe
is ass1..m3d to have sym1etric marginal
distributions F Ij] (around the location paraIreter e j ), for j
•• , p. Thus,
Fe (x) = F ( x - e ) , x
- -
where
~
-
-
£
# ;
-
8 = (8 , ••• ,8 )',
1
-
= 1, •
(1.1)
p
is the vector of (marginal) location paraIreters, and our
interest lies in the estimation of 8 • For an estimator 0 (based
...
...n
on Xl,
•••
,X
),
we
conceive
of
a
quadratic
error
loss
function
...
...n
L( 0_n , _e }
=
(0-n - e ) 'Q{ 0n- _e ) ,
(1.2)
for sene given positive definite (p.d.) matrix Q • Then, the risk
function is given by
pn ( _
0 ,
_e
where
V....
.....
= nE(
0 ;e_
_n
= nEL(
0 - e
-n...
) = Tr(
,
) (
0 ...n
For nomal F, the MLE of e is given by
QV
__n )
,
e) •
0
...n
=X
= n -1 E.n~=lX,
...n
... ~
(1.3)
(1.4)
• For p
J
•
greater than 2, a detailed account of (admissible and minimax )
~
estimation of
( for nonral F ) , based on the JanEs-Stein rule,
is given by Berger (1980). For possibly non-normal F, particularly,
for distributions with heavy tails, these estimators may not be
robust and efficient. Robust shrinkage R-estirnators of locations
in the multivariate case have recently been considera:i by Sen and
saleh (1985) ; parallel results on U-statistics are due to Sen
(1984) • We propose to study the preliminary test (PT) and shrin-
kage versions of M-estim:ltors of location in the multivariate case
and
to present their asymptotic risk properties. Along with the
preliminary notions, the prqx::>sa:i estimators are intrcxluca:i in
section 2. section 3 deals with the general results on their asyrrptotic risks. SOIre general ranarks are made in the concluding
section.
2.
•
THE PROPQSED ESTIMATORS
First, we introduce the score functions. let 1JJ = ( 1JJ , ••• ,1JJ p )
1
with lJJ, = {lJJ, (x); X £ E }, be defined by
J
J
lJJ,(x) = 1JJ· (x) + lJJ'2(x) , x £ E , j=l, ... ,p,
(2.1)
J
J1
J
where the lJJ
, k=1,2, are both nondecreasing and skew-synnetric;
jk
lJJ
is absolutely continuous on any bounda:i interval in E and lJJ
jl
j2
is a step-function having finitely many jlIll1?S, j=l, ••• ,po Further,
we assune that for suitable positive constants c, , for all Ix I>c, ,
J
J
lJJ, (x) = lJJ, (c,) signx , and lJJ, is non-constant on [-c"c.J, so that
J
J J
J
J J
o < a",'I'j = fElJJ~J (x)dF[ J'J (x) < 00 , l<j<p
,
(2.2)
-where F [jJ is the j.!:!! marginal of F. We also denote by F [j~J the
(j,~) th
( bivariate) marginal of F , for j 'I
~
1, ... , p, and let
=«
~1JJ
= « a ,lJJn»
ff~ lJJj(x)lJJ~(y)dF[j~J (x,y»).
(2.3)
lJJ
J
In the sequel, ~1JJ
will be assumed to be positive definite (p.d.).
)J"
•
Note that by vinue of the assurred synneby of the F [j] and the
skew-syrrmatry of the
~j =
1JJ j , we have
f E lJJ j ~}dF IjJ
(x) = 0 , for j = 1, ••• ,po
(2.4)
3
let us raw define for every real t am n
M . (t) =
n)
'if: 1 ljJ. (X.
1=)
0
0
n)
1,
t) , j = l, ••• ,p •
-
1)
Note that for each j, M
~
(2.5)
(t) is nonincreasing in t, and further,
by (2.4), M (6 ) has mean 0; this leads us to the following estinj j
mators of a
a
...n
"
"
( an! ' ... , anp
=
(2.6)
where
A
a.
n)
= (sup{t: M oCt) >
nJ
o}
+ inf(t: M .(t) < 0}}/2,
(2.7)
nJ
for j = 1, ... ,p. In the literature, these are kncMn as the M-estimators I viz., Huber (1981) for a detailed account of the univariate
theory ]. Umer the assunai regularity conditions, it is well known
A
a is a robust, carponentwise na:lian-unbiased., translation...n
invariant and oonsistent estimator of ~ • Further, as n -+ 00 ,
that
1
A
n"i. (a
:0
_.)( ( 0
p ...
- a)
...n . . .
r
where
Yj =
, r-IE r- l
~
-tjJ.....
...
) ,
(2.8)
= Diag( Yl ' ... , Yp ) is defined by letting
•
,
IE ljJj(xH-f(j] (x)/f(j] (x)}dF(j] (x), j=l, ... ,p,
(.2.9)
f
and it is assunej that the marginal probability density functions
(p.d.f) f (j] are all absolutely continuous with finite Fisher> in-
fomations. We may refer to JureCkova am. Sen (l98la,b; 1982) for
some detailed accounts of the asymptotic theory of M-estinators.
In a shrinkage estimation problan, one has ap1'io1'i sate rea-
sons to believe that
a pivotaZ point
~
~
is likely to be in a small neighbourhood of
; by virtue of the translation-invariance of the
estimators, we may set, witb::>ut any loss of generality, that
~ • In
a =
-0
a preliminary test estimation (PTE) problem, one has the sane
belief, and one sets to test for the null hypothesis that
e= e ,
-
first, and on
-0
the basis of this test, the ultimate estimator is
chosen. In either case, one needs a suitable test-statistic which
•
is incorporated in a suitable fonn in the construction of the estimators. Towards this, we define first
S = S (ljJ) =
...n
...n...
«
sOn
n,JN
» ,
(2.10)
4
,
where
s
= n -1 E.n~=1
.n
n,JA.
= 1, ... ,p.
for j ,JI.
'"
,I,. (X •. - 6 .) ,I, n (X. n
"'J ~J
nJ It' A. ~A.
Then, as in Singer and
-
'"
6
n
nA.
)
(2.11)
,
sen (1985),
we may con-
sider a test statistic
£n =n-l(M....n......n
(O»'Swhere M (0)
...n ...
=
(M_ 1 (O), ••• ,M
np
U..L
eM...n (0»,
...
(2.12)
(0»' is defined as in (2.5), and S-
...n
s:
is a (reflexive) generalized inverse of ...n
S • Under H '
has asn
O
yrrptotically chi quare distribution with p degrees of freedan (DF)
when E,l) is p.d. Thus, if.,2
stands for the upper 100a% point of
...~
·p,a
the chi square distribution with p DF, for a e: (0,1), then corres-
ponding to a desired significance level a , the preliminary t.est
for H is based on the following :
O
2
>
<
The PrE of
e...nPT
~
.
Xp,a '
x_2
,
"p,a
reJect HO '
(2.13)
accept H
0
is then defined by
= I(.tn-> ·p,a
x~ )8...n
+ ILl <
n
2
xP,a'"
)0,
(2.14)
where I (A) stands for the irxlicator function of the set A. On the
other hand, in the shrinkage estimation, the statistic.r..
is dire-
n
etly incorporated into the nodified estimator. For this p..1rpOse, we
need an estimator of the dispersion matrix
~ = ~-lEJlI~-l , appearing
in (2.8). Following JureCkova and sen (198lb), we chcx:>se sane positive a and define
y'" .
~
=
~ -1
'"
-~
"'-~
(2an)
{M. (6 . - n a) - M . (6 .+ n a)},
.
= 1, ... ,
for j
~
~
(2.15)
~
P and let
r...n = Diag(
'"
•
~
'"
'"
Y , ••• , Y )
np
nl
and
A
v
...n
= ...n
r-1S...n...n
r-1 .
'"
A
(2.16)
It follows fran the results in Singer and Sen (1985) that under the
v'" is p.d. in probability, and it
asslllred regularity conditions,
converges to
d
~
, in probability, as n
= chp (Qv
) = smallest
--n
'"
n
...n
-+
<Xl.
Let
A
characteristic root of Qv
--n
,
(2.17)
where Q is defined as in (1.2). Since for possibly non-noma1 F and
arbitr~ ~
, the behaviour of
J:.~l,when £.n
is close to 0, is not
5
that precisely known, and hence, we use a small truncation, and as
in
sen
(1984), consider the following version of a shrinkage M-esti-
{O'
=
mator :
AS
~n
!.
< e: ;
-lQ-~-l)~
n.J.. ... ...n ...n
(I-od
...
if
r
£n-> e:
if
n
(2.18)
,
where e: (> 0) is an arbitrary small number and c is a positive number ( 0 < c < 2 (P-2». It is also possible to replace c by a sequence {c } of positive numbers , such that c
o
n
n
converges to a limit c:
< c < 2 (p-2). Ideally, c may be taken as (p-2). OUr primary inte-
rest lies in studying the asyrrptotic risks of the pro{X)sed PrE and
shrinkage estimators and to carpare than in a meaningful way.
3.
ASYMP'IDI'IC RISK EFFICIENCY RESULTS
In view of the boundedness of the score functions, as has been
assumed, we may not need stringent rn:::m:mt conditions on the d.f. F.
we
assure that for sare positive b ( not necessarily.::. 1),
~ II~lllb ~
~ IXlj I
b
E3=1
<
0.1)
00
Then, the following results can directly be adapted fran J~va
and
sen
(1982) :
(i) For every k ( > 0), there exists a positive integer n (k),
k
0
such that E..-.- Ie. 1 exists for every n > n (k) (and j=l,. oo,p) •
~
m
- 0
(ii) For every k ( > 0),
A
l~~
•• '-
nk/ 2 (
E..-.-{
-p
e . - e.
)k } =
n))
~I~
EZ k ,
0/)
(3.2)
where Z has the standard nonnal distribution, and, in particular,
A
nE[(
A
,
e - _e) (e-n - -e )]
-n
~
V
as n
,
~
(3.3)
00.
(iii) For every j( =l, ••• ,p),
A
.(8.)
en). - e.) ) = y:~
)
n)
)
n-~Iw .1 ~ o alnost surely
n)
n(
where
EI
n-~ Wnj
Ik
~
+ w . ,
(3.4)
n)
(a. s. ), as n
0, as n ~
00
,
~
00,
and
for every k > O.
/
•
(3.5)
Now, by virtue of (1.3), (1.4) and (3.3), we obtain that for
the classical M-estiroator,
A
p( 8;0)
~
~
A
= 1tm
{P ( ~n
8 ;0) }
n
n~
~
= Tr(O[ltm
n+oo
-
nEe
A
A
8
_n
- 8)( 8
-
_n
- 8 )1])
_
(3.6)
=Tr(Ov).
•
~
Next, by (2.14), we have ( aPr - a )
_n
_n
n(
=I
(r
~
<
a)n 10- ( -n
8PI'- -n
a ) = n (e_n__n
Qa ) I ( t
aPr -
I
_n
~
)e
'p,a_n
n
<
, so that
x~
).
'p,a
(3.7)
Now, note that by (2.12),
r
max
-~2
~n ~ 1..93' {n nj (O)/sn,jj },
so that
p{
s:. n
< x~
'p,a
<
min
(3.8)
x: }
< 1~
P{n-~2.(0) < s ..
<J.:;p
nJ
n,JJ'p,a
}
-1
l~j~ P{n
Mnj(O) < n
-~ ~
S~,jj Xp,a } •
(3.9)
Note that when 8. is the true location, E{M . (0) I a.} is a negative
J
~
J
or positive number according as 8. is positive or negative, while
_~ ~
J
" x_
converges to 0 (a. s.) as n -+ 00 • Further, urrler e. ,
n,JJ'p,a
J
M . (0) involves Li.d.r.v. IS (bounded valued) on which the ~nennJ
tial inequality [ viz., Hoeffding (1963)] holds • Hence, using Then
s
oran 3.3 of JurOCkova and
sen
(1982) and the above facts, we arrive
at the conclusion that under 8 'I 0 , the right hand side of (3.9)
-2
-can be made O(n ), for n adequately large. On the other hand, by
the lb1der inequality, for every q > 1, by (3. 7) ,
Pr A
APr A
AI
-1
E{n(~n - ~n) 19( ~n - ~n)} ~ n[E(~n9~n)q]q
where
E( alOe)q
2
1-1
[P{(.n< Xp,a ] -q
(3.10)
< ch (0)E( ale)q < pq-~r(Q)E~_lEle .,2q ,
(3.11)
1 ~n-n
- JnJ
and the right hand side of (3.11) is finite for every n > n (q). By
- a
(3.9), (3.10) and (3.11), we conclude that for any (fixed) 8 'I 0 ,
~n~_n
.
(3.7) converges in mean to 0 as n
-+
00
,
so that
asymptotically risk-equivalent for any (fixed)
rr
and 6 - are-
'I
.
_n
~
~
n
Let us next consider the case of the shrinkage estimator. By
U.3), (1.4) and (2.18), we have
5
E{n(S5
)10(6
6 )} = I( ~ n < E){n(SlQ6
_n __n - "'n
_n__n )} +
~
2r-2 2
AIvA-1Q-lvA -1 A
+ I( ~ n -> E)E{ d n ~ n c ( n8_n_n
8 )}.
- ~n-n
a
~n
(3.12)
Now, using the basic inequality in (3.10) of
sen
and saleh (1985)
and proceaiing as in their (3.8) through (3.11), it follows that
for every (fixe:l) e ., 0 ,
limsup
~
so that for every
~timators
e
e)
~{n(-eS
)'Q( as =0 ,
(3.13)
-n _n _ -n
-n
(fixe:l) e:l 0, the shrinkage and the classical
.
- -
are asynptotically risk-equivalent. The situation is,
hc\iever, quite different for local shift alternatives, and, we plan
to explore the sane.
As has been mentione:l earlier in section 1 that both the PrE
and shrinkage estimation works out well only for shrinking neighb-
ourOOods of the pivotal point ( here
~
). In the asymptotic case,
this shrinking neighbourhood coincides with the usual Pitman-type
( local) translation alternatives. Hence, we consider a double sequence
{X. , i=l,oo.,ni n > I} of (row-wise) LLd.r.v.'s, where
-lU.
-
for each n, the distribution of X . is given by (1.1) with
-lU.
1
e = _n
e
-
= n -~ A , and A belongs to a CQ'll)act set C ( containing 0 as an
-
-
....,
inner point). By virtue of the translation-invariance of the
we may still work with the X . _nJ.
A
e ,
-n
e ,and justify all the asynpto-n
tic results considere:l earlier. Ther~fore , (3. 6) remains true for
such local alternatives as well, though the other asymptotic risk
expressions ranain to be worked out. For later use, we denote by
{K } , the sequence of alternatives for which (1.1) holds with
n
1
e n= .n.-~. . ,A , for a given _A ., _0, so that .for the PrE,we have
_
P(PTMiA)
_
= lim
{pn (e
_"PI' i Q)
_
n~
= limn~
p{ . (
<
n
I
~
e=
-
K}
n
'p,a
I
K } (A 'QA) +
n
---
(3.14)
+ limn-+'X' E{I (£., n -> x~
e ) 'Q(6
'p,a )n(6_n- _n
__n - e_n ) IKn }.
Note that
r
pI ~
n
2
--p,a
< X-
I
Kn }
2 i~)
= Hp (x"p,a
(3.15)
where
t::.
and H (Xi <5)
=
(A,\)-lA)
(3.16)
-- stands for a noncentral chi square distribution fu.octi-
q
on with q DF and noncentrality paraneter <5 • For the second term on
the right hand side of (3.14), we make use of (3.2) ( for k = 4) ,
•
(3.3), (3.4) and (2.12), and oonclude that the second teJ:m on the
right harrl side of (3.14) converges to
Tr(Qv) {l - Hp+2 ()(~
·p,a ill)] ~_
(A 'QA)
__ [HP ()(~
'p,a ill)~
2
2
2Hp+2(Xp,aill) + Hp+4 (~,ait,)J i
(3.17)
in this oontext, we have also made use of the fonnulale for the incc:nplete narents of multinonnal distributions [ viz. Sen and Saleh
(1979)]. Thus, fxonl (3.14) through (3.17), we obtain that
P(PTMi~)
=
2
2
[1- ~2(~,ait,)]Tr(~) + [2Hp+2(~,ait,) -
Hp+4(X~,a
it,) ]
(~'9~).
(3.18)
Note that (3.6) remains in tact under {K } as well, and hence, for
n
the classical J.l.l-estimator, we have
p(M i~)
= Tr(g~
) , for every (fixed)
~
£
Ef.
(3.19)
Fran (3.18) and (3.19), we have for every (fixed) A ,
p(~liA)/p(M;A)
-
-
=1
-
Hp+2(~
[(~ '9~)/I'r(9~)] [
'p,a.
+
it,)
2Hp+2 (~,a;ll) - Hp+4 (~,a;ll)]. (3.20)
Since Hq(X;O) ~ Hq+2 (x;o) for every x
£
E and q ~ 1 (0 ::.0), we obt-
ain fran (3.20) that
(i) (A 'QA) /I'r (Qv) > 1
=>
P (PTM; A) /p (M; A) > 1 ,
(3.20)
<~
=>
p(PTM;A)/p(M;A) < 1.
(3.21)
(ii) (A'QA)!Tr(Qv)
In (3.20), the excess over 1 (for the asynptotic risk ratio) may
not be very IlUlch, and this ratio converges to 1 as t,
•
-+
00
•
On the
other hand, the deficit in (3.21) may be rrore appreciablei in par2
ticular, under H : A = 0 , the risk ratio is I-Hp+2 (X
;0) , and
o - p,a
it depends on panda. Thus, the PrE performs better under H and
O
for small deviations fram H ' while, in the tail, the classical
O
M-estimator perfODllS better. None daninates over the other over the
entire range, though the PrE has a rrore robust perfonnance.
Let us row consider the case of the shrinkage estimator. Fram
(2.18), we obtain that under
ditions,
{K} and the assurrai regularity conn
= limn-+<x> lnE{ ( -n
as -.8...n...
) '0
= (A_ 'QA)
__ !limn-+<x> E{ I ( r
P (Sot; A)
...
(
as - -n
8 ) I K } ]
n
-n
-<
~n
£ )
I
Kn }]
e - ...n
8 ) I K
n
+ limn+oo E{ I(JC n > £)n(e...n - ...n......n
8 )'O(
}
-2cI lim
E{d t-ln( e - 8 ),~-l e lei: > £)1 K}]
n-+<x>
n n
_n
_n _n ...n
n
n
2
2
+ c [lim
E{d
-2ne'e-lo-le-le I(£ > £) IK } ]. (3.22)
n-+<x>
n n _n..;.n _ _n _n
n
n
r.
The first tenn on the right hand side (rhs) of (3.22) is equal to
(3.23)
(A'OA)H ( e: ; b. ) •
- -- P
Also, proceeding as in (3.17), the second tenn on the rhs of (3.22)
reduces to
[1-Hp+2 (e:; M ]Tr e9~) - e~ '9~) [Hp (e:;M -2Hpt2 (e:; M+Hp+4 (e:; M] • (3.24)
For the third tenn, we note that
In
(a...n - ...n
8 )' ~-le
_n ...n
d
n
I =
In\ e - 8 )' ~~-~
_n
_n _n _n
< ([nee - 8 ) ,e-l ( e - 8 ) d ]{ d .nEr~le
-n
-n -n
...n -n
n
n ...n-n_n
e - -n
e)
< {n(e - e )'O(
-n
...n... _n
-<
A
•
]}~
}~
ne'oa
-n-~
"""I
' "
n~e_nd n
~
..,
A
[n (8_n - _n
8 ) 0... (8...n - ...n
8 ) + n8...nQ8
__n 1/2
(3.25)
while,.c -1 is bounded fran above by £ -1 « 00). Hence, the integrabin
lity condition may easily be verified by using (3.2) through (3.5),
and, following some routine steps. ( and denoting by War. v. having
the p-variate nonnal distribution with mean ~
matrix I
t
= ~-~~
-
and dispersion
), we conclude that the third tenn on the rhs of (3.22)
...p
is equal to
-2c.ch (Qv)E{ (W - w) 'W(W'W)-l - I (W'W <e:)W' (W-w) (W'W)-l}. (3.26)
P -- - - - - - - - - ~ 'For the second tam under the expectation, we may proceed as in
sen and Saleh (1985) and conclude that it is
oee:(p-l)/2~,while, for
the first tem, we may use the classical Stein identity [viz., App-
endix B of Judge and Bock (1978) ], and conclude that (3.26) is
-2c.ch (Qv) { 1 p ...-
where
~'~ = ~ (Ll)
b.E (X:~2 (M) } + 0 (
.yr
£
(p-l) /2) ,
(3.27)
has the noncentral chi square distribution with
1.0
p OF and noncentrality pararreter /). = w' W = A' \}-IA. Also, noting
2 e'C-l Q-lC-l
that d n_n_n...
S'Qe ,we o~ on ~~- (3.2) through
_n _n -< _n__n
(3.5) that the last tenn on the rhs of (3.22) is equal to
e
[c.ch (Q\})J2
p
~~
{~T(Q-l\}-1}E(X:~2(/).)}
+ /)'*E(X:~4(/).»
_.~
- EI
.~
I (W'W <e:) (W'W) -2 (W' A*W
--- -
where
J }
(3.28)
(3.29)
As in Sen and Saleh (1985), the sea:>nd tenn on the rhs of (3.28) is
o (e:I>-2) ,
and hence, using (3.22) through (3.28), we obtain that
p(Sl-1;~} = Tr(9~)
-:-
2C.chp(9~){
1
-/).E(~2(A»}
[c.chp(9~)]2{ Tr(9-1~-1)E(~2(/).»
+
+
/)'*E(~4(/).»}
+ O(e:p / 2 ) + O(e:(p-l)/2) + O(e:P-2) ,
(3.30)
= O(e:q/2} , for every q -> 1 and 0 - > O. Since
q
p is taken to be greater than 2 ( for the shrinkage estimator), it
as n (e:;o) <
q
-
H (e:;0)
e: adeqUately small, the rhs of (3.30) can
follows that by choosing
be approxill1a:ted by the first three tenns • On this approximation,
the results in Berger et al. (1977) directly applies, and we obtain
that for every c : 0 < c < 2(P-2) and ~
e:
# ,
lim£TLO P (SM; A)
< 1 with strict < at /).
_ /p (M; A)
_-
= O.
(3.31)
Thus, the shrinkage M-estimator daninates over the classical one •
l = ~ ),
c =_iP-2) and 9~ =!' the
left hand side of (3.31) reduces to 4 (p-l)p
,and is < 1 for all
In particular, under HO (i.e.,
p > 2 ; the reduction is substantial for large values of p.
Finally, fran (3.18) and (3.30), we obtain that
lim,Op (SM; A) /p (P'I'Er-1; A) ={ 'l'r (Qv) - 2c.ch lQvlll£T
-
2
-
-1 -1
--
-4
p --
-4
-2
/).E (x_ '2
.~
(A) ) J
+ (c.chp(9~» ITr(9 ~ )E(Xpt2(/).»+/).*E(Xpt4(A»J}.
2
2
2-1
[1-Hp+2 (Xp,a ; /).) J Tr (Qv)
__ + (A'
- QA)
-_ [2Hp+2 (Xp,a ; /).) -Hpt4 (x_
'p,a ; /).)J } ,
By l3.20} and
A e: :E;P , /). > 0, /). * > 0 •
(3.31), we conclude that for all
(3 • 32 )
~
, for which
l
'9~
exceeds Tr ~), (3.32) is less than 1 (though it converges to 1 as
if
-
A noves away fran 0 , i.e., 6.
~
or 6.* converges to +
00
};
however,
the deficit fran the uwer asynptote (i.e., I) is usually quite
small for m:x1erate to large values of 6. • On the other hand, the
picture may be quite different for 6. close to
a;
see (3.21) in this
respect. Letting n. = ch. (Qv) /l'r COv}, fo+ oj =l,... ,p and noting
-1 J
J -~- -1 -1
-1
2
that (i) n < p
and (ii) Tr (Qv) Tr (Q v ) = ( En.) ( En. ) > P
p-- J
J( where the equality sign .ooIds when the n. are all equal), we ob~
tain that under H ( i.e.,
O
= ~ ),
(3.32) reduces to
{l
-2~ + c2~LTrt9~)Tr(9-1~-1)/p(P-2)J}{
~
{l-
2C,\> + c 2 '\>2
(p-2)
•
J
1- Hp+2 (~,a;O) }-l
-1
2}
p }/{ 1 - Hp+2 (Xp,a;O).
(3.33)
The denaninator on the rhs of (3.33) is free fran c, while the nu-
merator is minimized at c
= ca
where c
a
=
(p-2)p-1TL'1 ( > (P-2», so
p-
that (3.33) is bounded fran below by
2 2
-1
2
{ 1- 2co '\> + co'\> (p-2) p }/{ 1 - Hp+2 (Xp,a;O) }
2
-1
= (2/p) { 1 - Hp+2 (Xp,a;O)}
,
(3.34)
and (3.34) exceeds one whenever 1 - Hp+2 ~,a;O) < 2/p i.e.,
I Hp+2 (~,a;O) > (p-2)/p J => I (3.33) is > 1].
(3.35)
Now, looking at the central chi square distribution,
we gather that
(3.33) excee1s 1 for a significant range of values of P and a • For
exaII1?le, for
~
a
= 0.05,
for all p
~
21 and for a
= 0.10,
for all p
11. In fact, the smaller is the significance level a , the larger
is the range of p for which (3.33) is greater than 1. This clearly
indicates that generally the shrinkage M-estimator fails to daninate
over the PrE M-estiroator lU1der H and for smaller values of
O
shall ela1:x>rate this a bit !YOre in the next section.
4.
~
•
we
roo1E GENERAL REMARKS
In canparing the PrE and shrinkage versions of H-estimators,
we may note first that the PrE does not presuppose that p is greater
than 2 , while the shrinkage estimator does so. Thus, the PrE may
have a wider scope of applicability. On the other hand, the shrinkage estimator daninates over the classical one, while the PrE does
/2
so only for
~
close to
~
(i.e., for small values of /:. ). For larg-
er values of /:. , clearly the shrinkage estimator is superior to the
PrE , thJugh both. of these may behave quite closely for /:. not too
snaIl.
Next, we have noted after (2.18) that the constant c may
also be replaced by a sequence {c } for which lim
c = c exists.
n
n""'" n
The existence of tn.is limit will ensure the convergence of the risk
•
fontU1.ae. In this context, naturally, one may raise the question on
the choice of
E:
curl c
n
in the definition of the shrinkage estimator
in (2.18). Basically, for small n, the constant c
chosen small, while, for large n, c
n
should also be
n
may be taken to equal to (P-2)
( as was rec:amended for the no.nna.l rrean case by Berger et ale (1977». since in (3.30), we have neglected tenns of the 0 (e (p-l) /2) ,
the choice of
is naturally dependent on p as well. The larger is
the value of p, the larger is the quantity E: , so that E: (p-l)/2 is
E:
smaller than a specified n > O. It may also be noted that in (2.18)
1\
one needs the estimator v
1\
1\
-n
which in turn depends on the estimators
S curl r . For r , one may use the stronger convergence results
...n
-n
...n
studied in Jur~kova and sen U98la,b; 1982) , while the convergen-
,
ce rates for S
-n
, generally, depends on the underlying F (through
the dependence of the p characters) •
we
have tacitly assurred that
dn ' defined in (2.17), converges in probability (or a. s.) to a tositive limit.
This,in fact, danands that S_n converges (in probabi.
lity or a.s.) to a p.d. matrix as n -+ 00 , and,in order that such a
convergence is uniform ( in";\ ,under {K } ), we may need to assume
-
n
that F is non-singluar ( in the sense that ch (E,,) is strictly
p0-
P -'"
sitive over a suitable class of such F's). For -nonnal d.f.'s, it
suffices to fontU1.ate this oondition in tenns of the covariance matrix, while,
E~
takes on this role for a general F and a set of ch-
osen score :f\mctions. Finally, we have confined ourselves to bounded score :f\mctians on which the H-estimators are defined. It is tossible to relax this assumption and to incorp:>rate ( as in Jureekova and
sen (1981 a,b»
able .narv;mt conditions.
unbounded scores functions satisfying suitAs we need to justify (3.2) through (3.5)
(neErled for the desired integrability conditions to be verfied) ,
13
we may endup , in (3.1), with a value of b greater than k ( > 2).
Alternatively, instead of the asynptotic risk, one may consider the
asynptotic distributional risk (}\DR) ccrcputed directly fran the asynptotic distribution of an estimator (which mayor may not agree
with the asynptotic risk), and , in the light of ADR, we \'.'OUl.d have
the sane picture as in Section 3 , valid for a broader class of sc-
ore functions • However, bounded soore functions are very popular
( an the ground of robustness against gross errors and outliers) in
l-i-estimation, and hence, we need not eatq;>ranise an the weaker ADR
results at the cost of unbounded soores and / or higher order nonents of the underlying d.f. F •
~s
This research has been supported by the Natural. SCiences and
Engineering Research council of Canada, Grant No. A3088 and by the
(U.S.) National Heart, Lung and Blood Institute, contract No. NnINHLBI-2243-L fran the National Institutes of Health.
BIBLICX3RAPHY
Berger, J .0. (1980). A robust generalized Bayes estimation and confidence region for a multivariate nonnal mean. Ann. Statist.
~,
716-761.
Berger, J .0., Bock, M.E., Br<::Mn, L.D., casella, G. and GIeser, L.
(1977). Minimax estimation of a nonnal mean vector for arbitrary quadratic loss and unknown oovariance matrix. Arm. Stati~. ~
, 763-771.
Berger, J.O. and Haff, L.R. (1983). A class of minimax estimators
of a nonnal mean vector for arbitrary quadratic loss and unknown covariance matrix. Statist. Dec.
Janes, W. and stein,
c.
!O,
105-129.
(1961). Estimation with quadratic loss.
Pmc. 4th Berkeley Symp. Math. statist. probability!o, 293-325.
Judge, G.G. and Bock, M.E. (1978) .The Statistical Inplications of
Pretest and Stein-Rule Estimators in Econanics. North Holland,
Amsterdam.
Hoeffding, U. (1963). Probability inequalities for
SlIDlS
of bounded
•
ranckm variables. J. Alter. Statist. Assoc.
2!.,
13-30.
Huber, p.J. (1981). Robust Statistics. John Wiley, New York.
J~, J.J. and
sen,
P .K. (1981a). Invariance principles for
sane stochastic processes related to M-estiroators and their
role in sequential statistical inference. 5ankh.ya, 5er.A
£,
191-210.
JureC'kovci; J.J. and
sen,
P.K. (198lb). sequential procedures based
on M-estimators with discontinuous score fuIx::tions. J. Statist.
Plan. Infer.
2. ,
253-266.
J~, J.J. and Sen, P .K. (1982). M-estimators and L-estiroators
of location: unifonn integrability and asymptotic risk-effici-
sen,
ent sequential versions. S€quen. Anal. !. ' 27-56.
P.K. (1984). A Janes-Stein detour of U-statistics.
atist. Theor. Meth. A
g,
camnm.
St-
2725-2747.
Sen, P.K. and saleh, A.K.Ml.E. (1979) .Nonpararretri.c estimation of
location pararreter after a preliminary test on regression in
the multivariate case. J. Multivar. Anal.
Singer, J.l>1. and
•
sen,
P.K. (1985).
M~thods
node1s • J. Multivar. Anal. 15
~,
322-331-
in multivariate linear
(in press) •
Stein, C. (1956). Inadmissibility of the usual estimator for the
mean of a multivariate nonnal distribution. Pree. 3rd Berkeley
Syrcp. Math. Statist. Probability
!. '
197-206.
l~