Sen, Pranab Kumar; (1989).On the Pitman CLoseness of Some Sequential Estimators."

ON 'TIlE PIThiAN CLOSENESS OF SGffi SEQUENTIAL ESTIMATORS
By
PRANAB Km4AR SEN
urU.ve.!L6Lty 06 Nom CaJWUna. at Chapel. HLU
For a general class of stopping rules, the connection between median
unbiasedness and the Pitman closeness of statistical estimators is examined
in the sequential case. It is also shown that in the light of the Pitman
closeness, sequential shrinkage estimators of the multinormal mean vector
dominate the classical maximum likelihood estimator.
1. Introducti.on.
The Pitman c1osenes.s Cor nearnessJ criterion (PCC) is an intrin-
sic measure of the. comparative behavior of two estimators without requi ri ng the
finiteness. of their second moment. Th.is pair-wi'se comparison has been extended to
suitable classes of (equivariantl estimators by' Ghosh and Sen (1989) and Nayak (1989).
In this context, ancU1arity and rnedtan unl:itasedness (MU) playa fundamental role
and provide an easy access. to the verification of the PC dominance in the classical
nonsequential cas.e. In the multivariate case, the conventional Stein-rule or shrinkage
estimators may not belong to s.uch a class of equivariant estimators, and hence, the
characterizations. mentioned afiove may not apply directly to such estimators. Nevertheless, Sen, Kubokawa and Saleh (19891 have shown that for the p( ~ 2)-variate
normal distribution, the usual Stein-rule estimators of the mean vector dominate
the classical maximum likelihood es:timator (MLE) in the light of the PCC as well.
Also, in the sequential case, under the usual quadratic error loss (incorporating
the cost of sampling), the dominance of the shrinkage estimators of the mu1tinorma1
mean vector over the c1ass.ica1 MLE has been studied by Ghosh, Nickerson and Sen(1987).
Our main ob..:tecti.yes. are to extend the results in Ghosh and Sen (1989) in the general
n
AMS' (J2801 Sub}e.c..t C.e.M-.6.i ,{;c.ctti..on No.6': 62Ll2, 62H12, 62C15
Key Wo~: and pnAah'e6: : Dominance; loss function; Pitman closeness; shrinkage
estimators; sequential estimation; sufficiency and anci11arity; stein phenomenon.
Short Title : Pitman closeness of sequential estimators.
-2sequential case (covering multivariate situations. as well) and to establ ish the PC
dominance of the sequential shrinkage estimators of the multinormal mean vector.
These are accomplished in the next two sections.
~. ~~~_~!_~~g~~~~!~!_~~~!~~~~~~. Note that for a parameter
a,
if Tl and T2 are
two rival estimators, then Tl is said to be Pitman-closer than T if
2
(2. 1)
a12 1T2
Pa{ IT1 -
-
a I}
>
1/2, for all
a
see Pitman (1937). Further, an estimator T of
a
is said to be median unbiased (MU)
if
(2.2)
Pa{ T
2 a } = Pa{ T
see Lehmann (1983,p.6).
(2.3)
~
a}
,
for all
a
Consider the class C of all eS.ti.mators of the form
U = T + Z , T i's MU for
a ,
and T and Z are independently distributed.
Then, it follows from Ghosh and Sen (1989) that
(~.4)
Pa{1 T -
a I 2
a I }
Iu -
~
1/2, for all a
and all U
£
C.
Al though not necessary, the sufficiency of T and anci llarity Of Z 1ead to (2.3),
and hence, the PC dQJl1inance holds' for MU sufficient statistics under very general
e
conditions. One can imagine an inmediate extension of (2.4) to the sequential case
proyi ded (2.3) is shown to be valid in such a setup too.
Let {Xi,i
~
l} be a sequence of independent and identically distributed (i.i.d.)
random yariables Cr.v.) w.ith a distribution function (d.f.) Fa(x), x
R,
£
a
£
ec
R.
A sequential estimation procedure is governed by a stopping rule eN) and an estimatton rule (J). The stopping number N is a positive integer valued r.v., such that
the event [N=n] depends only on the outcome Xl"" ,Xn , for every n
~
1. Further, given
that N=n, the estimation rule prescribes an estimator, say, Tn' which is also based
on Xl, ... ,X , for every n ~ 1. Thus, a sequential estimator TN depends on both the
n
stopping number N and the estimation rule. Now, for every n > 1, consider the
transformation :
(2.5)
(Xl"" ,X n )
+
[
Tn , V , W
_n ] (where V
_n could b.e null ).
~n
Recall th.at trans.itivit,y and sUffi.ciency of Tn
in (1.5). Let
B+")
and
~n)
typically dictate. the transformation
be the sigma subfield generated by Tn and
~n'
respectively,
-3-
for n ~ 1. We assume that the following three conditions hold
~
(2.6)
For every n ~ 1,
(2.7)
For every n ~ 1, Zn = vn(~n) is B~n)-measurable and
For every n ~
(2.8)
[N=n]
is B~n)_ measurable;
T and Wn are independently distributed
n
1, Tn is MU for a
Finally, let CO be the class of all (sequential) estimators of the form UN = TN +
ZN' where the stopping number N satisfies (2.6), ZN satisfies (2.7) and TN (2.8).
THEOREM 1. Under (2.5) through (2.8),
Pe{ IT N - a I < I UN - a I} ~ 1/2, for every UN E CO and a E 0
Outli,ne of the proof. It suffices to show that Pa{ ZN( TN - a ) ~ O} ~ 1/2, \} a
(2.9)
E
0,
or equivalently,
Pa{N=n}. Ps{ Zn(T n- a ) ~ 0 1 N = n} ~ 1/2, ~ a E 0 .
Now, (2.6), (2.7) and (2.8) ensure that for every n( ~ 1), Pa{ Zn(T n- a )
(2.10)
is
e
~
En>l
1/2, and hence, (2.9) follows from (2.10).
~
01 N=n}
Q.E.D.
In passing, we may remark that (2.6), (2.7) and (2.8) imply (2.3) in the sequential
case. If ~n) be the sigma subfield generated by Zn' then by (2.7),
B~n) C ~n) ,
for every n, and hence, in (2.10), given N=n, Zn and Tn are conditionally independent.
In many applications, TN may be identified as a function of a sufficient statistic
(in a sequential setup), and we need to choose such a function in such a way that TN
is MU for a. As an illustration, we consider the following simple example:
Let {Xi,i ~
l}
be LLd.r.v. with the normal(a, (l) distribution, where both a and
n ( Xl' - X
- )2 be the sample
a 2 are unknown. For every n (->)
2 , let sn2 = (n-l )-1 E'1- l
n
variance (based on Xl, ... ,Xn ), and consider a stopping number N = NK, defined by
(2.11)
N = inf{ n >n : ~(n) > Ks 2 } , K > 0 ( usually taken large) ,
-0
-
n
where w(n) is a monotone increasing functi'on of n , and no C> 2) is the minimum sample
s'ize. Such a stopping number arises in the context of bounded-width confidence intervals
for a ( where w(n) n ) or minimum risk point estimation of a ( where w(n) =. n2).
=
Let us consider the Helmert transformation :
(2.12)
1
W.1 = [Xl+ ... +X.1- 1 - (i-l)X.]/[i(l-l)]~
1
i >2 ,
W
l =0 .
-4-
Then, we have
v
-1 n
2-1 n 2
Tn ::; "n
::; n r t :;l Xi: and Sn::; (n-11 Et::,lWi ' for every n ~ 2.
Note that for every n, Tn is independent of W ::(W 1, ... ,W)1 [and hence of s2 k < n]. ~
-n
n
k'.
Further, for every n -> 1, Tn has a normal distribution with mean e and variance
(.2.13.)
-1 2
ncr, so that Tn is MU for e . Hence, for any Z :: V (W ), (2.6) through (2.8) hold.
n
n -n
This leads to the PC dominance of TN :: XN within the class CO of estimators of the
form TN + ZN where {Zn} satisfies the B~n)_ measurability condition. From considerations of equivariance, we may consider the following group of transformations
G :: {g b(X):: a1. ... bX } , a real, b > 0,
a, . -n
-n
-n
so that G-equivariant estimators of e [ under a loss L(x,e;cr) :: p((x-e)/cr)
(2.14)
suitable
(2.15)
p ]
~or
a
have the representation :
mn(~n):: Xn
+
~(I l~nlr-l~n)U(~n) , n ~ 1,
where ~n:: (Xl' ... 'Xn)', ~(.) and u(.) are suitable functions and 11.llstands for the
Euclidean norm. Identifying Zn wi.th the second term on the right hand side of (2.15),
we are led to th.e class CO of G- equivariant estimators in (2.9). In the context
of estimation of the scale parameter, typically, we have a nonnegative TN ' and in
that case, we may set Co* as the class of estimators of the form UN :: TN{l + ZN} ,
where (2.6)-(2.8) pertain to the TN and ZN . Then along the same line as in Theorem
1, we obtain that within the class Co* of estimators, the nonnegative, MU estimator
TN is the Pitman closest one. We may also remark that in the conventional nonsequential
case, Brown, Cohen and Strawderman (1976) have shown that a MU estimator not solely
based on a sufficient statistic can be dominated by a version of the sufficient statistic which is MU. In this respect, they confined to the class of all MU estimators
of e , whereas our Un needs not be MU for
e . In that respect, we have a larger class.
However, in passing we may note that the Brown-Cohen-Strawde.rman result [ Corollary 4.1]
extends directly to the sequential case under (2.6)-(2.8).
Let us now consider a mu1tiparameter extension of Theorem 1. We conceive of a
p-vector ~. :: (el'''. ,e p) I , for some p ~ 1, so that in (2.5), !n is also a p-vector.
Th.e condition (2.6) stands as it is; in (2.7),
~n
is also a p-vector, while, we may
-5-
modify (2.8) as follows: For every n ~ 1 and arbitrary
Z In - ~ ) is
(2.8 1 )
I (
MU for O.
Moreover, in this case, for a given positive semi-definite (p.s.d.) matrix
define the quadratic norm
-
THEOREM 2. Under (2.6), (2.7) and (2.8
~N
~N
= IN +
we
II~II~ = ~Ig~, and extend (2.1) as follows: Il is Pitman-
closer than I2' if Pe { I I Il- ~I I Q ~ I 1!2- ~I I Q } ~ 1/2, for every e
we have th~ following.
form
g,
1
),
£
8. Then
for the clas.s CO of estimators of the
' we have for all p.s.d.
9,
Pe { IIIN - ~llg ~ II ~N -~llg} ~ 1/2 ,
Outline of the proof. It suffices to show that
(2.16)
V ~N
£
Co, e
£
8 .
-
£ 8
I,
(2.17)
Pe{
~Ng(!N
-
~)
~
1/2, for every e
O} >
As in (2.10), we rewrite (2.17) as
I
Pe{N= n} Pe{ ~ng(!n - ~ ) ~ 0 I N=n }.
Now, ~~g is ~n)-~easurabl;, so that by (2.6) and (2.8
given N = n , ~~g(In- ~ )
(2.18)
Ln>l
1
),
is MU for O. Hence, (2.18) is
~
~
1/2, and the proof is complete.
Remarks. Note that in Theorem 2, Q is allowed to be arbitrary, so that the PC dominance
holds for all p.s.d. Q . On the other hand, (2.8
1
)
is more restrictive than the usual
definition of HU. Often, it may be easier to verify (2.8
diagonal symmetry of In around
and
~.
~:
1
)
by using the possible
!n is diagonally symmetric about
~
if
In - ~
- !n both have the same distri button. Note that thi s diagonal symmetry is not
necessary for (2.8 1 ). Furth.er, (2.8 1 ) [or (2.8)J is als.o a sufficient condition for
Theorem 2 to hold. However, without (2.8 1 ) verification of (2.17) may be highly
dependent on the distribution of
~n
' given N=n. Thus, the simplicity of Theorem 2
crucially rests on (2.8 1 ) . As an illustration, we consider the following.
Let {X., i >
1
-
1}
be LLd.r.v. with the multinormal distribution with mean vector
e and dispersion matrix L (both unknown). In this case, in the context of sequential
e.s.timation of e, Ghosh, Sinha and Mukhopadhyay (J976) and Woodroofe (1977) have
considered a stopping number which may be defined as
(2. 19)
N = inf{ n ~ no: 1/J(n)
~
K[
trace(g~n)
] }, K( > 0) is usually large.
-6~{n}
In this context,
may be defined as in {2.11} while S is the sample covariance
-n
matrix based on ~l' •.• '~n' for n ~ 2. The He1mert transformation in {2.12} extends
directly to the p-vari"ate case, so that the characterization in {2.13} also extends
e
directly to the p-variate case.; here, ~n ::; {n-l}-lL~=l ~i.~~ , for n ~ 2. The
equivariance in {2.14} also extends to the class of nonsingu1ar 1inrear transformations
X to a. . .
...
+ BX
__ , where B
. . . is nonsingu1ar. Moreover, since
X
......n - .e
. . has a mu1tinorma1
law
wi.th null mean vector, its dtstrtbuUon is diagonally synmetric about Q, independently
of the
t and
~t
Th.eorem
~
hence,
~n
and N=n}. Hence, {2.6}, (2.l) and {2.8
yi·e1ds. the PC dominance of
~N
--.----~-~~---.--":"!'.----_
~n
all hold, and
Let us refer to the example considered at
the end of the last section. For p, the number of coordinates of
the MLE
}
with.in the class of equivariant estimators.
PCD of sequenti.al shri.nkage estimators.
-3. -------_.
. . _-.. . . . .
..
--.-.~-
1
~,
greater than 2,
is known to be inadmissible [ see, Stein (1956)] , and various other
{shrinkage or Stein-rule} estimators have been considered in the literature which
dominate the MLE in quadratic error loss. Ghosh, Nickerson and Sen {1987} showed that
such a quadratic error risk dominance holds in the sequential case too. Also, for
p > 1, Sen, Kubokawa and Saleh {l989} have shown that in the light of the PCC, the
X in the conventional fixed sample size case.
-n
Note that the Stein-rule estimators may not belong to the class CO in Theorem 2, and
Stein-rule estimators dominate the MLE
hence, the characterization in Theorem 2 may not apply to these estimators. Thus,
there 1.5. a natural interest in studying the PC dominance of such Stein-rule estimators
1.n the general sequentia1 case. Thi s wi 11 be done here.
For the sake of simplicity, we consider the most simple problem in the sequential
estimation of a multivariate normal mean vector ~. when the dispersion~atrix ~ is
of the form cr2I , where e and cr2{> O} are unknown. Based on n i.i.d. observations
...p
X1, ••. ,X n '
estimators
b
<-3.1 }
~n
-
the MLE of e is
-
X = n-1L~_lX.
1- -1
-n
, n > 1. Consider the class of James-Stein
ob{X , ..• ,X} of the form
-n - 1
-n
b
2
v
2-1
= ~n{~l'···'~n) = {1 - bsn{nl I~nl I}
where
{3.2}
=
n (X _ a }4 X X-)
{n p}-l Li=l
-i ~n \ -i - -n
}
~n'
n
>
2,
e
-7-
and b is a nonnegative shrinkage factor. In this setup, we conceive of the null pivot
and the adjustments for any other given pivot e are routine in nature. Also, in this
-0
~
setup, the stopping number N in (2.19) simplifies further, and we may consider a well
defined stopping rule N, such that for every n ~ 2, [N=n] depends only on the s~ ,
k 2 n. Finally, we define the class of sequential shrinkage estimators ~~ by (3.1)
b
allowing 0N = ob when N = n , for n _> 2. In view of the special structure of the
-n
dispersion matrix, one may consider here a simple quadratic error loss
(3.3)
L( ~n' ~) = (~n - ~)I( ~n - ~) + cn , n ~ 1, c > Q.
Then, it follows from Theorem 1 of Ghosh, Nickerson and Sen (1987) that under the
1
loss in (3.1) and the stopping rule: N = inf{ n ~ 2 : n ~ (p/c)~sn} , the risk of
the sequential estimator ~~ is smaller than (or equal to) that of ~N ' for every
c > 0 and every b E. (0, 2(p-2)), p > 3, e E e C RP . It is customary to take c small ,
-
-
so that pic is large, and this is then comparable to (2.11).
Also, in a fixed sample
size case, it follows from Theorem 1 of Sen, Kubokawa and Saleh (1989) that for every
~
b E (0, (p-l)(3p+l)/2p ), p > 2, ob dominates X in the light of the PC measure.
-n
-n
Qur basic goal is to extend the later result to the sequential case, so that it would
provide a result complementary to Theorem 1 of Ghosh et ale (1987).
Note that ~~ in (3.1) does not belong to the class of estimators in Theorem 2.
Moreover, when we compare ~~ and ~N ' both based on the same stopping number N, it is
not necessary to incorporate the second term in (3.3) in this comparison. Hence, we
shall use the conventional PCC: ~~ dominates ~N in the PC measure if
(3.4)
Pe,o {
II~~ -~. II ~ II~N - ~ II}
~
1/2,
V~,
a.
For our fu;ther analysis, we define the stopping number N by (2.11) where s~ is now
defined by (3.2), and for the sake of simplicity of presentation, we take 1/J(n)
=
n.
In Section 4, we shall briefly mention the other cases.
•
THEOREM 3. For the class of Stein-rule estimators in (3.1) and the stopping number in
(2.11), the PC dominance in (3.4) holds, for every bE (0, (p-l)(3p+l)/2p).
Proof. Note that by (3.1), for every n ~ 2,
2
(3.5) II~~ - ~112 = II~n - ~I 1 + n-2b2s~lI~nll-2 -
2b(nll~nIl2)-1~~(~n - e ).
-8-
Hence. it suffices to show that
N~~(~N - ~ ) ~ (b/2)S~ } ~ 1/2. ~ ~.a and b E (0.(p-l)(3p+l)/2p) .
Let us int;oduce the following notations. Let ~ = ~~. A = a-2~,~ = I I~I 12/4. bO =
~
b/2. and let G~f1)(x) = 1 - G~f1)(x). x E R+ be the d.f. of a noncentral chi squared
(3.6)
Pet a{
r. v. wi.th p degrees of freedom (OF) and noncentra lity parameter f1 (~ 0). Then. on
2
noti.ng that ~~(~n - ~ ) = I I~n - ~I 1 - Aa 2 • we may rewrite the left hand side of
(3.6) as
(3.7)
Pll • a {
....
=
=
NII~N -~I !2/ a2 ~ NA + bOs~/a2 }
G(nA)(nA + bOs 2la2 ) }
[N=n] p
n
E[G~NA)(NA + bOs~/a2 ) ].
L
n~2
E{ I
Therefore. it suffices to show that
(3.8)
E[ G~NA)(NA + bS~/a2)] ~ 1/2. for every b
E
(O.(p-l)(3p+l)/4p). A ~ O.
Note that
(3.9)
(a/db)E[ G~NA)(NA+ bS~/a2)] = _E[(s~/a2)g~NA)(NA+ bS~/a2) ]
~
0 • for every b ~ O.
where g~f1)(y) stands for the noncentral chi square pdf with p OF and noncentrality
parameter f1 . Thus. if we verify that (3.8) holds for any b arbitrary close to
(p-l)(.3p+l)/4P. then it follows that it would also hold for all smaller (but positive)
values of b. Thus. if we let
(3.10)
K
= (p-l)(3p+l)/4p. then it suffices to show that
E[ G~NA)(NA + KS~/a2) ] ~ 1/2. for every A ~
o.
In this context. we may refer to Theorem 1 of Sen. Kubokawa and Saleh (1989) where
it is shown that in the fixed sample size case. for every n ~ 2, p ~ 2.
(3.11)
Thu~.
E[ G(nA)(nA + bs 2Ia 2 )] > 1/2. for every be(O. K), A> O.
p
nthe crux of the problem is to verify that (3.11) holds in the sequential case.
The actual proof is lengthy and complicated too. Hence. for the sake of simplicity
of presentation. we shall provide a broad outline of the proof.
First. we consider the asymptotic setup of Chow and Robbins(l965) or Robbins
(1959) where in (2.11) [with Ndesignated as NK ] K is allowed to go to + . Note
2 2
that snla
~ 1 almost surely (a.s.) as n ~ ~. and N ~ a.s. as K ~ ~. Further.
K
~
-9-
G~·NA)(NA + Ks~/(2) is. a bounded r.v. assuming values i.n [0,1]. Hence, in this case,
convergence in probability would ensure convergence i.n th.e first moment too. Finally,
for any n > 1 and A >0, GenA)(nA +y) has a uniformly bounded and continuous first
p
derivati.ve (ji.e.• y1, for all p ~ 2. Therefore, i.t sufftces to show that for n suffi-
G~JlA)(nA + K)
ci.ent1y large,
G~·nA)CnA +
is
~
1/2 • Now, by Lemma 2.2 of Sen et ale (1989),
K }. is nonincreasing in (nA), while, proceeding as in Theorem 2 of Sen
G~nA)(nA
(1989), we obtain that ltm.(nA)-+al
for every n, A : nA
~
o.
+ K ) = 1/2. Hence,
G~nA)(nA
+1( ) is
~ 1/2,
This' simple method may not work out for the case where K
in (2.111 i:s: be.ld ftxed, and hence, a more e.l aborate proof is necessary.
For every n
~
1 and A >0., let
follows: from Sen (989) th.at
m~nA)
m~·nA)
be the medtan of the d.f.
G~nA),
p
~
2. It
is nondecreasi"ng i.n n and A , and further,
m~~~)
~ p + nA ,~ n ~ 1,
A ~ O. Let K be defined as in (2.11), and let
2
2
n* ={min{ n: Kn ~ pKa }, if/Ka ~ K2;
min{ n : K(n-1) > (p-2)Ka l,if 2Ka < I( •
<,3.12)
Then the left hand side of (3.10) can be written as
•
(3.13)
E[ G~NA)(NA + Ks~/(2) ]
=
Ln<n* E{ I[N<n][
+ E[
=
Ln>2
E{ I[N=n] G~nA)(nA + Ks~/(2) }
G~nA)(nA+ Ks~/(2)
-
G~(n+l)A)«(n+l)A+ KS~+1/(2)]
}
G~n*A)( n*A + KS~*la2 ) ] +
Ln>n* E{ I[N>n][
G~nA)(nA+ Ks~/(2)
-
G~(n-l)A)«n_l)A
+
I(s~_1/a2)
At this stage, we may assume without any loss of generality that n* is
~
]} .
2, as other-
wise, the first sum on the right hand side becomes vacuous. The second term is [ by
(3.11)] bounded from below by 1/2. Thus, we need to show that each of the two sums
in the right hand side of (3.13) is nonnegative. Towards this, we may note that
•
•
(a/Cln)G(nA) (nA+ y) = ).,[ g(+n A)(n)., + y) _ g(nA)(nA + y) ]
p
p 2
p
< Q
d.
.
< (nA)
~
, accof1,ng as. nA + y 1,S. ~ mp+2 '
where the last step follows from the unimodality results in Sen (1989). As a. result,
(3.14)
we 9btai,n that
C3.15)_
G~JIl+11).,1(Jn+1 )A+ y) - G~·nA)(nA + y)
< 0, i,f (n+1)A +. y < m(Jn+ 1 1A)
- - p+2
'
> 0 , if nA +. y > m(n A). .
. - p
-10-
Further, note that U~ = nps~/a.2 has the central chi. square d.f. with p(n-l}OF (6 (n-l»
*
2
2
*
P *
and Un+l = (n+l)psn+l/a' = Un + Un+l where Un+l has the d.f. 6 , independently of U .
~~+1- ~~lIi
~
~~/a2
E[(5~+1- 5~)/i ~Bn
C
= (n+1 )-1 p-l Un+1 ]. 50 :hat
= (n+l)-l[ 1 - s~/a2 ] , for every n ~ 2. Finally, note that [N<n] <=> [ s~ ~ n/K]
Aha
note tl1at
is 8n-measurable, and [N>n] <=> [s~ > m/K, ~ m~ n-l] is 8n_l -measurable. With
these, we consider a typical term in the last sum on the right hand side of (3.13).
*
Note that for any n ~ n*+l,
(3.16)
[ ~(nA)(nA + Ks 2/a 2 ) _ ~«n-l)A)«n_l)A + KS 2 /a 2 )]}
[N>n] p
n
p
n-l
= E{ I[N>n][ ~~nA)(nA + Ks~/a2) - ~~nA)(nA + KS~_1/a2 ) ]} +
E{ l
E{ I[N>n][~~nA)(nA + KS~_1/a2) - G~(n-l)A)«n_l)A + KS~_1/~2 ) ]} •
2
2
2
Now, by (3.15) and the fact that on [N >n], (n-l}A + Ksn_l/a > (n-l)A + K(n-l)/Ka
~ m~l~-l)A) ~ m~~;A) , we obtain that the second term on the right hand side of
(3.16) is nonnegative. For the first term, we may note that
(3.17)
(a/ay)G(nA)(nA+ y) = _g(nA)(nA + y ) , y > 0 ;
p
p
2
(3.18) (a / ay 2)G(nA)(nA+ y) = [ g(nA)(nA + y) - g(nA ) (nA+ y)]/2
p
p
p- 2
> 0, for all y: nA + y > m(nA) ,
-.
- p
where again the last step follows from the. urii.modaH.ty results in Sen [i989). Thus,
G~n'A)(n'A + y) is a convex function of y, for all y ~ m~n'A)_ nA
(~p-2). Note that
for n > n*, on the set [N > n], nA + Kn- l (n-l)s2 1/a2 > n'A + (p-2) > m(nA), and
n- p
2
2
1
)
2
2
nA + Ksn/a = nA + Kn (n-l sn_l/a + KUn/np , where Un has the central chi square
d.f. with p OF, independently of s~_l
or [N ~ n]. Hence, using the convexity of
G~_n'A)(nA + Kn-l(n-l)s~_1/a2+ KUn/np) [in Un] along with. the Jensen inequality,
we obtain that on [N
(3.19)
~
n] ,
E[ G~nA)(n'A + Ks~/a2)
I 8n- l ] ~ G~n'A)Cn~
+ Kn-l(Jl-l}s~_1/q2 + Kn... l )
~ G~n'A)(nA + KS~_1/a2) + Kn- l [ s~_1/a2 - 1]9~nA)(nA + KS~_1/a2) ,
where the last step follows. from (3.17)-(3.18) and the fact that Ks.~_ln...·l(n-1)/a2
is
~ m~-~~>'" n'A , for N~ n. Finally, note that by (3 ..1.2), for n > n*, s~_l/i
!:.
p/~
on [N > nJ, and K <p-l < p. Hence, from (3.16) and (3.19), we obtain that
•
-11(3.20.)
> E{
.::
•
G~nA)(nA + KS~/a2) - G~nA)(nA + KS~_1/a2 ) J}
r~N>nJ Kn- l ( s~_1/a2 - 1 )g~nA)(nA + KS~_1/a2 ) }
E{ i[N>n][
O~
for every n > n*.
This shows that the last sum on the right hand side of (3.13) is. nonnegative. The
treatment for the first sum is very similar. Note that we would have two terms as in
•
(3.16)~ where (3.15) would ensure that the second term is nonnegative. For the first
term, we write
G~nA)c.l = 1
part would ensure the
G~nA)(.), and note that
convexity of G~nA) ~ and hence~
-
(3.18) for the complementary
the rest of the manipulations
can be carried out as in (3.19) and (3.20). We therefore omit these details. Q.E.D.
4. Some general remarks. Our main result is contained in Theorem 3. For the sake of
- -------------------simplicity~
we have made some assumptions which are possibly less general than they
may appear actually in this context. For example, in
(2.11)~
we have considered a
general
1jJ(n) whi.le in the proof of Theorem 3, we have taken 1jJ(n):: n. The treat2
ment of 1jJ(n)
n
(or other plausible fonns) poses no extra regularity conditions
=
but extra manipulations. The basic fact is that the stopping time N has a distribution governed by the sequence {
•
s~ }~
and the convergence properties of these
(3.2)~
provide the desired keys to the actual manipulations. Secondly, in
s~
instead
of the conventional divisor p(n-l), we have taken np. A similar modification was
also made in
Ghos.h~
Nickerson and Sen (1987) for dealing with. the sequential shrinkage
estimation in the light of the quadratic error loss functions.
In the current
case~
i.t may not be necessary to have the divisor np ; p(n-l) would h.ave been quite valid.
But~
then in the defi,nitiQn of n* in [3.12) we would have needed some adjustments.
Thi.rdly, in the. conventi'onal ftxed s,ample s.ize case, Sen~ Kubokawa and Saleh (1989)
estimators~
considered a more general form of Stein-rule
•
•
where in
(3.1)~
constant b (0 < b.s. (p-1)(3p+1)/2p) was replaced by an arbitrary function
such that for all p.::
(4.1)
0. < <pC
2~
~n~S~)
n .::
2~
<
(p-l
)(3p+1)/4p~ for
every
(Xn~s~)
<PCXN~s~)
>.
~
<P(Xn's~)
a.e.
rn our case too, we may extend (4.l) to the s.equenti.al setup by.
the stopping time N (j .. e.~ taRing
the scalar
incorporating
and the steps. in C3.6} through (3.8)
-12remain in tact [ by virtue of the nonnegativity of </>(.) and its, boundedness from
above]. Thus, (3.10) again pertains to this more general situation, and the proof
provided remains applicable too. Fourthly, under appropriate quadratic error loss,
positive-rule versions of shrinkage estimators are known to dominate the usual
e
•
shrinkage estimators, and in the conventional nonsequential cas,e, this dominance
result has also been established in the light of the Pitman closeness [ viz., Sen,
Kubokawa and Saleh (1989)]. For the particular model cons.ide,red in Section 3, a
positive-rule version of (3.1) is
~~+ = {l - bs~(nl I~nl 12)-1 }+ ~n;
(4.2)
a+ = max(a,O) ,
s~ and the other notations are borrowed from (3.1).
where b,
version of this positive rule estimator is given by
A natural sequenti al
~~+ , where th,e stopping number
N is defined as in (2.11). Note that by (3.1) and (4.2)
II ~~+-~ 11 ~ II ~~ - ~ 11 }
>-Pa,o{ bS~~NII~NI12} + Pa,o{[bS~
(4.3)
When
2
Pa,o{
~
=
Q'- the
the case of
row of
e;
2
-
~'~N
> O]}.
-
O. Let A be an orthogonal matrix of order p , such that the first
-.
-....
~ is 11~II-l~,
Pa,o{
[bS~
=-'1: n>2
P{
Note that if-
]O[
right hand side of (4.3) is equal to 1, so we need to consider only
. For every n
Y =1 I~I I-l~'~n ' for every n
ln
(4.3) can be written as
(4.4)
NII~NI12
>
8~
~
~
1, let
~n
= (Yln,· .. ,Ypn)' =
•
' so that
1. Then, the second term on the right hand side of
I ~NI12](\[ Y1N ~ 0] }
2
l[N=n][ bS~ > nil !n 11 ] n[Y'ln ~
> NI
8(S,~,
=
~n
k
~
OJ }.
n) be the s'igma subfteld generated by th,e
s~,
k
~
n,
then (i) [N=n] i s 8~ -measurable, for every n ~ 2, (i i) given N = n, Y1n has the
l 2
normal distribution with mean Iiall and variance n- 0 , and (iii) s2 and Yn are
~
n
independent. Thus, we may virtually repeat the proof in (2.6)-(2.7) of Sen, Kubokawa
and Saleh (989), and conclude that for every n ~ 2,
(4.5)
Pe,q{ l[N=n]
~
[bS~
>
nll!~112
](\[ Yln
~
.22
,0{ l[N=n] [ bS'n ~ n II ~n II
]}
~ 0/2) Pe
=
O/2)P;',0 { I[N=n][
bS~
>n
Ilin 112
]}.
0] }
-13Thus, (4.4) is bounded from below by (l/2)Pe ,a{ bS~ > NII~NI12 }, and hence, the right
hand side of (4.3) is bounded from below by
2
(4.6)
Pe,a{ bS~ ~ NI I~NI 1 } + (1/2)P e ,a{ bS~ > Nl I~N112 }
2
='" ~ + ~ Pe,q{ bS~ ~ Nl I~NI 1 }'" ~ 1/2.
Therefore,
•
~~+
dOlT\t'nate the usual shrinkage estimator
~~.
in the 1ight of the Pitman
closeness criterion for an arbitrary stopping rule (depending only on the s~ ). Here
also, p is
term
~
2, and again, we may replace th,e snrinkage factor b by a more general
¢(.) as in (4.1), and the PC dominance remains. in tact. Fifthly, in the fixed
sample size case, Sen, Kubokawa and Saleh (1989) considered the case of a normal
distribution with mean vector ~ and dispersion matrix ~a2, where ~ is a known positive
2
definite matrix and a is unknown. In thts s.etup, if we ass.ume that if the ",1
X. are
2
) distribution, then the PC dominance in Theorem 3 also
i.i.d.r.v. IS with N C~,
ya
holds; we only need to modify the shrinkage estimator in (3.1) allowing a possibly
more general norm as in Theorem 2. Towards this, we let
•
•
~~ = [
(4.7)
where
9 appears
! - ¢(~n,s~)s~(n~I~-lg-1~-18n)-lg-1~-1
]gn ' n
~ 2,
in the definition of the norm in Theorem 2 and all the other notations
are borrowed from Section 3. As adapted to a stopping rule N, the shrinkage estimator
in (4.7) is defined for the sequential case, i.e.,
o¢
= o¢ if N = n . If we
"'.n
proceed as in (3.5) through (3.9) where we use the. .....
Q-norm d'Qd instead of the
",N
Euclidean norm d I·d , then we may show easily that (3.10) again provides the desired
'" '"
result. Therefore, we omit the details. Finally, we consider the general case of an
arbitrary covariance matrix
~
(positive definite but unknown) , and examine the
PC dominance of sequential shrinkage estimators for stopping rul es' of the type in
•
(2.19). In this general case, a Stein-rule estimator of
e [viZ., Stein (1981)]
'"
is of the form
t
(4.8)
where
*
~
) (W -1- )-1 -1 -1 ]~
! - (n-p) -1 ¢(~.n'~.n
dn n~n~n ~n' 9 ~n
~n
dn = minimum characteristic root of g~n ' ~n :; L~=l qi
a Wi,shart
~n = [
Ct:
'"
- ~n) (~i - ~n)
has
,p, n-1) distribution, and the other notations are borrowed from Sections
2, 3 and 4. For the fixed sample size case, the PC dominance of
*
~n
over
-
~n
has been
-14-
studied by Sen, Kubokawa and Saleh (1989), and in this cintext, it is assumed that
n > p ( so that S has a nondegenerate distribution). Thus, in the sequential case
-n
too, we would need that the initial sample size no is > p. Let us define
~n = (n-p)-l~n ' for n > p ,
(4.9)
and 1et
•
be defi ned as in (3. 10). Th.en, virtually repeating the fi rst part of
K
the proof of Theorem 2 of Sen, Kubokawa and Sa leh (l989), we conclude that for the
desired PC-dominance of
* over ~N
- '
~N
Pe,~{ N(!N" ~ ) '~Nl~N ~
(4.10)
it suffices to show that
K
>
}
V (~,~),
1/2,
~n
where the stopping number N is defined by (2.19) [ with
being replaced by
~n
].
In this context, we may set
(4.11)
~
=
_k
~ .~~
,
~n
=
L
-k
_
n~ ~ z (~n
-
~
0
~n
) and
=
k
L
~-~~n~-~
, for n > p.
Note that for every n -> 1, Z
has the normal law with null mean vector and disper-n
sion matrix !p , and (n-p)~~ has the Wishart (!p' p, n-1) distribution. Thus, the
left hand side of (4.10) can be written as
(4 .12)
I
0
-1
Po., I{ ~N (~N ) ~N ~
-
N"2 ~ (~N )- ~N }
-
-
I
0
I
= ~n >n PO,t{ N=n }. PO,I{ ~n(~n)
-
0
-
-
-
1
0
1
K
-1
~n ~
K
-
where the even [N = n] depends on the ~~, k ~ nand
-
~
0
n ~I(~n)
g,
-1
~n
I
N= n },
introduced in (2.19), for
n ~ no' Using the stochastic independence of the ~k and ~~ , we may claim that
gtven -n
VO (and N = n), A'{Vo)-lZ has a normal law with mean a , for any A, and
_ -n -n
we can conceive of a set of independent standard normal variabl es Ul , ... ,Up , such
that for every n ~ no '
(4.13)
PO•I { ~~(~~)-l~n ~
- 2
= PO,I{ &~=l dnjU j
K -
n~~t(~~)-l~n
N=n
*
n"2 ~~=l dnjU j
N=n, ~~ }
- 1
* depend on
where the dnj are the characteristic roots of (~~)- and the dnj
as the dnj (which are held fixed) ; the Uj are independent of the dnj and
>
1
K
-
A as well
* ·
dnj
Although this form is quite appealing, it does not lead us to the deisred result.
The main difficulty is caused by the fact that in (2.19) N involves the sequence
K(trace(9~n)l,
n ~ no and K ( > 0) gi.ven. With the transformation in (4.11), we
•
-150 0
P
0
so that trace(gy n) = trace(g ¥n) = ~j=l dnj , say, where
the dO. are the characteristic roots of QOv o , for n > n . The basic difficulty is
o
may write Q =
k
k
~2Q~2
,
nJ
- -n
-
0
caused by the fact that the stopping number N is determined by the sequence {do.}
nJ
* may not be in general proporttonal tQ the d ,
while in (4.13) the dnj and dnj
nj
,
and this creates a problem in adapting the breakup in (3.13). Thus, the method
outlined in Section 3 may not work out for this general problem. This is not at all
surprising. In the sequential shrinkage estimation (under quadratic error loss), a
very similar problem cropped up, and hence, in all the works. of Ghosh et al. (1987),
Nickerson(1987) and Sriram and Bose (1988), the case of ~ = (iv... was considered .
Also, if we go back to the classical multivariate sequential problem treated in
Ghosh, Sinha and Mukhopadhyay (1976) and Woodroofe (1977), there also, a stopping
was considered,allowing K to go to
+00
case, we have no difficulty in claiming that ~N* dominates
~N
rule of the type
(~.19)
In this asymptotic
in the PC criterion.
,
This follows from the fact that Yn + ~ a.s., as n +
so that as K+
NK
a.s. As such, in (4.8) ( for n = N ),
[ defined as in (2.19) with K] goes to +
we may replace (N-P}~Nl by E- l , and with the reSUlting estimators, the case reduces
•
to that of a known
00
,
00
,
00
E'
for which our earlier results would directly apply. However,
we may remark that as K
+
00"
the PC-dominance of ~N* over ~N is perceptible only
in a small neighborhood of the pivot, i.e.,
( 4. 14 ).
~
£.
-1
OK ~ { K ~:
~
£
p}
Compact C CR.
This asymptotic feature of the shrinking neighborhood pertaining to the sequential
shrinkage estimators C under quaratic loss) has already been noticed in Sen (198?),
and in the PC-dominance, the same remains pertinent. The problem of establishing
the dominance (in quadratic error loss or in the PC-criterion) for an arbitrary
~
•
and for any fixed K > a remains largely as open .
-16REFERENCES
BROWN, L.D., COHEN, A. and STRAWDERMAN, W.E. (1976). A complete class theorem for
strict monotone likelihood ratio with applications. Ann. Statist. i, 712-722.
CHOW, Y.S. and ROBBINS, H. (1965). On the asymptotic theory of fixed width sequential
interval for the mean. Ann. Math. Statist. 36, 457-462.
GHOSH, M., NICKERSON, D.M. and SEN, P.K. (1987). Sequential shrinkage estimation.
Ann. Statist. li, 817-829.
GHOSH, M. and SEN, P.K. (1989). Median unbiasedness and Pitman closeness. J. Amer.
Statist. Asso. 84 , in press.
GHOSH, M., SINHA, B.K. and MUKHOPADHYAY, N. (1976). Multivariate sequential point
estimation. J. Multivar. Anal. ~, 281-294.
LEHMANN, E.L. (1983). Theory of Point Estimation. John Wi,ley, New York.
NAYAK, T. (1989). Estimation of location and scale parameters using generalized
Pitman nearness criterion. J. Statist. Plan. Infer. (to appear).
NICKERSON, D.M. (1987}. Sequential shrinkage estimation of linear regression
parameters. Seguen. Anal. ~, 93-117.
ROBBINS, H. (1959). Sequential estimation of the mean of a normal population. In
probabi1i~ and Statistics ( Herald Cramer volume), Almquist &Wikse11, Uppsala,
pp. 235-24 .
SEN, P.K. (1987).Se uentia1 Stein-rule maximum likelihood estimation: General as
totics. In Statlstlca Decislon Theory and Re ated TOplCS IV eds.
J.O. Berger), Springer Verlag, New York, Vol. 2, 195-208.
SEN, P.K. (1989). The mean-median-mode inequality and noncentra1 chi square distributions. Sankhya, Ser. A ~, 108-116
SEN, P.K., KUBOKAWA, T. and SALEH, A.K.M.E. (1989). The Stein paradox in the sense
of the Pitman measure of closeness. Ann. Statist. 17, in press.
SRIRAM, T. and BOSE, A. (1988). Sequential shrinkage estimation in the general linear
model. Seguen. Anal. L ,149-163.
STEIN, C. (1956). Inadmissibility of the usual estimator of the mean of a multivariate
normal distribution. Proc. Third Berkeley Symp. Math. Statist. Probabil ity 1,
197-206, Univ. Calif. Press.
STEIN, C. (1981). Estimation of the mean of a multivariate normal distributuon.
Ann. Statist. ~, 1135-1151.
WOODROOFE, M.B. (1977}.Second order approximations for sequential point and interval
estimation. Ann. Statist. ~, 984-995.
pranab K. Sen
De,partments of Bios,tatis.ti,cs & Stati,stics,
Univers.ity of North Caroli'na,
Chapel Hi11, NC 27599-7400 (3260).
•
t
f
•
.

Download Report

Sen, Pranab Kumar; (1989).On the Pitman CLoseness of Some Sequential Estimators."

Paperzz.com

Your Paperzz