ON 'TIlE PIThiAN CLOSENESS OF SGffi SEQUENTIAL ESTIMATORS By PRANAB Km4AR SEN urU.ve.!L6Lty 06 Nom CaJWUna. at Chapel. HLU For a general class of stopping rules, the connection between median unbiasedness and the Pitman closeness of statistical estimators is examined in the sequential case. It is also shown that in the light of the Pitman closeness, sequential shrinkage estimators of the multinormal mean vector dominate the classical maximum likelihood estimator. 1. Introducti.on. The Pitman c1osenes.s Cor nearnessJ criterion (PCC) is an intrin- sic measure of the. comparative behavior of two estimators without requi ri ng the finiteness. of their second moment. Th.is pair-wi'se comparison has been extended to suitable classes of (equivariantl estimators by' Ghosh and Sen (1989) and Nayak (1989). In this context, ancU1arity and rnedtan unl:itasedness (MU) playa fundamental role and provide an easy access. to the verification of the PC dominance in the classical nonsequential cas.e. In the multivariate case, the conventional Stein-rule or shrinkage estimators may not belong to s.uch a class of equivariant estimators, and hence, the characterizations. mentioned afiove may not apply directly to such estimators. Nevertheless, Sen, Kubokawa and Saleh (19891 have shown that for the p( ~ 2)-variate normal distribution, the usual Stein-rule estimators of the mean vector dominate the classical maximum likelihood es:timator (MLE) in the light of the PCC as well. Also, in the sequential case, under the usual quadratic error loss (incorporating the cost of sampling), the dominance of the shrinkage estimators of the mu1tinorma1 mean vector over the c1ass.ica1 MLE has been studied by Ghosh, Nickerson and Sen(1987). Our main ob..:tecti.yes. are to extend the results in Ghosh and Sen (1989) in the general n AMS' (J2801 Sub}e.c..t C.e.M-.6.i ,{;c.ctti..on No.6': 62Ll2, 62H12, 62C15 Key Wo~: and pnAah'e6: : Dominance; loss function; Pitman closeness; shrinkage estimators; sequential estimation; sufficiency and anci11arity; stein phenomenon. Short Title : Pitman closeness of sequential estimators. -2sequential case (covering multivariate situations. as well) and to establ ish the PC dominance of the sequential shrinkage estimators of the multinormal mean vector. These are accomplished in the next two sections. ~. ~~~_~!_~~g~~~~!~!_~~~!~~~~~~. Note that for a parameter a, if Tl and T2 are two rival estimators, then Tl is said to be Pitman-closer than T if 2 (2. 1) a12 1T2 Pa{ IT1 - - a I} > 1/2, for all a see Pitman (1937). Further, an estimator T of a is said to be median unbiased (MU) if (2.2) Pa{ T 2 a } = Pa{ T see Lehmann (1983,p.6). (2.3) ~ a} , for all a Consider the class C of all eS.ti.mators of the form U = T + Z , T i's MU for a , and T and Z are independently distributed. Then, it follows from Ghosh and Sen (1989) that (~.4) Pa{1 T - a I 2 a I } Iu - ~ 1/2, for all a and all U £ C. Al though not necessary, the sufficiency of T and anci llarity Of Z 1ead to (2.3), and hence, the PC dQJl1inance holds' for MU sufficient statistics under very general e conditions. One can imagine an inmediate extension of (2.4) to the sequential case proyi ded (2.3) is shown to be valid in such a setup too. Let {Xi,i ~ l} be a sequence of independent and identically distributed (i.i.d.) random yariables Cr.v.) w.ith a distribution function (d.f.) Fa(x), x R, £ a £ ec R. A sequential estimation procedure is governed by a stopping rule eN) and an estimatton rule (J). The stopping number N is a positive integer valued r.v., such that the event [N=n] depends only on the outcome Xl"" ,Xn , for every n ~ 1. Further, given that N=n, the estimation rule prescribes an estimator, say, Tn' which is also based on Xl, ... ,X , for every n ~ 1. Thus, a sequential estimator TN depends on both the n stopping number N and the estimation rule. Now, for every n > 1, consider the transformation : (2.5) (Xl"" ,X n ) + [ Tn , V , W _n ] (where V _n could b.e null ). ~n Recall th.at trans.itivit,y and sUffi.ciency of Tn in (1.5). Let B+") and ~n) typically dictate. the transformation be the sigma subfield generated by Tn and ~n' respectively, -3- for n ~ 1. We assume that the following three conditions hold ~ (2.6) For every n ~ 1, (2.7) For every n ~ 1, Zn = vn(~n) is B~n)-measurable and For every n ~ (2.8) [N=n] is B~n)_ measurable; T and Wn are independently distributed n 1, Tn is MU for a Finally, let CO be the class of all (sequential) estimators of the form UN = TN + ZN' where the stopping number N satisfies (2.6), ZN satisfies (2.7) and TN (2.8). THEOREM 1. Under (2.5) through (2.8), Pe{ IT N - a I < I UN - a I} ~ 1/2, for every UN E CO and a E 0 Outli,ne of the proof. It suffices to show that Pa{ ZN( TN - a ) ~ O} ~ 1/2, \} a (2.9) E 0, or equivalently, Pa{N=n}. Ps{ Zn(T n- a ) ~ 0 1 N = n} ~ 1/2, ~ a E 0 . Now, (2.6), (2.7) and (2.8) ensure that for every n( ~ 1), Pa{ Zn(T n- a ) (2.10) is e ~ En>l 1/2, and hence, (2.9) follows from (2.10). ~ 01 N=n} Q.E.D. In passing, we may remark that (2.6), (2.7) and (2.8) imply (2.3) in the sequential case. If ~n) be the sigma subfield generated by Zn' then by (2.7), B~n) C ~n) , for every n, and hence, in (2.10), given N=n, Zn and Tn are conditionally independent. In many applications, TN may be identified as a function of a sufficient statistic (in a sequential setup), and we need to choose such a function in such a way that TN is MU for a. As an illustration, we consider the following simple example: Let {Xi,i ~ l} be LLd.r.v. with the normal(a, (l) distribution, where both a and n ( Xl' - X - )2 be the sample a 2 are unknown. For every n (->) 2 , let sn2 = (n-l )-1 E'1- l n variance (based on Xl, ... ,Xn ), and consider a stopping number N = NK, defined by (2.11) N = inf{ n >n : ~(n) > Ks 2 } , K > 0 ( usually taken large) , -0 - n where w(n) is a monotone increasing functi'on of n , and no C> 2) is the minimum sample s'ize. Such a stopping number arises in the context of bounded-width confidence intervals for a ( where w(n) n ) or minimum risk point estimation of a ( where w(n) =. n2). = Let us consider the Helmert transformation : (2.12) 1 W.1 = [Xl+ ... +X.1- 1 - (i-l)X.]/[i(l-l)]~ 1 i >2 , W l =0 . -4- Then, we have v -1 n 2-1 n 2 Tn ::; "n ::; n r t :;l Xi: and Sn::; (n-11 Et::,lWi ' for every n ~ 2. Note that for every n, Tn is independent of W ::(W 1, ... ,W)1 [and hence of s2 k < n]. ~ -n n k'. Further, for every n -> 1, Tn has a normal distribution with mean e and variance (.2.13.) -1 2 ncr, so that Tn is MU for e . Hence, for any Z :: V (W ), (2.6) through (2.8) hold. n n -n This leads to the PC dominance of TN :: XN within the class CO of estimators of the form TN + ZN where {Zn} satisfies the B~n)_ measurability condition. From considerations of equivariance, we may consider the following group of transformations G :: {g b(X):: a1. ... bX } , a real, b > 0, a, . -n -n -n so that G-equivariant estimators of e [ under a loss L(x,e;cr) :: p((x-e)/cr) (2.14) suitable (2.15) p ] ~or a have the representation : mn(~n):: Xn + ~(I l~nlr-l~n)U(~n) , n ~ 1, where ~n:: (Xl' ... 'Xn)', ~(.) and u(.) are suitable functions and 11.llstands for the Euclidean norm. Identifying Zn wi.th the second term on the right hand side of (2.15), we are led to th.e class CO of G- equivariant estimators in (2.9). In the context of estimation of the scale parameter, typically, we have a nonnegative TN ' and in that case, we may set Co* as the class of estimators of the form UN :: TN{l + ZN} , where (2.6)-(2.8) pertain to the TN and ZN . Then along the same line as in Theorem 1, we obtain that within the class Co* of estimators, the nonnegative, MU estimator TN is the Pitman closest one. We may also remark that in the conventional nonsequential case, Brown, Cohen and Strawderman (1976) have shown that a MU estimator not solely based on a sufficient statistic can be dominated by a version of the sufficient statistic which is MU. In this respect, they confined to the class of all MU estimators of e , whereas our Un needs not be MU for e . In that respect, we have a larger class. However, in passing we may note that the Brown-Cohen-Strawde.rman result [ Corollary 4.1] extends directly to the sequential case under (2.6)-(2.8). Let us now consider a mu1tiparameter extension of Theorem 1. We conceive of a p-vector ~. :: (el'''. ,e p) I , for some p ~ 1, so that in (2.5), !n is also a p-vector. Th.e condition (2.6) stands as it is; in (2.7), ~n is also a p-vector, while, we may -5- modify (2.8) as follows: For every n ~ 1 and arbitrary Z In - ~ ) is (2.8 1 ) I ( MU for O. Moreover, in this case, for a given positive semi-definite (p.s.d.) matrix define the quadratic norm - THEOREM 2. Under (2.6), (2.7) and (2.8 ~N ~N = IN + we II~II~ = ~Ig~, and extend (2.1) as follows: Il is Pitman- closer than I2' if Pe { I I Il- ~I I Q ~ I 1!2- ~I I Q } ~ 1/2, for every e we have th~ following. form g, 1 ), £ 8. Then for the clas.s CO of estimators of the ' we have for all p.s.d. 9, Pe { IIIN - ~llg ~ II ~N -~llg} ~ 1/2 , Outline of the proof. It suffices to show that (2.16) V ~N £ Co, e £ 8 . - £ 8 I, (2.17) Pe{ ~Ng(!N - ~) ~ 1/2, for every e O} > As in (2.10), we rewrite (2.17) as I Pe{N= n} Pe{ ~ng(!n - ~ ) ~ 0 I N=n }. Now, ~~g is ~n)-~easurabl;, so that by (2.6) and (2.8 given N = n , ~~g(In- ~ ) (2.18) Ln>l 1 ), is MU for O. Hence, (2.18) is ~ ~ 1/2, and the proof is complete. Remarks. Note that in Theorem 2, Q is allowed to be arbitrary, so that the PC dominance holds for all p.s.d. Q . On the other hand, (2.8 1 ) is more restrictive than the usual definition of HU. Often, it may be easier to verify (2.8 diagonal symmetry of In around and ~. ~: 1 ) by using the possible !n is diagonally symmetric about ~ if In - ~ - !n both have the same distri button. Note that thi s diagonal symmetry is not necessary for (2.8 1 ). Furth.er, (2.8 1 ) [or (2.8)J is als.o a sufficient condition for Theorem 2 to hold. However, without (2.8 1 ) verification of (2.17) may be highly dependent on the distribution of ~n ' given N=n. Thus, the simplicity of Theorem 2 crucially rests on (2.8 1 ) . As an illustration, we consider the following. Let {X., i > 1 - 1} be LLd.r.v. with the multinormal distribution with mean vector e and dispersion matrix L (both unknown). In this case, in the context of sequential e.s.timation of e, Ghosh, Sinha and Mukhopadhyay (J976) and Woodroofe (1977) have considered a stopping number which may be defined as (2. 19) N = inf{ n ~ no: 1/J(n) ~ K[ trace(g~n) ] }, K( > 0) is usually large. -6~{n} In this context, may be defined as in {2.11} while S is the sample covariance -n matrix based on ~l' •.• '~n' for n ~ 2. The He1mert transformation in {2.12} extends directly to the p-vari"ate case, so that the characterization in {2.13} also extends e directly to the p-variate case.; here, ~n ::; {n-l}-lL~=l ~i.~~ , for n ~ 2. The equivariance in {2.14} also extends to the class of nonsingu1ar 1inrear transformations X to a. . . ... + BX __ , where B . . . is nonsingu1ar. Moreover, since X ......n - .e . . has a mu1tinorma1 law wi.th null mean vector, its dtstrtbuUon is diagonally synmetric about Q, independently of the t and ~t Th.eorem ~ hence, ~n and N=n}. Hence, {2.6}, (2.l) and {2.8 yi·e1ds. the PC dominance of ~N --.----~-~~---.--":"!'.----_ ~n all hold, and Let us refer to the example considered at the end of the last section. For p, the number of coordinates of the MLE } with.in the class of equivariant estimators. PCD of sequenti.al shri.nkage estimators. -3. -------_. . . _-.. . . . . .. --.-.~- 1 ~, greater than 2, is known to be inadmissible [ see, Stein (1956)] , and various other {shrinkage or Stein-rule} estimators have been considered in the literature which dominate the MLE in quadratic error loss. Ghosh, Nickerson and Sen {1987} showed that such a quadratic error risk dominance holds in the sequential case too. Also, for p > 1, Sen, Kubokawa and Saleh {l989} have shown that in the light of the PCC, the X in the conventional fixed sample size case. -n Note that the Stein-rule estimators may not belong to the class CO in Theorem 2, and Stein-rule estimators dominate the MLE hence, the characterization in Theorem 2 may not apply to these estimators. Thus, there 1.5. a natural interest in studying the PC dominance of such Stein-rule estimators 1.n the general sequentia1 case. Thi s wi 11 be done here. For the sake of simplicity, we consider the most simple problem in the sequential estimation of a multivariate normal mean vector ~. when the dispersion~atrix ~ is of the form cr2I , where e and cr2{> O} are unknown. Based on n i.i.d. observations ...p X1, ••. ,X n ' estimators b <-3.1 } ~n - the MLE of e is - X = n-1L~_lX. 1- -1 -n , n > 1. Consider the class of James-Stein ob{X , ..• ,X} of the form -n - 1 -n b 2 v 2-1 = ~n{~l'···'~n) = {1 - bsn{nl I~nl I} where {3.2} = n (X _ a }4 X X-) {n p}-l Li=l -i ~n \ -i - -n } ~n' n > 2, e -7- and b is a nonnegative shrinkage factor. In this setup, we conceive of the null pivot and the adjustments for any other given pivot e are routine in nature. Also, in this -0 ~ setup, the stopping number N in (2.19) simplifies further, and we may consider a well defined stopping rule N, such that for every n ~ 2, [N=n] depends only on the s~ , k 2 n. Finally, we define the class of sequential shrinkage estimators ~~ by (3.1) b allowing 0N = ob when N = n , for n _> 2. In view of the special structure of the -n dispersion matrix, one may consider here a simple quadratic error loss (3.3) L( ~n' ~) = (~n - ~)I( ~n - ~) + cn , n ~ 1, c > Q. Then, it follows from Theorem 1 of Ghosh, Nickerson and Sen (1987) that under the 1 loss in (3.1) and the stopping rule: N = inf{ n ~ 2 : n ~ (p/c)~sn} , the risk of the sequential estimator ~~ is smaller than (or equal to) that of ~N ' for every c > 0 and every b E. (0, 2(p-2)), p > 3, e E e C RP . It is customary to take c small , - - so that pic is large, and this is then comparable to (2.11). Also, in a fixed sample size case, it follows from Theorem 1 of Sen, Kubokawa and Saleh (1989) that for every ~ b E (0, (p-l)(3p+l)/2p ), p > 2, ob dominates X in the light of the PC measure. -n -n Qur basic goal is to extend the later result to the sequential case, so that it would provide a result complementary to Theorem 1 of Ghosh et ale (1987). Note that ~~ in (3.1) does not belong to the class of estimators in Theorem 2. Moreover, when we compare ~~ and ~N ' both based on the same stopping number N, it is not necessary to incorporate the second term in (3.3) in this comparison. Hence, we shall use the conventional PCC: ~~ dominates ~N in the PC measure if (3.4) Pe,o { II~~ -~. II ~ II~N - ~ II} ~ 1/2, V~, a. For our fu;ther analysis, we define the stopping number N by (2.11) where s~ is now defined by (3.2), and for the sake of simplicity of presentation, we take 1/J(n) = n. In Section 4, we shall briefly mention the other cases. • THEOREM 3. For the class of Stein-rule estimators in (3.1) and the stopping number in (2.11), the PC dominance in (3.4) holds, for every bE (0, (p-l)(3p+l)/2p). Proof. Note that by (3.1), for every n ~ 2, 2 (3.5) II~~ - ~112 = II~n - ~I 1 + n-2b2s~lI~nll-2 - 2b(nll~nIl2)-1~~(~n - e ). -8- Hence. it suffices to show that N~~(~N - ~ ) ~ (b/2)S~ } ~ 1/2. ~ ~.a and b E (0.(p-l)(3p+l)/2p) . Let us int;oduce the following notations. Let ~ = ~~. A = a-2~,~ = I I~I 12/4. bO = ~ b/2. and let G~f1)(x) = 1 - G~f1)(x). x E R+ be the d.f. of a noncentral chi squared (3.6) Pet a{ r. v. wi.th p degrees of freedom (OF) and noncentra lity parameter f1 (~ 0). Then. on 2 noti.ng that ~~(~n - ~ ) = I I~n - ~I 1 - Aa 2 • we may rewrite the left hand side of (3.6) as (3.7) Pll • a { .... = = NII~N -~I !2/ a2 ~ NA + bOs~/a2 } G(nA)(nA + bOs 2la2 ) } [N=n] p n E[G~NA)(NA + bOs~/a2 ) ]. L n~2 E{ I Therefore. it suffices to show that (3.8) E[ G~NA)(NA + bS~/a2)] ~ 1/2. for every b E (O.(p-l)(3p+l)/4p). A ~ O. Note that (3.9) (a/db)E[ G~NA)(NA+ bS~/a2)] = _E[(s~/a2)g~NA)(NA+ bS~/a2) ] ~ 0 • for every b ~ O. where g~f1)(y) stands for the noncentral chi square pdf with p OF and noncentrality parameter f1 . Thus. if we verify that (3.8) holds for any b arbitrary close to (p-l)(.3p+l)/4P. then it follows that it would also hold for all smaller (but positive) values of b. Thus. if we let (3.10) K = (p-l)(3p+l)/4p. then it suffices to show that E[ G~NA)(NA + KS~/a2) ] ~ 1/2. for every A ~ o. In this context. we may refer to Theorem 1 of Sen. Kubokawa and Saleh (1989) where it is shown that in the fixed sample size case. for every n ~ 2, p ~ 2. (3.11) Thu~. E[ G(nA)(nA + bs 2Ia 2 )] > 1/2. for every be(O. K), A> O. p nthe crux of the problem is to verify that (3.11) holds in the sequential case. The actual proof is lengthy and complicated too. Hence. for the sake of simplicity of presentation. we shall provide a broad outline of the proof. First. we consider the asymptotic setup of Chow and Robbins(l965) or Robbins (1959) where in (2.11) [with Ndesignated as NK ] K is allowed to go to + . Note 2 2 that snla ~ 1 almost surely (a.s.) as n ~ ~. and N ~ a.s. as K ~ ~. Further. K ~ -9- G~·NA)(NA + Ks~/(2) is. a bounded r.v. assuming values i.n [0,1]. Hence, in this case, convergence in probability would ensure convergence i.n th.e first moment too. Finally, for any n > 1 and A >0, GenA)(nA +y) has a uniformly bounded and continuous first p derivati.ve (ji.e.• y1, for all p ~ 2. Therefore, i.t sufftces to show that for n suffi- G~JlA)(nA + K) ci.ent1y large, G~·nA)CnA + is ~ 1/2 • Now, by Lemma 2.2 of Sen et ale (1989), K }. is nonincreasing in (nA), while, proceeding as in Theorem 2 of Sen G~nA)(nA (1989), we obtain that ltm.(nA)-+al for every n, A : nA ~ o. + K ) = 1/2. Hence, G~nA)(nA +1( ) is ~ 1/2, This' simple method may not work out for the case where K in (2.111 i:s: be.ld ftxed, and hence, a more e.l aborate proof is necessary. For every n ~ 1 and A >0., let follows: from Sen (989) th.at m~nA) m~·nA) be the medtan of the d.f. G~nA), p ~ 2. It is nondecreasi"ng i.n n and A , and further, m~~~) ~ p + nA ,~ n ~ 1, A ~ O. Let K be defined as in (2.11), and let 2 2 n* ={min{ n: Kn ~ pKa }, if/Ka ~ K2; min{ n : K(n-1) > (p-2)Ka l,if 2Ka < I( • <,3.12) Then the left hand side of (3.10) can be written as • (3.13) E[ G~NA)(NA + Ks~/(2) ] = Ln<n* E{ I[N<n][ + E[ = Ln>2 E{ I[N=n] G~nA)(nA + Ks~/(2) } G~nA)(nA+ Ks~/(2) - G~(n+l)A)«(n+l)A+ KS~+1/(2)] } G~n*A)( n*A + KS~*la2 ) ] + Ln>n* E{ I[N>n][ G~nA)(nA+ Ks~/(2) - G~(n-l)A)«n_l)A + I(s~_1/a2) At this stage, we may assume without any loss of generality that n* is ~ ]} . 2, as other- wise, the first sum on the right hand side becomes vacuous. The second term is [ by (3.11)] bounded from below by 1/2. Thus, we need to show that each of the two sums in the right hand side of (3.13) is nonnegative. Towards this, we may note that • • (a/Cln)G(nA) (nA+ y) = ).,[ g(+n A)(n)., + y) _ g(nA)(nA + y) ] p p 2 p < Q d. . < (nA) ~ , accof1,ng as. nA + y 1,S. ~ mp+2 ' where the last step follows from the unimodality results in Sen (1989). As a. result, (3.14) we 9btai,n that C3.15)_ G~JIl+11).,1(Jn+1 )A+ y) - G~·nA)(nA + y) < 0, i,f (n+1)A +. y < m(Jn+ 1 1A) - - p+2 ' > 0 , if nA +. y > m(n A). . . - p -10- Further, note that U~ = nps~/a.2 has the central chi. square d.f. with p(n-l}OF (6 (n-l» * 2 2 * P * and Un+l = (n+l)psn+l/a' = Un + Un+l where Un+l has the d.f. 6 , independently of U . ~~+1- ~~lIi ~ ~~/a2 E[(5~+1- 5~)/i ~Bn C = (n+1 )-1 p-l Un+1 ]. 50 :hat = (n+l)-l[ 1 - s~/a2 ] , for every n ~ 2. Finally, note that [N<n] <=> [ s~ ~ n/K] Aha note tl1at is 8n-measurable, and [N>n] <=> [s~ > m/K, ~ m~ n-l] is 8n_l -measurable. With these, we consider a typical term in the last sum on the right hand side of (3.13). * Note that for any n ~ n*+l, (3.16) [ ~(nA)(nA + Ks 2/a 2 ) _ ~«n-l)A)«n_l)A + KS 2 /a 2 )]} [N>n] p n p n-l = E{ I[N>n][ ~~nA)(nA + Ks~/a2) - ~~nA)(nA + KS~_1/a2 ) ]} + E{ l E{ I[N>n][~~nA)(nA + KS~_1/a2) - G~(n-l)A)«n_l)A + KS~_1/~2 ) ]} • 2 2 2 Now, by (3.15) and the fact that on [N >n], (n-l}A + Ksn_l/a > (n-l)A + K(n-l)/Ka ~ m~l~-l)A) ~ m~~;A) , we obtain that the second term on the right hand side of (3.16) is nonnegative. For the first term, we may note that (3.17) (a/ay)G(nA)(nA+ y) = _g(nA)(nA + y ) , y > 0 ; p p 2 (3.18) (a / ay 2)G(nA)(nA+ y) = [ g(nA)(nA + y) - g(nA ) (nA+ y)]/2 p p p- 2 > 0, for all y: nA + y > m(nA) , -. - p where again the last step follows from the. urii.modaH.ty results in Sen [i989). Thus, G~n'A)(n'A + y) is a convex function of y, for all y ~ m~n'A)_ nA (~p-2). Note that for n > n*, on the set [N > n], nA + Kn- l (n-l)s2 1/a2 > n'A + (p-2) > m(nA), and n- p 2 2 1 ) 2 2 nA + Ksn/a = nA + Kn (n-l sn_l/a + KUn/np , where Un has the central chi square d.f. with p OF, independently of s~_l or [N ~ n]. Hence, using the convexity of G~_n'A)(nA + Kn-l(n-l)s~_1/a2+ KUn/np) [in Un] along with. the Jensen inequality, we obtain that on [N (3.19) ~ n] , E[ G~nA)(n'A + Ks~/a2) I 8n- l ] ~ G~n'A)Cn~ + Kn-l(Jl-l}s~_1/q2 + Kn... l ) ~ G~n'A)(nA + KS~_1/a2) + Kn- l [ s~_1/a2 - 1]9~nA)(nA + KS~_1/a2) , where the last step follows. from (3.17)-(3.18) and the fact that Ks.~_ln...·l(n-1)/a2 is ~ m~-~~>'" n'A , for N~ n. Finally, note that by (3 ..1.2), for n > n*, s~_l/i !:. p/~ on [N > nJ, and K <p-l < p. Hence, from (3.16) and (3.19), we obtain that • -11(3.20.) > E{ .:: • G~nA)(nA + KS~/a2) - G~nA)(nA + KS~_1/a2 ) J} r~N>nJ Kn- l ( s~_1/a2 - 1 )g~nA)(nA + KS~_1/a2 ) } E{ i[N>n][ O~ for every n > n*. This shows that the last sum on the right hand side of (3.13) is. nonnegative. The treatment for the first sum is very similar. Note that we would have two terms as in • (3.16)~ where (3.15) would ensure that the second term is nonnegative. For the first term, we write G~nA)c.l = 1 part would ensure the G~nA)(.), and note that convexity of G~nA) ~ and hence~ - (3.18) for the complementary the rest of the manipulations can be carried out as in (3.19) and (3.20). We therefore omit these details. Q.E.D. 4. Some general remarks. Our main result is contained in Theorem 3. For the sake of - -------------------simplicity~ we have made some assumptions which are possibly less general than they may appear actually in this context. For example, in (2.11)~ we have considered a general 1jJ(n) whi.le in the proof of Theorem 3, we have taken 1jJ(n):: n. The treat2 ment of 1jJ(n) n (or other plausible fonns) poses no extra regularity conditions = but extra manipulations. The basic fact is that the stopping time N has a distribution governed by the sequence { • s~ }~ and the convergence properties of these (3.2)~ provide the desired keys to the actual manipulations. Secondly, in s~ instead of the conventional divisor p(n-l), we have taken np. A similar modification was also made in Ghos.h~ Nickerson and Sen (1987) for dealing with. the sequential shrinkage estimation in the light of the quadratic error loss functions. In the current case~ i.t may not be necessary to have the divisor np ; p(n-l) would h.ave been quite valid. But~ then in the defi,nitiQn of n* in [3.12) we would have needed some adjustments. Thi.rdly, in the. conventi'onal ftxed s,ample s.ize case, Sen~ Kubokawa and Saleh (1989) estimators~ considered a more general form of Stein-rule • • where in (3.1)~ constant b (0 < b.s. (p-1)(3p+1)/2p) was replaced by an arbitrary function such that for all p.:: (4.1) 0. < <pC 2~ ~n~S~) n .:: 2~ < (p-l )(3p+1)/4p~ for every (Xn~s~) <PCXN~s~) >. ~ <P(Xn's~) a.e. rn our case too, we may extend (4.l) to the s.equenti.al setup by. the stopping time N (j .. e.~ taRing the scalar incorporating and the steps. in C3.6} through (3.8) -12remain in tact [ by virtue of the nonnegativity of </>(.) and its, boundedness from above]. Thus, (3.10) again pertains to this more general situation, and the proof provided remains applicable too. Fourthly, under appropriate quadratic error loss, positive-rule versions of shrinkage estimators are known to dominate the usual e • shrinkage estimators, and in the conventional nonsequential cas,e, this dominance result has also been established in the light of the Pitman closeness [ viz., Sen, Kubokawa and Saleh (1989)]. For the particular model cons.ide,red in Section 3, a positive-rule version of (3.1) is ~~+ = {l - bs~(nl I~nl 12)-1 }+ ~n; (4.2) a+ = max(a,O) , s~ and the other notations are borrowed from (3.1). where b, version of this positive rule estimator is given by A natural sequenti al ~~+ , where th,e stopping number N is defined as in (2.11). Note that by (3.1) and (4.2) II ~~+-~ 11 ~ II ~~ - ~ 11 } >-Pa,o{ bS~~NII~NI12} + Pa,o{[bS~ (4.3) When 2 Pa,o{ ~ = Q'- the the case of row of e; 2 - ~'~N > O]}. - O. Let A be an orthogonal matrix of order p , such that the first -. -.... ~ is 11~II-l~, Pa,o{ [bS~ =-'1: n>2 P{ Note that if- ]O[ right hand side of (4.3) is equal to 1, so we need to consider only . For every n Y =1 I~I I-l~'~n ' for every n ln (4.3) can be written as (4.4) NII~NI12 > 8~ ~ ~ 1, let ~n = (Yln,· .. ,Ypn)' = • ' so that 1. Then, the second term on the right hand side of I ~NI12](\[ Y1N ~ 0] } 2 l[N=n][ bS~ > nil !n 11 ] n[Y'ln ~ > NI 8(S,~, = ~n k ~ OJ }. n) be the s'igma subfteld generated by th,e s~, k ~ n, then (i) [N=n] i s 8~ -measurable, for every n ~ 2, (i i) given N = n, Y1n has the l 2 normal distribution with mean Iiall and variance n- 0 , and (iii) s2 and Yn are ~ n independent. Thus, we may virtually repeat the proof in (2.6)-(2.7) of Sen, Kubokawa and Saleh (989), and conclude that for every n ~ 2, (4.5) Pe,q{ l[N=n] ~ [bS~ > nll!~112 ](\[ Yln ~ .22 ,0{ l[N=n] [ bS'n ~ n II ~n II ]} ~ 0/2) Pe = O/2)P;',0 { I[N=n][ bS~ >n Ilin 112 ]}. 0] } -13Thus, (4.4) is bounded from below by (l/2)Pe ,a{ bS~ > NII~NI12 }, and hence, the right hand side of (4.3) is bounded from below by 2 (4.6) Pe,a{ bS~ ~ NI I~NI 1 } + (1/2)P e ,a{ bS~ > Nl I~N112 } 2 ='" ~ + ~ Pe,q{ bS~ ~ Nl I~NI 1 }'" ~ 1/2. Therefore, • ~~+ dOlT\t'nate the usual shrinkage estimator ~~. in the 1ight of the Pitman closeness criterion for an arbitrary stopping rule (depending only on the s~ ). Here also, p is term ~ 2, and again, we may replace th,e snrinkage factor b by a more general ¢(.) as in (4.1), and the PC dominance remains. in tact. Fifthly, in the fixed sample size case, Sen, Kubokawa and Saleh (1989) considered the case of a normal distribution with mean vector ~ and dispersion matrix ~a2, where ~ is a known positive 2 definite matrix and a is unknown. In thts s.etup, if we ass.ume that if the ",1 X. are 2 ) distribution, then the PC dominance in Theorem 3 also i.i.d.r.v. IS with N C~, ya holds; we only need to modify the shrinkage estimator in (3.1) allowing a possibly more general norm as in Theorem 2. Towards this, we let • • ~~ = [ (4.7) where 9 appears ! - ¢(~n,s~)s~(n~I~-lg-1~-18n)-lg-1~-1 ]gn ' n ~ 2, in the definition of the norm in Theorem 2 and all the other notations are borrowed from Section 3. As adapted to a stopping rule N, the shrinkage estimator in (4.7) is defined for the sequential case, i.e., o¢ = o¢ if N = n . If we "'.n proceed as in (3.5) through (3.9) where we use the. ..... Q-norm d'Qd instead of the ",N Euclidean norm d I·d , then we may show easily that (3.10) again provides the desired '" '" result. Therefore, we omit the details. Finally, we consider the general case of an arbitrary covariance matrix ~ (positive definite but unknown) , and examine the PC dominance of sequential shrinkage estimators for stopping rul es' of the type in • (2.19). In this general case, a Stein-rule estimator of e [viZ., Stein (1981)] '" is of the form t (4.8) where * ~ ) (W -1- )-1 -1 -1 ]~ ! - (n-p) -1 ¢(~.n'~.n dn n~n~n ~n' 9 ~n ~n dn = minimum characteristic root of g~n ' ~n :; L~=l qi a Wi,shart ~n = [ Ct: '" - ~n) (~i - ~n) has ,p, n-1) distribution, and the other notations are borrowed from Sections 2, 3 and 4. For the fixed sample size case, the PC dominance of * ~n over - ~n has been -14- studied by Sen, Kubokawa and Saleh (1989), and in this cintext, it is assumed that n > p ( so that S has a nondegenerate distribution). Thus, in the sequential case -n too, we would need that the initial sample size no is > p. Let us define ~n = (n-p)-l~n ' for n > p , (4.9) and 1et • be defi ned as in (3. 10). Th.en, virtually repeating the fi rst part of K the proof of Theorem 2 of Sen, Kubokawa and Sa leh (l989), we conclude that for the desired PC-dominance of * over ~N - ' ~N Pe,~{ N(!N" ~ ) '~Nl~N ~ (4.10) it suffices to show that K > } V (~,~), 1/2, ~n where the stopping number N is defined by (2.19) [ with being replaced by ~n ]. In this context, we may set (4.11) ~ = _k ~ .~~ , ~n = L -k _ n~ ~ z (~n - ~ 0 ~n ) and = k L ~-~~n~-~ , for n > p. Note that for every n -> 1, Z has the normal law with null mean vector and disper-n sion matrix !p , and (n-p)~~ has the Wishart (!p' p, n-1) distribution. Thus, the left hand side of (4.10) can be written as (4 .12) I 0 -1 Po., I{ ~N (~N ) ~N ~ - N"2 ~ (~N )- ~N } - - I 0 I = ~n >n PO,t{ N=n }. PO,I{ ~n(~n) - 0 - - - 1 0 1 K -1 ~n ~ K - where the even [N = n] depends on the ~~, k ~ nand - ~ 0 n ~I(~n) g, -1 ~n I N= n }, introduced in (2.19), for n ~ no' Using the stochastic independence of the ~k and ~~ , we may claim that gtven -n VO (and N = n), A'{Vo)-lZ has a normal law with mean a , for any A, and _ -n -n we can conceive of a set of independent standard normal variabl es Ul , ... ,Up , such that for every n ~ no ' (4.13) PO•I { ~~(~~)-l~n ~ - 2 = PO,I{ &~=l dnjU j K - n~~t(~~)-l~n N=n * n"2 ~~=l dnjU j N=n, ~~ } - 1 * depend on where the dnj are the characteristic roots of (~~)- and the dnj as the dnj (which are held fixed) ; the Uj are independent of the dnj and > 1 K - A as well * · dnj Although this form is quite appealing, it does not lead us to the deisred result. The main difficulty is caused by the fact that in (2.19) N involves the sequence K(trace(9~n)l, n ~ no and K ( > 0) gi.ven. With the transformation in (4.11), we • -150 0 P 0 so that trace(gy n) = trace(g ¥n) = ~j=l dnj , say, where the dO. are the characteristic roots of QOv o , for n > n . The basic difficulty is o may write Q = k k ~2Q~2 , nJ - -n - 0 caused by the fact that the stopping number N is determined by the sequence {do.} nJ * may not be in general proporttonal tQ the d , while in (4.13) the dnj and dnj nj , and this creates a problem in adapting the breakup in (3.13). Thus, the method outlined in Section 3 may not work out for this general problem. This is not at all surprising. In the sequential shrinkage estimation (under quadratic error loss), a very similar problem cropped up, and hence, in all the works. of Ghosh et al. (1987), Nickerson(1987) and Sriram and Bose (1988), the case of ~ = (iv... was considered . Also, if we go back to the classical multivariate sequential problem treated in Ghosh, Sinha and Mukhopadhyay (1976) and Woodroofe (1977), there also, a stopping was considered,allowing K to go to +00 case, we have no difficulty in claiming that ~N* dominates ~N rule of the type (~.19) In this asymptotic in the PC criterion. , This follows from the fact that Yn + ~ a.s., as n + so that as K+ NK a.s. As such, in (4.8) ( for n = N ), [ defined as in (2.19) with K] goes to + we may replace (N-P}~Nl by E- l , and with the reSUlting estimators, the case reduces • to that of a known 00 , 00 , 00 E' for which our earlier results would directly apply. However, we may remark that as K + 00" the PC-dominance of ~N* over ~N is perceptible only in a small neighborhood of the pivot, i.e., ( 4. 14 ). ~ £. -1 OK ~ { K ~: ~ £ p} Compact C CR. This asymptotic feature of the shrinking neighborhood pertaining to the sequential shrinkage estimators C under quaratic loss) has already been noticed in Sen (198?), and in the PC-dominance, the same remains pertinent. The problem of establishing the dominance (in quadratic error loss or in the PC-criterion) for an arbitrary ~ • and for any fixed K > a remains largely as open . -16REFERENCES BROWN, L.D., COHEN, A. and STRAWDERMAN, W.E. (1976). A complete class theorem for strict monotone likelihood ratio with applications. Ann. Statist. i, 712-722. CHOW, Y.S. and ROBBINS, H. (1965). On the asymptotic theory of fixed width sequential interval for the mean. Ann. Math. Statist. 36, 457-462. GHOSH, M., NICKERSON, D.M. and SEN, P.K. (1987). Sequential shrinkage estimation. Ann. Statist. li, 817-829. GHOSH, M. and SEN, P.K. (1989). Median unbiasedness and Pitman closeness. J. Amer. Statist. Asso. 84 , in press. GHOSH, M., SINHA, B.K. and MUKHOPADHYAY, N. (1976). Multivariate sequential point estimation. J. Multivar. Anal. ~, 281-294. LEHMANN, E.L. (1983). Theory of Point Estimation. John Wi,ley, New York. NAYAK, T. (1989). Estimation of location and scale parameters using generalized Pitman nearness criterion. J. Statist. Plan. Infer. (to appear). NICKERSON, D.M. (1987}. Sequential shrinkage estimation of linear regression parameters. Seguen. Anal. ~, 93-117. ROBBINS, H. (1959). Sequential estimation of the mean of a normal population. In probabi1i~ and Statistics ( Herald Cramer volume), Almquist &Wikse11, Uppsala, pp. 235-24 . SEN, P.K. (1987).Se uentia1 Stein-rule maximum likelihood estimation: General as totics. In Statlstlca Decislon Theory and Re ated TOplCS IV eds. J.O. Berger), Springer Verlag, New York, Vol. 2, 195-208. SEN, P.K. (1989). The mean-median-mode inequality and noncentra1 chi square distributions. Sankhya, Ser. A ~, 108-116 SEN, P.K., KUBOKAWA, T. and SALEH, A.K.M.E. (1989). The Stein paradox in the sense of the Pitman measure of closeness. Ann. Statist. 17, in press. SRIRAM, T. and BOSE, A. (1988). Sequential shrinkage estimation in the general linear model. Seguen. Anal. L ,149-163. STEIN, C. (1956). Inadmissibility of the usual estimator of the mean of a multivariate normal distribution. Proc. Third Berkeley Symp. Math. Statist. Probabil ity 1, 197-206, Univ. Calif. Press. STEIN, C. (1981). Estimation of the mean of a multivariate normal distributuon. Ann. Statist. ~, 1135-1151. WOODROOFE, M.B. (1977}.Second order approximations for sequential point and interval estimation. Ann. Statist. ~, 984-995. pranab K. Sen De,partments of Bios,tatis.ti,cs & Stati,stics, Univers.ity of North Caroli'na, Chapel Hi11, NC 27599-7400 (3260). • t f • .
© Copyright 2025 Paperzz