SHRINKAGE EETDlATI(I( IN A RFETRICfED PARAIIETER SPAa: BY DEBAPRIYA SENGUPTA AND PRANAB KUMAR SEN University of North Carolina. at Chapel Hill In a restricted parameter space. restricted maximum likelihood estimator generally perform better than their unrestricted counterparts. Stein-rule or shrinkage versions of these restricted estimators are considered for the multi-variate normal model when the mean vector is restricted to a positively homogeneous cone and the dispersion matrix mayor may not be specified. In the light of the usual quadratic error loss. dominance of the proposed shrinkage estimators is studied. In this context. a Stein-type identity is established for the positivelyorthant restrained multi-normal distribution. AMS Subject Classification Nos. 62C15. 62FIO. 62H12 Key words and phrases: Admissibility; positively homogeneous cone; positive-rule estimator; quadratic loss; restricted MLE; Stein-rule estimator; Stein-type identity. 1 1. Introduction. Under a quadratic loss: improved estimation of the mean vector (0) of a p(~3)-variate normal distribution has received a great deal of attention; See Stein (1956), James and Stein (1961) and an extensive literature cited in Berger (1985). A variant of this problem relating to a restricted parameter space (9) C RP ) is treated here. the unrestricted case, i.e., for 9 =RP , In shrinkage (or Stein-rule) estimators (SRE) of 0 dominate the classical maximum likelihood estimator (MLE). The pivot (!O), underlying the construction of SRE's, in the restrictecd case, may not be an inner point of 9 [a situation quite comparable to the hypothesis testing problem against a restricted alternative where under the null hypothesis, 0 may lie on the boundary of 9], so that the classical MLE may not be generally optimal. Generally, restricted MLE (RMLE), derived under the parameter restraints, performs better than the unrestricted MLE (UMLE) on the restricted parameter space, although a different picture may emerge on the complementary parameter space (i.e., ~P\9). wonder: One may therefore Can the RMLE be dominated (in risk) in the same manner as the UMLE is dominated by a SRE? The primary objective of the current study is to focus on this basic issue. To motivate, we may mention some typical problems where RMLE are advocated. ~ 1 R . (i) Ordered alternative problem: Here 9 = {O € RP : 0 0p}' so that the pivot (or null hypothesis) relates to 0 1 = 01, ~ 0 € The RMLE computed under the ordered restrictions is termed the isotonic MLE [viz., Barlow et al. (1912)]. problem: Here e (ii) Orthant alternative = {O € RP : ! ~ !O}, for some specified !O € RP . the RMLE is computed under the orthant restraint that ! " ~ !O. Thus, We may refer to Kudo (1963), Nuesch (1966) and Perlman (1969) for various 2 properties of RMLE. in:;Tes1lriot"edrmodels. It is well anticipated (and demonstrated. in some specific cases) that over a restricted.paraJDeter· sp:lce fJ. RMLE performs better than the UMLE. a. This raises the.i-Cluestion:- is the RMLE admissible? showing that for p ~ Under a quadratic loss and confined to We shall provide a negative answer by 3. a Stein-type shrinkage of the RMLE is indeed possible. and we shall provide an explicit formulation of this improved RMLE. An explicit expression for this improved RMLE and its risk function are also provided. Along with the preliminary notions, the RMLE are considered in Section 2. Section 3 deals with the .proposed restricted shrinkage MLE . . 9 established. over a+ ) ~ O}. The dominance of the RSMLE .•over the RMLE is : A general is also < ) olasslo£.re~~ricted minimax consldered;.inothe'~S8:IbEflTlsection. to the case of }; = ..' .- 02v. V mown and 0 2 estimators (of 9 Section 4 is devoted unlmown. and the results of Section 3 are extended for' dliS .moq.,l~· ··i'the9tbore igeneral case of }; arbitrary (and unknown)nts Ulen ttea.tedL'in;section 5. intricate properties of,t~exW~~~~~_~i~t~i,but~on(of Utilizing some sub-matrices). the dominance of the RSMLE over the RMLE is established for this general model too. More general forms of restricted a. arising particularly in some common linear models, ar treated in Section 6. Finally, in Section 1. the usual SMLE and the proposed RSMLE ar compared in the light of their risks. While the SMLE is invariant under non-singular transformations. the RSMLE is not. canonical reduction of ! This eliminates the possibili ty of a (on a+), and hence. the relative picture is drawn for diverse situations dealing with appropriate subspaces of a+ 3 2. Preliminary notions. (2.1) N p For every posi;Uv.e rinteger Ii., let = {l, .... p}. By a eN. we mean a subset of N,p -p or~red 'key D<."ltural fordering. and a' - N \a (also eN). the complement ,of a. isrca}so.l'takeh as a naturally p - p Further. for every a ~ H~. ordered version. stands for the number of elements in a. la' I = p-Ia I. (2.2) t~,L~ the cardinali ty of a. Thus. Icpl = o. 0 ~ lal ~ p and For later use • we also define", N* = {a : a C N p - For a p-vector and p lal > 2}. for p ~ 3. X = (x 1 ..... x p )·. we define for every a eN. X as - p ,.,a the lal-vector consisting of the components Xi' i € a: the case of a = cp can be treated convenHonafly:'''Forevef:9' a eN. such that lal = r (0 ~ - p r ~ pl. i f u and v at~)rZ;i:i:dd"{iPFr':::-J~~l:)6r$~ r6~~Jdvely. then ~ (u.v) a",,,, ~:~ ',"':}Q' is the p-vector such' tMtl":' ~,..I....lr-. ~"... ,''ud'..i I .. ''''r<r<r t'm£> '1\ ,,,,:hL""J.' ,I .t~(.r'; .,.. , 1t,"'~ (2.3) where the order in whichnthe cOlllpQDents.Jof ui.w"enter !is determined by .'-(C~· '\. the natural ordering.' ~ ~ • r .. "y For a p x p posi tivendef,iiJlite i~p..1Cl. }.tliatnix Qand. for a eN. b C - P - N • Q b denotes the lalx'Y~bl'$uhmatl'tx ofrlQ c6n$isting of the rows in a p ..;a '" and columns in b. Further! j}f6r:li;f;-~e~£gi~:x~m;d'8. p.d~ matrix Q. for every a eN. beN • let - p P =~ - <, ",', (2.4) ~:b{= ~:b{~» (2.5) ~:b = ~ - Qab~bb~' "-I ~b~b ~. -1 Let then ~ (2.6) a b Then R (Q) lV = {X IV = UcpCaCN p -1 € R : Q •• x • ~ O. X . • (Q) ..;a a,.,a ,.,a·a IV ~a' IV > O}. cp cae - - N . P where the ~a are disjoint [viz. Kudo (1963)]. - - p We consider here specifically the orthant model for which 4 (2.7) A general class of restricted alternative models can be reduced to this specific model. and we shall JDEd(e. ~re detailed comments in section 6. Suppose now that ~1'.'.'~ a~,e inde~ndent and identically distributed (Ll.d.) random vectors (r.v.) having the p-variate normal distribution with mean vector 9 and dispersion matrix I (p.d.). Note that the UMLE of 9 is given by - ~ = n (2.8) X Note that ,J1 -1 ...n .2. i =1 ~i X and ,J1 ~ N (9.n-1 I). p~ ~ is unbiased for 9 but not admissible for p ~ 3. In the case of I lmown. the RMLE of 9 is given by .:;,' r,:~,' \y~_: 1::>.,. q:HJ.2 ~ , -!-('.;~, , ~_ .-,.'( , (2.9) wi th the notationsadap~:...erQlll3'('2;tlh)(21"31, ,., (1966)]. (2.10) .. and (2.6) [viz .• Nuesch .1.. The RMLE .is~; __not unblas~ for 9. ..Jl.n:~.·1 .~) j.l~;.: n" ~2~"" S _...n ,J1 - .2. i ,ii' (X -V'-IJ)(.VI-t-X-:t'\.!,•. ,-,J""'-,I)-;1 8 =l ~i ~ ~ffiJ."· '~~""';'\~ . n· Note that (X .,S ) are.rJd1n&}~ ,llliltilei:en1;2for (9.1). X and ,J1 S are ,J1 ,Jl ._. ~ ~,J1 mutually independent. and (2.11) The UMLE of 9 is still~. Restdcted to 8+. the RMLE of 9. in this . case is given by [viz .• Nuesch (1966)]. (2.12) A* 9nv ~ = cpCaCN I _ _ {X . . (S ). 0) I{X a ,J1a·a n ~ ~ ~ € ~ a (S ,J1 ». - - p Note that in (2.9) and (2.12). for only one (out of the 2P ) of the indicator functions is equal to 1. so that we have actually a single (random) a. written in this closed form. In the next three sections. we shall study the dominance of the RMLE (over the UMLE) and propose some 5 In this ~ontext;. ("~e' introduce the RSMLE which dominate the RMLE. quadratic loss function so that for an estimator 9 of 9.' '" '" tii~~r{~k t.:r~' gtven by A A Note that an estimator 9 dominates another estimator 0 (of the common 0) '" A '" '" A ~ if R{O.O) R{o.9). V9 € 9. with the strict inequality sign holding for some O. , 3. RSML when! is known. - ~ = ~ '" Np {!':)' 1Q .;,•. rM.:~ (' .. For simplicity of notations. we may take n=l. ~,~ (7) l'Z' ~ ~ij,. Note itiidt;{~;~Y;3'ip S:~~' N~}L:l'~ ~ ~~rti tioning of IRP into 2 P disjoint subsets·.~., Also. OIi'a:aX~h ,the·,~~l).leJrt ().f estimating 0 '" '" ~ O~~~~~h1.·1Yy>~edJ~~s 'to"solving under the constraint that 9 an '" '" lal-dimensional problem);.:: ThUs;til!l'da]i£ntg: tHef~l8.sSi,cal James-Stein (1961) shrinkage vesion indiVidually on each 0t;the 2P subsets ~a (:). 'P Cae N - - ! propose the following RSMLE of 9 (in " (3.1) A 9 on (With the shrinkage'factonIdePihdlng p RSM = ! cpCaCN l{X € '" - - P .t~e , case of known !): '" ' _l A A, ~ the respective lal). we {!»{l-c (Oov! 0ov)} a '" a ",.nn ",,,,.nn _lA 9ou • 1U'l where (3.2) o < c a < 2{lal-2). if lal > 2. and c a = 0 for lal ~ 2. Our main contention is to prove the following. THEOREM 3.1 Under the quadratic loss in (2.13). for every p A (3.3) R{!RsM'!) A ~ R{!RM'!)' V! € 9+. ~ 3. 6 where the strict inequality holds in a neighborhood of the pivot 0 ~,~~,0p,t,~~tes Thus, over 8+, the (CB+). the IOO..E. For the proof of this theorem and some other results (to follow), ,'):, ' ,', we need a basic lemma, which is presented first. Lemma 3.2. Let <X,y> ,....,,....,0 m ,...., ,...., (3.5) X "..., ~ ~1 'V,...., ' y, and, for every m ~ 0, let IV = (2lr) -p/2 I1: ,-1/2 'iJ (a,1:) (3.4) =: X'1: ,...., J... J -m <a,z>~ m IIzll~ z~O""" ~ ,...., ~ r 2 z, exp{1 -2-lIzll~ ,...., ~ Xp,...., (a,1:); 1: p.d. t"V ,...., Proof of Theorem 3.1. (2~14). By (2.13). (3.1) and the disjoint nature of the x (1:), a eN. oN'defintng N*~"'as in" (2.2). we obtain that for p a ~ - ~ p ,::' J.~, fJ p;" 3. A R(anC'll'~) = ~l'l •_ :l ~.... A ' A Ea{lIaDv-:-a-:e.f;&~l.I)I~i\a....~) 1: ~,.,IU'1 cpCaCN '" f a .....,.KJII ~,~ -1.... 2 .,anvll~ l(X € ~ (1:»} ,.,IU'1 ~ ~ a ~ - - p A = R(anv .a)-2 1: ,.,IU'1 (3.7) + * r'W a'-.n - p IV 1: * 8!CN Ea{ca ~ 2 A_2 ca Ea{lIanvll~ l(X ,.,IU'1 ~ IV € ~ a (1:»}. IV IV - P Thus. it suffices to show that for every a € 8 . the net contributions IV of the last three terms is negative. (3.8) 1: * Ea{c l(X € aCN a IV - p ~ a (1:)} IV = aCN 1: * - p + Towards this, note that c -1 a Pa{X. ,(1:) > O. 1: 'a' X ~ ~·a ~ Q ~ Q ~ , ~ O}. ~ 7 = _~* Pa{~:a'(:) >.. £}. • Pail3i~~.~t!!·~"~}. ~'p as X . ,..a.a I and X ,..a (:};) '" 'V I 'V are mutually th~a~f--~'ibr every .L f' L' "; ~.~£.':,: a C N - p Thus, ~' ! by (3.6) with p = lal and r=s=O, (3.8) reduces to (3.9) };*c aCl'f a - p Pa{};-~ 'V ~o}[exp{-~lIa', I{};)II~'·}]. I X, a a,..a 'V ,..a·a • 'V ,..a:a l]] k/2 -1 ~+Ial] / f [Ia 1c~02 (k!) ~(~:a'(~), :a:a fL-2 2 : /. ) ' [ Similarly, A -2 A }; * Ea{c "anvll~ <a,anv > l(X € !: (};»} a 4.0 ~ a 8JCN '" '" ~ (3.10) ,.,IU"l 'V,.,IU"l 'V - p where we have used the identi tieS-: that ~ (3.13) = ~~al and t':_~ :lw-~~~n(:~;>~a~:~l Thus, again using (3.6), (3.11) and (3.12), (3.10) reduces to (3.14) a CN};* c a P - p [1c~0 = a{};~~a~;.~ ,~~~1 (~~$f~~~~~{~}I1~,..a..a }]. I 'V 1 kT 2 (k-1)/2 ~+ la I] /f [M]] 2 . ~+l(~;a'(~)' :.a:a') fL- 2 -1 a [1 2] .> ' CN};* c a Pa{};a'a ' ~I ~ ~} exp{-2"~:a'{~)I}; . ,..a. a - p 'V k (k-2)/2 'lJk(a . ,e};), }; . I) f }; k-,2 [k~l . ,..a·a ,..a. a 'V r- 2 lal ] / f (12-lal ] . 2+ Further. by using (3.6) with s=O, r=-2 and p=lal (so that r+s+lal a eN). * the last term of (3.7) - p (3.15) 2 -1 }; * c P {}; , I X I a a ,..a a,..a a CN - p ~ can similarly be reduced to [1 2] O} exp{-2-lIa . I (};)II~ } ,..a·a 4.0 • ,..a. a 'V 'V I > 0, V 8 :I [k~O 1 2 (k-2)/2 k-' . \II.. (9 . ,(:I). :I . ,) 'k ~·a ~ ~·a r r- 2 + 2 la I] / r [12- 1a I ] . Therefore. by (3.9). (3.14) and (3.15). we obtain from (3.7) that (3.17) = :I * (lal-2)2 aCN E9{1()c~?(~h(i·. " :I-~ ...., ~ a~ ~·a ~·a , X • ~·a ,)-1}. - p This form is very similar to the classical James-Stein estimator in the unrestricted case where the reduction in risk is Thus. we see that in the restricted case (of 8+). the improvements in the various lal-dimensional problems (for lal>2) add up to the total one in (3.17). REMARK 2. As expected. the risk-reduction due to shrinkage is a maximum 9 at the pivot O. and is given by (3.19) conve~ges As 9 moves away from 0 (inside 8+). (3.16) ~ ~ to 0; this is ~ because both the RJn..E and RSMLE are minimax. REMARK 3. In (3.1). we have considered the case of n=l. If X is A -1 '" ~ replaced bY~. the only change in (3.1) would be to replace ~: by -1 A n~: !RM. A As for the reduction in the risk in (3.16) remains in tact if in (2.13) we choose the loss as -1 A n(9-9)'~ A (9-9). A positive-rule version of (3.1) may also be considered as REMARK 4. follows: "'+ (3.20) ~= ~ cfCaCl( l(-k ~:~>(;.;;);,)2S'{~~'I.;.. {l~;-!. B!.~,;;}!:'"1}+ 9 ~ a '" S,"'!JuI': (.~, ::JUl. >.... - - p J ~.J, ~() i ,3" " ~;. . .'t·-- : J' ~ :~A , Here also. the (lower) truncation of th.~ sh¥i~e fa~tor is made to , • - A -1 '" depend on the sets ~a(:)" cp~;,a~.Np:. W!",it:ing,,{l...rca(~: ~ Ca){l-Ca(~ ~) :-1 '~1-1} . Ahd \ri¥i'iJi-i!y rePeating 1 ~ .' -1 + } = the !; - proof of Theorem 3.1. it follows that (3.21) so that c "'+ !RsM dominates ~ over 8+. p r "t +'c for an arbitrary W in -1 ]. (3.21) Jl!B.Y; ,np;~,~~f?ld\ [~d the picture is quite ". cOmParable to the unrestricted case]. (2.3) [instead of ~ How~v~~. ,...., ;' \ ". • .t"~ .•;~ .{ ,\", Motivated by the positive-rule shrinkage estimators and a general class of minimax estimators (in the unrestricted case) introduced by Baranchik (1970) and Strawderman (1971). we are tempted to consider more general minimax estiamtors in the restricted case. case. '" ~ In the unrestricted as well as the other minimax estimators are equivariant under nonsingular transformations X ~ Y = BX. B nonsingular (~ ~ B~' = ~*). 10 This equivariance may not generally hold for the restricted case. i.e .. 8 + is not generally invariant under a ~ ""J redu~ti~n to use a canonical 5 = Ba. ""'J This makes it difficult ,..,.,.,...., (on a.l) to establish the desired result. 'V 'V However. we are able to establish a general result under an additional condition on (a.l). (3.22) 8* + = {a € 8 'V Let us define the : ~k{ a . ,. -a·a + ~k{!':) as in (3.4). Let then * k ~ O}. 1 . ,) ~ O. V a eN. -a·a - p It may be noted that in order for 8* to be non-empty. it suffices to + assume that (3.23) a. I 1 -1. , ~ -a·a -a·a * O. for every a eN. P 'V - and this is. in particular. true for 1 = diagonal. so that -a·a a .. = -a a -> * .i _, r; * O. V a C N and 1 , : ' ~-,. = diagonal. V a C N - p -a:a -aa - p Thus. the theorem to 1.Hi.i:Y 11.t.s:do :cow ,lC5;:J\ ~ follow holds for 8+ when: is a diagonal matrix. For every a : ~ cae N • let~t (y) : R+ ~ (0.2{lal-2)+). be a .- p ,a •• , . ,.'.~... ' . ; j ( J , X nondecreasing. nonnegatf.:v:~;arid bounded' function of y. and for p > 3. let (3.24) "'* a = Note that if we let ta{y) = y * a eN. a ) + c a l{y>c). a - p then l{O~y~c (3.24) reduces to (3.20). so that the positive-rule estimator is a member of this class. TIIEOREM 3.3. Let X ~ 'V N (a.l) where 1 is such that a p,.., I"V I"V € 8*+ in (3.22). ,..., Also, assume that t (y) is non-negative and monotone increasing in y a (€R+) and (3.25) 0 ~ ta{y) ~ 2{lal-2). V a ~ Then, for p ~ N:. !RM 3. any estimator "'* a of the form (3.24) dominates '" * and hence is minimax (over 8+). * 8+) Proof. We very much follow the lines of the proof of Theorem 3.1. (over 11 Parallel to (3.7), we have here where compared to (3.9), (3. 14 )an& (~ 16)'~ fwe' have' here (3.27) = (I) * P9{~~a' Xa"~ o}(eXP{:-~II~l:a:a,.(~lll~ I aCN "V - p ---- "V ---- ..,a: a • "V }). 1 , , ; ' It 2 k-' 'ilk (9 . . . I . • ) EO[IIX . . 11" t (IIX . . II" )]) ..,a·a ..,a·a ..,a·a..G ..G .a• k '~O . ..,a.. a • a ..,a·a ..,a. (I "V (3.28) (II) = I * P9{I-~ , X . ~ ..,a a..,a "",..-v ~~, - p (I k ~'O (3.29) ..,a·a "V I * P9{I-~ • X . ~ ..,a a..,a roM (k~O ie! O} [exp{-2!1I9 . , II~ ..,a·a "V "V b ..- _~ ,.. ~ - P 1 '}. "'k (!!.a,,:;?- .~ ~ :,:- •)j :~~~f.::~,' ','. k-2 ";.a38,,' ..G . , ..,a. a }). 2 2 t a ( II~: a . ":.a: a . ) ] ) , .i.i!;./, IIX . • II ..,a·a "V 2 I a, ..,a: "V'?Ia I' VlPCaCN. - p 2 Therefore, wri ting U = IIX . . 11" ..,a·a..G , ..,a: a , we obtain that for every 0, (3.32) }). = 0, "V (3.31) ..G . a • ..,a. "V a~, Now, under 9 "V 1 k 2 k-' -JJ.. (9 . "I . .) EO[ IIX . . II" t (IIX . ,II" )]) , . 'k ..,a·a ..,a·a ..,a·a ..,a. ..G . a • a ..,a·a ..,a. ..G .a• = (III) O} [exp{-2!1I9 . • (I)II~ "V E {t (U)[2 O a ukl2 - 2 k ukl2- 1 - ukl2- 1 ta(U)]} * k eN, - p ~ ~ r- 12 r[~( lal+k-2»)/I" [~Ial) f{ ta(l(ja l+k-2)[2J(jal+k-2-2( lal+k-2)]}. 1 as by (3.25), t a (u) ~ 2( lal-2). = q, monotone / in y and E >(2 q Now t a (y) and 2y-2( lal+k-2) are both V q ~ O. Therefore, the right hand side of (3.32) is greater than, or ,equal .to (3.33) {2k/2- 1ra( lal+k-2»)/I" alai) fo{ ta(l(jal+k-2)} Eo{2><..lal+k_2-2(lal+k-2)]}, as lal+k-2 ~ 0, Va C N*. - all nonnegative. = 0, Vk ~ * 0, aCN, - p Finally, by (3.22), the ~ (9 . " Za:a') are 'k a·a P Hence, (3.30) is nonnegative too. This completes the proof of the theorem. In passing, we may remark that besides the case of diagonal };, '" :- ~ ,.~ (3.23) [and hence, (3.22)] may als~ hold for other situations (viz., e- ordered al ternative!f~:of:fni~':"'erk1fS"rbbr~etl)n '~els etc.). 4. RSMLE when}; = 'u2vJ. model: 2 'J 'v 'knOwn!'; i' ~~~':r;~~ .. 'i-' tJ;"f-':~,:3; ',l (X, S ) are mutually independent and we want to estimate ! su~_~ect ':_,,' We take s2 = m(m+2)-I S2 A well as lri'Fthis case, we consider the reduced -1 (~: estimator by !RM)' and let A !RsM' ,to _~,-:. "~ ~, 9+,~ wi~h • \ __' , ,J }}.. i"~ '~s2y.'~en, ,..., ,...,. ~_. "': repl~ce :~y we a plausible null pivot. " in (3.1), for !1 (};) as a "" z; and denote A the resulting A Note that !1 (};) = !1 (};), V ~ C a C N (by the a", a", - - p positive homogeneity of these sets), thus the only adjustment is made with the test statistic. If we proceed as in the proof of Theorem 3.1, we obtain here A (4.2) R(9DQM ,9) ,..no = R(9DV'~) - 2 }; * E9 2{ca (s2/a ,2)I(X'" ,..n.I'l ._ I"'"I'J a'-.l~ - p '" ,a € !1 (};»} a '" 13 Next, note that (i) s2 and X are indeperidell.t', and (U), by (4.1), E(s2/0 2) = m/(m+2) and E(s2/0 2)2 = (~2)-lm. (4.3) Thus, we may take the expectation of (s2/0 2) for (s2/0 2)2] and the rest separately. Hence, for the second term on the right hand side of (4.2), we have (3.9) multiplied by m/(m+2) , and the same multiplicative factor appears in (3.14) as well as in (3.15). Hence, here (3.16) holds with an additional factor m/(m+2). Thus, we obtain that 'e''};' ,.: A A A R(~,a) ~ R(~,~),:,i.i'~I.~O~ 80+;;:.1"-' (4.3) In a similar marmer i t 'y:"J" ~9.llQl's, r~t;.~P~~~,f!:-ls,~~~xtendsto case with the specific choice of s 2 = (m+2) -1 m S2 ; this the same multiplicative factor m/(m+2) ~P~I[s. in rQ.:'~~}, {.:L33)and elsewhere. but that does not al ter the relative picture .. Finally, (3.21) can be " handled in the same way. TIIEOREM 4.1. ;'.:AKILQ'f,(·d ',~:J' 'm (.,. Hence, we arrive at the following. {? For the case of }; = 2 2 2 S /0 "'~, independently (,!" o2.j;~jV~o~~d Ifor & A of .~ '" ·N(!.~) ,,'1 "."" " -;1 I" j .,. ~}Ar(';"p , s2 = (m+2)-l m S2, , th~ ~lias'ic "resul ts in Theorems ..".r, 3.1 and 3.3 and in (3.21)~hold. It may be noted that the factor ·mlfm+2) indicates that the cost of not knowing 5. 0 2 is 2/(m+2). and it goes to 0 as m increases. RSMLE when}; is arbitrary (unknown). given by (2.12). In this case. the RMLE of we can reduce the original problem to the following: (5.1) X '" H (a.};). S '" p,...,"¥ is since [by (2.11)]. (X .S ) are jointly sufficient for ,...n ,...n (a.~). '" a ,..., W(p,m,~). X.S mutually independent. where m is a 14 positive integer. and '( ••• ) stands for the Wishart distribution. We want to estimate O. subject to 0 € 9+. when it is plausible that 0 is ..... ,.... "close to" O. In view of the close similarity of (2.9) and (2.12). here. we consider the following RSMLE: (5.2) 9:cv = ~... }; l(X € !: (S»{1-c*(1I9;'vIlS-2)} 9:v . cpCaCN - - p a '" '" a,...nl'l 'v ,...nl'l "'* defined by (2.12). may be written [as in (3.11)]. as where !RM' (5.3) "'* 0nv = ,...nl'l for every (5.4) ~ (X . ,(S) .0) a ,..a·a '" '" "'* 2 , -1 on !: (S); 10nv "s=(X . ,(S» S .• (X . ,(S». a '" ,..nft ,..a·a '" a·a ,..a·a '" cae N . and the shrinkage factors {c* } satisfy - p a ~ 0 < c* < 2(lal-2)/(m-p+3) if lal > 2. and c* = O. otherwise. a THEOREM 5.1. a Under (5.1) and (5.4). the RSMLE in (5.2) dominates the 2\· .. ," RMLE in (2.12) fu.nder"t~gWid~~tic'Fok;1'n'(2.13)]. Proof. Following the1i~~s,~~tfie(pro~fio~DTheorem3.1. we have here (5.5) "'* "'* * "'* 2 "'* 2 R(~.~) = R(!iuI'~) - 2~* E~.~{ca 1(~ € !:a(~»I!RMI!," !RM"~} " + 2 }; * EO' -~ " • a '" P ) .. ~.tei~· L;rfj * 2 a~: E~.~{Ca) + p( ~{c*~ltX':€ ~ ~~ r1\J ~, ,c"', !:" \t '. (S)f9:~ };-1 0)/ 119:v Il 2 } '" ,..nl'l S a '" ,...nl'l '" ....),;"<'10!i -:;' .. . "'*' '2 ~ "'M Note that by virtue of (5~:!>'{,~Or e~rr~'2:;~ Sa (5.6) S . , '" W( la ,..a. a S Np ' I. m-I~" I.,,"~'-~~, }; . .) ." ( i (5.7) 2 1(~ € !:a(~»(~:I: / "!RH"S}' \ ":-;",1 . l . -1 S . , is independent of S , , (or S • ,) and S ,..a. a ,..a a ,..a a -1 aa ,S, a a II regardless of }; [See Anderson (1984. Theorem 7.3.6)]. and Sa:a' is independent of X . ,(S) (=X -S ,..a. a '" -1 -1 ,S, , X ,) and S , , x ,. ,..a,..aa,..a a ,..a a a The last ,..a result follows from the independence of S and X and (5.7). Thus. for * using the classical Wijsman (random) orthogonal every a eN. - p transformation on X . ,(S) and S . ,. we obtain that given X . ,(S) and ,..a·a -1 Sa'a' ,..a X , (X ,.., € !:a (S». '" ,..a. a ,..a·a ,.., e- 15 vecto~s which does not depend on the equali ty of distributions. * a~: E!.:{Ca 1(~ (5.9) = .... , : . Hence.'we'·'ml've·'-L A*. 2 A* 2 ~a(~»(!RMlf~ / II!RMII~}. € :I * c* E()(2p+l) EO ~{I(X . • (S) a -m.~ ..,a·a.... 8!CN ~(O)I(S-~ . X , ~ .... ..,a a..,a ........ - p = t held fixed; here = , stands for the :I * c*(m-P+l)E ~{I(S-~ .X . ~ 0) E[I(X . . (S) a O.~..,a a ..,a.... ..,a·a.... 8!CN ........ - p O)}. .... ~ ....O)IS-~ . X .J}. a a ..,a Next. note that (5.10) I 0-1. • X .} E{k. . (S) IS-1. . X .} = E{X -S • S-1• , X ,S ·a .... ..,a a ..,a ..,a...aa a a ..,a .... a a ..,a ;.Jt Similarly. ,~ i <. ~t',. : : .. /:;,...JY (5.11) VeX . ..,a. a ,(s)ls-~ .x .) =:I .~ X'. ~~ . X .} =:I . ,(X ,) say . ..,a a..,a, 'c"1':~~~J,t.:)--:)l £::?":.~ k . ..,a·a ..,a .{i -1 .r,·· . ,- Further. given S , . X ,. X is normally distributed.:' Hence. (5.9) a a ..,a ..,a, ., . " ... ii\i';.~~; i : (~\:;. .:1 ~)l" }:.<. : ' reduces (by us ing Lelll1la 3.2) to ~, , C:') (5.12) Similarly. * (5.13) a~* E!.:{Ca 1(~ A*' € ~a(~»(~: -1 A*_2 !)II~II~ - p * -1 1 = :I * c (m-p+l)EO ~{I(S . . X . ~ 0) :I k-' -pk(O . aCN a ..... : . . , a a ..,a .... k>O· ..,a. a - p II :I . . (X ,». ..,a·a..,a - exp(-~1I0....a :a.lI~....a:a • (X.» 2k/2r[~(lal+k-2)]/r[~lal). ..,a (5.14) * 2 A*. 2 A* -4 a~* E!.:{Ca ) 1(~ € ~a(~» • II~II:I II~IIS } - p 16 = 1 ~ c*2{m-P+1){m-P+3) EO ~{I(S-! ,X , ~ 0) {I + X', S-~ , X ,)-1. aCN..... a •L. ,.,a a ,.,a '" ,.,a ,.,a a ,.,a -p "'''' 1 1 2 IIO . ,1I~a:a,{Xal»} 1 k-'.JJ.. {O . 1 . ,(X ,)) • exp{-2k~O . 'k ,.,a·a ,.,a·a,.,a ,.,a.a:..... ........ It 2k/2-1f[~{ la l+k-2)]/f [~Ia I]. From (5.5). (5.9). (5.12). (5.13) and (5.14). we obtain that (5.15) A* A* R{!RM'!) - R{!RsM' !) *2 * -1 1 2 = 1 * c (m-p+1) c EO ~{l(S, ,X, ~ 0) exp{-2-IIO . • II~ (X .)}. a a . L. ,.,a a ,.,a '" ,.,a. a L. • , ,.,a a a CN '" '" ,.,a. - p ,-1 -1 1 (I + X . S , , X .) • 1 k-' 'lJk { 0 . •• 1 . (X '» • ,.,a ,.,a a ,.,a k~O',.,a . a ,.,a. a ,.,a I X',S-~ • x ,)(2Ial-4) a a ,.,a f[2!{la l+k-2)]/f[2!la l]-1 [(l + ,.,a - c*{m-p+3)]. a Now. under (5.4). for every a eN. * - (5.16) c*{m-P+3) a p -1 < 2{lal-2) ~ {lal-4){1+X', S , , X ,). ,.,a aa ,.,a ~ A : :{': :1': X ,.,a It (0. ( "") f.h} while we may rewrite (5f115f as (5.17) v 'n~ v' * -1rr,Y I,,~ 0) 1 ~ (m-p+1)~. c --t;'{Er:l'Q (~,) I~'a' ,~,]. . . . . . a,,:"'I • \;:- " "'7.a a' '!a" ,;, 11!RM1I1 ,.,a: a , aCN A* -2 - p [{I ~ '{X,.,a-', 'S7~ ~':xi':-}H~ral-4) ,.,a a ,.,a ,,: ':>.;::;' Q":n ., ~ O. ~)VS.· - • V 0 € 9+. ~".3.1. ~)r!:z t":'::::"";- This completes the proof of the \-;f1 ~ ::)W - c*{m-p+3)]} a r J £~)t nIi i. 1£..;'). theore~. .~ Suppose now that in (5.2). ,we f! < ~e..t (.~; · . <t····:- _, c*.= (lal-2t/{m-p+3). a eN. .':i: ~ .- (,::, -r(r~;; - p Then. we have the following. Corollary 5.1.1. For the speci?ic'~~hoice of c* = (lal-2t/{m-p+3). a C a N* • the reduction of the risk due to shrinkage is p (5.1S) 1 * aCN (m-p+1){{lal-2)+)2 -1 (p+3) EO ~{l(S , ,X , ~ 0) m• L. ,.,a a ,.,a '" - p -2 1I0Dvll~ ,...nl"l L. -1 • ,.,a. a , (X ,)}{1+2 X' ,S , , X )}. ,.,a ,.,a ,.,a a ,.,a The proof follows from (5.17) and particular substitution of the *a c . A*+ A positive-rule version (ORSM) of (5.2) can be considered as in e- 17 (3.20), and (3.21) extends in thisJcase too (with replaced by .....*+ .....* !RsM and !RsM' . . .+ ... !RsM and !RsM being respectivel_y1. :~$.e Let us indicate the modifications fo.r}he where X =~ = n -1 n ~ ~i and the ~i are i.i.d.r.v. with Np(~':)' In this case, in (5.1), i=l we have X - N (O,n- 1 ~). Also, here are replace S by S - W(p,n-1,~). P 'V Note that ..... ~ .Jl. "'-I 1'V,...Jl #'V = (n-1) -1.Jl. S. I'V Thus, here m=n-1, and hence, by Theorem 5.1 and Corollary 5.1.1, we may take the shrinkage estimator (5.19) , where (5.20) aMen) :RM " = ! c;' ... J ... ~ ~a(~:a'(:n),~) O~g~~~~:'):'" - - p ';' !_ .:v and the other notations ~fC;; as in '{~.'~)ii!th'\jfOper ~eplacement. Comparing (3.1) and (5.19), (with q . = (1~1-2t, a ~ N*), we ; :/'.. i S 'I j " a .f.;~" P may note that (by Remark 3) here we have'repl~ced ~ by ~ and have an , tJ :1 t ~'"> i '" ,Jl REMARK 5. ,- ":"':, ~. ." A .- For P=J', the later factor is equal to additional factor (n-1)/(n-p+2). 1, while for p > 3, (n-1)/(n-p+2)' ') I'; adjustments. a.s. i:' indicatini'l~rge f~'6"')9·-i:·'" Further, ..• ' '; ~~ { l''-(" ~~t (n~1)/(ri~p+2) ~ ,. t ':iO f :~~ ~ ~:, 1 1 as "I,'-"(d.~~.: ~1 nJ~~ . - p ~·a ".... 2 ,II~ ~ ~:a , ~~ A (p fixed), and two versions become asymptotically risk equivalent. * nllO . becomes large, for every a eN, shrinkage ._ (as well as in the rth mean, for every (fixed r :1 "; ~ ~ ~ .Jl. > O. Thus, the However, as n , so that there is no effective shrinkage (or reduction in risk due to shrinkage), for large n, for any fixed 0 ~ O. In the unrestricted case, Sen (1984) has advocated the use of Pitman alternatives to the pivot, for which the asymptotic risk has a non-degenerate form, and this consideration remains in tact for the restricted alternative under consideration. 18 6. Some extensions and impcrtant aDplications. In this section, we consider some extensions of the results obtained in the earlier sections to more general models, and also consider some specific applications in some practical problems related to linear models. 6.1 Sub-orthant models. x = (Xl' ,X~) (6.1) 'V ~1 where ~ € ~ '" ~1 and ~ We consider the following model: Nm+ p «~1',~~)' ~,~), = ",,...,£ are m-vectors. '" ~1 '" unrestricted, ~ ~ and are p-vectors, (R+)P, and we assume, for the time-being, that: is a (p+m) x (p+m) p.d. known matrix. We desire to consider improved estimation of ~ under ~ (wheIltl'~,:~ve the above sub-orthant restraint N~ We deno te by = {m+1 , .... , 'ril~ ~; ~q.~, ev~ry the null pivot for a : cp ~ a ~ N~, ~). the complementary sUbs~t ~.~ .den~~ed :blZ ~. ~, . A\so,/we define ~l:a' (:) = ~ £~,_, ~'~ f.:. Xl' .. X . . (~) == X.. ,_~d \~ . " ~·a ~·a ~ ,.".a:'-a., ~·a ~ ,JJ. ,.a , as --in ear lier sections. "' of ~ is given by (6.2) Un.. ~ Proof. (6.3) ::~:.': ...,.,. £:v- C J.... i'.-:' j-;'=~- ....., -1 ~ 0 ~N (X1 .,·X ......".Q) l(~ , .,cacJ'( m,a ~.a ~~J;:~ ~. ~ a - - p = Let us parti tion U(~l'~) = ,X . ~ O. X . , ~ ~·a > 0). ~-l-as (;(ti~»~',;j=1,2' and let ~=1 ~=l{~i-~i}·:ij(~j_~j)' A A Then the RMLE of ~ is given by ~ = (~i RM:' ~ RM:)', where (6.4) Since m +p U(~lRM:' ~RM:) = min{U(~l'~) . ~1 € R , ~ € (R ) }. A ~1 A • is unrestrained, differentiation of (6.3) with respect to yields that (6.5) A ~1 - ~1RM: = -(: 11 -1 12 ) : (~-~RM:)' Also, we may rewrite (6.3) as A ~1 19 (~1:2 - ~1:2) (6.6) -1 I , :1:2{~1:2 ~1:2) + (~-~) = :~~ Noting that (:11)-1 :12 -1 I22{~-~)· :12,_we_obtaln from (6.5) that A ~1:2 - ~1:2 RM = ~. (6.7) . ,. I) (~-~) Hence, it suffices to minimize " I !. A to , -1 I22{~:~) over ~ ~~, leading A ~ RM' and then to use (6.5) to obtain ~, (2.9) with: replaced by ~ RM' we have the solution in so that A ~~nv = (6.S) I ~ ~ 0 ~ a € ~ (I~», (X . 1,0) 1{X_ ~·a ~ a ~ ~ - - p while, by (6.5), A ~1RM = ~1 + (: (6.9) 11 -1 12 ) ':' .. (~ - ~) A -1 ( = ~1 - :11 :1~2 ~r~ f<~';;, A ) , "Lt", ,- -1 -1 ~ = (~1-:11 :12 ~) ,+ :f)11 ~f~' ~,., Next, we note that on (6.10) A X--~2nv ~!U"l Hence, on [i '-1:.1 ..,,: = ~a{I<lI<lI' """"""" I aa I , x; I~ ~ ~~~'~':~r:'b~;'6, '~:a' > O}), :-, t'i.n. _ ~ 0 cp C a'-ti:~N . X i),ivY ~ - - p ~ € ~a{~)' 12 A (6.11) ~a{~) (~{~ , ' ~., :~. ~1RM = ~1 + :11 e2: 1 ~a{3aa' I~'a' ~" ~I)' , ,'\ ' 1 / '\ so that by (6.S) and (6.11), on \!~(~)<~ .. ! ' L'.r.; (6.12) ~u ;..Kl'l = ~Nm,a (X1+I11e2 I12~at(~, ~I:-~ ." X .. X I)' X . 1,0), ~ ~'" ~ a ~ ~ ~·a, ~ for every a : <I> cae NO. - p yield that on (6.13) Finally, SOBle standard matrix manipulations ~a{~)' 12 ~Nm,a{~1+:11e2: = -1 ~a{3aa' :a'a' ~I' ~I)' ~:a" ~) ~Nm,a (Xl·" X . " ~ 0), Y <I> cae ~·a ~·a - NO, p and hence, (6.2) follows from (6.12) and (6.13). Q.E.D. Motivated by the RMLE in (6.2) and the Stein-rule estimators in Section 3, we proceed next to construct improved estimators of ~ for 20 this sub-orthant model. m subspace {~1 € R • ~ In this context. we consider the pivot as the = £}. and the quadratic error loss in (2.13). as amended. for the (p+m)-dimensional case. THEOREM 6.2. The shrinkage estimator ~ ~ -2 ~ ~ = ~Q(O 1(~ € !:a(~»{l-Call~lI: }~. (6.14) - - p ° ~ ca ~ 2(lal+m-2)+, where ~ cae NO. dominates the RMLE in (6.2) p [~ J.! € Rm x (R+)P] when p+m~3. '" Proof. Note that Lelll'lla 3.2 extends directly to the case where the ~ positive orthant [Z 0] is replaced by any positively homogeneous cone, and hence, we may virtually repeat the proof of Theorem 3.1, where we need to replace (3. J) by (6. 2).;" FO,I"\irjn:.tended brevity, the details are omitted. For unknown covariance matrix. whenever we ahve a Wishart matrix S (with M DF), independ~rt:'1yf;~.;,w~J~y, di~e~tly use the results of ,..>.". Section 5. and arrive at the THEOREM 6.3. For the sub-orthant model, the RMLE, given by ,,':) ,.; ° ~Nm.a(Xl' ,.rS) X". ",·a '" ..,a·a AM (6.15) lJ...... ~ followi~ .3.-; ;~."".,.~.;'J = .,r~rN }; bfl':"> ':~Ct:~~ ~lp ~s. jq :~. .".- .)j. :: r (t, " ,(S). 0) l(X_ €!: (S» '" '" ::.;l a '" , is dominated by the shrinkage estimatOr /" (6.16) ~ = ° 1(~ €~!I~(~)rl-b=1I~1I~2};' }; .pCaCN - - p (6.17) ° ~ c*a ~ 2(lal+m-2)+/(M-m-p+3). $ Cae NO. p We may also remark that parallel to (5.19) and (5.20). here also. -1 = X =n #'V,ovIl for X ~ ~i'-l - Xi' Xi i.i.d.r.v. with Np+ (J.!.};). we may construct "'" m '" IV ,...., ~ the RML with}; replaced by}; as in (6.2). and for the SMLE in (6.16), ,..n we may take the shrinkage factors as 21 (6.18) 1 - (n-1)(n-m-p+2) -1 n -1 (m+ Ia I-2) + ° A* -2 ... S a S N . p 1I~lIi' '+' ,Jl ~ = The modifications for the case of done as· in (4.1) by defining s2 = (6.15)-(6.17). 02v. m(m-2)-1~'. with a known V. can be and then proceeding as in For this special case. by virtue of the adjustment for s2. in (6.17). we may allow ° ~ c* ~ 2(lal+m-2)+. <pc a a - C NO. p This special case will be found to be useful for the linear model setup to be considered next. 6.2 Restricted MLE. Univariate linear model: (6.19) x = c~ ~p ,Jl + e ; e ,Jl,Jl ~ Consider the usual model 2 N (0.0 1 ). n ~ ,Jl where C is a (known) matrix (of order ~~) of regression constants. po = (~i' ~) is a pM-vector of.·)tmknOWl'lnregreSs·i~nparameters. ~j is a Pj-vector. j=1.2; p* = P1+P2' 0 2 < 0 < ~) (0 l<'.4'·!~. .-~_¥- the following situation: . :'rl"J.'jri.}~ P2rrr)[ nonnegative orthant in ",).l is unknown. and we consider ~~·,~.C-.:,'-:·!J3·,I.~> j.: .(:,~" -... " :; \ . ~ We may actually parti tion ~ as [~l 22~ and .rewrit,~ (6.15) as J' (6.21) ~ = ~1 ~1 + S ~ and we desire to consdier restraint (6.12) on~. , ~~;: G~") ',:; ,1 i 1:: ; + ~ ._ V ' ,.. +1>2· ~;,.gJ~~~_=~~~i~~_:~~") . ilJlPr-ffY~.~rt~r~ors of p under the set We set P2=P and assume that n ) P* and Rank of S = P2 = p. This condition enables us to use RSMLE for usual RMLE. r::':. j r PI ~ (or ~) which dominates the In passing. we may remark that the K-sample (K location model is a special case of (6.19). where p*=K. ~ 2) In this case. if we want to estimate the means (9 •...• 9 ). subject to the ordered K 1 restraint: 9 1 ~ ... < 9K• we may reparameterize as in (6.19) with P1=1, 22 P2=p=K-1. P1=01 may note that centers on and Pj=Gj-9j_1.j=2 •...• K. behaves as a nuisance parameter. and our main interest ~1 ~. Referring back to (6.21). we me~ Thus. we as well allow ~1 to be possibly of less than full rank. First. consider the case of C being of full rank. Then ~ = (~1" p.). = (C'C)-1C' X r_ ._ ~ '" '" '" ~ * -1 IIX -C PII ~ 2 (6.24) a 2 = (n-p) (6.23) ,...., ,ovIl I"V '" are jointly sufficient for (p.a 2 ). and. moreover. '" (6.25) p'" N (P, a2 {C'C)-1). '" (6.26) pM ,...., * '" '" ~2 {n-p)a I a 2 2 ~ or ~_pM' independent of p. '" , \ ik,. d; '!sb.m· '£J'; In this case. we may use Theorem 6.3 with the modifications for the case of = a2 ~ (6.27) V. V ~* = PDe ,...no = (C'C) -1 • ~ ° 1{P~ ~Cli - - p q ~ a 2 unknown. and obtain the estimator € ~ a ~2 (a (C'C) -1 '" '" { »t 1 - where (6.28) and ~* ~ °~ c a ~ 2(lat+po12}t~{ft~p);tn'~). ~c- a C 1 - is defined by (6.15) '0' .". ~ witK'~':'= ~ c . ;. and .. ~.~= ~2. NO. p a (~'~) -1 . Further. we obtain that under the quadratic;:::,efror c l().s, '(Y-I3) ! {C·C){Y-P)/a2 . We may also consider a positive-rule version of the shrinkage estimator as in (3.20) and verify (3.21). Let us next consider the case where C is not of full rank (P1+P2)' In this context. we write ~ = [~1' ~] (as in before) and assume that 23 ~ Rank[~lJ ~ where 1 S under which !!2 Pl· Essentially. this relates to the full rank of is identifiable. so'- that "'the re'strictions on posed iIi a meaningful manner. The justification for (6.30) may be Yij ~ N{9+a • a }. 1 i 2 visualized by considering a counter example: s. I ~ i ~ !!2 may be r; all these r.v.·s are assumed to be independent. <j ~ In this case. R[~l SJ = r. Rank[~lJ = 1. Rank[SJ = r. so that (6.30) does not hold. Since the 9-space and a-space intersects non-trivially. the hypothesis a ~ 0 is not testable. so that a € (R+}r has no real meaning. If C'C is not of full rank. we need to use a generalized inverse of In this context. we maY{,~o~~j t~~\<r~~~rf[l~~';~.J' ~~y.(~~neralized inverse of C'C is of the form .;~) ~ ;!~'. fI'J{;"'};'"'rAIt(: ,:"~)' ::~.'; "':'f.\ {6.31} where for any generalized inverse "'* of {~i ~1}' the estimator ~ {~i ~i} in (6.27) remains the,.~.,(~SH94hW~}lIIliiy~f?JloW: the same steps as in the case of full rank C IilDcl SJ~iJ.ft {6A~) iy" (6.32) Xij = J.L i + e ij • I where the e (6.33) J.L ij ~ I ~ j ~ ni' I ~ i Ie' lJ; ~ r. are i.i.d.r.v. X{O.a 2 ). and we have the restriction ... ~ J.L r. r which is a posi tively homogeneous subspace of R. (6.34) J.L i = Po + ... + Pi-I' i=I •...• r. then (6.33) may also be expressed as (6.35) If we write If we wri te 24 {6.36} {6.37} then for the given estimation problem. the sufficient statistics are 2 2 2 2 {6.38} {X1.···.Xr ; sO}; {n-r)sO / a or ~-r' where n=~ Thus. we may use {6.16} [with the modification s 2 = n 1.. [{n-4}/{n-r+2}]s~} with ° ~ c: ~ 2(lal=1}. $S a S Nr - 1 and conclude that this restricted shrinkage estimator dominates the restricted MLE [considered in detail in Barlow et al. {1972}]. two-way layout: the RSMLE of 7. ~. Xij = ~i + ~j ~ij' + 1 ~ j ~ We may also consider a n. 1 ~ i ~ r and obtain under {6.35}. Relative performance of SMLE and RSMLE. In earlier sections. we have noticed the similarity of the RSMLE and SMLE in various situations. c. , ,', X"f\-':\'"'" '1:,.1\',1' "'I Also. the shrinkage versi~s'domtnate their respective classical We intendjto comik~~ versions. tHe RSMLE:and SMLE in the light of their In thh respe9h;~w~''fmay'.,g,ote.t'.~pasiC?c.difference between the two risks. :",.1_" .. versions of the Stein-rule estimators. t .,.' r: .~~ \,)::- t.' ~ '1];' The SMLE is invariant under any !;; ::>fltJrl :'l non-singular transformation:· X -t Y = BX. B non-singular. RSMLE is not generally so. Thus. for the simple model X ~ H {9.~}. P 1~ ~ known. the risk of the SMLE depends on 9 only through A = 9' , '),: '1 ( : ) :>~(>' But. the t.,' = ~ ~ ~ 9. while the risk of the RSMLE. computed in {3. 7}. depends on the unknown 9 in a much more involved manner and may depend on the individual elements of 9. Thus. for the SMLE. it suffices to consider the model {7.1} X ~ ~ 1 0 2 H {9 .I} where 9° = {A .0 ..... }'. A ~ 0. p ~ ~ ° where. of course. 9 € 9+. the positive orthant. but on the boundary. On the otherhand. for the RSMLE. the risk at a point 9 {such that 9,~-19 25 = A}, on the A-constant curve, may not remain the same. that the comparison of the risk of SMLE~d contour (or any other one} becomes very~ ..hard p It turns out RSMLE on this A-constant (technically), for general > 3. We attack this problem in the f?llowing order: (I) Comparison of the risks at the pivot (9=O); (2) Study of the differential picture of the risk of the RSMLE in the different subspace of 9+, and (3) Comparison of the risk of SMLE and RSMLE under (7 .l), or parallel models. 7.1. (for Relative risks at the pivot. ~=I) For X ~ R p ~ (9,~), ~ ~ p ~ 3, the SMLE is given by (7.2) ~ T'T,j:":. E);:-;'i' ",:;,j.M.2~ ~~:f; "10 \·"':"11.£1 I[{rj~ <;' so that IIXs _911 2 = IIX-911 2 - 2(p-2}IIXII-2 {X-9}'X + (p_2}2 I1XII 2. "" "" Thus, at the pivot 9=0, ~ = 2, ~ ~ . "" IV, - _ < .; ~~ ,i.;'"C;'IftofutJ ertrc.;:~: '-:;':!"'.'/ '):~":".IV th~;r-i.sIw:p~ '~fl~ ~l}..i s ~qw~J,.. - ~ to p-2(p-2)+(p-2} V p ~ 3. equal to (7.3) 2-P{ ~ ~~p Thus, by (3.19) and (7.3), we obtain that at the pivot 9=0, (7.3) R(~,!} = p/2 - ~(p-4) - (P+2}2-P = 2 - (p+2}2-P . Consequently, at the pivot 9=0, for every p ~ 3, (7.5) In the above derivation, for simplicity, we have taken same result holds for equal to a 2 2-P (p+2). 2 ~=a ~=I. But the I, where the right hand side of (7.5) would be Modifications for general ~ can be made readily. 26 In the computation of the risk. lI ell has to be replaced by lIell~ and for 2 the RMLE. the ~ have to be replaced by ~:a' and relative picture. however. remains the same. modifications for the case of I (or 0 2 ) :.aa by :a:a" The Similarly. the unknown can also be instituted along the same line to establish a similar relative risk picture of the SMLE and RSMLE. 7.2 Directional variation of the risk of the RSMLE. steps. RMLE. We do it in two First. we study the directional variation of the risk of the Then. by virtue of (3.16). we draw the picture for the RSMLE. Let cP(x}. x € R be th~-,7l,.~~t;.~:p'or~~::~."f. and $(x) = cP' (x) be the corresponding densi ~Y:; f~~t~!:Jn. "Then~.~.f?f.- every a : = J4:8."~ 11 ,.,: ~(:-.Aj}..r.~~(x. )fQ} = II 1\,"1,~ • .. :~h ~ . jEa (7.7) Pn{X"., ~ O} (7.S) E {IIX -0 11 2 1(X ~':(j)}' ~';)'~1X>[.:J~\ O} ~ """" o "" -Jl -Jl -Jl '" 0 -Jl '" 4> S a S Np ' cP(9 j }: Lk0 {IIX-Jl-0-Jl 11 2 I -Jl X ~ O} '" Thus. by (7.6). (7.7) and (7.S). we obtain that (7.9) R(9ov .0} = I ~ '" ¢faOf - - P I II cP(Oj}cP(-On}{la l - I OJ 4>(0j}/cP(O.} jEa I!Ea' c; jEa J + 2 I 0l!}' I!Ea' It is quite clear that (7.9) depends on the individual elements 0 •.... 0 in an involved manner (and not simply on 11011 2 = O·O). 1 p ~ ~~ this point clear. we consider the special case of (7.1) where To make 27 1 9=90 =(A2 ,0, ... ,0)'. Through some routine steps, we obtain from (7.9) that (7.10) R(~,!o) = ~p 111 2 2 + [~(A2)_~] - {A ¢(A ) - 1 A[1~(A2)]}, 111 where by virtue of the Mills ratio, A[1~(A2)] ~ A2 <l><A2 ), V A ~ 0, and 111 2 - A[1~(A 2 )] the difference A2 N~A) ° A we conclude that R(!RM'! ) ~ ~ ° as (p+1)/2, V A ~ A ~m. Thus, from (7.10), 0; at A=O, (7.10) equals to p/2, while as A increases, it monotonically increases and attains its upper bound asymptotically as If we look at (7.2) and compute its A~. s risk R(X ,90), then it follows from'the James-Stein (1961) result that s ° \. >]! "t( .. A~:~ X;: 1p at A=O, R(X ,9 ) = 2 and ittitOrioton{~llY""inG)."ease-Jt..arsA increases, and ~ ~ its upper asymptote is equal:/~o p~ {:t,;o - c~{J;i9~t::_lfElY readily be made q from the above results: ~~ )':" ! I '; •. ~ ... :<~.I ;" _, For p=3, under (7.1), the RMLE dominates the SMLE, so that by (i) virtue of (3.16), the R8MLE do~:iila:afs"bbth'dre'>'RMLiand SMLE. (il) For any p ~ 3, the r¥-~k 'ol'}lthe SMLE·dtri~ Y{ot be smaller than that of the RMLE for all A (~hu?.ttl~rlt/Y-ol? large A), and hence, the . ~ ,'.>~' SMLE fails to dominate th~~) vm~~q~~C~,{RSMLE),under (7.1). _.r . ....:),.,. ...... Actually, we shall exploit (3.16)' and (7.10) to establish the . '" .- . ° dominance of the RSMLE over RMLEand SMLE when 9=9 , 9>0. Before that, we consider the other extreme case where (7.11) 9 = 9 1, 9 > ° 2 (so that A = P 9 ). If we let a = ~(9), P = 1-a<J>(9)/~(9) a(1+~)=1), (7.12) and ~ = ~(-9)/~(9) (so that then (7.9) simplifies to aP{p(1-P)(1+~)P-1 + p 92 ~(1+~)p-1} = p a[a(1+~)]p-1 {1-P+9 2 ~) = p{~(9) - 9 ~(9) + p-1A[1~(9)]}; A =p 2 9 28 Note that at 9=0. (7.12) reduces to p/2. and for p than (2) the risk of the SMLE at 9=0. not perform better than the SMLE. ~ 5. this is larger Thus. in this case. the RMLE may To compare (7.10) and (7.12). we note that for g(x) = p{~(x) - x¢(x) + x2[1~(x)]} - [~(xvP) - xvP ¢(xvP) + p x2[1~(xvP)]}. we have g' (x) = p{¢(x)-4l(x)-x ~'(x) + 2x[1~{x}] - x 2 ¢(x)} - {JP ¢(xvP) - JP ¢(xvP) - x p ~'(xvP) + 2p x[1~(xvP}] - p3/2 x 2 ¢(xvP)} = 2 px[~{xvP} - ~{x}] ~ O. V x ~ O. Thus. {7.12} equals to (7.10) at A=O and their difference is a monotone nondecreasing function of A. The upper bound (asymptote) for this difference is (p-1)/2 (~1). In a similar manner. if we take 1 {7.13} ! = !(k) = ({A/k) ~ function of k (1 2 lk· ~p'~~)~. A ~ O. (k=1 ....• p). ~.P)L ~i~,pfctu[elmakes i t k ~,., , , { ., ("',;." 1 L ~ ..:l.. 1_ _ J ~ ..,. n_ ' \ 0: ; " . more appealing to ,_ consider the RMLE instead of the classical MLE when 9 lies on a lower ·~f·(r:··: i~.'. , }" -'l'-~(" c," _ + dimensional edge of e+. the most favorable case being R x 0 (up to r _ , '»." permutations of coordinates). Note that by (7.4~1~d; (ir~). (;~~; ' 'l--. I, \ I '" L." )" r_ ' than the RMLE as well as the SMLE. pivot (inside e+). the risk of the ':.' _ the RSMLE has a smaller risk ",,", t..\:: ; : ' _. .' ,I How~e.r. ~" when 9 deviates from the is a constant. the SMLE has a risk depending only through (A.p). while the risk of the RMLE and RSMLE depend on 9 in much more involved manners. A single picture can not be drawn for the entire domain e+. although the RSMLE dominates the RMLE over this domain. 7.3 Dominance properties of the RSMLE. We have already remarked earlier that for the model (7.1). the RSMLE dominates (over e+) all the other three (i.e .• MLE. RMLE and SMLE) when p=3. the case of p~4 We therefore consider and study this relative dominance picture. 29 A By virtue of (7.10) and (3.16) [and the fact that (p-2) 2 (7.14) -2 E('<p,A), A = 2 R(~,~) = p = " . ' we obtain that under (7.1), II~II ], R(~,~} - R(~,~} = p/2 - (P-2;~(x-2 2) , 2 + {e<k9} - 9 (1-4(9)}} + }; * aCN [~(9}-1/2] :p,9 (lal-2}2 E9{1(X '" € ~ 2 }lIx 11- } a,...a - p First, consider the case of p=4. The right hand side of (7.14) reduces to (7.15) 2-4E(X-2 } - [~(9)-!-~(9) + 92 (1-4(9}] 4,92 2 + }; * aCN (lal-2}2 E9{1(X '" € ~ 2 )lIx 11- } a,...a - p = A4(9} + B4 (9} , ,say. Note that B4 (9} is non-negative, f~r ~v~t~ ~ O. Thus, it suffices to show that A (9) ~ 0, V9 ~ 0.' ,T6iafJf\li?; J8t~;?that' a r Therefore, from (7.5) and (7.6). we obtain that (7.17) A4 (O} = 0: while by definition, -4 _~92 (7.20) 4 E(X ,92 ) = e 6 };r~O 2 1 2 r 1 4 _!9 (29 ) r! (2+2r)(4+2r) > ~ e 2 ,V 9 Thus. for 9 > ~T (which is less than .8), (7.18) is positive. (7.18) is equal to 0, while for 0 < 9 < ~T, we note that 4 6 (7.21) (8/89}{4E(x- 2 ) - [1-4(9)]} =~9) - 169 E(x- 2) 6,9 8,9 ~ 0 At 9=0, 30 = e ~ e _~a2 { _1_ 1 2 r 1 ~ - a 1r~0 (2 a ) r! 1 } (2+2r)(4+2r)(6+2r) a -212{ _1_ _ 1 2 r 1 } a" ~ - 3 1r~0 (2a ) r! 2 !a 2 - = ~(a){l - ~ (a/3)e ~ ~(a){l - (213)e r - 1} > O. 2 !a 2 - ~ ~w as ae e -1 w and e -1 w ~ a ~ ~w. v0 < 3/2. Thus. (7.18) is nonnegative for so that A (9) ~ O. V a ~ o. Hence. (7.15) is ~O. V a ~ o. 4 so that for p=4. the RSMLE dominates the SMLE [under (7.1). for every a every a ~ ~ o. 0]. Consider next the general case of p '" .N(a.1) while Xj .N(0.1). for 2 '" ~ j ~ p. ~ 5. Note taht under (7.1) Xl We denote the last term on e' the right hand side of (7.14) by C (a) and write it as r P "'; "" (7.22) Cp(a) = a~*~}~I"_~~2 ~~~i~U~"~~~)II~II-2 - p + a~* (J~ I~2.:~~,:,~e{1(~ -pc = c~l)(a) ~ 1)} € ~a2.~~n-2 1(a ~ 1) !I.' 4', c~l)(aJ~i- 9~2?-~~hl~X.}br, Note that by direct (7.23) l(a " evaltm,U,n~._ ~ = (p;l) r=2 = ~~~~~2~!~1)(r-¥)2 • ...__.',::- '; ..' ~(-a)2-(p-i) ~ r=2 = un +(-0) {':5 + (r-2)-1 .. (P-l) (r-2) r ;t} . Note that by definition C(2)(0) = (p-3)/4+2-P and C (0) = (p-4)/2 + (p+2)2-P . p P For every r > 2. we denote by (7.24) (7.25) __2 __2 -1 Ar(a) = Ea{(XI + ... + X;) Then. we may write • l(X j ~ O. 1 ~ j ~ r)}. 31 (1.26) c(2)(e) = ~ (p-1)2-(p-r)(r_2)2 A (e). p r=2 r-1 r Noting that the joint pdf of X1 .···.Xr is ¢(x1-e)(~) ... ¢(Xr ). we obtain that (1.21) (8/8e) A (e) = Ee {(X1-e)IIX 11-2 l(X ~ O)}. V r > 2. r ~r ~r Since in (1.21) Xl is confined to R+. the derivative is positive at e=o+ (as well as for e ~ e~r). for some e~r) > 0). while it may be negative for 9 > 90 , Consequently. Cp(2) (9) is nondecreasing in 9 (0. min r € e~r». Further. by (1.21). we immediately obtain that (1.28) and (8/89) Ar(e) > -9 ArCe). V e > 0 r > 2. Consequently. from (1.26) and (1.28). we obtain that (1.29) (8/8e)c~2)(e) > -9 c~2)(e) .', v,e,,> O. The last inequality leads (1.30) _!o2 C(2)(9) > e 2 P uS to, ' L ;: ~: Hil " j ' ('" C(2)(0)'~J-V;~">"0~" d, " . " , ' Pi" ',,' '"J . .:, , I., V" i 1 '1 t " , , } " .. r ~ "11· ';', -~; I -", , "t', I ! ' .;:",.' ... 1 , 1. ~ Thus. by (1.22). (7.23). (7~4)'and (7.36). we have' 5 e2{ :, 3' '1'} ' , p+'l } 1 C (9) > cP(-9) { +'~ + ~ 2 ~ - + - ' p 2 2P-1 4 2P p- (7.31) '.. )-- . , p- (, I . '" Consequently. the right hand side ()fi: (7~ i4):":~'s (1.32) '*,.V_",9 > :-r" o. Do-&rtd~'from below by p/2 - (p_2)2 E()(-2 2) + {elP(e) - e2ti~te)]} "'- ~ P-3 p. + cPr-e) {"""'2 + ~'l-})"1"':le2~~3'1' {i~(}" , 2~1 ,+)~;2 l4i.:'~c2P • V 9 > O. At e=o. (1.32) reduces to (p+2)/2P '> O.~and proceeding as in (1.16) through (1.21) [but replacing 4 by p]. it follows that the derivative (With respect to e) of (1.32) is nonnegative for all 9. and hence. (1.32) is nonnegative for all 9 > O. dominates the SMLE for every p ~ Hence. under (7.1). the RSMLE 3. Let us next consider the relative risk picture under (1.11). this case. (3.16) reduces to (1.33) say. In 32 where (7.34) ° A (9) = 1...1 r Rr+ II~II -2 r IT ep(x j -9)dx 1 ... dx r • j=1 2 <r ~ p. Proceeding as in (7.27) and (7.28). we obtain that (8/89)AO(9) ~ -r 9 AO(9) ~ -p 9 AO(9). V 2 < r ~ p. r r r Consequently. proceeding as in (7.29)-(7.30). we obtain from (7.33). (7.35) (7.34) and (7.35) that 2 2 _!p9 _!p9 [P-4 P+2] (7.36) C*(9) ~ e 2 C*(O) = e 2 -2 + . V9 P P # ~ 0. Next. we note that under (7.11). (7.37) R(9cv .9) - R(~.9) = p - (p-2)~(x-2 2)NOft ~ ~ p.p9 p{~(9)-9cp(9) + 92[1~(9)]} + C*(9) , P ···i· _!p92 ~ p{I~(9) + 9ep(9)-92[1~9j]}F+ e!2 C*(O) - (p-2)~(x-2 2) p p.p9 ; "1~J"1.s1 ~.O-Ol _. ;:> "(.;:::,,·L~[Ilr,' Now the problem is to show thaer f7.37) is nonnegative for all p and 9 posi tive. But. unfortunately further simplification of above expression P\ tL;~ ~C~) becomes highly complicated. .! C ~ jrr~ i.. \) ;.: ;:~J.~:'''c' ~ Available mathematical tools do not work )~ ':)51 (" 8!.- ~ t ,~: 1 well to give us the desired result for all p and 9. r'1 ',,1.1 8 r t, ';ci" ~ "' C Hence we obtain following partial results: : ((; . \ ,., a) Nonnegativity of (7.37) for large 9. To see this we first write: { 1 1 p E9 ----2 = I (p) E9 2 • I(~I,···,Xr," ~ 0. Xr +1 .···.Xp < 0) IIXII r=O r 1I~111 Then rewriting (7.37) we obtain. (7.38) " " R(9SM .9) - R(9RSM .9) = ~(-9)} ~ Now. note that. A(p-r)(9) = o(~(-9» r pl· 2 + 9 ep(9) - 9 (p) A(p-r)(9) + ~ (p) (r-2)2 ~-r(_9) AO(9). r =3r r r=O r rA(p-r)(9) = I_n+ IIxlI-2 ~ <kxj-9) ~ <kxj+9) dX. r ~ ~ j=1 j=r+l ~ _ (p_2)2 Where. p{~(-9) as 9 ~ ~ if p-r > ° (i.e .• if r < e' 33 ( ) Because, A p-r (a) = I IIxlIr ~ r 2 p 11 ¢<xj-a) 11 ¢<xj+a)dX j=1 r+l ~ r 1 ~ I (p-1)+ 2 IT ¢<xj-a) 2 1 = [: (J>-II+X :"" ~+X~l j=1 11 ¢<x .+a)dX J ~ j=r+1 p~1 ¢<x.+a)dX]~(-a) j=r+1 x 1+···+xp_ 1 ~ P J ~ = A(p-1-r) (a) ~(-a). p It is now easy to see that A(p-1-r) (a) ~ 0 as ~. r o(~(-a» as a ~ if r < p. 00 Hence, A~p-r)(a) = Next observe in (7.38) the terms corresponding to r=p are equal in both the sums, hence they cancel each other. Thus we have, A A <: ":' '1' ~ _~. R(aSM,a) - R(aRSM,a) = p{~(-aJ + a(~{-a) ~-~(-a»} + O(~(-a» ~ p ~(-a) + o(~(..:a» a~[~i}j'tt>~;J~,j"(8)"\<' - A A So we have proved that a dominates a for large a and the order of RSM..:", ~! \ ·'-L~Mt.i'.i':~ '.Jit,0Y "'; dominance is ~(-a). b) Nonnegativity of (7.37) for p=3,and 4 (for all a): .:;.~._-lF~n consider the case p=3. (7.39) (}:~Y::'J.:.>:); ... Let us first tItJ In this case (7.38) reduces to [.; t :.G ~,{)') ·. :~~. -::::'C~~, ~e'-;lt:.~·~b :.; R(9RSM,a}'~;"~.P~f~t;: 2, ~~? :;'-'; a~ ~(-a)] R(9SM ,a) - :·i~C..&.;t"i Ea{II~1I2, Ic'~~ Now, " >.?)161~! 3'2'1 '~t;· --a - -lIxll 2 ~ IIxll-2 e 2 3 11 j=1 Lor r't': " and 2 Ea {IIXII- I(X>O)} = I 3+ ,..., ,..., ,..., IR. > 2 ! EaIIXII~ - 8 Thus, R(9 SM ,a) - R(9 ,a) ~ 3[~(-a)+a¢<a) - a2~(-a)] - '!. E IIXII-2 8 a ~ RSM Call the rhs of above equality D(a). Further, 34 so that Thus. -2 E9 1;::" 1 99 = 1 - (39 ) - 3-1 + 2 ~ 1 - 2 9 + 4 -6 2 - 2E{)(7.hA}' ~ 5 4 9 V 0 ~ 0 <h < 1 O. Therefore. note that , 1 7 . ; -"~)1 <P(O) - 24 = <P(9) - cf>(9 ) ";>'(9~EJOl ~h96";to! (1-h)6) S; (9-9 )¢(9 ) 0 0 0 so that D(O} ~ -3(1-02 )(9-00 } ¢loo} + 39[¢l0} - ~ 03 ] positive. for every 9 ~ 1. thus 0(9) >0 V 0 ~ 9 ~ 1. and rhs is 35 2-!nxn nxn-2 e -3/29 2 '" I 3 + 3 IR + + ~ e e ax +~-~ 1 (2lr) -3/2dX " '" I 3 IIxn-2 e -3/20 IR + 2 '" 2 1 2 -ax -ax -Ox_ -211~" e 1 2 ---;3 (2lr) -3/2dX '" 2 -0 3 _!02 2 -2 2 3 -2' 'I -3/20 -2 4 E{x 2} + e 2 E{x 2} + 8 e E{x3 ,O}· 3,0 3,20 Thus, D{O) ~ 3[{~{-0) + M.I ~O) 2 - 0 ~(-O)} - 1 2 E{x -2 1 2 . -20 2)e 3,20 1 _02 -2 1 -3/202 - - e E{X ) - 24 e ]. 4 3,02 Also note that, Finally, Hence after some trivial simplification we have D{O) ~ 0 V0 ~ 1. same treatment holds for p=4 (as, ~ = p-2) but not for p ~ 5 (as ~ p-2). The < Thus we obtain the following result: A Theorem 7.3. The restricted stein rule estimator 0RSM dominates the 36 unrestricted one 8 for p=3 and 4. under the model (7.11). SM p ~ Further if 5 the same conclusion holds for large values of 8. (: ..~-, ( --' ' • ... '. I":" • 37 APPENDIX Proof of Lemma 3.2: (8.1) E{II~II~ < ~.9 >; I(~ ~ ~)} Now by Cauchy-Schewarz on <~.!>.}; and by dominated convergence theorem it can be shown that the above expectation is finite if s ~ 0 and r+s+p)O. also the l.h.s. of (8.1) equals. (8.2) where EO •.}; denotes expectation under Hp(O.'};). Now. i t is a lrnown fact that under H (O •.};). (_X_. I(X ~ 0) is jointly independent of II~II.};. IIxll -p '" So '" .}; (8.2) can be rewritten as (after multiplying and dividing the integrand by II~II~+k): <9 X>s+k . I I(X [ IIXlls+k '" > 0) '" ] E IIXll r +s +k O•.}; '" .}; "'.}; r+s+k 1 11!1I.};} 2 ~~O (k!) -1 "'tc+s (! •.};) 2 = exp{ -2 which completes the proof of the lemma. 2 f [r+s+k+P ] /f 2 [E2 ] 38 REFERENCES Anderson. T.W. (1984) An Introduction to Multivariate Statistical Analysis. 2nd ed.. Wiley. New York. Baranchik. A.J. (1910) A family of minimax estimators of the mean of a multivariate normal distribution. Ann. Math. Stat. 41. pp. 642-645. Barlow. R.E. et al. (1912) Statistical Inference Under Order Restrictions. J. Wiley. London. New York. Berger. J.O. (1985) Statistical Decision Theory and Bayesian Analysis. 2nd ed .• Springer-Verlag. New York. James. W. and Stein. C. (1961) Estimation with quadratic loss. Proceedings of the Fourth Berkeley Symposium. Univ. of California Press. 1. pp. 361-319. Kudo. A (1963) A multivariate analogue of one-sided test. 50. pp. 403-418. Biometrika. Nuesch. P.E. (1966) On the problem of testing location in multivariate problems for restricted alternatives. Ann. Math. Stat. 37. pp. 113-119. Perlman. M.D. (1969) One sided testing problems in multivariate analysis. Aann. Math Stat. 40. pp. 549-561. Sen. P.K. (1984) A James-Stein detour of U-statistics. Theory Methods 13. pp. 2125-2141. Commun. Stat. A Stein. C. (1956) Inadmissibility of the usual estimator for the mean of a mul tivariate normal distribution. Proc. Third Berkeley Symposium. University of California Press. 1. pp. 191-206. Strawderman. W.E. (1911) Proper Bayes minimax estimators of the mul tivariate normal mean. Ann. Math. Stat .• pp. 385-388. •
© Copyright 2025 Paperzz