CONSOLIDATED OF NORTH A NEW GENERALIZATION OF JAMES AND STEIN'S ESTIMATORS IN MULTIPLE LINEAR REGRESSION Yong W. Nam Institute of Statistics Mimeo Series #1104 January 1977 -e DEPARTMENT OF STATISTICS Chapel Hill, North Carolina YONG WHA NAM.* ANew Generalization of James and Steints Estimatorl3 in MUltiple Linear Regression LUnder th..e direction of KeJ1}pton~ i,c, Smith.) \ Assume that e: is distributed as N(0 ,0'2 1) rank in the mUltiple linear regression n xl, X is n xp and and Y' = XS + e:, S is p x 1 (p ~ 3), is of full XIX where and e: y Define dS' (x , x)213,&2}a 2 S'(X ' X)2§ a* (r) 1'2 the usual least squares estimator and A A is unknown and (y-XS) I (y-X[3)! Cn-p) = 0' 2 and it is shown that under some general conditions on for every where A = (n-p)/(p-p+2) Let Then, 13. S* ~ if 0'2 0' (:j* B Cr) rC·,·), S if a satisfies o < a < 2 (p ... 2) A, is unknown and =1 if it is known. a satisfies the above inequality if MSE{S*} < MSE{S} MSE{S*} = I, for some attains its minimum when. a = (P-2)A, estimator coincides with the James and Stein's [10] estimator if By a simple transformation of the estimator, An is obtained ( be the estimator in the above class with r·,·) Moreover, the = if it is known, explicit formula for the mean squared error CMSE) of . are S~ is obtained such that A 13*, This XIX=!, a "positive-part-type" MSE{S:} < MSE{§*} for everY B and a>O. * This research was supported in part by the Army Office of Research under Contract DM-G29-74-C-0030 and in part by the National Science Foundation under Grant GP-42325 . • TABLE OF CONTENTS ACKNOWLEDGEMENTS ." I. \ •• tIj • " " ~ , " " • III " ... " • " " " • " , " " " " " " " " ,-. " " .. " " " " " III " " iii INTRODUCTION I. 3. Statement of the Problem .•.....••.••••..•.......... 1 Historical background .......•..•..•................ 2 Summary of research results 7 1.4. Basic identities "" . """" -" " " " " " " " " " " " " " " " " " " " " " " " " "" 10 1.1. I. 2. II. NEW CLASSES OF ESTIMATORS 11.1. Introduction .......•.............................. 14 11.2. Case I: (J2 known 15 2 I I . 3. Case I I : (J. unknown. . . . . . . . . . . . . . . . . . . . . . . . . . . .. 18 III. GENERALIZATION OF JAMES AND STEIN'S ESTIMATORS I I I. 1. Introduction : . . . . . . • . . . . . .. 26 111.2. I I I . 3. 111.4. . Case I: 0 2 known 27 2 Cas e I I : (J unknown. . . . . . . . . . . . . . . . . . . . . • . . • . . .. 30 Additional properties of the generalized estimators """""""""""""""""""""""""""""""""""" . """ 32 II 1. 5. IV. Comparison between the generali zed James and Stein's estimator and Bock's estimator 34 SIMPLE IMPROVEMENTS OF THE GENERALIZED JAMES AND STEIN I S ESTIMATORS IV.L 2 IV. 2. Case I: (J kn-olffi" "" " """ " "" " "" """ """ .. "" "" " """ " "" " IV. 3. Case II: cr 2 unkn-.own·" " " " "" " " " " " " " " " " " " " " " " " " " " " " " IV.4.Comparison between the improved James and Stein's estimator and the Ilimproved" estimator of Berger and Bock """"""""""""""""""""""""""""""""""" V. e-" 0 " " "" 49 DISCUSSION AND SOME SUGGESTIONS FOR FUTURE RESEARCH 53 APPENDIX A 58 APPEND IX B """",,""""""" ,-. " " " " ,,-. " " " " " " " " " " " " " " " " " " " " " " " " " .- " " " "" 69 APPENDIX C ................•................................. 73 BIB.LIOGRAPHY """""""""""""""""""" • e_" " " " " " " " " " " " " " " " " • • • • • • • • • • 74 ACKNOWLEDGEMENTS I wish to acknowledge the advice and guidance given to me by , my advisor, Professor Kempton J.C. Smith, during the course of this research. I wish to express my appreciation to chairman of my doctoral committee, Professor Raymond J. Carroll and other members of the committee, Professors Indra M. Chakravarti, Norman L. Johnson, and Peter J. Schmidt for their constructive criticism and suggestions. Appreciation also is extended to Professor Douglas M. Hawkins for his contribution. I wish to express my special appreciation to my parents, Mr. and Mrs. Chul Kyun Nam, my sisters, and, my late brother for their continuous support and encouragement. Finally, I thank my wife, Mi Yong, and my daughter? Mi Hae? for their patience and sacrifices during the course of this research. iii • CHAPTER I INTRODUCTION I.l Statement of the problem Consider the multiple linear regression model Y = XS + (1.1) where E , Y = (Yl'Yz, •.. ,yn )' is a vector of n x= dent variable; ((x,,)) 'lJ is an n xp observations on the depen- matrix such that the i-th row X, (x'l'x,Z""'x. 1 1 1p ) contains the i-th observations on the independent variables (i = 1,Z, ... ,n); S = (Sl'SZ""'Sp)' is the vector of of parameters to be estimated; and distributed as N(O, a constant, and I of gen~rality, (J 2 E = (E l ,E 2 , ... ,En )' I), 0 being the nx1 is assumed to be zero-vector, being the identity matrix of order n. (J Z b' e1ng Without loss it is assumed that the independent variables are stan- dardized so that X'X is in the form of a correlation matrix. Assume that X'X is of full rank. Let P be an orthogonal matrix such that = X'X (1. 2) P'AP where A is the diagonal matrix of the eigenvalues of X'X. , Then the usual least squares estimator, Al ~ A ~ ••• ~ A > 0 p Z 2 is the best linear unbiased estimator (BLUE) of e and has a normal distribution with mean e and variance crZ(X'X)-l. the mean squared error (MSE) of In particular, A e is given by , (1. 3) = tr Var[§] 2 p = cr . LIlA. 1 J. J.= This does not depend on scale of X because it is assumed that XIX is in the form of a correlation matrix. If cr 2 is unknown it can be estimated by s2 such that (n-p)s (1.4) is independent of 2 = (y-Xe)'(y-Xe) A A S and distributed as cr 2 times a chi-square ran- dom variable with n-p degrees of freedom. Throughout this thesis, 2 "Case I: cr known" indicates that cr Z is known in (1.1) above and "Case II: cr 2 unknown" is defined similarly. As shown in the next section, a number of (biased) estimators which have smaller mean squared error than the usual least squares estimator have been introduced by various authors. New estimators and new classes of estimators are proposed in this thesis. It is assumed, throughout this thesis, that I.2 p <!: 3. Historical background In 1956 Stein [16] showed, for Case I: exists an estimator of the parameter e cr 2 known, that there in (1.1) above which has A smaller mean squared error than e if X'X = I and p <!: 3. then a number of explicit forms of estimators and classes of Since e 3 estimators have been introduced. Some of those which are relevant to this research are presented below in two parts; one for the case when , =I X'X and the other for the case when an identity matrix. A2{= a 2 a.2 =s , , X'X is not necessarily For notational conveniences, define the true value, for Case I . an est1mator For the case when X'X 0 f- a2 = I, given in (1.4) for Case II . James and Stein [10] introduced an estimator of the form a = [1 - a. ~2A] ~ , (1.5) . (3 (3 where a. is a nonnegative constant. They first obtained two impor- tant identities, P. 00 =. l (1. 6) J=O J p-2+2j and 00 (1.7) .l (p-2) )=0 l' . P-Z{2j , (see (9) and (16), in· [10]), where the Poisson probabilities with mean 2 (31(3!(2a ). ties, they showed that MSE{S} < MSE{S} (1. 8) .. if a. satisfies for every (3 By using these identi- 4 o<a (1. 9) and that < 2(p-2)A a satisfies this inequality if where = 1 for Case I A{= (n-p)/(n-p+2) (1.10) They also showed that the MSE{S} 2{ 2 for Case II . attains its minimum, P.} cr P - (p-2) A j~O p- 2i2j (1.11) 00 , a = (p-2)A. when Baranchik [1] considered the "positive" part of James and Stein's estimator, (1.12) where <l>E (x) = 1 (1.13) if x E: E MSE {S+} < MSE (S} and = 0 for every otherwise, and showed that S and a > 0 • Later, Baranchik [1,2] also obtained a class of estimators of the form (1.14) where a is a nonnegative constant and r: [0,(0) -+- [0,1] is monotone nondecreasing, and showed that (1.15) .e if a satisfies (1. 9) above. The improved estimator, S+, belongs to the 5 .... 13 (r), given in (1.14) with class of estimators, If = 1, r(e) Baranchik's class coincides with James and Stein's estimator, that is, For Case II: =S. S(l) cr 2 unknown, Strawderman [18] obtained a large class of estimators which contains Baranchik's class. For the case when is not necessarily an identity matrix, XIX Bhattacharya [5] first obtained an estimator where :: 0 O. 1 {= is a diagonal matrix of !1 for 1 i =1,2 P 1 ,\' ,1 I.. (I\. • 1 P-l+ j=i P-J+ I\. 0 ,-1 ) (0) I\. 0 P-J such that O.1 (i:::: 1,2, ••. ,p) ,,2 J -2 A U' cr U (i) (i) for i = 3~ 4, ..• ,p , Bhattacharya showed that Bock [6] introduced a class of estimators of the form (1.16) .;-- ___ .__.. __.__ ~_whEIT!:l._ a. is a nonnegative constant and nondecreasing, and showed that r: [0,00) -+- [0,1] is monotone 6 (lel7) if a MSE{S*(r)} < MSE{S} for every r(e) and S satisfies o< (1.18) a < 2(A r r 1/A.-2)A Pi=l 1 assuming that P (1.19) ALI/A. > 2 • Pi=l 1 If X'X = I is assumed that the above assumption (1.19) is satisfied, since it p ~ 3, and this class of estimators coincides with Baranchik's class given in (1.14) above. Bock also considered a special case, class of estimators with r(e) S* =S*(l), of the above = 1, (1.20) and showed that (1. 21) if and only if a If X'X = I, satisfies (1.18) assuming that (1.19) is satisfied. again, the assumption (1.19) is satisfied and this estimator coincides with James and Stein's estimator given in (1.5) above. Berger and Bock [4] obtained an estimator which, if X'X = I, coincides with the improved estimator of Baranchik given in (1.12) above. This estimator is compared with new improved estimator in Section IV.4 of Chapter IV. "" 7 For Case I: cr 2 known, both Berger [3] and Hudson [9] indepen- dently obtained some of the results in this thesis, which are referred to when these results are stated later in this thesis. I.3 Summary of research results In Chapter II, a new class of estimators is proposed as of the form ~*(r): (1. 22) where a ,,2 {~, , 2~ "2} ~ I-a (J r ~ (X X) P, (J (X'X) ~, ~ 13' (X'X)2~ is a nonnegative constant and r: [0,(0) x [0,(0) -+ [0,1] satisfies ,,2. ,,2 (a) for each fixed (J, r(·,(J) (b) for each fixed S'(X'X)2§, sing. is monotone nondecreasing r{S(X'X)2 S,·} is monotone nonincrea- It is shown that (1.23) MSE{S*(r)} < MSE{S} for every r(·,·) and 13 if a satisfies (1.9) above. Two particular subclasses are also considered. obtained by letting The one which is S/ F: S'(X'X)2 cr 2 and coincides with Strawderman's [18] class if X'X: I. The other which is obtained by letting coincides with Baranchik's [2] class given in (1.14) above if X'X: I. 8 Although both Bock's class of estimators given in (1.16) and the above second subclass with the case when r (F). generalizes Baranchik' s class to 2 is not necessarily an identity matrix, this sub- XIX class does not require assumption (1.19) and gives wider range for ~ as given in (1.9) than Bock's class. For Case I: cr 2 known, both Berger [3] and Hudson [9] indepen- dent1y obtained the above second subclass and proved (1.23). However, their methods of proof are different from the one used in this thesis. If r(· ,.) - S* (r) 1, A* 13 - a* (1) (1. 24) = in (1.22) becomes cr'2 ~I - a a' (X'X)2 a ~s (X' X) In Chapter III, this particular estimator is further studied to show that (1. 25) if a satisfies (1.9) and that a satisfies (1.9) if a. = (p-2)A, It will also be shown that, when the MSE{a*} attains its minimum, (1. 26) where cr a ~ ).. p 2{ P L 1/)...1 i=l 2 (p-2) A a -1 L j=O Kj 00 2 2' } p- + J is a positive constant and the , K. 's J are defined as in (1.36) below. If XIX = I, this new estimator, ,.., Stein's estimator, 13, given in (1. 5) . A* 13, coincides with James and 9 Although both the new estimator, e*, and Bock's estimator S*, can be regarded as generalizations of James and Stein's estimator, to the case when X'X "" (3, is not necessarily an identity matrix, these generalized estimators have distinct properties. Firstly, the condition (1.19) on the eigenvalues of X'X be satisfied to define ""* (3 should while no such condition is necessary to A define 13*. A* (3 Secondly, uses the same interval of a in the case of S while ""* (3 given in (1.9) as uses a different interval given in (1.18) which is not so wide as the former in general. Thirdly, the MSE{S*} is minimized for every when (3 a= (p-2)A, the midpoint of the interval given in (1.9), as in the case of the MSE{S} while the value of a which minimizes the function of (3 MSE{S*} is a and,' in some cases, falls outside of the interval given in (1.18) as will be proved in Section III.S of Chapter III. Fourthly, if MSE{S*} < MSE{S} for some (3, then a must belong to the interval given in (1.9) as in the case of the ""(3 while it is not so for S* unless MSEfS*} < MSE{S} for every (3. Thus, it seems reasonable to say that this new estimator, is more like James and Stein's estimator, ""*(3. S, (3A* , than Bock's estimator, It is for this reason that the new estimator is named a genera- lized James and Stein's estimator in this thesis. For Case I: cr 2 known, Hudson [9] independently obtained the same estimator and showed, by using a different method of proof, that, A when a = (p-2), ., (1. 27) MSE{(3*} cr is given by 2{ p . - (p-2) 2 L l/A.. E i=l t 2 ~} cr 2 A A ~(3' (X'X) (3 . • 10 By using Corollary 1.1 (i) in the next section, it is readily seen that the two quantities given in (1.26) and (1.27) are same. In Chapter IV, the new estimator given in (1. 24) above is fur- ther improved by using (1. 28) where P is defined as in (1.2) and <Pi (i = 1,2, •.. ,p) ~ is a diagonal matrix of such that AI I 2 A A2} <Pi = <P[aA. ,00) { 13 (X X) 13/cr for i = 1, 2, ... ,p . 1 It is shown that (1.29) It is also shown that this estimator does not belong to the class of estimators given in (1.22) above unless X'X =I in which case it coincides with the improved estimator of Baranchik given in (1.12) above. The above improved estimator dent1y for Case I: 13A*+ was also considered indepen- cr 2 known by Hudson [9] who stated, without proof, A "it seems clear that the estimator 1.4 would be superior to 13* •" Basic identities In this section, important new identities are obtained. These are used in the proofs of theorems in the next two chapters. Since the proof of Theorem 1 (ii) is very complicated and lengthy, it is given separte1y in Appendix B. 11 .e Theorem 1. Let ex ~ 0) , f (x) = 1 xm/ 2- 1e -x/2 m f(m/2)2m/2 the p.d.f. of a chi-square random variable with dome and let h: [0,00) -+ [0,1] m degrees of free- be monotone nondecreasing. Define OO (1. 30) I J 2 = h Ca.cr X) f (x)dx mOm (A :<!: a. > 0) . p Then (i) (ii) I p _2+ 2j ) , where K.'s Proof. (i) (1. 31) where· P J are defined as in (1.36) below. Let Z = cr -1 k A A 2p f3 and e= cr -1 1 A ~P S , is the orthogonal matrix defined in (1. 2) above. is distributed as N(e,I). Using (1.31), we get (1. 32) where . (1. 33) Let A :<!:a.> 0 p and a.i = A./a.:<!: 1 1 (i = 1,2, ... ,p) • Then Z 12 Then, from (1.33), (1. 34) V= P 2 L A.Z. . 111 =a 1= P 2 L a.Z. . 111 = a 1= P ,2 L a'X 82 . 1 1 1, . 1= 1 '2 "m,d denotes a noncentra1 chi-square random variable with degrees of freedom and noncentra1ity parameter d. By Lemma 1 in where y m Appendix A -1 ~ h(cr 2v) -1 = a £K. f 2.(a v)dv o j=O J v p+ J OO (1. 35) f where the K.'s J are given by n P -8'8/2 K = e a.-~ O .1= 1 1 (1. 36) ". j=1,2, ... , (1. 37) 9 = p P t -1 m t 2 -1 -1 m-1 , £ (I-a. ) + m £ B.a. (I-a.) m '1= l '1 11 1= ~ 1 m=1,2, ... , 00 L K. = 1, j=O J K. ~ 0, J and j = 0, 1, 2, . .. . Now, by the Monotone Convergence Theorem for infinite integrals (see Corollary 1 to Theorem A of Loeve [13], p. 124), the integral and the summation signs on the right hand side of (1.35) can be interchanged. Hence, = a-I = a-I L K.foo. 2 L K. foo. h(acr x) 00 j=O J 0 By Lemma 7 in Appendix A with m + 2k > 0 if P ~ 3) , 2 h(cr v) f . (a- 1v)dv j=O J 0 v p+2J 00 x m= p+2j f and .(x)dx. p+2J k = -1 (note that 13 lOOK. - a.- ~ J .L p-2+2J' J=O lOOK. - a.- ~ J . L p-2+2J' J=O (1. 38) OOh fo 2 (ao X)f 2 2·(x)dx p- + J From (1.32) and (1.38), we get which proves the first identity. (ii) 0 See Appendix B. Corollary 1.1 OOK. l = a. - L~ (i) J - j=O p-2+2j (ii) E Proof. t (S:S)' (X'X) !~ Let h(-) - 1 m then . Then, for all = 1 m. 0 Hence, the proof follows. X'X= I, Kj 2 2' . j=O p- + J in Theorem 1. I If ~ = ( p- 2) a. -1 L S'(X'X) 2S K. = P. J J for all j by Ruben [15]. Hence, in this case, the above identities (i) and (ii) coincide with James and Stein's [9] identities given in (1.6) and (1.7), respectively . • CHAPTER II NEW CLASSES OF ESTIMATORS II.1 Introduction Two classes of estimators are obtained in this chapter for the case when X'X is not necessarily an identity matrix. When 0 2 is unknown the new class of estimators is defined to be of the form where a is a nonnegative constant and the function r(·,·) is assumed to satisfy certain conditions as stated in Theorem 2 below. We first obtain an upper bound on the difference between the mean A squared error of ~ 8* and that of the usual least squares estimator as ,.,.2{n- p +2 2 } -1 ~ Kj n-p a -2(p-2)a a j~O p-2+2j Ap _2+2j , S v where A P 2: a > 0, the K.' s are defined as in (2.7). J From this, we show that MSE{~*(r)} < MSE{~} if a ~ are defined as in (1. 36), and the for every r(·,·) satisfies 0 < a < 2(p-2) (n-p)/(n-p+2) . and 8 A's m 15 This new class of estimators, A* S (r), contains Strawderman's [18] class and, in turn, the corresponding class of Baranchik [2] when X'X = I. Thus, this new class may be regarded as a genera1iza- tion of the latter two. 11.2 Case I: cr 2 known As a generalization of Barachik's [1] result to the case when X'X is not necessarily an identity matrix, Bock [6] produced a class of estimators of the form 2 2 [1-acr r{S' (X"X)S/cr }/S' (X'X)S]S and showed that these estimators, like those of Baranchik, have smaller mean squared error than the usual least squares estimator A S if assuming that A IliA. > 2, where A is P Pi=l 1 Pi=l 1 the smallest eigenvalue of X'X. In this section we obtain a new 0< a. < 2 (A IliA .. - 2) generalization of Baranchik's result in such a way that these new estiators have similar property to the above regarding mean squared error but do not require the additional assumption made by Bock. Theorem 2. Assume that a* (r) (2.1) where A 2 {A, , 2 } ~ I _ acr -: S (X X~ S (X' X). ~ , C S'(X'X)2 S ~ [0,1] Then MSE{~*(r)} < MSE{S} for every if a. satisfies (2.3) Define a. is a nonnegative constant and' r: [0,00) nondecreasing. (2.2) = cr 2 is known. o < a. < 2(p-2). r(e) and S is monotone 16 Proof. We have Hence, (2.4) = E[{i3*(r)-B}'{a*(r)-B}] = E[(S-B)' (S-B)] MSE{a*(r)} _Z""ZE rrlf' (X· X) Z!l (a-S) , (X'X) ~ J [ B' (X'X)2 13 + "Z,,zEfc."Zi{a· (X' X) ZaU .. [ a' (X'X)2 a J Now A (2.5) A E [(6-B) , (B-B)] 2 P 1/)... . . . 1 1 1= = MSE{B} = cr l: A By Theorem 1 (i) in Chapter I, \ (2.6) where (2. 7) By Theorem 1 (ii) in Chapter I, (2.8) Bp _2 + 2j ] , where (2.9) 2 B = [r(acr X)f (x)dx . mOm Substitution of (2.5), (2.6), and (2.8) into (2.4) gives 17 ( _ 2 p -1 00 2j (2.10) MSE{SA* (r)}-ez l K. B +2' - 2 2' Bp _2+2j ) { L 1/;\.-2aet i=l 1 j=O J P J p- + J +a.2<l - l ooK.J . A j~O p-2+2J } .• p-2+2J Thus, we have obtained an explicit formula for the mean squared error A* of S (r). Now, Ap _2+2j = ~r2(oa2X)fp_2+2j(X)dX by (2.7] oo s; Jor(<l0'2x)fp- 2 2·(x)dx J + = Bp _2+2j S Bp+·2'J since OSr(e) S 1 by (2.9) by Lemma 6 (i) in Appendix A . Hence, we get, from (2.10) A*. 2 (2.11) MSE{S (r)}sO' {. P ll/;\.-2qct 1 i=1 -1 .( 2j 00 L K.1 - 2 2' ] A 2 2' j=O J p- + J p- + J +a.2~·-1 00 j~O K. J p-2+2j p S0'2 . p-2+2j '-1 1- K. 00 L 1/;\.+{a.2_2(P_2)a.~-1 L L } A '-0 P J- 1 _2 J 2' + J Or, equivalently, 00 (2.12) MSE{~*(r)}-MSE{~}s0'2{a.2_2(p-2)a~-ll j =0 . .r Clearly, the quadratic form in if 0 < a. < 2(p- 2). If XIX = I K. /2' A 22" p- + J p- +. J a. on the right hand side is negative Hence, the proof is completed. 0 A* (r) in (2.1) becomes in Theorem 2, then S 18 which is the same class of estimators obtained by Baranchik [1], given in (1.14) in Chapter I. As stated in Section I.3, both Berger [3] and Hudson [9] independently arrived at the same class of estimators as the ~*(r) in (2.1) and proved Theorem 2 by using a different method of proof. II.3 Case II: 0 2 unknown For the case when X'X is not necessarily an identity matrix, we obtain a wide class of estimators which have smaller mean squared error than the usual least squares estimator ~. The class of estimators in Theorem 3 below contains Strawderman's [18] class and, in turn, the corresponding class of Baranchik [2] if X'X = I. A different generali- zation of Baranchik's result was obtained by Bock [6] under the additional assumption stated in Section II.2 above. The method of proof as used in Section II. 2 above is followed except for minor changes in notation. Theorem 3. Define (2.13) where a. is a nonnegative constant and r: [0,00) x [0,00) + [0,1] satisfies (a) for each fixed 2 2 s , r(·,s) (b) for each fixed {A, I 2 ..... 1 I 2 } S (X X) S, r S eX X) S,· is monotone nonincreasing. A is monotone nondecreasing, A 19 Then, MSE{S*(r)} < MSE{S} (2.14) if a. Proof. Since and a satisfies 0 < a. < 2(p-2) (n-p)/(n-p+2) . (2.15) Let for every r(o,-) As in (2.4), we first obtain, Z, 6, and V be defined as in (1.31) and (1.33) in Chapter T. (n_p)s2/cr2 is independent of tion with n-p S and has a chi-square distri- degrees of freedom, we get the conditional expected By Lemma 7 in Appendix A, the integral becomes, fo~r2{S'(X'X)2S, (n_p)-lcr2y} y2 f n-p (y)dy ~ 2 Al I 2 A -1 2 = (n-p) (n-p+2) 0 r {S (X X) 13, (n-p) cr y}fn _ + (y) dy . J = (n-p) (n-p+2)h1{S' (X ' X)2 S} , where p 4 e- 20 00 (2.17) hI (x) = Jr r 2{ x, (n-p) -1 cr 2y }f O By assumption (a), hI: [0,00) ~ [0,1] n _p+4 (y) dy is monotone nondecreasing. Now, the conditional expectation is Hence, we get, (2.18) 1 = (n-p)- (n-p+2)awhere Am , = (Z-8) Z h (Z'1\Z) 2'AZ 2 (2.19) l: j=O K. /2' A p- + J p-2+2j is defined as in (2.7) with hI (0) Similarly, where 1 00 in place of 2 r (0) • 21 By assumption (a), it is readily seen that monotone nondecreasing. h 2 : [0,00) + [0,1] is Now, we get, corresponding to (2.8), (2.20) - -1 ~.( 2j J - a. j~O Kj Bp +2j - p-2+2j Bp _2+2j , where Bm is defined as in (2.9) with h 2 (e) in place of r(e). Substitution of (2.S), (2.18) and (2.20) into (2 .16) gives,. corresponding to (2.10), (2.21) ( "* _ 2p. -1 00 2j MSE{S }-a { Il/A.-2aa. I K. B 2' - 2 2' Bp _2+2j ) i=l 1 j=O J p+ J p- + J +(n-p) -1 . 2 _looK. 2) (n-p+2)a. a. . I 2' j=O p- + J From assumption (b) and Lemma 6 (ii) in Appendix A, hI (x) = Jroor 2{ x, (n-p) -1 a 2y } f n _p+4 (y)dy o oo s Jor{x,(n_p)-la2Y}fn-p+ 4(y)dy s Jor{x,(n_p)-la 2Y}fn-p+ 2(y)dy oo From this inequality and Lemma 6 (i) in Appendix A ( Using these inequalities in (2.21) we get, 22 A* 2 -1 2j MSE{S (r)}~cr {. pl: 1/A.-2aa 00l: K. ( 1- 2 2' JA 2 2' i=l 1 j=O J p- + J p- + J (2.22) + (n-p) -1 } /2' A p- + J p-2+2j 2 _looK. (n-p+2}a ct L j=O ~cr2LI. 1 l/A.+{(n-p) -1(n_p+2)a 2 1= 1 -2(p-2)a'd- . } 1 K. 00 L . J. J=O p-2+2J Or, equivalently, ~cr2{(n_p)-1(n_p+2)a2_2(p_2)a1..'a-lY~j2' . 1 j=O p- + J Ap- 2+ 2'J . Clearly, the quadratic form in a' on the right hand side is negative if 0 <a < 0 2(p-2) (n-p)/(n-p+2). This completes the proof. Two subclasses of S*(r) given in (2.13) are considered in the next two corollaries. Corollary 3.1. Let F = S'(X ' X)2 S/s 2. Define (2.24) where a is a nonnegative constant and r: [0,00) x [0,00) + [0,1] satisfies (a I) for each fixed s 2 , r(e,s 2 ) (b ' ) for each fixed F, r(F,e) Then, the Proof. MSE{(3* (r)} is monotone nondecreasing, is monotone nonincreasing. satisfies (2.14) if a satisfies (2.15). It suffices to show that the above function r(e,e), when , regarded as a function of a' (x x)2 a and s 2 , satisfies conditions 23 (a) and (b) in Theorem 3. Let and consider the transformations Y = X= 2 2 s 2 By the Chain Rule of partial derivatives, = ar (Y1 ' Y2) (..!...] aY l + ar (Y1 ,Y2) X 2 (0) <:: 0 . ay2 2 X = s 2 , r (. , s) is monotone nondecreasing, 2 which is condition (a) in Theorem 3. Similarly, Hence, for any fixed 3rfYI'Y2) • 3r(Y I ,Y 2) [_ Xl] + a,x 2 ay 1 Hence, for any fixed X 1 x2 2 = S'(X ' X)2 S, r{S'(X ' X)2 S,·} increasing, which is condition (b) in Theorem 3. is monotone non- Hence, the proof \ follows. 0 (In the strict sense, the partial derivatives of r exist "except for a set of Lebesgue measure zero". in the above proof However, since we are dealing with expected values of continuous random variables, such a distinction is irrelevant to the results.) If XIX = I in Corollary 3.1, then AlA F = S Sis 2 and A* l3 (r) (2.24) becomes which coincides with the corresponding class of estimators by Strawderman [15], in 24 The class of estimators in Corollary 3.2 below is a subclass of the one given in Corollary 3.1 and, in turn, the one given in Theorem 3. Corollary 3.2. Let (2.25) F = S'(X'X)2 S/ s 2. S* (r) = E-ar~F) Define (X'X~ S a is a nonnegative constant and where nondecreasing. Then, the MSE{S*(r)} , r: [0,00) + [0,1] is monotone satisfies (2.14) if a satis- fies (2.15). Proof. Clearly, the above function reo) is a special case of r(o,o) in Corollary 3.1. Hence, the proof follows. If XIX =I . Coro 11 ary 3 . 2 , t hen ln 0 2 F. -_ a'a/ ~ ~ s and the ~* (r) in (2.25) becomes which coincides with the corresponding class of estimators of Baranchik [2], as given in (1.14). The formula for the MSE{S*(r}} as given in (2.21) above, is a function of A and B. An attractive property of S*(r) in Corolm m lary 3.2 is that the method of computing A and B is relatively m m 2 simple and that these quantities do not depend on S or 0 as shown in the following. Let From (2.17), 25 (00 2 -1 2 h l (x) = J r l {x, (n-p) cr w2}fn _p +4 (w 2)dwZ O = (00 2 -2 J r {(n-p)cr x/w 2}fn _p+4 (w2)dw Z . O From (2.7) with Am 2 hl(e) in place of r (e), = ~hl(aWl)fm(Wl)dWl = J~~r2{a(n-p)Wl/W2}fm(Wl)fp_P+4(W2)dWldW2 . 2 Now, this is the expected value of r {a(n-p)W l /W 2} such that Wl and Wz are independent and have chi-square distributions with v l = m and v 2 = n-p+4 degrees of freedom, respectively. On the other hand, it is well known that the ratio W= Wl/WZ has a Pearson Type VI distribution and that the p.d.f. of W is given by f \1 V w;:: 0 • (w) l' 2 Hence, A m can be computed directly as A = [r2{a(n-p)~}f 4(w)dw . m 0 m,n-p+ Similarly, B = r;{<l(n"p)w}f 2(w)dw . m J m,n-p+ o CHAPTER III GENERALIZATION OF JAMES AND STEIN'S ESTIMATORS 111.1 Introduction James and Stein's estimators are generalized for the case when XIX is not necessarily an identity matrix. When cr 2 is unknown, this gen- era1ized estimator is defined as S* = where a. {I _. S' 2 a.s (X 'X) 2 S is a nonnegative constant. (X'X)}S, This estimator is obtained from the class of estimators given in (2.13) of Chapter II by letting r(·,·) :: 1, that • 1S, A* _ A* S =8 (1). This estimator has a number of attractive properties as described below. An explicit formula for the difference between the mean squared error of this estimator and that of the usual least squares estimator is MSE{S*} - MSE{S} and the where I. p+2 = cr 2{n-n-p 2 a. - 2 (p- 2) a.} K.'s J ooK. - l L\' 2 J 2· , j=O p- + J are defined as in (1.36) in Chapter From this, we show that MSE{S*} < MSE{S} if CI. for every S a. satisfies o < a. and that < 2(p-2) (n-p) I (n-p+2) a. satisfies this inequality if MSE{S*} < MSE{S} for some 27 S. In addition, we also show that the mean squared error of this new a = (p-2) (n-p) / (n-p+2). estimator is minimized when coefficient(s) S, we show that this new estimator has the minimum mean squared error when estimator for As for the unknown X'X S =0 as in the case of James and Stein's = I. Thus, it is seen that this new estimator has all the important properties of James and Stein's estimator. In Section 111.5 below, the generalized James and Stein's estimator and Bock's estimator are compared in terms of percentage reductions in mean squared error as compared with the usual least squares estimator. In Examples 1 and 2 in that section, it is pointed out that the new p estimator performs better than Bock's estimator with a = .\ L 1/.\. - 2 . 1 P ~= and that the latter with ~ a = p-2 performs better than the new estima- tor in some cases. 111.2 Case I: cr 2 known As a generalization of James and Stein's result to the case when X'X is not necessarily an identity matrix, Bock [6] produced an esti- mator of the form {l-acr 2/s,(x/X)S}S and obtained a necessary and sufficient condition for which this estimator has smaller mean squared error than the usual least squares estimator. known for what value of the constant this estimator is minimized. However, it has not been a the mean squared error of In this section we obtain a lization of James and Stein's result. new genera- We not only determine the opti- mal value of . a but also obtain an explicit formula for the mean squared error of this new generalized estimator for any given value of a. 28 Theorem 4. cr Assume that 2 is known. Define (3.1) where ~ is a nonnegative constant. MSE{S*} < MSE{S} (3.2) if (3.3) 0 < ~ ~ < 2 (p-2) satisfies this inequality if MSE{S*} < MSE{S} (3.4) ·e S for every satisfies ~ and Then, for some S. Moreover, the MSE{S*} attains its minimum (3.5) cr 2{ P . 2 -1 Kj } II/A. (p-2) ex I 2 2' i=l j=O p- + J 00 , 1 when Proof. ~ = p-2. Let r(e) _ 1 in (2.1) in Chapter II. (3.1) above. Then the Hence, from (2.4) in Chapter II (3.6) By Corollary 1.1 in Chapter I, we get A* S (r) becomes 29 p MSE{S*} = 0'2 { (3.7) L l/A. -2 (p-2)aC4 -1 L i=l· + = 0' K. 00 /2' j=O P- + J 1 a2~-1 ~ Kj } j~O p-2+2j 2fP L l/A. 12-=1 1 ooK. ~ +{.a.2 -2(p-2)a}a.- lj=O L /2·' p- + J Clearly, the quadratic form in' a on the right hand side is negative if and only if 0 < a < 2(p-2) , and is minimized when a =p-2. By sub- stituting this value of a into (3.1) and (3.7), we see that theestimator S* = (3.8) {r - (p-2) 0'2 S,(x,x)213 (X'X)}~ has the smallest mean squared error as given in (3.5) above. completes the proof. If X'X=I, This 0 then K.:: P. J J for all j by Ruben [15]. Hence, the above estimator, in this case, coincides with James and Stein's estimator given in (1.5) in Chapter I and the minimized mean squared error of the former given in (3.5) becomes that of the latter given in (1.10) in Chapter I. As mentioned in Section 1.3, Hudson [9] independently arrived at the same estimator as the different method of proof. a "'* in (3.1) and proved Theorem 4 by using However, Hudson did not obtain such an explicit formula of the MSE{S*} as given in (3.5) above. ,.. 30 111.3 Case II: 0 2 unknown As in the above section, we generalize James and Stein's result to the case when is not necessarily an identity matrix. XIX Bock [6] obtained a different generalization also for Case II in the same way as for Case I. Theorem 5. Define (3.9) a is a nonnegative constant. where Then (3.10) if a satisfies a (3.11) and· a < a < 2(p-2) (n-p)/(n-p+2) satisfies this inequality if (3.12) Moreover, the 02 (3.13) when MSE{S*} attains its minimum p II/A. - (p-2) 2 (n-p) { i=l ~ n-p+2 a.-I I 00 K i .} , j=O p-2+2J a = (p-2) (n-p) / (n-p+2) . Proof. becomes Let A* (3 r(·,·) =1 in (3.9) in (2.13) in Chapter II. above. Then the S*(r) Hence, from (2, 16) in Chapter II, 31 (3.14) Since (n_p)s2/cr2 is independent of bution with n-p S and has a chi-square distri- degrees of freedom, we get (3.15) e' By Corollary 1.1 in Chapter I, (3.16) MSE{S*} = cr 2 rI U-=l p 1/)... + {n- +2 -l a. n-p 1 ooK.] j~O ~ p-2+2j Clearly, the quadratic form in a a? -2 (p-2)a} . on the right hand side is negative if and only if 0 < a < 2(p-2) (n-p)/(n-p+2), a = (p-2) (n-p)/(n-p+2). and is minimized when By substituting this value of a into (3.9) and (3.16), we see that the estimator (3.17) S* = . • {I _ (p-2) (n-p) s2 (X' X)}s n-p+2 S'(X'X)2 S 32 has the smallest mean squared error as given in (3.13) above. 0 completes the proof. If X,X = I, This the estimator a* ~ in (3.9) becomes s* = (1 -~s~) S, S'S which coincides with the James and Stein's estimator as given in (1.5) of Chapter I and the minimized mean squared error of the former given in (3.13) above becomes that of the latter given in (1.11) in Chapter I. III.4 Additional properties of the generalized estimators By Theorems 4 and 5, the minimized mean squared errors of the generalized James and Stein's estimators are given by . m~n MSE{S*} = a 2{ P L 1/>... - (p-2) 2Aa -1 L K2j 2' } i=l ~ j=O p- + J 0) (J . where , I for S*' . A:{ cn_PJ!cn_p+z;n Theorem,4 , for S* in Theorem 5 . Clearly, these minimized mean squared errors are small and hence reductions in mean squared error made by the corresponding generalized James and Stein's estimators are large when the term (3.18) is large. In this section we investigate R(A!a,6) as a function of 33 Theorem 6. any fixed R(A/a,8) Let A/a, R(A/a,-) be defined as in (3.18) above. is a decreasing function of and attains its maximum when assuming that Proof. 0 8 = 0, or equivalently, when ; Prob ~ {; = j} = K., S =0 as j = 0, 1 , 2, . .. , J K.'s 8~1 (i = 1,2, ... ,p) 2 is finite. Define a random variable where the Then, for which are defined in (1.36) in Chapter I satisfy J 00 K. ~ 0 (j = 0,1,2, ... ) L K. = 1 and J Then, ; . j=O J is a proper random variable. By Lemma 3 in Appendix A, the distribution function of ;, F (x) = Prob. {; ~ x} = is decreasing in 8~ (i = 1, 2, . . . , p) . L K. j~x J , Now, 1 (3.19) Clearly, the function 1/(p-2+2;) is decreasing in ;. Hence, by Lemma 5 (ii) in Appendix A, the expected value on the right hand side is decreasing in theorem. when Since 8 = O. 8~1 (i = 1,2, ... ,p), which proves the first part of the 8~ ~ 0 for all 1 i, R(A/a, -) By (1.31) of Chapter I, e = 0 -1Ak1>S , attains its maximum 34 where A and Pare nonsingular matrices. Hence, 8 = 0 if and 2 only if S = 0 assuming that 0 is finite. This completes the proof. 0 If X'X mean 8'8/2, = I, K. becomes the J (j+l)-th Poisson probability with that is, j=0,1,2, .... Hence, in this case, R(A/a,-) instead of p is a function of a single quantity quantities 2 2 82 , ... ,8 ' The above p Theorem 6 was obtained by James and Stein [10] for the case when X'X = I. 111.5 Comparison between the generalized James and Stein's estimator and Bock's estimator The generalized James and Stein's estimator defined in (3.1) above and Bock's estimator given in (1.20) of Chapter I are compared by means of percentage l;'eductions of the mean squared error made by these esti2 mators over the usual least squares estimator for Case I: 0 known, 2 where it is assumed, without loss of generality, that 0 = L In Examples 1 and 2 below, the percentage reductions by the generalized James and Stein's estimator S* are computed by using the formula given in (3.5) above and those by Bock's estimator e* are computed by using the corresponding formula of Bock (see the proof of Theorem 3 of [6]) which, after some computation, can be written as PooP. (3 t20) where MSE{e*} = i~ll/\ + j~O (p+2j){p_2+2j)qj(a) 35 j = 0,1,2, ... the Poisson probabilities with mean 8'8/2, p L 11'A.{:a.2-2CP-2+2j)a.}'if = . 1 ~ ~= C3.21) ={ ~ . 1 ~= 8=0 2 1/A.+2 j a(8)r - 2{CP-2+2 j ) ~ I '1 1/A.-4 j a(8)f if ~ ~= 8~0 and a(8) = 8' A-1 8/ 8 , 8 C3.22) C8 ~ 0) . P a. L a. = 'A l/A. -2, P i=l ~ the midpoint of Bock's interval given in C1.18) of Chapter I, and a.=p-2. Two values of are considered for Bock's estimator: It will first be shown that Bock's estimator S* has the following new properties: CA) MSE fs*} = min MSE{S*} a.=p-2 a MSE fs*} < a.=p-2 CB) where -=1 A if a(8)/A -1 =1 MSEfs*} if p a.='A I/A.-2 Pi=l ~ or a(8)/A- l s 1 , L 1 P =1/t...• P i=1 ~ L Suppose that 8 = O. Po = 1 Hence, C3.20) becomes and Then, P = 0 j 8=0 for all j ~ 0 . 36 Clearly, the quadratic fonn of a a= p-2, when on the right hand side is minimized which proves the second part of (A). Suppose that -=1 = 1. a.(6)/A ) Then, from (3.21, ---=I +2jA ---=I )a 2 -2. { (p-2+2j)PA-::Y-=T} -4jA a q.(a) = (pA J = (p+2j) -=I A {a 2 - 2 (p-2)a } . a = p-2 minimizes Hence, q. (a) for all J j. This and (3.20) prove the first part of (A). From (3.21) it is readily seen that 1 a ='2 h j (8) , q. (a)· J is minimized when where P 2(p-2+2j) ~ 1/A.-8ja.(8) 1 .1 1= p 2: 1/A.1 +2ja.(6) .1 1= Consider 4 h. 1(6) -h.(6) = J+ J p I . 1 1= 1/ A;{ {h. (6) } J This proves (B) because When .. 1/ \;2 j a.(6)} Then, it is seen, from the above, that is a nondecreasing sequence. Example 1. 1 .. p 1=1 0.(6) IA -1 s 1. coefficient of 1/A.-Pa.(6J} {)1/Ai+2(j+1Ja.(6J}{.~ 1=1 Suppose that I 1= 1 1. q. (a) J Hence, is a quadratic fonn in a. and the a2 is positive. 6 = 0: For 12 arbitrarily chosen values of centage reductions of the mean squared error by Bock's estimator A, per- 37 p computed at ct = >.. ~ 1/>... -2 and a. = p-2 are given in the first and p i~l· 1 · second columns of "Bock" in Table 3.1 below a~d those by the genera~ = p-2 lized James and Stein's estmator computed at ~ are given in the last column. . ·TABLE 3.1 Percentage reductions of the mean squared error when A 1.0 Bock New e=a 1.0 1.0 1.0 66.7 (66.7) 66.7 1.2 1.0 1.0 1.0 1.4 1.0 1.0 1.0 1.6 1.0 1.0 1.0 1.8 1.0 1.0 1.0 1.0 1.0 1.0 1.0 0.8 0.6 0.4 0.2 6L,3 47.0 25.3 Nd* (66.7) (66.7) (66.7) (66.7) 66.0 63.0 58.0 44.7 1.2 1.4 1.6 1.8 1.2 1.4 1.6 1.8 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 0.8 0.6 0.4 0.2 0.8 0.6 0.4 0.2 61.9 50.9 36.3 19.1 (66.7) (66. 7) (66.7) (66.7) 65.3 60.8 51.9 35.1 1.2 1.4 1.6 1.8 1.2 1.4 1.6 1.8 1.2 1.4 1.6 1.8 0.8 0.6 0.4 0.2 0.8 0.6 0.4 0.2 0.8 0.6 0.4 0.2 62.5 54.4 45.6 37.0 (66.7) (66.7) (66.7) . (66.7) 64.7 58.4 47.4 30.0 1.0 1.0 e p (*) Nd means "not defined" because a= A L 1/>".-2 < 0 P i=l 1 in this case. It is seen in the table that the generalized James and Stein's p estimator performs better than Bock's estimator with for 11 out of 12 values of L 1/>".-2 Pi=l 1 >... It is also seen that Bock's estimator with p better than the one with a=A a = A. L 1/>".1 -2 . Pi::1 a=p-2 performs for all 12 values of A as 38 expected from the property (A) shown above and that the former performs better than the generalized James and Stein's estimator for all these A's. Exapple 2. e;z! 0: For 30 arbitrarily chosen pairs of (A,S), When the percentage reductions of the mean squared error by Bock's estimator p a =A II/A. -2 and a =p-2 are given in the first and computed at Pi=l· ~ second rows of each case in Table 3.2 below and those by the generalized James and Stein's estimator computed at a= p-2 row. (iI' i 2) A(i) and In the second column of the table, (i ) (i ) that A=A 1 e=e and 2 where are given in the third for each case means SCi) are defined as A(i) i 1 2 3 y (1.8 1.4 (1. 5 1.3 (1. 5 1.5 = x/ /3, for 1.2 1.0 1.1 1.0 0.9 1.0 SCi) 0.4 0.7 0.2) 0.5) (x (0 0.5 0.5) (y x =0.5, 1. 0;, 1. 5, 2.0, 4.0 0 0 0 0 0) 0 0 0 x) 0 Y 0 Y 0) , and () 6.0. It is seen in Table 3.2 below that the generalized James and Stein's estimator performs better than Bock's estimator with p A II/A. -2 Pi=1 ~ of Case 2. for 28 out of 30 entries except for x = 4.0 and a = x =6.0 39 TABLE 3.2 Percentage reductions of the mean squared error when 8 0 ;:e a.(8)/\-1 Case (i 1 ~i2) x=0.5 x=1.0 x=1.5 x=2.0 x=4.0 x=6.0 1 0.32 (1, 1) 3.8 Bock { 65.8 39.3 New 3.5 63.1 33.3 3.0 58.6 25.8 2.5 52.6 18.8 1.1 28.2 5.4 0.5 15.1 2.4 2 0.58 (2, 1) 38.1 65.1 57.0 34.5 60.7 49.2 29.7 54.2 39.2 24.6 46.9 29.7 10.7 22.9 9.6 5.4 11.9 4.3 3 1. 00 (3, 3) 42.2 64.0 54.8 37.5 56.8 48.6 31.3 47.4 40.5 25.0 37.8 32.3 9.6 14.6 12.3 4.6 7.0 5.8 4 1. 74 (2, 2) 37.3 62.0 58.2 31.9 50.0 53.6 25.0 35.2 47.1 18.4 21.8 40.2 5.0 - 0.1 19.7 2.0 - 1.8 10.6 5 2.83 (1, 2) 3.7 59.1 40.9 3.1 40.1 38.9 2.4 17.5 36.0 1.6 - 1.6 32.6 0.3 . -21.6 20.4 0.1 -14.6 12.9 e It is also seen that, for all 18 entries of Cases 1,2, and 3, where --=r :::; 1, a.(8)/\ Bock's estimator with a.=p-2 performs better p I 1/\.-2 as expected from property (B) Pi=l 1 above and that the former performs better than the generalized James than the one with a.= \ and Stein's estimator for these entries. where -:y a.(8)/ \ . > 1, However, for Bock's estimator with a.= p-2 C~ses 4 and 5, increases the mean squared error over the usual least squares estimator for 5 out of 12 entries by ranging from 0.1% to as much as 21.6% while Bock's p estimator with . a.=\ L 1/\.-2 Pi=l 1 and the generalized James and Stein's - ~ 40 estimator never increase the mean squared error over the usual least squares estimator. Thus, it seems reasonable to say that the useful- ness of Bock's estimator with a. = p- 2 is limited to the cases where aCe) fA -1 s 1. In the following example, generated data were used to make furthere comparisons. Example 3. Consider a linear model, where the independent variables are standardized and pendently normally distributed with mean 0 and variance are inde1. Let the X~1'X~2'···' 1 ,1 values of the unstandardized independent variables Xi6 EiJ'S and be given as in Appendix C, where 4 L xi·J = 14 j=l 4 L X~. = 12 , j=l 1J (i = 2,3, ... ,40) , is a string of 80 random normal deviates (generated by a computer program). Xii 'X i2 ' ... , and X i6 in (3.23) above be given as the standardized "respectively values of X*il 'X *i2 ' ... ' and Then, the correlation matrix is given by 1.000 X'X 0.841 1.000 = The eigenvalues of X'X Let the values of -0.315 -0.485 1.000 are -0.592 -0.369 -0.568 1.000 0.134 0.234 -0.124 -0.037 1.000 (i =1,2, ... ,40) ." -0.276 -0.251 -0.054 0.294 0.042 1.000 41 A2 = 1.660 . Al = 0.451 Let A4 = 0.773 " AS = 0.133 13 = 13 (k) (k = 1,2) be given as, , A3 = 0.981 A6 = 0.002 13(1)=( 1.135, -0.875, 0.046, -0.618, 0.257, 0.459) 13(2) = (30.417, 3.558, 32.911, 38.138, 0.645, -0.219) . Then, by direct computations, we get where A string- of 40 random deviates (generated by a computer program) ,was substituted into (3.23) to generate values of dependent variable for each 13 = 13 (k) (k = 1,2) • mator; b, The squared distance between 13 and its esti- defined as (3.24) 6 L2 (b) = (b-I3)'(b-l3) = I (b.-I3.) 1=1 ~ ~ was then computed for each S. e:' X(X' X) -2 x' e: 2 A. A. ,~~ L (13) = (13-13) (13-13) = 2 for both f3t s ~ Entries in'Tab1e 3.3 below are the squared distances, A In particular, ~ A* L (13), L (13*), and L2 (13), computed by using 10 2 2 different strings of random normal deviates (generated by a computer program). ~ 42 TABLE 3.2 Squared distances between 13 Bock and its estimators New Sample BLUE 13=13(1)(13=13(2)) 13=13(1)(13=13(2)) 1 172.974 91. 868 (289.454) 162.290(159.560) 2 33.522 5.353( 59.118) 19.806( 22.972) 3 44.872 9.484(405.900) 45.163( 44.199) 4 187.346 25.210( 17.693) 180.510(176.438) 5 11.181 8.399(297.616) 1O.416( 6 23.762 7.330(544.125) 19.035( 25.397) 7 186.212 18.455 ( 26.413) 182.316 (182.068) 8 36.563 18.482(249.247) 31. 999 (33. 271) 9 119.710 14.778(780.823) 114.947(120.612) 10 29.870 17.704( 40.471) . 25. 443( 24.107) 84.601 21.696(271.086) 79.193( 79.748) ·e Average 8.859) It is seen, in the table, that the average of the squared distances , ~ ~ L2(f3*) is smaller than that of the other two when et (f3)/A =0.023 and bigger when et'(f3)/A- 1 = 2.982. This is a limited but specific example which shows that the mean squared error of Bock's estimator, ~ E[L 2(f3*)], tor, .e may be bigger than that of the usual least squares estima- E[L (S)], 2 if et'(f3)/A- 1 >1. CHAPTER IV SIMPLE IMPROVEMENTS OF THE GENERALIZED JAMES AND STEIN'S ESTIMATORS IV.1 Introduction The generalized James and Stein's estimators defined in Chapter III are further improved by simple transformations. When 0 2 is unknown, this improved estimator is defined as s: = (P' cpP) 8* 2 where estimator, and cp . is the generalized James and Stein's { I-" as 2" (X'X)}13 S' (X'X) S P is the orthogonal matrix defined as (1.2) of Chapter I, is a diagonal matrix of cp. (i 1 CPi . = cp[aA. ,(0) {S' (X'X)2 S/ s 2} = 1,2, ... ,p) for i such that =1,2, ... ,p , 1 where ~(x) = {: if x EO E otherwise . It will be shown that MSE{S*} < MSE{S*} + for every (3 and a> 0 . In Theorem 5 of Chapter III, it was shown that the mean squared error of " S* is minimized when the optimal value of a = (p-2) (n...p) / Cn-p+2) . However, as for "* S+' a has not been determined. 44 IV.2 Case I: cr 2 known Baranchik [1] showed that the James and Stein's estimator 2 2 (1 - acr /S'S)S can be improved by replacing (1 - acr /S'S) by its 2 2 "positive part" (1 - acr /S'S)+ = max {a, (1 - a.cr /S'S)}. In this section we show that a similar improvement can be made for the generalized 2 James and Stein's estimator S* = {I a.cr (X' XJ}i3 by a simple S'(X'X)2 S transformation. (The following theorem can also be proved by applying a Baranchik's theorem (see page 19 of [1]) after reparameterization. However, we will use Lemma 2 in Appendix A so as to obtain a formula for the dif. ference in mean squared error as given in (4.14) below.) Theorem 7 •. Assume that of <p.~ (i cr 2 is known. ~ be a diagonal matrix = 1,2, ... ,p) such that (4.1) and let Let for 8* ~ i = 1, 2, ... , p , be the generalized James and Stein's estimator defined in Theorem 4 of Chapter III. Define (4.2) Then (4.3) Proof. MSE{S*} < MSE{i3*} + S and a. > 0 . By definition 13* = where for every {r a. is a nonnegative constant. By using the transformations 45 defined in (1.31) of Chapter I, Hence MSE{~*} = (i (4.4) . rA.: 1.= 1 where a. A d. (Z'AZ) = 1 - Z';Z 1. (4.5) EU{d. (Z'AZ)Z. - 1 11 r 1 (1.}~. 1 if Z' AZ < all.. if Z' AZ 1 0 ~ 0 ~ all.. 1 From (4.1), by using the same transformations, we get (4.6) -e Hence - (4.7) MSE{S*} + 1 = (i . 1A.: EU{0. (Z'AZ)Z. - e.}~. 1 1. 11 r , 1= where o.(Z'AZ) (4.8) 1 = ~.d.(Z'AZ) 1 1 . From (4.4) and (4.7), (4.9) where D. .e 1 (4.10) 2 ] =E[{o.1 (Z'AZ)z.-e.}2]-EHd. (Z'AZ)z.-e.J ·1 1 1 . . 1 1 =E[{O~(Z'AZ)-d~(Z'AZ)}Z~]-2e.EHo. (Z'AZ)-d. (Z'AZ)}Z.] 1 1 111 1 1 • 46 From (4.8) (4.11) e~(Z'AZ)-~(Z'AZ)";{<I> 1 1 . co (Z'AZ)-l}d~(Z'AZ) [aA., ) 1 . 1 2 . 2 V = aa.X' 2 + a. a.Z. (m=1,2), m 1 (1+2m),8. j;z!i J J l: Define ,2 X .2 (1+2m) ,8. h were 1 1 Z~ are independent for all J Appendix A, j;z! i. By (4.11) and Lemma 2 (i) in -28. E[{e. (Z' AZ) -d. (Z'AZ) }Z.] = 28.E{<I>[0 1 1 1 1 and 1 It.) (Z' AZ)d. (Z 'AZ)Z.} ,a i l l (4.12) Similarly, by (4.11) and Lemma 2 (ii) in Appendix A, (4.13) E[{e~(Z'AZ)-d~(Z'AZ)}Z~] = -E{<I> [0 a' ) (Z'AZ)d~(Z'AZ)Z~} 1 1 1 , 1 1 1\. 1 e- Substitutions of (4.12) and (4.13) into (4.10) and then into (4.9) give A* A* MSE{S+}-MSE{S } = (J 2 P i~l \ -1 [ [E { - 8i E{<I>[0,aA ) i 2} <I>[o,a\) (V1)d i (VI) (V2)d~(V2)} + 26iE{$[O,aAi) (VI)di (VI) ~ Thus, we have obtained a formula for the difference of mean squared errors. Now, E{<I>[O _, )(v )d~(V)} > 0 'W\i m 1 m (m=1,2) e. 47 0 Hence, the proof is completed, X' X = I If in the above theorem, then 13* I,> + becomes which coincides with the improved James and Stein's estimator by Baranchik [1], In the following, it will be shown that the improved estimator, "* S+, defined in (4.2) above does not belong to the class of A S*(r), defined in (2,1) of Chapter II unless S: = (PI~P){I a. - A cr 2 '" X~X=It estimators~ From (4.2) • (XIX)}S SI eX IX) 2 e where a diagonal matrix. if and only Hence, the if (4.14) such that r: [O,~) + Yi (i = 1,2, ... ,p) [0,1] is monotone nondecreasing. be the i-th diagonal element of r. Let Then It is readily seen that the above (4,14) does not hold unless XIX = I. 48 ~ in (4,2) was also considered indo. The improved estimator ~* + 2 pendently for Case I: 0 known by Hudson [19] who stated 1 without 8*+ proof, "it seems clear that the estimator Case II: IV.3 o 2 would be supeTior to unknown A simple improvement of Baranchik's [1] type is applied to the generalized James and Stein's estimator Theorem 8. Let S* = {I -" as " ex' X) a' (X'X)2 a be a diagonal matrix of ep " (4.15) 2 2..... 2 <Pi = <P[a.A.,OO){f3'(XIX) Sis} for <Pi(i=1~2"",p) }a. such that i = 1,2, •. .,p 1 A and let 13* be the generalized James and Stein's estimator defined in Theorem 5 of Chapter I I I. Define' (4.16) Then MSE(S*) < MSE(S*) , (4.17) Proof. + for every f3 and a. > 0 • From (4.15), by using the transformations defined in (1,30) of Chapter I, (4.18) By comparing the right hand side of (4,6) that Z'Az is replaced by and that of (4.18), we see Z'AZ The same change is made in ~*, ~2/(i • Thus, we can follow the proof of Theorem 7, No additional changes are necessary in computing conditional expected values given s 2 except 49 V Ym is replaced by 2m 2 that (m = 1,2). s !cr 2 taking expectations with respect to s If X~X = I Then, the proof follows by o in the above theorem, then 8*' ,.;. 1-'+ - s*+ becomes ", 2 { a.s 2}", <p[a. 00)(13 Sis) 1--13 , A , .~,~ which coincides with the improved James and Stein's estimator obtained by Baranchik [1]. ~* As for Case I, it can be shown that the improved estimator defined in (4.16) does not belong to the class of estimators, defined in (2.13) of Chapter II unless IV.4 +' 13. . * (r), XIX = 1. Comparison between the improved generalized James and Stein's estimator and the "improved" estimator of Berger and Bock Berger and Bock [4] introduced an estimator of the form (4.19) for Case I: erro~ A* 13, known and showed that it has smaller mean squared (J2 for any a., than the generalized James and Stein's estimator, defined in (3.1) of Chapter III. It was shown, in Theorem 7 above, that the improved estimator, "'* 13+, has smaller mean squared error, for any a, than the estimator a"* . If estimator X' X = I, S* I-' + both of the improved generali zed James and Stein ~ s defined in (4.2) and the "improved tl estimator S: of Berger and Bock given above coincide with the improved James and Stein~s estimator 50 of Bara.nchi.k II] ~ which. has smaller mean squared er'l'or than the James and Steints estimator (1_~cr2/~tS)6 for any a\ In the following, it will be shown, for Ca.se 11 if 2 cr known, that , e = 0, (4.20) Throughout the proof, it is assumed, without loss of generality, that cr 2 = 1. Define, for B. 1 = [O,aA.) 1 i::; 1,2, ' , , ,p, and B. 1 = faA.1. ~oo) , By using the transformations defined in (1,31) of Chapter I, it was shown, in (4.7) above, that (4.21) where MSE{S*} = + dt(ZIAZ) I A~lE~¢-B .11 1= . 1 (ZIAZ)d.1 CZlAZ)z.-e.}] , 11 is defined as in (4,5) above. Define Then, by using the same transformations, (4.22) From this and (4.19) 51 Hence, (4.23) = I A~lE[{¢A(Z'Z)<5. . "I 1= + 1 1 <P;;:(Z' Z) d i (Z'Z,Z'AZ)Z. (Z' Az) Zi - 1 e~}~ , where Z'Z <5i (Z' Z, Z'AZ) = 1 - 'fiAZ \ (4.24) and di(Z'AZ) is defined as in (~.5) above. Let 8=0 inboth(4.21) and (4.23). Then, (4.25) = PI L A~.1 i=l D. 1 where (4.·26) D.1 =E [{¢n -E [{¢A(Z' Z) <5~ (Z 'z ,Z' AZ) }Z~] B• (Z' AZ)d~1 (Z' AZ) }Z~] 1 1 1 1 By multiplying inside of the first expectation on the right hand side of the above and ¢B. (Z'AZ) 1 .e + ~. (Z'AZ) =I 1 inside of the second and the third expectations, we get, after dropping 52 E[{<f>p;(Z'Z)4>[(Z'AZ)di(Z IAZ)}Zi] which appears both positively and nega- tively, (4.27) D. = E[cPA(Z' Z)<Pn (Z 'A'Z) {d~ (Z'1I.Z) -o~ (Z' Z,Z '11. Z) }Z~] 1 B. l' 1 1 1 - E[{cP (Z'Z)cP A B• (Z'AZ)o~(Z'Z,Z'1I.Z)}Z~] 1 1 - E[{eJ>AA(Z'Z)cP 1 (Z'AZ)d~(ZIAZ)}Z~] 1 1 B. 1 . Clearly, both of the second and third terms on the right hand side of the above are negative. Hence, it suffices to show that the first expected value is also negative. E[cPA(ZIZ)<Pn B. 1 [ = and (4.24) , we get (Z'1I.Z){d~(ZIAZ)-O~(Z'Z,ZI1I.Z)}Z~] 1 1 1 E~A(ZIZ)4>[i(ZI1I.Z) . which is negative. From (4.5) { (Z'Z+a.)A.}{(ZIZ-a)A.} ~ 2- . z'Az 1 Z'AZ 1 Z~ , Hence, the proof is c?mpleted. 0 e CHAPTER V DISCUSSION AND SOME SUGGESTIONS FOR FUTURE RESEARCH Some of the general properties of the proposed estimators are presented in this chapter in order to examine the usefulness of these estimators and to expose some problems which have not been solved. 1. (Standardization of X) TIlroughout all the previous chapters it has been assumed that the XIX independent variables are standardized so that matrix. is the correlation This assumption was necessary to make comparisons between the proposed estimators and the corresponding estimators for XI X = 1. How- ever, these new estimators are defined in exactly the same way even if X is not standardized. 2. (Underclying assumptions in defining the c Let (i) b A S*) b be an estimator of is distributed as S such that N(S,(1 2Q), where Q is a known matrix of full rank, and that (ii) an estimator, independent of b variable with n-p s2, of (12 is available such that and distributed as (12 times a chi-square random degrees of freedom. Then, the generalized James and Stein I s estimator, .e defined as (5,1) (n_p)s2 b* = {I - t s~ blQ 2b Q-l}b, b* , of S is is 54 where t is a nonnegative constant. By Theorems 4 and 5 in Chapter ~ III, MSE{b*} < MSE{b} (5.2) if t satisfies 0 < t < 2 (p-2) (n-p) / (n-p+2) (5.3) and t satisfies this inequality if MSE{b*} < MSE{b} (5.4) Moreover, the MSE{b*} t (S.5) . 3. S for every S. for some is minimized when = (p-2)(n-p)/(n-p+2) . (Generalized least squares estimator) Suppose that, in the multiple linear regression model y = X13+ e: , e: is distributed as estimator, b, N(O,O 2V). Then, the generalized least squares is given by (5.6) An estimator, s 2 , which is distributed as 2 which satisfies the condition (ii) above is given by of 0 (5.7) s 2 = (y-Xb)'V-1 (y-Xb)/(n-p) . Hence, the generalized James and Stein's estimator of as in (5.1) above. S is defined e. 55 (Analysis of variance in a one-way classification) 4. Let y.. 1J for j = 1,2, ... ,n. denote the observations in the i-th 1 group in a one-way classification with I groups. The usual fixed- effects analysis of variance model is (i= 1,2, ... ,1) y .. = 1-I+a. +E .. (5.8) 1J . 1 1J subject to a constraint I . L1n.a. =0 11 , 1= where E.. 1J variance -e is independently normally distributed with mean 2 cr . Let I n y. = . Ln., 111 = 1= 1 ni n.1 L y .. (i = 1,2, ... , I) , j=l 1J and 1 Y= o and I L n.y. . 1 1 1 n 1= Then, A 1-1 A a. = y. (i = Y and 1 1 = 1,2, ... , I) are the usual least squares estimators of respectively. and 1-1 a. (i=1,2, ... ,I) 1 Let and A a It is well known that is distributed as 1 1 --n n 1 n • •• --n1 --n1 1 1 --n n • •• 1 n 1 -n • 1 n l .e (5.9) Q 2 N(a,cr Q), = 2 • • • • • -- • •• 1 1 ---n n I where 56 The usual sum of squares within groups (n-I) s is independent of & and dom variable with n-I n. I 2 that ~ distributed as ~R where cr degrees of freedom. is singular. 3:S;k<I. _ ~ ~ (y .. -y.) i=l j=l 1) 1 = James and Stein's estimator of a because . 1 a Let R times a chi-square ranNow, the generalized be a vector of any k a. 's 1 " 's a. 1 ~ 2 can not be defined in this case Then, is in the form of 2 is distributed as in (5.9) above and is nonsingu1ar. Hence, the generalized James and Stein's estimator of defined as in (5.1) above with (Determination of the 5. If r(e) =1 "'" " ~ in place of ~ r(e)) in the class of estimators, [17] showed for Case I: "" 2 r(I3' S/cr ) (5.10) = a. -1 cr 2 can be b. "'" l3(r) , of Baranchik, "'" coincides with the James and Stein's estimator,S. 13(1) such known that the S(r) Strawderman with 13 'I3/(2cr 2 ) } 2e- "'" .p+2-2c -. 1 ""2- . { xp/2-c e -xl3' 13/(2cr ldx Jo (0 :s; c < 1, P ~ 6 - 2c) is admissible, by Efron and Morris [7]. However both of these functions seem too complicated to be of practical use. ~(e) A different form was produced The problem of determining the has not been studied in this thesis. e. 57 6. (Distribution of A ~*) The problem studied in this paper is point estimation. Toobtain confidence intervals or regions the distribution of the proposed estimator should first be known. Although an explicit formula for its mean squared error is given in this thesis its distribution has not been determined. 7. (Admissibility conditions) In this thesis, comparison between the new estimator and the usual least squares estimator is made solely in terms of the mean squared errors. -e For the new estimator to be more useful, it needs to be compared in terms of other conditions. Unfortunately, it seems dif- ficult to do so without first obtaining its distribution. APPENDIX A Some known results are reproduced without proofs in the first two lemmas. These and five other lemmas which we obtain in this Appendix are basic tools which are used to prove main results in the preceeding chapters. Lemma 1. as (Press [14] and Ruben [15]), NCB, 1). ex> 0 Let and a 1 2: a 2 2: Z is distributed Suppose that • • • 2: a p 2: 1. Then the p.d.f. of the quadratic form . P V 2 P = ex . L1a1.Z.1 =Cl.La.x' i=l 1= 1 2 2 1,8. 1 is given by -e p (v) = CI.- I (A. 1) V where ,2 Xm,d 00 L K.f 2' (CI.-Iv) j=O J p+ J denotes a chi-square random variable with freedom and noncentrality parameter chi-square random variable with d, f (e) m m degrees of denotes the p.d.f. of a m degrees of freedom, and given by • P -B ' B/2n K = e a .-!z O . 1 1 1.= (A.2) j=1,2, ... , (A.3) p t I. . 1 (I-a.-1 ) m+ m 1= 1 P t I. . 1= 1 B.2a -1 . (I-a.-1 ) m-l 11 m = 1,2, ... 1 \ 00 (A.4) L K. j=O J = 1, and K. ' J 2: 0, j = 0, 1, 2, . .. . K. 's J are 59 This was obtained by Press [14] except for the recursive formulas (A.2) and (A.3) which are due to Ruben [15]. One may refer Johnson and Kotz [11] to trace for different reprsentations of the p.d.f. of quadratic forms in general. If (l =1 and a. 1 =1 for all i, then K. becomes the Poisson J probabi 1i ty j = 0,1,2, . . . . Lemma 2. 1. (Bock [6]). For any fixed Let ,2 and Z, V, i (i = 1,2, ... ,p) , Xm,d be defined as in Lemma define and V 2 ,2 = aa·X 2 1 5 8 + a ,. 1 where X ,2 2 and 3,8. -+ 2 a . Z2. , ~ J.~.1 J J are independent of 5,8. 1 h: [0,00) ,2 X \' Z~ for all J Let 1 (_00,00). Then, (i) E[h(V)Zi] = 8i E[h(V1 )] (ii) E[h(V)Zi] = E[h(V1 )] + , 8iE[h(V2 )] These are generalized versions of Bock's results. (See Theorems A and B of [6].) Lemma 3. able CA.5) ~ Let K. 's J be defined as in Lemma 1. Define a random vari- as Prob. { ~ = j} = K. J j = 0,1,2, . . . . 60 ~, Then the distribution function of = Prob. {~ S x} = I F(x) K. , jSx J e~1 for fixed i is decreasing in Proof. Let prime (') =1,2, ... ,p. of a function denote the partial derivative of 87 the function with respect to for a fixed 1 aKt i. For example, a9 m K' = - - and 9' = -.- a8~ t ae~ m 1 1 It suffices to show that N = I aF(N) (A.6) a8~1 j=O K~ < 0 for any N • J From (A.2), we get K' o = _<3_ ae~ 1 (e -e'e/2frct~~) i=l Hence, (A.6) holds when N=O. (A.7) = - .!. 1 . 2 < 0 • K 0 To proceed, we will first show that j-l ' . l I t K = - -2 K. + -2 L. ct.-1 (I-ct.-1 ) j-l-13KQ J J 13=0 1 ~ I-' j = 1,2, .... From (A.3), we get 9 -1 -1 m-1 = mct. (I-ct. ) m 1 1 , m = 1,2, .... By direct computations, K~ = a:~ 1 b\ J =~ fu. ~ 1KO 91 KO + 91 ( - ~ KOJ) 61 Hence, (A.7) holds when j = 1. Similarly, by using O and Ki K obtained above, Hence, (A.7) holds for j =1 for all for some j = 1,2, ... ,m, and 2. m. Now, suppose that (A.7) holds From (A.2) (A.8) By applying the identity m x-I (A.9) L L u(x,y) x=l y=O m m-a = L L u(a+b,b) a=l b=O (which can be proved by expanding both sides) to the summation in the last term on the right hand side of (A.8), we get, 62 ~L ltLl g.. (A.lO) -1 l=l S=O m+ -~ 1 W = -1 l-l-S 1 oa. (I-a. ) L mt L 9 1 a=l 0=0 m+ -a- = Ka ~ 1 --1 -1 a-I . (l ..CL.) K. ba 1 m 0 \ -1 -1 m-~ L a. (I-a.) l=l 1 1 Ub 1 l-l \ L go aKa 13=0 ~-~ p m \ -1 -1 m-l L (2l)a. (I-a.) Ko = l=l 1 ~ 1 where the last equality follows from (A.2). Substitution of (A.lO) into (A.B) gives , Km+I 1 1 = -2 2(m+l) m m 0 \ 1 -1 -1 m 1 \ -1· -l.m-~ L g 1 lK +Za. (I-a. ) KO+2 L a. a-a.) Kl l=O m+ - l 1 1 £= 1 1 1 1 1 ~ -1 -1 m-l =--K 1+2- La. (I-a.) Ko 2 m+ £=0 1 1 ~. Hence, (A.7) holds for j =m+l. Therefore, by the principle matical Induction, (A.7) holds for any j. Now, from O and (A.7), K N I . d {F(N)}= K! ae.2 j=O J 1 = -!K + 2 0 =- -1 ' } L {-~K. +];. j L a ~I (14 ~l)J-l-SK N j=l 2 J 2(3=0 1. 1 N N j-l . 1 \ 1 \ \ -1 -1 J -l-S - L K. + 2 L La. (I-a. ) K S ~=O J j=l S=O 1 1 By applying the identity .e m x-I (A. 11) L L u(x,y) x=l y=O m-l m-l-a = L a=O L b=O u(a+b+l, a) of Mathe- S 63 to the last term on the right hand side, we get a - 2 {F(N)} ae 1. 1 N 1 N-l{N-l-j -1 -1 ~ K. +2 I~ a. (I-a.. ) j =0 J j =0 . R-=O 1 1 = -2 1 1 N-l{ N-l-j -1 -1 ~ 1~ a.. (I-a.. ) 2-~ 2 j=O l=o 1 1 =-- Ie - a... l} K. J l} K. J ' for all i. Hence, 1 - a,-. 1 _< 1 f or a 1 11. If 1 l ' N-J 0 -1 'i' -1 -1 -l. -1 1- a.. =1, then L. a.. (I-a.. ) =a. = O. (For a consistency, it is 1 l=O 1 1 1 understood that 00 = 1 and Ol = 0 for all l;tO.) If 1_a.~1 < 1, By assumption ~ 1 1 1 then -;.. {F(N)} < 0, Hence, Lemma 4. Let and VI Py (vI) 1 and Y2 0 be defined as in Lemma 2. Then the p. d.f. 's Y2 are given by, respectively, . _ (i) which completes the proof. Cle.1 I j =0 00 j Ll=oI a.. -1 -1 j -l (I-a..) 1 1 Ko/J f -l. J . 2 2' (vI/a) , p+ + J (ii) Proof. Both (i) and (ii) are proved simultaneously. Let '1' (t) m denote the characteristic function of a chi-square random variable with m degrees of freedom. '1' (t) m Let .~z(t) = Then (1_2it)-m/2 = '1'm/2(t) 2 . be the characteristic function of a random variable Z e. 64 and let V be defined as in Lemma 1. Then, from (A.l), we get, QQ (A.12) cj)v/ (t) = a I K. '1'2· (t) j=O J P+ J QQ = '¥p/2 I 2 K.'¥ jet) . j=O J 2 On the other hand, = {a.1 (1-2it)-(a.-l)}-1 1 'e By using the Taylor series, (l_Z)-m = Il=o.(m-l+l)zl, Izi < 1 , we get (A.13) Define (A.14) V m = aa·X 12 1 \' 2 2 +a L a.Z. (1+2m) ,8 j;t;i J J i Then, by using (A.12) and (A.13), we get, .e em= 1,2) • 65 <l>Yrrl (t) = <I> a a.·X 1 ,2 (t) -<I> 2 j~i J J (1+2m),a. 1 =\f~(t)-<I> 2 (t) I a..z. ,2 a.·x 2 lIe (t)-<I> ,. 2(t) I a..z. .~. J J J 1 1 (A. IS) Application of the inversion formula termwise gives . m m (A.16) By (v ) ~t· ~ f p+ 2m+ 2·J (vmla) m-l + j -.t -m -1 j -.t . = j=O I .t=O (. 0 Ja..1 (I-a..) Kola J-~ 1 ~ 00 - From (A.14) and (A.16) we get (i) and (ii) by letting m= 1 and respectively, which completes the proof. m= 2, 0 The above (A.16) may be written as 00 (A .17) P.y(v)=a m m -1 ~ , / L.K.f+ + ·(v a), j=O J P 2m 2 J m which is a form of (A.l) in Lemma 1. However, the effect of the spe- cific difference between Y and Y is not explained in (A.17). To m perform algebraic operations among expected values of the form as given in Lemma 2 above, the relationship between known. K. J and K! J should be e. 66 Lemma 5. Suppose that G1 and G2 are two distribution functions such that (A. IS) Let h: [0,00) (i) E (ii) E -+ be a monotone function. [0,1] 1 [h(X)] ~ E 1 [hex)] ~ E 2 2 [hex)] if h(· ) is nondecreasing, [h (X)] if h(· ) is nonincreasing, where E. [h (X)) = l. Proof. (i) Then, Jooh (x) dG. (x) o , i 1. = 1,2 . By (A.1S), there exist two random variables Xl and X 2 such that and G. (x) l. (See Lemma 1 of = ph: x.l. (z) ~ x}, Lehmann [12], p. 73.) i = 1,2 . Hence, we get E1 [h (X)] = (h{X l (z) }dP(z) ~ J~h{(X2(Z)}dP(Z) which completes the proof. (ii) This follows from (i) by considering the nondecreasing function g=l-h: [0,00) -+ [0,1]. 0 67 Lemma 6. Let f be defined as in Lemma 1 and let h: (0) m be a monotone function. [0,00) + [0,1] Define . = fooh(X)f (x)dx mOm A Then, for any m, (i) A ~ A 1 m m+ if h(o) is nondecreasing, (ii) A ~ A 1 m m+ if h(o) is nonincreasing. Proof. and (i) variables with and 1 the random variable distribution with X be two independent chi-square random m m degrees of freedom, respectively. Then, Xm+ 1 defined by Xm+ 1 = Xl m+ 1 degrees of freedom. tion function of X , k=l, m, m+l. k Gm+l (x) = + Xm has a chi-square Let G be the distribuk Then, for any x, ~Gm(X-Z)dGl(Z) ~ ~Gm(X)dGl(Zl = Gm(x) (Z <:: 0) • On the other hand, ooh(X)f (x)dx = fooh(X)dG (x) mOm Jo Hence, the proof follows from Lemma 5 (i). (ii) This is proved as above by using Lemma 5 (ii). Lemma 7. Then Let f (0) m be defined as in Lemma 1. Let 0 h: [0,00) + [0,1]. 68 where m+ Zk > 0 and r(m/Z+k) Zk = rCm/2) \)k is the k-th moment of a chi-square random variable with m degrees of freedom. Proof. By direct computations oo k . x h(x)f (x)dx Jo m = 1. r (m/Z) Zm/Z i ooh ()· k m/Z-l -x/Z x x ·x e dx 0 . = !(m/Z+k)Zm/Z+k Joo hex) x(m+2k)/Z- l e- x/ Z dx r (m/Z) Zm/Z 0 r (m+zZk) Z(m+Zk) /Z Hence the proof is completed. .e 0 4It APPENDIX B Proof of Theorem 1 (ii). (1.31) in Chapter (B.1) E By using the transformations defined in I, ~S-S) , (X' X) S h{S' (X' X) 2§~ =E[1Z-8) 'z J [ Z 'AZ L S' (X'X)2§ For a fixed i (i = 1,2, ... ,p), VI a = aa·x ,2 a L a.z. , i V2 1 2 + 5,8. 1 where both ,2 X 2 3,8. 1 • and ,2 X 2 5,8. AZ~ • .J let ,2 = aa·X 2+ 1 3 8 and h()2 Z ! '\ 2 .~. J J J 1 L a.z. '\ 2 j~i J J are independent of z~ J for all j ~ i. 1 Then, by Lemma 2 in Appendix A and By using the p.d.f. of VI given in Lemma 4 (i) in Appendix A, we get, as in (1. 37) , (B.2) .e where 1 is defined in (1,30). The p.d.f. of V , m 2 (ii) in Appendix A, can be written as given in Lemma 4 70 From this, as in (1.37) and (B.2), we get, E~((lv25J L V2 = oo J fo ex = ex-I -1 J -1 -2 -1 j-l-1 L ~. L a.. (j -l)(l-a.. ) Ko 00 j =1 l=o Ir j=l~=O 1 f1£2 (j 1 2 h(cr v 2 ) -1 f 2 2' (ex v ) dV 2 2 v2 p+ + J .(.. _l)(l_a~l)j -l-l K1 I p +2.j 1 ':l 1 . p+2J (B.3) Hence, by (B.2) and (B.3), + ex - ex -1 2 -2 -1 '-l-l ~ L ~j-1 I e.a. (j-l) (I-a. )J Ko 00 j=l -1 l=O 1 1 1 .(.. 2 -1 -1 j-l L e.a. (I-a.) K~ ~j j=O l=O I o 00 1 1 1 .(.. I p+ 2'J p+2j . I p+ 2'J p+2j 71 By substituting ,. in the first and second term on the right hand side, respectively, we get =a K -1 co.K. . J -1 CO{ -1 2 -1 } -L I -2-' I 2·- a I '(I-a.. )+6.a... 2' I 2' j=O p+ J p+ J j=O' 1 1 1 p+ J p+ J y.r j=l~=O + a -1 .. -a -1 co I j =1 t{(l"'-ct ~l) j -i+ (j -i) 6~a.~1 (I-a. ~l) j -i-I}.Kl 1 1 1 I p+2} ~ p+2J 1 I .p+2j -1 ·--l..+1 2 -1 -1 '--l.. ~j-l{ ° +(j-i+1)6.a.. I (l-a..)J (I-a..)J O} Ko~. p+2j i=o 1 1 1 1 -l.. (B.5) Now, application of the recursive formulas (1,36) and (1,37) to the summations of the above on i gives, (B.6) = a -1 ~L j~O 2j . I .) . p-2+2] p-2+2] K . (1 . . -. J p+2] (Interchangabi1ity of the order of the summations is justified by the 72 fact that we are dealing with finite summations of infinite series all of which are convergent.) identity. 0 Hence, from (B,l) and (B.6), we get the e APPENDIX C 40 x 6 with ~ " e .. .e matrix X* unstandardized independent variables i * Xi! X~2 Xi3 Xi4 X~5 X~6 1 2 3 4 5 6 7 8 0 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 8.000 8.000 8.000 8.000 8.000 8.000 8.000 8.000 8.000 8.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 3.000 3.000 3.000 3.000 3.000 3.000 3.000 3.000 3.000 3.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 2.000 2.000 2.000 2.000 2.000 2.000 2.000 2.000 2.000 2.000 0.000 O. 000· 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 2.000 2.000 2.000 2.000 2.000 2.000 2.000 2.000 2.000 2.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 2.000 2.000 2.000 2.000 2.000 2.000 2.000 2.000 2.000 2.000 10.000 10.000 10.000 10.000 10.000 10.000 10.000 10.000 10.000 10.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 2.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 2.000 2.000 2.000 2.000 2.000 2.000 2.000 2.000 2.000 2.000 7.000 7.000 7.000 7.000 7.000 7.000 7.000 7.000 7.000 7.000 12.000 12.000 12.000 12.000 12.000 12.000 12.000 12.000 12.000 12.000 0.056 0.847 -0.934 1.283 1.350 -1. 807 -0.191 1.435 -0.082 0.257 0.351 -0.083 -1. 392 0.311 0.636 0.594 -0.543 -0.189 0.793 -0.483 -0.311 -0.596 0.140 -0.074 1. 714 0.923 1.180 0.247 1.131 0.388 0.608 0.095 -0.273 0.007 0.302 -0.729 0.474 -0.825 -0.669 0.765 0.465 -0.524 1.948 -0.246 -0.743 -2.320 -1.404 -1. 399 0.177 -0.254 0.906 -1.142 -0.982 -2.182 -0.306 0.265 0.692 1.229 0.966 -0.251 0.297 -0.499 1.914 -1. 577 0.119 -0.659 -1. 083 0.590 0.633 -1.355 0.543 -0,261 -0.502 0.495 0.548 0.546 2.565 -0.838 0.872 -0.124 BIBLIOGRAPHY [1] Baranchik, A.J. (1964). Multiple regression and estimation of the mean of a multivariate normal distribution. Technical Report No. 51, Stanford University. [2] Baranchik, A.J. (1970). A family of minimax estimators of the mean of a multivariate normal distribution. Annals of Mathematical Statistics 41, 642-645. [3] Berger, J.O. (1976). Admissible minimax estimation of a multivariate normal mean with arbitrary loss. Annals of Statistics !, 223-226. [4] Berger, J.O. and Bock, M.E. (1976). Eliminating singularities of Stein-type estimators of location vectors. Journal of the Royal Statistical Society~Series B 38, 166-70. [5] Bhattacharya, P.K. (1966). Estimating the mean of a multivariate normal population with general quadratic loss function. Annals of Mathematical Statistics ~~ 1819-1824. [6] Bock, M.E. (1975). Minimax estimators of the mean of a mUltivariate normal distribution. Annals of Statistics ~, 209-218. [7] Efron, B. and Morris, C. (1973). Stein's estimation rule and its competitors -An empirical Bayes approach. Journal of the American Statistical Association 68, 117-130. [8] Harville, D.A. (1971). On the distribution of linear combinations of non-central chi-squares. Annals of Mathematical Statistics 42, 809-811. [9] Hudson, H.M. (1974). Empirical Bayes estimation. Report No. 58, Stanford University. Technical [10] James, W. and Stein, C. (1961). Estimation with quadratic loss. Proccedings of the Fourth Berkeley Symposium on Mathematiaal Statistics and Probability!, 361-379. [11] Johnson, N.L. and Kotz, S. (1970). Continuous univariate distributions--2, Houghton Mifflin Co., 149-188. [12] Lehman, E.L. (1956). Testing statisticaZ hypotheses, John Wiley and Sons, Inc., 73-74. [13] Loeve, M. (1963). Co., Inc. Probability Theory. 3rd Ed. D. Van Nostrand 75 [14] [15] Press, S.J. (1966). Linear combinations of non-central chi-square variates. Annals of Mathematical Statistics 37, 480-487. ~ .., Ruben, H. (1962). Probability content of regions under spherical normal distributions IV. Annals of Mathematical Statistics 33, 542-570. [16] Stein, C. (1956). Inadmissibility of the usual estimator for the m'ean of a multivariate normal distribution. Proceedings the Third Berokeley Symposium on Mathematical, Statistics and Probability !_' 197-206. [17] Strawderman, W.E. (1971). Proper Bayes minimax estimators of the multivariate normal mean. Annals of Mathematical Statistics 42, 385-388. [18] Strawderman, W.E. (1973). Proper Bayes mInImax estimators of the multivariate normal mean vector for the case of common unknown variances. AnnaZs of Statistics l, 1189-1194. } oj ..
© Copyright 2024 Paperzz