Nam, Y.W.; (1977).A New generalization of James and Stein's estimators in multiple linear regression."

CONSOLIDATED
OF NORTH
A NEW GENERALIZATION OF JAMES AND STEIN'S ESTIMATORS
IN MULTIPLE LINEAR REGRESSION
Yong W. Nam
Institute of Statistics Mimeo Series #1104
January 1977
-e
DEPARTMENT OF STATISTICS
Chapel Hill, North Carolina
YONG WHA NAM.* ANew Generalization of James and Steints Estimatorl3
in MUltiple Linear Regression LUnder th..e direction of KeJ1}pton~ i,c,
Smith.)
\
Assume that
e:
is distributed as
N(0 ,0'2 1)
rank in the mUltiple linear regression
n xl,
X is
n xp
and
and
Y' = XS + e:,
S is p x 1 (p
~
3),
is of full
XIX
where
and e:
y
Define
dS' (x , x)213,&2}a 2
S'(X ' X)2§
a* (r)
1'2
the usual least squares estimator and
A
A
is unknown and
(y-XS) I (y-X[3)! Cn-p)
=
0'
2
and it is shown that under some general conditions on
for every
where
A = (n-p)/(p-p+2)
Let
Then,
13.
S*
~
if 0'2
0'
(:j*
B Cr)
rC·,·),
S if a satisfies o < a < 2 (p ... 2) A,
is unknown and
=1
if it is known.
a satisfies the above inequality if MSE{S*} < MSE{S}
MSE{S*}
= I,
for some
attains its minimum when. a = (P-2)A,
estimator coincides with the James and Stein's [10] estimator if
By a simple transformation of the
estimator,
An
is obtained
(
be the estimator in the above class with r·,·)
Moreover, the
=
if it is known,
explicit formula for the mean squared error CMSE) of
.
are
S~ is obtained such that
A
13*,
This
XIX=!,
a "positive-part-type"
MSE{S:} < MSE{§*}
for everY
B
and a>O.
* This research was supported in part by the Army Office of Research
under Contract DM-G29-74-C-0030 and in part by the National Science
Foundation under Grant GP-42325 .
•
TABLE OF CONTENTS
ACKNOWLEDGEMENTS
."
I.
\
••
tIj
•
"
"
~
,
"
"
•
III
"
...
"
•
"
"
"
•
"
,
"
"
"
"
"
"
"
"
,-.
"
"
..
"
"
"
"
"
III
"
"
iii
INTRODUCTION
I. 3.
Statement of the Problem .•.....••.••••..•.......... 1
Historical background .......•..•..•................ 2
Summary of research results
7
1.4.
Basic identities "" . """" -" " " " " " " " " " " " " " " " " " " " " " " " " "" 10
1.1.
I. 2.
II.
NEW CLASSES OF ESTIMATORS
11.1. Introduction .......•.............................. 14
11.2. Case I: (J2 known
15
2
I I . 3. Case I I : (J.
unknown. . . . . . . . . . . . . . . . . . . . . . . . . . . .. 18
III.
GENERALIZATION OF JAMES AND STEIN'S ESTIMATORS
I I I. 1. Introduction
: . . . . . . • . . . . . .. 26
111.2.
I I I . 3.
111.4.
.
Case I:
0
2
known
27
2
Cas e I I : (J unknown. . . . . . . . . . . . . . . . . . . . . • . . • . . .. 30
Additional properties of the generalized
estimators """""""""""""""""""""""""""""""""""" . """ 32
II 1. 5.
IV.
Comparison between the generali zed James and
Stein's estimator and Bock's estimator
34
SIMPLE IMPROVEMENTS OF THE GENERALIZED JAMES AND STEIN I S
ESTIMATORS
IV.L
2
IV. 2. Case I: (J kn-olffi" "" " """ " "" " "" """ """ .. "" "" " """ " "" "
IV. 3. Case II: cr 2 unkn-.own·" " " " "" " " " " " " " " " " " " " " " " " " " " " " "
IV.4.Comparison between the improved James and Stein's
estimator and the Ilimproved" estimator of Berger
and Bock """""""""""""""""""""""""""""""""""
V.
e-"
0
"
"
""
49
DISCUSSION AND SOME SUGGESTIONS FOR FUTURE RESEARCH
53
APPENDIX A
58
APPEND IX B """",,""""""" ,-. " " " " ,,-. " " " " " " " " " " " " " " " " " " " " " " " " " .- " " " "" 69
APPENDIX C ................•................................. 73
BIB.LIOGRAPHY """"""""""""""""""""
•
e_" " " " " " " " " " " " " " " " " • • • • • • • • • •
74
ACKNOWLEDGEMENTS
I wish to acknowledge the advice and guidance given to me by
,
my advisor, Professor Kempton J.C. Smith, during the course of
this research.
I wish to express my appreciation to chairman of
my doctoral committee, Professor Raymond J. Carroll and other members of the committee, Professors Indra M. Chakravarti, Norman L.
Johnson, and Peter J. Schmidt for their constructive criticism
and suggestions.
Appreciation also is extended to Professor
Douglas M. Hawkins for his contribution.
I wish to express my special appreciation to my parents,
Mr. and Mrs. Chul Kyun Nam, my sisters, and, my late brother for
their continuous support and encouragement.
Finally, I thank my wife, Mi Yong, and my daughter? Mi Hae?
for their patience and sacrifices during the course of this
research.
iii
•
CHAPTER I
INTRODUCTION
I.l
Statement of the problem
Consider the multiple linear regression model
Y = XS +
(1.1)
where
E ,
Y = (Yl'Yz, •.. ,yn )' is a vector of n
x=
dent variable;
((x,,))
'lJ
is an
n xp
observations on the depen-
matrix such that the i-th row
X, (x'l'x,Z""'x.
1
1
1p ) contains the i-th observations on the independent variables (i = 1,Z, ... ,n); S = (Sl'SZ""'Sp)' is the vector of
of
parameters to be estimated; and
distributed as
N(O,
a constant, and
I
of
gen~rality,
(J
2
E
= (E l ,E 2 , ... ,En )'
I), 0 being the
nx1
is assumed to be
zero-vector,
being the identity matrix of order n.
(J
Z b'
e1ng
Without loss
it is assumed that the independent variables are stan-
dardized so that
X'X
is in the form of a correlation matrix.
Assume that
X'X
is of full rank.
Let
P be an orthogonal matrix
such that
= X'X
(1. 2)
P'AP
where
A is the diagonal matrix of the eigenvalues
of X'X.
,
Then the usual least squares estimator,
Al
~
A ~ ••• ~ A > 0
p
Z
2
is the best linear unbiased estimator (BLUE) of e and has a normal
distribution with mean
e and variance crZ(X'X)-l.
the mean squared error (MSE) of
In particular,
A
e is given by
,
(1. 3)
= tr
Var[§]
2
p
= cr . LIlA.
1 J.
J.=
This does not depend on scale of X because it is assumed that XIX
is in the form of a correlation matrix. If cr 2 is unknown it can be
estimated by s2
such that
(n-p)s
(1.4)
is independent of
2
= (y-Xe)'(y-Xe)
A
A
S and distributed
as
cr
2
times a chi-square ran-
dom variable with n-p degrees of freedom. Throughout this thesis,
2
"Case I: cr known" indicates that cr Z is known in (1.1) above and
"Case II: cr 2 unknown" is defined similarly.
As shown in the next section, a number of (biased) estimators
which have smaller mean
squared error than the usual least squares
estimator have been introduced by various authors.
New estimators and
new classes of estimators are proposed in this thesis.
It is assumed, throughout this thesis, that
I.2
p <!: 3.
Historical background
In 1956 Stein [16] showed, for Case I:
exists an estimator of the parameter
e
cr 2 known, that there
in (1.1) above which has
A
smaller mean squared error than
e
if X'X = I
and p <!: 3.
then a number of explicit forms of estimators and classes of
Since
e
3
estimators have been introduced.
Some of those which are relevant
to this research are presented below in two parts; one for the case
when
,
=I
X'X
and the other for the case when
an identity matrix.
A2{=
a
2
a.2
=s
,
,
X'X is not necessarily
For notational conveniences, define
the true value, for Case I
.
an est1mator
For the case when
X'X
0 f-
a2
= I,
given in (1.4) for Case II .
James and Stein [10] introduced an
estimator of the form
a = [1 - a. ~2A] ~ ,
(1.5)
. (3 (3
where
a. is a nonnegative constant.
They first obtained two impor-
tant identities,
P.
00
=. l
(1. 6)
J=O
J
p-2+2j
and
00
(1.7)
.l
(p-2)
)=0
l' .
P-Z{2j ,
(see (9) and (16), in· [10]), where
the Poisson probabilities with mean
2
(31(3!(2a ).
ties, they showed that
MSE{S} < MSE{S}
(1. 8)
..
if a.
satisfies
for every
(3
By using these identi-
4
o<a
(1. 9)
and that
< 2(p-2)A
a satisfies this inequality if
where
= 1 for Case I
A{= (n-p)/(n-p+2)
(1.10)
They also showed that the
MSE{S}
2{
2
for Case II .
attains its minimum,
P.}
cr P - (p-2) A j~O p- 2i2j
(1.11)
00
,
a = (p-2)A.
when
Baranchik [1] considered the "positive" part of James and Stein's
estimator,
(1.12)
where
<l>E (x) = 1
(1.13)
if
x
E:
E
MSE {S+} < MSE (S}
and
= 0
for every
otherwise, and showed that
S and a > 0 •
Later, Baranchik [1,2] also obtained a class of estimators of the
form
(1.14)
where
a is a nonnegative constant and r: [0,(0)
-+-
[0,1]
is monotone
nondecreasing, and showed that
(1.15)
.e
if
a satisfies
(1. 9) above. The improved estimator, S+, belongs to the
5
....
13 (r), given in (1.14) with
class of estimators,
If
= 1,
r(e)
Baranchik's class coincides with James and Stein's
estimator, that is,
For Case II:
=S.
S(l)
cr
2
unknown, Strawderman [18] obtained a large
class of estimators which contains Baranchik's class.
For the case when
is not necessarily an identity matrix,
XIX
Bhattacharya [5] first obtained an estimator
where
:: 0
O.
1 {=
is a diagonal matrix of
!1
for
1
i
=1,2
P
1
,\' ,1 I.. (I\. • 1 P-l+ j=i P-J+
I\.
0
,-1 ) (0)
I\.
0
P-J
such that
O.1 (i:::: 1,2, ••. ,p)
,,2
J -2 A U'
cr
U
(i) (i)
for
i
= 3~ 4, ..• ,p
,
Bhattacharya showed that
Bock [6] introduced a class of estimators of the form
(1.16)
.;--
___ .__.. __.__
~_whEIT!:l._ a.
is a nonnegative constant and
nondecreasing, and showed that
r: [0,00)
-+-
[0,1]
is monotone
6
(lel7)
if a
MSE{S*(r)} < MSE{S}
for every
r(e)
and
S
satisfies
o<
(1.18)
a < 2(A
r
r 1/A.-2)A
Pi=l
1
assuming that
P
(1.19)
ALI/A. > 2 •
Pi=l
1
If X'X = I
is assumed that
the above assumption (1.19) is satisfied, since it
p
~
3,
and this class of estimators coincides with
Baranchik's class given in (1.14) above.
Bock also considered a special case,
class of estimators with
r(e)
S*
=S*(l),
of the above
= 1,
(1.20)
and showed that
(1. 21)
if and only if a
If X'X = I,
satisfies (1.18) assuming that (1.19) is satisfied.
again, the assumption (1.19) is satisfied and this
estimator coincides with James and Stein's estimator given in (1.5)
above.
Berger and Bock [4] obtained an estimator which, if X'X = I,
coincides with the improved estimator of Baranchik given in (1.12)
above.
This estimator is compared with new improved estimator in
Section IV.4 of Chapter IV.
""
7
For Case I:
cr
2
known, both Berger [3] and Hudson [9] indepen-
dently obtained some of the results in this thesis, which are referred
to when these results are stated later in this thesis.
I.3
Summary of research results
In Chapter II, a new class of estimators is proposed as of the
form
~*(r):
(1. 22)
where a
,,2 {~, , 2~ "2}
~
I-a (J r ~ (X X) P, (J (X'X) ~,
~
13' (X'X)2~
is a nonnegative constant and
r: [0,(0) x [0,(0)
-+ [0,1]
satisfies
,,2.
,,2
(a)
for each fixed
(J, r(·,(J)
(b)
for each fixed
S'(X'X)2§,
sing.
is monotone nondecreasing
r{S(X'X)2 S,·}
is monotone nonincrea-
It is shown that
(1.23)
MSE{S*(r)} < MSE{S}
for every r(·,·)
and
13
if a satisfies (1.9) above.
Two particular subclasses are also considered.
obtained by letting
The one which is
S/
F: S'(X'X)2 cr 2 and
coincides with Strawderman's [18] class if X'X: I.
The other which
is obtained by letting
coincides with Baranchik's [2] class given in (1.14) above if
X'X: I.
8
Although both Bock's class of estimators given in (1.16) and the
above second subclass with
the case when
r (F). generalizes Baranchik' s class to
2
is not necessarily an identity matrix, this sub-
XIX
class does not require assumption (1.19) and gives wider range for
~
as given in (1.9) than Bock's class.
For Case I:
cr
2
known, both Berger [3] and Hudson [9] indepen-
dent1y obtained the above second subclass and proved (1.23).
However,
their methods of proof are different from the one used in this thesis.
If r(· ,.)
-
S* (r)
1,
A*
13 - a* (1)
(1. 24)
=
in (1.22) becomes
cr'2
~I - a a' (X'X)2
a
~s
(X' X)
In Chapter III, this particular estimator is further studied to show
that
(1. 25)
if a
satisfies (1.9) and that
a satisfies (1.9) if
a. = (p-2)A,
It will also be shown that, when
the
MSE{a*}
attains
its minimum,
(1. 26)
where
cr
a
~
)..
p
2{ P
L 1/)...1 i=l
2
(p-2) A a
-1
L
j=O
Kj
00
2 2'
}
p- + J
is a positive constant and the
,
K. 's
J
are defined as in
(1.36) below.
If XIX
= I,
this new estimator,
,..,
Stein's estimator,
13,
given in (1. 5) .
A*
13,
coincides with James and
9
Although both the new estimator,
e*,
and Bock's estimator S*,
can be regarded as generalizations of James and Stein's estimator,
to the case when
X'X
""
(3,
is not necessarily an identity matrix, these
generalized estimators have distinct properties.
Firstly, the condition (1.19) on the eigenvalues of X'X
be satisfied to define ""*
(3
should
while no such condition is necessary to
A
define
13*.
A*
(3
Secondly,
uses the same interval of a
in the case of S while ""*
(3
given in (1.9) as
uses a different interval given in
(1.18) which is not so wide as the former in general.
Thirdly, the
MSE{S*}
is minimized for every
when
(3
a= (p-2)A,
the midpoint of the interval given in (1.9), as in the case of the
MSE{S}
while the value of a which minimizes the
function of (3
MSE{S*}
is a
and,' in some cases, falls outside of the interval
given in (1.18) as will be proved in Section III.S of Chapter III.
Fourthly, if MSE{S*} < MSE{S}
for some
(3,
then
a must belong
to the interval given in (1.9) as in the case of the ""(3 while it is
not so for
S*
unless
MSEfS*} < MSE{S}
for every
(3.
Thus, it seems reasonable to say that this new estimator,
is more like James and Stein's estimator,
""*(3.
S,
(3A* ,
than Bock's estimator,
It is for this reason that the new estimator is named a genera-
lized James and Stein's estimator in this thesis.
For Case I: cr 2 known, Hudson [9] independently obtained the
same estimator and showed, by using a different method of proof, that,
A
when a = (p-2),
.,
(1. 27)
MSE{(3*}
cr
is given by
2{ p
. - (p-2) 2
L l/A..
E
i=l
t
2 ~}
cr 2 A
A
~(3' (X'X) (3
.
•
10
By using Corollary 1.1 (i) in the next section, it is readily seen
that the two quantities given in (1.26) and (1.27) are same.
In Chapter IV, the new
estimator given in (1. 24) above is fur-
ther improved by using
(1. 28)
where
P is defined as in (1.2) and
<Pi (i = 1,2, •.. ,p)
~
is a diagonal matrix of
such that
AI
I
2 A A2}
<Pi = <P[aA. ,00) { 13 (X X) 13/cr
for
i = 1, 2, ... ,p .
1
It is shown that
(1.29)
It is also shown that this estimator does not belong to the class
of estimators given in (1.22) above unless
X'X
=I
in which case it
coincides with the improved estimator of Baranchik given in (1.12)
above.
The above improved estimator
dent1y for Case I:
13A*+ was also considered indepen-
cr 2 known by Hudson [9] who stated, without proof,
A
"it seems clear that the estimator
1.4
would be superior to
13* •"
Basic identities
In this section, important new identities are obtained.
These
are used in the proofs of theorems in the next two chapters.
Since
the proof of Theorem 1 (ii) is very complicated and lengthy, it is
given separte1y in Appendix B.
11
.e
Theorem 1.
Let
ex ~ 0) ,
f (x) =
1
xm/ 2- 1e -x/2
m
f(m/2)2m/2
the p.d.f. of a chi-square random variable with
dome and let
h: [0,00) -+ [0,1]
m degrees of free-
be monotone nondecreasing.
Define
OO
(1. 30)
I
J
2
=
h Ca.cr X) f (x)dx
mOm
(A :<!: a. > 0) .
p
Then
(i)
(ii)
I p _2+ 2j ) ,
where
K.'s
Proof.
(i)
(1. 31)
where· P
J
are defined as in (1.36) below.
Let
Z = cr
-1
k
A
A 2p f3
and
e=
cr
-1
1
A ~P S ,
is the orthogonal matrix defined in (1. 2) above.
is distributed as
N(e,I).
Using (1.31), we get
(1. 32)
where
. (1. 33)
Let
A :<!:a.> 0
p
and
a.i = A./a.:<!:
1
1
(i = 1,2, ... ,p) •
Then
Z
12
Then, from (1.33),
(1. 34)
V=
P
2
L A.Z.
. 111
=a
1=
P
2
L a.Z.
. 111
= a
1=
P
,2
L a'X 82
. 1 1 1, .
1=
1
'2
"m,d denotes a noncentra1 chi-square random variable with
degrees of freedom and noncentra1ity parameter d. By Lemma 1 in
where
y
m
Appendix A
-1 ~
h(cr 2v)
-1
= a
£K.
f 2.(a v)dv
o j=O J v
p+ J
OO
(1. 35)
f
where the
K.'s
J
are given by
n
P
-8'8/2
K = e
a.-~
O
.1= 1 1
(1. 36)
".
j=1,2, ... ,
(1. 37) 9
=
p
P
t
-1
m
t 2 -1
-1 m-1 ,
£ (I-a. ) + m £ B.a. (I-a.)
m '1= l
'1 11
1=
~
1
m=1,2, ... ,
00
L K. = 1,
j=O J
K. ~ 0,
J
and
j = 0, 1, 2, . .. .
Now, by the Monotone Convergence Theorem for infinite integrals (see
Corollary 1 to Theorem A of Loeve [13], p. 124), the integral and the
summation signs on the right hand side of (1.35) can be interchanged.
Hence,
= a-I
= a-I
L K.foo.
2
L K. foo.
h(acr x)
00
j=O J 0
By Lemma 7 in Appendix A with
m + 2k > 0
if P ~ 3) ,
2
h(cr v) f
. (a- 1v)dv
j=O J 0
v
p+2J
00
x
m= p+2j
f
and
.(x)dx.
p+2J
k = -1
(note that
13
lOOK.
- a.- ~
J
.L p-2+2J'
J=O
lOOK.
- a.- ~
J
. L p-2+2J'
J=O
(1. 38)
OOh
fo
2
(ao X)f
2 2·(x)dx
p- + J
From (1.32) and (1.38), we get
which proves the first identity.
(ii)
0
See Appendix B.
Corollary 1.1
OOK.
l
= a. - L~
(i)
J
-
j=O p-2+2j
(ii)
E
Proof.
t
(S:S)' (X'X) !~
Let h(-) - 1
m
then
.
Then,
for all
= 1
m.
0
Hence, the proof follows.
X'X= I,
Kj
2 2' .
j=O p- + J
in Theorem 1.
I
If
~
= ( p- 2) a. -1 L
S'(X'X) 2S
K. = P.
J
J
for all
j
by Ruben [15].
Hence, in
this case, the above identities (i) and (ii) coincide with James and
Stein's [9] identities given in (1.6) and (1.7), respectively .
•
CHAPTER II
NEW CLASSES OF ESTIMATORS
II.1
Introduction
Two classes of estimators are obtained in this chapter for the
case when
X'X
is not necessarily an identity matrix.
When
0
2 is
unknown the new class of estimators is defined to be of the form
where a
is a nonnegative constant and the function
r(·,·)
is
assumed to satisfy certain conditions as stated in Theorem 2 below.
We first obtain an upper bound on the difference between the mean
A
squared error of
~
8*
and that of the usual least squares estimator
as
,.,.2{n- p +2
2
} -1 ~
Kj
n-p a -2(p-2)a a j~O p-2+2j Ap _2+2j ,
S v
where
A
P
2:
a
>
0,
the
K.' s
are defined as in (2.7).
J
From this, we show that
MSE{~*(r)} < MSE{~}
if a
~
are defined as in (1. 36), and the
for every
r(·,·)
satisfies
0 < a < 2(p-2) (n-p)/(n-p+2) .
and
8
A's
m
15
This new class of estimators,
A*
S (r), contains Strawderman's
[18] class and, in turn, the corresponding class of Baranchik [2]
when
X'X
= I.
Thus, this new class may be regarded as a genera1iza-
tion of the latter two.
11.2
Case I:
cr 2 known
As a generalization of Barachik's [1] result to the case when
X'X
is not necessarily an identity matrix, Bock [6] produced a class
of estimators of the form
2
2
[1-acr r{S' (X"X)S/cr }/S' (X'X)S]S
and showed
that these estimators, like those of Baranchik, have smaller mean
squared error than the usual least squares estimator
A
S if
assuming that A IliA. > 2, where A is
P
Pi=l
1
Pi=l
1
the smallest eigenvalue of X'X. In this section we obtain a new
0<
a. < 2 (A IliA .. - 2)
generalization of Baranchik's result in such a way that these new estiators have similar property to the above regarding mean squared error
but do not require the additional assumption made by Bock.
Theorem 2.
Assume that
a* (r)
(2.1)
where
A
2 {A, , 2 }
~
I _ acr -: S (X X~ S (X' X). ~ ,
C
S'(X'X)2 S
~
[0,1]
Then
MSE{~*(r)} < MSE{S}
for every
if a. satisfies
(2.3)
Define
a. is a nonnegative constant and' r: [0,00)
nondecreasing.
(2.2)
=
cr 2 is known.
o < a.
< 2(p-2).
r(e)
and
S
is monotone
16
Proof.
We have
Hence,
(2.4)
= E[{i3*(r)-B}'{a*(r)-B}] = E[(S-B)' (S-B)]
MSE{a*(r)}
_Z""ZE rrlf' (X· X) Z!l (a-S) , (X'X) ~
J
[ B' (X'X)2 13
+
"Z,,zEfc."Zi{a· (X' X) ZaU ..
[ a' (X'X)2 a J
Now
A
(2.5)
A
E [(6-B) , (B-B)]
2 P
1/)... .
. . 1
1
1=
= MSE{B} = cr l:
A
By Theorem 1 (i) in Chapter I,
\
(2.6)
where
(2. 7)
By Theorem 1 (ii) in Chapter I,
(2.8)
Bp _2 + 2j ]
,
where
(2.9)
2
B = [r(acr X)f (x)dx .
mOm
Substitution of (2.5), (2.6), and (2.8) into (2.4) gives
17
(
_ 2 p
-1 00
2j
(2.10) MSE{SA* (r)}-ez
l
K. B +2' - 2 2' Bp _2+2j )
{ L 1/;\.-2aet
i=l
1
j=O J P J p- + J
+a.2<l - l
ooK.J
. A
j~O
p-2+2J
}
.•
p-2+2J
Thus, we have obtained an explicit formula for the mean squared error
A*
of S (r). Now,
Ap _2+2j
= ~r2(oa2X)fp_2+2j(X)dX
by (2.7]
oo
s;
Jor(<l0'2x)fp- 2 2·(x)dx
J
+
= Bp _2+2j
S Bp+·2'J
since
OSr(e) S 1
by (2.9)
by Lemma 6 (i) in Appendix A .
Hence, we get, from (2.10)
A*.
2
(2.11) MSE{S (r)}sO' {. P
ll/;\.-2qct
1
i=1
-1
.(
2j
00
L K.1
- 2 2' ] A 2 2'
j=O J
p- + J p- + J
+a.2~·-1
00
j~O
K.
J
p-2+2j
p
S0'2
.
p-2+2j
'-1
1-
K.
00
L 1/;\.+{a.2_2(P_2)a.~-1 L
L
}
A
'-0 P
J-
1
_2 J 2'
+ J
Or, equivalently,
00
(2.12)
MSE{~*(r)}-MSE{~}s0'2{a.2_2(p-2)a~-ll
j =0
. .r
Clearly, the quadratic form in
if 0 < a. < 2(p- 2).
If
XIX
=
I
K.
/2' A 22"
p- + J
p- +. J
a. on the right hand side is negative
Hence, the proof is completed.
0
A* (r) in (2.1) becomes
in Theorem 2, then S
18
which is the same class of estimators obtained by Baranchik [1], given
in (1.14) in Chapter I.
As stated in Section I.3, both Berger [3] and Hudson [9] independently arrived at the same class of estimators as the
~*(r)
in (2.1)
and proved Theorem 2 by using a different method of proof.
II.3
Case II:
0
2 unknown
For the case when
X'X
is not necessarily an identity matrix, we
obtain a wide class of estimators which have smaller mean squared error
than the usual least squares estimator
~.
The class of estimators in
Theorem 3 below contains Strawderman's [18] class and, in turn, the
corresponding class of Baranchik [2] if
X'X = I.
A different generali-
zation of Baranchik's result was obtained by Bock [6] under the additional assumption stated in Section II.2 above.
The method of proof as used in Section II. 2 above is followed
except for minor changes in notation.
Theorem 3.
Define
(2.13)
where a.
is a nonnegative constant and
r: [0,00) x [0,00)
+
[0,1]
satisfies
(a)
for each fixed
2
2
s , r(·,s)
(b)
for each fixed
{A, I 2
..... 1
I
2
}
S
(X X) S, r S eX X) S,· is monotone nonincreasing.
A
is monotone nondecreasing,
A
19
Then,
MSE{S*(r)} < MSE{S}
(2.14)
if a.
Proof.
Since
and
a
satisfies
0 < a. < 2(p-2) (n-p)/(n-p+2) .
(2.15)
Let
for every r(o,-)
As in (2.4), we first obtain,
Z, 6,
and V be defined as in (1.31) and (1.33) in Chapter T.
(n_p)s2/cr2 is independent of
tion with n-p
S
and has a chi-square distri-
degrees of freedom, we get the conditional expected
By Lemma 7 in Appendix A, the integral becomes,
fo~r2{S'(X'X)2S,
(n_p)-lcr2y} y2 f
n-p
(y)dy
~ 2 Al I 2 A
-1 2
= (n-p) (n-p+2) 0 r {S (X X) 13, (n-p) cr y}fn _ + (y) dy .
J
= (n-p) (n-p+2)h1{S' (X ' X)2 S} ,
where
p 4
e-
20
00
(2.17)
hI (x) =
Jr r 2{ x, (n-p) -1 cr 2y }f
O
By assumption (a),
hI: [0,00)
~
[0,1]
n _p+4 (y) dy
is monotone nondecreasing.
Now, the conditional expectation is
Hence, we get,
(2.18)
1
= (n-p)- (n-p+2)awhere Am
,
= (Z-8) Z h (Z'1\Z)
2'AZ
2
(2.19)
l:
j=O
K.
/2' A
p- + J p-2+2j
is defined as in (2.7) with hI (0)
Similarly,
where
1 00
in place of
2
r (0) •
21
By assumption (a), it is readily seen that
monotone nondecreasing.
h 2 : [0,00)
+
[0,1]
is
Now, we get, corresponding to (2.8),
(2.20)
- -1 ~.(
2j
J
- a. j~O Kj Bp +2j - p-2+2j Bp _2+2j ,
where
Bm is defined as in (2.9) with h 2 (e) in place of r(e).
Substitution of (2.S), (2.18) and (2.20) into (2 .16) gives,. corresponding to (2.10),
(2.21)
(
"* _ 2p.
-1 00
2j
MSE{S }-a { Il/A.-2aa. I K. B 2' - 2 2' Bp _2+2j )
i=l
1
j=O J p+ J p- + J
+(n-p)
-1
.
2 _looK.
2)
(n-p+2)a. a. . I
2'
j=O p- + J
From assumption (b) and Lemma 6 (ii) in Appendix A,
hI (x) =
Jroor 2{ x, (n-p) -1 a 2y } f n _p+4 (y)dy
o
oo
s
Jor{x,(n_p)-la2Y}fn-p+ 4(y)dy
s
Jor{x,(n_p)-la 2Y}fn-p+ 2(y)dy
oo
From this inequality and Lemma 6 (i) in Appendix A
(
Using these inequalities in (2.21) we get,
22
A*
2
-1
2j
MSE{S (r)}~cr {. pl: 1/A.-2aa 00l: K. ( 1- 2 2' JA 2 2'
i=l
1
j=O J
p- + J p- + J
(2.22)
+ (n-p)
-1
}
/2'
A
p- + J p-2+2j
2 _looK.
(n-p+2}a ct
L
j=O
~cr2LI. 1 l/A.+{(n-p) -1(n_p+2)a 2
1=
1
-2(p-2)a'd-
.
}
1
K.
00
L
.
J.
J=O
p-2+2J
Or, equivalently,
~cr2{(n_p)-1(n_p+2)a2_2(p_2)a1..'a-lY~j2'
.
1 j=O p- + J Ap- 2+ 2'J
.
Clearly, the quadratic form in a' on the right hand side is negative
if
0 <a
<
0
2(p-2) (n-p)/(n-p+2). This completes the proof.
Two subclasses of
S*(r)
given in (2.13) are considered in the
next two corollaries.
Corollary 3.1.
Let
F = S'(X ' X)2
S/s 2.
Define
(2.24)
where
a is a nonnegative constant and
r: [0,00) x [0,00)
+
[0,1]
satisfies
(a I)
for each fixed
s 2 , r(e,s 2 )
(b ' )
for each fixed
F, r(F,e)
Then, the
Proof.
MSE{(3* (r)}
is monotone nondecreasing,
is monotone nonincreasing.
satisfies (2.14) if a
satisfies (2.15).
It suffices to show that the above function r(e,e), when
,
regarded as a function of a' (x x)2 a and s 2 , satisfies conditions
23
(a) and (b) in Theorem 3.
Let
and consider the transformations
Y = X=
2
2
s
2
By the Chain Rule of partial derivatives,
=
ar (Y1 ' Y2) (..!...]
aY
l
+
ar (Y1 ,Y2)
X
2
(0) <:: 0 .
ay2
2
X = s 2 , r (. , s)
is monotone nondecreasing,
2
which is condition (a) in Theorem 3. Similarly,
Hence, for any fixed
3rfYI'Y2) • 3r(Y I ,Y 2) [_ Xl] +
a,x 2
ay 1
Hence, for any fixed
X
1
x2
2
= S'(X ' X)2 S,
r{S'(X ' X)2 S,·}
increasing, which is condition (b) in Theorem 3.
is monotone non-
Hence, the proof
\
follows.
0
(In the strict sense, the partial derivatives of
r
exist "except for a set of Lebesgue measure zero".
in the above proof
However, since we
are dealing with expected values of continuous random variables, such
a distinction is irrelevant to the results.)
If XIX = I
in Corollary 3.1, then
AlA
F = S Sis
2
and
A*
l3 (r)
(2.24) becomes
which coincides with the corresponding class of estimators by
Strawderman [15],
in
24
The class of estimators in Corollary 3.2 below is a subclass of
the one given in Corollary 3.1 and, in turn, the one given in Theorem
3.
Corollary 3.2.
Let
(2.25)
F
= S'(X'X)2 S/ s 2.
S* (r)
=
E-ar~F)
Define
(X'X~ S
a is a nonnegative constant and
where
nondecreasing.
Then, the
MSE{S*(r)}
,
r: [0,00)
+
[0,1]
is monotone
satisfies (2.14) if a satis-
fies (2.15).
Proof.
Clearly, the above function
reo)
is a special case of r(o,o)
in Corollary 3.1. Hence, the proof follows.
If
XIX
=I
. Coro 11 ary 3 . 2 , t hen
ln
0
2
F. -_ a'a/
~ ~ s
and the
~* (r)
in (2.25) becomes
which coincides with the corresponding class of estimators of Baranchik [2], as given in (1.14).
The formula for the
MSE{S*(r}}
as given in (2.21) above, is a
function of A and B. An attractive property of S*(r) in Corolm
m
lary 3.2 is that the method of computing A and B is relatively
m
m
2
simple and that these quantities do not depend on S or 0
as
shown in the following.
Let
From (2.17),
25
(00 2
-1 2
h l (x) =
J r l {x, (n-p) cr w2}fn _p +4 (w 2)dwZ
O
=
(00 2
-2
J r {(n-p)cr x/w 2}fn _p+4 (w2)dw Z .
O
From (2.7) with
Am
2
hl(e)
in place of r (e),
= ~hl(aWl)fm(Wl)dWl
= J~~r2{a(n-p)Wl/W2}fm(Wl)fp_P+4(W2)dWldW2
.
2
Now, this is the expected value of r {a(n-p)W l /W 2} such that Wl and
Wz are independent and have chi-square distributions with v l = m
and v 2 = n-p+4 degrees of freedom, respectively. On the other hand,
it is well known that the ratio W= Wl/WZ has a Pearson Type VI distribution and that the p.d.f. of W is given by
f
\1
V
w;:: 0 •
(w)
l' 2
Hence,
A
m
can be computed directly as
A = [r2{a(n-p)~}f
4(w)dw .
m
0
m,n-p+
Similarly,
B = r;{<l(n"p)w}f
2(w)dw .
m J
m,n-p+
o
CHAPTER III
GENERALIZATION OF JAMES AND STEIN'S ESTIMATORS
111.1
Introduction
James and Stein's estimators are generalized for the case when XIX
is not necessarily an identity matrix.
When
cr
2 is unknown, this gen-
era1ized estimator is defined as
S* =
where
a.
{I _.
S'
2
a.s
(X 'X) 2
S
is a nonnegative constant.
(X'X)}S,
This estimator is obtained from
the class of estimators given in (2.13) of Chapter II by letting r(·,·)
:: 1,
that
•
1S,
A* _ A*
S
=8
(1).
This estimator has a number of attractive
properties as described below.
An
explicit formula for the difference
between the mean squared error of this estimator and that of the usual
least squares estimator is
MSE{S*} - MSE{S}
and the
where
I.
p+2
= cr 2{n-n-p
2
a. - 2 (p- 2) a.}
K.'s
J
ooK.
- l L\'
2 J 2· ,
j=O p- + J
are defined as in (1.36) in Chapter
From this, we show that
MSE{S*} < MSE{S}
if
CI.
for every
S
a. satisfies
o < a.
and that
< 2(p-2) (n-p) I (n-p+2)
a. satisfies this inequality if
MSE{S*} < MSE{S}
for some
27
S.
In addition, we also show that the mean squared error of this new
a = (p-2) (n-p) / (n-p+2).
estimator is minimized when
coefficient(s)
S, we show that this new estimator has the minimum
mean squared error when
estimator for
As for the unknown
X'X
S
=0
as in the case of James and Stein's
= I. Thus, it is seen that this new estimator has
all the important properties of James and Stein's estimator.
In Section 111.5 below, the generalized James and Stein's estimator and Bock's estimator are compared in terms of percentage reductions
in mean squared error as compared with the usual least squares estimator.
In Examples 1 and 2 in that section, it is pointed out that the new
p
estimator performs better than Bock's estimator with a = .\
L 1/.\. - 2
. 1
P ~=
and that the latter with
~
a = p-2 performs better than the new estima-
tor in some cases.
111.2
Case I:
cr 2 known
As a generalization of James and Stein's result to the case when
X'X
is not necessarily an identity matrix, Bock [6] produced an esti-
mator of the form {l-acr 2/s,(x/X)S}S and obtained a necessary and sufficient condition for which this estimator has smaller mean squared error
than the usual least squares estimator.
known for what value of the constant
this estimator is minimized.
However, it has not been
a the mean squared error of
In this section we obtain a
lization of James and Stein's result.
new genera-
We not only determine the opti-
mal value of . a but also obtain an explicit formula for the mean
squared error of this new generalized estimator for any given value of
a.
28
Theorem 4.
cr
Assume that
2
is known.
Define
(3.1)
where
~
is a nonnegative constant.
MSE{S*} < MSE{S}
(3.2)
if
(3.3)
0 <
~
~
< 2 (p-2)
satisfies this inequality if
MSE{S*} < MSE{S}
(3.4)
·e
S
for every
satisfies
~
and
Then,
for some
S.
Moreover, the MSE{S*} attains its minimum
(3.5)
cr
2{ P .
2 -1
Kj }
II/A.
(p-2) ex
I
2 2'
i=l
j=O p- + J
00
,
1
when
Proof.
~
= p-2.
Let
r(e) _ 1 in (2.1) in Chapter II.
(3.1) above.
Then the
Hence, from (2.4) in Chapter II
(3.6)
By Corollary 1.1 in Chapter I, we get
A*
S (r) becomes
29
p
MSE{S*} = 0'2 {
(3.7)
L l/A. -2 (p-2)aC4 -1 L
i=l·
+
= 0'
K.
00
/2'
j=O P- + J
1
a2~-1 ~
Kj }
j~O p-2+2j
2fP
L l/A.
12-=1
1
ooK. ~
+{.a.2 -2(p-2)a}a.- lj=O
L /2·'
p- + J
Clearly, the quadratic form in' a on the right hand side is negative
if and only if
0 < a < 2(p-2) ,
and is minimized when
a =p-2.
By sub-
stituting this value of a into (3.1) and (3.7), we see that theestimator
S* =
(3.8)
{r - (p-2)
0'2
S,(x,x)213
(X'X)}~
has the smallest mean squared error as given in (3.5) above.
completes the proof.
If
X'X=I,
This
0
then
K.:: P.
J
J
for all
j
by Ruben [15].
Hence,
the above estimator, in this case, coincides with James and Stein's
estimator given in (1.5) in Chapter I and the minimized mean squared
error of the former given in (3.5) becomes that of the latter given
in (1.10) in Chapter I.
As mentioned in Section 1.3, Hudson [9] independently arrived at
the same estimator as the
different method of proof.
a
"'*
in (3.1) and proved Theorem 4 by using
However, Hudson did not obtain such an
explicit formula of the MSE{S*} as given in (3.5) above.
,..
30
111.3 Case II:
0
2
unknown
As in the above section, we generalize James and Stein's result
to the case when
is not necessarily an identity matrix.
XIX
Bock
[6] obtained a different generalization also for Case II in the same
way as for Case I.
Theorem 5.
Define
(3.9)
a is a nonnegative constant.
where
Then
(3.10)
if a
satisfies
a
(3.11)
and· a
< a < 2(p-2) (n-p)/(n-p+2)
satisfies this inequality if
(3.12)
Moreover, the
02
(3.13)
when
MSE{S*}
attains its minimum
p
II/A. - (p-2) 2 (n-p)
{ i=l
~
n-p+2
a.-I
I
00
K
i .} ,
j=O p-2+2J
a = (p-2) (n-p) / (n-p+2) .
Proof.
becomes
Let
A*
(3
r(·,·)
=1
in (3.9)
in (2.13) in Chapter II.
above.
Then the
S*(r)
Hence, from (2, 16) in Chapter II,
31
(3.14)
Since
(n_p)s2/cr2 is independent of
bution with n-p
S and
has a chi-square distri-
degrees of freedom, we get
(3.15)
e'
By Corollary 1.1 in Chapter I,
(3.16)
MSE{S*} = cr
2
rI
U-=l
p
1/)... + {n- +2
-l
a.
n-p
1
ooK.]
j~O
~
p-2+2j
Clearly, the quadratic form in a
a? -2 (p-2)a}
.
on the right hand side is negative
if and only if 0 < a < 2(p-2) (n-p)/(n-p+2),
a
= (p-2) (n-p)/(n-p+2).
and is minimized when
By substituting this value of a into (3.9)
and (3.16), we see that the estimator
(3.17)
S* =
.
•
{I _ (p-2) (n-p)
s2
(X' X)}s
n-p+2
S'(X'X)2 S
32
has the smallest mean squared error as given in (3.13) above.
0
completes the proof.
If X,X = I,
This
the estimator
a*
~
in (3.9) becomes
s* = (1 -~s~)
S,
S'S
which coincides with the James and Stein's estimator as given in (1.5)
of Chapter I and the minimized mean squared error of the former given
in (3.13) above becomes that of the latter given in (1.11) in Chapter I.
III.4
Additional properties of the generalized estimators
By Theorems 4 and 5, the minimized mean squared errors of the
generalized James and Stein's estimators are given by
.
m~n MSE{S*} =
a
2{ P
L 1/>... - (p-2) 2Aa -1 L K2j 2' }
i=l
~
j=O p- + J
0)
(J
.
where
, I
for
S*' .
A:{ cn_PJ!cn_p+z;n
Theorem,4 ,
for
S*
in Theorem 5 .
Clearly, these minimized mean squared errors are small and hence reductions in mean squared error made by the corresponding generalized James
and Stein's estimators are large when the term
(3.18)
is large.
In this section we investigate
R(A!a,6)
as a function of
33
Theorem 6.
any fixed
R(A/a,8)
Let
A/a, R(A/a,-)
be defined as in (3.18) above.
is a decreasing function of
and attains its maximum when
assuming that
Proof.
0
8
= 0,
or equivalently, when
;
Prob ~ {; = j} = K.,
S =0
as
j = 0, 1 , 2, . .. ,
J
K.'s
8~1 (i = 1,2, ... ,p)
2 is finite.
Define a random variable
where the
Then, for
which are defined in (1.36) in Chapter I satisfy
J
00
K. ~ 0 (j = 0,1,2, ... )
L K. = 1
and
J
Then,
;
.
j=O J
is a proper random variable.
By Lemma 3 in Appendix A, the
distribution function of ;,
F (x) = Prob. {; ~ x} =
is decreasing in
8~ (i = 1, 2, . . . , p) .
L K.
j~x J
,
Now,
1
(3.19)
Clearly, the function
1/(p-2+2;)
is decreasing in
;.
Hence, by
Lemma 5 (ii) in Appendix A, the expected value on the right hand side
is decreasing in
theorem.
when
Since
8 = O.
8~1 (i = 1,2, ... ,p), which proves the first part of the
8~ ~ 0 for all
1
i,
R(A/a, -)
By (1.31) of Chapter I,
e = 0 -1Ak1>S
,
attains its maximum
34
where A and Pare nonsingular matrices. Hence, 8 = 0 if and
2
only if S = 0 assuming that 0
is finite. This completes the
proof.
0
If X'X
mean
8'8/2,
= I,
K.
becomes the
J
(j+l)-th Poisson probability with
that is,
j=0,1,2, ....
Hence, in this case,
R(A/a,-)
instead of p
is a function of a single quantity
quantities
2
2
82 , ... ,8 ' The above
p
Theorem 6 was obtained by James and Stein [10] for the case when X'X = I.
111.5
Comparison between the generalized James and Stein's estimator
and Bock's estimator
The generalized James and Stein's estimator defined in (3.1) above
and Bock's estimator given in (1.20) of Chapter I are compared by means
of percentage l;'eductions of the mean squared error made by these esti2
mators over the usual least squares estimator for Case I: 0
known,
2
where it is assumed, without loss of generality, that 0 = L
In Examples 1 and 2 below, the percentage reductions by the generalized James and Stein's estimator
S*
are computed by using the
formula given in (3.5) above and those by Bock's estimator e*
are
computed by using the corresponding formula of Bock (see the proof of
Theorem 3 of [6]) which, after some computation, can be written as
PooP.
(3 t20)
where
MSE{e*} =
i~ll/\ + j~O
(p+2j){p_2+2j)qj(a)
35
j = 0,1,2, ...
the Poisson probabilities with mean
8'8/2,
p
L 11'A.{:a.2-2CP-2+2j)a.}'if
=
. 1
~
~=
C3.21)
={
~
. 1
~=
8=0
2
1/A.+2 j a(8)r - 2{CP-2+2 j )
~
I
'1
1/A.-4 j a(8)f if
~
~=
8~0
and
a(8) = 8' A-1 8/ 8 , 8
C3.22)
C8 ~ 0) .
P
a.
L
a. = 'A
l/A. -2,
P i=l
~
the midpoint of Bock's interval given in C1.18) of Chapter I, and a.=p-2.
Two values of
are considered for Bock's estimator:
It will first be shown that Bock's estimator S* has the following
new properties:
CA)
MSE fs*} = min MSE{S*}
a.=p-2
a
MSE fs*} <
a.=p-2
CB)
where
-=1
A
if
a(8)/A -1 =1
MSEfs*}
if
p
a.='A
I/A.-2
Pi=l
~
or
a(8)/A- l s 1 ,
L
1 P
=1/t...•
P i=1
~
L
Suppose that
8 = O.
Po = 1
Hence, C3.20) becomes
and
Then,
P = 0
j
8=0
for all
j
~
0 .
36
Clearly, the quadratic fonn of a
a= p-2,
when
on the right hand side is minimized
which proves the second part of (A).
Suppose that
-=1 = 1.
a.(6)/A
)
Then, from (3.21,
---=I +2jA
---=I )a 2 -2. { (p-2+2j)PA-::Y-=T}
-4jA
a
q.(a) = (pA
J
= (p+2j) -=I
A {a 2 - 2 (p-2)a } .
a = p-2 minimizes
Hence,
q. (a)
for all
J
j.
This and (3.20) prove
the first part of (A).
From (3.21) it is readily seen that
1
a ='2 h j (8) ,
q. (a)·
J
is minimized when
where
P
2(p-2+2j) ~ 1/A.-8ja.(8)
1
.1
1=
p
2: 1/A.1 +2ja.(6)
.1
1=
Consider
4
h. 1(6) -h.(6) =
J+
J
p
I
. 1
1=
1/ A;{
{h. (6) }
J
This proves (B) because
When
..
1/ \;2 j a.(6)}
Then, it is seen, from the above, that
is a nondecreasing sequence.
Example 1.
1
.. p
1=1
0.(6) IA -1 s 1.
coefficient of
1/A.-Pa.(6J}
{)1/Ai+2(j+1Ja.(6J}{.~
1=1
Suppose that
I
1= 1
1.
q. (a)
J
Hence,
is a quadratic fonn in
a. and the
a2 is positive.
6 = 0:
For 12 arbitrarily chosen values of
centage reductions of the mean squared error by Bock's estimator
A, per-
37
p
computed at ct = >..
~ 1/>... -2 and a. = p-2 are given in the first and
p i~l· 1 ·
second columns of "Bock" in Table 3.1 below a~d those by the genera~ = p-2
lized James and Stein's estmator computed at
~
are given in
the last column.
. ·TABLE 3.1
Percentage reductions of the mean squared error when
A
1.0
Bock
New
e=a
1.0
1.0
1.0
66.7 (66.7)
66.7
1.2 1.0 1.0 1.0
1.4 1.0 1.0 1.0
1.6 1.0 1.0 1.0
1.8 1.0 1.0 1.0
1.0
1.0
1.0
1.0
0.8
0.6
0.4
0.2
6L,3
47.0
25.3
Nd*
(66.7)
(66.7)
(66.7)
(66.7)
66.0
63.0
58.0
44.7
1.2
1.4
1.6
1.8
1.2
1.4
1.6
1.8
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
0.8
0.6
0.4
0.2
0.8
0.6
0.4
0.2
61.9
50.9
36.3
19.1
(66.7)
(66. 7)
(66.7)
(66.7)
65.3
60.8
51.9
35.1
1.2
1.4
1.6
1.8
1.2
1.4
1.6
1.8
1.2
1.4
1.6
1.8
0.8
0.6
0.4
0.2
0.8
0.6
0.4
0.2
0.8
0.6
0.4
0.2
62.5
54.4
45.6
37.0
(66.7)
(66.7)
(66.7) .
(66.7)
64.7
58.4
47.4
30.0
1.0
1.0
e
p
(*) Nd means "not defined" because
a= A
L 1/>".-2 < 0
P i=l
1
in this case.
It is seen in the table that the generalized James and Stein's
p
estimator performs better than Bock's estimator with
for 11 out of 12 values of
L 1/>".-2
Pi=l
1
>...
It is also seen that Bock's estimator with
p
better than the one with
a=A
a = A.
L 1/>".1 -2
. Pi::1
a=p-2
performs
for all 12 values of
A as
38
expected from the property (A) shown above and that the former performs better than the generalized James and Stein's estimator for all
these
A's.
Exapple 2.
e;z! 0: For 30 arbitrarily chosen pairs of (A,S),
When
the percentage reductions of the mean squared error by Bock's estimator
p
a =A II/A. -2 and a =p-2 are given in the first and
computed at
Pi=l· ~
second rows of each case in Table 3.2 below and those by the generalized
James and Stein's estimator computed at
a= p-2
row.
(iI' i 2)
A(i) and
In the second column of the table,
(i )
(i )
that
A=A
1
e=e
and
2
where
are given in the third
for each case means
SCi)
are defined
as
A(i)
i
1
2
3
y
(1.8 1.4
(1. 5 1.3
(1. 5 1.5
= x/ /3,
for
1.2
1.0
1.1
1.0
0.9
1.0
SCi)
0.4
0.7
0.2)
0.5)
(x
(0
0.5
0.5)
(y
x =0.5, 1. 0;, 1. 5, 2.0, 4.0
0
0
0 0 0)
0 0 0 x)
0 Y 0 Y 0) ,
and
()
6.0.
It is seen in Table 3.2 below that the generalized James and
Stein's estimator performs better than Bock's estimator with
p
A II/A. -2
Pi=1
~
of Case 2.
for 28 out of 30 entries except for
x = 4.0
and
a =
x =6.0
39
TABLE 3.2
Percentage reductions of the mean squared error when
8
0
;:e
a.(8)/\-1
Case
(i 1 ~i2)
x=0.5
x=1.0
x=1.5
x=2.0
x=4.0
x=6.0
1
0.32
(1, 1)
3.8
Bock { 65.8
39.3
New
3.5
63.1
33.3
3.0
58.6
25.8
2.5
52.6
18.8
1.1
28.2
5.4
0.5
15.1
2.4
2
0.58
(2, 1)
38.1
65.1
57.0
34.5
60.7
49.2
29.7
54.2
39.2
24.6
46.9
29.7
10.7
22.9
9.6
5.4
11.9
4.3
3
1. 00
(3, 3)
42.2
64.0
54.8
37.5
56.8
48.6
31.3
47.4
40.5
25.0
37.8
32.3
9.6
14.6
12.3
4.6
7.0
5.8
4
1. 74
(2, 2)
37.3
62.0
58.2
31.9
50.0
53.6
25.0
35.2
47.1
18.4
21.8
40.2
5.0
- 0.1
19.7
2.0
- 1.8
10.6
5
2.83
(1, 2)
3.7
59.1
40.9
3.1
40.1
38.9
2.4
17.5
36.0
1.6
- 1.6
32.6
0.3
. -21.6
20.4
0.1
-14.6
12.9
e
It is also seen that, for all 18 entries of Cases 1,2, and 3,
where
--=r :::; 1,
a.(8)/\
Bock's estimator with
a.=p-2
performs better
p
I 1/\.-2 as expected from property (B)
Pi=l
1
above and that the former performs better than the generalized James
than the one with
a.= \
and Stein's estimator for these entries.
where
-:y
a.(8)/ \ . > 1,
However, for
Bock's estimator with
a.= p-2
C~ses
4 and 5,
increases the
mean squared error over the usual least squares estimator for 5 out of
12 entries by ranging from
0.1%
to as much as
21.6%
while Bock's
p
estimator with
.
a.=\
L 1/\.-2
Pi=l
1
and the generalized James and Stein's
-
~
40
estimator never increase the mean squared error over the usual least
squares estimator.
Thus, it seems reasonable to say that the useful-
ness of Bock's estimator with
a. = p- 2 is limited to the cases where
aCe) fA -1 s 1.
In the following example, generated data were used to make furthere comparisons.
Example 3.
Consider a linear model,
where the independent variables are standardized and
pendently normally distributed with mean
0 and variance
are inde1.
Let the
X~1'X~2'···'
1
,1
values of the unstandardized independent variables
Xi6
EiJ'S
and
be given as in Appendix C, where
4
L xi·J = 14
j=l
4
L X~. = 12
,
j=l 1J
(i = 2,3, ... ,40) ,
is a string of 80 random
normal deviates (generated by a computer program).
Xii 'X i2 ' ... ,
and X i6
in (3.23) above be given as the standardized
"respectively
values of X*il 'X *i2 ' ... ' and
Then, the correlation matrix is given by
1.000
X'X
0.841
1.000
=
The eigenvalues of
X'X
Let the values of
-0.315
-0.485
1.000
are
-0.592
-0.369
-0.568
1.000
0.134
0.234
-0.124
-0.037
1.000
(i
=1,2, ... ,40) ."
-0.276
-0.251
-0.054
0.294
0.042
1.000
41
A2 = 1.660
. Al = 0.451
Let
A4 = 0.773
"
AS = 0.133
13 = 13 (k) (k = 1,2)
be given as,
,
A3 = 0.981
A6 = 0.002
13(1)=( 1.135,
-0.875,
0.046,
-0.618,
0.257,
0.459)
13(2) = (30.417,
3.558,
32.911,
38.138,
0.645,
-0.219)
.
Then, by direct computations, we get
where
A string- of 40 random deviates (generated by a computer program) ,was
substituted into (3.23) to generate values of dependent variable for
each
13 = 13 (k) (k = 1,2) •
mator; b,
The squared distance between
13
and its esti-
defined as
(3.24)
6
L2 (b) = (b-I3)'(b-l3) = I (b.-I3.)
1=1 ~ ~
was then computed for each S.
e:' X(X' X) -2 x' e:
2
A.
A.
,~~
L (13) = (13-13) (13-13) =
2
for both f3t s ~ Entries in'Tab1e 3.3 below are the
squared distances,
A
In particular,
~
A*
L (13), L (13*), and L2 (13), computed by using 10
2
2
different strings of random normal deviates (generated by a computer
program).
~
42
TABLE 3.2
Squared distances between
13
Bock
and its estimators
New
Sample
BLUE
13=13(1)(13=13(2))
13=13(1)(13=13(2))
1
172.974
91. 868 (289.454)
162.290(159.560)
2
33.522
5.353( 59.118)
19.806( 22.972)
3
44.872
9.484(405.900)
45.163( 44.199)
4
187.346
25.210( 17.693)
180.510(176.438)
5
11.181
8.399(297.616)
1O.416(
6
23.762
7.330(544.125)
19.035( 25.397)
7
186.212
18.455 ( 26.413)
182.316 (182.068)
8
36.563
18.482(249.247)
31. 999 (33. 271)
9
119.710
14.778(780.823)
114.947(120.612)
10
29.870
17.704( 40.471)
. 25. 443( 24.107)
84.601
21.696(271.086)
79.193( 79.748)
·e
Average
8.859)
It is seen, in the table, that the average of the squared distances
,
~
~
L2(f3*) is smaller than that of the other two when et (f3)/A =0.023
and bigger when et'(f3)/A- 1 = 2.982. This is a limited but specific
example which shows that the mean squared error of Bock's estimator,
~
E[L 2(f3*)],
tor,
.e
may be bigger than that of the usual least squares estima-
E[L (S)],
2
if et'(f3)/A- 1 >1.
CHAPTER IV
SIMPLE IMPROVEMENTS OF THE GENERALIZED
JAMES AND STEIN'S ESTIMATORS
IV.1
Introduction
The generalized James and Stein's estimators defined in Chapter
III are further improved by simple transformations.
When
0
2 is
unknown, this improved estimator is defined as
s:
=
(P' cpP) 8*
2
where
estimator,
and
cp
.
is the generalized James and Stein's
{ I-" as 2" (X'X)}13
S' (X'X) S
P is the orthogonal matrix defined as (1.2) of Chapter I,
is a diagonal matrix of cp. (i
1
CPi
.
= cp[aA. ,(0) {S' (X'X)2 S/ s 2}
=
1,2, ... ,p)
for
i
such that
=1,2, ... ,p
,
1
where
~(x) = {:
if x EO E
otherwise .
It will be shown that
MSE{S*} < MSE{S*}
+
for every
(3
and
a> 0 .
In Theorem 5 of Chapter III, it was shown that the mean squared error
of
"
S*
is minimized when
the optimal value of
a = (p-2) (n...p) / Cn-p+2) . However, as for "*
S+'
a has not been determined.
44
IV.2
Case I:
cr
2
known
Baranchik [1] showed that the James and Stein's estimator
2
2
(1 - acr /S'S)S can be improved by replacing (1 - acr /S'S) by its
2
2
"positive part" (1 - acr /S'S)+ = max {a, (1 - a.cr /S'S)}. In this section we show that a similar improvement can be made for the generalized
2
James and Stein's estimator S* = {I a.cr
(X' XJ}i3 by a simple
S'(X'X)2 S
transformation.
(The following theorem can also be proved by applying a Baranchik's
theorem (see page 19 of [1]) after reparameterization.
However, we
will use Lemma 2 in Appendix A so as to obtain a formula for the dif. ference in mean squared error as given in (4.14) below.)
Theorem 7 •. Assume that
of <p.~ (i
cr
2
is known.
~ be a diagonal matrix
= 1,2, ... ,p) such that
(4.1)
and let
Let
for
8*
~
i = 1, 2, ... , p ,
be the generalized James and Stein's estimator defined
in Theorem 4 of Chapter III.
Define
(4.2)
Then
(4.3)
Proof.
MSE{S*} < MSE{i3*}
+
S and a. > 0 .
By definition
13* =
where
for every
{r
a. is a nonnegative constant.
By using the transformations
45
defined in (1.31) of Chapter I,
Hence
MSE{~*} = (i
(4.4)
.
rA.:
1.=
1
where
a.
A
d. (Z'AZ) = 1 - Z';Z
1.
(4.5)
EU{d. (Z'AZ)Z. -
1 11
r
1
(1.}~.
1
if
Z' AZ < all..
if
Z' AZ
1
0
~
0
~
all..
1
From (4.1), by using the same transformations, we get
(4.6)
-e
Hence
-
(4.7)
MSE{S*}
+
1
= (i . 1A.:
EU{0. (Z'AZ)Z. - e.}~.
1
1.
11
r
,
1=
where
o.(Z'AZ)
(4.8)
1
= ~.d.(Z'AZ)
1 1
.
From (4.4) and (4.7),
(4.9)
where
D.
.e
1
(4.10)
2
]
=E[{o.1
(Z'AZ)z.-e.}2]-EHd.
(Z'AZ)z.-e.J
·1
1
1 . .
1
1
=E[{O~(Z'AZ)-d~(Z'AZ)}Z~]-2e.EHo.
(Z'AZ)-d. (Z'AZ)}Z.]
1
1
111
1
1
•
46
From (4.8)
(4.11) e~(Z'AZ)-~(Z'AZ)";{<I>
1
1
. co (Z'AZ)-l}d~(Z'AZ)
[aA.,
)
1
. 1
2
.
2
V = aa.X'
2 + a.
a.Z. (m=1,2),
m
1 (1+2m),8.
j;z!i J J
l:
Define
,2
X
.2
(1+2m) ,8.
h
were
1
1
Z~ are independent for all
J
Appendix A,
j;z! i.
By (4.11) and Lemma 2 (i) in
-28. E[{e. (Z' AZ) -d. (Z'AZ) }Z.] = 28.E{<I>[0
1
1
1
1
and
1
It.) (Z' AZ)d. (Z 'AZ)Z.}
,a
i l l
(4.12)
Similarly, by (4.11) and Lemma 2 (ii) in Appendix A,
(4.13)
E[{e~(Z'AZ)-d~(Z'AZ)}Z~]
= -E{<I> [0 a' ) (Z'AZ)d~(Z'AZ)Z~}
1
1
1
,
1
1
1\.
1
e-
Substitutions of (4.12) and (4.13) into (4.10) and then into (4.9)
give
A*
A*
MSE{S+}-MSE{S } = (J
2 P
i~l \
-1 [
[E
{
- 8i E{<I>[0,aA )
i
2}
<I>[o,a\) (V1)d i (VI)
(V2)d~(V2)}
+ 26iE{$[O,aAi) (VI)di (VI)
~
Thus, we have obtained a formula for the difference of mean squared
errors.
Now,
E{<I>[O _, )(v )d~(V)} > 0
'W\i
m 1 m
(m=1,2)
e.
47
0
Hence, the proof is completed,
X' X = I
If
in the above theorem, then
13*
I,>
+
becomes
which coincides with the improved James and Stein's estimator by
Baranchik [1],
In the following, it will be shown that the improved estimator,
"*
S+,
defined in (4.2) above does not belong to the class of
A
S*(r),
defined in (2,1) of Chapter II unless
S:
=
(PI~P){I
a.
-
A
cr
2
'"
X~X=It
estimators~
From (4.2)
•
(XIX)}S
SI eX IX) 2 e
where
a diagonal matrix.
if and only
Hence, the
if
(4.14)
such that
r:
[O,~) +
Yi (i = 1,2, ... ,p)
[0,1]
is monotone nondecreasing.
be the i-th diagonal element of
r.
Let
Then
It is readily seen that the above (4,14) does not hold unless
XIX
= I.
48
~
in (4,2) was also considered indo.
The improved estimator ~*
+
2
pendently for Case I: 0
known by Hudson [19] who stated 1 without
8*+
proof, "it seems clear that the estimator
Case II:
IV.3
o
2
would be supeTior to
unknown
A simple improvement of Baranchik's [1] type is applied to the
generalized James and Stein's estimator
Theorem 8.
Let
S* = {I -" as " ex' X)
a' (X'X)2 a
be a diagonal matrix of
ep
"
(4.15)
2
2.....
2
<Pi = <P[a.A.,OO){f3'(XIX) Sis}
for
<Pi(i=1~2"",p)
}a.
such that
i = 1,2, •. .,p
1
A
and let
13*
be the generalized James and Stein's estimator defined in
Theorem 5 of Chapter I I I.
Define'
(4.16)
Then
MSE(S*) < MSE(S*)
, (4.17)
Proof.
+
for every
f3
and
a. > 0 •
From (4.15), by using the transformations defined in (1,30)
of Chapter I,
(4.18)
By comparing the right hand side of (4,6)
that
Z'Az
is replaced by
and that of (4.18), we see
Z'AZ
The same change is made in ~*,
~2/(i •
Thus, we can follow the proof of Theorem 7,
No additional changes are
necessary in computing conditional expected values given
s
2
except
49
V
Ym is replaced by 2m
2
that
(m = 1,2).
s !cr
2
taking expectations with respect to s
If
X~X = I
Then, the proof follows by
o
in the above theorem, then
8*' ,.;.
1-'+ -
s*+
becomes
",
2 {
a.s 2}",
<p[a. 00)(13 Sis) 1--13 ,
A
,
.~,~
which coincides with the improved James and Stein's estimator obtained
by Baranchik [1].
~*
As for Case I, it can be shown that the improved estimator
defined in (4.16) does not belong to the class of estimators,
defined in (2.13) of Chapter II unless
IV.4
+'
13. . * (r),
XIX = 1.
Comparison between the improved generalized James and Stein's
estimator and the "improved" estimator of Berger and Bock
Berger and Bock [4] introduced an estimator of the form
(4.19)
for Case I:
erro~
A*
13,
known and showed that it has smaller mean squared
(J2
for any
a.,
than the generalized James and Stein's estimator,
defined in (3.1) of Chapter III.
It was shown, in Theorem 7 above, that the improved estimator,
"'*
13+,
has smaller mean squared error, for any
a,
than the estimator
a"* .
If
estimator
X' X = I,
S*
I-'
+
both of the improved generali zed James and Stein ~ s
defined in (4.2) and the "improved tl estimator
S:
of
Berger and Bock given above coincide with the improved James and Stein~s
estimator
50
of Bara.nchi.k II] ~ which. has smaller mean squared er'l'or than the James
and Steints estimator
(1_~cr2/~tS)6 for any a\
In the following, it will be shown, for Ca.se 11
if
2
cr
known, that ,
e = 0,
(4.20)
Throughout the proof, it is assumed, without loss of generality,
that
cr
2
= 1.
Define, for
B.
1
= [O,aA.)
1
i::; 1,2, ' , , ,p,
and B.
1
= faA.1. ~oo)
,
By using the transformations defined in (1,31) of Chapter I, it was
shown, in (4.7) above, that
(4.21)
where
MSE{S*}
=
+
dt(ZIAZ)
I A~lE~¢-B
.11
1=
.
1
(ZIAZ)d.1 CZlAZ)z.-e.}]
,
11
is defined as in (4,5) above.
Define
Then, by using the same transformations,
(4.22)
From this and (4.19)
51
Hence,
(4.23)
=
I A~lE[{¢A(Z'Z)<5.
. "I
1=
+
1
1
<P;;:(Z' Z) d i
(Z'Z,Z'AZ)Z.
(Z' Az) Zi -
1
e~}~ ,
where
Z'Z
<5i (Z' Z, Z'AZ) = 1 - 'fiAZ \
(4.24)
and
di(Z'AZ)
is defined as in
(~.5)
above.
Let
8=0 inboth(4.21)
and (4.23). Then,
(4.25)
=
PI
L A~.1
i=l
D.
1
where
(4.·26) D.1 =E [{¢n
-E [{¢A(Z' Z) <5~ (Z 'z ,Z' AZ) }Z~]
B• (Z' AZ)d~1 (Z' AZ) }Z~]
1
1
1
1
By multiplying
inside of the first expectation on the right hand side of the above and
¢B. (Z'AZ)
1
.e
+ ~.
(Z'AZ)
=I
1
inside of the second and the third expectations, we get, after dropping
52
E[{<f>p;(Z'Z)4>[(Z'AZ)di(Z IAZ)}Zi]
which appears both positively and nega-
tively,
(4.27)
D. = E[cPA(Z' Z)<Pn (Z 'A'Z) {d~ (Z'1I.Z) -o~ (Z' Z,Z '11. Z) }Z~]
1
B.
l'
1
1
1
- E[{cP (Z'Z)cP
A
B•
(Z'AZ)o~(Z'Z,Z'1I.Z)}Z~]
1
1
- E[{eJ>AA(Z'Z)cP
1
(Z'AZ)d~(ZIAZ)}Z~]
1 1
B.
1
.
Clearly, both of the second and third terms on the right hand side of
the above are negative.
Hence, it suffices to show that the first
expected value is also negative.
E[cPA(ZIZ)<Pn
B.
1
[
=
and (4.24) , we get
(Z'1I.Z){d~(ZIAZ)-O~(Z'Z,ZI1I.Z)}Z~]
1
1
1
E~A(ZIZ)4>[i(ZI1I.Z)
. which is negative.
From (4.5)
{
(Z'Z+a.)A.}{(ZIZ-a)A.} ~
2- . z'Az 1
Z'AZ 1 Z~ ,
Hence, the proof is c?mpleted.
0
e
CHAPTER V
DISCUSSION AND SOME SUGGESTIONS FOR FUTURE RESEARCH
Some of the general properties of the proposed estimators are presented in this chapter in order to examine the usefulness of these estimators and to expose some problems which have not been solved.
1.
(Standardization of X)
TIlroughout all the previous chapters it has been assumed that the
XIX
independent variables are standardized so that
matrix.
is the correlation
This assumption was necessary to make comparisons between the
proposed estimators and the corresponding estimators for
XI X = 1. How-
ever, these new estimators are defined in exactly the same way even if
X is not standardized.
2.
(Underclying assumptions in defining the
c
Let
(i)
b
A
S*)
b be an estimator of
is distributed as
S such that
N(S,(1 2Q), where Q is a known matrix of
full rank, and that
(ii)
an
estimator,
independent of b
variable with
n-p
s2,
of (12
is available such that
and distributed as
(12
times a chi-square random
degrees of freedom.
Then, the generalized James and Stein I s estimator,
.e
defined as
(5,1)
(n_p)s2
b* = {I - t
s~
blQ 2b
Q-l}b,
b* ,
of
S is
is
54
where
t
is a nonnegative constant.
By Theorems 4 and 5 in Chapter
~
III,
MSE{b*} < MSE{b}
(5.2)
if t
satisfies
0 < t < 2 (p-2) (n-p) / (n-p+2)
(5.3)
and
t
satisfies this inequality if
MSE{b*} < MSE{b}
(5.4)
Moreover, the
MSE{b*}
t
(S.5)
. 3.
S
for every
S.
for some
is minimized when
= (p-2)(n-p)/(n-p+2)
.
(Generalized least squares estimator)
Suppose that, in the multiple linear regression model
y = X13+ e: ,
e:
is distributed as
estimator,
b,
N(O,O 2V).
Then, the generalized least squares
is given by
(5.6)
An estimator, s 2 ,
which is distributed as
2
which satisfies the condition (ii) above is given by
of 0
(5.7)
s
2
= (y-Xb)'V-1 (y-Xb)/(n-p)
.
Hence, the generalized James and Stein's estimator of
as in (5.1) above.
S is defined
e.
55
(Analysis of variance in a one-way classification)
4.
Let
y..
1J
for
j = 1,2, ... ,n.
denote the observations in the i-th
1
group in a one-way classification with I groups.
The usual fixed-
effects analysis of variance model is
(i= 1,2, ... ,1)
y .. = 1-I+a. +E ..
(5.8)
1J
.
1
1J
subject to a constraint
I
.
L1n.a.
=0
11
,
1=
where
E..
1J
variance
-e
is independently normally distributed with mean
2
cr . Let
I
n
y.
= . Ln.,
111
=
1=
1
ni
n.1
L y .. (i = 1,2, ... , I) ,
j=l 1J
and
1
Y=
o
and
I
L n.y.
. 1 1 1
n 1=
Then,
A
1-1
A
a. = y. (i
= Y and
1
1
= 1,2, ... , I)
are the usual least squares estimators of
respectively.
and
1-1
a. (i=1,2, ... ,I)
1
Let
and
A
a
It is well known that
is distributed as
1 1
--n n
1
n
• ••
--n1
--n1
1 1
--n n
• ••
1
n
1
-n
•
1
n
l
.e
(5.9)
Q
2
N(a,cr Q),
=
2
•
•
•
•
•
--
• ••
1 1
---n n
I
where
56
The usual sum of squares within groups
(n-I) s
is independent of
& and
dom variable with n-I
n.
I
2
that
~
distributed as
~R
where
cr
degrees of freedom.
is singular.
3:S;k<I.
_
~ ~ (y .. -y.)
i=l j=l 1) 1
=
James and Stein's estimator of a
because
.
1
a
Let
R
times a chi-square ranNow, the generalized
be a vector of any k a. 's
1
" 's
a.
1
~
2
can not be defined in this case
Then,
is in the form of
2
is distributed as
in (5.9) above and is nonsingu1ar.
Hence, the generalized James and Stein's estimator of
defined as in (5.1) above with
(Determination of the
5.
If r(e) =1
"'"
"
~
in place of
~
r(e))
in the class of estimators,
[17] showed for Case I:
"" 2
r(I3' S/cr )
(5.10)
= a.
-1
cr
2
can be
b.
"'"
l3(r) ,
of Baranchik,
"'"
coincides with the James and Stein's estimator,S.
13(1)
such
known that the
S(r)
Strawderman
with
13 'I3/(2cr 2 )
}
2e- "'"
.p+2-2c -. 1
""2- .
{
xp/2-c e -xl3' 13/(2cr ldx
Jo
(0
:s;
c
<
1,
P ~ 6 - 2c)
is admissible,
by Efron and Morris [7].
However both of these functions seem too
complicated to be of practical use.
~(e)
A different form was produced
The problem of determining the
has not been studied in this thesis.
e.
57
6.
(Distribution of
A
~*)
The problem studied in this paper is point estimation.
Toobtain
confidence intervals or regions the distribution of the proposed estimator should first be known.
Although an explicit formula for its
mean squared error is given in this thesis its distribution has not
been determined.
7.
(Admissibility conditions)
In this thesis, comparison between the new estimator and the
usual least squares estimator is made solely in terms of the mean
squared errors.
-e
For the new estimator to be more useful, it needs to
be compared in terms of other conditions.
Unfortunately, it seems dif-
ficult to do so without first obtaining its distribution.
APPENDIX
A
Some known results are reproduced without proofs in the first two
lemmas.
These and five other lemmas which we obtain in this Appendix
are basic tools which are used to prove main results in the preceeding
chapters.
Lemma 1.
as
(Press [14] and Ruben [15]),
NCB, 1).
ex> 0
Let
and a 1 2: a 2 2:
Z is distributed
Suppose that
• • • 2: a p 2:
1.
Then the p.d.f. of
the quadratic form
. P
V
2
P
= ex . L1a1.Z.1 =Cl.La.x'
i=l
1=
1
2
2
1,8.
1
is given by
-e
p (v) = CI.- I
(A. 1)
V
where
,2
Xm,d
00
L K.f
2' (CI.-Iv)
j=O J p+ J
denotes a chi-square random variable with
freedom and noncentrality parameter
chi-square random variable with
d, f (e)
m
m degrees of
denotes the p.d.f. of a
m degrees of freedom, and
given by
•
P
-B ' B/2n
K = e
a .-!z
O
. 1 1
1.=
(A.2)
j=1,2, ... ,
(A.3)
p
t
I.
. 1
(I-a.-1 ) m+ m
1=
1
P
t
I.
.
1=
1
B.2a -1
. (I-a.-1 ) m-l
11
m = 1,2, ...
1
\
00
(A.4)
L K.
j=O J
= 1,
and
K.
' J
2:
0,
j = 0, 1, 2, . .. .
K. 's
J
are
59
This was obtained by Press [14] except for the recursive formulas (A.2) and (A.3) which are due to Ruben [15].
One may refer
Johnson and Kotz [11] to trace for different reprsentations of the
p.d.f. of quadratic forms in general.
If
(l
=1
and a.
1
=1
for all
i,
then
K.
becomes the Poisson
J
probabi 1i ty
j = 0,1,2, . . . .
Lemma 2.
1.
(Bock [6]).
For any fixed
Let
,2
and
Z, V,
i (i = 1,2, ... ,p) ,
Xm,d
be defined as in Lemma
define
and
V
2
,2
= aa·X
2
1 5 8
+
a
,.
1
where
X
,2
2
and
3,8.
-+
2
a . Z2. ,
~
J.~.1 J J
are independent of
5,8.
1
h: [0,00)
,2
X
\'
Z~ for all
J
Let
1
(_00,00).
Then,
(i)
E[h(V)Zi]
= 8i E[h(V1 )]
(ii)
E[h(V)Zi]
= E[h(V1 )]
+
,
8iE[h(V2 )]
These are generalized versions of Bock's results.
(See Theorems
A and B of [6].)
Lemma 3.
able
CA.5)
~
Let
K. 's
J
be defined as in Lemma 1.
Define a random vari-
as
Prob. { ~ = j} = K.
J
j
= 0,1,2, . . . .
60
~,
Then the distribution function of
= Prob. {~ S x} = I
F(x)
K. ,
jSx J
e~1 for fixed i
is decreasing in
Proof.
Let prime
(')
=1,2, ... ,p.
of a function denote the partial derivative of
87
the function with respect to
for a fixed
1
aKt
i.
For example,
a9 m
K' = - - and 9' = -.-
a8~
t
ae~
m
1
1
It suffices to show that
N
= I
aF(N)
(A.6)
a8~1
j=O
K~ < 0
for any
N •
J
From (A.2), we get
K'
o
= _<3_
ae~
1
(e -e'e/2frct~~)
i=l
Hence, (A.6) holds when N=O.
(A.7)
= - .!.
1
.
2
< 0 •
K
0
To proceed, we will first show that
j-l
'
.
l
I
t
K = - -2 K. + -2 L. ct.-1 (I-ct.-1 ) j-l-13KQ
J
J
13=0 1
~
I-'
j = 1,2, ....
From (A.3), we get
9
-1
-1 m-1
=
mct. (I-ct. )
m
1
1
,
m = 1,2, ....
By direct computations,
K~ = a:~
1
b\
J =~ fu. ~ 1KO
91 KO
+ 91 ( -
~
KOJ)
61
Hence, (A.7) holds when
j = 1.
Similarly, by using
O and Ki
K
obtained above,
Hence, (A.7) holds for
j =1
for all
for some
j = 1,2, ... ,m,
and
2.
m.
Now, suppose that (A.7) holds
From (A.2)
(A.8)
By applying the identity
m x-I
(A.9)
L L u(x,y)
x=l y=O
m m-a
=
L L u(a+b,b)
a=l b=O
(which can be proved by expanding both sides) to the summation in the
last term on the right hand side of (A.8),
we get,
62
~L ltLl g..
(A.lO)
-1
l=l S=O m+
-~ 1
W
=
-1 l-l-S
1 oa. (I-a. )
L
mt
L 9
1
a=l 0=0 m+ -a-
=
Ka
~
1
--1
-1 a-I
. (l ..CL.)
K.
ba
1
m
0
\ -1
-1 m-~
L a. (I-a.)
l=l 1
1
Ub
1
l-l
\
L go aKa
13=0 ~-~ p
m
\
-1
-1 m-l
L (2l)a. (I-a.)
Ko
=
l=l
1
~
1
where the last equality follows from (A.2).
Substitution
of (A.lO)
into (A.B) gives
,
Km+I
1
1
= -2 2(m+l)
m
m
0
\
1 -1
-1 m
1 \ -1· -l.m-~
L g 1 lK +Za. (I-a. ) KO+2 L a. a-a.) Kl
l=O m+ - l
1
1
£= 1 1
1
1
1 ~ -1
-1 m-l
=--K 1+2- La. (I-a.)
Ko
2 m+
£=0 1
1
~.
Hence, (A.7) holds for
j =m+l.
Therefore, by the principle
matical Induction, (A.7) holds for any
j.
Now, from
O and (A.7),
K
N
I
. d {F(N)}=
K!
ae.2
j=O J
1
= -!K
+
2 0
=-
-1
' }
L {-~K. +];. j L
a ~I (14 ~l)J-l-SK
N
j=l
2 J
2(3=0
1.
1
N
N j-l
.
1 \
1 \
\ -1
-1 J -l-S
- L K. + 2 L La. (I-a. )
K
S
~=O J
j=l S=O 1
1
By applying the identity
.e
m x-I
(A. 11)
L L u(x,y)
x=l y=O
m-l m-l-a
=
L
a=O
L
b=O
u(a+b+l, a)
of Mathe-
S
63
to the last term on the right hand side, we get
a
- 2 {F(N)}
ae 1.
1 N
1 N-l{N-l-j -1
-1
~ K. +2 I~ a. (I-a.. )
j =0 J
j =0 . R-=O 1
1
= -2
1
1 N-l{
N-l-j -1
-1
~ 1~ a.. (I-a.. )
2-~ 2 j=O
l=o 1
1
=-- Ie - a...
l}
K.
J
l}
K.
J
'
for all i. Hence, 1 - a,-. 1 _< 1 f or a 1 11.
If
1
l
'
N-J
0
-1
'i'
-1
-1 -l.
-1
1- a.. =1, then
L. a.. (I-a.. ) =a. = O.
(For a consistency, it is
1
l=O 1
1
1
understood that 00 = 1 and Ol = 0 for all l;tO.) If 1_a.~1 < 1,
By assumption
~
1
1
1
then
-;.. {F(N)} < 0,
Hence,
Lemma 4.
Let
and
VI
Py (vI) 1
and
Y2
0
be defined as in Lemma 2.
Then the p. d.f. 's
Y2 are given by, respectively,
. _
(i)
which completes the proof.
Cle.1
I
j =0
00
j
Ll=oI a..
-1
-1 j -l
(I-a..)
1
1
Ko/J f
-l.
J
.
2 2' (vI/a) ,
p+ + J
(ii)
Proof.
Both (i) and (ii) are proved simultaneously.
Let
'1' (t)
m
denote the characteristic function of a chi-square random variable
with
m degrees of freedom.
'1' (t)
m
Let
.~z(t)
=
Then
(1_2it)-m/2
= '1'm/2(t)
2
.
be the characteristic function of a random variable
Z
e.
64
and let
V be defined as in Lemma 1.
Then, from (A.l), we get,
QQ
(A.12)
cj)v/ (t) =
a
I
K. '1'2· (t)
j=O J P+ J
QQ
= '¥p/2 I
2
K.'¥ jet) .
j=O J 2
On the other hand,
= {a.1 (1-2it)-(a.-l)}-1
1
'e
By using the Taylor series,
(l_Z)-m
=
Il=o.(m-l+l)zl,
Izi
< 1 ,
we get
(A.13)
Define
(A.14)
V
m
= aa·X 12
1
\'
2
2 +a L a.Z.
(1+2m) ,8
j;t;i J J
i
Then, by using (A.12) and (A.13), we get,
.e
em= 1,2)
•
65
<l>Yrrl (t) = <I>
a
a.·X
1
,2
(t) -<I>
2
j~i J J
(1+2m),a.
1
=\f~(t)-<I>
2 (t)
I a..z.
,2
a.·x
2
lIe
(t)-<I>
,.
2(t)
I a..z.
.~. J J
J
1
1
(A. IS)
Application of the inversion formula termwise gives
.
m m
(A.16)
By (v )
~t·
~ f p+ 2m+ 2·J (vmla)
m-l + j -.t -m
-1 j -.t .
= j=O
I .t=O (.
0
Ja..1 (I-a..)
Kola
J-~
1
~
00
-
From (A.14) and (A.16) we get (i) and (ii) by letting m= 1 and
respectively, which completes the proof.
m= 2,
0
The above (A.16) may be written as
00
(A .17)
P.y(v)=a
m m
-1 ~
,
/
L.K.f+ + ·(v a),
j=O J P 2m 2 J m
which is a form of (A.l) in Lemma 1.
However, the effect of the spe-
cific difference between Y and
Y is not explained in (A.17). To
m
perform algebraic operations among expected values of the form as given
in Lemma 2 above, the relationship between
known.
K.
J
and
K!
J
should be
e.
66
Lemma 5.
Suppose that
G1 and
G2 are two distribution functions
such that
(A. IS)
Let
h: [0,00)
(i)
E
(ii)
E
-+
be a monotone function.
[0,1]
1 [h(X)]
~ E
1 [hex)]
~ E
2
2
[hex)]
if h(· )
is nondecreasing,
[h (X)]
if h(· )
is nonincreasing, where
E. [h (X)) =
l.
Proof.
(i)
Then,
Jooh (x) dG. (x)
o
,
i
1.
= 1,2
.
By (A.1S), there exist two random variables
Xl
and
X
2
such that
and
G. (x)
l.
(See
Lemma 1 of
= ph: x.l. (z)
~ x},
Lehmann [12], p. 73.)
i
= 1,2
.
Hence, we get
E1 [h (X)] = (h{X l (z) }dP(z)
~ J~h{(X2(Z)}dP(Z)
which completes the proof.
(ii)
This follows from (i) by considering the nondecreasing function
g=l-h: [0,00)
-+
[0,1].
0
67
Lemma 6.
Let f
be defined as in Lemma 1 and let h:
(0)
m
be a monotone function.
[0,00) +
[0,1]
Define
.
= fooh(X)f (x)dx
mOm
A
Then, for any m,
(i)
A ~ A 1
m
m+
if h(o)
is nondecreasing,
(ii)
A ~ A 1
m
m+
if h(o)
is nonincreasing.
Proof.
and
(i)
variables with
and
1
the random variable
distribution with
X be two independent chi-square random
m
m degrees of freedom, respectively. Then,
Xm+ 1
defined by
Xm+ 1 = Xl
m+ 1 degrees of freedom.
tion function of X , k=l, m, m+l.
k
Gm+l (x) =
+
Xm has a chi-square
Let
G be the distribuk
Then, for any x,
~Gm(X-Z)dGl(Z)
~ ~Gm(X)dGl(Zl
= Gm(x)
(Z <:: 0)
•
On the other hand,
ooh(X)f (x)dx = fooh(X)dG (x)
mOm
Jo
Hence, the proof follows from Lemma 5 (i).
(ii)
This is proved as above by using Lemma 5 (ii).
Lemma 7.
Then
Let
f (0)
m
be defined as in Lemma 1.
Let
0
h:
[0,00) +
[0,1].
68
where
m+ Zk > 0 and
r(m/Z+k) Zk
= rCm/2)
\)k
is the k-th moment of a chi-square random variable with m degrees
of freedom.
Proof.
By direct computations
oo k
.
x h(x)f (x)dx
Jo
m
=
1.
r (m/Z) Zm/Z
i
ooh ()· k m/Z-l -x/Z
x x ·x
e
dx
0
.
= !(m/Z+k)Zm/Z+k Joo hex) x(m+2k)/Z- l e- x/ Z dx
r (m/Z) Zm/Z
0
r (m+zZk) Z(m+Zk) /Z
Hence the proof is completed.
.e
0
4It
APPENDIX B
Proof of Theorem 1 (ii).
(1.31) in Chapter
(B.1)
E
By using the transformations defined in
I,
~S-S) , (X' X) S h{S' (X' X) 2§~ =E[1Z-8) 'z
J [ Z 'AZ
L S' (X'X)2§
For a fixed
i (i = 1,2, ... ,p),
VI
a
= aa·x ,2
a L a.z.
, i
V2
1
2 +
5,8.
1
where both
,2
X
2
3,8.
1
•
and
,2
X
2
5,8.
AZ~
•
.J
let
,2
= aa·X
2+
1 3 8
and
h()2 Z !
'\
2
.~. J J
J 1
L
a.z.
'\
2
j~i J J
are independent of z~
J
for all
j
~
i.
1
Then, by Lemma 2 in Appendix A
and
By using the p.d.f. of VI
given in Lemma 4 (i) in Appendix A, we get,
as in (1. 37) ,
(B.2)
.e
where
1 is defined in (1,30). The p.d.f. of V ,
m
2
(ii) in Appendix A, can be written as
given in Lemma 4
70
From this, as in (1.37) and (B.2), we get,
E~((lv25J
L
V2
=
oo
J
fo ex
= ex-I
-1
J
-1 -2
-1 j-l-1
L ~. L
a.. (j -l)(l-a.. )
Ko
00
j =1
l=o
Ir
j=l~=O
1
f1£2 (j
1
2
h(cr v 2 )
-1
f 2 2' (ex v ) dV
2
2
v2
p+ + J
.(..
_l)(l_a~l)j -l-l K1 I p +2.j
1
':l
1
.
p+2J
(B.3)
Hence, by (B.2) and (B.3),
+ ex
- ex
-1
2 -2
-1 '-l-l ~
L ~j-1
I e.a.
(j-l) (I-a. )J
Ko
00
j=l
-1
l=O
1
1
1
.(..
2 -1
-1 j-l
L e.a.
(I-a.)
K~
~j
j=O l=O
I
o
00
1
1
1
.(..
I p+ 2'J
p+2j .
I p+ 2'J
p+2j
71
By substituting
,.
in the first and second term on the right hand side, respectively, we
get
=a
K
-1 co.K.
. J
-1 CO{
-1
2 -1 } -L
I
-2-' I 2·- a
I
'(I-a.. )+6.a...
2' I 2'
j=O p+ J p+ J
j=O'
1
1 1
p+ J p+ J
y.r
j=l~=O
+ a -1
..
-a
-1 co
I
j =1
t{(l"'-ct
~l) j -i+ (j -i) 6~a.~1 (I-a. ~l) j -i-I}.Kl
1
1
1
I p+2}
~ p+2J
1
I
.p+2j
-1 ·--l..+1
2 -1
-1 '--l..
~j-l{
° +(j-i+1)6.a..
I (l-a..)J
(I-a..)J O} Ko~. p+2j
i=o
1
1 1
1
-l..
(B.5)
Now, application of the recursive formulas (1,36) and (1,37) to the
summations of the above on
i
gives,
(B.6)
= a -1 ~L
j~O
2j . I
.) .
p-2+2] p-2+2]
K . (1
. . -.
J p+2]
(Interchangabi1ity of the order of the summations is justified by the
72
fact that we are dealing with finite summations of infinite series all
of which are convergent.)
identity.
0
Hence, from (B,l) and (B.6), we get the
e
APPENDIX C
40 x 6 with
~
"
e
..
.e
matrix
X*
unstandardized independent variables
i
*
Xi!
X~2
Xi3
Xi4
X~5
X~6
1
2
3
4
5
6
7
8
0
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
8.000
8.000
8.000
8.000
8.000
8.000
8.000
8.000
8.000
8.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
3.000
3.000
3.000
3.000
3.000
3.000
3.000
3.000
3.000
3.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
2.000
2.000
2.000
2.000
2.000
2.000
2.000
2.000
2.000
2.000
0.000
O. 000·
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
2.000
2.000
2.000
2.000
2.000
2.000
2.000
2.000
2.000
2.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
2.000
2.000
2.000
2.000
2.000
2.000
2.000
2.000
2.000
2.000
10.000
10.000
10.000
10.000
10.000
10.000
10.000
10.000
10.000
10.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
2.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
2.000
2.000
2.000
2.000
2.000
2.000
2.000
2.000
2.000
2.000
7.000
7.000
7.000
7.000
7.000
7.000
7.000
7.000
7.000
7.000
12.000
12.000
12.000
12.000
12.000
12.000
12.000
12.000
12.000
12.000
0.056
0.847
-0.934
1.283
1.350
-1. 807
-0.191
1.435
-0.082
0.257
0.351
-0.083
-1. 392
0.311
0.636
0.594
-0.543
-0.189
0.793
-0.483
-0.311
-0.596
0.140
-0.074
1. 714
0.923
1.180
0.247
1.131
0.388
0.608
0.095
-0.273
0.007
0.302
-0.729
0.474
-0.825
-0.669
0.765
0.465
-0.524
1.948
-0.246
-0.743
-2.320
-1.404
-1. 399
0.177
-0.254
0.906
-1.142
-0.982
-2.182
-0.306
0.265
0.692
1.229
0.966
-0.251
0.297
-0.499
1.914
-1. 577
0.119
-0.659
-1. 083
0.590
0.633
-1.355
0.543
-0,261
-0.502
0.495
0.548
0.546
2.565
-0.838
0.872
-0.124
BIBLIOGRAPHY
[1]
Baranchik, A.J. (1964). Multiple regression and estimation of
the mean of a multivariate normal distribution. Technical
Report No. 51, Stanford University.
[2]
Baranchik, A.J. (1970). A family of minimax estimators of the
mean of a multivariate normal distribution. Annals of
Mathematical Statistics 41, 642-645.
[3]
Berger, J.O. (1976). Admissible minimax estimation of a multivariate normal mean with arbitrary loss. Annals of
Statistics !, 223-226.
[4]
Berger, J.O. and Bock, M.E. (1976). Eliminating singularities
of Stein-type estimators of location vectors. Journal of
the Royal Statistical Society~Series B 38, 166-70.
[5]
Bhattacharya, P.K. (1966). Estimating the mean of a multivariate
normal population with general quadratic loss function.
Annals of Mathematical Statistics ~~ 1819-1824.
[6]
Bock, M.E. (1975). Minimax estimators of the mean of a mUltivariate normal distribution. Annals of Statistics ~, 209-218.
[7]
Efron, B. and Morris, C. (1973). Stein's estimation rule and
its competitors -An empirical Bayes approach. Journal of
the American Statistical Association 68, 117-130.
[8]
Harville, D.A. (1971). On the distribution of linear combinations
of non-central chi-squares. Annals of Mathematical Statistics 42, 809-811.
[9]
Hudson, H.M. (1974). Empirical Bayes estimation.
Report No. 58, Stanford University.
Technical
[10]
James, W. and Stein, C. (1961). Estimation with quadratic loss.
Proccedings of the Fourth Berkeley Symposium on Mathematiaal
Statistics and Probability!, 361-379.
[11]
Johnson, N.L. and Kotz, S.
(1970). Continuous univariate distributions--2, Houghton Mifflin Co., 149-188.
[12]
Lehman, E.L. (1956). Testing statisticaZ hypotheses, John Wiley
and Sons, Inc., 73-74.
[13]
Loeve, M. (1963).
Co., Inc.
Probability Theory.
3rd Ed. D. Van Nostrand
75
[14]
[15]
Press, S.J. (1966). Linear combinations of non-central chi-square
variates. Annals of Mathematical Statistics 37, 480-487.
~
..,
Ruben, H. (1962). Probability content of regions under spherical normal distributions IV. Annals of Mathematical Statistics 33, 542-570.
[16]
Stein, C. (1956). Inadmissibility of the usual estimator for the
m'ean of a multivariate normal distribution. Proceedings
the Third Berokeley Symposium on Mathematical, Statistics and
Probability !_' 197-206.
[17]
Strawderman, W.E. (1971). Proper Bayes minimax estimators of
the multivariate normal mean. Annals of Mathematical Statistics 42, 385-388.
[18]
Strawderman, W.E. (1973). Proper Bayes mInImax estimators of
the multivariate normal mean vector for the case of common
unknown variances. AnnaZs of Statistics l, 1189-1194.
}
oj
..