A LEAST SQUARES APPROACH
TO QUADRATIC UNBIASED ESTIMATION
OF VARIANCE COMPONENTS
IN GENERAL MJXED MODEL
by
Cheng-Hsuan Yuan
Institute of Statistics
Mimeograph Series No. 1132
Raleigh, N. C.
August 1977
iv
TABLE OF CONTENTS
Page
1.
INTRODUCTION
1
2.
PRELIMINARY FORMULATIONS
7
2.1
2.2
3.
7
9
THE LEAST SQUARES APPROACH TO QUADRATIC UNBIASED
ESTIMATION OF VARIANCE COMPONENTS
3.1
3.2
3.3
3.4
4.
The General Mixed Model
The Inner Product Space A and the
Linear Transformation H
Introduction.
General Results
Quadratic Least Squares Estimation
of Variance Components
Invariant Estimation of
Variance Components
13
15
19
30
ANOTHER LOOK AT SOME OTHER ESTIMATION PROCEDURES
4.1
4.2
4.3
4.4
Seely's Work on Quadratic Unbiased
Estimation in Mixed Models
Koch's Synunetric Sum Approach to the
Estimation of Variance Components •
A Least Squares Approach to the MINQUE
A Geometric Approach to Quadratic Least
Squares Estimation and the MINQUE
5.
CONCLUSIONS.
6.
LIST OF REFERENCES
13
38
38
42
46
52
57
. ..
60
1.
INTRODUCTION
Estimation of variance components has been a long-standing problem
associated with statistical inference in general linear models.
In its
early stage of development, the estimators were derived as linear combinations of sums of squares from analysis of variance tables for testing
linear hypotheses.
The well known methods proposed by Henderson (1953)
and extended to more complex models by Searle (1968) still retained the
form as linear combinations of sums of squares.
In the case of balanced
models, the estimators obtained from the analysis of variance have been
proved to be uniformly minimum variance quadratic unbiased estimators
(cf. Graybill, 1961).
However, for the unbalanced models, the analysis
of variance methods do not, in general, give unbiased estimates for all
quadratic estimable functions.
Searle (1968) pointed out Henderson's
Method 1 does not give unbiased estimates in mixed models and Method 2
cannot be used if there is an interaction between fixed and random
effects.
Harville (1967) gave an example where Henderson's Method 3 does
not give unbiased estimates of estimable variance components in a two-way
classification random model.
Limitation in the estimability of analysis
of variance methods is due to the restriction that the estimators are
linear combinations of sums of squares instead of all quadratic forms of
observations.
Some significant developments took place in the last decade.
Koch
(1967) proposed a symmetric sum approach to the estimation of variance
components.
The underlying idea was to consider the linear combination
of all squares and cross products of the observations, as Koch stated,
"Any estimable parameter can be estimated by forming an appropriate
2
linear combination of certain symmetric sums of (linearly independent)
random variables having the sanie expectation such that the resulting
quadratic form is an unbiased estimate of it."
He considered the obser-
vation vector Y from a certain experiment and denoted
E(Y)
and
=0
E(IT')
ll,
Var(Y)
= llll' +
=V
V=S •
Given the design of the experiment,
II
and V have a particular linear
structure which can be utilized in constructing estimators.
With this
basic theory Koch then demonstrated the symmetric sums approach in a
class of random models including some nested models, cross classification models and an experiment of mixed type.
Even though the general
form of the symmetric sum estimators was not formulated, the general
feature of the method in different models is clear.
The estimating
procedure consists of calculating the averages of the entries in the
matrix YY' having the same expectation
(!.~.,
the identical entries in
S), equating the averages to their expectations and solving the set of
equations.
For the models Koch used to demonstrate his methods, the
coefficient matrices of the equations are non-singular.
solutions are unbiased estimators of the parameters.
Therefore the
The estimators were
found to be not invariant under shifts in location of the data.
Koch
(1968) modified the symmetric sum method by using squared differences of
observations instead of the squares and cross products of observations.
Forthofer and Koch (1974) extended the method of mixed models and the
cross products of differences of observations were also used as well as
squared differences.
3
In his work on estimation in finite·dimensional linear spaces, Seely
(1969, 1970a, 1970b) treated the quadratic unbiased estimation in mixed
models as a special case of his general theory.
expectation of YY' as Koch did.
E(Y)
= XB
and
Var(Y)
He also considered the
Based on the linear structure in
=
he introduced the representation
E(YY')
where
e
= He
(1.1)
is the Mxl vector of parameters and H is a linear transformation
M
from R into
A,
the linear space of all nxn symmetric matrices.
Seely
first considered the general case of quadratic estimation of linear
functions of
e
in (1.1).
He obtained necessary and sufficient conditions
for the estimability of linear functions tIe and also derived equations
for the estimation of tIe.
As will be shown in Chapter 4 of this disser-
tation, Seely's estimation equations lead to estimators identical to
Koch's (1967) in the random models considered by Koch.
Seely also derived
a set of equations for estimation of variance components but the consistency of the equations was not verified.
Above all, Seely's work presents
an important motivation to the present work on the least squares approach
to quadratic estimation of variance components.
Another significant development was Rao's (1970, 1971a, 1971b, 1972)
theory of minimum norm quadratic unbiased estimation (known as the MINQUE).
In the general mixed model
Y
= XB +
Ue:
4
suppose the random effects e: were observable.
Then a natural estimator
would be e:' I::.e: with I::. a suitably defined diagonal matrix.
of
If Y' AY with AX
=a
m
is an tmbiased estimator of
r
i=l
ference between Y'AY and e:'I::.e: is e:' (U'AU - I::.)e:.
Picr~, then the dif-
The MINQUE, as Rao
stated, is to make e:' (U' AU - 1::.) e: "small in some sense by minimizing
lIu'AU - 1::.11", or equivalently, by minimizing lIu'Aull.
condition AX
=a
In case that
is not consistent with unbiasedness, then the MINQUE
will be obtained by minimizing tr(A(V + 2XX')AV) where V
= UU'.
In
either case, it will be shown in Chapter 4 that the MINQUE can be considered a special case of the least squares quadratic estimation of
variance components.
The aim of the present work is two-fold:
(1)
To extend and apply the theory of least squares estimation to
quadratic estimation of variance components in the general
mixed model.
(2)
To demonstrate that the variance component estimation procedures proposed by some previous authors can be unified in
the general quadratic least squares approach.
Since Gauss' work in the early years of the nineteenth century, the
theory of least squares, through the effort of various authors, has been
established with much generality in linear estimation.
To the present
author, it seems to be a natural extension to apply the least squares
principles to the quadratic estimation in mixed models.
In this dis-
sertation we shall demonstrate this without making special assumptions
on the distribution of the random variables, except for the existence
5
of the second degree moment.
tionals of YY'.
Quadratic estimates y'AY are linear func-
Application of least squares estimation with the symmetric
matrix YY' as the dependent variable does not fit into the conventional
form of least squares estimation where the dependent variable is in the
form of a column vector.
To facilitate the application of the principle
of least squares estimation, Seely's representation E(YY')
2
He is adopted
and linear transformations on the linear space A of all nxn symmetric
matrices are defined and derived.
results in Chapter 3.
These will be presented as part of the
In Chapter 4 we shall verify that Koch's symmetric
sums estimators and the MINQUE are both special cases of least squares
estimators.
Also, Seely's work will be further discussed.
At the completion of the present work, a recent article by Pukelsheim
(1976) was brought to the author's attention.
Pukelsheim considered the
k
model E(Y)
=
XB, Var(Y)
=
L
i=l
Vito and derived the model
1.
(1.2)
where M is the orthogonal projection matrix on the null space of X', and
DM is the known coefficient matrix of the vector t of variance components.
Among the results derived from this model, he pointed out that
t
= n+ •
M
MY 8 MY ,
(1.3)
where D~ is the Moore-Penrose inverse of D , is the ordinary least
M
squares estimate of t and he also showed that, with a suitably chosen
positive definite matrix T, Rao's MINQUE with invariance can be obtained
from the normal equation
(1.4)
6
The above results coincide with part of the findings in the present work
which is developed independently of Pukelsheim's work.
In Chapters 3
and 4 in this dissertation, special remarks will be made where the equivalence to Pukelsheim's results is observed.
7
2.
PRELIMINARY FORMULATIONS
2.1
The General Mixed Model
By the general mixed model we mean the nx! vector Y of observations
with the linear structure
m
Y
= Xe +
I
(2.1)
Ui 8 i
i=l
where X is an nxp matrix of known constants; 8 is a pxl vector of unknown
constants; Ui are nxqi matrices of known constants;
8
i are qixl vectors
of random variables with
and
where Ci are qixqi matrices of known constants and
With
~
=
n and U and C non-singular, this model is a generalized repm
m
resentation of the so-called fixed, random and mixed models.
When p = 1,
(2.1) is a random model and when m = 1, it is a fixed model.
From model
(2.1), it follows immediately that
P
E(Y) = X8 =
I
i=l
X8
i i
(2.2)
m
and
Var(Y) =
I Uiciuicr~
i=l
m
=
I
i=l
2
V.cr.
~
~
where 8.~ and X.~ are the ith entry in 8 and the ith column of X, respectively, and V.
~
= U.C.U i'
~
~
•
Denote W = yy' .
Then
8
=
E(W)
Denote
i=l, ••• ,p
(2.3)
i,j=l, ... , p
1
and
M = 2P(P+l) + m .
Then
E(W)
=
With this fixed set
{Bij,V }
i
of M symmetric matrices, Seely (1969,
M
1970b) introduced a linear transformation H from R into the linear
space
A of all nxn symmetric matrices over the field of real numbers.
The linear transformation H is defined as
PPm
Hp
=
I
I
B1."J"P1."J" +
" 1 j =1."
1.=
for every Mxl vector
°=
I
i=l
(2.4)
V"p"
1. 1.
M
(0Il ••• PlpP 22 ••• PppP l •.• Om)' in R , where
Vi and Bij are given in (2.2) and (2.3), respectively.
With this defi-
nition, we write Seely's representations as
E(W)
= He
(2.5)
2
2
where e = (ell ••• eppal ••• a m) and 6 ij are given in (2.3).
To facilitate the later development, it is helpful to make the partition
(2.6)
9
where
(2.7)
Accordingly, we partition the parameter space n as the product space
1
n1x n2 , where n1 is the set of all F(P+1)x1
parametric vectors Sl and n2
is the set of all mxl parametric vectors S2.
We adopt Seely's assumption
on n that A'S = 0 for all S in n implies A = O.
consequence of the conventional assumption that
Actually, this is the
_0:>
<
13 i
< +co , 0 <
2
IJi
< +00
and that 13 , 13 , IJ~, IJ~ are functionally independent whenever i ~ j and
j
i
k ~ t. (Any linear dependence among 13
Because
p
p
I
I
Aij13i13j
=0
i
can be removed by reparameterization.)
for all 13 in RP implies Aij = 0, 1 ~ i ~ j ~ P
i=l j=i
2
and I AiIJ = 0 for all S2 in n2 implies A _= 0, 1 < i < m •
i
i
i=l
m
2.2
The Inner Product Space
A and the Linear Transformation H
We are interested in the quadratic forms Y'AY whose expectation is
a certain linear functional t'S.
Since we intend to deal with general
model (2.1) or (2.5) and the parametric linear functionals t's, we shall
consider every men symmetric matrix as a potential candidate to be used
in the quadratic form of Y.
symmetric real matrices.
Denote
A to
be the linear space of all nxn
n
Since W = YY' is in A for every Y in R and we
want to express Y' AY as a linear functional of W, we need to define an
inner product ( , ) on A such that
(A,W) = Y' AY
n
for every A in A and Y in R .
Consequently the inner product is to be defined, for any pair A,B in
as the trace of the product matrix AB,
(A,B)
=
tr(AB)
i.~.,
for every A,B in A •
A,
10
M
Also the inner product < , > on R is defined to be
<a,b>
M
= a 'b
for every a,b in R
With the inner products on
.
A and RM defined, the adjoint H* of H is
determined as the unique linear transformation from
( V
A into RM such that
m
therefore, H* is defined by
(B
(B
H*A
II
pp
A)
A)
=
( VI
A)
( V
A)
m
A)
II
Corresponding to the partition of H in (Z.6), we partition H* as
where Ht and
H~
are the adjoints of HI and HZ respectively.
I
upper p(p+l)xl subvector in H*A and
H~A
HtA is the
the lower mxl subvector.
We denote R(H) to be the range space of H, r(H) the rank of H, N(H*)
the null space of H* and n (H*) the nulli ty
Based on the fact that (A,Hp)
=
0
f H*.
M
<H*A,p> for every A in A and p in R ,
it can easily be proved that R(H) and N(H*) are orthogonal complements in
A.
Aa
It follows that for every A in A, there exist tmique
in N(H*) such that A = Al + AO '
M
some p in R.
Since
~
~
in R(H) and
is in R(H), Al = Hp for
Therefore
H*A
= H*~ = H*Hp
This implies R(H*)
•
= R(H*H).
It should be noted that the composite
M
M
linear transformation H*H maps from R into R and can be expressed as an
nxn matrix.
Seely showed that
(B
,B
)
..
ll
,B
n)
(Bpp,B pp ) (Bpp'V l )
(Vl' BII )
(Vl ' Bpp ) '(Vl ' Vl)
ll
(B,,, , 13\)
(B
H*H
=
pp
12
Further, since
p'H*Hp
= (Hp,
Hp)
>
a
M
for every p in R ,
the matrix H*H is non-negative definite and is positive definite if and
only if the set {B
ij
, Vi} is linearly independent.
13
3.
THE LEAST SQUARES APPROACH TO QUADRATIC UNBIASED ESTIMATION
OF VARIANCE COMPONENTS
3.1
Introduction
As was stated in Chapter 1 t one of the aims of the present work is
to extend and apply the theory of least squares to quadratic estimation
of variance components in general mixed models.
In this section we shall
outline some fundamental principles of least squares estimation and
describe the basic analogy between linear and quadratic estimation in
application of the theory of least squares.
Let us consider a finite dimensional real inner product space U.
Let S be a subspace of U and S~ the orthogonal complement of S.
The
I'>
orthogonal projection of a point u in U on S is the point u in S such
that u - u is in
II u
-
S~.
Since
~ II = min II u
- slit
sES
u is called the best approximation to u by elements in S (cf. p. 283,
Hoffman and Kunze, 1971).
In the case of linear estimation, we consider the observation Y
taken from an experiment to be an nxl vector.
is unknown but the model implies that
~
= XS,
The expectation
meaning that
~
~
of Y
lies in
the range space R(X) of an nxp matrix X, a linear transformation from
n
RP into R •
The least squares estimate ~ of ~ is the best approximation
to Y by elements in R(X) in the sense that
IIY -
~II
=
min
x~R(X)
Il Y -
xII •
14
n
Denote P to be the orthogonal projection on R(X) , i.e., P : R
P
= P' = p 2
and Px
= x for every x in R(X). Then Y =
approximation to Y in R(X).
Py
n
R ,
+
is the best
Also, Y has the same expectation as Y
and if t 'Y is an unbiased es timate
for a certain parametric function,
t'Y is the least squares estimate.
In quadratic unbiased estimation in general mixed models, the
expectation of W = YY' is S
= ~~'
+ V.
The representation S
reveals that S lies in the range space of H.
= H6
Therefore the quadratic
least squares estimate of S is the orthogonal projection WI of W on
R(H), and if (A,W) is unbiased for a certain parametric function, then
the quadratic least squares estimate of that function is (A,W ).
l
In the following sections of this chapter we shall work in the following areas:
(1)
The explicit form of the orthogonal projection on R(H) will
be presented in terms of H and H*.
(2)
Quadratic least squares estimators of variance components will
be derived.
A linear operator T on A such that the expectation
of W under the mapping of T is not a function of 6 will be
1
introduced.
(3)
Quadratic least squares estimators invariant with respect to
the fixed effects B will be obtained.
A quadratic estimator
Y'AY is invariant with respect to B if and only if AX
linear operator whose range space is
introduced.
{A~A:
= O.
AX = O} will be
A
15
3.2
General Results
In this section, we shall deal with the least squares quadratic
unbiased estimation of the parametric functions of the form
(3.1)
We adopt the following definition of quadratic estimability which was
given by Graybill and Hultquist (1960), Koch (1967) and Seely (1969).
Definition 3.1.
A parametric function of the form (3.1) is said
to be quadratic estimable if there is an nxn symmetric matrix A such
that
E(A,W)
= <Q.,8>
for all 8.
In the following we first outline Seely's results and then undertake
our own exploration from there on.
Based on his representation E(W) = H8, Seely (1969, 1970b) obtained
the following:
(1)
The parametric function
<~,8>
is quadratic estimable if, and
only if, there exist p in ~ such that
H*Hp
(2)
If
<~,8>
=
~
(3.2)
is quadratic estimable and if 8 is a solution of the
equation
= H*W
H*H8
then
for
<~,8>
<~ ,8>
(3.3)
is an unbiased estimator
•
16
Based on these results given by Seely, we now proceed as follows.
An immediate consequence from Definition 3.1 is the following
lemma.
A parametric function <1,6> is quadratic estimable if,
Lemma 3.1.
and only if, there exist A in
that
=1 •
H*A
Proof.
A suCh
(3.4)
Since
E(A,W)
= (A,H6) = <H*A,6>
for every 6 in Q ,
=1
that <1,6> is quadratic estimable implies H*A
and vice versa.
Comparing (3.2) and (3.4) we have A - Hp is in N{H*).
is the orthogonal projection of A on R(H).
projection of A in
H*Hp
A on R(H)
Therefore Hp
In general, the orthogonal
is obtained from the equation
= H*A .
(3.5)
Observe that this equation is consistent since R(H*)
= R(H*H).
Therefore
the solution exists and is of the form
p
where (H*H)
=
(H*H)-H*A ,
is a generalized inverse of (H*H)-.
unique (i.~., when H*H is singular), P
solution of (3.5).
But Hp
~~en
= (H*H) -H*A
= H(H*H)-H*A,
A is
An alternative proof that
is not
is not the unique
the orthogonal projection of
A on R(H) is unique regardless of the choice of (H*H)-.
linear operator H(H*H)-H* on
(H*H)
Therefore, the
the orthogonal projection on R(H).
H(H*H)-H* is the orthogonal projection
on RCH) is given in the following.
17
Theorem 3.1.
The linear operator H(H*H)-H* on A is unique, regard-
less of the choice of (H*H)-, symmetric and idempotent, in short,
H(H*H)-H* is the orthogonal projection on R(H).
Proof.
(1)
Uniqueness:
Let p and
equation (3.5).
H~
be two distinct solutions of the
Then
H*Hp == H*W ==
or Hp -
~
H*H~
is in N(H*).
fore, Hp - H~ == 0 and
lip
But Hp -
H~
is also in R(H).
= H(H*H) -H*A
There-
is unique for every A
in A.
(2)
Symmetry:
Since H*H is symmetric, we can choose (H*H)
symmetric.
symmetric,
Since H(H*H)-H* is unique, H(H*H)-H* must be
i.~.,
(A, H(H*H) -H*B) == (H(H*H) -H*A, B)
for every A, B in
(3)
Idempotency:
A.
First observe that
This is true because from (3.5)
H*A == H*Hp == H*H(H*H)-H*A
for every A in
A.
It follows that
(H(H*H) -H*) 2 == H(H*H) -H* ,
i.~., H(H*H)-H* is idempotent.
to be
18
That H(H*H)-H* is the orthogonal projection on R(H) follows
by Theorem 1, §75 of Halmos (1958).
Now let us go back to the equation (3.3).
The above results have
shown that He is the orthogonal projection of W on R(H).
Therefore, for
any e satisfying the equation (3.3), He is the quadratic least squares
estimator of He in the sense that
A
IIw-Hell=
IIw-Hell.
min
e€RM
In other words, the equation (3.3) is equivalent to the equation
~
ae
" W-
HsII 2 = 0
However, if we want to minimize
e is in
(3.6)
.
/I W -
A
He II
2
with the restriction that
M
a subset of R , then we have the following non-linear program
Q,
(3.7)
The solution of (3.7) will preserve the functional dependence among the
parameters,
2
e ..
~J
=
(3.8)
e .. e ..
~~
JJ
A2
in the estimates e .. and assures positive value of e. •
~J
~
In general,
solution of (3.7) is neither quadratic in Y nor unbiased for 6.
There-
fore, solving the non-linear program (3.7) is not within the scope of
the present work and we do not intend to pursue this problem any further.
19
3.3 Quadratic Least Squares Estimation of Variance Components
From now on we shall concentrate on the estimation of parametric
functions of the form
(3.9)
If this parametric function is estimable, then by Lemma 3.1, there
exists A in A such that
H*A
1
=0
and
H~A = Q.
2
•
We therefore have the following.
The parametric function (3.9) is estimable if, and only
Lemma 3.2.
if, there exists A in N(H!) such that
Q.
2 = H~A .
It should be noted that A in N (Hf) means
(B
ij
, A)
= 0,
l<i<j<p,
which, in turn, is equivalent to X'AX
N(Hf)
= {A€A:
X'AX
= O}
= O.
Therefore we have
•
(3.10)
Consider the linear operators on A whose range space coincides with
N(Hf).
Some of these operators will be discussed in Chapter 4 when we
discuss the estimation procedures of other authors.
The one of special
importance is introduced in the following.
Lemma 3.3.
Let T be a linear operator on A defined by
TA = A - PAP
for every A in A,
(3.11)
where P is the orthogonal projection matrix on the range space of X.
20
Then,
(1)
R(T)
(2)
T
= {ALA:
X'AX
= O},
= T2 = T* •
In other words, T is the orthogonal projection on N(R!).
Proof.
(1)
Since
X'(A - PAP)X
= X'AX
- X'AX
=0
,
therefore A in R(T) implies A in N(R!) and this shows
R(T)C N(R!) .
If A is in N(R!), that is, X'AX
PAP
=
0, then TA
= A since
= O. Therefore, A is in R(T) and N(R!):> R(T) .
Rence R(T) = N(R!) .
(2)
Since
T(PAP) = PAP - PAP = 0 ,
T2 (A)
= T(A - PAP) = TA - T(PAP) = TA .
2
Therefore T = T
To prove T
of A.
•
= T*, let A and B denote any two fixed members
Then,
(TA,B) = (A - PAP,B)
=
(A,B) - (PAP,B)
= (AjB) - (A,PBP)
21
= (A, B - PBP)
= (A,TB) •
Therefore T
=
T* •
First of all, the above proof serves the purpose of demonstrating
the existence of a linear operator whose range space coincides with
N(H!).
With this, we can restate Lemma 3.2 as follows.
Lenuna 3.4.
Let T be a linear operator given in (3.11).
Then the
parametric function <t ,8 > is estimable if, and only if,
2 2
t
2
= H*TA
2
for some A in A.
Consider the composite transformation TH.
Since R(T)
= N(H!)
and
N(H!) and R(H l ) are orthogonal complements, it follows that TH l is a zero
.
p(P+l)/2
transformation, that is, TH1P l = 0 for every P 1n R
• Therefore,
l
THp
= TH 2P2 ,
R(TH)
where P is the subvector of P of order m.
2
= R(TH Z)
= N(H~T).
and N(H*T)
then since N(H*)C:N(H!)
Also, if A is a member of N(H*),
O
= R(T), TAO = Aa.
and we have N(H*)C: N(H*T)
Consequently,
Therefore,
H*T4~
= H*Aa = 0
= N(H~T).
For every A in A, put A =
~
+ A with
2
~
in R(H) and A in N(H*).
O
We have
H*TA
2
= H*TA2-L + H*TA
Z 0
= H*THp
2
The above shows that
proves the following.
R(H~TH2)
= R(H~T).
This, together with Lenuna 3.4,
22
Lemma 3.5.
Let T be the linear operator defined in (3.11).
the parametric function
<~2,e2>
Then
is estimable if, and only if, there
m
exists P2 in R such that
Notice that the transformation
H~TH2
m
is a linear operator on R and
is expressible in the form of an mxm matrix as follows.
This provides us a useful device for examining the estimabi1ity of
<~2,e2>
by the consistency of equation (3.12).
Let
<~2,e2>
be quadratic estimable and (A,W) be an unbiased estimator.
From the above results, we must have A in R(T) ,
(A,W)
=
(TA,W) =
(A,~H)
since T is symmetric.
i.~.,
A
= TA.
Therefore,
This implies that, in the
quadratic unbiased estimation of variance components, we get identical
estimators whether we use W or TW
=
W - PWP.
Since in general, W is in
A and TW is in R(T) , we can restrict our attention in R(T) rather than
the whole space
A.
Hence, we shall derive estimators from the transformed
model
(3.13)
23
Since TW = W - PWP is the orthogonal projection of W on N(H!),
PWP = W - TW is the orthogonal projection of W on R(H ).
1
In other
words,
(3.14)
The above expression should be the least squares estimator of Hal if the
model were E(W) = H S •
1 l
Of course, this is a biased estimator of H S
1 1
but by analogy with linear estimation, TW = W - PWP can be considered as
"The data adjusted for the fixed effect parameters in 81 ."
Let <i ,8 > be a quadratic estimable function, then by Lemma 3.5 we
2 2
have
This shows that every estimable function of the form <P2,8
mated by a linear func tional of
2
>
can be esti-
H~TW.
Consider the equation
(3.15)
Since
R(H~TH2)
solution 8
2
= R(H~),
equation (3.15) is consistent.
in the estimation of 8
Theorem 3.2.
2
is shown in the following.
Let 8 be a solution of (3.15).
2
for <i ,8 > if, and only if, <i ,8 > is estimable.
2 2
2 2
Proof.
Let 8
2
The use of
be a solution of (3.15),
i.~.,
Then <i ,8 > is unbiased
2 2
Z4
Where (H~Z)
is a generalized inverse of H~THZ' an mxm matrix. Then
Therefore <i ,8
Z 2> is unbiased for <i Z ,8 Z> •
Theorem 3.3. TH 8 is the orthogonal projection of TW on R(TH ) if,
Z Z
Z
and only if, 8
Proof,
Z
satisfies (3.15).
To prove that
is the orthogonal projection of TW on R(TH )' we shall show that
Z
TH 8 is unique for all 8 that satisfy (3.10) and that the linear
Z Z
Z
operator THZ(H~THZ)-H~T is symmetric and idempotent.
(1)
Uniqueness:
(3.15).
Then
let 8
2
and
~Z
be two distinct solutions of equation
Z5
,.
~Z)
Hence, THZ(e Z -
N(H~T)
R(TH ) • .Since
Z
,.
therefore, THZ(e Z
(Z)
Symmetry:
is in
,.
N(H~)
but THZ(e Z
-~Z)
is also in
and R(TH ) are orthogonal complements,
Z
,.
=0
-~Z)
and THZe
is unique.
Z
Let A and B be any two members in A.
Since
[(H~Z)-]'iS also a generalized inverse of H~HZ and
TH2(H~Z)-H~T is unique, then
This proves symmetry.
(3)
Idempo tency:
Observe that for every P
z in
m
R ,
=a .
It follows that
We then have
,.
Conversely if TH e is the orthogonal projection of
2 2
TW on R(TH ), then
2
26
In other words, S2 satisfies
But the above equation is equivalent to equation (3.15) and the
proof is completed.
It should be pointed out that the equation (3.3) and (3.15) together
~
are also consistent, that is, if S'
=
~
(Si Si) is a solution of (3.3), then
the subvector S2 is also a solution of (3.15).
This is proved in the fol-
lowing.
Theorem 3.4. Equations (3.3) and (3.15) are consistent.
~
Proof.
is in
~
Let S'
= (S'1
N(H*)<:N(H~).
H*TW =
Z
H~HS
2
..
S') be a solution of (3.3).
2
Recall that W - HS
We have
+ H*T(W - HS)
Z
= H~S
=
Z
This shows that S2 from (3.3) is also a solution of (3.15).
Recall that WI
= as
is the orthogonal projection of W on Rca) and
TW I - TH 6 is the orthogonal proj ection of WI on R(T).
Z Z
3.3, TH S is the orthogonal projection of TW on R(TH ).
2 Z
Z
THZS Z
R(TH Z)
= THZ(H~THZ)-H~TW
=
R(TH).
By Theorem
Notice that
is also the orthogonal projection of W on
This shows that the orthogonal projection of W on
R(TH) can be taken in two steps.
The proj ection on R(H) followed by
the projection on R(T) gives the same projection of W on R(TH).
27
Example 1
One-stage nested random mode1--Unbiased estimation.
Yij =
~
+ a i + €ij' i=1,2, ••• , t,
j=1,2, ••• , n
i
t
where n =
I
i=l
n , J = 1n l'n' 1 is the nxl vector of tmity and
n
i
n
o
o
o
o
o
o
Since the set {J , VI' I } is linearly independent, H*H is positive
n
definite.
n
This means that
~
2
, cr
2
2
and cr are quadratic estimable.
1
From the above model we have,
t
P=.!J
n n'
2
t
=
\'L.
i=l
=n
1
-"2
1
t
n
n
2
L n.
':-1 ~
...-
~
i=l
n.
~
n~)
(L
PV P = 1
1
n2
t
(\'
l.
i=l
J ,
n
2 2
n.)
~
28
t
I
=
y2
1=1
1.
2
n1
t
I
1=1
(V ,PWP) =
1
n
2
2
y ••
therefore,
t
I
1=1
t
=
I y:
1=1
J..
n
2
n.
J.
2
y..2
(V 2 ,TW) = (I,W) - (P,W)
n1
I
I
1=1 j=l
t
=
y~. - 1:n y~.
J
Now we have
t
2
I n
t
2
1=1 1 2
2 (n - In.)
1=1 J.
n
t
2
1
2
- (n - I n.)
n
1=1 J.
HpH Z =
t 2
1 2
-en - In.)
n
1=1 J.
(n - 1)
29
2
t
t
L
L n.
2
i=l ~
Y
i=l
1. -
n
2
Y ••
2
H~·
2
n
t
2
i
L2
Y · - ~ ..
i=l j=l i J
n
L L
Here
H~2
is non-singular and its inverse is
n(n - 1)
n
2
-1
t
Ln
-
2
i=l i
(H~2)
-1
1
L n.~
L ni
- n
. 1
~=
i=l
-1
n
t
"2
cr
1
=
1
I
t
{ nCn - 1)
2
t
n
n
- n
i=l i
2
(I
I
2
Y ••
i=l ~
2
t
-
n.
i=l ~
I
i=l
n
2
ni
2
·2
Y.. )
n
i 2
1 2
Ly .. -~y,,)}
i=l j=l ~J
n
t
-cI
cr"2
= ( It
2
n - n)
i=l i
t
(I
2
-1
1
n.
n 2 C~
\~ 2
1 2)
i L L Yi · - - Y..
n i=l
i=1 j=1 J
n
{l L~
.I
t
2
t
= -t-=2--
2
2
Y.. - 2"
n Y.. )} •
i=1 ~
n ~=1 i
30
3.4
Invariant Estimation of Variance Components
In Model (2.1), if we replace Y by Y + XE ' then the model becomes
O
m
Y + X8 0 = X(8 + 80) +
.L
(3.16)
Uie i •
1.=1
By invariance in quadratic unbiased estimation of variance components we
mean the estimates (A,W)
= y'AY
of <2 ,8 > remain identical under the
2 2
translation of the S-parameters in (3.16),
Y' AY
= (Y +
for all 80 in RP •
i.~.,
X8 0 ) 'A(Y + X8 )
(3.17)
0
This means that the value of the estimate is indeThe condition (3.17) is
pendent of the value of the 8-parameters.
equivalent to
AX
=0
•
Therefore, to say that <2 ,8 > is quadratic estimable with invariance,
2 2
or, in short, that <2 ,8 > is invariant estimable we mean, that there
2 2
exists A in A with AX
<2 2 ,6 2>,
=0
such that (A,W) is an unbiased estimate of
In Lerema 3.2, we have given a necessary and sufficient con-
dition of the estimability of <2 ,8 >,
2 2
Now, with the additional
requirement of invariance we have the following:
Lemma 3.6.
<2 ,8 > is invariant estimable if, and only if, there
2 2
exist some A in the subspace {AE A: AX
= O}
such that
In the following lemma we shall give a linear operator T on
2
whose range space is this subspace of
A.
A
31
Lemma 3.7.
Let T be a linear operator on A defined by
Z
TZA = (I - P}A(1 - P)
(3.18)
then T is symmetric and idempotent and
Z
R(T )
Z
=
(3.19)
{A€A: AX ... O} •
In short, T is the orthogonal projection on the subspace
Z
{A~A: AX • O} of A •
Proof. ' By the idempotency and symmetry of (I - P), the orthogonal
projection on the null space of X', we have
= (I - P)A(1 - P) = TZA ,
and
(TA,B)
=
tr{(I - P)A(I - P)B}
=
tr{A(I - P)B(I - P)}
=
(A,TB) •
These prove the idempotency and symmetry of T '
Z
To show that
R(T )
Z
=
{A~A: AX
= O} ,
first observe that if .AX ... 0, then
A
i.~.,
=
(I - P)A
= A(I
A is in R(T Z)'
{AEA: AX
- P)
=
Therefore
= O} CR(T Z}
•
(I - P)A(I - P) ,
32
Conversely,
(I - P)A(1 - P)X
=0
for every A in
Therefore, If AE:R(T 2) then AX= 0 ,
A.
i.~.,
and we have
R(T ) = {AEA: AX = O} •
2
With the linear operator T introduced above we now restate Lemma 3.7 as
2
follows:
Lemma 3.8.
for some A in
1
<Q,2,8 > is invariant estimable if, and only if,
2
A.
Lemma 3.9r
<Q,2,8 > is invariant estimable if, and only if,
2
(3.20)
. Rm •
f or some P2 ~n
Proof.
Lemma 3.8 says that <t 2 ,8 > is invariant estimable if, and
2
only if, Q,2 is in
R(H~T2)'
projection on R(T H ).
2 2
For A in
A,
let T H P2 be its orthogonal
2 2
Then
since T2 is symmetric and idempotent.
~ukelsheim's results imply that the unbiased estimator of q't exists
if q is in
R(D~).
Lemma 3.8 presents equivalent results.
33
This shows that
R(H~Z)
= R(H~ZHZ)'
invariant estimable if and only if 2
Z
Therefore <2 Z,6 > is
Z
is in R(HPZH Z)'
m
and only if there exists Pz in R such that 2 Z
Notice that the linear operator HPZH
Z
on
m
R
!..~.,
= HP2HZP2
if
•
can be written as an
mxm symmetric matrix with its (i,j)th entry being
Lemma Z.9 provides a useful tool in examining the quadratic estimability
with invariance of <2 ,6 >
2 Z
•
In general, R(T Z) is a subspace of R(T).
is a subspace of
R(H~Z)
= R(HP).
Hence,
R(H~TZHZ)
= R(H~2)
This means that a quadratic esti-
mable function <2 ,6 > is not necessarily invariant estimable.
Z Z
If <2 ,6 > is invariant estimable, and (A,W) is an invariant unbiased
Z Z
estimator of <2 ,6 >
Z Z
,
then we must have A in R(T )'
Z
By symmetry and
idempotency of T , (A,W) = (A,TZW) for every W in A. Hence, it suffices
2
2
to derive invariant unbiased estimators from the mode1
(3.21)
Following the approach developed in the last section, we set up the
equation
3
(3.22)
ZThe model (3.Z1) is equivalent to Pukelsheim's model E(MY 0 MY) DMt.
3The ordinary least squares estimate
t
Pukelsheim is a solution of Equation (3.2Z).
=
D~ • Y 0 Y given by
34
This equation is consistent since
R(H~2H2)
= R(H~2)
as shown in the
proof of Lemma 3.9.
By analogy with Theorem 3.3, we have the following.
Theorem
if and only if
e
Theorem 3.6.
<i
<i
T H e is the orthogonal projection of T W on R(TH )
2
2
2 2 2
is a solution of equation (3.22).
3.5.
4
Let
e2
be a solution of equation (3.22).
2
,e 2>
is an invariant unbiased estimator of <i
2
,e 2 >
is invariant estimable.
Proof.
2
,e 2 >
Then
if and only if
Unbiasedness follows by analogy with the proof of Theorem
3.2.
Invariance is verified by observing
where P2 is a solution of equation (3.20).
Equations (3.22) and (3.15) do not have a common solution in general.
A sufficient condition for the equations to have a common solution is
given in the following.
Theorem 3.7.
Proof.
in
4
<i
Equation (3.22) and (3.15) have a common solution if
It can easily be shown that T TA
2
= TT 2A = T2A for
A.
~
,e >
2 2
is equivalent to q't in Pukelsheim's notation.
every A
35
For every 8
2
satisfying equation (3.15), we have
...
H~2W
= H~2(W
- TH 2 8 2 ) + H~T2H282 •
N(H~).
since W - TH 6 is in
2 2
if R(T H ) C R(TH ), then
2 2
2
Therefore,
for every solution 8
of equation (3.15).
2
In other words, a solu-
tion of (3.15) is also a solution of (3.22) if R(T H ) C R(TH ) •
2 2
2
In the following example, the model is the same as that in Example 1.
The
invariate
equation (3.22).
Example 2
of
estimators
2
0"1 and 0"
2
are obtained by solving
They are different from those given in Example 1.
One-stage nested model--Invariant estimation.
The matrices P, VI and V are the same as in Example 1.
2
=
t
I
i=l
2
3
1
2 2
n2 i=l
I n. +:2 ( I n i )
~
n
i=l
t
ni -
1
t
2
= n - - In
n i=l i
t
Then
36
= n - 1 ,
(V ,T V ) = (1,(1 - P»
2 2 2
n
t
(V ,T W)
1 2
i
I (I
i=l j=l
=
(Y
- »2
- Y••
ij
(1,T W) = «I - P) ,W)
2
(V2 ,T l-l)
2
ni
t
..
_ ly2
Ly2
l:
=
n
i=l j=l ij
1
F
n - -
n
1
2
It
n - 1
n - n
n ~=
. 1 i
where F
=
t
2
l: n.
-I ~
-
i-
t
t
ln. l\ n i3 + 1(\
2
l
~=1
t
2
I
n
i=l i
n
n -
2)2
n..
i=l ~
,
1
n
1
t
l:
i=l
1
t
n I
i=l
where D
=
(n + 1)
t
L
n2
i=l
+
i
2
ni
F
- n
_ 2(n - 1)
n
(n - 2)( ~ 2)2
2
2
l n.
- n
n
i=l ~
2
ni - n
t
3
l: n i
i=l
e
37
t
1
1 l.~ n 2 - n)(V ,T W)}
Then (J"'2 "= -{en
- 1)(V1 ,T2W) + (2 2
1
D
n i=l i
+
.. 2
(J
1
n
1
t 2
t
i
2
1 2
(- Lni -n)(
Y , - - Y .. ) } ,
n i=l
i=l j=l iJ
n
L L
1
t
2
= - {(- L ni
D
n i=l
n
+ F(
L Li
t
i=l j=l
t
-
n)
2
L (Y..
i=l
~
1
2
-
n i Y•• 2
n )
Yij - ; Y.• )} .
38
4.
4.1
ANOTHER LOOK AT SOME OTHER ESTIMATION PROCEDURES
Seely's Work on Quadratic Unbiased Estimation in Mixed Models
In his work on quadratic unbiased estimation in mixed models, Seely
introduced the model
E(W)
= H6
induced from the general mixed linear model
m
= XS +
y
I
i=l
Ui€i·
He showed that estimable functions of the form <2,6> can be estimated by
<2,6> if 6 is a solution of the equation (3.3)
H*H6
= H*W
•
In the last chapter we proved that if 6 is a solution of the above
equation, then
(1)
H6 is the orthogonal projection of W on R(H), and
(2) <2,6> is the least squares estimate of <2,6> •
In the next section we shall further prove that, in certain random models,
the estimators obtained from the above equation are identical to the ones
obtained by the symmetric sum method proposed by Koch (1967).
In estimation of variance components Seely proposed a quadratic
unbiased estimator without invariance for the general mixed model.
first introduced a linear operator T
s
T A
s
=
(I - P)A + A(I - P)
He
on A defined by
for every A in A
(4.1)
39
and proved that the range space of this linear operator coincides with
the subspace {AE A:
x' AX = O}
of A.
metric and non-negative definite,
He further showed that T is syms
~.~.,
if (A,T A)
s
= 0,
then T A = O.
s
Also he showed that
R(T )
s
= R(Ts H)
@
N(H*)
(4.2)
and hence the existence of P2 such that
H~TsH2P2
=t2
is a necessary
and sufficient condition for <t ,6 > to be estimable.
2 2
Further, Seely stated that if 6
2
is a solution of the equation
(4.3)
then <t ,6
2
2
>
is an unbiased estimate of <t ,6 > if <t ,6 > is estimable.
2 2
2 2
However, the consistency of the equation (4.3) was not verified.
We found that the equality in (4.2) and the consistency of equation
(4.3) are equivalent.
is assured.
Therefore, the existence of a solution of (4.3)
In proving this, we first present the following.
Theorem 4.1. If the linear operator L on
A is such that R(L):>N(H*),
then the following statements are equivalent.
(1)
R(LH) + N(H*)
(2)
R(H*LH)
Proof.
= R(L)
= R(H*L)
Statement (1) implies that for every A in
(4.4)
(4.5)
A there exist P
in RM and AO in N(H*) such that
LA
= LHp +
therefore
H*LA
A
O
= H*LHp
(4.6)
40
and this imp lies
R(H*LH)'::> R(H*L) •
The reversed containment is obvious.
R(H*LH)
Therefore
R(H*L) •
=
Conversely, statement (2) implies that for every A in
A there
M
exists p in R such that
H*LA
=
H*LHp •
(4.7)
Therefore (LA - LHp) is in N(H*).
LA
= LHp
Write
+ (LA - LHp)
for every A in A.
This shows
R(L)CR(LH) + N(H*) •
But _both R(LH) and N(H*) are subspaces of R(L) , so is R(LH) + N(H*).
Hence
R(L)
= R(LH) +
Since R(T )
s
N(H*) •
= N(R!)::::> N(R*) ,
holds if we substitute T for L.
s
R(T H)
s
= R(Ts H2),
the equivalence in the above theorem
Further, since R(H!Ts)
=
{O} and
it is easily proved that
(4.8)
41
is equivalent to
R(H*Ts H) = R(H*T s ).
This shows that the equality in (4.2) and the consistency of equation
(4.3) are equivalent.
Therefore the existence of solution of equation
(4.3) is assured.
Next we shall investigate if the estimators of
equation (4.3) are identical to those from (3.3).
<~2,62>
derived from
Since (3.3) and (3.15)
are consistent, we shall examine if (3.15) and (4.3) have common solution.
First, it can be easily proved that
(4.9)
If Equations (3.15) and (3.22) have common solutions,
exists 62 in
hold,
am such
!.~.,
if there
that both
then adding these two equations, we have
This shows that (3.15) and (4.3) have common solutions if and only if
(3.15) and (3.22) do.
And this is when the estimators of variance com-
ponents derived from (3.3) and (4.3) are identical.
42
4.2
Koch's Symmetric Sum Approach to the Estimation
of Variance Components
The symmetric sum approach proposed by Koch (1967) provides quadratic
tmb iased estimates of variance components in random models.
To begin with,
we summarize Koch's method in the general random model as follows:
(1)
The general random model is
m
Y = p1 +
I
Ui€i
i=l
where Y is the nx1 vector of observations, p is the overall
mean and €i (i=l, 2, ••• , m) are vectors of random effects.
The entries in each of these vectors €i are assumed to be
independently and identically distributed with zero mean and
2
variance cr. (i=l, 2, .. • , m).
From these assumptions we have
J.
E(W) = p2 J +
m
I
i=l
2
vicr i
where J is the nxn matrix of unity and V. = UiU! (i=l, 2, ••• , m).
1.
(2)
J.
Divide the entries W.. of W into groups such that all the W..
J.J
J.J
with the same expectation are in the same group.
the W matrix into a column vector of N
z=
= n2
Transform
entries
(4.10)
Z
r
where Zt(t=l, 2, •.• , r) is the vector of N entries of the t-th
i
r
group of W .( l N. = N
i J i=l J.
= n 2)
.
43
(3)
Calculate the average Zt(t=l, 2, ••• , r) of each vector Zt and
let
Zr )'
(4.11)
be the r-vector of the averages with expectation
E(Z) = L8 •
(4)
(4.12)
Solve the equation
L8
=Z
(4.13)
•
The random models considered by Koch (1967) produced non-singular
coefficient matrices L.
Therefore 8 is the unbiased estimator of 8 by
the symmetric sum approach.
(i.~.,
We shall prove that, when rank (L) = r
<
m+ 1
the rows of L are linearly independent), L8 is the least squares
estimator of L8.
The proof is as follows.
Let D be an Nxr matrix
defined by
f
a
o
o
o
D=
Q
a
1
r
where It is the Nt-vector of unity (t=l, ••• , r), and let
F
= DL.
44
Then we have
= DLa = Fa
E(Z)
(4.14)
•
Set the equation
II Z -
Fa II 2
=
(4.15)
0 ,
aa
which is equivalent to
=
F'Fa
Observe that F'F
(4.16)
F'Z •
=
L'D'DL,
D'D
= Diag(Nl
••• N ) and F'Z
r
=
L'D'Z
= L'D'DZ.
By the assumption that L is rX(n*l) of rank r, we know the inverse (LL,)-l
of LL'I exists.
Pre-multiplying both sides of the normal equation
A
= F'Z
F'Ee
by (LL')-~, we get
A
D'DLe
Since D'D
= D'DZ
= Diag(N l
La
••• N ) is non-singular, this is equivalent to
r
= Z
This shows that La is the least squares estimate of La if L is a set of
linearly independent functions of a.
Further, the transformation from W to Z is a one-to-one correspondN
ence between A and R that preserves the inner product, !.~.,
II Z -
Fe" 2
=
"W _
Ha"
2
(4.17)
45
· m+1
for every S
E R.
Therefore the least squares estimate
<~,S>
of <i,S>
obtained from equations
as
and
as
IIw - Hell
II Z -
2
Fe II 2
0
(4.18)
=0
(4.19)
OIl
are identical if and only if
<~,S>
is estimable.
Forthofer and Koch (1974) modified the method and extended it to
mixed models.
The general approach of the modified method can be i11us-
trated by an example of two-way classification mixed model.
First, the
differences among the observations within each level of the fixed effect
factor are calculated.
If the fixed effect factor has f levels and there
are n. observations within the ith level, then there are
~
ferences altogether.
f (n~
difi=l 2 )
The differences can be expressed in a vector form
DY with a suitably defined matrix D.
Then the vector G is formed by
collecting the upper triangle entries of Dyy'D' excluding those whose
expectation is identically zero.
Denote the expectation of G by Aa,
where A is the known coefficient matrix and cr the vector of variance
components.
The estimators are derived by least squares procedures based
on the model E(G) = Acr.
The estimators are invariant with respect to the
fixed effect parameters S.
However, the invariant estimators given by
Forthofer and Koch and those obtained in Section 3.4 are not identical
in general, since the transformation from TZW to G is not
and the equations
one-to~one
46
d
II T2w -
'"
d6
d
and
'"
T 2H2 6 2 "
2
=0
Z
....
IIG-AaIl
2
=0
dO
are not equivalent.
4.3
A Least Squares Approach to the MINQUE
In this section we shall discuss a general approach of estimation
of variance components known as the MINQUE developed by Rao (1971).
We
shall demonstrate, with a suitable linear transformation, that the MINQUE
can be obtained by the least squares approach developed in Chapter 3.
In
the process, we shall also discuss the estimability by the MINQUE and its
equivalence to quadratic estimability defined in Sections 3.3 and 3.4.
We first consider the MINQUE with invariance.
m
be the same as given in (2.2).
E
Vi' Since V is positive
i=l
definite, there exists an nxn non-singular matrix C such that C'VC = I or
V = C,-lC- l •
but
e
Define V =
Let X,V ' i=l, 2, •.• , m
i
(In Rao's notation, C and C' are both denoted by v-lIZ,
is not symmetric unless V is diagonal.)
Let Z
orthogonal projection matrix on the range space of Z,
= e'x
and P
z
the
l.~.,
(4.20)
The procedure of the MINQUE is to find A* in the subset
of A such that the minimum of
lIe-1 Ac,-I"
Z is attained at A*.
It should
be noted that the subset A of A is non-empty if and only if <p,6 > is
p
2
~
47
invariant estimable.
The MINQUE with invariance Y' A*Y given by Rao
(1971) is
A
*
=
m
r AiC(I -
i=l
P )C'ViC(I - P )C' ,
Z
Z
where A' = (1. 1 1. 2 ••• Am) is so chosen such that
H~A*
= p.
One problem
m
that needs clarification is the existence of such a vector A in R •
In
other words, assume that A is non-empty, is there a member of A expresp
p
sible in the form (4.2l)?
We shall answer the question by showing that the MINQUE with invariance can be obtained by the least squares approach developed in Section 3.4.
First we define a linear operator G on
GA
=
A by
(4.22)
C'AC
for every A in A.
Let WM = GW.
Then
(4.23)
where
~
= GH, we make the partition
(4.24)
just as we did in partitioning H
= Hl +
H •
2
Since C is non-singular, G is invertable and for every A in
A, there
exists B in A such that (A,W) and (B,W ) have the same expectation for
M
every value of e. (Actually (A,W) = (B,W ) if, and only if, B = G-1*A,
M
-1*
where G
is the adjoint of the inverse of G.) This means that estimability based on the model E(W)
the model (4.23).
= He
is equivalent to that based on
48
Let T be a linear operator on
M
A defined
by
(4.25)
w- = (I - Pz )A(1 - Pz )
T_A
for every A in A.
Then by analogy in Lemma 3.7 we have
~ = T~ = T~
R (~)
and
=
{A E
A:
AZ
= a}
.
(4.26)
The least squares estimators based on the model
are obtained from the equation
5
or
(4.27)
By analogy of Lemma 3.5, a necessary and sufficient condition that
m
<p ,8 > is invariant estimable is the existence of AER such that
2
HfuThzA = P .
(4.28)
Combining the above results and Lemma 3.5, we have the following.
5
Pukelsheim proved that the MINQUE with invariance can be obtained
from the normal equation
D'(MSMoTSToMSM)+Dt
= D'(MSM.T~T'MSM)+YSY.
49
Let ~2 and p be given in (4.27) and (4.28), then <P'~2> is the least
squares estimate of <p,8 > based on the model (4.26).
2
If A is a solution
of (4.28), then
(4.29)
By some elementary but tedious algebraic operation, the right side
of (4.29) can be written as (A*, W) or Y'A*Y where A* is of the form
(4.21).
The above development shows that the MINQUE with invariance is identical to the least squares estimate based on the transformed model (4.26).
Since the existence of A* in
Ap
depends on the consistency of Equation
(4.28), this, together with Lemma 4.1, shows that the MINQUE exists for
every invariant estimable function <p,8 > •
2
Under the assumption that the random variables e , 1=1, ••• , mare
i
normally distributed, Rao (1971b) showed that, when V is redefined as
m
V =
I
i=l
(4.30)
r.V.
~ ~
where r., i=l, ••• , m are positive numbers, the MINQUE is the locally
~
minimum variance quadratic unbiased estimator (MIVQUE) at the points
2=
where 8
(cri •••
cr~) is a multiple of r' = (r1 ••• r m). This suggests
that the MIVQUE can be obtained by weighted least squares procedures by
introducing the weights r
i
into the model (4.26).
In the case of the MINQUE without invariance, the problem is to minimize tr(A(V + 2XX')AV) under the restriction that A is in
50
=
Bp
{AE"A: Xl AX = 0 , H*A
2 = pERm}
(4.31)
where B is non-empty if and only if <p,6 > is estimable.
p
2
The procedures of applying the least squares method developed in
Section 3.2 to obtain the MINQUE without invariance are given as fo1lows.
=
It can be eas'i1y proved that if X'AX
tr(A(V+2XX')AV)
=
0, then
tr(A(V + XX')A(V + XX'»
•
Since (V + XX') is positive definite, there exists a non-singular matrix
N such that
N' (V
or
V
+
+
XX') N
XX'
=I
= N,-~-l
•
Therefore the problem is to minimize
Let
W
N
IIN-1AN,-111
2
A in Bp •
= N'WN
U = N'X
and define linear transformations
~2
m
by
I
i=l
and
\N'V.N,
~
A
m
: R
in A ,
+
A and TN
A+A
51
respectively.
Then we have
(4.32)
Set the equation
(4.33)
or
tV
and solve for 6
2
•
Let A be the solution of the equation
(4.34)
Then <P'~2>
=
<A, ~2TNWN> is the least squares estimate of <p,6 2> based
on the model (4.32).
(4.35)
where Q = X(X' (V + XX,)-lX)-X' , then,
Min
AEB p
tr(A(V + 2XX')AV)
= tr(A*(V
+ 2XX')A*V)
is also the MINQUE of <p,6 > without invariance.
2
The matrix A* given in (4.35) can also be obtained from applying
Lemma 3.10, Rao (1971) with V replaced by V + XX' •
52
However, in deriving the MINQUE with invariance, Rao (1971) did not
use this lemma and the matrix he obtained in minimizing tr(A(V + XX')AV)
is different in form from the matrix A* in (4.35).
Pringle (1974) also
obtained a different form of the matrix A that minimizes tr(A(V + 2XX')AV).
The algebraic identity of the three different forms of the matrix
has not been established but in the next section we will prove that A*
in (4.35) does minimize tr(A(V + 2XX')AV) and that the MINQUE without
invariance is unique for every estimable function <p,6 2>.
4.4
A Geometric Approach to Quadratic Least Squares
Es timat ion and the MINQUE
In the last section we have demonstrated that the MINQUE can be
obtained by the least squares method developed in Chapter 3.
In this
section we shall discuss the relationship between the CWo methods in a
geometric approach.
Even though the following discussion is mostly on
unbiased estimation without invariance, its implication to invariant
estimation should be easily seen.
m
Let p be a fixed vector in R such that B is not empty.
p
By the
definition of B in (4.31) we know that A is in B if and only if
p
p
A
and
= TA = A - PAP
H~TA =
P •
m
By Lemma 3.5, there exists A in R such that
e
53
Therefore !HZA is in B •
p
for every A in B •
p
Observe that
By Theorem 3.3 TIlZA is the orthogonal projection on
R(TH ) of every A in B •
Z
p
If !HZP is in B then
p
and (TIlZP - THZA) is in N(H*).
Therefore !HZP
= TIlZA
But (!HZP - !HZA) is also in R(!HZ).
and it is the mique point in R(TH )(\ B •
Z
p
Let A bQ in B • Then
p
H*TA
2
= P = H*TH
A
Z Z
and (A - !HZA) is in N(H~).
Therefore
Summarizing the above discussion, we have the following.
Theorem
4.Z.
If B
p
is not empty, then
(1)
m
There exists A in R such that TH A is in B ;
p
2
(2)
B
p
= {!H2A} (3:) N(H*)
where A satisfies
54
II All
(3)
Z
for every A in B.
p
holds if, and only if, A
= THZA
The equality
•
From Theorem 4.Z we have
I A II
min
Z
= II TIlZA II
Z
AEB p
= p.
where A satisfies
H~THZA
where 6 satisfies
H~THZ6Z
Z
Since
= Hrw.
Therefore (A,W) is the least squares
estimator of <p,6 > if A is in B and has the minimum norm.
Z
p
Conversely if (A,W) is the least squares estimator of <p,6
(A,W)
for every W.
=
(p,6 )
=
<A , H*TW>
Z
> , !.~.,
Z
Z
Therefore
A= THZA
and this shows that
if (A,W) is the least squares estimator of <p,8 > •
2
IIAI!
Z is minimized
.e
55
We now conclude the above discussion in the following.
Theorem 4.3.
(A,W) is the least squares estimator of <p,e > if,
Z
and only if, A is in B
p
=
and
min
liB
liZ
BEB
P
In the unbiased estimation with invariance, the statements in Theorems
4. Z and 4.3 are still true if T and N(H*) are replaced by T
Z
and
R(Tz)n N(H*) , respectively.
In order to obtain the MINQUE through least squares procedures, we
have to make a linear transformation on the data.
In the case of the
MINQUE without invariance, the linear transformation G was defined as
GA
= N' AN
so that
(4.36)
Corresponding transformations to be made are
U
= N' X
or
TNA = A - P
uAP U
With this set up we define
and
Cp
= {BEA:
U'BU = 0, ~ZB = p} •
Then there is a one-to-one correspondence between
and
min (tr(A(V + ZXX')AV» =
AEBp
min
BECp
liB lIZ.
Bp
and
Cp
56
This, together with Theorems 4.2 and 4.3, shows that the MINQUE from the
model E(W)
E(TNWN)
= He
is the least squares estimator from the model
= TN~e2·
57
5.
CONCLUSIONS
The application of the theory of least squares in quadratic estimation in the general mixed model has been developed.
In reviewing the
results presented in this dissertation, we find it interesting to note
the analogy and the dissimilarity between linear and quadratic estimation in application of least squares principles.
In both cases, we
consider the observation as a member or a point in a fin±te dimensional
inner product space.
Its expected value lies in a subspace which we
refer to as estimation space.
finite subset
The estimation space is spanned by a
specified by the model.
In the model E(Y)
spanning subset consists of the columns of X.
it is the collection of B.. and V.•
1J
1
= X8,
In the model E(W)
the
= He,
A linear function of parameters
is estimable if and only if it can be expressed as a linear function
of the expected value of the observation.
The least squares estimator
of such a parametric function is a linear function of the orthogonal
projection of the observation on the estimation space.
In linear esti-
mation, if e'y is unbiased for 2 ' S, then e'y is the least squares estimate, where Y is the orthogonal projection of Y on R(X).
Similarly,
if (A,W) is unbiased for <2,e>, then (A,W ) is the least squares estil
mate if W is the orthogonal projection of W on R(H).
l
In estimation of variance components, we make the partition
He = Hle
l
+ HZe2 like the partition X8
= ~Sl
+ X282 in linear models.
One dissimilarity between the two cases is observed in unbiased estimation with invariance.
2 8 , then
2
2
~
In linear estimation, if «'Y is unbiased for
is in the null space of Xi and
respect to the translation of Sl.
~'Y
is invariant with
However, in quadratic estimation
58
that (A,W) is unbiased for <t ,8i> does not necessarily mean that it is
z
invariant with respect to 8 •
1
Invariance is a desirable property in
quadratic estimation of variance components.
However, invariant esti-
mates do not always exist for quadratic estimable functions of the form
<t ,8 > .
z Z
Least squares estimators of variance components derived from
= TH Z8 Z
the model E(TW)
model W = H8.
are proved to be identical with those from the
These estimators are unbiased without invariance.
It is
proved that Koch's (1967) symmetric sum approach to estimation of variance components is a special case of least squares estimation base on
the model E(W) = H8.
E(TZW)
TH
Z
= TZHZ8 Z•
and TZH
Z
Invariant estimators are derived from the model
Since the range spaces of the linear transformations
do not necessarily coincide, the projections of W on these
two subspaces are not identical in general.
It is proved that Rao's (197la) MINQUE can be derived by the weighted
least squares approach.
If <p,6
2
>
is quadratic estimable with invariance,
then the MINQUE of <p,8 > with invariance is the quadratic estimator
Z
(A*,W) such that
II c 'A*clI
2 =
min
Ilc'ACII
Z
•
AEA
p
Where
A
p
=
{AEA: AX
=0
and
H*A
Z
= p}
m
and C is the non-singular matrix such that
L
i=l
cr;)
C'V.C
~
= I.
is a multiple of erl ••• r ) and C is such that
m
m
.L
c'vicr
~=l
i
=
I, then the MINQUE is the locally minimum variance unbiased
estimator tmder the assumption of normality.
This is analogous to the
case in linear estimation where the least squares estimate
d~Y
of
~'S
is
59
such that a.* minimizes a' a. among all linear unbiased estilDates a 'Y
and
V
a..~Y
is the best linear unbiased estimate if a* minimizes a'va. and
= Var{Y).
The theory of least squares leads to minimum variance unbiased
estimator in linear estimation.
In the present work, the least squares
approach to quadratic estimation of variance components is developed and
the analogy and dissimilarity between linear and quadratic estimation
are outlined in the hope that minimum variance unbiased quadratic estimators will be developed in the future research.
Other topics of sug-
gested future research are the admissibility of invariant quadratic
estimators, specification of the class of experimental designs where all
variance components are quadratic estimable with invariance and the
experimental designs for quadratic least squares estimation of variance
components.
60
LIST OF REFERENCES
Forthofer, R. N. and G. G. Koch. 1974. An extension of the symmetric
sum approach to the estimation of variance components. Biometrische
Zeitschrift 16:3-14.
Graybill, F. A. 1961. An Introduction to Linear Statistical Models.
McGraw-Hill Book Company, Inc., New York.
Graybill, F. A. and R.A. Hultquist. 1961. Theorems concerning
Eisenhart's model II. Annals of Mathematical Statistics 32:261-269.
HaImos, P. R. 1958. Finite Dimensional Vector Spaces, Second Edition.
D. Van Nostrand Company, Princeton.
Harville, D. A. 1967. Estimability of variance components for the
two-way classification with interaction. Annals of Mathematical
Statistics 38:1508-1519.
Hoffman, K. and R. Kunze. 1971. Linear Algebra, Second Edition.
Prentice-Hall, Inc., Englewood Cliffs, New Jersey.
Koch, G. G. 1967. A general approach to the estimation of variance
components. Technometrics 9 :93-118.
Koch, G. G. 1968. Some further remarks concerning 'a general approach
to the estimation of variance components.' Technometrics 10:551-558.
Pringle, R. M. 1974. Some results on the estimation of variance components by MINQUE. Journal of the American Statistical Association
69:987-989.
Pukelsheim, F. 1976. Estimating variance components in linear models.
Journal of Multivariate Analysis 6:626-629.
Rao, C. R. 1970. Estimation of heteroscedastic variances in a linear
model. Journal of American Statistical Association 65:161-172.
Rao, C. R. 1971a. Estimation of variance and covariance components-MINQUE theory. Journal of Multivariate Analysis 1:257-275.
Rao, C. R. 1971b. Minimum variance quadratic unbiased estimation of
variance components. Journal of Multivariate Analysis 1:445-456.
Rao, C. R. 1972. Estimation of variance and covariance components in
linear models. Journal of the American Statistical Association
67: 112-115.
Rao, C. R. 1973. Linear Statistical Inference and Its Applications,
Second Edition. John Wiley and Sons, Inc., New York.
61
LIST OF REFERENCES (continued)
Searle, S. R. 1968. Another look at Henderson's methods of estimating
variance components (with discussion). Biometrics 24:749-787.
Searle, S. R.
York.
1971.
Linear MOdels.
John Wiley and Sons, Inc., New
Seely, J. 1969. Estimation in finite dimensional vector space with
application to the mixed linear model. Ph.D. Dissertation, Iowa
State University, Ames.
Seely, J. 1970a. Linear spaces and lUlbiased estimation.
Mathematical Statistics 41:1725-1734.
Annals of
Seely, J. 1970b. Linear spaces and unbiased estimation--app1ication to
the mixed linear model. Annals of Mathematical Statistics 41:17351748.
© Copyright 2026 Paperzz