CONTRIBUTIONS
'ro
THE THEORY, COMPUTATION AND
APPLICATION OF GENERALIZED INVERSES
Charles A. Rohde
Institute of Statistics
~tlmeog~aph Series No. 392
May
1964
iv
•
TABLE OF CONTENTS
Page
LIST OF TABLES
vi
LIST OF ILLUSTRATIONS • • •
• vii
CHAPI'ER 1.
INTRODUCTION.
1
CHAPTER 2.
LITERATURE REVIEW
4
.
2.1 Summary . . ~
2.2 Generalized Inverses • • • • • •
2·3 Reflexive Generalized Inverses
2.4 Normalized Generalized Inverses ••
• ••.
2·5 Pseudo-Inverses. • • • • • • • . •
2.6 Generalizations to Hilbert Spaces.
• •••
Generalized
Inverses
in
Algebraic
Structures
• •
2·7
CHAPTER 3.
3·1
TIlEOREn'ICAL RESULTS •
Summar.y.
0
•
•
•
•
•
•
4.1
•
0
0
0
•
•
•
•
•
4.2 An Expression for a Generalized Inverse of a Partitioned
Matrix . . . . . . . . . . . . . . . . .
4.3 Some Computational Procedures for Finding Generalized
Inverses . . . . . . . . .
(I
•
•
•
•
•
4.3.1 Generalized Inverses • • • • • . .
4.3.2
Reflexive Generalized Inverses ••
4.3.3 Normalized Generalized Inverses •
4.3.4 Pseudo-Inverses • • • • •
4.4 Abbreviated Doolittle Technique. •
4.5 Some Results on Bordered Matrices ••
CHAPTER 5.
LEAST SQUARES APPLICATIONS. •
5·1 Introduction • • • • • • • • • •
• ••••
5·2 A Review of Some Current Theories..
Weighted
Least
Squares
and
the
Generalized
Gauss-Markoff
5.3
Theorem. • . . . . . . . . . . . . . . .
5·4
13
14
16
19
29
32
32
35
37
41
42
47
COMPUTATIONAL METHOrs •
Summar.y. •
4
32
3.2 Results on Rank. . • • • •
3·3 Symmetry Results • • • . •
3.4 Eigenvalues and Eigenvectors
3·5 Orthogonal Projections • • • •
3.6 Characterization of Generalized Inverses .
CHAPTER 4.
4
47
47
52
52
54
56
56
59
69
74
74
75
79
Linear Models with Restricted Parameters • • • • • • • • • • 82
•
v
TABLE OF CONTENTS (continued)
Page
5.5
5.6
5.7
Linear Models with a Singular Variance - Covariance
Matrix . . . . . . . . . . . . . . . . . . . . . . .
Multiple Analysis of Covariance. • • • • • • • • • • •
Use of the Abbreviated Doolittle in Linear Estimation
Problems
Tests of Linear HYPotheses • • • • • • • • • • . .
0
5.8
CHAPTER 6.
6.1
6.2
•
0
•
•
•
•
•
•
....
SUMMARY AND SUGGESTIONS FOR FUTURE RESEARCH • •
Summary. . . . . . . . . . . . . .
Suggestions for Future Research ••
CHAPTER 7.
•
•
LIST OF REFERENCES. • • • • •
87
89
91
92
96
96
96
103
vi
LIST OF TABLES
Page
4.1.
The Abbreviated Doolittle :format
o
•
•
•
•
•
ill
•
•
•
0
ill
0
63
vii
LIST OF ILLUSTRATIONS
Page
2.1.
The Pseudo-Inverse in geometric terms • • . . . • . • • . .
27
CHAPl'ER 1.
INTRODUCTION
Statistics, which may be viewed as a branch of mathematics, has
suggested many interesting mathematical problems.
Certain developments
in the areas of measure theory, modern algebra,mmiber theory, numerical
analySiS, etc. have been fostered by research into statistical problems.
Matrix theory affords perhaps the best example of the
statistics and mathematics.
~nterplay
between
The most widely used statistical techniques
are Analysis of Variance and Regression and each can be compactly treated
using matrix algebra.
Application of matrix algebra to these areas led,
for example, to Cochran's Theorem which has as its mathematical counterpart the Spectral Representation Theorem for a symmetric matrix.
Matrix
theory also has provided relatively simple proofs of the properties of
least squares as used in Analysis of Variance and Regression problems.
Mathematically these proofs are based on the concepts of projections and
linear transformations on (finite dimensional) vector spaces.
Recent
work in the area of stationary stochastic processes has indicated that
the infinite dimensional analogues of Analysis of Variance and Regression
techniques will become increasingly important and useful to the applied
statistician.
Application of the theory of least squares to the General Linear
Model results in a set of linear equations called the normal or least
squares equations.
When represented in matrix form, these equations
frequently have no unique solution, Le. the coefficient matrix is
singular.
To cope with this situation, certain authors (Bose [1959], and
C. R. Rao [1962]) developed the concept of a Generalized Inverse.
Roughly
2
speaking, a Generalized Inverse is an attempt to represent a typical
or generic solution to a set of linear equations in a compact form.
Several earlier generalizations of the inverse of a matrix developed
in certain pure mathematical contexts.
Until the late 1950's little investigation of the theoretical
properties of Generalized Inverses was undertaken.
Starting in 1955
With an excellent article by Penrose, many authors have investigated
certain properties of a Generalized Inverse called the Pseudo-Inverse.
The Pseudo-Inverse of a matrix is a uniquely determined matrix which
behaves almost exactly like an inverse.
Due to its uniqueness it was
natural that the properties of the Pseudo-Inverse were studied and extended to apply to other Algebraic Systems and just recently to Hilbert
space.
The Pseudo-Inverse of a matriX, while of interest because of its
theoretical properties, is perhaps a little too cumbersome computationally for use in least squares theory.
In particular the results of
least squares computations happen to be invariant under choice of
Generalized Inverse.
Such invariance, coupled With the computational
difficulties present With the PseudO-Inverse, indicates one need for
research on the properties of other types of Generalized Inverses.
At
present there appear to be four types of Generalized Inverses which
deserve recognition and study.
In this work we shall attempt to accomplish five purposes:
1.
Define the various types of Generalized Inverses which appear
to be important and collate previous research,
2.
Investigate inter-relationships between the various types of
Generalized Inverses,
3
3.
Discuss some computational aspects of obtaining the various
types of Generalized Inverses,
4.
Indicate and modify the statistical applications of Generalized
Inverses which have appeared in the literature,
and
5.
Discuss possible areas for future research.
Although topics 1 through 5 were of primary importance in this
work, at least one interesting by-product appeared.
Upon investigating
the computational methods it was found that by a simple modification of
the techniques and theory taught in beginning matrix theory courses
Generalized Inverses could be discussed at an elementary level.
This
suggests that the idea of Generalized Inverses might well be fitted into
beginning matrix theory courses in order to achieve a more general and
more satisfying treatment of systems of linear equations.
4
CHAPrER 2.
LITERATURE REVIEW
2.1
Summary
In this chapter we investigate five definitions of Generalized
Inverses.
Each of these definitions, as mentioned in the Introduction,
is designed to assist in solving a set of equations.
Four of the defi-
nitions presented are concerned with linear equations in finite dimensional inner product spaces.
representation of a solution.
Definition 2.1 provides the most general
Each of the other definitions can be
considered as special cases of Definition 2.1 which defines what we
shall call a Generalized Inverse.
Definition 2.2 defines the concept
of a Reflexive Generalized Inverse which is simply a Generalized Inverse
with a little symmetry.
Normalized Generalized Inverses are introduced
in Definition 2.3 and can be considered as Reflexive Generalized Inverses with a certain projection property.
The concept of Pseudo-Inverse
introduced in Definition 2.4 is simply a unique Generalized Inverse or
alternatively a Normalized Generalized Inverse with a symmetric projection property.
An extension of the concept of Generalized Inverses to Hilbert
spaces is achieved in Definition 2.5 where the Pseudo-Inverse of a
bounded linear transformation is defined.
Further extensions of the
concept of Generalized Inverses are discussed in Section 2.6.
2.2
Generalized Inverses
The definition of a Generalized Inverse formulated by Bose [1959],
who used the term Conditional Inverse, revolved around the need to solve
5
the system of non-homogeneous linear equations
~
(2.1)
where
an
A is an
nx1
and r
A- l
= p,
matrix (n > p),! is a
nx p
vector.
=~ ,
The rank: of
vector and ~
p xl
A is assumed to be
r
~
p.
are consistent
(!.!:..,
If
r
= p,
n = p
= A-l~
then (2.1) admits a unique solution, namely !
denotes the inverse of A.
If
is
where
then provided the equations
possess a solution), a solution is given by
!o=
, )-1
( AA
kz..
Since
we see that (2.2) is a solution of (2.1).
The matrix
(A'A)-lA'
is an
example of the Pseudo-Inverse of the matrix A and will be discussed in
Section 2.4.
A solution of (2.1), provided it exists, is relatively easy to
represent in both the cases mentioned above.
where
r < p < n, a solution of (2.1) is more difficult to represent.
g
If we let· A
r
In the more general case
=p < n
= A-lor
(A' ArlA'
depending on whether
r
=p = n
or
then we see that
This suggests that a particular solution of (2.1), say A~, should
satisfy (2.3).
We thus introduce the following definition due to Bose.
g
Definition 2.1 (Generalized Inverse) A matrix A is said to be
a Generalized Inverse of the matrix A if (2.3) is satisfied,
AAgA
=A
•
!.!:..,
if
6
Most of the theory of solving linear equations is based on the
following three theorems which are well known and stated here for future
reference.
Theorem 2.1:
If A is a square matrix a necessary and sufficient
condition that there exists
~
rQ
such that
=0
Ax
is that
A be
singular.
Theorem 2.2:
equations
~
=y
Theorem 2.3:
A necessary and sufficient condition that the linear
be solvable for
x
is that
The general solution (more precisely a typical element
in the set of solutions) of the non-homogeneous linear equations
~
= if.
is obtained by adding to any particular solution, x , the general solu-0
tion of
Ax = 0 •
Using the above three theorems, Bose proves the following results,
the proofs of which we reproduce since they form an integral part of our
later work.
g
The first of these shows that any A
satisfYing Defini-
tion 2.1 plays a basic role in providing a solution to (2.1).
Theorem 2.4:
equations
~
= if.
is a solution of
(i)
If
g
A
is a Generalized Inverse of
are consistent (!.~., solvable for
~
A and the
~) then Agif.
= ~l
= if..
(ii)
every if. for which ~
If ~l
= if.
= Agif.
is a solution of
admits a solution then
is a Generalized Inverse of A.
~
AAgA
= if.
= A,
for
1.~.,
g
A
7
Proof:
(i )
.A:! = y...
such that
~l
Hence
Agy"
= ~l
Thus if ~
~
trary
!.
= Agy"
then
We have
AAgy"
is consistent.
.
~ = Y..
is a solution to
= Y..
y.. are consistent there exists x
= AAgy" = AAg(~) = AAg~ = ~ = Y..
(11)
system
~ =
Since
= Y..
for every Y..
In particular let
Y..
for which the
= A!
for an arbi-
~ = Y.. is consistent with Agy" = AgA! a solution and
Then
thus we have
!.
for arbitrary
Hence
=A
AAgA
•
The following result, used by Bose in developing the theory of
least squares, indicates an important set of linear equations which are
consistent and possess a certain uniqueness property.
Theorem 2~:
consistent and
Proof:
The equations
X!
(X'X)~
= Xl Y..
for any Y..:f Q are
is unique.
Since the rank of a product of matrices is less than or
equal to the minimum of the separate ranks we have
Rank [(X'X) I X'y"]
= Rank
[x'(xlY..)] ~ Rank [X']
= Rank
[X'X] •
We also have
Rank [(X'X)
Hence
Rank [X'X
(X'X)~
= X'y"
Let
~l
I
X'y"]
= Rank
I
X'y"] ~ Rank [X'X] •
[X'X], and by Theorem 2.2 the equations
are consistent.
and ~
be any two solutions to
(X'X) (~l - ~)
=0
(X'X)~ = X'y"
then
8
and hence
x'x
(x'
- -'~
~)
-1
(x
- X-)
-l-'~
=a
which implies
or
Xx..=Xx
--;;';;1
==2
X:!.
proving uniqueness of'
The f'ollowing expanded version of' Theorem 2.5 also due to Bose f'inds
extensive application in Chapter
Theorem 2.6:
matrix
X'
If'
Y such that
(X'X)Y
( i)
Further:
(ii)
(iii)
Proof':
Let ~i
and a's elsewhere.
an !i
such that
is a
4.
pxn
matrix, there exists a p x n
= X' .
XY is unique
(xy)' = XY
(xy)2 = (XY)
be an
(n x 1)
vector with a
By Theorem 2.5 f'or each
(X'X)!i = X'~i.
1
in the i
= 1,2, ... ,n
i
th
row
there exists
Hence
(X'X) [!1'!2'·.· '.!n l = X' [~1'~2'··· '~nl = X'I = X'
which proves the existence of'
Theorem 2.5,
unique.
X!i
Y by taking
is unique f'or each
Y
i = 1,2, ... ,n.
To prove (ii) and (iii) we see that
(X'xy)'y = Y'X'.
Hence
(xy)'xy
(X')'Y
= y'X'
= (xy)' = Xy
= [!l'~' ••• '.!nl .
or
or
Hence
= y'X'
y'(X'xy)
Xy = (Y'X')
(XY)
2
= XY
.
By
XY is
or
= (XY)'.
Also
9
Corollary 2.1 The matrix x(x'x)Bx'
Generalized Inverse of
(X'X)
where
(X'X)g is any
is uniquely determined, symmetric and
idempotent.
By Theorems 2.5 and 2.6 the equations
Proof:
consistent.
(X'X)Y = X'
are
Hence by Theorem 2.4 a solution is given by
y =
By Theorem 2.6 X(X'X)Sx'
(x'x)Bx' .
has the stated properties.
In view of the importance of Corollary 2.1 an alternative proof
will be presented.
(i)
,
(X'X)g
(ii)
= (x'x)(x'X)g
[(X'X)(X'X)g(X'X)]'
,
X'X
= X'X
implies that
is also a Generalized Inverse of X'X,
[X - X(X'X)g(X'X)]'[X - X(X'X)Sx'X]
= [x'-(X'X)(X'X)g
,.
,
X'][X-X(X'X)g X'X]
,
I
= X'X - X'X(X'X)Sx'X-(X'X)(X'X)g x'x + (X' X)(X'X)g (X'X)(X'x]8x' X
= X'X - X'X - X'X + X'X
= 0
implies that X = x(x'x')g(X'x)
(iii)
Let
(X'X), then
(X'X)~ and
for any Generalized Inverse of X'X,
(X'X)~ be any two Generalized Inverses of
10
[X(X'X)~f
- X(X'X)~']'[X(XfX)~' - X(X'X)~']
1
2
1
2
,
,
= [X(X'X)g X' - X(X'X)g X'][X(X'X)~'
- X(X'X)~']
212
1
,
,
= X(X'X)g X'X(X'X)~'- X(X'X)g X'X(X'X)gx'
1
1
1
2
,
,
- X(X'X)g X'X(X'X)~' + X(X'X)g X'X(X'X)~'
2
1
2
2
= X(X'X)~' - X(X'X)~' - X(X'X)~' + X(X'X)~'
1
-
=0
2
[X(X'X)~']'
Thus X(X'X)~'
and X'
!.~.,
= X(X'X)g
[X(X'X)~'][X(X'X)~']
and (v)
2
•
Thus X(X'X)~' = X(X'X)~'
(iv)
1
,
X(X'X)~'
X' = X(X'X)~'
is uniquely determined,
by (iii),
= [X(X'X)g(X'X)](X'X)~'
= X(X'X)~' .
is unique, syrmnetric, idempotent, X = X(X'X)g(X'X),
= (X'X)(X'X)~'.
Thus far we have not settled the question of existence of a Genera1ized Inverse for an arbitrary matrix A.
The following result of Bose
settles the existence question and in addition gives a type of canonical
representation for a Generalized Inverse.
Theorem 2.7:
If A is any matrix then a Generalized Inverse
exists and is given by
where P2
and P are non-singular matrices such that
1
11
..o
.o.
,
r
= Rank
A,
U, V and W being arbitrary matrices.
g
Further every Generalized Inverse of A has the form P B P
2
l
g
some Generalized Inverse, B of B.
Proof:
P
l
such that
and P2
where
For any matrix A there exists non-singular matrices
for
r
is the rank of A.
g
Define B =
[*"J
then
B#B=[Hi][m][Hi] ~ [m]
g
Hence B is a Generalized Inverse of B if and only if X
Thus
g
B =
[Ww],
where' U, V and II
= I(r).
are arbitrary, is a canonical
-L -1
representation for a Generalized Inverse of B. Since A = P l3P
we
l
2
g
see that A = P2BSpl is a Generalized Inverse of A, thus proving
existence.
g
If A is any Generalized Inverse of A then the equation
implies that
12
or
It follows that
p~lAgpil is a Generalized Inverse of B say B* .
Hence
for some Generalized Inverse of B.
Bose applies the above results to obtain a solution of the so-called
"normal "or"least
squares " equations given by
(x'x)!?
=
X'I
and discusses conditions for (linear) estimability, which reduce to the
requirements for the uniqueness of a linear function
l!?,
and
testability which reduce to conditions for the uniqueness of
the rank of L is full.
(Strong)
L!?
where
(L is a rectangular matrix of order
sxp(s < p) and rank s).
C. R. Rao [1962], in a recent paper, has discussed what we have
defined as a Generalized Inverse.
Rao's treatment closely parallels
that developed by Bose as given above.
verse to be any matrix AS
consistent,
~
= A~
Rao defines his Generalized In-
such that for any I
is a solution.
for which ~
=I
is
He then establishes that AS is
a Generalized Inverse of A' if and only if AASA
= A.
Rao discusses
briefly the application of Generalized Inverses to the theory of least
squares and distributions of quadratic forms in normal variables.
13
2.3
Reflexive Generalized Inverses
C. R. Rao [1962] established that there exists a Generalized Inverse with the additional property that
In his proof of this result a constructive procedure for determining
g
such an A is given. He simply uses the fact that for any matrix A
there exists non-singular matrices
P1A P2 =
P1
~(r)
and P2
~j
such that
=B
where r is the rank of A. The Generalized Inverse given by
Ag = P Blp is such that AAgA = A and AgAAg = Ag .
2
1
The reflexive property present with such a Generalized Inverse
suggests that some interesting results might be obtained by consideringthe
following type of Generalized Inverse.
Definition 2.2 A Reflexive Generalized Inverse of a matrix A is
defined as any matrix Ar such that
(2.4)
In the case of a symmetric matrix Rao has given a computational
method (a modified version of which we discuss in detail in Chapter 4)
for obtaining a Reflexive Generalized Inverse.
Properties of this type
of Generalized Inverse (other than those which a Generalized Inverse
possesses) do not appear to have been previously discussed.
It is of
interest to point out here that Roy [1953] in his discussion of
14
univariate Analysis of Variance utilizes an approach resting on the
concept of the space spanned by the columns of a matrix.
It can be
easily shown that such an approach is exactly equivalent to working with
a judicious choice of a Reflexive Generalized Inverse.
This will be
discussed briefly in Chapter 5.
Specific properties of the Reflexive Generalized Inverse will be
discussed in Chapter 3.
2.4 Normalized Generalized Inverses
Zelen and Goldman [1963] in a recent article have proposed a type
of Generalized Inverse which they call a Weak Generalized Inverse.
Definition 2.3
nxp
(Normalized Generalized Inverse)
If X is an
matrix then a Normalized Generalized Inverse of X is a
matrix Xn
px n
satisfying
XX~
=X
__.n) ,
n
( .xx
=XX.
Zelen and Goldman's proof of the existence of a Normalized Generalized Inverse is based on the following three Lemmas on bordered
matrices which we state without proof.
Lemma 2.1
let R be a
px(p-r)
Let A be a symmetric px p matrix of rank r < p
p x q matrix of rank q = p-r.
matrix B with the properties
=
-0
(a)
B'A
(b)
det (B'R) ~ 0
Then there exists a
and
15
if and only if the square symmetric matrix
[+++]
M =
is non-singular.
In this case any B of rank p-r
can be used as the
B in (*).
Furthermore
-1
M
-1
M
Lemma 2.2
singular.
has the form
=
[(B'R~-~'
B R'B -1 ]
0
Let A, R, M be as in Lemma (2.1) and assume M non-
Then there is a unique px p
symmetric matrix
e associated
to R, With the property that for at least one B satisfying
(a)
R' e
(*),
=0
-
(**)
(B bears no relation With the B used in Sections 2.2 and 2.3.)
matrix
e
satisfies (**) for every B satisfying
The
(*), and has the
additional properties:
Also
( a)
CAe
(b)
e
e can be written as e
Lemma 2.3
ACA
=A
is of rank r.
= [I - B(R'B)-lR'][A
+ RB,]-l.
Let A be a symmetric p xp matrix.
(a) and (b) of Lemma 2.2 then
expression given for
Note that
= e,
e
If
e satisfies
can be found using some R in the
e in Lemma 2.2.
e as given above is symmetric since the inverse of the
symmetric matrix M is symmetric.
Also note that
e is a Reflexive
16
Generalized Inverse in the sense of Definition 2.2, but that
C is
required to be symmetric here.
The following Lemma established by Zelen and Goldman will be used
in Chapter 3 to establish the relation between Reflexive Generalized
Inverses and Normalized Generalized Inverses.
Lemma 2.4 Let X be an n x p matrix.
Then the p x n matrix yfl
is a Normalized Generalized Inverse of X, if and only if
for some
C associated with A
= X'X
as in Lemma 2.2.
Zelen and Goldman use the concept of Normalized Generalized Inverses to study and extend some of the concepts of minimum variance
linear unbiased estimation theory.
~
Pseudo-Inverses
The earliest work on Generalized Inverses was done by E. H. Moore
[1920].
Moore introduced the concept of "general reciprocal " of a
matrix in 1920 and later in 1935 in a book entitled General Analysis
further investigated the concept.
Penrose [1955, 1956], independently
through a different approach discovered Moore ,s"general reciprocal " ,
which he calls the "generalized inverse" of a matrix. Rado [1956] has
pointed out the equivalence of the two approaches.
Penrose's approach
is much closer to the approach we have followed thus far and we introduce this type of Generalized Inverse in this manner.
Due to the
recent literature we use the name Pseudo-Inverse for the Generalized
Inverse developed by Moore and Penrose.
17
The definition of the Pseudo-Inverse of a matrix hinges on the
following theorem proved by Penrose.
Theorem 2.8:
The four equations
(i)
AAtA
= A
( ii )
AtAAt
= At
(iii)
(AAt )' = AAt
(iv)
(AtA)' = AtA
have a unique solution for any A.
Proof:
Penrose's proof can be greatly shortened using the results
previously mentioned in this Chapter and the spectral representation of
a symmetric matrix.
We see that
(A' A)
is a symmetric matrix and hence has a spectral
representation
where
Ei =
E~
=
eigenvalue of
~,
EiEj =
A' A.
2 (i
If we let
~ j)
and
X = .E
A.~lEi
i
= (A' A)
and similarly
a1ized Inverse of
A' A.
define
Then
At = XA'.
X A' A X = X.
Hence
A.
i
is the i
then
th
non-zero
(A' A)X(A' A) = 1: A.iE
i
i
X is a Reflexive Gener-
Hence by Corollary 2.1
A X A' A = A.
We now
t
AA A = A X A' A = A ,
~t AAt
= X A' A X A' = X A' = At ,
t
A A=XA'A=1:E =(.EE)'=(AtA/
iii i
and
AAt = A X A' = (A X A' )' = (AAt ),
by Theorem 2.6.
Following Penrose, uniqueness is shown by noting that if
B also
18
satisfies (2.6) then
At
= AtAAt = AtAt'A' = AtAt'A'B'A' = AtAt'A'(AB)' = AtAt'A'AB
= AtAB = AtAA'B'B = A'B'B = (BA)'B = BAB = B
t t'
where the relations
A = AA ' At'
= At , At AA' = A,,
A A A'
= (AB)',
(AB)
and
have been freely used.
Definition 2.4
The matrix
At
associated with A as in Theorem
2.8 is called the Pseudo-Inverse of A.
Penrose investigates many properties of the Pseudo-Inverse, the
results being summarized in the following theorems.
Theorem 2.9:
(i)
(ii)
(iii)
( i v)
(v)
(At)t = A,
(A,)t
= At, ,
At = A- l
(M)t
(A'A)t
if
A is non-singular,
= A.t At ,
A.
any s caler with A.t
=0
if A.
= AtAt , ,
(vi)
If
U and V are orthonormal (UAV)t = V'AU',
(vii)
If
A
=
r
whenever
(viii)
(ix)
Ai' where
i
rj
AiAj
=~
and A~Aj
=~
then At = L At ,
i
i
At = (A'A)t A"
t
AtA, AAt , I-AtA, and I_AA
are all synunetric
idempotent matrices,
(x)
If A is normal
,
t
A,A A,A ,
AtA = AAt ,
all have rank equal to
trace of
We recall that a normal matrix is a matrix such that
AA' = A' A.
=0,
19
Theorem 2.10:
A necessary and sufficient condition for the equation
A X B = C to have a solution for X is
in which case the general solution is
where
Y
is arbitrary.
Penrose points out that
AAt A = A in order for Theorem 2.10 to hold.
need only satisfy
A
An immediate corollary is
that the general solution of the consistent equation P.x = c
x
= Pt .£ +
t
(I-P p)Z where Z
Theorem 2.11
is
is arbitrary.
t
The matrix A B is the unique best approximate
solution of the equation AX
= B.
(The matrix X
o
best approximate solution of the equation f(X)
is said to be a
=G
if for all
X
either
(i)
Ilf(x) - Gil> II f (xo )
or
(ii)
Note that
II Mil
IIf(X) - G
II = II f (xo )
-
Gil
- Gil
and
Ilxll -> II
X0 II
.
is a norm of the matrix M and is commonly given by the
square root of the trace of M'M.).
2.6
Generalizations to Hilbert Spaces
A generalization of the concept of Pseudo-Inverses to infinite
dimensional vector spaces or more specifically to Hilbert space has
recently been achieved by Desoer and Whalen [1963] and also by Foulis
[1963] in a more algebraic manner.
Whatever is said about Hilbert space
naturally holds for a finite-dimensional inner product space.
Desoer
20
and Whalen's approach will be discussed in some detail because of its
importance in Hilbert space applications and also because the natural
specialization of their approach to finite dimensional inner product
spaces affords a modern look at the concept of Pseudo-Inverses.
Their
approach can best be described as a range-null-space approach.
Before
describing the approach taken by Desoer and Whalen we need to indicate
certain basic definitions, theorems, and notations appropriate to the
concepts of Hilbert space.
The approach we take follows closely that
of Simmons [1963] and Halmos [1951].
We recall that a linear space H is a set of points
~,~,
z
such that
I.
(H, +)
There is defined an operation + on H x H to
is an Albelian Group,
(i)
(ii)
(2.6')
!.!:.,
(~+~) + ! = ! + (~ +!)
There exists
~ €
any
H such that
0 € H
for all ~, ~ and! in H,
such that x + 0
=0
+x
=x
for
H,
(iii) For each x
€
H
there exists an element
x + (-~) = (-x) + ~
-x such that
=Q ,
(iv ) ~ + ~ = ~ + ~ for all x
and ~ in H
and
II.
There is defined an operation on H x F (where F is a field
of scalars) to
H
(i) a
(2.6")
(ii)
(iii)
(iv)
called scalar multiplication such that
(~ + ~) =
H, a
€
F,
X €
H, a, (3
€
F,
(a(3)~ = a«(3~)
x
H, a, (3
€
F,
l.x = x
X €
(a +
(3) ~ =
CX! + et;L
CX! +
(3~
~, ~ €
€
H •
•
21
The elements of
H are called vectors while the zero vector
0
is
called the origin.
The linear space
vector
x
H is called a normed linear space if to each
H there corresponds a real number denoted
€
the norm of
x
II xII
and called
such that
II~II ~ 0, and II~II
=0
if and only if ~
= .Q.
,
and
It can be shown that a normed linear space
H can be made into a metric
space by defining the metric as
(2.8)
The metric defined by (2.8) is called the metric induced by the norm.
We recall that a sequence
X €
H, written x
-n
-;>
x
-
x
-n
in a metric space
if and only if
H is convergent to
d(x ,x) -;> 0
-n-
as
n -;>
00.
A Cauchy sequence in a metric space is a sequence
d(x ,x )
-m -n
<
€
whenever
a Cauchy sequence.
in
n, m > n.
-
0
A metric space
H is convergent to a point in
x
such that
-n
Obviously a convergent sequence is
H such that each Cauchy sequence
H is called a complete metric
space.
A complete normed linear space is defined to be a Banach space.
In
a Banach space the induced norm provides us wi. th a natural measure of
distances but there is no natural measure of angles as exists in, say
2
R •
The concept of a Hilbert space admits such a measure of angles.
A
linear inner product space is a linear space with a function (called the
22
inner product) such that
If a linear inner product space
H is complete and we define IIxll by
the equation
IIxll
then
2
= (x,x)
H can be shown to be a Banach space.
The 'norm, Ilxll
is said to
be induced by the inner product.
A Hilbert spac e is thus a complex Banach space whose norm arises
from an inner product.
The inner product allows a defin1tion of orthog-
2
onality which generalizes the usual concept of orthogonality in R
r{; . Vectors !
and l
in
H, are said to be orthogonal if
indicates that !
The notation !.1l
indicates that !l.l for all l
!....Ll for all x
€
A and l
€
and
(!,l) = O.
is orthogonal to l,!-l A
A while
AJ.. B indicates that
B.
€
It can be shown that the inner product, the induced norm, addition
and scalar multiplication are all continuous functions in the topology
generated by the metric defined by (2.8).
A subset MeH is said to be subspace of
whenever !, l'
A and
€
Cl,
13
€
F.
A subset
H
if (~ + I3l)
€
M
M of a Hilbert space can
be shown to be closed (in the sense of the norm topology) if and only if
it is complete.
x = J!j
write
H
If any vector !
where !j
= Ml
+
~
€
€
H can be written uniquely as
Mj , a closed subspace of
H, for all
j, then we
+ ••. + and call H the direct sum of the subspaces
23
M:L'
~,
.•••
M:L'~'
It can be shown that if
•.•
orthogonal and H is spanned by the union of the
direct sum of the
are pairwise
M
then H is the
J
M•
J
Reflection on finite dimensional vector spaces reveals that linear
transformations are of importance in many applications.
Linear transfor-
mations in finite dimensional vector spaces are represented by matrices.
The generalization of a matrix to Hilbert space results in the concept
A. A linear
of a linear transformation
from
H to another Hilbert space
H'
transformation
A
is a mapping
such that
(2.10)
for all
~,'Z €
H and
Ct,
13
€
F.
A linear transformation mapping one Hilbert Space into another
obviously preserves some of the algebraic structure of
H but since
is equipped With a topology it is useful to require that
A be continu-
ous in order to preserve the topological structure as well.
transformation A:-H --> H'
The linear
is said to be bounded if there exists
Ct
such that
IIA xii
(2.11)
.-
The norm
(2.12)
IIAII
IIAII
of
A
~
Ct
II
x
II
for all
x
€
H •
is defined as
= in! {
Ct:
IIA xii ~ Ct
Ilxll
for all
~
€
H
H} •
The concept of boundedness is important since it can be proved that a
linear transformation is continuous if and only if it is bounded.
If
A: H --~ H and is bounded then we shall call A a (linear) operator.
>0
24
Similarly if A: H -> R or
C and is bounded then we shall call A
a (linear) functional.
We shall have occasion to discuss only bounded operators and functionals.
We also note that there are many differences in terminology
used in defining functionals and operators.
The scalar multiple, sum, and product of the operators
A and
B
are defined by
(aA)!
= a(~)
,
(A + B)! = ~ + B! ,
(AB)2£ = A(l32£) ,
and it can be shown that
(2.14)
llaAll =
lal
IIAII, IIA + BII ~ IIAII + IIBII
IIABII < IIAII IIBII •
and
It can also be shown that the set of all linear operators
B(H)
F,! •.!!., a linear
from H into
H forms an algebra over the field
space over F
where the elements (in this case the operators) and
scalars can be multiplied in the following natural manner:
A(BC) = (AB)C
A(B + C)
and
= AB
+ AC
a(AB) = (aA)B = A(aB)
for
A,B,C, € B(H) ,
for
A,B,C, € B(H) ,
for
A,B€ B(H) and a € F •
In addition this algebra has an identity, namely the operator I defined
by
(2.16)
An operator
A-1A
I
A
= I = AA- l .
x = x
for all x
€
H .
is said to be invertible if there exists
A-
l
such that
25
If A is any operator it can be shown that there exists a unique
operator A* with the property that
for all
~,
Z
€
The operator A* is called the adjoint of A and
H.
plays a role similar to that of the conjugate transpose of a matrix in
finite dimensional linear spaces.
It can be shown that the adjoint has
the following properties:
(2.18)
(A*)*
= A,
(aA)*
= a*A*
where a* = complex conjugate of a,
(A + B)* = A* + B* ,
(AB)*
= B*A*
(A*r l
and
,
= (A- l )*
if A is invertible.
There are several specific types of operators of importance in the
study of Hilbert spaces.
We list these operators and their properties
below.
Hermitian Operator:
An operator
A is hermitian if A = A*
(generalizes the concept of a hermitian matrix),
Normal Operator:
An operator
*
A is normal if A A = AA.
*
normal operator has the property that
Unitary Operator:
An operator
IIA*xll
= IIAxIl
for all ~
U is unitary if
A
H,
€
U'*u = 00* = I
(generalizes the concept of a unitary (orthogonal) matrix) ,
Projection Operator:
is the operator
where
x
€
The projection operator
P, defined, for every
M and Z
€
~, by pz = x.
~ €
P
on a subspace
H of the form
~ =~
It can be shown that if
+
Z
P is
M
26
an idempotent
(rl- = p)
tion on M ={ !:
P! = !
and hermitian operator then P is the proj ec-
J.
Theorem 2.8 indicates that
AAt
and AtA are proj ections on suit-
able subspaces in the finite dimensional case.
One of the main virtues of a Hilbert space is the fact that if M
is a closed subspace then there exists a projection operator
that
M =1!:
P! = !
J.
Further MJ. = [!:
In accordance with customary usage
(written R(P)) while
~
p! = Q
J and
such
M + Ml. = H.
M is called the range of
is called the null space of
P
P
P (written N(P)).
The existence of projections on a given closed subspace plays a fundamental role in the generalization of the concept of a Pseudo-Inverse to
Hilbert space.
The final concept of importance is the spectrum of an operator
which is defined as the set of all
vertible.
')...
such that A - ')...1
is not in-
The spectrum of an operator is obviously a generalization of
the set of characteristic roots of a matrix.
With the above preliminaries and notation completed we are in a
position to discuss Desoer and Whalen's definition of the Pseudo-Inverse
of a bounded linear transformation from H to
H'.
The essential problem is simply to associate with each transformation A in B( H, H')
.A!
=;£.
another linear transformation A such that
is true if and only if At;£. =!.
clearly satisfies the requirements.
when A-1
definition.
e,
If
A
is invertible then A-
l
In order to cope with the situation
does not exist Desoer and Whalen introduce the following
e-
27
Definition 2.5
Hilbert space
Let
A be a bounded linear transformation of a
H into a Hilbert space H'
such that
R(A)
is closed.
A is said to be the Pseudo-Inverse of A if
(i)
(ii)
and (iii)
At~
=
~
Atl = 0
If II
€
~
€
[N(A)]J. ,
for alll
€
[R(A)].L
for all
R(A) and l2
€
[R(A)~ then At (ll
t
The connection between A and A
+ Ie) = Atll + AtIe'
can best be seen by the diagram
given in Figure 2.1.
A
o
Figure 2.1.
We note that since
mapping of
R(A)
onto
G
The Pseudo-Inverse in geometric terms
R(A)
is closed the mapping At
This coupled with (ii) and (iii) serve
[N(A)].L.
H'
to define
At
as a mapping of
tees that
At
is a linear operator from
[N(A)]l and null space
is a one-one
[R(A)lL.
into
H. 'In addition (iii) guaranH'
into
H with range
28
Using Definition 2.5 Desoer and Whalen prove that At
has the
properties (i) to (iv) of Theorem 2.8 thus indicating that Definition
2.5 provides an extension of the concept of Pseudo-Inverse (.£.,f. Definition 2.4) to Hilbert space.
Note that in the case of a finite dimension-
al vector space the range of any linear transformation is closed and
hence in this case Definition 2.5 is equivalent to Definition 2.4.
Using Definition 2.5 the authors prove that At is such that
(At)t :::: A and (At )*:::: (A*)t. The projection results indicated in
Theorem 2. 8, namely, that AAt
(
is the projection of H'onto
R A)
and that AtA is the projection of H onto
lished.
[N(A)f are also estab-
Equation (viii) of Theorem 2.9, which plays an important role
in the computational methods to be discussed in Chapter 4, is true in
the Hilbert Space context also and reduces the computation of the
Pseudo-Inverse of an operator A to the computation of the Pseudo(A*A).
Inverse of
Several interesting properties of the Pseudo-Inverse are established by Desoer and Whalen.
The first states an equivalent definition
in terms of an inner product relation and a condition on the null space
of A.
Specifically the equivalent definition is:
Definition 2.5.1 Let 2f,;[. be elements of H and write
2f :::: 2fl
+~,
are in N(A).
(i)
;[.::::;[.1 +
Then At
~
where 2fl' ;[.1· are in [N(A)].L and
is the Pseudo-Inverse of A if and only if
(AtA 2f,;[.) :::: (2f1';[.1) for all 2f,;[. in H,
and
(ii)
N(At
~, ~
):::: [R(A)].L .
29
As stated in the introduction, the concept of Generalized Inverse
is extremely important in the theory of least squares.
Penrose's
Theorem (2.11) on best approximate solutions of matrix equations
generalizes in the following sense.
Theorem 2.11':
:5
IIA! -
III
If l
for all 2£
inequality (a).
€
€
and 2£1 = Atl then (a) 11A!1 - l
H'
H, and (b)
11!I11 :5 1I!oli
for all
!o
II
satisfying
=l
In intuitive terms Theorem 2.11' states that if A!
has a solution then Atl
is the solution which has smallest norm.
As is well known (see for example HalmosU958]) an arbitrary matrix can
be factored as
A
= UQ
where
U is an isometry (an orthonormal matrix)
and Q is a positive matrix (non-negative definite matrix).
Desoer and
Whalen proved the following generalization of this decomposition (Which
is called a polar decomposition because of its relation to the polar
representation of a complex number as re
Theorem 2.12:
H'.
If
R(A)
Let
t
U* = U,
maps
H onto
2.6
).
A be a bounded linear mapping from
H into
is closed then A can be factored as
A
where
iG
P
= UP
is a positive self adjoint operator on
H and
U
R(A).
Generalized Inverses in Algebraic Structures
The concept of Generalized Inverses revived by Penrose [19'5] in the case
of matrices led to immediate generalizations to other algebraic structures.
Since the set of all
n x n matrices is a ring under multiplica-
tion, it is natural to suspect that the concept of Generalized Inverse
30
extends to arbitrary rings.
In recent articles Drazin [1958] and Munn
[1962] have investigated Generalized Inverses in semi-groups and rings.
A semi-group consists of a set
8
such that the mapping is associative,
of
8
and a mapping of
if
!.~.,
8x8
x, y and z
into
8
are elements
then
Drazin and Munn introduce the following definition.
Definition 2.6 An element
x
in 8
invertible if there exists an element
i
is said to be pseudoin 8
such that
xx=xx,
(ii)
and
(iii)
Xn -- xn+lxx-
f or some pos iti ve i n t eger
n,
-2
= x x .
The uniqueness of i (if eXistent) has been proved by Drazin.
In terms of the associative ring of
nxn matrices under multipli-
cation we see that any normal matrix (any matrix such that
X' X= XX' )
is pseudo-invertible in the sense of Munn and Drazin since
xtx = (X'X)tx'x = (XX,)txx ' = x,tx' = txxt), = xxt ,
X= xxtx
=
x2xt ,
xt = xtxxt = xt2x .
and
Using the above definition of a Pseudo-Inverse Munn and Drazin investigate the implications of the concept of Pseudo-Inverse in the structure
of rings and semi-groups.
tions for
x
In particular Drazin gives sufficient condi-
to be pseudo-invertible when
(8, (3))
is a ring and MUJ:!lIl
obtains necessary and sufficient conditions for an element of a
31
semi-group to be pseudo-invertible.
power of
x
Foulis
lies in a subgroup of
This condition is si1DI>ly that some
(S,®) •
[1963] in a recent paper has considered Pseudo-Inverses in a
special type of semi-group.
His work is highly algebraic in structure
and includes as a special case certain of the results obtained by Desoer
and Whalen
[1963] in Hilbert space contexts.
32
CHAPrER
3.
THEORETICAL RESULTS
3.1 Summary
In Chapter 2 several definitions of Generalized Inverses were
presented.
It is the purpose of the present chapter to investigate some
of the implications of these definitions.
t(A)
Let g(A), r(A), n(A)
and
denote the set of all Generalized, Reflexive Generalized, Normal-
ized Generalized and Pseudo-Inverses respectively, of a matrix A.
We
are interested in investigating relationships between a property of A
and the corresponding property of a typical element in g(A), r(A), n(A)
or t(A).
The property of rank is discussed in Section 3.2, symmetry
in Section 3.3, and eigenvalues and eigenvectors in Section 3.4.
In
statistical applications of Generalized Inverses orthogonal projections,
represented by idempotent symmetric matrices, play an important role and
these aspects of Generalized Inverses are discussed in Section 3.5.
In
Section 3.6 several results are proved which indicate that under certain
conditions
g(A), r(A), n(A)
and t(A)
can be characterized by the
relation of a property of A to the corresponding property of a typical
element in i(A), i=g, r, n,
~
t.
Results on Rank
One of the most important characteristics of a matrix is its rank.
g
In the case of a Generalized Inverse A , of a matrix A, the rank of If>
satisfies the following inequality
The proof of this result hinges on the following Lemma.
33
Proof:
Since the rank of a product of two matrices does not exceed the
rank of either factor the conclusions follow from
g
Rank (A) ~ Rank (AA ) ~ Rank (AAgA) = Rank (A),
and
Rank (A) ~ Rank (AgA) ~ Rank (AAgA) = Rank (A) •
Using Lemma 3.1 we have
Theorem 3.1:
g
Rank (A ) ~ Rank (AgA) = Rank (A).
If A is a square matrix the following theorem indicates that
g
singularity of A and singularity of its Generalized Inverse, A ,
are
not equivalent concepts.
Theorem 3.2:
If A is singular then a non-singular Generalized
Inverse exists.
Proof:
g
P BP
2 l
From Theorem 2.7 a Generalized Inverse of A is given by
where
P1AP2=B=[~]
and
BS=[Wr-].
g
Since Pl and P2 are non-singular it follows that A
non-singular Generalized Inverse of A.
= P2B~1
is a
One of the most useful results concerning inverses of square
matrices is that CAB, where C, A, and B are non-singular, is in.
-1 -1 -1
vertible with inverse given by B A C
It is interesting and very
useful (see Chapter 4) to note that a similar result holds for Generalized Inverses.
Theorem 3.3:
If
C and B are non-singular square matrices and
A is any matrix such that
CAB
is given by
CAB
B-1AgC- l •
(CAB)(B-1AgC- l )( CAB)
Proof:
eXists, then a Generalized Inverse of
= CAAg~ = CAB.
One would hope that an analogous result would hold if
were replaced by singular matrices and B-1AgC- l
B and
C
replaced by BgAgcS
but consideration of the equation
indicates that such a result cannot hold in general.
some interest occurs when
A special case of
C and B are commutative idempotent matrices
which also commute With A.
Then since a Generalized Inverse of an idem-
potent matrix is itself we have
= CBACB = CBCBA = CBA = CAB.
(CAB)(BAgC)(CAB) = CABAgCAB = CBAAgACB
Such a result could be of interest in the
study of idempotent matrices and their associated projections.
When we consider Reflexive Generalized Inverses the inequality of
Theorem 3.1 becomes an equality.
Theorem 3.4:
r
If A
is a Reflexive Generalized Inverse of
A
r
then Rank (A ) = Rank (A).
Proof:
Theorem 3.1 applied to
r
3.1 to A
A and another application of Theorem
yields the results
and hence
r
Rank (A )
2:
Rank (A),
Rank (A)
2:
r
Rank (A )
r
Rank (A ) = Rank (A) •
The analogue of Theorem 3.3 holds for Reflexive Generalized Inverses
also.
35
Theorem 3.5:
If
C and Bare non- singular square matrices and
A is any matrix such that
Inverse of
Proof:
CAB
exists then a Reflexive Generalized
CAB is given by B-1Ar C-1 .
In view of Theorem 3.3 the equation
= (B-~rc-l)(CAB)(B-1ArC-l) = B-1ArAArC- l
= B-1ArC- l
(CAB)r(CAB)(CAB)r
completes the proof.
In view of Theorem
3.4 both Normalized Generalized Inverses and
Pseudo-Inverses preserve the rank of the original matrix.
Theorems 3.3
and 3.5, however, become somewhat weaker when we restrict the class of
Generalized Inverses.
Simple computations show that a Normalized Gener-
alized Inverse of MC where
B is any orthonormal matrix is
C is any non-singular square matrix and
C-1A~'. As might be expected the
Pseudo-Inverse of MC when B and C are orthonormal is
C'AB'
(this result was in fact given by Penrose [1955]).
2..:2.
Symmetry Results
In many applications of matrices square symmetric matrices become
extremely important.
In view of the importance of symmetry and the
simple fact that the inverse of a non-singular symmetric matrix is again
symmetric, a natural question which arises is the extent to which Generalized Inverses preserve or destroy symmetry of the original matrix.
The first result states that the two concepts are not equivalent
for Generalized Inverses and Reflexive Generalized Inverses while the
second is an existence theorem guaranteeing that symmetric Generalized
and Reflexive Generalized Inverses of symmetric matrices exist.
36
Theorem 3.6: If A is symmetric, then it is not necessarily true
g
r
that A or A is symmetric.
Proof:
By Theorem 2.7 a Generalized Inverse of A is given by
g
and it is clear from this form that A need not be symmetric.
Similarly a Reflexive Generalized Inverse of A is given by
and again it is clear that Ar
need not be symmetric.
g
If A is symmetric then there exists a symmetric A
Theorem 3.7:
r
and a symmetric A.
In Theorem 2.7 we can choose Pl and P2 so that P2 = P~.
g
If we choose B and Br to be symmetric the conclusion follows from
Proof:
Theorem 3.5.
The situation for Normalized Generalized Inverses of symmetric
matrices is not quite so simple.
Consideration of
-1
3
o
which is a Normalized Generalized Inverse of
A =
[~
1
1
o
shows that a Normalized Generalized Inverse of a symmetric matrix need
not be symmetric.
If the Normalized Generalized Inverse is symmetric
then it is the Pseudo-Inverse.
Thus the existence result for Normalized
37
Generalized Inverses of symmetric matrices has little importance since,
as we shall show below, the Pseudo-Inverse of a symmetric matrix is
symmetric.
Theorem 3.8:
If A is symmetric then the Pseudo-Inverse of A is
symmetric.
Proof:
At
From (ii) Theorem 2.9 we have
(A'} = At, .
Hence
= (A,)t = (At),.
Theorem 3.8 can, using (v) of Theorem 2.9, be strengthened to the
statement that
A normal implies
~
At
normal.
Eigenvalues and Eigenvectors
The importance of the spectrum of a matrix and more generally the
spectrum of a linear transformation on a linear space is well known.
Classical matrix theory discusses the spectrum of a matrix via study of
the eigenvectors or characteristic vectors and their associated eigenvalues or characteristic roots.
It is known that if x
is an eigen-
vector of an invertible matrix A then x will also be an eigenvector
of A-1.
The corresponding eigenvalues will be reciprocals.
Since
Generalized Inverses behave so much like inverses a natural subject for
..
investigation is the relation of the eigenvectors (values) of a square
g
matrix A and those of an associated Generalized Inverse A •
1\ 1
(M)
M! = A.
~
For any square matrix M let
eigenvalues of M,
1\ 1(M)
and let
!.~.,
={ A.: A. ~ 0
1\ ~l(m) =
denote the set of non-zero
f f:
A.
and
€ ' \ (M)J.
for some
~;. Q5
For any square matrices
M and N
let the sets
A 2(M)
A 2 (M)
and A ;l(N) be defined by
=[!:!
,\;l(N) ={!: !
r0
and
Mx
r Q,
N! =
A.-
=
x for some
A.
1
!
A 2(Ag )
E1.
as follows:
g
is satisfied we say that A has
and similarly for property
Theorem 3.9:
~
•
Generalized Inverses and Reflexive Generalized In-
verses do not necessarily possess properties
Proof:
•
=A;l(A) •
If Ag is such that property R
1
property
€A:fM)}
is any Generalized Inverse of
A then we define properties R and ~
1
Property ~: A1(Ag) =A~l(A),
Property ~:
€A 2 (M)}
and !
g
If A is any square matrix and A
A.
R1
and
~.
The matrix
A
has characteristic roots
=
[~
~]
1
1
o
0, 1, 3,.
The matrices
r
[1
and A = -~
-1
2
o
~]
are respectively a Generalized and Reflexive Generalized Inverse of A.
g
r
The eigenvalues of A are 1, 1, 1, while those of A are 3
and
3;V5.
;-15
Note that the multiplicity of the zero eigenValues is not
g
preserved with A • Consideration of
39
•
1
r
and A =
[
~
shows that a Reflexive Generalized Inverse of a symmetric matrix can
possess property
Bi
but not necessarily property
an eigenvector of Ar
Theorem 3.10:
~ since~]
is
and not an eigenvector of A.
If A is symmetric then the non-zero eigenvalues of
A and any Normalized Generalized Inverse of A are reciprocals,
!.~.,
An
possesses property R if A is symmetric.
l
Proof: If A is a non-zero eigenvalue of A there exists !
r2
such
that
Ax=Ax.
n
Multiplying both sides of the above equation by AA yields
=
!
A- l
Hence
If
~
n
AA
we have
= (AAn )'! = An' Ax
-1
or A x
= An' x .
is a non-zero eigenvalue of An •
is a non-zero eigenvalue of An then there exists a non-
zero vector I
such that
Hence
or
Thus
A:;i is an eigenvector of An'
both sides by A gives
with eigenvalue
~.
Multiplying
40
Hence A(Al)
that Al
n
AZ=
rQ
= ~-l(Al) or ~-l is a non-zero eigenvalue of A. Note
= Q then An' Al = Q or AAnZ = 0 and hence
since if Al
Q which is a contradiction.
For non-symmetric matrices the conclusion of Theorem 3.10 is not
valid as a consideration of
A
[~ ~
=
shows.
The matrix A has eigenvalues
values
o and 1/5.
n
0, and 3 while A has eigen-
The following Lemma gives a sufficient condition for a Reflexive
Generalized Inverse to possess property
Lemma 3.2:
r
A Reflexive Generalized Inverse A
r
of A possesses
r
if A and A conunute (written A <-> A ).
r
If A <-> A then if A. rOwe have
property ~
Proof:
~.
Ax
= A.x =>
Ax
= ")I.AAr x =>
A.x
= A.2Ar x => A.-1x;:: Ar x, and
The following simple corollary to Lemma 3.2 indicates that PseudoInverses of normal matrices are indeed well behaved.
Theorem 3.11:
If
possesses property R2 •
Proof:
tJ
A is normal then the Pseudo-Inverse, A
By (viii) of Theorem 2·9 J
From Lemma 3.2 it follows that At
A normal implies that AAt
possesses property
~.
= AtA
.
41
It is curious to note that if A is not normal then its PseudoInverse need not even possess property R .
l
furnished by the matrix
=
A
An example of this is
[~ ~
and its Pseudo-Inverse
t
A
=
~~ ~
It is easily shown that A has eigenvalues
eigenvalues
0
and 3/10.
0 and 3 while A has
Necessary conditions for the Pseudo-Inverse
of an arbitrary matrix to possess property R
l
at present.
~
or R
2
are not known
Orthogonal Projections
The topic of orthogonal or perpendicular projections finds wide
application in the theory of matrices and linear spaces (SUCh as the
spectral representation of a linear operator) as well as in the application of matrix theory to other fields (the sum of squares due to a
hypothesis can be defined using the notion of an orthogonal projection
on a certain subspace).
We recall (see Section 2.6) that an orthogonal
projection is siI!I.Ply a symmetric ideI!I.Potent matrix.
In this section we
shall see that Generalized Inverses can be classified partly by the
g
projection properties of AA and AgA. Simple matrix multiplication
shows that both AAg and AgA are ideI!I.Potent. Neither Generalized Inverses or Reflexive Generalized Inverses are necessarily orthogonal
projections because they are not necessarily symmetric.
Theorem 3.12:
A Reflexive Generalized Inverse is a Normalized
r
Generalized Inverse if and only if
Proof:
AA
is an orthogonal projection.
By the definitions.
Theorem 3.13:
A Reflexive Generalized Inverse is the Pseudo-
Inverse if and only if
Proof:
r
AA
r
and A A are orthogonal projections.
Again obvious from the definitions.
Note that Theorems 3.12 and 3 .13 merely put conditions (iii) of
Definition 2.3 and (iv) of Definition 2.4 into geometric terms.
For applications to the theory of least squares it is of interest
to note that the trace of AgA
g
and AA
is equal to the rank of A
(see Rao and Chipman [1964] for a proof of the result that the trace of
an idempotent matrix is equal to its rank).
3.6 Characterization of Generalized Inverses
Thus far in this Chapter we have investigated the relation between
various properties of a matrix A and the corresponding properties of
a typical element of
g(A), r(A), n(A)
or
t(A).
In this section we
carry this investigation further and find criteria to characterize the
sets
g(A), r(A), n(A)
and. t(A).
The first result indicates that rank plays an important role in
distinguishing between Generalized Inverses and Reflexive Generalized
Inverses.
Theorem 3.14:
A Generalized Inverse is a Reflexive Generalized
g
Inverse if and only if Rank A = Rank A.
r
If A
Proof:
r
is a Reflexive Generalized Inverse then by Theorem 3.4
Rank A = Rank A.
To establish the converse let
Inverse of
and
P
2
P
l
g
A
be an arbitrary Generalized
A which is of the rank as A.
By Theorem 2.7 there exists
such that
where
[Hf]
,
[H+]
the matrices
V, W and X being arbitrary.
g
g
Rank A = Rank A = Rank B
R=kBg=Rank
= Rank
Hence
g
Rank B
= Rank
Rank A
= Rank Ag
then
We see,that
[;g][~][~]
[~
= Rank
= Rank
B.
If
> Rank
[~]
B •
B implies
X - WV
=~
or
X
= WV
•
Thus
g
B
if
g
A
that
=
~,
is a Generalized Inverse with Rank
g
A
= Rank
A.
We now find
44
But
=
=
[~][$][~] = [~][~J
[~
g
= B
Hence
and AS
is a Reflexive Generalized Inverse.
As has been the case throughout this Chapter the properties of the
Normalized Generalized Inverse are somewhat nebulous.
theorem, a slightly stronger version of Lemma
2.4
The following
due to Zelen and
Goldman, characterizes Normalized Generalized Inverses in terms of
Generalized Inverses.
Theorem 3.15:
A Generalized Inverse is a Normalized Generalized
Inverse if and only if it can be written in the form
for some Generalized Inverse of A'A.
Proof:
If An = (A'A)SA'
where
(A'A)S
is a Generalized Inverse of
A'A then the equations
and
show that
An
is a Normalized Generalized Inverse of A.
The converse follows immediately from Lemma
2.4.
The property of commutativity of A and Ar
serves to characterize
the Pseudo-Inverse of a sYmmetric matrix as the following theorem indicates.
t Conversely
Theorem 3.16: If A is SYmmetric then A <;--> A.
if A and Ar are SYmmetric and A ~---> Ar then Ar = At •
Proof:
AtA
If A is SYmmetric (viii) of Theorem 2.9 implies that
= AAt .
If A and Ar
are SYmmetric and commute then
I
(AAr),
= Ar
(ArA)f
= AfAr
and
A'
I
= ArA
= AAr
r
r
=AA =AA.
Hence Ar = A.
The Pseudo-Inverse of a SYmmetric matrix can also be characterized
n
in terms of A and Property~.
Theorem 3.17:
If the Normalized Generalized Inverse
SYmmetric matrix possesses Property
Proof:
Let
~J:' ~2'
El'
~,
R
2
implies that
..• ,
~r
~
An of a
then An = At.
denote the non- zero eigenvalues of A and
.•• , Er the corresponding eigenspaces. Also let Eo denote
the eigenspace spanned by the vectors in the null space of A. Property
i = 1,2, ... ,r
where !i
is an arbitrary element in Ei
We may write any vector ! as
i = 1,2, ••. , r •
+ -r
x + -0
x
46
where
~ €
Ei
i:= 1,2, ••• ,r
and
~ €
Eo.
Hence
... + -r + An Ax
t
"" -1
x +
0:
and AnAx
X
-1
e·
+
"""0
+ -r
x
= AnA(_x,
It follows tha.t
and thus
X
:.I.
+ ... + -r
x + -0
x ) =!l + ... + -r
x •
AAn = AnA.
Hence
CHAPTER
4.
COMPUTATIONAL METHOOO
4.1 Summary
In this chapter certain computational methods for finding
Generalized Inverses are discussed.
The subject is as vast as the
general theory of solving systems of linear equations and hence no claim
to completeness can be made.
In Section 4.2 an expression is obtained for a Generalized Inverse
of a suitably partitioned matrix.
This result is subsequently used in
obtaining Generalized Inverses of Bordered Matrices in Section
Section 4.3 provides a review of some computational methods.
4.5.
Some of
the theory underlying the Abbreviated Doolittle technique is discussed
using Generalized Inverses in Section
4.2
An
4.4.
Expression for a Generalized Inverse
of a Partitioned Matrix
In this section we shall develop a formula for obtaining a Generalized Inverse of a suitably partitioned symmetric matrix.
We note
that computation of Normalized Generalized Inverses and Pseudo-Inverses
of a matrix X can be performed by finding either a Generalized Inverse
or the Pseudo-Inverse of A
Theorem 2.9).
= X·X
(see Theorem 3.15 and (viii) of
Matrices of the form X·X arise frequently in statis-
tical applications and occur commonly in partitioned form as
( 4.1)
X·X
=
For use in algebraic expressions occurring in statistical applications
and as a computational method to find Normalized Generalized Inverses
48
and Pseudo-Inverses we shall develop an expression for a Generalized
Inverse of a matrix partitioned in the form (4.1).
Theorem 4.1:
If a matrix X'X is partitioned as
X'X
then a Generalized Inverse of X'X is given by
(X'X)g
(X' X )g + (X'X )g(X'X_)Qg(X~X )(X'X )g
1 1
1 1
l-~
-~ 1
1 1
[
-Qg(X'X )(X'X )g
2 1 1 1
=
(4.2 )
- (X'X )(X'X )~IX_
where Q = x~x_
-~-~
2 1 1 1 l-~
and Qg
and
(X'X)g are any
1 1
Generalized Inverses of Q and xiXl respectively.
Proof:
Formula (4.2) can be directly verified by forming (X'X)(X'X)~X'X)
and simplifying.
It is of some interest, however, to observe that (4.2)
can be obtained in a straightfoward manner using the results of
previous chapters.
Xl
Using the relations
= Xl(XiXl)~iXl we see that P1X'XP2
Z =
Since XI X
= Pl-1-1
ZP2
Inverse of X'X is
~]
and
Xi = (Xi~)(XiXl)~i and
= Z where
Q = x~x_
- X~X
(X'X
-~-~
-~ 1
1 1 )gx'x_
l-~·
it follows from Theorem 3.3 that a Generalized
where
zg
is any Generalized Inverse of
Z.
The Generalized Inverse of
which yields (4.2) is
Z
Note that other choices of
zg would be permissible and it appears
possible that certain of these choices would lead to simpler expressions
for
(X'X)g
although the choice made above is a natural one.
The following paragraph serves as an introduction to the proofs of
Theorems 4.2 and 4.3.
= Xixl(Xi~)~i
Xi
Using the relations
' the definition of
Xl
= ~(XiXl)g(XiXl)'
Q and simple matrix multiplica-
tions shows that
(x'x)g(x'x)(x'x)g
=
(XiXl)g(XiXl)(Xi~)g
[
+
(XiXl)~i~QgQQ~~(Xi~)g -(XiXl)~i~QgQQg]
-QgQQ~Xl(XiXl)g
QgQQg
Using the results of (4.3) the following extended form of Theorem 4.1
can be stated.
50
If X'X be partitioned as in (4.1) then a Reflexive
Theorem 4.2:
Generalized Inverse of X·X
is
Q = ~~ - ~JS. (XiXl )rXi~
where
ive Generalized Inverses of
Proof:
and Qr
Q and
(XiX )
l
From Theorem 4.1 it follows that
equation 4.3 with
is clear that
and
(X·X)(x·x)rx •X
= (X'X)r
are any Reflex-
respectively.
(XiXl)r(XiJS.)(XiXl)r = (xiJS.?
(x'x)r(X'x)(X'X)r
(Xixl?
= X·X.
and QrQQr
From
= Qr
it
which completes the proof.
The results of Theorem 4.1 and Theorem 4.2 can be extended, under
certain circumstances, to include Normalized Generalized Inverses and
Pseudo-Inverses.
Before indicating these extensions we establish the
following Lemma:
Lemma 4.1:
If X·X
is partitioned as
x·x =
where
Rank ~JS. = r
Rank X'X
=r
I
XJ. X'~]
~~
[X'~JS.
and ~~
is non-singular (of order
qxq)
and
+ q then
is non-singular.
Proof:
Since
Q is
q xq
it suffices to prove that
Rank Q
= q.
Since the rank of a matrix is unaltered by pre- or post-multiplication
by non-singular matrices it follows that the rank of
X·X
is the same
51
as that of Z" defined in the proof of Theorem 4.1.
Rank Z
On the other hand
= Rank
(X'X)
=r
Hence"
+ q •
Rank Z = Rank (Xixl) + Rank Q = r + Rank Q.
Hence
Rank Q = q.
4.1 we can establish the following result.
Using Lemma
Theorem
4.3: If X'X be partitioned as
where the ass~tions of Lemma
where
4.1 are fulfilled then
(Xixl)n is any Normalized Generalized Inverse of
(xiJS.),
is a
Normalized Generalized Inverse of X·X.
(b)
(x.x/
where
J
= [<Xix
(xixl)
t
+
<XiX/<XiX:!)Q-l<~JS.)(XiJS.)t -<XiJS.)t(XiX:!)Q-l]
Q-l(~JS.)(xixl)t
is the Pseudo-Inverse of
Q-l
(Xixl)" is the Pseudo-Inverse
of (X'X).
Proof:
(a)
Theorem
4.2. From 4.3 it follows that (X'X)(X'X)n = [(x'x)(x'x)n], •
(b)
and
(X'X)(X'X)n(X'X)
By part (a)
(x'X)(X'X)t
and
(X'X)n(X'X)(X'X)n = (X'X)n by
(X'X)(X'X)t(x'X) = x'x" (X'X)t(X'X)(x'X)t= (X'X)
= [(X'x)(X'X)t],.
(X'x)t(X'X) = [(X'X)t(X'x)]' •
From
4.3 it follows that
52
It is to be noted that if XiX is non-singular and
(XiXl)
is
non-singular then (4.2) reduces to a commonly given expression for the
inverse of a partitioned matrix.
The expression given by (4.2) can be
used to find Normalized Generalized Inverses of arbitrary partitioned
matrices.
If a matrix
A is partitioned as
A =
[A~ll
A12]
~l
~2
we simply apply (4.2) to
and form
which, according to Theorem 3.15, is a Normalized Generalized Inverse of
A.
~
Some Cornwutational Procedures for Finding Generalized Inverses
4.3.1 Generalized Inverses
Finding Generalized Inverses, according to the definition is
simply a matter of solving the linear equations
and expressing a solution as
g
The matrix A will then be a Generalized Inverse of A.
There is a
53
wealth of literature dealing with the solution of systems of"linear
equations (see for example Bodewig [1959], Dwyer [1951], and Fadeeva
[1959]) and to present a review of this material would be out of place
here.
In Section
4.4
we shall present a method which has gained wide
acceptance in statistical circles.
For the present the method called
sweep-out or pivotal condensation seems to be the simplest, although
not necessarily the most compact way of finding a Generalized Inverse
of an arbitrary matrix.
Consider the
nxp
matrix X
np
and the augmented matrix
[~np ~(n)J
One performs elementary row and column operations on this matrix until
it has been reduced to the form
r-
I(r)
o_n-r,r
o
,
--
where
r = Rank (Xnp )'
It follows that
of
P1XP2
X is given by
(4.4)
or more simply
= X*
and using Theorem
3.3 a Generalized Inverse
where a possible X*g is given by
x*g
[I."....(r_)---+~2;;;..,t.r .; .; . n-~r _]
~p-r,r Qp-r,n-r
=
4.3.2 Reflexive Generalized Inverses
The computation of Reflexive Generalized Inverses can be achieved
in a manner analagous to that used to compute Generalized Inverses.
If Xnp
is an arbitrary matrix one forms the augmented matrix
_[x I ...
M-
np
I(P)
I(n)]
0
.
One then performs elementary row and column operations on M until M
becomes
~:p I ~l~
where
X*
np
= ~I(r)
0
...m-r,r
It then follows that
I ~r,p-r
0
]
with r = Rank (X
...~r,p-r
np
).
P XP2 = X* and using Theorem 3.5 we see that
1
is a Reflexive Generalized Inverse of X.
If A is a symmetric matrix written as A = X·X it is possible
to give a particularly simple method for computing a Reflexive Generalized Inverse.
Pre- and post-multiplication of X·X by elementary
matrices results in
55
the row space of
E'X'XE.
[XiX1 IXi~]
(Hence Xi~
constitutes a basis for the row space of
is non-singular.) As is well known (see Roy
[1957])
Routine matrix multiplication shows that a Reflexive Generalized Inverse
of E'X'XE is given by
Hence, by Theorem 3.5, a Reflexive Generalized Inverse of X'X is given
by
say.
A direct verification of this result is simple since
E'(X'X)(x'x)r(X'X)E
= E'(X'X)
[EQE'] (X'X)E'
= [E'(X'X)E]
Q [E'(X'X)E]
= E'X'XE
and
(X'X)r(X'X)(X'X)r
Note that
= EQ(E'X'XE)QE' = EQE' = (X'X)r
•
(X'X)r as given by (4.6) is not necessarily a Normalized
Generalized Inverse of
(X'X) but that
Generalized Inverse of X.
(X,x)rx ,
is a Normalized
Another method of computing a Reflexive
56
Generalized Inverse will be given in Section 4.5 which does not require
that
XiX be partitioned as above.
4.3.3 Normalized Generalized Inverses
By Theorem 3.15 any Generalized Inverse of the form
a Normalized Generalized Inverse of X.
(X'X)~'
is
Hence, the computation of a
Normalized Generalized Inverse can be reduced to the computation of a
Generalized Inverse.
Alternatively, the method due to Zelen and Goldman [1963] can be
used.
If one writes
R (R is
P x (p-r)
A=X'X
of rank
(A is
(p-r»
pxp of rank r)
so that
andchooses
B'R is non-singular where
B'A = 0 then a Normalized Generalized Inverse of X is given by
where
-
As discussed in Section 2.4 if one finds
M- l
ex'
and partitions it in the
-1
same manner as M then C occupies the same position in M
as A
does in M where
M
=
[Hi]
4.3.4 Pseudo-Inverses
Computational methods for finding Pseudo-Inverses have been explored
from various viewpoints.
Some of the methods preViously published seem
to have little practical value, however.
Equation (Viii) of Theorem 2.9 allows us to concentrate on methods
for obtaining the Pseudo-Inverse of a matrix of the form A = XiX.
Penrose [1956] showed that the Pseudo-Inverse of A = XiX
can be
57
expressed in terms of ordinary inverses via the formula
(4.8)
where
P =
[(X~Xl)2 + (~~)(~~)rl(~~)[(~~)2 + (~~)(~~)rl
Since we can rearrange the rows and columns
X'X by pre- and post-
multiplication by suitable orthonormal matrices (vi) of Theorem 2.9
shows that there is no loss of generality in assuming that
where
[XiXl
I
Xi~]
forms a basis for the row space of
Greville [1959] showed that if one factors
where
t
X
r = Rank X
np
and B
nr
and
C
rp
Xnp
as
X'X.
Xnp
= Be,
nr rp
are both of rank r, then
can be expressed as
In the case A = X'X we can factor
A as
A = T'T
where
T
is
r xp
of rank r and (4.9) reduces to
(4.10 )
In a recent paper van der Vaart [1964] has used a particularly simple
matrix factorization to obtain an expression for the Pseudo-Inverse.
X
np
is a matrix of rank r simply select an orthonormal basis for the
row space of X •
nP
~'
r
If the row vectors of this basis are
then we can write
~i,~, .•• ,
If
58
where
and Q
P' =
= XP
It is then easy to verify that
Boot [1963] has also given a formula for the Pseudo-Inverse of a
matrix in terms of ordinary inverses.
r
We write X'
so that its first
pn
rows are linearly independent (where r = Rank X). Then one defines
F'rp and G'rn by the equations
X'X =
[:?
J
p-r,p
x, --
G'rn
[ Nt
j
p-r,n
The Pseudo-Inverse of Xnp is then given by
t
Xpn
(4.11)
=F(F'F)-lG'
Boot gives two proofs of this result.
•
The first is based on the result
(see Ben Israel and Wersan [1962] or Rao and Chipman [1964]) that the
Pseudo-Inverse Xt
trace of
(X'X)
of a matrix X is that matrix which minimizes
subject to the restriction that
(X'x)Xt = X'.
The
alternative proof is algebraic and like Greville's treatment rests on a
factorization of X.
59
Ze1en and Goldman [1963] mention that one can compute the PseudoInverse of A
2.1.
= X'X
(where rank A
= r)
by choosing R
=B
in Lenuna
The explicit formula
At
(4.12 )
= [I
_ R(R'R)-lR,](A + RR' )-1
is presented.
It is interesting to note that (4.12) can be directly verified
using the properties of the Pseudo-Inverse.
by (iii) and (vii) of Theorem 2.9.
[I - R(R'R)-lR'][A + RR,]-l
Hence
= [I
since it is easily verified that
We see that
- R(R'R)-lR'][A + (RR,)]t
(RR') = R(R'R)-2 R,
=At
and since
R(R'R)-~'At = R(R'R)-lR'At AAt = R(R'R)-lR'AAt At = 2 .
4.4 Abbreviated Doolittle Technique
In this section we indicate a presentation of the theory behind the
Abbreviated Doolittle computing technique widely used by statisticians
in dealing with linear equations, and illustrate how this technique can
be used to obtain Generalized Inverses.
form X'X
We consider matrices of the
and linear equations of the form
(X'X)£.
= X'I.
The develop-
ment followed here closely follows that of Anderson [1959] who considered the case where
X'X
is non-singular.
In Chapter 6, Section 5.7, we return to the Abbreviated Doolittle
and its place in both the theory and application of least squares.
60
We recall that a matrix X can be factored as follows
X = ZH
where
Z
is such that
Z' Z
= D,
a diagonal matrix, and
H
right triangular matrix with all diagonal elements unity.
is an upper
To conform
with the usual presentation of the Abbreviated Doolittle technique we
write
X
= ZH
instead of the more natural
X
= HZ.
The above factori-
zation is simply a result of the Gram-Schmidt Orthogonalization Process
applied to the columns of
~
be the columns of
Z
X.
Let
and X
!l'~'
""!p and .!l'
respectively.
Then
zl'
-
~~,
-c;
~,
... ,
..• , -p
z
are determined recursively as
for
i = 1,2, ••• ,p.
of X
Since no assumptions have been made about the rank
some care needs to be taken if !.e
case for
q = 1,2, ••. ,p-.e
we take
.e-q-l
!.e+q = .!.e+q -
j
~l
jrj:.e
It is clear that
ZH = X
where
=Q
for some
! . In this
61
,
~~j
j
= i+1,
1
j
=i
0
j
= 1,
0
j
= i+1,
i~,
1
j
and
0
j
=i
= 1,
I
!i!i
Hij
=
... , P
i~,
2,
... , i-1
if !i
rQ
(4.14)
Hij
Hence
=
2,
... , P
z
-i
... ,
= -0
if !i
=0
i-1
H is upper right triangular, non-singular and has diagonal
elements un1 ty .
The vectors
zl'~'
-
-Co
••. , -p
z
z'z
=D
,
are orthogonal by con-
struction and hence
(4.14')
where
D
is diagonal.
Thus we have
(4.14" )
x'x = H'Z'ZH
•
From Theorem 3.3 we see that a Generalized Inverse of
X'X
is given by
(4.15 )
where
(z'Z)g
is a Generalized Inverse of
(Z'Z).
diagonal a Generalized Inverse is easily found.
by
(4.16)
Since
(Z'Z)
If we define
is
(Z'Z)g
62
where
1
arbitrary
then
(X'X)g
If
is a Generalized Inverse of
(z'z)r
is defined by
(4.17)
(Z'Z)r = diag
where
1
,
di i =
(x'x)r
dii
z. fQ
if
-J.
!i!i
0
then
(X'X).
if
z = 0
-i
-
is a Reflexive Generalized Inverse of
(X'X).
The above remarks indicate that if one can obtain the matrices
and Z one can easily find a Generalized Inverse or a Reflexive
alized Inverse of
H
Gener-
X'X.
The equations (4.13) and (4.14) for the determination of
indicate that such determination of
Z and H
Z and H is anything but trivial.
Fortunately the application of the Abbreviated Doolittle Computational
Procedure yields a systematic method for obtaining
that
H.
It turns out
Z is not needed explicitly and the procedure yields
Z'Z
in a
simple manner.
The Abbreviated Doolittle computing procedure can be conveniently
expressed in the format given in Table 4.1.
are the elements of the
X'X matrix and
a
The quantities
iy
= !.~l for
a ij =!.' i!.j
i = 1,2, •.. ,p.
Some difficulty arises in using the format indicated in Table 4.1 when
.
•
Table
e
4.1.
e
The Abbreviated Doolittle format
·.... Pu- ·......
~p
'1.y
where
~
B
1i
·......
B
1P
B
1y
where
B1j = A1j/~1; j = 1,2, •.• ,p,y
~i
·.....
B
2p
~y
where
~j
B
2i
·......
B2p
B
2y
where
B2j = ~j/~2; j = 1,2, ••• ,p,y
···
·
····
~
~1
~
~~
1
B
12
~
0
~2
~
0
1
···
·
··
·0·
···
·
0
·....
A
ii
~~
0
0
·....
1
····
··
··
····
~~
0
0
~~
0
0
~~
.
,
·....
·.....
·.....
·....
.. . ...
,
···
·
··
··
·.....
·.....
=
~j;
j = 1,2, ... ,p,y
= a 2j - B12~j; j = 1,2, ••. ,p,y
AiP _
A
iy
where Aij = a ij - ~~ Bk1~; j=1,2, ••• ,p,y
B
iP
B
iy
where Bij = Aij/Aii ;
·
···
··
··
0
·.. ,. ..
App
A
py
where Apj =
0
·......
1
Bpy
where B . = A j / A ;
PJ
P
pp
~j
-
j = 1,2, ••. ,p,y
~~ Bkp~;
j=1,2, •.. ,p,y
j=1,2, •.• ,p,y
-&
64
AJ,J = O.
~l' ~,
This will occur if and only if
... , ~J,-l·
equations
j
X'XE,
= 1,2, ••• ,p,y
More precisely if ~J
= X'y
~J
is a linear combination of
J-l
= i~l
Ci~i
then since the
are consistent it can be shown that
AJ,j
=0
for
•
In such a case we define
as follows
( 4.18)
o
j
1
j =
<J
J
OJ> J
Similar procedures are followed if several
A
ii
's
are null.
The reason
for this somewhat peculiar convention will become clear in what follows.
Recalling the definition of !l
given in (4.13) we see that
and
B'
-1
=
o •
0
J
Also
,
for
j
= 2, .• ·,Pj
3, . . ., p j
B2y =
~y
~l.
ziz
=e=e
= ~ib
and B2l
= OJ
B2j =
¥j
z~
for
j = 2,
65
In general it follows that
j <1
0
j
=1
!1!j
j
= 1+1,
0
j <1
1
j ::: 1
1
=
A1j
!1!1
,
(4.19 )
=
B
1j
1+2, ••• , p
1
!t!j
!1!1
A
1y
=
j
= 1+1,
1-t2, ••• , P
ZIZ
-1
zll
B
1y
for
1 = 1,2, ... ,p
=
-1
~
!1!1
if ~:f o.
If !1 =.Q
that
B
1j
:::
(4.19 ' )
B
1y
It is clear that
:::
o.
0
j <1
1
j ::: 1
0
j >1
we follow the convention
66
B'
-1
B
=
=H
B'
-p
(4.20 )
A
py
where the matrices
Z and H are as defined by equations
4.13 and
4.14.
It is also of interest to note that
( 4.21)
(Z'Z)H
=
(Z'Z)B
=
=
A.
A'
-p
The matrix H- l = B- 1
sequential scheme.
can easily be found by the following
Remembering that the inverse of a triangular
matrix is again triangular we see that the matrix equation
67
1 B12
B
13
B
1P
Cll
C
12
C
1P
1 0
0 1
B
23
B2p
0
C22
C
2P
o1
=
.
o
o
o
...... 0
...... 0
...........
o
. . . . . . . . . . .. cpp
o
1
o
1
yields the following set of equations for determining the elements of
the
C matrix.
( 4.22)
Cij = 1
i = j
Cij = 0
j
<i
j
>
j
k;iBikCkj = 0
i
= 1,2, ••• ,p
i
The simplest procedure is to start with the equation
C
+ C B
= 0
p-l,p
pp p-l,p
and proceed in a sequential fashion to determine the other
Cij's.
is clear that this procedure gives a routine method for finding
It
-1
B •
Thus the computation of a Generalized Inverse or a Reflexive Generalized
Inverse can easily be performed within the Abbreviated Doolittle format
utilizing equation (4.15) i.e.,
It is of interest to note that the Doolittle procedure can be used
•
to compute certain matrix products.
how to compute
~(xtX)r~.
To be specific ve shall indicate
Form the augmented matrix
68
M = [X·X
I
~'I~]
and carry through the forward solution of the Abbreviated Doolittle on
M.
Note that since M is rectangular with P
procedure will stop after
consisting of the
same way as
".A"
p
steps.
rows the Doolittle
If one forms the matrix,
~,
rows of the Doolittle and partitions it in the
M it follows from A = (B'rlX'X that
(4.23' )
If we let
~
designate theJ8lJalagously partitioned matrix of "B" rows
it follows from the definition of the Doolittle operations that
Hence
(4.24 )
Equation (4.24) gives a simple procedure for computing a Normalized
Generalized Inverse of
(X'X).
ized Generalized Inverse of
Recall from Theorem 2.18 that a Normal-
(X'X)
is given by
(4.25 )
Upon forming the augmented matrix
The computation of
(X'X)n
can be performed by an application of the
Doolittle forward solution and equation (4.24).
matrix product
The computation of the
~(X'X)~ discussed above is a generalization of
69
Aitken ' s triple product method [1931] and is discussed in a recent
paper by Rohde and Harvey [1964].
The computation of the Pseudo-Inverse of
(X'X)
can be achieved
by the Doolittle using the factorization representation discussed in
Section 4.3.4, namely,
where T is determined from the equations
X'X
= T'T,
= r = Rank
Rank (Trp)
(X'X) •
To compute T we let T consist of the r
non-zero rows of
DB
where
D and B are as defined in (4.20). It is easy to see that
X'X = T'T. The matrix product T' (TT' 2T can, of course, be computed
r
from the Doolittle utilizing the augmented matrix
,
[(TT )2
( 4.26)
I
T] •
Note that the determination of T is identical with the determination
of the square root of X'X and hence the square root method or any
similar factorization procedure can be used to obtain the Pseudo-Inverse
in a routine fashion.
~
Some Results on Bordered Matrices
A widely used technique in the solution of extrema problems is the
Lagrangian Multiplier Technique.
In this technique one considers the
problem of maximizing (minimizing) a function F of p variables
9 , 9 , ••• , 9p subject to s < p restrictions. Let the restrictions
1 2
be given by V (9 ,9 , ••• ,9 p ) = r
for i = 1,2, ••• ,s. Define
i
i l 2
,.
70
•
e'
= (e l , e2,
••• J
A.'
= (~,
•
A. ,
2
0
.,
ep )
A. ).
s
!' (e) = ('lfl(~)' 'lf2(~)' ... , 'If s (e»
A solution to the problem of finding an extremum of
F
subj ect to the
stated restrictions is often obtained by equating the partial derivatives of
wi th respect to
~
and
~
to zero and solving for
Under certain circumstances (~.~. when F
e
and
i
(~)
e.
is a quadratic form in
is linear in ~) the above procedure results in a set of
linear equations of the form
(4.27)
The matrix
is, for obvious reasons, called a Bordered Matrix.
As a useful applica-
tion of the theory of Generalized Inverses we shall discuss briefly the
problem of obtaining a solution to a set of linear equations of the form
(4.27).
We shall do this by a simple application of the results of
Section 4.2
Forming
L=[f-B][~]
=
[
A'A+RR'lK.B]
RiA ~
=
~J '
[R'JllR'R
say,
71
we see that a Generalized Inverse of L is given by
where Q = R'R - R'~A'R
I
[~I ~]
Hence a Normalized Generalized Inverse of
is given by
~R+~A'RQgR'~~l •
-QgR'~R
J
( 4.28)
The general formula given by (4.28) can be specialized to yield the
results concerning Bordered Matrices which are widely known as well as
those which are not.
One application of Bordered Matrices is, as indicated in Section
4.3, the computation of Normalized Generalized Inverses and PseudoInverses.
Using (4.28) we shall derive alternative, but equiValent,
expressions for the Normalized Generalized Inverses and PseudO-Inverses
given in Section
Case I:
[1963], !.~.,
p x (p-r)
4.3.
If we adopt the assumptions used by Zelen and Goldman
A is symmetric,
of rank (p-r)
and R'B
F
is non-singular.
Letting
p xp of rank rj
=
B' A
=2
is non-singular, then
[Hi]
where B is
72
be partitioned in the same manner as
FF- l = I
F we see that the identity
yields the equations
=I
AC + RU'
R' C
,
=~ ,
R'U = I ,
and
U' = (B'Rr~'
Thus
=0
AU + RV
and V =~.
•
Since the inverse of
F
is unique we
have
F- l
=[
where
C
(B'Rr~'
Q
= R'R
IB(R'Bd =[¥f>A+¥f>ARQgR' A¥f>A-¥f>ARQ~' I¥f>R+¥f>ARQ~'A~R1
2
J
- R' A¥f>AR
-Q~' A¥f>A
and
K
-QgR' A¥f>R =
¥f>R
= A2
+
+ RR' .
Q~'
-QgR' A¥f>R
It follows that
2
= B(R'Br
l
QgR' - Q~' A¥f>A = (B'Rr~' •
Noting that assumptions imply that
explicit expression for
¥f>
= K- l = (A2
+ RR' r l
we have an
C given by
C = Kf5A - Kf5AR[Q~' - Q~' A'Kf5A]
= ¥f>A[I -
= (A2
R(B'R)-~']
+ RR,)-lA[I - R(B'R)-~']
This is an alternative form of the expression given in Section
It is easily checked that
2
A
and hence
CA
4.3.3.
C is a Reflexive Generalized Inverse of
is a Normalized Generalized Inverse of' A.
J
73
Case II:
If we let B
=R
in Case I we see that the assumptions
imply that
[LLB]
~'
are all invertible.
K = A2 + RR',
R'R and Q = R'R _ R' AK-1AR
Obviously
Then 4.28 becomes
Since the inverse of a symmetric matrix is again symmetric it follows
that
If we let
At
be the Pseudo-Inverse of A we see that
involved only one matrix inversion while that given by Zelen and Goldman
involves two inversions.
CHAPrER
5.
LEAST SQUARES APPLICATIONS
~
Introduction
In an excellent historical review Eisenhart [1963] has pointed out
that the analysis of a set of observed data using least squares techniques is by no means new.
It has only been in the last thirty years,
however, that research into the statistical properties and implications
of such analyses has been initiated.
Much of this research grew out of
the application of least squares analyses to a Wide class of models
variously called Analysis of Variance, Regression or General Linear
Models.
Generally speaking, least squares techniques yield tractable estimators when the model used to represent the observed data is linear in the
parameters.
There is concern about properties of least squares estima-
tors when parameters enter into models in a non-linear fashion.
However,
alternative methods of estimation such as maximum likelihood, minimum
chi-square, minimum modified chi-square etc. can often be viewed as
quasi-least squares estimators.
Least squares theory (and indeed most of the important applications)
is relatively complete for the class of models which can be fitted into
the following definition.
Definition 5.1 A model for a set of observed data I
general linear model if
(i)
(5.1)
E(lnl )
= X ~
np pl
,
RankX=r~p<n,
Var(l) = V (]'2 , Rank V = n
where ~ and (]'2 are unknown parameters.
and (ii)
,
is called a
75
Of importance is inference about the parameters.
We shall devote
most of our attention to problems of estimation in this chapter.
In the
framework of Definition 5.1 not all the parameters are estimable in the
following sense.
Definition 5.2 A parametric function !r~ is said to be linearly
estimable if there exists
regardless of the value of
c'
such that
~.
There is some controversy about restricting problems of estimation
to those parametric functions which are (linearly) estimable.
reasons can be given for such a restriction.
Two
First it can be shown
(Section 5.2) that estimable functions possess unique minimum variance
unbiased linear estimators (called BLUE estimators).
The criterion of
minimum variance unbiasedness is Widely used in other estimation problems (although occasionally it leads to nonsensical estimators
(~.f.
Lehmann [1950], pages 3-14 to 3-15) and hence such a restriction seems
reasonable.
Second, most of the common applications require for their
solution that only estimable functions in the sense of Definition 5.2
need be considered.
In the next section we review some of the current approches to
least squares theory in general linear models.
5.2 A Review of Some Current Theories
The theory of estimation in the general linear !!lOdel is well
developed and can be compactly presented.
In this section we shall
describe briefly the essence of this theory.
We assume that we have a
76
vector of random variables Z with the following properties
(i)
=~
E(Z)
(5.3 )
2
and (ii)
Var(z) = Icr ,
where ~ and cr2
RankX=r'::::p'::::n,
are unknown parameters.
It is desired to estimate !'~, a (linearly) estimable function in
the sense of Definition 5.1 using a minimum variance linear unbiased
estimator or BLUE estimator.
The most elegant proof of the existence and uniqueness of a BLUE
estimator for !'~ uses the theory of projections on finite dimensional
vector spaces.
The details of this theory are available in many sources.
For convenience we follow the line of development used by Scheffe'"
[1959].
Recall that
exists
c
such that
!' ~ is linearly estimable if and only if there
J,'
-
".
= -c 'X' • Scheffe proves that there exists a
! ,~
unique linear unbiased estimator for
£* lies
i,n
Vr '
given by £ *' Z.
The vector
the vector space spanned by the columns of X.
In
addition, if £'Z is any linear unbiased estimator of !'~ then
is the proj ection of
c
-
on V·
r
along
v-l.r
c*
1./ Using these results
Scheffe'" establishes the celebrated Gauss-Markoff Theorem.
Theorem 5.1:
Under the model (5.3) every
(Gauss-Markoff)
(linearly) estimable function !'~ has a unique linear unbiased estimator
£i'~
which has minimum variance in the class of' all linear un-
biased estimators.
•
"
The form of !
1./:s.ose (~959] calls
error space •
Vr
f
~
is given by
the "estimation space" and
v!-
the
77
is any solution to the "least squares "
or "
normal
" equations
where b
(X'X)£
= X',;z: •
From Theorem 2.5 we have £ = (X' X)~ I,;z:.
is
t£ = .&0' (X'X)~',;z:.
E[SSE]
(]"2
is
= qri
'(/2
where
q
Hence the BLUE of
It can also be shown that
=n
- Rank (X'X).
.!' ~
E[(,;z:-X!?), (,;z:-X£)]
=
Thus an unbiased estimator for
= SSE!q •
Other approaches to the theory of least squares are possible, and
have certain desirable qualities.
be extended in Section 5.4.
The approach of Plackett [1960] will
An approach due to Roy
[1953] is of interest
because it is equivalent to using a Reflexive Generalized Inverse of
(X'X).
ROY's approach is also important in that it often leads to more
efficient programming of linear model problems on high-speed computing
equipment as is brought out by the following paragraph.
Roy's approach utilizes a basis of X and is as follows.
Rewrite
the model (5.3) as
where Xl
is a basis of X.
Note that the elements of
rearranged to conform with the rearrangement of X.
~
have been
From the condition
for linear estimability (5.2) it follows that .!'~ is estimable if and
only if there exists
c
such that
Using this form of the estimability condition Roy shows that the BLUE
estimator for !'~ is given by
78
.etA
_.t:.
r
lv '
= -1
.e'(x'x
11
~l
and that this estimator coincides with the least squares estimator of
,!'f!. Roy also develops criteria for testing and establishing the testability of hypotheses and shows that these results are invariant under
choice of a basis of X.
It is interesting to note that a simple proof of the invariance of
say,
l'f! under choice of a basis of X can be given using Generalized
Inverses.
We see that
Since l'~ is estimable we have
some
[£~
I
£~].
[l~
I ~] = [£~ I ~]
[Xl
I X2]
for
Thus
,!ff! = [£~ I £~] X(X'X)rX' l
and since X(X'X)r X·
is uniquely determined by Theorem 1.6 it follows that another choice of
basis for
X will yield the same
,!1-~ .
The above approached to the theory of the general linear model can
be generalized in that one can replace the assumption that Var (y) = Icl
by the assumption that Var (y)
and known.
If
= v~2 where V is positive definite
V is not known, then estimates of its elements must be
substituted and iteration will need to b'e used to find b.
The theory
necessary to handle linear model theory under the assumption that
2
Var (~) = V~
•
and V known will be discussed in Section 5.3 .
79
~
Weighted Least Squares and the Generalized
GausS-Markoff Theorem
In Section 5.2, it was mentioned that the assumption of the general
linear model could be somewhat relaxed in that Var (z) = IO"2
replaced by Var
(z)
2
= VO".
could be
In this section we define the weighted
least squares estimator of a linearly estimable function and show the
equivalence of the weighted least squares estimator and the BLUE.
The
results extend P1ackett's [1960] results and utilize the theory of
Generalized Inverses.
The proofs are simple, purely algebraic, and
reduce to the case considered in Section 5.2 when V
Definition 5.3
= IO"2 .
If! is
If a set of linear parametric functions
such that there exists
C satisfying E[qz] =
~
for all
~
then
~
is said to be a (linearly) estimable set of parametric functions.
Theorem 5.2:
If
~
is a set of linearly estimable functions then
qz, the best (minimum variance) linear unbiased estimator for
~
is
given when
Further the value of var(ca)
is
L(x'v-lx)gL' •
Before proving Theorem 5.2 we need to establish the following
Lemma.
Lemma 5.1:
the quantity
The identity
(x'v-lx)(x'v-lx)~,v-1 = x,v- 1 holds and
v-(i)X(X'v-lx)~'v{t)' is uniquely determined, synnnetric
and idempotent.
Proof:
Let
Y = v-(i)x where v(t)'v(i) = V.
v-(t)X(X'v-lx)~'v(t)'
=
Then
Y(Y'Y)~' is uniquely determined, synnnetric
80
and idempotent by Theorem 2.6. Also by Theorem 2.6 it follows that
Proof of Theorem 5.2. If Cz is to be unbiased for If! then
E[ Cz] = CXl:! = If! for all /3. Hence ex = L. We must show that the
diagonal elements of Var (Cz) = eve', subject to ex
when e = L(x'v-lx)gx'v- l . We see that
=
L, are minimized
[e - L(X'V-lX)gx'V-l]V [e - L(X'V- l x)gx'v- l ],
=
eve' - CVV-lx(x'V-lX)g'L' - L(x'v-lx)~'v-lye'
+
L(x'v-lx)gx'v-lyy-lx(x'v-lx)g'L'
= eve' - CVV-lx(x'v-lx)g'x'v-lye' - ex(x'v-lx)gx'e'
+ cvv-lx(x'v-lx)~'v-lx(x'V-lx)g'x'v-lye'
ex(x'v-lx)~' e' -ex(x'v-lx)x' e' + ex(X'v-lx)~' e'
=
eve'
=
eve' - ex(x'v-lx)~' e' by Lemma 4.1.
Hence
...
Var(If!)
=
Var( Cz)
=ex
=
eve'
(x'v~lx)g x'e' + [e-L(x'v-lx)~'v-l]v[e-L(x'v-lX)gxv-l]'
= L(x'v-lx)gL'
+ [e-L(x'v-lx)gx'v-l]v[e-L(x'v-lX)gx'V- l ], •
"
Thus each diagonal element of Var (If!)
is minimized when
e - L(X'V-lX)~'V-l
or when
e
=~
= L(X'V-lX)~'v-l .
With this choice of e we have Var (:4!)
= L(xv-lx)gL' •
81
Having found the linear minimum variance unbiased estimator for
~
we now show the equivalence of this estimator and the weighted least
squares estimator.
Definition
5.4
The weighted least squares estimator for a linearly
estimable set of parametric functions
~
is given by
""
~
where
""
~
is any solution to the equations
Theorem 5 .3:
The vector
""
~
used in determining
""
~
minimizes the
weighted error sum of squares
Proof:
We first note that
WSSE = (X - X!?) 'v-leX - X£) = !'y-l!
= [(X -
X(X'y-~)gX'y-1X) + (X!? - X(X'y-~)Sx'y-1Z)] .
where
!
Hence
WSSE = [X - x(x'y-lx)Sx'y-lx) 'y-l[X - X(X'y-3x )~'y-\]
- [x-X(X'y-lx)gx'v-1X]'y-l[X!?-X(X'y-lx)gx'y-lx]
- [X!? - x(X'y-lx)gx'y-lx] 'y-l[X- x(x'y-lx)Sx'y-lx]
+ [XE, - X(X'y-lx)Sx'y-lx] 'y-l[XE, - x(x'y-lx)gx'v-lx] .
By Lemma
4.1
we have
[X!? - X(X'y-lx)Sx'y-1X]'y-l[X - x(x'y-lx)Sx'y-lx]
= [£ _ (X'y-~)Sx'y-1X]' [(X'y-l)X - (X'y-lx)(X'y-~)Sx'y-1X] =0 .
82
Similarly
[I - X(X'V-lx)gx'v-11]'V-1[~ - X(X'V-lx)gX'V- 11]
=0
•
Hence
WSSE
= l'[r - X(X'V-lx)Sx'V- 1 ]'v- 1[r - X(X'V-lx)Sx'V- 1 ]1
+ [~ - X(X'V-lx)gx'V-11]'V-1[~ - X(X'V-lx)gX'V- 11]
is minimized if b
Obviously
E.
=
is chosen so that X£ = X(X'v-lx)gX,v- 11 •
(X'V-lx)~'V-11 satisfies this equation. Further if b
satisfies the above equation then £
(X'v-lx)£
£
is a
.
solut~on
also satisfies
= (x'V-lx)(x'v-lx)Sx'V- 11 = X'V- 11
=~ = (X'V-lx)Sx'V- 11
;
,
" or
"least squares " equations.
to the "
normal
~
Linear Models with Restricted Parameters
Occasionally linear models arise in which the parameters
restricted in a linear fashion.
~
are
That is, it is known that
R'~ =!
for some matrix R and vector r.
In such a restricted model one
feels that it is only reasonable that a set of estimators
"-
~
should
satisfy (5.4).
We are thus led to consider a restricted linear model of the form
(i)
(ii)
and
(iii)
E(l) = Jq!;
Var
(I)
2
= Vcr;
=r
~ p
<n
V positive definite,
R'~ = ! .
Under the model (5.5)
theorem.
Rank X
c.
R. Rao [1952] established the following
Theorem 5.4:
.t~
A linear function
and only if there exists
c
and
d
is estimable under (5.5) if
such that
clx + d'R' = J'
hold.
c'
-
= b
d'r
and
£~
If
= -A.'X'y-l
o
is estimable then its BLUE is given by ~'"l. + b
and b
= d'r.
0
The quantities
_A.
and _d
o
where
are any solu-
tion to the equations
[~] [~] =t~j
If
Proof:
then !Ip'
t::
= t~
c
and
d
are such that
- ~'! + ~I! = ~I~
r = clyp' + b
E(c'V
+ b0
-,
is estimable since
_."t:,
and
0
d'r = b
o
Jlp'- dIR'P.+b
_t:: t::
0
=
.
If ~'~ is estimable and R'~ = r
such that
= JI
clX + d'R'
E(~I"l. + b ) = !I~.
then there exists
c
and b
o
Hence
o
clyp' + b
_."t:,
= Jlp' •
0
-
t::
Thus
for some
d.
Hence
= _d'R'
clX - JI
To find the BLUE of ~ I~
determines
~
and b 0
and b
o
one simply lets
"
Yar (t~)
such that
= d'r •
~"I~ = ~ I"l.
+ b0
is a minimum.
and
Since
"
Yar (!'~) = ~Iy~ we are led to form the LaSrangian function
F(c, A. , A.) = clyc + 2A. (b
-
0
-
-
-
0
Taking partials with respect to
0
- d'r) - 2A. I (X I C + Rd - J) •
--
~,
A.0' b 0
-
and'
~
-
-
yields the equations
84
aF =
2c'y - 2A.'X'
aF _
b
de
dA.
o
-
,
d'r
- --'
0
~ =
c'X' + d'R' _ -£'
aF
2A.
w=
0
o
,
,
aF
'dd'" = 2A.-0r' + 2A. 'R
Equating the partial derivatives to zero and simplifying yields the
equations
and
Hence
where A.
c'
= !:::.'X'y-l,
b
A.'R
= -d'r- ,
= 0' •
c'
= A.'X'y-l
o
c'X + d'R'
A.
and
b
o
,
o
=!' ,
= 0
,
.
= dr,
--
and d are solutions to the equations
[~] [~] [~J
=
Since the use of Lagrangian multipliers only yields necessary conditions
for extrema one still needs to verify that the above solutions do in
fact yield a minimum.
minimize Yar C~'f~)
Rao has shown that these solutions do in fact
...
and that !'t3
is uniquely determined.
Dwyer [1959] has obtained the natural generalization of Rao's
formula to the case of a set of linear functions.
He has shown that
85
~
is a set of estimable functions in the model (5.5) if and only if
there exists
C and D such that
e'x
=L
+ D'R'
and
D'r = b
-
-0
•
is an estimable set of linear functions then its BLUE is given
"'-
by ~
= e'l + £0 where
e'
=
!\'X'V- l
b
=
D'r •
-0
The quantities
J\
and D are given as any solution to the equations
[~l[q
=
U]
Dwyer also gives an explicit expression for the BLUE in the case
that
X is of full rank and R is of full rank.
He mentions that Rao
did not obtain such an explicit expression for the BLUE of a single
estimable function.
This is not surprising since Rao assumes that X
is not necessarily of full rank.
Rao assumes, however, that
R is of
full rank.
Using the theory of Generalized Inverses of bordered matrices we
"'-
can give an explicit expression for
~
and Rao's results as special cases.
Zelen and Goldman [1963] have
which includes both Dwyer's
achieved such results but their formulae seem unnecessarily complicated
and cumbersome.
r
86
From Section
4.5 a Generalized Inverse of
I .
fXIV~;X ~]
is given by
where A
= x'v-Ix,
Q
= R'R
- R'~A'R and K = A'A + RR' .
Hence
[~j
Thus
b
-0
.
,
= L( I - ~ A' )RQg r .
-
By allowing the assumptions on X and R to vary one can obtain
as special cases of the above formulae the results of Dwyer and an explicit expression for Rao's situation.
-
Inverse of 0 is
-
0
Note that since a Generalized
there is no problem in treating the case R
as in Dwyer's method.
The variance covariance matrix of the BLUE for
~
-
=0
can easily be
obtained and is simply
""
V(~)
= e've
or
The expressions given by Dwyer for various special cases can be obtained
by simple substitution into the above formula.
r
87
Linear Models with a Singular Variance - Covariance Matrix
~
In this section we generalize the results of Section 5.3 to the
case where
lIs
V is singular.
This situation can occur when one of the
is a linear combination of some of the other lIs
the lIs
is a constant.
or when one of
Zelen and Goldman [1963] have presented some
results in this connection but their results seem too restrictive.
The
treatment presented here is straight-forward and utilizes the results
5.4
of Section
as well as the theory of Generalized Inverses.
The linear model under consideration is now
and
(i )
E(l)
= xt!,
(ii)
Var l
= V,
V positive semi-definite of rank
Without loss of generality we may assume that
q
~
n •
V has been written in
the form
V
where
[V
ll
I
V ]
12
J
12
V
=
is a basis for
V.
It then follows that
non- singular and that
If we let
pI =
[I
then simple matrix multiplications show that
plvp =
[VQll
E-
,
V is
ll
r
88
It follows that the transformation !
for
~ [~ ~
=
(i)
(ii)
Hence !2
~l
"here
,
E~:~
=
= (~ -
yields a linear model
corresponds to VU ' given by
=
P'x.@.
J
=
viiv12Xl
)~
Zl]
Var [ ~2
= P'l
[ V
Qll
0Q_ ]
-=--~--=-J
with probability one.
It is clear that the model has been reduced by the transformation
P'
to a restricted model of the form considered in Section 5.4.
model is thus
(i)
(ii)
and
(iii)
E(!l)
= Xl~
var(!l)
,
= Vll
'
R'~ = (~ - Viiv12~)~ = ~ .
From the results presented in Section 5.4 it follows that
where
and
~
"
=
e'l
e'
=
,
,
'1
L[I - (K-~ A' )RQg R']~ ~V11
bo
=
L( I - ~ A' )RQg ! ,
A
=
X1Vll~ ,
K
= A'A + RR'
Q
=
R'
= ~
+
!!o
,
' -1
,
R'R - R'~A'R ,
- viiv12 .
,
The
The formula presented in Section
5.4
for the variance covariance
matrix can be used to obtain the variance covariance matrix of the BLUE
of
~.
2..:.§. Multiple Analysis of Covariance
The use of the theory of Generalized Inverses allows an uncluttered
treatment of multiple analysis of covariance.
In particular the results
of Section 4.2 can be used to show that the estimators for the regression coefficients of the covariables are unbiased.
Suppose that we have the model
z=
[~ I ~l [~j
+
€
where
Z is
€
an n x 1 vector,
is an n x 1 vector,
Xl is an n x p matrix of rank r
and
~
p < n,
X2
is an n x q matrix of covariables of rank q < n,
~l
is a p x 1 vector of parameters,
~
is a q x 1 vector of regression coefficients for the covariables.
We make the usual assumptions that
well as the assumption that
=Q
E(.§)
[~I~]
= lei
has rank (r + q) ~ n.
The normal equations for estimating
partitioned form,
and Var (§.)
~l
and
~
are, in
as
90
where £1
and
Ee
are the estimates of
~l
~2
and
respectively .
From Theorem 2.6 we have
In terms of the model the estimators of
Using
~l
and
~2
become
4.3 we have
From Lemma 4.1 it follows that
Hence
4.3 shows that
Another use of
It follows that
Q is non- singular.
"-
is an unbiased estimator for
~2
~2
with variance
covariance matrix given by
v(~
)
~2
l 2
- x~X
(X'X
-~ 1
1 1 )~'][x_
1 -~ - X1 (X'x~
1--1 )~IX
1 2 ]Q- cr
= Q-l[x~x
__ x~X (X'X )~'X ]Q- l cr2
-~-~
-c 1 1 1 1 2
= Q-1QQ- l cr2
= Q-l[x~
-~
= Q-1cr2
•
91
l:1
Use of the Abbreviated Doolittle in
Linear Estimation Problems
In a recent paper Rohde and Harvey [1964] have generalized a method
of computing certain matrix products originally due to Aitken [1937].
The Abbreviated Doolittle procedure described in Chapter 4 can be used
to give a compact formulation of Aitken's method which computes matrix
products of the form
CA-~.
Of more importance in statistical applica..
tions is the computation of the matrix products which arise in the
theory of linear estimation.
To be specific let us indicate how one can use the Doolittle procedure to compute the matrix products
1(x'y-Ix)~y-1Iand 1(X'y-lx)~,
Which arise in the theory developed in Section 5.3.
Augment the
X'Y-Ix matrix and X'Y-~ column of the Doolittle by l' = X' C to obtain
[x'Y-Ix
I
X'y-II
I
x'c
= 1']
.
If one carries through the forward solution of the Doolittle on the
augmented matrix it follows from Section 4.5 that the matrix composed of
the "A" rows is
[(Z'Z)B
I
Z'I
I
Z'C] •
Similarly the matrix composed of the "B" rows is
We see immediately that
c'z(z'Z)rZ'I
= C'XB-l(B'-lx'Y-lxB-l)rB,-lX'y-1I
= c'X(X'Y-lx?X'y-lI
= 1(x'Y-lx)Sx'y- l I
.
92
Similarly
Thus the computation of the BLUE and variance covariance matrix of
an estimable function can be obtained from the forward solution of the
properly augmented Doolittle.
Note that it is the fact that (xlv-Ix)g
can be any Generalized Inverse of
(xlv-Ix) which allows the above
procedure to function properly.
changing the
matrices
By
(XIv-Ix)
and L
properly one can use the above method to find the BLUE and
variance-covariance matrix of the estimable functions considered in
Sections 5.4, 5.5 and 5.6.
~
Tests of Linear Hypotheses
Associated with the linear model 5.1 one can consider the testing
of various (linear) hypothesis about f!.
Such hypotheses are often
expressed in the form
HO:
~=.2
HA:
~
r .2
vs.
.
We confine ourselves to strongly testable hypotheses (hypotheses where
L is of full rank and
~
is a set of (linearly) estimable functions).
The statistic used to test the hypotheses
two parts.
HO will be composed of
The numerator of the statistic consists of the sum of squares
due to the hypothesis,
SSHO' divided by its degrees of freedom, Rank L,
and the denominator of the statistic consists of the sum of squares due
to error divided by its degrees of freedom, [n - Rank (XIX)].
The re-
sults on obtaining such quantities are widely known and available so we
93
shall be content to present the formulas as follows:
and
~
We note in passing that both
and that
~oV~
o
=~ •
and
~
are idempotent and symmetric
Intuitively SSH is a measure of that part of the total sum of
O
squares which
~
be explained
Ez
He
the null hypothesis
while
SSE
is a measure of the part of the total sum of squares which cannot be
explained
Ex
The usual
the model.
F
=
F
statistic defined by
[SSHe/Rank L]
[SSE/n-Rank XiX]
is thus seen to be a reasonable criterion to judge the import of the
hypothesis.
The diVisors, [Rank L] and [n-Rank X'X], simply put the
numerator and denominator on a unit basis.
tion of the
F
This intuitive interpreta-
statistic can be greatly strengthened when certain
additional distributional assumptions are included as part of the model.
~
More precisely if
E(~) =
2
and Var (§.) = V then the
hypothesis,
.-
is assumed to have a multinormal distribution with
~
= 2,
a central
F
F
statistic has, under the null
distribution and allows one to make
a judgment about the hypothesis via significance tests.
hypothesis is false, say
a non-central
F
~ =
!
r 0,
distribution with
degrees of freedom.
When the null
then the distribution of
[Rank L]
and
F
[n-Rank X'X]
is
94
The above distributional results follow directly from the following
two Lemmas due to Graybill [1961].
Lemma 5.2:
and Var ("l:.)
If "l:. has a multinormal distribution with
=V
then "l:.'B"l:.
E("l:.) =!:!:.
has a non-central chi-square distribution
•
with
[Rank B]
degrees of freedom and non-centrality parameter
A
given by
Lemma 5.;:
and Var ("l:.)
If
=V
"l:. has a multinormal distribution with
then
i
Ay and
"l:.' B"l:.
E("l:.)
= !:!:.
are independent if and only if
AVB = Q.
From the above Lemmas it follows from previous remarks that:
(i)
SSH has a non-central chi-square distribution with [Rank L]
O
degrees of freedom and non-centrality parameter given by
(ii)
SSE
has a central chi-square distribution with
[n-Rank
(iii)
and
(iv)
(x'v-lx)]
SSH
and
O
SSE
degrees of freedom,
are independent,
F, as defined by (5.6) has a non-central
F
distribution
with [Rank L] and [n-Rank (X'V-1X)] degrees of freedom.
It is to be pointed out that the seemingly more general hypothesis
can be tested by the above method simply by replacing the model by
E(Z}
= [X I Ql [~l
•
95
and testing the homogeneous hypothesis
[L
The resulting
I -!]
t~ f
= 0 .
SSHO is found to be
[L(X'VX)Sx'V-ll - !]'[L(x'v-lx)gL,]-l[L(X'V-lx)gx'v-ll -!] .
This simple device was pointed out by Dwyer [1959] in another context.
96
6.
CIJAP:rER
SUMMARY AND SUGGESTIONS FOR FUTURE RESEARCH
6 .1
Summary
In this dissertation four types of Generalized Inverses of matrices
were discussed.
•
and
The sets defined by
g
g(A) ={A :
AAgA = AJ
r
r(A) =[A :
AArA = A and ArAAr = Ar
J
n(A) = {An:
AAn = A, AnAAn = An
(AAn ) 1 = AAn }
t (A)
AAtA = A, AtAAt = At, (AAt)1 = AAt and (AtA)1 = AtA}
= [At:
and
define the sets consisting of the Generalized, Reflexive Generalized,
Normalized Generalized and Pseudo Inverses respectively.
be shown that
only if
g(A);2 r(A) :J n(A).2.. t(A)
A is non-singular.
regarding the sets i(A)
for
with equality holding if and
In Chapter 2 the work of various authors
i = g, r, n and t
was reviewed.
particular the fundamental results of Bose [1959] on
[1955] on
t (A)
It can easily
g(A)
In
and Penrose
were emphasized, and extensions to Hilbert space and
other algebraic systems were discussed.
In Chapter
3 the various types of Generalized Inverses were studied
from a theoretical viewpoint.
The principle underlying this investiga-
tion was that an intuitively appealing method of studying the sets
r(A), n(A)
and
t(A)
is to relate a property of the matrix A to the
corresponding property of a typical element of the set
n
and
t.
g(A),
i(A), i = g, r,
Properties investigated from this viewpoint were rank,
synnnetry, eigenvalues and eigenvectors.
that the rank of any matrix Ar
in r(A)
In particular it was shown
is equal to the rank of
A,
97
and that if A is symmetric then the non-zero eigenvalues of any matrix
An in n(A)
are reciprocals of the non-zero eigenvalues of A.
It was
also shown that if A is symmetric then a symmetric Generalized Inverse
always exists and that the eigenvectors of the Pseudo-Inverse of A
coincide with the eigenvectors of A if A is symmetric.
..
results on the properties of the sets
g(A), r(A), n(A)
Using
and t(A),
certain relations between the sets were established. It was shown that
g
g
g
A E;: r(A) if and only if Rank (A ) = Rank (A) and that A E;: n(A) if
g
and only if A = (A'A)gA ' where (A'A)g E;: g(A). This second result is
a slight extension of one due to Zelen and Goldman [1963].
were also obtained for
t (A)
Relations
by restricting A to be normal or
symmetric.
In Chapter
4 computational methods were reviewed and the results of
Chapters 2 and 3 used to develop some useful modifications.
Of
partic-
ular importance in applications is the following expression obtained for
a Generalized Inverse of a partitioned matrix.
If XiX is partitioned
as
XiX =
then a Generalized Inverse of XiX is given by
(6.1)
where
ing
(XIX)g
Q
=
(XIX )g + (XiX )/5vIX_Q/5vIX (XiX )g _(Xix.. )g(Xlx_ )Qg]
1 1
1 1 ~-~. ~ 1 1 1
1--1
l-~
,
[
_Q/5vIX (XiX )g
Qg
~ 1 11
= ~~ - ~xl (XiXl )~i~ . It was also shown by suitably choos-
(XiXl)g
and Qg that the above expression can be used to find a
Reflexive Generalized Inverse of
partitioning of
XIX
XIX.
It was also shown that clever
and suitable choice of
(~Xl)g permit the compu-
tation of Normalized Generalized and Pseudo-Inverses.
Application of
(6.1) to Bordered Matrices was discussed briefly and alternative but
equivalent expressions for previously known results were developed.
The theory of Generalized Inverses was also used to develop the theory
of the Abbreviated Doolittle technique when the coefficient matrix is
singular.
Application of Generalized Inverses to least squares theory formed
the content of Chapter 5.
In particular, different formulations of
certain well known results were obtained.
The use of partitioned
matrices and (6.1) allowed a simple treatment of an estimation problem
present in multiple covariance.
Also discussed were the Gauss-Markoff
Theorem, Weighted Least Squares, Linear Models with restrictions, Linear
Models with a Singular Variance-Covariance Matrix and use of the Abbreviated Doolittle Technique.
6.2
Suggestions for Future Research
There are many possibilities for future research in the area of
Generalized Inverses.
First of all it is clear that nearly all results .
hold when the matrices are allowed to have complex elements, provided
one interprets the transpose operation as the conjugate transpose operation and replaces symmetric by Hermitian.
Of interest would be careful
delineation of the results which do hold for complex matrices and those
which do not.
The relation of a property of a matrix
corresponding property of a typical element of
i(Ab i
A and the
= g,
r, n and
t ,
99
can clearly be investigated for properties other than the obvious ones
studied here.
One by-product of such research might be different and
possibly more natural ways of characterizing the sets
n(A)
and
present.
(1)
t (A).
g(A), r(A),
Many minor though not uninteresting problems are also
For example:
If A is not synnnetric but An is, does this inr.ply that An
possesses Property R (the non-zero eigenvalues of Ag are
l
reciprocals of the non-zero eigenvalues of A and conversely)
g
or Property ~ (the eigenvectors of A corresponding to nonzero eigenvalues are identical to the eigenvectors of A with
(2 )
the associated eigenvalues being reciprocals)?
Does An continue to possess Property R if synnnetric is
l
replaced by normal in Theorem 3.l0?
(3) What happens to the multiplicities of the eigenvalues (including the zero eigenvalue) when a Generalized Inverse possesses
(4)
Property R ?
l
For what class of matrices does the Pseudo-Inverse necessarily
possess Properties R or R ?
l
2
An extensive list of such problems could easily be drawn up.
A topic of considerable interest which has not been touched upon in
this work is the topic of convergence of a sequence of matrices.
-
.
As is
well known (see John [1956] or Marcus [1960]) there are many ways to
define a matrix norm in the set of all square matrices.
common and most natural are
HAilt2 = 1.
e.
u. b •
xA'Ax
XiX
= max
A.€/\
(A.)
Among the most
100
A = {r..:
where
IIAII
and
=
At ~
= A.:!
Trace (At A)
~
for some
= [
E
~
r Q1'
r..}~
r..€/\
The matrix norms defined above satisfy all the requirements of
norms discussed in Section 2.6.
be sin:q;>ly stated as:
Given A
n
does
g
g
An - > A
n:->
A convergence problem of interest can
n:->
> A in norm under what conditions
00
in (the same) norm.
00
From the eigenvalue and eigenvector properties discussed in Section
3.5 it would appear that for a Normalized Generalized Inverse and the
Pseudo-Inverse one might find a sin:q;>le solution to the convergence
problem.
There are also numerous con:q;>utational aspects of Generalized Inverses which can and should be investigated.
For matrices which occur
frequently in certain investigations (such as least squares analyses)
it would be useful to have available a con:q;>i1ation of the various types
of Generalized Inverses.
in such cOn:q;>utations.
Variants of formula (4.2) might prove useful
Li tt1e work has been done in the con:q;>utation of
Generalized Inverses in Hilbert Spaces.
Recent work by Parzen [1959]
indicates that it might be profitable to investigate the computation of
Generalized Inverses in Hilbert spaces using the notions of Reproducing
Kernel Hilbert Spaces.
The results of Section 5.5 should be investigated to see if they
are basis-invariant.
Some recent work by Kalman [1963] and C. R. Rao
[1962] on the application of Generalized Inverses to singular multinormal distributions indicates that it shOuld be possible to develop
101
the theories of Regression and Analysis of Variance on an underlying
singular multinormal distribution.
Aside from least squares it appears that Generalized Inverses might
be applicable in qwasi-leastsquares situations which include the various minimum chi-square estimators and, in some cases, maximum likelihood
estimators.
= (~,
Let ~
x , ... , x )
s
2
be a vector valued random
variable with a probability distribution given by P9' where
9
= (9 1 ,9 2 , •.. ,
9 )
r
is a vector of (real) parameters.
and the variance covariance matrix of x
-
respectively.
on x
-
Let
~l'!e'
.• -,
~n
The mean vector
will be denoted by Il ( 9) and v( 9 )
--
represent independent observations
n
and define
z = (1: x. )/n. It is well known (see Ferguson
-n
i=l-J.
[1959]) that the common minimum chi-square estimators for 9 can be
found by minimizing
where
ag
gl(wl' w , _•• , ws )
2
fi(~:~) =
~(wl,w2'·· .,ws )
,.
ag
l
e to minimize r-
M( -9, z)
n
~
and £.
a.
consists of expanding
s
~
~
G(!) =
gs(wl ,w2 ,·· -,ws )
for suitable choice of
ag
l
...-
.
aG
s
~
The usual method of choosing
r
in a Taylor series,
~
ignoring terms higher than the second, and minimizing the resulting
quadratic expression.
Such a procedure yields linear equations with a
coefficient matrix depending on the unknown
~.
Iteration is usually
-
102
applied to solve such a system.
Often the resulting equations are
singular and the usual procedures call for some sort of reparametrization in order to solve the system.
It appears to the author that one
of the types of Generalized Inverses might prove useful to avoid
reparametrization.
The previously mentioned results concerning statistical applications of Generalized Inverses indicate that much work remains to be
done in extending, simplifying and unifying these applications.
Although certain of these developments might achieve little from a
practical viewpoint it appears that they would be extremely important
from the standpoint of presenting a logical development of statistical
analysis in the classroom.
103
CHAP.rER
7.
LIST OF REFERENCES
Aitken, A. c. 1937. The evaluation of' a certain triple product matrix.
Proc. of' the Royal Soc. of' Edinburgh 57: 172-181.
Anderson, R. L. 1959. Unpublished lectures on the applications of'
least squares in economics. N. C. State of' the University of'
North Carolina at Raleigh, Raleigh, N. C.
Ben-Israel, A. and S. J. Wersan. 1962. A least square method f'or
computing the generalized inverse of' an arbitrary complex matrix.
Northwestern University, O. N. R. Research Memorandum No. 61.
Bodewig, E. 1959. Matrix Calculus, 2nd Edition.
ing Co., Amsterdam.
North Holland Publish-
Boot, J. C. G. 1963. The computation of' the generalized inverse of'
singular or rectangular matrices. Amer. Math. Monthly 70(3) :302303·
Bose, R. c. 1959. Unpublished lecture notes on analysis of' variance.
Univ. of' North Carolina at Chapel Hill, Chapel Hill, N. C.
Desoer, C. A. and B. H. Whalen.
SIAM Jour. 11(2):442-447.
1963.
A note on Pseudo-Inverses.
Drazin, M. P. 1958. Pseudo-Inverses in associative rings and semigroups. Amer. Math. Monthly 65:506-514.
Dwyer, P. s. 1959. Generalizations of' a Gaussian theorem.
Math. Stat. 29:106-117.
Annals of'
Eisenhart, C. 1963. The background and evolution of' the method of'
least squares. 34th Session of' the Inter. Stat. Inst., Ottawa,
Canada, 21-29 August 1963.
Fadeeva, V. N. 1959. Computational Methods of' Linear Algebra.
Publications, Inc., New York.
Dover
Ferguson, T. s. 1959. A method of' generating best asymptotically
normal estimators with application to the estimation of' bacterial
densities. Annals of'Math. Stat. 29:1046-1062.
Foulis, D. J. 1963. Relative inverses in Baer* - semi-groups.
Math. Jour. 10(1):65-84.
Michigan
Graybill, F. A. 1961. An Introduction to Linear Statistical Models,
Vol. I. McGraw Hill Book Co., New York.
104
Greville, T. N. E. 1959. The Pseudo-Inverse of a rectangular or
singular matrix and its application to the solution of systems
of linear equations. SIAM Review 1(1) :38-43.
Halmos, P. R. 1951. Introduction to Hilbert Space and the Theory of
Spectral Multiplicity. Chelsea Publishing Co., New York.
Halmos, P. R. 1958. Finite-Dimensional Vector Spaces.
Co., Inc., Princeton, New Jersey.
John, F. 1956. Advanced Numerical Analysis.
Inst. of Math. Sciences.
D. Van Nostrand
New York University
Kalman, R. E. 1963. New methods in Wiener filtering - Proceedings of
the First Symposium on Engineering Applications of Random Function
Theory and Probability. John Wiley and Sons, New York.
Lehmann, E. L. 1950. Notes on the Theory of Estimation, Chapters I to
IV. Associated Students' Store, Univ. of California, Berkeley 4,
California.
Marcus, M. 1960. Basic theorems in matrix theory.
Standards, Applied Math. Series. 57.
Moore, E. H.
National Bureau of
1920. Abstract - Bull. of the Amer. Math. Soc. 26:394-
395·
Moore, E. H. 1935. General Analysis.
cal Soc., Vol. I.
Memoirs of the Amer. Philosophi-
Munn, W. D. 1962. Pseudo-Inverses in semi-groups.
Cambridge Philosophical Soc. 57:247-250.
Proc. of the
Parzen, E. 1959. Statistical inference on time series by Hilbert space
method.s, I. Tech. Report No. 23, Applied Math. and Stat. Lab.,
Stanford University.
Penrose, R. 1955. A generalized inverse for matrices.
Cambridge Philosophical Soc. 51: 406-413.
.
.
Proc. of the
Penrose, R. 1956. On best approximate solutions of linear matrix
equations. Proc. of the Cambridge Philosophical Soc. 52:17-19 .
Plackett, R. L. 1960. Principles of Regression Analysis.
University Press, London.
Oxford
Rado, R. 1956. Note on generalized Inverses of matrices.
Cambridge Philosophical Soc. 52:600-601.
Proc. of the
105
Rao, C. R. 1952. Advanced Statistical Methods in Biometric Research.
John Wiley and Sons, New York.
Rao, C. R. 1962. A note on a generalized inverse of a matrix with
applications to problems in mathematical statistics. Jour. of the
Royal Stat. Soc., Series B 24(1):152-158.
Rao, M. M. and J. S. Chipman. 1964. Projections, generalized inverses
and quadratic forms. To be published in Jour. of Math. Analysis
and Applications.
Roy, S. N.
- I.
1953. Some notes on least squares and analysis of variance
Inst. of Stat. Mimeo. Series No. 81.
Roy, S. N. 1957. Some Aspects of Multivariate Analysis.
and Sons, New York.
John Wiley
Rohde, C. A. and J. R. Harvey. Unified Least squares analysis.
Submitted for publication to Jour. of the Amer. Stat. Assoc.
Scheffe, H. 1959.
New York.
The Analysis of Variance.
John Wiley and Sons,
Sirmnons, G. F. 1963. Introduction to Topology and Modern Analysis.
McGraw Hill Book Co., New York.
Van der Vaart, R. 1964. Appendix to 'Generalizations of Wilcoxon
statistic for the case of k samples" by Elizabeth Yen. To be
published in Statistica Neerlandica.
Wersan, J. J. and A. Ben-Israel. 1962. A least square method for
computing the generalized inverse of an arbitrary complex matrix.
Northwestern University, 0. N. R. Research Memorandum No. 61.
Zelen, M. and A. J. Goldman. 1963. Weak generalized inverses and
minimum variance linear unbiased estimation. Math. Researc h Center
Tech. Report 314, U. s. Army, Madison, Wisconsin .
•
© Copyright 2026 Paperzz