•
COMBINING TIME SERIES AND CROSS-SECTION
DATA IN SIMULTANEOUS ~INEAR EQUATIONS
By
ASHIQ HUSSAIN
l
1966
Institute of Statistics
Mimeograph Series No. 505
Raleigh
1966
NORTH CAROLINA STATE UNIVERSITY
AT RALEIGlI
ABSTRACT
HUSSAIN, ASHIQ.
Simultaneous
Combining Time Series and Cross-Section Data in
LinearEquations~·
(Under the direction of THOMAS DUDLEY
WALLACE) •
This thesis is concerned with the estimation of parameters in a
system of· simultaneous linear equations,by the combined use of time
series and cross-sectional data.
is postulated:
An error component disturbance model
the disturbance term in each equation of the system is
assumed to have three mutually independent random components - one
associated with time, another associated with cross-sectional units, and
a third one representing.measurement
error~
Usual distributional
properties are.ascribed to. these error components, and methods of
estimation appropriate for the model are developed.
A two-stage estimation procedure is given for the reduced-form
parameters.
*
In the first.stage, covariance matrices L
),l~
I
(~,~'
=
1, 2, 3, ••• , M) .of the reduced-form disturbances are estimated from
the ordinary least squares -residuals.
of estimators for the reduced-form
In the second stage, two sets
par~eters
are derived:
(i) single
equation estimators which result when Attken's two-stage method is
applied to each reduced-form equationpeparately; and (it) generalized
estimators obtained from the entire set of reduced-form equations.
Both sets of estimators compare favorably with the ordinary least
squares estimators.
For the estimation of structural ,parameters, two methods designated
as the "Two-Stage.Generalized.Least.Squares Method" and the "Three-Stage.
Generalized Least.Squares .Method" are given.
These methods parallel
respectively the 2-8L8 and the 3-8L8 methods for the ordinary model of
simultaneous linear equations.
For a system of exactly identified
equations, a third method called the "Indirect Generalized Least 8quares
Method" is also available.
All three methods are based upon the
covariance matrices of reduced-form disturbances; the estimation of
covariance matrices of structural disturbances is avoided.
The estimators
of structural parameters obtained by these methods are found to have
some optimal large sample properties.
Finally, some special cases, including the dummy variable model, are
considered.
In this last case, it is found that, although BLU estimators
can be obtained for reduced-form parameters, the estimation of structural
parameters is made difficult by the fact that the number of crosssectional units and the number of time intervals are both fixed.
COMBINING TIME SERIES AND CROSS-SECTION DATA
IN SIMULTANEOUS LINEAR EQUATIONS
by
ASHIQ HUSSAIN
A thesis submitted to the Graduate Faculty
of North Carolina State University
at Raleigh
in partial fulfillment of the
requirements for the Degree of
Doctor of Philosophy
DEPARTMENT OF EXPERIMENTAL STATISTICS
RALEIGH
1966
APPROVED BY:
Chairman of Advisory Committee
•
ii
BIOGRAPHY
Born:
Sargodha, West Pakistan
January 1, 1924
Married:
Fahmida, June 27, 1954
Previous Work:
Undergraduate:
B.A. (Mathematics) 1947
University of Punjab, Lahore, Pakistan
Graduate:
M.A. (Mathematics) 1950
University of Punjab, Lahore, Pakistan
M.A. (Economics) 1962
University of Peshawar, Peshawar, Pakistan
M.E.S. 1965
North Carolina State University, Raleigh
Employment:
Lecturer in Mathematics
Murray College, Sialkot, Paki~tan
and
Pakistan Military Academy, Kakul, Pakistan
1950 - 1954
Senior Lecturer in Mathematics
University of Peshawar, Peshawar, Pakistan
1955 -
iii
ACKNOWLEDGMENTS
I want to acknowledge my indebtedness to Dr. Thomas Dudley Wallace,
Chairman of my
committe~,
who inspired my interest in the subject of
simultaneous linear equations and from whom I received unfailing
guidance and encouragement throughout my stay at North Carolina State
University.
I am also thankful to the other members of my committee for their
advice and helpful criticism.
I must thank the Agency for International Development, Department
of State, United States Government for financing my education here and
the faculty and staff of the Department of Experimental Statistics whose
cooperation and courtesy made my stay a pleasant one.
A special word of thanks should go to Mrs. Jo Ann Beauchaine for her
excellent typing of this thesis.
Lastly, lowe a debt of gratitude to my wife, Fahmida, who had to
shoulder the burden of supporting our children for three years that I
was in the United States, earning nothing, but who never failed to
a word of encouragement from across the seas, and to my father, who
shared this burden with her.
s~nd
iv
TABLE OF CONTENTS
Page
1.0
INTRODUCTION • • • . •
1
2.0
ORDINARY SIMULTANEOUS LINEAR EQUATION MODELS ••
5
2.1
2.2
2.3
2.4
Stochastic Specifications ••
Identification • • • • • • •
Reduced-Form Estimation ••
Structural Estimation ••
2.4.1
2.4.2
2.4.3
2.4.4
Indirect Least Squares (ILS).
Two-Stage Least Squares (2-SLS) •
Three-Stage Least Squares (3-SLS) •
Limited Information Single Equation
(LISE) or Least Variance Ratio
2.4.5
2.4.6
K-Class Estimators • • • • • • • • • •
Full Information Least Generalized
Residual Variance Method. •
Method. . . . • • • . . . . . • .
3.0
SOME THEOREMS ON STOCHASTIC CONVERGENCE
14
15
17
....
19
21
22
23
23
24
30
ERROR COMPONENT MODEL: REDUCED-FORM ESTIMATION • • •
39
4.1
4.2
4.3
4.4
5.0
10
14
Useful Inequalities • • • • • • • • • • •
Stochastic Convergence, Univariate Case. •
Convergence of Sequences of Random Vectors .
3.1
3.2
3.3
4.0
5
7
Description of the System • • • • • • •
Estimation of Covariance Matrices of
Reduced-Form Disturbances • • • • • •
Some Useful Results • • • • • • • • •
Estimators of Reduced-Fo~ Parameters ••
ERROR COMPONENT MODEL: STRUCTURAL ESTIMATION OF AN
EXACTLY IDENTIFIED SYSTEM •
• • • •
5.1
5.2
5.3
5.4
Notation. . • . • . •
• • • •
Indirect Generalized Least Squares
Two-Stage Generalized Least Squares ••
Three-Stage Generalized Least Squares ••
39
46
54
59
71
71
72
74
77
v
TABLE OF CONTENTS
(continu~d)
Page
6.0
ERROR COMPONENT MODEL: STRUCTURAL ESTIMATION OF
OVERIDENTIFIEDSYSTEMS • • • • • • • • • • • •
6.1
Inadequacy of Indirect Generalized Least
Squares Method. •
6.2
6.3
6.4
Two Lenunas.
0
7.0
•
0
0
0
•
0
•
..
•
•
•
•
•
•
•
•
•
•
. . . . . . '.
•
Two-Stage Generalized Least Squares Estimators ••
Three-Stage Generalized Least Squares Estimators.
ERROR COMPONENT MODEL: SOME SPECIAL CASES.
7.1
Fixed Cross-Sectional Effects • • •
7.2
7.3
Cross-Sectional and Time Effects Random with
Finite Nonzero Expectations .
Dummy Variable Specifications.
SUMMARY AND CONCLUSIONS • • •
LIST OF REFERENCES • • • • • .
79
80
83
88
93
.....
7.1.1 Estimators of Reduced-Form Parameters. • •
7.1. 2 Structural Estimation. • . • • • • •
8.0
..
79
~
93
95
101
102
104
106
109
1.0
INTRODUCTION
This thesis is concerned with the estimation of parameters in a
system of simultaneous linear equations by the combined use of time
series and cross-section data.
The problem was suggested by Hildreth
l
in 1950; but so far as the present writer knows, it has received little
attention.
Most of the work done thus far on simultaneous equations
is based on time series data.
,
,
The model which is ordinarily used is:
,
.YtA+~B+~=Q,
(1.1)
,
where Zt = (Ylt' Y2t' ••• , YMt) is the vector of observed values of M
.
th
endogenous variables for the t
time period;
'
~ =
(x
lt
' x
2t
' ••• , x
Kt
)
is the vector of observed values of K exogenous variables for the tth
,
time period;
~t =
(u lt ' u 2t ' ••. ,
~t)
is the vector of values of M
unobservable random variables in the system, specified by the relations
E(uj.lt)
=0
E(u t u , ,)
j.l
j.l t
for all t
=
i
,
(J
j.l].l
= 1, 2,
,
(finite) , i f t
0, i f t
,
+t,
,
(j.l,].l
= t,
= 1, 2,
••• , M).
A is an MxM nonsingular matrix of constants with diagonal elements
equal to - 1; and B is a KxM matrix of constants.
A few cross-sectional studies have been made using the same general
model (model (1.1)) but data consisting of observations for different
cross-sectional units.
But no attempt has been made to pool time series and cross-section
data for the estimation of simultaneous equations.
lClifford Hildreth, Professor, Department of Economics, University
of Minnesota, Minneapolis, Minnesota, Combining cross-section data and
time series, unpublished Cowles Commission Discussion Paper, 1950.
2
In the classical single-equation system problem two different models
have been suggested for combining time series and cross-section data (i) the "dummy-variable" model, and (ii) the error-component model.
In the "dummy-variable" model, time and cross-sectional effects are
assumed constant.
A "dummy-variable'} version of the simultaneous linear
equation system will be the following:
,
~t A
,
+ ~t
B
+
,
'"
4
+ ~ + ~t
=
(1. 2)
.Q. ,
where A and B are as defined in (1.1):
~t
is an Mxl vector of observations on the endogenous variables for
.
.
. d;
t h e 1,th cross-sect1on
an d t t ht1me
per10
~t
is a Kxl vector of observations on the exogenous variables for the
.
.
. d;
1. th cross-sect1on
an d t t ht1me
per10
A, is an Mxl vector of constant effects associated with the i th cross-
-:J..
section;
~ is an Mxl vector of constant effects associated with the tth time
period; and
~t
is an Mxl vector of values of unobservable random variables
specified by the relations
E(u 't) = 0,
IJ1
E(u. u.' '=
IJ1t IJ1 t )
'(finite) , i f i "= i and t = t,
,
0, otherwise for IJ,IJ = 1,2, ••. ,M.
(J
)
IJIJ
The model is unsuitable for two reasons.
First, it does not take
cognizance of random effects which may cause variations from time to
time and from one cross-sectional unit to another.
of
cross~sectional
Second, the number
units as well as the number of time intervals must be
assumed fixed; otherwise the model does not have any practical value.
3
But this very assumption stands in the way of structural estimation.
For
a subset of explanatory variables in each structural equation consists of
endog~nQus
variables which are correlated with the error term.
Conseqpently
the principle of least squares cannot be applied.
In the error-component model,
i~
is assumed that the error term has
three mutually independent random components - a component associated with
time, another associated with cross-sectional units, and a third one
representing measurement error.
We adopt this model.
We assume that the
di13turbance term of the lJth equation of the system is of the form
(1. 3)
where u
lJi
' V
and W
are mutually independent random variables with
lJit
lJt
zero means and
E(u . u '.')
In
lJ
1.
=
i
aU , (finite) if i
lJlJ
,
+ i for
).o~"' (f~nite)if
10, if t + t for
0, if i
E(v
v")
t
t
=
= l,2,3, ..• ,M;
lJ,lJ
t' - t,
= l,2,3, ••. ,M;
lJ,lJ
) a~lJ' (finite) i f i
10'
i,
,
,
=
i and t
=
t,
otherwise, for lJ,lJ' = 1,2,3, ... M.
It is further assumed that the components of the error terms are
normally distributed.
Our model is, therefore,
,
,
~t A
and disturbances
+ ~t
E
B
+ -4t
=
°
(1. 4)
it = (£lit' £2it' ... , £Mit) as specified in (1. 3).
A good rationalization of this type of error structure is given by
Mund1ak [13],
Our object in this research is to develop methods for
4
estimating the parameters of the system of equations - that is, the
elements of matrices A andB - which are appropriate to this error model.
It will be shown in the sequel that our methods yield estimators which
have some optimal properties.
The plan of this thesis is as follows:
In Chapter 2.0 we present a
review of the existing material on the subject of simultaneous linear
equations, discussing briefly some of the well-known methods of
structural and reduced-form estimation.
model (1.1).
All these methods are based on
This is followed by a chapter on stochastic convergence;
a number of simple results on the convergence of sequences of random
variables are given, which are helpful in the derivation of large-sample
properties of estimators.
The remaining chapters are devoted to the main
topic - estimation of the reduced-form and structural parameters in the
model (1.4).
5
2.0
ORDINARY SIMULTANEOUS LINEAR
2.1
~EQUATION
MODELS
Stochastic Specifications
By far the most important contributions to the subject of simultaneous linear equations are those which use the model (1.1).
We have a
rich and growing body of literature in this area, and a variety of methods
for
~tructural
and reduced-form estimation are available.
Excellent
treatment of these methods is given by Theil [17], Goldberger [8] and
Johnston [9].
We review some of these methods briefly here merely to
provide a frame of reference for what is to follow in succeeding chapters.
Let T be the number of observations made on the observable variables
of the system (1.1).
We can then write the system of equations compactly
as
YA +
+
XB
U = 0
(2.1)
where
Y is a TxM matrix of observations on M endogenous variables;
X is a TxK matrix of obse+vations on K exogenous variables; and
U is a TxM matrix of values of M unobservable random variables
(u l ' u 2 ' ••• , uM)·
The equation (2.1) will be referred to as the structure, while the
equations
Y=XIT+V, ...
where
IT
V
will be
==-
~alled
-1
BA
,
UA- 1 ,
the reduced-form of the system.
(2.2)
6
~
The exogenous variables
side the system.
=
(x lt ' x
2t
The observation matrix X
' ••• x
=
Kt
) are determined out-
(Xl' X ' •••
2
~)
is assumed
to be generated by some mechanism independently of the disturbance so that
E(ulx)
= E(U) = 0
E(Ylx)
= E(Y)
It is further assumed that
EX(XtX~) =
LXX is nonsingular so that Plim
(x~x)
exists and is positive definite.
It has been shown that the specifications given above give the same
asymptotic results as would be given by the specification that
(xl"'~)
are nonstochastic but subject to the condition that the ordinary limit,
we
i~ (x~x)
(2.3)
exists and is nonsingular.
In what follows we shall, for sake of convenience, ignore the process
generating observations on the exogenous variables, regarding them as
nonstochastic but subject to the condition (2.3).
It is clear that
,
E(~~) = E
(2.4)
0"11
0" 12
O"IM
0"21
0"22
0" 2M
=
=
O"MI
O"M2
O"MM
L
uu
7
and that u , the disturbance vector of the u
th
tJ
structural equation has
covariance matrix
,
E(u
= a ].1].1
u)
tJ tJ
I
T
Continuing, we note that
Let v
tJ
(2.2).
,
~
=
'-1
~t
A
so that
E(~)
= 0
and
be the disturbance vector of ].Ith equation in the reduced-form
Clearly
M
v
-u
the].l
-e
th
=
L
r=l
a].lr u
where a].lr (r = 1,2, ••• ,M) are elements of
-r'
-1
column of A •
Or
(t
v].lt
so that
E(v
].It
o for
v") =
].I t
(2.6)
,
t
+t
M
M
= (,2:
L
r =1 r=l
,
(].I,].I
= 1,2, ••• ,M)
, ,
a
].Ir
a
].I r
a
rr
')
'
= 1,2, ••• ,M) which shows that the reduced-form disturbances are
temporally uncorre1ated.
2.2
Identification
We assume that the system (1.1) is identified - all underidentified
equations are deleted from the system.
This means that given IT there is
at least one solution of
B = ITA.
(2.7)
8
There must be a sufficient number of
~priori
restrictions on the
elements of A and B if (Z.7) is to have at least one solution.
We have
already imposed one restriction, namely, that diagonal elements of A are
equal to -1.
Let the remaining restrictions take the form of zero-restric-
tions - that is, some of the elements of A and B are zeros.
8* be the
~
--,.I
that t
th column vectors of A and B, respectively.
elements of
~
Q\
* are
--,.I
Let
Q\
*
--,.I
and
Let us assume
nonzero.
One of these nonzero elements (the
~
th
element) is -1.
Therefore,
rearranging these elements we can write
*=
--,.I
Q\
-e
[~].
0
where
is t xl, having one element equal to -1
Q\
~
--,.I
and all other elements nonzero.
Similarly, assuming that
K~
elements of
* are
~
nonzero, while all
the remaining elements are zeros,we can write
where 8
--;.l
is K xl.
~
From (Z.7) it follows that
-8--,.I*
= IT*
Q\
--,.I
(Z.8)
Partitioning IT as
IT
=[rr~l
IT
IT~
.
ll~s
K xt
~
where
ZI
~
IT ~ is (I\-K )xJ/,
ZI
~
~
IT ~ is K x(M-t )
IZ
~
~
,
and
9
n~22 is (K-K )x(M-t )
~
~
and using (2.8), we have
(2.9)
o
=0
equation in the structure is
The
*
Ya
11
+
xa11* + I.l
u
= O.
-
+
(~)
Or y
IJ
-e
=Y
a
I.l
+
X
u
/.l
= O.
! + I.l
u
(2.10)
I
~...
]J
Here Y is the matrix of
eluded in the]J
th
(t~-l)
structural
endogenous variables other than Y]J in-
equation~
and X is the matrix of
]J
observations on K exogenous variables included in this equation.
]J
Suppose that we know
and
a,
I.l
n.
o
If the equations (2.9) are solvable for a
we can immediately find the parameters of the]J
I.l
th
structural
equation (2.10).
Now, by a theorem of matrix algebra (Goldberger [8], p. 23), a
necessary condition for the equation
to have a solution that
K - K > t -1
]J - ]J
or
K > K + t -1
-]J
~
(2.11)
10
The solution of the equation
o
when substituted for a
in the first equation of (2.9) will yield a value
-;.t
Thus, given a knowledge of
of~.
once determine the
~
th
reduced~form
parameters IT, we can at
structural equation.
Three situations arise:
When K = K~ + ~~-1, the nullity of the (K - K~)x~~ matrix IT~l
is exactly equal to 1 so that there is a unique solution (a ,6 ) of
-;.t -;.t
equation (2.9), and the
~
th
structural equation is exactly identified.
-1, the nullity of IT ~ exceeds 1. In this
21
'
case, the equations (2.9) admit more than one so· 1
ut~on
an d t h e ~ th
When K > K +
~
-e
~
~
structural equationisoveridentified.
Finally, when K <
~
~
+ K -1, equations (2.9) have no solution
~
and the ~th structural equation is said to be underidentified.
The
underidentified relations are indeterminate so that only the first two
cases are relevant to the estimation problem.
2.3
Reduced-Form Estimation
The reduced-form equations are
y
J.l
where IT
~
= XITI.l + I.l
v
is the
~
th
= 1,
(~
2, ... , M)
column vector of IT.
We have seen that
E(V~t v ~t ') = 0
for
M
and
E(v~t V~t) = w~~ =
,L
~
= 1, 2,
... , M
M
L a ~r
r =1 r=l
(1
rr
a
r
~
for all t so that
(2.12)
11
Hence the ordinary least squares method applied to each
reduced~form
equation yields unbiased estimates.
We have, therefore
,
1
'
,
1
'
= (X X)- X Y = II + (X X)- X
II
~
~
~
v
(2.13)
~
which have ,covariance matrix
,
E(II
~
- II ) (II
~
~
- II )
~
=
,
-1
(X X ) w .
j..l~
(~ =
1,2, ... , M). (2.14)
It will be shown that if we take all the reduced-form equations and
~
simultaneously estimate !1' !2' ••• ,
we obtain the same estimates as
are given byequation-by-equation ordinary least squares method.
Write the
'e
::L=
reduced~form
=
::L1
equations in the form
+
XIII
v
1
::L2
X!2
v
~
X~
~
(2.15)
2
If we write
X=
X
0
0
0
X
0
0
0
X
0
0
0
X
MTxMK,
then the equations (2.15) take the form
e
::L=X!+,y
(2.16)
12
where II
=
v
~
=
~
MK+l
TMxl
,
E(y' v ) =
=
-e
wll1
w 1
12
wlM1
w 1
2l
w 1
22
w 1
2M
wMl1
w 1
M2
wMM1
w
ll
w
12
w
lM
w
2l
w
22
w
2M
001 = fl
<ID!.
(2.17)
w
MM
By Aitken's generalized least squares method we have
].=
These estimates are B.L.U.
(j, k
= 1,
Denoting the elements of fl-
2, ••• M), we have
l
jk
by w
e
l!. =
w1M (X' X)
w12 (X'X)
wll(X 'X)
w22 (X'X)
w21 (X' X)
w2M (X'X)
,
=
,
w (X X)
M
w
M
,1: 1
w2'J X Y..,
J=
wMM(X'X)
-1
w
12
-1
(X ' X) -1 ••• w
1M
w (X 'X) -1 •.• w2M (X ' X) -1
22
21
,
J
wM'J X Y..'
J
M
1j
j~l
w
M
,1: 1
w2J'
J=
M
1:
"e
X~
,
,1:
J= 1
' -1.
(XX)
Ij
13
,
j~1
M
wM1 (X'X)
w (X x)
ll
-1
M'
X~
X~
(2.18)
,
w J X Y..,
J
~1'
if ].l = R,
M w ,wJ'R, =.
Since ~1:1
J=
].lJ
0, if].lfR,
(2.19)
we have from (2.18)
which shows the classical least
=
squares method applied to each equation separately yields B.t.ll. estimators.
Further, the covariance matrix of IT
~
tends to zero as
T~.
,
is (X X) -1 w
].l].l
"
This implies IT ~ IT
Il
Il
( , )-1
= UT
w
..l:!J:!. which
T
14
2.4
2.4.1
Structural Estimation
Indirect Least Squares (ILS)
Suppose that we have estimated
n by n. n is
clearly consistent.
Let us recall the equations (2.9):
If the equations are exactly identified,
Q, -1
II
and K - K
II
= Q,
II
-1
Therefore the nullity of n
-e
21
which is(K - Kll)x Q,ll matrix is
exactly equal to 1, and there is a unique solution of
A
0
Denote it by a.
Then a
J.l
"jJ
0
"jJ
A
= -
nll
11
Therefore
so that
=
1,,.
0
is a consistent estimate of a.
0
a
"jJ
is a consistent estimate of S .
J.l
This holds for all II = 1, 2, ••• , M.
Thus in order to find consistent estimates of structural parameters
of exactly identified systems we obtain estimates
parameters
n by
equations like
n
of reduced-form
ordinary least squares method and then solve the
15
Now if K > KlJ + !/,lJ-l, the nullity of IIilexceeds one, and there
o
are more than one consistent estimates of a
corresponding to a cons is-
-j..I
tent estimate II"lJ of II lJ •
21
21
Each of these estimates substituted in
will give a consistent estimate of S • One has to arbitrarily discard
11
o
all but one of the estimates of a
11
in order to obtain a consistent
estimate of the lJth structural equation in the overidentified case.
2.4.2
Two-Stage Least Squares (2-SLS)
The two-stage least squares method was developed by Theil [17] and,
independently, by Basmann [3, 4].
-e
The method avoids the arbitrariness
and loss of efficiency involved in the ILS method when the system is overidentified.
Suppose we estimate the reduced-form parameters by ordinary least
sq1-1ares method applied to each reduced-form equation separately.
est;imates have been shown to be B.L.U. and consistent.
~
m
= 1,
2,
= X(X
... , M,
'"
Ym
,
l'
X)- X ~
These
Therefore
is a consistent estimate of XII , for
lJ
and
=~ -
I
X(X X)
-1
I
X
Ym = (IT
is a consistent estimate of v (m
-m
= 1,
I
1
I
- X(X X)- X ) ~
(2.20)
2, ••• , M).
In the reduced-form estimation we obtain the estimator ylJ and the
e$timator V
lJ
of the corresponding matrix of reduced-form errors so that
(2.21)
16
and (2.10) can be written in the form
(lJ)
Y. =
Y ex
IJ
"j.l
+
11, +
X
lJ..
(u
IJ
+
(2.22)
Vex)
lJ
"j.l
The two-stage least squares estimators 0
"j.l
t]_.
~ [~]
of structural
parameters[
.
(lJ) ,
(lJ)
(lJ) ,
Y
Y
Y
o =
"j.l
-1
X
lJ
.
(lJ) ,
X
Y
lJ
(2.23)
,
(lJ)
X
;
Y
lJ
X
lJ
X
~
It has been shown that these estimators are consistent and that
their asymptotic covariance matrix is
.
((~)'
1
(J
~
T
Plim T
T-+oo
.
1
P1im
X'
T
..
.
((~)'
T-+oo
lJ
X
(].1) ,
)
Y X
,
'
lJ
X X
lJ
lJ
lJ
(2.24)
which is consistently estimated by the matrix
.
(lJ) ,
.
Y
s
-1
X")
lJlJ
X
X
lJ
lJ
(2.25)
X
lJ
where
(2.26)
An alternative derivation is the following:
(lJ)
,
Writing H =(Y,X) and premu1tip1ying (2.21) by X , we have
lJ
,
X
lJ
,
Y. = X H
lJ IJ
lJ
0
"j.l
+
,
X u
IJ
,
The covariance matrix of the transformed error vector is a
,
which is not diagonal.
llll
17
(X X)
,
Ignoring the correlation between X yll and X u
f.l
(Which is weak) and applying Aitken's method, we obtain
o =
-V
[H
,
,
II
X(X X)-
l'
X H]-
l'
II
,
H X(X X)II
1
'
XY
(2.27)
II
,
If the equation is exactly identified, X H is a square matrix.
If
II
it is also nonsingular, (2.27) simplifies to
o" =
-V
2.4.3
' - 1X' v .
II
f.l
(2.28)
(X H)
Three-Stage LeastSguares (3-SLS)
The three-stage least squares method was developed by Zellner and
Theil [20].
--
Zellner [19] has shown that if the error terms in a set of
regression relations were contemporaneously correlated and if the
regressors in different relations were different, Aitken's method applied
to the entire set of relations would yield better estimates than the
ordinary least squares method applied,to each equation separately.
The
three-stage least squares method utilizes the same principle.
(ll)
Writing H
ll
=(
y ,X ), H
ll
=
HI
0
o
y
=
Y2
=
,u
u2
and 0
,
equation by X , we obtain
,
,
,
Xy=XH.§.+Xl!
=
o
X
0
, X= 0
X
o
o
X
, and premultiplying each structural
,
8ince E(X
,
~~
18
,
X) = (X
is appropriate.
L:
uu
X)t Aitken's generalized least squares method
ConsequentlYt we have
o = [H "X(X L: . X) - lX' H]- 1 [H "X(X L:
uu
uu
X) - lX' y]
which reduces to
-1
lL_'
'-1'
crlZH~X(X'X)-lX'HZ
ZL_'
'-1'
crZZH~X(X'X)-lX'HZ
cr 1H1X(X X)
o=
cr IHZX(X X)
X HI
X HI
.
' l'
cr MMu'v
L~(X X)- X ~
-e
(Z.Z9)
Replacing cr jk by their estimates Sjk obtained in 2-8L8 method t we obtain
the three-stage least squares estimators
lL_'
'-1'
SlZH~X(X'X)-lX'HZ
ZL'
'-1'
SZZH~X(X'X)-lX'HZ
s 1H1X(X X)
o ==
s IHZX(X X)
X HI
X HI
...
' l'
s MMu'v
~(X X)- X ~
x
M
Mj'
j~l s
Vex
'-1'
X)
X~
(2.30)
19
Zellner and Theil have shown that these estimators are consistent and
that their asymptotic covariance matrix is
1
lim E[T (§..-.0 ('§"-.0 ' ]
TT~
-1
=
t
Plim T
allH~X(X'X)-lX'Hl
alMH~X(X'X)-lX'~
a 2lH;X(X'X)-lX'H
a2MH;X(X'X)-lX'~
T~
l
MMu'
a
, l'
~~X(X X)- X ~
(2.31)
which is consistently estimated by the matrix Qn the right-hand side of
(2.31).
It has been, futther, shown that these estimators are
asymptotically more efficient than the two-stage least squares estimators.
It is, of course, easy to see that if all the equations are exactly
"-
identified, the 2-SLS and 3-SLS give identical estimators.
2.4.4
Limited Information Single Equation (LISE) or Least Variance
Ratio Method"
The two methods are essentially the same, though these were
developed under different assumptions - the former by Anderson and
Rubin [1, 2] under the assumption of normality of structural disturbances,
and the latter by Koopmans and Hood
~O]
without normality assumption.
Consider the first structural equation
1
Zl = y (Xl + Xl~l +.!!1
so that
.I.*
I =
Partition X - the matrix of observations on all exogenous variables
in the complete system of equations - as X
=
(Xl' X ).
2
The principle of
least variance ratio states that the ratio of residual variance when .I.l*
20
*
is regressed on Xl to the residual variance when Y1 is regressed on X is
as small as possible - that 'is, the addition of excluded exogenous
variables X should make a minimum improvement to the explained sums of
2
squares.
Let t denote the ratio of the two unexplained sums of squares.
We have
[I t
=
(2.32)
*
, -1'
[I - X(X X) X] Y1
=-
Therefore, t =
a
0'
y 1'*
0'
y 1'*
~1
(say) •
= -
o ,
-e
=
* 0
W
~1 1 a 1
0
* 0
a W
1
.,."
~1
~.
o
Minimizing with respect to
~1'
we obtain
(2.33)
This equation has a nontrivial solution if and only if the nullity
of the matrix W* - t W* is at least equal to 1 - that is, if and only if
1
1
(2.34)
This is a polynomial of degree T in t.
If t is the smallest of
the roots of (2.34), we can solve
(2.35)
o
for a
1
, obtaining an estimate a
1
of
~1'
The next problem is to find an estimator
on Xl' we have
~1
of
~1'
*
Regressing Y1
21
Thus Al is estimated by
A
I
A
A = (XlX l )
where
;
-1
1*
I
~
(2.36)
XlY~l'
I-~
~l = l~J
2.4.5
K-Class Estimators
Theil [24] defines the K-class estimators by the relations
(].i)
Y
I
X
]1
-e
(]1
= 1,
I
(")' J
(].i)
Y X
Y- KV V
].i
]1
(].i)
(].i)
].l
-'\l
X].i X].i
~
I
Y - KV ].i
a
=
I
Y
(2.37)
~,
X
].i
(ll)
2, .•• , M), where Y and V have been defined before, and K is a
].i
scalar which may be a constant.
Or it may be a stochastic or non-
stochastic variable.
When K
= 0,
we h.ve the ordinary least squares estimators;
whenK
=
~,
we have the LVR estimators, and
wnen K
= 1,
we obtain the 2-8L8 estimators.
It has been shown by Nagar [14] that if P1im
(K~l)
= 0,
the K-c1ass
T-+<><>
P
IT"" (K-1) --7>
estimaters are cons:f,.st'int; further, if
0,
have
asymptotic covariance matrix
(].i) I (]1)
Y
(1
]1].i
P1im
T-+<><>
Y - V
]1
I
[
AI
X
]1
(].i)
A
V,
].i
-1
I
Y X
]1
(]1)
Y
X X
]1
].i
(2.38)
22
2.4.6
Full Information Least Generalized Residual Variance Method
This method was developed by Koopmans, Rubin and Leipnik [11] under
the specification that the structural disturbances are normally distributed.
While in the limited information least variance ratio method
only the restrictions on a single structural equation are utilized, this
method uses the restrictions on the entire system of structural equations.
It is a method for simultaneous estimation of all structural parameters.
In order to estimate the
Y = XII
reduced~form
+ V, we minimize
I'
I
- (V V) = - (Y - XII)
T
T
We may as well minimize
,
-e
system
1
- log I(v V)
2
I
I = 2'
t
log
(Y - XII).
,
log I(v V)
IA1-2 + 2'I
I.
But
log
,
lu ul·
(2.39)
From the structural equations,
(Y,X)
Z A
or
2'I
+
[~] + U = 0
U = 0 , we have
log Iv ' vi = -log IAI +
2'I
loglA
' ,
2 2 AI
(2.40)
This quantity is minimized subject to the constraints on A =[AJ
B. '
to obtain estimates of structural parameters.
The method has not been
extensively used, because on equating to zero, the partial derivatives of
,
log Iv vi with respect to the elements of A are, confronted with a complex
set of nonlinear equations.
23
3.0
SOME THEOREMS ON STOCHASTIC CONVERGENCE
Useful Inequalities
3.1
This chapter concerns the convergence of sequences of random
variables.
Using the standard notation, we shall give some theorems
on stochastic convergence which will be useful in the sequel.
First, we state a few inequalities:
(i)
Let xl' x 2 ' ••. , xn ; Yl' Y2 , •• ", Yn be nonnegative real
numbers.
Then
n
~
-e
> 1
n
xi Y ~ (~ x.p)p
i
i=l 1
i=l
where p
-1
n
1
(
~
1
y.q)q
(iii)
(3.1)
(Cauchy-Schwarz)
(3.2)
(Holder)
(3.3)
1
and - + - = 1.
p
q
For arbitrary real xi' Y (i = 1, 2,
i
(ii)
(Holder)
1
i=l
o •• ,
n)
If X and Yare random variables,
1.
EIXYI ~ [ElxIP]p
for p
>
1 and
1.
+ 1.
= 1, provided that the expectations exist.
p
q
An extension of (3.3) is the following:
E
1.
1
IXYZ •• I.~J E ( IXIp) ] p
[EIYlq]q
(iv)
c
number.
[Elzlr]r ... ,1.+1.+1.+
p
q
r
••• =
1. (3.4)
Let X be a random variable and c a positive constant.
EIXlr
p[IXI > c] < ~
(v)
1
r
for all r > O.
(Markov)
Then
(3.5)
Let X and Y be two random variables and e: > 0 an arbitrary
Then
p[IX-YI > e:] < E[X-1]2
e:
(Tchebychev)
(3.6)
24
(vi)
For any random variable X,
IE(X) I' .::. E IX!
provided EX exis t s •
(3.7)
(For proof of these inequalities see Rao [15]' and Loeve [12].
3.2
Stochastic Convergence.
Univariate Case
A
Let Xl' X2 ' " ' , Xn ' be a sequence of random variables.
particular member, X , of this sequence has distribution function
n
F (x) which depends upon integer n and moments (if existing) which also
n
depend upon n.
The various modes of convergence of sequences of random
variables seek an answer to the question:
What,happens to the random
variable X , its distribution function F (x) and
n
n
it~
moments when n gets
large?
Definition 1.
A sequence {Xn } converges in distribution to the
random variable X, if the distribution function F (x) of X converges
.
n
n
to the distribution function F(x) of X at all points of continuity of
F(x).
We write X
n
Definition 2.
dist. '> X.
A sequence {X } of~' random variables converges in
n
probability to a constant cif
for
every
E >
cl
0 lim p[IX
n
n -+co
> E] = O.
p
We write X --~ c, or P1im X = c.
n
n-+co n
Definition 3.
A sequence of random variab~es {X}
n
is said to
converge in probability to a random variable X if for every
lim P [ IX - X I > E] = O.
n
E >
0
P
Wewri te X --;> X.
n
Convergence in probability implies convergence in distribution.
For proof, see Wilks [18], pp. 101-102.
The converse is not true.
25
Theorem (3.1).
I
Let {Xn } be a sequence of random variables and X
a random variable such that EX
lim E(X - X)2 =
n
n-7<lO
<
2
n
<
00
for all. n, and
o.
lim EX = EX;
n
n-7<lO
(ii)
lim E(X - EX )2 = E(X - EX)2.
n
n
n-+oo
(iii)
(i)
See Wilks [18], p. 100.
I [EXn
(ii)
EX]I~ Elxn - xl
~ IE(Xn - X)2,
-
Therefore, 0 ~ lim I (EXn - EX)
n -+00
'e
EX
00,
X 2-..> X·,
n
Then (i)
Proof:
2
I~
by (3.7) and (3.3).
lim IE(X _ X)2
n-7<lO
n
=
0
2
so that lim EX = EX, which clearly exists because E(X ) <
n
n-7<lO
(iii)
Now E(X
n
Let EX = a, EX
n
=
a
n
00.
so that by (ii) lim a = a.
n-+oo n
- EX )2 = E(X - X + X - a + a - a ) 2
n
' n
n
=
E(X
n
- X)2 + E(X - a)2 + (a
n
_ a)2
+ 2E(Xn - X) (X - a) + 2(a - a n ) E(Xn - X)
+ (a- an ) E(X - a)
=
E(X
n
- X)2 + E(X - a)2 + (a
n
_ a)2
+ 2E[(Xn - X) (X - ' a)] + 2(a - an ) E(Xn - X).
Since lim E(X
n
- X)2 = 0, E(X
lim IE(Xn - X) (X - a)
n-+oo
I~
Therefore, lim E(X
n
n
- X) is bounded for all n.
lim IE(X _ X)2 E(X _ a)2
n-+oo
n
=
O.
- EX )2 = E(X - a)2 = E(X - EX)2.
n
Moreover,
26
Corollary (3.1.1).
with probability 1.
lim EX
n
n~
= c
Let X be a degenerate random variable.
Then lim E(X
N~
- c)2 =
n
° implies
n~
n
n~
lim E(X - Y )2
n
n
n~
n~
Proof:
n
n~
= lim E Yn ,
lim E(X )
n
lim E(X
random
0, then
=
n~
--
c
= Flim Xn •
Let {Xn } and
. {yn } be two sequences of
2
variables such that lim E y < 00, and lim E Y exists. If
(ii)
=
that
Corollary (3.1.2).
(i)
X
n
n~
= lim E(Yn _ EYn )2
- EX )2
n
n~
2
Since lim E(y )
n~
<
n
00, it follows that E y 2 and E(Y - E Y )2
n
n
n
both exist and are bounded for all n.
Now IE(Xn - Yn )1 -< Elxn - Yn I -< I EX-Y
(
)2 which tends to zero
n
as n
+
00, so that lim E(Xn - Yn )
=
0.
n
Therefore,
n~
lim E X
n
n~
= lim
n~
[E(X - Y + Y )]
n
n
n
This proves (i) above.
E(X - E X )2
n
n
=
= lim
n~
E(X - Y ) + lim E Y = lim E Y
n
n
n
n
n~
n~
Further,
E(X - Y + Y - E Y + E Y - E X )2
n
n
n
n
n
n
= E(X - Y )2 + E(Y _ E Y )2 + (E Y _ E X )2
n
n
n
n
n
n
+ 2E[(Xn - Yn )(Yn - E Yn )] + 2(E Yn - E Xn )E(Yn - E Yn )
+ 2(E Yn - EXn )E(Xn - Yn )
= E(Xn
- Y )2 + E(Y _ E Y )2 - (E X _ E Y )2
n
n
n
n
n
+, 2E(Xn - Yn ) (Y n - E Yn ) •
27
On taking limit and using the given condition, we obtain
lim E(X
n
- EX )2
n
= lim
E(Y
n-+oo
- EY )2 + 2 lim E(X - Y ) (Y - EY ).
n
n
n
n
n
n-+oo
n
But lim IE[(X
n
n-+oo
Y )2 E(Y
n
= lim. ,E(Yn
Therefore, lim E(X - EX )2
n
n
n-+oo
n-+oo
=
o.
n
- EY )2.
n
The corollary is proved.
Theorem (3.2).
Let S be-a statistic and
n
that for a given integer
lim E Is
n
n-+oo
- 13 I~
~ >
=
3
0.
Let X
n
be normal (0, 0 2 ) where 0 2 <
n
n
Then S
X
dist.
n
a
n
n
B a finite constant such
~ normal (d, 13 2
00
0 2 ),
and lim 0 2
n-+oo n
= 02 •
and the mean and variance of
X converge respectively to the mean and variance of the limiting
n
distribution.
Proof:
Using Holder's 'inequality, we have
P
Hence, by theorem (3.1), an ---:;> S.
dist.
Then X
:> X, and, by Slutsky's
n
theorem, (Cramer [7], p. 254) a n X,n dist. :'> BX, which is normal (0, 13 2 0 2 ).
Let X be normal (0, 0 2 ).
Further, E(a X)
n n
= E(Sn
'V
- S)X + SEX = E(8 - 8)X .
n
n
n
n
which tends to zero as n goes to infinity.
Therefore, lim E(S - S)X
n
n
n-+oo
= 0,
which implies that lim E Sn X
n
n-+oo
= 0.
28
Moreover, 'using Holder's inequality, we have
2
1
2
6 1
2
E X (S - S)2 < [E X ]3 [EIS - S13]3 = (15 0 6 )3 [EIS - S1 3 ]3
n
n
n
n
n
n
which tends to zero as n
~
'
and
00,
which also tends to zero as n
~
00.
Therefore,
lim E
X~ S~
= lim E
n~
X~
(Sn - S)2 + S2 lim E
n~
X~ +
2S lim E
n~
X~
(Sn - S)
n~
The theorem is proved.
Let X and S be specified as in theorem (3.2).
Corollary (3.2.1).
n
n
Let P(x) be a polynomial of degree p (3p
lim Elp(S ) - peS) I
n~
r
n
Further, y
o for
n
Then
0 < r < ~ •
-p
= X Pc'S) dist
n
n
n
two moments of y
=
~ ~).
.~ normal
(0,
0
2
p 2 (S») and the first
converge to the corresponding moments of the limiting
distribution.
Proof:
By Taylor's theorem,
p
tV
pes ) = pes) + L
n
v=l
= d
where
v
P(x)
dxv
I_
x =
s.
p
J
Therefore,
- pes) Ir
~
L
[ v=l
(v)
P v! (S)
L
IS
n
and the given conditions ensure that
r
lim Elp(S ) - pes) I = 0 for 0 < r ~ ~/p
n~
n
29
The rest of the corollary.follows from the theorem (3.2), because
'U
P(S ) plays the same role as S .
n
n
'U
Corollary (3.2.2).
(3.2.1).
Let X , Sand P(x) be specified as in corollary
n
n
Let R(x) be a rational function of the form R(x) =
1~6(~)
Q(x) being any polynomial such that Q(x) > 0 for all real x.
Ir
lim EIR(Sn) - R(S)
n-+oo
Further, Z = X R(~)
n
n
n
~
° for 0
dist.
~ ~/p.
< r
--:.> normal
Then
2
[0, (J2 R (S)], and the first
two moments of Z converge to the corresponding moments of the limiting
n
distribution.
Proof:'
,
Evidently
Q (x)
[1 +:Q(x)2:
is bounded for all real x so that
there exists a positive constant M <
,
~<M.
[1
such that
00
+ Q(x)]2 -
By Taylor's theorem,
1
1 +Q(S )
n
where
Yn = S + e(Sn -
S), (0 ~ e ~ 1).
,
'U
P(S ) - P(S)
n
.
1
< c E
r
+ Q(Sn )
pes. n ) .'- P(S)
1 + Q(S )
n
.
by c r - inequality, c r
[12], p. 155).
Therefore,
=1
'U
P(S) Q (Y )
r
n
[1
+, Q(yn )]2
r
,
+ cr
r
IP (8) I
'U
Q (Yn )
r
'U
E ---:;;.- (8 1 + Q(y) n
n
r-l
or 2
according as r < 1 or r
~
8)
1, (Loeve
30
Using the results of corollary (3.2.1), we have
= 0 for 0
r
<
< ~ •
-p
The remaining part of the corollary now follows immediately from
'"
the fact that R(Bn ) satisfies the conditions stated in theorem (3.2).
3.3
Let
~
Convergence of Sequences of Random Vectors
Y1
= Y2
be a K-dimensiona1 ranQom vector defined on the basic
probability space (Q,a,P), the symbols having the usual meanings.
We
J Y1 dP
de:l;ine
E ~ =
Definition 4.
J ~ dP
Q
=
J Y2 dP
Q
Q
The integral J~ dP is said to be finite (or existing), if each of
Q
the integrals on the right is finite.
In usual notation, we have
<
K
E
i=l
If ~
Q
dP I 2.
K
J
Q
I~ 1dP 2. E
.
ly.l,
~
J Iyi IdP •
(3.8)
i=l Q
The following very useful result is probably not new, but a simple
proof is given.
31
.Y l
Theorem (3.3).
Then for every
P
E >
[Max
l<i<K
~=
Let
I~I ~
E
be a K-dimensional random vector.
0,
,
Iyi I
~ P [ I~ I
> E]
> £]
< E (~'1.~
E
Since I~I <
Proof:
Y2
is a sphere in K-dimensional Euclidean space,
E
==;>Max IYi l ~
l<i<K
E.
Or
Max IYi l >
l<i<K
E
==>I~I >
Therefore, P Max 1Yi l >
l<i<K
[
E.
~ p[I~1 >
J
E
E]
E(I~12)
<
2
,
=
E~~
E
E
2
Theorem (3.4).
(i) Let
~
~
be a sequence of K-dimensional random vectors
a K-dimensional random vector such that
,
E (~~ <
00,
and lim E (~ -
~ (~
and
,
-
~
= 0.
n~
l~ E(~) = E~,
.and
n~
,
lim E (~ - E
~) ~
-
E~)
=
E (~ - E
~)(~
- E
~)
n~
(ii) Let c be a Kxl vector of constants such that
lim
E(~
- s)
(~
- s)
=
0.
n~
Then
lim E
n~
Proof:
~ =
Plim
~
= .£.
n~
By Holder's inequality, we have
IE ~I ~ EI~I ~ f E ~'~
E
~
exists.
,which is finite by assumption.
Moreover, it is evident that E(Z - E
is a positive definite matrix.
,
~ ~
-
E~)
Hence
exists and
32
Now let y. and y. denote the corresponding elements of v and y
l.n
l.
-n
respectively.
By the stated conditions, it follows that (y. - Yi) is
l.n
always well-defined.
,
By theorem (3.3), we have
P
e:
[Max Iy . - Yi
l<i<K
l.n
I
e:]
>
E(.Y.n - ~ (.Y.n - ~
<
e: 2
,
>
0.
, for every
But lim E(.Y.n - y) (.Y.n - y) .;, lim tr. E(.Y.n - ~.~ - ~
n~
= 0,
n~
since K is fixed.
Therefore, lim P [Max !Yin - yil >
n~
l.::.i.::.K
e:] = 0,
which implies that
P
Yn ---7>" y.
Further, lim E(.Y.n)
=
n~
-e
Since lim !E(.Y.n - y)
n~
\
lim E(~ - ~ + E{y).
n~
I .::. lim EIXn - yl ~ lim tlE (
n~
n~
~
_ ..\
.:J..I
'(v
""'n
_
)
Y
.
it follows that lim E(~ - ~
=
Q,. and lim E Xn = E y.
n~
n~
,
= lim
E~
- y) (Xn - ~
+ lim E(y - E y) ~ -
n~
~
n~
+ lim E(Xn - y)(y - E y) + E{y - E y) (y - E y)
n~
= lim
E{y - E
:U (Xn
-
~
+ lim E(Xn - y)(y - E y)
,
n~
n~
+ E(y - E ~ (y - E ~ •
,
A typical element of E(y - E
~
(Xn - 1)
is E(Y
Using Holder's inequality, we obtain
< lim tl E(
)2 (
_ y)2
n~
Yi - E Yi
E Yjn
j
= 0.
i - E Yi ) (Y jn - y j ),
= 0,
33
,
This implies that lim E (y, - E ::t)
(~
-::t.)
= 0 =
n~
lim E ~ - ::t.)(::t. - E ::t..>
n~
Hence, lim E(~ - E ~)(~ - E~)
Q.E.D.
= E(y, - E ::t..> (::t. - E ::t..>
,
n~
Replacing ::t. by .£, we see that lim
E~ -
.£) ~ -.£)
=0
implies
n~
Plim
n~
~
= lim
E
~
= c.
This,proves (ii).
n~
~
Theorem (3.5). Let
Yln
Y2n
=
be two sequences of
K-dimensional random vectors such that
lim E
lim
exists,
~
n~
,
E~) <
00,
and
n~
lim E( ; -
~) ( ; -~)
= O.
n~
Then (i) lim E ;
= lim
n~
E ~,
n~
(ii) lim Ee; - E ~) e; -E ~)
n~
Proof:
Since lim E(~ - ~)e; -~)
= 0,
n~
< lim
n~
= 0,
I tr.
which implies that
Therefore,
lim E ( ; ) = lim E( ; n~
n~
which is finite by assumption.
This proves (i), above.
~)
+ lim
n~
E ~
= lim
n~
E ~ ,
34
,
E~) <
Further» lim
00
implies that
~
has existing covariance
n~
E(~
matrix for all n and» moreover» lim
- E
~)~
-
E~)
exists.
·n~
~
Denote this matrix by V
[v .• ].
~J
Then in the manner of corollary (3.1.2)>>
it follows» from the given conditions» that
lim E(Zi - E Zi )(Z. - E Z. )
. n
n . Jn
Jn
=
n~
= vij
= 1» 2, ••• » K).
(i» j
This implies that
lim E(Z
,
- E Z )(Z
-xl
n~
lim E(Y i - E Yin) (Y jn - E Yjn)
n
n~
--n
-xl
- E Z )
,
= lim
-xl
E(~ - E ~)(~ - E::Lu) •
n~
The theorem is proved.
'V
Theorem 3.6.
"-
Let IT
=
upon the integer n» and IT
'V
[IT rs ] be a KxK matrix of statistics depending
= (IT rs )
a KxK matrix with real nonstochastic
elements also depending on n such that
lim IT
=
IT (positive definite)>>
n~
and:
lim
n~
E(~- IT )4 = 0»
rs
Further» let
lim V
(r» s
rs
~
be
= 1» 2» ... , K).
K~variatenorma1
= V (positive
[Q» V= V(n)]» where
definite).
n~
Then (i) ~ ~ converges in distribution to normal (Q» IT V IT )>>
'V
(ii) lim E IT
~
= Q» and
n~
'V
(iii) lim E IT
~
'V
IT
= IT-
-
-
V IT •
n~
Proof:
and
Let IT rs» v rs and v rs denote the typical elements of IT» V
Vrespectively.
Using Holder's inequality» we have» for t < 4
EI~ rs
- IT
It
rs
<
-
[E(~ rs
t
_ IT
rs
)4] /4 •
35
Therefore,
lim E Irt' rs
n-t<>o
Further, c
inequality (see Loeve [12], p. 155) gives
r
Elrt'rs - IT rs I~
<
Since lim IT
n-t<>o
rs
lim Elrt'
n-t<>o
EI(rt'.rs - IT rs ) + (IT rs - IT rs ) I~
=
2~-1 [E·lrt'
= IT
rs
rs
rs
- IT
rs
I~ + lIT
rs
I~]
- IT
rs.·
' we have
- IT rs I~ = 0,
~ = 1,2,3,4.
'U
P
This implies that IT ----? IT
(elementwise).
dist.> normal (Q, V), it follows by Slutsky's theorem
Since Yn
'U
_
-'
that IT Yn approaches in distribution to normal (Q, IT V IT ).
Now E(rt' Yn) = E(rt' -
+ IT E Yn
IT) Yn
=
E(Tl - IT) Yn'
K
'U
A typical element of (IT - IT) Yn is or =
(II
I:
i=l
n
<
l:
i=l
=
-
IT .) y. ,
r1
1n
and
EI (rt' r1. - IT r1.) y.1n I
K
-
'U
l:
E(IT r1.
i=l
r
IT. )
2
ri
K
Hence, lim IE(o ) I =
n-t<>o
0
r1
l:
./
i= 1
. ' {
lim
n-t<>o
.
'U
v.11~ E( IT r1.
-IT i )2}
r
_
K
=
l:
./-
i=l
'U
-
vii lim E(IT ri - IT ri )
n-t<>o
'U
'U
2 = 0, which implies that
-
lim E(IT Yn) = lim E(IT -IT) Yn = Q.
n-t<>o
'U
n-t<>o
''U'
Finally, E(IT Yn Yn IT )
= E[(IT'U
'U
-
''U
_
'-'
-'
-
''U
-'
- IT) Yn Yn (IT - IT) ] + ITE[Yn Yn (IT - IT) ]
-
'-'
+ [E(IT - IT) Yn Yn ] IT + IT E(Yn Yn ) IT
•
36
~
''V
IT)~ (IT
'V
Consider the matrix (IT -
-'
- IT) •
A typical element of
this matrix is given by
K
K
'V
(IT i - ff i)Yi }" {E (~i - ff ')Y i }
i=l
r
r
n
i=l
s
S1
n
eprs =
{E
Using Holder's inequality, we obtain
lim IE eprs l <
n~
"-
+
K
E lim
[{E(~ri
i=ln~
K
K
E
E lim
4 1
- ff ri ) }4
{E(~Si
1
4 1
- ff si )4}4 {E y in }2]
i=l j=ln~
ifj
+
K
K
E
E
i=l j=l
ifj
<1-3V
iiV jj
= 0, which implies that
'V
-
''V
~
lim E[(IT - IT)
-'
o.
(IT - IT) ] =
n~
' V '
''V'
Similarly, lim E[(IT - ff) ~YnJ = 0 = lim E[~ (IT - IT) ].
n~
Moreover,
n~
, -'
-
-
-'
- - -,
lim (IT E X~ IT ) = lim (IT V IT ) = IT V IT •
n~
n~
'V
Therefore, lim E(IT
''V'
YnYn
n~
The theorem is proved.
I
IT ) = IT V IT .
37
Corollary 0-.6.1)
'V
Let TI, TI and.In he-specified-as in theorem (3.6).
0
Let Z be a sequence-of r<-dimensiona'l-random··ve-ctors·such that
"""'I1
Um E(Zin -:- Yin)
4
(i
= 0,
n-+<X>
= 1,
.0.,
2,
,K).
~ ~ di~ 1::. -::.>nonnal (.Q., TI V IT'),
Then (i)
(if)
-'
lim E(TItV Z Z 'tVTI') = -TI V
TI •
-n-n
n-+oo
(ifi)
Proof:
Using Holder's inequality, we have
lim E(Zin - Yin)
n-+oo
2
<
41
lim [E(Zin - Yin) ]2
n-+oo
,
so that lim E(~ -~) (~ - ~)
n-loOO
=
= 0,
K
(i = 1, 2, ••• , K)
2
E lim E(Zin - Yin)
i=l n-+oo
= 0.
Therefore, by theorem (3.3),
Plim (Z
- v )
n-+oo """'I1 """'I1
Let Z be normal (Q,
Plim
n-+oo
d
Z - IT Z)
"""'I1
Thus
°.
V).
Then
~
P
Z' and
=
PUm ~(Z - ~) + PUm (TI - TI) ~ + PUm TI <.I.u
n-+oo -n
n-+oo
n-+oo
=
o.
tV
~ ~ dist.~ IT
Since Yin (i
=
= 1,
-
-
-
Z)
Z, which is normal (0, TI V IT').
2, ..• , K) have
existi~g
moments of all orders
which converge to the corresponding moments of y. (i
~
= 1,
2, .•• , K), the
given condition ensures that Zin has existing moments of the 4th order,
and moreover,
lim
n-+oo
J/,
)
E(Z~n) = limE(y.
.
n-+oo
~n
for
J/,
38
The remaining part of the corollary, therefore, follows from the
theorem if we rep1ace.y.;:m by Z.
~n
Note:
and . use these results.
In many practical problems, we have to deal with random
variables whose distribution depends upon two or more integers.
The
results established in sections (3.2) and (3.3) can be reworded to take
care of such a situation.
Suppose that in theorem (3.2) we have the
'V
following hypothesis:. S is a statistic depending upon integers nand T
and 8 a finite constant such that
Ele- sl~
and Ttend to infinity independently.
a~n T)
that lim lim
n-+eo T-+eo
(~~ 3) goes to zero as n
If X is normal [0,
a~n,T)]'
2
=
cr ,
it is straightforward to show that
'
such
~X converges
2
in distribution to normal (0,
-e
eX converge
cr),
and that the mean and the variance of
respectively to the mean and the variance of the limiting
distribution.
39
4.0
ERROR COMPONENT MODEL: REDUCED-FORM ESTIMATION
401
Description of the System
This and the following three chapters are intended to provide
methods for the estimation of structural and reduced-form parameters in
a system of simultaneous linear equations with error structure specified
in (1.3).
Following the common practice, we first consider reduced-form
estimation.
To begin with, let us restate in greater detail the assumptions
underlying the error component model.
,
,
where ~t
~t A
=
,
,
+
~t B
+
The structural equations are
,
~t
=0
(4.1)
(Ylit' Y2it ' "0, YMit ) ,
~t
=
(x lit ' x 2it ' ••. , xKit ),
A is an MxM nonsingular matrix of constants with diagonal elements equal
I
,
to -1, B is a KxM matrix of constants, and
is a vector of unobservable errors with
~t
E~it
=
(E lit , E2it , ••• , EMit)
= U~it +
V~t
+
W~it'
We make the following assumptions:
u
(i)
For each i, t,
~
=
w
li
lit
u2i
and
~t =
w2it
are
mutually independent M-dimensional normally distributed random vectors
with zero means and covariance-matrices
nu
,
= E(u. u.) =
-J. -J.
u
°11
u
°21
u
°12
u
°22
u
°lM
u
°2M
U
°Ml
U
°M2
u
°MM
40
e
,
Q
v
= E~~) =
,
=
w E~tWit) =
Q
v
all
v
a 2l
av
12
v
a
22
v
am
v
a 2M
a~ll
v
v
a
M2
v
a
MM
w
all
w
a 2l
w
a
12
w
a
22
w
aIM
w
a
2M
aw
M2
w
a
MM
.w
aMI
and
the elements of these matrices being all finite.
(ii)
.0., M), uj.ll' uj.l2 ' ••. , represent
For each j.l ( = 1, 2,
independent drawings from the marginal distribution of u j.l which is
--
u
normal (0, aj.lj.l ); v j.l l' vj.l2' .•• , represent independent drawings from the
V
marginal distribution of v j.l which is normal (0, a j.lj.l ); and wj.lll' wj.l12'
.•• , represent independent drawings from the marginal distribution of
W
wj.l which is normal (0, a j.lj.l ).
(iii)
The exogenous variables x"
J 1.t
independent of the error terms
(j = 1, 2,
... ,
K) are
't (j.l = 1, 2, ••. , M).
j.l.1.
E
Suppose that we have a sample of n T observations for n crosssectional units and T time intervals.
To avoid complications, we shall
assume that nTxK matrix X is nonstochastic and subject to the condition
that
<a)
~~
lx:; I
exists and is positive definite, and further
T~
that
n
(b)
T
~
~ (x .. t -x..
i=l t=l J 1.
J 1.
for all integers nand T > 1.
0
X.
J•
t
+
xj • • )
2
>
°
(j = 1, 2, ... , K)
(4.2)
41
Nonstochastic.X implies the assumption (iii) above, while (4.2)
eJ.'isures that X does not include a colunmvector of ones.
Inter1'iJ.s of nl' observations, we can write the]..l
th
structural
equation in the compact form.
Y..
JJ
where y
= y]..l
Ct
""1J
+ X S + e: ,
]..l""1J
(4.3)
""1J
is an nTxl vector of observations on the]..l
]..l
th
endogenous
variable, y]..l is an nTx(t -1) matrix of observations on (t -1) endogenous
]..l
]..l
. th·
variables (other than y ]..l ) included in the]..l
equation, X] .
is. l
an]nTxK
..l
.
th
matrix of observations on K exogenous variables included in the ]..l
]..l
equation, e: is an nTxl vector of errors, Ct is an (t ]..l -l)xl vector of un""1J
""1J
is a K xl vector of unknown parameters.
]..l
known parameters, and S
'"'"'1J
If we write,
-e
.
1 .Q. 0
A= 0
0
1
-0
0
1
0
0
I
0
,
B
u]..ll
= I
u
Il
= u]..l2
I
1
nTxn
u
nTxT
]..In
v
Il
v]..ll
w]..lll
= v]..l2
and w = w]..l12
Il
v]..lT
nxl
(4.4)
w]..li:lT
Txl
nTxl
where 1 is a Txl vector of ones and I is a TxT identity matrix, we have
e:
""1J
= Au + Bv +
Il
Il
w .
Il
Using the assumptions listed above, we obtain
= E(e:1-1
L
]..l]..l
J
=
0
0
0
J
0
0
0
J
0
0
0
,
=AA
e:)
1-1
0
0
J
, u
cr
I
]..l]..l
I
u
cr
I
]..l]..l +
I
I
I
... ...
nTxnT
I
I
I
v
cr
]..l]..l +
I
0
0
0
I
0
0
I
cr
w
]..l]..l
e
42
cr
u
llll
+
J
=
v + crw ) I
llll
llll
v
I
cr
llll
(cr
cr
u
a
J
llll
v
I
llll
v
I
llll
v
cr
I
llll
cr
+ (crv + crW ) I
llll
llll
(4.5)
Here J is T:x:T matri:x: with 1 everywhere.
There are n rows and n columns
of T:x:T block matrices.
Evidently, L
llll
, = E(E
""1.1
E' ,) where II
""1.1
,
by replacing one of the ll'S by II
+II
can be obtained from (4.5)
.
By trial error and generalization we obtain
-e
I + a J
2
a
I + a
J
a
a3 I + a4 J
a
a
-1
llll
L
=
where
a
1
3
4
3
1
3
I + a
I + a
I + a
4
2
4
I + a
J
a
J
a ,I + a J
4
3
J
a
3
1
I + a
4
2
J
(4.6)
J
v
cr
1
}.l)J
a = --w
w
wv
1
(cr
+ ncr )
cr
cr
llll'
llll
llll
llll
cr
a
2
=
u
)J)J
u
w w
crllll(cr
+ Tcr llll )
llll
[1 -
cr V
)J)J
W
U
1]
(20 1J)J +nov
lJ)J + To )Jll
w
v
w
cr, + ncr v
cr
+ ncr + Tcr u
llll
llll
llll
llll
llll
(4.7)
43
The reduced form of the system written in (4.1) is
*'
II +
~t =~t
.f..:t.t
(4.8)
,
*'
-1
-1
where
= -.f..:t.t A and II = -B A •
.f..:t.t
(lJ = 1, 2, ..., M)
£*
=
lJit
-
=
-
M
~
r=l
-e
... ,
(£lit' £2it'
a lJr £
= rit
denote the column vectors of
£Mit)
M
a lJr
~
U
r=l
(l
-
ri
M
a lJr v
~
r=l
rt
M
-
~
r=l
a lJr Writ
*
*
*
= ulJi + VlJt + WlJit
where
*
ulJi
M
= - ~ a lJr
r=l
(4.9)
,etc.
U
ri
Recalling the assumptions about u., L. and w. , we see that
..
~
M
~
M
~
a lJr alJ s
r=lr=l
,
0, i f i
M M
~
~
~t
,
a
r=ls=l
0, if t
lJr
,
+i
alJ
+t
n
U
"rs
= ou*' if i' = i,
lJlJ'
for all lJ, lJ
s
v*,
v
° rs = °lJlJ ' i f t
for lJ,lJ
= 1, 2,
M M
~
0, i f i
+i
and/or t
+t
... , M·,
,
= t
... , M;
s W
w*,
a lJr alJ
° rs = °lJlJ ' i f i
r=ls=l
~
and E(W*'t w*'.' ') =
lJ~
lJ ~ t
= 1, 2,
,
,
= i and t
= t
(4.10)
e
44
Thus we can assert that
(i)
*
*
*
~,~ and~t
are mutually independent M-variate normal
vectors with zero means and covariattce matrices
rl
rl
u*
v*
=
=
-rlw*
=
(ii)
u*
°11
u*
°12
u*
°lM
u*
°21
u*
°22
u*
°2M
u*
°M1
u*
°M2
u*
°MM
v*
°11
v*
°12
v*
°lM
v*
°21
v*
°22
v*
°2M
v*
°M1
v*
°M2
v*
°MM
w*
°11
w*
°12
om
w*
°21
w*
°22
w*
°2M
w*
°M1
w*
°M2
w*
°MM
For each
*
w*
respectively;
~,
*
••. , are independent drawings from the
marginal distribution of u which is normal (0, ° u* );
u~l'
u~2'
*
~
*
~~
*
v~l' v~2' .•. ,are independent drawings from the
marginal
distributio~
*
of v * which is normal
(0, ° v*
); and
,.
.
~~
~
*
w~ll' w~12'
.•. ,are independent drawings from the
marginal distribution of w* which is normal (0, ° w* ).
~
~~
0
v* I
1111
0
v* I
1111
0
u* J +
1111
(0
v* + ( w* ) I
1111
1111
(4.12)
and
-e
2:
*-1
1111
=
a *l I + a *2 J
a* I + a* J
3
4
a * I + a *4 J
3
a * I + a* J
3
4
a *l I + a *2 J
a *3 I + a *4 J
(4.13)
* a * , a * , and a * have the same expressions as those of aI' a , a
where aI'
2
3
2
4
3
replaced by
0
u*
1111 '
0
v*
1111 '
0
w*
respectively.
1111 '
+
The formula for 2: * '(11 '
11) can be derived from (4.12) by changing one
1111
,
of the subscripts 11 into 11 •
The distinctive feature of the error component model is that the
covariance matrices of disturbance vectors of anyone of the structural
or reduced-form equations are not diagonal.
Consequently, many of the
techniques which are ordinarily employed for structural and reduced-form
estimation are no longer appropriate.
In chapter 2.0 for
i~stance,
we
46
saw that the ordinary least squares method applied to each reduced-form
equation separately would give B.L.U. estimates.
error vector of the
form cr
~~
I.
~
th
This was because the
reduced-form equation had covariance matrix of
In the present case, because of the form of the covariance
matrices, ordinary least squares method applied to an equation in the
reduced form of the system will give unbiased, but not minimum variance
estimates (within the class of unbiased estimates).
4.2
Estimation of Covariance Matrices of Reduced-Form Disturbances
We can write the reduced-form equations in the form
~t III
+
E:
*
lit
E:
*
2it
,
I.tt
-e
~t l!.2
=
.,
*
~t~
If the covariance matrices
E:M~t
·
L
*
~~
'were known, best estimates of reduced-
form parameters would be given by Aitken's generalized least squares
method applied to the entire set of reduced-form equations.
since cr
u*,
~~'
cr
v*,
~~'
and cr
w*,
~~
However,
are unknown parameters, the first step in
the estimation procedure is to find estimates of variance components
which appear in the covariance matrices.
Ignoring the complicated error structure we obtain the ordinary
least squares estimators
!Iv
A
=
'-1 '
(X X) X ~
( ~ = 1 , 2, ..., M)
(4.14)
These estimators are unbiased and consistent and their covariance matrix
is
1
(X , X)- 1 (X '*
L X) (X 'X)~~
(4.15)
47
,
Writing Y]..Iit = ~t (X X)
*
e]..lit
=
-1'
X ~, ·we have estimates
(4.16)
(Y]..Iit - Y]..Iit)
of the reduced-form residuals £*i •
]..I t
Theorem (4.1). Under the assumption (4.2),
*
*
P
e]..lit~
Proof:
£]..Iit·
-1
Write S = lim
Clearly S
has finite elements and
n~
T~
is, moreover, positive definite.
We have
.:1t - ':1t - ~t (x~~)-l(x~;·) .
*
-e
(X'nTX).-I[X'nT-;
*.
e~it - £]..Iit
= ~t
Plim (e: it - £:it)
n~
+
X'A~
nT
n~
n-7<X>
T~
T~
=~
•
nT
I
=
,
w*
(XnTX) ~
, and
nT
lim E(! !)
therefore
= 0,
n~
T~
which implies by theorem (3.4) that
Plim
(~
X'BV*]
JI
nT
, *
xAu
Il
nT
' .
+ Plim
'n~
X
, *
B~]
nT
T~
(4.17)
,*
Xw
E(S S')
- -
+
~t s-l [Plim x~~ + Plim
=-
T~
Consider _S
that
so
• Plim
n~
n~
T~
T~
( Xn'~T*)
=
0 •
48
n
E x -1
i=l 1 ...
, *
X Au
Next, consider i = nT"""1l
*
U .
1,U
n
*
n
E ~i ulJi
i=l
n
We have,
,
tr. E(1
,
i) =
E(11) =
~ (~
..
r=l i=l
x~
r~
lou.
lJlJ
n2
K
L:
<
r=l
(i~l X;it) (o~: )
L:
-+ 0 t
as n
-+
00,
T
-+
Similarly, Plim
n-+oo
T-+oo
(]
u*
-lilL
n
tr.(Xn~)
,
n
00.
APPly:::.th(;~~;:)(:.::
n-+oo
T-+oo
=
t=l
nT
which
,
T
again, we have
( ~X'B~)
nT
=
o.
-
Using these results in (4.17) we obtain
*
*
P~~ (e lJit - £lJit) = O.
This implies that e
*
lJit
P
*
~ £lJit.
T-+oo
Note:
In taking the probability limit we let both nand T tend to
infinity.
It is not sufficient to say that (nT)
fixed, while T
-+
00.
-+
00.
Suppose n is
,
Then (nT) tends to infinity, but tr. E(1.t) does
not necessarily tend to zero.
Similarly, if T is fixed while
h -+
00,
PIiDI ( x~:.Yp) is uot necessarily zero. Unless explicitly stated otherwise, we assume that both nand T can be very large.
49
Using the calculated residuals we can estimate the variance
(J v* and
components (J u*
].l].l'
].l].l
(J w* =
].l].l
tV
n
T
L
L
(J
i=l t=l
w*
].l].l
Consider the estimators:2
-*
-*
-*
*
(e].lit - e ].lie - e ].l.t + e ].l •• )
2
(n-l) (T-l)
w.]
2
(J u* = [ nL (e-* i - -*
)
e
].l].l
].l ••
].l •
i=l
~].l].l
tV
T
n-l
-'It
-* )
(J v* - [ T
L (e].lit - e
].l ••
].l].l
t=l
tV
2
p'
].l].l ]
n
T-l
Theorem (4.2).
(4.18)
tV u*
(J].l].l ,
Under the assumption (4.2), the estimators
v*
tV w*
(J ].l].l and (J ].l].l given by (4.18) are consistent and asymptotically unbiased.
tV
Moreover, for every given integer r ~ 1,
r
r
tV u*
w*
u*
lim E (J w* - (J].l].l)
lim E«(J
=
= lim E(~].l].lv*
(J].l].l)
].l].l
].l].l
n~
n~
n~
<'"
T~
T~
I
Proof:
Let L = (I
N= I
nT
nT
r
(J].l].l)
(4.19)
.
I
1
J
+ nT)
- AA _ BB
T
n
I
v*
T~
0&
=
-
nT'
and
I
- X(X X)- X
2The formulae (4.18) have a familiar form. We could, however,
replace (n-l) and (T-l) by nand T, respectively, and drop the terms
tV w*
tV w*
i:J
(J
•
tV u*
tV v*
.J!.lL and .J!.lL from the express~ons (J
The resulting
T
n
].l].l and (J ].l].l
estimates would be consistent and asymptotically unbiased. Further,
(4.19) would still hold.
50
The matrices Land N are idempotent and their ranks are (n-l)(T-l)
write
'"
o w*
].lll
Moreover. L L
].l].l
'*
*'
=
= aw*
j.l].l L.
*
and (nT-K) respectively.
e
L }.l
e
(n-l) (T-l)
}.l
Now we can
*' [NLN]'E:"
.
*
'E'"
=
=l.l
=j.l
(n-l) (T-l)
Since '"0].l].lw* is a quadratic form in
E*
~
* ). it
which is normal (0. L].l].l
.
follows by a theorem due to Box [6] that the s'th cumulant of '"
0 w* is
].l].l
a 1
Ka = 2 - (a-1) I tr. [
~;~1~(~_:~a
The matrices appearing in the expression above are all symmetric
so that
K
s
E(~
= 2s - l
(s-l)!
= ss-l
(s-l)! tr (L N)
(n-l) s (T-l) s
w*) = K
].l].l
l
= oW*
tr (L N)
= oW*
].l].l (n-l)(T-l)].l].l
= a"~'
!(n-1) (T-1) - tr
L
Therefore. lim E(o'" w*
n-+-oo
.].l].l
T-+-oo
= 2(ow*) 2 [
].l].l
w*
= 2(0].l].l)
to zero as n
•
+
00.
tr L - tr L fr(X'X)~~']
(n-l) (T-l)
(~-1 X' ~TX. ]
(n-l) (T-l)
XIX -1 X'L X
w* tr -;T
nT
o. .
].l].l
(n-l) (T-l)
= 0 w*
].l].l
K2
w* s
(0 )
whence
].l].l •
T
+
2[
00 •
ow*)
'
].l].l
= 0 -and
i
'"
0].l].lw* is asymptotically unbiased.
tr L - tr (X , X)- lX' L X]
(n~1)2 (T_l)2
, l' ]
1
_ tr(X X)- (X L X)
(n-l) (T-l)
(n_l)2 (T_l)2
• which tends
51
w*
tV
Since
0
].1].1
is asymptotically unbiased,
2
lim E(otV w* - ow*)
].1].1
= lim
n-too
T-too
].1].1
tV
[Var(o
)]
= lim
n-too
T-too
].1].1
This shows by corollary (3.1.1) that
\
w*
tV
0
w*
].1].1
is consistent.
w*
tV
0].1].1
Further, let v 3 denote the third central moment of
tV
o
w*
].1].1
Since
is asymptotically unbiased, we have
tV w*
lim E(o
n-too
].1].1
T-too
= lim
S(ow*) 3
n-too].l].l
0
w*
3
= lim (\>3) = lim (K )
)
n-too
T-too
].1].1
n-too
T-too
3
[tr L - tr {X'Xl-\X'L Xl]
(n-1) 3
(T-1) 3
T-too(w*) 3 1" [
1
_ tr (X' X) -l(X' L X)]
o.
~m
2
2
3
3
].1].1
n-too (n-1) (T-1)
(n-1) (T-1)
T-too
=S
= o.
Continuing in this way, we see that for an arbitrary integer r
=0
].1].1
u*
tV
The statements about
1,
r
lim E(~ w* _ ow*)
].1].1
~
0
and
].1].1
v*
tV
0
].1].1
are true by analogy.
The theorem
is proved.
Substituting
tV
0
u*
obtain an estimate ~*
of
].1].1
functions of
0
v*
tV
and
0
].1].1'].1].1
L
0
].1].1].1].1
0
u*
v*
0
].1].1'].1].1
and
0
w*
].1].1
in (4.12), we
tV
Therefore, if qab and qab are the
].1].1
].1].1
for
].1].1
w*
corresponding elements of L*
w*
].1].1
The nonzero elements of L* are linear
*
].1].1
u* , 0 v* and
tV
0
and ~* , respectively, we have by the
].1].1
corollary (3.2.1)
tV
lim E(qab
n-too
T-too
r
q
ab
)
=0
(4.20)
for any given integer r > 1
This implies that the elements of ~*
].1].1
are consistent and aymptot-
ica1ly unbiased estimates of the corresponding elements of L
*
].1].1
52
In the following two sections of this chapter we shall encounter
.
some functions of the var1ance
components a u* , av* and aw*
].l].l'
].l].l
].l].l
*-1
expression of ~
the following expressions occur:
In the
].l].l
(1)
f
1 =
1
w*
a
j.l].l
na v*
(2)
].l}l
f2
w* (a w* + na v* )
a
].l].l
].l].l
].l].l
Ta u*
f
(3)
]J].l
=
3
w* (aw* + Ta u *)
a
].l].l
].l].l
].l].l
w*
a
u* v*
a
nT a
].ll.l
(4) f
{l+
].l].l
].l}l
w*
a
+ na v* + Ta u*
].l}l
}l}l
}
].l}l
u
v*
w*
w*
w*
a
(a].l].l + na ].l].l ) (a].l].l + Ta ].l].l*)
].l].l
4
e
We consider the estimates of these functions obtained by replacing
a
v*
].l].l
w*
~ u*
~ v*
~ w*
and a
by a
, a
and a
, respectively and derive their
].l].l
].l].l].ll.l
].l].l
properties.
To simplify algebra, we shall make some approximations.
We
can obviously write
u*
~
a
].l].l
v*
~
a
].l].l
w*
~
a
<P
(u) ,
= ~
+
= n
+ <p(v) ,
=
~ +
<p (w)
(4.2l)
,
].l].l
where (i)
(n-1)
u*
Q'].l].l
~
(T-1) n
v*
a
].l].l
~,
n, and
~
are mutually independent random variables such that
is distributed as x
is x
2
.
with v
2
=
2
with v
1
=
n-1 degrees of freedom;
(T-1) degrees of freedom;
53
(n-l) (T-l) t,;
w*
cr
lJlJ
is x
2
.
with (n-l)(T-l) degrees of freedom; and
(ii) ~(u), ~(v), and ~(w) are random variables such that
(4.22)
for every given integer r
>
1.
For nand T both large, we can, therefore write
u*
lJlJ
tV
tV
t,;
tV
v*
lJlJ
tV
tV
n
tV
w*
lJlJ
tV
tV
t,;
tV
cr
cr
cr
(4.23)
Using these approximations, it may be shown in a straightforward manner
by means of corollaries (3.2.1) and (3.2.2) that
lim E(?
s
- f )r = 0,
s
for every given integer r
~
(s
= 1,2,3,4)
(4.24)
1.
We may note, however, (4.24) hold whether or not we make the
approximations (4.23), which are meant to make derivation easier.
In order to estimate L'* '(lJ
lJ ' ), we use the estimators
lJlJ
n
T
*
*
*
-*
-e * , - -e * , + -e * , )
L L (e * 't - e
e
+
e ]1 •• ) (e'
J.L it
lJilJ.t
lJ ilJ .t
lJ
i=l t=l lJ~
=
(n-l) (T-l)
n
*
*
*,
-e * , ) tVcr w*,
L (e , - -e lJ ., • )
lJlJ
1J ilJ
i=l lJ~·
=
(n-l)
T
T
*
*
*
*
L
- -e lJ· • ) lJ •t - -e lJ , ) tVcr lJlJw*'
t=l lJ.t
=
(4.25)
(T-l)
n
+
w*
lJlJ'
tV
cr
..
(e
u*
cr ,
lJlJ
tV
(e
e
tV
cr
v*
lJlJ'
(e ,
..
..
---
--
Results similar to those given in theorem (4.2) can be established
for the estimators (4.25),
54
4.3 ,Some Useful Results
We now establish a number of results which will be used in the
(i) L*
HH
following section.
is a positive definite symmetric matrix and
*-1
so is L
for given nand T. The rank of X(nTxK) is K, so that it follows
HH
by a well known theorem (Goldberger [8], Theorem 2.7.4, p. 35) of matrix
is a positive definite matrix.
(X'~-lHJ;l xl
E
=
nT
We can write
1
+
nTcr 2
w
(X':B'X) +
nT(cr2 + ncr 2 )
w
v
,
e-
+
, ,
1
X AA X
nT(cr 2 + Tcr 2 )
w
u
T
X J X
(nT)2
i
(4.26)
where E is KxK matrix with elements
E
=
rs
n
T
L
L (x.
i=l t=l
r1t
- x. - x
+
r1.
r.t
Xr.. )(x S1't
- x
si.
- x
s.t
+
Xs •. ).
(4.27)
J is nTxnT matrix of l's and i is a scalar function which tends to
zero as n
+
00,
T
+
The conditions (4.2) ensure that
00.
exists and is positive definite.
nu1~
matrices as n
+
00,
T
+
00.
X ,*-1
L
X
. Therefore, lim (
n+oo
T+oo
)JH .
nT
The remaining terms in (4.26) tend to
)
exists and is a positive definite matrix.
55
X'
(
t'* -1X )
.
n~M
, *-1
has the same expression as (
u*
tV
replaced by cr
lJlJ
have
tV
cr
v*
• lJlJ •
w*
tV
cr
( X ,~.-1
L
X)
MM
nT
Plim
n~
respectively.
lJlJ
Plim
j
X
u*
with cr·
lJlJ •
nT
lim
MM
nT
n~
= lim
n~
T~
cr v*
lJlJ·
cr w*
tV
lJlJ
Applying Slutsky's theorem, we
( X , L .-1X)
=
T~
and
X L
T~
(a:>T)
,
(X 't'::1X)-1 =
nT
n~
T~
and 0R.m the
(=i_~.=.l_X_R._:__x_
mi
_ _.) +
.•
By (4.2)
n
T
ILL
i=l t=l
x
R.it
x
mit
I
a4 -
(4.28)
is bounded for all nand T so that
nT
there exists a positive number y such that
~
nT
T
Further.
L x
x I
It=l
R..t m.t
T
<
y <
00
( R.. m
= 1,
2 • • •• , K) •
56
n
Ii:1
x JI,i. xmf.
I
(nE
T
2
Ex.
<
T
2 )
Ex.
t-l ht) i-I t=1 m1t
n
nT
Ix R,. xm•• I
nT ,
<
e
These terms are therefore bounded for all nand T.
Writing ~tm
= (~tm
= ~1
- 0tm)
- ~2 - ~3 + ~4' and using (4.24) we
see that, given any two integers a and b
lim IE(1/Ja 1/Jb)
n~
r
s
T~
I,
.: lim
n~
IE
~
1,
1/J2'a E '2b
r
1/J s
=
0:
(r,s,
= 1,
2, 3, 4)
T~
which ensures that, given any integer c > 1
lim E(~tm)c
(ii)
= lim E(~tm
n~
n~
T~
T~
(4.29)
-
We can, therefore, write
(4.30)
where the elements ~tm of !:I
1
satisfy (4.29).
Using a fundamental result of matrix calculus (Bodewig, E.
,*-1 )-1 .' *-1 )
(
(
X E
]J]J
nT
X
_
XE
]J~
nT
X -1
(XE
'
!:I
1
[5]~
*-1 )-1
]J]J
p.30)
X
+ •..
nT
(4.31)
57
rhe elements of KKK matrix
ryR-m
the elements d
rvR-m
=
(0
-
aR-m)
(
x'r:
-1)-1
X
T
Therefore,
ry
~2
of
are all bounded.
have the same properties as dR-m' that
is, given any integer r > 1
(4.32)
~
'rv*
X 2:
~R-m
~R-m b'
the correspon1ng
d'
o
an d ue1ng
elements of (nT
1
-1
X
.
and
respectively.
nT
We have established the follow/ing theorem:
,~*-1
-
rv
Let Q
Theorem 4.3.
Then Plim Q-1
= lim
n~
n~
T~
T~
-1
Q
X
=
}.I}.I
X
and Q
nT
= l'1m
n~
T~
['nT
E
. *
OW
].l].l
=
(r
O
-\)
}.I)l
nT
]
~
Moreover, if ~0R-m an d uR-m
are the R-m th elements of Qrv and.Q, and ~R-m
0 ,
aR-m the R-m th elements of rv-1-1
Q and Q respectively
r
lim E(5'R-m -
aR-m) = 0,
n~
T~
lim E(5'R-m _ oR-m) r
= 0,
for any given integer r > 1.
n~
T~
(iii)
Consider the random vectors
, *-1 *
'rv*-l
X 2:
e:
X 2:
e:
].l].l )l)l -y, and ~ =
A=
vnT
e
Clearly, A is normal
lim Q
n~
T~
= lim
vnT
LQ, Q] and
[E ]
n~ nT ow*
T~
].l].l
(4.33)
58
'V-I and Q-1 meet the conditions
Theorem (4.3) ensures that the matrices Q
stated in theorem (3.6).
.E.
=(
Therefore,
, *-1 *
''V* -1)-1
X
X L
X L
)lll
nT
e;
llll """1l
is asymptotically normal with mean
InT
vector zero and covariance matrix lim (Q-l).
n-+oo
T-+oo
lim E (,E,)
Moreover,
=-.Q. ,
n-+oo
T-+oo
(4.34)
Further, if
'V
A~
and At be the t
ili
elements of
'V
~
and l respectively,
one can show that
lim
n-+oo
T-+oo
E(~t - At )4 = O.
(4.35)
Applying corollary (3.6.1), we derive the following result:
Theorem (4.4).
,'V*-l
.£ = (
X L
X
)-1 ,-1
1111
nT
X
~
e;
*
11 1l"""1l
converges in distribution
InT
to normal [0, lim (Q-l)], and
n-+oo
T-+oo
'V
(i) lim E(.E.)
n-+oo
T-+oo
=0
'V 'V'
(ii) lim E(.E..E.)
n-+oo
T-+oo
= lim (Q- 1) .
n-+oo
T-+oo
(4.36)
59
Writing X
x
o
=
0
0
X
.,
o
E* =
X
*
Ell
E*
12
*
ElM
E*
21
E*
22
E*
MM
*
EM1
E*
M2
*
LMM
(MnTxMK)
(MnTxMnT)
£1*
and e: *
=
£2*
, we generalize the results of theorem (4.4) in the form
(MnTx1)
of a corollary:
'V(o)
.Eo
Corollary (4.4.1).
converges in
distribution to normal [Q, lim
n~
T~
lim E (.E: (0)
i( 0)
) = lim
n~
n~
T~
T~
4.4
(X~*-\) -1
(4.37)
. nT
Estimators of Reduced-Form Parameters
We are now in a position to estimate the parameters of the reducedf orm equat i ons.
y
IJ
ConS1'd er the
11
~
th
equat i on
= XIT--,.t + --,.t
e: *
* of --,.t
If the covariance matrix E]..1]..1
e: * were known, Aitken's generalized least
squares method would yield the estimators:
A( )
IT 0
--,.t
=
, *-1 -1
(X E
]..1]..1
X)
(4.38)
60
which have the covariance matrix
, *-1 ) -1
X I:
X
=
llll
nT
(
(4.39)
On the other hand,the covariance matrix of the ordinary least squares
estimators 11
'"j..I
is
LA
1
(X , X)- 1 (X '*
I: X) (X 'X)-
=
(4.40)
].l].l
11 11:
By Aitken's theorem, the estimators n(o)
-tl
F
=
are B.L.U. so that
,
1 '*
'1
' *-1 -1
(X X) - (XI: X) (X X) - - (X I:
X)
].l].l
].l].l
(4.41)
is nonnegative definite.
We do not know the covariance matrix r * and therefore consider the
].l].l
estimators 3
'V(o)
11
=
'"j..I
-1
(X'~* X)
].l].l
-1
,'),.*-1
X L].l].l1J
Y
(4.42)
The properties of these estimators are given in the following
theorem:
If the ].l th re d uce d- f orm equat i on
Th eorem (4 •5)•
,
. = -x.1t '"j..I
11 + E •
Y].l1t
].l1t
is considered independently, the estimators Jt(o)
'"j..I
(i) are consistent and asymptotically unbiased;
(ii) are asymptotically more efficient than the ordinary least squares
estimators 11].l ; and
(iii) have asymptotic covariance matrix
•
1-
lim Q-1
nT n-loOO
'
T-loOO
3'Note the distinction between n(o) and Jt(o) •
'"j..I
'"j..I
61
which is consistently estimated by
,,,,*-1 -1
(X E
]l]l
X)
"'( )
ITo
Proof:
""'1l
. ,"'*....1 -1
,,,,*-1
= (X E
X)
]l]l
X E
y,
]l]l I.l
=
IT
""'1l
+
__1_ '"
~
lOT
By theorem (4.4), we have
lim E(~(o) _ IT )(~(o) - IT )
n ~""'1l
=
""'1l
""'1l""'1l
T~
1 '" ",'
lim E[nT ~ .:.eJ = lim
n~
n~
T~
T~
By theorem (3.4) it follows that ~(o)
""'1l
[L
nT
Q-1] =
o.
\.
are consistent and asymptotically
unbiased.
Further, lim E[~ (~ (0)
n~
IT ) (~ (0) - IT ) , lOT
]l""'1l
""'1l
]
""'1l
T~
=
'" ",' = lim (Q- 1 ),
lim E[~~]
n~
n~
T~
T~
(4.43)
so that the asymptotic covariance matrix of
~(o) is 1.- lim (Q-1) which
nT n~
""'iJ
T~
,,,,*-1
is consistently estimated by (X EX).
]l]l
We have seen that F is nonnegative
lim {E[nT (IT
""'1l
- IT )(IT
- IT ) '] - E[nT
""'1l""'1l]l
=
definit~
which ensures that
(~(o) - IT )(~(o) - IT ) '] }
-""'1l""'1l]l
~:: ( ~T) .
T~
is nonnegative definite.
It follows, therefore, the estimators ~ (0) are
""'1l
asymptotically more efficient than the estimators IT.
""'1l
proved.
The theorem is
62
When we apply Aitken's generalized least squares method to each reduced
form equation separate1Yt we evidently assume that the disturbance terms in
different equations are mutually uncorre1ated t that iS t
*
E(u]J1.
U
*
]J ' i
)
= 0t
This implies that
M
M
L
L
,
]Jr a]J s
a
r=l s=l
u
0
=
rs
M
M
L
L
,
]Jr a]J s
a
0
r=l s=l
v
rs
(4.44)
=
M
M
L
L
r=l s=l
Even when the structural disturbances are mutually uncorre1ated t the
relations (4.44) will be true only under heroic conditions.
condition is that A-
1
and therefore A are diagonal matrices.
One such
We have
already assumed that the diagonal elements of A are all equal to -1.
There£ore t if the matrix A is diagonal with diagonal elements equal to -It
the reduced-form equations will be just the structural equations with
uncorrelated error terms.
This is generally not true.
It is difficult to
imagine any other situation under which (4.44) hold without violating the
basic assumption that A is nonsingular.
1f t as is generally true t different reduced-form disturbances are not
mutually uncorrelatedt Aitken's generalized estimators obtained by considering each reduced form equation independentlYt that iS t
, *-1
~(o) = (X L
X)
]J]J
11
are not B.L.U.
-1
More efficient estimators are obtained by applying Aitken's
method to the whole set of reduced form equations.
63
We write
.Y1
.Y=
.YZ
I:
0
0
III
0
X
0
II Z + £Z*
X
~
o
= X -II + -E*
~1
X
0
Or .Y
*
X
...
.*
•
.!l.1
*
* 1*
If L = E(£ £ ) were known, B.L.U. estimators of II =
II Z
would
be given by the formula
A
(
II
a
)
= (X
I
*-1 -1 I *-1
L X)
XL . .Y
-1
=
*lL
(X L-· -X)
I
(x '
L*l2x)
..
I
*lM__
(X L" --.x)
M,.
i1 (X L*lJy --,)
j=\l
J
M I
L (X L*Zjy.. )
j=l
J
.
'>
(X I L*MIX)
(X'L*MZX)
(X I
L*MMX)
M,
..
L (X L'*Mj.)
Y.
j=l
J
_( 4.45 )
*tm
th
*-1
where L
represents t-m block submatrix of L
Since L*
~~
I
(~,
~
= 1, Z, ••• , M) are not known, we replace L* by
its estimate tV*
L and consider estimators:
64
e
'rv*ll
(X L
X)
, *12
(X ~
X)
'rv*lM
(X L
-1
X)
M
L
, rv*lj
X L
j=l
=
'rv*21
(X L
X)
'rv*22
(XL
, rv*2M
X)
(X L
X)
M
L
'rv*2j
X L
j=l
y.j
y.j
..
'rv*MM
(X L
rv*~m
HereL
X)
(4.46)
rv*-l
's are (nTxnT) block matrices in the matrix L
•
The mere fact that the correlation between reduced form disturbances
is taken into account in deriving estimators (4.45) constitutes a proof
that these are more efficient than the estimators
~(o) =
~o)
which are obtained when individual equations are considered separately.
(4.47)
and
Now
= II +
65
By corollary (4.4.1) we have
'V
lim E (IT. -
LD (1l'V -
II)
'
=
lim
n~
n~
T~
T~
P1l.·m (_'VII) = lim
Th ere·ore,
f
n~
n~
T~
T~
E(~)
= II ,
[~T W- 1 ]
=
o.
which means that estimators ~ are
consistent and asymptotically unbiased.
Moreover,
lim E
,
[d - ll)(~
- ll) 'nT] = lim E (,£(0),£(0) )
n~
n~
T~
T~
-1
= W
(4.48)
•
Thus we have the theorem:
Theorem (4.6).
Estimators (4.46) are consistent and asymptotically
unbiased, and have asymptotic covariance
-1
( X •E•11X )
nT
1
-1
1. 1.
-w
:= riT . l.m
nT
n~
(X<~2X)
(X.•21
(X.•22
E
X)
E
X)
nT
nT
(..•M1
)
X EX
( XL
• oM2 X )
nT
nT
T~
which is consistently estimated by
.,.MII)
(XE nT X
66
-1
'1\;*11X)
(X E
'1\;*21
(X E
'I\;*Ml
(X E
'1\;*12
(X E
'1\;*22
X)
(X E
X)
(X E
'l\;*lMn
X)
(X E
X)
(X E
'I\;*M2 X)
-lC)
'1\;*2M
...
X)
,1\;*Mt-l_
(X E --lC)
(4.49)
Moreover, these estimators are more efficient than the estimators
~(o) and II
e_
=
We write
I\;
Yllit
I\;
where II
--j.l
e **
llit
= ~t
I\;
II
I-l
I\;
is the subvector of II
= Yllit
-
I\;
Yllit; and
(4.50)
o
~t
(KMxM)
I\;
Since II is a consistent and asymptotically unbiased estimate of
!,
I\;
Y
is a consistent and asymptotically unbiased estimate of E(y
).
llit
llit
The residuals e ** , s are the second-round estimates of errors E* , s as
llit
llit
opposed to the e * ,s which were obtained in the first round by D.L.S.
llit
67
Clearly. we can write
'V
''V
= C .!!. and
~t =
'V
E~t
Therefore.
'
'V
'
"''V
"
,
~t - C IT)
IT)
'V
= X IT
n-+l>O
T~
T~
~t =
,
C! +
*
~t
e **
2it
** =
=
(4.51)
W- l
• we have
e **
lit
~t
'
rI) C
= lim [C nT C] = o.
n~
Since
<1l'V -
- C IT) = C E<11 - l!)
'V
E(~t - C
lim
'V
~t
- C IT)
'V
:L
*
'V
~t - ~t)
,
'V
(4.52)
= -4t - C (ll - ll)·
.**
~it
It is evident from (4.52) that
**
P
*
~t ->~t
Theorem (4. 7) •
(4.53)
The "second-round" estimates
,
,
of residuals are
'V
.!!..
asymptotically uncorrelated with
Proof:
**
~t
We have to prove that
'V
lim E{(~**t (ll
- ll) ' } = 0
(4.54)
(MxKM) •
n~
T~
If the covariance matrices
*
E~~'
(~, ~
= 1.
2••••• M) were known.
we would obtain estimators
A(a)
IT
,
=
-1
(X E* X)
-1
,*-1
XE
:L
,
-1
= .!!. + (X E* X)
-1
,
-1
X E* e*
68
Let
(a) **
~t
denote the correspondins estimates of residuals.
~(a)**. ~ _ i(a)
*
= f. -
, *-1
X(X L
:. E[e ( a )
=
X(X
= ~ _ X(X
X)
** "( )
(!
, *-1
X)
L
a
-1
, *-1
X)
L
-1, *-1
X
L
~
-1, *~1 *
X L .£
,
(4.55)
-1 , *-1 * '
= E[e ( a ) ** {(X '*-1
L X) X E €}]
_ II) ]
- X(X
Then
, *-1
X)
E
-1
=0
(4.56)
(MnTxMK)
o
Now let
.£
0
=
e·
o
1
. th e 1 ement,
J.t
o
o
(nTx1)
(4.57)
....o
d
and D
o
o
=
o
(a) **
Then ~t
d
o
o
o
d
(MnTxM)
' ( a ) **
= Do
~
( ) **
: • E [~:
"()
,
Q!. a _ II) ]
=
, (a) **
E[D £
"()
,
Q!. a - II) ]
In view of our results about ~*, we have
** ~'
lim E [~t (.!!. - l!) ]
n-+oo
T-+oo
= lim
n-+oo
T-+oo
(a)** "(a)
,
E [~
Ql
-!)]
t
= O.
69
This proves the theorem.
** 'V'
Corollary (4.7.1).
lim E~t ~t)
n-+oo
T-+oo
=0
(4.58)
(MxM)
= C ' VII
Proof:
** (.!!.'V
'
II) C] +
= E[~t
**
I
E[~t !!.C]
** 'V'
** (11.'V - 11.) ' C].
lim E(~t ~t) = lim E [~t
n-+oo
T-+oo
.
n-+oo
T-+oo
Using theorem (4.7), we have
** 'V'
lim E[~t ~t] = O.
Q.E.D.
n-+oo
T-+oo
* 'V'
Corollary (4.7.2).
lim E(~t ~t)
n-+oo
T-+oo
* = Do, -e: *
Proof:
,
and
D
o
Note that £.it
,
X = Do
* 'V'
0
o
o
x
o
o
0
x
, * 'V
(..!!. -
1D
'
C
+
* 'V'*
= lim E[D
n-+oo
T-+oo
I
=C
'*
.!!.
Do .£
:. lim E(£.it ~t) = lim E[D
n-+oo
T-+oo
(4.59)
,
x
E(£.it ~t) = E[D o -.£
=o .
n-+oo
T-+oo
I
o
I
(4.60)
•
C] = E[D
*
A( )
* 'V
,
I
.£ (IT a - II) C]
, * *' *-1 ,*-1-1
e: e: E
X(X E X) C]
0--
' _
.£ ([ - 11.) C]
o
=
=
'
,*-1-1
X) C]
lim [D X(X 2:
0
n-loOO
T-loOO
1
lim [L C'W- C]
nT
= 0
,
=
.
lim [L C (
n-loOO nT
T-loOO
X'::
-1
r
X
70
C]
Q.E.D.
'V
,
Note .that ~t
v. is a consistent and asymptotically unbiased estimate of
C.!!., and, clearly, -u
'V
= ~ (v.
'
- C IT) is asymptotically normal (Q,C
-,.
t-
Further, the asymptotic covariance of
L lim E(u _u')
nT n-loOO
T-loOO
=
L (C'W- 1C)
'V
~t
nT'
,
-1
W C).
is
(4.61)
which is consistently estimated by
(4.62)
71
5.0
ERROR COMPONENT MODEL: STRUCTURAL ESTIMATION
OF AN EXACTLY IDENTIFIED SYSTEM
5.1
Notatien
Se far our interest has centered on the estimation of reduced-form
coefficients of the system of equations (1.4).
Utilizing the error
structure of the error component model, we have found estimates which
are asymptotically more efficient than the estimates obtained under
assumptions of the traditional model.
We now take up the problem of
structural estimation bearing in mind that only identified relations can
be estimated.
In the present chapter we assume that each equation in the system is
exactly identified.
y~it
We write
= the value of the
~
th
th
endogenous variable for the i
cross-section in the tth time interval;
(~)
~t
=
the vector of endogenous variables other than
y~it
included in the ~th structural equation; and
x(~)it=
the vector of exogenous variables included in the
~
th
structural equation.
We assume, as in Chapter 2.0, that the ~
tains t
~
th
structural equation con-
endogenous variables and K exogenous variables so that
~
(~)
~t
-x(
~
is a
(t~-1)x1
vector, and
) ~'t is a K~ xl vector.
The ~th structural equation is, therefore,
(~)
y~it • ~t ~ + ~(~)it ~ + E~it
(5.1)
72
where a
"j.l
is a
(~
]J
-l)xl vector of coefficients, and
B is a K]J xl vector
"j.l
of coefficients.
If this equation is exactly identified, we have in accordance with
the criteria derived. in section (2.2),
K = K
]J
+
~
]J
-1 •
We consider the first equation in the structure:
(5.2)
For simplicity, assume that
(1)
(5.3)
~t =
We cannot treat (5.2) as an ordinary regression relation; because
a subset of explanatory variables consists of jointly dependent variables.
5.2
Indirect Generalized Least Sguares
If the equation (5.2) is exactly identified, there is a one-one
correspondence between structural and reduced-form parameters.
partition the matrix IT of reduced-form coefficients as
To see this
Let (0)
~1
=
[-1]
0.
' where
1
~1
is a
(~1-1)x1
vector.
Then from (Z.9) we have
The
1 (0)
TIll 0.1
= -
TI 1
Z1
= 0.
(5.4)
(0)
0.
(3
-1
1
identifiability condition implies that
K = K
1 +
~1-1,
or K - K 1 =
1
nullity of TI
Z1 is equal to 1.
solution of TI
1 (0)
0.
Z1
1
(0)
TIn
0.
1
=
~1-1,
from which it follows that the
Therefore, there is a unique nontrivial
0, and this solution gives a unique vector !1 by
1 = - !1 •
Now recall the reduced-form estimators given by (4.46):
~
.!!.
,~*-1 -1. ,~*-1
= (X L:
X)
X £;
Y.
These estimators are consistent and asymptotically unbiased.
these estimators when rearranged in the original matrix form.
Let ~ denote
Then it
follows that corresponding to these estimators we have a unique estimator
74
tV
~1
satisfying
~il ~J
= 0,
(5.5)
tV
and a unique estimator 81 of
=-
~111
~1
given by
[.-lJ
(5.6)
tV
~1
tV
Clearly the estimators ell and
tV
~1
are consistent.
.Simi1ar1y, by rearrang-
ing columns of IT and partitioning it as
12
~].I
IT
, we obtain consistent estimators ~ and ~
22
of coefficients in the ].Ith structural equation from
"e(5.7)
~i{~J ~ -~
prove
id d t h at t h e].l th
..
equat~on ~s
d
exact 1
y ·~ dent~. fi e.
Th ese est im ators
are at least as efficient asymptotically as those obtained under
assumptions of the traditional model.
We call these estimators the
"Indirect Generalized Least Squares" estimators, because the reduced-form
estimators upon which they are based have been derived by a two-stage
generalized least squares method.
5.3
Two-Stage Generalized Least Squares
A method which we shall designate as "Two-Stage Generalized Least
Squares Method" can also be used to estimate structural parameters.
It
75
will be shown that this method yields the same estimators as the indirect
generalized least squares estimators, when the system is exactly
identified.
In terms of n T observations, the first structural equation is
(1)
]:,1
=y
Ct 1
+ Xl
~1
+
(5.8)
~-l
Y211
Y311
Y212
Y312
(1)
where
Y
=
U2' ]:,3' ••• , ::i.Q,)
1
=
nTx(Q,l-l)
•
x
-
Xl
= (xl' x 2 '
... , ~ )
1
=
x
111
112
x 1nT
x
x
x
211
. ...
~111
212
xK 12
1
2nT
~lnT
I
tV
Let Yllit denote, as in section (4.4) , the estimate of ~t IT
"'"1J
.
Then
Further,
**
~3 '
**)
~Q,
=
1
Then (5.8) can be written as
(5.9)
,
Premultiplying by X
,
X ~l
=
76
,
(xl'
~2'
••• ,~)
we have
,
=
(X
(5.10)
,
,
E (X .!l.lIII X)
=
, TI
(X l:l1X) , where
We do not know G , but if (5.8) is exactly identified, we need not
l
estimate this matrix.
d)
It follows from corollaries (4.7.1) and (4.7.2) that Y is asymptotically
uncorrelated with TIl so that regression analysis is valid.
Aitken's generalized least squares method gives
.'*
~l
(5.11)
If (5.8) is exactly identified,
K
= Kl +
Jl.l-l ,
*
and Kx(K l + Jl.1-l) matrix ~l is square and generally nonsingular.
We
have, therefore,
tV
, (1)
=
,
(X Y, X Xl)
-1,
(5.12)
X ~l
This is the same as the estimator given by the indirect generalized least
squares method.
77
5.4·
Three~StageGeneralizedLeastSquares
,
Continuing, we premultiply each structural equation by X.
The entire
set of equations takes the form
y
*
",*
*
(5.13)
=Z'£'+!l.
where
.!l.l*
o
*
*=
~ =
:i.
*
• ,.!l.* =
~2
o
o
* * = (X ' Ln X) ,
Let G = E(!l..!l.)
-e.
where
x=
X
o
0
X
o
X
o
X
The generalized least squares estimate of .£. is given by
~
Now ~
* is
= (~
'*
* -1 '*
G- l ~)
~
-1 *
G :i.
(5.14)
M
MKx[ L
(~
]J=l
]J
+ K]J -1)] matrix.
Assuming that all equations in
the structure are exactly identified, we see that ~
* is
square and
generally nonsingular so that (5.14) reduces to
~ = ~*-l:i.*
(5.15)
78
=
~*-1 *
M ~
from which it is evident that if the system of
e~uations
is exactly
identified, the three methods given above yield identical estimators.
We note, once again, that three estimators are asymptotically more efficient
than the corresponding estimators obtained for the traditional model in the
exactly identified situation.
Estimators (5.14) may be called the "Three-Stage Generalized Least
e.
Squares" estimators.
79
6.0
6.1
ERROR COMPONENT MODEL: STRUCTURAL ESTIMATION
OF OVERIDENTIFIED SYSTEMS
Inadequacy of Indirect Generalized Least Squares Method
In this chapter we consider structural estimation of a system of
overidentified simultaneous linear equations with error structure
described in (4.1).
The notation and terminology developed in the
preceding chapters will be carried over to this chapter.
To begin with, note that when the equations are overidentified,
the indirect generalized least squares method cannot be used to
estimate structural parameters.
].l
Consider for example, the relation
(0)
11 21 ~ = 0
· d·~n t h e secon d ch apter.
obt a~ne
If the ].lth
..
equat~on ~n
.
t h e structure ~s
overidentified,
K > K + R, -1
].l
].l'
and the (K - K~)XR,].l matrix nil has nullity exceeding one.
~here
Therefore,
are more than one nontrivial solutions to this equation.
Correspond-
'V
ing to a consistent estimate n of n obtained by methods of Chapter 4.0,
'V
there will be, in general, more than one estimates a
J..l
and therefore more than one estimates ~ given by
= -
Each of these sets of estimates will be consistent.
of a
J..l
satisfying
80
One way to circumvent this difficulty is to discard enough rows of
'\ill
TI
Zl
to bring its nullity to 1.
Consistent estimates of
~
and
~
will
But since the choice of rows of ~~l to throwaway is
thereby result.
made arbitrarily, there must be some loss of efficiency.
Thus the pro-
cedure whereby reduced-form estimates are transformed into structural
estimates without loss of efficiency is not available. when the system of
equations is overidentified.
6.2
Two Lemmas
Before developing methods of structural estimation appropriate for
an overidentified system of equations we establish two results:
Lemma (6.1).
Let
(~l
~t
be the vector of reduced-form errors associated
with the endogenous explanatory variables
structural equation.
(ll)
~t
occuring in the
th
II
Then
*
( e;.
llit
(6.1)
*
where e;ll;i.t = the disturbance term in the
e;
II
= the disturbance term in the II
llit
(1)
th
th
reduced-form equation and
structural equation.
(1)
We consider ~t which is associated with ~t =
Proof:
and show that
(ll,
~t Ct 1 =
YZit
Y3it
81
*' = - ~t A- 1 from (4.11) so that
We have ~t
*'
~t A = - ~t
Let
* denote
~1
the first column vector of A.
*' * = -
~t ~l
From the
E1it •
~priori
a~ ~~.
-
a
1
Q is
Writing
restrictions
where
(~l-l)xl
is
Then
(M-~l)xl
vector of nonzero coefficients, and
vector of zeros.
* = (E *lit , E2it
* ,
~t
*
••• , EMit)'
we have
The lemma is proved.
Lemma (6.2).
be the vector of "second-round" estimates of
(p)
reduced-form errors associated with endogenous variables
the pth structural equation.
~t
occuring in
Then
(6.2)
(p)**,
and lim E(E pI.. t + .!L
. .1. t
(6.3)
n-+oo
T-+oo
for all i, j
= 1,
2, ••• , and all t, s
= 1,
2, ••• , •
82
(1)**
Again, We consider e.
Proof:
which are associated with
""""'1.t
.(1)
~ =
, and establish (6.2) and (6.3) for
'p~t =
This will
1.
constitute the proof of the lemma.
We can write
*'
(1)*
(1)
E lit + ~t
£.1
+ ( ~t
(1)
,
- ~t) (Xl
Therefore,
(6.4)
Now by (A.52), we have
** E.* )
(e
, t - ~t
~ - IT"
= - C' (IT-~,
and i f C denotes the matrix obtained by replacing the subscripts i and t
1
by j and s respectively, we have
(e.**
-.JS
Clearly, if j
lim
n-+oo
T-+oo
**
E(~t
-
*
-s)
..Is
= -
= i and s = t, C1 = C.
*
~t)
**
(e
jS -
*
-=-js) =
,
lim E[C
n-+oo
T-+oo
,
(R - ])
-1
~
Ql -1D
W
= lim (C nT C1 ) = O.
n-+oo
T-+oo
,
C l
1
83
(1) **
(~t
(1)*
~t)
**
isa subvector of (~t
*
~t)
so that wehave t
(1) ** (1)* (1)** (1) * '
lim E(e.
- £ ) = 0
-,.t - ~t) (e.
-.Js
~s
n-7<lO
(6.5)
T~
Using (6.4) and (6.5) we obtain
(1)**
lim E(£l.
~t
n~
T~
+ -,.t
e.
1(1)**
= lim E[a i
n~
(~t
I
N
~1
-
*
(1)**,
£ ·)(c
+ e
lit "'ljs
-ljs
(1)* (1)** (1)*,
I
- ~t) (~s - ~s) ~1] = a i 0 ~1 = 0
T~
This establishes (6.3).
It, further t follows by theorem (3.1) that
p
--;>
The lemma is proved.
This lemma gives us a useful approximate result:
Writing
we see that the second.ordermoments and cross moments of n
converge
llit
*
to the corresponding moments andHcross moments of £llit·
and T are large we can write n
llit
,
E(~ ~)
~
£llit.
Thus when n
Since the element of En
llll
converge to those of E* ,we can take
llll
~*
~
llll
=
as an estimate of
the covariance matrix of the disturbance vector in the II
th
structural
equation.
6.3.· Two-Stage Generalized Least Squares Estimators
Consider again the first equation in thestructure t which can be
written as
(1)
,
Y1it = ~t a l + ~(l)it !1 + n1it •
84
If we write ~lit
(l)
=
~t
and
2-1
=
(:~)
• we have
~(l) i t
For large samples, we can write n
lit
by corollaries (4.7.1) and (4.7.2) that
~ E
'U
~t
*lit •
Moreover, it follows
is asymptotically uncorrelated
with nlit so that the explanatory variables ~lit
are asymptotically
uncorrelated with n
•
lit
Using
~~l obtained in Chapter 4.0 as an estimate of L~l' we obtain
the following estimators of structural parameters ~l by Aitken's two-stage
method:
(6.6)
=
We can write
-1
+
(6.7)
Now IT, the KxM matrix of reduced-form coefficients can be
partitioned as IT
=
[]!.l- IIT;z, .!!.3' ••• , .!!.R, lIT!/, +1' ... , ~].
1
1
85
(1)
Writing
IT
=
l, we have evidently
!~
[IT 2 , IT , ••• ,
3
1
(1)
(1)
'Ie
'Ie
'Ie
= X IT + ('£2' '£3' ... , £~ )
y
1
(1)
= X IT
(say) ,
(l)
and
Here IT
-,J
(6.8)
(l)
Y = X IT
is the vector of coefficients in the ~th reduced-form equation.
We have proved that ~t is a consistent and asymptotically unbiased
I
estimate of C !, where ! is the vector obtained by arranging the elements
of matrix IT lexicographically in the form of a vector.
'V
:
'VI
'V
I
+ lim
n-+oo
T-+oo
'V
E(~t)
I
'V
lim E(~t ~t) = lim E(4t - C
n-+oo
n-+oo
T-+oo
T-+oo
l!! (~t
I
C IT)
I
+ C 1l lim
n-+oo
T-+oo
'V
I
E(~t
I
- C 1l)
I
IT C
Using (5.40) we have
'V
'V I
lim E(~t ~t)
n-+oo
T-+oo
I
= C ITIT
I
C
from which it follows that
'V
'V
I
lim E(Y~it yp it
n-+oo
T-+oo
Moreover,
'V
y~it
)
I
(6.9)
= ~t ~ ~I ~t •
is asymptotically uncorre1ated with the errors
'V
n~it.
Therefore, for large samples, y 't plays the same role in regression
~l.
I
analysis as x,
-,.t
IT.
-,J
(~)
(~)
In other words, Y can be replaced by X IT •
86
It follows that, in terms of
~~ =
~
(¥,
(~)
X ) and Z~
= (X
ll
r
II ,
X~), we have all the results of
theorems (4.3) and (4.4), and corollary (4.4.1).
~:: E[~1' ~~l-1~1
T~
nT
=
lim E
n~
e)'
~ -1 (1) ' )
~
y
i1
We have, for instance,
(~) ,
nT
T~
(x'
.
1
'.
l:11 Xl
)
-1
nT
~
*
~* -1
l:n
(i) )
(x~
I'~1-1Xl )
nT
nT
1
Plim
=
n~
T~
-e.
= lim
n~
(1) ,
, -1
-1
(1) , ( , -1 )(1)
II
X l:!1 X . II , II (X I!l Xl
T~
=
D~l (say).
Further, if we write
-1
nT
nT
(6.10),
l
87
lim E .£1.1
n~
= lim
n~
T~
(~~ ~~~l ~l)
,,,,*-1 *
-1
Zl ~11
lim E(.£1.1.£1.1)
=
(6.11)
£1
~
T~
,
and
E
-1
(1) '( , *","1 ) (1)
lim II . X ~11 X II
n~
n~
T~
T~
. nT
(6.12)
Using these results, we have
-~~P )
-1
lim
n~
E(~~o) -
o
'\) (li ) - '£'1)'
= lim E(~T SlS~)
n~
T~
= 0,
T~
T~
so that ~(o) are consistent and asymptotically unbiased.
-1
Further,
o
lim E[nT(li )n~
-1
i 1)
(~(o) _ 0 )'] = lim
-1
-1
n~
T~
T~
The limit on the right clearly exists and is a positive definite matrix.
Similar results hold for other structural equations.
therefore, the following theorem:
We have,
88
e
Theorem (6.1) •
The estimators
rv
rv
~·)=[t]
Y
(].l)
rv
rv*-l(].l)
L:
Y
].l].l
X
].l
th
,
].l].l
:;:
of parameters in the].l
rv
~*-1(~)
(].l) ,
rv* -1
l:
X].l
].l].l
I
Y
,
-1
~*
].l].l
X].l
-1
rv
(].l) , rv*-l
l:
Y
].l].l
,
X
].l
X
].l
~
(6.13)
~*-1
].l].l ~
structural equation are consistent and asymptot-
ically unbiased.
5(0) is
Moreover, the asymptotic covariance. matrix of f.l
1
-1
-D
nT ].l
:;: -1
1.
(il)
'. *-1.)
(].l)
II '( XL:
).1].l XII
.
~m
nT n-+oo
nT
T-+oo
·Tf :~:l x~j
-1
(6.14)
which is consistently estimated by
(].l) , rv*-l (].l)
Yt
Y
L:
(].l) ,
Y
].l].l
,
X
].l
6,.4
Three-Stage
rv
Write 2
-
°
:;:
:;:
~*-1
X
].l].l
].l
-1
(6.15)
~
-1
(].l)
I
X
].l
Yt
].l].l
GeIl~ralized
rv
21
0
0
rv
2
0
0
°1
°2
~
~*
-1
].l].l
X].l
Least Squares Estimators
0
2
~M
.!l1
and .!l
:;:
.!l2
~
89
Then the entire set of structural equations can be written compactly
as·
(6.16)
If we are dealing with large samples, we have
£1*
.!L ~ £ *
n
I
2: . = E(.!L .!L)
~
=
* '*
*
E(£ £ ) = 2:
and the structural parameters
I
(~2:
-e·
*-1
~)
-1
~' 2:*11 ~
1
l'
= ~'2 2:*21 ~ l'
'V'
Z 2:
*
£2
~may
be estimated by
*-1
~
~' 2:*12 ~
1
~~
2'
~' 2:*22 ~
2
2'
...
2:*lM
~
-1
~' 2:*2M ~
2
M
M
2:
ll=l
~l i*lll ~
M
2:
ll=l
~2 2:,*211 ~
I
I
...
(6.17)
Evidently, the estimators (6.17) are asymptotically equivalent to
and are, moreover, more efficient asYmptotically than the single equation
4
estimators'
(ll
4Notethe distinction between
~o) and 6~O) .
= 1,
2, ••• , M).
90
But the formula (6.17) is not operational, when L* is not known.
In
this situation, we use the estimate ~*
1
of L* obtained in Chapter 4.0 and
consider the estimators
""1;*-1", -1 """,*-1
~ ~ (Z
Z)
Z L Y.
~'
1
=
't'*1l
~' ",*21
2
L
~
,
~1 't'
l'
",'
'Zl'
"
-e.
't'*22
Z2
",' ",*M1
ZM L
Zl'
",*1111
where L
*12
",'
ZM
't'*M2
Zl
",*lM '"
L
ZM
'"Z2'
~'
't'*2M
'"Z2'
",' 1;*MM '"
ZM
ZM
",'
'"Z2'
2
-1
~
M
M
L
11=1
M
L
11=1
FI
L
,
~1 ",*111
L
~
,
~2 1;
*211
~'
*M11
~
't'
11=1
~
(6.18)
,
(11,1l
,
= 1,
2, ••• , M) are nTxnT block submatrices in
",*-1
L
Note that if the reduced-form disturbances are mutually uncorre1ated
the estimators (6.18) reduce to
~(o)
"1.l
=
-1
(~' 1;*
II llll
~)
II
-1
-1
~' 't'*
1111
~
(II
= 1,
2, ..., M)
derived in section (6.3).
Formula (6.18) is based upon observations on all endogenous and
exogenous variables in the system of equations and, moreover,
takes
account of correlation between disturbances in different equations.
Therefore, the estimators given by (6.18) must be at least as efficient
asymptotically as the single equation estimators (6.13).
(~)
As we observed in section (6.3), for large samples Y plays the same
(11)
role in regression analysis as X TI, and the limiting moments and cross
moments of nllit's are precisely the corresponding moments of
for II
= 1,
2, .0., M and for all i and t.
Therefore, writing
* ,s
E
llit
e
91
Z
1l
Z
=
=
(1l)
(X IT , X )
1l
Zl
0
0
Z2
,
0
o
and D = lim
l: *-1
(Z'
Z)
(6.19)
nT
n~
T~
we have in the manner of corollary (4.7.1) the following results:
e·
-1
is asymptotically normal (Q, D ),
lim E~(o»
= Q,
and lim E(,g,(o),g,(C» ')
n~
n~
T~
T~
= D- 1
Now ~ . (~'::-1~ )-1 (~'::-lz )
. i +
=0
-
-1
(~'::-1~) (i::-1~)
1
"'(0)
+-~
·rnT
(6.20)
92
1
""(0)
lim E(~) = 0 + lim E Sl
n~
/nT
n~
T~
T~
lim E [nT(~
and
= 0
-i) <l
- i) , ]
,
=
lim E[,g:(o) ,g:(o)
n~
n~
T~
T~
=
-1
D •
We have, thus, the following theorem:
Theorem (6.2).
Estimators
of structural coefficients
i,
(i) are consistent and asymptotically unbiased,
-e·
(ii) have asymptotic covariance matrix
1
~
-1
D
,which is consistently
estimated by
1 ""-1
-- D
nT
=
(Z""',,,,*
~
=
-1
""Z)
-1
,,,,,*12
-1
~,,,,,*lM,,,,
"",,,,,*11,,,,
Zl ~
Zl'
~l~
"",,,,,*21,,,,
Z2~
Zl'
22
""',,,,* ""
Z2~
Z2'
"",,,,,*2M,,,,
Z2~
ZM
~,,,,,*Ml~
M~
M'
"",~*M2""
ZM
Z2'
~,~*MM~
~2'
l~
M
~
M
and (iii) are asymptotically more efficient than the single equation
estimators
~(o)
~
givenby
~(o)
Il
,
=
(~~*
j.l
-1
j.lj.l
~ )-1
j.l
(j.l
=
1, 2, .•• , M)
93
7.0
ERROR COMPONENT ,MODEL: SOME SPECIAL CASES
7.1, Fixed Cross-Sectional Effects
In this chapter we consider the estimation problem when some of the,
assumptions underlying the error component model are changed.
One very
common situation is that the number of cross-sectional units is finite.
In this case the effects
u
-=f
=
associated with the i th cross-sectional unit may'be assumed constant
parameters.
e·
Let the time effects
V~t (~ =
1, 2, ••• , M) be random.
Then the error term in each equation may be viewed as consisting of a
random time effect and a measurement error which is also random.
We can
write the system of equations in the following form
,
A+
,
~t
Here (i) e(c)=
-it
V~t' andw~it
'
~t
B+
'
'+ ~(c)
~
where
~t
=
(c) =
e~it
0'
(7.1)
(~
= 1,
2, ..., M),
being random variables specified in section (4.1);
(if) for each i, u.
~
=
is Mxl vector of unknown but constant,
94
parameters; and
(iii)
~t' ~t'
A and B are the same as specified before.
Suppose that there are n cross-sectional units in all, and that for
each of these units we can take observations over any number of time
periods.
If we randomly select T time periods, it is clearly implied
that, while n is a fixed integer, T can be made as large as we please.
In terms of n T observations the ~th structural equation is
(~)
= y
e·
where €(c) = [Bv
Il
J.l
.
~ + X~ ~
(e)
+ A~ + ~
(7.2)
+ w ] is an nTxn random vector
J.l
u~l
and
=
U
J.l
is an nx1 vector of parameters.
u~2
u
~n
It will be noted that the new specifications leave the identification criteria unchanged.
For we can regard the nTx1 matrix
A as
the
matrix of values of n additional exogenous variables so that the total
number of exogenous variables in the system of equations is (K+n), of
which (K + n) are included in the ~th ,equation.
~
The identification
condition is
K + n > K + n + £ -1
'~
or
K -> K~ + £~ -1
~
,
which is precisely the condition established in Chapter 2.0.
95
The
reduced~formequationsof
*'
~t = ~t II +.!4.
+
(c) *
4t
where i t follows that, in terms
the system (7.1) are
of
(7.3)
'
n T --observations, the /h reduced-form
equation can be written as
(7.4)
7.1.1
Estimators "of Reduced...Ferm Parameters
It is easy to verify that
l:
-e.
( cr v* + crw* ) I,
llll
llll
=
(c)*
llll
cr v* I,
llll
(cr v * + cr w*) I,
llll
llll
cr v* I,
llll
cr v* I,
llll
cr u* I,
llll
·.. ,
cr v* I
llll
·.. ,
cr v* I
llll
·.. ,
(nTxnT)
where I is TxT identity matrix and, as before,
M
M
llr a llS cr v
cr v* = l:
l: a
rs
llll
r=l s=l
M
M
llr a llS crw
crw* = l:
l: a
rs
llll
r=l s=l
e
(7.5)
96
Further,
~
(c)
].l].l
*-1
®
=
I,
(rom) (7.6)
* and a *3 have the same expressions as a l and a 3 given in
where aI'
formulae {4.7) with av and aw replaced by av* and aw* respectively.
].l].l].l].l
].l].l].l].l
If we write
a].l].lv* + aw*
].l].l'
a
v*
].l].l'
a
v*
].l].l'
a
v*
w*
a].l].l + a].l].l'
av*
].l].l'
v*
].l].l
av*
].l].l
a
(7.7)
v* + aw*
].l].l
].l].l
(nxn)
we have
~
and
(c) *
= Q(c)* ®I
].l].l
].l].l
~ (c) *-1
].l].l
For ].l
= Q(c)*
-1
].l].l
+].l
(7.8)
G9 1.
, we have
E(e:'(C}* e:'~c)*) = Q(~),*rtr\I where Q(c),*
""1l""1l
].l].l \tY ,
].l].l
can be obtained
from Q(c),* by replacing one of the ].l's by ].l'.
].l].l
In order to estimate IT
""1l
we need to find the estimate of Q(c)* which
].l].l
involves two unknown parameters.
To do this we adopt the same technique
97
as was used in Chapter 4.0.
Ignoring v's, we have, by the ordinary least
squares method, estimators
[t]
* = (X,
where P
=
(P
*' * -1 *'
p)
P
~
(].I • 1, 2, ..., M)
(7.9)
A).
Note that, if X contains a column vector of l's, the nTx(K+n) matrix
P
* is singular and its inverse does not
has rank (K+n-1) so that (P *' P)
exist.
I n t h at case we rep 1ace (p *"p*)-l by (p*'p*)-g - t h e genera1i ze d
,
inverse of P P - in the.formu1a (7.9).
does not contain a vector of l's.
We can assume, however, that X
We retain this assumption in the
Note further that since n is fixed, if
lim
T~
(X';)
n
; : (P::
exists and is positive definite,
p.)
(7.10)
exis ts and is positive definite.
*
This is evident from the expression for P*' P,
S~x1' SXKx 2 ,
e
SX1~'
x1 l.
x 12 .
x
SX2~'
x 2l.
x 22 •
x
~l.
~2.
~n.
2
S~,
In.
2n.
x 1 l. '
x 2l. '
~1.'
T
0
o
x 12 • '
x 22 • '
~2.'
0
T
o
xln. '
x 2n • '
~n.'
0
0
T
98
C1e~r1y,
,
,
Y].lit = (~t' ~t) ~
where
(7.11)
0
0
=
~t
,
-
1
.th
~
element.
0
o
(nx1)
Let e(c)* =
Y
it
Yit - i t ·
Then
(c)
~t
*=
(c)
E:
-
it
,
,
*' * -1
*' ( )*
(~t' ~t) (p P) ,P ~ c
(7.12)
and it can be ,shown in the manner of Chapter 4.0 that the following are
consistent and asymptotically unbiased estimators:
n
T
L:
L:
(c)* _(c)* 2
(e
- e
)
].lit
].l.t
i=l t=l
rvw*
(J
].l].l =
T(n-1)
T
rvw*
(J
].l].l
)
n
)=
rvw*
(J
,= i=l
].l].l
(c)*
*
(e (~) *- e].l. t ) ( e (c)
].l 'i t
t=l ].l~t
rvw*
].l].l
(J
n
(c) *
- e'
].l •t )
e
'(c) *
"""1J
(c) *
e
].l.t
T
-e (c)*
,
].l .t
-
].l].l
~W]
n
(I~BB)
e (c)*
,
"""1J
=
T(n-1)
T
[t:l
BB'e(c)*
~2T
L:
L:
e
e'(c)*
"""1J
T
n
"""1J
T(n-1)
"vV*
= t=l
(J
].l].l
'Vv*
(J
,=
].l].l
"""1J
BB ) e (c)*
n
=
(c) 2*
( " (e ].l~t
e '*(c) (I _'
,
T(n-1)
e'(c)*(BB')e(~)*
"""1J
=
"""1J
2
n T
rvw*
,
(J
].l].l
n
(7.13)
99
These estimators when substituted for
~
(c)*
1-11.1
and
~
0
w* , 0v*
etc., give estimates for
llll'
llll
(c)*
, which are consistent and asymptotically unbiased, and we
llll
obtain the estimator
~(c)*
=
'V(c) *
~11 '
~(c)*
~(c)*
~(c)
21
12
'
22
~(c)*
'
1M
~(c)*
*
'
'V(c)*
~1
of I: * = ~ * ®I
~
I
®1 = 'V*QP
2M
~)*
,
.
Therefore, writing
e.
R* =
p*
0
0
P
0
0
0
*
and
p*
MnTx(K+n)M
*=
1...
*
YZ
we have consistent and asymptotically unbiased estimators given by
l*
=
(R*l
~(c)*-l
R*)-l R*'
~(c)*-l ~
(7.14)
100
e
P
=
P
P
. *' tV(c) *12 *
P
L
P ,
*' tV( ) *11*
L c
P
*'
() *21 *
~ c
P ,
*' tV( )*Ml
L c
P
*
P
P ,
*.' tV( c) *22
L
·..
·..
P* ,
,
P
,
P
*' tV( ) *lM *
L c
P
*' tV(c) *2M
L
MM
*' tV( ) *M2 *
L c
P ,
• •• , P
*' ~(c)*
£;
P
P
-1
*
*
(7.15)
M
L
ll=l
P
M
x
L
P
*' ~(C)*lll
*'
tV(c)*2 11
L
ll=l
M
L
e-
P
~
.Yv
*'
ll=l
which have asymptotic covariance matrix
1.nT
lim
T-+oo
*'
.:;.R:...---=-L
(
(c)*-l
*)-1
...
R_
nT.
=
i
lim
(~R;"'*_'...;;L:..(_C_)_*-_l--::;R;....* )
T-+ca.
(7.16)
T
The asymptotic covariance of these estimators is consistently estimated
by
)*11 *
P*' tV(
L c
P ,
*'tV( )*-1 * -1
(R L c
R )
=
*'.~(c)*2l *
P
P ,
*':~(C)*Ml *
P
£;
P ,
*' tV( )*12 *
P L c
P,
* '.tV( ) *22 *
P. L c
P
,
*'. ( )*lM *
P ~ c
P
-1
*'tV( )*2M*
L c
P
P
* '.~(c) *M2 *
P £;
P ,
(7.17)
These results are similar to those obtained for estimators ~ in
Chapter 4.0.
101
When each reduced-form equation is considered independently we
obtain estimators
(l.l
=
X'~(C)*-lA -1
''V(c)*-l
XL
X
l.ll.l
= 1,
2, ..., M)
''V(c)*-l
~
X L].l].l
].l].l
(7.18)
A'~(c)*-lX
A'~(c)*-lA
].l].l
A''V(c)*-l
L].l].l
~
].l].l
so that we have explicitly
~(o)
=
"""1J
--
x
7 .J. 2
-1
, ( ) -1 -1
, ( ) *-1
, ( )*-1
[X ' ~ ( c )*-1X _ (X ~ c
A) (A ~ c * A)
(A ~ c
X)]
].l].l
].l].l
].l].l
].ll.l
[X'~(C)*-l~
(X'~(c)*
].l].l
].l].l
-1
A)(A'~(c)*
].l].l
-1
A)
-1
(A'~(C)*-l~
].l].l
(7.19)
Structura1Estimation
The estimation of structural parameters follows the same pattern as
in Chapters 5.0 and 6.0.
Thus the two-stage generalized least squares
estimators are given by
~(o)
'V(o)
~
"""1J
=
. 'V(o)
u
.
-1
'V'~(c)*
z].l
].l].l
'z
V
].l
... -1
Z'~(c)*
].l ].l].l
A
-1
'V''V(c)*-l
Z].l L].l].l
~
(7.20)
A'~(c)*-l~
].l].l
"""1J
].l
A'~(c)*
-1
].l].l
A
A'~(c)*-l
].l].l
~
whence we obtain explicitly
~(o) = [~'~(c)*-1~ _ ~'~(c)*-1A ~A'~(cj*-\)-l A'~(c)*"1~ ]-1
"""1J
].l l.l].l
].l
].l ].l].l
].l].l
].ll.l].l
-1
,
-1,
-1 -1,
-1
x [~'~(c)*
_ ~ ~ (c)* A(A ~(c)* A)
A ~(c)*
].l ].l].l
~
].l ].l].l
].l].l
].l].l
~
(].l
= 1,
2, ••• , M).
(7.21)
102
Further, the three-stage generalized least squares estimators are
(7.22)
'V
.f41
t'**
where
t'**
1.1
=
(t'1.1 , A), t'**
,Z=
1.1
Z2
,
etc.
t'**
o
The estimators of
Zl
~**
0
=
0
0
1.1
1.1
~ = [~] given by
(7.22) as asymptotically more
efficient than those given by (7.21).
7.2
Cross-Sectional and Time Effects Random
with Finite Nonzero Expectations
Consider the model
,
~t A + ~t
u
where u.
-:L
=
,
,
B +..!!t +.Y.t + litt =.Q.,
v
li
u2i
,
,
, .Y.t =
(7.23)
w
1it
1t
v 2t
,
and ~t
=
w2it
are mutually independent random vectors such that
•
E(u .)
1.IJ.
= A1.1
for all i
E(u . - A ) (u , , - A ,)
1.IJ.
1.1
1.1
i
1.1
= 1,
=
2,
...
cr u , ,
1.11.1
if i
0,
if i
,
=i
,
1.1,1.1
+, i
for
= 1,
2,
••• t
M;
103
= elJ
E(v t)
lJ
,for all t
= 1;
2,
elJ ) (vlJ , t , - elJ ,) =
E(v t lJ
CJ
v, ,
lJlJ
0,
,
=t
,
if t
if t
,
lJ,lJ
+t,
= 1,
2,
for
... , M,
and WlJit's are specified as in (4.1.1).
Assuming that
CJ
are all finite and that ~ and ~
. .!.
..
u l and cr VI
lJlJ
lJlJ
are multivariate normal, we can obviously write
= 1,
(lJ
and
+
The random variables u
lJi
'
+
V
properties as u
2, ••• , M).
and W
have the same distributional
lJt
lJit
th
of section (4.1), and the lJ
structural
' V
and W
lJit
lJi
lJt
equation (in terms of n T observations) can be" written as
(lJ)
~
where
a
= Y
"'"11
+ X
lJ
a
""'1J
+ (>"
lJ
+
e ). 1
lJ
-
.j.
e+ ,
;.l
(7.24)
1
1
1=
1 nTx1
Clearly, E(€+ €1+) = I:
"'"11 "'"11
E(€
+
1+
€,) = I:
I.l "'"11
Let s
lJ
= >.. lJ
+
e.
lJ
and
lJlJ
I
lJlJ
I
(lJ
+lJ) .
The equation (7.24) becomes
(lJ)
Y.
~
=Y
a
"'"11
+ X S + s 1 + €+
lJ I.l
lJ "'"11
(lJ)
+ +
+
=Ya +X y+€ ,
"'"11
lJ I l "'"11
where
x; - (x", 1>
and
~ =[ ~]
(7.25)
104
This is exactly the equation (4.3) with X replaced by X+ and
replaced
+
by~.
in the model.
a
"""'1-l
The new specifications merely introduce a constant term
The estimation of reduced-form and structural parameters
follows the same lines as in the preceding chapters.
7.3
Dummy Variable Specifications
Finally, let us assume that the time effects and cross-sectional
th
effects are both fixed so that in the jl
structural equation
(]J) ,
YjlJ.t
. = ~t
Y. 11
a + -]J
x()i t 11
a + ujlJ.' + v]Jt + wjlJ.'t'
the error term is w]Jit'
It is clearly implied that the number of cross-
sectional units, n, and the number of time intervals, T, are fixed.
•
In
this context, one cannot speak of asymptotic samples and consistent
estimation really does not make sense.
We can find unbiased estimators for the reduced-form parameters.
*w'*.)
Since E(w
-V~.
= crw*
jljl
I nT' the ordinary least squares method applied to
each reduced-form equation
writing f
IT
""1.l
(jl
= (A, B)
se~a~ately
will yield such estimators.
Thus
we have
,
"
,
1
= [X , X - X" f(f f)-g f 'X][X~ - X f(f r)-g f~],
(7.26)
= 1,
2, ..• , M), as estimators of IT which are B.L.U.
, 11
,
We take generalized inverse (f f)-g, because (f f) is singular and its
inverse does not exist.
However, for any choice of (f'f)-g, the matrix on
the right-hand side of (7.26) is nonsingular (see Rohde [16], p. 51).
When we attempt to find structural estimators we encounter a difficulty, because of a subset of explanatory variables in each structural
equation consists of jointly dependent variables.
These being correlated
,
with the error term, the least
with any desirable property.
105
squaresmeth~d
fails to provide estimators
What enabled us to estimate structural
parameters in the error component model was the fact that n or T or both
could become very large, and that, asymptotically, explanatory variables
were uncorre1ated with the error terms.
In the present case, however, n
and T are both fixed, and the correlation between explanatory variables
and error terms does not permit the estimation of structural parameters.
106
8.0
SUMMARY AND ·CONCLUSIONS
This thesis is concerned withthe'problemof-estimating the
parameters of a system of . simultaneous·linear equations with the error
component disturbance model.
The 'disturbance term in each equation of
the system is assumed to have three mutually independent random
components, a component associated with time, another representing random
cross~sectional
effect, and a measurement error.
The problem is introduced in Chapter 1.0, but its consideration is
postponed until Chapter 4.0.
Chapter 2.0 is a review of existing material on the subject of
simultaneous linear equations.
briefly discussed.
Some well-known methods of estimation are
All these methods use time series and their main
stochastic assumption is that the disturbances are temporally uncorrelated.
In Chapter 3.0 some simple results on the convergence of sequences
of random variables are given, which are used later to derive large sample
properties of structural and reduced-form estimators.
The topic of reduced-form estimation is considered in Chapter 4.0.
A
two-stage estimation procedure is developed for the error component model.
In the first stage, the covariances matrices
L
*
~~
,(~,~
, = 1,
2, ••• , M)
of reduced-form disturbances are estimated from the ordinary least squares
residuals.
In the second stage, two sets of estimators are derived for
the reduced-form parameters:
(i)
Single equation estimators
~(o), which result when Aitken's
two-stage method is applied to each equation in the reduced-form
•
separately, and
(ii)
Generalized estimators
equations simultaneously.
'V
!, obtained from the entire set of
107
Both sets of estimators compare favorably with the ordinary least
squares estimators.
The structural parameters maybe estimated by one of the two methods
developed in Chapter 6.0;
the "Two-Stage Generalized Least Squares Method,"
and the "Three-Stage Generalized Least Squares Method," which parallel,
respectively, the 2-SLS and the 3-SLS methods for the ordinary model of
simultaneous linear equations.
For a system of
e~act1y
identified
equations, a third method called the "Indirect Generalized Least Squares
Method" is also available.
The three methods yield identical estimators
when the equations are exactly identified.
A notable feature of these
methods is that they are based on the covariance matrices of reduced-form
disturbances, and do not require the estimation of covariances of
•
structural disturbances •
The properties of estimators derived in the thesis may be summarized
as follows:
Properties
Estimators
Reduced-form
(i)
"Ordinary Least Squares" IT:
Unbiased, consistent, asymptotica11y inefficient.
(ii)
Single Equation ~(o):
Consistent, asymptotically unbiased,
asymptotically more efficient than
IT.
(iii)
tV
Generalized IT:
Consistent, asymptotically unbiased,
asymptotically more efficient than
~(o) as well as IT.
108
Structure
"Two-Stage" ~(o):
(iv)
Consistent, asymptotically
unbiased.
"Three-Stage" ~:
(v)
Consistent, asymptotically
unbiased, asymptotically more
efficient than ~(o).
A number of special cases are considered.
In the case of the
"dummy variable model" it is seen that, although one can obtain B.L.U.
estimators for the reduced-form parameters, there is a difficulty in
the way of structural estimation because of the fact that the number of
cross-sectional units and the number of time intervals are both assumed
fixed.
This study demonstrates that the error component model is an effective
device for combining time series and cross-sectional data in the estimation of simultaneous linear
equations~· -'l'hemodel
can be readily adapted
to take care of a variety of situations.
The error component model invites further research in a number of
directions:
(i)
(ii)
The problem of prediction;
Small sample properties of reduced-form and structural
estimators derived in this thesis;
(iii)
The problem of structural estimation in the dummy variable
model; and
(iv)
Derivation of other estimation procedures.
109
LIST OF REFERENCES
•
1.
Anderson, T. W., and H. Rubin. 1949. Estimation of parameters of a
single equation in a complete system of stochastic equations.
Annals of Mathematical Statistics 20:46-63.
2.
Anderson, T. W., and H. Rubin. 1950. Asymptotic properties of
estimators of parameters of a single equation in a complete
system of stochastic equations. Annals of Mathematical
Statistics 21:570-582.
3.
Basmann, R. L. 1957. Generalized classical estimation of coefficients
in a structural equation. Econometrica 25:77-83.
4.
Basmann, R. L. 1960. On asymptotic distribution of generalized
linear estimators. Econometrica 28:97-106.
5.
Bodewig, E. 1956. Matrix Calculus.
Company, Amsterdam.
6.
Box, G. E. P. 1954. Some theorems on quadratic forms applied in
analysis of variance problems, etc. Annals of Mathematical
Statistics 25:290-302 •
7.
Cramer, H. 1963. Mathematical Methods of Statistics.
University Press, Princeton, New Jersey.
8.
Goldberger, A. S. 1964.
Inc., New York.
9.
Johnston, J. 1963. Econometric Methods.
Inc., New York.
North-Holland Publishing
Econometric Theory.
Princeton
John Wiley and Sons,
McGraw-Hill Book Company,
. 10.
Koopmans, T. C. and W. C. Hood. 1953 • The estimation of simultaneous economic relationships, pp. 112-199. In W. C. Hood and
T. C. Koopmans (eds.), Studies in Econometric Method. John
Wiley and Sons, Inc., New York.
11.
Koopmans, T. C., H. Rubin and R. C. Leipnik. 1950. Measuring the
Equation Systems of Dynamic Economics. John Wiley and Sons,
Inc., New York.
12.
Loeve, M. 1963. Probability Theory.
Princeton, New Jersey.
13.
Mundlak, Y. 1963. Estimation of production and behavioral functions,
pp. 138-166. In C. F. Christ (ed.), Measurement in Economics.
Stanford University Press, Stanford, California.
14.
Nagar, A. L. 1959. The bias and moment matrix of general K-class
estimators of the parameters in simultaneous equations.
Econometrica 27:575-595.
D. Van Nostrand Company, Inc.,
.'
110
LIST OF 'REFERENCES (continued)
15.
Rao, C. R. 1965. Linear Statistical Inference and Its Applications.
John Wiley and Sons, Inc., New York.
16.
Rohde, A. R. 1964. Contribution to the theory of generalized inverses,
etc. Institute of Statistics, Mimeograph Series No. 392, North
Carolina State University at Raleigh.
17.
Theil, H. 1961. Economic Forecast and Policy.
Publishing Company, Amsterdam.
18.
Wilks, S.
·i.'1
1962.
Mathematical Statistics.
North-Holland
John Wiley and Sons,
Inc., New York.
19.
Zellner, A. 1962. An efficient method of estimating seemingly
unrelated regressions. Journal of the American Statistical
Association 57:348-368.
20.
Zellner, A., and H. Theil. 1962. Three-stage least-squares:
Simultaneous estimation of simultaneous equations. Econometrica
30:54-78.
© Copyright 2026 Paperzz