LINEAR REGRESSION LIMIT THEORY FOR NONSTATIONARY

Econometrica, Vol. 67, No. 5 ŽSeptember, 1999., 1057᎐1111
LINEAR REGRESSION LIMIT THEORY FOR
NONSTATIONARY PANEL DATA1
BY PETER C. B. PHILLIPS AND HYUNGSIK R. MOON 2
This paper develops a regression limit theory for nonstationary panel data with large
numbers of cross section Ž n. and time series ŽT . observations. The limit theory allows for
both sequential limits, wherein T ª ⬁ followed by n ª ⬁, and joint limits where T, n ª ⬁
simultaneously; and the relationship between these multidimensional limits is explored.
The panel structures considered allow for no time series cointegration, heterogeneous
cointegration, homogeneous cointegration, and near-homogeneous cointegration. The
paper explores the existence of long-run average relations between integrated panel
vectors when there is no individual time series cointegration and when there is heterogeneous cointegration. These relations are parameterized in terms of the matrix regression
coefficient of the long-run average covariance matrix. In the case of homogeneous and
near homogeneous cointegrating panels, a panel fully modified regression estimator is
developed and studied. The limit theory enables us to test hypotheses about the long run
average parameters both within and between subgroups of the full population.
KEYWORDS: Nonstationary panel data, long-run average relations, multidimensional
limits, panel cointegration regression, panel spurious regression.
1.
INTRODUCTION
THERE HAS BEEN MUCH RECENT EMPIRICAL econometric work on economic
models that uses panel data for which the time series component is nonstationary. Testing growth convergence theories in macroeconomics and estimating
long-run relations between international financial series such as relative prices
and exchange rates, and spot and future exchange rates are a few examples. This
work has been facilitated by the construction and availability of a number of
important panel data sets covering different individuals, regions, and countries
over a relatively long time period, a notable example being the Penn World
table. For such cases a new nonstationary panel data limit theory which allows
for large n and large T asymptotics is useful. Much past panel data research has
focused on identifying and estimating effects from stationary panels with a large
cross section data dimension Ž n. but with few time series ŽT . observations. In
1
An earlier version of this paper, Phillips and Moon Ž1997a., hereafter PM a , was presented at the
inaugural meeting of the New Zealand Econometric Society Group in Auckland, February 1997,
while Phillips was visiting the University of Auckland. Some of the results were also presented in a
training course on panel cointegration given by the first author at the EMBA meetings in Palm
Cove, Australia, August 1996. The present paper is a shortened version of Phillips and Moon
Ž1997b., hereafter PM b , to which readers will be frequently referred for a full development of the
algebra and limit theory given here.
2
The authors thank a co-editor and three referees for comments on the paper, and Donald
Andrews, David Pollard, and Oliver Linton for helpful discussions. Phillips thanks the NSF for
research support under Grant No. SBR 94-22922, and Moon gratefully acknowledges financial
support from a C.A. Anderson Prize Fellowship. The paper was typed by the authors in Scientific
Word 2.5.
1057
1058
P. C. B. PHILLIPS AND H. R. MOON
such cases a large n, fixed T limit theory is natural and Chamberlain Ž1984.,
Hsiao Ž1986., Matyas and Sevestre Ž1992., and Baltagi Ž1995. review much of
this research.
The purpose of the present contribution is to investigate regressions with
nonstationary panel data for which the time series component is an integrated
process and where both T and n are large. In such cases, panel regressions can
behave very differently from time series regressions. It has long been recognized
by econometricians that panel data can distinguish effects that time series or
cross section data alone cannot identify, and nonstationary panels provide a
further instance of this phenomenon.
Suppose that we have two I Ž1. random vectors, say Yi, t and X i, t . When there
is no cointegrating relation between Yi, t and X i, t , and a time series regression
for given i is performed, then the regression coefficient is well known to have a
nondegenerate limit distribution and the regression is characterized as spurious
ŽGranger and Newbold Ž1974. and Phillips Ž1986... Now suppose that there are
panel observations of Yi, t and X i, t with large cross sectional and time series
components. In this case, even if the noise in the time series regression is strong,
the noise can often be characterized as independent across individuals. Hence,
by pooling the cross section and time series observations, we may attenuate the
strong effect of the residuals in the regression while retaining the strength of the
signal Ž X i, t .. In such a case, we can expect a panel-pooled regression to provide
a consistent estimate of some long-run regression coefficient.
The present paper is concerned with developing a limit theory that is helpful
in understanding and interpreting regressions of this type. In particular, we show
the existence of an interesting long-run relation between panel vectors like Yi, t
and X i, t that have no individual time series cointegrating relation. The new
relation is a long-run a¨ erage relationship over the cross section and it is
parameterized in terms of a matrix regression coefficient derived from the cross
section long-run average covariance matrix.
The following sections consider four possible panel structures for Yi, t and
X i, t : Ži. no cointegrating relation, Žii. a heterogeneous cointegrating relation, Žiii.
a homogeneous cointegrating relation, and Živ. a near-homogeneous relation.
Our analysis shows that in all four cases the pooled estimator is consistent and
has a normal limit distribution. In the no cointegration and heterogeneous
cointegration cases, we also study a limiting cross section estimator and prove
that it is consistent and has a normal limit distribution, but that it is less efficient
than the pooled estimator. In addition, in the case of homogeneous cointegration and near-homogeneous cointegration, we can construct a consistent estimator for the long-run regression coefficient, which we call a pooled FM Žfully
modified. estimator. This estimator has a faster coefficient convergence rate
than the simple cross section and time series estimators.
Since the beginning of the 1990’s there has been some ongoing research on
nonstationary panel data that connects to our work here. Quah Ž1994., Levin
and Lin Ž1993., and recently Im et al. Ž1996. considered unit root time series
regressions with nonstationary panel data and proposed test statistics for unit
roots. In addition, Pedroni Ž1995. studied some properties of cointegration
NONSTATIONARY PANEL DATA
1059
statistics in pooled time series panels, and Robertson and Symons Ž1992. studied
the biases that are likely to arise in practice with both stationary and nonstationary panel data. More closely related to our work is Pesaran and Smith Ž1995.,
who examined the impact of nonstationary variables on cross section regression
estimates. They showed that spurious correlation between two I Ž1. variables
does not arise in the case of cross section regression with a finite number of
time series observations under conditions such as exogenous regressors and iid
disturbances. Our paper extends that result to a very general setting and
provides a limit theory as T ª ⬁ and n ª ⬁ in panel regressions. The long-run
relation defined in Pesaran and Smith Ž1995. is an average of randomly different
cointegrating coefficients and they suggested cross section regression with time
averaged data for consistent estimation. By contrast, our long run relation is the
regression relation associated with the long run average covariance matrix and it
is this regression that is the natural limit of a pooled panel regression. Further,
we show that both pooled panel regression and limiting cross section regression
estimators are consistent for this long-run average relation.
The limit theory developed here allows for both sequential limits, wherein
T ª ⬁ followed by n ª ⬁, and joint limits where T, n ª ⬁ simultaneously. A
detailed discussion of multi-index asymptotic theory is provided and some
general theorems for laws of large numbers and central limit theory are given.
Sequential limit theory is easy to derive and generally leads to quick results for a
variety of model configurations. Under some strengthening of the conditions,
the results obtained under sequential limits are shown to apply also when
T, n ª ⬁ simultaneously. For a limit distribution theory we need the rate
condition nrT ª 0. The latter condition indicates that the limit theory here is
most likely to be useful in practice when n is moderate and T is large. We can
expect such data configurations in multi-country macreconomic data, for example, when we restrict attention to groups of countries like OECD nations or
developing countries. The limit theory enables us to test hypotheses about the
long-run average parameters both within and between such subgroups of the full
population.
The paper is organized as follows. Section 2 introduces the basic model, lays
out assumptions, and gives some preliminary results including a multidimensional Beveridge Nelson ŽBN. decomposition. Section 3 develops a framework
for asymptotics for double indexed processes that is used in the paper for both
sequential and joint limit theories. Section 4 assumes that there is no cointegration among the I Ž1. variable across all the individuals and gives asymptotic
theories for a pooled regression estimator and a limiting cross section estimator.
Section 5 assumes that there exists a cointegrating relation in the I Ž1. variables
across all the individuals and derives limit theories for the pooled estimators in
three casesᎏheterogenous, homogeneous, and near-homogeneous cointegration. Section 6 indicates some extensions of this theory to allow for models with
individual specific effects. Section 7 concludes the paper. Five appendices are
included to develop the multidimensional limit theory, provide some technical
background, relevant lemmas, and proofs of results in the paper.
1060
P. C. B. PHILLIPS AND H. R. MOON
Notation is fairly standard. The symbol ‘‘« ’’ signifies weak convergence,
‘‘[ ’’ is definitional equivalence, ‘‘' ’’ signifies equivalence in distribution,
p
a .s.
‘‘ ª ’’ is convergence in probability, and ‘‘ ª ’’ is convergence almost surely. The
inequality ‘‘) 0’’ signifies positive definiteness when applied to matrices.
Stochastic processes such as Brownian motion W Ž r . on w0, 1x are usually written
as W, integrals such as H01W Ž r . dr as HW, and stochastic integrals like
H01W Ž r . dW Ž r . as HW dW. Also vecŽ A. denotes vectorization of the matrix A by
stacking columns, and 5 A 5 is the Euclidean norm ŽtrŽ AX A..1r2 .
2.
ASSUMPTIONS, LARGE
T
ASYMPTOTICS, AND THE LONG-RUN
AVERAGE COVARIANCE MATRIX
We start with a panel data model based on the vector integrated process
Ž 2.1.
Zi , t s Zi , ty1 q Ui , t
Ž t s 1, . . . , T ; i s 1, . . . , n .
with common initialization at t s 0 satisfying
Ž 2.2.
Zi , 0 is iid across i with E 5 Zi , 0 5 4 - ⬁.
We partition the m-vectors Zi, t and Ui, t in Ž2.1. into m y and m x components
Ž m s m y q m x . as Zi, t s Ž Yi,X t , X i,X t .X and Ui, t s ŽUyX , t , UxX , t .X . Condition Ž2.2. is
i
i
made for convenience and could be generalized to allow for remote past
initialization at the cost of some further complications Že.g., Phillips and Lee
Ž1996... The error Ui, t is assumed to be generated by the random coefficient
linear process
Ž 2.3.
Ui , t s
⬁
Ý Ci , sVi , tys ,
ss0
where: Ži. Ci, t 4 is a double sequence of Ž m = m. random matrices across i and
over t; Žii. the m-vectors Vi, t are iid across i and over t with EŽ Vi, t . s 0,
EŽ Vi, t , Vi,X t . s Im , and, letting Va, i, t be the ath element of Vi, t , the Va, i, t are
assumed to be independent across as 1, . . . , m with EŽ Va,4 i, t . s ¨ 4 for all i and
t; Žiii. Ci, s and Vj, t are independent for all i, j, t, and s.
We make two further assumptions about the random coefficients in Ž2.3.. The
first involves moment conditions and the second is a set of summability conditions on the moments of the random coefficients.
ASSUMPTION 1 ŽRandom Coefficient Conditions.:
Ži. Ci, s 4i is iid across i for all s.
Žii. E 5 Ci, s 5 4 - ⬁ for all s.
Thus, Ci, s is assumed to be iid across individuals and to have finite fourth
moments that may vary over time. We allow Ci, s to be dependent over s. This is
important, because whenever Ui, t is generated by a finite-parameter time series
NONSTATIONARY PANEL DATA
1061
model like an autoregression, the coefficients in the Wold decomposition Ž2.3.
will be nonlinear functions of these parameters that are lag Ž s . dependent and
will therefore inevitably be dependent over s. Let C a, i, s be the ath element of
vecŽ Ci, s .. Also let EŽ C a,k i, s . s ␴ k, a, s . Then we make the following assumption:
ASSUMPTION 2 ŽSummability Conditions.: The following hold for all as
1, . . . , m2 :
Ži. Ý⬁ss0 s 2␴ 2, a, s - ⬁.
Žii. Ý⬁ss0 s 4 Ž ␴4, a, s .1r4 - ⬁.
Suppose the Ui, t in Ž2.1. are generated by a random coefficient ARMA
process whose characteristic equation has roots ␭ i j : j s 1, . . . , J 4 . Then the
coefficients Ci, s in the Wold decomposition Ž2.3. are all linear combinations of
powers of these characteristic roots. Under weak conditions on the distribution
of the roots we can now verify Assumptions 1 and 2. Suppose, for instance, that
the support of the distribution of the moduli of these roots is a compact set
inside the stable region, so that < ␭ i j < F M␭ - 1 a.s. Then all moments of 5 Ci, s 5
are finite for all s, and series such as those in Assumption 2 are easily seen to be
majorized by convergent series. For example, Ý⬁ss 0 s 2␴ 2, a, s F M Ý⬁ss0 s 2 M␭2 s - ⬁
for some constant M. Similar conditions will ensure the validity of the alternative Assumptions 4 and 5 that are used later on in the paper.
The following lemma establishes the integrability of terms that appear frequently in our development.
LEMMA 1: Let Ci Ž1. s Ý⬁ss0 Ci, s , C˜i, s s Ý⬁tssq1Ci, t , and U˜i, t s Ý⬁ss0 C˜i, sVi, tys .
Under Assumptions 1 and 2, the following hold:
Ža. EwÝ⬁ss0 s 2 5 Ci, s 5 2 x - ⬁.
Žb. E 5 Ui, t 5 2 - M for some M- ⬁.
Žc. E 5 U˜i, t 5 4 - M for some M- ⬁.
Žd. E 5 Ci Ž1.5 4 - ⬁.
The temporal shift operator generating the iid sequence Vi, t 4t defines a
measure preserving map on the product probability space induced by the
independent sequences Vi, t 4t and Ci, t 4t and it generates the sequence Ui, t 4t .
Also the random coefficient sequence Ci, t 4t is square summable a.s. since
Ý⬁ts0 5 Ci, t 5 2 - ⬁ a.s. by Lemma 1Ža.. Hence, the time series sequence Ui, t 4t is
square integrable ŽLemma 1Žb.. and strictly stationary for all i. However, the
sequence Ui, t 4t is not ergodic. This is because Fc i s ␴ Ž Ci, 0 , . . . , Ci, t , . . . ., the
sigma field generated by the sequence Ci, t 4⬁ts0 , is an invariant sigma field with
respect to the temporal shift operator and generates events with probability
between zero and unity.
The following lemma shows that, for each i, Ui, t satisfies a time series BN
decomposition Žsee Phillips and Solo Ž1992.. almost surely.
1062
P. C. B. PHILLIPS AND H. R. MOON
LEMMA 2 ŽPanel BN decomposition.: Under Assumptions 1 and 2 the processes
Ui, t s Ý⬁ss0 Ci, sVi, tys in Ž2.3. admit the following BN decomposition:
Ž 2.4.
Ui , t s Ci Ž 1 . Vi , t q U˜i , ty1 y U˜i , t
a.s.
Note that Ci Ž1. s Ý⬁ss0 Ci, s - ⬁ a.s. in view of Lemma 1Žd., and U˜i, t are well
defined square integrable random vectors by Lemma 1Žc.. Following Phillips and
Solo Ž1992., partial sums of Ui, t can be written as
Ž 2.5.
w Tr x
1
a.s.
Ý Ui , t s Ci Ž1.
'T
ts1
1
w Tr x
1
1
Ý Vi , t q 'T U˜i , 0 y 'T U˜i , wT r x ,
'T ts1
where w Tr x denotes the integer part of Tr and it is a simple matter to establish
that these partial sum processes satisfy functional laws. Indeed, we have the
following large T result.
LEMMA 3 ŽPanel Functional CLT.: Under Assumptions 1 and 2,
w Tr x
1
Ý Ui , t « Ci Ž1. Wi Ž r .
'T ts1
as T ª ⬁ for all i.
Let Mi Ž r . s Ž M y iŽ r .X , M x iŽ r .X .X s C i Ž1.Wi Ž r . s Ž C y iŽ1.X , C x iŽ1.X .X Wi Ž r .. Here,
Mi Ž r . is a randomly scaled Žor mixed. Brownian Motion with conditional
covariance matrix Ci Ž1.Ci Ž1.X , whose expectation is well defined because
5 ECi Ž1.Ci Ž1.X 5 - ⬁ in view of Lemma 1Žd..
By the continuous mapping theorem and initial condition Ž2.2. we have
T
1
T
2
X
Ý Zi , t ZiX , t « Ci Ž1. HWi WiX Ci Ž1. s HMi MiX ,
ts1
as T ª ⬁ for all i. Then, averaging over i s 1, . . . , n, we have
Ž 2.6.
1
n
n
Ý
is1
1
T2
T
1
n
X
1
n
Ý Zi , t ZiX , t « n Ý Ci Ž1. HWi WiX Ci Ž1. s n Ý HMi MiX ,
ts1
is1
is1
as T ª ⬁ for any fixed n. Integrability of the summands in Ž2.6. follows readily
under the given summability and moment conditions.
LEMMA 4: Under Assumptions 1 and 2, E 5 HMi MiX 5 2 - ⬁.
In consequence, a strong law of large numbers applies to Ž2.6. as n ª ⬁ and
so
Ž 2.7.
1
n
n
a.s.
Ý HMi MiX ª E
is1
Mi MiX .
žH /
1063
NONSTATIONARY PANEL DATA
The limit here depends not on the covariance matrix of Zi, t , but on a parameter
matrix that measures the long-run Žover t . covariance of Zi, t averaged over i.
This parameter matrix is constructed as follows.
Let ⍀ i be the long-run conditional covariance matrix of Zi, t s Ž Yi, t , X i, t .X
conditioned on Fc i , i.e.,
ž
⍀i s
⍀ yi yi
⍀ yi xi
⍀ xi yi
⍀ xi xi
X
/
X
s Ci Ž 1 . Ci Ž 1 . s
ž
X
C y iŽ 1 . C y iŽ 1 .
C y iŽ 1 . C x iŽ 1 .
C x iŽ 1 . C y iŽ 1 .
C x iŽ 1 . C x iŽ 1 .
X
X
/
,
where the partitions of ⍀ i and Ci Ž1.Ci Ž1.X are conformable. By Lemma 1Žd., ⍀ i
is integrable and we denote
⍀s
ž
⍀yy
⍀yx
⍀x y
⍀x x
/
X
s E Ž ⍀ i . s E Ž Ci Ž1. Ci Ž1. . .
We call ⍀ the long-run a¨ erage co¨ ariance matrix of Zi, t .
It is now apparent from Ž2.6. that EŽ HMi MiX . s Ew Ci Ž1. EŽ HWi WiX .Ci Ž1.X x. A
simple calculation reveals that EŽ HWi WiX . s 12 Im , so that Ž2.7. becomes
Ž 2.8.
1
n
n
1
a.s.
Ý HMi MiX ª 2 ⍀ ,
is1
showing that the limit of the second moment matrix of the data depends on the
long-run average covariance matrix ⍀ . Taken together, Ž2.6. and Ž2.7. give us an
instance of an asymptotic development in which T ª ⬁, followed by n ª ⬁,
leading to the sequence of limits
Ž 2.9.
Xn , T [
1
n
n
Ý
is1
ž
1
T
2
T
Ý Zi , t , ZiX , t
is1
/
«
1
n
n
a.s.
1
Ý HMi MiX ª 2 ⍀
is1
for the double indexed process X n, T . The next section discusses this type of
sequential asymptotic theory in relation to more general joint asymptotics.
3.
LIMIT THEORY FOR MULTIDIMENSIONAL PROCESSES
Throughout this paper, attention is focused on deriving the limit behavior of a
double indexed process X n, T , such as that given in Ž2.9.. In general, the limit of
X n, T depends on the treatment of the two indices, n and T, and the properties
that link the rows and columns of the process. Several approaches are possible.
One approach is to fix one of the indexes, say n, and allow the other ŽT . to pass
to infinity, giving an intermediate limit. By letting n pass to infinity subsequently, a sequential limit theory is obtained. We write this type of limit process
in the form ŽT, n ª ⬁. seq , where the order of the indices is critical to the
meaning. While they often lead to tractable deviations, sequential limits can give
1064
P. C. B. PHILLIPS AND H. R. MOON
asymptotic results that are misleading in cases where both indexes pass to
infinity simultaneously. A second approach is to pass to infinity along a specific
diagonal path Žin the two dimensional array. determined by a monotonically
increasing function relation of the type T s T Ž n. while the index n ª ⬁. ŽThis
approach really requires specificity about the functional dependence only in the
limit, so it includes cases such as those where we assume that ŽTrn. ª c / 0, in
which case we simply have T Ž n. s cn.. We write this type of limit process in the
form ŽT Ž n., n ª ⬁.diag . This approach also simplifies the asymptotic theory by
replacing X n, T with the single indexed process X n, T Ž n. . The drawback of
diagonal path limit theory is that the assumed expansion path ŽT Ž n., n. ª ⬁ is
highly specific and may not provide an appropriate approximation for a given
ŽT,n. situation. Moreover, the limit theory can depend on the specific functional
relation T s T Ž n. that is used in the asymptotic development. ŽA recent econometric example of this situation is analyzed in Phillips and Lee Ž1996... A third
approach is to allow both indexes to pass to infinity simultaneously without
placing specific diagonal path restrictions on the divergence. We write this type
of limit process in the form ŽT, n ª ⬁.. Generally speaking, such joint limit
theory requires stronger conditions Žlinking the rows and columns of the joint
array, and on the moments of the component variates. to establish than
sequential convergence or diagonal path convergence. But, by the same token,
the results are also stronger and may be expected to be relevant to a wider
range of circumstances, provided the conditions hold.
The asymptotic development in this paper will involve both sequential limit
theory and joint limit theory arguments. The sequential limits are especially
helpful in extracting quick asymptotics and they are useful because they bring
into play all of the key elements in our final limit theory in a straightforward
way. The joint limit theory is more difficult to derive and applies under stronger
conditions. Fortunately, these conditions do not seem to exclude cases of major
importance for the type of large T and moderate n empirical applications that
we have in mind for our methods.
The following subsections define the convergence concepts that we need and
give some conditions that assure joint convergence.
3.1. Definitions and Some Relations between Sequential and Joint Limits
A typical double index process of the type that occurs in this paper has the
linear form
Ž 3.1.
Xn , T s
1
kn
n
Ý Yi , T ,
is1
where Yi, T are independent m-component random vectors across i defined on a
probability space Ž ⍀ , F, P .. A typical Yi, T component in our case has the form
NONSTATIONARY PANEL DATA
1065
of a standardized sum
Ž 3.2.
Yi , T s
1
dT
T
Ý f Ž Zi , tswT r x . ,
ts1
where the Zi, wT r x are random elements in the space 3 Dw0, 1x h , for some integer h,
within the space Ž ⍀ , F , P ., dT is a standardizing factor, and f is a continuous
functional from Dw0, 1x h to ⺢ m . We alert the reader that the meaning of the
notation X n, T and Yi, T in Ž3.1. and Ž3.2. above is different from that of the
symbols Yi, t and X i, t which represent components of Zi, t given in Ž2.1. that
appear in other sections of this paper. The differences in meaning should be
obvious from the context.
DEFINITION 1: Ža. A sequence of m-vectors X n, T 4 on Ž ⍀ , F , P . is said to
con¨ erge in probability to X sequentially, written X n, T ªp X in sequential limit as
ŽT, n ª ⬁. seq , if
lim lim P 5 X n , T y X 5 ) ␧ 4 s 0
nª⬁ Tª⬁
᭙␧ ) 0.
Žb. X n, T converges in distribution sequentially to the m-vector X, written
X n, T « X in sequential limit as ŽT, n ª ⬁. seq , if
lim lim Ef Ž X n , T . y Ef Ž X . s 0
nª⬁ Tª⬁
᭙fgC ,
where C is the class of all bounded, continuous, real functions on ⺢ m.
In practice, we can find the sequential limits of X n, T in Ž3.1. as follows. Using
time series limit theory we find the limit behavior of Yi, T . Suppose, for example,
that as T ª ⬁
Ž 3.3.
Yi , T « Yi
or
Ž 3.4.
p
Yi , T ª Yi
for all i.
Then, by the independence of Yi, T across i for all T, we have X n, T « X n or
X n, T ªp X n as T ª ⬁ for all n, where X n s Ž1rk n .Ý nis1Yi .
By enlarging the underlying probability space if necessary, we can take it in
the case of Ž3.3. that all the Yi ’s are defined on the same probability space.
3
Dw0, 1x h is a product metric space of h independent copies of Dw0, 1x, the space of the all real
valued functions on the interval w0, 1x that are right continuous and have finite left limits. Dw0, 1x h is
endowed with the metric ␳ hŽ f, g . s max i ␳ Ž f i , g i .: i g 1, . . . , h4, f i , g i g Dw0, 1x4, where ␳ is the
modified Skorohod metric Žsee Billingsley Ž1968, p. 112.. under which Dw0, 1x is separable and
complete.
1066
P. C. B. PHILLIPS AND H. R. MOON
Hence, the sum of the limit random variables Ý nis1Yi is well defined on the same
space. Next, we allow n ª ⬁ and apply a limit theory to the standardized sum
Ž 3.5.
Xn s
n
1
kn
Ý Yi .
is1
Under some regularity conditions, we can now find the sequential limit, X, or
X n . For example, if k n s n, we can apply a law of large numbers ŽLLN. to X n
and if k n s 'n we can use an appropriate central limit theorem ŽCLT..
The requirement that the Yi ’s in Ž3.3. are defined on the same probability
space is important, especially when we apply an LLN to X n in the second stage
Ž3.5.. The reason is as follows. The weak convergence X n, T « X n as T ª ⬁
involves only the implication that the distribution of the X n, T converges to the
distribution of the X n , not any properties of the probability space where the X n
are defined. Indeed, if the weak convergence is mixing Že.g., see Hall and Heyde
Ž1980.., then X n escapes from the underlying probability space when T ª ⬁.
However, to employ an LLN to the sequence of X n , these variates need to be
defined on the same probability space. The requirement that the Yi ’s in Ž3.3. are
defined on the same probability space can be accommodated by suitably enlarging the underlying space. The construction of such a probability space is
provided in Appendix B.
Next, we define the concepts of joint convergence in probability and joint
weak convergence.
DEFINITION 2:
are defined on
probability jointly
Ž 3.6.
Ža. Suppose that the m-vector random sequence X n, T and X
a probability space Ž ⍀ , F , P .. X n, T is said to con¨ erge in
to X, written X n, T ªp X as ŽT, n ª ⬁., if
lim P 5 X n , T y X 5 ) ␧ 4 s 0
T , nª⬁
᭙␧ ) 0.
Žb. X n, T is said to converge in distribution jointly to a Ž m = 1. random vector
X, written X n, T « X as ŽT, n ª ⬁., if
Ž 3.7.
lim
T , nª⬁
Ef Ž X n , T . y Ef Ž X . s 0
᭙fgC ,
where C is the class of all bounded, continuous real functions on ⺢ m.
REMARKS: Ža. Evidently, joint convergence implies diagonal convergence on
all monotonic diagonal paths. Moreover, a version of the converse is also true,
namely that X n, T ªp X Žor X n, T « X . as ŽT, n ª ⬁. if X n, T Ž n. ªp X as ŽT Ž n., n
ª ⬁.diag for all T Ž n. ª ⬁ monotonically as n ª ⬁.
Žb. In some of our results, we need to place a condition on the indexes in
joint convergence of the form nrT ª 0. Joint convergence as ŽT, n ª ⬁. is then
said to apply subject to this condition. The definitional limits given in Ž3.6. and
Ž3.7. above are naturally subject to the same condition regarding the passage of
the indexes to infinity in this case.
1067
NONSTATIONARY PANEL DATA
Sequential limits are by no means always the same as joint limits or diagonal
path limits. Sometimes, different normalizations even are required to obtain
nondegenerate limits. PM b gives several examples. Nevertheless, under some
circumstances we can establish a relationship between sequential limits and
joint limits. The following two lemmas give some elementary conditions.
LEMMA 5 ŽConditions for Joint Convergence to Imply Sequential Convergence.: Ža. Suppose there exist random ¨ ectors X n on the same probability space as
X n, T satisfying, for all n, X n, T ªp X n as T ª ⬁. If X n, T ªp X as ŽT, n ª ⬁., then
X n, T ªp X sequentially as ŽT, n ª ⬁. seq .
Žb. Suppose there exist random ¨ ectors X n such that, for any fixed n, X n, T « X n
as T ª ⬁. If X n, T « X as n, T ª ⬁, then X n, T « X sequentially as ŽT, n ª ⬁. seq .
LEMMA 6 ŽConditions for Sequential Convergence to Imply Joint Convergence.: Ža. Suppose there exist random ¨ ectors X n and X on the same probability
space as X n, T satisfying, for all n, X n, T ªp X n as T ª ⬁ and X n ªp X as n ª ⬁.
Then, X n, T ªp X as Ž n, T ª ⬁. if and only if,
Ž 3.8.
lim sup P 5 X n , T y X n 5 ) ␧ 4 s 0
᭙␧ ) 0.
n, T
Žb. Suppose there exist random ¨ ectors X n such that, for any fixed n, X n, T « X n
as T ª ⬁ and X n « X as n ª ⬁. Then, X n, T « X as Ž n, T ª ⬁. if and only if,
Ž 3.9.
lim sup E Ž f Ž X n , T .. y E Ž f Ž X n .. s 0
᭙fgC.
n, T
3.2. Joint Con¨ ergence in Probability
Consider a double indexed process X n, T whose typical form is an average of
Ž m = 1. random vectors Yi, T ,
Ž 3.10.
Xn , T s
1
n
n
Ý Yi , T ,
is1
where the Yi, T are independent across i for all T. The concern is to establish
conditions under which a probability limit of X n, T in Ž3.10. exists and to develop
methods of finding this probability limit.
Suppose the X n, T are integrable and let
Ž 3.11.
␮ X s lim EX n , T s lim
n, Tª⬁
n , Tª⬁
1
n
n
Ý EYi , T
be finite.
is1
By definition it is sufficient for X n, T ªp ␮ X as Ž n, T ª ⬁. to show that
Ž 3.12.
lim P
n, Tª⬁
½
1
n
n
Ý Ž Yi , T y EYi , T .
is1
5
)␧ s0
for all ␧ ) 0.
1068
P. C. B. PHILLIPS AND H. R. MOON
In some applications Ž3.12. can be verified by showing that
Ž 3.13.
1
lim E
n
n, Tª⬁
n
Ý Ž Yi , T y EYi , T .
s0
is1
using the Markov inequality. Or, if the X n, T are square integrable, Ž3.12. follows
by Chebychev’s inequality when
Ž 3.14.
1
lim E
n
n, Tª⬁
2
n
Ý
Ž Yi , T y EYi , T .
s lim
n , Tª⬁
is1
1
n2
n
Ý E 5 Yi , T y EYi , T 5 2 s 0,
is1
where the first equality holds because the Yi, T are independent across i for
all T.
Sequential probability limits can also be derived. From time series limit theory
we may obtain the limit behavior of Yi, T when T ª ⬁. Suppose, for instance,
that as T ª ⬁
Ž 3.15.
Yi , T « Yi
᭙i
or
Ž 3.16.
p
Yi , T ª Yi
᭙i
so that, by the independence of Yi, T across i for all T, it follows X n, T « X n or
X n, T ªp X n as T ª ⬁ for all n, where X n s Ž1rn.Ý nis1Yi .
Suppose also, in the case of Ž3.15., that the Yi are defined on the same
probability space for all i so that the sum of the limit random variables
Ž1rn.Ý nis1Yi is meaningful. Appendix BŽ1. provides a construction for doing this
and, hereafter, we assume that the random vectors Yi in Ž3.15. exist on the same
probability space whenever we use sequential limit arguments. By allowing
n ª ⬁ and applying a standard strong law for independent random variables to
Ž 3.17.
Xn s
1
n
n
Ý Yi ,
is1
under some regularity conditions,4 we may find the sequential limit X. Let
Ž 3.18.
␮
˜ X s lim
n
1
n
n
Ý EYi .
is1
Then
Xn s
1
n
n
a.s.
Ý Yi ª ␮˜ X s lim
is1
n
1
n
n
Ý EYi .
is1
A fundamental question is whether the joint probability limit ␮ X in Ž3.11. is
equivalent to the sequential probability limit ␮
˜ X in Ž3.18.. Lemma 6 provides
4
Simple sufficient conditions are that the Yi are independent with sup i E 5 Yi y EYi 5 2 - ⬁.
NONSTATIONARY PANEL DATA
1069
one solution. According to Lemma 6, it is enough to verify condition Ž3.9. with
X n, T s Ž1rn.Ý nis1Yi, T and X n s Ž1rn.Ý nis1 Ý nis1Yi to conclude that X n, T ªp ␮
˜X
as n, T ª ⬁, where ␮
˜ X s lim nŽ1rn.Ýnis1 EYi . The following theorem gives a set
of sufficient conditions under which condition Ž3.9. is satisfied, so that the
probability limits ␮ X in Ž3.11. and ␮
˜ X in Ž3.18. are equivalent.
THEOREM 1 ŽJoint Probability Limits.: Suppose the Ž m = 1. random ¨ ectors
Yi, T are independent across i for all T and integrable. Assume that Yi, T « Yi as
T ª ⬁ for all i. Let the following hold:
Ži. lim sup n, T Ž1rn.Ý nis1 E 5 Yi, T 5 - ⬁;
Žii. lim sup n, T Ž1rn.Ý nis1 5 EYi, T y EYi 5 s 0;
Žiii. lim sup n, T Ž1rn.Ý nis1 E 5 Yi, T 515 Yi, T 5 ) n ␧ 4 s 0 ᭙␧ ) 0; and
Živ. lim sup nŽ1rn.Ý nis1 E 5 Yi 515 Yi 5 ) n ␧ 4 s 0 ᭙␧ ) 0.
Ža. Then condition Ž3.9. holds.
Žb. If lim nª⬁Ž1rn.Ý nis1 EYi Ž[ ␮
˜ X . exists and X n ªp ␮
˜ X as n ª ⬁, then X n, T
ªp ␮
˜ X as Ž n, T ª ⬁..
In establishing the existence of a joint probability limit of Ž1rn.Ý nis1Yi, T ,
Theorem 1 requires only first moment assumptions on Yi, T and is, in this
respect, less demanding than Ž3.14., which uses second moments of Yi, T .
Theorem 1 is particularly useful when the first moment condition Ž3.13. is not so
easy to establish.
An important special case arises when the Yi, T are a scaled version of some
iid random vectors Q i, T .5 Suppose that Yi, T s Ci Q i, T , where the Q i, T 4i are iid
for all T and the Ci are Ž m = m. nonrandom matrices for all i. Suppose that
Q i, T « Q i as T ª ⬁ for all i, so that Yi s Ci Q i . In general, the Yi, T are
heterogenous across i unless the Ci are the same for all i. The source of the
heterogeneity of Yi, T is the scale effect Ci , and then the heterogeneity from Ci
is smoothed by letting n ª ⬁. We have the following result for this special case.
COROLLARY 1: Suppose that Yi, T s Ci Q i, T , where the Q i, T are iid across i for all
T, and the Ci are Ž m = m. nonrandom matrices for all i. Assume that the Q i, T are
integrable for all T and Q i, T « Q i as T ª ⬁. Assume that C s lim nŽ1rn.Ý nis1Ci
exists. If 5 Q i, T 5 is uniformly integrable in T for all i, and if sup i 5 Ci 5 - ⬁, then
Ž1rn.Ý nis1Yi, T ªp CEŽ Q i . as Ž n, T ª ⬁..
5
In many applications, an I Ž1. process Zi, t can be decomposed into a scaled random walk process
plus an error term, that is, Zi, t s Ci Ž1. Si, t q U˜i, 0 y U˜i, t , where Si, t s Si, ty1 q Ui, t and Ci Ž1. is the
long-run moving average coefficient of ⌬ Zi, t Žsee Phillips and Solo Ž1992... Then, the scale factor Ci
is Ci Ž1. and Q i, T corresponds to f Ž Si, tr 'T ., where f is a continuous functional on some metric
space.
1070
P. C. B. PHILLIPS AND H. R. MOON
3.3. Joint Central Limit Theory
This section considers joint convergence in distribution of the
ized double sequence X n, T ,
1 n
Ž 3.19.
Xn , T s
Ý Yi , T ,
'n is1
'n -standard-
where the Yi, T are independent Ž m = 1. random vectors across i with EYi, T s 0
and EYi, T Yi,X T s ⍀ i, T .
One approach to the limit distribution of X n, T is to attempt to use a
multivariate CLT directly. This approach is particularly appropriate in the case
of diagonal path limits where we have X n, T Ž n. s Ž1r 'n .Ý nis1Yi, T Ž n. and a suitable multivariate CLT for triangular arrays can be applied. This idea was
employed by Quah Ž1994. and Levin and Lin Ž1993. in their work on panel unit
root tests. But, in general, when n and T go to infinity and no specific expansion
relation between n and T is assumed, we cannot use traditional CLT’s in this
way. In what follows, therefore, we develop a joint CLT for Ž n, T ª ⬁., using a
Lindeberg condition for a double indexed process.
First, take the case where the Yi, T in Ž3.19. are scalar random variables. Let
2
n
sn,
T s Ý is1 ⍀ i, T and define ␰ i, n, T s Yi, T rs n, T . Then we have the following result.
THEOREM 2 ŽJoint Limit CLT.: Suppose that for ᭙␧ ) 0,
n
Ž 3.20.
lim
ÝE
n, Tª⬁ is1
␰ i 2, n , T 1 < ␰ i , n , T < ) ␧ 4 s 0.
Then, as Ž n, T ª ⬁.,
n
Ý ␰ i , n , T « N Ž0, 1. .
is1
There are some interesting special cases of this joint CLT. The following
result, which is related to a theorem of Eicker Ž1963., arises when the Ž m = 1.
random vectors Yi, T are scaled versions of iid random vectors Q i, T .
THEOREM 3 ŽJoint Limit CLT for Scaled Variates.: Suppose that Yi, T s Ci Q i, T ,
where the Ž m = 1. random ¨ ectors Q i, T are iidŽ0, ⌺ T . across i for all T and the Ci
are Ž m = m. nonzero and nonrandom matrices. Assume the following conditions
hold:
Ži. Let ␴ T2 s ␭minŽ ⌺ T . and lim inf T ␴ T2 ) 0;
Žii. max i F n 5 Ci 5 2r␭minŽÝ nis1Ci CiX . s O Ž1rn. as n ª ⬁;
Žiii. 5 Q i, T 5 2 are uniformly integrable in T ;
Živ. lim n, T Ž1rn.Ý nis1Ci ÝT CiX s ⍀ ) 0.
Then,
Xn , T s
1
n
Ý Yi , T « N Ž0, ⍀ .
'n is1
as n, T ª ⬁.
NONSTATIONARY PANEL DATA
1071
Sequential weak convergence of Ž1r 'n .Ý nis1Yi, T can be derived in the same
way as the sequential probability limit of Ž1rn.Ý nis1Yi, T considered earlier.
Suppose that, for each i as T ª ⬁, the random variables Yi, T converge in
distribution to Yi , where the Yi are independent with mean zero and variance
⍀ i . Then, Ž1r 'n .Ý nis1Yi « N Ž0, ⍀ . if for all ␧ ) 0 as n ª ⬁
Ž 3.21.
E
Yi 2
sn2
1
½
Yi 2
sn2
5
) ␧ ª 0,
and
Ž 3.22.
1
n
sn2 s
1
n
n
Ý ⍀i ª ⍀ .
is1
In many econometric applications Yi is a Gaussian random variable or a
function of the Gaussian process. So the Yi usually possess higher moments.
Second moment requirements then follow automatically, and the Lindeberg
condition Ž3.21. for Ž1r 'n .Ý nis1Yi may be verified directly using a Liapounov
condition.
Additional Remarks
Ža. Sequential weak convergence of Ž1r 'n .Ý nis1Yi, T to N Ž0, ⍀ . under conditions Ž3.21. and Ž3.22. does not imply that Ž1r 'n .Ý nis1Yi, T converges in
distribution jointly to N Ž0, ⍀ . as Ž n, T ª ⬁.. According to Lemma 6Žii., condition Ž3.9. is a necessary and sufficient condition for Ž1r 'n .Ý nis1Yi, T to converge
in distribution jointly to the sequential limit distribution N Ž0, ⍀ .. In this case,
therefore, condition Ž3.20. and the condition that Ž1rn.Ý nis1 ⍀ i, T ª ⍀ as Ž n, T
ª ⬁. provide sufficient conditions for condition Ž3.9..
Žb. When Yi, T in Ž3.19. does not have mean zero, but instead has mean zero
asymptotically as T ª ⬁ for each i, joint CLT’s such as Theorem 2 or Corollary
3 cannot be applied. In this case, T needs to increase fast enough to make the
'n -standardized sum of the biases small. That is, Ž1r 'n .Ýnis1 EYi, T should go to
zero as Ž n, T ª ⬁.. In this case, asymptotic normality of Ž1r 'n .Ý nis1Yi, T will
continue to hold provided the expansion rate between n and T allows the bias
to go to zero. The next section gives an example where this problem arises Že.g.,
see Theorem 4..
4.
SPURIOUS PANEL REGRESSION
This section considers the case where the two component random vectors Yi, t
and X i, t of Zi, t in Ž2.1. have no cointegrating relation for any i. This case is
covered by the following assumption.
1072
P. C. B. PHILLIPS AND H. R. MOON
ASSUMPTION 3 ŽSpurious Regression.: The random matrices ⍀ i are positi¨ e
definite almost surely.
Suppose that we perform a time series regression of Yi, t on X i, t :
Ž 4.1.
Yi , t s ␤ˆi X i , t q Uˆi , t ,
where ␤ˆi s ÝTts1Yi, t X i,X t ŽÝTts1 X i, t X i,X t .y1. As is well known Že.g., Phillips Ž1986..,
under Assumption 3 the regression coefficient estimator ␤ˆi has the following
nondegenerate limit distribution, as T ª ⬁:
Ž 4.2.
␤ˆi « M y i M xX i
H
ž HM
y1
X
x i Mx i
/
for all i.
The weak convergence result Ž4.2. implies that regression Ž4.1. is spurious in the
sense that the regression of Yi, t on X i, t does not identify any fixed long-run
relation between Yi, t and X i, t . By contrast, the main result in what follows is
that, in a panel data set, such regressions are no longer spurious and do, in fact,
distinguish a long-run average relation between Yi, t and X i, t .
Consider the following linear least-squares regression of Yi, t on X i, t with
pooled panel data:
Ž 4.3.
Yi , t s ␤ˆn , T X i , t q Uˆi , t ,
where
n
Ž 4.4.
T
␤ˆn , T s Ý Ý Yi , t X iX, t
ž
is1 ts1
n
T
/žÝ Ý
y1
X i , t X iX, t
is1 ts1
/
.
We now proceed to develop an asymptotic theory for ␤ˆn, T . The approach we
adopt is to derive the limit under sequential convergence of the indices ŽT, n.,
and then show that the limit continues to hold under joint convergence ŽT, n ª
⬁. provided certain conditions hold. In many cases, this is the simplest way to
proceed.
Indeed, for estimators like ␤ˆn, T , asymptotic results are readily obtained using
sequential asymptotics such as ŽT, n ª ⬁. seq . According to Ž2.6. in first stage
asymptotics, the pooled estimator ␤ˆn, T has the following limit distribution:
␤ˆn , T «
ž
n
1
ÝH
n
M y i M xX i
is1
/ž
1
n
n
ÝH
y1
M x i M xX i
is1
/
as T ª ⬁ for any fixed n. From Lemma 4 we know that HM y i M xX i and HM x i M xX i
have finite second moments. Also, as in Ž2.8. above, by direct calculation we get
EŽ HMi MiX . s 12 EŽ ⍀ i . s 12 ⍀ . And then, applying the strong law of large numbers
as in Ž2.7., we have
1
n
n
a.s.
1
Ý HM y M xX ª 2 ⍀ y x
i
is1
i
and
1
n
n
a.s.
1
Ý HM x M xX ª 2 ⍀ x x ,
i
is1
i
1073
NONSTATIONARY PANEL DATA
as n ª ⬁. By Assumption 3, ⍀ x i x i is positive definite a.s., and cX⍀ x i x i c ) 0 a.s. for
any c / 0 in ⺢ m x . Thus, EcX⍀ x i x i c s cX⍀ x x c ) 0, which implies that ⍀ x x is positive
Ž
.
definite. Hence ⍀y1
x x exists, and so we have as T, n ª ⬁ seq :
p
␤ˆn , T ª ⍀ y x ⍀y1
xx .
Let ␤ s ⍀ y x ⍀y1
x x . We will call the parameter ␤ the long-run a¨ erage regression coefficient. It is the matrix regression coefficient Žof y on x . associated with
the long-run average covariance matrix ⍀ . To find the limit distribution of ␤ˆn, T
we rescale the centered estimator Ž ␤ˆn, T y ␤ . by 'n and let T ª ⬁ for fixed n.
For all fixed n as T ª ⬁ we have
Ž 4.5.
'n Ž ␤ˆn , T y ␤ . «
1
n
Ý
'n is1
žH
M y i M xX i y ␤ M x i M xX i
H
/ž
y1
n
1
ÝH
n
M x i M xX i
is1
/
.
Note that
E
žH
M y i M xX i y ␤ M x i M xX i s E E
/
H
žH
M y i M xX i y ␤ M x i M xX i < Fc i
H
/
s 12 E Ž ⍀ y i x i y ⍀ y x ⍀y1
x x ⍀ x i x i . s 0,
where the conditional expectation exists because HM y i M xX i y ␤HM x i M xX i is square
integrable by Lemma 4. Thus, the numerator of Ž4.5. has mean zero. Also, we
know that the numerator has finite second moments from Lemma 4, and with a
straightforward calculation the variance matrix is found 6 to be
Ž 4.6.
ž žH
E vec
M y i M xX i y ␤ M x i M xX i vec
/ žH
H
M y i M xX i y ␤ M x i M xX i
H
X
//
s 16 E Ž ⍀ x i x i m Ž ⍀ y i y i y ␤⍀ x i y i y ⍀ y i x i ␤ X q ␤⍀ x i x i ␤ X . .
q 16 E Ž ⍀ x i y i y ⍀ x i x i ␤ X . m Ž ⍀ y i x i y ␤⍀ x i x i . K m y m x
q 14
s⌰ ,
ž
E ž vec Ž ⍀
y i x i y ␤⍀ x i x i
X
. Ž vec Ž ⍀ y x y ␤⍀ x x . .
i i
i i
/
/
say
where K m y m x is the Ž m y m x = m y m x . commutation matrix Že.g., see Magnus and
Neudecker Ž1988... The sequence of random matrices HM y i M xX i y ␤HM x i M xX i 4i in
the numerator of the matrix quotient Ž4.5. is iid Ž0, ⌰ . across i. From the
multivariate Lindeberg-Levy theorem, we then get as n ª ⬁
Ž 4.7.
6
1
n
Ý
'n is1
žH
M y i M xX i y ␤ M x i M xX i « N Ž 0, ⌰ . .
H
/
The calculations are given in Appendix C of PM b.
1074
P. C. B. PHILLIPS AND H. R. MOON
Combining Ž4.7. with the limit
n
1
a.s.
1
Ý HM x M xX ª 2 ⍀ x x
n
i
i
n ª ⬁,
as
is1
we have the following limit distribution of the pooled estimator ␤ˆn, T as ŽT, n ª
⬁. seq
Ž 4.8.
'n Ž ␤ˆn , T y ␤ . « N ž 0, 4 ž ⍀y1
x x m Im
y
y1
x x m Im y
/⌰ ž ⍀
//.
Theorem 4 below shows that these results continue to hold in joint asymptotics as ŽT, n ª ⬁.. For the limit distribution Ž4.8. to hold in this case we need
the additional requirement that nrT ª 0. No additional condition is required
for consistency.
THEOREM 4: Suppose Assumptions 1, 2, and 3 hold.
Ža. Then, as Ž n, T ª ⬁., we ha¨ e ␤ˆn, T ªp ␤ .
Žb. If Ž n, T ª ⬁. and nrT ª 0, then
'n Ž ␤ˆn , T y ␤ . « N ž 0, 4 ž ⍀y1
x x m Im
y
y1
x x m Im y
/⌰ ž ⍀
//.
REMARKS: Ža. The restriction Ž nrT . ª 0 in Theorem 4 controls the effects of
bias in the panel regression. Under the assumptions on the DGP given in
Section 2, the expectation of the components in the numerator of 'n Ž ␤ˆn, T y ␤ .
is generally nonzero, i.e.,
E
ž
T
1
T2
Ý Ž Yi , t X iX, t y ␤ X i , t X iX, t .
ts1
/
/ 0,
/
ª 0,
whereas
E
ž
T
1
T2
Ý Ž Yi , t X iX, t y ␤ X i , t X iX, t .
ts1
as T ª ⬁ for all i. In this case, the condition Ž nrT . ª 0 prevents the bias from
having a dominating asymptotic effect on the standardized quantity 'n Ž ␤ˆn, T y
␤ .. But, when Ž nrT . ¢ 0, the bias can dominate and the asymptotic behavior
can be very different. For example, suppose that
1
n
ÝE
'n is1
ž
1
T2
T
Ý Ž Yi , t X iX, t y ␤ X i , t X iX, t .
ts1
/
ªb
along some diagonal limit ŽT Ž n., n ª ⬁.diag . In this event, we can expect to have
a limit distribution with an asymptotic bias b. Further, the required restriction
on the expansion rate between n and T will change depending on the underlying assumptions about the DGP. For example, if the shocks Ui, t Žs ⌬ Zi, t . are iid
over t and Zi, 0 s 0 for all i, then our results hold as Ž n, T ª ⬁. without imposing
any restriction on the expansion rate between n and T.
1075
NONSTATIONARY PANEL DATA
Žb. Theorem 4 holds for any partition of Zi, t . If the panel data form a vector
unit root model such as Ž2.1., then we can estimate the average long-run relation
between any two subvectors. In effect, therefore, there is an average long-run
relationship between any two subvector components of an integrated process
over a cross section population.
Žc. A key factor in determining these results is that panel data provide iid
cross section information that is unavailable in a simple time series context. In
consequence, we may expect some version of these results to apply when the
regression utilizes only a fraction of the time series data. Suppose we regress Yi, t
on X i, t using the cross section observations at time period t s w Tr x with
0 - r F 1. The cross section OLS estimator ␤˜n, wT r x is then defined by
n
Ž 4.9.
␤˜n , wT r x s
ž
Ý Yi , t X iX, t
is1
n
/ž Ý
y1
X i , t X iX, t
is1
/
.
Using similar arguments to those employed above, we can show that, in
sequential asymptotics as ŽT, n ª ⬁. seq , we have
p
␤˜n , wT r x ª ␤ ,
and
˜ y1
'n ž ␤˜n , wT r x y ␤ / « N ž 0, ž ⍀y1
x x m Im / ⌰ ž ⍀ x x m Im
y
y
//,
where
˜ s E Ž ⍀ x i x i m Ž ⍀ y i y i y ␤⍀ x i x i y ⍀ x i x i ␤ X q ␤⍀ x i x i ␤ X . .
⌰
q E Ž ⍀ x i y i y ⍀ x i x i ␤ X . m Ž ⍀ y i x i y ␤⍀ x i x i . K m y m x
ž
q E ž vec Ž ⍀
y i x i y ␤⍀ x i x i
X
/
. Ž vec Ž ⍀ y x y ␤⍀ x x . . .
i i
i i
/
Žd. Since Ž ⍀y1
. ˜ Ž y1
.
Ž y1
. Ž y1
.
x x m Im y ⌰ ⍀ x x m Im y y 4 ⍀ x x m Im y ⌰ ⍀ x x m Im y ) 0, the
cross section estimator ␤˜n is asymptotically less efficient than the pooled
estimator ␤ˆn, T . This is to be expected because the pooled estimator ␤ˆn, T uses all
the time series information while the cross section estimator ␤˜n, wT r x uses only
single time period information. It is therefore interesting to note that, although
time series regression may be spurious, use of all the time series data does
reduce the limiting variance in a panel regression. Heuristically, this is because
when we pool the data we average the limiting information and quantities like
H01 M y i and H01 M y i M xX i have less variation than M y iŽ1. and M y iŽ1. M x iŽ1.. For
example, W Ž1. ' N Ž0, 1., whereas H01 W' N Ž0, 13 ..
5.
PANEL COINTEGRATION
This section considers the case where the variables in Zi, t are cointegrated.
As discussed in Phillips Ž1986., there exists a cointegrating relation among the
variables in Zi, t if the conditional long-run variance matrix ⍀ i of Zi, t has
1076
P. C. B. PHILLIPS AND H. R. MOON
deficient rank. We will discuss three particular types of model: Ži. heterogeneous
panel cointegration, where there exist different cointegrating relations among
the variables in Zi, t across individuals; Žii. homogeneous panel cointegration,
where the cointegration relation is the same for all the individuals; and Žiii.
near-homogeneous panel cointegration, where there exist slightly different cointegrating relations across the individuals.
5.1. Heterogeneous Panel Cointegration
We start by strengthening the moment conditions of the random coefficients
Ci, t in Ž2.3. and the summability conditions as follows. These conditions help to
ensure the existence of a valid BN decomposition for the equation errors in the
panel cointegration model Ž5.2. given below.
ASSUMPTION 4 ŽRandom Coefficient ConditionsX .:
Ži. Assumption 1Ž i . holds.
Žii. ECa,16i, t Žs ␴ 16, a, t . - ⬁ for all as 1, . . . , m2 .
ASSUMPTION 5 ŽSummability ConditionX .: For all as 1, . . . , m2 :
Ži. Ý⬁ts0 t 2␴ 2, a, t - ⬁;
Žii. Ý⬁ts0 t 4 Ž ␴4, a, t .1r4 - ⬁;
Žiii. Ý⬁ts0 t 2 Ž ␴ 8, a, t .1r8 - ⬁;
Živ. Ý⬁ts0 Ž ␴ 16, a, t .1r16 - ⬁.
The previous section assumes that the conditional long-run covariance matrix
⍀ i of the integrated vector Zi, t in Ž2.1. is positive definite. When ⍀ i is singular,
important differences arise in the time series case, as is well known, and a
different large T time theory applies for each i.
ASSUMPTION 6: The following conditions hold almost surely.
Ži. ⍀ i has rank m x .
Žii. Each Ž m x = m x . leading submatrix ⍀ x x is positi¨ e definite.
i i
In this case the generating mechanism Ž2.1. has a deficient set of unit roots
and the vector Zi, t is cointegrated almost surely. To see this, take an arbitrary
element of the probability space for which Ži. and Žii. of Assumption 6 hold.
y1
Ž
.
Then we have ⍀ y i y i s ⍀ y i x i ⍀y1
x i x i ⍀ x i y i . Let ␣ i s Im y , y␤i and ␤ i s ⍀ y i x i ⍀ x i x i .
The Ž m y = m. random matrix ␣ i is well defined because ⍀ x i x i is positive
definite. Since ⍀ i s Ci Ž1.Ci Ž1.X , the equality ⍀ y i y i y ⍀ y i x i ⍀y1
x i x i ⍀ x i y i s 0 can be
written as
Ž 5.1.
X
␣ i Ci Ž 1 . C y iŽ 1 . s 0,
so that ␣ i is in the row null space of the matrix Ci Ž1..
Define Ei, t s ␣ i Zi, t s Yi, t y ␤i X i, t . Note that ⌬ Ei, t s ␣ i ⌬ Zi, t s ␣ i Ui, t s
␣ i Ci Ž1.Vi, t y ⌬␣ i U˜i, t , where the last equality comes from the BN decomposition
NONSTATIONARY PANEL DATA
1077
of Ui, t . Then, since ␣ i Ci Ž1. s 0, ⌬ Ei, t s y⌬␣ i U˜i, t , that is Ei, t s y␣ i U˜i, t s
y␣ i Ý⬁ss0 C˜i, sVi, tys . Lemma 15 in Appendix D shows that Ei, t is square integrable and that the random coefficients y␣ i C˜i, s 4s are summable. Hence,
Assumption 6 implies the existence of the following panel cointegration model
with probability one:
Ž 5.2.
a.s.
Yi , t s ␤i X i , t q Ei , t ,
X i , t s X i , ty1 q Ux i , t ,
where
Ž 5.3.
Fi , t s
Ei , t
s
Ux i , t
⬁
ž /
ž /
Gi , s s
Ý Gi , sVi , tys ,
ss0
y␣ i C˜i , s
,
⌫ Ci , s
and
⌫ s Ž 0 ... Im x .Ž m x = m . .
The coefficient ␤i in model Ž5.2. is random. This means that ␤i differs
randomly across i and so the cointegrating relation between Yi, t and X i, t is
heterogenous. Also, the random coefficients Gi, s in the linear process generating Ž Ei,X t , UxXi , t .X in model Ž5.2. each involve the cointegrating matrix ␣ i whose
main component is ␤i s ⍀ y i x i ⍀y1
x i x i , which depends on the inverse of ⍀ x i x i . From
Assumption 6, ⍀y1
exists
almost
surely. But, additionally, we need some
xi xi
moment conditions on ⍀ y i x i ⍀y1
to
ensure the existence of moments of the
xi xi
random coefficients Gi, s , which help in establishing the validity of a panel BN
decomposition. Assumptions 1 and 2 alone do not assure the existence of
y1
moments of ⍀y1
x i x i . Hence, to avoid heavy tails in the density of ⍀ x i x i , we make
the following assumption about the distribution of ⍀ x i x i .
ASSUMPTION 7: The random matrix ⍀ x i x i has continuous density function f with
the following properties.
Ži. f Ž ⍀ . s O ŽetrŽyc ⍀ .. for some c ) 0 when trŽ ⍀ . ª ⬁, where etrŽyc ⍀ .
denotes exp trŽyc ⍀ .4 .
Žii. f Ž ⍀ . s O ŽŽdet ⍀ .␥ . for some ␥ ) 7 when detŽ ⍀ . ª 0.
REMARKS: Ža. Condition Ži. implies that the tail of the density f is exponentially small as trŽ ⍀ . ª ⬁. Condition Žii. restricts the behavior of the density f
when det ⍀ ª 0. Taken together Ži. and Žii. ensure that Ždet ⍀ . s f Ž ⍀ . is integrable for s G y8.
Žb. An example of a density f satisfying conditions Ži. and Žii. is the Wishart
distribution Wm xŽ J, Im x . whose probability element is
f Ž ⍀ .Ž d ⍀ . s
1
2
m x Ž Jr2. ⌫ m xŽ Jr2.
1
etr y ⍀ det ⍀ Ž Jym xy1.r2 Ž d ⍀ . ,
2
ž
/
1078
P. C. B. PHILLIPS AND H. R. MOON
with degrees of freedom parameter J ) m x q 15 and where
⌫m x
J
ž /
2
s
H⍀)0etr Žy⍀ . det ⍀
Jy m xy1r2 Ž
d⍀ . .
In this case, ⍀y1 has an inverse Wishart distribution with Ž J q m x q 1. degrees
of freedom and Ž m x = m x . parameter matrix Im x, Wmy1x Ž J q m x q 1, Im x . Že.g., see
Muirhead Ž1982, p. 113..
In view of Lemma 15Ža. in Appendix D, Ý⬁ss 1 s 2 5 Gi, s 5 2 - ⬁. a.s. Then, as in
Lemma 2, it follows from Phillips and Solo Ž1992. that Fi, t has a valid panel BN
decomposition of the form
Ž 5.4.
a.s.
Fi , t s Gi Ž 1 . Vi , t q F˜i , ty1 y F˜i , t ,
where Gi Ž1.Vi, t and F˜i, t are well defined square integrable random vectors in
view of Lemma 15. Using Ž5.4., the partial sum process of Fi, t can be written as
Ž 5.5.
1
w Tr x
1
w Tr x
1
1
a.s.
Fi , t s Gi Ž 1 .
Ý
Ý Vi , t q 'T F˜i , 0 y 'T F˜i , wT r x .
'T ts1
'T ts1
With this BN decomposition in place, we can use the Phillips-Solo approach to
deduce a functional law for partial sums of Fi, t . In particular, we have the
following lemma:
LEMMA 7: If Assumptions 4᎐7 hold, then
Ž 5.6.
1
w Tr x
Ý Fi , t « Gi Ž1. Wi Ž r .
'T ts1
as T ª ⬁ for all i ,
where Wi Ž r . is a standard ¨ ector Brownian motion independent of Fc i .
rx
Ž
Thus, Ž1r 'T .ÝwT
ts1 Fi, t converges in distribution to a randomly scaled or
mixed. Brownian motion Gi Ž1.Wi Ž r . as T ª ⬁ for all i. Let Si, t s Ýtss1 Fi, s q Si, 0 ,
where Si, 0 are iid across i with E 5 Si, 0 5 4 - ⬁. The next lemma shows that
Ž1rT .ÝTts1 Si, t Fi,X t converges in distribution to a matrix stochastic integral plus
an Fc i-measurable random matrix.
LEMMA 8: Suppose the assumptions in Lemma 7 hold. Then,
Ž 5.7.
1
T
T
X
Ý Fi , t SXi , t « Gi Ž1. H dWi WiX Gi Ž1. q ⌳ i
ts1
where ⌳ i s Ý⬁ks0 EŽ Fi, k Fi,X 0 < Fc i . s Ý⬁ks0 Ý⬁ss0 Gi, sqk GXi, s .
as T ª ⬁,
1079
NONSTATIONARY PANEL DATA
Partition Gi Ž1., ⌳ i , and Gi Ž1.Wi Ž r . conformably as follows:
Gi Ž 1 . s
Ge iŽ 1 . m y
,
Gx Ž 1 . m x
ž /
ž
⌳i s
i
Gi Ž 1 . Wi Ž r . s
Ge iŽ 1 . Wi Ž r .
Gx iŽ 1 . Wi Ž r .
ž
⌳ei ei
⌳ei x i
⌳ x i ei
⌳xi xi
Me iŽ r .
/ ž /
s
/
,
.
M x iŽ r .
Consider the time series regression of Yi, t on X i, t . Using Ž5.6., Ž5.7., and the
continuous mapping theorem, we find the following large T limit distribution for
the OLS estimator of the Žrandom. coefficient ␤i :
Ž 5.8.
a.s.
T Ž ␤ˆi y ␤i . s
« Ge iŽ 1 .
ž
H
ž
1
T
T
Ý
Ei , t X iX, t
ts1
/ž
T2
X
dWi WiX Gx iŽ 1 . q ⌳ e i x i
s
ž H dM
ei
T
1
M xX i q ⌳ e i x i
Ý
y1
X i , t X iX, t
ts1
/
X
Gx iŽ 1 . Wi WiX Gx iŽ 1 .
/ž
/ žH
H
M x i M xX i
y1
/
y1
as T ª ⬁ for all i.
/
The bias term ⌳ e i x i arises in the usual way from the temporal correlation
between Ei, t and Ux i , t Žc.f., Phillips and Durlauf Ž1986... Thus, time series
regression produces a consistent estimator of the cointegrating matrix ␤i , and
thereby distinguishes the randomly differing individual long-run relations between Yi, t and X i, t .
When both dimensions of the panel data are utilized, a long-run average
coefficient ␤ is also identified. This can be accomplished as in the previous
section, by means of a pooled panel regression or a limiting cross section
regression. The following sections concentrate on pooled panel regression and
discuss limiting cross section estimators only briefly.
In the heterogeneous panel cointegration model Ž5.2. the pooled estimator
␤ˆn,T has the same form as that defined in Ž4.4.. The limit theory for this pooled
estimator is as follows.
THEOREM 5: Let the assumptions of Lemma 15 hold. Then:
Ža. as Ž n, T ª ⬁., ␤ˆn, T ªp ␤ s ⍀ y x ⍀y1
xx ;
Žb. as Ž n, T ª ⬁. with nrT ª 0,
'n Ž ␤ˆn , T y ␤ . « N ž 0, 4 ž ⍀y1
x x m Im
y
y1
x x m Im y
/⌰ ž ⍀
//,
where
⌰ s 16 Ž ⍀ x i x i m Ž ⍀ y i y i y ␤⍀ x i x i y ⍀ x i x i y ⍀ x i x i ␤ X q ␤⍀ x i x i ␤ X . .
q 16 E Ž Ž ⍀ x i y i y ⍀ x i x i ␤ X . m Ž ⍀ y i x i y ␤⍀ x i x i . . K m x m y
X
q 14 E Ž vec Ž ⍀ y i x i y ␤⍀ x i x i . vec Ž ⍀ y i x i y ␤⍀ x i x i . . .
1080
P. C. B. PHILLIPS AND H. R. MOON
REMARKS: Ža. Define E˜i, t s Ž ␤i y ␤ . X i, t q Ei, t . Then the heterogeneous panel
cointegration model Ž5.2. becomes
Ž 5.9.
Yi , t s ␤ X i , t q E˜i , t .
The pooled estimator ␤ˆn, T is a least squares estimator of the regression
coefficient in Ž5.9. and consistently estimates the long-run average coefficient ␤
between Yi, t and X i, t . Note that the noise, E˜i, t , in this regression involves the
integrated random vector X i, t . By the same logic as that of the spurious
regression case, the long-run coefficient ␤ is consistently estimated by pooling
the panel data because cross section pooling attenuates the strength of the noise
E˜i, t relative to the signal in the regression Ž5.9..
Žb. As seen in Theorem 5, the pooled estimator ␤ˆn, T is 'n consistent for the
average long-run regression coefficient ␤ and has a normal limit distribution.
Observe that the limit variance matrix for the heterogeneous pooled panel
.Ž y1
.
regression estimator in Theorem 5, viz., 4Ž ⍀y1
x x m Im y ⍀ x x m Im y has precisely
the same form as the limit variance matrix of the spurious regression pooled
panel regression estimator in Theorem 4. This equivalence in form is especially
interesting because the individual long-run covariance matrix ⍀ i is singular in
the heterogeneous cointegration case but nonsingular in the spurious regression
case, so that these individual component matrices must be different between the
two models. Nevertheless, and in spite of these differences, the average long-run
covariance matrix ⍀ may well be nonsingular in the heterogeneous cointegration model, in which case there is a basis for direct comparison between the two
results. Obviously, the effect of the heterogeneity in the cointegration parameter
is to slow down the rate of convergence of the pooled estimator. In particular,
the convergence rate is 'n and, interestingly, this rate is uninfluenced by the
time series sample size in spite of the fact that the individual time series
regressions are themselves T-consistent Žsee Ž5.8... Thus, there is a correspondence in the limit theory between the heterogeneous cointegration model and
the pooled spurious regression model after pooling the data.
Žc. In general, of course, Ew ⍀ y x ⍀y1
x
w
xŽ w
x.y1 , so there is no
xi xi / E ⍀ yi xi E ⍀ xi xi
i i
reason why the limit of the average of the cointegrating relation Ž1rn.Ý nis1 ␤i
should equal ␤ , the average long-run regression coefficient. As we have seen, it
is the latter parameter that is the limit of the pooled regression estimator in the
heterogeneous cointegration model. One situation where lim nª⬁Ž1rn.Ý nis1 ␤i s
␤ does hold is when ⍀ x i x i has a degenerate distribution, namely, ⍀ x i x i s ⍀ x x
almost surely. Thus, in the heterogeneous panel cointegration case, the parameter being estimated is not the average cointegrating coefficient, but the average
long-run regression coefficient, just as in the spurious panel regression case.
Again, the two models are much closer than they might appear.
Žd. As discussed in Ža., the heterogeneous panel cointegration model can be
reinterpreted in the form of the panel model Ž5.9.. As such, we may be
interested in constructing statistical tests about the long-run coefficients ␤ . For
example, to test ⺘ 0 : ␸ Ž ␤ . s 0, where ␸ Ž⭈. is a p-vector of smooth functions on
NONSTATIONARY PANEL DATA
1081
a subset of ⺢ m y=m x such that ⭸␸r⭸␤ X has full rank pŽF m y m x ., we may use the
Wald statistic
X
W␸ s n ␸ Ž ␤ˆn , T . Vˆ␸y1␸ Ž ␤ˆn , T . ,
ˆy1
. ˆ Ž ˆy1
.
where Vˆ␸ s Ž ⭸␸ Ž ␤ˆn, T .r⭸␤ X .Ž ⭸␸ Ž ␤ˆn, T .Xr⭸␤ ., Vˆ␤ s 4Ž ⍀
x x m Im y ⌰ ⍀ x x m Im y ,
ˆs
⌰
1
n
⍀̂y1
xx s
n
Ý
is1
1
n
T
½
1
T4
n
Ý
is1
½
X i , t X iX, s m Eˆi , t EˆiX , s ,
Ý
5
s, ts1
2
T2
T
Ý
y1
X i , t X iX, t
ts1
5
,
and Eˆit s Yi, t y ␤ˆn, T X i, t . Some simple manipulations in the case of sequential
asymptotics show that this statistic leads to a standard asymptotic ␹ 2 test as
ŽT, n ª ⬁. seq . This limit theory also holds very generally under joint limits as
ŽT, n ª ⬁. as the next result reveals.
THEOREM 6: Under ⺘ 0 : ␸ Ž ␤ . s 0 and Assumptions 4᎐7, W␸ « ␹ p2 , as Ž n, T ª
⬁. with nrT ª 0.
Že. We may also be interested in testing hypotheses about the coefficients in
generalizations of model Ž5.9. of the following form:
Ž 5.10.
Yi , t s ␤␮ X i , t q E˜i , t
with
½
␤␮ s ␤ a for i g Ia ,
␤␮ s ␤ b for i g Ib ,
where Ia and Ib are index sets corresponding to subgroups of the cross section
population for which the long-run average covariance matrices are ⍀ a and ⍀ b ,
respectively, leading to long-run average regression coefficients ␤ a s ⍀ a, y x ⍀y1
a, x x
and ␤ b s ⍀ b, y x ⍀y1
b, x x that may differ between the two populations. Models like
Ž5.10. can be readily extended to multi-category models and will be empirically
relevant, for example, in cross country panel regressions where countries are
partitioned into classes of similar category like developed ŽOECD. nations and
developing and undeveloped nations. Note that in such cases the model Ž5.10.
allows for intra-class variation Ži.e., the regression coefficients ␤i for i g Ia will
differ. but our primary interest lies in the inter-class difference ␤ a y ␤ b . A
natural hypothesis is then: ⺘ 0 : ␤ a s ␤ b . Let n a s 噛Ž Ia . and n b s 噛Ž Ib ., respectively. Suppose that n brn a ª ␬ - ⬁ as n a , n b ª ⬁. The null hypothesis can be
tested by constructing pooled regression coefficients ␤ˆa , ␤ˆb in each class and
computing the Wald statistic
X
y1
Wa, b s n b vec ␤ˆa y ␤ˆb Vˆayb
vec ␤ˆa y ␤ˆb
½ ž
/
ž
/5,
1082
P. C. B. PHILLIPS AND H. R. MOON
ˆ␮y1, x x m Im x .⌰ˆ␮Ž ⍀ˆ␮y1, x x m Im x .,
where Vˆay b s Ž n arn b .Vˆa q Vˆb , Vˆ␮ s 4Ž ⍀
ˆ␮ s
⌰
1
n␮
⍀̂␮y1
, xxs
½
Ý
igI␮
Ý
igI␮
X i , t X iX, s m Eˆi , t EˆiX , s ,
Ý
T4
1
n␮
T
1
5
s, ts1
½
1
T2
T
Ý
y1
X i , t X iX, t
ts1
5
,
and Eˆi t s Yi, t y ␤ˆ␮ X i, t with ␮ g a, b4 . Again, this leads to an asymptotic ␹ 2
test. The following result gives the limit theory under joint limits as Ž n a , n b , T ª
⬁. and can be obtained in a simple way from Theorems 5 and 6.
THEOREM 7: Under ⺘ 0 : ␤ a s ␤ b and Assumptions 4᎐7, Wa, b « ␹m2 y m x, as
Ž n a , n b , T ª ⬁. with n arT, n brT ª 0.
5.2. Homogeneous Panel Cointegration and Pooled FM Estimation
This section considers a homogeneous panel cointegration model, where the
cointegrating relations are the same across individuals, and develops an asymptotic theory for a pooled FM estimator. We start with the following simplifying
assumption.
ASSUMPTION 8: Ci, t sa.s. Ct , where Ct is an Ž m = m. nonrandom matrix for all t.
Then, under Assumption 6, the panel cointegration model Ž5.2. becomes
Ž 5.11.
a.s.
Yi , t s ␤ X i , t q Ei , t ,
X i , t s X i , ty1 q Ux i , t ,
where
␤ s ⍀ y x ⍀y1
xx ,
Gs s
␣ s Ž Im y , y␤ . ,
Ge, s
y␣ C˜s
s
,
Gx , s
␥ Cs
ž / ž
/
and
Ei , t
s
Ux i , t
ž /
C˜s s
⬁
Ý GsVi , tys ,
ss0
⬁
Ý
Cj .
jssq1
In this model the same long-run relation between Yi, t and X i, t applies for all i.
Unlike previous models, the error term in model Ž5.11. is generated by a linear
process with nonrandom coefficients Gs 4 , on which we impose the following
summability condition.
1083
NONSTATIONARY PANEL DATA
ASSUMPTION 9: Ý⬁ss0 s 3 5 C s 5 - ⬁.
Define Gs s Ý⬁jssq1Gj . Under Assumption 9, GŽ1. s Ý⬁ss0 Gs - ⬁ and
⬁
Ý ss0 s 2 5 Gs 5 s Ý⬁ss0 s 2 5Ý⬁jssq1Gj 5 2 - ⬁. Write Fi, t s Ž Ei,X t , UxXi , t .X . Then, as T ª ⬁,
rx
Ž .
Ž .
we have the functional law Ž1r T .ÝwT
ts1 Fi, t « Bi r ' BM ⍀ F , where ⍀ F s
X
˜
˜
'
GŽ1.GŽ1. ŽTheorem 3.4 in Phillips and Solo Ž1992... The following assumption is
conventional in time series cointegration analysis, although it could be relaxed
with some consequential changes in the asymptotics, including changes in
convergence rates in directions determined by the singularity.
ASSUMPTION 10: ⍀ F is positi¨ e definite.
Partition Bi Ž r . s Ž Be iŽ r .X , B x iŽ r .X .X conformably with Fi, t . Set Si, t s Ýtss1 Fi, s q
Si, 0 , where Si, 0 are iid across i with E 5 Si, 0 5 4 - ⬁. Then, in the usual way
ŽPhillips Ž1988.., as T ª ⬁
1
T
T
Ý Fi , t SXi , t « H dBi BiX q ⌳ F ,
ts1
where ⌳ F s Ý⬁ks0 EŽ Fi, k Fi,X 0 . s Ý⬁ks0 Ý⬁ss0 Gsqk GsX . Again, conformably partition
⍀ F and ⌳ F as
ž
⍀ee
⍀xe
⍀e x
⍀x x
/
ž
and
⌳e e
⌳xe
⌳e x
,
⌳x x
/
respectively.
For each i, model Ž5.11. is a time series cointegrating regression. The
least squares estimator
T
y1
T
␤ˆi s Ý Yi , t X iX, t Ý X i , t X iX, t
ž
ts1
a .s.
s ␤q
T
Ý
ts1
Ei , t X iX, t
ts1
T
žÝ
/
y1
X i , t X iX, t
ts1
/
has the following asymptotic distribution ŽPhillips and Durlauf Ž1986..:
Ž 5.12.
T Ž ␤ˆi y ␤ . «
žH
dBe i BXx i q ⌳ e x
/ žH
B x i BXx i
y1
/
as
T ª ⬁.
The time series estimator ␤ˆi is therefore consistent for ␤ , the common long-run
coefficient for all i, although there may be a second order bias effect entering
through the term ⌳ e x in Ž5.12. arising from the correlation between Ei, t and
X i, s .
1084
P. C. B. PHILLIPS AND H. R. MOON
When the panel observations are pooled, as in the estimator ␤ˆn, T defined in
Ž4.4., we get
n
a .s.
␤ˆn , T s ␤ q
T
Ý Ý
is1 ts1
Ei , t X iX, t
n
y1
T
žÝ Ý
X i , t X iX, t
is1 ts1
/
.
When ⌳ e x s 0, the limit theory of this estimator is as follows.
THEOREM 8: Suppose that Assumptions 6, 8᎐10 hold. If ⌳ e x s 0, then as
Ž n, T ª ⬁. with nrT ª 0,
'n T Ž ␤ˆn , T y ␤ . « N Ž 0, 2 Ž ⍀y1
.
x x m ⍀ee . .
Thus, if Ei, t and Ux i , s are uncorrelated, so that the one sided long-run
covariance ⌳ e x s 0, the pooled estimator ␤ˆn, T is 'n T consistent and has a
limiting normal distribution in joint asymptotics as Ž n, T ª ⬁. when nrT ª 0.
When ⌳ e x / 0, we do not attain 'n T consistency with the pooled least
squares estimator ␤ˆn, T , because of the persistence of bias effects. However, we
may ‘fully modify’ the regressor Yi, t to eliminate the serial correlation ⌳ e x .
Originally, the fully modified ŽFM. regression method was introduced in Phillips
and Hansen Ž1990. to correct for the presence of endogeneity Žthe correlation
between Be i and B x i . and serial correlation in the OLS estimator ␤ˆi of the
individual cointegration regression model. Their construction calls for consistent
ˆF and ⌳ˆF of ⍀ F and ⌳ F . In our case, consistent
time series estimators ⍀
estimates may be constructed using averages Žover i s 1, . . . , n. of the usual
consistent Žas T ª ⬁. nonparametric kernel estimates of the corresponding
long-run quantities for each i. More specifically, let ⌫ˆi Ž j . s Ž1rT .Ý t Fi, tqj Fi,X t ,
where the summation is over 1 F t, t q j F T, and define the averaged kernel
estimators
Ž 5.13.
ˆF s
⍀
ˆF s
⌳
1
n
1
n
n
Ý ⍀ˆF , i ,
Ty1
ˆF , i s
⍀
is1
is1
Ty1
ˆF , i s
⌳
Ý
js0
w
j
ž /ˆ
ž /ˆ
w
jsyTq1
n
Ý ⌳ˆF , i ,
Ý
j
K
K
⌫i Ž j . ,
⌫i Ž j . ,
⬁
where w Ž x . is a lag kernel for which w Ž0. s 1, w Ž x . s w Žyx ., Hy⬁
w Ž x . 2 dx- ⬁,
and with Parzen’s exponent q g Ž0, ⬁. such that k q s lim x ª⬁Ž1 y w Ž x ..r< x < q - ⬁
Že.g., see Hannan Ž1970. or Andrews Ž1991...7 As is well know in the nonparametric literature, the choice of the bandwidth K is important in the limit
7
In determining asymptotic properties of kernel estimates of the long-run variance we usually
also impose a smoothness restriction on the spectral density at the origin. This smoothness condition
X
can be formulated as a summability condition on the autocovariance sequence ⌫ Ž h. s EŽ Fi, t Fi, tqh ..
The summability conditions in Assumption 9 ensure that Ý⬁hs 0 h2 5 ⌫ Ž h.5 - ⬁, and provide sufficient
smoothness for our results here.
1085
NONSTATIONARY PANEL DATA
ˆF . Under the summability condition given in Assumption 9, it is
behavior of ⍀
ˆF, i ª ⍀ F if K, T ª ⬁ with KrTª 0. However, later in this section
known that ⍀
ˆF y ⍀ F ., 'n Ž ⌳ˆF y
Že.g., for Theorem 9. we need the stronger result that 'n Ž ⍀
⌳ F . s op Ž1. as Ž n, T ª ⬁. with nrT ª 0. The following Assumption about bandwidth choice is made so that these conditions apply.
ASSUMPTION 11: The lag kernel w Ž⭈. in Ž5.13. has Parzen exponent q ) 1r2, and
the bandwidth parameter K tends to infinity with KrTª 0 and K 2 qrT ª ⑀ ) 0.
Define
Ž 5.14.
ˆe x ⍀ˆy1
Yiq, t s Yi , t y ⍀
x x ⌬ Xi , t
and
Ž 5.15.
ˆq
ˆ
ˆ ˆy1 ˆ
⌳
e x s ⌳e x y ⍀ e x ⍀ x x ⌳ x x .
Equation Ž5.14. gives the endogeneity correlation and equation Ž5.15. gives the
serial correlation correction.
Using these corrections, a pooled FM ŽPFM. estimator can be defined as
follows:
n
Ž 5.16.
␤ˆP F M s
T
žÝ
Ý Yiq, t X iX, t y nT⌳ˆqe x
is1 ts1
n
a .s.
s ␤q
y1
T
X i , t X iX, t
/žÝ Ý /
//ž Ý Ý /
is1 ts1
T
žÝžÝ
is1
n
n
X
ˆq
Eˆq
i , t X i , t y T⌳ e x
ts1
T
,
is1 ts1
ˆ ˆy1
ˆ
where Eˆq
i, t s Ei, t y ⍀ e x ⍀ x x ⌬ X i, t . Rescaling ␤ F P M y ␤ by
T ª ⬁ for fixed n, we have
1
'n T Ž ␤ˆP F M y ␤ . «
y1
X i , t X iX, t
ž'
n
n
ÝH
dBe i , x i BXx i
is1
/ž
1
n
n
ÝH
'n T
and letting
y1
B x i BXx i
is1
/
,
where Be i , x iŽ r . ' BM Ž ⍀ e. x . and ⍀ e. x s ⍀ e e y ⍀ e x ⍀y1
x x ⍀ x e.
Ž
.
Ž
.
Note that Be i. x i r and B x i r are independent, so EH dBe i. x i BXx i s 0 and
ž
E vec
H
dBe i . x i BXx i
/ž
vec
H
dBe i . x i BXx i
X
/
s 12 Ž ⍀ x x m ⍀ e. x . .
Thus, applying the multivariate Lindeberg-Levy theorem to
Ž 1r'n . Ý nis1 H dBe . x BXx
i i
i
and combining this with the limit of Ž1rn.Ý nis1 HB x i BXx i , we find that as n ª ⬁
1
ž'
n
n
ÝH
is1
dBe i . x i BXx i
/ž
1
n
n
ÝH
is1
y1
B x i BXx i
/
.
« N Ž 0, 2 Ž ⍀y1
x x m ⍀ e. x . .
1086
P. C. B. PHILLIPS AND H. R. MOON
.. in sequential limit as ŽT, n ª ⬁. seq .
Thus, 'n T Ž ␤ˆF P M y ␤ . « N Ž0, 2Ž ⍀y1
x x m ⍀ e. x
The following theorem shows that these asymptotics also hold for joint limits.
THEOREM 9: Under Assumptions 6, 8᎐11 we ha¨ e
'n T Ž ␤ˆP F M y ␤ . « N Ž 0, 2 Ž ⍀y1
.
x x m ⍀ e. x .
as Ž n, T ª ⬁. with nrT ª 0.
REMARKS: Ža. The pooled FM estimator ␤ˆP F M is 'n T consistent and has a
normal limit distribution.
Žb. When ⌳ e x s 0, observe that ␤ˆP F M is more efficient than ␤ˆn, T because
⍀ e. x - ⍀ e e . The efficiency gain in ␤ˆP F M is obtained from the endogeneity
correction that adjusts Yi, t in the fully modified estimator. This effectively
reduces the long-run variance of the noise in the panel cointegrating equation.
Žc. Asymptotic ␹ 2 tests follow from Theorem 9 in the usual way. A consistent
ˆy1
ˆ .
estimate of the covariance matrix, 2Ž ⍀
x x m ⍀ e. x , can be constructed from
y1
ˆ in Ž5.13. by defining ⍀ˆe. x s ⍀ˆe e y ⍀ˆe x ⍀ˆ x x ⍀ˆ x e . A Wald test of ⺘ 0 : ␸ Ž ␤ . s 0,
⍀
where ␸ Ž⭈. is a p-vector of smooth functions such that ⭸␸r⭸␤ X has full rank p,
can then be formulated in the usual way as
Ž 5.17.
X
W␸ s nT 2␸ Ž ␤ˆP F M . Vˆ␸y1␸ Ž ␤ˆP F M . ,
where
ˆy1
ˆ
Vˆ␸ s ⭸␸ Ž ␤ˆP F M . r⭸␤ X 2 ⍀
x x m ⍀ e. x
ž
/
X
ž ⭸␸ Ž ␤ˆ . r⭸␤ / .
PFM
Žd. As in Remark Že. following Theorem 6, it may be of interest to generalize
model 5.11 to allow for subgroups of the population in which the regression
coefficient is the same. In effect, we may replace model Ž5.11. with
Ž 5.18.
a.s.
Yi , t s ␤␮ X i , t q Ei , t
½
with
␤␮ s ␤ a
for i g Ia ,
␤␮ s ␤ b
for i g Ib ,
X i , t s X i , ty1 q Ux i , t .
It is then possible to test hypotheses about the vectors ␤ a and ␤ a in the
generalized model Ž5.18.. For example, to test ⺘ 0 : ␤ a s ␤ b , letting n a s 噛Ž Ia .
and n b s 噛Ž Ib ., respectively, and assuming that n brn a ª ␬ ª ⬁ as n a , n b ª ⬁,
we may construct pooled FM regression coefficients ␤ˆa, P F M , ␤ˆb, P F M in each
class and then the Wald statistic
Way b , P F M s n b T 2 vec ␤ˆa, P F M y ␤ˆb , P F M
½ ž
X
/
y1
ˆ
ˆ
=Vˆay
b , P F M vec ␤ a, P F M y ␤ b , P F M
ž
/5.
NONSTATIONARY PANEL DATA
1087
ˆ␮y1, x x m ⍀ˆ␮ , e. x , and ⍀ˆ␮ , x x , ⍀ˆ␮ , e. x
Here, Vˆay b, P F M s ␬ Vˆa, P F M q Vˆb, P F M , Vˆ␮ , P F M s 2 ⍀
are the respective estimates of the long-run conditional covariance matrices of
the regressors and the fully modified error processes in the classes I␮ with
n␮ s 噛Ž I␮ . where ␮ g a, b4 . As in the earlier case of heterogeneous cointegration, this leads to an asymptotic ␹ 2 test based on the null distribution Way b, P F M
« ␹m2 y m x, which follows in a manner analogous to that of Theorem 7.
5.3. Near-Homogeneous Panels
The homogeneous panel model Ž5.11. discussed above is somewhat unrealistic
because it assumed that each individual has exactly the same cointegrating
relation. Here we study briefly a panel cointegration model with nearly homogeneous cointegrating vectors of the form
Ž 5.19.
␤i s ␤ q
␪i
'n T ,
where the sequence of Ž m y = m x . random matrices ␪ i is iid across i with mean
␪ and finite variance.
ASSUMPTION 12: ␪ i is independent of Ž Ei, t , Ux i , t . for all i and t.
We again consider the pooled FM estimator ␤ˆP F M given in Ž5.16. and the limit
theory follows in Theorem 9 above.
THEOREM 10: Suppose there exists near-homogeneous panel cointegration of the
form Ž5.19.. Let Assumptions 9᎐12 hold. Then, as Ž n, T ª ⬁. with nrT ª 0
Ž 5.20.
'n T Ž ␤ˆP F M y ␤ . « N Ž ␪ , 2 Ž ⍀y1
.
x x m ⍀ e. x . .
Theorem 5.20 is useful in calculating the asymptotic local power of the test
statistic for the null hypothesis
Ž 5.21.
H0 : ␤i s ␤ 0
᭙ i.
According to remark Žc. of the previous subsection, the Wald statistic in Ž5.17.
for the null hypothesis in Ž5.21. is W␸ with ␸ Ž ␤ . s vecŽ ␤ y ␤ 0 . and its limit
distribution is ␹m2 y m x. A sequence of local alternatives to the null Ž5.21. can be
formulated as
Ž 5.22.
HL A : ␤ i s ␤ 0 q
␪i
'n T ,
where the ␪ i are iid across i with mean ␪ / 0, have finite variance and satisfy
Assumption 12. In this case, under the local alternative hypothesis Ž5.22. and the
1088
P. C. B. PHILLIPS AND H. R. MOON
assumptions of Theorem 5.20, the Wald statistic W␸ has an asymptotic noncentral chi-square distribution as Ž n, T ª ⬁. with nrT ª 0, i.e.,
W␸ « ␹m2 y m xŽ ␭ . ,
where the noncentrality parameter is ␭ s vecŽ ␪ .X V␸y1 vecŽ ␪ .r2.
6.
MODELS WITH INDIVIDUAL EFFECTS
Much of the preceding asymptotic theory can be extended in a straightforward
way to panel models with individual specific effects and time trends. We
illustrate what is involved in these extensions by taking the case of primary
importance where the panel regression equation involves individual special
effects. To motivate the analysis, consider the following model of heterogeneous
panel cointegration in place of Ž5.2.:
Ž 6.1.
a.s.
Yi , t s ␥ i q ␤i X i , t q Ei , t ,
X i , t s X i , ty1 q Ux i , t .
Here, the ␥ i are individual effects in the cointegrating equation. They could be
fixed or random effects. We can also allow for individual effects in the equation
for X i, t in Ž6.1.. In that case, the X i, t have individual deterministic trends as
well as stochastic trends and in what follows we would proceed using detrended
rather than demeaned data in the pooled panel regression, with some associated
change in the final formulae.
The individual effect in Ž6.1. can be eliminated in the usual way by removing
individual specific means, i.e., Yi, .s Ž1rT .ÝTts1Yi, t and X i, .s Ž1rT .ÝTts1 X i, t .
Then pooled panel regression leads to the estimator
n
T
␤˜n , T s Ý Ý Y˜i , t X˜iX, t
ž
is1 ts1
n
/ž
y1
T
Ý Ý X˜i , t X˜iX, t
is1 ts1
/
,
where Y˜i, t s Yi, t y Yi, . , and X˜i, t s X i, t y X i, ..
As in our earlier theory, some quick asymptotic results for ␤˜n, T can be
obtained using sequential limits. First consider the case where there is no
cointegration and the true data generating mechanism is Ž2.1., even though it is
˜i Ž r . s
model Ž6.1. that is estimated. Define the demeaned limiting process M
˜yX iŽ r ., M˜xX iŽ r ..X s Mi Ž r . y HMi Ž s . ds. According to Ž2.6. and the continuous mapŽM
ping theorem, under Assumptions 1᎐3, the pooled estimator ␤˜n, T has the
following limit distribution as T ª ⬁ for any fixed n:
␤˜n , T «
ž
1
n
n
ÝH
is1
˜y i M˜xX i
M
/ž
1
n
n
ÝH
is1
y1
˜x i M˜xX i
M
/
.
˜i M˜iX . s EŽ HMi MiX . y EŽ HMi HMiX . s 16 ⍀ .
A simple calculation shows that EŽ HM
˜y i M˜xX i and
Thus, applying the strong law of large numbers to Ž1rn.Ý nis1 HM
1089
NONSTATIONARY PANEL DATA
˜x i M˜xX i , we get Ž1rn.Ýnis1 HM˜y i M˜xX i ªa.s. 16 ⍀ y x , and Ž1rn.Ýnis1 HM˜x i M˜xX i
Ž1rn.Ý nis1 M
1
Ž
.
ªa.s. 6 ⍀ x x . It follows that ␤˜n, T ªp ␤ s ⍀ y x ⍀y1
x x as T, n ª ⬁ seq .
The asymptotic normality of ␤˜n, T follows by arguments analogous to those of
Section 4. Rescaling the centered estimator Ž ␤˜n, T y ␤ . by 'n and letting T ª ⬁
for fixed n, we have
'n Ž ␤˜n , T y ␤ . «
n
1
Ý
'n is1
žH
˜y i M˜xX i y ␤ M˜x i M˜xX i
M
H
/ž
n
y1
n
1
ÝH
˜x i M˜xX i
M
is1
/
.
˜y i M˜xX i y ␤HM˜x i M˜xX i . s 0, so demeaning the data does not affect
Note that EŽ HM
the asymptotic centering. After some lengthy calculations, we find the variance
matrix
ž
E vec
ž HM˜
˜X
˜ ˜X
y i M x i y ␤ M x i M x i vec
H
/
ž HM˜
˜X
˜ ˜X
yi Mx i y ␤ Mx i Mx i
H
X
//
s 901 E Ž ⍀ x i x i m Ž ⍀ y i y i y ␤⍀ x i y i y ⍀ y i x i ␤ X q ␤⍀ x i x i ␤ X . .
q 901 E Ž ⍀ x i y i y ⍀ x i x i ␤ X . m Ž ⍀ y i x i y ␤⍀ x i x i . K m y m x
q 361
s ⌰f ,
ž
E ž vec Ž ⍀
y i x i y ␤⍀ x i x i
X
. Ž vec Ž ⍀ y x y ␤⍀ x x . .
i i
i i
/
/
say.
Note that this covariance matrix differs in the coefficients of its components
from the earlier matrix ⌰ given in Ž4.6. for the case where there is no
demeaning to remove possible individual effects. As is clear from the formulae
for these two cases Žsee Ž6.2. below., ⌰ f - ⌰ , so one effect of demeaning is to
reduce time series variability.
˜y i M˜xX i
Applying the multivariate Lindeberg-Levy Theorem to Ž1r 'n .Ý nis1Ž HM
X
X y1
n
˜x i M˜x i . and combining this with the limit of ŽŽ1rn.Ý i-1 HM˜x i M˜x i . as
y␤HM
n ª ⬁, we have
1
'n
n
Ý HM˜y i M˜xX i y ␤HM˜x i M˜xX i
is1
ž
/ž
y1
n
1
Ý HM˜x i M˜xX i
n
is1
y1
« N 0, 36 ⍀y1
x x m Im y ⌰ f ⍀ x x m Im y
ž
ž
/ ž
/
//.
Hence, as ŽT, n ª ⬁. seq we have
'n Ž ␤˜n , T y ␤ . « N ž 0, 36 ž ⍀y1
x x m Im
y
y1
x x m Im y
/⌰ ž ⍀
f
//.
These sequential limit results can be extended to joint limit results, just as in the
proof of Theorem 4, and we merely state the final results here.
1090
P. C. B. PHILLIPS AND H. R. MOON
THEOREM 11: Suppose Assumptions 1, 2, and 3 hold and the data generating
mechanism is Ž2.1.. Then:
Ža. as Ž n, T ª ⬁., we ha¨ e ␤˜n, T ªp ␤ ;
Žb. if Ž n, T ª ⬁. and nrT ª 0,
'n Ž ␤˜n , T y ␤ . « N ž 0, 36 ž ⍀y1
x x m Im
y
y1
x x m Im y
/⌰ ž ⍀
f
//.
REMARKS: Ža. Comparing the limit variance of ␤˜n, T in Theorem 11 to that of
␤˜n, T in Theorem 4, we find that ␤˜n, T has the smaller asymptotic covariance. In
fact, it is apparent from the formulae that
Ž 6.2.
4⌰ y 36⌰ f s 154 Ž ⍀ x i x i m Ž ⍀ y i y i y ␤⍀ x i y i y ⍀ y i x i ␤ X q ␤⍀ x i x i ␤ X . .
q 154 E Ž ⍀ x i y i y ⍀ x i x i ␤ X . m Ž ⍀ y i x i y ␤⍀ x i x i . K m y m x
ž
/
s 154 E C x iŽ 1 . m Ž C y iŽ 1 . y ␤ C x iŽ 1 . . Ž Im 2 q K m .
½ž
= C x iŽ 1 . m Ž C y iŽ 1 . y ␤ C x iŽ 1 . .
ž
Ž 6.3.
/
X
/5
) 0.
As remarked above, this reduction in variance occurs because demeaning the
data by removing individual effects reduces time series variability. Similar effects
occur when higher order time trends are removed from the data in the
construction of pooled panel estimators.
Žb. In the heterogenous panel cointegration case, the data are generated by Ž6.1.. The individual effects ␥ i can now be consistently estimated by
time series regression on Ž6.1. leading to ␥
ˆi s Yi, . y ␤˜i X i, . and ␤˜i s
ŽÝTts1Y˜i, t X˜i,X t .ŽÝTts1 X˜i, t X˜i,X t .y1. These least squares estimates and their fully
modified variants have asymptotic properties that are well known ŽPhillips and
Hansen Ž1990... Following the same line of argument as in Section 5.1, the
pooled panel estimator ␤˜n, T can be shown to have the same limit distribution as
that given in Theorem 11 for the spurious regression case, although the long run
covariance matrices ⍀ i are now singular, just as in Section 5.1. Under the
assumptions of Theorem 5, the asymptotic theory holds for joint limits as
Ž n, T ª ⬁. and nrT ª 0, as well as sequential limits. Again ␤˜n, T estimates the
long run average coefficient ␤ s ⍀ y x ⍀y1
x x . Wald tests like those discussed in
Section 5 can now be constructed with obvious modifications to the estimated
covariance matrix formulae that allow for elimination of the individual effects by
demeaning.
Žc. In the homogeneous panel cointegration case, the data are generated by
Ž6.1. with ␤i s ␤ s ⍀ y x ⍀y1
x x a.s. We can eliminate individual effects by removing
individual specific means as above, and may proceed with FM estimation as in
Section 5.2. The data are now corrected according to the formula Y˜i,qt s Y˜i, t y
1091
NONSTATIONARY PANEL DATA
ˆe x ⍀ˆy1
˜
Ž
.
⍀
x x ⌬ X i, t rather than as in 5.14 . The pooled FM estimator in this case is
given by
n
T
˜q
␤˜P F M s Ý Ý Y˜iq, t X˜iX, t y nT⌳
ex
ž
is1 ts1
n
/ž
y1
T
Ý Ý X˜i , t X˜iX, t
is1 ts1
/
.
Under the same assumptions as Theorem 9, we find that 'n T Ž ␤˜P F M y ␤ . «
.. as Ž n, T ª ⬁. with nrT ª 0. Note that in this case, the
N Ž0, 6Ž ⍀y1
x x m ⍀ e. x
effect of eliminating individual specific means is to increase the limit variance
matrix in comparison with Theorem 9. Wald tests can be constructed as
described in Section 5.2 with obvious modifications for the use of demeaned
data, and a noncentral limit theory follows as in Section 5.3.
7.
CONCLUSION
This paper has developed a linear regression limit theory for nonstationary
panel data with large numbers of cross section and time series observations. A
central result is the existence of interesting long-run relations between two
integrated panel vectors where there is no individual time series cointegration or
where there are heterogeneous cointegrating relations. The new relations are
characterized as long-run average relationships over the cross section and are
parameterized in terms of the matrix regression coefficient, ⍀ y x ⍀y1
x x , of the
cross section long-run average covariance matrix, ⍀ . They are analogous to
population regression coefficients in conventional cross section regression of iid
variates. The limit theory can be used to construct tests of hypotheses about the
long-run average regression coefficients and to compare these coefficients in
subgroups of the cross section population. These tests are given explicitly for the
two cases of heterogeneous panel cointegration and homogeneous panel cointegration, which seem to be the important cases for empirical applications. The
local asymptotic power function for these tests is also derived.
The limit theory developed in this paper is designed for two dimensional
arrays where both time series and cross section sample sizes pass to infinity. It
allows for both sequential limits as T ª ⬁ and n ª ⬁ in that sequence, and joint
limits where T, n ª ⬁ jointly. As the proofs in the Appendices demonstrate,
convergence for joint limit is more difficult to obtain. However, apart from some
stricter moment and summability conditions, the only additional requirement we
use in the development of this theory is the rate condition that nrT ª 0. This
condition indicates that the limit theory given herein is likely to be most useful
in cases where T is large and n is moderately large. The usefulness of this
asymptotic theory in describing finite sample behavior in panel regressions now
needs to be systematically explored in simulation experiments.
An important assumption that is common in panel data work and is used here
in deriving asymptotics is cross section independence. For many nonstationary
panel data applications, this independence condition is restrictive and it is an
important limitation of our theory. For instance, multi-country GDP series,
1092
P. C. B. PHILLIPS AND H. R. MOON
exchange rates, and financial assets prices all involve cross section dependence
arising from global shocks and complicated interdependencies among the variables. As is apparent from our approach, certain strong laws and central limit
results will continue to apply when the cross sectional dependence is of the weak
memory variety, but in this case the limit variance matrices will change according to the dependence. More significantly, when there are strong correlations in
a cross section Žas there will be in the face of global shocks. we can expect
failures in the strong laws and central limit theory arising from the nonergodicity. However, even in this event, theorems like the ergodic theorem will still
apply but the limits will be random and measurable with respect to the invariant
algebra generated by the global shocks.
In the present case and, indeed, quite commonly in panel data theory, cross
section independence is assumed in part because of the difficulties of characterizing and modeling cross section dependence. In general, finding a natural
ordering for cross section indices in economic data is not easy, and this has been
a serious obstacle in the development of a satisfactory approach. While some
recent research has attempted to resolve the difficulty by employing a framework for spatial data based on the economic distance between individuals Že.g.
Conley Ž1997.., the successful simultaneous modeling of both cross section
dependence and time series dependence remains a challenging problem and is a
major area for future research in multi-index asymptotics of the type considered
here.
Cowles Foundation for Research in Economics, Yale Uni¨ ersity, Box 208281, New
Ha¨ en, CT 06520-8281, U.S.A., and Uni¨ ersity of Auckland; [email protected]
and
Dept. of Economics, Uni¨ ersity of California, Santa Barbara, CA 93106, U.S.A.;
[email protected]
Manuscript recei¨ ed April, 1997; final re¨ ision recei¨ ed September, 1998.
APPENDICES
APPENDIX A: PRELIMINARY LEMMAS
AND
PROOFS
We start with some lemmas that are useful in following arguments. The results are straightforward and proofs are omitted here, but are available in PM b.
LEMMA 9: Ža. For any pG 1 and any m = n matrix A, there exists a constant M) 0 such that
m
Ž 8.1.
5 A5 p F M
n
Ý Ý < ai , j < p ,
is1 js1
where a i, j is the Ž i, j . th element of A.
Žb. For any m = m matrix A
Ž 8.2.
Ž tr Ž A .. 2 F m 5 A 5 2 .
1093
NONSTATIONARY PANEL DATA
LEMMA 10: Suppose that A Žs a i, j 4i, j . and B Žs bi, j 4i, j . are Ž m = m. matrices and K m is the
commutation matrix. Then,
tr wŽ A m B . K m x F 5 A 5 5 B 5 .
If A is symmetric, then
tr wŽ A m A . K m x s 5 A 5 2 .
LEMMA 11: Ža. Under Assumptions 1 and 2
⬁
Ž 8.3.
E
4
Ý
- ⬁.
Ci , t
ts0
Žb. Under Assumptions 4 and 5
⬁
Ž 8.4.
E
8
- ⬁.
Ý tCi , t
ts0
Žc. Under Assumptions 4 and 5
⬁
Ž 8.5.
E
16
- ⬁.
Ý Ci , t
ts0
1. PROOF
Ž 8.6.
OF
LEMMA 2: The panel BN decomposition
Ui , t s Ci Ž 1 . Vi , t q U˜i , ty1 y U˜i , t
a.s.
follows directly from Phillips and Solo Ž1992. provided Yi s Ý⬁ss0 s 2 5 Ci, s 5 2 - ⬁ a.s. This condition
holds if EŽ Yi . - ⬁, which holds by Lemma 1Ža..
Q. E. D.
2. PROOF OF LEMMA 1: See PM b.
3. PROOF
OF
LEMMA 2: It is enough to show that
1
p
'T
sup
r
Q. E. D.
Ũa , i , 0 ª 0 and
1
'T
p
U˜a , i , wT r x ª 0 as T ª ⬁ for all a, i.
But, these follow because U˜a, i, t is strictly stationary in t and square integrable by Lemma 1, so that
the results hold by the same argument as that given in Phillips and Solo Ž1992, p. 978.. The
functional law follows directly.
Q. E. D.
4. PROOF
OF
E
LEMMA 4: Substituting Mi s Ci Ž1.Wi , we have
HMi MiX
2
s E vec
HMi MiX
2
F E Ci Ž 1 . m Ci Ž 1 .
X
s E Ž Ci Ž 1 . m Ci Ž 1 .. vec Wi Wi
H
2
X
E vec Wi Wi
H
2
,
2
1094
P. C. B. PHILLIPS AND H. R. MOON
where the last inequality holds because 5 AB 5 F 5 A 5 5 B 5 and because Ci Ž1. is independent of Wi .
X
We know E 5vec HWi Wi 5 2 - ⬁ and E 5 Ci Ž1. m Ci Ž1.5 2 s E 5 Ci Ž1.5 4 - ⬁ by Lemma 1. Therefore,
X 2
5
5
E HMi Mi - ⬁, as required.
Q. E. D.
APPENDIX B: PROOFS
FOR
SECTION 3ᎏMULTIDIMENSIONAL LIMIT THEORY
1. CONSTRUCTION OF RANDOM VECTORS Yi IN Ž3.3. TO EXIST ON THE SAME PROBABILITY SPACE:
According to Skorohod’s Theorem in ⺢ m , Theorem 29.6 in Billingsley Ž1986.,8 we can construct a
probability space Ž ⍀ iU , FiU , PiU . where there exist random vectors Yi,UT and YiU such that Yi, T ' Yi,UT ,
Yi ' YiU , and Yi,UT ªa.s. YiU as T ª ⬁ for all i. Also, we can choose independent YiU because the Yi, T
are independent across i for all T. Now we define ⍀ U s Ł⬁is 1 ⍀ iU , the Cartesian product of ⍀ iU , and
let ␲ i be the natural projection of ⍀ U onto ⍀ iU for each i. Let F U be the smallest ␴-field
Ž F . for all i and F g FiU . Define R to be the collection of all finite
containing all the sets ␲y1
i
dimensional rectangles, Ł⬁is 1 Fi where Fi g FiU for all i and Fi s ⍀ iU , except for at most finite many
values of i. Now define P U ŽŁ⬁is 1 Fi . s Ł⬁is1 PiU Ž Fi .. Then, by Theorem 8.2.2 Žp. 201. in Dudley
Ž1989., P U on R extends uniquely to a probability measure on F U . Let Y˜i Ž ␻ . s YiU Ž␲y1
Ž ␻ .. for all
i
␻ g ⍀ U . By the way of their construction, the Y˜i Ž ␻ . are random vectors on the probability space
Ž ⍀ U , F U , P U . and Y˜i ' YiU ' Yi . Choose Yi in Ž3.3. to be Y˜i and we have the desired result. Q.E.D.
2. PROOF OF LEMMA 5: We prove part Žb.. Then part Ža. holds by the same principle. Suppose
that f g C is given. From X n, T « X as n, T ª ⬁, for any given ␧ ) 0, we can chosen n 0 and T1 such
that whenever n G n 0 and T G T1 , the following inequality holds:
Ž 8.7.
␧
Ef Ž X n , T . y Ef Ž X . -
2
.
From X n, T « X n as T ª ⬁ ᭙n , we can choose T2 depending on n and ␧ such that
Ž 8.8.
Ef Ž X n , T . y Ef Ž X n . -
␧
2
if T G T2 .
For each n G n 0 choose T2 Ž n, ␧ ., and choose a fixed T0 greater than both T1 and T2 . Then both
Ž8.7. and Ž8.8. hold and therefore
Ef Ž X n . y Ef Ž X . - ␧
if n G n 0
and X n, T « X sequentially as ŽT, n ª ⬁. seq .
Q. E. D.
3. PROOF OF LEMMA 6: We show part Žb.. Part Ža. can be established by similar arguments.
Suppose that f g C is given. Assume that Ž3.9. holds. From Ž3.9. and X n « X as n ª ⬁, for any
given ␧ ) 0, we can choose n 0 and T0 such that whenever n G n 0 and T G T0 , we have
sup
Ef Ž X n , T . y Ef Ž X n . -
nGn 0 , TGT 0
␧
2
,
and
Ef Ž X n . y Ef Ž X . -
␧
2
.
8
For Skorohod’s theorem on function spaces refer to the representation theorem in Pollard
Ž1984. or Theorem 4 on p. 47 in Shorack and Wellner Ž1986..
NONSTATIONARY PANEL DATA
1095
Thus, if n G n 0 and T G T0 ,
Ef Ž X n , T . y Ef Ž X . F
Ef Ž X n , T . y Ef Ž X n . q Ef Ž X n . y Ef Ž X . - ␧ .
sup
nGn 0 , TGT 0
Hence, X n, T « X as Ž n, T ª ⬁..
Now assume that X n, T « X and X n « X as Ž n, T ª ⬁.. The necessity of the condition follows
because
lim sup Ef Ž X n , T . y Ef Ž X n .
n, T
F lim sup Ef Ž X n , T . y Ef Ž X . q lim sup Ef Ž X n . y Ef Ž X . s 0.
Q. E. D.
n
n, T
Before starting the proof of Theorem 1 we give the following lemma.
LEMMA 12: Suppose Yi, T are independent across i. Assume that Yi, T « Yi as T ª ⬁ for all i. Then,
lim sup n ª ⬁Ž1rn.Ý nis1 E < Yi, T < - ⬁ implies lim sup n ª ⬁Ž1rn.Ý nis1 E < Yi < - ⬁.
PROOF: Note that, since < Yi, T < « < Yi < as T ª ⬁ by the continuous mapping theorem, it follows that
E < Yi < F lim inf T E < Yi, T < Žsee Theorem 5.3 in Billingsley Ž1968... Thus,
lim sup
nª⬁
1
n
n
1
n
sup lim inf Ý E < Yi , T <
Ý E < Yi < F lim
Tª⬁ n
nª⬁
is1
is1
F lim sup
n, Tª⬁
1
n
n
Ý E < Yi , T < - ⬁.
Q. E. D.
is1
4. PROOF OF THEOREM 1: Part Žb. follows easily from Lemma 6 and part Ža.. In particular, from
the assumptions of the theorem we know that X n, T s Ž1rn.Ý nis1Yi, T « X n s Ž1rn.Ý nis1Yi as T ª ⬁
for all n and X n s Ž1rn.Ý nis1Yi ªp ␮
˜ X s lim nŽ1rn.Ýnis1 EYi . Then, since condition Ž3.9. holds from
part Ža., the desired result X n, T ªp ␮
˜ X as Ž n, T ª ⬁. follows by Lemma 6.
Now, we prove part Ža.. First, we establish condition Ž3.9. in the scalar class. It is sufficient for
condition Ž3.9. to restrict C to C ⬁ , the class of all the bounded, continuous real functions with
bounded, continuous derivatives of all orders Žsee Theorem 7.1 in Billingsley Ž1968. or Theorem 12
in Pollard Ž1984... Without loss of generality, let the functions be such that < f Ž k . Ž x .< F 1, ᭙k, where
f Ž k . Ž x . denotes the kth derivative function of f Ž x ..
Before proceeding, we need to ensure that the probability space on which the variates are defined
is large enough to permit the arguments that follow. Limits such as X n, T s Ž1rn.Ý nis1Yi, T « X n s
Ž1rn.Ý nis 1Yi as T ª ⬁ involve the joint distributions of the random vectors Ž Y1, T , . . . , Yn, T .X and
Ž Y1 , . . . , Yn .X , not any properties of the probability space on which they are defined. However, we
need to ensure that we can relate these variates on the same space. This can be accomplished by
passing to a new probability space, using Skorohod’s Theorem in ⺢ m Že.g., Theorem 29.6 in
X
Billingsley Ž1986.., in which they are defined new random variables Ž Y˜1, T , . . . , Y˜n, , Y˜1 , . . . , Y˜n . such
that Y˜i, T ' Yi, T and Y˜i ' Yi for all i and the 2 n random variables Y˜1, T , . . . , Y˜n, T , Y˜1 , . . . , Y˜n are
independent. Without loss of generality, we can assume that Y˜i, T s Yi, T and Y˜i s Yi for all i and T.9
9
As in Appendix BŽ1. above, we can construct an infinite dimensional probability space where the
X
X
two independent random vectors Ž Y1, T , . . . , Yn, T . and Ž Y1 , . . . , Yn . coexist. However, the argument
given is enough for the proof that follows.
1096
P. C. B. PHILLIPS AND H. R. MOON
Now we can define the quantities ␨ k, n, T s Ý k ) i G 1Yi, T q Ý k - i F n Yi , for 1 F k F n, all on the same
probability space. By virtue of the definitions of X n, T , X n , and ␨ k, n, T , we have
f
ž
1
n
n
Ý Yi , T
is1
n
1
/ ž
yf
n
Ý Yi
is1
/ ž
sf
1
n
Ž ␨n , n , T q Yn , T . y f
/ ž
n
s
Ý
ks1
1
½ž
f
n
1
n
Ž ␨ 1 , n , T q Y1 .
Ž ␨ k , n , T q Yk , T . y f
/ ž
1
n
/
Ž ␨ k , n , T q Yk .
/5
.
It follows that
Ž 8.9.
lim sup Ef Ž X n , T . y Ef Ž X n .
n, Tª⬁
ž
s lim sup Ef
n, Tª⬁
1
n
n
Ý Yi , T
is1
¡ž
Ý E~
n
s lim sup
f
1
n
¢ ž
n, Tª⬁ ks1
qf
/ ž
y Ef
1
n
n
Ý Yi
is1
/
Ž ␨ k , n , T q Yk , T . y f
/ ž
1
n
␨k , n , T y f
/ ž
1
n
1
n
␨k , n , T
Ž ␨ k , n , T q Yk .
/¦¥
/§
.
X
Let g Ž h. s sup x < f Ž xq h. y f Ž x . y f Ž x . h <. Take x s ␨ k, n, T rn and h s Yk, T rn in the case of
f ŽŽ1rn.Ž ␨ k, n, T q Yk, T .. y f ŽŽ1rn. ␨ k, n, T ., and take x s ␨ k, n, T rn and h s Ykrn in the case of
f ŽŽ1rn.Ž ␨ k, n, T q Yi .. y f ŽŽ1rn. ␨ k, n, T .. By the triangle inequality, it follows that Ž8.9. is bounded
above by
n
Ž 8.10.
lim sup
ÝE
n, Tª⬁ is1
X
␨i , n , T
Yi , T
n
n
½ ž /ž
Ý ž /
f
n
q lim sup
Yi , T
Eg
n
n, Tª⬁ is1
y
Yi
n
/5
n
q lim sup
Ý Eg
n , Tª⬁ is1
Yi
ž /
n
.
By the triangle inequality, the first term in Ž8.10. is less than
n
lim sup
Ý
n, Tª⬁ is1
X
␨i , n , T
Yi , T
n
n
Yi
½ ž /ž
/5
Ý ž
/ž
/
Ý ž
/
E f
n
s lim sup
Ef
X
␨i , n , T
n
n, Tª⬁ is1
E
E
n
n, Tª⬁ is1
F lim sup
y
Yi , T
n
y
Yi
n
n
Yi , T
n
y
Yi
n
s 0.
The first line above uses the fact that ␨ i, n, T rn, Yi, T rn and Yirn are independent, the inequality in
X
the second line holds because < f < F 1, and the third line follows directly from condition Žii..
1097
NONSTATIONARY PANEL DATA
For the second term in Ž8.10., note by the mean value theorem that g Ž h. F M1 min< h <, h2 4 for
some constant M1 which depends on f alone. Then, for any ␧ ) 0
n
lim sup
Ý Eg
n, Tª⬁ is1
Yi , T
ž /
n
n
F lim sup
ÝE
g
n, Tª⬁ is1
n
q lim sup
ÝE
n, Tª⬁ is1
Yi , T
1
n
Yi , T
g
n
F ␧ M12 lim sup
Yi , T
ž /½
ž /½
ÝE
1
n
Yi , T
n
n, Tª⬁ is1
F␧
n
Yi , T
n
5
)␧
5
n
q M1 lim sup
ÝE
n , Tª⬁ is1
Yi , T
n
1
½
Yi , T
n
)␧
5
s ␧ M2 ,
where the first inequality holds by applying g Ž h. F M1 h2 on 1< Yi, T rn < F ␧ 4 and g Ž h. F M1 < h <
on 1< Yi, T rn < ) ␧ 4 and the last inequality holds by conditions Ži. and Žiii. with M2 s
M12 lim sup n, T Ý nis1 E < Yi, T rn <.
By Lemma 12, condition Ži. implies
lim sup
n, T
1
n
n
Ý E < Yi < - ⬁,
is1
and by condition Živ. we have
lim sup
n, T
1
n
n
Ý E < Yi <1< Yi < ) ␧ n4 - ⬁.
is1
Thus, applying the same arguments as those used for lim sup n, T Ý nis 1 Eg Ž Yi, T rn . to
lim sup n, T Ý nis1 Eg Ž Yirn., we have lim sup n, T Ý nis1 Eg Ž Yirn. s 0. It follows from Ž8.9. and Ž8.10. that
condition Ž3.9. holds.
When the Yi, T are m-vectors, the Cramer-Wold
device can be used. That is, using the above
´
X
X
argument, we obtain s X n, T ªp s ␮
˜ X as Ž n, T ª ⬁. ᭙s g ⺢ m , and it follows that X n, T ªp ␮
˜ X as
n, T ª ⬁.
Q. E. D.
5. PROOF OF COROLLARY 1: Define X n, T s Ž1rn.Ý nis1Yi, T s Ž1rn.Ý nis1Ci Q i, T and X n s
Ž1rn.Ý nis 1Yi s Ž1rn.Ý nis1Ci Q i . Assume sup i 5 Ci 5 ) 0, for if this does not hold, the result is trivial.
We know that X n, T « X as T ª ⬁ for all n by the conditions in the corollary. By assumption the
Q i, T are uniformly integrable and Q i, T « Q i , so E 5 Q i 5 - ⬁. Also, C s lim nŽ1rn.Ý nis1Ci exists, so we
have X n ªp CEŽ Q i . as n ª ⬁. Hence, if we establish conditions Ži. ᎐ Živ. of Theorem 1, then
X n, T ªp CEŽ Q i . as Ž n, T ª ⬁..
By the uniform integrability of 5 Q i, T 5 and sup i 5 Ci 5 - ⬁, we have
lim sup
n, T
1
n
n
Ý E 5 Yi , T 5 F ž sup 5 Ci 5 / sup E 5 Qi , T 5 - ⬁,
is1
i
T
verifying condition Ži., and
lim sup
n, T
1
n
n
Ý 5 EYi , T y EYi 5 F ž sup 5 Ci 5 / lim sup 5 EQi , T y EQi 5 s 0.
is1
i
T
1098
P. C. B. PHILLIPS AND H. R. MOON
Condition Žiii. is satisfied since
Ž 8.11.
1
n
n␧
n
Ý E w 5 Yi , T 515 Yi , T 5 ) n ␧ 4x F ž sup 5 Ci 5 / sup E
i
is1
5 Q i , T 51 5 Q i , T 5 )
T
½
sup i 5 Ci 5
5
which converges to zero as n ª ⬁, again by virtue of uniform integrability and sup i 5 Ci 5 - ⬁.
Condition Živ. of Theorem 1 holds because
1
n
n
Ý E 5 Yi 51 5 Yi 5 ) n ␧ 4 F ž sup 5 Ci 5 / E
5 Q i 51 5 Q i 5 )
i
is1
½
by sup i 5 Ci 5 - ⬁ and dominated convergence since E 5 Q i 5 - ⬁.
n␧
sup i 5 Ci 5
5
ª0
Q. E. D.
6. PROOF OF THEOREM 2: The proof follows that of Lindeberg’s theorem given in Billingsley
Ž1968, Theorem 7.2.. The only change is that the additional index T appears in the component
variates ␰ i, n, T and limits are taken as Ž n, T ª ⬁.. The fact that T passes to infinity with n is
incidental to the main argument. For example, we still have
⍀i , T
sn2 , T
F ␧ 2 q E w ␰ i2, n , T 1 < ␰ i , n , T < ) ␧ 4x
and, as a consequence of the Lindeberg condition Ž3.20.,
max
iFn
⍀i , T
sn2 , T
ª0
as Ž n, T ª ⬁..
7. PROOF
Q. E. D.
OF
THEOREM 3: Define
r2
␰ i , n , T s ⍀y1
n , T Ci Qi , T ,
X
where ⍀ n, T s Ý nis1Ci ⌺ T Ci . By the Cramer-Wold
device, Ý nis1 ␰ i, n, T « N Ž0, Im . as Ž n, T ª ⬁., if
´
m
᭙ t g ⺢ with 5 t 5 s 1
Ž 8.12.
t
X
n
Ý ␰ i , n , T « N Ž0, 1.
as
n, T ª ⬁.
is1
Then, by condition Živ. Ž1rn.Ý nis 1Yi, T « N Ž0, ⍀ . as Ž n, T ª ⬁..
To establish Ž8.12., it is sufficient to verify condition Ž3.20.. For given ␧ ) 0 and t g ⺢ m with
5 t 5 s 1, we have
Ž 8.13.
t
X
n
Ý E w ␰ i , n , T ␰ iX, n , T 1< tX␰ i , n , T ␰ iX, n , T t < ) ␧ 4x t
is1
X
s t ⍀y1r2
n, T
n
X
X y1r2
<
4x y1r2
Ý E w Ci Qi , T QXi , T CXi 1< tX⍀y1r2
n , T C i Q i , T Q i , T C i ⍀ n , T t ) ␧ ⍀ n , T t.
is1
1099
NONSTATIONARY PANEL DATA
Take the indicator function first. Note that
X
X
X
y1r2 <
4
1 < t ⍀y1r2
n , T Ci Q i , T Q i , T Ci ⍀ n , T t ) ␧
F1
X y1r2
X
X y1r2
<
n , T Ci Qi , T Qi , T Ci ⍀ n , T t ) ␧
½ max < t ⍀
5 t 5s1
X
5
X
r2
y1r2 .
)␧4
s 1 ␭ ma x Ž ⍀y1
n , T Ci Q i , T Q i , T Ci ⍀ n , T
. max 5 C j 5 2 5 Q i , T 5 2 ) ␧
F 1 ␭ma x Ž ⍀y1
n, T
½
ž
/
jFn
␭min Ž ⍀ n , T .
½
s 1 5 Qi , T 5 2 ) ␧
5
5 ½
F 1 5 Qi , T 5 2 ) ␧
max 5 C j 5 2
jFn
␴ T2 ␭min Ž Ý njs1C j CXj .
max Ž 5 C j 5 2 .
jFn
5
.
Next, expression Ž8.13. is bounded above by
Ž 8.14.
max
5 t 5s1
n
X
t ⍀y1r2
n, T
½
X
X
Ci Q i , T Q i , T Ci 1 5 Q i , T 5 2 ) ␧
ÝE
is1
n
Ž
.
F ␭y1
min ⍀ n , T
Ý nis1 5 Ci 5 2
␭min Ž ⍀ n , T .
jFn
½
½
i
␴ T2 ␭min Ž Ý nis1Ci CXi .
max Ž 5 C j 5 2 .
␴ T2 ␭min Ž Ý njs1C j CXj .
max Ž 5 C j 5 2 .
jFn
½
5
␴ T2 ␭min Ž Ý njs1C j CXj .
jFn
E 5 Q1 , T 5 2 1 5 Q1 , T 5 2 ) ␧
n max 5 Ci 5 2
F
max 5 C j 5 2
5 Ci Qi , T 5 2 1 5 Qi , T 5 2 1 ) ␧
ÝE
is1
F
␴ T2 ␭min Ž Ý njs1C j CXj .
E 5 Q1 , T 5 2 1 5 Q1 , T 5 2 ) ␧
⍀y1r2
n, T t
5
5
␴ T2 ␭min Ž Ý nis1Ci CXi .
max Ž 5 Ci 5 2 .
iFn
5
.
By conditions Ži. and Žii.,
n max i 5 Ci 5 2
2
␴ T ␭min Ž Ý nis1Ci CXi .
s O Ž1.
and
␴ T2 ␭min Ž Ý nis1Ci CXi .
max i F n Ž 5 Ci 5 2 .
ª ⬁,
as Ž n, T ª ⬁.. Then, since 5 Q i, T 5 2 is uniformly integrable in T by condition Žiii., it follows that
Ž8.14. ª 0 as Ž n, T ª ⬁..
Q. E. D.
APPENDIX C: PROOFS
FOR
SECTION 4ᎏSPURIOUS PANEL REGRESSION LIMIT THEORY
The next lemma gives the joint limit theory needed for Theorem 4.
LEMMA 13: Suppose that Assumptions 1᎐3 hold.
Ža. As Ž n, T ª ⬁.,
1
n
n
Ý
is1
1
T
2
T
p
1
1
Ý Zi , t ZXi , t ª 2 E⍀ i s 2 ⍀ .
ts1
1100
P. C. B. PHILLIPS AND H. R. MOON
Žb. If Ž n, T ª ⬁. and nrT ª 0, then
n
1
T
1
Ý T Ý Ž Yi , t XiX, t y ␤ Xi , t XiX, t . « N Ž0, ⌰ . .
'n is1
ts1
PROOF
OF
LEMMA 13: Ža. From the BN decomposition of Ui, t in Ž2.4. we have
a.s.
Zi , t s Ci Ž 1 . Pi , t q U˜i , 0 y U˜i , t q Zi , 0 ,
where Pi, t s Ýtss1Vi, s , which leads to
n
1
n
T
1
Ý
is1
a.s.
n
1
Ý Zi , t ZXi , t s n Ý Ž Qi , T q R i , T . ,
T2
ts1
is1
where
T
1
Qi , T s
T
Ý Ci Ž1. Pi , t PiX, t Ci Ž1. ,
2
ts1
X
R i , T s R1 , i , T q R1 , i , T q R 2 , i , T ,
T
1
R1 , i , T q
T
R2, i , T s
2
and
ts1
1
T
X
Ý Ci Ž1. Pi , t ŽU˜i , 0 y U˜i , t q Zi , 0 . ,
2
T
X
Ý ŽU˜i , 0 y U˜i , t q Zi , 0 .ŽU˜i , 0 y U˜i , t q Zi , 0 . .
ts1
We show that as Ž n, T ª ⬁., Ž1rn.Ý nis 1 Q i, T ªp 12 ⍀ and Ž1rn.Ý nis1 R k, i, T ªp 0, for k s 1, 2.
X
X
The Q i, T are iid across i for all T. Also, as T ª ⬁, Q i, T « Q i s Ci Ž1.HWi Wi Ci Ž1. and
Ž1rn.Ý nis 1 Q i ªa.s. 12 ⍀ . That is, in sequential asymptotics such as ŽT, n ª ⬁. seqŽ1rn.Ý nis1 Q i, T ªp 12 ⍀ .
According to Corollary 1 Žset Ci s Im so that the second condition is automatically satisfied., if we
show that 5 Q i, T 5 is uniformly integrable in T, then it follows that
n
1
Ž 8.15.
p
1
Ý Qi , T ª 2 ⍀
n
is1
as Ž n, T ª ⬁..
By 5 AB 5 F 5 A 5 5 B 5 and the triangle inequality
Ž 8.16.
5 Qi , T 5 F Ci Ž1.
2
1
T2
T
Ý 5 Pi , t 5 2 .
ts1
Also, as T ª ⬁
1
T2
T
Ý 5 Pi , t 5 2 « H5 Wi 5 2 ,
ts1
and we have
E
ž
1
T
2
T
Ý 5 Pi , t 5 2
ts1
/ ž
s tr
1
T
2
T
Ý E Ž Pi , t , PiX, t .
ts1
/
ªE
5 Wi 5 2 s
žH /
1
2
tr Ž Im . .
It follows Že.g., Billingsley Ž1968, Theorem 5.4.. that Ž1rT 2 .ÝTts 1 5 Pi, t 5 2 is uniformly integrable in T.
Since E 5 Ci Ž1.5 2 - ⬁ by Lemma 1, we deduce that 5 Ci Ž1.5 2 Ž1rT 2 .ÝTts1 5 Pi, t 5 2 is uniformly integrable
in T. Thus, 5 Q i, T 5 is uniformly integrable in T, and Ž8.15. follows.
1101
NONSTATIONARY PANEL DATA
Next, Ž1rn.Ý nis 1 R1, i, T and Ž1rn.Ý nis1 R 2, i, T converge in probability to zero if E 5 R1, i, T 5,
E 5 R 2, i, T 5 ª 0 as Ž n, T ª ⬁.. Note that
Ž 8.17.
E 5 R1 , i , T 5 F
F
s
1
T
2
T
Ý E w 5 Ci Ž1. Pi , t 5 5U˜i , 0 y U˜i , t q Zi , 0 5x
ts1
T
1 1
'T
1
'T
T
Ý
ts1
(
Pi , t
E Ci Ž1.
2
E 5 U˜i , 0 y U˜i , t q Zi , 0 5 2
'T
O Ž1. ,
where the first inequality holds by the triangle inequality and 5 AB 5 F 5 A 5 5 B 5, the second inequality
holds by the Cauchy-Schwarz inequality and the last line holds because E 5 Ci Ž1.Ž Pi, tr 'T .5 2 s O Ž1.,
E 5 Zi, 0 5 2 - M1 , and E 5 U˜i, t 5 2 - M2 ᭙ t and for some M1 , M2 - ⬁ by Lemma 1Žc.. Thus, E 5 R1, i, T 5 ª 0,
as Ž n, T ª ⬁..
Similar arguments show that E 5 R 2, i, T 5 s Ž1rT . O Ž1.. So, all the desired results hold and part Ža.
is proved.
Žb. Write Ci Ž1. s Ž C y Ž1.X , C x Ž1.X .X , and U˜i, t s ŽU˜yX , t , U˜xX , t .X , conformably with the partition of Zi, t
i
i
i
i
into Yi, t and X i, t . Using the BN decomposition of Ui, t in Ž2.4., we have
1
n
T
1
Ý T 2 Ý Ž Yi , t XiX, t y ␤ Xi , t XiX, t . ,
'n is1
ts1
n
1
s
Ý Ž Q i , T q R1 , i , T q R 2 , i , T q R 3 , i , T q R 4 , i , T q R 5 , i , T . ,
'n is1
where
Qi , T s
T
1
T
R1 , i , T s
R2, i , T s
R3, i , T s
R4, i , T s
R5 , i , T s
X
i
i
1
i
i
ts1
T
1
T
2
2
2
2
T
X
Ý Ž U˜y , 0 y U˜y , t q Yi , 0 . PiX, t C x Ž1. 4 ,
i
i
i
T
X
Ý Ž U˜y , 0 y U˜y , t q Yi , 0 .Ž U˜x , 0 y U˜x , t q Xi , 0 . 4 ,
i
i
i
i
T
X
X
Ý ␤ C x Ž1. Pi , t Ž U˜x , 0 y U˜x , t q Xi , 0 . q ␤ Ž U˜x , 0 y U˜x , t q Xi , 0 . PiX, t C x Ž1. 4 ,
i
i
i
i
i
ts1
1
T
i
ts1
1
T
i
ts1
1
T
i
ts1
1
T
X
Ý C y Ž1. Pi , t Ž U˜x , 0 y U˜x , t q Xi , 0 . 4 ,
2
T
X
Ý ␤ Ž U˜x , 0 y U˜x , t q Xi , 0 .Ž U˜x , 0 y U˜x , t q Xi , 0 . 4 .
i
ts1
We show that as Ž n, T ª ⬁.
Ž 8.18.
X
Ý C y Ž1. Pi , t PiX, t C x Ž1. y ␤ C x Ž1. Pi , t PiX, t C x Ž1. 4 ,
2
n
Ý Qi , T « N Ž0, ⌰ .
'n is1
i
i
i
i
1102
P. C. B. PHILLIPS AND H. R. MOON
and as Ž n, T ª ⬁. with nrT ª 0,
1
n
p
Ý Rk , i, T ª 0
'n is1
Ž 8.19.
Ž k s 1, . . . , 5 . .
Note that
EQi , T s
1
T2
T
X
X
Ý t Ž E w C y Ž1. C x Ž1. x y ␤ E w C x Ž1. C x Ž1. x. s 0.
i
i
i
i
ts1
Also,
Ž 8.20.
1
T4
T
T
X
Ý Ý E wvecŽ Pi , t PiX, t .vecŽ Pi , s PiX, s . x
ts1 ss1
s
s
1
T
4
1
T2
T
T
Ý Ý E w Pi , t PiX, s m Pi , t PiX, s x
ts1 ss1
¡
tns
2
T
T
ts1 ss1
1
s ⌶T q O
q
,
T
¦
X
Ž Im 2 q K m q vec Im Ž vec Im . .
ž /
Ý Ý~
¢ ž /žž / ž //
ž /
T
tns
tks
T
T
tns
y
T
¥q O ž T1 /
§
X
vec Im Ž vec Im .
say,
and
Ž 8.21.
X
E Ž vec Ž Q i , T . vec Ž Q i , T . .
s E Ž C x i Ž 1 . m Ž C y i Ž 1 . y ␤ C x i Ž 1 ...
1
T
4
T
T
X
Ý Ý E wvecŽ Pi , t PiX, t .vecŽ Pi , s PiX, s . x
ts1 ss1
X
X
= Ž C x i Ž 1 . m Ž C y i Ž 1 . y ␤ C x i Ž 1 .. .
s E Ž C x i Ž 1 . m Ž C y i Ž 1 . y ␤ C x i Ž 1 ... ⌶ T q O
ž
1
ž //
T
X
X
= Ž C x i Ž 1 . m Ž C y i Ž 1 . y ␤ C x i Ž 1 .. .
s ⌶T ,
say.
It is easy to see that ⌰ T ª ⌰ as T ª ⬁. So Q i, T 4i is an iid sequence with mean zero and covariance
matrix ⌰ T .
Next, apply Theorem 3 with Ci s Im y m x to establish that Ž1r 'n .Ý nis1 Q i, T « N Ž0, ⌰ . as Ž n, T ª ⬁..
Conditions Ži., Žii., and Živ. of the theorem are obviously satisfied in view of the fact that Ci s Im y m x
and ⌰ T ª ⌰ as T ª ⬁. For the uniform integrability of 5 Q i, t 5 2 , note by the continuous mapping
that as T ª ⬁
5 Qi , T 5 2 « 5 Qi 5 2 s
HC y Ž1.WiWiX C x Ž1.X y ␤ C x Ž1.WiWi C x Ž1.X 4
i
i
i
i
2
.
1103
NONSTATIONARY PANEL DATA
Then, 5 Q i, T 5 2 is uniformly integrable in T because
X
E 5 Q i , T 5 2 s tr Ž E Ž vec Ž Q i , T . vec Ž Q i , T . .. s tr Ž ⌰ T .
X
ª tr Ž ⌰ . s tr Ž E Ž vec Ž Q i . vec Ž Q i . .. s E 5 Q i 5 2 .
Next, to prove Ž1r 'n .Ý nis 1 R k, i, T ªp 0 as Ž n, T ª ⬁. with nrT ª 0, we simply show that
n, T ª ⬁ with nrT ª 0 for k s 1, . . . , 5. Note that
'n E 5 R k, i, T 5 ª 0 as
'n E 5 R1 , i , T 5 F 'n
s
1
2
T
'
T
Ý E 5 C y Ž1. Pi , t 5 5U˜x , 0 y U˜x , t q Xi , 0 5 4
i
i
i
ts1
n
O Ž 1 . ª 0,
T
where the first inequality holds by the triangle inequality and 5 AB 5 F 5 A 5 5 B 5 and the last line holds
in view of Ž8.17. above. Similarly we can show that 'n E 5 R 2, i, T 5, 'n E 5 R 4, i, T 5 s nrT O Ž1. and
'n E 5 R 3, i, T 5, 'n E 5 R 5, i, T 5 s Ž'n rT . OŽ1.. So we have the desired limits and part Žb. follows.
Q. E. D.
'
PROOF
OF
THEOREM 4: By Lemma 13Ža., it is easy to see that as Ž n, T ª ⬁.
␤ˆn , T s
1
n
n
Ý
is1
T
1
T2
Ý
X
Yi , t X i , t
ts1
ž
n
1
n
Ý
is1
1
T2
y1
T
Ý
X
Xi , t Xi , t
ts1
/
p
ª ⍀ y x ⍀y1
x x s␤.
Also, when Ž n, T ª ⬁. with nrT ª 0, from Lemma 13Ža. and Žb. we have
'n Ž ␤ˆn , T y ␤ . s
ž
1
n
1
T
Ý T 2 Ý Ž Yi , t XiX, t y ␤ Xi , t XiX, t .
'n is1
ts1
y1
« N 0, 4 ⍀y1
x x m Im y ⌰ ⍀ x x m Im y
Ž Ž
. Ž
n
n
Ý
is1
1
T2
y1
T
Ý Xi , t XiX, t
ts1
/
..,
Q. E. D.
giving the required result.
8.4. APPENDIX D: PROOFS
/ž
1
FOR
SECTION 5.1ᎏHETEROGENEOUS PANEL COINTEGRATION
LIMIT THEORY
The following two lemmas give some useful results on the existence of moments of the
heterogeneous coefficients ␤i in Ž5.2. and the random coefficients in the linear process representation Ž5.3.. Both results are proved in PM b .
LEMMA 14: Under Assumptions 4, 5, and 7, E 5 ␤i 5 8 - ⬁.
˜i, s s Ý⬁tssq1Gi, t , and F˜i, t s Ý⬁ss0 G˜i, sVi, tys . Suppose that
LEMMA 15: Let Gi Ž1. s Ý⬁ss0 Gi, s , G
Assumptions 4᎐7 hold. Then:
Ža. EŽÝ⬁ss 0 s 2 5 Gi, s 5 2 . - ⬁.
Žb. E 5 Fi, t 5 2 - M for some constant M- ⬁.
Žc. E 5 Gi Ž1.5 4 - ⬁.
Žd. E 5 F˜i, t 5 4 - M for some constant M- ⬁.
1104
P. C. B. PHILLIPS AND H. R. MOON
2. PROOF OF LEMMA 7: As in the proof of Lemma 3, since F˜i, t 4t is strictly stationary and F˜i, t is
square integrable from Lemma 15Žd., it follows that
Ž 8.22.
1
sup
p
F˜a , i , wT r x ª 0
'T
0FrF1
as T ª ⬁
for all a, i ,
where F˜a, i, wT r x is the ath element of F˜i, wT r x. The functional law follows directly from Ž5.5. and
rx
Donsker’s theorem applied to Ž1r 'T .ÝwT
Q. E. D.
ts 1 Vi, t .
3. PROOF OF LEMMA 8: The proof follows the same lines as Phillips Ž1988. and is omitted here
Ždetails are available in PM b ..
Q. E. D.
4. PROOF OF THEOREM 5: The proof follows similar lines to that of Lemma 13 and Theorem 4
above but makes use of the bounds established in Lemmas 14 and 15 and the panel BN
decomposition Ž5.4.. The details are lengthy and to save space they are not repeated here. They are
Q. E. D.
given in full in PM b.
5. PROOF
OF
THEOREM 6: The proof proceeds by showing that as Ž n, T ª ⬁. with n, T ª 0,
ˆs
⌰
n
1
Ý
n
is1
T
1
½
T
T
Ý Ý Xi , t XiX, s m Eˆi , t EˆXi , x
4
ts1 ss1
5
p
ª⌰,
and
⍀̂y1
xx s
n
1
n
Ý
is1
½
T
y1
T
2
2
Ý Xi , t XiX, t
ts1
5
p
ª ⍀y1
xx .
Then, by Theorem 5 and the delta method, the proof is complete. From Lemma 13, we know that
y1
ˆy1
Ž1rn.Ý nis 1Ž2rT 2 .ÝTts1 X i, t X i,X t 4 ªp ⍀ x x as Ž n, T ª ⬁.. In consequence, ⍀
Ž
.
x x ªp ⍀ x x as n, T ª ⬁
since ⍀ x x ) 0. Again, full details are available in PM b.
Q. E. D.
6. PROOF OF THEOREM 7: Under the null hypothesis, using the cross section independence and
applying Theorem 5, we have
'n
b
Ž ␤ˆa y ␤ˆb . s n a Ž n brn a . 1r 2 Ž ␤ˆa y ␤ a . y n b Ž ␤ˆb y ␤ b .
'
'
« N Ž 0, ␬ Va q Vb . ,
. Ž y1
.
when ŽT, n a , n b ª ⬁. and n arT, n brT ª 0, and where V␮ s 4Ž ⍀␮y1
, x x m Im x ⌰␮ ⍀␮ , x x m Im x for ␮ s
a, b.
As in the proof of Theorem 6, we can show that as ŽT, n a , n b ª ⬁. with n arT, n brT ª 0, we have
ˆ␮ s
⌰
1
n␮
Ý
igI␮
½
T
1
T
T
Ý Ý Xi , t XiX, s m Eˆi , t EˆXi , s
4
ts1 ss1
5
p
ª ⌰␮ ,
and
⍀̂␮ , x x s
1
1
n␮
Ý
igI␮
½
2
T2
T
Ý
ts1
y1
X
Xi , t Xi , t
5
p
ª ⍀␮y1, x x .
1105
NONSTATIONARY PANEL DATA
Consequently, Vˆ␮ ªp V␮ , and Vˆayb s Ž n brn a .Vˆa q Vˆb ªp ␬ Va q Vb . It follows that
X
y1
Wa , b s n b vec Ž ␤ˆa y ␤ˆb . Vˆayb
vec Ž ␤ˆa y ␤ˆb .4 « ␹m2 y m x ,
giving the required result.
APPENDIX E: PROOFS FOR SECTION 5.2ᎏHOMOGENEOUS PANEL
COINTEGRATION LIMIT THEORY
Before we start the proof of Theorem 9, we give the following useful lemma.
X
X
X
THEOREM 16: Let Fi, t s Ž Ei, t , Ux i, t . s Ý⬁ss0 GsVi, tys be the panel process defined in Model Ž5.11..
X
T
Also, let Si, t s Ý ts1 Fi, t q Si, 0 , where Si, 0 are iid with E 5 Si, 0 5 4 - ⬁, ⍀ F s GŽ1.GŽ1. , ⌳ F s
X
Ý⬁ks 0 Ý⬁ss0 Gsqk Gs , and GŽ1. s Ý⬁ss0 Gs . Then, under the summability condition Assumption 9 and
positi¨ e definiteness condition Assumption 10, as Ž n, T ª ⬁. with nrT ª 0,
1
n
Ý
'n is1
PROOF
OF
ž
T
1
T
Ý Fi , t SXi , t y ⌳F
ts1
/ ž
« N 0,
1
2
⍀F m ⍀F .
/
LEMMA 16: Using the BN decomposition as in the proof of Lemma 8, we have
1
n
Ý
'n is1
s
ž
T
Ý Fi , t SXi , t y ⌳F
ts1
n
1
Ý
'n is1
ž
1
n
q
y
q
y
s
T
1
1
T
/
T
X
Ý G Ž1.Ž Vi , t PiX, t y Im . G Ž1.
ts1
1
Ty1
ž ž
Ýž Ý
Ýž Ý
Ý
'n is1
T
n
1
'n
1
T
is1
n
1
'n
1
T
is1
n
1
Ý
⬁
X
X
Ý G˜sGsq1
F˜i , t Fi , tq1 y
ts1
T
/
ss0
˜X0 .
Ž G Ž 1 . Vi , t F˜iX, t y G Ž 1 . G
ts1
T
X
G Ž 1 . Vi , t Ž F˜i , 0 q Si , 0 .
ts1
1
1
n
1
'n
T
⬁
X
Ý G˜sGsq1
ss0
/
/
Ý T F˜i , T SXi , T q 'n Ý T F˜i , 0 SXi , 1
'n is1
is1
1
//
q
a.s.
n
Ý Ž Q1 , i , T q R1 , i , T q R 2 , i , T q R 3 , i , T q R 4 , i , T q R 5 , i , T .
'n is1
qO
ž' /
n
T
a.s., say.
We show that Ž1r 'n .Ý nis 1 Q i, T « N Ž0, 12 ⍀ F m ⍀ F ., and Ž1r 'n .Ý nis1 R k, i, T ªp 0, k s 1, . . . , 5, as
Ž n, T ª ⬁. with nrT ª 0.
1106
P. C. B. PHILLIPS AND H. R. MOON
Note that
n
1
n
1
ts2
1
ž
T
/
T
X
Ý G Ž1.Ž Vi , tViX, t y Im . G Ž1.
is1
/
n
1
s
T
Ý
'n is1
X
Ý G Ž1. Vi , t PiX, ty1G Ž1.
n
1
q
T
1
ž
Ý Qi , T s 'n Ý
'n is1
is1
Ý Ž Q1 , i , T q Q2 , i , T . ,
'n is1
X
say.
X
Since EŽŽ1rT .ÝTts 1GŽ1.Ž Vi, t Vi, t y Im .GŽ1. . s 0, we have
E
2
n
1
Ý Q2 , i , T
'n is1
sE
2
T
1
Ý
T
X
X
G Ž 1 .Ž Vi , t Vi , t y Im . G Ž 1 .
ts1
1
4
F G Ž 1 . tr
sO
1
ž /
T
T
2
T
T
X
Ý Ý EŽ Vi , tViX, s m Vi , tViX, s . y vecŽ Im .ŽvecŽ Im .. 4
ts1 ss1
.
Thus, Ž1r 'n .Ý nis 1 Q2, i, T s op Ž1.. Next, observe that
X
E w vec Ž Q1 , i , T .Ž vec Ž Q1 , i , T .. x
s
s
ª
T
1
T2
1
T
1
2
T
X
X
Ý Ý Ž G Ž1. m G Ž1.. E Ž Pi , ty1 PiX, sy1 m Vi , tViX, s .Ž G Ž1. m G Ž1. .
ts2 ss2
T
Ý
ts2
ty1
ž /
T
X
X
Ž G Ž 1. G Ž 1 . m G Ž 1 . G Ž 1 . . ' ⌶ TU
X
X
Ž G Ž1. G Ž1. m G Ž1. G Ž1. . s
1
2
Ž ⍀F m ⍀F . ' ⌶ U ,
Ž say .
say.
Also, note that 12 Ž ⍀ F m ⍀ F . ) 0. These verify conditions Ži., Žii., and Živ. of Theorem 3. Condition
Žiii. of Theorem 3 holds because
5 Qi , T 5 2 « 5 Qi 5 2 s G Ž1.
H dWi WiX G Ž1.X
2
and
Q i2, T s tr Ž ⌶ TU . ª tr Ž ⌶ U . s E 5 Q i 5 2
so that the 5 Q i, T 5 2 are uniformly integrable in T. By Theorem 3, Ž1r 'n .Ý nis1 Q1, i, T «
N Ž0, 12 ⍀ F m ⍀ F ..
1107
NONSTATIONARY PANEL DATA
Next, we show that Ž1r 'n .Ý nis 1 R1, i, T ªp 0 by proving E 5Ž1r 'n .Ý nis1 R1, i, T 5 2 ª 0 as Ž n, T ª ⬁..
Note that
2
n
1
E
Ý R1 , i , T
'n is1
X
s tr w E Ž vec Ž R1 , i , T .Ž vec Ž R1 , i , T .. .x since E Ž R1 , i , T . s 0
¡
⬁
vec
1
s tr
~
Ty1 Ty1
T2
ž
Ý Ý
⬁
Ý
T
hsyTq2
⬁
⬁
/
X
X
X
˜pVi , sypVi , sq1yq Gq
G
ps0 qs0
⬁
yvec
T y 1 y < h<
Ty2
1
⬁
ž žÝ Ý
ts1 ss1
¢
s tr
js0 ks0
= vec
E
¦
⬁
Ý Ý G˜jVi , tyjViX, tq1yk GXk
ž
T
⬁
ž
X
Ý G˜sGsq1
ss0
X
⬁
/ž ž
X
Ý G˜sGsq1
vec
// ¥
/ /§
ss0
/
⬁
Ý Ý Ý Ý Ž Gk m G˜j .
=
js0 ks0 ps0 qs0
X
X
X
˜p .
=E Ž Vi , tq1yk Vi , tqhq1yq m Vi , tyj Vi , tqhyp .Ž Gq m G
⬁
yvec
X
˜s Gsq1
G
žÝ
X
⬁
ss0
X
/ž žÝ
˜s Gsq1
G
vec
ss0
//
0
.
If we show
⬁
⬁
⬁
⬁
Ý Ý Ý Ý Ž Gk m G˜j .
⬁
tr
Ý
hs0
js0 ks0 ps0 qs0
X
X
X
˜p .
=E Ž Vi , tq1yk Vi , tqhq1yq m Vi , tyj Vi , tqhyp .Ž Gq m G
⬁
yvec
žÝ
⬁
X
˜s Gsq1
G
ss0
/ž žÝ
vec
X
X
˜s Gsq1
G
ss0
//
0
- ⬁,
then, by Cesaro summability, it follows that E 5Ž1r 'n .Ý nis 1 R1, i, T 5 2 s O Ž1rT .. Observe that
⬁
Ž 8.23.
s
⬁
⬁
Ý Ý Ý tr Ž Gk GXkqh m G˜j G˜Xjqh .
hs0
ž
js0 ks0
⬁
q
⬁
ž
⬁
Ý Ý
hs0
⬁
Ž
X
⬁
Ý Ý tr
hs0 js0
s I q II q III ,
X
˜kqhy1 m G˜j Gjqhq1 K m
tr Gk G
Ý
js0 ks Ž0k Ž1yh ..
q Ž ¨ 4 y 3.
say.
/
.
m
ž
˜j .
Ž Gjq 1 m G
ž
Ý e l , lme l , l
ls1
/
/
˜Xjqh .
Ž GXjq hq1 m G
/
1108
P. C. B. PHILLIPS AND H. R. MOON
Since trŽ A m B . s trŽ A.trŽ B . and trŽ A. F rowsŽŽ A..1r 2 5 A 5 from Ž8.2. in Lemma 9, we have
⬁
⬁
Ý tr Ý Gk GXkqh
Is
ž
hs0
ks0
⬁
F
⬁
tr
Ý
hs0
/ž
/
⬁
⬁
Ý
ks0
hs0
˜k 5
5G
žÝ / žÝ /
ks0
tr
ž
Ý G˜j G˜Xjqh
/
js0
2
⬁
5 Gk 5
/
js0
Ý Gk GXkqh
2
⬁
Fm
ž
⬁
Ý G˜j G˜Xjqh
tr
-⬁
by Assumption 9.
ks0
By Lemma 10,
⬁
II F
⬁
⬁
ks1 js0
hs1
4
⬁
F
5 Gj 5
⬁
⬁
q
⬁
4
⬁
5 Gj 5
ks0 js0
⬁
hs0 ks0
hs0 js0
2
⬁
js0
/
2
⬁
j 5 Gj 5
/
⬁
Ý Ý 5 G˜j 5 5 Gjqh 5
q
js0
ž
Ý Ý 5 Gk 5 5 G˜kqh 5
/ž
ž / ž
žÝ / žÝ / žÝ /
Ý
js0
F
⬁
Ý Ý 5 Gk 5 5 G˜ky1 5 5 G˜j 5 5 Gjq1 5 q Ý Ý Ý 5 Gk 5 5 G˜kqhy1 5 5 G˜j 5 5 Gjqhq1 5
5 Gj 5
- ⬁.
js0
Similarly, we can show that for some M) 0
4
⬁
III F M
5 Gj 5
žÝ /
- ⬁.
js0
Also, we can show by modifying the arguments used above that
2
n
1
E
Ý
'n is1
R2, i , T
,
E
2
n
1
Ý
'n is1
sO
R3, i , T
1
ž /
T
,
and
n
1
E
Ý R4, i , T
'n is1
sO
n
ž' /
,
T
E
1
n
Ý R5 , i , T
'n is1
sO
so all the desired results are proved and the lemma follows.
ž' /
n
T
,
Q. E. D.
PROOF OF THEOREM 9: To establish joint limit normality of the PFM estimator ␤ˆP F M , it is
enough to show that, as Ž n, T ª ⬁. with nrT ª 0,
1
n
Ý
'n is1
ž
1
T
T
Ý Ž Eˆqi , t XiX, t y ⌳ˆqe x .
ts1
/
« N Ž 0, 12 Ž ⍀ x x m ⍀ e. x .. ,
and
1
n
n
Ý
is1
½
1
T2
T
Ý Xi , t XiX, t
ts1
y1
5
p
ª 2 ⍀y1
xx .
The proof of the latter result is similar to the proof of Lemma 13Ža., so we concentrate on the
former.
1109
NONSTATIONARY PANEL DATA
y1
q
y1
Let ⌳q
e x s ⌳ e x y ⍀ e x ⍀ x x ⌳ x x and Ei, t s Ei, t y ⍀ e x ⍀ x x ⌬ X i, t . Then
1
n
Ý
'n is1
s
ž
1
T
1
T
Ý Ž Eˆqi , t XiX, t y ⌳ˆqe x .
/
ts1
n
Ý
'n is1
ž
1
T
T
Ý Ž Eqi , t XiX, t y ⌳qe x .
ts1
y1 .
ˆe x ⍀ˆy1
yŽ ⍀
x x y ⍀e x ⍀ x x
/
Ýž
n
1
'n
is1
1
T
T
Ý Ž ⌬ Xi , t XiX, t y ⌳ x x .
ts1
/
ˆe x y ⌳ e x . q ⍀ˆe x ⍀ˆy1
' Žˆ
.
y'n Ž ⌳
x x n ⌳x x y⌳x x .
X
y1 .Ž
ˆe x ⍀ˆy1
ˆ ˆy1
' . n ŽŽ . T Ž
..
Ž .
First, Ž ⍀
x x y ⍀ e x ⍀ x x 1r n Ý is1 1rT Ý ts1 ⌬ X i, t X i, t y ⌳ x x s o p 1 because ⍀ e x ⍀ x x y
X
y1
n ŽŽ
T Ž
'
Ž
.
Ž
.
.
..
Ž
.
⍀ e x ⍀ x x s op 1 and 1r n Ý is1 1rT Ý ts1 ⌬ X i, t X i, t y ⌳ x x s Op 1 by Lemma 16 and E 5 X i, 0 5 4
- M for some constant M. Next, according to Theorems 9 and 10 in Hannan Ž1970, pp. 280᎐283. Žor
ˆF , i y E⍀ˆF , i 5 2 s Ž KrT . OŽ1., and 5 E⍀ˆF , i y
Proposition 1 in Andrews Ž1991.., we know that E 5 ⍀
⍀ F 5 2 s Ž1rK 2 q . O Ž1.. Thus,
E
'n Ž ⍀ˆF y ⍀ F .
2
2
n
1
Ý Ž ⍀ˆF , i y E⍀ˆF , i q E⍀ˆF , i y ⍀ F .
'n is1
sE
ˆF , i y E⍀ˆF , i 5 2 q n 5 E⍀ˆF , i y ⍀ F 5 2
sE5 ⍀
s
ž
K
T
n
q
K 2q
/
O Ž1. .
Since the bandwidth parameter Kª ⬁ with KrTª 0 and K 2 qrT ª ⑀ ) 0 for some q ) 12 by
ˆF y ⍀ F .5 2 ª 0 as Ž n, T ª ⬁. with nrT ª 0. The same
Assumption 11, it follows that E 5'n Ž ⍀
ˆF . In consequence, we have
argument can be applied to ⌳
'n Ž ⌳ˆe x y ⌳ e x . , ⍀ˆe x ⍀ˆy1
' Žˆ
.
Ž .
x x n ⌳ x x y ⌳ x x s op 1 .
The remainder of the proof involves showing that
1
n
Ý
'n is1
ž
1
T
T
Ý Ž Eqi , t XiX, t y ⌳qe x .
ts1
/ ž
« N 0,
1
2
Ž ⍀ x x m ⍀ e. x . ,
/
and this is entirely analogous to the proof of Lemma 16. The main contribution of Ž1r 'n .Ý nis 1ŽŽ1r
X
q ..
T .ÝTts 1Ž Eq
from the BN decomposition is
i, t X i, t y ⌳ e x
1
n
Ý
'n is1
ž
1
T
T
X
X
Ž ..
Ž .
Ý Ž Ge Ž1. y ⍀ e x ⍀y1
x x C x 1 Vi , t Pi , ty1 C x 1
ts1
/
s
1
n
Ý Qi , T ,
'n is1
and it is easy to see that
X
E w vec Ž Q i , T .Ž vec Ž Q i , T .. x s
ª
ž
1
T
1
2
T
Ý
ts1
ty1
T
/
Ž ⍀ x x m ⍀ e. x .
X
Ž ⍀ x x m ⍀ e. x . s E w vec Ž Q i .Ž vec Ž Q i .. x ,
1110
P. C. B. PHILLIPS AND H. R. MOON
X
Ž ..
Ž .X
where Q i s Ž Ge Ž1. y ⍀ e x ⍀y1
x x C x 1 H dWi Wi C x 1 . Thus, by Theorem 3, we have the desired result.
X
q ..
All the remainder terms in the BN decomposition of Ž1r 'n .Ý nis 1ŽŽ1rT .ÝTts1Ž Eq
i, t X i, t y ⌳ e x
4
converge in probability to zero by Lemma 16 and the moment bound E 5 X i, 0 5 - M, for some
constant M.
Q. E. D.
REFERENCES
ANDREWS, D. W. K. Ž1991.: ‘‘Heteroskedasticity and Autocorrelation Consistent Covariance Matrix
Estimation,’’ Econometrica, 59, 817᎐858.
BALTAGI, B. Ž1995.: Econometric Analysis of Panel Data. New York: Wiley.
BILLINGSLEY, P. Ž1968.: Con¨ ergence of Probability Measures. New York: Wiley.
ᎏᎏᎏ Ž1986.: Probability and Measure. New York: Wiley.
CHAMBERLAIN, G. Ž1984.: ‘‘Panel Data,’’ in Handbook of Econometrics, Vol. 2, ed. by Z. Griliches
and M. Intriligator. Amsterdam: North-Holland.
CONLEY, T. Ž1997.: ‘‘Econometric Modelling of Cross Sectional Dependence,’’ Mimeo.
DUDLEY, R. Ž1989.: Real Analysis and Probability. Pacific Grove: Wadsworth & BrooksrCole
Mathematics Series.
EICKER, F. Ž1963.: ‘‘Central Limit Theorems for Families of Sequences of Random Variables,’’
Annals of Mathematical Statistics, 34, 439᎐446.
GRANGER, C. W. J., AND P. NEWBOLD Ž1974.: ‘‘Spurious Regressions in Econometrics,’’ Journal of
Econometrics, 2, 111᎐120.
HSIAO, C. Ž1986.: Analysis of Panel Data. Cambridge: Cambridge University Press.
HALL, P., AND C. HEYDE Ž1980.: Martingale Limit Theory and its Applications. New York: Academic
Press.
HANNAN, E. Ž1970.: Multiple Time Series. New York: Wiley.
IM, K., H. PESARAN, AND Y. SHIN Ž1996.: ‘‘Testing for Unit Roots in Heterogeneous Panels,’’ Mimeo.
LEVIN, A., AND C. LIN Ž1993.: ‘‘Unit Root Tests in Panel Data: New Results,’’ UC San Diego
Working Paper.
MAGNUS, J., AND H. NEUDECKER Ž1988.: Matrix Differential Calculus. New York: Wiley.
MATYAS, L., AND P. SEVESTRE ŽEDS.. Ž1992.: The Econometrics of Panel Data. Boston, MA: Kluwer
Academic Publishers.
MUIRHEAD, R. Ž1982.: Aspects of Multi¨ ariate Statistical Theory. New York: Wiley.
PEDRONI, P. Ž1995.: ‘‘Panel Cointegration; Asymptotic and Finite Sample Properties of Pooled Time
Series Tests with an Application to the PPP Hypothesis,’’ Indiana University Working Papers in
Economics No. 95-013.
PESARAN, H., AND R. SMITH Ž1995.: ‘‘Estimating Long-Run Relationships from Dynamic Heterogeneous Panels,’’ Journal of Econometrics, 68, 79᎐113.
PHILLIPS, P. C. B. Ž1986.: ‘‘Understanding Spurious Regressions in Econometrics,’’ Journal of
Econometrics, 33, 311᎐340.
ᎏᎏᎏ Ž1988.: ‘‘Weak Convergence of Sample Covariance Matrices to Stochastic Integrals via
Martingale Approximations,’’ Econometric Theory, 4, 528᎐533.
PHILLIPS, P. C. B., AND S. DURLAUF Ž1986.: ‘‘Multiple Time Series Regression with Integrated
Processes,’’ Re¨ iew of Economic Studies, 53, 473᎐495.
PHILLIPS, P. C. B., AND B. HANSEN Ž1990.: ‘‘Statistical Inference in Instrumental Variables Regression with I Ž1. Processes,’’ Re¨ iew of Economic Studies, 57, 99᎐125.
PHILLIPS, P. C. B., AND C. LEE Ž1996.: ‘‘Efficiency Gains from Quasi-Differerencing under Nonstationarity,’’ in Athens Conference on Applied Probability and Time Series: Volume II, Time Series
Analysis in Memory of E. J. Hannan, ed. by P. M. Robinson and M. Rosenblatt. New York:
Springer-Verlag.
NONSTATIONARY PANEL DATA
1111
PHILLIPS, P. C. B., AND H. MOON Ž1997a.: ‘‘Linear Regression Limit Theory for Nonstationary Panel
Data,’’ University of Auckland Discussion Paper in Economics.
ᎏᎏᎏ Ž1997b.: ‘‘Linear Regression Limit Theory for Nonstationary Panel Data,’’ Yale University,
Mimeographed.
PHILLIPS, P. C. B., AND V. SOLO Ž1992.: ‘‘Asymptotics for Linear Processes,’’ Annals of Statistics, 20,
971᎐1001.
POLLARD, D. Ž1984.: Con¨ ergence of Stochastic Processes. New York: Springer Verlag.
QUAH, D. Ž1994.: ‘‘Exploiting Cross-Section Variations for Unit Root Inference in Dynamic Data,’’
Economic Letters, 44, 9᎐19.
ROBERTSON, D., AND J. SYMONS Ž1992.: ‘‘Some Strange Properties of Panel Data Estimators,’’
Journal of Applied Econometrics, 7, 175᎐189.
SHORACK, G., AND J. WELLNER Ž1986.: Empirical Processes with Applications to Statistics. New York:
Wiley.