Econometrica, Vol. 67, No. 5 ŽSeptember, 1999., 1057᎐1111 LINEAR REGRESSION LIMIT THEORY FOR NONSTATIONARY PANEL DATA1 BY PETER C. B. PHILLIPS AND HYUNGSIK R. MOON 2 This paper develops a regression limit theory for nonstationary panel data with large numbers of cross section Ž n. and time series ŽT . observations. The limit theory allows for both sequential limits, wherein T ª ⬁ followed by n ª ⬁, and joint limits where T, n ª ⬁ simultaneously; and the relationship between these multidimensional limits is explored. The panel structures considered allow for no time series cointegration, heterogeneous cointegration, homogeneous cointegration, and near-homogeneous cointegration. The paper explores the existence of long-run average relations between integrated panel vectors when there is no individual time series cointegration and when there is heterogeneous cointegration. These relations are parameterized in terms of the matrix regression coefficient of the long-run average covariance matrix. In the case of homogeneous and near homogeneous cointegrating panels, a panel fully modified regression estimator is developed and studied. The limit theory enables us to test hypotheses about the long run average parameters both within and between subgroups of the full population. KEYWORDS: Nonstationary panel data, long-run average relations, multidimensional limits, panel cointegration regression, panel spurious regression. 1. INTRODUCTION THERE HAS BEEN MUCH RECENT EMPIRICAL econometric work on economic models that uses panel data for which the time series component is nonstationary. Testing growth convergence theories in macroeconomics and estimating long-run relations between international financial series such as relative prices and exchange rates, and spot and future exchange rates are a few examples. This work has been facilitated by the construction and availability of a number of important panel data sets covering different individuals, regions, and countries over a relatively long time period, a notable example being the Penn World table. For such cases a new nonstationary panel data limit theory which allows for large n and large T asymptotics is useful. Much past panel data research has focused on identifying and estimating effects from stationary panels with a large cross section data dimension Ž n. but with few time series ŽT . observations. In 1 An earlier version of this paper, Phillips and Moon Ž1997a., hereafter PM a , was presented at the inaugural meeting of the New Zealand Econometric Society Group in Auckland, February 1997, while Phillips was visiting the University of Auckland. Some of the results were also presented in a training course on panel cointegration given by the first author at the EMBA meetings in Palm Cove, Australia, August 1996. The present paper is a shortened version of Phillips and Moon Ž1997b., hereafter PM b , to which readers will be frequently referred for a full development of the algebra and limit theory given here. 2 The authors thank a co-editor and three referees for comments on the paper, and Donald Andrews, David Pollard, and Oliver Linton for helpful discussions. Phillips thanks the NSF for research support under Grant No. SBR 94-22922, and Moon gratefully acknowledges financial support from a C.A. Anderson Prize Fellowship. The paper was typed by the authors in Scientific Word 2.5. 1057 1058 P. C. B. PHILLIPS AND H. R. MOON such cases a large n, fixed T limit theory is natural and Chamberlain Ž1984., Hsiao Ž1986., Matyas and Sevestre Ž1992., and Baltagi Ž1995. review much of this research. The purpose of the present contribution is to investigate regressions with nonstationary panel data for which the time series component is an integrated process and where both T and n are large. In such cases, panel regressions can behave very differently from time series regressions. It has long been recognized by econometricians that panel data can distinguish effects that time series or cross section data alone cannot identify, and nonstationary panels provide a further instance of this phenomenon. Suppose that we have two I Ž1. random vectors, say Yi, t and X i, t . When there is no cointegrating relation between Yi, t and X i, t , and a time series regression for given i is performed, then the regression coefficient is well known to have a nondegenerate limit distribution and the regression is characterized as spurious ŽGranger and Newbold Ž1974. and Phillips Ž1986... Now suppose that there are panel observations of Yi, t and X i, t with large cross sectional and time series components. In this case, even if the noise in the time series regression is strong, the noise can often be characterized as independent across individuals. Hence, by pooling the cross section and time series observations, we may attenuate the strong effect of the residuals in the regression while retaining the strength of the signal Ž X i, t .. In such a case, we can expect a panel-pooled regression to provide a consistent estimate of some long-run regression coefficient. The present paper is concerned with developing a limit theory that is helpful in understanding and interpreting regressions of this type. In particular, we show the existence of an interesting long-run relation between panel vectors like Yi, t and X i, t that have no individual time series cointegrating relation. The new relation is a long-run a¨ erage relationship over the cross section and it is parameterized in terms of a matrix regression coefficient derived from the cross section long-run average covariance matrix. The following sections consider four possible panel structures for Yi, t and X i, t : Ži. no cointegrating relation, Žii. a heterogeneous cointegrating relation, Žiii. a homogeneous cointegrating relation, and Živ. a near-homogeneous relation. Our analysis shows that in all four cases the pooled estimator is consistent and has a normal limit distribution. In the no cointegration and heterogeneous cointegration cases, we also study a limiting cross section estimator and prove that it is consistent and has a normal limit distribution, but that it is less efficient than the pooled estimator. In addition, in the case of homogeneous cointegration and near-homogeneous cointegration, we can construct a consistent estimator for the long-run regression coefficient, which we call a pooled FM Žfully modified. estimator. This estimator has a faster coefficient convergence rate than the simple cross section and time series estimators. Since the beginning of the 1990’s there has been some ongoing research on nonstationary panel data that connects to our work here. Quah Ž1994., Levin and Lin Ž1993., and recently Im et al. Ž1996. considered unit root time series regressions with nonstationary panel data and proposed test statistics for unit roots. In addition, Pedroni Ž1995. studied some properties of cointegration NONSTATIONARY PANEL DATA 1059 statistics in pooled time series panels, and Robertson and Symons Ž1992. studied the biases that are likely to arise in practice with both stationary and nonstationary panel data. More closely related to our work is Pesaran and Smith Ž1995., who examined the impact of nonstationary variables on cross section regression estimates. They showed that spurious correlation between two I Ž1. variables does not arise in the case of cross section regression with a finite number of time series observations under conditions such as exogenous regressors and iid disturbances. Our paper extends that result to a very general setting and provides a limit theory as T ª ⬁ and n ª ⬁ in panel regressions. The long-run relation defined in Pesaran and Smith Ž1995. is an average of randomly different cointegrating coefficients and they suggested cross section regression with time averaged data for consistent estimation. By contrast, our long run relation is the regression relation associated with the long run average covariance matrix and it is this regression that is the natural limit of a pooled panel regression. Further, we show that both pooled panel regression and limiting cross section regression estimators are consistent for this long-run average relation. The limit theory developed here allows for both sequential limits, wherein T ª ⬁ followed by n ª ⬁, and joint limits where T, n ª ⬁ simultaneously. A detailed discussion of multi-index asymptotic theory is provided and some general theorems for laws of large numbers and central limit theory are given. Sequential limit theory is easy to derive and generally leads to quick results for a variety of model configurations. Under some strengthening of the conditions, the results obtained under sequential limits are shown to apply also when T, n ª ⬁ simultaneously. For a limit distribution theory we need the rate condition nrT ª 0. The latter condition indicates that the limit theory here is most likely to be useful in practice when n is moderate and T is large. We can expect such data configurations in multi-country macreconomic data, for example, when we restrict attention to groups of countries like OECD nations or developing countries. The limit theory enables us to test hypotheses about the long-run average parameters both within and between such subgroups of the full population. The paper is organized as follows. Section 2 introduces the basic model, lays out assumptions, and gives some preliminary results including a multidimensional Beveridge Nelson ŽBN. decomposition. Section 3 develops a framework for asymptotics for double indexed processes that is used in the paper for both sequential and joint limit theories. Section 4 assumes that there is no cointegration among the I Ž1. variable across all the individuals and gives asymptotic theories for a pooled regression estimator and a limiting cross section estimator. Section 5 assumes that there exists a cointegrating relation in the I Ž1. variables across all the individuals and derives limit theories for the pooled estimators in three casesᎏheterogenous, homogeneous, and near-homogeneous cointegration. Section 6 indicates some extensions of this theory to allow for models with individual specific effects. Section 7 concludes the paper. Five appendices are included to develop the multidimensional limit theory, provide some technical background, relevant lemmas, and proofs of results in the paper. 1060 P. C. B. PHILLIPS AND H. R. MOON Notation is fairly standard. The symbol ‘‘« ’’ signifies weak convergence, ‘‘[ ’’ is definitional equivalence, ‘‘' ’’ signifies equivalence in distribution, p a .s. ‘‘ ª ’’ is convergence in probability, and ‘‘ ª ’’ is convergence almost surely. The inequality ‘‘) 0’’ signifies positive definiteness when applied to matrices. Stochastic processes such as Brownian motion W Ž r . on w0, 1x are usually written as W, integrals such as H01W Ž r . dr as HW, and stochastic integrals like H01W Ž r . dW Ž r . as HW dW. Also vecŽ A. denotes vectorization of the matrix A by stacking columns, and 5 A 5 is the Euclidean norm ŽtrŽ AX A..1r2 . 2. ASSUMPTIONS, LARGE T ASYMPTOTICS, AND THE LONG-RUN AVERAGE COVARIANCE MATRIX We start with a panel data model based on the vector integrated process Ž 2.1. Zi , t s Zi , ty1 q Ui , t Ž t s 1, . . . , T ; i s 1, . . . , n . with common initialization at t s 0 satisfying Ž 2.2. Zi , 0 is iid across i with E 5 Zi , 0 5 4 - ⬁. We partition the m-vectors Zi, t and Ui, t in Ž2.1. into m y and m x components Ž m s m y q m x . as Zi, t s Ž Yi,X t , X i,X t .X and Ui, t s ŽUyX , t , UxX , t .X . Condition Ž2.2. is i i made for convenience and could be generalized to allow for remote past initialization at the cost of some further complications Že.g., Phillips and Lee Ž1996... The error Ui, t is assumed to be generated by the random coefficient linear process Ž 2.3. Ui , t s ⬁ Ý Ci , sVi , tys , ss0 where: Ži. Ci, t 4 is a double sequence of Ž m = m. random matrices across i and over t; Žii. the m-vectors Vi, t are iid across i and over t with EŽ Vi, t . s 0, EŽ Vi, t , Vi,X t . s Im , and, letting Va, i, t be the ath element of Vi, t , the Va, i, t are assumed to be independent across as 1, . . . , m with EŽ Va,4 i, t . s ¨ 4 for all i and t; Žiii. Ci, s and Vj, t are independent for all i, j, t, and s. We make two further assumptions about the random coefficients in Ž2.3.. The first involves moment conditions and the second is a set of summability conditions on the moments of the random coefficients. ASSUMPTION 1 ŽRandom Coefficient Conditions.: Ži. Ci, s 4i is iid across i for all s. Žii. E 5 Ci, s 5 4 - ⬁ for all s. Thus, Ci, s is assumed to be iid across individuals and to have finite fourth moments that may vary over time. We allow Ci, s to be dependent over s. This is important, because whenever Ui, t is generated by a finite-parameter time series NONSTATIONARY PANEL DATA 1061 model like an autoregression, the coefficients in the Wold decomposition Ž2.3. will be nonlinear functions of these parameters that are lag Ž s . dependent and will therefore inevitably be dependent over s. Let C a, i, s be the ath element of vecŽ Ci, s .. Also let EŽ C a,k i, s . s k, a, s . Then we make the following assumption: ASSUMPTION 2 ŽSummability Conditions.: The following hold for all as 1, . . . , m2 : Ži. Ý⬁ss0 s 2 2, a, s - ⬁. Žii. Ý⬁ss0 s 4 Ž 4, a, s .1r4 - ⬁. Suppose the Ui, t in Ž2.1. are generated by a random coefficient ARMA process whose characteristic equation has roots i j : j s 1, . . . , J 4 . Then the coefficients Ci, s in the Wold decomposition Ž2.3. are all linear combinations of powers of these characteristic roots. Under weak conditions on the distribution of the roots we can now verify Assumptions 1 and 2. Suppose, for instance, that the support of the distribution of the moduli of these roots is a compact set inside the stable region, so that < i j < F M - 1 a.s. Then all moments of 5 Ci, s 5 are finite for all s, and series such as those in Assumption 2 are easily seen to be majorized by convergent series. For example, Ý⬁ss 0 s 2 2, a, s F M Ý⬁ss0 s 2 M2 s - ⬁ for some constant M. Similar conditions will ensure the validity of the alternative Assumptions 4 and 5 that are used later on in the paper. The following lemma establishes the integrability of terms that appear frequently in our development. LEMMA 1: Let Ci Ž1. s Ý⬁ss0 Ci, s , C˜i, s s Ý⬁tssq1Ci, t , and U˜i, t s Ý⬁ss0 C˜i, sVi, tys . Under Assumptions 1 and 2, the following hold: Ža. EwÝ⬁ss0 s 2 5 Ci, s 5 2 x - ⬁. Žb. E 5 Ui, t 5 2 - M for some M- ⬁. Žc. E 5 U˜i, t 5 4 - M for some M- ⬁. Žd. E 5 Ci Ž1.5 4 - ⬁. The temporal shift operator generating the iid sequence Vi, t 4t defines a measure preserving map on the product probability space induced by the independent sequences Vi, t 4t and Ci, t 4t and it generates the sequence Ui, t 4t . Also the random coefficient sequence Ci, t 4t is square summable a.s. since Ý⬁ts0 5 Ci, t 5 2 - ⬁ a.s. by Lemma 1Ža.. Hence, the time series sequence Ui, t 4t is square integrable ŽLemma 1Žb.. and strictly stationary for all i. However, the sequence Ui, t 4t is not ergodic. This is because Fc i s Ž Ci, 0 , . . . , Ci, t , . . . ., the sigma field generated by the sequence Ci, t 4⬁ts0 , is an invariant sigma field with respect to the temporal shift operator and generates events with probability between zero and unity. The following lemma shows that, for each i, Ui, t satisfies a time series BN decomposition Žsee Phillips and Solo Ž1992.. almost surely. 1062 P. C. B. PHILLIPS AND H. R. MOON LEMMA 2 ŽPanel BN decomposition.: Under Assumptions 1 and 2 the processes Ui, t s Ý⬁ss0 Ci, sVi, tys in Ž2.3. admit the following BN decomposition: Ž 2.4. Ui , t s Ci Ž 1 . Vi , t q U˜i , ty1 y U˜i , t a.s. Note that Ci Ž1. s Ý⬁ss0 Ci, s - ⬁ a.s. in view of Lemma 1Žd., and U˜i, t are well defined square integrable random vectors by Lemma 1Žc.. Following Phillips and Solo Ž1992., partial sums of Ui, t can be written as Ž 2.5. w Tr x 1 a.s. Ý Ui , t s Ci Ž1. 'T ts1 1 w Tr x 1 1 Ý Vi , t q 'T U˜i , 0 y 'T U˜i , wT r x , 'T ts1 where w Tr x denotes the integer part of Tr and it is a simple matter to establish that these partial sum processes satisfy functional laws. Indeed, we have the following large T result. LEMMA 3 ŽPanel Functional CLT.: Under Assumptions 1 and 2, w Tr x 1 Ý Ui , t « Ci Ž1. Wi Ž r . 'T ts1 as T ª ⬁ for all i. Let Mi Ž r . s Ž M y iŽ r .X , M x iŽ r .X .X s C i Ž1.Wi Ž r . s Ž C y iŽ1.X , C x iŽ1.X .X Wi Ž r .. Here, Mi Ž r . is a randomly scaled Žor mixed. Brownian Motion with conditional covariance matrix Ci Ž1.Ci Ž1.X , whose expectation is well defined because 5 ECi Ž1.Ci Ž1.X 5 - ⬁ in view of Lemma 1Žd.. By the continuous mapping theorem and initial condition Ž2.2. we have T 1 T 2 X Ý Zi , t ZiX , t « Ci Ž1. HWi WiX Ci Ž1. s HMi MiX , ts1 as T ª ⬁ for all i. Then, averaging over i s 1, . . . , n, we have Ž 2.6. 1 n n Ý is1 1 T2 T 1 n X 1 n Ý Zi , t ZiX , t « n Ý Ci Ž1. HWi WiX Ci Ž1. s n Ý HMi MiX , ts1 is1 is1 as T ª ⬁ for any fixed n. Integrability of the summands in Ž2.6. follows readily under the given summability and moment conditions. LEMMA 4: Under Assumptions 1 and 2, E 5 HMi MiX 5 2 - ⬁. In consequence, a strong law of large numbers applies to Ž2.6. as n ª ⬁ and so Ž 2.7. 1 n n a.s. Ý HMi MiX ª E is1 Mi MiX . žH / 1063 NONSTATIONARY PANEL DATA The limit here depends not on the covariance matrix of Zi, t , but on a parameter matrix that measures the long-run Žover t . covariance of Zi, t averaged over i. This parameter matrix is constructed as follows. Let ⍀ i be the long-run conditional covariance matrix of Zi, t s Ž Yi, t , X i, t .X conditioned on Fc i , i.e., ž ⍀i s ⍀ yi yi ⍀ yi xi ⍀ xi yi ⍀ xi xi X / X s Ci Ž 1 . Ci Ž 1 . s ž X C y iŽ 1 . C y iŽ 1 . C y iŽ 1 . C x iŽ 1 . C x iŽ 1 . C y iŽ 1 . C x iŽ 1 . C x iŽ 1 . X X / , where the partitions of ⍀ i and Ci Ž1.Ci Ž1.X are conformable. By Lemma 1Žd., ⍀ i is integrable and we denote ⍀s ž ⍀yy ⍀yx ⍀x y ⍀x x / X s E Ž ⍀ i . s E Ž Ci Ž1. Ci Ž1. . . We call ⍀ the long-run a¨ erage co¨ ariance matrix of Zi, t . It is now apparent from Ž2.6. that EŽ HMi MiX . s Ew Ci Ž1. EŽ HWi WiX .Ci Ž1.X x. A simple calculation reveals that EŽ HWi WiX . s 12 Im , so that Ž2.7. becomes Ž 2.8. 1 n n 1 a.s. Ý HMi MiX ª 2 ⍀ , is1 showing that the limit of the second moment matrix of the data depends on the long-run average covariance matrix ⍀ . Taken together, Ž2.6. and Ž2.7. give us an instance of an asymptotic development in which T ª ⬁, followed by n ª ⬁, leading to the sequence of limits Ž 2.9. Xn , T [ 1 n n Ý is1 ž 1 T 2 T Ý Zi , t , ZiX , t is1 / « 1 n n a.s. 1 Ý HMi MiX ª 2 ⍀ is1 for the double indexed process X n, T . The next section discusses this type of sequential asymptotic theory in relation to more general joint asymptotics. 3. LIMIT THEORY FOR MULTIDIMENSIONAL PROCESSES Throughout this paper, attention is focused on deriving the limit behavior of a double indexed process X n, T , such as that given in Ž2.9.. In general, the limit of X n, T depends on the treatment of the two indices, n and T, and the properties that link the rows and columns of the process. Several approaches are possible. One approach is to fix one of the indexes, say n, and allow the other ŽT . to pass to infinity, giving an intermediate limit. By letting n pass to infinity subsequently, a sequential limit theory is obtained. We write this type of limit process in the form ŽT, n ª ⬁. seq , where the order of the indices is critical to the meaning. While they often lead to tractable deviations, sequential limits can give 1064 P. C. B. PHILLIPS AND H. R. MOON asymptotic results that are misleading in cases where both indexes pass to infinity simultaneously. A second approach is to pass to infinity along a specific diagonal path Žin the two dimensional array. determined by a monotonically increasing function relation of the type T s T Ž n. while the index n ª ⬁. ŽThis approach really requires specificity about the functional dependence only in the limit, so it includes cases such as those where we assume that ŽTrn. ª c / 0, in which case we simply have T Ž n. s cn.. We write this type of limit process in the form ŽT Ž n., n ª ⬁.diag . This approach also simplifies the asymptotic theory by replacing X n, T with the single indexed process X n, T Ž n. . The drawback of diagonal path limit theory is that the assumed expansion path ŽT Ž n., n. ª ⬁ is highly specific and may not provide an appropriate approximation for a given ŽT,n. situation. Moreover, the limit theory can depend on the specific functional relation T s T Ž n. that is used in the asymptotic development. ŽA recent econometric example of this situation is analyzed in Phillips and Lee Ž1996... A third approach is to allow both indexes to pass to infinity simultaneously without placing specific diagonal path restrictions on the divergence. We write this type of limit process in the form ŽT, n ª ⬁.. Generally speaking, such joint limit theory requires stronger conditions Žlinking the rows and columns of the joint array, and on the moments of the component variates. to establish than sequential convergence or diagonal path convergence. But, by the same token, the results are also stronger and may be expected to be relevant to a wider range of circumstances, provided the conditions hold. The asymptotic development in this paper will involve both sequential limit theory and joint limit theory arguments. The sequential limits are especially helpful in extracting quick asymptotics and they are useful because they bring into play all of the key elements in our final limit theory in a straightforward way. The joint limit theory is more difficult to derive and applies under stronger conditions. Fortunately, these conditions do not seem to exclude cases of major importance for the type of large T and moderate n empirical applications that we have in mind for our methods. The following subsections define the convergence concepts that we need and give some conditions that assure joint convergence. 3.1. Definitions and Some Relations between Sequential and Joint Limits A typical double index process of the type that occurs in this paper has the linear form Ž 3.1. Xn , T s 1 kn n Ý Yi , T , is1 where Yi, T are independent m-component random vectors across i defined on a probability space Ž ⍀ , F, P .. A typical Yi, T component in our case has the form NONSTATIONARY PANEL DATA 1065 of a standardized sum Ž 3.2. Yi , T s 1 dT T Ý f Ž Zi , tswT r x . , ts1 where the Zi, wT r x are random elements in the space 3 Dw0, 1x h , for some integer h, within the space Ž ⍀ , F , P ., dT is a standardizing factor, and f is a continuous functional from Dw0, 1x h to ⺢ m . We alert the reader that the meaning of the notation X n, T and Yi, T in Ž3.1. and Ž3.2. above is different from that of the symbols Yi, t and X i, t which represent components of Zi, t given in Ž2.1. that appear in other sections of this paper. The differences in meaning should be obvious from the context. DEFINITION 1: Ža. A sequence of m-vectors X n, T 4 on Ž ⍀ , F , P . is said to con¨ erge in probability to X sequentially, written X n, T ªp X in sequential limit as ŽT, n ª ⬁. seq , if lim lim P 5 X n , T y X 5 ) 4 s 0 nª⬁ Tª⬁ ᭙ ) 0. Žb. X n, T converges in distribution sequentially to the m-vector X, written X n, T « X in sequential limit as ŽT, n ª ⬁. seq , if lim lim Ef Ž X n , T . y Ef Ž X . s 0 nª⬁ Tª⬁ ᭙fgC , where C is the class of all bounded, continuous, real functions on ⺢ m. In practice, we can find the sequential limits of X n, T in Ž3.1. as follows. Using time series limit theory we find the limit behavior of Yi, T . Suppose, for example, that as T ª ⬁ Ž 3.3. Yi , T « Yi or Ž 3.4. p Yi , T ª Yi for all i. Then, by the independence of Yi, T across i for all T, we have X n, T « X n or X n, T ªp X n as T ª ⬁ for all n, where X n s Ž1rk n .Ý nis1Yi . By enlarging the underlying probability space if necessary, we can take it in the case of Ž3.3. that all the Yi ’s are defined on the same probability space. 3 Dw0, 1x h is a product metric space of h independent copies of Dw0, 1x, the space of the all real valued functions on the interval w0, 1x that are right continuous and have finite left limits. Dw0, 1x h is endowed with the metric hŽ f, g . s max i Ž f i , g i .: i g 1, . . . , h4, f i , g i g Dw0, 1x4, where is the modified Skorohod metric Žsee Billingsley Ž1968, p. 112.. under which Dw0, 1x is separable and complete. 1066 P. C. B. PHILLIPS AND H. R. MOON Hence, the sum of the limit random variables Ý nis1Yi is well defined on the same space. Next, we allow n ª ⬁ and apply a limit theory to the standardized sum Ž 3.5. Xn s n 1 kn Ý Yi . is1 Under some regularity conditions, we can now find the sequential limit, X, or X n . For example, if k n s n, we can apply a law of large numbers ŽLLN. to X n and if k n s 'n we can use an appropriate central limit theorem ŽCLT.. The requirement that the Yi ’s in Ž3.3. are defined on the same probability space is important, especially when we apply an LLN to X n in the second stage Ž3.5.. The reason is as follows. The weak convergence X n, T « X n as T ª ⬁ involves only the implication that the distribution of the X n, T converges to the distribution of the X n , not any properties of the probability space where the X n are defined. Indeed, if the weak convergence is mixing Že.g., see Hall and Heyde Ž1980.., then X n escapes from the underlying probability space when T ª ⬁. However, to employ an LLN to the sequence of X n , these variates need to be defined on the same probability space. The requirement that the Yi ’s in Ž3.3. are defined on the same probability space can be accommodated by suitably enlarging the underlying space. The construction of such a probability space is provided in Appendix B. Next, we define the concepts of joint convergence in probability and joint weak convergence. DEFINITION 2: are defined on probability jointly Ž 3.6. Ža. Suppose that the m-vector random sequence X n, T and X a probability space Ž ⍀ , F , P .. X n, T is said to con¨ erge in to X, written X n, T ªp X as ŽT, n ª ⬁., if lim P 5 X n , T y X 5 ) 4 s 0 T , nª⬁ ᭙ ) 0. Žb. X n, T is said to converge in distribution jointly to a Ž m = 1. random vector X, written X n, T « X as ŽT, n ª ⬁., if Ž 3.7. lim T , nª⬁ Ef Ž X n , T . y Ef Ž X . s 0 ᭙fgC , where C is the class of all bounded, continuous real functions on ⺢ m. REMARKS: Ža. Evidently, joint convergence implies diagonal convergence on all monotonic diagonal paths. Moreover, a version of the converse is also true, namely that X n, T ªp X Žor X n, T « X . as ŽT, n ª ⬁. if X n, T Ž n. ªp X as ŽT Ž n., n ª ⬁.diag for all T Ž n. ª ⬁ monotonically as n ª ⬁. Žb. In some of our results, we need to place a condition on the indexes in joint convergence of the form nrT ª 0. Joint convergence as ŽT, n ª ⬁. is then said to apply subject to this condition. The definitional limits given in Ž3.6. and Ž3.7. above are naturally subject to the same condition regarding the passage of the indexes to infinity in this case. 1067 NONSTATIONARY PANEL DATA Sequential limits are by no means always the same as joint limits or diagonal path limits. Sometimes, different normalizations even are required to obtain nondegenerate limits. PM b gives several examples. Nevertheless, under some circumstances we can establish a relationship between sequential limits and joint limits. The following two lemmas give some elementary conditions. LEMMA 5 ŽConditions for Joint Convergence to Imply Sequential Convergence.: Ža. Suppose there exist random ¨ ectors X n on the same probability space as X n, T satisfying, for all n, X n, T ªp X n as T ª ⬁. If X n, T ªp X as ŽT, n ª ⬁., then X n, T ªp X sequentially as ŽT, n ª ⬁. seq . Žb. Suppose there exist random ¨ ectors X n such that, for any fixed n, X n, T « X n as T ª ⬁. If X n, T « X as n, T ª ⬁, then X n, T « X sequentially as ŽT, n ª ⬁. seq . LEMMA 6 ŽConditions for Sequential Convergence to Imply Joint Convergence.: Ža. Suppose there exist random ¨ ectors X n and X on the same probability space as X n, T satisfying, for all n, X n, T ªp X n as T ª ⬁ and X n ªp X as n ª ⬁. Then, X n, T ªp X as Ž n, T ª ⬁. if and only if, Ž 3.8. lim sup P 5 X n , T y X n 5 ) 4 s 0 ᭙ ) 0. n, T Žb. Suppose there exist random ¨ ectors X n such that, for any fixed n, X n, T « X n as T ª ⬁ and X n « X as n ª ⬁. Then, X n, T « X as Ž n, T ª ⬁. if and only if, Ž 3.9. lim sup E Ž f Ž X n , T .. y E Ž f Ž X n .. s 0 ᭙fgC. n, T 3.2. Joint Con¨ ergence in Probability Consider a double indexed process X n, T whose typical form is an average of Ž m = 1. random vectors Yi, T , Ž 3.10. Xn , T s 1 n n Ý Yi , T , is1 where the Yi, T are independent across i for all T. The concern is to establish conditions under which a probability limit of X n, T in Ž3.10. exists and to develop methods of finding this probability limit. Suppose the X n, T are integrable and let Ž 3.11. X s lim EX n , T s lim n, Tª⬁ n , Tª⬁ 1 n n Ý EYi , T be finite. is1 By definition it is sufficient for X n, T ªp X as Ž n, T ª ⬁. to show that Ž 3.12. lim P n, Tª⬁ ½ 1 n n Ý Ž Yi , T y EYi , T . is1 5 ) s0 for all ) 0. 1068 P. C. B. PHILLIPS AND H. R. MOON In some applications Ž3.12. can be verified by showing that Ž 3.13. 1 lim E n n, Tª⬁ n Ý Ž Yi , T y EYi , T . s0 is1 using the Markov inequality. Or, if the X n, T are square integrable, Ž3.12. follows by Chebychev’s inequality when Ž 3.14. 1 lim E n n, Tª⬁ 2 n Ý Ž Yi , T y EYi , T . s lim n , Tª⬁ is1 1 n2 n Ý E 5 Yi , T y EYi , T 5 2 s 0, is1 where the first equality holds because the Yi, T are independent across i for all T. Sequential probability limits can also be derived. From time series limit theory we may obtain the limit behavior of Yi, T when T ª ⬁. Suppose, for instance, that as T ª ⬁ Ž 3.15. Yi , T « Yi ᭙i or Ž 3.16. p Yi , T ª Yi ᭙i so that, by the independence of Yi, T across i for all T, it follows X n, T « X n or X n, T ªp X n as T ª ⬁ for all n, where X n s Ž1rn.Ý nis1Yi . Suppose also, in the case of Ž3.15., that the Yi are defined on the same probability space for all i so that the sum of the limit random variables Ž1rn.Ý nis1Yi is meaningful. Appendix BŽ1. provides a construction for doing this and, hereafter, we assume that the random vectors Yi in Ž3.15. exist on the same probability space whenever we use sequential limit arguments. By allowing n ª ⬁ and applying a standard strong law for independent random variables to Ž 3.17. Xn s 1 n n Ý Yi , is1 under some regularity conditions,4 we may find the sequential limit X. Let Ž 3.18. ˜ X s lim n 1 n n Ý EYi . is1 Then Xn s 1 n n a.s. Ý Yi ª ˜ X s lim is1 n 1 n n Ý EYi . is1 A fundamental question is whether the joint probability limit X in Ž3.11. is equivalent to the sequential probability limit ˜ X in Ž3.18.. Lemma 6 provides 4 Simple sufficient conditions are that the Yi are independent with sup i E 5 Yi y EYi 5 2 - ⬁. NONSTATIONARY PANEL DATA 1069 one solution. According to Lemma 6, it is enough to verify condition Ž3.9. with X n, T s Ž1rn.Ý nis1Yi, T and X n s Ž1rn.Ý nis1 Ý nis1Yi to conclude that X n, T ªp ˜X as n, T ª ⬁, where ˜ X s lim nŽ1rn.Ýnis1 EYi . The following theorem gives a set of sufficient conditions under which condition Ž3.9. is satisfied, so that the probability limits X in Ž3.11. and ˜ X in Ž3.18. are equivalent. THEOREM 1 ŽJoint Probability Limits.: Suppose the Ž m = 1. random ¨ ectors Yi, T are independent across i for all T and integrable. Assume that Yi, T « Yi as T ª ⬁ for all i. Let the following hold: Ži. lim sup n, T Ž1rn.Ý nis1 E 5 Yi, T 5 - ⬁; Žii. lim sup n, T Ž1rn.Ý nis1 5 EYi, T y EYi 5 s 0; Žiii. lim sup n, T Ž1rn.Ý nis1 E 5 Yi, T 515 Yi, T 5 ) n 4 s 0 ᭙ ) 0; and Živ. lim sup nŽ1rn.Ý nis1 E 5 Yi 515 Yi 5 ) n 4 s 0 ᭙ ) 0. Ža. Then condition Ž3.9. holds. Žb. If lim nª⬁Ž1rn.Ý nis1 EYi Ž[ ˜ X . exists and X n ªp ˜ X as n ª ⬁, then X n, T ªp ˜ X as Ž n, T ª ⬁.. In establishing the existence of a joint probability limit of Ž1rn.Ý nis1Yi, T , Theorem 1 requires only first moment assumptions on Yi, T and is, in this respect, less demanding than Ž3.14., which uses second moments of Yi, T . Theorem 1 is particularly useful when the first moment condition Ž3.13. is not so easy to establish. An important special case arises when the Yi, T are a scaled version of some iid random vectors Q i, T .5 Suppose that Yi, T s Ci Q i, T , where the Q i, T 4i are iid for all T and the Ci are Ž m = m. nonrandom matrices for all i. Suppose that Q i, T « Q i as T ª ⬁ for all i, so that Yi s Ci Q i . In general, the Yi, T are heterogenous across i unless the Ci are the same for all i. The source of the heterogeneity of Yi, T is the scale effect Ci , and then the heterogeneity from Ci is smoothed by letting n ª ⬁. We have the following result for this special case. COROLLARY 1: Suppose that Yi, T s Ci Q i, T , where the Q i, T are iid across i for all T, and the Ci are Ž m = m. nonrandom matrices for all i. Assume that the Q i, T are integrable for all T and Q i, T « Q i as T ª ⬁. Assume that C s lim nŽ1rn.Ý nis1Ci exists. If 5 Q i, T 5 is uniformly integrable in T for all i, and if sup i 5 Ci 5 - ⬁, then Ž1rn.Ý nis1Yi, T ªp CEŽ Q i . as Ž n, T ª ⬁.. 5 In many applications, an I Ž1. process Zi, t can be decomposed into a scaled random walk process plus an error term, that is, Zi, t s Ci Ž1. Si, t q U˜i, 0 y U˜i, t , where Si, t s Si, ty1 q Ui, t and Ci Ž1. is the long-run moving average coefficient of ⌬ Zi, t Žsee Phillips and Solo Ž1992... Then, the scale factor Ci is Ci Ž1. and Q i, T corresponds to f Ž Si, tr 'T ., where f is a continuous functional on some metric space. 1070 P. C. B. PHILLIPS AND H. R. MOON 3.3. Joint Central Limit Theory This section considers joint convergence in distribution of the ized double sequence X n, T , 1 n Ž 3.19. Xn , T s Ý Yi , T , 'n is1 'n -standard- where the Yi, T are independent Ž m = 1. random vectors across i with EYi, T s 0 and EYi, T Yi,X T s ⍀ i, T . One approach to the limit distribution of X n, T is to attempt to use a multivariate CLT directly. This approach is particularly appropriate in the case of diagonal path limits where we have X n, T Ž n. s Ž1r 'n .Ý nis1Yi, T Ž n. and a suitable multivariate CLT for triangular arrays can be applied. This idea was employed by Quah Ž1994. and Levin and Lin Ž1993. in their work on panel unit root tests. But, in general, when n and T go to infinity and no specific expansion relation between n and T is assumed, we cannot use traditional CLT’s in this way. In what follows, therefore, we develop a joint CLT for Ž n, T ª ⬁., using a Lindeberg condition for a double indexed process. First, take the case where the Yi, T in Ž3.19. are scalar random variables. Let 2 n sn, T s Ý is1 ⍀ i, T and define i, n, T s Yi, T rs n, T . Then we have the following result. THEOREM 2 ŽJoint Limit CLT.: Suppose that for ᭙ ) 0, n Ž 3.20. lim ÝE n, Tª⬁ is1 i 2, n , T 1 < i , n , T < ) 4 s 0. Then, as Ž n, T ª ⬁., n Ý i , n , T « N Ž0, 1. . is1 There are some interesting special cases of this joint CLT. The following result, which is related to a theorem of Eicker Ž1963., arises when the Ž m = 1. random vectors Yi, T are scaled versions of iid random vectors Q i, T . THEOREM 3 ŽJoint Limit CLT for Scaled Variates.: Suppose that Yi, T s Ci Q i, T , where the Ž m = 1. random ¨ ectors Q i, T are iidŽ0, ⌺ T . across i for all T and the Ci are Ž m = m. nonzero and nonrandom matrices. Assume the following conditions hold: Ži. Let T2 s minŽ ⌺ T . and lim inf T T2 ) 0; Žii. max i F n 5 Ci 5 2rminŽÝ nis1Ci CiX . s O Ž1rn. as n ª ⬁; Žiii. 5 Q i, T 5 2 are uniformly integrable in T ; Živ. lim n, T Ž1rn.Ý nis1Ci ÝT CiX s ⍀ ) 0. Then, Xn , T s 1 n Ý Yi , T « N Ž0, ⍀ . 'n is1 as n, T ª ⬁. NONSTATIONARY PANEL DATA 1071 Sequential weak convergence of Ž1r 'n .Ý nis1Yi, T can be derived in the same way as the sequential probability limit of Ž1rn.Ý nis1Yi, T considered earlier. Suppose that, for each i as T ª ⬁, the random variables Yi, T converge in distribution to Yi , where the Yi are independent with mean zero and variance ⍀ i . Then, Ž1r 'n .Ý nis1Yi « N Ž0, ⍀ . if for all ) 0 as n ª ⬁ Ž 3.21. E Yi 2 sn2 1 ½ Yi 2 sn2 5 ) ª 0, and Ž 3.22. 1 n sn2 s 1 n n Ý ⍀i ª ⍀ . is1 In many econometric applications Yi is a Gaussian random variable or a function of the Gaussian process. So the Yi usually possess higher moments. Second moment requirements then follow automatically, and the Lindeberg condition Ž3.21. for Ž1r 'n .Ý nis1Yi may be verified directly using a Liapounov condition. Additional Remarks Ža. Sequential weak convergence of Ž1r 'n .Ý nis1Yi, T to N Ž0, ⍀ . under conditions Ž3.21. and Ž3.22. does not imply that Ž1r 'n .Ý nis1Yi, T converges in distribution jointly to N Ž0, ⍀ . as Ž n, T ª ⬁.. According to Lemma 6Žii., condition Ž3.9. is a necessary and sufficient condition for Ž1r 'n .Ý nis1Yi, T to converge in distribution jointly to the sequential limit distribution N Ž0, ⍀ .. In this case, therefore, condition Ž3.20. and the condition that Ž1rn.Ý nis1 ⍀ i, T ª ⍀ as Ž n, T ª ⬁. provide sufficient conditions for condition Ž3.9.. Žb. When Yi, T in Ž3.19. does not have mean zero, but instead has mean zero asymptotically as T ª ⬁ for each i, joint CLT’s such as Theorem 2 or Corollary 3 cannot be applied. In this case, T needs to increase fast enough to make the 'n -standardized sum of the biases small. That is, Ž1r 'n .Ýnis1 EYi, T should go to zero as Ž n, T ª ⬁.. In this case, asymptotic normality of Ž1r 'n .Ý nis1Yi, T will continue to hold provided the expansion rate between n and T allows the bias to go to zero. The next section gives an example where this problem arises Že.g., see Theorem 4.. 4. SPURIOUS PANEL REGRESSION This section considers the case where the two component random vectors Yi, t and X i, t of Zi, t in Ž2.1. have no cointegrating relation for any i. This case is covered by the following assumption. 1072 P. C. B. PHILLIPS AND H. R. MOON ASSUMPTION 3 ŽSpurious Regression.: The random matrices ⍀ i are positi¨ e definite almost surely. Suppose that we perform a time series regression of Yi, t on X i, t : Ž 4.1. Yi , t s ˆi X i , t q Uˆi , t , where ˆi s ÝTts1Yi, t X i,X t ŽÝTts1 X i, t X i,X t .y1. As is well known Že.g., Phillips Ž1986.., under Assumption 3 the regression coefficient estimator ˆi has the following nondegenerate limit distribution, as T ª ⬁: Ž 4.2. ˆi « M y i M xX i H ž HM y1 X x i Mx i / for all i. The weak convergence result Ž4.2. implies that regression Ž4.1. is spurious in the sense that the regression of Yi, t on X i, t does not identify any fixed long-run relation between Yi, t and X i, t . By contrast, the main result in what follows is that, in a panel data set, such regressions are no longer spurious and do, in fact, distinguish a long-run average relation between Yi, t and X i, t . Consider the following linear least-squares regression of Yi, t on X i, t with pooled panel data: Ž 4.3. Yi , t s ˆn , T X i , t q Uˆi , t , where n Ž 4.4. T ˆn , T s Ý Ý Yi , t X iX, t ž is1 ts1 n T /žÝ Ý y1 X i , t X iX, t is1 ts1 / . We now proceed to develop an asymptotic theory for ˆn, T . The approach we adopt is to derive the limit under sequential convergence of the indices ŽT, n., and then show that the limit continues to hold under joint convergence ŽT, n ª ⬁. provided certain conditions hold. In many cases, this is the simplest way to proceed. Indeed, for estimators like ˆn, T , asymptotic results are readily obtained using sequential asymptotics such as ŽT, n ª ⬁. seq . According to Ž2.6. in first stage asymptotics, the pooled estimator ˆn, T has the following limit distribution: ˆn , T « ž n 1 ÝH n M y i M xX i is1 /ž 1 n n ÝH y1 M x i M xX i is1 / as T ª ⬁ for any fixed n. From Lemma 4 we know that HM y i M xX i and HM x i M xX i have finite second moments. Also, as in Ž2.8. above, by direct calculation we get EŽ HMi MiX . s 12 EŽ ⍀ i . s 12 ⍀ . And then, applying the strong law of large numbers as in Ž2.7., we have 1 n n a.s. 1 Ý HM y M xX ª 2 ⍀ y x i is1 i and 1 n n a.s. 1 Ý HM x M xX ª 2 ⍀ x x , i is1 i 1073 NONSTATIONARY PANEL DATA as n ª ⬁. By Assumption 3, ⍀ x i x i is positive definite a.s., and cX⍀ x i x i c ) 0 a.s. for any c / 0 in ⺢ m x . Thus, EcX⍀ x i x i c s cX⍀ x x c ) 0, which implies that ⍀ x x is positive Ž . definite. Hence ⍀y1 x x exists, and so we have as T, n ª ⬁ seq : p ˆn , T ª ⍀ y x ⍀y1 xx . Let  s ⍀ y x ⍀y1 x x . We will call the parameter  the long-run a¨ erage regression coefficient. It is the matrix regression coefficient Žof y on x . associated with the long-run average covariance matrix ⍀ . To find the limit distribution of ˆn, T we rescale the centered estimator Ž ˆn, T y  . by 'n and let T ª ⬁ for fixed n. For all fixed n as T ª ⬁ we have Ž 4.5. 'n Ž ˆn , T y  . « 1 n Ý 'n is1 žH M y i M xX i y  M x i M xX i H /ž y1 n 1 ÝH n M x i M xX i is1 / . Note that E žH M y i M xX i y  M x i M xX i s E E / H žH M y i M xX i y  M x i M xX i < Fc i H / s 12 E Ž ⍀ y i x i y ⍀ y x ⍀y1 x x ⍀ x i x i . s 0, where the conditional expectation exists because HM y i M xX i y HM x i M xX i is square integrable by Lemma 4. Thus, the numerator of Ž4.5. has mean zero. Also, we know that the numerator has finite second moments from Lemma 4, and with a straightforward calculation the variance matrix is found 6 to be Ž 4.6. ž žH E vec M y i M xX i y  M x i M xX i vec / žH H M y i M xX i y  M x i M xX i H X // s 16 E Ž ⍀ x i x i m Ž ⍀ y i y i y ⍀ x i y i y ⍀ y i x i  X q ⍀ x i x i  X . . q 16 E Ž ⍀ x i y i y ⍀ x i x i  X . m Ž ⍀ y i x i y ⍀ x i x i . K m y m x q 14 s⌰ , ž E ž vec Ž ⍀ y i x i y ⍀ x i x i X . Ž vec Ž ⍀ y x y ⍀ x x . . i i i i / / say where K m y m x is the Ž m y m x = m y m x . commutation matrix Že.g., see Magnus and Neudecker Ž1988... The sequence of random matrices HM y i M xX i y HM x i M xX i 4i in the numerator of the matrix quotient Ž4.5. is iid Ž0, ⌰ . across i. From the multivariate Lindeberg-Levy theorem, we then get as n ª ⬁ Ž 4.7. 6 1 n Ý 'n is1 žH M y i M xX i y  M x i M xX i « N Ž 0, ⌰ . . H / The calculations are given in Appendix C of PM b. 1074 P. C. B. PHILLIPS AND H. R. MOON Combining Ž4.7. with the limit n 1 a.s. 1 Ý HM x M xX ª 2 ⍀ x x n i i n ª ⬁, as is1 we have the following limit distribution of the pooled estimator ˆn, T as ŽT, n ª ⬁. seq Ž 4.8. 'n Ž ˆn , T y  . « N ž 0, 4 ž ⍀y1 x x m Im y y1 x x m Im y /⌰ ž ⍀ //. Theorem 4 below shows that these results continue to hold in joint asymptotics as ŽT, n ª ⬁.. For the limit distribution Ž4.8. to hold in this case we need the additional requirement that nrT ª 0. No additional condition is required for consistency. THEOREM 4: Suppose Assumptions 1, 2, and 3 hold. Ža. Then, as Ž n, T ª ⬁., we ha¨ e ˆn, T ªp  . Žb. If Ž n, T ª ⬁. and nrT ª 0, then 'n Ž ˆn , T y  . « N ž 0, 4 ž ⍀y1 x x m Im y y1 x x m Im y /⌰ ž ⍀ //. REMARKS: Ža. The restriction Ž nrT . ª 0 in Theorem 4 controls the effects of bias in the panel regression. Under the assumptions on the DGP given in Section 2, the expectation of the components in the numerator of 'n Ž ˆn, T y  . is generally nonzero, i.e., E ž T 1 T2 Ý Ž Yi , t X iX, t y  X i , t X iX, t . ts1 / / 0, / ª 0, whereas E ž T 1 T2 Ý Ž Yi , t X iX, t y  X i , t X iX, t . ts1 as T ª ⬁ for all i. In this case, the condition Ž nrT . ª 0 prevents the bias from having a dominating asymptotic effect on the standardized quantity 'n Ž ˆn, T y  .. But, when Ž nrT . ¢ 0, the bias can dominate and the asymptotic behavior can be very different. For example, suppose that 1 n ÝE 'n is1 ž 1 T2 T Ý Ž Yi , t X iX, t y  X i , t X iX, t . ts1 / ªb along some diagonal limit ŽT Ž n., n ª ⬁.diag . In this event, we can expect to have a limit distribution with an asymptotic bias b. Further, the required restriction on the expansion rate between n and T will change depending on the underlying assumptions about the DGP. For example, if the shocks Ui, t Žs ⌬ Zi, t . are iid over t and Zi, 0 s 0 for all i, then our results hold as Ž n, T ª ⬁. without imposing any restriction on the expansion rate between n and T. 1075 NONSTATIONARY PANEL DATA Žb. Theorem 4 holds for any partition of Zi, t . If the panel data form a vector unit root model such as Ž2.1., then we can estimate the average long-run relation between any two subvectors. In effect, therefore, there is an average long-run relationship between any two subvector components of an integrated process over a cross section population. Žc. A key factor in determining these results is that panel data provide iid cross section information that is unavailable in a simple time series context. In consequence, we may expect some version of these results to apply when the regression utilizes only a fraction of the time series data. Suppose we regress Yi, t on X i, t using the cross section observations at time period t s w Tr x with 0 - r F 1. The cross section OLS estimator ˜n, wT r x is then defined by n Ž 4.9. ˜n , wT r x s ž Ý Yi , t X iX, t is1 n /ž Ý y1 X i , t X iX, t is1 / . Using similar arguments to those employed above, we can show that, in sequential asymptotics as ŽT, n ª ⬁. seq , we have p ˜n , wT r x ª  , and ˜ y1 'n ž ˜n , wT r x y  / « N ž 0, ž ⍀y1 x x m Im / ⌰ ž ⍀ x x m Im y y //, where ˜ s E Ž ⍀ x i x i m Ž ⍀ y i y i y ⍀ x i x i y ⍀ x i x i  X q ⍀ x i x i  X . . ⌰ q E Ž ⍀ x i y i y ⍀ x i x i  X . m Ž ⍀ y i x i y ⍀ x i x i . K m y m x ž q E ž vec Ž ⍀ y i x i y ⍀ x i x i X / . Ž vec Ž ⍀ y x y ⍀ x x . . . i i i i / Žd. Since Ž ⍀y1 . ˜ Ž y1 . Ž y1 . Ž y1 . x x m Im y ⌰ ⍀ x x m Im y y 4 ⍀ x x m Im y ⌰ ⍀ x x m Im y ) 0, the cross section estimator ˜n is asymptotically less efficient than the pooled estimator ˆn, T . This is to be expected because the pooled estimator ˆn, T uses all the time series information while the cross section estimator ˜n, wT r x uses only single time period information. It is therefore interesting to note that, although time series regression may be spurious, use of all the time series data does reduce the limiting variance in a panel regression. Heuristically, this is because when we pool the data we average the limiting information and quantities like H01 M y i and H01 M y i M xX i have less variation than M y iŽ1. and M y iŽ1. M x iŽ1.. For example, W Ž1. ' N Ž0, 1., whereas H01 W' N Ž0, 13 .. 5. PANEL COINTEGRATION This section considers the case where the variables in Zi, t are cointegrated. As discussed in Phillips Ž1986., there exists a cointegrating relation among the variables in Zi, t if the conditional long-run variance matrix ⍀ i of Zi, t has 1076 P. C. B. PHILLIPS AND H. R. MOON deficient rank. We will discuss three particular types of model: Ži. heterogeneous panel cointegration, where there exist different cointegrating relations among the variables in Zi, t across individuals; Žii. homogeneous panel cointegration, where the cointegration relation is the same for all the individuals; and Žiii. near-homogeneous panel cointegration, where there exist slightly different cointegrating relations across the individuals. 5.1. Heterogeneous Panel Cointegration We start by strengthening the moment conditions of the random coefficients Ci, t in Ž2.3. and the summability conditions as follows. These conditions help to ensure the existence of a valid BN decomposition for the equation errors in the panel cointegration model Ž5.2. given below. ASSUMPTION 4 ŽRandom Coefficient ConditionsX .: Ži. Assumption 1Ž i . holds. Žii. ECa,16i, t Žs 16, a, t . - ⬁ for all as 1, . . . , m2 . ASSUMPTION 5 ŽSummability ConditionX .: For all as 1, . . . , m2 : Ži. Ý⬁ts0 t 2 2, a, t - ⬁; Žii. Ý⬁ts0 t 4 Ž 4, a, t .1r4 - ⬁; Žiii. Ý⬁ts0 t 2 Ž 8, a, t .1r8 - ⬁; Živ. Ý⬁ts0 Ž 16, a, t .1r16 - ⬁. The previous section assumes that the conditional long-run covariance matrix ⍀ i of the integrated vector Zi, t in Ž2.1. is positive definite. When ⍀ i is singular, important differences arise in the time series case, as is well known, and a different large T time theory applies for each i. ASSUMPTION 6: The following conditions hold almost surely. Ži. ⍀ i has rank m x . Žii. Each Ž m x = m x . leading submatrix ⍀ x x is positi¨ e definite. i i In this case the generating mechanism Ž2.1. has a deficient set of unit roots and the vector Zi, t is cointegrated almost surely. To see this, take an arbitrary element of the probability space for which Ži. and Žii. of Assumption 6 hold. y1 Ž . Then we have ⍀ y i y i s ⍀ y i x i ⍀y1 x i x i ⍀ x i y i . Let ␣ i s Im y , yi and  i s ⍀ y i x i ⍀ x i x i . The Ž m y = m. random matrix ␣ i is well defined because ⍀ x i x i is positive definite. Since ⍀ i s Ci Ž1.Ci Ž1.X , the equality ⍀ y i y i y ⍀ y i x i ⍀y1 x i x i ⍀ x i y i s 0 can be written as Ž 5.1. X ␣ i Ci Ž 1 . C y iŽ 1 . s 0, so that ␣ i is in the row null space of the matrix Ci Ž1.. Define Ei, t s ␣ i Zi, t s Yi, t y i X i, t . Note that ⌬ Ei, t s ␣ i ⌬ Zi, t s ␣ i Ui, t s ␣ i Ci Ž1.Vi, t y ⌬␣ i U˜i, t , where the last equality comes from the BN decomposition NONSTATIONARY PANEL DATA 1077 of Ui, t . Then, since ␣ i Ci Ž1. s 0, ⌬ Ei, t s y⌬␣ i U˜i, t , that is Ei, t s y␣ i U˜i, t s y␣ i Ý⬁ss0 C˜i, sVi, tys . Lemma 15 in Appendix D shows that Ei, t is square integrable and that the random coefficients y␣ i C˜i, s 4s are summable. Hence, Assumption 6 implies the existence of the following panel cointegration model with probability one: Ž 5.2. a.s. Yi , t s i X i , t q Ei , t , X i , t s X i , ty1 q Ux i , t , where Ž 5.3. Fi , t s Ei , t s Ux i , t ⬁ ž / ž / Gi , s s Ý Gi , sVi , tys , ss0 y␣ i C˜i , s , ⌫ Ci , s and ⌫ s Ž 0 ... Im x .Ž m x = m . . The coefficient i in model Ž5.2. is random. This means that i differs randomly across i and so the cointegrating relation between Yi, t and X i, t is heterogenous. Also, the random coefficients Gi, s in the linear process generating Ž Ei,X t , UxXi , t .X in model Ž5.2. each involve the cointegrating matrix ␣ i whose main component is i s ⍀ y i x i ⍀y1 x i x i , which depends on the inverse of ⍀ x i x i . From Assumption 6, ⍀y1 exists almost surely. But, additionally, we need some xi xi moment conditions on ⍀ y i x i ⍀y1 to ensure the existence of moments of the xi xi random coefficients Gi, s , which help in establishing the validity of a panel BN decomposition. Assumptions 1 and 2 alone do not assure the existence of y1 moments of ⍀y1 x i x i . Hence, to avoid heavy tails in the density of ⍀ x i x i , we make the following assumption about the distribution of ⍀ x i x i . ASSUMPTION 7: The random matrix ⍀ x i x i has continuous density function f with the following properties. Ži. f Ž ⍀ . s O ŽetrŽyc ⍀ .. for some c ) 0 when trŽ ⍀ . ª ⬁, where etrŽyc ⍀ . denotes exp trŽyc ⍀ .4 . Žii. f Ž ⍀ . s O ŽŽdet ⍀ .␥ . for some ␥ ) 7 when detŽ ⍀ . ª 0. REMARKS: Ža. Condition Ži. implies that the tail of the density f is exponentially small as trŽ ⍀ . ª ⬁. Condition Žii. restricts the behavior of the density f when det ⍀ ª 0. Taken together Ži. and Žii. ensure that Ždet ⍀ . s f Ž ⍀ . is integrable for s G y8. Žb. An example of a density f satisfying conditions Ži. and Žii. is the Wishart distribution Wm xŽ J, Im x . whose probability element is f Ž ⍀ .Ž d ⍀ . s 1 2 m x Ž Jr2. ⌫ m xŽ Jr2. 1 etr y ⍀ det ⍀ Ž Jym xy1.r2 Ž d ⍀ . , 2 ž / 1078 P. C. B. PHILLIPS AND H. R. MOON with degrees of freedom parameter J ) m x q 15 and where ⌫m x J ž / 2 s H⍀)0etr Žy⍀ . det ⍀ Jy m xy1r2 Ž d⍀ . . In this case, ⍀y1 has an inverse Wishart distribution with Ž J q m x q 1. degrees of freedom and Ž m x = m x . parameter matrix Im x, Wmy1x Ž J q m x q 1, Im x . Že.g., see Muirhead Ž1982, p. 113.. In view of Lemma 15Ža. in Appendix D, Ý⬁ss 1 s 2 5 Gi, s 5 2 - ⬁. a.s. Then, as in Lemma 2, it follows from Phillips and Solo Ž1992. that Fi, t has a valid panel BN decomposition of the form Ž 5.4. a.s. Fi , t s Gi Ž 1 . Vi , t q F˜i , ty1 y F˜i , t , where Gi Ž1.Vi, t and F˜i, t are well defined square integrable random vectors in view of Lemma 15. Using Ž5.4., the partial sum process of Fi, t can be written as Ž 5.5. 1 w Tr x 1 w Tr x 1 1 a.s. Fi , t s Gi Ž 1 . Ý Ý Vi , t q 'T F˜i , 0 y 'T F˜i , wT r x . 'T ts1 'T ts1 With this BN decomposition in place, we can use the Phillips-Solo approach to deduce a functional law for partial sums of Fi, t . In particular, we have the following lemma: LEMMA 7: If Assumptions 4᎐7 hold, then Ž 5.6. 1 w Tr x Ý Fi , t « Gi Ž1. Wi Ž r . 'T ts1 as T ª ⬁ for all i , where Wi Ž r . is a standard ¨ ector Brownian motion independent of Fc i . rx Ž Thus, Ž1r 'T .ÝwT ts1 Fi, t converges in distribution to a randomly scaled or mixed. Brownian motion Gi Ž1.Wi Ž r . as T ª ⬁ for all i. Let Si, t s Ýtss1 Fi, s q Si, 0 , where Si, 0 are iid across i with E 5 Si, 0 5 4 - ⬁. The next lemma shows that Ž1rT .ÝTts1 Si, t Fi,X t converges in distribution to a matrix stochastic integral plus an Fc i-measurable random matrix. LEMMA 8: Suppose the assumptions in Lemma 7 hold. Then, Ž 5.7. 1 T T X Ý Fi , t SXi , t « Gi Ž1. H dWi WiX Gi Ž1. q ⌳ i ts1 where ⌳ i s Ý⬁ks0 EŽ Fi, k Fi,X 0 < Fc i . s Ý⬁ks0 Ý⬁ss0 Gi, sqk GXi, s . as T ª ⬁, 1079 NONSTATIONARY PANEL DATA Partition Gi Ž1., ⌳ i , and Gi Ž1.Wi Ž r . conformably as follows: Gi Ž 1 . s Ge iŽ 1 . m y , Gx Ž 1 . m x ž / ž ⌳i s i Gi Ž 1 . Wi Ž r . s Ge iŽ 1 . Wi Ž r . Gx iŽ 1 . Wi Ž r . ž ⌳ei ei ⌳ei x i ⌳ x i ei ⌳xi xi Me iŽ r . / ž / s / , . M x iŽ r . Consider the time series regression of Yi, t on X i, t . Using Ž5.6., Ž5.7., and the continuous mapping theorem, we find the following large T limit distribution for the OLS estimator of the Žrandom. coefficient i : Ž 5.8. a.s. T Ž ˆi y i . s « Ge iŽ 1 . ž H ž 1 T T Ý Ei , t X iX, t ts1 /ž T2 X dWi WiX Gx iŽ 1 . q ⌳ e i x i s ž H dM ei T 1 M xX i q ⌳ e i x i Ý y1 X i , t X iX, t ts1 / X Gx iŽ 1 . Wi WiX Gx iŽ 1 . /ž / žH H M x i M xX i y1 / y1 as T ª ⬁ for all i. / The bias term ⌳ e i x i arises in the usual way from the temporal correlation between Ei, t and Ux i , t Žc.f., Phillips and Durlauf Ž1986... Thus, time series regression produces a consistent estimator of the cointegrating matrix i , and thereby distinguishes the randomly differing individual long-run relations between Yi, t and X i, t . When both dimensions of the panel data are utilized, a long-run average coefficient  is also identified. This can be accomplished as in the previous section, by means of a pooled panel regression or a limiting cross section regression. The following sections concentrate on pooled panel regression and discuss limiting cross section estimators only briefly. In the heterogeneous panel cointegration model Ž5.2. the pooled estimator ˆn,T has the same form as that defined in Ž4.4.. The limit theory for this pooled estimator is as follows. THEOREM 5: Let the assumptions of Lemma 15 hold. Then: Ža. as Ž n, T ª ⬁., ˆn, T ªp  s ⍀ y x ⍀y1 xx ; Žb. as Ž n, T ª ⬁. with nrT ª 0, 'n Ž ˆn , T y  . « N ž 0, 4 ž ⍀y1 x x m Im y y1 x x m Im y /⌰ ž ⍀ //, where ⌰ s 16 Ž ⍀ x i x i m Ž ⍀ y i y i y ⍀ x i x i y ⍀ x i x i y ⍀ x i x i  X q ⍀ x i x i  X . . q 16 E Ž Ž ⍀ x i y i y ⍀ x i x i  X . m Ž ⍀ y i x i y ⍀ x i x i . . K m x m y X q 14 E Ž vec Ž ⍀ y i x i y ⍀ x i x i . vec Ž ⍀ y i x i y ⍀ x i x i . . . 1080 P. C. B. PHILLIPS AND H. R. MOON REMARKS: Ža. Define E˜i, t s Ž i y  . X i, t q Ei, t . Then the heterogeneous panel cointegration model Ž5.2. becomes Ž 5.9. Yi , t s  X i , t q E˜i , t . The pooled estimator ˆn, T is a least squares estimator of the regression coefficient in Ž5.9. and consistently estimates the long-run average coefficient  between Yi, t and X i, t . Note that the noise, E˜i, t , in this regression involves the integrated random vector X i, t . By the same logic as that of the spurious regression case, the long-run coefficient  is consistently estimated by pooling the panel data because cross section pooling attenuates the strength of the noise E˜i, t relative to the signal in the regression Ž5.9.. Žb. As seen in Theorem 5, the pooled estimator ˆn, T is 'n consistent for the average long-run regression coefficient  and has a normal limit distribution. Observe that the limit variance matrix for the heterogeneous pooled panel .Ž y1 . regression estimator in Theorem 5, viz., 4Ž ⍀y1 x x m Im y ⍀ x x m Im y has precisely the same form as the limit variance matrix of the spurious regression pooled panel regression estimator in Theorem 4. This equivalence in form is especially interesting because the individual long-run covariance matrix ⍀ i is singular in the heterogeneous cointegration case but nonsingular in the spurious regression case, so that these individual component matrices must be different between the two models. Nevertheless, and in spite of these differences, the average long-run covariance matrix ⍀ may well be nonsingular in the heterogeneous cointegration model, in which case there is a basis for direct comparison between the two results. Obviously, the effect of the heterogeneity in the cointegration parameter is to slow down the rate of convergence of the pooled estimator. In particular, the convergence rate is 'n and, interestingly, this rate is uninfluenced by the time series sample size in spite of the fact that the individual time series regressions are themselves T-consistent Žsee Ž5.8... Thus, there is a correspondence in the limit theory between the heterogeneous cointegration model and the pooled spurious regression model after pooling the data. Žc. In general, of course, Ew ⍀ y x ⍀y1 x w xŽ w x.y1 , so there is no xi xi / E ⍀ yi xi E ⍀ xi xi i i reason why the limit of the average of the cointegrating relation Ž1rn.Ý nis1 i should equal  , the average long-run regression coefficient. As we have seen, it is the latter parameter that is the limit of the pooled regression estimator in the heterogeneous cointegration model. One situation where lim nª⬁Ž1rn.Ý nis1 i s  does hold is when ⍀ x i x i has a degenerate distribution, namely, ⍀ x i x i s ⍀ x x almost surely. Thus, in the heterogeneous panel cointegration case, the parameter being estimated is not the average cointegrating coefficient, but the average long-run regression coefficient, just as in the spurious panel regression case. Again, the two models are much closer than they might appear. Žd. As discussed in Ža., the heterogeneous panel cointegration model can be reinterpreted in the form of the panel model Ž5.9.. As such, we may be interested in constructing statistical tests about the long-run coefficients  . For example, to test ⺘ 0 : Ž  . s 0, where Ž⭈. is a p-vector of smooth functions on NONSTATIONARY PANEL DATA 1081 a subset of ⺢ m y=m x such that ⭸r⭸ X has full rank pŽF m y m x ., we may use the Wald statistic X W s n Ž ˆn , T . Vˆy1 Ž ˆn , T . , ˆy1 . ˆ Ž ˆy1 . where Vˆ s Ž ⭸ Ž ˆn, T .r⭸ X .Ž ⭸ Ž ˆn, T .Xr⭸ ., Vˆ s 4Ž ⍀ x x m Im y ⌰ ⍀ x x m Im y , ˆs ⌰ 1 n ⍀̂y1 xx s n Ý is1 1 n T ½ 1 T4 n Ý is1 ½ X i , t X iX, s m Eˆi , t EˆiX , s , Ý 5 s, ts1 2 T2 T Ý y1 X i , t X iX, t ts1 5 , and Eˆit s Yi, t y ˆn, T X i, t . Some simple manipulations in the case of sequential asymptotics show that this statistic leads to a standard asymptotic 2 test as ŽT, n ª ⬁. seq . This limit theory also holds very generally under joint limits as ŽT, n ª ⬁. as the next result reveals. THEOREM 6: Under ⺘ 0 : Ž  . s 0 and Assumptions 4᎐7, W « p2 , as Ž n, T ª ⬁. with nrT ª 0. Že. We may also be interested in testing hypotheses about the coefficients in generalizations of model Ž5.9. of the following form: Ž 5.10. Yi , t s  X i , t q E˜i , t with ½  s  a for i g Ia ,  s  b for i g Ib , where Ia and Ib are index sets corresponding to subgroups of the cross section population for which the long-run average covariance matrices are ⍀ a and ⍀ b , respectively, leading to long-run average regression coefficients  a s ⍀ a, y x ⍀y1 a, x x and  b s ⍀ b, y x ⍀y1 b, x x that may differ between the two populations. Models like Ž5.10. can be readily extended to multi-category models and will be empirically relevant, for example, in cross country panel regressions where countries are partitioned into classes of similar category like developed ŽOECD. nations and developing and undeveloped nations. Note that in such cases the model Ž5.10. allows for intra-class variation Ži.e., the regression coefficients i for i g Ia will differ. but our primary interest lies in the inter-class difference  a y  b . A natural hypothesis is then: ⺘ 0 :  a s  b . Let n a s 噛Ž Ia . and n b s 噛Ž Ib ., respectively. Suppose that n brn a ª - ⬁ as n a , n b ª ⬁. The null hypothesis can be tested by constructing pooled regression coefficients ˆa , ˆb in each class and computing the Wald statistic X y1 Wa, b s n b vec ˆa y ˆb Vˆayb vec ˆa y ˆb ½ ž / ž /5, 1082 P. C. B. PHILLIPS AND H. R. MOON ˆy1, x x m Im x .⌰ˆŽ ⍀ˆy1, x x m Im x ., where Vˆay b s Ž n arn b .Vˆa q Vˆb , Vˆ s 4Ž ⍀ ˆ s ⌰ 1 n ⍀̂y1 , xxs ½ Ý igI Ý igI X i , t X iX, s m Eˆi , t EˆiX , s , Ý T4 1 n T 1 5 s, ts1 ½ 1 T2 T Ý y1 X i , t X iX, t ts1 5 , and Eˆi t s Yi, t y ˆ X i, t with g a, b4 . Again, this leads to an asymptotic 2 test. The following result gives the limit theory under joint limits as Ž n a , n b , T ª ⬁. and can be obtained in a simple way from Theorems 5 and 6. THEOREM 7: Under ⺘ 0 :  a s  b and Assumptions 4᎐7, Wa, b « m2 y m x, as Ž n a , n b , T ª ⬁. with n arT, n brT ª 0. 5.2. Homogeneous Panel Cointegration and Pooled FM Estimation This section considers a homogeneous panel cointegration model, where the cointegrating relations are the same across individuals, and develops an asymptotic theory for a pooled FM estimator. We start with the following simplifying assumption. ASSUMPTION 8: Ci, t sa.s. Ct , where Ct is an Ž m = m. nonrandom matrix for all t. Then, under Assumption 6, the panel cointegration model Ž5.2. becomes Ž 5.11. a.s. Yi , t s  X i , t q Ei , t , X i , t s X i , ty1 q Ux i , t , where  s ⍀ y x ⍀y1 xx , Gs s ␣ s Ž Im y , y . , Ge, s y␣ C˜s s , Gx , s ␥ Cs ž / ž / and Ei , t s Ux i , t ž / C˜s s ⬁ Ý GsVi , tys , ss0 ⬁ Ý Cj . jssq1 In this model the same long-run relation between Yi, t and X i, t applies for all i. Unlike previous models, the error term in model Ž5.11. is generated by a linear process with nonrandom coefficients Gs 4 , on which we impose the following summability condition. 1083 NONSTATIONARY PANEL DATA ASSUMPTION 9: Ý⬁ss0 s 3 5 C s 5 - ⬁. Define Gs s Ý⬁jssq1Gj . Under Assumption 9, GŽ1. s Ý⬁ss0 Gs - ⬁ and ⬁ Ý ss0 s 2 5 Gs 5 s Ý⬁ss0 s 2 5Ý⬁jssq1Gj 5 2 - ⬁. Write Fi, t s Ž Ei,X t , UxXi , t .X . Then, as T ª ⬁, rx Ž . Ž . we have the functional law Ž1r T .ÝwT ts1 Fi, t « Bi r ' BM ⍀ F , where ⍀ F s X ˜ ˜ ' GŽ1.GŽ1. ŽTheorem 3.4 in Phillips and Solo Ž1992... The following assumption is conventional in time series cointegration analysis, although it could be relaxed with some consequential changes in the asymptotics, including changes in convergence rates in directions determined by the singularity. ASSUMPTION 10: ⍀ F is positi¨ e definite. Partition Bi Ž r . s Ž Be iŽ r .X , B x iŽ r .X .X conformably with Fi, t . Set Si, t s Ýtss1 Fi, s q Si, 0 , where Si, 0 are iid across i with E 5 Si, 0 5 4 - ⬁. Then, in the usual way ŽPhillips Ž1988.., as T ª ⬁ 1 T T Ý Fi , t SXi , t « H dBi BiX q ⌳ F , ts1 where ⌳ F s Ý⬁ks0 EŽ Fi, k Fi,X 0 . s Ý⬁ks0 Ý⬁ss0 Gsqk GsX . Again, conformably partition ⍀ F and ⌳ F as ž ⍀ee ⍀xe ⍀e x ⍀x x / ž and ⌳e e ⌳xe ⌳e x , ⌳x x / respectively. For each i, model Ž5.11. is a time series cointegrating regression. The least squares estimator T y1 T ˆi s Ý Yi , t X iX, t Ý X i , t X iX, t ž ts1 a .s. s q T Ý ts1 Ei , t X iX, t ts1 T žÝ / y1 X i , t X iX, t ts1 / has the following asymptotic distribution ŽPhillips and Durlauf Ž1986..: Ž 5.12. T Ž ˆi y  . « žH dBe i BXx i q ⌳ e x / žH B x i BXx i y1 / as T ª ⬁. The time series estimator ˆi is therefore consistent for  , the common long-run coefficient for all i, although there may be a second order bias effect entering through the term ⌳ e x in Ž5.12. arising from the correlation between Ei, t and X i, s . 1084 P. C. B. PHILLIPS AND H. R. MOON When the panel observations are pooled, as in the estimator ˆn, T defined in Ž4.4., we get n a .s. ˆn , T s  q T Ý Ý is1 ts1 Ei , t X iX, t n y1 T žÝ Ý X i , t X iX, t is1 ts1 / . When ⌳ e x s 0, the limit theory of this estimator is as follows. THEOREM 8: Suppose that Assumptions 6, 8᎐10 hold. If ⌳ e x s 0, then as Ž n, T ª ⬁. with nrT ª 0, 'n T Ž ˆn , T y  . « N Ž 0, 2 Ž ⍀y1 . x x m ⍀ee . . Thus, if Ei, t and Ux i , s are uncorrelated, so that the one sided long-run covariance ⌳ e x s 0, the pooled estimator ˆn, T is 'n T consistent and has a limiting normal distribution in joint asymptotics as Ž n, T ª ⬁. when nrT ª 0. When ⌳ e x / 0, we do not attain 'n T consistency with the pooled least squares estimator ˆn, T , because of the persistence of bias effects. However, we may ‘fully modify’ the regressor Yi, t to eliminate the serial correlation ⌳ e x . Originally, the fully modified ŽFM. regression method was introduced in Phillips and Hansen Ž1990. to correct for the presence of endogeneity Žthe correlation between Be i and B x i . and serial correlation in the OLS estimator ˆi of the individual cointegration regression model. Their construction calls for consistent ˆF and ⌳ˆF of ⍀ F and ⌳ F . In our case, consistent time series estimators ⍀ estimates may be constructed using averages Žover i s 1, . . . , n. of the usual consistent Žas T ª ⬁. nonparametric kernel estimates of the corresponding long-run quantities for each i. More specifically, let ⌫ˆi Ž j . s Ž1rT .Ý t Fi, tqj Fi,X t , where the summation is over 1 F t, t q j F T, and define the averaged kernel estimators Ž 5.13. ˆF s ⍀ ˆF s ⌳ 1 n 1 n n Ý ⍀ˆF , i , Ty1 ˆF , i s ⍀ is1 is1 Ty1 ˆF , i s ⌳ Ý js0 w j ž /ˆ ž /ˆ w jsyTq1 n Ý ⌳ˆF , i , Ý j K K ⌫i Ž j . , ⌫i Ž j . , ⬁ where w Ž x . is a lag kernel for which w Ž0. s 1, w Ž x . s w Žyx ., Hy⬁ w Ž x . 2 dx- ⬁, and with Parzen’s exponent q g Ž0, ⬁. such that k q s lim x ª⬁Ž1 y w Ž x ..r< x < q - ⬁ Že.g., see Hannan Ž1970. or Andrews Ž1991...7 As is well know in the nonparametric literature, the choice of the bandwidth K is important in the limit 7 In determining asymptotic properties of kernel estimates of the long-run variance we usually also impose a smoothness restriction on the spectral density at the origin. This smoothness condition X can be formulated as a summability condition on the autocovariance sequence ⌫ Ž h. s EŽ Fi, t Fi, tqh .. The summability conditions in Assumption 9 ensure that Ý⬁hs 0 h2 5 ⌫ Ž h.5 - ⬁, and provide sufficient smoothness for our results here. 1085 NONSTATIONARY PANEL DATA ˆF . Under the summability condition given in Assumption 9, it is behavior of ⍀ ˆF, i ª ⍀ F if K, T ª ⬁ with KrTª 0. However, later in this section known that ⍀ ˆF y ⍀ F ., 'n Ž ⌳ˆF y Že.g., for Theorem 9. we need the stronger result that 'n Ž ⍀ ⌳ F . s op Ž1. as Ž n, T ª ⬁. with nrT ª 0. The following Assumption about bandwidth choice is made so that these conditions apply. ASSUMPTION 11: The lag kernel w Ž⭈. in Ž5.13. has Parzen exponent q ) 1r2, and the bandwidth parameter K tends to infinity with KrTª 0 and K 2 qrT ª ⑀ ) 0. Define Ž 5.14. ˆe x ⍀ˆy1 Yiq, t s Yi , t y ⍀ x x ⌬ Xi , t and Ž 5.15. ˆq ˆ ˆ ˆy1 ˆ ⌳ e x s ⌳e x y ⍀ e x ⍀ x x ⌳ x x . Equation Ž5.14. gives the endogeneity correlation and equation Ž5.15. gives the serial correlation correction. Using these corrections, a pooled FM ŽPFM. estimator can be defined as follows: n Ž 5.16. ˆP F M s T žÝ Ý Yiq, t X iX, t y nT⌳ˆqe x is1 ts1 n a .s. s q y1 T X i , t X iX, t /žÝ Ý / //ž Ý Ý / is1 ts1 T žÝžÝ is1 n n X ˆq Eˆq i , t X i , t y T⌳ e x ts1 T , is1 ts1 ˆ ˆy1 ˆ where Eˆq i, t s Ei, t y ⍀ e x ⍀ x x ⌬ X i, t . Rescaling  F P M y  by T ª ⬁ for fixed n, we have 1 'n T Ž ˆP F M y  . « y1 X i , t X iX, t ž' n n ÝH dBe i , x i BXx i is1 /ž 1 n n ÝH 'n T and letting y1 B x i BXx i is1 / , where Be i , x iŽ r . ' BM Ž ⍀ e. x . and ⍀ e. x s ⍀ e e y ⍀ e x ⍀y1 x x ⍀ x e. Ž . Ž . Note that Be i. x i r and B x i r are independent, so EH dBe i. x i BXx i s 0 and ž E vec H dBe i . x i BXx i /ž vec H dBe i . x i BXx i X / s 12 Ž ⍀ x x m ⍀ e. x . . Thus, applying the multivariate Lindeberg-Levy theorem to Ž 1r'n . Ý nis1 H dBe . x BXx i i i and combining this with the limit of Ž1rn.Ý nis1 HB x i BXx i , we find that as n ª ⬁ 1 ž' n n ÝH is1 dBe i . x i BXx i /ž 1 n n ÝH is1 y1 B x i BXx i / . « N Ž 0, 2 Ž ⍀y1 x x m ⍀ e. x . . 1086 P. C. B. PHILLIPS AND H. R. MOON .. in sequential limit as ŽT, n ª ⬁. seq . Thus, 'n T Ž ˆF P M y  . « N Ž0, 2Ž ⍀y1 x x m ⍀ e. x The following theorem shows that these asymptotics also hold for joint limits. THEOREM 9: Under Assumptions 6, 8᎐11 we ha¨ e 'n T Ž ˆP F M y  . « N Ž 0, 2 Ž ⍀y1 . x x m ⍀ e. x . as Ž n, T ª ⬁. with nrT ª 0. REMARKS: Ža. The pooled FM estimator ˆP F M is 'n T consistent and has a normal limit distribution. Žb. When ⌳ e x s 0, observe that ˆP F M is more efficient than ˆn, T because ⍀ e. x - ⍀ e e . The efficiency gain in ˆP F M is obtained from the endogeneity correction that adjusts Yi, t in the fully modified estimator. This effectively reduces the long-run variance of the noise in the panel cointegrating equation. Žc. Asymptotic 2 tests follow from Theorem 9 in the usual way. A consistent ˆy1 ˆ . estimate of the covariance matrix, 2Ž ⍀ x x m ⍀ e. x , can be constructed from y1 ˆ in Ž5.13. by defining ⍀ˆe. x s ⍀ˆe e y ⍀ˆe x ⍀ˆ x x ⍀ˆ x e . A Wald test of ⺘ 0 : Ž  . s 0, ⍀ where Ž⭈. is a p-vector of smooth functions such that ⭸r⭸ X has full rank p, can then be formulated in the usual way as Ž 5.17. X W s nT 2 Ž ˆP F M . Vˆy1 Ž ˆP F M . , where ˆy1 ˆ Vˆ s ⭸ Ž ˆP F M . r⭸ X 2 ⍀ x x m ⍀ e. x ž / X ž ⭸ Ž ˆ . r⭸ / . PFM Žd. As in Remark Že. following Theorem 6, it may be of interest to generalize model 5.11 to allow for subgroups of the population in which the regression coefficient is the same. In effect, we may replace model Ž5.11. with Ž 5.18. a.s. Yi , t s  X i , t q Ei , t ½ with  s  a for i g Ia ,  s  b for i g Ib , X i , t s X i , ty1 q Ux i , t . It is then possible to test hypotheses about the vectors  a and  a in the generalized model Ž5.18.. For example, to test ⺘ 0 :  a s  b , letting n a s 噛Ž Ia . and n b s 噛Ž Ib ., respectively, and assuming that n brn a ª ª ⬁ as n a , n b ª ⬁, we may construct pooled FM regression coefficients ˆa, P F M , ˆb, P F M in each class and then the Wald statistic Way b , P F M s n b T 2 vec ˆa, P F M y ˆb , P F M ½ ž X / y1 ˆ ˆ =Vˆay b , P F M vec  a, P F M y  b , P F M ž /5. NONSTATIONARY PANEL DATA 1087 ˆy1, x x m ⍀ˆ , e. x , and ⍀ˆ , x x , ⍀ˆ , e. x Here, Vˆay b, P F M s Vˆa, P F M q Vˆb, P F M , Vˆ , P F M s 2 ⍀ are the respective estimates of the long-run conditional covariance matrices of the regressors and the fully modified error processes in the classes I with n s 噛Ž I . where g a, b4 . As in the earlier case of heterogeneous cointegration, this leads to an asymptotic 2 test based on the null distribution Way b, P F M « m2 y m x, which follows in a manner analogous to that of Theorem 7. 5.3. Near-Homogeneous Panels The homogeneous panel model Ž5.11. discussed above is somewhat unrealistic because it assumed that each individual has exactly the same cointegrating relation. Here we study briefly a panel cointegration model with nearly homogeneous cointegrating vectors of the form Ž 5.19. i s  q i 'n T , where the sequence of Ž m y = m x . random matrices i is iid across i with mean and finite variance. ASSUMPTION 12: i is independent of Ž Ei, t , Ux i , t . for all i and t. We again consider the pooled FM estimator ˆP F M given in Ž5.16. and the limit theory follows in Theorem 9 above. THEOREM 10: Suppose there exists near-homogeneous panel cointegration of the form Ž5.19.. Let Assumptions 9᎐12 hold. Then, as Ž n, T ª ⬁. with nrT ª 0 Ž 5.20. 'n T Ž ˆP F M y  . « N Ž , 2 Ž ⍀y1 . x x m ⍀ e. x . . Theorem 5.20 is useful in calculating the asymptotic local power of the test statistic for the null hypothesis Ž 5.21. H0 : i s  0 ᭙ i. According to remark Žc. of the previous subsection, the Wald statistic in Ž5.17. for the null hypothesis in Ž5.21. is W with Ž  . s vecŽ  y  0 . and its limit distribution is m2 y m x. A sequence of local alternatives to the null Ž5.21. can be formulated as Ž 5.22. HL A :  i s  0 q i 'n T , where the i are iid across i with mean / 0, have finite variance and satisfy Assumption 12. In this case, under the local alternative hypothesis Ž5.22. and the 1088 P. C. B. PHILLIPS AND H. R. MOON assumptions of Theorem 5.20, the Wald statistic W has an asymptotic noncentral chi-square distribution as Ž n, T ª ⬁. with nrT ª 0, i.e., W « m2 y m xŽ . , where the noncentrality parameter is s vecŽ .X Vy1 vecŽ .r2. 6. MODELS WITH INDIVIDUAL EFFECTS Much of the preceding asymptotic theory can be extended in a straightforward way to panel models with individual specific effects and time trends. We illustrate what is involved in these extensions by taking the case of primary importance where the panel regression equation involves individual special effects. To motivate the analysis, consider the following model of heterogeneous panel cointegration in place of Ž5.2.: Ž 6.1. a.s. Yi , t s ␥ i q i X i , t q Ei , t , X i , t s X i , ty1 q Ux i , t . Here, the ␥ i are individual effects in the cointegrating equation. They could be fixed or random effects. We can also allow for individual effects in the equation for X i, t in Ž6.1.. In that case, the X i, t have individual deterministic trends as well as stochastic trends and in what follows we would proceed using detrended rather than demeaned data in the pooled panel regression, with some associated change in the final formulae. The individual effect in Ž6.1. can be eliminated in the usual way by removing individual specific means, i.e., Yi, .s Ž1rT .ÝTts1Yi, t and X i, .s Ž1rT .ÝTts1 X i, t . Then pooled panel regression leads to the estimator n T ˜n , T s Ý Ý Y˜i , t X˜iX, t ž is1 ts1 n /ž y1 T Ý Ý X˜i , t X˜iX, t is1 ts1 / , where Y˜i, t s Yi, t y Yi, . , and X˜i, t s X i, t y X i, .. As in our earlier theory, some quick asymptotic results for ˜n, T can be obtained using sequential limits. First consider the case where there is no cointegration and the true data generating mechanism is Ž2.1., even though it is ˜i Ž r . s model Ž6.1. that is estimated. Define the demeaned limiting process M ˜yX iŽ r ., M˜xX iŽ r ..X s Mi Ž r . y HMi Ž s . ds. According to Ž2.6. and the continuous mapŽM ping theorem, under Assumptions 1᎐3, the pooled estimator ˜n, T has the following limit distribution as T ª ⬁ for any fixed n: ˜n , T « ž 1 n n ÝH is1 ˜y i M˜xX i M /ž 1 n n ÝH is1 y1 ˜x i M˜xX i M / . ˜i M˜iX . s EŽ HMi MiX . y EŽ HMi HMiX . s 16 ⍀ . A simple calculation shows that EŽ HM ˜y i M˜xX i and Thus, applying the strong law of large numbers to Ž1rn.Ý nis1 HM 1089 NONSTATIONARY PANEL DATA ˜x i M˜xX i , we get Ž1rn.Ýnis1 HM˜y i M˜xX i ªa.s. 16 ⍀ y x , and Ž1rn.Ýnis1 HM˜x i M˜xX i Ž1rn.Ý nis1 M 1 Ž . ªa.s. 6 ⍀ x x . It follows that ˜n, T ªp  s ⍀ y x ⍀y1 x x as T, n ª ⬁ seq . The asymptotic normality of ˜n, T follows by arguments analogous to those of Section 4. Rescaling the centered estimator Ž ˜n, T y  . by 'n and letting T ª ⬁ for fixed n, we have 'n Ž ˜n , T y  . « n 1 Ý 'n is1 žH ˜y i M˜xX i y  M˜x i M˜xX i M H /ž n y1 n 1 ÝH ˜x i M˜xX i M is1 / . ˜y i M˜xX i y HM˜x i M˜xX i . s 0, so demeaning the data does not affect Note that EŽ HM the asymptotic centering. After some lengthy calculations, we find the variance matrix ž E vec ž HM˜ ˜X ˜ ˜X y i M x i y  M x i M x i vec H / ž HM˜ ˜X ˜ ˜X yi Mx i y  Mx i Mx i H X // s 901 E Ž ⍀ x i x i m Ž ⍀ y i y i y ⍀ x i y i y ⍀ y i x i  X q ⍀ x i x i  X . . q 901 E Ž ⍀ x i y i y ⍀ x i x i  X . m Ž ⍀ y i x i y ⍀ x i x i . K m y m x q 361 s ⌰f , ž E ž vec Ž ⍀ y i x i y ⍀ x i x i X . Ž vec Ž ⍀ y x y ⍀ x x . . i i i i / / say. Note that this covariance matrix differs in the coefficients of its components from the earlier matrix ⌰ given in Ž4.6. for the case where there is no demeaning to remove possible individual effects. As is clear from the formulae for these two cases Žsee Ž6.2. below., ⌰ f - ⌰ , so one effect of demeaning is to reduce time series variability. ˜y i M˜xX i Applying the multivariate Lindeberg-Levy Theorem to Ž1r 'n .Ý nis1Ž HM X X y1 n ˜x i M˜x i . and combining this with the limit of ŽŽ1rn.Ý i-1 HM˜x i M˜x i . as yHM n ª ⬁, we have 1 'n n Ý HM˜y i M˜xX i y HM˜x i M˜xX i is1 ž /ž y1 n 1 Ý HM˜x i M˜xX i n is1 y1 « N 0, 36 ⍀y1 x x m Im y ⌰ f ⍀ x x m Im y ž ž / ž / //. Hence, as ŽT, n ª ⬁. seq we have 'n Ž ˜n , T y  . « N ž 0, 36 ž ⍀y1 x x m Im y y1 x x m Im y /⌰ ž ⍀ f //. These sequential limit results can be extended to joint limit results, just as in the proof of Theorem 4, and we merely state the final results here. 1090 P. C. B. PHILLIPS AND H. R. MOON THEOREM 11: Suppose Assumptions 1, 2, and 3 hold and the data generating mechanism is Ž2.1.. Then: Ža. as Ž n, T ª ⬁., we ha¨ e ˜n, T ªp  ; Žb. if Ž n, T ª ⬁. and nrT ª 0, 'n Ž ˜n , T y  . « N ž 0, 36 ž ⍀y1 x x m Im y y1 x x m Im y /⌰ ž ⍀ f //. REMARKS: Ža. Comparing the limit variance of ˜n, T in Theorem 11 to that of ˜n, T in Theorem 4, we find that ˜n, T has the smaller asymptotic covariance. In fact, it is apparent from the formulae that Ž 6.2. 4⌰ y 36⌰ f s 154 Ž ⍀ x i x i m Ž ⍀ y i y i y ⍀ x i y i y ⍀ y i x i  X q ⍀ x i x i  X . . q 154 E Ž ⍀ x i y i y ⍀ x i x i  X . m Ž ⍀ y i x i y ⍀ x i x i . K m y m x ž / s 154 E C x iŽ 1 . m Ž C y iŽ 1 . y  C x iŽ 1 . . Ž Im 2 q K m . ½ž = C x iŽ 1 . m Ž C y iŽ 1 . y  C x iŽ 1 . . ž Ž 6.3. / X /5 ) 0. As remarked above, this reduction in variance occurs because demeaning the data by removing individual effects reduces time series variability. Similar effects occur when higher order time trends are removed from the data in the construction of pooled panel estimators. Žb. In the heterogenous panel cointegration case, the data are generated by Ž6.1.. The individual effects ␥ i can now be consistently estimated by time series regression on Ž6.1. leading to ␥ ˆi s Yi, . y ˜i X i, . and ˜i s ŽÝTts1Y˜i, t X˜i,X t .ŽÝTts1 X˜i, t X˜i,X t .y1. These least squares estimates and their fully modified variants have asymptotic properties that are well known ŽPhillips and Hansen Ž1990... Following the same line of argument as in Section 5.1, the pooled panel estimator ˜n, T can be shown to have the same limit distribution as that given in Theorem 11 for the spurious regression case, although the long run covariance matrices ⍀ i are now singular, just as in Section 5.1. Under the assumptions of Theorem 5, the asymptotic theory holds for joint limits as Ž n, T ª ⬁. and nrT ª 0, as well as sequential limits. Again ˜n, T estimates the long run average coefficient  s ⍀ y x ⍀y1 x x . Wald tests like those discussed in Section 5 can now be constructed with obvious modifications to the estimated covariance matrix formulae that allow for elimination of the individual effects by demeaning. Žc. In the homogeneous panel cointegration case, the data are generated by Ž6.1. with i s  s ⍀ y x ⍀y1 x x a.s. We can eliminate individual effects by removing individual specific means as above, and may proceed with FM estimation as in Section 5.2. The data are now corrected according to the formula Y˜i,qt s Y˜i, t y 1091 NONSTATIONARY PANEL DATA ˆe x ⍀ˆy1 ˜ Ž . ⍀ x x ⌬ X i, t rather than as in 5.14 . The pooled FM estimator in this case is given by n T ˜q ˜P F M s Ý Ý Y˜iq, t X˜iX, t y nT⌳ ex ž is1 ts1 n /ž y1 T Ý Ý X˜i , t X˜iX, t is1 ts1 / . Under the same assumptions as Theorem 9, we find that 'n T Ž ˜P F M y  . « .. as Ž n, T ª ⬁. with nrT ª 0. Note that in this case, the N Ž0, 6Ž ⍀y1 x x m ⍀ e. x effect of eliminating individual specific means is to increase the limit variance matrix in comparison with Theorem 9. Wald tests can be constructed as described in Section 5.2 with obvious modifications for the use of demeaned data, and a noncentral limit theory follows as in Section 5.3. 7. CONCLUSION This paper has developed a linear regression limit theory for nonstationary panel data with large numbers of cross section and time series observations. A central result is the existence of interesting long-run relations between two integrated panel vectors where there is no individual time series cointegration or where there are heterogeneous cointegrating relations. The new relations are characterized as long-run average relationships over the cross section and are parameterized in terms of the matrix regression coefficient, ⍀ y x ⍀y1 x x , of the cross section long-run average covariance matrix, ⍀ . They are analogous to population regression coefficients in conventional cross section regression of iid variates. The limit theory can be used to construct tests of hypotheses about the long-run average regression coefficients and to compare these coefficients in subgroups of the cross section population. These tests are given explicitly for the two cases of heterogeneous panel cointegration and homogeneous panel cointegration, which seem to be the important cases for empirical applications. The local asymptotic power function for these tests is also derived. The limit theory developed in this paper is designed for two dimensional arrays where both time series and cross section sample sizes pass to infinity. It allows for both sequential limits as T ª ⬁ and n ª ⬁ in that sequence, and joint limits where T, n ª ⬁ jointly. As the proofs in the Appendices demonstrate, convergence for joint limit is more difficult to obtain. However, apart from some stricter moment and summability conditions, the only additional requirement we use in the development of this theory is the rate condition that nrT ª 0. This condition indicates that the limit theory given herein is likely to be most useful in cases where T is large and n is moderately large. The usefulness of this asymptotic theory in describing finite sample behavior in panel regressions now needs to be systematically explored in simulation experiments. An important assumption that is common in panel data work and is used here in deriving asymptotics is cross section independence. For many nonstationary panel data applications, this independence condition is restrictive and it is an important limitation of our theory. For instance, multi-country GDP series, 1092 P. C. B. PHILLIPS AND H. R. MOON exchange rates, and financial assets prices all involve cross section dependence arising from global shocks and complicated interdependencies among the variables. As is apparent from our approach, certain strong laws and central limit results will continue to apply when the cross sectional dependence is of the weak memory variety, but in this case the limit variance matrices will change according to the dependence. More significantly, when there are strong correlations in a cross section Žas there will be in the face of global shocks. we can expect failures in the strong laws and central limit theory arising from the nonergodicity. However, even in this event, theorems like the ergodic theorem will still apply but the limits will be random and measurable with respect to the invariant algebra generated by the global shocks. In the present case and, indeed, quite commonly in panel data theory, cross section independence is assumed in part because of the difficulties of characterizing and modeling cross section dependence. In general, finding a natural ordering for cross section indices in economic data is not easy, and this has been a serious obstacle in the development of a satisfactory approach. While some recent research has attempted to resolve the difficulty by employing a framework for spatial data based on the economic distance between individuals Že.g. Conley Ž1997.., the successful simultaneous modeling of both cross section dependence and time series dependence remains a challenging problem and is a major area for future research in multi-index asymptotics of the type considered here. Cowles Foundation for Research in Economics, Yale Uni¨ ersity, Box 208281, New Ha¨ en, CT 06520-8281, U.S.A., and Uni¨ ersity of Auckland; [email protected] and Dept. of Economics, Uni¨ ersity of California, Santa Barbara, CA 93106, U.S.A.; [email protected] Manuscript recei¨ ed April, 1997; final re¨ ision recei¨ ed September, 1998. APPENDICES APPENDIX A: PRELIMINARY LEMMAS AND PROOFS We start with some lemmas that are useful in following arguments. The results are straightforward and proofs are omitted here, but are available in PM b. LEMMA 9: Ža. For any pG 1 and any m = n matrix A, there exists a constant M) 0 such that m Ž 8.1. 5 A5 p F M n Ý Ý < ai , j < p , is1 js1 where a i, j is the Ž i, j . th element of A. Žb. For any m = m matrix A Ž 8.2. Ž tr Ž A .. 2 F m 5 A 5 2 . 1093 NONSTATIONARY PANEL DATA LEMMA 10: Suppose that A Žs a i, j 4i, j . and B Žs bi, j 4i, j . are Ž m = m. matrices and K m is the commutation matrix. Then, tr wŽ A m B . K m x F 5 A 5 5 B 5 . If A is symmetric, then tr wŽ A m A . K m x s 5 A 5 2 . LEMMA 11: Ža. Under Assumptions 1 and 2 ⬁ Ž 8.3. E 4 Ý - ⬁. Ci , t ts0 Žb. Under Assumptions 4 and 5 ⬁ Ž 8.4. E 8 - ⬁. Ý tCi , t ts0 Žc. Under Assumptions 4 and 5 ⬁ Ž 8.5. E 16 - ⬁. Ý Ci , t ts0 1. PROOF Ž 8.6. OF LEMMA 2: The panel BN decomposition Ui , t s Ci Ž 1 . Vi , t q U˜i , ty1 y U˜i , t a.s. follows directly from Phillips and Solo Ž1992. provided Yi s Ý⬁ss0 s 2 5 Ci, s 5 2 - ⬁ a.s. This condition holds if EŽ Yi . - ⬁, which holds by Lemma 1Ža.. Q. E. D. 2. PROOF OF LEMMA 1: See PM b. 3. PROOF OF LEMMA 2: It is enough to show that 1 p 'T sup r Q. E. D. Ũa , i , 0 ª 0 and 1 'T p U˜a , i , wT r x ª 0 as T ª ⬁ for all a, i. But, these follow because U˜a, i, t is strictly stationary in t and square integrable by Lemma 1, so that the results hold by the same argument as that given in Phillips and Solo Ž1992, p. 978.. The functional law follows directly. Q. E. D. 4. PROOF OF E LEMMA 4: Substituting Mi s Ci Ž1.Wi , we have HMi MiX 2 s E vec HMi MiX 2 F E Ci Ž 1 . m Ci Ž 1 . X s E Ž Ci Ž 1 . m Ci Ž 1 .. vec Wi Wi H 2 X E vec Wi Wi H 2 , 2 1094 P. C. B. PHILLIPS AND H. R. MOON where the last inequality holds because 5 AB 5 F 5 A 5 5 B 5 and because Ci Ž1. is independent of Wi . X We know E 5vec HWi Wi 5 2 - ⬁ and E 5 Ci Ž1. m Ci Ž1.5 2 s E 5 Ci Ž1.5 4 - ⬁ by Lemma 1. Therefore, X 2 5 5 E HMi Mi - ⬁, as required. Q. E. D. APPENDIX B: PROOFS FOR SECTION 3ᎏMULTIDIMENSIONAL LIMIT THEORY 1. CONSTRUCTION OF RANDOM VECTORS Yi IN Ž3.3. TO EXIST ON THE SAME PROBABILITY SPACE: According to Skorohod’s Theorem in ⺢ m , Theorem 29.6 in Billingsley Ž1986.,8 we can construct a probability space Ž ⍀ iU , FiU , PiU . where there exist random vectors Yi,UT and YiU such that Yi, T ' Yi,UT , Yi ' YiU , and Yi,UT ªa.s. YiU as T ª ⬁ for all i. Also, we can choose independent YiU because the Yi, T are independent across i for all T. Now we define ⍀ U s Ł⬁is 1 ⍀ iU , the Cartesian product of ⍀ iU , and let i be the natural projection of ⍀ U onto ⍀ iU for each i. Let F U be the smallest -field Ž F . for all i and F g FiU . Define R to be the collection of all finite containing all the sets y1 i dimensional rectangles, Ł⬁is 1 Fi where Fi g FiU for all i and Fi s ⍀ iU , except for at most finite many values of i. Now define P U ŽŁ⬁is 1 Fi . s Ł⬁is1 PiU Ž Fi .. Then, by Theorem 8.2.2 Žp. 201. in Dudley Ž1989., P U on R extends uniquely to a probability measure on F U . Let Y˜i Ž . s YiU Žy1 Ž .. for all i g ⍀ U . By the way of their construction, the Y˜i Ž . are random vectors on the probability space Ž ⍀ U , F U , P U . and Y˜i ' YiU ' Yi . Choose Yi in Ž3.3. to be Y˜i and we have the desired result. Q.E.D. 2. PROOF OF LEMMA 5: We prove part Žb.. Then part Ža. holds by the same principle. Suppose that f g C is given. From X n, T « X as n, T ª ⬁, for any given ) 0, we can chosen n 0 and T1 such that whenever n G n 0 and T G T1 , the following inequality holds: Ž 8.7. Ef Ž X n , T . y Ef Ž X . - 2 . From X n, T « X n as T ª ⬁ ᭙n , we can choose T2 depending on n and such that Ž 8.8. Ef Ž X n , T . y Ef Ž X n . - 2 if T G T2 . For each n G n 0 choose T2 Ž n, ., and choose a fixed T0 greater than both T1 and T2 . Then both Ž8.7. and Ž8.8. hold and therefore Ef Ž X n . y Ef Ž X . - if n G n 0 and X n, T « X sequentially as ŽT, n ª ⬁. seq . Q. E. D. 3. PROOF OF LEMMA 6: We show part Žb.. Part Ža. can be established by similar arguments. Suppose that f g C is given. Assume that Ž3.9. holds. From Ž3.9. and X n « X as n ª ⬁, for any given ) 0, we can choose n 0 and T0 such that whenever n G n 0 and T G T0 , we have sup Ef Ž X n , T . y Ef Ž X n . - nGn 0 , TGT 0 2 , and Ef Ž X n . y Ef Ž X . - 2 . 8 For Skorohod’s theorem on function spaces refer to the representation theorem in Pollard Ž1984. or Theorem 4 on p. 47 in Shorack and Wellner Ž1986.. NONSTATIONARY PANEL DATA 1095 Thus, if n G n 0 and T G T0 , Ef Ž X n , T . y Ef Ž X . F Ef Ž X n , T . y Ef Ž X n . q Ef Ž X n . y Ef Ž X . - . sup nGn 0 , TGT 0 Hence, X n, T « X as Ž n, T ª ⬁.. Now assume that X n, T « X and X n « X as Ž n, T ª ⬁.. The necessity of the condition follows because lim sup Ef Ž X n , T . y Ef Ž X n . n, T F lim sup Ef Ž X n , T . y Ef Ž X . q lim sup Ef Ž X n . y Ef Ž X . s 0. Q. E. D. n n, T Before starting the proof of Theorem 1 we give the following lemma. LEMMA 12: Suppose Yi, T are independent across i. Assume that Yi, T « Yi as T ª ⬁ for all i. Then, lim sup n ª ⬁Ž1rn.Ý nis1 E < Yi, T < - ⬁ implies lim sup n ª ⬁Ž1rn.Ý nis1 E < Yi < - ⬁. PROOF: Note that, since < Yi, T < « < Yi < as T ª ⬁ by the continuous mapping theorem, it follows that E < Yi < F lim inf T E < Yi, T < Žsee Theorem 5.3 in Billingsley Ž1968... Thus, lim sup nª⬁ 1 n n 1 n sup lim inf Ý E < Yi , T < Ý E < Yi < F lim Tª⬁ n nª⬁ is1 is1 F lim sup n, Tª⬁ 1 n n Ý E < Yi , T < - ⬁. Q. E. D. is1 4. PROOF OF THEOREM 1: Part Žb. follows easily from Lemma 6 and part Ža.. In particular, from the assumptions of the theorem we know that X n, T s Ž1rn.Ý nis1Yi, T « X n s Ž1rn.Ý nis1Yi as T ª ⬁ for all n and X n s Ž1rn.Ý nis1Yi ªp ˜ X s lim nŽ1rn.Ýnis1 EYi . Then, since condition Ž3.9. holds from part Ža., the desired result X n, T ªp ˜ X as Ž n, T ª ⬁. follows by Lemma 6. Now, we prove part Ža.. First, we establish condition Ž3.9. in the scalar class. It is sufficient for condition Ž3.9. to restrict C to C ⬁ , the class of all the bounded, continuous real functions with bounded, continuous derivatives of all orders Žsee Theorem 7.1 in Billingsley Ž1968. or Theorem 12 in Pollard Ž1984... Without loss of generality, let the functions be such that < f Ž k . Ž x .< F 1, ᭙k, where f Ž k . Ž x . denotes the kth derivative function of f Ž x .. Before proceeding, we need to ensure that the probability space on which the variates are defined is large enough to permit the arguments that follow. Limits such as X n, T s Ž1rn.Ý nis1Yi, T « X n s Ž1rn.Ý nis 1Yi as T ª ⬁ involve the joint distributions of the random vectors Ž Y1, T , . . . , Yn, T .X and Ž Y1 , . . . , Yn .X , not any properties of the probability space on which they are defined. However, we need to ensure that we can relate these variates on the same space. This can be accomplished by passing to a new probability space, using Skorohod’s Theorem in ⺢ m Že.g., Theorem 29.6 in X Billingsley Ž1986.., in which they are defined new random variables Ž Y˜1, T , . . . , Y˜n, , Y˜1 , . . . , Y˜n . such that Y˜i, T ' Yi, T and Y˜i ' Yi for all i and the 2 n random variables Y˜1, T , . . . , Y˜n, T , Y˜1 , . . . , Y˜n are independent. Without loss of generality, we can assume that Y˜i, T s Yi, T and Y˜i s Yi for all i and T.9 9 As in Appendix BŽ1. above, we can construct an infinite dimensional probability space where the X X two independent random vectors Ž Y1, T , . . . , Yn, T . and Ž Y1 , . . . , Yn . coexist. However, the argument given is enough for the proof that follows. 1096 P. C. B. PHILLIPS AND H. R. MOON Now we can define the quantities k, n, T s Ý k ) i G 1Yi, T q Ý k - i F n Yi , for 1 F k F n, all on the same probability space. By virtue of the definitions of X n, T , X n , and k, n, T , we have f ž 1 n n Ý Yi , T is1 n 1 / ž yf n Ý Yi is1 / ž sf 1 n Ž n , n , T q Yn , T . y f / ž n s Ý ks1 1 ½ž f n 1 n Ž 1 , n , T q Y1 . Ž k , n , T q Yk , T . y f / ž 1 n / Ž k , n , T q Yk . /5 . It follows that Ž 8.9. lim sup Ef Ž X n , T . y Ef Ž X n . n, Tª⬁ ž s lim sup Ef n, Tª⬁ 1 n n Ý Yi , T is1 ¡ž Ý E~ n s lim sup f 1 n ¢ ž n, Tª⬁ ks1 qf / ž y Ef 1 n n Ý Yi is1 / Ž k , n , T q Yk , T . y f / ž 1 n k , n , T y f / ž 1 n 1 n k , n , T Ž k , n , T q Yk . /¦¥ /§ . X Let g Ž h. s sup x < f Ž xq h. y f Ž x . y f Ž x . h <. Take x s k, n, T rn and h s Yk, T rn in the case of f ŽŽ1rn.Ž k, n, T q Yk, T .. y f ŽŽ1rn. k, n, T ., and take x s k, n, T rn and h s Ykrn in the case of f ŽŽ1rn.Ž k, n, T q Yi .. y f ŽŽ1rn. k, n, T .. By the triangle inequality, it follows that Ž8.9. is bounded above by n Ž 8.10. lim sup ÝE n, Tª⬁ is1 X i , n , T Yi , T n n ½ ž /ž Ý ž / f n q lim sup Yi , T Eg n n, Tª⬁ is1 y Yi n /5 n q lim sup Ý Eg n , Tª⬁ is1 Yi ž / n . By the triangle inequality, the first term in Ž8.10. is less than n lim sup Ý n, Tª⬁ is1 X i , n , T Yi , T n n Yi ½ ž /ž /5 Ý ž /ž / Ý ž / E f n s lim sup Ef X i , n , T n n, Tª⬁ is1 E E n n, Tª⬁ is1 F lim sup y Yi , T n y Yi n n Yi , T n y Yi n s 0. The first line above uses the fact that i, n, T rn, Yi, T rn and Yirn are independent, the inequality in X the second line holds because < f < F 1, and the third line follows directly from condition Žii.. 1097 NONSTATIONARY PANEL DATA For the second term in Ž8.10., note by the mean value theorem that g Ž h. F M1 min< h <, h2 4 for some constant M1 which depends on f alone. Then, for any ) 0 n lim sup Ý Eg n, Tª⬁ is1 Yi , T ž / n n F lim sup ÝE g n, Tª⬁ is1 n q lim sup ÝE n, Tª⬁ is1 Yi , T 1 n Yi , T g n F M12 lim sup Yi , T ž /½ ž /½ ÝE 1 n Yi , T n n, Tª⬁ is1 F n Yi , T n 5 ) 5 n q M1 lim sup ÝE n , Tª⬁ is1 Yi , T n 1 ½ Yi , T n ) 5 s M2 , where the first inequality holds by applying g Ž h. F M1 h2 on 1< Yi, T rn < F 4 and g Ž h. F M1 < h < on 1< Yi, T rn < ) 4 and the last inequality holds by conditions Ži. and Žiii. with M2 s M12 lim sup n, T Ý nis1 E < Yi, T rn <. By Lemma 12, condition Ži. implies lim sup n, T 1 n n Ý E < Yi < - ⬁, is1 and by condition Živ. we have lim sup n, T 1 n n Ý E < Yi <1< Yi < ) n4 - ⬁. is1 Thus, applying the same arguments as those used for lim sup n, T Ý nis 1 Eg Ž Yi, T rn . to lim sup n, T Ý nis1 Eg Ž Yirn., we have lim sup n, T Ý nis1 Eg Ž Yirn. s 0. It follows from Ž8.9. and Ž8.10. that condition Ž3.9. holds. When the Yi, T are m-vectors, the Cramer-Wold device can be used. That is, using the above ´ X X argument, we obtain s X n, T ªp s ˜ X as Ž n, T ª ⬁. ᭙s g ⺢ m , and it follows that X n, T ªp ˜ X as n, T ª ⬁. Q. E. D. 5. PROOF OF COROLLARY 1: Define X n, T s Ž1rn.Ý nis1Yi, T s Ž1rn.Ý nis1Ci Q i, T and X n s Ž1rn.Ý nis 1Yi s Ž1rn.Ý nis1Ci Q i . Assume sup i 5 Ci 5 ) 0, for if this does not hold, the result is trivial. We know that X n, T « X as T ª ⬁ for all n by the conditions in the corollary. By assumption the Q i, T are uniformly integrable and Q i, T « Q i , so E 5 Q i 5 - ⬁. Also, C s lim nŽ1rn.Ý nis1Ci exists, so we have X n ªp CEŽ Q i . as n ª ⬁. Hence, if we establish conditions Ži. ᎐ Živ. of Theorem 1, then X n, T ªp CEŽ Q i . as Ž n, T ª ⬁.. By the uniform integrability of 5 Q i, T 5 and sup i 5 Ci 5 - ⬁, we have lim sup n, T 1 n n Ý E 5 Yi , T 5 F ž sup 5 Ci 5 / sup E 5 Qi , T 5 - ⬁, is1 i T verifying condition Ži., and lim sup n, T 1 n n Ý 5 EYi , T y EYi 5 F ž sup 5 Ci 5 / lim sup 5 EQi , T y EQi 5 s 0. is1 i T 1098 P. C. B. PHILLIPS AND H. R. MOON Condition Žiii. is satisfied since Ž 8.11. 1 n n n Ý E w 5 Yi , T 515 Yi , T 5 ) n 4x F ž sup 5 Ci 5 / sup E i is1 5 Q i , T 51 5 Q i , T 5 ) T ½ sup i 5 Ci 5 5 which converges to zero as n ª ⬁, again by virtue of uniform integrability and sup i 5 Ci 5 - ⬁. Condition Živ. of Theorem 1 holds because 1 n n Ý E 5 Yi 51 5 Yi 5 ) n 4 F ž sup 5 Ci 5 / E 5 Q i 51 5 Q i 5 ) i is1 ½ by sup i 5 Ci 5 - ⬁ and dominated convergence since E 5 Q i 5 - ⬁. n sup i 5 Ci 5 5 ª0 Q. E. D. 6. PROOF OF THEOREM 2: The proof follows that of Lindeberg’s theorem given in Billingsley Ž1968, Theorem 7.2.. The only change is that the additional index T appears in the component variates i, n, T and limits are taken as Ž n, T ª ⬁.. The fact that T passes to infinity with n is incidental to the main argument. For example, we still have ⍀i , T sn2 , T F 2 q E w i2, n , T 1 < i , n , T < ) 4x and, as a consequence of the Lindeberg condition Ž3.20., max iFn ⍀i , T sn2 , T ª0 as Ž n, T ª ⬁.. 7. PROOF Q. E. D. OF THEOREM 3: Define r2 i , n , T s ⍀y1 n , T Ci Qi , T , X where ⍀ n, T s Ý nis1Ci ⌺ T Ci . By the Cramer-Wold device, Ý nis1 i, n, T « N Ž0, Im . as Ž n, T ª ⬁., if ´ m ᭙ t g ⺢ with 5 t 5 s 1 Ž 8.12. t X n Ý i , n , T « N Ž0, 1. as n, T ª ⬁. is1 Then, by condition Živ. Ž1rn.Ý nis 1Yi, T « N Ž0, ⍀ . as Ž n, T ª ⬁.. To establish Ž8.12., it is sufficient to verify condition Ž3.20.. For given ) 0 and t g ⺢ m with 5 t 5 s 1, we have Ž 8.13. t X n Ý E w i , n , T iX, n , T 1< tX i , n , T iX, n , T t < ) 4x t is1 X s t ⍀y1r2 n, T n X X y1r2 < 4x y1r2 Ý E w Ci Qi , T QXi , T CXi 1< tX⍀y1r2 n , T C i Q i , T Q i , T C i ⍀ n , T t ) ⍀ n , T t. is1 1099 NONSTATIONARY PANEL DATA Take the indicator function first. Note that X X X y1r2 < 4 1 < t ⍀y1r2 n , T Ci Q i , T Q i , T Ci ⍀ n , T t ) F1 X y1r2 X X y1r2 < n , T Ci Qi , T Qi , T Ci ⍀ n , T t ) ½ max < t ⍀ 5 t 5s1 X 5 X r2 y1r2 . )4 s 1 ma x Ž ⍀y1 n , T Ci Q i , T Q i , T Ci ⍀ n , T . max 5 C j 5 2 5 Q i , T 5 2 ) F 1 ma x Ž ⍀y1 n, T ½ ž / jFn min Ž ⍀ n , T . ½ s 1 5 Qi , T 5 2 ) 5 5 ½ F 1 5 Qi , T 5 2 ) max 5 C j 5 2 jFn T2 min Ž Ý njs1C j CXj . max Ž 5 C j 5 2 . jFn 5 . Next, expression Ž8.13. is bounded above by Ž 8.14. max 5 t 5s1 n X t ⍀y1r2 n, T ½ X X Ci Q i , T Q i , T Ci 1 5 Q i , T 5 2 ) ÝE is1 n Ž . F y1 min ⍀ n , T Ý nis1 5 Ci 5 2 min Ž ⍀ n , T . jFn ½ ½ i T2 min Ž Ý nis1Ci CXi . max Ž 5 C j 5 2 . T2 min Ž Ý njs1C j CXj . max Ž 5 C j 5 2 . jFn ½ 5 T2 min Ž Ý njs1C j CXj . jFn E 5 Q1 , T 5 2 1 5 Q1 , T 5 2 ) n max 5 Ci 5 2 F max 5 C j 5 2 5 Ci Qi , T 5 2 1 5 Qi , T 5 2 1 ) ÝE is1 F T2 min Ž Ý njs1C j CXj . E 5 Q1 , T 5 2 1 5 Q1 , T 5 2 ) ⍀y1r2 n, T t 5 5 T2 min Ž Ý nis1Ci CXi . max Ž 5 Ci 5 2 . iFn 5 . By conditions Ži. and Žii., n max i 5 Ci 5 2 2 T min Ž Ý nis1Ci CXi . s O Ž1. and T2 min Ž Ý nis1Ci CXi . max i F n Ž 5 Ci 5 2 . ª ⬁, as Ž n, T ª ⬁.. Then, since 5 Q i, T 5 2 is uniformly integrable in T by condition Žiii., it follows that Ž8.14. ª 0 as Ž n, T ª ⬁.. Q. E. D. APPENDIX C: PROOFS FOR SECTION 4ᎏSPURIOUS PANEL REGRESSION LIMIT THEORY The next lemma gives the joint limit theory needed for Theorem 4. LEMMA 13: Suppose that Assumptions 1᎐3 hold. Ža. As Ž n, T ª ⬁., 1 n n Ý is1 1 T 2 T p 1 1 Ý Zi , t ZXi , t ª 2 E⍀ i s 2 ⍀ . ts1 1100 P. C. B. PHILLIPS AND H. R. MOON Žb. If Ž n, T ª ⬁. and nrT ª 0, then n 1 T 1 Ý T Ý Ž Yi , t XiX, t y  Xi , t XiX, t . « N Ž0, ⌰ . . 'n is1 ts1 PROOF OF LEMMA 13: Ža. From the BN decomposition of Ui, t in Ž2.4. we have a.s. Zi , t s Ci Ž 1 . Pi , t q U˜i , 0 y U˜i , t q Zi , 0 , where Pi, t s Ýtss1Vi, s , which leads to n 1 n T 1 Ý is1 a.s. n 1 Ý Zi , t ZXi , t s n Ý Ž Qi , T q R i , T . , T2 ts1 is1 where T 1 Qi , T s T Ý Ci Ž1. Pi , t PiX, t Ci Ž1. , 2 ts1 X R i , T s R1 , i , T q R1 , i , T q R 2 , i , T , T 1 R1 , i , T q T R2, i , T s 2 and ts1 1 T X Ý Ci Ž1. Pi , t ŽU˜i , 0 y U˜i , t q Zi , 0 . , 2 T X Ý ŽU˜i , 0 y U˜i , t q Zi , 0 .ŽU˜i , 0 y U˜i , t q Zi , 0 . . ts1 We show that as Ž n, T ª ⬁., Ž1rn.Ý nis 1 Q i, T ªp 12 ⍀ and Ž1rn.Ý nis1 R k, i, T ªp 0, for k s 1, 2. X X The Q i, T are iid across i for all T. Also, as T ª ⬁, Q i, T « Q i s Ci Ž1.HWi Wi Ci Ž1. and Ž1rn.Ý nis 1 Q i ªa.s. 12 ⍀ . That is, in sequential asymptotics such as ŽT, n ª ⬁. seqŽ1rn.Ý nis1 Q i, T ªp 12 ⍀ . According to Corollary 1 Žset Ci s Im so that the second condition is automatically satisfied., if we show that 5 Q i, T 5 is uniformly integrable in T, then it follows that n 1 Ž 8.15. p 1 Ý Qi , T ª 2 ⍀ n is1 as Ž n, T ª ⬁.. By 5 AB 5 F 5 A 5 5 B 5 and the triangle inequality Ž 8.16. 5 Qi , T 5 F Ci Ž1. 2 1 T2 T Ý 5 Pi , t 5 2 . ts1 Also, as T ª ⬁ 1 T2 T Ý 5 Pi , t 5 2 « H5 Wi 5 2 , ts1 and we have E ž 1 T 2 T Ý 5 Pi , t 5 2 ts1 / ž s tr 1 T 2 T Ý E Ž Pi , t , PiX, t . ts1 / ªE 5 Wi 5 2 s žH / 1 2 tr Ž Im . . It follows Že.g., Billingsley Ž1968, Theorem 5.4.. that Ž1rT 2 .ÝTts 1 5 Pi, t 5 2 is uniformly integrable in T. Since E 5 Ci Ž1.5 2 - ⬁ by Lemma 1, we deduce that 5 Ci Ž1.5 2 Ž1rT 2 .ÝTts1 5 Pi, t 5 2 is uniformly integrable in T. Thus, 5 Q i, T 5 is uniformly integrable in T, and Ž8.15. follows. 1101 NONSTATIONARY PANEL DATA Next, Ž1rn.Ý nis 1 R1, i, T and Ž1rn.Ý nis1 R 2, i, T converge in probability to zero if E 5 R1, i, T 5, E 5 R 2, i, T 5 ª 0 as Ž n, T ª ⬁.. Note that Ž 8.17. E 5 R1 , i , T 5 F F s 1 T 2 T Ý E w 5 Ci Ž1. Pi , t 5 5U˜i , 0 y U˜i , t q Zi , 0 5x ts1 T 1 1 'T 1 'T T Ý ts1 ( Pi , t E Ci Ž1. 2 E 5 U˜i , 0 y U˜i , t q Zi , 0 5 2 'T O Ž1. , where the first inequality holds by the triangle inequality and 5 AB 5 F 5 A 5 5 B 5, the second inequality holds by the Cauchy-Schwarz inequality and the last line holds because E 5 Ci Ž1.Ž Pi, tr 'T .5 2 s O Ž1., E 5 Zi, 0 5 2 - M1 , and E 5 U˜i, t 5 2 - M2 ᭙ t and for some M1 , M2 - ⬁ by Lemma 1Žc.. Thus, E 5 R1, i, T 5 ª 0, as Ž n, T ª ⬁.. Similar arguments show that E 5 R 2, i, T 5 s Ž1rT . O Ž1.. So, all the desired results hold and part Ža. is proved. Žb. Write Ci Ž1. s Ž C y Ž1.X , C x Ž1.X .X , and U˜i, t s ŽU˜yX , t , U˜xX , t .X , conformably with the partition of Zi, t i i i i into Yi, t and X i, t . Using the BN decomposition of Ui, t in Ž2.4., we have 1 n T 1 Ý T 2 Ý Ž Yi , t XiX, t y  Xi , t XiX, t . , 'n is1 ts1 n 1 s Ý Ž Q i , T q R1 , i , T q R 2 , i , T q R 3 , i , T q R 4 , i , T q R 5 , i , T . , 'n is1 where Qi , T s T 1 T R1 , i , T s R2, i , T s R3, i , T s R4, i , T s R5 , i , T s X i i 1 i i ts1 T 1 T 2 2 2 2 T X Ý Ž U˜y , 0 y U˜y , t q Yi , 0 . PiX, t C x Ž1. 4 , i i i T X Ý Ž U˜y , 0 y U˜y , t q Yi , 0 .Ž U˜x , 0 y U˜x , t q Xi , 0 . 4 , i i i i T X X Ý  C x Ž1. Pi , t Ž U˜x , 0 y U˜x , t q Xi , 0 . q  Ž U˜x , 0 y U˜x , t q Xi , 0 . PiX, t C x Ž1. 4 , i i i i i ts1 1 T i ts1 1 T i ts1 1 T i ts1 1 T X Ý C y Ž1. Pi , t Ž U˜x , 0 y U˜x , t q Xi , 0 . 4 , 2 T X Ý  Ž U˜x , 0 y U˜x , t q Xi , 0 .Ž U˜x , 0 y U˜x , t q Xi , 0 . 4 . i ts1 We show that as Ž n, T ª ⬁. Ž 8.18. X Ý C y Ž1. Pi , t PiX, t C x Ž1. y  C x Ž1. Pi , t PiX, t C x Ž1. 4 , 2 n Ý Qi , T « N Ž0, ⌰ . 'n is1 i i i i 1102 P. C. B. PHILLIPS AND H. R. MOON and as Ž n, T ª ⬁. with nrT ª 0, 1 n p Ý Rk , i, T ª 0 'n is1 Ž 8.19. Ž k s 1, . . . , 5 . . Note that EQi , T s 1 T2 T X X Ý t Ž E w C y Ž1. C x Ž1. x y  E w C x Ž1. C x Ž1. x. s 0. i i i i ts1 Also, Ž 8.20. 1 T4 T T X Ý Ý E wvecŽ Pi , t PiX, t .vecŽ Pi , s PiX, s . x ts1 ss1 s s 1 T 4 1 T2 T T Ý Ý E w Pi , t PiX, s m Pi , t PiX, s x ts1 ss1 ¡ tns 2 T T ts1 ss1 1 s ⌶T q O q , T ¦ X Ž Im 2 q K m q vec Im Ž vec Im . . ž / Ý Ý~ ¢ ž /žž / ž // ž / T tns tks T T tns y T ¥q O ž T1 / § X vec Im Ž vec Im . say, and Ž 8.21. X E Ž vec Ž Q i , T . vec Ž Q i , T . . s E Ž C x i Ž 1 . m Ž C y i Ž 1 . y  C x i Ž 1 ... 1 T 4 T T X Ý Ý E wvecŽ Pi , t PiX, t .vecŽ Pi , s PiX, s . x ts1 ss1 X X = Ž C x i Ž 1 . m Ž C y i Ž 1 . y  C x i Ž 1 .. . s E Ž C x i Ž 1 . m Ž C y i Ž 1 . y  C x i Ž 1 ... ⌶ T q O ž 1 ž // T X X = Ž C x i Ž 1 . m Ž C y i Ž 1 . y  C x i Ž 1 .. . s ⌶T , say. It is easy to see that ⌰ T ª ⌰ as T ª ⬁. So Q i, T 4i is an iid sequence with mean zero and covariance matrix ⌰ T . Next, apply Theorem 3 with Ci s Im y m x to establish that Ž1r 'n .Ý nis1 Q i, T « N Ž0, ⌰ . as Ž n, T ª ⬁.. Conditions Ži., Žii., and Živ. of the theorem are obviously satisfied in view of the fact that Ci s Im y m x and ⌰ T ª ⌰ as T ª ⬁. For the uniform integrability of 5 Q i, t 5 2 , note by the continuous mapping that as T ª ⬁ 5 Qi , T 5 2 « 5 Qi 5 2 s HC y Ž1.WiWiX C x Ž1.X y  C x Ž1.WiWi C x Ž1.X 4 i i i i 2 . 1103 NONSTATIONARY PANEL DATA Then, 5 Q i, T 5 2 is uniformly integrable in T because X E 5 Q i , T 5 2 s tr Ž E Ž vec Ž Q i , T . vec Ž Q i , T . .. s tr Ž ⌰ T . X ª tr Ž ⌰ . s tr Ž E Ž vec Ž Q i . vec Ž Q i . .. s E 5 Q i 5 2 . Next, to prove Ž1r 'n .Ý nis 1 R k, i, T ªp 0 as Ž n, T ª ⬁. with nrT ª 0, we simply show that n, T ª ⬁ with nrT ª 0 for k s 1, . . . , 5. Note that 'n E 5 R k, i, T 5 ª 0 as 'n E 5 R1 , i , T 5 F 'n s 1 2 T ' T Ý E 5 C y Ž1. Pi , t 5 5U˜x , 0 y U˜x , t q Xi , 0 5 4 i i i ts1 n O Ž 1 . ª 0, T where the first inequality holds by the triangle inequality and 5 AB 5 F 5 A 5 5 B 5 and the last line holds in view of Ž8.17. above. Similarly we can show that 'n E 5 R 2, i, T 5, 'n E 5 R 4, i, T 5 s nrT O Ž1. and 'n E 5 R 3, i, T 5, 'n E 5 R 5, i, T 5 s Ž'n rT . OŽ1.. So we have the desired limits and part Žb. follows. Q. E. D. ' PROOF OF THEOREM 4: By Lemma 13Ža., it is easy to see that as Ž n, T ª ⬁. ˆn , T s 1 n n Ý is1 T 1 T2 Ý X Yi , t X i , t ts1 ž n 1 n Ý is1 1 T2 y1 T Ý X Xi , t Xi , t ts1 / p ª ⍀ y x ⍀y1 x x s. Also, when Ž n, T ª ⬁. with nrT ª 0, from Lemma 13Ža. and Žb. we have 'n Ž ˆn , T y  . s ž 1 n 1 T Ý T 2 Ý Ž Yi , t XiX, t y  Xi , t XiX, t . 'n is1 ts1 y1 « N 0, 4 ⍀y1 x x m Im y ⌰ ⍀ x x m Im y Ž Ž . Ž n n Ý is1 1 T2 y1 T Ý Xi , t XiX, t ts1 / .., Q. E. D. giving the required result. 8.4. APPENDIX D: PROOFS /ž 1 FOR SECTION 5.1ᎏHETEROGENEOUS PANEL COINTEGRATION LIMIT THEORY The following two lemmas give some useful results on the existence of moments of the heterogeneous coefficients i in Ž5.2. and the random coefficients in the linear process representation Ž5.3.. Both results are proved in PM b . LEMMA 14: Under Assumptions 4, 5, and 7, E 5 i 5 8 - ⬁. ˜i, s s Ý⬁tssq1Gi, t , and F˜i, t s Ý⬁ss0 G˜i, sVi, tys . Suppose that LEMMA 15: Let Gi Ž1. s Ý⬁ss0 Gi, s , G Assumptions 4᎐7 hold. Then: Ža. EŽÝ⬁ss 0 s 2 5 Gi, s 5 2 . - ⬁. Žb. E 5 Fi, t 5 2 - M for some constant M- ⬁. Žc. E 5 Gi Ž1.5 4 - ⬁. Žd. E 5 F˜i, t 5 4 - M for some constant M- ⬁. 1104 P. C. B. PHILLIPS AND H. R. MOON 2. PROOF OF LEMMA 7: As in the proof of Lemma 3, since F˜i, t 4t is strictly stationary and F˜i, t is square integrable from Lemma 15Žd., it follows that Ž 8.22. 1 sup p F˜a , i , wT r x ª 0 'T 0FrF1 as T ª ⬁ for all a, i , where F˜a, i, wT r x is the ath element of F˜i, wT r x. The functional law follows directly from Ž5.5. and rx Donsker’s theorem applied to Ž1r 'T .ÝwT Q. E. D. ts 1 Vi, t . 3. PROOF OF LEMMA 8: The proof follows the same lines as Phillips Ž1988. and is omitted here Ždetails are available in PM b .. Q. E. D. 4. PROOF OF THEOREM 5: The proof follows similar lines to that of Lemma 13 and Theorem 4 above but makes use of the bounds established in Lemmas 14 and 15 and the panel BN decomposition Ž5.4.. The details are lengthy and to save space they are not repeated here. They are Q. E. D. given in full in PM b. 5. PROOF OF THEOREM 6: The proof proceeds by showing that as Ž n, T ª ⬁. with n, T ª 0, ˆs ⌰ n 1 Ý n is1 T 1 ½ T T Ý Ý Xi , t XiX, s m Eˆi , t EˆXi , x 4 ts1 ss1 5 p ª⌰, and ⍀̂y1 xx s n 1 n Ý is1 ½ T y1 T 2 2 Ý Xi , t XiX, t ts1 5 p ª ⍀y1 xx . Then, by Theorem 5 and the delta method, the proof is complete. From Lemma 13, we know that y1 ˆy1 Ž1rn.Ý nis 1Ž2rT 2 .ÝTts1 X i, t X i,X t 4 ªp ⍀ x x as Ž n, T ª ⬁.. In consequence, ⍀ Ž . x x ªp ⍀ x x as n, T ª ⬁ since ⍀ x x ) 0. Again, full details are available in PM b. Q. E. D. 6. PROOF OF THEOREM 7: Under the null hypothesis, using the cross section independence and applying Theorem 5, we have 'n b Ž ˆa y ˆb . s n a Ž n brn a . 1r 2 Ž ˆa y  a . y n b Ž ˆb y  b . ' ' « N Ž 0, Va q Vb . , . Ž y1 . when ŽT, n a , n b ª ⬁. and n arT, n brT ª 0, and where V s 4Ž ⍀y1 , x x m Im x ⌰ ⍀ , x x m Im x for s a, b. As in the proof of Theorem 6, we can show that as ŽT, n a , n b ª ⬁. with n arT, n brT ª 0, we have ˆ s ⌰ 1 n Ý igI ½ T 1 T T Ý Ý Xi , t XiX, s m Eˆi , t EˆXi , s 4 ts1 ss1 5 p ª ⌰ , and ⍀̂ , x x s 1 1 n Ý igI ½ 2 T2 T Ý ts1 y1 X Xi , t Xi , t 5 p ª ⍀y1, x x . 1105 NONSTATIONARY PANEL DATA Consequently, Vˆ ªp V , and Vˆayb s Ž n brn a .Vˆa q Vˆb ªp Va q Vb . It follows that X y1 Wa , b s n b vec Ž ˆa y ˆb . Vˆayb vec Ž ˆa y ˆb .4 « m2 y m x , giving the required result. APPENDIX E: PROOFS FOR SECTION 5.2ᎏHOMOGENEOUS PANEL COINTEGRATION LIMIT THEORY Before we start the proof of Theorem 9, we give the following useful lemma. X X X THEOREM 16: Let Fi, t s Ž Ei, t , Ux i, t . s Ý⬁ss0 GsVi, tys be the panel process defined in Model Ž5.11.. X T Also, let Si, t s Ý ts1 Fi, t q Si, 0 , where Si, 0 are iid with E 5 Si, 0 5 4 - ⬁, ⍀ F s GŽ1.GŽ1. , ⌳ F s X Ý⬁ks 0 Ý⬁ss0 Gsqk Gs , and GŽ1. s Ý⬁ss0 Gs . Then, under the summability condition Assumption 9 and positi¨ e definiteness condition Assumption 10, as Ž n, T ª ⬁. with nrT ª 0, 1 n Ý 'n is1 PROOF OF ž T 1 T Ý Fi , t SXi , t y ⌳F ts1 / ž « N 0, 1 2 ⍀F m ⍀F . / LEMMA 16: Using the BN decomposition as in the proof of Lemma 8, we have 1 n Ý 'n is1 s ž T Ý Fi , t SXi , t y ⌳F ts1 n 1 Ý 'n is1 ž 1 n q y q y s T 1 1 T / T X Ý G Ž1.Ž Vi , t PiX, t y Im . G Ž1. ts1 1 Ty1 ž ž Ýž Ý Ýž Ý Ý 'n is1 T n 1 'n 1 T is1 n 1 'n 1 T is1 n 1 Ý ⬁ X X Ý G˜sGsq1 F˜i , t Fi , tq1 y ts1 T / ss0 ˜X0 . Ž G Ž 1 . Vi , t F˜iX, t y G Ž 1 . G ts1 T X G Ž 1 . Vi , t Ž F˜i , 0 q Si , 0 . ts1 1 1 n 1 'n T ⬁ X Ý G˜sGsq1 ss0 / / Ý T F˜i , T SXi , T q 'n Ý T F˜i , 0 SXi , 1 'n is1 is1 1 // q a.s. n Ý Ž Q1 , i , T q R1 , i , T q R 2 , i , T q R 3 , i , T q R 4 , i , T q R 5 , i , T . 'n is1 qO ž' / n T a.s., say. We show that Ž1r 'n .Ý nis 1 Q i, T « N Ž0, 12 ⍀ F m ⍀ F ., and Ž1r 'n .Ý nis1 R k, i, T ªp 0, k s 1, . . . , 5, as Ž n, T ª ⬁. with nrT ª 0. 1106 P. C. B. PHILLIPS AND H. R. MOON Note that n 1 n 1 ts2 1 ž T / T X Ý G Ž1.Ž Vi , tViX, t y Im . G Ž1. is1 / n 1 s T Ý 'n is1 X Ý G Ž1. Vi , t PiX, ty1G Ž1. n 1 q T 1 ž Ý Qi , T s 'n Ý 'n is1 is1 Ý Ž Q1 , i , T q Q2 , i , T . , 'n is1 X say. X Since EŽŽ1rT .ÝTts 1GŽ1.Ž Vi, t Vi, t y Im .GŽ1. . s 0, we have E 2 n 1 Ý Q2 , i , T 'n is1 sE 2 T 1 Ý T X X G Ž 1 .Ž Vi , t Vi , t y Im . G Ž 1 . ts1 1 4 F G Ž 1 . tr sO 1 ž / T T 2 T T X Ý Ý EŽ Vi , tViX, s m Vi , tViX, s . y vecŽ Im .ŽvecŽ Im .. 4 ts1 ss1 . Thus, Ž1r 'n .Ý nis 1 Q2, i, T s op Ž1.. Next, observe that X E w vec Ž Q1 , i , T .Ž vec Ž Q1 , i , T .. x s s ª T 1 T2 1 T 1 2 T X X Ý Ý Ž G Ž1. m G Ž1.. E Ž Pi , ty1 PiX, sy1 m Vi , tViX, s .Ž G Ž1. m G Ž1. . ts2 ss2 T Ý ts2 ty1 ž / T X X Ž G Ž 1. G Ž 1 . m G Ž 1 . G Ž 1 . . ' ⌶ TU X X Ž G Ž1. G Ž1. m G Ž1. G Ž1. . s 1 2 Ž ⍀F m ⍀F . ' ⌶ U , Ž say . say. Also, note that 12 Ž ⍀ F m ⍀ F . ) 0. These verify conditions Ži., Žii., and Živ. of Theorem 3. Condition Žiii. of Theorem 3 holds because 5 Qi , T 5 2 « 5 Qi 5 2 s G Ž1. H dWi WiX G Ž1.X 2 and Q i2, T s tr Ž ⌶ TU . ª tr Ž ⌶ U . s E 5 Q i 5 2 so that the 5 Q i, T 5 2 are uniformly integrable in T. By Theorem 3, Ž1r 'n .Ý nis1 Q1, i, T « N Ž0, 12 ⍀ F m ⍀ F .. 1107 NONSTATIONARY PANEL DATA Next, we show that Ž1r 'n .Ý nis 1 R1, i, T ªp 0 by proving E 5Ž1r 'n .Ý nis1 R1, i, T 5 2 ª 0 as Ž n, T ª ⬁.. Note that 2 n 1 E Ý R1 , i , T 'n is1 X s tr w E Ž vec Ž R1 , i , T .Ž vec Ž R1 , i , T .. .x since E Ž R1 , i , T . s 0 ¡ ⬁ vec 1 s tr ~ Ty1 Ty1 T2 ž Ý Ý ⬁ Ý T hsyTq2 ⬁ ⬁ / X X X ˜pVi , sypVi , sq1yq Gq G ps0 qs0 ⬁ yvec T y 1 y < h< Ty2 1 ⬁ ž žÝ Ý ts1 ss1 ¢ s tr js0 ks0 = vec E ¦ ⬁ Ý Ý G˜jVi , tyjViX, tq1yk GXk ž T ⬁ ž X Ý G˜sGsq1 ss0 X ⬁ /ž ž X Ý G˜sGsq1 vec // ¥ / /§ ss0 / ⬁ Ý Ý Ý Ý Ž Gk m G˜j . = js0 ks0 ps0 qs0 X X X ˜p . =E Ž Vi , tq1yk Vi , tqhq1yq m Vi , tyj Vi , tqhyp .Ž Gq m G ⬁ yvec X ˜s Gsq1 G žÝ X ⬁ ss0 X /ž žÝ ˜s Gsq1 G vec ss0 // 0 . If we show ⬁ ⬁ ⬁ ⬁ Ý Ý Ý Ý Ž Gk m G˜j . ⬁ tr Ý hs0 js0 ks0 ps0 qs0 X X X ˜p . =E Ž Vi , tq1yk Vi , tqhq1yq m Vi , tyj Vi , tqhyp .Ž Gq m G ⬁ yvec žÝ ⬁ X ˜s Gsq1 G ss0 /ž žÝ vec X X ˜s Gsq1 G ss0 // 0 - ⬁, then, by Cesaro summability, it follows that E 5Ž1r 'n .Ý nis 1 R1, i, T 5 2 s O Ž1rT .. Observe that ⬁ Ž 8.23. s ⬁ ⬁ Ý Ý Ý tr Ž Gk GXkqh m G˜j G˜Xjqh . hs0 ž js0 ks0 ⬁ q ⬁ ž ⬁ Ý Ý hs0 ⬁ Ž X ⬁ Ý Ý tr hs0 js0 s I q II q III , X ˜kqhy1 m G˜j Gjqhq1 K m tr Gk G Ý js0 ks Ž0k Ž1yh .. q Ž ¨ 4 y 3. say. / . m ž ˜j . Ž Gjq 1 m G ž Ý e l , lme l , l ls1 / / ˜Xjqh . Ž GXjq hq1 m G / 1108 P. C. B. PHILLIPS AND H. R. MOON Since trŽ A m B . s trŽ A.trŽ B . and trŽ A. F rowsŽŽ A..1r 2 5 A 5 from Ž8.2. in Lemma 9, we have ⬁ ⬁ Ý tr Ý Gk GXkqh Is ž hs0 ks0 ⬁ F ⬁ tr Ý hs0 /ž / ⬁ ⬁ Ý ks0 hs0 ˜k 5 5G žÝ / žÝ / ks0 tr ž Ý G˜j G˜Xjqh / js0 2 ⬁ 5 Gk 5 / js0 Ý Gk GXkqh 2 ⬁ Fm ž ⬁ Ý G˜j G˜Xjqh tr -⬁ by Assumption 9. ks0 By Lemma 10, ⬁ II F ⬁ ⬁ ks1 js0 hs1 4 ⬁ F 5 Gj 5 ⬁ ⬁ q ⬁ 4 ⬁ 5 Gj 5 ks0 js0 ⬁ hs0 ks0 hs0 js0 2 ⬁ js0 / 2 ⬁ j 5 Gj 5 / ⬁ Ý Ý 5 G˜j 5 5 Gjqh 5 q js0 ž Ý Ý 5 Gk 5 5 G˜kqh 5 /ž ž / ž žÝ / žÝ / žÝ / Ý js0 F ⬁ Ý Ý 5 Gk 5 5 G˜ky1 5 5 G˜j 5 5 Gjq1 5 q Ý Ý Ý 5 Gk 5 5 G˜kqhy1 5 5 G˜j 5 5 Gjqhq1 5 5 Gj 5 - ⬁. js0 Similarly, we can show that for some M) 0 4 ⬁ III F M 5 Gj 5 žÝ / - ⬁. js0 Also, we can show by modifying the arguments used above that 2 n 1 E Ý 'n is1 R2, i , T , E 2 n 1 Ý 'n is1 sO R3, i , T 1 ž / T , and n 1 E Ý R4, i , T 'n is1 sO n ž' / , T E 1 n Ý R5 , i , T 'n is1 sO so all the desired results are proved and the lemma follows. ž' / n T , Q. E. D. PROOF OF THEOREM 9: To establish joint limit normality of the PFM estimator ˆP F M , it is enough to show that, as Ž n, T ª ⬁. with nrT ª 0, 1 n Ý 'n is1 ž 1 T T Ý Ž Eˆqi , t XiX, t y ⌳ˆqe x . ts1 / « N Ž 0, 12 Ž ⍀ x x m ⍀ e. x .. , and 1 n n Ý is1 ½ 1 T2 T Ý Xi , t XiX, t ts1 y1 5 p ª 2 ⍀y1 xx . The proof of the latter result is similar to the proof of Lemma 13Ža., so we concentrate on the former. 1109 NONSTATIONARY PANEL DATA y1 q y1 Let ⌳q e x s ⌳ e x y ⍀ e x ⍀ x x ⌳ x x and Ei, t s Ei, t y ⍀ e x ⍀ x x ⌬ X i, t . Then 1 n Ý 'n is1 s ž 1 T 1 T Ý Ž Eˆqi , t XiX, t y ⌳ˆqe x . / ts1 n Ý 'n is1 ž 1 T T Ý Ž Eqi , t XiX, t y ⌳qe x . ts1 y1 . ˆe x ⍀ˆy1 yŽ ⍀ x x y ⍀e x ⍀ x x / Ýž n 1 'n is1 1 T T Ý Ž ⌬ Xi , t XiX, t y ⌳ x x . ts1 / ˆe x y ⌳ e x . q ⍀ˆe x ⍀ˆy1 ' Žˆ . y'n Ž ⌳ x x n ⌳x x y⌳x x . X y1 .Ž ˆe x ⍀ˆy1 ˆ ˆy1 ' . n ŽŽ . T Ž .. Ž . First, Ž ⍀ x x y ⍀ e x ⍀ x x 1r n Ý is1 1rT Ý ts1 ⌬ X i, t X i, t y ⌳ x x s o p 1 because ⍀ e x ⍀ x x y X y1 n ŽŽ T Ž ' Ž . Ž . . .. Ž . ⍀ e x ⍀ x x s op 1 and 1r n Ý is1 1rT Ý ts1 ⌬ X i, t X i, t y ⌳ x x s Op 1 by Lemma 16 and E 5 X i, 0 5 4 - M for some constant M. Next, according to Theorems 9 and 10 in Hannan Ž1970, pp. 280᎐283. Žor ˆF , i y E⍀ˆF , i 5 2 s Ž KrT . OŽ1., and 5 E⍀ˆF , i y Proposition 1 in Andrews Ž1991.., we know that E 5 ⍀ ⍀ F 5 2 s Ž1rK 2 q . O Ž1.. Thus, E 'n Ž ⍀ˆF y ⍀ F . 2 2 n 1 Ý Ž ⍀ˆF , i y E⍀ˆF , i q E⍀ˆF , i y ⍀ F . 'n is1 sE ˆF , i y E⍀ˆF , i 5 2 q n 5 E⍀ˆF , i y ⍀ F 5 2 sE5 ⍀ s ž K T n q K 2q / O Ž1. . Since the bandwidth parameter Kª ⬁ with KrTª 0 and K 2 qrT ª ⑀ ) 0 for some q ) 12 by ˆF y ⍀ F .5 2 ª 0 as Ž n, T ª ⬁. with nrT ª 0. The same Assumption 11, it follows that E 5'n Ž ⍀ ˆF . In consequence, we have argument can be applied to ⌳ 'n Ž ⌳ˆe x y ⌳ e x . , ⍀ˆe x ⍀ˆy1 ' Žˆ . Ž . x x n ⌳ x x y ⌳ x x s op 1 . The remainder of the proof involves showing that 1 n Ý 'n is1 ž 1 T T Ý Ž Eqi , t XiX, t y ⌳qe x . ts1 / ž « N 0, 1 2 Ž ⍀ x x m ⍀ e. x . , / and this is entirely analogous to the proof of Lemma 16. The main contribution of Ž1r 'n .Ý nis 1ŽŽ1r X q .. T .ÝTts 1Ž Eq from the BN decomposition is i, t X i, t y ⌳ e x 1 n Ý 'n is1 ž 1 T T X X Ž .. Ž . Ý Ž Ge Ž1. y ⍀ e x ⍀y1 x x C x 1 Vi , t Pi , ty1 C x 1 ts1 / s 1 n Ý Qi , T , 'n is1 and it is easy to see that X E w vec Ž Q i , T .Ž vec Ž Q i , T .. x s ª ž 1 T 1 2 T Ý ts1 ty1 T / Ž ⍀ x x m ⍀ e. x . X Ž ⍀ x x m ⍀ e. x . s E w vec Ž Q i .Ž vec Ž Q i .. x , 1110 P. C. B. PHILLIPS AND H. R. MOON X Ž .. Ž .X where Q i s Ž Ge Ž1. y ⍀ e x ⍀y1 x x C x 1 H dWi Wi C x 1 . Thus, by Theorem 3, we have the desired result. X q .. All the remainder terms in the BN decomposition of Ž1r 'n .Ý nis 1ŽŽ1rT .ÝTts1Ž Eq i, t X i, t y ⌳ e x 4 converge in probability to zero by Lemma 16 and the moment bound E 5 X i, 0 5 - M, for some constant M. Q. E. D. REFERENCES ANDREWS, D. W. K. Ž1991.: ‘‘Heteroskedasticity and Autocorrelation Consistent Covariance Matrix Estimation,’’ Econometrica, 59, 817᎐858. BALTAGI, B. Ž1995.: Econometric Analysis of Panel Data. New York: Wiley. BILLINGSLEY, P. Ž1968.: Con¨ ergence of Probability Measures. New York: Wiley. ᎏᎏᎏ Ž1986.: Probability and Measure. New York: Wiley. CHAMBERLAIN, G. Ž1984.: ‘‘Panel Data,’’ in Handbook of Econometrics, Vol. 2, ed. by Z. Griliches and M. Intriligator. Amsterdam: North-Holland. CONLEY, T. Ž1997.: ‘‘Econometric Modelling of Cross Sectional Dependence,’’ Mimeo. DUDLEY, R. Ž1989.: Real Analysis and Probability. Pacific Grove: Wadsworth & BrooksrCole Mathematics Series. EICKER, F. Ž1963.: ‘‘Central Limit Theorems for Families of Sequences of Random Variables,’’ Annals of Mathematical Statistics, 34, 439᎐446. GRANGER, C. W. J., AND P. NEWBOLD Ž1974.: ‘‘Spurious Regressions in Econometrics,’’ Journal of Econometrics, 2, 111᎐120. HSIAO, C. Ž1986.: Analysis of Panel Data. Cambridge: Cambridge University Press. HALL, P., AND C. HEYDE Ž1980.: Martingale Limit Theory and its Applications. New York: Academic Press. HANNAN, E. Ž1970.: Multiple Time Series. New York: Wiley. IM, K., H. PESARAN, AND Y. SHIN Ž1996.: ‘‘Testing for Unit Roots in Heterogeneous Panels,’’ Mimeo. LEVIN, A., AND C. LIN Ž1993.: ‘‘Unit Root Tests in Panel Data: New Results,’’ UC San Diego Working Paper. MAGNUS, J., AND H. NEUDECKER Ž1988.: Matrix Differential Calculus. New York: Wiley. MATYAS, L., AND P. SEVESTRE ŽEDS.. Ž1992.: The Econometrics of Panel Data. Boston, MA: Kluwer Academic Publishers. MUIRHEAD, R. Ž1982.: Aspects of Multi¨ ariate Statistical Theory. New York: Wiley. PEDRONI, P. Ž1995.: ‘‘Panel Cointegration; Asymptotic and Finite Sample Properties of Pooled Time Series Tests with an Application to the PPP Hypothesis,’’ Indiana University Working Papers in Economics No. 95-013. PESARAN, H., AND R. SMITH Ž1995.: ‘‘Estimating Long-Run Relationships from Dynamic Heterogeneous Panels,’’ Journal of Econometrics, 68, 79᎐113. PHILLIPS, P. C. B. Ž1986.: ‘‘Understanding Spurious Regressions in Econometrics,’’ Journal of Econometrics, 33, 311᎐340. ᎏᎏᎏ Ž1988.: ‘‘Weak Convergence of Sample Covariance Matrices to Stochastic Integrals via Martingale Approximations,’’ Econometric Theory, 4, 528᎐533. PHILLIPS, P. C. B., AND S. DURLAUF Ž1986.: ‘‘Multiple Time Series Regression with Integrated Processes,’’ Re¨ iew of Economic Studies, 53, 473᎐495. PHILLIPS, P. C. B., AND B. HANSEN Ž1990.: ‘‘Statistical Inference in Instrumental Variables Regression with I Ž1. Processes,’’ Re¨ iew of Economic Studies, 57, 99᎐125. PHILLIPS, P. C. B., AND C. LEE Ž1996.: ‘‘Efficiency Gains from Quasi-Differerencing under Nonstationarity,’’ in Athens Conference on Applied Probability and Time Series: Volume II, Time Series Analysis in Memory of E. J. Hannan, ed. by P. M. Robinson and M. Rosenblatt. New York: Springer-Verlag. NONSTATIONARY PANEL DATA 1111 PHILLIPS, P. C. B., AND H. MOON Ž1997a.: ‘‘Linear Regression Limit Theory for Nonstationary Panel Data,’’ University of Auckland Discussion Paper in Economics. ᎏᎏᎏ Ž1997b.: ‘‘Linear Regression Limit Theory for Nonstationary Panel Data,’’ Yale University, Mimeographed. PHILLIPS, P. C. B., AND V. SOLO Ž1992.: ‘‘Asymptotics for Linear Processes,’’ Annals of Statistics, 20, 971᎐1001. POLLARD, D. Ž1984.: Con¨ ergence of Stochastic Processes. New York: Springer Verlag. QUAH, D. Ž1994.: ‘‘Exploiting Cross-Section Variations for Unit Root Inference in Dynamic Data,’’ Economic Letters, 44, 9᎐19. ROBERTSON, D., AND J. SYMONS Ž1992.: ‘‘Some Strange Properties of Panel Data Estimators,’’ Journal of Applied Econometrics, 7, 175᎐189. SHORACK, G., AND J. WELLNER Ž1986.: Empirical Processes with Applications to Statistics. New York: Wiley.
© Copyright 2026 Paperzz