Granger`s representation theorem

Econometrics Journal (2005), volume 8, pp. 23–38.
Granger’s representation theorem: A closed-form
expression for I(1) processes
P ETER R EINHARD H ANSEN
Stanford University
E-mail: [email protected]
Received: March 2004
Summary The Granger representation theorem states that a cointegrated vector
autoregressive process can be decomposed into four components: a random walk, a stationary
process, a deterministic part, and a term that depends on the initial values. In this paper, we
present a new proof of the theorem. This proof enables us to derive closed-form expressions
of all terms of the representation and allows a unified treatment of models with different
deterministic specifications. The applicability of our results is illustrated by examples. For
example, the closed-form expressions are useful for impulse response analyses and facilitate
the analysis of cointegration models with structural changes.
Key words: Cointegration, Granger representation, Impulse responses.
1. INTRODUCTION
Two theorems—one due to Engle and Granger (1987) and one due to Johansen (1991)—are both
referred to as the Granger representation theorem (GRT). The former asserts the existence of
an autoregressive error correction representation of a process, X t , under the assumptions that
X t and β X t have stationary and invertible VARMA representations for some matrix β. The
latter provides the moving average representation of a vector autoregressive process by making
assumptions (about the autoregressive parameters) that characterize the I(1) processes.1
The Granger representation has four terms: a random walk, a stationary moving average
process, a deterministic component, and a term that depends on initial values. So the Granger
representation can be viewed as a multivariate Beveridge–Nelson decomposition,2 where
permanent and transitory components (or trend and cycle) are commonly used as labels for
the stochastic terms of the representation (see Beveridge and Nelson 1981; King et al. 1991;
Morley et al. 2003).
In this paper, we extend the Granger representation theorem of Johansen (1991). We provide
a new proof of the theorem that allows us to obtain closed-form expressions for all terms
of the Granger representation. This result allows a unified treatment of models with different
This paper is based on a chapter of my Ph.D. dissertation at UCSD.
1 The theorem of Johansen (1991) is sometimes referred to as the Johansen–Granger representation theorem.
2 To the best of our knowledge, this connection has not been pointed out in the literature.
C Royal Economic Society 2005. Published by Blackwell Publishing Ltd, 9600 Garsington Road, Oxford OX4 2DQ, UK and 350 Main
Street, Malden, MA, 02148, USA.
24
Peter Reinhard Hansen
deterministic specifications and our results facilitate the analysis of cointegration models with
structural changes. The closed-form expressions are also useful for impulse response analysis
of the transitory component. For example, the result makes explicit how the distribution of the
estimated (transitory) impulse response function is tied to the distribution of the estimators of the
autoregressive parameters. The result can also be used for parameter estimation that is subject to
constraints on the transitory impulse response function.
The random walk component and the deterministic component are important for the asymptotic
analysis of cointegrated processes, which explains that the literature has devoted most effort to
derive closed-form expressions for these terms. Some details about the random walk component
were derived by Johansen (1988) and its closed-form expression was derived by Johansen (1991).
The deterministic component depends on the deterministic variables of the model. Johansen (1996)
and Hansen and Johansen (1998) contain the closed-form expressions for various specifications
of the deterministic variables. The Granger representation that we derive in this paper embeds the
existing results and provides additional details about the deterministic components. The closedform expressions for the two other components of the representation—the stationary component
and the one that depends on initial values—are new results. So the paper fills a gap in the literature
and since all terms are given in an explicit form we have completed the Granger representation
‘problem’ for the most common specifications of autoregressive processes that are I(1). Besides
the main result, we establish several identities and other intermediate results that may facilitate
the analysis of different aspect of this model.
Our results are based on a new proof and the main structure of the proof is quite simple.
We consider the companion form of the autoregressive process. This process decomposes into a
random walk and a stationary vector autoregressive process of order one that both have simple
moving average representations. By inverting the initial decomposition of the autoregressive
process, we obtain the Granger representation for the process.
The rest of the paper is organized as follows. Section 2 contains the Granger representation
theorem, which is an extension of that due to Johansen (1991). We consider four different
deterministic specifications of the autoregressive model in Section 3, and derive the corresponding
closed-form expressions for the deterministic term of the Granger representation. In Section 4,
we illustrate the applicability of our result to impulse response analyses and the analysis of
cointegrated processes with structural changes. Section 5 summarizes our results. All proofs and
some intermediate results are presented in the Appendix.
We use the following notation: The orthogonal complement of an m × n matrix, A, with full
column rank n, is a full rank matrix, A ⊥ , with dimension m × (m − n) that satisfies A⊥ A = 0.
Further, Ā refers to Ā ≡ A(A A)−1 and diag(A 1 , . . . , Al ) denotes the block-diagonal matrix with
the matrices A 1 , . . . , Al along its diagonal.
2. THE GRANGER REPRESENTATION THEOREM
This Granger representation theorem states that a vector autoregressive
process A(L)X t =
t
Dt + ε t , which is integrated of order one, has the representation X t = C i=1
εi + C(L)εt +
τ (t) + A0 , where {C(L)ε t } is stationary if {ε t } is stationary, where τ (t) is a deterministic
component that depends on the deterministic variables Dt , and where A 0 depends on initial values
(X 0 , X −1 , . . .) (see Johansen 1991, 1991). Johansen’s result provides a closed-form expression for
C (as a function of the autoregressive parameters), whereas the coefficients of the lag polynomial,
C Royal Economic Society 2005
25
Granger’s representation theorem
C(L), the deterministic component, τ (t), and the initial value, A 0 , are given (more or less)
implicitly.
The result of Johansen (1991) is sufficient for the asymptotic analysis of cointegrated processes
because the terms of the representation that are given implicitly do not play a role in this analysis.
However, closed-form expressions for the coefficients of C(L) are important for impulse response
analysis of cointegrated processes where the coefficients are interpreted as the transitory effects
of the shocks, ε t , (see, e.g. Lütkepohl and Reimers 1992; Warne 1993; Lütkepohl and Saikkonen
1997; Phillips 1998). Similarly, in asymptotic analysis of the cointegrated VAR with structural
changes, see Hansen (2003), one needs a closed-form expression for the initial value, A 0 .
We consider the p-dimensional vector autoregressive process of order k given by
X t = 1 X t−1 + 2 X t−2 + · · · + k X t−k + Dt + εt ,
t = 1, . . . , T ,
where the process’ deterministic terms (such as a constant, a linear trend, and seasonal dummies)
are contained in Dt and where ε t , t = 1, . . . , T , is a sequence of independent identically distributed
stochastic variables with mean zero.3 The initial values, X 0 , X −1 , . . . , X −k+1 , are taken as given.
It is well known that the process can be re-written in the error correction form
X t = X t−1 +
k−1
i X t−i + Dt + εt ,
t = 1, . . . , T ,
i=1
k
where = −I + i=1
i and i = − kj=i+1 j .
The conditions that ensure that X t is integrated of order one are stated in the following
assumption.
Assumption 1 Let A(z) ≡ I − 1 z − 2 z 2 − · · · k z k , where z ∈ C.
(i) The roots of the characteristic polynomial (defined from det(A(z)) = 0) are either outside
the unit circle or equal to one.
(ii) The matrix has reduced rank r < p, i.e. = α β , where α and β are p × r matrices of
full column rank r.
k−1
(iii) The matrix α ⊥ β ⊥ has full rank, where = I − i=1
i and where α ⊥ and β ⊥ are the
orthogonal complements to α and β.
The first assumption, (i), ensures that the process is not explosive or seasonally cointegrated.
Roots inside the unit circle cause the process to be explosive (see, e.g. Nielsen 2001), whereas
roots on the boundary of the unit circle that are not equal to one are associated with seasonal
cointegration (see Hylleberg et al. 1990 or Johansen and Schaumburg 1998). The second
condition, (ii), ensures that there are at least p − r unit roots and induces cointegration
whenever r ≥ 1. The third assumption, (iii), restricts the process from being I(2) because
(iii) together with (ii) ensures that the number of unit roots is exactly p − r .4 In fact (iii)
can be replaced by
3 The Granger representation theorem does not rely on the distributional assumptions on ε since it is entirely an algebraic
t
result. But the assumptions are important for interpretations of the representation.
4 For the case where X is integrated of order d ∈ N , Neusser (2000) has shown an interesting relation between the
t
0
order of integration, d, and the multiplicity of the unit root. After transforming the process, X t , to the companion form
X ∗t = ∗ X ∗t−1 + ∗ D ∗t + ε ∗t , the Jordan form of ∗ exposes the relation. See also Bauer and Wagner (2002).
C Royal Economic Society 2005
26
Peter Reinhard Hansen
(iii ) The number of unit roots equals p − r .
Assumption 2 The deterministic term, Dt , satisfies |Dt | < a + |t|b , for some constants a, b ∈ R.
The assumption that Dt is bounded
∞ by−isome polynomial in t is not very restrictive and serves to
2 Dt−i
ensure that certain terms, such as i=1
t, are finite. Under Assumptions 1 and 2, Johansen
(εi + Di ) + C(L) (εt + Dt ) + A0 , where
(1991) derived the representation X t = C i=1
C = β ⊥ (α ⊥ β ⊥ )−1 α ⊥ . In this paper, I consider a different approach in obtaining the
representation. This approach is based on the companion form of the process, and it makes it
possible to derive an explicit representation with closed-form expressions for the coefficients of
the lag polynomial C(L) = C 0 + C 1 L + C 2 L 2 + · · · and the initial value, A 0 . The closed-form
representation is given in the following theorem.
Theorem 1 (The Granger representation theorem) Let a process be given by the equation
k−1
X t = X t−1 +
i X t−i + Dt + εt ,
t = 1, . . . , T ,
i=1
and suppose that Assumptions 1 and 2 hold.
The process has the representation
Xt = C
t
εs + C(L)εt + τ (t) + C(X 0 − 1 X −1 − · · · − k−1 X −k+1 ),
(1)
s=1
where C = β ⊥ (α ⊥ β ⊥ )−1 α ⊥ , where τ (t) = C ts=1 Ds + C(L)Dt , and where the
coefficients of C(L) are given from the recursive formula
k−1
Ci = Ci−1 +
j Ci− j ,
i = 1, 2, . . . ,
(2)
j=1
with the conventions C 0 ≡ I − C and C −1 = · · · = C −k+1 ≡ −C.
Remark 1 The recursive formula (2) yields the Yule–Walker equations. These are usually
associated with the covariance functions γ i = cov(X t , X t−i ) for stationary autoregressive
X t+i
processes, but it is well known that the impulse responses, ϑi = ∂ ∂ε
, also satisfy these
t
equations. In the present setting with cointegrated I(1) variables, the impulse responses
are given by a permanent
and a transitory component, ϑ i = C + Ci , and the equation,
ϑ
ϑi = ϑi−1 + k−1
i− j , simplifies to (2), because C is a matrix that satisfies
j=1 j
C = 0.
Remark 2 The exact form of the deterministic term, τ (t), depends on Dt , and we derive closedform expressions for the cases where Dt is a constant; a restricted constant; a linear trend and a
constant; or a restricted linear trend and a constant.
The proof of Theorem 1 is based on the following useful identities.
Lemma 1 With the definitions above the following identities hold
(I − C) = (I − C)β̄β ,
(I − C) = α ᾱ (I − C),
I = (I − C)β̄β + C( − I ) + C ᾱ⊥ α⊥
.
C Royal Economic Society 2005
Granger’s representation theorem
27
The result of Lemma 1 can be appreciated by considering a vector autoregressive process of
order one. In this case, the Granger representation is easily obtained by dividing the process into a
stationary VAR(1) process and a random walk and then combining the two terms. This approach
makes use of the identity I = α (β α)−1 β + β ⊥ (α ⊥ β ⊥ )−1 α ⊥ . The new proof of Theorem 1
establishes the result for the general case for a VAR(p) process by employing the last identity of
Lemma 1, which simplifies to I = α (β α)−1 β + β ⊥ (α ⊥ β ⊥ )−1 α ⊥ when = I , as is the case for
a VAR(1) process.
Corollary 1 The variables β X t and X t have the representations
β X t = Cβ (L)(εt + Dt )
and
X t = C (L)(εt + Dt ),
where C β,i = β Ci , C ,i = Ci − C i−1 , and Ci , i = 0, 1, . . . , are the coefficients of the polynomial
in Theorem 1.
The representations of Corollary 1 confirm the well-known result that β X t and X t have a
stationary representation provided that Dt = µ 0 . Corollary 1 is useful as it shows how to obtain
closed-form expressions for the coefficients of the stationary polynomials, C β (L) and C (L),
from (2).
2.1. An alternative Granger representation
Next, we derive a slightly different version of the Granger representation. This representation is
often more convenient to work with because the expression for the deterministic term is simpler
in some leading cases. Expressions for the deterministic term will be derived in Section 3.
Lemma 2 The initial value can be expressed as
C(X 0 − 1 X −1 − · · · − k−1 X −k+1 ) = X 0 − C(L)(ε0 + D0 ),
where the lag-polynomial C(L) is that of Theorem 1.
The lemma
motivates an alternative expression for the Granger representation that is given by
X t = C ts=1 (εs + Ds ) + St + X 0 − S0 , where St = C(L)(ε t + Dt ). This expression is that
of Johansen (1991) and if the deterministic part of the model is constant (Dt = µ 0 ), then S 0 =
C(L)(ε 0 + D 0 ) can be viewed as a particular element of a stationary moving average process
of infinite order. However, this does not imply that the ‘stationary’ part of X t , given by St − S 0 is
ergodic, even if {ε t } is ergodic, because it is the same realized value of S 0 that is being subtracted
from X t , for all t = 1, 2, . . . If we isolate the contributions from the deterministic term we have
the representation
Xt = C
t
εs + C(L)εt + τ̃ (t) + X 0 − C(L)ε0 ,
(3)
s=1
where τ̃ (t) = C ts=1 Ds + C(L)(Dt − D0 ) = τ (t) − C(L)D0 . In Section 3, we derive
expressions for τ (t) and τ̃ (t) for particular choices of Dt , and it turns out that the expression
for τ̃ (t) is always simpler than that of τ (t).
C Royal Economic Society 2005
28
Peter Reinhard Hansen
We present the two versions of the Granger representation for the special case where X t is a
vector autoregressive process of order one.
Corollary 2 Let X t = αβ X t−1 + ε t and suppose that Assumption 1 holds. The Granger
representations (1) and (3) are given by
∞
X t = C ts=1 εs + (I − C) i=0
(I + αβ )i εt−i + C X 0 ,
and
Xt = C
t
s=1 εs
+ (I − C)
∞
i=0 (I
+ αβ )i εt−i + X 0 − (I − C)
∞
i=0 (I
+ αβ )i ε0−i ,
respectively, where C = β ⊥ (α ⊥ β ⊥ )−1 α ⊥ .
3. DETERMINISTIC TERMS
The deterministic part plays an important role for the asymptotic analysis of this model because
the limits of test statistics depend on the deterministic term. In this section, we consider
four commonly used specifications of deterministic term, Dt , and derive the corresponding
components
of the Granger representation. The generic form of the deterministic components are
τ (t) = C ts=1 Ds + C(L)Dt , as in (1), and τ̃ (t) = C ts=1 Ds + C(L)(Dt − D0 ), as in
(3). The expression of τ̃ (t) is simpler but the two have the same properties.5
The four deterministic specifications that we consider are6
r
r
r
r
Model H 1 with an unrestricted constant: Dt = µ 0 .
Model H ∗1 with a restricted constant: Dt = αρ 0 .
Model H with an unrestricted deterministic trend: Dt = µ 0 + µ 1 t.
Model H ∗ with a restricted deterministic trend: Dt = µ 0 + α ρ 1 t.
Tosimplify our
expressions below we define:ξ ≡ (I − C)
β̄, η ≡ ᾱ (I − C) , ≡
k−1
k−1
k−1
k−1
k−1
k−1
(i+1)i
˜
i .
i=1
j=i j =
i=1 ii , and where ≡
i=1 i
j=i j =
i=1
2
3.1. Models with a deterministic constant
When the deterministic term is simply a constant, µ 0 = Dt , it is immediately clear that the
Granger representation of X t contains a linear deterministic trend Cµ 0 t, unless Cµ 0 = 0. Using
the expression for C(1), given by Lemma A.6, we find that
τ H1 (t) = Cµ0 t − (ξ η + CC)µ0 ,
where the subscript of τ refers to the form of the deterministic term. This result encompasses
two findings of Hansen and Johansen (1998). The first is that E(β X t ) = β C(1)µ0 =
5 The expressions of the present paper apply to the I(1) model. Rahbek et al. (1999) have derived some related expressions
for the deterministic terms within the I(2) model.
6 We have adopted a notation for the deterministic specifications: H , H ∗ , H , and H ∗ that is standard in the literature,
1
1
see Johansen (1996).
C Royal Economic Society 2005
Granger’s representation theorem
29
−ᾱ (I − C)µ0 , and the second is that the linear trend vanishes in model H ∗1 , where µ 0 =
αρ 0 , whereas the constant is given by
τ H1∗ (t) = C(1)µ0 = −ξρ0 .
The alternative expressions for the deterministic part of the Granger representation are given by
τ̃ H1 (t) = Cµ0 t
and
τ̃ H1∗ (t) = 0.
3.2. Models with a linear deterministic trend
When the deterministic term contains a linear trend, Dt = µ 0 + µ 1 t, the deterministic part of
the Granger representation is given by
τ (t) = 12 Cµ1 t 2 + C µ0 + 12 µ1 t + C(L) (µ0 + µ1 t) ,
(see Hansen and Johansen 1998). This can be re-written as
∞
1
1
2
τ (t) = Cµ1 t + Cµ0 +
iCi µ1 .
C + C(1) µ1 t + C(1)µ0 −
2
2
i=0
(4)
So unless α ⊥ µ 1 = 0, the linear trend µ 1 leads to a quadratic deterministic trend in the process
X t . Using Lemma A.6 we find
τ H (t) = 12 Cµ1 t 2 + Cµ0 + 12 C + C(1) µ1 t
˜
− (ξ η + CC)(µ0 + µ1 ) − [ξ η ξ η − ξ η C − Cξ η − CCC − C C]µ
1.
In the model with a restricted linear trend, µ 1 = αρ 1 , (4) simplifies to
τ H ∗ (t) = (Cµ0 − ξρ1 )t − (ξ η + CC)µ0 − ξρ1 − [ξ η ξ − Cξ ]ρ1 ,
which encompasses a result from Johansen (1996, equation 5.20).7 The alternative expressions
for the deterministic part of the Granger representation are given by
τ̃ H (t) = 12 Cµ1 t 2 + Cµ0 + 12 C − ξ η − CC µ1 t,
τ̃ H ∗ (t) = (Cµ0 − ξρ1 ) t.
4. APPLICATIONS
This section illustrates the applicability of the closed-form expressions that has been derived in
this paper.
7
This can be seen by verifying that the expression for τ 1 , in Johansen (1996, equation 5.20), simplifies to
Cµ0 − (I − C)β̄ρ1 = (Cµ0 − ξρ1 ).
C Royal Economic Society 2005
30
Peter Reinhard Hansen
4.1. Impulse response analysis
Since the explicit Granger representation has exposed how the coefficients C 1 , C 2 , . . . are functions
of the autoregressive parameters , 1 , . . . , k−1 , it is simple to see how restrictions on C 0 , C 1 ,
. . . translate into (non-linear) restrictions on , 1 , . . . , k−1 . So it is clear that the closed-form
representation is useful for maximum-likelihood analysis, subject to constraints on the impulse
response function. Consider, as an example, the constraints Ci (a, b) ≤ 0, i = 1, . . . , m, where
Ci (a, b) is a particular element of Ci , and m is some integer. These restrictions can be expressed
by gi (α, β, 1 , . . . , k−1 ) ≤ 0, i = 1, . . . , m, and this leads to the constrained maximization
problem: max θ L(θ ), s.t. gi ≤ 0, i = 1, . . . , m, where L denotes the likelihood function and θ
represents all the parameters of the model.
Another possible application of the closed-form expressions is to derive the asymptotic
ˆ ˆ 1 , . . . , ˆ k−1 ) using the delta
distribution of Ĉ1 , Ĉ2 , . . . from the asymptotic distribution of (,
ˆ ˆ 1 , . . . , ˆ k−1 ) →
method. However, given the high degree of nonlinearity of the mapping: (,
(Ĉ1 , Ĉ2 , . . .), this approach may require T to be fairly large for this to be useful in practice and
other approaches, such as bootstrap methods, might provide a better solution to this problem.8
Nevertheless, the closed-form expressions for C 1 , C 2 , . . . make it possible to obtain the exact
ˆ ˆ 1 , . . . , ˆ k−1 is available. The
distribution of Ĉ1 , Ĉ2 , . . . whenever the exact distribution of ,
latter may be obtained by simulation methods in particular situations. The result may also be
useful for the analysis of particular aspects of the estimated impulse response function, such as
the bias of Ĉi , i = 0, 1, . . . , and the effect on Ĉi of a bias correction of, say β̂.
4.2. Structural changes in the cointegrated VAR
k−1
Consider the process X t = α(t)β(t)X t−1 + i=1
k X t−k + εt , t = 1, . . . , T , where α(t) =
α1 1{t≤T0 } + α2 1{t>T0 } and where β(t) = β1 1{t≤T0 } + β2 1{t>T0 } . This is a cointegrated process with
a structural change in the parameters, α and β, after time T 0 . We could consider the Granger
representation for each of the two subsamples: t = 1, . . . , T 0 and t = T 0 + 1, . . . , T . The initial
values of these representations are given by A 0 = C(X 0 − 1 X −1 − · · · − k−1 X −k+2 ) and A T0 =
D(X T0 − 1 X T0 −1 − · · · − k−1 X T0 −k+2 ), respectively, where C = β 1⊥ (α 1⊥ β 1⊥ )−1 α 1⊥ and
D = β 2⊥ (α 2⊥ β 2⊥ )−1 α 2⊥ . However, this representation is not directly suited for the analysis of
the estimators because the initial value of the second
representation, A T0 , is not independent of
ε 1 , . . . , ε T . Thus it is not immediately clear that ts=1 εs can be viewed as a random walk while
A T0 is taken as fixed and constant, although this approach has been used in the literature (see
Johansen et al. 2000).
Fortunately, the closed-form expressions make it possible to substitute a Granger
representation for each of the variables X T0 , . . . , X T0 −k+2 , and after some calculations (see Hansen
2003), it can be shown that
T1
A T0 = DC
εs + DC ∗ (L)εT1 + D A0 ,
s=1
where C ∗ (L)ε t is a stationary process. By substituting this expression for A T0 in the Granger
representation for the second subsample t = T 0 + 1, . . . , T , one obtains an expression where the
initial value is independent on the errors, ε 1 , . . . , ε T .
8 See
Inoue and Kilian (2002) for the use of bootstrap methods in this context.
C Royal Economic Society 2005
Granger’s representation theorem
31
5. SUMMARY
We have extended the Granger representation theorem of Johansen (1991). The theorem was
derived with a new proof that facilitates the derivation of closed-form expressions of all terms
of the Granger representation. The closed-form expressions make explicit how the coefficients
of the stationary polynomial, C(L), depend on the autoregressive parameters. This allows the
econometrician to estimate autoregressive models that are subject to constraints on the impulse
response function by maximum likelihood. Our result also shows how the distribution of the
estimated impulse response function is tied to the distribution of the estimated parameters. This
makes it possible to derive confidence bands for a response of a shock to the transitory component
of the process, for example by using the delta method. Further, the expression that we derived for
the initial values facilitates the analysis of cointegrated processes with structural changes.
ACKNOWLEDGEMENT
I thank Graham Elliott, James D. Hamilton, Søren Johansen, Lutz Kilian, Hans Christian Kongsted,
Pierre Perron (editor), Anders Rahbek, Halbert White, and three anonymous referees for valuable
comments and suggestions. All errors remain my responsibility. Financial support from the Danish
Social Science Research Council and the Danish Research Academy is gratefully acknowledged.
REFERENCES
Bauer, D. and M. Wagner (2002). A canonical form for unit root processes in the state space framework.
Working Paper (avaliable at EconPapers).
Beveridge, S. and C. R. Nelson (1981). A new approach to decompositions of time series into permanent
and transitory components with particular attentions to measurement of the ‘business cycle’. Journal of
Monetary Economics 7, 151–74.
Engle, R. F. and C. W. J. Granger (1987). Co-integration and error correction: representation, estimation and
testing, Econometrica 55, 251–76.
Hansen, P. R. (2003). Structural changes in the cointegrated vector autoregressive model. Journal of
Econometrics 114, 261–95.
Hansen, P. R. and S. Johansen (1998). Workbook on Cointegration. Oxford: Oxford University Press.
Hylleberg, S., R. F. Engle, C. W. J. Granger and S. Yoo (1990). Seasonal integration and cointegration.
Journal of Econometrics 44, 215–38.
Inoue, A. and L. Kilian (2002). Bootstrapping autoregressive processes with possible unit roots.
Econometrica 70, 377–91.
Johansen, S. (1988). Statistical analysis of cointegration vectors. Journal of Economic Dynamics and Control
12, 231–54.
Johansen, S. (1991). Estimation and hypothesis testing of cointegration vectors in gaussian vector
autoregressive models. Econometrica 59, 1551–80.
Johansen, S. (1996). Likelihood Based Inference in Cointegrated Vector Autoregressive Models. 2nd edn.
Oxford: Oxford University Press.
Johansen, S., R. Mosconi and B. Nielsen (2000). Cointegration analysis in the presence of structural breaks
in the deterministic trend. Economtrics Journal 3, 216–46.
Johansen, S. and E. Schaumburg (1998). Likelihood Analysis of Seasonal Cointegration. Journal of
Econometrics 88, 301–39.
C Royal Economic Society 2005
32
Peter Reinhard Hansen
King, R. G., C. I. Plosser, J. H. Stock and M. W. Watson (1991). Stochastic trends and economic fluctuations.
American Economic Review 81, 819–40.
Lütkepohl, H. and H.-E. Reimers (1992). Impulse response analysis of cointegrated systems. Journal of
Economic Dynamics and Control 16, 53–78.
Lütkepohl, H. and P. Saikkonen (1997). Impulse response analysis in infinite order cointegrated vector
autoregressive processes. Journal of Econometrics 81, 127–57.
Morley, J. C., C. R. Nelson and E. Zivot (2003). Why are Beveridge–Nelson and unobserved-component
decompositions of GDP so different? Review of Economics and Statistics 85, 235–43.
Neusser, K. (2000). An algebraic interpretation of cointegration. Economics Letters 67, 273–81.
Nielsen, B. (2001). The asymptotic distribution of unit root tests of unstable autoregressive processes.
Econometrica 69, 211–19.
Phillips, P. C. B. (1998). Impulse response and forecast error variance asymptotics in nonstationary VARs.
Journal of Econometrics 83, 21–56.
Rahbek, A., H. C. Kongsted and C. Jørgensen (1999). Trend-stationarity in the I(2) cointegration model.
Journal of Econometrics 90, 81–106.
Warne, A. (1993). A common trends model: Identification, estimation and inference. Seminar Paper
No. 555, IIES, Stockholm University.
APPENDIX
Lemma A.1 Let a and b be m × n matrices, m ≥ n with full column rank n, and let a ⊥ and b ⊥ be their
orthogonal complements, respectively.
The following five statements are equivalent.
(i)
(ii)
(iii)
(iv)
(v)
The matrix (I + b a) does not have 1 as an eigenvalue.
Let v be a vector in Rm . Then (b a)v = 0 implies v = 0.
The matrix b a has full rank.
The m × m matrix (b, a ⊥ ) has full rank.
The matrix b⊥ a ⊥ has full rank.
Proof. The equivalence of (i), (ii), and (iii) is straightforward, and the identity
det(a, a⊥ ) det(b, a⊥ ) = det((a, a⊥ ) (b, a⊥ )) = det
ab
0
a⊥ b
a⊥ a⊥
= det(a b) det(a⊥ a⊥ ),
proves that (iii) holds if and only if (iv) holds. Finally, the identity
det(b, b⊥ ) det(b, a⊥ ) = det((b, b⊥ ) (b, a⊥ )) = det
b b
0
b
b⊥
b⊥
a⊥
a⊥ ),
= det(b b) det(b⊥
completes the proof.
The vector autoregressive process of order p is transformed into the companion form by defining
∗
X t = (X t , X t−1 , . . . , X t−k+1 ) . A related companion form was used in Phillips (1998) for a different purpose,
and our approach is closely related to the state-space approach for analysing this type of processes (see, e.g.
Bauer and Wagner 2002). With suitable definitions (given below) we find that
∗
∗
X t∗ = ∗ X t−1
+ ∗t + εt∗ = α ∗ β ∗ X t−1
+ ∗t + εt∗ ,
C Royal Economic Society 2005
33
Granger’s representation theorem
which is a vector autoregressive process of order one. The needed definitions are

 αβ + 1 2 − 1 · · · k−1 − k−2 −k−1

I
−I
0 



..  ,
.
∗
..
=
. 



−I
0 
0

α
0

α∗ =  .
 ..
0
εt∗
=
0
···
1
I
..
(εt , 0, . . . , 0) ,
.


β
k−1
0


0 
.
 , β ∗ =  ..



I
0
and
∗t
−I
I
I
−I
0
I
···
..
···
.
0
0




,

I 
−I
= ((Dt ) , 0, . . . , 0) ,
and it is easily verified that the orthogonal complements of α ∗ and β ∗ are given by
α⊥∗ = (α⊥ , −α⊥ 1 , . . . , −α⊥ k−1 ) and β⊥∗ = (β⊥ , . . . , β⊥ ) .
Lemma A.2 Let α, β, α ∗ , and β ∗ be defined as above. Given Assumption 1, the eigenvalues of the matrix (I
+ β ∗ α ∗ ) are all less than one in absolute value.
∗
∗ ∗
Proof. By Assumption 1 (iii), the identity α ∗
⊥ β ⊥ = α ⊥ (I − 1 − · · · − k−1 )β ⊥ shows that α ⊥ β ⊥ has full
∗ ∗
rank, and by Lemma A.1, we have that 1 is not an eigenvalue of (I + β α ). To complete the proof, let
v = (v 1 , . . . , v k ) = 0 be an eigenvector of (I + β ∗ α ∗ ), i.e. (I + β ∗ α ∗ )v = λv, for some λ ∈ C. The upper
r + p rows of (I + β ∗ α ∗ )v yield
v1 + β (αv1 + 1 v2 + · · · + k−1 vk ) = λv1
(αv1 + 1 v2 + · · · + k−1 vk ) = λv2
which implies λβ v 2 = (λ − 1)v 1 , and the lower rows of (I + β ∗ α ∗ ) imply v 2 = λv 3 = · · · = λk−2 vk . The
case λ = 0 trivially satisfies |λ| < 1, so assume λ = 0. Multiply (A.1) by (λ − 1)/λk and substitute z = 1/λ
to obtain
[I (1 − z) − αβ z − 1 (1 − z)z − · · · − k−1 (1 − z)z k−1 ]vk = 0.
By Assumption 1,

det  I (1 − z) − αβ z −
k−1

i (1 − z)z
i
=0
i=1
implies |z| > 1 and we conclude that |λ| < 1.
∞
The result has
the implication that the sum i=0
(1 + β ∗ α ∗ )i is convergent with limit (β ∗ α ∗ )−1 , so that
∞
∗ ∗ i
a process Yt = i=0 (1 + β α ) u t−i is stationary whenever u t is stationary, provided that Assumption 1
holds.
Proof of Lemma 1 Since I = β(β β)−1 β + β⊥ (β⊥ β⊥ )−1 β⊥ = β̄β + β⊥ β̄⊥ , the first identity follows from
(I − C) = (I − C)(β̄β + β⊥ β̄⊥ )
= (I − C)β̄β + β⊥ β̄⊥ − β⊥ (α⊥ β⊥ )−1 α⊥ β⊥ β̄⊥
= (I − C)β̄β .
C Royal Economic Society 2005
34
Peter Reinhard Hansen
The second identity is proven similarly and the third follows by applying the first identity and the identity C =
C ᾱ⊥ α⊥ .
With these results established we can prove the main result of this paper.
Proof of Theorem 1. Under Assumption 1, the pk × pk matrix (β ∗ , α ∗⊥ ) has full rank. We can therefore
obtain the Granger representation for X ∗t by deriving the moving average representation for the processes
∗
∗
∗ −1
β ∗ X ∗t and α ∗
.
⊥ X t and combine them by stacking the two on top of each other and multiply by (β , α ⊥ )
∗ ∗
∗ ∗
∗ ∗
∗ ∗
∗
First, consider the process β X t = (I + β α )β X t−1 + β (εt + t ). Since all the eigenvalues of
(I + β ∗ α ∗ ), according to Lemma A.2, are smaller than one in absolute value, the process has the stationary
representation β ∗ X ∗t = C ∗ (L)(ε∗t + ∗t ) where C i∗ = (I + β ∗ α ∗ )i β ∗ , and where by stationary we mean
that β ∗ X ∗t − E(β ∗ X ∗t ) is stationary.
t
∗
Next, consider the random walk α⊥∗ X t∗ = α⊥∗ X t−1
+ α⊥∗ (εt∗ + ∗t ) = α⊥∗ X 0∗ + i=1
α⊥∗ (εi∗ + i∗ ). A
∗
representation for X t is now obtained by


C ∗ (L)(εt∗ + ∗t )

t
X t∗ = (β ∗ , α⊥∗ )−1 

α⊥∗ (εi∗
+
i∗ )
+
α⊥∗ X 0∗

.

i=1
The matrix (β ∗ , α ∗⊥ )−1 is given in Lemma A.4, and its upper p rows (which define the equation for X t ) are
s
given by ((I − C)β̄, −C1s , . . . , −Ck−1
, C ᾱ⊥ ) with the definition is ≡ i + · · · + k−1 . For simplicity,
s
s
we define F = ((I − C)β̄, −C1 , . . . , −Ck−1
) and obtain the representation

C ∗ (L)(εt∗ + ∗t )

 t
X t = (F, C ᾱ⊥ )  
α⊥∗ (εi∗ + i∗ ) + α⊥∗ X 0∗

i=1
= FC ∗ (L)(εt∗ + ∗t ) + C ᾱ⊥
t
α⊥∗ (εi∗ + i∗ ) + C ᾱ⊥ α⊥∗ X 0∗
i=1
= C(L)(εt + Dt ) + C
t
(εi + Di ) + A0 .
i=1
where the initial value is given by
A0 = C ᾱ⊥ α⊥∗ X 0∗ = C(X 0 − 1 X −1 − · · · − k−1 X −(k−1) ),
and the coefficients of the polynomial C(L) are given by Ci = F(I + β ∗ α ∗ )i β ∗ E 1 = Fϒ i B E 1,2 , where
ϒ ≡ (I + β ∗ α ∗ ), B ≡ diag(β, Ip , . . . , Ip ), E 1 ≡ (Ip , 0, . . . , 0) , and where E 1,2 ≡ (Ip , Ip , 0, . . . , 0) . Since
(I + β α)β = β (I + αβ ), it follows that ϒ B = B Q, where





Q=



I +
1
···
k−2
1
···
k−2
0
..
.
I
0
0
..
.
I
k−1

k−1 


0 
,
.. 

. 
0
which allows us to express the coefficients by Ci = Fϒ i B E 1,2 = FB Qi E 1,2 = FB κ i , where κ i = (κ 1,i , . . . ,
κ k,i ) = Qi E 1,2 , and where the identity (I − C)β̄β = (I − C) of Lemma 1 can be used to show that
FB = ((I − C), − C s1 , . . . , − C sk−1 ). Tedious algebra (given in the proof of Lemma A.5) leads to the
C Royal Economic Society 2005
35
Granger’s representation theorem
relation Ci = C i−1 + κ 2,i , i = 0, 2, . . . , where C −1 = −C, and the structure of Q yields the equation
κ2,i =
i
( + j )κ2,i− j ,
κ2,0 = I ,
i = 1, 2, . . .
(A.2)
j=1
Via the substitution κ 2,i− j = C i− j − C i− j−1 , we find the relation Ci = Ci−1 + ij=1 ( + j )Ci− j , i =
1, 2, . . . , with the conventions C 0 ≡ I − C, C 0 ≡ I , and j ≡ 0 for j ≥ k. Using that C = 0 we obtain
the expression of the theorem. This completes the proof.
Proof of Corollary 1. Follows directly from the Granger representation theorem.
Lemma A.3 The coefficients of the stationary lag polynomial C(L) of Theorem 1, satisfy
CCi = C(1 Ci−1 + · · · + k−1 Ci−k+1 ),
f or all i ∈ N0 .
Proof. From the Yule–Walker equations, (2), we have that CCi = C( 1 C i−1 + · · · + k−1 C i−k+1 ),
which implies that C(Ci − 1 C i−1 − · · · − k−1 C i−k+1 ) is constant. For i = 0, this term equals C(I − C
+ 1 C + · · · + k−1 C) = C + C(−I + 1 C + · · · + k−1 C) = C − CC = 0, which completes the
proof.
Proof of Lemma 4. The identity
C(X 0 − 1 X −1 − · · · − k−1 X −k+1 ) = X 0 + (C − I )β̄β X 0 + C
k−1
is X 1−i ,
i=1
shows that the initial value can be divided into X 0 and S 0 , say, where the latter is a linear combination of
the variables β X 0 , X 0 , . . . , and X −k+2 . Substitution of the representations of Corollary 1 yields
S0 =
(C − I )β̄β [
C0 ζ0
+
+C1s [
+C2s [
(C0 − C−1 )ζ0
+
C1 ζ−1
+···+
Ck−2 ζ−k+2
+ ··· ]
(C1 − C0 )ζ−1
+···+
(Ck−2 − Ck−3 )ζ−k+2
]
(C0 − C−1 )ζ−1
+···+
+ ··· ]
..
.
(Ck−3 − Ck−4 )ζ−k+2
..
.
s
[
+Ck−1
(C0 − C−1 )ζ−k+2
+ · · · ],
where ζ t = ε t + Dt . Adding up the terms for each ζ t and using the identity (C − I ) = (C − I )β̄β yields
S0 =
∞
{[(C − I ) + C1s ]Ci − C(1 Ci−1 + · · · + k−1 Ci−k+1 )}ζ0−i
i=0
∞
∞
=
{[C − I ]Ci − CCi }ζ0−i =
− Ci ζ0−i = −C(L)(ε0 + D0 ),
i=0
C Royal Economic Society 2005
i=0
36
Peter Reinhard Hansen
where we used the result of Lemma A.3.
Lemma A.4 The inverse of (β ∗ , α ∗⊥ ) is given by

(β ∗ , α⊥∗ )−1
(I − C)β̄
···
−C1s


 (I − C)β̄



 (I − C)β̄
=

..


.



 (I − C)β̄
−C1s − I
(I − C)β̄
−C1s − I
−C1s − I
−C1s − I
..
.
..
.
..
.
s
−Ck−1
s
−Ck−1
..
.
..
.
..
.
s
−Ck−1
..
.
..
.
s
−Ck−1
···
s
−Ck−1
−I
C ᾱ⊥



C ᾱ⊥ 



C ᾱ⊥ 
.
.. 

. 



C ᾱ⊥ 
C ᾱ⊥
Proof. The lemma is proved by verifying that the product of (β ∗ , α ∗⊥ ) and the expression above is the
identity matrix. For example, for the upper-left block matrix we find (I − C)β̄β − C1s + C ᾱ⊥ α⊥ =
(I − C)β̄β + C(I − ) + C ᾱ⊥ α⊥ , which equals I given Lemma 1. All other calculations follow similarly
from the identities of Lemma 1.
Lemma A.5 With the definitions given in the proof of Theorem 1, it holds that
Ci = Ci−1 + κ2,i , i = 1, 2, . . . ,
and κ 2,i is given recursively from (A.2).
Proof. From the equation
κ 0 = E 1,2 and the structure of Q it follows that κ 1,i = κ 1,i−1 + κ 2,i ,
κ i = Qκ i−1 ,
i
such that κ1,i = κ1,0 + i−1
κ
=
2,i−
j
j=0
j=0 κ2,i− j , where we use that κ 1,0 = κ 2,0 ; and that κ k,i = κ 2,i−k+2 ,
for k ≥ 2, where we use the convention κ 2,i ≡ 0 for i < 0. These identities show that
κ2,i = κ1,i−1 + 1 κ2,i−1 + 2 κ3,i−1 + · · · + k−1 κk,i−1
i−1
=
κ2,i−1− j + 1 κ2,i−1 + 2 κ2,i−2 + · · · + k−1 κ2,i−k+1
j=0
i
=
( + j )κ2,i− j ,
(recall that j ≡ 0 for j ≥ k),
j=1
such that κ 2,i is given recursively from (A.2). Since Cκ 2,i = C 1 κ 2,i−1 + · · · + C k−1 κ k,i−1 , we find
s
Ci = F B κi = (I − C) κ1,i − C1s κ2,i − · · · − Ck−1
κk,i
s
= (I − C)(κ2,i + κ1,i−1 ) + C( − I )κ2,i − C2s κ3,i − · · · − Ck−1
κk,i
s
= (I − C) κ2,i + (I − C) κ1,i−1 − C2s κ3,i − · · · − Ck−1
κk,i
s
= (I − C) κ2,i + (I − C) κ1,i−1 − C(1s − 1 )κ2,i−1 − · · · − C(k−2
− k−2 )κk−1,i−1
−C(k−1 − k−1 )κk,i−1
= F B κi−1 + κ2,i
= Ci−1 + κ2,i ,
C Royal Economic Society 2005
37
Granger’s representation theorem
which completes the proof.
Lemma A.6 Let C(L) be the polynomial of Theorem 1. Then
C(1) =
∞
Ci = −ξ η − CC,
i=0
∞
˜
iCi = [ξ η + CC] + [ξ η ξ η − ξ η C − Cξ η − CCC − C C].
i=0
Proof of Lemma A.6. The first identity is proven as follows. The object of interest is given by
C(1) = F
∞
(I + β ∗ α ∗ )i β ∗ E 1 = −F(β ∗ α ∗ )−1 β ∗ E 1 .
i=0
The inverse of







∗ ∗
β α =





βα
β 1
β 2
···
β k−2
α
1 − I
2
···
k−2
0
..
.
I
−I
..
.
..
.
..
.
β k−1
k−1 


0 

.. 
,
. 



0 
−I
0

−I
I
is given by

∗ ∗ −1
(β α )



=


ᾱ (I − C) β̄
ᾱ (I − C) 1s
(I − C) β̄
..
.
−C1s − I
(I − C) β̄
−C1s − I
···
s
ᾱ (I − C) k−1
s
−Ck−1
..
.
···




.


s
−Ck−1
−I
Since β ∗ E 1 = (β, I , 0, . . . , 0) , it holds that




(β α ) β E 1 = 


∗ ∗ −1
∗
ᾱ (I − C)
−C
..
.




,


(A.3)
−C
s
and multiplication by −F = −((I − C)β̄, −C1s , . . . , −Ck−1
) yields
C(1) = −(I − C)β̄ ᾱ (I − C) − C
k−1
k−1 i=1 j=i
= −ξ η − CC,
as stated.
C Royal Economic Society 2005
j C
38
Peter Reinhard Hansen
The second identity is proven as follows. Since
∞
i=0
i (I + β ∗ α ∗ )i = (β ∗ α ∗ )−1 + (β ∗ α ∗ )−2 , it holds
that
∞
i=0
iCi = F
∞
i(I + β ∗ α ∗ )i β ∗ E 1
i=0
∗ ∗ −1
= F(β α ) β ∗ E 1 + F(β ∗ α ∗ )−2 β ∗ E 1 .
The former term equals −C(1) and the latter term is derived by multiplying (β ∗ α ∗ )−1 by (β ∗ α ∗ )−1 β ∗ E 1
that was derived above. The product is given by


η ξ η − η C


ξ η + CC + C




∗ ∗ −2 ∗
(β α ) β E 1 = 
,
.


..


ξ η + CC + (k − 1) C
where we used that ᾱ (I − C) β̄ = ᾱ (I − C)(I − C)β̄ = η ξ . Multiplication by F yields
˜
F (β ∗ α ∗ )−2 β ∗ E 1 = ξ η ξ η − ξ η C − Cξ η − CCC − C C
−
ξη
= (ξ η , C)
.
˜
C
− −C − Combining the two terms completes the proof.
C Royal Economic Society 2005