Prediction in dynamic models with time

Journal
of Econometrics
52 (1992) 91-113.
North-Holland
Prediction in dynamic models with
time-dependent conditional variances*
Richard
T. Baillie
Michigan State Unil~ersity, East Lansing,
MI 48824, USA
Tim Bollerslev
Northwe.ytern
Unirmsity,
Ecanston,
IL 60208, USA
This paper considers
forecasting
the conditional
mean and variance
from a single-equation
dynamic model with autocorrelated
disturbances
following an ARMA process, and innovations
with time-dependent
conditional
heteroskedasticity
as represented
by a linear GARCH process.
Expressions
for the minimum MSE predictor and the conditional
MSE are presented.
We also
derive the formula for all the theoretical
moments of the prediction
error distribution
from a
general dynamic model with GARCHtl,
1) innovations.
These results are then used in the
construction
of ex ante prediction
confidence
intervals by means of the Cornish-Fisher
asymptotic expansion. An empirical example relating to the uncertainty
of the expected depreciation
of
foreign exchange rates illustrates the usefulness of the results.
1. Introduction
The ARCH class of models was originally introduced
by Engle (1982) as a
convenient
way of modeling time-dependent
conditional
heteroskedasticity;
see Bollerslev,
Chou, and Kroner (1992) for a recent survey. Despite the
extensive literature
on ARCH and related models, relatively little attention
has been given to the issue of forecasting
in models where time-dependent
conditional
heteroskedasticity
is present. Bollerslev (19861, Diebold (19881,
and Granger,
White, and Kamstra (1989) all discuss the construction
of
one-step-ahead
prediction
error intervals with time-varying
variances. Engle
and Kraft (1983) derive expressions
for the multi-step
prediction
error
variance in ARMA models with ARCH errors, but do not further discuss the
characteristics
of the prediction
error distribution.
The prediction
error
*The authors are very grateful to Rob Engle, two anonymous referees, and the participants
at
the conference
on ‘Statistical Models of Volatility’ at the University of California,
San Diego for
helpful comments, and thank the NSF for financial support under Grant SES90-22807.
0304-4076/92/$05.00
0 1992-Elsevier
Science
Publishers
B.V. All rights reserved
92
R. T. Baillie and T. BollersleL>,Prediction in dynamic models
distribution
is also analyzed in Geweke (1989) within a Bayesian framework
using extensive simulation
methods. It turns out that the presence of ARCH
effects can make a substantial
difference to the conduct of inference, such as
constructing
ex ante forecast confidence
intervals and out-of-sample
structural stability tests.
This paper considers
prediction
from a fairly general
single-equation
model as represented
by a nonlinear
regression function with ARMA disturbances and innovations
with time-dependent
heteroskedasticity.
After a brief
section discussing
notation,
section 3 describes
how the minimum
mean
square error (MSE) predictor of the future values of the conditional
mean
can be constructed.
In the absence of ARCH in the mean effects, the actual
form of the predictor
is the same as in the homoskedastic
case, but the
presence of ARCH changes the MSE of the predictor and can make it larger
or smaller than the value obtained
under the assumption
of conditional
homoskedasticity.
By expressing the Generalized
ARCH (GARCH)
model,
as in Bollerslev (19861, in a companion
form representation,
section 4 derives
the minimum
MSE predictor of future values for the conditional
variance.
Some theoretical
results for the corresponding
MSE for the predictions
of
the conditional
variance are given in section 5. Section 6 then derives explicit
expressions
for all the theoretical
moments
of the conditional
prediction
error distribution
for the popular
GARCH(l,l)
model. This allows the
percentiles
of the forecast density to be approximated
by means of the
Cornish-Fisher
asymptotic expansion as discussed in section 7. These results
are extended
in section 8 to the important
case in practice where the
disturbances
from a model have ARMA errors and GARCH(1, 1) innovations. In section 9 the practical relevance
of the techniques
is illustrated
through
a simple empirical
example
relating
to the uncertainty
of the
expected depreciation
in the forward foreign exchange rate market. A brief
conclusion
and suggestions for future work are given in section 10.
2. Notation
and assumptions
In many practical contexts it is important
to derive multi-step
predictions
of the conditional
mean from dynamic econometric
models. To keep the
setup as general as possible, let {y,) refer to the univariate
discrete time
real-valued
stochastic process to be predicted and let
Et-ICY,)
=~r
denote the conditional
mean given information
innovation
process, {E,}, for the conditional
mean
E, = Y, - PLt >
(1)
through time t - 1. The
is then given by
(2)
R. T. Baillie and T. Boliersle~: Prediction in dynamic models
with corresponding,
possibly
infinite,
Var( Et) = E( &:) = u2.
unconditional
93
variance
(3)
While the unconditional
variance
is assumed to be time-invariant,
the
conditional
variance of the process is allowed to depend nontrivially
on the
set of conditioning
information,
so that
Var,_,(
y,) = E,-i(&,“>
E a?.
(4)
It is important
to note that both pt and a, are measurable
with respect to
the time t - 1 information
set and assumed to be finite with probability one.
We also define
u,=&,2-q2.
(5)
Similarly to the innovation
process for the conditional
mean given in (2), (v,}
is serially uncorrelated
through time with mean zero, and is readily interpreted as the time t innovation
for the conditional
variance.
The above setup allows for a wide variety of dynamic econometric
models
with time-varying
second-order
moments. In order to simplify the exposition,
in the following
analysis we shall concentrate
on predictions
from the
standard ARMA(k, 1) class of models, i.e.,
The extension to the case of exogenous explanatory variables allowing for the
possibility of co-integration,
as in Engle and Granger (1987), is in principle
straightforward.
It is well-known
that if the innovation
variance, a*, is finite,
the ARMA(k,l)
model in (6) is covariance-stationary
and invertible
if and
onlyifalltherootsof1-~,z-~~~-~,zk=Oand1-~,z-~~~-~,z’=O
lie outside the unit circle.
One important
exclusion from this framework
concerns
the ARCH in
mean model, originally due to Engle, Lilien, and Robins (1987). Processes
with feedback from the conditional
variance to the conditional
mean will
considerably
complicate
the form of the predictor and its associated MSE.
Analysis of such models is consequently
left for future research.
Recently several alternative
parameterizations
for the time-varying
conditional variance, a,*, have been suggested in the econometrics
and time series
literature.
In this study, we shall focus on the popular linear GARCH(p,
q)
94
R. T Baillie and T. Bollersleu, Prediction in dynamic models
class of models,
at
i=l
(7)
i=l
where o > 0, and (Y~ and pi are restricted
so that the coefficients
in the
infinite distributed
lag representation
of a,* in terms of lagged values of &:
are all positive; see Nelson and Cao (1992). If (hi + . . . +a, + p, + . . . +
/3, < 1, {Ed} is covariance-stationary
and
a2=w(l
-(Y, -
..’
-cr,-p,
-
...
-p,)
-1
Similarly
to the ARMA(k,
f) model
for the conditional
mean,
the
GARCH(p,
4) structure
could easily be extended
to allow for exogenous
explanatory
variables entering the conditional
variance. The particular
parameterization
in the GARCH(p,
4) model in (7) has a,* expressed
as a
function of lagged squared innovations.
An alternative,
although less widely
used, representation
that could be analyzed in a similar fashion involves a,*
being expressed
as a function
of serially correlated
lagged disturbances
u, = yt -f<x,; b), where f(x,; b) denotes
a possible nonlinear
regression
function; see Bera and Lee (1988) and Bera, Lee, and Higgins (1990).
In part of the subsequent
analysis we shall make use of the higher-order
conditional
moments for the (F,) process. For simplicity, we assume that this
conditional
distribution
is symmetric with all the existing even-ordered
moments proportional
to the corresponding
powers of the conditional
variance,
E,_,(E:‘+‘)
= 0,
E,_ ,( $‘)
= K,o;2’,
r=O,l,...,
K-l,
(8)
r = 0, 1, . . . , K.
(9)
Here K, denotes the rth-order
cumulant
for the conditional
density of Em,
and by definition
K() = K~ = 1.
For instance, under the assumption
of conditional
normality often invoked
when conducting
inference
in ARCH-type
models, all the moments of the
conditional
distribution
of &, are finite and
K,=
fi(2i-I),
r=
1,2 )...
.
(10)
i=l
With conditional
t-distributed
errors
as in Bollerslev
(1987),
~,=(n-2)~r(r+l/2)~(n/2-r)~(1/2)-’T(n/2)-’,
(11)
r=
1,2 ,...,
K,
95
R. T Baillie and T. Bollerslec, Prediction in dynamic models
where r(.) denotes the gamma function, n > 2 the degrees of freedom in the
t-distribution
standardized
to have a unit variance, and K = int(n/2).
Note,
only the first IZ moments of the t-distribution
are finite.
While the vast majority of empirical studies using ARCH models tend to
rely on parametric
specifications
for the conditional
density of E, given
information
through time t - 1 as in (10) and (ll), different nonparametric
methods have also been suggested in the literature,
including the polynomial expansion
in Gallant
and Tauchen
(1989) and Gallant,
Hsieh, and
Tauchen
(1990) and the nonparametric
density estimation
in Engle and
Gonzalez-Rivera
(1991). Although,
explicit expressions
for the higher-order
conditional
moments may not be directly available from these methods, the
implied K,. coefficients can easily be evaluated using numerical
techniques.r
3. Prediction
of the mean in ARMACR, I) models
Many alternative
expressions
are available for the minimum MSE predictor from the ARMA(k, 0 class of models. However, in order to provide an
analogy with subsequent
material it is convenient
to express the ARMA(k, I)
model given by (11, (2), (4), and (6) in the companion
form representation
as
bk
0
-$I,
Yr
Yl-I
1
Et-1
0
0
0
El-l+1
0
Yr-ktl
=
El
+
0
. ..lO
0,
0
f-J0
o...
0
0
0
0
1
0
0
1
0
0
:::
+
(12)
or more compactly,
Y, =w,
+ @K_,
+ (e,
+ektl)E,,
(13)
where ej refers to the compatible
vector of zeros except for unity in the jth
element,
here a (k + II-dimensional
unit vector. Following
Baillie (19871,
upon repeated
substitution
in (13) the optimal
s-step-ahead
predictor
is
readily seen to be
k-l
Et( Yt+s)= L.~+ C
i=O
I-
T,,.~Yt-i +
I
C
*i,s’t-i,
(14)
i=O
‘The K, coefficients
from the density estimation
in Engle and Gonzalez-Rivera
(1991) are
time-invariant
by assumption,
but the higher-order
standardized
conditional
moments from the
seminonparametric
method in Gallant and Tauchen (1989) may be time-varying.
R. i? Baillie and i? Bollerslec, Prediction in dynamic models
96
where
Ls=e;(I+@+
ri,s
=
...
e;cDsei+,,
+@‘-l)e,p,
i=O ,...,ki=O ,...,l-
hi,,=e;@sek+i+i,
(15)
1,
(16)
1.
(17)
This is a different and more tractable expression than that given by Baillie
(1980) and Yamamoto
(1981), where the ARMA process is represented
in
terms of an infinite-order
autoregression.
Furthermore,
by direct substitution
and iterated expectations
the forecast
error for the s-step-ahead
predictor in (14) is given by
e f,S =-Y*+s
-
Et( ~r+s) = i +r-iFt+i,
(18)
i=l
with conditional
MSE
Et(&) = Var,(y,+,>= i +LiEt(uhi),
i=l
where
ICI,=e;@(e,
+ek+,),
i=O ,...,s-
1.
(20)
Note, the I,!J~‘scorrespond
directly to the coefficients
in the infinite-order
moving average representation
of the model.
With conditionally
homoskedastic
errors the conditional
MSE for the
optimal s-step-ahead
predictor is identical to the unconditional
MSE,
Var( y,,,) =u2 i 4s2-i,
i=l
where a2 is assumed to be finite. However, when conditional
heteroskedasticity is present, the forecast error uncertainty
is generally changing through
time, and from (19) the conditional
MSE takes the form
Hence, the conventional
first term is appended
by a second term reflecting
the differences
between the average or the unconditional
variance of the
future innovations
and the conditional
variance given information
through
time t. If the process is covariance-stationary
and invertible, $2 goes to zero
and E,(u,:~)
to u2 for increasing
values of i, and the conditional
MSE
converges to the unconditional
variance of the process as the forecast horizon
increases.
However, with a time-varying
variance
this convergence
is not
necessarily monotone.
As previously noted by Engle and Kraft (1983), over
97
R. T Baillie and T. BoNerslel,, Prediction in dynamic models
certain time periods the conditional
MSE may exceed the unconditional
variance provided E,(a,!+;) > u2 for some 1 5 i IS.
It is important
to note, that the presence of conditional
heteroskedasticity
does not change the expression for the minimum MSE predictor as given by
(14), but the forecast error uncertainty
associated with the optimal predictor
will be time-varying
as reflected in (19). Of course, in order to evaluate the
expression
for the conditional
MSE given by the formula
in (19) it is
necessary to calculate the conditional
expectations
for the future conditional
variances of the innovation
process. This is the subject of the next section.
4. Prediction
of the variance
in GARCH(p,
Following Bollerslev (19%X), the linear
can be conveniently
rewritten as
E,2=W + ((Y, +P,)E:-,
-p,V,_,
-
+ ...
...
4) models
GARCH(p,
+(a,
q)
model
in (4) and (7)
+&)E:_,
-pPV,_P+V,,
(21)
where m = max(p, q), cyi = 0 for i > q, and pi = 0 for i >p. From the
definition
in (5), {v,) is a serially uncorrelated
process, and (21) corresponds
to an ARMA(m,p)
model for {E:). Thus, analogous
to the ARMA(k, 1)
representation
in (12), the GARCH( p, q) model may be expressed in companion first-order form as
(Y,
+
ffw,+ Pm
-p,
-p/J
1
0
0
0
0
0
1
0
0
0
0
0
1
0
0
0
0
0
1
0
+p,
0
1
0
r
3
(22)
R. T. Baillie and T. Bollerslel: Prediction in dynamic models
98
or more compactly,
v,‘=we,+rV,Z,+(e,+e,+,)v,.
Upon
repeated
(23)
substitution,
s-1
C = C We, + em+l)V,+.y--r
+ we,)+ rY*.
i=O
However, E:+, = e’,y:,
and E,(v,+~) = 0 for i > 0, and analogously
to the
derivation
of (14) it follows, from pre-multiplication
with e’, and the use of
iterated expectations,
that the minimum MSE s-step-ahead
predictor for the
conditional
variance from the GARCH(p,
q) process is given by*
m-1
p-1
Er($+,)= E,(d+s)=
w_s +
C
6i,sat2-i
+
C
Pi,s&:-i,
(24)
i=O
i=ll
where
w,=e;(l+r...
(25)
+rs-l)e,w,
i=o
a,,,=-e;rsem+i+l,
Pi, s
=4rs(ei+l
i=O ,...,P-1,
+e,+i+,),
i=p,...,m
P 1,s =e;rser+,,
As an illustration
consider
(26)
1,
,...,P-
the popular
(27)
- 1.
(28)
GARCH(1, 1) process,
where
s-l
=ygl( f/q’+
Ql
If the model is covariance-stationary
pi>-‘, the optimal predictor for at+,
E&q:,)
‘An alternative
the
recursions
+
=a*+
expression
E,(a(!+.)
+(Y~E,(E,?+, _J,
where
and EI(~:+I) = E:+, for i 5 0.
(aI +a,)‘-‘(q:,
(a* +P,)s-‘&.
with (Y, + pi < 1 and a2 = ~(1 -(Y, becomes
-a2),
for the optimal predictor for the conditional
= w + P,E,(a,?+,_,)
+ ...
+P,E,(o,?+,_,)
by definition
E,(E,?+,) = E,(a,?+,)
variance is given by
+ ‘YIE,(F:+,_,)
for i > 0, while E,(a,!+,)
= u,?,
99
R. T. Baillie and T. Bollersler~,Prediction in dynamic models
a result previously derived by Engle and Bollerslev (1986). Once again, as the
forecast horizon increases current information
becomes less important,
and
the optimal forecast converges monotonically
to the unconditional
variance.
However, for the Integrated
GARCH(1, l), or IGARCH(1, 11, model with
II, + p, = 1, current information
remains important
for forecasts of all horizons.
E,(a,:,)
=w(s
- 1) +u;+,.
Shocks to the conditional
variance are persistent
in the sense of Bollerslev
and Engle (1989), as E,(u,$,,) - E,_ ,(u,:~) = a,?+, - w - a,’ = a,vt is a nontrivial function of the information
set at time t for all forecast horizons s > 0.
The conditional
MSE associated with the optimal forecasts for the mean in
the general ARMA(k, I)-GARCH(p,
s> class of models is readily obtained
by combining eqs. (19) and (241,
“=A
Y,,,)
=
I
2 v&w; + fl$,,‘_,
p&p;_,
+yp,,,E:_,
.
i=
I
i=
I
i j= 0
i = 0
(29)
For example, for the covariance-stationary
errors the optimal predictor becomes
model with GARCH(1,
1)
=&Y,>
E,(Y,+,)
with associated
AR(l)
MSE
Var,( y,,,)
= i
4f(‘-‘)(c2
+ (a,
+p,)‘-‘(fl,!+,
-v’))
i=l
x((q +P,)‘y-4:s)(q ltPI-4-‘7
where the last equality is only valid for 4: # (Y, + /3,. The inclusion of the
second
term
in the
above
expression
for the
conditional
MSE
may lead to an increase
or a decrease
in the prediction
error variance
compared to the conventional
MSE, but as the horizon increases the conditional
MSE
will
converge
to the
unconditional
variance,
i.e.,
var(y,) = ~(1 -(Y, - p,)-‘(l
- 4:)Y’.
100
R. T. Baillie and T. Bollerslec, Prediction in dynamic models
5. Uncertainty
in predicting
future conditional
variances
The results in the previous section provide formulae for the calculation
of
the forecast error variance of the mean in general dynamic econometric
models with GARCH( p, q) errors. However, in many applications
in financial economics
the primary interest centers on the forecast for the future
conditional
variance itself. Such instances include option pricing as discussed
by Day and Lewis (1992) and Lamoureux
and Lastrapes (1990), the efficient
determination
of the market rate of return as in Chou (19891, and the
relationship
between stock market volatility and the business cycle as analyzed by Schwert (1989). In these situations
it is therefore of interest to be
able to characterize
the uncertainty
associated
with the forecasts for the
future conditional
variances
also. Some potentially
useful results for this
purpose are given by Lemma 1.
Lemma 1. For the GARCH(p,q)
model gicen by (4), (5), (7), (81, and (9)
the forecast error for the s-step-ahead predictor for the conditional cariance
a,!+, in (24) equals
and the conditional MSE is
El(&) = Var,(di,) = b2 - 1) c x_i-A(4k),
(31)
i=l
where
xi=e;T’(e,
+e,+,),
i=l
,...,s-
1.
(32)
Proof.
Since a,!+, - E,(aF+,) = E:+, - E,(u~+,) - v,_~, (30) and (32) follow
from the companion
form in (23) and slight modifications
to the derivation of
(18) and (19). By (5) and (9) Et(~,+i~t+j ) = 0 for 1 I i <j I s, and E,(vf+, j =
k2 - ljE,@=+, > for i > 0. Hence (31) is a result of iterated expectations.
Q.E.D.
To illustrate,
consider the GARCH(1, 1) model. From (22) and (32) it
follows that xi = (Y,((Y,+ /3,>‘-‘, i = 1,. . . , s - 1. Hence, the forecast error
uncertainty
associated
with predictions
of the future conditional
variance
R. T. Baillie and T. BollersleL,, Prediction in dynamic models
101
equals
s-
c ,,s
=
aI
c
i=l
with corresponding
q
1
(a,
+P,)‘-L-i~
conditional
MSE
&> = (K* - 1)a; c
(a, +tp,)2~r~‘~E,(~p,,_,).
i=l
The empirical evaluation
of this conditional
MSE requires an expression for
the fourth conditional
moment.
Such an expression
is derived in the next
section. However, given the focus in the present paper on optimal predictions
for the conditional
mean and the distribution
of the corresponding
prediction
errors, we shall not pursue the topic of optimally forecasting the conditional
variance any further in this paper.
6. Prediction
error distributions
in GARCH(1, 1) models
When conducting
inference
in GARCH(p,
4) models, distributional
assumptions are generally placed on the conditional
distribution
of E, given a,*.
This implies specific values for the cumulants
K,
in (9) that characterize
the
conditional
even-ordered
moments. However, in the presence of ARCH the
unconditional
distribution
of &t has fatter tails than this conditional
onestep-ahead
prediction
error distribution.
In particular,
given finite fourth
moment
E,_ ,(E:) - K*(E,_ ,<&f>>’ = 0 < E(e:> - K,(E(E~>)~.
Similarly,
the
conditional
distribution
of cltF for s > 1 given information
through
time t
differs from the conditional
distribution
for s = 1. Even if the distribution
of
> is time-invariant
for s = 1 by assumption,
for s > 1 and ut2
e,+.s/\lEr(&s
time-varying
the standardized
prediction error distribution
generally depends
nontrivially
on the information
set at time t. As opposed to the conventional
framework with conditionally
homoskedastic
errors where standardizing
with
the prediction
error variance
leads to a time-invariant
distribution,
the
dependence
in the higher-order
moments
for the GARCH(p,
4) model
substantially
complicates
conventional
multi-step prediction
exercises.
For instance,
the quantile
regression
techniques
discussed
in Granger,
White, and Kamstra (1989) as a method for estimating
the time-varying
one-step-ahead
prediction
error intervals do not easily extend to multi-step
predictors. Similarly, the numerical methods developed by Geweke (1989) for
calculating the exact predictive density would require extensive simulations
of
the prediction
error distribution
for each particular realization
of a,: ,.
R. T. Baillie and T. BollersleL,, Prediction in dynamic models
102
The presence of heteroskedasticity
also alters
sample structural stability tests proposed by Box
used as a tool in model evaluation
and diagnostic
and Hendry (1986), Lahiri (1975), and Liitkepohl
under the assumption
of conditional
normality of
tion errors, ~,/a,, it follows that the test statistic
the structure of the postand Tiao (1976) commonly
checking; see, e.g., Chong
(1985, 1988). In particular,
the one-step-ahead
predic-
will possess the conventional
chi-squared
distribution
with s degrees of
freedom.
Ignoring the temporal variation
in Us!+; in constructing
the test
statistic can seriously bias inference.”
To overcome this apparent indeterminancy
regarding the properties of the
prediction
error distribution
from GARCH models, simple recursive expressions for all the existing conditional
moments of the prediction error density
for the widely-used GARCH(1, 1) model are provided by Theorem 1.
Theorem 1.
ff,
2=
For the GARCH(1,
w+a&
1) model defined by (4), (d), (9), and
+p,&,
the 2 Kth first conditional moments of elts, s > 1, are given by
Et( $;;‘)
= 0,
E,(e::,)=+,(q!,‘,)>
r=O,l
,...,K-
r = 0,l ,...,K,
1,
(33)
(34)
where
(35)
Proof.
Eqs. (33) and (34) follow from (8) and (9) by the law of iterated
expectations.
On repeated use of the binomial formula and from the law of
3For instance, for the GARCH(1.1)
model with w = 0.1, (Y, = 0.2, /3, = 0.7, and v,:, fixed at
the unconditional
variance of one, the estimated rejection frequencies
for s = 2, 5, and 10 based
on the 0.05 and 0.01 fractiles in the chi-squared
distribution
with s degrees of freedom are 0.064,
0.091, 0.114 and 0.027, 0.046, 0.067, respectively.
R. T Baillie and T. Bollersler, Prediction in dynamic models
iterated
103
expectations,
E,(q?,) = E,((w+d+,-,
w’-‘E,((d+,-, +Pd-I)‘)
iw
=
r
z.T
+ P,d-dr)
r
1
0
which reduces
to (35) and (36).
Q.E.D.
Given the recursive expressions for all the conditional
moments in Theorem 1, several alternative
asymptotic expansions are available for approximatexpansion used
ing prediction
error distributions. 4 To set out the particular
below in forming prediction
error intervals, it is convenient
to introduce the
standardized
cumulants
for the s-step-ahead
prediction
error,5
Yr,r,.~~Kr,r,s(K2.r,s)~r,
r=2
,..‘,
K,
(37)
for e,,:, conditional
on the time t
where K,, , , s denotes the 2rth cumulant
information
set. For the one-step-ahead
predictton
error Y,,~,1 will be timeinvariant for all r by the assumptions
in (8) and (91, but in general Y,,~,~ will
be a nontrivial function of the time t information
set for s > 1. For instance,
from Theorem
1 the conditional
excess kurtosis for the two-step-ahead
‘The conditional moment sequence uniquely
Carleman condition is satisfied; i.e., E,(E,?+,)~‘/*
Serfling (1980) for sufficient conditions.
‘The
‘Y, ,.r’ s
Gramm-Ckarlier,
determines the conditional distribution if the
+ E,(E:+,)-“~
+ E,(E~=‘+,)-“’
+
= m. See
play an important
role in many asymptotic
expansions,
including
the Edgeworth, and the Cornish-Fisher
approximation
used below.
the
104
R. T. Baillie and T. Bollerslec, Prediction in dynamic models
error in the GARCH(1, 1) model
prediction
Yz.r.2
=
which under
((4
-
++,4,,
the assumption
+
(K2
-
is given by
3)
of conditional
normality,
i.e.,
~~
=
3, reduces
to
is an increasing
function of both a,‘-+, and LY,. For large
Obviousk y2,
values of a,:,
the conditional
kurtosis for the two-step-ahead
prediction
error distribution
may exceed the unconditional
excess kurtosis for et; i.e.,
6cu:(l - /3: - 2a,p, - 3at)-’
where the denominator
is assumed to be positive in order to ensure a finite fourth moment. More complicated
expressions
for the higher-order
cumulants
and longer forecast
horizons
from the
GARCH(1, 1) model are readily available from Theorem 1.
!,
2
7. Comish-Fisher
expansion
For the purpose of constructing
prediction error intervals that remain valid
in the presence of heteroskedasticity,
the inverse of the Edgeworth expansion
original developed by Cornish and Fisher (1937) is particularly
useful; see
Barndorff-Nielsen
and Cox (1989) for a recent discussion. Thus, let z,,,(P)
denote the Cornish-Fisher
approximation
to the time-varying
pth quantile
in the conditional
distribution
for the s-step-ahead
prediction
error e, s. For
symmetric deviations
from conditional
normality
the expression
for ;,,,(p>
then simplifies to
z,,s(P> =d
where
f sp
<
P~E,(e:s)"2~
1, and
P,.S( P) = @-‘(PI
+P22(@-‘(
+P2(@-‘(
P))Y;,tJ
P1)Yz.r.s
+tP,(W
P))YW,s
+ ... .
(39)
The first term in (39), W’(p),
refers to the pth quantile
in the standard
normal distribution,
the second term adjusts for conditional
excess kurtosis,
while the third and fourth terms are due to adjustments
for UP to the sixth
conditional
moment, and terms involving eight or higher-order
moments have
R. T. Baillie and T. Bollerslec, Prediction in dynamic models
been
omitted.
Also, three
important
functions
105
are given by’
pz( z) = ( z3 - 3r)/24,
p4( z) = ( z5 - 10z3 + 15z)/720,
p2z( z) E - (32’ - 242” + 29z)/384.
Of course, the assumption
of conditional
normality
for the s-step-ahead
prediction
errors corresponds
to fixing all these functions in (39) to zero.
In order to check the accuracy of the Cornish-Fisher
approximation
in
obtaining prediction
error intervals in the present context a series of simulations were performed
for various GARCH(1, 1) parameterizations
with conditionally
normal one-step-ahead
prediction
errors. For small values of (or
and short forecast horizons the convenient
assumption of conditional
normality of the multi-step
predictions
appeared
to work reasonably
well on average, with the estimated coverage probabilities
being close to the true size of
the intervals. However, consistent with the discussion in the previous section,
for longer forecast horizons and/or
larger values of LY, the true conditional
prediction
error distribution
was more peaked at the center and had fatter
tails than the normal distribution,
and the normal approximation
tended to
overestimate
the 0.50 and 0.20 fractiles while underestimating
the 0.01
fractile. Interestingly,
the crossing point for the densities for the true prediction error distribution
and the normal approximation
were generally close to
the 5% fractiles. However,
the simple Cornish-Fisher
expansion
in (39)
based on the adjustment
for up to the fourth conditional
moment only, i.e.,
neglecting p&@-‘(~)),
pZ2(@-‘(p)),
and higher-order
terms, proved a better
representation
of the extreme fractiles. Also, for large realizations
of a,:,
resulting
in more marked
deviations
from conditional
normality
in the
multi-step-prediction
error distributions
the Cornish-Fisher
expansion based
on only pJ@‘-‘(~1)
performed
quite well. Full details of these simulation
results are available in Baillie and Bollerslev (1990b).
8. Prediction
error distributions
in ARMAfk, l)-GARCH(1,
1) models
Eqs. (14) and (29) provided explicit expressions for the optimal s-step-ahead
predictor
and its associated
conditional
MSE in the ARMA(k,
1)GARCH(p,
91 class of models. However,
in order to make use of any
asymptotic
expansions
in approximating
the prediction
error distribution,
expressions
for the higher-order
conditional
moments are called for. Theorem 2 provides such a formula for the fourth moment of the prediction errors
from a general dynamic econometric
model with GARCH(1, 1) innovations.
“See Abramowitz
and Stegum (1972) and Kendall and Stuart
importance
functions in terms of Hermite polynomials.
(1967) for a definition
of the
106
R. T. Baillie and T. Bollersleq Prediction in dynamic models
Theorem 2. The conditional fourth moment for the prediction error for the
minimum MSE predictor in the ARMA(k, O-GARCH(1, 11 model given by (11,
(21, (4), and (6)-(9)
equals
s-1
J%(ep,,) =K~ c
i=O
s-2
+6
&%(6!+s--i)
s-l
c
c
i=O j=i+l
XEt(u:+s-j
k%;[h
)+
+fl,)‘-i-l(‘%“,
li+jEt(uAs-j)]
+i3,)
2
(40)
where
j-l-i
5i,j=
and
E,(UfY+i)
C
h=O
Otal
and
(41)
+PlIh2
E,(u,‘!+~)
for i > 0 are given by (35).
Proof.
From (8) the conditional expectation of terms involving E~+~to odd
pow&s is zero, and therefore
Et(ep.3) =Et[(
+Et[(
=
:c4i&z+s+ir]
i<@i&t+s-i))
. . .
s-l
=
s-2
i~o$~E,(~P+s-i)+ 6 C
i=O
s-l
C ~Z~~EI(&:+s-i&:+r-j).
j=i+l
For 0 2 i I s it follows directly from (9) that
Et(EP+s-i
) =Et(Er+s-i-I(EP+~-i))
=K~Et(u:+s-i).
R. 7: Baillie and T. Bollerslec, Prediction in dynamic models
107
for 0 I i <j <s it can be shown that
Similarly,
Et(Ef+s-,Ef+s-j)
=Er(&12+Z-,[~fru,e:,,-,-,
+Pd-,-I])
=
Et(“f+s-j[w
+(a1 +P,ML-11)
=E,($+s_j[o(l
=w(I + ((~1 +Pl)
+(a,
which reduces
+ ((Y, +/3,)
+ ...
+ ...
+(a,
+/3,)‘-2-i)
+~~)‘-‘-‘)E,(~~Y+~_j)
+tp,)j-I-‘(‘QN,+PdEr(d-j),
to (40) and (41) upon substitution.
Note that if LY,+ p, # 1 the expression
&j=w(l
+(a1
- (a1 +a,,j-j)(l
Q.E.D.
for &, in (41) simplifies to
-LX, -p,>-‘.
As for the formula for the conditional
MSE in (19), the results for the
conditional
fourth moments provided in Theorem 2 apply more generally to
all dynamic econometric
models with GARCH(1, 1) innovations
and prediction errors that can be expressed as in eq. (20). Of course, in the absence of
any serial dependence
in the conditional
mean, i.e., I+?~= $2 = . . . = $,V_, = 0,
e ,,s = F,+,, and (40) is just a special case of the more general results for the
conditional
moments in the GARCH(1, 1) model given in Theorem 1.
To assess the practical importance
of the results in Theorem
2 we also
carried out several simulations
for different AR(l)-GARCH(1,
1) formulations with conditionally
normal one-step-ahead
prediction error distributions.
Not surprisingly,
the presence
of serial dependence
in the disturbances
generally
led to an increase
in the conditional
excess kurtosis
for the
prediction errors, y2, f, s, due to the temporal dependence
in Et(~:+,Y_ie:+,_,>.
For instance, for LYE= 0.2, /3, = 0.7, and s = 2 the average conditional
excess
kurtosis increased from 0.228 for 4, = 0.0 to 0.516 for 4, = 0.5 and 0.642 for
R. T. Baillie and T. Bollerslec, Prediction in dynamic models
108
4, = 1.0. This is also borne out by the simulation results obtained for the
coverage probabilities for the prediction error intervals. The conditional
normal approximations for the AR(l)-GARCH(1,
1) models are too peaked
at the center and too thin in the tails, and the serial dependence in the
conditional mean tend to enhance these departures from normality even
further when compared to the results from the simple GARCH(1, 1) models.
Fortunately, the Cornish-Fisher prediction error intervals based on corrections for up to the fourth conditional moment generally provided reasonably
close approximations.
9. Empirical
example
To illustrate the techniques discussed above we now consider a simple
empirical example relating to the uncertainty of four different forward
foreign exchange rates as predictors of the corresponding future spot rates.
The data are opening bid prices from the New York Foreign Exchange
Market from March 1, 1980 to February 2, 1989, and constitutes a total of
462 weekly observations on the UK pound (UK), the West German
Deutschemark (WG), the Swiss franc (SW), and the French franc (FR), all
vis-a-vis the US dollar. The one-month-forward rates are taken on Tuesdays
and the corresponding future spot rates four weeks and two days later on
Thursdays;’ for a more detailed description of the data see Baillie and
Bollerslev (1990a). Following Hansen and Hodrick (1980) and Baillie (19891,
if the forward rate is an unbiased predictor of the future spot rate, but the
sampling time interval of the data is finer than the maturity time of the
forward contract, the forecast errors will be serially correlated. To take
account of this fourth-order moving average error structure induced by the
one-month-forward contracts and overlapping weekly data plus the volatility
clustering, an MA(4)-GARCH(l,
1) model was estimated for each of the four
currencies,
E,- I( ~0 = or = P + el&,- I+ Q-~
+ G&,-j
+ O-4
(42)
and
Var,_,(y,)
=0;2= w + Ly+:_, + p,&,
(43)
where yt = log s, - log fl_4. The estimates reported in table 1 are maximum
likelihood estimates obtained under the assumption of conditional normality.8
In accordance with the results in Baillie and Bollerslev (1990a), the estimates
‘This generally matches the forward rate with the spot rate in the future that would be used to
cover an open position. However,
this alignment
could be one or two days off around the
beginning of a new month; see Hodrick (1987).
‘For comparison
purposes
multiplication
with 100.
the numbers
have been converted
to monthly
percentage
rates by
R. T Baillie and T. Bollerslec, Prediction in dynamic models
109
Table 1
Maximum
w
K2
Q(tO)
Q*(lO)
likelihood
estimates.”
UK
WCi
SW
FR
- 0.322
(0.305)
- 0.765
(0.317)
-0.915
(0.344)
- 0.487
(0.307)
0.906
(0.048)
0.925
(0.052)
0.930
(0.048)
0.928
(0.050)
0.796
(0.054)
0.819
(0.055)
0.826
(0.053)
0.825
(0.056)
0.768
(0.053)
0.754
CO.0531
0.784
(0.053)
0.742
(0.053)
0.310
(0.047)
0.298
CO.0461
0.325
(0.047)
0.312
(0.045)
0.158
(0.116)
1.118
(0.523)
0.899
(0.528)
0.960
(0.420)
0.060
(0.029)
0.199
(0.076)
0.151
(0.062)
0.214
(0.074)
0.987
(0.050)
0.528
(0.168)
0.662
(0.143)
0.552
(0.137)
4.087
6.670
6.371
4.367
9.150
7.762
3.643
6.293
9.911
3.778
6.269
10.909
“Maximum likelihood estimates with asymptotic standard errors in parentheses.
K~ gives the
sample kurtosis for the standardized
residuals.
Q(lO) and Q*(lO) refer to the Ljung-Box
portmanteau
test for up to IOth-order
serial correlation
in the levels and the squares of the
standardized
residuals, respectively.
for the four MA coefficients
in (42) are all reasonably
close to the values
implied by the unbiasedness
hypothesis and a martingale
spot price process,
i.e., 0.837, 0.773, 0.686, and 0.258, respectively.
In fact for none of the four
rates are these implied parameter values rejected by a formal likelihood ratio
test constructed
under the assumption
of conditional
normality. Additional
diagnostic tests, including the portmanteau
tests for remaining serial correlation in the levels and the squares of the standardized
residuals reported in
table 1, also indicate a reasonably
good fit of the models for all the four
currencies.
Optimal predictions
for the MA(4)-GARCH(l, l)
model are readily available from (14), while the corresponding
forecast errors and forecast error
uncertainty
are given by eqs. (18) and (191, respectively.
Note, in this
situation
I+!I~
= 0; for i = 1,. . . ,4 and I/J;= 0 for i > 4. In order to succinctly
summarize
the empirical distribution
of the forecast errors associated with
the expected depreciation
from each of the four models, the first three rows
in table 2 report the average rejection frequencies
across the one- through
six-week forecast horizons obtained over the whole sample period using three
110
R. T. Baikie and T BoUersleu, Prediction in dynamic models
Table 2
One percent
confidence
intervals
rejection
frequencies.”
,UK
WG
SW
FR
1980.31989.2
Homoskedastic
Heteroskedastic
Cornish-Fisher
1.52
1.41
1.63
1.56
1.96
1.00
1.30
1.70
1.04
1.52
1.93
0.93
1985.21986.2
Homoskedastic
Heteroskedastic
Cornish-Fisher
5.45
0.32
0.32
3.21
4.49
2.24
3.53
3.21
1.92
4.49
3.85
1.60
“Average rejection frequencies
with 1% confidence intervas for one- through six-steps-ahead
forecast horizons.
Homoskedastic
denotes
the confidence
interval
constructed
under
the assumption
of conditionally
homoskedastic
normal errors. Heteroskedastic
gives the
rejections
with a conditional
heteroskedastic
normal
confidence
interval, while the Cornish-Fisher
intervals
adjust for up to the fourth conditional
moment.
different 1% confidence intervals.’
In particular,
the entries labelled ‘homoskedastic’ denote the rejections
that occur with a homoskedastic
normal
confidence
interval; i.e., p = 0.995, E,(u(f+;) fixed at the unconditional
variance, and all the higher-order
correction
terms in (39) set equal to zero.
Similarly, ‘heteroskedastic’
refers to the average rejection frequencies
across
the six horizons that obtain by allowing for the GARCH(l,l)
conditional
heteroskedastic
error structure, but omitting any of the higher-order
correction terms for deviations
from conditional
normality
in (39). Finally, the
‘Cornish-Fisher’
approximation
adjusts for up to the fourth conditional
moment in the prediction error distribution
based on the sample kurtosis for
the standardized
residuals, i.e., K* in table 1.
For all three methods and four currencies, the actual number of rejections
are generally fairly close to the expected values. Interestingly,
this is also true
for the homoskedastic
confidence
intervals, since the unconditional
sample
distribution
for the prediction
errors are not markedly different from the
normal in the present context. It is worth pointing out, that although table 2
only reports the average number of rejections, no systematic pattern across
the six forecast horizons is apparent for the four currencies. To illustrate, for
the UK the actual number of rejections that occur with the Cornish-Fisher
expansion for the six horizons are 7, 5, 5, 8, 10, and 9, respectively, compared
to 8, 8, 4, 3, 2, and 2 for West Germany.
Also, the results for other sized
confidence intervals are very much in line with the findings reported in table
2. For instance, for the UK the average rejection frequencies
for the whole
“Six observations
were excluded in the beginning and the end of the sample
startup problems and predictions
up to six steps ahead.
to allow for
R. T. Baillie and T. Bollersleu, Prediction in dynamic models
111
sample with a 5% confidence interval for each of the three different methods
equals 4.63, 4.93, and 5.15, respectively.
While the actual rejections for each of the three intervals are quite close
over the entire sample period, the results for certain subsamples of the data
are very different. To illustrate, consider the one-year period beginning
February 26, 1985, corresponding to the peak of the US dollar against the
deutschemark at 3.477 mark to the dollar. Over the following year, stimulated by the Plaza agreement on September 22, 1985, the dollar experienced
a volatile but fairly systematic depreciation against most major currencies.
From table 2 this increase in volatility resulted in far more rejections with the
homoskedastic 1% confidence intervals over this one-year period than
were to be expected. Whereas the heteroskedastic
normal confidence
intervals generally do somewhat better, it follows also that correcting for
higher-order deviations from conditional normality as in the Cornish-Fisher
asymptotic expansion, may be very important under high volatility scenarios.
These results for the 1% confidence intervals are also in line with the
findings pertaining to other sized tests. For instance for the UK over the
1985.2-1986.2 period the 5% intervals result in average rejection frequencies
of 10.90, 3.85, and 4.17, respectively, for each of the three different methods.
10. Conclusion
This paper has considered predictions from a general dynamic time series
model with ARMA disturbances and time-dependent
conditional heteroskedasticity, as represented by a GARCH process. Tractable formulae for
the minimum MSE predictor of both the future values of the conditional
mean and conditional variance are presented. Expressions for all the exact
moments of the multi-step forecast errors in the presence of GARCH(l,l>
are also derived, and it is shown how the Cornish-Fisher expansion can be
used in approximating the forecast densities. As illustrated by the empirical
example concerning the depreciation of exchange rates, these adjustment
formulae can be especially useful in very volatile periods.
One potentially important issue not addressed relates to the effect of
parameter estimation. For the processes considered in this study, the information matrix is block-diagonal between the parameters in the conditional
mean and variance equations. This implies that the estimation uncertainty for
the conditional variance parameters is irrelevant to the asymptotic MSE for
predicting the conditional mean. However, adjustment of higher-order moments for this effect may be important when using asymptotic approximations
for the prediction density in small sample sizes. The practical importance of
this is hard to ascertain without a detailed Monte Carlo experiment, and is
left as an area for future research.
112
R.T. Baillie and T. Bollersleu, Prediction in dynamic models
References
Abramowitz,
M. and I.A. Stegum, 1972, Handbook
of mathematical
functions (Dover Publications, New York, NY).
Baillie, R.T., 1980, Predictions
from ARMAX models, Journal of Econometrics
12, 365-374.
Baillie, R.T., 1987, Inference
in dynamic models containing
‘surprise’ variables,
Journal
of
Econometrics
35, 101-117.
Baillie, R.T., 1989, Econometric
tests of rationality
and market efficiency, Econometric
Reviews
8, 151-186.
Baillie, R.T. and T. Bollerslev, 1990a, A multivariate
generalized
ARCH approach to modeling
risk premia in forward foreign exchange rate markets, Journal of International
Money and
Finance 9, 309-324.
Baillie, R.T. and T. Bollerslev,
1990b, Prediction
in dynamic models with time dependent
conditional heteroskedasticity,
Working paper no. 8815 (Department
of Economics, Michigan
State University, East Lansing, MI).
Barndorff-Nielsen,
O.E. and D.R. Cox, 1989, Asymptotic techniques for use in statistics (Chapman and Hall, London).
Bera, A.K. and S. Lee, 1988, On the formulation
of a general
structure
for conditional
heteroskedasticity,
Unpublished
manuscript
(Department
of Economics,
University of Illinois, Urbana-Champaign,
IL).
Bera, A.K., S. Lee, and M. Higgins, 1990, Interaction
between autocorrelation
and conditional
heteroskedasticity:
A random coefficient approach, Unpublished
manuscript
(Department
of
Economics, University of Illinois, Urbana-Champaign,
IL).
Bollerslev,
T., 1986, Generalized
autoregressive
conditional
heteroskedasticity,
Journal
of
Econometrics
31, 307-327.
Bollerslev, T., 1987, A conditional
heteroskedastic
time series model for speculative prices and
rates of return, Review of Economic and Statistics 69, 542-547.
Bollerslev, T., 1988, On the correlation
structure for the generalized
autoregressive
conditional
heteroskedastic
process, Journal of Time Series Analysis 9, 121-131.
Bollerslev, T. and R.F. Engle, 1989, Common persistence
in conditional variances, Unpublished
manuscript
(J.L. Kellogg
Graduate
School of Management,
Northwestern
University,
Evanston, IL).
Bollerslev, T., R.Y. Chou, and K. Kroner, 1992, ARCH modeling in finance: A review of the
theory and empirical evidence, Journal of Econometrics,
this issue.
Box, G.E.P. and G.C. Tiao, 1976, Comparison
of forecast and actuality, Journal of the Royal
Statistical Society Series C 25, 195-200.
Chong, Y.Y. and D.F. Hendry, 1986, Econometric
evaluation of linear macroeconomic
models,
Review of Economic Studies 53, 671-690.
Chou, R.Y., 1989, Volatility persistence
and stock valuations:
Some empirical evidence using
GARCH, Journal of Applied Econometrics
3. 279-294.
Cornish, E.A. and R.A. Fisher, 1937, Moments and cumulants
in the specification
of distributions, Revue de I’Institute International
Statistique 5, 307-320.
Day, T.E. and CM. Lewis, 1992, Stock market volatility and the information
content of stock
index options, Journal of Econometrics,
this issue.
Diebold, F.X., 1988, Empirical modeling of exchange rate dynamics (Springer Verlag, New York,
NY).
Engle, R.F., 1982, Autoregressive
conditional heteroskedasticity
with estimates of the variance of
United Kingdom inflation, Econometrica
50, 987-1007.
Engle, R.F. and T. Bollerslev, 1986, Modelling the persistence
of conditional
variances, Econometric Reviews 5, l-50.
Engle, R.F. and G. Gonzalez-Rivera,
1991, Semiparametric
ARCH models, Journal of Business
and Economic Statistics 9, 345-359.
Engle, R.F. and C.W.J. Granger,
1987, Cointegration
and error correction:
Representation,
estimation and testing, Econometrica
55, 251-276.
Engle, R.F. and D.F. Kraft, 1983, Multiperiod
forecast error variances of inflation estimated
from ARCH models. in: Applied time series analysis of economic
data (Bureau of the
Census, Washington,
DC).
R. T. Baillie and T Bollersler,, Prediction in dynamic models
113
Engle, R.F., D. Lilien, and R.P. Robins, 1987, Estimating
time varying risk premia in the term
structure: The ARCH-M model, Econometrica
55. 391-407.
Gallant,
A.R. and G.E. Tauchen,
1989, Seminonparametric
maximum likelihood
estimation,
Econometrica
55, 1091-1120.
Gallant,
A.R., D.A. Hsieh, and G.E. Tauchen,
1990, On fitting a recalcitrant
series: The
pound/dollar
exchange rate 1974-83, in: Nonparametric
and semiparametric
methods in
econometrics
and statistics (Cambridge
University Press, Cambridge).
Geweke, J., 1989, Exact predictive densities for linear models with ARCH disturbances,
Journal
of Econometrics
40, 63-86.
Granger, C.W.J., H. White, and M. Kamstra, 1989, Interval forecasting:
An analysis based upon
ARCH-quantile
estimators,
Journal of Econometrics
40, 87-96
Hansen, L.P. and R.J. Hodrick, 1980, Forward exchange rates as optimal predictors
of future
spot rates, Journal of Political Economy 88, 829-853.
Hodrick, R.J., 1987, The empirical evidence on the efficiency of forward and futures foreign
exchange markets (Harwood Academic Publishers, Chur).
Kendall, M.G. and A. Stuart. 1969, The advanced theory of statistics, Vol. 2 (Griffin, London).
Lahiri, K., 1975, Multiperiod
prediction
in dynamic model, International
Economic Review 16,
699-711.
Lamoureux,
C.G. and W.D. Lastrapes,
1990, Forecasting
stock return variance:
Toward an
understanding
of stochastic
implied volatilities,
Unpublished
manuscript
(Department
of
Economics, University of Georgia, Atlanta, GA).
Liitkepohl, H., 1985, The joint asymptotic distribution
of multistep prediction errors of estimated
vector autoregressions,
Economics Letters 17, 103-106.
Liitkepohl,
I-I., 1988. Prediction
tests for structural
stability, Journal
of Econometrics
39,
267-296.
Nelson, D.B. and C.Q. Cao, 1991, A note on the inequality constraints
in the univariate GARCH
model, Journal of Business and Economic Statistics, forthcoming.
Schwert, G.W., 1989, Business cycles, financial crises and stock volatility, Carnegie-Rochester
Conference
Series on Public Policy 31, 83-125.
Serfling, R.J., 1980, Approximation
theorems in mathematical
statistics (Wiley, New York. NY).
Yamamoto,
T., 1981, Predictions
of multivariate
autoregressive
moving average
models,
Biometrika
68, 485-492.