0
INTRODUCTION
1
Chapter 13
Time series (2)
0
Introduction
• In this chapter we first consider the following questions:
– How can we tell whether a time series is stationary or not?
– If it is not, how do we go about turning it into a stationary
series?
• We also look at the Box-Jenkins approach to fitting and forecasting.
• Multivariate time series models and some special non-stationary
and non-linear time series models will be also introduced.
1
1
COMPENSATING FOR TREND AND SEASONALITY
2
Compensating for trend and seasonality
• In this section we deal with possible sources of non-stationarity
and how to compensate for them.
• Three possible causes of non-stationarity:
1. a deterministic trend (e.g. exponential or linear growth)
2. a deterministic cycle (e.g. seasonal effect)
3. the time series is integrated
1
COMPENSATING FOR TREND AND SEASONALITY
1.1
3
Detecting non-stationary series
• The most useful tools in identifying non-stationarity are the
simplest:
– a plot of the series against t, and
– the sample ACF
γ̂k
γ̂0
n
1 X
γ̂k =
(xt − µ̂)(xt−k − µ̂)
n
ρ̂k =
µ̂ =
1
n
t=k+1
n
X
xt
t=1
• Plotting the series will highlight any obvious trends in the
mean and will show up any cyclic variation which could also
form evidence of non-stationarity.
• The sample ACF should, in the case of a stationary time series, ultimately converge towards zero.
If the sample ACF decreases slowly but steadily from a value
near 1, we would conclude that the data need to be differenced before fitting the model.
• See Fig.13.1 & Fig.13.2 on page 4.
• See Fig.13.3a and Fig.13.3b on pages 7-8.
1
COMPENSATING FOR TREND AND SEASONALITY
• Methods for removing a linear trend.
– Least squares trend removal
– Differencing
• Methods for removing cycles (seasonal variation)
– Seasonal differencing
– Method of moving averages
– Method of seasonal means
4
1
COMPENSATING FOR TREND AND SEASONALITY
1.2
5
Least squares trend removal
• The simplest way to remove a linear trend is by ordinary
least squares.
xt = a + bt + yt
where a and b are constants and yt is a zero-mean stationary
process. The parameters a and b can be estimated by linear
regression prior to fitting a stationary model to the residuals
yt .
1
COMPENSATING FOR TREND AND SEASONALITY
1.3
6
Differencing
• Differencing may well be beneficial if the sample ACF decreases slowly from a value near 1.
• It will also remove any linear trend too.
xt = a + bt + yt
∇xt = b + ∇yt
• Least squares trend removal v.s. Differencing
1
COMPENSATING FOR TREND AND SEASONALITY
1.4
7
Seasonal differencing
• Where seasonal variation is present in the data, one way of
removing it is to take a seasonal difference.
• Suppose that the time series {xt} records the monthly average temperature in London. A model of the form:
x t = µ + θt + y t
might be applied, where {θt} is a periodic function with period 12 and {yt} is a stationary series. Then the seasonal
difference of {xt} is defined as
∇12xt = xt − xt−12
and we see that:
∇12xt = xt − xt−12
= (µ + θt + yt) − (µ + θt−12 + yt−12)
= yt − yt−12
is a stationary process.
• We can then model the seasonal difference of {xt} as a stationary process and reconstruct the original process {xt} itself afterwards.
1
COMPENSATING FOR TREND AND SEASONALITY
1.5
8
Method of moving averages
• The method of moving averages makes use of a simple linear
filter to eliminate the effects of periodic variation.
• If {xt} is a time series with seasonal effects with even period
d = 2h, then we define a smoothed process {yt} by:
1 1
yt =
xt−h + xt−h+1
2h 2
1
+ · · · + xt−1 + xt + · · · + xt+h−1 + xt+h
2
• The same can be done with odd periods d = 2h + 1:
1 yt =
xt−h + xt−h+1
2h + 1
+ · · · + xt + · · · + xt+h−1 + xt+h
1
COMPENSATING FOR TREND AND SEASONALITY
1.6
9
Method of seasonal means
• The simplest method for removing seasonal variation is to
subtract from each observation the estimated mean for that
period, obtained by simply averaging the corresponding observations in the sample.
• For example, when fitting the Model
x t = µ + θt + y t
to a monthly time series {xt} extending over 10 years from
January 1990 the estimate for µ is x̄ and the estimate for
θJanuary is
θ̂January =
1
(x1 + x13 + x25 + · · · + x109) − µ̂
10
1
COMPENSATING FOR TREND AND SEASONALITY
1.7
10
Transformation of the data
• Diagnostic procedures such as an inspection of a plot of the
residuals may suggest that even the best-fitting standard linear time series model is failing to provide an adequate fit to
the data.
• Before attempting to use more advanced non-linear models
it is often worth attempting to transform the data in some
straightforward way in an attempt to find a data set on which
the linear theory will work properly.
• An example of a simple transformation would be Yt = log Xt,
which would be used to remove an exponential growth effect.
2
IDENTIFICATION OF MA(Q) AND AR(P) MODELS
2
11
Identification of MA(q) and AR(p) models
• The treatment of this section assumes that the sequence of
observations {x1, x2, . . . , xn} may be presumed to come from
a stationary time series process.
2.1
Estimation of the ACF and PACF
• The autocovariance and autocorrelation functions play a central role in the analysis of time series.
The partial autocorrelation function are derived from the
ACF.
• With a sequence of observations {x1, x2, . . . , xn}, estimating
the ACF of the time series process of which the data form a
realization is the first step to find a time series model to fit
the sequence.
• The common mean of a stationary model can be estimated
using the sample mean:
n
1X
µ̂ =
xt
n t=1
• The autocovariance function γk can be estimated using the
sample autocovariance function, denoted γ̂k , given by:
n
1 X
(xt − µ̂)(xt−k − µ̂)
γ̂k =
n
t=k+1
2
IDENTIFICATION OF MA(Q) AND AR(P) MODELS
12
• The autocorrelatian function ρk then can be estimated by rk :
rk =
γ̂k
γ̂0
• The collection {rk : k ∈ Z} is called the sample autocorrelation function (SACF). Every time series analysis involves
at least one plot of rk against k. Such a plot is called a
correlogram.
• The partial autocorrelation function φk can be estimated using rk .
• The resulting function {φ̂k : k ∈ Z +} called the sample
partial autocorrelation function (SPACF).
The plot of φ̂k against k, called the partial correlogram.
2
IDENTIFICATION OF MA(Q) AND AR(P) MODELS
2.2
13
Identification of white noise
• The verification of goodness of fit of any model should include
a test as to whether the residuals form a white noise process.
• Clearly the SACF and SPACF of a white noise process are
random, being simple functions of the observations.
• Even if the original process was a perfectly standard white
noise the SACF and SPACF would not be identically zero.
• An asymptotic result states that, if the original model is white
noise:
Xt = µ + et
then both rk and φ̂k are approximately normally distributed
with mean 0, variance 1/n for each k.
• Values of the SACF or SPACF falling outside the range from
√
√
−2/ n to 2/ n can be taken as suggesting that the white
noise model is inappropriate.
√
• Some care should be exercised: the cut-off points of ±2/ n
give approximate 95% limits, implying that about one value
in 20 will fall outside the range even when the white noise
mode! is correct.
This means that one single value of rk or φ̂k outside the
specified range would not be regarded as significant on its
own, but three such values might well be significant.
2
IDENTIFICATION OF MA(Q) AND AR(P) MODELS
14
• A “portmantea” test tells us that, if the white noise model
is correct, then:
m
X
rk2
n(n + 2)
∼ χ2m
n−k
k=1
for each m.
• Question 13.1 on page 14.
2
IDENTIFICATION OF MA(Q) AND AR(P) MODELS
2.3
15
Identification of MA(q)
• The distinguishing characteristic of MA(q) is that ρk = 0 for
all k > q.
• A test for the appropriateness of a MA(q) model is that rk
is close to 0 for all k > q.
• If the data really do come from a MA(q) model, the estimates
rk for k > q will be roughly normally
distributed with mean
P
0, variance n−1 1 + qi=1 ρ2i .
2
IDENTIFICATION OF MA(Q) AND AR(P) MODELS
2.4
16
Identification of AR(p)
• The corresponding diagnostic procedure for an autoregressive
model is based on the sample partial ACF, since the PACF
of an AR(p) is distinctive, being equal to zero for k > p.
• The asymptotic variance of φ̂k is 1/n for each k > p.
3
3
FITTING A TIME SERIES MODEL USING THE BOX-JENKINS METHODOLOGY
17
Fitting a time series model using the Box-Jenkins methodology
• In this section we consider the general class of autoregressive integrated moving average models - the ARIMA(p, d, q)
models.
• We assume that historical data, comprising a time series {xt :
t = 1, 2, . . . , n} are given.
• We will also assume that deterministic trends and seasonal
effects have been removed from the data.
• No differencing of the process is assumed - that is part of the
Box-Jenkins method.
3
FITTING A TIME SERIES MODEL USING THE BOX-JENKINS METHODOLOGY
3.1
18
The Box-Jenkins methodology
• The Box-Jenkins approach allows one to find an ARIMA
model which is reasonably simple and provides a sufficiently
accurate description of the behavior of the historical data.
• The main steps of the approach are:
– Tentative identification of a model from the ARIMA class
(Sec.3.2-Sec.3.3)
– Estimation of parameters in the identified model (Sec.3.4)
– Diagnostic checks (Sec.3.5)
3
FITTING A TIME SERIES MODEL USING THE BOX-JENKINS METHODOLOGY
3.2
19
Differencing
• An ARIMA(p, d, q) model is completely identified by the
choice of non-negative integer values for the parameters p,
d and q.
• The following principles can be used to choose the appropriate value of d.
– A time series {xt} can be modelled by a stationary ARMA
model if the sample autocorrelation function rk (ρ̂k ) decays
rapidly to zero with k.
If a slowly decaying positive sample autocorrelation function rk is observed, this should be taken to indicate that
the time series needs to be differenced to convert it into a
likely realization of a stationary random process.
– Let σ̂d2 denote the sample variance of the process z (d) =
∇dxt, i.e. the sample variance of the data after they have
been differenced d times.
It is normally the case that σ̂d2 first decreases with d until
stationarity achieved and then starts to increase.
Therefore d can be set to the value which minimizes σ̂d2.
This could be d = 0 if the original time series {xt} is
already stationary.
• Example 13.3 on page 18.
3
FITTING A TIME SERIES MODEL USING THE BOX-JENKINS METHODOLOGY
3.3
20
Fitting an ARMA(p, q) model
• Suppose that
– appropriate value for the parameter d has been found
– time series {zd+1, zd+2, . . . , zn} is adequately stationary
– sample mean of the {zt} sequence is zero
– for the sake of simplicity, let d = 0
• In the framework of the Box-Jenkins approach we try to find
an ARMA(p, q) model which fits the data {zt}.
– If ether the correlograrn or the partial correlogram appears to be close to zero for sufficiently large k, an AR(p)
or MA(q) model is indicated.
– Otherwise we should look for an ARMA(p, q) model with
non-zero values of p and q.
– We start with a simple model like ARMA(1,1) and to
work up to more complicated models if the simple ones
are inadequate.
– Akaike’s Information Criterion (AIC)
We should only consider adding an extra parameter if this results in a reduction of the residual sum
of squares by a factor of at least e−2/n.
3
FITTING A TIME SERIES MODEL USING THE BOX-JENKINS METHODOLOGY
3.4
21
Parameter estimation
• Once the values of p and q have been identified, the problem becomes to estimate the values of parameters for the
ARMA(p, q) model.
• least squares estimation
• maximum likelihood estimation
• method of moments estimation
– calculate the theoretical ACF of an ARMA(p, q) process
– calculate the sample ACF of the ARMA(p, q) process
– ρ1, . . . , ρp+q coincides with r1, . . . , rp+q
• Example 13.4 on page 21.
• The variance of the {et}, σ 2 can be estimated by
n
X
1
ê2t
σ̂ 2 =
n t=p+1
n
1 X
=
(zt − α̂1zt−1 − · · · − α̂pzt−p − β̂1êt−1 − · · · − β̂q êt−q )2
n t=p+1
where êt denotes the residual at time t.
3
FITTING A TIME SERIES MODEL USING THE BOX-JENKINS METHODOLOGY
3.5
22
Diagnostic checking
• If the ARMA(p, q) model is a good approximation to the
underlying time series process, then the residuals {êt} will
form a good approximation to a white noise process.
• Inspection of the graph of {êt}
If any pattern is evident, whether in the average
level of the residuals or in the magnitude of the fluctuations about 0, this should he taken to mean that the
model is inadequate.
• Inspection of the sample autocorrelation functions of {êt}
– SACF & PSACF
– χ2 statistic
– Counting turning points
∗ turning point
∗ the number of turning points
!
2
T ∼ N (N − 2), (16N − 29)/90 approximately
3
∗ 95% confidence interval:
"
r
2
(N − 2) − 1.96
3
r
16N − 29 2
, (N − 2) + 1.96
90
3
16N − 29
90
#
4
FORECASTING
4
23
Forecasting
4.1
Box-Jenkins approach to forecasting stationary time series
• Using the Box-Jenkins approach, forecasting is relatively straightforward.
• Having fitted an ARMA model to the data {X1, . . . , Xn} we
have the equation:
Xn+k = µ + α1 (Xn+k−1 − µ) + · · · + αp (Xn+k−p − µ) + en+k + β1 en+k−1 + · · · + βq en+k−q
• The forecast value of Xn+k given all observations up until
time n, known as the k-step ahead forecast and denoted
x̂n(k), is obtained from above equation by:
– replacing all (unknown) parameters by their estimated
values;
– replacing the random variables X1, . . . , Xn by their observed values x̂n(1), . . . , x̂n(k − 1);
– replacing the random variables Xn+1, . . . , Xn+k−1 by their
forecast values x1, . . . , xn;
– replacing the innovations e1, . . . , en by the residuals ê1, . . . , ên;
– replacing the random variables en+1, . . . , en+k−1 by their
expectations, 0.
• The one-step ahead and two-step ahead forecasts for an AR(2)
are given by:
x̂n(1) = µ̂ + α̂1(xn − µ̂) + α̂2(xn−1 − µ̂)
x̂n(2) = µ̂ + α̂1(x̂n(1) − µ̂) + α̂2(xn − µ̂)
4
FORECASTING
24
• The two-step ahead forecast for an ARMA(2,2) is given by:
x̂n(2) = µ̂ + α̂1(x̂n(1) − µ̂) + α̂2(xn − µ̂) + β̂2ên
• The k-step ahead forecast is essentially the conditional expectation of the future value of the process given all the information currently available.
E[Xn+2 |X0 , . . . , Xn ]
= E[µ + α1 (Xn+1 − µ) + α2 (Xn − µ) + en+2 + β1 en+1 + β2 en |X0 , . . . , Xn ]
• A point estimate of Xn+k is less useful than a confidence
interval, for which an estimate of the variance is required.
• A comparison of Xn+1 with x̂n(1) shows that the difference
between them arises from numerous sources, including en+1,
differences between true values of parameters and their estimates, and differences between true values of the {et} and
the residuals {êt} which are used to estimate them.
• Calculation of the prediction variance in any given case is
complicated.
• In general it is possible to state that the variance of the k-step
ahead estimator is relatively small for small values of k and
converges, for large k, to γ0, the variance of the stationary
process {Xt}.
4
FORECASTING
4.2
25
Forecasting ARIMA processes
• If {Xt} is an ARIMA(p, d, q) process, then Zt = ∇dXt is
ARMA(p, q).
• If {Xt} is an ARIMA(1, 1, 1) process, then Zt = ∇Xt =
Xt − Xt−1 is ARMA(1, 1). So
Xt = Xt−1 + Zt
Xn+1 = Xn + Zn+1
x̂n(1) = xn + ẑn(1)
• An ARIMA(p, d, q) process with d > 0 is not stationary and
therefore has no stationary variance.
• The prediction variance for the k-step ahead forecast increases to infinity as k increases.
4
FORECASTING
4.3
Exponential smoothing (TO BE OMITTED)
4.4
Linear filters (TO BE OMITTED)
26
5
MULTIVARIATE TIME SERIES MODELS (TO BE OMITTED)
5
Multivariate time series models (TO BE OMITTED)
5.1
Vector autoregressions
5.2
Cointegrated time series
6
27
Some special non-stationary and non-linear time series models (TO BE OMITTED)
6.1
Bilinear models
6.2
Threshold autoregressive models
6.3
Random coefficient autoregressive models
6.4
Autoregressive models with conditional heteroscedasticity
© Copyright 2026 Paperzz