A maximum likelihood short-cut to the Chow

A maximum likelihood short-cut to the Chow-Lin procedure
Christian Müller∗
Zurich University of Applied Sciences
School of Management and Law
CH-8401 Winterthur, Switzerland
Tel.: +41 58 934 68 87
Fax: +41 58 935 68 87
Email: [email protected]
Abstract
Economists and econometricians very often work with data which has been temporally disaggregated prior to use. Building on Chow and Lin’s (1971) popular disaggregation model this paper proposes a novel, handy maximum likelihood estimation
approach. An advantage of the new method in comparison to Chow and Lin (1971)
is the possibility to estimate the econometric model at the aggregate level without recurring to the disaggregate level while iterating. A simulation study and an example
application illustrate the findings.
JEL classification: F31, F47, C53
Keywords: temporal disaggregation, restricted ARMA
1
Introduction
Quantitative economic analysis very often has to rely on data whose observation frequency
is systematically lower than desired. For example, economic activity which is commonly
expressed as the flow of value added is generated continuously. However, as it would
require enormous resources to actually observe this process, most countries use annual
estimates of economic activity as the basis of their statistics. In contrast, many other
variables such as money stock and interest rates are available at a far higher frequency
(and can often also be observed more accurately). Nevertheless, researchers, policy makers
and the public, all have genuine interest in high frequency information on low frequency
data for efficient and timely decision making. Therefore, statistical offices all around the
world work on providing temporarily disaggregated data to serve this aim. Statisticians
at the European Commission have even developed a free software tool (ECOTRIM) for
conducting disaggregation.
This paper proposes a novel maximum likelihood estimation method for the popular
disaggregation procedure due to Chow and Lin (1971), henceforth CL. An advantage of
the proposal is that estimation can be pursued using the aggregated, observable data.
Maybe surprisingly, the basic CL approach continues to enjoy huge popularity. According to Santos Silva and Cardoso (2001, p. 269) it is still the most widely used method
for disaggregating time series. Among the reasons for its attractiveness certainly is its
comparably simple structure. Further, in many practical applications the number of data
points of the aggregated time series is very small which excludes models with a sophisticated dynamic structures, or which rest on very many time series, such as mixed frequency
dynamic factor models, for example.
The next section reviews the CL disaggregation approach, reasons why it is still very
popular despite its numerous alternatives and sets thus the framework of the analysis. The
third section describes the new estimation approach while the fourth provides simulation
2
result. Finally, an example is provided and conclusions are drawn.
2
Chow&Lin revisited
2.1
The basic model
Suppose the following data generating process for the high frequency data
(1)
Yh = Xh β + Uh
Uh ∼ (0, Σh ) .
The endogenous variable, Yh = (yh,1 , yh,2 , . . . , yh,T )0 , is assumed to depend on an ex-
ogenous variable, or a set of exogenous variables, Xh = (xh,1 , xh,2 , . . . , xh,T )0 , and an
innovation process, Uh . The subscript h indicates high frequency. The general idea is to
use high frequency information on x and an estimate of Σh to obtain estimates for yh,t
which is not directly observable. In order to estimate yh,t , CL suggest a particular structure for Σh . Denoting Uh = (uh,1 , uh,2 , . . . , uh,T )0 they suggest to consider the stationary
process
(2)
(3)
uh,t = ρuh,t−1 + h,t
|ρ| < 1
h,t ∼ i.i.d.(0, σh2 ).
It is worthwhile to notice that (1) is suitable for both, stationary and non-stationary
variables Yh and Xh as long as (3) holds. If yh,t and xh,t were both nonstationary, then
under (3), they are cointegrated in the sense of Engle and Granger (1987). Due to the fact
that cointegration is now a well understood property of many fundamental economic relationships (1) represents a very attractive approach to the disaggregation of low frequency
data. Furthermore, if (1) is a cointegration relationship, forecasts of yh,t based on xh,t
3
will in general outperform forecasts which are not based on cointegration relations at least
at longer horizons. In contrast, the proposals by Fernandez (1981) and Litterman (1983)
suggest nonstationary processes in (2) which generally result in ‘smoother’ high frequency
estimates at the expense of forecasting performance. The latter property is important
because many disaggregation exercises serve the provision of early estimates of high frequency information on y such as quarterly GDP estimates. These are usually constructed
on forecasted values of y and hence large forecast errors imply according revisions later
on. Yet another alternative has been discussed by Santos Silva and Cardoso (2001). They
add a lag polynomial of the exogenous variable to the model and restrict its coefficients.
The model thus obtained is more general than the original CL resulting in better data fits
in terms of standard errors of residuals.
A more recent survey of the aggregation literature and methods provide Silvestrini
and Veredas (2005) who also provide generalisations to general ARFIMA and to GARCH
processes. For the purpose of the current paper it shall be assumed that the variables x and
y are both integrated of the same order, and hence cointegrated under (2) if appropriate.
In the CL approach (1) is transformed into a low frequency model with observable data
series yl,t , xl,t and the error process εl,t . In the following, the focus will be on the temporal
aggregation which is accomplished by pre-multiplying (1) by a matrix Cm of dimension
(T /m × T ) where
 1

1×m 01×m . . . 01×m
 01×m 11×m . . . 01×m 
Cm =  ..
.. 
..
.
.
.
01×m . . . . . . 11×m
and m is the number of high frequency observations that are temporally aggregated to
yield the low frequency data. For example, if temporal aggregation of quarterly to annual
data is considered, m = 4 would be chosen.
After temporal aggregation (1) mutates to
(4)
Yl = Xl β + Ul
4
with
Yl = Cm Yh
Xl = Cm Xh
Ul = Cm Uh
and thus Yl = (yl,1 , . . . , yl,τ , . . . )0 , Xl = (xl,1 , . . . , xl,τ , . . . )0 , and Ul = (ul,1 , . . . , ul,τ , . . . )0
are temporal aggregates. It is important to notice that the aggregation only affects the
parameters of the error process while the parameter characterising the linear relationship
between dependent and independent variable is still completely described by β. The latter
is thus independent of Cm for all m.
2.2
Estimation
Chow and Lin (1971) suggest to estimate ρ and β subject to the aggregation constraint
(4). They proposed the following feasible generalised least squares (GLS) estimate:
(5)
β̂CL =
(
Xl0 Σ−1
l Xl
)−1
Xl0 Σ−1
l Yl
Σl = E(Ul Ul0 )
The key element in the estimation is the matrix Σl , which can be obtained by considering that
Σl = E(Ul Ul0 )
0
= E(Cm Uh Uh0 Cm
)
0
= Cm Σh Cm
.
Since the structure of Σh is known by assumption, and Cm by construction Σl is also
identified up to ρ and σh2 . In general, as for example Marcellino (1999) and Wei (2006)
have shown, the elements of Σl are nonlinear functions of ρ and m. Notice, however, that
5
estimation is feasible by a sequence of linear regressions where only an initial value for ρ is
required. Subsequent updates of ρ can be based on a regression of ûh,t on ûh,t−1 where ûh,t
is the estimated high frequency residual. The details of this calculation are not repeated
here. The reader is referred to Chow and Lin (1971), p. 373. The resulting estimates for
Yh have desirable properties such as being BLUE.
A clear disadvantage of Chow and Lin’s (1971) GLS approach is the need for repeated
estimation of Uh for updating ρ until convergence is achieved. Hence, it is not possible to
only work with the aggregate data until the optimal model and parameters are chosen.
The opportunity to first evaluate β estimates for various exogenous variables before
actually choosing the most appropriate for the disaggregation appears very desirable in
applications. It would allow to select the most suitable without having to calculate the
actual disaggregate data. A key contribution of this paper is to provide an alternative
that allows to proceed in these two steps. While doing so the analysis rests entirely on
the aggregate level which significantly simplifies data management.
3
An alternative to the Chow&Lin estimation approach
3.1
The basic idea
In the following, an alternative estimation procedure using the temporally aggregated
data only is described. It starts by the formulation of a general model against which the
temporally aggregated model appears as a restricted version.1
I start by repeating that the aggregation restriction can alternatively be expressed
as a restriction on the error process Ul . Following Wei (2006), temporal aggregation of
an autoregressive process Uh of order one results in a new error process which can be
1
In principle, that allows the definition of a likelihood ratio test to check appropriateness of the
temporal aggregation. However, this will be the topic of a follow-up project.
6
described as an autoregressive–moving average process. The order of the autoregressive as
well as the moving average component is one (ARM A(1, 1)) and the components will be
labelled ρ∗ , and φ∗ respectively. Given the fact that β is invariant to Cm the aggregation
restriction is thus a restriction on the parameters of the low frequency error process, Ul .
The estimation will be based on the explicit representation of the aggregate process
which requires the derivation of its parameters, that is ρ∗ , φ∗ and the variance. Again,
Wei(1990; 2006) has suggested a way to solve this problem by looking at the relationship
between the autocovariance generating functions of the disaggregate and the aggregate
processes. While Wei provides the general approach Silvestrini and Veredas (2005) offer
more details for calculating the particular formulas for the aggregation in question. A
general pattern is a nonlinearity in the relation between the coefficients of the disaggregate
and aggregate models.
Interestingly, most authors have focussed almost entirely on the mapping of the disaggregate model’s parameters on to the parameters of its aggregate counterpart. Disaggregation would in contrast require to go into the opposite direction, that is from the
aggregate to the disaggregate model.
3.2
A general formula
For the problem at hand I provide an explicit derivation of the ARM A(1, 1) coefficients
which is pretty similar to Wei’s(1990; 2006) procedure. The value added of the special
case I consider is the provision of a handy solution to the task for arbitrary m.
I outline the derivation of φ∗ as a function of m given the aggregation problem described
in the main text. The basic idea is to express ρ∗ and φ∗ in terms of ρ and m. As a notational
convention I will use τ = mt and τ + j = mt + jm in the following. The starting point is
the notion that any uh,t can be given as
(6)
uh,t = ρn uh,t−n + ρn−1 h,t−n+1 + ρn−2 h,t−n+2 + · · · + h,t
7
which implies for temporal aggregation over m periods,
uh,t + uh,t−1 + · · · + uh,t−m+1 = ρm uh,t−m + ρm−1 h,t−m+1 + ρm−2 h,t−m+2 + · · · + h,t
+ ρm uh,t−m−1 + ρm−1 h,t−m + ρm−2 h,t−m+1 + · · · + h,t−1
..
.
+ ρm uh,t−2m+1 + ρm−1 h,t−2m+2 + ρm−2 h,t−2m+3 + . . .
+h,t−m+1
= ρm (uh,t−m + uh,t−m−1 + · · · + uh,t−m+1 ) + zl,τ .
The aggregated process can now more compactly be written as
(7)
ul,τ = ρm ul,τ −1 + zl,τ
where the error term zl,τ has the autocorrelation function of an M A(1) process and can
be written as
zl,τ
= φ∗ el,τ −1 + el,τ
εl,τ
∼ i.i.d.(0, σl2 )
with
(8)
{
E(zl,τ , zl,τ −s ) =
(1 + φ∗ 2 )σl2 for s = 0,
φ∗ σl2
for s ± 1,
0
else.
The first and second line can alternatively be expressed as
(
)
(1 + φ∗ 2 )σl2 = σh2 S0 = σh2 4 + 6ρ + 8ρ2 + 8ρ3 + 8ρ4 + 6ρ5 + 4ρ6
(
)
φ∗ σl2 = σh2 S1 = σh2 ρ + 2ρ2 + 4ρ3 + 2ρ4 + ρ5
giving rise to
(1 + φ∗ 2 )σl2 = σh2 S0
φ∗ σl2 = σh2 S1
8
where it has been made use of (7). The solution for φ∗ can consequently be given as2

√
 S0
S02
− 1 for 0 < |ρ| < 1, .
2S1 ±
4S12
(9) φ∗ =

0
for ρ = 0.
Thus, the solution for φ∗ follows directly from the quantities S0 and S1 . These can be
calculated for general m in the following way. Consider again zl,τ . It is the sum of the
elements of the (m × m) matrix Φτ :
[
]
(10) Φτ = [t t−1 . . . t−m+1 ]0 ⊗ (ρL)m−1 (ρL)m−2 . . . (ρL)0
which makes use of the lag operator, L, with Li xt = xt−i . It is

m−1 ρm−2 h,t−m+2 ρm−3 h,t−m+3
h,t−m+1
 ρ

 ρm−1 ρm−2 h,t−m+1 ρm−3 h,t−m+2

h,t−m


(11) Φτ =  ρm−1 h,t−m−1
ρm−2 h,t−m
ρm−3 h,t−m+1


..
..
..

.
.
.


ρm−1 h,t−2m+2 ρm−2 h,t−2m+3 ρm−3 h,t−2m+4
instructive to expand Φτ :

0
···
ρ h,t


0
···
ρ h,t−1 



0
···
ρ h,t−2 


..
..

.
.


· · · ρ0 h,t−m+1
This matrix has an interesting structure. In particular, the innovations with identical time
subscripts are to be found along the diagonals. Hence, the variance of zl,τ is calculated
by adding the coefficients from the diagonal elements, squaring these sums, then adding
them for all diagonals. At the same time the power to which ρ is raised is the same in
each column. Therefore, every secondary diagonal can be regarded a truncated version of
the main diagonal with respect to the power coefficients. The following auxiliary matrices
and operator are useful in finding handy expressions. Let me use the operator diag which
2
This procedure parallels Silvestrini and Veredas’s (2005) approach in that it treats the AR and MA
parts separately. It differs though in that the aggregate parameters are assumed known while the
disaggregate’s are unknown.
9
stacks the main diagonal of a symmetric matrix into a vector. Hence,
[
]
Ψ ≡ 1m×1 ⊗ ρm−1 ρm−2 . . . ρ0
[
]0
diag(Ψ) = ρm−1 ρm−2 ρm−3 · · · ρ0
= ψ
 1
 0.
 .
 .
H ≡  0


0
0
1 ···
1 ···
.. ..
. .
0 ···
.
0 ..
0 ···
1 1 0 ···
1 1 1 ···
.. .. .. ..
. . . .
1 1 1 ···
.
1 1 1 ..
0 1 1 ···
0 0 
0 0 
.. .. 
. . 
0 0 


1 0
1 1
where diag(Ψ) and H have dimensions (m×1) and (m×2m−1) respectively, and 1m×1 is a
(m × 1) vector of ones. Notice that H is essentially a matrix of m rows of a m dimensional
column vector of ones within a (m × 2m − 1) matrix of zeros where in each successive row
the vector of ones is shifted one column to the right. The product ψH now conveniently
collects the 2m − 1 sums of the diagonal elements of Φτ in a (1 × 2m − 1) vector omitting
for the sake of simplicity the innovation terms. The variance of zl,τ can now be obtained
as
S0 ≡ ψHH 0 ψ 0
E(zl,τ zl,τ ) = σh2 S0
which makes use of the i.i.d. property of the t .
For deriving S1 , decompose H = (h1 , 1m×1 , h2 ) where h1 and h2 are (m × m − 1)
matrices collecting the sums of the diagonal elements below and above the main diagonal
respectively. Consider now Φτ −1 = Lm Φτ whose sum of elements define zl,τ −1 . The value
of S1 is linear in the covariance between zl,τ and zl,τ −1 . Therefore, we need to multiply
the sums of the elements above the main diagonal of the matrix Φτ −1 with the sums of
the elements below the main diagonal of the matrix Φτ diagonal by diagonal. With the
10
aid of h1 and h2 one can write
S1 = ψh1 h02 ψ 0
E(zl,τ zl,τ −1 ) = σh2 S1 .
Looking at the result it is noteworthy that the square root term in equation (8) is always
positive which can be conjectured from its monotonicity in ρ and looking at the limiting
cases of ρ → 0 and ρ → ±1 respectively. As it turns out only one of the solutions yields
an invertible M A representation.
√ Furthermore, the whole expression will be dominated
S02
S0
|
>
− 1 given ρ 6= 0. It can also be shown that ρ and φ∗
by the first term since | 2S
4S 2
1
1
will always have the same sign. Finally, φ∗ can be identified as the choice out of the two
possible options. It is reasonable to consider the resulting invertible M A coefficient as the
true coefficient of the aggregated process.
For completeness I also give a formula for the variance of the aggregated process. The
variance amounts to σl2 = S0 σh2 /(1 + φ∗ 2 ).
Figure 1 (see page 12) illustrates the results by depicting the relation between ρ and
φ∗ . As a notational convention I will provide the value of m (in parentheses) if convenient
because φ∗ , and ρ∗ all depend on m.
3.3
Feasible maximum likelihood estimation
Feasible approaches to estimate the model parameters are the Kalman filter and various
numerical optimisation methods. Using standard notation, L is the lag operator with
Lxt = xt−1 while L = Lm is the modified lag operator, and hence Lxτ = xτ −1 . We set up
a model as in (4), but consider the ARM A(1, 1) error process
(12) (1 − %L)ul,τ
= (1 + φL)εl,τ
where % = ρ∗ and φ = φ∗ when the aggregation restriction is in place. The corresponding
empirical model is denoted ARM A∗ (1, 1). Furthermore, I introduce the parameter vectors
11
Figure 1: The parameter φ∗ (vertical axis) as a function of ρ (horizontal axis).
(
)
( % )
ρ∗ (m)
∗
θ = φ and θ (m) =
to obtain handy expressions for future use.
φ∗ (m)
In what follows an Ox3.40 program will be employed (Doornik and Ooms, 2001) making
use of its standard ARFIMA package (Ooms and Doornik, 1999; Doornik and Hendry,
2001, chap. 13). It is capable of filtering data by various definitions of ARFIMA models
and of calculating the according likelihood function. In combination with a simple hill
climbing algorithm the likelihood can be maximised and corresponding estimates for ρ
and φ∗ be obtained.
3.4
Simulation study
In this section the results of a simulation study (see tables 1 on page 14) are presented. I
pursue the following strategy.
12
The principal simulation approach uses a moderate sample size of T /m = 100. The
data is filtered by the ARM A∗ (1, 1) and an unrestricted ARM A(1, 1) process for comparison. In order to assess the estimation procedure the population means as well as standard
deviations of the estimates for % and β are reported. Note, however, for the ARM A∗ (1, 1)
models the listed % correspond to the value in (2).
The parameter space for the simulation setup is depicted in figure 2 which illustrates
the choices of φ∗ and ρ∗ for various m. Given a certain variance of the aggregated process,
the variance of the innovation process of the disaggregated model does not vary much
with m for ρ < .8 (see bottom panel of figure 2). It is further interesting to notice that
regardless of the actual choice of m the implicit ARM A(1, 1) parameters are close to each
other for a considerable range of ρ. In fact, all possible pair of lines have at least one point
in common. This implies that there are always some ρ for which the observable data could
have been generated at two possible higher frequencies.
For the sake of brevity, the identifaction of m is not discussed, however. In the simulation experiment it is assumed that the level of aggregation is always known. As table
1 shows the estimates are indeed very well in line with the true values. The accuracy
increases with % which can be inferred from the coresponding standard deviations. Thus,
the proposed maximum likelihood approach provides reliable coefficient estimates.
The accuracy of the point estimates of the β parameter appears to be unaffected by the
choice of ρ. This can be conjectured from the reported standard deviations. There is no
situation where the t-ratio comes close the 2, it is always much larger. Quite reasonably,
the variance depends on the underlying variance of the innovation process and the variance
of xl,τ .
Since the standard error of the parameter estimates for β does hardly vary at all across
models and is relatively small, there is a good chance to correctly determine the impact
of the related series. This leads to the conclusion that independent of whether or not the
true disaggregated model can be identified, forecasting or nowcasting on the basis of the
13
Table 1: Parameter estimation in a small sample simulation
yh,t = 1 + 1.5xh,t + uh,t , T /m = 100
xh,t ∼ N (0, σh2 )
uh,t = ρuh,t−1 + t
t ∼ N (0, σh2 )
m = 4, ref.: equation (4)
σh2
ρ
model
ρ̂
σ̂ρ̂
1
.2
.5
.8
.95
4
.8
∗
φ̂
σ̂φ̂
β̂
σ̂β̂
ARM A (1, 1)
ARM A(1, 1)
.20
-.01
.18
.80
.07
.59
1.51
1.51
.23
.24
ARM A∗ (1, 1)
ARM A(1, 1)
.47
.02
.15
.73
.21
.42
1.52
1.51
.31
.31
ARM A∗ (1, 1)
ARM A(1, 1)
.79
.73
.05
.25
.27
.19
1.52
1.51
.39
.40
ARM A∗ (1, 1)
ARM A(1, 1)
.94
.94
.02
.02
.27
.12
1.51
1.51
.40
.40
ARM A∗ (1, 1)
ARM A(1, 1)
.79
.73
.05
.25
.27
.19
1.57
1.56
1.58
1.59
ARM A∗ (1, 1) denotes the restricted ARM A(1, 1) model (m = 4). The column
headed % signifies the rejection frequency of the hypotheses to the left in percentage points. Columns ‘ρ̂’ (‘σ̂ρ̂ ’), ‘φ̂’ (‘σ̂φ̂ ’), ‘β̂’ (‘σ̂β̂ ’) report the average (standard
deviation) of the estimated model parameters. Note: Estimates for ρ can be compared to the true parameter only for ARM A∗ (1, 1). A comparison to % = ρ∗ (4)
should otherwise be made.
true related series is, in principle, feasible.
4
Application
The following application illustrates the approach in a realistic setting. The Swiss federation has agreed to comply with European standards in statistics on, among other things,
economics, such as the system of national accounts. This invokes the need for providing
monthly information for the industrial production, for example. Up until today, no official statistic is available at that frequency. It is therefore reasonable to start with an
14
application of the procedure outlined before.
The general approach will be to use a list of reasonable related series. They will be
drawn from three two different categories. The first comprises data related to trades in
goods. It is provided by the Swiss Custom Office. The trade activities looked at are
exports and imports of all goods, investment goods, vehicles, machinery, and intermediate
goods for the industry. The second data type is survey data. Here, a sample of firms is
surveyed monthly on a range of issues covering the overall business situation, the number
of orders, and turnovers to name but a few. These firms can furthermore be distinguished
according to their size (small, medium, large).
The model estimated is (2) with additional centred seasonal dummies to account for
seasonality. The following table reports for the ARM A(1, 1)∗ the coefficient estimate for
β, its t−value, the sum of squared forecast errors for the last two years of the sample, and
the estimates for ρ (both Chow and Lin’s (1971) result and ρ∗ ). The sample runs from
1990 through 2003, the years 2004 and 2005 were used for out-of-sample forecasting.
According to the two criteria significance of the coefficient on the related series and
minimum forecast error the disaggregation based on the variable “PROD VM MITTLERE
G AVER” (average of the judgment of medium sized firms of their actual change in
production from the last to the current month; share of the answer category “stay equal”)
seems to be the most appropriate choice.
The corresponding disaggregated series is depicted in figure 3 on page 19.
5
Summary and conclusions
This paper suggests an alternative estimation of the popular Chow&Lin disaggregation
approach. While preserving the advantages of the CL method, especially its moderate
data demands, it simplifies the analysis by allowing a one-step-estimation on the basis of
15
Table 2: Disaggregation of annual value added in the industry to quarterly values
Related variable
A GROSSE S AVER KUM2
EXC INVFAHRZ
IMC INVFAHRZNOM
PROD VM ALLE G AVER
IMC INVFAHRZ
PROD VM MITTLERE G END
PROD VM ALLE G END
PROD VM MITTLERE G AVER
EXC FAHRZ
IMC FAHRZNOM
PROD VM GROSSE G AVER
PROD IND PAUL
PROD IND PAUL∗
ρ̂
β̂ ∗
t-val
MSFE
.6607
.8887
.8353
.8418
.8427
.8131
.74
.8809
.8775
.5894
.7657
.5210
.8692
2.75
- 30.48
-34.57
300.80
-28.67
179.40
184.30
308.40
-30.89
4.98
65.24
113.30
240.80
1.512
-2.717
-2.149
2.778
-1.966
1.775
1.317
3.496
-2.268
.133
.429
1.601
2.931
9.92
22.13
17.12
17.36
18.12
19.27
20.17
20.42
24.95
26.31
32.44
16.47
55.08
∗
Model without trend. This model is used for generating the semiofficial quarterly data by the Swiss ministry of economic affairs.
the aggregated data. A simulation study reveals that the estimates are reliable within a
considerable parameter space.
Future research might be devoted to develop a statistical test for the aggregation
restriction on the basis of the maximum likelihood approach.
References
Chow, G. C. and Lin, A.-l. (1971). Best linear unbiased interpolation, distribution, and
extrapolation of time series by related series, The Review of Economics and Statistics
53(4): 372 – 75.
Doornik, J. A. and Hendry, D. F. (2001). Econometric Modelling Using PcGive, Vol. III,
Timberlake Consultands, London.
Doornik, J. A. and Ooms, M. (2001). Introduction to Ox, Oxford University.
16
Engle, R. F. and Granger, C. W. J. (1987). Co-Integration and Error Correction: Representation, Estimation and Testing, Econometrica 55(2): 251 – 276.
Fernandez, R. B. (1981). A methodological note on the estimation of time series, The
Review of Economics and Statistics 63(3): 471 – 478.
Litterman, R. B. (1983). A random walk, Markov model for the distribution of time series,
Journal of Business & Economic Statistics 1(2): 169 – 173.
Marcellino, M. G. (1999). Some Consequences of Temporal Aggregation in Empirical
Analysis, Journal of Business & Economic Statistics 17(1): 129 – 136.
Ooms, M. and Doornik, J. (1999). Inference and Forecasting for Fractional Autoregressive Integrated Moving Average Models, with an application to US and UK inflation,
Econometric Institute Report EI 9947/A, Erasmus University Rotterdam, Econometric Institute.
Santos Silva, J. S. and Cardoso, F. (2001). The Chow-Lin method using dynamic models,
Economic Modelling 18(2): 269 – 280.
Silvestrini, A. and Veredas, D. (2005). Temporal aggregation of univariate linear time
series models, Core discussion paper 59, Université Libre de Bruxelles.
Wei, W. W. S. (1990). Time Series Analysis, Addison-Wesley Publishing Inc., New York.
Wei, W. W. S. (2006). Time Series Analysis, 2nd edn, Addison-Wesley Publishing Inc.,
New York.
17
0.3
ρ=0.99
ρ=0.90
ρ=0.8
φ*
0.2
m=2
m=4
m=12
0.1
0.0
0.1
0.2
0.3
0.4
0.5
ρ *(4)
0.6
0.7
0.8
0.9
1.0
0.2
0.3
0.4
0.5
ρ
0.6
0.7
0.8
0.9
1.0
0.2
0.3
0.4
0.5
ρ
0.6
0.7
0.8
0.9
1.0
0.3
φ
0.2
φ=φ *(2)
φ=φ *(3)
φ=φ *(4)
φ=φ *(12)
0.1
0.0
0.1
15
γ0
10
m=2
m=4
m=12
5
0.0
0.1
Figure 2: Top panel: the simulation experiment in the (ρ∗ , φ∗ ) space, the lines connect
pairs of ρ∗ and φ∗ generated with ρ fixed yet varying m, middle panel: values
of φ∗ for popular choices of m (semi-annual to annual, monthly to quarterly,
quarterly to annual, monthly to annual aggregates), bottom panel: autocovariance for σl2 = 1 and various m.
18
Disaggregated quarterly value added in the industry incl. forecasts,
^
PRD_VM_MITT_G_AV as related series (ρ=0.88)
85000
forecasts
80000
Chow−Lin estimates
75000
70000
1990
86000
1995
2000
2005
Forecast comparison (if applicable)
Chow−Lin forecasts
original series
84000
82000
80000
2000
2001
2002
2003
2004
2005
Figure 3: Temporal disagregation of Swiss industrial production by the Chow-Lin procedure using the new estimation apprach using survey data.
19