Macroeconomic Forecasting Using Diffusion Indexes

Macroeconomic
ForecastingUsing
Indexes
Diffusion
James H. STOCK
Kennedy School of Government,HarvardUniversity,and NationalBureauof Economic Research,
Cambridge, MA 02138
MarkW. WATSON
WoodrowWilson School, PrincetonUniversity,Princeton,NJ 08544, and NationalBureauof
Economic Research
This article studies forecastinga macroeconomictime series variableusing a large numberof predictors.
The predictorsare summarizedusing a small number of indexes constructedby principal component
analysis. An approximatedynamicfactor model serves as the statisticalframeworkfor the estimationof
the indexes and constructionof the forecasts. The method is used to construct6-, 12-, and 24-monthahead forecastsfor eight monthlyU.S. macroeconomictime series using 215 predictorsin simulatedreal
time from 1970 through 1998. During this sample period these new forecasts outperformedunivariate
autoregressions,small vector autoregressions,and leading indicatormodels.
KEY WORDS: Factormodel; Forecasting;Principalcomponents.
1. INTRODUCTION
Recent advances in informationtechnology make it possible to access in real time, at a reasonable cost, thousandsof
economic time series for major developed economies. This
raises the prospect of a new frontierin macroeconomicforecasting, in which a very large numberof time series are used
to forecast a few key economic quantities,such as aggregate
productionor inflation.Time series models currentlyused for
macroeconomicforecasting, however, incorporateonly a few
series: vector autoregressions,for example, typically contain
fewer than 10 variables. Although variable selection procedures can be used to choose a small subset of predictorsfrom
a large set of potentially useful variables,the performanceof
these methods ultimately rests on the few variables that are
chosen. For example, real economic activity is often used to
predictinflation(the so-called Philips curve), but is the unemployment rate, the rate of capacity utilization, or the Gross
Domestic Product gap the best measure of real activity for
this purpose?An alternativeto selecting a few predictorsis to
pool the informationin all the candidatepredictors,averaging
away idiosyncratic variation in the individual series. In this
paper, we use an approximatefactor model for this purpose.
The premise is that for forecasting purposes, the information
in the large numberof predictorscan be replacedby a handful
of estimated factors.
This idea has a long tradition in macroeconomics. For
example, the notion of a common business cycle underliesthe
classic work of Bums and Mitchell (1947) and the indexes
of leading and coincident indicators originally developed at
the National Bureau of Economic Research (NBER). This
notion was formally modeled by Sargent and Sims (1977)
in their dynamic generalization of the classic factor analysis model. Versions of their model have been used by several
researchersto study dynamic covariationamong sets of variables (Geweke 1977; Singleton 1980; Engle and Watson 1981;
Stock and Watson 1989, 1991; Quah and Sargent 1993; Forni
and Reichlin 1996, 1998). Modem dynamic general equilibrium macroeconomic models often postulate that a small set
of driving variablesis responsible for variationin macro time
series, and these variablescan be viewed as a set of common
factors. Although the previous empirical research focused on
estimatingindexes of covariation,this paperuses the estimated
factors for prediction.
The approximatedynamic factor model, which relates the
variableto be forecast,yt+,, to a set of predictorscollected in
the vector X,, is presentedin Section 2. Forecastingis carried
out in a two-step process: first the factors are estimated (by
principal components) using X,, then these estimated factors
are used to forecast yt,. Focusing on the forecasts implied
by the factors ratherthan on the factors themselves permits
sidesteppingthe difficultproblemof identification(or rotation)
inherentin factor models. One interpretationof the estimated
factors is in terms of diffusion indexes developed by NBER
business cycle analysts to measure common movement in a
set of macroeconomicvariables, and accordingly we call the
estimated factors diffusion indexes.
The performanceof the diffusion index (DI) forecasts is
examined in Sections 3 and 4. The experiment reported in
these sections simulatesreal-timeforecastingduringthe 19701998 period of eight U.S. macroeconomicvariables,four measures each of real economic activity and of price inflation.
The DI forecasts are constructedat horizons of 6, 12, and 24
months using as many as 215 predictorseries. These forecasts
are compared to several conventional benchmarks:univariate autogressions, small vector autoregressions,leading indicator models, and, for inflation,unemployment-basedPhillips
curve models. Generally speaking, the diffusion index forecasts based on a small numberof factors (in most cases, one
or two) are found to performwell, with relative performance
improvingas the horizon increases. The improvementover the
benchmarkforecastscan be dramatic,in several cases produc-
147
@ 2002 American Statistical Association
Journal of Business & Economic Statistics
April2002, Vol. 20, No. 2
148
Journalof Business & EconomicStatistics,April2002
ing simulated out-of-sample mean square forecast errorsthat 2.2 Estimationand Forecasting
are one-thirdless than those of the benchmarkmodels.
Because {F,}, ah, 1h(L), and yh(L) are unknown,forecasts
of YT+h based on (2.4) and (2.5) are constructedusing a two2. ECONOMETRIC
FRAMEWORK
step procedure.First, the sample data {Xt},=, are used to estimate
a time series of factors (the diffusion indexes),
2.1 An ApproximateDynamicFactorModel
{•F},=l1
Second, the estimators ah, 3h(L) and Yh(L) are obtained
We begin with a discussion of the statistical model that by regressing
yt+l onto a constant, Ft and yt (and lags). The
motivates the DI forecasts. Let y,t, denote the scalar series forecast of
yh is then formed as 'h +/3h(L)F + Yh(L)y,.
to be forecast and let, Xt be an N-dimensional multiple time
Stock and Watson (1998) developed theoreticalresults for
series of predictorvariables,observedfor t = 1 ..., T, where this
two-step procedureapplied to (2.3) and (2.4). The factors
y, and Xt are both taken to have mean 0. (The differenttime are estimated by principalcomponentsbecause these estimasubscriptsused for y and X emphasize the forecasting rela- tors are readily calculated even for very large N and because
tionship.) We suppose that (Xt, yt+) admit a dynamic factor principalcomponentscan be generalizedto handle data irregmodel representationwith F common dynamic factors f,,
ularities as discussed later. Under a set of moment conditions
for (E, e, F) and an asymptoticrankconditionon A, the feasi(2.1) ble forecast is
Yt+,= 8(L)ft + y(L)y, + Et+,,
asymptoticallyfirst-orderefficient in the sense
(2.2) that its mean square forecast error (MSE) approaches the
Xi, = Ai(L)f, + eit,
MSE of the optimal infeasible forecast as N, T -- oo, where
for i= 1,...
N, where et = (elt,,..., eNt)' is the N x 1 N = O(TP) for any p > 1. This result suggests that feasible
idiosyncraticdisturbanceand Ai(L) and P(L) are lag polyno- forecasts are likely to be nearly optimal when N and T are
mials in nonnegativepowers of L. It is assumed that E(Et+l I large, regardlessof the ratio of N to T. The assumptionsby
=
ft, y,, X, f,_-, yt-1 X,_-1... ) 0. Thus, if {ft, P(L), and Stock and Watson (1998) are similar to assumptionsmade in
minimum
mean square error forecast the literatureon approximatefactor models (Chamberlainand
were
the
known,
y(L)
of yT+ would be P(L)f, + y(L)y,.
Rothschild 1983; Connor and Korajczyk 1986, 1988, 1993),
We make two importantmodifications to (2.1) and (2.2). generalizedto allow for serial correlation.A related dynamic
First, the lag polynomials Ai(L), P(L), and y(L) are modeled generalizationand estimation(butnot forecasting)resultswere
as having finite ordersof at most q, so Ai(L) = EIj__ AJjL and discussed by Forni, Hallin, Lippi, and Reichlin (2000). Stock
P(L) =
oPjLj. The finite lag assumptionpermitsrewriting and Watson(1998) also showed that the principalcomponents
and
(2.1)
(2.2) as
remain consistent when there is some time variationin A and
small amounts of data contamination,as long as the number
(2.3)
of predictorsis very large, N >> T.
Yt+l = P'F, + y(L)y, + Et+I,
(2.4)
X, = AFt + et,
where Ft = (f, ..., f,_q)' is r x 1, where r < (q+ 1)F, the ith
row of A in (2.4) is (Ai0,...., 9iq), and 3 = (30 . . . P/q)'. The
main advantage of this static representationof the dynamic
factor model is that the factors can be estimated using principal components.This comes at a cost, because the assumption is inconsistentwith infinite distributedlags of the factors.
Whetherthis cost is large is ultimately an empiricalquestion,
addressed here by studying whether (2.3) and (2.4) can be
used to produce accurateforecasts.
Second, our empirical applicationfocuses on h-step-ahead
forecasts. At least two approachesto multistepforecastingare
possible. One is to develop a vector time series model for
F,, to estimate this using the estimated factors, and to roll
the (y,, F,) model forward,but this entails estimating a large
numberof parametersthat could erode forecast performance.
Another approachis to recognize that the ensuing multistep
forecasts would be linear F, and y, (and lags) and to use an
h-step-aheadprojectionto constructthe forecasts directly.We
adopt the latter approach,and the resulting multistep ahead
version of (2.3) is
Y+h= ?h + h(L)Ft+ yh(L)y, + E+h,
(2.5)
2.3 Data Irregularitiesand ComputationalIssues
In our dataset, some series contain missing observationsor
are available over a diminishedtime span. Although our data
are all monthly,furthercomplicationswould arise in applications in which mixed sampling frequencies are used, such as
monthly and quarterly.In these cases standardprincipalcomponents analysis does not apply. However, the expectationmaximization(EM) algorithmcan be used to estimatethe factors by solving a suitable minimization problem iteratively.
Details are given in Appendix A.
Although the components of X, typically will be distinct
time series, X, could contain multiple lags of one or more
series. Because the estimated factors F, could include lags of
the dynamic factors f,, estimation of F, might be enhanced
by augmenting a vector of distinct time series with its lags.
This is referredto later as stacking X, with its lags, in which
case the principalcomponents of the stacked data vector are
computed.
3.
THE DATAAND FORECASTING
EXPERIMENTAL
DESIGN
3.1 Forecasting Models and Data
where yh+his the h-step-aheadvariableto be forecast,the conThe forecasting experiment simulates real-time forecaststant term is introducedexplicitly, and the subscripts reflect
the dependenceof the projectionon the horizon.
ing for eight major monthly macroeconomicvariablesfor the
Stock and Watson: MacroeconomicForecastingUsing DiffusionIndexes
United States. The complete dataset spans 1959:1 to 1998:12.
Fourof these eight variablesare the measuresof real economic
activity used to construct the Index of Coincident Economic
Indicatorsmaintainedby the Conference Board (formerly by
the U.S. Departmentof Commerce):total industrialproduction
(ip); real personal income less transfers(gmyxpq); real manufacturingand trade sales (msmtq); and number of employees on nonagriculturalpayrolls (lpnag). (Additionaldetails are
given in Appendix B, which lists series by the mnemonics
given here in parenthesis.)The remainingfour series are price
indexes: the consumerprice index (punew); the personal consumption expenditureimplicit price deflator(gmdc); the consumer price index (CPI) less food and energy (puxx); and the
producerprice index for finished goods (pwfsa). These series
and the predictorseries were taken from the May 1999 release
of the DRI/McGraw-HillBasic Economics database(formerly
Citibase). In general these series represent the fully revised
historical series available as of May 1999, and in this regard
the forecasting results will differ from results that would be
calculated using real-time data.
For each series, several forecastingmodels are comparedat
the 6-, 12-, and 24-month forecasting horizons: DI forecasts
based on estimatedfactors,a benchmarkunivariateautoregression, and benchmarkmultivariatemodels. For both the real
and the price series, one of the benchmarkmultivariatemodels
is a trivariatevector autoregression,and a second is based on
leading economic indicators. As a furthercomparison, inflation forecasts are also computed using an unemploymentbased Phillips curve.
Our focus is on multistep-aheadprediction,and most of the
forecastingregressionsare projectionsof an h-step-aheadvariable yh+honto t-dated predictors,sometimes including lagged
transformedvalues y, of the variableof interest.The real variables are modeled as being I(1) in logarithms. Because all
four real variables are treated identically, consider industrial
production,for which
yh+h= (1200/h) Iln(IPt+h/IPt)
and y, = 1200ln(IPt/IPt,_).
(3.1)
The price indexes are modeled as being 1(2) in logarithms.
The 1(2) specificationis consistentwith standardPhillips curve
equationsand is a good descriptionof the series over much of
the sample period. However, I(1) specifications also provide
adequatedescriptionsof the data, particularlyin the early part
of the sample. Stock and Watson(1999) found little difference
in I(1) and 1(2) factormodel forecasts for these prices over the
sample period studiedhere, so for the sake of brevity we limit
our analysis to the 1(2) specification.Accordingly,for the CPI
(and similarly for the other price series),
y+h =
(1200/h) ln(CPIt+h/CPIt) - 1200 ln(CPI,/CPIt_,)
and y = 1200OAln(CPI,/CPI,_1). (3.2)
Diffusion Index Forecasts. Following (2.5), the most general DI forecasting function is
m
YTh+hlT
h
j=1
p
fhjFrj+l
+
j=1
YhjYT-j+1,
(3.3)
149
where F, is the vector of k estimatedfactors. Results for three
variantsof (3.3) are reported.The first, denoted in the tables
by DI-AR, Lag, includes lags of the factorsand lags of y, with
k and lag orders m and p estimated by Bayesian information
criterion (BIC), with 1 < k < 4, 1 < m < 3, and 0 < p < 6.
Thus the smallest candidatemodel that BIC can choose here
includes only a single contemporaneousfactor and excludes
y,. The second, denotedDI-AR, includes contemporaneousF,,
that is, m = 1, and k and p are chosen by BIC with 1 <
k < 12 and 0 < p • 6. The third, denoted DI, includes only
contemporaneousF,, so p = 0, m = 1, and k is chosen by BIC,
1 <k < 12.
The full dataset used to estimate the factors contains 215
monthly time series for the United States from 1959:1 to
1998:12. The series were selectedjudgmentallyto represent14
main categoriesof macroeconomictime series: real outputand
income; employmentand hours;real retail,manufacturing,and
trade sales; consumption;housing startsand sales; real inventories and inventory-salesratios; orders and unfilled orders;
stock prices; exchange rates; interest rates; money and credit
quantity aggregates; price indexes; average hourly earnings;
and miscellaneous. The list of series is given in Appendix B
and is similar to lists we have used elsewhere (Stock and
Watson 1996, 1999). These series were taken from a somewhat longer list, from which we eliminated series with gross
problems, such as redefinitions.However, no furtherpruning
was performed.
The theory outlined in Section 2 assumes that Xt is I(0),
so these 215 series were subjectedto three preliminarysteps:
possible transformationby takinglogarithms,possible firstdifferencing, and screeningfor outliers. The decision to take logarithmsor to first differencethe series was madejudgmentally
afterpreliminarydataanalysis, includinginspectionof the data
and unit root tests. In general, logarithms were taken for all
nonnegativeseries that were not alreadyin rates or percentage
units. Most series were first differenced.A code summarizing
these transformationsis given for each series in Appendix B.
After these transformations,all series were furtherstandardized to have sample mean zero and unit sample variance.
Finally, the transformeddata were screened automaticallyfor
outliers (generally taken to be coding errors or exceptional
events such as labor strikes), and observationsexceeding 10
times the interquartilerange from the median were replaced
by missing values.
Using this transformedand screened dataset, three sets of
empirical factors were constructed. The first was computed
using principal components from the subset of 149 variables
available for the full sample period (the balancedpanel). The
second set of factors was computed using the nonbalanced
panel of all 215 series using the methods of Appendix A. The
thirdset of factorswas computedby stackingthe 149 variables
in the balanced panel with their first lags, so the augmented
data vector has dimension 298. Empirical factors were then
estimatedby the principalcomponentsof the stacked data, as
discussed in Section 2.
AutoregressiveForecast. The autoregressiveforecast is a
univariateforecast based on (3.3), where the terms involving
F are excluded. The lag order p was selected recursively by
BIC with 0 < p < 6, where p = 0 indicates that yt and its lags
are excluded.
150
Vector Autoregressive Forecast. The first multivariate
benchmarkmodel is a vector autoregression(VAR) with p
lags each of three variables. One version of the VAR used
p = 4 lags, and anotherversion selected p recursivelyby BIC.
The fixed-lag VARs performedsomewhatbetter than the BIC
selected lag lengths (which often set p = 1), and we report
results for the fixed lag specificationsin the results to follow.
The variablesin the VAR are a measureof the monthlygrowth
in real activity,the change in monthlyinflation,and the change
in the 90-day U.S. treasurybill rate. When used to forecast the
real series, the relevantreal activity variablewas used and the
inflation measure was CPI inflation.For forecastinginflation,
the relevantprice series was used and the real activity measure
was industrialproduction.Multistep forecasts were computed
by iteratingthe VAR forward.This contraststo the autoregressive forecasts, which were computedby h-step-aheadprojection ratherthan iteration.
Multivariate Leading Indicator Forecasts. The leading
indicatorforecasts have the form
Journalof Business & EconomicStatistics,April2002
new orders in durable goods industries (mdoq), the nominal
Ml money supply (fml), the federal funds overnightinterest
rate (fyff), and the interest rate spread between 1-year U.S.
treasurybonds and the federalfunds rate (sfygtl). The remaining variable is the trade-weightedexchange rate listed in the
previous paragraph.
In all cases, the leading indicatorswere transformedso that
W, is I(0). This entailed taking logarithms of variables not
alreadyin rates and differencingall variablesexcept the interest rate spreads, housing starts, the index of vendor performance, and the help wanted index.
For each variable to be forecast, p and m in (3.4) were
determinedby recursive BIC with 1 < m < 4 and 0 < p < 6,
so 28 possible models were comparedin each time period.
Phillips Curve Forecasts. The unemployment-based
Phillips curve is considered by many to have been a reliable
method for forecasting inflation over this period (Gordon
1982; CongressionalBudget Office 1996; Fuhrer 1995; Gordon 1997; Staiger et al. 1997; Tootel 1994). The Phillips
curve inflation forecasts considered here have the form (3.4),
m
p
where Wt consists of the unemploymentrate (LHUR) and
=8h0
(3.4)
hi WT j+l +
YhjYT--j+1
m - 1 of its lags, the relativeprice of food and energy (current
SYT+hJT •
j=1
j=1
and one lagged value only), and Gordon's(1982) variablethat
where Wtis a vector of leading indicatorsthat have been fea- controls for the imposition and removal of the Nixon wage
tured in the literatureor in real-time forecasting applications and price controls. The wage and price control variable is
and Sh0 and so forth are ordinary least squares coefficient introducedfor forecasts made in 1971: 7 + h, before which it
estimates.
produces singularregressions. The lag lengths m and p were
For the real variables, Wt consists of 11 leading indicators chosen by recursiveBIC, where 1 < m < 6 and 0 < p < 6.
that we used for real-time monthly forecasting in experimental leading and recession indicators(Stock and Watson 1989). 3.2
SimulatedReal-TimeExperimentalDesign
(The list used here consists of the leading indicatorsused to
Estimationand forecastingwas conductedto simulate realproduce the XRI and the XRI-2, which are released monthly
time
and documentedat the web site http://www.nber.org.)
Five of
forecasting. This entailed fully recursiveparameterestithese leading indicators are also used in the factor estima- mation, factor estimation, model selection, and so forth. The
tion step in the diffusion index forecasts. These are average first simulated out of sample forecast was made in 1970:1.
weekly hours of productionworkersin manufacturing(lphrm), To constructthis forecast, the data were screened for outliers
the capacity utilizationrate in manufacturing(ipxmca), hous- and standardized,the parametersand factors were estimated,
ing starts (building permits) (hsbr), the index of help-wanted and the models were selected, using only data available from
advertising in newspapers (lhel), and the interest rate on 1959:1 through1970:1. (The first date for the regressionswas
10-yearU.S. treasurybonds (fygtl0). The remainingsix lead- 1960:1, and earlier observationswere used for initial condiing indicators are the interest rate spread between 3-month tions as needed.) Thus regressions(3.3) and (3.4) were run for
U.S. treasurybills and 3-month commercialpaper;the spread t = 1960:1 ...., 1970:1- h, then the values of the regressors
between 10-year and 1-year U.S. treasury bonds; the num- at t = 1970:1 were used to forecast y1970:1+h. All parameters,
ber of people working part-timein nonagriculturalindustries factors, and so forth were then reestimated,informationcribecause of slack work; real manufacturers'unfilled orders in teria were recomputed,and models were selected using data
durable goods industries;a trade-weightedindex of nominal from 1959:1 through 1970:2, and forecasts from these models
exchange rates between the United States and the U.K., West were then computed for yh970:2+h. The final simulated out of
Germany,France, Italy, and Japan;and the National Associ- sample forecast was made in 1998:12- h for yh1998:12"
ation of PurchasingManagers' index of vendor performance
4. EMPIRICALRESULTS
(the percent of companies reportingslower deliveries).
For the inflationforecasts, eight leading indicatorsare used.
4.1 Forecasting Results
These variables were chosen because of their good individual performancein previous inflationforecastingexercises. In
The results for the real variables are reportedin detail in
1 for 12-month-aheadforecasts, and summariesfor 6in
these
variables
well
at
least
one
of
the
Table
particular
performed
historical episodes considered by Staiger, Stock, and Watson and 24-month-aheadforecasts are reported in Table 2. Two
(1997) (also see Stock and Watson 1999). Seven of these vari- sets of statistics are reported.The first is the MSE of the canables are also used in the factor-estimationstep in the diffu- didateforecastingmodel, computedrelativeto the MSE of the
sion index forecasts: the total unemploymentrate (lhur), real univariateautoregressiveforecast (so the autoregressiveforemanufacturingand trade sales (msmtq), housing starts (hsbr), cast has a relative MSE of 1.00). For example, the simulated
151
Stock and Watson: Macroeconomic Forecasting Using Diffusion Indexes
Table1. SimulatedOut-of-SampleForecastingResults:Real Variables,12-MonthHorizon
Industrialproduction
Personalincome
Mfg & tradesales
Nonag. employment
Forecast
Rel. MSE
method
&
Rel. MSE
&
&
Rel. MSE
Rel. MSE
&
Benchmarkmodels
1.00
.86 (.27)
.97 (.07)
.57(.13)
.75(.68)
1.00
.97(.21)
.98(.05)
.52(.15)
.68 (.34)
1.00
.82(.25)
.98(.04)
.63(.17)
.73(.58)
1.00
.89 (.23)
1.05(.09)
.56(.14)
.22 (.41)
.57(.27)
.63(.25)
.52 (.26)
.76(.13)
.71 (.12)
.88(.17)
.77(.14)
.86(.16)
.86(.16)
.76(.13)
.61 (.12)
.61 (.12)
.48(.25)
.57(.24)
.56 (.23)
.99(.15)
.84(.18)
.94 (.20)
.91 (.13)
.99(.31)
.92 (.26)
.63(.18)
.51 (.20)
.55 (.20)
Balanced panel (N = 149)
DI-AR, Lag
.67(.25)
.67 (.25)
DI-AR
DI
.59(.25)
.70(.13)
.70(.12)
.81 (.17)
.82(.15)
.92(.14)
.92(.14)
.70(.13)
.57(.12)
.57 (.12)
.56(.23)
.61 (.23)
.57(.23)
.91(.16)
.80(.17)
.91 (.18)
.88(.14)
.88(.22)
.84(.21)
.68(.18)
.58 (.17)
.62 (.16)
.70(.12)
.81 (.18)
.93(.15)
.93(.15)
.56(.12)
.56(.12)
.61 (.22)
.66(.21)
.89(.19)
.85(.20)
1.02(.30)
.95(.24)
.49(.14)
.53(.14)
Full dataset; m = 1, p = BIC, k fixed
DI-AR,k= 1
1.06(.11)
.27(.34)
1.03(.08)
1.01 (.09)
DI-AR,k=2
.63(.25)
.76(.14)
.78(.14)
DI-AR,k= 3
DI-AR,k = 4
.56(.26)
.54(.26)
.84(.14)
.85(.14)
.77(.15)
.76(.15)
.77 (.13)
.78 (.14)
.30(.49)
.89(.15)
1.00(.16)
1.00(.16)
1.01 (.09)
.78(.14)
.77(.15)
.76(.15)
.46 (.34)
.76 (.13)
.77 (.13)
.78(.14)
AR
LI
VAR
Fulldataset (N = 215)
DI-AR, Lag
DI-AR
DI
Stacked balance panel
DI-AR
DI
.65(.25)
.62(.25)
Fulldataset; m = 1, p = O,k fixed
DI, k= 1
DI, k=2
DI, k=3
DI, k= 4
1.03(.07)
.55(.25)
.51(.25)
.49(.25)
RMSE, AR Model
.049
.027
out of sample MSE of the leading indicator (LI) forecast of
industrialproductionis 86% that of the autoregressiveforecast at the 12-monthhorizon. Autocorrelationconsistent standarderrorsfor these relativeMSEs, calculatedfollowing West
(1996), are reportedin parentheses.The second set of statistics
is the coefficient on the candidate forecast from the forecast
combining regression,
h
hh
A(+hI
Yt+h
" oyt+hlt-
ahAR
Yt+hlt-ut+hh
(4.1)
where ,+h is the candidate h-step-aheadforecast and Yh, AR
is the benchmark h-step-ahead autoregressiveforecast. Heteroscedastic autocorrelationrobust (HAC) standarderrorsfor
a are reported in parentheses. For example, a is estimated
to be .57 when the candidate forecast is the leading indicator forecast at the 12-monthhorizon, with a standarderrorof
.13, so the hypothesis that the weight on the leading indicator forecast is 0 (a = 0) is rejected at the 5% level, but so is
the hypothesis that the leading indicatorforecast receives unit
weight.
We now turn to the results for the real variables.First consider the DI forecasts with factors estimated using the full
dataset (the unbalancedpanel). These forecasts with BIC factor selection generally improve substantiallyover the benchmark univariateand multivariateforecasts. The DI-AR, Lag
model, which allows recursiveBIC selection across own lags
and lags of the factors, outperforms all three benchmark
.34(.41)
.98(.06)
.63(.46)
.77(.14)
.53(.24)
.93(.15)
.52(.23)
.51 (.23)
.98(.05)
.57(.24)
.60(.21)
.59(.22)
.045
.49(.24)
.77(.13)
.82(.15)
.99(.16)
1.01 (.16)
.84(.14)
.83(.15)
.75(.20)
.73 (.19)
.67 (.49)
.95(.17)
1.02(.19)
1.03(.20)
1.01 (.09)
.78(.13)
.84(.14)
.82(.15)
.48 (.24)
.83(.16)
.76(.19)
.75(.18)
.017
models in 10 of the 12 variable-horizoncombinations, the
exceptions being 6- and 12-month-aheadforecasts of employment. In most cases the performanceof the simpler DI forecasts, which exclude lags of F, and y, is comparableto or
even better than that of the DI-AR, Lag forecasts. This is
rather surprising, because it implies that essentially all the
predictable dynamics of these series are accounted for by
the estimated factors. In some cases, the improvementover
the benchmarkforecasts are quite substantial;for example,
for industrialproductionat the 12-monthhorizon the DI-AR,
Lag forecast has a forecast error variance 57% that of the
autoregressivemodel and two-thirdsthat of the leading indicator model. The relative improvementsare more modest at
the 6-month horizon. At the 24-month horizon, the multivariate benchmarkforecasts break down and perform worse than
the univariateforecast;however,the DI-AR, Lag, DI-AR, and
DI forecasts continue to outperformthe autoregressivebenchmark very substantially.
The performanceof comparable models is usually better
when the empirical factors from the full dataset are used, relative to those from the balancedpanel subset. Performanceis
not improvedby using empiricalfactors from augmentingthe
balanced panel with its first lag; for these real series, doing
so does comparably to, or somewhat worse than, using the
empirical factors from the unstackedbalanced panel.
Inspection of the final panels of Tables 1 and 2 reveals
a striking finding: simply using DI or DI-AR forecasts with
152
Journal of Business & Economic Statistics, April2002
Table2. SimulatedOut-of-SampleForecastingResults:Real Variables,6- and 24-MonthHorizons
Industrialproduction
Personalincome
Forecast
method
Rel. MSE
&
Rel. MSE
&
Mfg& tradesales
Rel. MSE
&
Nonag. employment
Rel. MSE
&
A. Horizon= 6 months
Benchmarkmodels
AR
LI
VAR
1.00
.70(.25)
1.01 (.05)
.68(.13)
.43 (.39)
1.00
.83(.15)
.99(.03)
.64(.11)
.63(.43)
1.00
.77(.19)
.99(.04)
.68(.14)
.64 (.45)
1.00
.75(.19)
1.06(.07)
.67(.12)
.12(.34)
.69(.25)
.77(.30)
.74(.25)
.69(.14)
.62(.16)
.68 (.17)
.77(.12)
.81 (.16)
.81 (.16)
.86(.15)
.66(.13)
.65(.13)
.63(.18)
.70(.20)
.67(.20)
.89(.17)
.76 (.17)
.79(.18)
.94(.16)
1.02(.32)
.96(.28)
.56(.18)
.49(.19)
.52(.19)
Balanced panel (N = 149)
.73 (.25)
DI-AR, Lag
DI-AR
.78(.28)
DI
.73(.24)
.68 (.16)
.62 (.16)
.69(.1(.15)
.79(.13)
.81 (.15)
.78(.13)
.66(.11)
.66(.11)
.66(.17)
.76(.19)
.68(.19)
.87(.17)
.70(.17)
.81 (.17)
.93(.17)
.97(.28)
.95(.26)
.58(.21)
.52(.19)
.53(.18)
Full dataset; m = 1, p = BIC, k fixed
DI-AR,k= 1
.97(.15)
DI-AR,k=2
.67(.22)
DI-AR, k=3
.64(.23)
DI-AR,k = 4
.64(.23)
.58 (.33)
.77(.15)
.81 (.15)
.80 (.15)
.91 (.07)
.76(.11)
.75(.12)
.74(.13)
.80(.23)
.90(.14)
.89(.14)
.87(.14)
.99(.11)
.64(.18)
.64(.18)
.63(.18)
.52 (.29)
.86(.16)
.88(.17)
.87(.15)
.94(.12)
.84(.13)
.88(.14)
.91 (.16)
.60(.19)
.71(.16)
.66(.17)
.60(.18)
Fulldataset (N = 215)
DI-AR, Lag
DI-AR
DI
RMSE, AR Model
.030
.016
.028
.008
B. Horizon= 24 months
Benchmarkmodels
AR
LI
VAR
1.00
1.09(.28)
1.01 (.10)
.45(.14)
.44(.48)
1.00
1.29(.31)
.98(.06)
.30(.20)
.63(.34)
1.00
1.08(.21)
1.03(.06)
.45(.14)
.13(.85)
1.00
1.07(.31)
1.06(.13)
.47(.15)
.35(.31)
.57 (.24)
.59(.25)
.55(.26)
.88 (.13)
.88 (.15)
.91 (.14)
.70(.20)
.76(.22)
.76(.22)
.94(.23)
.80(.26)
.80(.25)
.66(.18)
.70(.20)
.70(.20)
.95 (.18)
.89 (.19)
.89(.19)
.82(.15)
.74(.19)
.74(.19)
.88(.26)
.97 (.24)
.97 (.24)
Balanced panel (N = 149)
DI-AR, Lag
.57(.25)
DI-AR
.58(.25)
DI
.58(.25)
.87 (.14)
.87 (.14)
.87(.14)
.76(.19)
.83(.20)
.83(.20)
.86(.23)
.74(.24)
.74(.24)
.64(.20)
.67(.19)
.67 (.20)
.94(.18)
.93 (.18)
.94(.19)
.74(.17)
.76(.18)
.75(.18)
1.06(.25)
.94(.25)
.94(.24)
Full dataset; m = 1, p = BIC, k fixed
DI-AR,k= 1
1.12(.19)
DI-AR,k= 2
.76(.19)
DI-AR,k=3
.58(.24)
DI-AR,k= 4
.56(.24)
.10(.46)
.68(.11)
.89(.13)
.90(.14)
1.07(.09)
.88(.13)
.72(.19)
.70(.20)
.81(1.00)
.68(.17)
.90(.18)
.93(.23)
.97(.04)
.65(.20)
.70(.17)
.67(.18)
.90(.62)
.87 (.14)
.89(.14)
.95(.18)
1.03(.07)
.72(.16)
.79(.16)
.78(.16)
.33(.46)
.99(.17)
.95(.24)
.96(.24)
Fulldataset (N = 215)
DI-AR, Lag
DI-AR
DI
RMSE, AR Model
.075
.046
two factors capturesmost of the forecasting improvement.In
most cases, incorporatingBIC factor and lag order selection
provides little or no improvement over just using two factors, with no lags of the factors and no lagged dependent
variables.
The results for the price series are given in Tables 3 and 4.
There are three notable differencesin these results, relative to
those for the real variables. First, the DI-AR, Lag forecasts
outperformall the benchmarkforecasts less often, in only 6
of the 12 variable-horizoncombinations. Second, including
lagged inflationdramaticallyimprovesthe forecasts, and without this the DI forecasts are actually worse than the autoregressive forecasts. Third, other factor forecasts generally outperform the DI-AR, Lag forecasts. Notably, the full data set
DI-AR forecastwith k = 1 (and no lagged factors)outperforms
all the benchmarksin 11 of 12 cases and typically improves
.070
.031
on the DI-AR lag. Thus most of the forecastinggains seem to
come from using a single factor.
As with the real variables, forecasts based on the stacked
data performless well than those based on the unstackeddata.
Although the full dataset forecasts are typically better than
the balanced panel subset forecasts for the 6- and 12-month
horizons, at the 24-monthhorizonthe balancedpanel forecasts
slightly outperformthe full datasetforecasts.
Additional analysis of factor-basedforecasts of CPI and
consumptiondeflatorinflation, and additionalcomparisonsof
these forecasts to other Phillips-curve forecasts and to forecasts based on other leading indicators, were presented by
Stock and Watson (1999). Three findings from that study are
worth noting here. First, the DI-AR and DI-AR, Lag forecasts
are found to performwell relative to a large numberof additional multivariatebenchmarks.Second, the forecasts reported
Stock and Watson: Macroeconomic Forecasting Using Diffusion Indexes
153
Table3. SimulatedOut-of-SampleForecastingResults:Price Inflation,12-MonthHorizon
Consumptiondeflator
CPI
Forecast
method
Rel. MSE
&
Rel. MSE
&
CPIexc. food & energy
Rel. MSE
&
Producerprice index
Rel. MSE
&
Benchmarkmodels
1.00
.79(.15)
.82(.13)
.91 (.09)
.76(.15)
.95(.20)
.74(.20)
1.00
.95(.12)
.92(.10)
1.02(.06)
.58(.17)
.72(.23)
.45(.20)
1.00
1.00(.16)
.79(.18)
.99 (.05)
.50(.21)
.80(.22)
.56(.21)
1.00
.82(.15)
.87(.14)
1.29(.14)
.75(.19)
.96(.30)
.25(.12)
.72(.14)
.71 (.16)
1.30(.16)
.91 (.14)
.83(.13)
.34(.08)
.90(.09)
.90(.10)
1.40(.16)
.65(.13)
.62(.13)
.25(.08)
.84(.15)
.85(.15)
1.55(.31)
.76(.20)
.74(.20)
.24(.06)
.83(.13)
.82(.14)
2.40(.88)
.78(.21)
.75(.20)
.13(.07)
Balanced panel (N = 149)
DI-AR, Lag
.70(.14)
DI-AR
.69(.15)
DI
1.30(.16)
.94(.12)
.88(.13)
.32(.08)
.90 (.08)
.87(.10)
1.34(.13)
.67(.15)
.66(.12)
.26(.09)
.84(.15)
.85(.15)
1.57(.33)
.77(.21)
.73(.20)
.20(.07)
.86(.11)
.85(.14)
2.44(.87)
.77(.21)
.71 (.19)
.14(.06)
.82(.12)
.28(.08)
.87(.09)
1.51 (.18)
.65(.12)
.25(.08)
.85(.15)
1.55(.32)
.77(.21)
.23(.06)
.81 (.14)
3.06(1.89)
.75 (.20)
.11 (.06)
Full dataset; m = 1, p = BIC, k fixed
DI-AR, k= 1
1.14(.14)
.64(.15)
DI-AR,k=2
1.07(.13)
.67(.14)
.91 (.15)
DI-AR, k=3
.76(.13)
DI-AR,k = 4
.89(.15)
.74(.14)
.77(.12)
.83(.09)
.94(.07)
.91 (.09)
.96(.16)
.83(.14)
.61 (.14)
.64(.14)
.71 (.17)
.72(.17)
.86(.14)
.87(.15)
1.25(.23)
.97(.19)
.73(.20)
.72(.21)
.76(.16)
.77(.15)
.86(.11)
.82(.13)
.95(.24)
.93(.23)
.78(.21)
.79(.21)
1.56(.20)
1.58(.20)
1.60(.20)
1.56(.19)
.22(.09)
.21 (.08)
.17(.08)
.21(.08)
1.55(.31)
1.62(.39)
1.69(.43)
1.67(.40)
.23(.06)
.22(.07)
.18(.07)
.19(.07)
2.76(1.61)
2.72(1.56)
2.68(1.49)
2.55(.99)
.12(.07)
.13(.07)
.13(.07)
.16(.06)
AR
LI
Phillips Curve
VAR
Fulldataset (N = 215)
DI-AR, Lag
DI-AR
DI
Stacked balance panel
DI-AR
DI
.73(.15)
1.54(.31)
Fulldataset; m = 1, p = 0, k fixed
DI, k= 1
DI, k=2
DI, k=3
DI, k=4
.25(.07)
.26(.07)
.24(.08)
.25(.07)
1.60(.34)
1.56(.31)
1.57(.32)
1.56(.25)
RMSE, AR Model
.021
.015
here can be further improved on using a single-factor forecast, where the factor is computedfrom a set of variablesthat
measure only real economic activity. Forecasts based on this
real economic activity factor have MSEs approximately10%
less than the best forecasts reportedin Table 3. Finally, similar rankings of methods are obtained using I(1) forecasting
models, ratherthan the 1(2) models used here, that is, when
first ratherthan second differences of log prices are used for
the forecasting equation and factor estimation.
In interpretingthese results, it should be stressed that the
multivariateleading indicator models are sophisticatedforecasting tools that provide a stiff benchmarkagainst which
to judge the diffusion index forecasts. In our judgment, the
performance of the leading indicator models reported here
overstates their true potential out of sample performance,
because the lists of leading indicators used to construct the
forecasts were chosen by model selection methods based
on their forecasting performanceover the past two decades,
as discussed in Section 3. In this light, we consider the
performanceof the various diffusion index models to be particularly encouraging.
.019
.033
Nevertheless,the findingthat good forecastscan be made with
only one or two factors suggests brieflycharacterizingthe first
few factors.
Figure 1 thereforedisplays the R2 of the regressionsof the
215 individualtime series against each of the first six empirical factorsfrom the balancedpanel subset, estimatedover the
full sample period. These R2 are plotted as bar charts with
one chartfor each factor.(The series are groupedby category
and orderednumericallyusing the orderingin the Appendix.)
Broadly speaking, the first factor loads primarily on output
and employment; the second factor on interest rate spreads,
unemploymentrates, and capacity utilization rates; the third,
on interestrates;the fourth,on stock returns;the fifth, on inflation; and the sixth, on housing starts.Takentogether,these six
factors account for 39% of the variance of the 215 monthly
time series in the full dataset, as measuredby the trace-R2;
the first 12 factors together account for 53% of the variance
of these series. (The contributionsto the trace-R2by the first
six factors are, respectively, .137, .085, .048, .040, .034, and
.041, for a total of .385.)
5. DISCUSSIONAND CONCLUSIONS
4.2 EmpiricalFactors
We find two features of the empiricalresults surprisingand
Because the factors are identifiedonly up to a k x k matrix, intriguing. First, only six factors account for much of the
detailed discussion of the individual factors is unwarranted. variance of our 215 time series. One interpretationof this
154
Journal of Business & Economic Statistics, April 2002
Table4. SimulatedOut-of-SampleForecastingResults:Price Inflation,6- and 24-MonthHorizons
Forecast
method
Consumptiondeflator
CPI
Rel. MSE
&
&
Rel. MSE
CPIexc. food &energy
&
Rel. MSE
Producerprice index
Rel. MSE
&
A. Horizon= 6 months
Benchmarkmodels
AR
LI
Phillips Curve
VAR
1.00
.82(.12)
.90(.11)
1.04(.08)
.78(.16)
.80(.27)
.41(.16)
1.00
1.04(.09)
.99(.06)
1.15(.07)
.42(.16)
.54(.23)
.08(.20)
1.00
1.10(.16)
.90(.11)
1.00(.05)
.32(.27)
.68(.19)
.50(.21)
1.00
1.00(.09)
1.02(.04)
1.34(.16)
.51(.19)
.34(.37)
.19(.12)
.73(.14)
.74(.14)
1.57(.25)
1.05(.18)
1.01(.19)
.21(.08)
.91 (.08)
.89(.08)
1.68(.26)
.71(.17)
.79(.18)
.10(.08)
.83(.13)
.83(.13)
1.74(.43)
.89(.25)
.89(.25)
.13(.07)
.87(.11)
.87(.10)
2.42(.74)
.87(.26)
.87(.26)
.05(.07)
Balanced panel (N = 149)
DI-AR, Lag
.79(.13)
DI-AR
.78(.13)
DI
1.59(.26)
1.00(.22)
.94(.21)
.19(.08)
.97(.07)
.96(.08)
1.64(.21)
.59(.18)
.60(.18)
.09(.08)
.85(.13)
.85(.13)
1.73(.43)
.85(.25)
.85(.25)
.13(.07)
.91 (.09)
.91 (.09)
2.42(.70)
.78(.27)
.82(.29)
.07(.07)
Full dataset; m = 1, p = BIC, k fixed
.71 (.14)
DI-AR,k= 1
DI-AR,k=2
.72(.14)
DI-AR,k=3
.76(.13)
DI-AR,k=4
.76(.13)
1.15(.19)
1.03(.18)
.97(.18)
.96(.19)
.85(.09)
.88(.08)
.93(.08)
.93(.08)
.91(.20)
.78(.17)
.66(.17)
.65(.17)
.85(.11)
.80(.13)
.86(.12)
.88(.12)
1.13(.29)
1.00(.24)
.82(.25)
.79(.25)
.85(.12)
.86(.12)
.91(.10)
.90(.10)
.90(.26)
.86(.26)
.76(.26)
.75(.25)
Fulldataset (N = 215)
DI-AR, Lag
DI-AR
DI
RMSE, AR Model
.010
.007
Benchmarkmodels
AR
LI
Phillips Curve
VAR
.009
.017
B. Horizon= 24 months
1.00
.70(.21)
.84(.12)
.92(.08)
.76(.12)
.77(.08)
.80(.22)
1.00
.70(.20)
.81(.15)
.98(.06)
.78(.11)
.80(.09)
.57(.18)
1.00
.99(.29)
.72(.21)
1.00(.06)
.51(.25)
.93(.19)
.49(.34)
1.00
.65(.22)
.77(.19)
1.18(.12)
.84(.19)
1.00(.06)
.29(.10)
.74(.23)
.75(.25)
1.18(.22)
.74(.18)
.67(.16)
.40(.12)
.75(.16)
.71(.21)
1.21(.18)
.79(.13)
.73(.12)
.38(.12)
.92(.26)
.96(.33)
1.40(.22)
.58(.28)
.53(.27)
.30(.07)
.82(.14)
.77(.17)
2.09(.72)
.68(.12)
.68(.13)
.19(.09)
Balanced panel (N = 149)
DI-AR,Lag
.59(.22)
DI-AR
.70(.24)
DI
1.07(.20)
.95(.12)
.72(.13)
.46(.12)
.67(.18)
.70(.20)
1.08(.18)
.84(.10)
.75(.12)
.45(.12)
.84(.22)
.87(.29)
1.43(.22)
.69(.24)
.61 (.25)
.27(.07)
.76(.14)
.86(.15)
2.10(.70)
.78(.13)
.62(.11)
.19(.08)
1.04(.18)
1.07(.17)
.82(.23)
.81(.21)
.68(.17)
.72(.16)
.80(.12)
.74(.15)
.97(.15)
.92(.13)
.83(.13)
.83(.14)
.60(.25)
.64(.24)
.94(.25)
.92(.26)
1.12(.20)
.96(.17)
.56(.29)
.59(.29)
.73(.17)
.68(.19)
.81(.11)
.78(.14)
.93(.22)
.97(.20)
.80(.14)
.78(.14)
Fulldataset (N = 215)
DI-AR, Lag
DI-AR
DI
Full dataset; m = 1, p = BIC, k fixed
DI-AR,k= 1
.63(.20)
DI-AR,k=2
.61(.21)
DI-AR,k=3
.80(.17)
DI-AR,k=4
.76(.20)
RMSE, AR Model
.052
.038
result is that there are only a few importantsources of macroeconomic variability.Second, just a few factors are needed to
forecast real activity, and the most accurateforecasts of inflation use lags of inflation together with a single factor. This
suggests that a very small state vector may be necessary for
forecasting macroeconomictime series.
These results raise several issues for future empirical and
theoreticalresearch.We mention five here. First, classical diffusion indexes are computed using nonlineartransformations
of the data, but our indexes are linear functions of the data.
This raises the possibility that further forecasting gains can
be realized using a nonlinear version of the dynamic factor
model. Second, the results reportedhere rely on monthly data,
but data from other sampling frequencies (weekly, quarterly)
may improve the forecasts. A computational algorithm for
.046
.077
estimating the factors with mixed frequency data is outlined
in Appendix A. Third, we considered only U.S. data, and it
would be useful to study the relative forecasting performance
of these methods for other countries. Fourth, the estimated
factors that we used here were based on simple estimators
and it would be useful to study other estimatorsdesigned to
exploit the heteroscedasticityand serial correlationin the data
to improve efficiency. Finally, our results are based on 215
time series chosen judgementally from the large number of
available macroeconomic time series. Would there be additional improvementsif we were to use 500 series or much
loss by restrictingourselves to only 100 series? Alternatively,
the problem of systematically selecting many series from
very many series is a difficult problem that requires further
research.
Stock and Watson: MacroeconomicForecastingUsing DiffusionIndexes
155
Factor
#1
0
CD
0o
CD:lull.,
..
10
0
20
30
Out
40
50
60
70
RTS PCE
Emp
..
80
90
HSS
100
..
110
120
Ord
Inv
Factor
130
SPr
140
150
EXR
Int
140
150
EXR
Int
160
170
180
190
200
Pri
Mon
210
AHE Oth
#2
CD
C
00C)
10
C
20
30
Out
40
50
60
70
RTS PCE
Emp
80
90
HSS
100
Inv
110
120
SPr
Ord
Factor
130
160
170
180
Mon
190
200
Pri
210
AHE Oth
#3
Ll
II,,
...C.
0
.
10
... J..lI..II
....
I,.....lL
.II zL
. ...
20
30
Out
40
Ii.
50
60
70
RTS PCE
Emp
IIi...__
, L I_,,,,.
80
90
HSS
100
a
11020
0
SPr
Ord
Inv
Factor
.... ........I50
130
140
150
EXR
Int
140
150
EXR
EXR
Int
Int
140
150
....
J.1 ..I ..i
160
170
180
Mon
190
..
200
Pri
210
AHE Oth
#4
CD
L-i
oCD
I
CT
o
10
.0
20
30
Out
Out
40
50
60
70
RTS
PCE
RTS PCE
Emp
Emp
80
90
HSS
HSS
100
Inv
110
120
Ord
Ord
Inv
Factor
130
SPr
SPr
160
170
180
Mon
Mon
190
200
Pri
Pri
210
AHE Oth
AHE
Oth
#5
#4
CD
C
_5 L)
C
10
Out
20
30
40
Emp
50
60
70
RTS PCE
80
HSS
90
100
Inv
110
Ord
120
130
SPr
EXR
Int
160
170
Mon
and credit quantity aggregates (Mon); price indexes (Pri); average hourly earnings (AHE);miscellaneous (0th).
180
190
Pri
200
210
AHE
Oth
Journalof Business & EconomicStatistics,April2002
156
ACKNOWLEDGMENTS
To carry out the calculations,note that
The materialin this article originally appearedin our paper
titled "Diffusion Indexes."We thank Michael Boldin, Frank
Q(Xt, F, A, F, A)
Diebold, Gregory Chow, Andrew Harvey, Lucrezia Reichlin,
= Z{Ef•7(xt I Xt) + (AF,)2 - 2it(A'Ft)}, (A.4)
Ken Wallis, CharlesWhiteman,and several referees for helpi
t
ful discussions and comments and Lewis Chan and Alexei
Onatskifor skilled researchassistance.This researchwas supwhere Xi, = E• •(Xi, I Xt). The first term on the right side of
ported in part by National Science Foundationgrants SBR(A.4) does not depend on F or A, and so for purposesof min9409629 and SBR-9730489.
imization it can be replaced by Ei L
E 7t. This implies that
the values of F and A that minimize (A.4) can be calculated
as the minimizers of V(F, A) = Ei Z,(Xi, - A•F,)2. At the
WITHAN
APPENDIXA: EMESTIMATION
PANELAND DATAIRREGULARITIES jth step, this reduces to the usual principalcomponenteigenUNBALANCED
value calculationwhere the missing data are replacedby their
In practice, when N is large one encounters various expectation conditional on the observed data and using the
data irregularities,including occasionally missing observa- parametervalues from the previousiteration.If the full dataset
tions, unbalancedpanels, and mixed frequency (for example, contains a subset that constitutesa balanced panel, then startmonthly and quarterly)data. In this case, a modification of ing values for F in the EM iteration can be obtained using
standardprincipalcomponentestimationis necessary.To moti- estimates from the balanced
panel subset.
vate the modification,consider the least squaresestimatorsof
We now provide some additional details on the calcuA and F, from (2.4) from a balanced panel. The objective lation of
Xit, for some important special cases. Let Xi =
function is
(Xi ..., Xi)', and let Xt be the vector of observationson the
ith variable. Suppose that Xt = AiX, for some known matrix
N T
=
Ai, as can be done in the cases of missing values and tempoV(F, A)
(A.1)
E E(Xit -AiF,)2,
i=1 t=1
ral aggregation,for example. Then E(Xi IXt) = E(X I Xt) =
FAi + A'(AiA')-(Xt AiFAi), where (AiA)- is the generalwhere Ai is the ith row of A. (A.1) can be minimized by the ized inverse of
AiAi. The particularsof these calculationsare
usual eigenvalue calculationsand F, are the principalcompo- now
for
some importantspecial cases. In the first
presented
nents of X,.
four special cases discussed, this level of generalityis unnecWhen the panel is unbalanced,least squares estimators of
essary and the formula for Xit follows quite simply from the
F, can be calculatedfrom the objective function
natureof the data irregularity.
A. Missing Observations. Suppose some observationson
NT
are missing. Then, during iteration j, the elements of
Vt(F, A) ==
(A.2) Xi,
li,(Xi, --AF,)2,
the
estimated
balanced panel are constructedas Xi, = Xit if
i=1 t=l
observed
and
Xi,
Xi = AiF, otherwise. The estimate of F
where li, = 1 if Xi, is availableand 0 otherwise.Minimization is then updated by computing the eigenvectors correspondof (A.2) requiresiterativemethods.This appendixsummarizes ing to the largest r eigenvalues of N-1 Ei XiXi, where Xi =
an iterativemethodbased on the EM algorithmthathas proved (Xi,9 Xi2 ..., XiT)'. The estimate of A is updatedby the ordito be easy and effective.
nary least squaresregressionof X onto this updated estimate
To motivatethis EM algorithm,notice that V(F, A) is pro- of F.
B. Mixed Monthly and Quarterly Data-I(O) Stock Variportional to the log-likelihood under the assumptionthat Xi,
are iid N(A'F,, 1), in which case the least squaresestimators ables. A series that is observedquarterlyand is a stock variare the Gaussianmaximumlikelihood estimators.Because Vt able would be the point-in-timelevel of a variableat the end
is just a missing data version of V and because minimization of the quarter,say, the level of inventories at the end of the
of V is computationallysimple, a simple EM algorithmcan quarter.If this series is I(0), then it is handled as in case A;
be constructedto minimize Vt.
that is, it is treatedas a monthly series with missing observaThe jth iterationof the algorithmis defined as follows. Let tions in the first and second months of the quarter.
A and F denote estimates of A and F constructedfrom the
C. Mixed Monthly and Quarterly Data--l(O) Flow Vari(j- 1)st iteration,and let
ables. A quarterly flow variable is the average (or sum)
of unobserved monthly values. If this series is I(0), it can
Q(Xt, F, A, F, A) = E•,[V(F, A) I Xt],
(A.3) be treated as follows. The unobserved monthly series, Xit,
is measured only as the time aggregate Xiq, where Xq =
where Xt denotes the full set of observed data and (1/3)(Xi, t_2 + Xi,_1 + Xi,) for t = 3, 6, 9, 12 ...
and X,
of
In
this
case
estimafor
other
values
is
all
is
the
value
of
the
data
missing
t.
A)
expected
complete
E;i[V(F,
IXt]
log-likelihood V(F, A), evaluatedusing the conditional den- tion proceeds as in case A but with Xi, = AiF,+ eit, where
sity of X IXt evaluatedat F and A. The estimates of F and ei, = Xq, - Ai(F7-2 + F,_1 + F7)/3, where 7 = 3 when t =
A at iterationj solve MinF, AQ(Xt, F, A, F, A).
1,2, 3, 7 = 6, when t = 4, 5, 6, and so forth.
Z
Stock and Watson: MacroeconomicForecastingUsing DiffusionIndexes
D. Mixed Monthly and Quarterly Data--I(1) stock variables. Suppose that underlyingmonthly data are I(1) and let
X1i'denote the quarterlyfirstdifferencestock variable,assumed
to be measured in the third month of every quarter,and let
Xit denote the monthly first difference of the variable. Then
X = (Xi, t2 + Xi, t-1 + X it) for t - 3, 6, 9, 129.... and X is
missing for all other values of t. In this case estimationproceeds as in case A but with Xit = AiFt+ (1/3)eit, where it =
=
X, - A'(Fi-2 + F,-1 + F), where 7 3 when t= 1, 2, 3, r =
6, when t = 4, 5, 6, and so forth.
E. Mixed Monthly and Quarterly Data--I(1) Flow Variables. Constructionof Xit is more difficult here than in the
earliercases. Here the general regressionformulagiven above
can be implementedafter specifying Xj and Ai. Let the quarterly first differences be denoted by Xq, which is assumed to
be observed at the end of every quarter.The vector of observations is then X= (Xq3,X9 ...
.)', where denotes the
month of the last quarterlyobservation.If the underlyingquarterly data are averages of monthly series, and if the monthly
first differences are denoted by Xit, then Xi, = (1/3)(Xi,t, +
2Xi,t_ + 3Xit-2 + 2Xit-3 + Xit-4) for t = 3, 6, 9, 12,..... and
this implicitly defines the rows of Ai. Then the estimate of Xi
is given by X. = FAi +A'(AiA>)-'(XI - AiFAi).
157
APPENDIXB: DATADESCRIPTION
The time series used to constructthe diffusion index forecasts discussed in Section 5 are presentedhere. The format is
as follows: series number,series mnemonic, data span used,
transformationcode, and brief series description.The transformation codes are 1 = no transformation,2 = first difference,
4 = logarithm,5 = first difference of logarithms, 6 = second
difference of logarithms.An asterisk after the date denotes a
series that was included in the unbalancedpanel but not the
balanced panel, either because of missing data or because of
gross outliers that were treated as missing data. The series
either were taken directly from the DRI-McGraw-Hill Basic
Economics database, in which case the original mnemonics
are used, or were produced by author calculations based on
data from that database, in which case the author calculations and original DRI-McGrawseries mnemonics are summarized in the data descriptionfield. The following abbreviations appearin the data definitions:SA = seasonally adjusted,
NSA = not seasonally adjusted,SAAR = seasonally adjusted
at an annualrate, FRB = FederalReserve Board,AC = Author
calculations.
Real output and income (Out)
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.
19.
20.
21.
22.
23.
24.
25.
26.
ip
ipp
ipf
ipc
ipcd
ipcn
ipe
ipi
ipm
ipmd
ipmnd
ipmfg
ipd
ipn
ipmin
iput
ipx
ipxmca
ipxdca
ipxnca
ipxmin
ipxut
pmi
pmp
gmpyq
gmyxpq
1959:01-1998:12
1959:01-1998:12
1959:01-1998:12
1959:01-1998:12
1959:01-1998:12
1959:01-1998:12
1959:01-1998:12
1959:01-1998:12
1959:01-1998:12
1959:01-1998:12*
1959:01-1998:12
1959:01-1998:12
1959:01-1998:12
1959:01-1998:12
1959:01-1998:12
1959:01-1998:12
1967:01-1998:12*
1959:01-1998:12
1967:01-1998:12*
1967:01-1998:12*
1967:01-1998:12*
1967:01-1998:12*
1959:01-1998:12
1959:01-1998:12
1959:01-1998:12*
1959:01-1998:12
Employmentand hours (EMP)
27.
lhel
1959:01-1998:12
28.
lhelx
1959:01-1998:12
29.
lhem
1959:01-1998:12
5
5
5
5
5
5
5
5
5
5
5
5
5
5
5
5
1
1
1
1
1
1
1
1
5
5
industrialproduction:total index (1992 = 100, sa)
industrialproduction:products,total (1992 = 100, sa)
industrialproduction:final products(1992 = 100, sa)
industrialproduction:consumer goods (1992 = 100, sa)
industrialproduction:durableconsumer goods (1992 = 100, sa)
industrialproduction:nondurableconsumergoods (1992 = 100, sa)
industrialproduction:business equipment(1992 = 100, sa)
industrialproduction:intermediateproducts(1992 = 100, sa)
industrialproduction:materials(1992 = 100, sa)
industrialproduction:durablegoods materials(1992 = 100, sa)
industrialproduction:nondurablegoods materials(1992 = 100, sa)
industrialproduction:manufacturing(1992 = 100, sa)
industrialproduction:durablemanufacturing(1992 = 100, sa)
industrialproduction:nondurablemanufacturing(1992 = 100, sa)
industrialproduction:mining (1992 = 100, sa)
industrialproduction:utilities (1992- = 100, sa)
capacity util rate: total industry(% of capacity,sa)(frb)
capacity util rate: manufacturing,total (% of capacity,sa)(frb)
capacity util rate: durablemfg (% of capacity,sa)(frb)
capacity util rate:nondurablemfg (% of capacity,sa)(frb)
capacity util rate: mining (% of capacity,sa)(frb)
capacity util rate: utilities (% of capacity,sa)(frb)
purchasingmanagers'index (sa)
NAPM productionindex (percent)
personal income (chained) (series #52) (bil 92$, saar)
personal income less transferpayments (chained) (#51) (bil 92$, saar)
5
4
5
index of help-wantedadvertisingin newspapers(1967 = 100; sa)
employment:ratio; help-wantedads:no. unemployed clf
civilian labor force: employed, total (thous.,sa)
Journalof Business & EconomicStatistics,April2002
158
30.
31.
32.
33.
34.
35.
36.
37.
38.
39.
40.
41.
42.
43.
44.
45.
46.
47.
48.
49.
50.
51.
52.
53.
54.
lhnag
lhur
lhu680
lhu5
lhul4
lhul5
lhu26
lpnag
lp
lpgd
lpmi
lpcc
lpem
lped
lpen
lpsp
lptu
lpt
lpfr
lps
lpgov
lw
lphrm
lpmosa
pmemp
1959:01-1998:12
1959:01-1998:12
1959:01-1998:12
1959:01-1998:12
1959:01-1998:12
1959:01-1998:12
1959:01-1998:12
1959:01-1998:12
1959:01-1998:12
1959:01-1998:12
1959:01-1998:12*
1959:01-1998:12
1959:01-1998:12
1959:01-1998:12
1959:01-1998:12
1959:01-1998:12
1959:01-1998:12*
1959:01-1998:12
1959:01-1998:12
1959:01-1998:12
1959:01-1998:12
1964:01-1998:12*
1959:01-1998:12
1959:01-1998:12
1959:01-1998:12
5
1
1
1
1
1
1
5
5
5
5
5
5
5
5
5
5
5
5
5
5
2
1
1
1
civilian labor force: employed, nonagric.industries(thous.,sa)
unemploymentrate: all workers, 16 years & over (%,sa)
unemploy.by duration:average (mean) durationin weeks (sa)
unemploy.by duration:persons unempl. less than 5 wks (thous.,sa)
unemploy.by duration:persons unempl. 5 to 14 wks (thous.,sa)
unemploy.by duration:persons unempl. 15 wks + (thous.,sa)
unemploy.by duration:persons unempl. 15 to 26 wks (thous.,sa)
employees on nonag. payrolls:total (thous.,sa)
employees on nonag. payrolls: total, private(thous.,sa)
employees on nonag. payrolls: goods-producing(thous.,sa)
employees on nonag. payrolls: mining (thous.,sa)
employees on nonag. payrolls: contractconstruction(thous.,sa)
employees on nonag. payrolls: manufacturing(thous.,sa)
employees on nonag. payrolls: durablegoods (thous.,sa)
employees on nonag. payrolls: nondurablegoods (thous.,sa)
employees on nonag. payrolls: service-producing(thous.,sa)
employees on nonag. payrolls:trans. & public utilities (thous.,sa)
employees on nonag. payrolls: wholesale & retail trade (thous.,sa)
employees on nonag. payrolls: finance, insur. & real estate (thous.,sa)
employees on nonag. payrolls: services (thous.,sa)
employees on nonag. payrolls: government(thous.,sa)
avg. weekly hrs. of prod. wkrs.: total private(sa)
avg. weekly hrs. of productionwkrs.: manufacturing(sa)
avg. weekly hrs. of prod. wkrs.: mfg., overtime hrs. (sa)
NAPM employmentindex (percent)
Real retail, manufacturingand trade sales (RTS)
55.
1959:01-1998:12 5 manufacturing& trade:total (mil of chained 1992 dollars)(sa)
msmtq
1959:01-1998:12 5 manufacturing& trade:manufacturing;total (mil of chained 1992 dollars)(sa)
56.
msmq
1959:01-1998:12
5 manufacturing& trade:mfg; durablegoods (mil of chained 1992 dollars)(sa)
57.
msdq
1959:01-1998:12
5 manufact.& trade:mfg; nondurablegoods (mil of chained 1992 dollars)(sa)
58.
msnq
5 merchantwholesalers:total (mil of chained 1992 dollars)(sa)
1959:01-1998:12
59.
wtq
5
merchantwholesalers:durablegoods total (mil of chained 1992 dollars)(sa)
1959:01-1998:12
60.
wtdq
5
merchantwholesalers:nondurablegoods (mil of chained 1992 dollars)(sa)
61.
1959:01-1998:12
wtnq
retail
trade:total (mil of chained 1992 dollars)(sa)
5
62.
1959:01-1998:12
rtq
retail
trade:
nondurablegoods (mil of 1992 dollars)(sa)
1959:01-1998:12
5
63.
rtnq
Consumption(PCE)
64.
1959:01-1998:12
gmcq
1959:01-1998:12
65.
gmcdq
1959:01-1998:12
66.
gmcnq
1959:01-1998:12
67.
gmcsq
68.
gmcanq 1959:01-1998:12
5
5
5
5
5
personalconsumptionexpend (chained)-total(bil 92$, saar)
personalconsumptionexpend (chained)-totaldurables(bil 92$, saar)
personalconsumptionexpend (chained)-nondurables(bil 92$, saar)
personalconsumptionexpend (chained)-services(bil 92$, saar)
personalcons expend (chained)-newcars (bil 92$, saar)
Housing starts and sales (HSS)
hsfr
1959:01-1998:12
69.
hsne
1959:01-1998:12
70.
71.
hsmw
1959:01-1998:12
hssou
72.
1959:01-1998:12
hswst
1959:01-1998:12
73.
74.
hsbr
1959:01-1998:12
75.
hsbne
1960:01-1998:12*
hsbmw 1960:01-1998:12*
76.
77.
hsbsou
1960:01-1998:12*
hsbwst
78.
1960:01-1998:12*
79.
hns
1963:01-1998:12*
hnsne
80.
1973:01-1998:12*
81.
hnsmw 1973:01-1998:12*
hnssou
82.
1973:01-1998:12*
4
4
4
4
4
4
4
4
4
4
4
4
4
4
housing starts:nonfarm(1947-58); total farm & nonfarm(1959-) (thous.,sa)
housing starts:northeast(thous.u.) s.a.
housing starts:midwest (thous.u.) s.a.
housing starts:south (thous.u.) s.a.
housing starts:west (thous.u.) s.a.
housing authorized:total new priv housing units (thous.,saar)
houses authorizedby build, permits:northeast(thous.u.) s.a.
houses authorizedby build. permits:midwest (thous.u.) s.a.
houses authorizedby build. permits:south (thous.u.) s.a.
houses authorizedby build. permits:west (thous.u.) s.a.
new 1-family houses sold during month (thous, saar)
one-family houses sold: northeast(thous.u.,s.a.)
one-family houses sold: midwest (thous.u.,s.a.)
one-family houses sold: south (thous.u.,s.a.)
Stock and Watson: MacroeconomicForecastingUsing DiffusionIndexes
83.
84.
85.
86.
87.
88.
89.
90.
hnswst
hnr
hniv
hmob
contc
conpc
conqc
condo9
1973:01-1998:12*
1963:01-1998:12*
1963:01-1998:12*
1959:01-1998:12
1964:01-1998:12*
1964:01-1998:12*
1964:01-1998:12*
1959:01-1998:10*
4
4
4
4
4
4
4
4
159
one-family houses sold: west (thous.u.,s.a.)
new 1-family houses, month's supply @ currentsales rate (ratio)
new 1-family houses for sale at end of month (thous,sa)
mobile homes: manufacturers'shipments(thous. of units, saar)
construct.put in place: total priv & public 1987$ (mil$, saar)
construct.put in place: total private 1987$ (mil$, saar)
construct.put in place: public construction87$ (mil$, saar)
construct.contracts:comm'l & indus.bldgs (mil.sq.ft.floorsp.; sa)
Real inventoriesand inventory-salesratios (Inv)
5
5
5
5
5
5
2
2
2
2
1
manufacturing& trade inventories:total (mil of chained 1992)(sa)
inventories,business, mfg (mil of chained 1992 dollars, sa)
inventories,business durables(mil of chained 1992 dollars, sa)
inventories,business, nondurables(mil of chained 1992 dollars, sa)
manufacturing& tradeinv: merchantwholesalers (mil of chained 1992 dollars)(s
manufacturing& tradeinv: retail trade (mil of chained 1992 dollars)(sa)
ratio for mfg & trade:inventory/sales(chained 1992 dollars, sa)
ratio for mfg & trade:mfg; inventory/sales(87$)(s.a.)
ratio for mfg & trade:wholesaler;inventory/sales(87$)(s.a.)
ratio for mfg & trade:retail trade;inventory/sales(87$)(s.a.)
napm inventoriesindex (percent)
1
1
5
5
5
5
5
5
5
5
5
5
5
5
5
5
napm new orders index (percent)
napm vendor deliveries index (percent)
new orders (net)-consumergoods & materials, 1992 dollars (bci)
new orders, durablegoods industries, 1992 dollars (bci)
new orders, nondefense capital goods, in 1992 dollars (bci)
mfg new orders:all manufacturingindustries,total (mil$, sa)
mfg new orders:mfg industrieswith unfilled orders (mil$, sa)
mfg new orders:durablegoods industries,total (mil$, sa)
mfg new orders:durablegoods indust with unfilled orders (mil$, sa)
mfg new orders:nondurablegoods industries,total (mil$, sa)
mfg new orders:nondurablegds ind. with unfilled orders (mil$, sa)
mfg unfilled orders:all manufacturingindustries,total (mil$, sa)
mfg unfilled orders:durablegoods industries,total (mil$, sa)
mfg unfilled orders:nondurablegoods industries,total (mil$, sa)
contracts& ordersfor plant & equipment(bil$, sa)
contracts& orders for plant & equipmentin 1992 dollars (bci)
Stockprices (SPr)
1959:01-1998:12
118. fsncom
1966:01-1998:12*
119. fsnin
1966:01-1998:12*
120. fsntr
121. fsnut
1966:01-1998:12*
122. fsnfi
1966:01-1998:12*
1959:01-1998:12
123. fspcom
1959:01-1998:12
124. fspin
1959:01-1998:12
125. fspcap
126. fsptr
1970:01-1998:12"
127. fsput
1959:01-1998:12
128. fspfi
1970:01-1998:12*
1959:01-1998:12
129. fsdxp
1959:01-1998:12
130. fspxe
131. fsnvv3
1974:01-1998:07*
5
5
5
5
5
5
5
5
5
5
5
1
1
5
NYSE common stock price index: composite (12/31/65 = 50)
NYSE common stock price index: industrial(12/31/65 = 50)
NYSE common stock price index: transportation(12/31/65 = 50)
NYSE common stock price index: utility (12/31/65 = 50)
NYSE common stock price index: finance (12/31/65 = 50)
S&P's common stock price index: composite (1941-43 = 10)
S&P's common stock price index: industrials(1941-43 = 10)
S&P's common stock price index: capital goods (1941--43 = 10)
S&P's common stock price index: transportation(1970 = 10)
S&P's common stock price index: utilities (1941-43 = 10)
S&P's common stock price index: financial (1970 = 10)
S&P's composite common stock: dividend yield (% per annum)
S&P's composite common stock: price-earningsratio (%,nsa)
NYSE mkt composition:reptd share vol by size, 5000 + shrs,%
Exchange rates (EXR)
1959:01-1998:12
132. exrus
1959:01-1998:12
133. exrger
5
5
United States effective exchange rate (merm) (index no.)
foreign exchange rate: Germany(deutsche markper U.S.$)
91.
92.
93.
94.
95.
96.
97.
98.
99.
100.
101.
ivmtq
ivmfgq
ivmfdq
ivmfnq
ivwrq
ivrrq
ivsrq
ivsrmq
ivsrwq
ivsrrq
pmnv
1959:01-1998:12
1959:01-1998:12
1959:01-1998:12
1959:01-1998:12
1959:01-1998:12
1959:01-1998:12
1959:01-1998:12
1959:01-1998:12
1959:01-1998:12
1959:01-1998:12
1959:01-1998:12
Ordersand unfilledorders (Ord)
102.
103.
104.
105.
106.
107.
108.
109.
110.
111.
112.
113.
114.
115.
116.
117.
pmno
pmdel
mocmq
mdoq
msondq
mo
mowu
mdo
mduwu
mno
mnou
mu
mdu
mnu
mpcon
mpconq
1959:01-1998:12
1959:01-1998:12
1959:01-1998:12
1959:01-1998:12
1959:01-1998:12
1959:01-1998:12
1959:01-1998:12
1959:01-1998:12
1959:01-1998:12
1959:01-1998:12
1959:01-1998:12
1959:01-1998:12
1959:01-1998:12
1959:01-1998:12
1959:01-1998:12
1959:01-1998:12
Journalof Business & EconomicStatistics,April2002
160
134.
135.
136.
137.
exrsw
exrjan
exruk
exrcan
1959:01-1998:12
1959:01-1998:12
1959:01-1998:12*
1959:01-1998:12
5
5
5
5
foreign exchange rate: Switzerland(swiss franc per U.S.$)
foreign exchange rate: Japan(yen per U.S.$)
foreign exchange rate: United Kingdom (cents per pound)
foreign exchange rate: Canada(canadian$ per U.S.$)
2
2
2
2
2
2
2
2
2
1
2
1
1
1
1
1
1
1
1
1
interestrate: federal funds (effective) (% per annum,nsa)
interestrate: 90 day commercialpaper,(ac) (% per ann,nsa)
interestrate: U.S. treasurybills, sec mkt, 3-mo. (% per ann,nsa)
interestrate: U.S. treasurybills, sec mkt, 6-mo. (% per ann,nsa)
interestrate: U.S. treasuryconst maturities,1-yr. (% per ann,nsa)
interestrate: U.S. treasuryconst maturities,5-yr. (% per ann,nsa)
interestrate:U.S. treasuryconst maturities,10-yr. (% per ann,nsa)
bond yield: moody's aaa corporate(% per annum)
bond yield: moody's baa corporate(% per annum)
weighted avg foreign interest rate (%,sa)
secondarymarketyields on fha mortgages (% per annum)
spreadfycp - fyff
spreadfygm3 - fyff
spreadfygm6 - fyff
spreadfygtl - fyff
spreadfygt5 - fyff
spreadfygtl0 - fyff
spreadfyaaac - fyff
spreadfybaac - fyff
spreadfyfha - fyff
Interest rates (Int)
138.
139.
140.
141.
142.
143.
144.
145.
146.
147.
148.
149.
150.
151.
152.
153.
154.
155.
156.
157.
fyff
fycp90
fygm3
fygm6
fygtl
fygt5
fygtl0
fyaaac
fybaac
fwafit
fyfha
sfycp
sfygm3
sfygm6
sfygtl
sfygt5
sfygtl0
sfyaaac
sfybaac
sfyfha
1959:01-1998:12*
1959:01-1998:12*
1959:01-1998:12*
1959:01-1998:12*
1959:01-1998:12*
1959:01-1998:12
1959:01-1998:12
1959:01-1998:12
1959:01-1998:12
1973:01-1994:04*
1959:01-1998:12
1959:01-1998:12
1959:01-1998:12
1959:01-1998:12
1959:01-1998:12
1959:01-1998:12
1959:01-1998:12
1959:01-1998:12
1959:01-1998:12
1959:01-1998:12
Money and credit quantityaggregates (Mon)
1959:01-1998:12 6 money stock: ml (curr,trav.cks,dem dep, other ck'able dep) (bil$, sa)
158. fml
fm2
159.
1959:01-1998:12 6 money stock: m2 (ml + o'nite rps, euro$, g/p&b/d mmmfs&sav&smtime dep) (bil$,
1959:01-1998:12 6 money stock: m3 (m2 + Ig time dep, term rp's&instonly mmmfs) (bil$, sa)
160. fm3
161. fml
1959:01-1998:09* 6 money stock: 1 (m3 + other liquid assets) (bil$, sa)
162. fm2dq
1959:01-1998:12 5 money supply-m2 in 1992 dollars (bci)
fmfba
163.
1959:01-1998:12 6 monetarybase, adj for reserve requirementchanges (mil$, sa)
164. fmrra
1959:01-1998:12 6 depositoryinst reserves:total, adj for reserve req chgs (mil$, sa)
165. fmrnbc 1959:01-1998:12 6 depositoryinst reserves:nonborrow+ ext cr, adj res req cgs (mil$, sa)
166. fcls
1973:01-1998:12* 5 loans & sec @ all coml banks: total (bils, sa)
167. fcsgv
1973:01-1998:12* 5 loans & sec @ all coml banks:U.S. govt securities (bil$, sa)
1973:01-1998:12* 5 loans & sec @ all coml banks: real estate loans (bil$, sa)
168. fclre
1973:01-1998:12* 5 loans & sec @ all coml banks: loans to individuals(bil$, sa)
169. fclin
170. fclnbf
1973:01-1994:01* 5 loans & sec @ all coml banks: loans to nonbankfin inst (bil$, sa)
171. fclnq
1959:01-1998:12* 5 commercial & industrialloans outstandingin 1992 dollars (bci)
172. fclbmc
1959:01-1998:12* 1 wkly rp Ig com'l banks: net change com'l & indus loans (bil$, saar)
1959:01-1995:09* 1 consumerinstal. loans: delinquencyrate, 30 days & over, (%,sa)
173. cci30m
174. ccint
1975:01-1995:09* 1 net change in consumerinstal cr: total (mil$, sa)
175. ccinv
1975:01-1995:09* 1 net change in consumerinstal cr: automobile(mil$, sa)
176. ccinrv
1980:01-1995:09* 1 net change in consumerinstal cr: revolving (mil$, sa)
Price indexes (Pri)
177.
178.
179.
180.
181.
182.
183.
184.
185.
186.
pmcp
pwfsa
pwfcsa
pwimsa
pwcmsa
pwfxsa
pwl60a
pwl50a
psm99q
punew
1959:01-1998:12
1959:01-1998:12
1959:01-1998:12
1959:01-1998:12*
1959:01-1998:12*
1967:01-1998:12*
1974:01-1998:12*
1974:01-1998:12*
1959:01-1998:12
1959:01-1998:12
1
6
6
6
6
6
6
6
6
6
napm commodity prices index (percent)
producerprice index: finished goods (82 = 100, sa)
producerprice index: finished consumergoods (82 = 100, sa)
producerprice index: intermedmat. supplies & components(82 = 100, sa)
producerprice index: crude materials(82 = 100, sa)
producerprice index: finished goods, excl. foods (82 = 100, sa)
producerprice index: crude materialsless energy (82 = 100, sa)
producerprice index: crude nonfood mat less energy (82 = 100, sa)
index of sensitive materialsprices (1990 = 100) (bci-99a)
cpi-u: all items (82-84 = 100,sa)
Stock and Watson: MacroeconomicForecastingUsing DiffusionIndexes
187.
188.
189.
190.
191.
192.
193.
194.
195.
196.
197.
198.
199.
200.
201.
202.
1967:01-1998:12*
1967:01-1998:12*
1959:01-1998:12
1959:01-1998:12
1959:01-1998:12
1959:01-1998:12
1959:01-1998:12
1959:01-1998:12
1959:01-1998:12
1959:01-1998:12
1959:01-1998:12
1975:01-1998:12*
1959:01-1998:12
1959:01-1998:12
1959:01-1998:12
1959:01-1998:12
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
cpi-u: food & beverages (82-84 = 100, sa)
cpi-u: housing (82-84 = 100, sa)
cpi-u: apparel& upkeep (82-84 = 100, sa)
cpi-u: transportation(82-84 = 100, sa)
cpi-u: medical care (82-84 = 100, sa)
cpi-u: commodities (82-84 = 100, sa)
cpi-u: durables(82-84 = 100, sa)
cpi-u: services (82-84 = 100, sa)
cpi-u: all items less food (82-84 = 100, sa)
cpi-u: all items less shelter (82-84 = 100, sa)
cpi-u: all items less medical care (82-84 = 100, sa)
commodities price: gold, london noon fix, avg of daily rate, $ per oz
pce, impl pr defl: pce (1987 = 100)
pce, impl pr defl: pce; durables(1987 = 100)
pce, impl pr defl: pce; nondurables(1987 = 100)
pce, impl pr defl: pce; services (1987 = 100)
Average hourly earnings (AHE)
203. leh
1964:01-1998:12*
204. lehcc
1959:01-1998:12
205. lehm
1959:01-1998:12
206. lehtu
1964:01-1998:12*
207. lehtt
1964:01-1998:12*
208. lehfr
1964:01-1998:12*
209. lehs
1964:01-1998:12*
6
6
6
6
6
6
6
avg hr earningsof prod wkrs: total privatenonagric ($, sa)
avg hr earningsof constr wkrs: construction($, sa)
avg hr earnings of prod wkrs: manufacturing($, sa)
avg hr earningsof nonsupv wkrs: trans & public util ($, sa)
avg hr earningsof prod wkrs: wholesale & retail trade (sa)
avg hr earnings of nonsupv wkrs: finance, insur, real est ($, sa)
avg hr earnings of nonsupv wkrs: services ($, sa)
pu81
puh
pu83
pu84
pu85
puc
pucd
pus
puxf
puxhs
puxm
pcgold
gmdc
gmdcd
gmdcn
gmdcs
161
Miscellaneous (Oth)
210.
211.
212.
213.
214.
215.
fste
fstm
ftmd
fstb
ftb
hhsntn
1986:01-1998:12* 5
1986:01-1998:12* 5
1986:01-1998:12* 5
1986:01-1998:12* 2
1986:01-1998:12* 2
1959:01-1998:12
1
U.S. mdse exports:total exports (f.a.s. value) (mil.$, s.a.)
U.S. mdse imports:general imports (c.i.f. value) (mil.$, s.a.)
U.S. mdse imports:general imports (customs value) (mil.$, s.a.)
U.S. mdse tradebalance: exports less imports (fas/cif) (mil.$, s.a.)
U.S. mdse tradebalance: exp. (fas) less imp. (custom) (mil.$, s.a.)
u. of mich. index of consumer expectations(bcd-83)
[Received May 2000. Revised March 2001.]
"Lets Get Real: A Dynamic Factor Analytical Approach to
((1998),
DisaggregatedBusiness Cycle,"Review of Economic Studies, 65, 453-474.
Fuhrer,J. C. (1995), "The Phillips Curve is Alive and Well,"New England
Economic Review of the Federal Reserve Bank of Boston, March/April,
REFERENCES
41-56.
Bums, A. F., and Mitchell, W. C. (1947), Measuring Business Cycles, Geweke, J. (1977), "TheDynamic FactorAnalysis of Economic Time Series,"
New York:National Bureau of Economic Research.
in Latent Variablesin Socio-EconomicModels, eds. D. J. Aigner and A. S.
Chamberlain,G., and Rothschild,M. (1983), "ArbitrageFactorStructure,and
Goldberger,Amsterdam:North-Holland.
Mean-VarianceAnalysis of Large Asset Markets,"Econometrica,51, 5.
Gordon,R. J. (1982), "PriceInertiaand Ineffectivenessin the United States,"
Office
CongressionalBudget
(1996), The Economic and Budget Outlook:FisJournal of Political Economy,90, 1087-1117.
cal Years1997-2006, Washington,DC: Author.
(1997), "TheTime-VaryingNAIRU and its Implicationsfor Economic
Connnor, G., and Korajczyk, R. A. (1986), "PerformanceMeasurement
Policy,"Journal of Economic Perspectives, 11-32.
With the ArbitragePricing Theory,"Journal of Financial Economics 15,
Quah, D., and Sargent,T. J. (1983), "A Dynamic Index Model for LargeCross
373-394.
Sections,"in Business Cycles, Indicators,and Forecasting,eds. J. H. Stock
• (1988), "Risk and Return in an EquilibriumAPT: Application of a
and M. W. Watson,Chicago: University of Chicago Press, 285-306.
New Test Methodology,"Journal of Financial Economics, 21, 255-289.
"A Test for the Number of Factors in an ApproximateFactor Sargent, T. J., and Sims, C. A. (1977), "Business Cycle Modeling without
--(1993),
Pretendingto Have Too Much A-Priori Economic Theory,"in New MethJournal
Model,"
of Finance, 48, 4.
ods in Business Cycle Research, ed. C. Sims et al., Minneapolis:Federal
Engle, R. F, and Watson, M. W. (1981), "A One-FactorMultivariateTime
Reserve Bank of Minneapolis.
Series Model of MetropolitanWage Rates,"Journal of the American Statistical Association, 76, 376, 774-781.
Singleton, K. J. (1980), "A LatentTime Series Model of the Cyclical Behavior of InterestRates,"InternationalEconomic Review, 21, 559-575.
Forni, M., Hallin, M., Lippi, M., and Reichlin, L. (2000), "The Generalized Dynamnic Factor Model: Identificationand Estimation,"The Review Staiger,D., Stock, J. H., and Watson,M. W. (1997), "TheNAIRU, Unemployment, and MonetaryPolicy,"Journal of Economic Perspectives, 11, 33-51.
of Economics and Statistics, 82, 4, 540-552.
Forni, M., and Reichlin, L. (1996), "Dynamic Common Factors in Large Stock, J. H., and Watson, M. W. (1989), "New Indexes of Coincident and
Cross-Sections,"Empirical Economics, 21, 27-42.
Leading Economic Indicators,"NBERMacroeconomicsAnnual, 351-393.
162
-
(1991), "A Probability Model of the Coincident Economic Indicators," in Leading Economic Indicators: New Approachesand Forecasting
Records, eds. K. Lahiri and G. H. Moore, New York:CambridgeUniversity Press, 63-85.
(1996), "Evidence on StructuralInstability in MacroeconomicTime
Series Relations,"Journal of Business and Economic Statistics, 14, 11-30.
"Diffusion Indexes,"working paper 6702, National Bureauof
-(1998),
Economic Research.
Journalof Business & EconomicStatistics,April2002
Stock, J. H., and Watson, M. W. (1999), "ForecastingInflation,"Journal of
MonetaryEconomics 44, 293-335.
(2000), "ForecastingUsing PrincipalComponentsFroma LargeNumber of Predictors,"manuscript.
Tootell, G. M. B. (1994), "Restructuring,the NAIRU, and the Phillips Curve,"
New England Economic Review of the Federal Reserve Bank of Boston,
Sept./Oct., 31-44.
West, K. D. (1996), "AsymptoticInferenceAbout PredictiveAbility,"Econometrica, 64, 1067-1084.