multi-step perturbation solution of nonlinear rational expectations

1
REAL-TIME FILTERED ESTIMATES OF FINAL SUPERLATIVE
CPI BASED ON ESTIMATED MONTHLY MODELS OF INITIAL
AND FINAL SUPERLATIVE CPI
Peter A. Zadrozny
Division of Price Index Number Research
Bureau of Labor Statistics
2 Massachusetts Ave., NE, Room 4915
Washington, DC 20212
e-mail: [email protected]
Januery 29, 2006
ABSTRACT
The monthly superlative consumer price index (SCPI) has three publicly released
estimates: initial, interim, and final. We focus here solely on SCPI, which is revised, and
not on the better known CPI for urban consumers (CPIU) which is not. Initial SCPI is
released every month for the previous month. In even-numbered years, interim SCPI is
released in February for all months of the previous year; in odd-numbered years, interim
SCPI is not revised and is simply set to initial SCPI. Final SCPI is released every
February for all months two years before. Because interim SCPI switches annually
between revision to nonrevision, we ignore it and focus solely on initial and final SCPI.
Thus, the current or real-time estimation problem addressed here is overcoming the 14-25
month delays between releases of initial and final SCPI and in every month computing
current or filtered estimates of final SCPI for that month, based on all available current
and past observations on initial and final SCPI.
We consider two methods for this problem, which we call a regression method and a
time-series method. The regression method is attractive in its technical simplicity but
only weakly exploits current sample information. Results to date suggest that the timeseries method's estimates of final SCPI are about 45 times more accurate, in terms of a
root mean-squared error (RMSE) measure of accuracy, than the regression method's
estimates. The regression method regresses current final SCPI on current and past initial
SCPI and, then, estimates final SCPI as the estimated regression line evaluated at current
and past inital SCPI. More complexly, the time-series method estimates by maximum
likelihood a vector autoregressive moving-average (VARMA) model of jointly generated
initial and final SCPI and, then, applies the missing-data Kalman filter (MDKF) to the
2
estimated model in order to estimate final SCPI based on all current and past
observations.
The estimation of final SCPI is complicated by the fact that, whereas initial SCPI is
released every month with a fixed one-month delay, final SCPI is released in February
two calendar years after it occurs. We handle this data complexity by indexing it
historically in order to estimate a model and currently in order to estimate final SCPI
based on an estimated model. In historical indexing, data are indexed by periods in which
they occur and, in current or real-time indexing, data are indexed by periods in which
they are released. The historical form is the more familiar one, is compact, and generally
has few or no missing values. The current form is expansive and always has missing
values, in the present case, many missing values. For example, here the historical form
has dimension N2 but the current form has dimension N75, where N is the number of
sample periods. When the MDKF is used to estimate final SCPI, it must be traversed only
once, so that applying it for this purpose to data in expansive current form and an
estimated model in correspondingly expansive state-space form is not computationally
burdensome.
3
CURRENT ESTIMATES OF FINAL SUPERLATIVE CPI BASED
ON ESTIMATED MONTHLY MODELS OF INITIAL AND FINAL
SUPERLATIVE CPI
Peter A. Zadrozny*
Bureau of Labor Statistics
Division of Price Index Number Research
2 Massachusetts Ave., NE, Room 3105
Washington, DC, USA
e-mail: [email protected]
October 30, 2006
Key words: real-time estimation of revised data
This work represents the author's views and does
not represent any official positions of BLS.
*
4
Organization.
1. Introduction.
2. Data analysis.
3. OLS estimation of univariate regressions.
4. ML estimation of bivariate VAR models.
5. Adjusted estimates of final CPI and their RMSEs.
6. Conclusions.
5
1. Introduction.
a. Objectives.
The 1st objective is to determine how accurately
initial releases of superlative CPI (SCPI) estimate
final releases of SCPI. SCPI differs from the
better known CPIU which is not revised. Initial
estimates of SCPI released every month estimate
true SCPI in the previous month. Final estimates of
true SCPI for the same previous month are not
released until February two calendar years later.
The 2nd objective is to determine which estimated
regression or vector autoregressive (VAR) model
produces the best current estimates of final SCPI
in the sense of minimizing root mean-squared errors
(RMSE) of eventually released final SCPI, using all
currently available observations of current and
past, initial and final, SCPI.
b. Method.
Initial and final SCPI data may be indexed
"historically," according to months in which the
data occur, or "currently," according to months in
which they are observed or released to the public.
Here, historically indexed data are used for
estimating models and currently indexed data are
used for making current estimates of final SCPI.
How much currently available data can be used in an
estimation depends on the method. Regression
estimates can generally be based only on current
and past initial SCPI, but, using the Kalman
filter, VAR estimates can generally be based on all
current and past data, both initial and final SCPI.
6
c. Practical considerations.
Regression estimation is much easier to implement.
It's technically most advanced step is ordinary
least
squares
(OLS),
which
is
included
in
commercial statistical software. VAR estimation
uses the missing-data Kalman filter, which is
generally not included in commercial statistical
software and was implemented here using the FORTRAN
77 program VARMA11B.FOR written by the author.
7
2. Data analysis.
a. Data in compact historical and expanded current
form.
We
distinguish
between
data
in
compact
historical form and expanded current form and
show graphs of SCPI data in historical form.
i. SCPI data in compact historical form.
months s
1
2
3
...
10
11
12
is,s+1
i1,2
i2,3
i3,4
...
i10,11
i11,12
i12,13
fs,t
f1,26
f2,26
f3,26
...
f10,26
f11,26
f12,26
end of year s = 1, ..., 12
13
14
15
...
22
23
24
i13,14
i14,15
i15,16
...
i22,23
i23,24
i24,25
f13,38
f14,38
f15,38
...
f22,38
f23,38
f24,38
end of year s = 13, ..., 24
25
26
27
...
i25,26
i26,27
i27,28
...
f25,50
f26,50
f27,50
..
is,s+1 = log of initial SCPI occuring in month s
and observed in month s+1,
fs,t = log of final SCPI occuring in month s and
observed in month t > s.
8
ii. SCPI data in expanded current form.
months t
1
2
3
...
10
11
12
13
14
15
...
22
23
24
it-1,t
fs,t
i0,1
na
...
i1,2
f-12,2
...
i2,3
na
...
...
na
...
i9,10
na
...
i10,11
na
...
i11,12
na
...
end of year s = 1, ..., 12
i12,13
na
...
i13,14
f0,14
...
i14,15
na
...
...
na
...
i21,22
na
...
i22,23
na
...
i23,24
na
...
end of year s = 13, ..., 24
25
i24,25
na
...
26
i25,26
f12,26
...
27
i26,27
na
...
...
...
...
...
na
f-23,2
na
na
na
na
na
na
f-11,14
na
na
na
na
na
na
f1,26
na
...
it-1,t = log of initial SCPI occuring in month
t-1 and observed in month t,
fs,t
= log of final SCPI occuring in month s and
observed in month t > s,
na
= no data are available.
9
b. Graphs of SCPI data in historical form.
All SCPI data in left-side graphs in figures 13 are normalized, with sample means subtracted
and divided by sample standard deviations.
1. Figure 1: initial SCPI in given price-relative
form, in log form, and autocorrelations and
spectra thereof, from March 1998 to December
2004.
I = initial SCPI
3
Autocors. I
2
1
0
-1
-2
-3
1998 1999 2000 2001 2002 2003 2004
0.10
0.01
0.0
0
5
i = log initial SCPI
3
1
0
-1
-2
-3
0
-1
-2
-3
1998 1999 2000 2001 2002 2003 2004
0.8
1.0
0.8
1.0
0.8
1.0
Spectrum i
0.10
0.01
0.0
0
5
0.2
10 15 20 25 30 35
0.4
0.6
Fractions of pi
Autocors. di
Spectrum di
1.00
1.00
0.75
0.50
0.25
0.00
-0.25
-0.50
-0.75
-1.00
1
0.6
1.00
di = seas. diff. log initial SCPI
2
0.4
Fractions of pi
Autocors. i
1998 1999 2000 2001 2002 2003 2004
3
0.2
10 15 20 25 30 35
1.00
0.75
0.50
0.25
0.00
-0.25
-0.50
-0.75
-1.00
2
Spectrum I
1.00
1.00
0.75
0.50
0.25
0.00
-0.25
-0.50
-0.75
-1.00
0.10
0.01
0.0
0
5
10 15 20 25 30 35
0.2
0.4
0.6
Fractions of pi
10
ii. Figure 2: final SCPI in given price-relative
form, in log form, and autocorrelations and
spectra thereof, from March 1998 to December
2004.
F = final SCPI
3
Autocors. F
2
1
0
-1
-2
-3
1998 1999 2000 2001 2002 2003 2004
0.10
0.01
0.0
0
f = log final SCPI
3
1
0
-1
-2
-3
0
-1
-2
-3
1998 1999 2000 2001 2002 2003 2004
0.8
1.0
0.8
1.0
0.8
1.0
Spectrum f
0.10
0.01
0.0
0
0.2
5 10 15 20 25 30 35
0.4
0.6
Fractions of pi
Autocors. df
Spectrum df
1.00
1.00
0.75
0.50
0.25
0.00
-0.25
-0.50
-0.75
-1.00
1
0.6
1.00
df = seas. diff. log final SCPI
2
0.4
Fractions of pi
Autocors. f
1998 1999 2000 2001 2002 2003 2004
3
0.2
5 10 15 20 25 30 35
1.00
0.75
0.50
0.25
0.00
-0.25
-0.50
-0.75
-1.00
2
Spectrum F
1.00
1.00
0.75
0.50
0.25
0.00
-0.25
-0.50
-0.75
-1.00
0.10
0.01
0.0
0
5 10 15 20 25 30 35
0.2
0.4
0.6
Fractions of pi
11
iii. Figure 3: initial minus final SCPI, in given
price-relative form, in log form, and autocorrelations and spectra thereof, from March
1998 to December 2004.
I-F
3
Autocors. I - F
2
1
0
-1
-2
-3
1998 1999 2000 2001 2002 2003 2004
0.10
0.01
0.0
0
i-f
3
1
0
-1
-2
-3
0
-1
-2
-3
1998 1999 2000 2001 2002 2003 2004
0.8
1.0
0.8
1.0
0.8
1.0
Spectrum i - f
0.10
0.01
0.0
0
0.2
5 10 15 20 25 30 35
0.4
0.6
Fractions of pi
Autocors. di - df
Spectrum di - df
1.00
1.00
0.75
0.50
0.25
0.00
-0.25
-0.50
-0.75
-1.00
1
0.6
1.00
di - df
2
0.4
Fractions of pi
Autocors. i - f
1998 1999 2000 2001 2002 2003 2004
3
0.2
5 10 15 20 25 30 35
1.00
0.75
0.50
0.25
0.00
-0.25
-0.50
-0.75
-1.00
2
Spectrum I - F
1.00
1.00
0.75
0.50
0.25
0.00
-0.25
-0.50
-0.75
-1.00
0.10
0.01
0.0
0
5 10 15 20 25 30 35
0.2
0.4
0.6
Fractions of pi
12
c. Understanding graphs of spectra in figures 1-3.
i. Table 1: Frequencies and periods of harmonic
monthly seasonal cycles.
Cases
Frequency
 radians
0
Frequency
cycles/mon
.0000
Period
mon/cycle
1
Frequency
radians
.0000
2
.5236
1/6
.0833
12
3
1.047
1/3
.1667
6
4
1.571
1/2
.2500
4
5
2.094
2/3
.3333
3
6
2.618
5/6
.4167
12/5
7
3.142
1
.5000
2

ii. Cases 2-7 are nonaliasing or identifiable,
harmonic, monthly, seasonal cycles. Dominant
spectral peaks in figures 1-3 at /3 = 1.047
radians represent seasonal cycles with 6-month
periods.
iii. See P.A. Zadrozny, "Frequency Representation
of Seasonal Economic Cycles," typescript.
13
d. Sample statistics of initial, final, and initial
minus final SCPI.
Using 82 observation months from March 1998 to
December 2004.
i. Notation.
I = given initial SCPI,
F = given final SCPI,
i = log(I),
f = log(F).
ii. Sample means.
I = 1.00163,
F = 1.00166,
I-F =
i =
f = .00166,
i-f = -.00003,
.00162,
.99997.
iii. Sample standard deviations.
sI = .00265,
sF = .00278,
sI-F = .00056,
si = .00265,
sf = .00277,
si-f = .00056.
14
3. OLS estimates of regressions of final on initial
SCPI.
a. Unrestricted regression, denoted UR.
i. Coefficient estimates.
~
~
.9929 it1,t - .0678 it 2,t  1 ,
(.0594) (36.11)
(2.313)
~
~
~
+ .0150 it 3,t  2 - .0472 it 4,t  3 + .0862 it 5,t  4 ,
(1)
~
ft1, =
.0013 +
(.4929)
(1.573)
(2.871)
~
~
~
- .0217 it 6,t  5 + .0279 it 7,t  6 - .0115 it 8,t  7 ,
(.7412)
(.9500)
(.3888)
~
~
~
+ .0041 it 9,t  8 - .0571 it 10,t  9 + .0530 it 11,t  10 ,
(.1338)
(1.871)
(1.701)
~
~
- .0174 it 12,t  11 + .0108 it 13,t  12 + t,
(.5797)
(.3594)
where tildes denote normalized variables
absolute t statistics are in parentheses.
and
ii. Summary statistics.
R2 = .9758, R 2 = .9701,
Durbin-Watson = 1.921,
s = est. std. dev. of residuals = .1847
iii. Residual diagnostics.
UR residuals
Residual autocors.
0.42
1.00
0.28
0.75
0.50
0.14
Residual spectrum
1.000
0.100
0.25
0.00
0.00
-0.14
0.010
-0.25
-0.28
-0.50
-0.42
-0.75
-0.56
-1.00
1999
2002
0.001
0.0
0
0.4
Fractions of pi
0.8
15
b. 1st restricted regression, denoted RR1.
We reduced regression UR to RR1 by dropping all
regressors with coefficients with absolute t
statistics < 2 and reestimated.
i. Coefficient estimates.
(2)
~
ft1, =
~
~
1.009 it1,t - .0667 it 2,t  1 ,
(48.07)
(2.976)
~
+ .0583 it 5,t  4 + t.
(2.701)
ii. Summary statistics.
R2 = .9705, R 2 = .9697,
Durbin-Watson = 1.987,
s = est. std. dev. of residuals = .1784.
iii. Residual diagnostics.
RR1 residuals
0.6
Residual autocors.
Residual spectrum
1.000
1.00
0.75
0.4
0.50
0.2
0.100
0.25
-0.0
0.00
0.010
-0.25
-0.2
-0.50
-0.4
-0.75
-0.6
0.001
0.0
-1.00
1999
2002
0
0.4
Fractions of pi
0.8
16
c. 2nd restricted regression of final SCPI on
initial SCPI, denoted RR2.
We reduced regression RR1 to RR2 by dropping
initial
SCPI
lagged
1
and
4
months
as
regressors and reestimated.
i. Coefficient estimates.
(3)
~
ft1, =
~
.9801 it1,t + t.
(44.40)
ii. Summary statistics.
R2 = .9605, R 2 = .9605,
Durbin-Watson = 1.801,
s = est. std. dev. of residuals = .1987.
iii. Residual diagnostics.
RR2 residuals
0.75
Residual autocors.
Residual spectrum
1.000
1.00
0.75
0.50
0.50
0.25
0.100
0.25
0.00
0.00
0.010
-0.25
-0.25
-0.50
-0.50
-0.75
-0.75
0.001
0.0
-1.00
1999
2002
0
0.4
0.8
Fractions of pi
d. Conclusions.
UR and RR1 residuals have insignificant serial
correlations, but many UR coefficients are
insignificant. RR2 residuals have a significant
6-month seasonal cycle.
17
4. ML estimates of bivariate VAR models.
a. VAR models in terms of normalized transformed
data vector
(4)
~
~
is,  fs, 
~
ws = 
.
~
fs,


~
~
We estimated models for normalized w
s = ( is, ~
~
~
~
ys = ( is, , fs, )T resulted in
fs, , fs, )T, because ~
~
~
nonconvergent ML estimates, because is, and fs, ,
~
~
~
but not is, - fs, and fs, are mutually too highly
correlated.
b. Table 2: Statistics of unrestricted VAR models.
VAR
order
2
2
Ri
f Rf
LGLK
# est
parms
AIC
BIC
1
.1206
.1418
125.1
7
139.1
155.9
2
.1539
.2382
115.2
11
137.2
163.6
3
.1847
.2699
110.5
15
140.5
176.6
4
.2438
.2947
100.3
19
138.3
184.0
5
.2486
.3131
98.13
23
144.1
199.5
6
.2760
.3283
90.31
27
144.3
209.3
7
.2827
.3519
87.51
31
149.5
224.1
8
.3017
.4323
77.30
35
147.3
231.1
9
.3315
.4420
73.69
39
151.7
245.6
10
.3553
.4442
70.11
43
156.1
259.6
11
.3594
.4813
62.97
47
157.0
270.1
12
.3743
.4950
60.33
51
162.3
285.1
Qi-f,i-f
MSLQ
66.96
.0013
51.06
.0494
36.66
.4382
32.49
.6363
29.91
.7526
28.11
.8234
30.43
.7303
33.70
.5785
27.22
.8537
23.06
.9534
22.72
.9584
22.15
.9659
Qi-f,f
MSLQ
39.61
.3119
45.13
.1415
42.57
.2092
40.46
.2798
36.48
.4462
41.29
.2506
39.27
.3256
39.46
.3179
39.71
.3081
36.13
.4627
31.18
.6972
28.75
.7994
Qf,i-f
MSLQ
97.39
1.5e-7
79.94
3.5e-5
65.09
2.1e-3
54.33
.0256
53.28
.0318
66.04
1.7e-3
65.39
2.0e-3
57.60
.0126
61.74
4.8e-3
63.82
2.9e-3
55.83
.0186
55.61
.0195
Qff
MSLQ
101.3
4.0e-8
114.2
5.e-10
119.3
7.e-11
105.8
8.e-9
97.96
1.2e-7
66.91
.0132
55.74
.0190
47.71
.0972
49.50
.0663
51.00
.0500
41.41
.2463
38.60
.3529
18
c. Reduction to restricted VAR models.
We reduced each unrestricted VAR model to a
restricted VAR model by setting to zero the VAR
coefficients with absolute values < .05 and
reestimated. The exception was the unrestricted
VAR(1) model which needed no reduction and which is
repeated in table 3. Also, the two restricted
VAR(12) models have different zero restrictions and
different numbers of parameters.
d. Table 3: Statistics of restricted VAR models.
VAR
order
2
2
Ri
f Rf
LGLK
# est
parms
AIC
BIC
1
.1206
.1418
125.1
7
139.1
155.9
2
.1530
.2302
116.0
8
132.0
151.3
3
.1798
.2569
112.1
10
132.1
156.2
4
.2377
.2869
101.7
12
125.7
154.5
5
.2262
.3029
101.2
12
125.2
154.1
6
.2496
.3036
96.34
13
122.3
153.6
7
.2506
.3375
92.21
15
122.2
158.3
8
.2459
.4109
85.11
16
117.1
155.6
9
.3086
.4143
80.11
16
112.1
150.6
10
.3411
.4219
74.87
19
112.9
158.6
11
.3424
.4673
67.76
21
109.8
160.3
12
.3509
.4689
68.83
20
108.8
157.0
12
.3533
.4692
68.37
22
112.4
165.3
Qi-f,i-f
MSLQ
66.96
.0013
52.67
.0359
36.13
.4627
36.43
.4487
41.65
.2383
39.28
.3250
38.91
.3401
42.51
.2111
29.27
.7788
21.84
.9697
21.61
.9722
20.85
.9794
21.42
.9791
Qi-f,f
MSLQ
39.61
.3119
45.41
.1352
40.66
.2727
40.04
.2954
35.78
.4789
34.69
.5308
30.48
.7282
36.54
.4434
35.07
.5126
34.78
.5264
30.61
.7224
28.45
.8108
27.08
.8583
Qf,i-f
MSLQ
97.39
1.5e-7
87.94
3.1e-6
68.48
8.8e-4
56.48
.0162
56.41
.0164
62.55
4.0e-3
65.45
.0020
63.33
.0033
63.79
.0029
59.16
.0088
52.47
.0374
44.87
.1475
46.30
.1168
Qff
MSLQ
101.3
4.0e-8
119.3
7.e-11
125.7
7.e-12
106.0
8.0e-9
102.6
2.5e-8
89.11
2.1e-6
60.63
.0063
51.13
.0488
53.22
.0321
54.13
.0267
44.69
.1519
45.80
.1268
45.40
.1354
19
e. Restricted VAR models.
~ , which minimizes BIC,
i. Restricted VAR(9) for w
s
denoted RV9.
(5)
0 ~
0
~ = 0 .2568 w
~
w
s
0 .3431 s  1 + 0 .3312 w s  2




.1958 ~
 0
0 .3235 ~
w
w
+ 
+
0
0  s  3
0  s  4
.2139

0 ~
0 ~
 0
.1601
w
+ 
+
 s 5
 0
 ws  6
.
1272
0

.
1905




0 ~
0 ~
0
 0
w
+ 
+
 s7
 .2403 .3233 w s  8
0 .1995


0 .2471 ~
+ 
 w s  9 + w,s,
0
0


 .6821 .2382
w = E  w,s Tw,s = 
= R wR T
w,

 .2382 .5893 
0 
 .8259
Rw = 
 = Cholesky factor of w.

.
2884
.
7114


20
~ , which minimizes AIC,
ii. Restricted VAR(12) for w
s
denoted RV12.
(6)
0 ~
0
~ = 0 .2069 w
~
w
s
0 .3439 s  1 + 0 .2728 w s  2




.1803 ~
 0
0 .2777 ~
w
w
+ 
+
0
0  s  3
0  s  4
.1602

0 ~
0 ~
0 0 ~
0
0
ws  5 + 
ws  6 + 
+ 


 ws  7
0
0
0

.
1858
0
.
1874






0 ~
 0
0 .2188 ~
w
w
+ 
+
 s8
0
0  s  9
 .2179 .2900

0 ~
 .1897 .1942 ~
 0
w
+ 
+
 s  10
.2278 .1728 w s  11
0
0




 .1561 0 ~
+ 
 w s  12 + w,s,
 .1105 0
 .6367 .2148
w = E  w,s Tw,s = 
= R wR T
w,


.
2148
.
5363


0 
 .7979
Rw = 
 = Cholesky factor of w.
 .2692 .6811
21
~ to ~
ys .
f. Transform estimated VAR models from w
s
i. Objective.
Transform the estimated VAR models from normalized
~ to normalized data vector
data vector w
s
~
is, 
~
ys =  ~  ,
 fs, 
(7)
~
~
because estimates of fs, must be made using i,t
~
and f,t as separately dated conditioning variables.
~ to ~
ys .
ii. Transform VAR models from w
s
~ =
(8) For w
s
kp  1 A kw~s  k
akij,
Ak =
+ w,s,
 
w = ij ,
(9) bk11 = ak11 + ak21,
bk12 = ak12 + ak22 - ak11 - ak21,
bk21 = ak21,
bk22 = ak22 - ak21,
s11 = 11 + 212 + 22,
s12 = 12 + 22,
s21 = s12,
s22 = 22,
s11 ,
r11 =
r21 = s21/r11,
(10)
computes


r12 = 0,
r22 =
~
ys =
2
s22  r21
,
kp  1 Bk ~ys  k
 
+ y,s,
 
Bk = bkij , y = sij = R y R Ty , Ry = rij .
22
ys .
g. Implied restricted VAR model RV9 for ~
(11)
0 .5999 ~
0 .3312  ~
~
ys = 
y
+
s

1

0 .3312 ys  2
0
.
3431




.2139 .0181 ~
0 .3235 ~
y
y
+ 
+
 s 3
0
0  s  4
.2139 .2139

.1272 .1272 ~
.1601 .3506 ~
ys  5 + 
+ 

 ys  6
.
1272

.
1272
0

.
1905




0 .1995 ~
 .2403 .0830  ~
y
+ 
+
 s7
 .2403 .0830 y s  8
0 .1995


0 .2471 ~
+ 
 y s  9 + y,s,
0
0


.7950 .3511
y = 
= R y R Ty ,

.3511 .5893
0 
.9801
Ry = 
.
.4316 .7246
23
ys .
h. Implied restricted VAR model RV12 for ~
(12)
0 .5508 ~
0 .2728  ~
~
ys = 
y
+
s

1

0 .2728 ys  2
0
.
3439




.1602 .0201  ~
0 .2777 ~
y
y
+ 
+
 s 3
0
0  s  4
.1602 .1602

0 0 ~
0 .1858 ~
0 .1874 ~
ys  5 + 
ys  6 + 
+ 


 ys  7
0
0
0

.
1858
0
.
1874






 .2179 .0721  ~
0 .2188 ~
y
y
+ 
+
 s8
0
0  s  9
 .2179 .0721

  .1897 .0045 ~
.2278 .0550 ~
y
+ 
+
 s  10
.2278 .0550 ys  11
0
0




 .0456 .0456 ~
+ 
 ys  12 + y,s,
 .1105 .1105 
.7434 .3215
y = 
= R y R Ty ,

.3215 .5363
0 
.9698
Ry = 
.
.
4247
.
7076


24
5. Adjusted estimates of final SCPI and their
RMSEs.
a. Regression-based estimates of final SCPI are
given directly by estimated regressions (2) and
(3). VAR-based estimates of final CPI are
computed by applying the missing-data Kalman
filter (MDKF) to models (11) and (12), using the
data in current form.
b. To do this, models (11) and (12) and their data
vectors must first be restated in the compatible
first-order state representation
(13)
xt = Fxt-1 + Gy,t or
~
yt 
 ~
 A 1   A p 0  0  yt1 
I
~

 I 0     0  ~
0
y
yt2 
 t1 




 
yt2 
yt3 
~
  ~
0 I 0
 






     +    y,t,
   =     
  
 
 
  
   






 
      
  
 
 
~
~
 yt24 
 0    0 I 0  yt25 
0




yt =
where p = 9 or 12, F = 5050, G = 502, and ~
21 and xt = 501 are
(14)
~
~
~
y t = ( it  1,t , ft  1,t )T,
~
~
~
~
xt = ( it  1,t , ft  1,t , ..., it  25,t  24, ft  25,t  24 )T.
xt is both the data vector and the state vector.
Correspondingly, to compute estimates of final
SCPI using the MDKF, the data must be expanded
from 842 historical form to 8450 current form.
25
c. Definition of RMSE.
(15)
~
ˆ
ftM 1,t = method M estimate in month t of
~
ft  1,t ,
efM,t
~
~
ˆ
ˆ
~
= ft  1,t - ftM 1,t = error of ftM 1,t ,
RMSEfM =
tT  1 (efM,t )2 / T ,
where M = N, RR1, RR2, RV9, and RV12, respectively,
denote estimation of final values based on current
initial values or no estimation, on regressions RR1
or RR2, and on VAR models RV9 or RV12, and T =
number of months used to compute RMSE,  82 months,
depending on the lags of regressors underlying
~
ˆ
ftM 1,t .
Regression RMSEs were computed using "in sample"
data to estimate models and would be more realistic
if they included only "out of sample" data not used
to estimate a model. However, even the full sample
provided barely enough periods to estimate a model
accurately, so that "in sample" data were also used
to estimate final SCPI and to compute RMSEs. Data
scarcity is due to there being only 7 years of
observations to account for significant seasonal
variations.
26
d. Current regression-based estimates of final SCPI
using normalized data.
~
ˆ
~
(16) ftN 1,t = it  1,t ,
~ 1
ˆ
~
~
ftRR
 1,t = 1.009 it  1,t - .0667 it  2,t  1
~
+ .0583 it  5,t  4 ,
~ 2
ˆ
~
ftRR
 1,t = .9800 it  1,t .
e. Comment.
We do not report RMSE of estimated F, the given
unlogged final SCPI. The reported RMSEs of
~
ˆ
ftM 1,t are defined in terms of absolute errors.
~M
ˆ
Similarly defined RMSEs of F
would be in a
t  1,t
different
and
not
comparable
form.
However,
~M
ˆ
because RMSEs of F
t  1,t defined in terms of
relative errors are very close to RMSEs of
~
ˆ
ftM 1,t , they are redundant and are not reported.
f. RMSEs of regression-based current estimates of
final SCPI.
(17)
RMSEfN
= .1988,
RR1
= .1753,
RMSEf
RR2
= .1978.
RMSEf
In all cases, 82 months of observations, from
March 1998 to December 2004, were used to
compute the in-sample RMSEs.
27
g. Table 4: RMSEs of estimates of final SCPI based
on VAR model RV12.
Element
of xt
ft-i,t-i+1
for i =
Occurs
in
Unnorm.
RMSE
Normal.
RMSE
Theil U
xt,28
14
Dec
.1045e-5
.8888e-3
1.483
xt,30
15
Nov
.4093e-6
.4201e-3
1.576
xt,32
16
Oct
.3657e-7
.1962e-4
.9761
xt,34
17
Sep
.5550e-5
.4840e-2
1.484
xt,36
18
Aug
.4039e-6
.3409e-3
.1774
xt,38
19
Jul
.1748e-6
.7555e-3
.4302
xt,40
20
Jun
.2014e-5
.1184e-2
.8481
xt,42
21
May
.3180e-5
.1866e-2
.9226
xt,44
22
Apr
.7243e-5
.3415e-2
.9608
xt,46
23
Mar
.4554e-5
.2180e-2
1.289
xt,48
24
Feb
.9194e-5
.9073e-2
.5334
xt,50
25
Jan
.2718e-4
.1640e-1
.2880
Average
---
---
.4426e-5
.3764e-2
.9141
h. Comments.
Theil U = RMSE being considered  RMSE of a
forecast based on the last observation of the
forecasted variable. State eq. (13) implies,
e.g., that a 14-month-ahead forecast made in
~
month t of xt,28 or ft  14,t  13 is the estimate
~
made in month t of the f which occurs in month
t, using all available observations in month t.
Thus, 14-month-ahead forecasts of xt,28 made in
Decembers, but unobserved until two Februaries
later, are estimates of final SCPI for those
Decembers and have normalized RMSE = .888810-3,
etc.
28
6. Conclusions.
a. About regression estimates.
RMSEs (17) indicate that, among methods N, RR1,
and RR2, RR1 regression yields the lowest RMSE
for current estimates of final SCPI. Compared
with using initial SCPI to estimate final SCPI
(method N), using RR1 regression reduces RMSE by
about 12%, but using RR2 regression reduces RMSE
only about .5%.
There's no evidence to suggest using regressions
with
time-varying
coefficients,
because
no
residuals have predictable and parametrically
estimatable temporal variations such as trend or
seasonality. Also, 7 years of observations would
probably not be enough to estimate more timevarying parameters with acceptable accuracy.
b. About VAR estimates.
RMSEs in table 4 show that VAR current estimates
of final SCPI are about 45 times more accurate
than any of the regression estimates.