Completing spatial dependent data: The Chow

Completing spatial dependent
data:
The Chow-Lin method for
NUTS data
W. Polasek
IHS Wien and UAM
1
Contents
•
•
•
•
•
Chow and Lin (1971) approach
Model dependent missing data:
LHS missing (left hand side)
RHS missing (right hand side)
Overall solution:
– LHS: Prediction problem.
– RHS: interpolation, parameter estimation
2
The Problem
• European regions: cross section of NUTS1
data; completely observed: (nx1) vector.
• We are interested in a (Nx1) NUTS2
vector: length N > n.
3
Chow-Lin method for time series
• Problem: Quarterly data y needed but only
annual data z are observed. Other
quarterly indicators X:(T x K) are available:
• Disaggregate model
• y = X + ε, ε ~ N(0, σ2 Ω ).
• Define aggregation matrix C = In⊗(1,1,1,1)
• Estimate β in the aggregated model using
• ya = Cy and Xa = CX
4
The Chow-Lin procedure
•
•
•
•
1. Set up he disaggregate model,
2. Aggregate with C matrix
3. Estimate b
4. Forecasting in the disaggregate model
using b and the known indicators X1,...XK
5
Completing quarterly data
•
•
•
•
•
•
•
b = [X’a (C Ω C ) -1 Xa] -1 X’a(C Ω C’)-1 ya
The quaterly predictiosn are BLUE:
y^ = Xb + ΩC’ (C Ω C’) -1 (ya - Xab).
The covariance matrix of the predictions
Var(z)= PΩ + PX [X’C’(C Ω C )-1 CX]-1 X‘P‘
With P = (I - Ω C’(C Ω C ) -1 XC )
Note that multiplying with C gives
aggregation-consistent predictions.
6
Disaggregated model
• That is, it is possible to establish a linear
model for the disaggregated data vector
y = X + ε,
ε ~ N(0, σ2IN ).
• X is a N x p completely observed
regression matrix (known indicators)
• y is incomplete N x1 vector.
7
Reduced Form (RF)
• The disaggregated SAR model:
ε ~ N(0, σ2IN ).
• W is a neighbourhood matrix (see Anselin
1988).
•
y = ρWy + X + ε,
• This leads to a reduced form of the spatial
model
Ry = X + ε,
with
R = (I - ρW)
• The reduced form is
•
y = R-1X + u , u ~ N(0, σ2 (R’R)-1 )
8
Data generating process
•
•
•
•
The stochastic model is
y ~ N( X(ρ) , σ2 Ω(ρ) )
with X(ρ) = R-1X
and Ω(ρ) = (R’R)-1.
9
Aggregation matrix C
• aggregate NUTS 2 to NUTS 1 regions
• C: (n x N) matrix of 0’s and 1’s.
• Block diagonal of 1ki -vectors. Since the
NUTS aggregation is not equal
• C = diag (1’k1,1’k2…,1’kn)
• ki: number of NUTS 2 cells that add up to
a NUTS 1 cell in row i (i =1,…,n).
10
Auxiliary regression
• Multiply disaggregate by the aggregation
matrix C, to obtain the aggregate
relationship
Cy = CR-1X + Cu,
u ~ N(0.
2I)
• where Cu are new disturbances with
covariance matrix
Var(Cu) =
2 CΩ(ρ)
C.
11
Conditional GLS
• ya = Cy is just the aggregated completely
observed NUTS 1 vector and
• XC(ρ) = CR -1X is, conditional on ρ, a
completely known (n x p) regressor matrix.
• GLS estimate of the auxiliary model:
• bρ = [X’C(ρ) (C Ω(ρ) C ) -1 XC(ρ)]-1
X’C(ρ)(C Ω(ρ) C’)-1 ya
12
ML estimate of ρ
• Use the 2-step ML method of Anselin
(1988). Define the aggregated OLS
estimators
• bo = (Xa‘Xa) -1 Xa‘ya and
• bL = (Xa‘Xa) -1 Xa‘Wya .
• Since bML = bo – ρbL we estimate ρ by the
ML variance:
2 =
ML
(eo- ρ eL)’ (eo - ρ eL)/n
• with eo = yo - Xa bo and eL = yL – Xa bL.
13
ML estimates for SAR
βˆ = (x x) x (I − ρW)y
σˆ = y (I − ρW) [I − x(x x) x ](I − ρW)y / n
T
2
T
−1
T
T
T
−1
T
• The ML approach can be extended for systems
14
Prediction of incompletes
• Following Chow and Lin (1971) and
Goldberger (1962), it is known that the
best linear unbiased predictor using the
indicators X*(ρ) = R-1X of y is :
•
y^ = X*(ρ) bρ + Ω(ρ)C’ (C Ω(ρ) C’) -1
(ya – X*(ρ)bρ).
• We see that the predictor of y can be
decomposed into two components.
15
Interpretation
• The first is the conditional expectation of y given
x in the spatial model. The other is an
improvement using the aggregated residuals
and the GLS projection matrix.
• Note that for ρ = 0 the spatial Chow-Lin method
degenerates to the simple Chow-Lin method
with spherical errors.
• Multiplying by C gives back the aggregationconsistent values of the aggregated model.
16
More spatial dependence
• Consider an extended regressor matrix [X : WX]
where WX is interpreted as “potentials” of the
variables in X.
• For sectoral data that has to be disaggregated
by region, we can use the same approach.
• Suppose there are s = 1,…,S sectors, then
• ys the GDP in sector s in the disaggregated units
• Cys is the GDP aggregated across all regions.
• “structural” zeros : we know that a certain sector
is not producing in a certain region.
17
0-restrictions
• We construct a 0-structured regressor Xs0,
and then we do the prediction simply using
the special sectoral indicators Xs0 :
ys^ = C Xs0 (ρ) bs,ρ + Ω(ρ)C’ (C Ω(ρ) C ’) -1
(Cys - C Xs0 (ρ)bs,ρ),
s = 1,…,S.
• and inserting Xs0 (ρ) = R-1 Xs0 into the
prediction
18
Example: Nuts 2->3
Spanish Data
1 Andalucía
10 C. Valenciana
2 Aragón
11 Extremadura
3 Asturias (P.de)
12 Galicia
4 Balears (Illes)
13 Madrid (C.de)
5 Canarias
14 Murcia (R.de)
6 Cantabria
15 Navarra (C.F.de)
7 Castilla y León
16 País Vasco
8 Castilla-La Mancha 17 Rioja (La)
9 Cataluña
18 Ceuta y Melilla
19
Bayesian SAR estimation (Heteroscedastic model)
!
"
%
#
&'
()
+
% +
# % !
!
! "
! "
!
.% $
"
/0
"
/0
!
! 1
2
$
$*
,
$ %
%
$$ ,**
$
$$ , *
/0
$
1 !
%
$
20
Variable
Coefficient
• cons
0.212
• log_km
-0.132
• log_pop
0.385
• log_stock
0.116
• log_exp
-0.274
• log_imp
0.305
• log_access
-0.171
• log_trucks
-0.277
• log_banks
0.815
• rho
-0.339
Std Devia.
0.1746
0.0847
0.3712
0.2994
0.1290
0.1233
0.1764
0.2275
0.2858
0.2174
p-level
0.095
0.062
0.140
0.326
0.022
0.011
0.154
0.100
0.005
0.062
21
SAR Diagnostics
S AR heteros cedas tic Gibbs
6
Actual vs . P redicted
0.2
Res iduals
0.1
4
0
2
0
Actual
P redicted
0
5
10
-0.1
15
20
-0.2
Mean of Vi draws
4
0
5
10
15
20
P os terior Dens ity for rho
2
1.5
3
1
2
0.5
0
1
0
5
10
15
20
-1
0
1
22
Figure 1: Comparison GDP NUTS 2 (blue) vs. GDP ChowLin (aggregated in green)
600
CL
500
400
300
200
100
0
0
2
4
6
8
10
12
14
16
18
23
Figure 2: Comparison Chow-Lin forecast (blue) vs.
disaggregate real GDP data (green)
150
100
50
0
0
10
20
30
40
50
60
24
Figure 3: Ration between Chow-Lin and real GDP data (CL
divided by real GDP)
2
1.5
1
0.5
0
10
20
30
40
50
60
25
Figure 4: Scatter Chow-Lin vs. Real Data
150
100
50
0
0
50
100
150
26
Table 1: NUTS-3 Forecast comparison: Chow-Lin
GDP vs. Real GDP
No.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
Name
Almería
Cadiz
Córdoba
Granada
Huelva
Jaén
Málaga
Sevilla
Huesca
Teruel
Zaragoza
Asturias
Illes Balears
Las Palmas
Santa Cruz De Tenerife
Cantabria
Avila
Burgos
León
Palencia
Salamanca
Segovia
Soria
Valladolid
Zamora
Albacete
Ciudad Real
CL
7.8013
16.4397
9.5304
12.71
10.0743
8.2321
23.6132
27.2495
4.209
3.2966
18.4121
17.9748
20.9948
16.9725
17.2117
10.4754
2.4686
8.1407
6.1681
1.6984
6.6952
2.9738
1.9487
12.1549
3.3202
4.8839
4.491
real
data
10.6707
17.6041
10.3084
11.6673
7.625
8.5422
21.6097
27.6232
4.2463
2.7713
18.9001
17.9748
20.9948
18.4606
15.7236
10.4754
2.5753
7.8184
8.0467
3.2938
5.7291
3.0022
1.8029
10.2906
3.0096
5.4844
7.9319
Ratio
No.
0.73
28
0.93
29
0.92
30
1.09
31
1.32
32
0.96
33
1.09
34
0.99
35
0.99
36
1.19
37
0.97
38
1.00
39
1.00
40
0.92
41
1.09
42
1.00
43
0.96
44
1.04
45
0.77
46
0.52
47
1.17
48
0.99
49
1.08
50
1.18
51
1.10
52
0.89 Total
0.57
Name
Cuenca
Guadalajara
Toledo
Barcelona
Gerona
Lérida
Tarragona
Alicante
Castellón de la Plana
Valencia
Badajoz
Cáceres
La Coruña
Lugo
Orense
Pontevedra
Madrid
Murcia
Navarra
Álava
Guipúzcoa
Vizcaya
La Rioja
Ceuta (ES)
Melilla (ES)
CL
real data
3.0616
3.1154
6.1575
3.2948
9.9525
8.7199
126.9866 117.2997
9.8092
14.9833
6.6029
9.2576
14.6223
16.4805
26.9904
28.2353
8.6169
10.6662
46.2183
42.9242
8.51
8.3984
5.4319
5.5435
19.3657
18.3762
5.5453
5.1973
4.7117
4.756
13.2958
14.5892
148.688
148.688
21.2489
21.2489
14.2451
14.2451
6.1811
7.9923
13.9635
17.0123
31.5154
26.6552
6.2183
6.2183
0.6823
1.26
1.7268
1.1492
280.1409
284.1824
Ratio
0.98
1.87
1.14
1.08
0.65
0.71
0.89
0.96
0.81
1.08
1.01
0.98
1.05
1.07
0.99
0.91
1.00
1.00
1.00
0.77
0.82
1.18
1.00
0.54
1.50
0.99
27
Part 2
Flow Models
28
Completing cross-sectional flow
data
• Now we look at the simplest case of flow
data that can be modelled at an aggregate
level (NUTS 1) and we want to estimate
the flows for a disaggregate level (NUTS
2).
• Consider the 2 x 2 aggregated flow matrix
from 2 regions into 2 regions (e.g. NUTS 1
level):
29
Dis-agg. and aggregated flows
Agg.:
Y=
a c
b d
Dis-Agg.:
a11
a12
c11
c12
a 21
Y=
b11
b21
a 22
b12
b22
c 21
d11
d 21
A C
c 22
=
.
d12
B D
d 22
30
2 x 2 flow example
• Assume that each region consists of 2
sub-regions,
• so that we like to know the flows between
4 disaggregated origin regions and
• 4 disaggregated destination regions:
31
Aggregated spatial model
• The aggregated model can be written via
the vectorisation of matrix A, i.e.
• y = vecY = (a,b,c,d)’ in the same way as
before:
•
y = ρ(W1,W2)y + X + e,
32
Spatial lag polynomial for flows
ρ(W1,W2)y is a spatial lag polynomial that
is applicable for flow models (see
LeSage and Pace 2005)
⊗
⊗
⊗
ρ (W1,W2)y = ρ1(W1 In2)y
⊗
⊗
• + ρ2(I n1 W2)y + ρ3(W1 W2)y .
33
4 sub-models
• Because all the 4 sub-models are again
flow models we can write 4 auxiliary
disaggregated regressions:
• ya = vec A = ρa (W1a,W2a)ya + Xa a + εa,
• yb = vec B = ρb (W1b,W2b)yb + Xb b + εb,
• yc = vec C = ρc (W1c,W2c)yc + Xc c + εc,
• yd = vec D = ρd (W1d,W2d)yd + Xd d + εd
34
The aggregated system
•
•
•
•
•
4 equations:
Ca ya = Ca Ra -1 Xa
Cb yb = Cb Rb -1 Xb
Cc yc = Cc Rc -1 Xc
Cd yd = Cd Rd -1 Xd
-1 ε ,
+
C
R
a
a
a
a
-1 ε ,
+
C
R
b
b
b
b
-1 ε ,
+
C
R
c
c
c
c
-1 ε
+
C
R
d
d
d
d
35
Pooled estimation
• Simplifying assumption:
r = ra = rb = rc = rd
and
• b = ba = bb = bc = bd
with diagonal matrices:
C = diag (Ca, Cb, Cc, Cd) ,
• DX = diag( Xa, Xb, Xc , Xd) and
• Dε, = diag(εa, εb, εc, εd).
36
System estimation
• of the auxiliary equation
•
Cy = C R-1DX + C R-1Dε,
• with R = I - ρ(W1,W2) and can be
estimated by GLS with the transformed
regressors
• X(ρ
ρ) = CR-1DX and Ω(ρ) = (R’R)-1
• with R = diag(Ra, …, Rd)
• and
Rj = I - ρj(W1,W2),
37
GLS estimate
• Each component is a block diagonal:
• diag (W1a ⊗I2, …, W1d ⊗I2).
• to obtain GLS estimate bρ :
bρ = [X’C(ρ) (C Ω(ρ
ρ) C’) -1 XC(ρ)]-1
X’C(ρ)(C Ω(ρ
ρ) C’) -1 y
38
Minimizer of ρ
• ML estimate by minimizing the 2-step
Anselin 1988 procedure for each block
separately:
–
2
ML,i
= (eo,I - ρ eL,I)’ (eo,I - ρ eL,I) / n ,
for i = a,b,c,d.
– With the ordinary and the W-lag
residuals eo and eL
39
Cross-Country Spillovers
• extending neighbourhood matrices across
country borders.
• Define these “cross-country”
neighbourhood matrices:
• either contiguity matrix, indicating
neighbours by 0’s and 1’s
• or W is distance based.
40
Cross-country = “off blockdiagonal”
• Such matrices have an “off block-diagonal”
structure:
• For n= 2
0 W12
WB =
W21 0
41
A “perforated” matrix
• In general this matrix is “perforated” on the
block diagonal and looks like
WB =
0
W12
W21
0
...
...
Wn1 ...
...
W1n
W21 W2 n
... .
...
Wn −1,n
Wn ,n −1
0
42
Extend with WB matrix
• Add another spatial lag
•
⊗
y = ρ(W1,W2)y + WB ρ4+ X + ε,
⊗
• …the spatial effect form the crosscountries spillovers.
• Now the spread matrix R looks like
⊗
(W I
⊗
⊗W )
•
R* = Inn - ρ1
1 n2) - ρ2 (I n1
• - ρ3(W1⊗ W2) – ρ4WCC.
2
43
Extension
• Clearly the flow system can be
generalized to include the cross-country
effects, (“extended polynomial” ρ(W1,W2)
with 4 rho components):
Cj yj = Cj R*-1Xj j + Cj R*-1 εj, j = 1,…,4,
44
Lindley Smith (1972) estimates
• A non-informative 3-stage hierarchical
model.
• Cj yj = Cj R-1Xj j + Cj R-1 εj, j = 1,…,4,
•
j ~ N[µ,Σ]
• and for the hyper-parameters (µ, Σ) we
assume a non-informative prior (Σ-1 = 0).
45
1-stage posterior mean
• multivariate regressions set-up as in
Lindley-Smith we arrive at the following
estimates of the posterior mean
–1
•
**
=
H
j
j
* [X’C(ρ) (C Ω(ρ
ρ) C’)-1 XC(ρ) + Σ* -1
j*]
• with
• Hj = X’C(ρ) (C Ω(ρ
ρ) C ) -1 XC(ρ) + Σ* -1
46
Pooled estimates
• and the overall estimate is
•
** = (Σj = 1,…,4 Hj)-1 Σ j=1,…,4 Hj jGLS
Σ* can be approximated by
Σ* = ¼ Σ j=1,…,4 ( jGLS - jave)
• where jave is the average of the individual
GLS estimates jGLS.
47
Completing flows using partial
information
• Now we consider a flow matrix, which we
like to disaggregate, but the information
available is asymmetrical.
• Suppose that there are 2 countries, home
and foreign, where the aggregated flows in
4 cells A, B, C and D, are known.
48
Dis- and aggregated flow matrix
• Notation
a11
a 21
Y=
b11
a12
a 22
b12
c11
c 21
d11
b21
b22
d 21
c12
A C
c 22
.
=
d12
B D
d 22
49
The disaggregated model
• for the flows in the n2 x m1 matrix Y12 is
vectorized to yield vec Y12 = yb and is
modelled as
yb = ρ(W1,W2) yb + X
b
+ eb,
50
The aggregated model
•
•
•
•
•
C yb = C ρ(W1,W2) yb + C X b + Cεb
Cεb ~ N(0, σ2 (R’R) -1 )
with
C = 1'n 2 ⊗ I m1 = diag (1'n 2 ,...,1'n 2 )
and where 1’n2 is a (1 x n2) vector of ones.
A single equation with regressors
XC(ρ) = CR-1X and Ω(ρ) = (R’R) -1
with spread matrix R = In2m1 - ρ(W1,W2)
and ρ(W1,W2) is given before.
51
Minimizer of ρ
• ML estimate by minimizing the 2-step
Anselin 1988 procedure for each block
separately:
– 2ML,i = (eo,I - ρ eL,I)’ (eo,I - ρ eL,I) / n ,
for i = a,b,c,d.
– With the ordinary and the W-lag residuals eo
and eL
52
Flow predictions
• Completing by flow predictions
•
yb^ = XC(ρ)bρ + W(ρ)C’ (C W(ρ) C )-1
(Cyb - XC(ρ)bρ).
• where bρ is the ML or GLS estimate.
53
Completing the flow matrix YC
• we can use the same set-up as before,
only we have to use the transposed flow
matrix Y’C, since the marginal distribution
reflects the sum over the rows.
• The disaggregated model for the flows in
the n2 x m1 matrix vec Y’21 = yc is
•
yc = ρ(W1,W2) yc + X c + εc,
54
Aggregated model estimation
• C yc = C ρ(W1,W2) yc + C X c+ C εc,
• with
• where 1’m2 is a (1 x m2). The flow
predictions are
•
yc^ = XC(ρ)bρ + Ω(ρ)C’ (C Ω(ρ) C’) -1
(Cyc - XC(ρ)bρ).
• where bρ is the numerically optimized GLS
estimate
55
Conclusions
• New method for completing correlated
cross-sectional (spatial) data.
• The method can be generalized for flow
data
• The data completeion method depends on
the model: Reseach for combined forecast
(model averaging?)
56
Minimizer of ρ
Now the error sum of square has to be
minimized with respect to 3 rho’s, i.e.
in a cube (-1,1)3
ESS (ρ
ρ)
= (Cy - X(ρ
ρ) bGLS(ρ
ρ))’ (Cy - X(ρ
ρ) bGLS(r)).
•
= (y - Xρ bρ)’C’C (y - Xρ bρ).
57
• Since ρ is an unknown scalar it is easy to
minimize the error sum of squares ESS
with respect to ρ, in the interval (-1,1):
ESS (ρ) = (ya - C Xρ βρ)’ (ya - C Xρ βρ)
• = (y - Xρ βρ)’C’C (y - Xρ βρ)
• with Xρ = R-1X and
• the minimizer is denoted by bρ.
58
The conditional GLS estimate
• The conditional GLS estimate is given by
(3.5) and the following ESS has to be
minimized
– ESS (ρ
ρ) = (Cy - X(ρ
ρ) bGLS(ρ
ρ))’ (Cy - X(ρ
ρ)
bGLS(ρ
ρ)).
•
= n2 (yb - Xρ bρ)’C’C (yb - Xρ bρ)
• with X(ρ
ρ) = R -1 X and since C’C = n2 Im1
59