Spatial Econometrics

Spatial Econometrics
Why it may be important and some
helpful hints getting started
Philip Watson
Colorado State University
Dept. of Ag. and Resource Economics
2.17.2006
Objectives

Briefly describe the issue


Discuss how to incorporate the spatial dimension


Different options for weighting matrices
Present the two different spatial dependence models



Why does space matter?
Spatial error
Spatial lag
Go over some diagnostics

How to test for the presence of spatial dependence and
how to correct for it
Introduction

Who cares?



First Law of Geography
Assumption of OLS is that observations are
independent of one another
Are Larimer County and Weld county completely
independent observations?
Counties of the West
Introduction

Who cares?





First Law of Geography
Assumption of OLS is that observations are independent of
one another
Are Larimer County and Weld county completely
independent observations?
…no, so we have a violation of OLS
So what?

Depending on the nature of the dependence, OLS will be
either inefficient with wrong SE or will be biased and
inconsistent
Weighting I

Assumption: structure of spatial dependence is
known, not estimated - Taken as known a priori
The specification of the weighting matrix “is a mater of
considerable arbitrariness and a wide range of suggestions
in the literature” – Anselin and Bera 1998
“Connectivity matrix” specifies the degree of
interdependence among observations
 Based either on contiguity, Euclidean distance, or even
non-geographical distance based measure
Might be considered a strong assumption, not as strong as
assuming it is zero and all observations are spatially
independent



Weighting II

Typical types of Weighting Matricies

Contiguity




Distance




Rook
Queen
First vs Second Order vs Higher Order
k nearest neighbors
Inverse distance
Distance decay function
Combination

wij=pbij/daij
Weighting III


Denote Weighting matrix, W
Contiguity matrix

NxN symetric matrix where wij = 1 when i and j
are neighbors and 0 when they are not


Makes for a fairly sparse matrix
W matrix is usually standardized so all columns
sum to 1

wsij = wij / Σj wij

Makes operations with the W matrix as an
average of neighboring values
Weighting IV

W matrix used to generate spatial lag
operator, Wy

Σjwijyj

Weighted average of the y values based on neighbors
Types of Models


Spatial Error
Spatial Lag


Also often reffered to as spatial autoregressive
model
Nonstandardized vocabulary


Anselin calls both “spatial autocorrelation” with the
first referred to as spatial error model and the
second referred to as spatial lag model
Others use different definitions so watch out
Analogy to Time Series

This issue can be thought of as an analogy to
time series autocorrelation



Dependence moves both ways instead of just one
Spatial error model analogous to time-series
serially correlated errors
Spatial lag model corresponds to the timeseries lagged dependent variable model
Types of Models: Spatial Error

Spatially lagged error

Observations interdependent through unmeasured variables
that are correlated across space or measurement error that
is correlated with space
 A nuisance that arises because we can not model all the
facets of a geographical region that may influence all nearby
locations
 May also arise from boundaries that are not perfect measures


Counties are not labor markets but we use them as proxies
Theoretically possible to eliminate this type of spatial
dependence with proper explanatory variables and correct
boundaries of observations
Types of Models: Spatial Error

Space matters only in the error process, not
in the substantive portion of the model



If we were able to add the right variables and
move the error to the model, then space doesn’t
matter anymore
Two counties affected by same hurricane 3 years
ago
Natural amenity index based on county
boundaries but natural amenities don’t conform to
same boundaries
Natural Amenity Index
Types of Models: Spatial Error

Model


Start with basic model
 y= x+e e~N(0,2)
y= x+e+λwe

If λ=0, reduces to OLS, if λ0, OLS is
unbiased and consistent, but SE will be
wrong and the betas will be inefficient
Types of Models: Spatial Lag

Spatial lag model

Dependent variable is affected by the values of
the dependent variables in nearby places

Land value in a county is a function of land value in
nearby counties, not just related to common
unmeasured variables
Average Value of Ag Land and
Buildings
Types of Models: Spatial Lag

Model

Y = xi + φwiy + ei


Can also include wixi term
OLS in this case is biased and inconsistent
Types of Models: Spatial Lag

Quick look at why spatial lag leads to inconsistent
estimates


OLS omits φwiy and thus becomes part of error
 By construction, φwiy is related to neighboring y’s
 Therefore, yi is correlated with the error term, unless φ=0
As opposed to the time series case (where GLS is
appropriate), the correlation between observations
move both ways



Variance matrix is full not, upper triangular as in time series
Spatial GLS is also biased and inconsistent
For formal proof see Anselin and Bera 1998
Squiggles

Must be accounted for in maximum likelihood
framework or using a proper set of instrument
variables (Ord 1975)
Spatial lag ML function
Spatial error ML function
Diagnostic

Morans’ I
 Indicates general spatial misspecification
 I = e’We’/e’e (for row standard weights)




e = vector of OLS residuals
W = spatial weights
Similar to the Durbin-Watson test
Does not provide insight into suggesting which
alternative specification to use
Diagnostics

Lagrange multiplier tests
 Run regression of the residuals on the original variables and the
lagged residuals


Test for λ=0 and φ=0
See Anselin et. al 1996

LM-Lag and Robust LM-Lag



LM-Error and Robust LM-Error



Pertain to Spatial Lag model as alternative
Robust: tests for lag dependency in presence of missing error
Pertain to Spatial Error model as alternative
Robust: tests for error dependence in presence of missing lag
Problem: tests for spatial lag and error can be mutually
contaminated by each other
 LM test for λ=0 responds to non-zero φ and vice versa
 Robust takes into account the possibility of non-zero of the
nuisance parameter
From: Anselin 2005
DIAGNOSTICS FOR SPATIAL DEPENDENCE - NFIASS
FOR WEIGHT MATRIX :
Dissertation Weights New.GAL (row-standardized weights)
TEST
MI/DF
VALUE
PROB
Moran's I (error)
0.096119 3.7097932
0.0002075
Lagrange Multiplier (lag)
1
13.2385278 0.0002743
Robust LM (lag)
1
3.5785788
0.0585292
Lagrange Multiplier (error) 1
9.6647450
0.0018784
Robust LM (error)
1
0.0047961
0.9447878
DIAGNOSTICS FOR SPATIAL DEPENDENCE - NFIA
FOR WEIGHT MATRIX : Dissertation Weights New.GAL
(row-standardized weights)
TEST
MI/DF
VALUE
PROB
Moran's I (error)
0.123882 4.6386775
0.0000035
Lagrange Multiplier (lag)
1
26.9069495
0.0000002
Robust LM (lag)
1
11.8469234
0.0005776
Lagrange Multiplier (error) 1
16.0540439
0.0000616
Robust LM (error)
1
0.9940178
0.3187624
REGRESSION
SUMMARY OF OUTPUT: SPATIAL LAG MODEL - MAXIMUM LIKELIHOOD ESTIMATION
Data set
: west
Spatial Weight
: Dissertation Weights New.GAL
Dependent Variable : NFIASSIMP Number of Observations: 413
Mean dependent var : 0.0268696 Number of Variables : 18
S.D. dependent var : 0.0386071 Degrees of Freedom : 395
Lag coeff. (Rho) : 0.626462
R-squared
: 0.431525 Log likelihood
: 874.714
Sq. Correlation : Akaike info criterion : -1713.43
Sigma-square
: 0.000847318 Schwarz criterion : -1641.01
S.E of regression : 0.0291087
----------------------------------------------------------------------Variable
Coefficient
Std.Error
----------------------------------------------------------------------W_NFIASSIMP
0.6264623
0.04290037
CONSTANT
-0.01827905
0.0116231
ORCHIMP
0.0316304
0.01921175
VEGIMP
0.1204401
0.02197299
CORNIMP
0.1146744
0.0512294
WHTIMP
0.06172138
0.01830314
FRGIMP
0.0179874
0.01102344
SILGIMP
0.04123105
0.06479901
HORTIMP
-0.02029143
0.009002836
DIRECTIMP
-0.2231155
0.1135618
STOCKTIMP
-0.006905066
0.006543202
OFFARMTIMP
0.01557627
0.01854374
TEMPIMP
0.0002773511 0.0002004012
URBANIMP
0.001173342
0.0007315193
SIZEIMP
-6.707924e-007 7.395939e-007
COOPIMP
-0.5220158
0.6303666
CUSTIMP
-0.1743578
0.2256416
DAIRYIMP 0.05479179
0.01530465
3.580075
-----------------------------------------------------------------------
z-value
Probability
14.60273
-1.572648
1.646409
5.48128
2.238448
3.372175
1.631741
0.6362914
-2.253893
-1.964706
-1.055304
0.8399748
1.383979
1.603979
-0.9069739
-0.8281147
-0.7727203
0.0003436
0.0000000
0.1158003
0.0996796
0.0000000
0.0251917
0.0007459
0.1027340
0.5245864
0.0242028
0.0494481
0.2912865
0.4009224
0.1663650
0.1087187
0.3644205
0.4076054
0.4396878
Final Thoughts

OLS suffers from potentially severe omitted variable
bias


Spatially weighted GLS dramatically improves the
estimates





Tends to inflate estimates of common-stimuli effects
Still has simultaneity basis
Appropriateness depends on the degree of spatial
dependence
Spatial Maximum Likelihood best option
Spatial Autocorrelation (Lag) probably a bigger issue
than Spatial Error
Choosing a weighting structure?

…try a few and compare the log likelihood