Case study CBS - Neuchatel SAE

CBS case study
Crime survey
Neuchatel, 7-8 July 2011
Introduction
•
•
•
•
•
Crime and victimization survey
Planned domains: police districts
Sample size approx 750 / district
2005-2008: NSM
2008 onwards: ISM
SAE of crime statistics
From NSM to ISM
• Local oversampling
• Data collection: sequential mixed-mode
• Different questionnaire
Discontinuities expected
SAE of crime statistics
Quantifying discontinuities
• Survey transition from NSM to ISM
• Small scale NSM in parallel to new ISM
(full scale: approx 18,000; small: 1/3rd)
• Discontinuities at national level
• Now: police district level discontinuities
required
• But: NSM sample too small => SAE
SAE of crime statistics
Example of discontinuity
Bicycle thefts NSM and ISM
NSM
ISM
2009
SAE of crime statistics
Total: 541,000 (NSM) ; 897,000 (ISM)
Coeff of variation, bicycle theft, 2009
NSM: 0.41 ; ISM: 0.24
SAE of crime statistics
SAE to increase precision of NSM
• Fay-Herriot model: linear mixed, area level
• EBLUP and HB estimators
• Bayesian estimation of model variance also in
EBLUP (avoiding zero-estimates of model
variance)
SAE of crime statistics
Bayesian estimation of model variance
SAE of crime statistics
Covariates from registers
• Police:
Reported offences: property crimes,
violence, assaults, threats, illicit drugs,
weapons, vandalism, traffic offences
• Administration:
age, ethnicity, urbanisation, house prices,
welfare claimants
SAE of crime statistics
Covariates from ISM survey
• Design based GREG estimates as auxiliary
information (Ybarra & Lohr 2008)
• Consequences for small area estimates
• Model estimate weighted lower in BLUP due to
error in covariate
• Achieved through higher estimate of model
variance in EBLUP (not Y & L adjustment)
• Variance of GREGs approx. equal for all areas
• (Other idea: multivariate FH model)
SAE of crime statistics
Simulating errors in covariates
• Bicycle thefts: NSM survey ~ police-reported
• No error
• post. mean model var = 1.22
• Adding error, mean 0, sd 2, iterate 1,000 x
• post. mean model var = 1.32
• To add detail, e.g. estimated beta
SAE of crime statistics
Dimension reduction: PCA
• Rather than using a small subset of
covariates, use small dimension of PC
subspace
• Not guaranteed to work as correlation with
survey variables not used in PCA
• Use as a separate set of potential
covariates in model selection
pc
var. expl.
1
2
.39
.55
SAE of crime statistics
3
4
5
6
.67 .75 .81 .87
…
12
…
.99
PC space of dim 2
SAE of crime statistics
Model selection
• Conditional AIC (Vaida & Blanchart 2005)
cAIC = - 2 cond_llh + 2 eff_d
( AIC = - 2 llh + 2 d )
• Cross validation (CV)
LOO: leave-one-out, predictive accuracy
Start from minimal model, and add terms,
maximizing improvement wrt cAIC or CV,
until no further improvement
SAE of crime statistics
Model selection results
For each NSM survey variable: 2 years, 2 criteria
• CV-models are larger
• cAIC are nested within CV models
Hence:
Use cAIC models
• Models differ between years
• Alternative: choose single model for both years
SAE of crime statistics
Selected models
violent crimes
satisf. police
victimization
property crimes
nuisance
feeling unsafe
degradation
bicycle theft
SAE of crime statistics
2008
2009
ISM-bicycle-theft, REGproperty, REG-weapons,
ISM-property
pc21, pc10, pc4
ISM-satisf
age,ISM-satisf,urbanisation
ISM-property, REG-property
pc1, pc21,pc5,pc6
ISM-victim, elderly
pc1, pc21,pc2,pc5,pc6
ISM-nuisance, elderly
ISM-victim, REG-traffic,
ISM-property
ISM-nuisance, house val,
ISM-satisf
ISM-unsfae, ISM-satisf
pc1,pc4,pc10,pc22
ISM-degrad
ISM-bicycle
ISM-bicycle, ISM-satisf
Selected models excl. ISM
violent crimes
satisf. police
victimization
property crimes
nuisance
feeling unsafe
degradation
bicycle theft
SAE of crime statistics
2008
2009
PC
PC
PC
PC
REG-property, elderly
PC
REG-property, age
REG-property, REGtraffic, REG-weapons
PC
PC
PC
PC
urban, house val, REGvandalism
PC
PC
PC
SAE results (hybrid EBLUP),
reduction in coeff. of variation
violent crimes
satisf. police
victimization
property crimes
nuisance
feeling unsafe
degradation
bicycle theft
SAE of crime statistics
incl. ISM
-40 %
-47 %
-43 %
-44 %
-43 %
-33 %
-35 %
-39 %
excl. ISM
-40 %
-46 %
-41 %
-42 %
-33 %
-25 %
-16 %
-33 %
Bicycle theft, cv, 2009
NSM: 0.41, EBLUP: 0.23, ISM:0.24
SAE of crime statistics
SAE results, weight of direct est. in BLUP
violent crimes
satisf. police
victimization
property crimes
nuisance
feeling unsafe
degradation
bicycle theft
SAE of crime statistics
incl. ISM
0.21
0.24
0.20
0.19
0.21
0.39
0.31
0.35
excl. ISM
0.27
0.35
0.22
0.24
0.27
0.41
0.64
0.32
EBLUP vs. Hierarchical Bayes
violent crimes
satisf. police
victimization
property crimes
nuisance
feeling unsafe
degradation
bicycle theft
Diff. point est.
-0.1 %
+0.0 %
-0.0 %
-0.0 %
+0.0 %
-0.0 %
-0.0 %
-0.2 %
Diff. var est.
-4.7 %
-3.8 %
-4.7 %
-4.5 %
-4.6 %
-3.1 %
-4.0 %
-2.6 %
HB accounts for uncertainty in estimating the model variance
SAE of crime statistics
Conclusions
•
•
•
•
Considerable increase in precision with SAE
Gain in precision depends on variable
PCA is important for some variables
Using ISM outcomes important for some
variables
• MSE estimates HB higher (preferable)
SAE of crime statistics
To do (maybe)
• Sort out errors in input data! And re-run
everything.
• Calibration to direct estimate of totals (is
model diagnostic)
• Study residuals
• Elaborate on errors in covariates
• Use past survey outcomes as covariates
• More detailed comparison of HB-NSM
estimates with ISM
SAE of crime statistics
Future work (post-ESSnet)
• Multivariate modelling of NSM and ISM
variables
• Consider model averaging
• Using more detailed areas, with smaller
sample sizes: beneficial?
SAE of crime statistics