Open Problems in Multilevel Regression and Poststratification

Observed data
−4
−2
0
2
4
Imputed data−−Missing completely at random
−4
−2
0
2
4
Imputed data−−Fitted normal model
−4
−2
0
2
4
Imputed data−−Various assumptions
−4
−2
0
2
4
−4
−2
0
2
4
−4
−2
0
2
4
−4
−2
0
2
4
Easier Said Than Done: Open Problems in
Multilevel Regression and Poststratification
Andrew Gelman
Department of Statistics and Department of Political Science
Columbia University, New York
Conference in honor of Rod Little, 31 Oct 2015
From Wikipedia:
60%
40%
20%
0%
Response rate
80%
100%
Survey response rates are going through the floor!
1940
1960
1980
Year
2000
60%
40%
20%
0%
Response rate
80%
100%
Survey response rates are going through the floor!
1940
1960
1980
Year
2000
The poststratification identity
PJ
θ=
j=1 Nj θj
PJ
j=1 Nj
The poststratified estimate
PJ
θ̂ =
j=1 Nj θ̂j
PJ
j=1 Nj
Xbox estimates, adjusting for demographics:
Xbox estimates, adjusting for demographics and partisanship:
Why multilevel regression?
Open problems in MRP
I
Deep interactions
I
Non-census variables
I
Survey weights
I
Cluster sampling
I
Estimating regression coefficients
I
Building trust in results