De Leeuw-Predicting Nonresponse and Attrition in a Probability

General Online Research Conference (GOR 17)
15-17 March 2017
HTW Berlin University of Applied Sciences
Berlin, Germany
Edith de Leeuw, Utrecht University
Joop Hox, Utrecht University
Benjamin Rosche, Utrecht University
Predicting Nonresponse and Attrition in a Probability-Based
Online Panel
Contact: Edith de Leeuw [email protected]
Suggested citation: de Leeuw, Edith, Hox, Joop, Rosche, Benjamin. 2017. “Predicting Nonresponse and Attrition in a ProbabilityBased Online Panel General Online Research (GOR) Conference, Berlin.
This work is licensed under a Creative Commons Attribution 4.0 International License
(http://creativecommons.org/licenses/by/4.0/)
Predicting Nonresponse and Attrition
in a Probability-Based Online Panel
Edith de Leeuw, Joop Hox, & Benjamin Rosche
Department of Methodology & Statistics, Utrecht University
General Online Research Conference
Berlin, 15-17 March 2017
Online Panels
Online Panels one of the fastest growing
data collection tools
 Probability-based and Non-probability based
Probability-based state of the art
Europe and USA
Nonresponse and attrition always threat
Also to probability-based panels
However, advantages probability based panels are
 Know and can calculate response rates
 Know who is not responding, attriting
 Question is: can we do something about it?
3
Research Question
Theories on Nonresponse
 Social Exchange Theory
 Planned Behaviour (Reasoned Action)
Social Psychological (special case: Leverage saliency)
Important variables
Socio-Biographical indicators & Attitudes
Do survey attitudes predict wave
nonresponse and attrition?
Do survey attitudes predict better than standard
indicators (e.g., age, education, income,
urbanization)?
Available Data
Dutch LISS Panel (www.lissdata.nl)
Probability-based online panel since October 2007
Core questionnaire every year (longitudinal part)
Varying questionnaires every month (cross-sectional)
Survey Attitude Scale in core (2008-2013)
Nine item scale (de Leeuw et al, 2010)
 34 variables (socio-demographic, psychographic)
Based on expert ratings selection:
 13 clear indicators of nonresponse, used as covariates
Dependent variable based on individual panel member’s
Number of completed interviews per year
Number of invitations per year
 Period 2008-2015
Survey Attitude Scale
3 subscales:
enjoy, value,
burden
3 items/scale,
reliabilities ok
Nice simple
factor
structure
Survey Attitude
Specific
Trait
Enjoy
Value
Burden
 Is survey attitude a
State at
stable individual
time t
characteristic or does
it fluctuate over time?
 State and Trait Model
Common
(Bons et al, 2014)
Trait
 Variance 37-62% trait,
6-33% state
 Use trait & state
variables to predict
nonresponse
= mean score &
deviation over time
Analysis Strategy
Multilevel Negative Binomial Regression (Respons)
Dependent variable: count of completes per year
Offset: count of invites per year
Period 2008-2013
Step1: only survey attitude predictors
Step2: all covariates (nonresponse correlates) added
Period 2014-2015
Cross-validation of prediction models step 1 and 2
Completes and invites
Survival Analysis on Attrition
 Attrited between 2008-2015
Results 1
Dependent variable: Number
of completed interviews p.a.
Intercept
Year (2008 = 0)
Survey attitude scale
Enjoyment: mean
Enjoyment: deviation
Value: mean
Value: deviation
Burden: mean
Burden: deviation
Female
Age
Years of education
Migrant
Dwelling: Self-ownedS
Urbanization
Household income
Household size
SimPC
Social (generalized) trust
Voted
Dissatisfied leisure time
Agreeableness
Model 1: Survey
Attitude
Exp(B)
0.210
0.963**
1.218**
1.021**
1.084**
1.001
0.883**
0.992**
Model 2: SAS +
covariates
Exp(B)
0.201**
0.964**
1.201**
1.021**
1.070**
1.002
0.892**
0.992**
1.031*
1.006**
0.994**
0.924†
1.020
0.993
1.000
0.989*
0.959*
1.001
1.060†
0.994**
0.974**
 Predicting Response
 6 Survey Attitude
predictors predict
62.8% of total
explained variance
 All ‘trait’ attitudes
significant
 2 of 3 ‘state’ attitudes
significant: enjoyment
& burden
 Adding covariates
does not change
attitude model
Results 2
Correlation
SAS trait based
on scores of:
Observed 2014
response rate
Observed 2015
response rate
Predicted response rate using model 1
(survey attitude scale, only trait-part)
2008
2008-10
2008-13
Predicted response rate using model 2
(adding covariates)
2008
2008-10
2008-13
R=0.120
R=0.140
R=0.238
R=0.304
R=0.295
R=0.335
R=0.108
R=0.135
R=0.230
R=0.295
R=0.292
R=0.330
 Cross-validation using holdout sample
 Predicting response in 2014-2015 from the model for
2008-2013
 Better prediction with more information
 More trait scores and closer in time
 Covariates
Results 3
Dependent variable: Dropout
Intercept
Year (2008=0)
Survey attitude scale
Enjoyment: mean
Enjoyment: deviation
Value: mean
Value: deviation
Burden: mean
Burden: deviation
Female
Age
Years of education
Migrant
Dwelling: Self-owned
Urbanization
Household income
Household size
SimPC
Social (generalized) trust
Voted
Dissatisfied leasure time
Agreeableness
Model 1: survey
attitudes
Coef.
0.920*
-0.148**
-0.511**
-0.019
-0.262**
-0.116*
0.271**
0.018
Model 2: add
covariates
Coef.
0.637†
-0.135***
-0.523***
-0.027
-0.272***
-0.138***
0.261***
0.017
-0.047
-0.001
-0.020***
-0.016
-0.066
0.018
0.000***
-0.019
-0.259*
0.005
-0.298***
0.021
0.248*
 Survival Analysis:
predicting Dropout
 All ‘Trait’ scores
significant, only 1
‘State’
 Survey Attitudes
predict 17.04% of the
observed dropout
 Adding 13 Covariates
increases this to
17.22%
Summary
Similar results from Multilevel Negative Binomial
Regression predicting counts of responses per
year and survival analysis predicting panel dropout
Survey Attitude Trait (mean) score predicts well
Survey Attitude State (deviation) score predicts less
Usual demographics important but do not diminish
importance of attitudes
Explained variance attitudes around 15-20%
With demographics added increased to 25-30%
Prediction on later years better if more and more
recent measures of predictors are used
Conclusion
Traits are presumably difficult to change
States are temporarily, can be influenced
Enjoyment most important, Burden second
For Enjoyment and Burden subscales both trait and
state predict, for Burden only trait
Stress that survey is enjoyable and easy to fill in
Make surveys enjoyable and easy (cf. Dillman)
Gamification? (cf. Cape, Keusch&Zhang, Puleston)
Short (“bonsai”) surveys? (cf. Puleston)
Acknowledgements
We thank Annette Scherpenzeel, Corrie Vis &
Miquelle Marchand (LISS-CentERdata) for their
knowledgeable assistance in procuring the LISS
data
We thank 31 international experts in survey
methodology and nonresponse, who rated
theoretical indicators of nonresponse on
importance.
We are very grateful for your labour of love!
References
 Cape, P. (2016) Gammifying questions using tekst alone. GOR 2016-archive
 Dillman, D.A. (1978). Mail and Telephone Surveys: The Total Design Method.
New York: Wiley (and later works e.g. Dillman, D.A., Smyth, J.D. & Christian,
L.M. (2009) Internet, Mail and Mixed-mode surveys: The Tailored Design
Method. New York: Wiley).
 Bons, H., Hox, J., de Leeuw, E & Schouten, B. (2015). Stability of the survey
attitude scale over time: A latent state-trait analysis. Poster presented at the
JOS 30th anniversary conference, SCB, Stockholm.
 De Leeuw, E. D., Hox, J. J., Lugtig, P., Scherpenzeel, C. V., Goritz, A., &
Bartsch, S. (2010). Measuring and Comparing Survey Attitude Among New and
Repeat Respondents Cross-Culturally. Wapor 63 annual conference. Chicago.
 Keusch, F. & Zhang, Ch. (2015) A review of issues in gammified surveys,
SSCR, 1-120
 Puleston J. (2015). The art of asking questions. Workshop at the 2015 GORconference, Cologne
 Puleston J. (2012) Gammification 101- from theory to practice, part 1 & 2.
Quirk’s Marketing Research Media
Appendix A
Survey Attitude Scale
Nine items, based on literature
Analysis showed that 3 constructs existed
and were measured reliably
Enjoyment, Value, & Burden (de Leeuw et al, 2010)
These are described in the following slides
In parenthesis reference to literature where these
questions were used earlier
All questions were to be answered on 7-point scale,
ranging from ‘Totally Disagree’ to ‘Totally Agree’. The
response scales were endpoints labeled only.
Survey Attitude Scale
Three Constructs: I
Survey Enjoyment
I really enjoy responding to questionnaires
through the mail or Internet
(Cialdini/Rogelberg)
I really enjoy being interviewed for a
survey (Cialdini/Rogelberg)
Surveys are interesting in themselves
(Stocké)
Survey Attitude Scale
Three Constructs: II
Survey Value
Surveys are important for society (Stocké)
A lot can be learned from information collected
through surveys (Rogelberg)
Completing surveys is a waste of time (-)
(Rogelberg/Singer)
Three Constructs continued
Three Constructs: II
Survey Burden
I receive far to many request to participate in
surveys (Cialdini)
Opinion polls are an invasion of privacy (Goyder)
It is exhaustive to answer so many questions in a
survey (Stocké)
Appendix B
Indicators of Nonresponse as Covariates
 Gender
 Age
 Years of education
 Migrant or not
 Household size
 Household income
 Type of dwelling
 Urbanization
 SIMPC
 Generalized (Social)Trust
 Voted in at least one national election
 Opportunity costs (dissatisfaction with amount of leisure time)
 Agreeableness (big five)