Introduction As with the entire market and survey

The Choice is Yours
Comparing Alternative Likely Voter Models within Probability and Non-Probability Samples
By Robert Benford, Randall K Thomas, Jennifer Agiesta, Emily Swanson
Likely voter models often improve election predictions – for both voter turnout and vote
choice. Successful modeling typically combines several measures to estimate registered
voters, voter turnout, and vote outcome. A number of factors have made likely voter
modeling more difficult, including the broader utilization of early voting and changes in
sampling and data collection mode. In October 2013 the AP-GfK Poll moved from dualframe RDD telephone surveys to an online protocol using KnowledgePanel ®, which is the
largest U.S. probability-based online panel, enabling rapid and detailed national polling.
Though KnowledgePanel can be used for national projections, one of the key interests is
prediction of voting outcomes by states where KnowledgePanel can fall short. As such, GfK
and The Associated Press (AP) have examined how larger demographically balanced nonprobability (opt-in) samples could supplement probability-based (KnowledgePanel) through
a calibration methodology. To study this, we selected two states with diverse populations one in the Midwest often favoring Democrats (Illinois) and one in the South often favoring
Republicans (Georgia). Each state had both Senatorial and Gubernatorial races on the line.
In each state two parallel surveys with about 800 KnowledgePanel and 1,600 opt-in
respondents were administered immediately prior to the elections. Respondents in each
sample were randomly assigned to one of two alternative sets of likely voter items - either the
AP-GfK Poll’s standard likely voter set or an alternate set driven mainly by stated intention
to vote. We report estimates of registered voters, turnout, and election results by likely voter
model, how that model can be optimized, and comparisons of estimates separately from
KnowledgePanel and opt-in samples. While both models predicted well, the revised model
used fewer variables. In addition, calibrating opt-in samples to probability samples can
improve the utility of opt-in polling samples.
Introduction
As with the entire market and survey research industry, polling faces challenges that continue to
erode fully probabilistic, high response rate methods that have historically produced quality
estimates with calculable precision. Probability-based samples of all varieties are fraught with
unknown levels of imperfection due to coverage error and non-response error in what are now
often response rates in the low teens or even single-digits. Attempts to overcome these sources
of potential error come at a high cost and extensive effort, which rarely eradicate these potential
errors. Online samples are one cost effective approach, where the cost-quality tradeoffs are a
main reason motivating survey and market researchers to experiment with the use of online
samples.
Essentially online samples come in two varieties, opt-in or probability based. Opt-in samples
can further be thought of as community-based (mostly panels) or intercept approaches (mostly
river) with a great deal of variation by sample provider in these approaches. GfK uses both types
of samples depending on a project’s budget and fitness of use. When an opt-in sample is selected
as the best match for a survey, GfK uses routing technology provided by Fulcrum to manage a
large number of opt-in providers under the theory that more is better and robustness can
overcome many issues. GfK’s KnowledgePanel is a probability-based online sample with over
50,000 members and is primarily used when fitness of use mandates this. However, probabilitybased samples by nature are more expensive to recruit, empanel, and maintain leading to frames
of higher cost and of a moderated size nationally. At times, combinations of these types of
sample are indicated because low incidence populations or other constraints on sample such as
geography make it unfeasible to use one type of sample or the other on its own.
GfK’s national polling for the Associated Press is conducted using KnowledgePanel. The AP’s
survey standards allow publication of online polls only if they are conducted using panels
selected using probability-based methods.
Regardless of online sample source, surveys attempting to represent narrower geographic
locations, such as statewide surveys, can limit the amount of sample available for analysis. For
example, a client such as the AP might want to conduct a political survey in a specific state to
assess the horse race in a statewide election or tell the story of political issues in that state.
Often, subgroup analyses by party, sex or race are important and thus sample sizes often must be
greater than one single online source can provide. KnowledgePanel covers every state in the
U.S., but proportionally leaves some states with less than desirable case counts for surveys with a
smaller geographic coverage area. Opt-in samples can be a cost-effective way to supplement
KnowledgePanel, in particular when and if they align well through weighting or calibration
techniques. This leads to the question of quality by sample source and how the two can work
together to produce quality survey estimates.
Methodology
To address these questions GfK, in coordination with the Associated Press, carried out two
surveys, one in Georgia and one in Illinois, where essentially the entire KnowledgePanel sample
in each state was used. These samples were supplemented with opt-in panel sources via the
Fulcrum router, managed by GfK. To mimic the population in each state, an interlocking quota
design matched survey respondents to state distribution of sex by age (18-29, 30-49, 50-64, 65+)
by race (Black/AA, All Other) by educational attainment (Some college or less, College grad or
higher) or a 32-cell design1. These demographics were also included for KnowledgePanel
respondents in each state. In addition, opt-in respondents were asked five early adopter
questions that are already asked and included with the KnowledgePanel sample. Three weights
were computed, adjusting only KnowledgePanel, adjusting only opt-in, and calibrating opt-in to
KnowledgePanel via demographics and early adopter questions.
This research also assessed two different likely voter models in each state. Cases from each
sample source were randomly assigned to the standard likely voter model versus a more direct
stated intention to vote method. The stated intention model is based on registered voters and
includes those who already voted or say they will definitely vote and those who say the probably
will vote and say they always or nearly always vote in elections. The stated intention model is
based on three survey questions. The standard model is also based on registered voters and is a
complex set of definitions that includes past vote frequency, past voting behavior, whether they
have already voted, likelihood to vote, interest in news about the election and knowing where to
vote. This model requires eight survey questions based on four different patterns of survey
answers to define a likely voter. This model is very similar to what others in the polling sector
use.
Within each sample type, sample was randomly assigned to a likely voter model. Sample sizes
1
State benchmarks are from the American Community Survey Three Year averages 2011-1013.
for each model and sample by state are shown in Table 1. To control for consistency between
these two models, specific to this research, the above weighting was completed within model.
Prior to analysis, weighted data were compared to assess the outcome of the random assignment
and ensure important covariates of election outcomes such as party identification were equitable.
In Illinois the demographic weighted outcomes were not equitable between models on party
identification. It is not GfK’s or AP’s standard practice to include party identification in
weighting given variability known for this variable. To make the models equitable, the initial
weighted estimate of party identification was used as an additional weighting variable within
each model.
Table 1
Model
GA
IL
KnowledgePanel
Standard
Intent
333
321
494
523
Opt-In
Standard
Intent
800
759
875
877
Results
For analytic purposes there are two states, each with two models, each comprised of three types
of sample – KnowledgePanel only, opt-in only, and both combined through calibration.
Statistical significance is determined at the 95% confidence level using a t-test of proportions
and the effective sample sizes to account for variability due to weights. It should also be noted
that while testing of estimates is against parameters, it is also meaningful to assess the absolute
differences in estimates by sample type and model. Throughout the findings there are essentially
thirty-two estimates across the two types of sample. Each are discussed then summarized.
Registered Voters
As a prerequisite to vote in most states in the U.S., including both Georgia and Illinois, one must
be registered to vote, which makes voter registration the root of most likely voter models. Table
2 shows estimates for each sample type. KnowledgePanel estimates for registered voters across
models and states are always within statistical tolerance. Opt-in estimates are the most distant
from actual percentages of registered voters. Interestingly, in Illinois, the calibrated estimate is
closer to actual registered voters than KnowledgePanel.
Table 2: Registered to Vote
GA Reg Voter
IL Reg Voter
Actual
77.0
83.3
KnowledgePanel
77.0
81.3
Opt-in
81.8**
85.5**
Calibrated
80.2**
84.0
**Highlighted estimate significantly different from parameter at 95% confidence.
Turnout
The essence of any likely voter model is to predict the population that will actually cast votes on
Election Day. Turnout, as operationalized as those registered voters modeled as likely to vote, is
an estimate that is nearly always overstated by likely voter models. All estimates are statistically
significantly higher than actual turnout among registered voters (Table 3). This could be because
those who participate in political surveys are more likely to be interested in politics to begin
with, because of overstatement of vote intention or because of some combination of the two.
KnowledgePanel was closest to actual turnout in three out of four cases, followed by calibrated
estimates, then opt-in. In one case, the Georgia standard model, the calibrated and opt-in sample
were closer than KnowledgePanel. Overall, the standard model overstated turnout less than the
stated intention model, due to the additional minutia asked in the model to winnow down the
likely voter pool. However, just because the turnout estimate is closer does not mean the right
mix of voters who turn out is predicted.
Table 3: Turnout
GA Turnout
IL Turnout
Standard Model
Stated Intent Model
Actual
KnowledgePanel
Opt-in
Calibrated
KnowledgePanel
Opt-in
Calibrated
50.0
49.2
68.1**
64.0**
64.3**
69.2**
64.3**
66.7**
73.9**
68.9**
78.1**
77.2**
74.9**
74.7**
*Highlighted estimate significantly different from parameter at 95% confidence.
Election Results
With the exception of the Illinois Governor results in the standard model, KnowledgePanel was
always directionally correct in estimating the elections tested in the surveys and never
significantly different from the actual results for each candidate (Table 4). The calibrated results
performed similarly, but missed the Illinois Governor’s race in both models. Opt-in sample
missed the Illinois Governor’s race in both models as well as missing the Georgia Senate race in
the standard likely voter model.
Table 4: Election
Results
Actual
Standard Model
KnowledgePanel
Durbin
Oberweis
53.5
42.7
52.2
45.6
Quinn
Rauner
46.3
50.3
48.7
48.1
Nunn
Perdue
45.2
52.9
45.2
52.0
Carter
Deal
44.9
52.8
38.9
54.4
Opt-in
Stated Intent Model
Calibrated
IL Senate
56.4
55.4
38.5
39.8
IL Governor
49.7
49.4
44.5**
45.8**
GA Senate
46.5
45.5
45.5**
47.2
GA Governor
44.4
42.3
47.8
49.8
*Democratic candidate always shown first, third party candidate not shown.
** Highlighted estimate significantly different from parameter at 95% confidence.
KnowledgePanel
Opt-in
Calibrated
53.5
42.6
53.9
39.2
54.6
38.8
44.1
50.2
48.7
45.9**
47.6
47.0
42.2
49.7
44.3
51.0
43.8
50.2
41.5
49.8
42.9
50.9
41.9
50.1
Table 5 shows the predicted margin of victory as the percentage for the Democratic candidate
minus the percentage for the Republican candidate. That is, a positive number is the margin in
favor of the Democrat and a negative in favor of the Republican. This margin is often critical to
calling a race or predicting a winner based on survey estimates. Again with the exception of the
Illinois Governor’s race in the standard likely voter model, which would have been deemed too
close to call, surveys drawn from KnowledgePanel would likely have resulted in directionally
correct race calls. The calibrated sample was wrong in both models for the Illinois Governor’s
race and too close to call in the Georgia Senate race in the standard model but directionally
correct. Opt-in sample estimates were wrong in both models for the Illinois Governor’s race and
wrong for the Georgia Senate race in standard model.
Table 5: Dem-Rep
Margin
Standard Model
Actual
IL Senate
IL Governor
GA Senate
GA Governor
KnowledgePanel
10.8
-4.0
-7.7
-7.9
6.6
0.6
-6.8
-15.5**
Opt-in
Stated Intent Model
Calibrated
17.9**
5.2**
1.0**
-3.4
KnowledgePanel
15.6**
3.6**
-1.6**
-7.4**
Opt-in
10.9
-6.1
-7.5
-8.3
14.7**
2.8**
-6.6
-8.0
Calibrated
15.7**
0.5**
-6.4
-8.2
**Highlighted estimate significantly different from parameter at 95% confidence.
To assess the surveys’ ability to generate a sample with demographic traits which match those of
the overall electorate, we compared the survey results by model with the National Election Pool
exit poll estimates of sex, race, Hispanic origin, age and education level. We were also able to
look at the actual share of the electorate by gender and race based on figures released by the
Georgia Secretary of State.
Tables 6 and 7 show weighted demographics among likely voters in each model in each state
broken down by sample source.
Table 6: Georgia Demographic Comparison
KP Only
Standard
Stated
KP + Opt-in
Standard
Stated
Exit poll
Men
Women
44
56
49
51
47
53
48
52
48
52
Secretary
of state
45
55
18-29
30-44
45-59
60+
6
35
35
25
15
35
28
22
13
35
30
22
15
36
28
21
10
27
34
29
NA
NA
NA
NA
White alone
Black alone
58
33
66
28
64
30
68
27
65
29
64
29
Hispanic origin
8
2
6
4
4
1
HS or less
Some college
College grad
39
32
29
39
32
29
34
34
32
Table 7: Illinois Demographic Comparison
KP Only
Standard
Stated
Men
48
43
Women
52
57
38
32
31
18
28
54
KP + Opt-in
Standard
Stated
49
47
51
53
NA
NA
NA
Exit poll
50
50
18-29
30-44
45-59
60+
8
27
36
28
13
32
31
23
13
32
31
25
14
32
31
23
11
23
37
29
White alone
Black alone
75
19
76
13
76
16
77
14
75
16
Hispanic origin
4
6
10
9
6
HS or less
Some college
College grad
31
33
37
36
31
34
33
31
36
32
33
35
19
30
51
Likely Voter Models
Estimates of candidate vote percentage show very few significant differences by sample within
model. From the perspective of the margin of victory, there are more significant differences, but
a good deal of directional consistency. That is, in a majority of cases, the call would have been
correct. Significance aside, comparing the models across estimates by sample, 70% of the time
the stated intention model was closer to the actual results than the standard model. This suggests
that the stated intention model (fewer questions) may work well as a substitute for the standard
model (more questions).
Conclusions
It seems clear that probabilistic online samples such as KnowledgePanel are a better choice when
budget and the number of panelists available make that choice feasible. When geographic or
other constraints limit sample availability, then supplementing these samples with online opt-in
samples can work well in estimating election outcomes. However, several details are important
in doing so.
Bayesian statisticians argue that knowledge of the posterior distribution can help align samples
so that they are unbiased. This is not dissimilar to weighting, but extends beyond
geodemographics. Care needs to be taken when opt-in samples are designed to not only mimic
the geodemographics but to attend to other dimensions in aligning samples to these posteriors.
In this research attitudes towards early adoption are used, and this steers opt-in samples towards
accuracy.
What this suggests is that as research practices continue to change and evolve, standard or
typically weighting practices will need to be more creative, aggressive, and often heroic in
nature. These efforts will more likely than not come at the expense of greater variability due to
weighting, but be deemed necessary for precision in estimates of populations.
Last, when it comes to likely voter models tested here, results are inconclusive based on
statistical significance. That is, there is no clear statistical winner. Even though the standard
model gets closer to turnout among registered voters, the stated intention model performs equally
well when election outcomes are estimated. Thus, given this outcome, one may opt to save
questionnaire space and take the more direct stated intention approach. That coupled with
appropriate weighting, even when opt-in samples are used alone or calibrated, can produce the
reliable estimates necessary. The choice is yours.