Mixed Modes and Measurement Error: Using Cognitive

Suspicious and Non-Suspicious Response Patterns which
Are and Are Not Problematic:
Cognitive Interviewing as Tool for
Exploring Quantitative Findings
Pamela Campanelli
Independent Survey Methods Consultant, UK
Michelle Gray, Margaret Blake and Steven Hope
National Centre for Social Research, UK
Thank you to:
● UK Economic & Social Research Council (ESRC): Grant RES-175-25-0007
● Core project team members: Gerry Nicolaas (National Centre for Social Research), Peter
Lynn, Annette Jäckle, and Alita Nanda (Institute for Social and Economic Research)
● Cognitive interviewers and respondents
About the Project (1)
● The Cognitive Interviewing was a
pre-planned follow-up
to quantitative mixed modes experiment
● The mixed modes experiment was set up to look at
the interactions between CAPI, CATI and CAWI
And 7 question format comparisons
And easy versus difficult questions
And sensitive versus non-sensitive questions
About the Project (2)
● Cognitive Interviewing R‟s were a quota sample
drawn from the mixed modes experiment
(contrasting Rs who displayed satisficing
behaviour with those who did not)
● 37 cognitive interviews were carried out in Rs‟
homes in 8 locations across England and
Scotland, lasting approximately 1 hour
About the Project (3)
● The cognitive interviews began with a carefully selected
subset of survey questions from the mixed modes
experiment
● These questions were administered in standard
quantitative fashion and across 3 modes (CAPI, CATI,
and CAWI)
● Then came the actual cognitive interviewing
● It used retrospective think alouds and many pre-written
probes
● The cognitive interviews were transcribed
● Analysis took place using UK qualitative charting
programme, „Framework‟
Purpose of Paper (1)
Figure 1
Suspicious
Non-suspicious
A
Problematic
B
Problematic
C
Non- problematic
D
Non-problematic
Purpose of Paper (2)
● Project was not set up to seek out examples for Figure 1, but
rather interesting examples from Cells C and B were uncovered
● Paper highlights how cognitive interviewing shows alternative
interpretations of quantitative findings
● Paper focuses on:
 Acquiescence
 Non-differentiation
 Middle categories on end-labelled scales
 Similar quantitative distributions not being caused by the
same R processes
 Problems with simple factual questions in 3 versus 7 or 8
category format
Acquiescence
Acquiescence response set is tendency to agree
to an item regardless of its content
Detection of acquiescence:
● Agreement to opposite statements  individual level
● More agrees in agree/disagree format than in
balanced forced choice format  aggregate level
Mixed Mode Experiment
12 agree/disagree questions split into 3
themes:
● Aspects of R‟s neighbourhood  adapted
from questions on a London Housing Association
questionnaire
% agreement
to opposite
statements
3.4 %
● Mental patients or former prisoners
living in the community  extended from two 32.7%
questions from the 2006 UK British Social Attitudes
survey
● Degree of background work one would
do in making an important financial
decision  extended from three questions from the 44.8%
UK Attitudes to Pensions Survey
Cognitive Findings (1)
The cognitive interviewing showed that of the 23
instances of agreement to opposite statements, 21
instances were sensibly justified by respondents.
Example:
● N36. Compared to other neighbourhoods, this neighbourhood
has more properties that are in a poor state of repair.
● N38. Compared to other neighbourhoods, this
neighbourhood has more properties that are well kept.
● “In this village, . . . it’s like half and half. There is a
bit [that] . . . wants doing up and there’s” the other
part which doesn‟t (female, no qualifications, very low income,
White British)
Cognitive Findings (2)
Only the remaining two cases were problematic
● Clear acquiescence (possibly due to cultural
politeness – see Javeline, 1999):
 “I think I don’t understand that, I just say agree” (Female, no
qualifications, low income, Pakistani with poor English)
● Possible acquiescence:
 R had ambivalent feelings; found it hard to choose agree
or disagree
 Other Rs with similar views chose the middle category,
thus choice of „agree‟ could be a type of acquiescence
(Female, high school equivalent, low income, White British)
Non-differentiation
● Where respondents give the same rating to all or
almost all items
● Usually on a battery of questions with same answer
format
● „Rating‟ as opposed to „ranking‟ is known to be
vulnerable to this (Krosnick and Alwin, 1988)
 Because it is an easier task than „ranking‟
 Set up as battery of questions
Mixed Mode Experiment
2 questions used in both rating or ranking format:
● A childhood game where children describe where they live
based on ever increasing geography (e.g., their home street,
town, county, country, UK, Europe, and the world)  taken
from British Social Attitudes, 2006
● A list of improvements to the neighbourhood  newly developed
to fit hypotheses
● Survey experiment found
 Higher percentage of non-differentiation with rating
Children‟s game
List of improvements
Rating
34.9
Ranking
13.7
9.3
3.6
Based on CAWI results which showed the largest differences between formats
Cognitive Findings (1)
● Non-differentiation found among cognitive Rs
 16 of 18 showed non-differentiation on 2, 3 or 4
questions of 4 item subset
● But this does not appear to be satisficing
 All of these Rs gave clear and justifiable answers!
● In cases of non-differentiation, Rs were then asked to
 Choose which was most important
 Say how easy or difficult that was to do
● The majority of respondents answered „difficult‟
Cognitive Findings (2)
Example:
● Interviewer: “Is that an easy or a difficult choice to make?”
● Respondent: “Probably quite difficult, really.”
● Interviewer: “So what would make that a difficult choice?”
● Respondent: “Well, obviously we’d like more parking here in
the close, but then, I don’t think it’s really achievable, so then
you kind of think, “Well, that’s what I’d like in an ideal world,” but
then the next thing would be, I suppose, the schools would be
more important, would be important to me. But they’re
secondary because they don’t apply to me at the moment, but
they will in the future.”
(Female, first degree, high income, White British)
End-labelled versus Fully-labelled
(and Middle Categories) (1)
On attitude questions
● Both mixed mode experiment and Dillman & Christian (2005)
found a higher percentage of answers at the top of the scale
(more positive answers) in the end-labelled format
● Both studies found a higher percentage of middle category
answers in the end-labelled format (although less in CATI)
On behavioural frequency questions (end-labelled scales not
generally used), but
● Mixed mode project found same pattern!
● Similar distributions on attitude and behavioural questions
could imply that the same response process is responsible
Mixed Mode Experiment
End-labelled versions of questions used
● GB16: On the whole, how satisfied are you with the way
democracy and personal freedom work in Great Britain, where 1
is very satisfied and 7 is very dissatisfied? [from European Social
Survey, 2006 with addition of „and personal freedom‟ to make the question
more difficult]
● FM68: The next question is about grocery shopping which
includes food, drinks, cleaning products, toiletries and household
goods. How often do you personally do grocery shopping, where
1 is every day and 7 is never? [newly developed to fit hypotheses]
● FM74: In the last two weeks, how many teas, coffees and other
hot beverages have you purchased outside the home, where 0 is
none and more than 25 is 6? [newly developed to fit hypotheses]
GB17 about the state of the economy is not included here because most Rs gave the
economy a bad rating so this question behaved differently than the others.
BUT. . .
Cognitive Findings (1)
● Rs answering attitude question are using the scale
● Most Rs answering behavioural frequency questions are not
using the scale
● Rs choosing middle category on attitude question are a mix of
satisficing and valid answers
● Rs choosing middle category on behavioural frequency
questions were trying to give a valid answer
● Among CATI Rs probed about difficulty
 About equal numbers said it would be „easy‟ or „difficult‟ to
find the middle category if they wanted to on an aural endlabelled scale
 About a third of Rs chose numbers which they thought
were the middle category, but got it wrong
3 versus 7 or 8 categories
Mixed Modes Experiment
● Taking the 7 or 8 category format and collapsing it into 3
categories, showed that it was not equivalent to the original 3
category version!
● True for all question comparisons: 2 satisfaction, 2 nominal factual
and 2 ordinal factual questions
● Makes sense on satisfaction questions - more Rs chose the middle
category in 3-point versus 7-point scale
● But very surprising for factual questions - particularly for 2
questions (see next slide)
● Given that respondents had been randomly assigned to the two
question formats, it could be easy to conclude that such differences
were due to randomness (a Type 1 error) rather than being an
important finding
3 versus 7 or 8 categories: Mixed Modes Experiment
7 or 8 Category Versions
3 Category Versions
FM75. Which of these best describes your home?
Would you say a . . . (READ OUT) . . .
Detached house
1
Semi-detached house
2
Terraced house
3
Bungalow
4
Flat in a block of flats
5
Flat in a house
6
Maisonette
7
Or other?
8
FM75. Which of these best describes your home?
Would you say a . . . (READ OUT) . . .
House
Flat or maisonette
Or other ?
1
2
3
FM82. How long have you lived in this area? FM82. How long have you lived in this area?
Would you say . . . READ OUT . . .
Would you say . . . READ OUT . . .
Less than 12 months
1
Less than 3 years
1
12 months or more but less than 2 years 2
2 years or more but less than 3 years 3
3 years or more but less than 10 years
2
10 years or longer
3
3 years or more but less than 5 years 4
5 years or more but less than 10 years 5
10 years or more but less than 20 years 6
20 years or longer
7
Cognitive Findings (1)
For the Cognitive Interviewing,
● Rs were asked the 3 category version of the
questions as part of the survey questions
● Later Rs presented showcard with the more detailed
categories (without reminding them of their original
survey answer)
● Cognitive interviewers were to probe any
inconsistencies
● None of the 12 respondents were inconsistent
Cognitive Findings (2)
● But it was found that both the „dwelling‟ and „years lived in area‟ questions
were confusing.
 “What the hell difference is there between a maisonette and a flat and
a block of flats, a flat and a house?” (Male, postgraduate degree,
employed, high income, White British
 Regarding a maisonette. Household member: “I’ll call it a duplex,
yeah.” Respondent: “Well, it’s what they call it in the South.” (Male,
postgraduate degree, employed, high income, White British)
 R answered „flat,‟ but the interviewer observed it as semi-detached
house. R said it had to be very large to be called a house (Female,
higher education below degree level, employed, medium income, other
ethnicity).
 Years lived in area
• Difficulty in remembering the number of years
• Feeling stuck between two categories
• Feeling the short version was much simpler
Conclusions (1)
● Cognitive interviews were preplanned follow-up study to mixed
modes experiment
● Cognitive interviews offered useful
and surprising insights into
quantitative findings
Conclusions (2)
Figure 1 – Re-visited
Suspicious
A
Problematic
C
Non-problematic
Agreeing to opposite statements
Non-differentiation in rating task
Middle categories can be difficult
to choose on aural end-labelled
scale
Non-suspicious
B
Problematic
Same quantitative distribution
doesn’t mean same underlying
cause
Even basic factual questions can
be misunderstood
D
Non-problematic
Conclusions (3)
Potential limitations
● Some problems with mixed mode experiment questions
 Cognitive interviews also good at picking up these unexpected
problems
● Retrospective think aloud and probing leading to post hoc
rationalisations
 Never know for sure, but didn‟t appear to be the case
● Cognitive interview Rs unusual because
 Had been interviewed twice previously
 Were quota of Rs with less than optimal behaviour and those with
opposite profile
Hope more use of cognitive interviewing as
follow-up study