Week 8 - Cognitive Aspects of Survey Methodology

SC970 Week 8
Cognitive Aspects of Survey
Methodology
Outline
• Cognitive Aspects of Survey Methodology
– The response process
• Errors in different stages of the response process
– Examples
• Implications for questionnaire design
– See Groves et al ch. 7
• Discussion of coursework and further examples
Sources of survey errors
Measurement error
• Deviation between reported and “true” score
• True score not always easy to determine
– Objective constructs
• Observable
• More or less clearly defined
– Subjective constructs
• Not observable: thoughts, feelings, attitudes, opinions
• Constructed by respondent (often on the spot)
• Scales typically arbitrary representation of construct
Measuring Measurement Error
•
•
•
•
•
Compare survey estimate to external records
– E.g., Hospital records vs. survey answers
Compare two different survey estimates
– E.g., Men vs Women on the number of sexual partners
Compare two different survey reports
– E.g., Husbands vs Wives on % of household duties done by each
Compare results obtained by two different methods of data collection on
random subsamples.
– E.g., Different modes, formats of response options.
Measure reliability and validity of estimates
– E.g., Change in estimates between two measurements
– E.g., People who support Cameron’s economic policies should
more often vote for the Torys.
Sources of measurement error
• Stems from different components of the measurement process:
– Respondent
– Interviewer
– Questionnaire
– Mode of data collection
Cognitive Aspects of Survey Methodology
• Started in the 1980s
• Psychologists and survey methodologists deliberately attempted to
create a new interdisciplinary field
– Illuminating the cognitive and communicative processes
underlying survey responding
– Drawing on psychological theories of language comprehension,
memory and judgment
– Formulated modules of the question answering process and
tested them in laboratory experiments and experimental surveys
• Potential to contribute to understanding of basic psychological
processes as well as to the improvement of survey methodology
• Major survey centres established cognitive laboratories to help with
questionnaire development
Cognitive models of measurement error
• Focus on bias part of the error
• Give reasons for why errors occur and ways to avoid them
• Problems can occur in different parts of the response process
(Tourangeau 1984)
Understanding
the meaning of
the question
and what
information it
seeks
Recalling
relevant
information
from long-term
memory
Combining the
retrieved
information and
filling the gaps
Selecting and
communicating
an answer
Survey satisficing
•
Word “satisfice” was coined by Simon (1957): in many cognitive tasks
(games, problem solving etc) people do not persist in finding an
optimum solution and in some case do not have access to the
information to do so and settle for a solution that is “good enough”
•
Krosnick (1991): Respondents shortcut the response process
•
Some indicators:
– Selecting first reasonable response option (primacy/recency effects)
– Agreeing with assertions regardless of contents (acquiescence)
– Selecting same response in battery of ratings scales (nondifferentiation)
– Endorsing status-quo instead of change
– “Don’t know”
– Randomly choosing an answer
Krosnick, J. A. (1991) Response Strategies for Coping with the Cognitive Demands of
Attitude Measures in Surveys. Applied Cognitive Psychology, 5, 213-236.
Errors related to comprehension
Types of comprehension problems
• Grammatical form of the question
– Grammatical ambiguity
– Excessive complexity
– Faulty presuppositions
• Meaning of words and concepts
– Vague concepts
– Vague quantifiers
– Unfamiliar terms
– False inference
Comprehension processes
• Comprehension usually assumed to involve analysis at various
levels:
– Sensory: Segment speech stream into words, recognise printed
characters
– Lexical: Retrieval of meaning, pronunciation, identify parts of
speech, etc.,…
– Syntactic: String words into a grammatical sequence
– Pragmatic: Determine overall sentence meaning (i.e., deduce
the meaning of the question)
• Comprehension problems can occur at each of these levels
Pragmatic inferences: Establishing common
ground
• Example:
– A: What have you done today?
– B: Got up, had shower, breakfast, cleaned teeth....
• Speakers assume shared knowledge and experience
– What has been said and presupposed
– What others can be reasonably assumed to know
– Inferences about speaker’s intentions
Pragmatic inferences: Establishing common
ground
• Grice (1975) – conversational maxims
– Quantity
• “Say as much as necessary but not more”
– Quality
• “Say what you believe to be truthful”
– Relevance
• “Do not make contributions irrelevant to the present
discourse”
– Manner
• “Be clear”
• The conversational norms are routinely violated in the process of
collecting survey data
Example: Vague concepts
• “Do you think children suffer any ill effects from watching
programmes with violence in them, other than ordinary westerns?”
• Belson (1981) determined that Rs interpreted children, ill effects,
and violence in numerous ways
– E.g., “children” could be aged < 8 years, < 19-20 years, students,
one’s own children regardless of age, etc.,…
– Only 8% of respondents interpreted the question as intended.
Example: Excessive complexity
“Do you think that children suffer any ill effects
from watching TV with violence in them, other
than ordinary Westerns? By children I mean
people under 14, by ill effects I mean increased
aggression at school or at home, increased
nightmares, inability to concentrate on routine
chores, and so on. By violence I mean graphic
depictions of individuals inflicting physical injuries
on themselves or others, depictions of individuals
damaging property or possessions, abusive
behaviours or language to others, and so on.”
Example: Vague concepts
“Do you own a car?”
“Do you ever smoke cigarettes?”
Suessbrick et al (2000) interpretation of the term “cigarette”
23%
54%
23%
Cigarettes that you
finished
Cigarettes that you
partially smoked
Cigarettes that you took a
puff or two from
Example: Vague quantifiers
• Psychiatrist: “How often do you sleep together?”
Alvy (Woody Allen) “Hardly ever, maybe three times a week”
Annie Hall (Diane Keaton) “Constantly, I’d say three times a week”
• Belson (1981) found “few” (as in “over the last few years”) meant:
– “No more than two years” (12%)
– “Seven or more years” (32%)
– “Ten or more” (19%)
– Other answers (37%)
Example: Vague quantifiers
•
Schaeffer (1991)
Example: False “Implicatures”
• From Scharz et al (1991):
• “How successful would you say you have been in life?”
– Two different response scales
Example: False “Implicatures”
Scale 1 (0-10): Grey bars. 35% responded in lower half (0 to 5)
Scale 2 (-5 +5): Black bars. 13% responded in lower half (-5 to 0)
Improving comprehension: Standardisation
•
•
Standardised interviewing
– Goal is to standardise question wording and remove the interviewer
as a source of bias – “interviewer variance”
– Interviewers always read questions exactly as worded
– But, this inhibits establishment of common ground which may lead
to misunderstanding
Conversational interviewing
– Goal is to standardise meaning, even if wording varies across
interviewers or respondents
– Allow interviewer to say whatever is necessary for respondent to
understand question as intended
– Interviewer must initially read question as worded but is then free to
paraphrase or add relevant information
– But: longer to administer
Improving comprehension: Cognitive interviews
•
•
•
•
•
•
Pre-test technique
Verbal protocols (Ericsson & Simon, 1993)
Can give insight into how respondents answer questions
Assume that reporting does not interfere with answering process
Uncovers only conscious processes
More in week 10.
Errors related to retrieval
Errors related to retrieval
• Errors in
– Occurrence of events
– Frequency of events
– Dating of events
– Characteristics of events
• Types of recall errors
– Calendar effects
– Telescoping
– Seam effects
Examples of questions about past events
• Did you call the police during the last 6 months to report something
that happened to YOU which you thought was a crime? (National
Crime Victimisation Survey)
• How long have you been unemployed?
• When did you get [married / divorced / etc,…]?
Memory systems
• Working memory
– A “mental scratch pad”. It has limited capacity
• Store telephone numbers until called
• Store words of a sentence (question) as they arrive
• Store partial results of a long term memory search, e.g.,
when counting up recalled episodes
• Long term memory
– Relatively permanent (though forgetting occurs over time)
– Types of LTM
• Episodic memory: biographical events
• Semantic memory: words, ideas, concepts
• Procedural memory: skills, e.g., riding a bike
Retrieval
• Bringing information from long term memory into working memory
• E.g., “Last Sunday, my friend Aaron and I went to a restaurant in
Colchester.”
– Retrieval cues
• Events (eating at a restaurant)
• Places (in Colchester)
• People (with your friend Aaron)
• Dates (on Sunday)
– Some cues are more successful than others:
• Brewer (1988) and Wagenaar (1986): cues about what
happened are better than who or where, which are better
than when
• What did YOU do on August 4th?
Some events are harder to retieve
• Those that are not very distinctive or are numerous
– How many times in the past month have you used a cashmachine?
– How many times in the past month have you checked your email?
• Those that happened a long time ago
– What are the names of your elementary school teachers?
• When cues do not match what is in memory
– How often do you do light or moderate activities for at least 10
minutes that cause only light sweating or a slight to moderate
increase in breathing or heart rate?
… and some are easier to retrieve
•
•
•
•
Recent events
Distinctive events
Important, emotionally involving events
Events near significant temporal boundaries
Types of recall errors: The Calendar Effect
• Recall peaks for students occur at term boundaries
– Regardless of where these events fell in town or how many happened during the
year
– Regardless of the importance of the events (Schum and Rips 1999)
Types of recall errors: Telescoping
• Tendency to report events that actually occurred before the
reference period has having occurred within it
– E.g., Neter & Waksberg (1964): More expenditures reported
• For periods between interviews: When Rs not reminded of
previous reports (unbounded) than when reminded
(bounded)
• For last month: When Rs asked about each of the previous 3
months rather than about the last month.
• Elapsed time seems to shrink the way distance seems to shrink
when viewed through a telescope
• Used to account for overestimates in frequency reports
– … which may be offset by forgetting across all responses
Types of recall error: Heaping
• Current labour market activity?
– If unemployed, follow-up question:
– “How many months have you been unemployed?”
• (1-99 months)
Heaping: Duration of unemployment
Source: Figure 1 in Torelli & Trivellato (1993) Journal of Economics
Types of recall error: Seam effects
• In panel surveys, the tendency for month-to-month changes to
concentrate in adjacent months covered by different interviews.
Type of recall error: Seam effects
Monthly transitions in legal marital status
(source: BHPS waves 11-15)
Aids to retrieval
•
•
•
•
•
•
Life event calendars
– Use of personal and other landmarks to improve dating and recall
Additional retrieval cues
– E.g., What, who, where or when
Decomposition
– E.g., cigarettes smoked during different activities such as
commuting, at work, during drinks after work, etc., (Means et al
1991)
Longer time on task, allowing for second guessing
– Lessler, Tourangeau & Salter (1989); Burton & Blair (1991)
Bounding:
– Ask about events 30-60 days ago, then 0-30 days ago. Discard data
for the earlier period since likely to include telescoped events
Dependent interviewing
Errors related to judgment and estimation
Errors related to judgment and estimation
•
•
•
•
Compensating for imperfect retrieval
Estimation of frequencies
Context effects
Question wording effects
Compensating for imperfect retrieval
• Retrieval rarely provides veridical records of events
– Memory degrades as elapsed time increases
– Similar episodes may merge into generic memories
– Retrieval cues vary in effectiveness
• When retrieval cannot serve as basis for response, people may
estimate
– E.g., based on what they typically do
• What did you have for breakfast two days ago?
– But also based on irrelevant information, such as the question or
response options
Compensating for imperfect retrieval
• … by using elements of the question text
• Loftus (1979) showed a video of a car accident and asked:
– “How fast was the car going when it went through the yield
sign?”
• Lead to reports of a yield sign in the original traffic event on a
subsequent memory test even when there was no yield sign present
in the video.
Compensating for imperfect retrieval
• … by using the range presented in response options
Estimation of frequencies
• Different strategies and related errors (Brown 1995)
• “Last week, how many times did you check your e-mail?”
– Recall and count (“11 times”)
• When events are distinctive and irregular
• More likely to forget an instance – underestimation is likely
– Rate based estimation (“I check it about 3-4 times a day, so it
must be about 25 times”)
• When events are similar to each other and occur on a regular
basis.
• Exceptions are not taken into account – overestimation
possible
– Impression (“I check it a lot. Must be 50 times”)
• Cannot be any lower than 0 but is unbounded on the high
end – overestimation is likely
Estimation of frequencies
• Strategy depends on regularity, similarity, and frequency of events
Context effects
• Many studies have shown that judgments are highly contextdependent
– Broader circumstances unrelated to the survey topic: e.g.
weather (Schwarz and Clore, 1983).
– General survey context: Preceding questions, the name of the
survey, or its sponsor all can have an effect.
Context effects: Previous questions
• Schuman & Presser (1981)
– “Do you think the United States should let Communist reporters
from other countries come in here and send back to their papers
the news as they see it?”
– “Do you think a Communist country like Russia should let
American newspaper reporters come in and send back to their
papers the news as they see it?”
Context effects: Previous questions
80
70
60
50
First question first
40
First question second
30
20
10
0
1948
1980
% of respondents answering ‘yes’, US should allow communist reporters.
Context effects: Previous questions
• Schwarz et al, (1991)
– Asked students at German universities a question about a vague
“educational contribution”
– Preceding question was either
• An item about college tuition in the US, or
• An item about government financial support for students in
Sweden
• Support for educational contribution was higher in the latter
condition
Errors in formatting the answer
Mapping of answers onto response options
• Answers can be affected by question format
– Open, text responses
– Open, numerical responses
– Closed with ordered response scales
– Closed with categorical response options
Problems with open numeric format
• (From ANES) “We’d like to get your feelings about some groups in
American society. When I read the name of a group, we’d like you
to rate it with what we calla feeling thermometer. (…) Using the
feeling thermometer, how would you rate the following group:
Democrats? _____”
• In the past month, how many times did you sue a cash machine?
___times
– May be hard to convert vague impression into a number
– Answers are often rounded numbers (5, 10, 50)
• Indicates difficulty with conversion or unwillingness to be
precise or recall problems
Problems with unordered response scales
• Primacy and recency effects (Krosnick & Alwin 1987)
– When options presented visually, first few more likely to be
endorsed: primacy
• Under visual presentation Rs mind becomes cluttered after
first few options
– When presented orally, last few more likely to be endorsed:
recency
• Under auditory presentation, earlier options are overwritten in
working memory
Middle and “Don’t Know” options
• Effects of explicitly mentioning middle options:
– “Should divorce in this country be easier or more difficult to
obtain than it is now?”
• Easier – 28.9%
• More difficult – 44.5%
• Stay as is (volunteered) – 21.7%
• Don’t know – 4.9%
– “Should divorce in this country be easier to obtain, more difficult
to obtain, or stay as it is now?”
• Easier – 22.7%
• More difficult – 32.7%
• Stay as is – 40.2%
• Don’t know – 4.3%
Sensitive questions
• Examples
– When answers are socially (un)desirable
• Did you vote in the last general election?
– When they invade one’s privacy
• What is your monthly income?
– When there is a perceived risk of disclosure to third parties
• Did you ever use cocaine?
• Consequences
– People may refuse to answer the question
– Or they can modify their answer
• Does not have to be conscious
• Done for a variety of reasons: Deference to the interviewer,
self-presentation, etc…
How to ask sensitive questions?
•
Barton: “Asking the Embarrassing Question”, Public Opinion Quarterly
1958
“How to ask: Did you kill you wife?”
•
The casual approach:
– “Do you happen to have murdered your wife?”
•
The numbered card:
– “Would you please read off the number on this card which
corresponds to what became of your wife?” (HAND CARD TO
RESPONDENT)
1 Natural death
2 I killed her
3 Other (specify)
“Did you kill your wife?”
• The Everybody Approach
– “As you know, many people have been killing their wives these
days. Do you happen to have killed yours?”
• The “Other People” Approach
(a) “Do you know any people who have murdered their wives?”
(b) “How about yourself?”
•
The Kinsey Technique
– Stare firmly into respondent’s eyes and ask in simple, clear cut
language such as that to which the respondent is accustomed,
and with an air of assuming that everyone has done everything,
“Did you ever kill your wife?”
“Did you kill your wife?”
How to ask sensitive questions (really)
• Self-administration helps
– Addresses concerns about disclosure to an interviewer
• Open items seem better than closed items
– E.g. Tourangeau & Smith (1996) tested 3 response formats for
the number of sexual partners in last 5 years:
• Closed low frequency, i.e., 0, 1, 2, 3, 4, 5+
• Open
• Closed high frequency, i.e., 0, 1-4, 5-9, 10-49, 50-99, 100+
– Closed low: 2.62
– Open: 3.12
– Closed high: 5.33
• Other techniques
Further reading
• Tourangeau, Rips and Rasinski (2000). The Psychology of Survey
Response. Cambridge University Press.
Ex 6 – problems with questions? Improvement?
During the past four weeks, beginning [DATE FOUR WEEKS AGO] and ending
today, have you done any housework, including cleaning, cooking, yard work, and
household repairs, but not including any activities carried out as part of your job?
Comprehension:
•
Response format not given – what is a permissible answer?
•
Complex sentence
•
Unpaid housework in other people’s homes?
Recall:
•
Examples – exhaustive?
•
Yes/no question – if in doubt “yes”
•
4 week recall – non-memorable activity – telescoping likely
•
Expected frequency of events vs. recall period
•
Yes/no misses information about intensity – ask about frequency?
Ex 6 – problems with questions? Improvement?
In the past week, how may times did you drink alcoholic beverages?
Encoding
• Mismatch of encoding and term “alcoholic beverage”
Comprehension
• Definition of alcoholic beverage?
• Calendar week or last 7 days?
• How many days or occasions?
Formulating response
• Social desirability bias – under-reporting?
Ex 6 – problems with questions? Improvement?
Living where you do now and meeting the expenses you consider
necessary, what would be the smallest income (before any deductions) you
and your family would need to make ends meet each month?
Encoding
•
Assumes R knows HH income and expenditure
Comprehension
•
Living where you do now – in town/house...?
•
Definition of “necessary” - different for different respondents
•
Definition of “you and your family”
Recall / Estimation
•
Estimation / guessing likely
•
Disposable income vs. gross income
Ex 6 – problems with questions? Improvement?
During the past 12 months, since [DATE], about how many days did illness
or injury keep you in bed more than half the day? Include days while you
were an overnight patient in a hospital.
Comprehension
•
About how many days – suggests guessing is ok
•
Definition of “illness”
•
Definition of “half a day”?
•
Hospital days – include all, or only if in bed half day?
Recall
•
Long reference period – ok for healthy people, but difficult for ill?
•
Serious versus non-serious illnesses
•
Telescoping?
Ex 7 - why avoid agree/disagree questions?
• Why may it be wise to avoid questions in the agree-disagree format?
• This neighbourhood is not a bad place to live. Would you say you ...
Strongly agree...............................1
Agree.............................................2
Neither agree nor disagree... .......3
Disagree........................................4
Or strongly disagree?....................5
•
•
•
•
Acquiescence bias
Nondifferentiation in batteries of Qs using same answer scales
Risk of double negatives
Better to include object in response categories
Alternative
Is your neighbourhood a....
Very good place........................1
Good place...............................2
Neither good nor bad place.......3
Bad place..................................4
Or a very bad place to live?.......5
Ex 8 – likely estimation strategy? Consequences
for reported frequencies?
1. Number of times the respondent was hospitalized in the past 2
years
– Recall and count
– Omissions / telescoping
2. Number of times the respondent ate in a restaurant in the last
month
– Rate based estimation
– Over-reporting
3. Number of times respondent’s spouse/partner went on vacation
during the past summer
– Recall and count / guessing
– ?