The effect of changes in the definition of the household reference

The effect of changes in the definition of the household reference
person
Jean Martin and Jeremy Barton
1.
Alternative definitions to head of
household
A previous paper1 reviewed the definition of Head of
Household (HOH) which is currently used on most
government surveys. The paper explained the need to
define one member of the household as a reference
person who can be used to characterise the household
as a unit. A number of problems and criticisms of the
current definition were noted, and the paper
recommended that alternative definitions should be
investigated. The criteria for a suitable definition
were outlined and a number of possibilities
suggested. Criticisms of the existing definition do not
in themselves make the case for change; any
alternative definition must be shown to improve
sufficiently on the current one to justify a change,
particularly in terms of discontinuities in time series
The earlier paper argued that ‘householder’ – the
household
member
in
whose
name
the
accommodation is held (owned or rented) – should be
retained as a core part of the definition of the
household reference person (HRP), and that in all
households where there is only one householder, that
person should be the reference person. this represents
a change from the current definition which selects as
HOH male non-householders who are partners of
female householders.
A further and more difficult issue is the criterion for
choosing between joint householders. The current
definition of the HOH gives males priority over
females and takes the eldest of same sex
householders. The previous paper recommended that
two possible alternative criteria, age and highest
income, be compared using empirical evidence to
evaluate the effect of change.
Ideally the same definition should be used on
government surveys and the Census. However, this
may prove difficult. The 1991 Census of Population
did not identify householder status; to do so in 2001
would require an additional question. For both
surveys and the Census there are different
requirements depending on why a reference person is
needed. To select a reference person in order to sort
out relationships within a household requires a simple
criterion, which can be applied at either the start of an
1
interview, or by the person filling in the form or
questionnaire. Selecting a reference person to represent
the characteristics of the household may be necessary at
the start of the interview but on surveys where the
relevant information is available for all household
members (or at least for all householders) it can be left
until the analysis stage, which potentially allows more
complex criteria to be used.
Most surveys do not provide sufficient information to
allow us to compare alternative ways of selecting a
reference person. Typically, interviewers ask only those
questions necessary to operate the current definition of
HOH where required (ie. In all but single family
households), but do not record the answers to these. In
particular, many surveys do not include a question on
householder status. And those that do may not have the
information needed to choose between joint
householders: they may not have information about
income and may include age in years only (and not date
of birth).
However, both the General household Survey (GHS)
and the Survey of English Housing (SEH) include a
question about householder members’ responsibility for
the accommodation, although the answer categories are
not identical. Neither asks for date of birth for all
household members, but the GHS has income details
for each adult whereas the SHE does not, so further
analysis of criteria for choosing between joint
householders is confined to the GHS. This is not ideal
since we have no means of applying the criteria to joint
householders who are the same age in years, whose
incomes are the same or where income data is not
available. We describe below the assumptions we have
made in such cases.
2.
Responsibility for the
accommodation
First we considered the extent to which giving priority
to responsibility for the accommodation is likely to
affect the selection of a reference person. Table 1
shows the distribution of GHS and SHE households
according to household composition and householder
status. The figures for the two surveys don not quite
agree, at least in part because answers were not coded
in the same way, thus the figures for Table 1 were
derived using
SMB 381/96
Jean Martin and Jeremy Barton
Table 1
The effect of changes in the definition of the household reference person
Household composition and householder status
GHS
SEH
%
%
Single adult household
- male
- female
10
22
10
19
Sole householder
Sole householder
- male
- female who is HOH
13
5
14
5
Sole householder
- female who is not HOH
- male HOH
7
41
4
47
1
1
*
*
100
100
9794
20472
Joint householder couple
Joint householders non-couple
- male HOH
- female HOH
All households
Base
* Joint non-couple households not coded as such on SHE – they are coded as sole householders
other information about the household composition.
Clearly in single adult households there is no issue;
that person must be the HRP by any definition; this
accounts for between 29% and 32% of all households.
In a further 23-25% of households, although there is
more than one adult, there is only one householder;
13-14% are men and 5% are women without partners
who are defined as HOH on the current definition. But
the remaining 4-7% have female householders who are
not defined as HOH because they live with a husband
or partner. So if householder status were the first
criterion, these women would be defined as the
household reference person. It is worth noting that in
the majority of these households the couple was
cohabiting rather than married, supporting the view of
interviewers that in most cases the woman had
responsibility for the accommodation before her
current partner moved in. It is unfortunate that the
figures for this group differ significantly between the
two surveys so we are unable to obtain a precise
estimate of the proportion of households which would
be affected by a change in definition. We are carrying
out further investigations into the reasons for the
discrepancy.
41-47% of households contain a couple who are (the
sole) joint householders. The remaining 2% of
households (for GHS only) contain two or more joint
householders mad up of non-couples (eg flat sharers)
or, rarely, couples plus other household members
(eg.two-generation families). We examine next the
effect of using income and age rather than sex to
decide between joint householders.
2
3.
Criteria used for selecting between
joint householders
Highest income
If it were decided to choose between joint
householders on the basis of income the simplest
method would be for interviewers to ask directly who
has the highest income; details of income would not be
needed. Since that had not been done on the GHS we
examined the information on gross weekly income
(from all sources) which is recorded for each adult in
the household, from which we could identify the
householder with the highest income from all sources.
Income questions tend to suffer from higher levels of
item non-response than other questions. On the GHS,
in approximately 10% of couple households and 18%
of non-couple households, the highest income
householder could not be identified because either the
information was missing or (in a small number of
cases) both householders had the same income. In
order to obtain an estimate of the effect of a change in
definition for all joint householder households we
have assumed those for whom information was
missing would have been distributed in the same
proportions as those for whom we have results and
have adjusted our estimates accordingly, separately for
couple and non-couple households.
SMB 381/96
Jean Martin and Jeremy Barton
The effect of changes in the definition of the household reference person
Table 2 shows that applying the criterion of highest
income to choose between joint householders would
result in a change in the HRP in 8% of households: 7%
being couples and 1% non-couples. Looking only at
joint householder households, this represents a total of
17% where a different HRP would be selected.
Although the number of non-couple households is
small, it is worth noting that using an income based
definition would affect proportionately more noncouple than couple households. As well as replacing
the priority given to males among opposite sex
householders, it would replace the current age criterion
to decide between same sex householders.
Age
In examining the effect of using age as the criterion for
choosing between joint householders, it should be
remembered that, for non-couples of the same sex, age
is currently a criterion for selection of HOH; a change
is only possible in households with both male and
female joint householders (approximately half of all
non-couple households).
Table 2
In 5% of households both (or all) joint householders
were the same age so we could not apply this criterion
directly. However, we have assumed that in half such
cases the male would be the eldest and in half the
female, and have adjusted the estimates accordingly.
Applying the age criterion would result in a change in
the reference person in 10% of households, almost all
of which are couple households (Table 3). Thus age
would result in a change in rather more households
than income. Considering only joint householder
households, the proportion that would change
reference person is 23%: 22% couple and 1% noncouple households. The proportion for couples is
substantially higher than if income were the criterion,
indicating that men are more likely to have higher
incomes than to be older than their partners. The
proportion for non-couples is much lower than for
income since age is already the criterion used for same
sex joint householders.
We note that although in a similar proportion of
households a different reference person from the HOH
would be selected regardless of whether income or age
is the criterion, different individuals are being selected
in around three quarters cases.
Distribution of HRP defined as highest income householder
Single householders
All households
Joint householder
households
%
%
57
Joint householder couples:
male has highest income
35
80
female has highest income
7
15
male has highest income
1
2
female has highest income
1
2
43
100
Joint householders non-couples:
Total joint householders
All households
100
Base
9794
4228
In 4% of households no income information was available to apply the criterion. We have assumed the same ration of males to females having the
highest income would have been found in such households and have distributed the cases accordingly.
3
SMB 381/96
Jean Martin and Jeremy Barton
Table 3
The effect of changes in the definition of the household reference person
Distribution of HRP defined as highest income householder
Single householders
All households
Joint householder
households
%
%
57
Joint householder couples:
male is eldest
32
73
female is eldest
10
22
male is eldest
1
3
female is eldest
0
1
43
100
Joint householders non-couples:
Total joint householders
All households
100
Base
9794
4228
In 5% of households both householders were the same age so we could not apply the criterion. We have assumed equal proportions of males and
females would be the eldest and in such households and have distributed the cases accordingly.
4.
Overall effect on distributions of
reference person characteristics
Even though a change of definition might result in a
different person being selected as the reference person
in some 15% to 17% of households, this does not
necessarily imply that characteristics base on
information about that person will also change.
Bearing in mind that in 7% of households a different
person would be selected on the basis of choosing sole
householders and in a further 8% or 10% there would
be a change due to using income or age to choose
between joint householders, we compared the
distributions for HOH and reference person defined in
terms of income and age on four characteristics: sex,
age, work status and social class (Table 4). This table
excludes households where either age or income
information was not available to apply the criteria for
choosing between joint householders.
As we might expect, using either the age or the income
criterion results in a lower proportion of men being
selected than using the HOH definition as the
reference person: 58% (income) or 56% (age)
compared with 73% for HOH. For the other
characteristics the distributions show only modest
changes. All three age distributions are very similar,
presumably because most couples are similar in age.
Rather fewer reference persons are in paid work when
4
defined by age than by the other definitions; an
income based definition is likely to favour those in full
time paid employment. The social class distributions
for reference persons defined on either criterion show
higher proportions in skilled non-manual (IIInm) and
fewer in skilled manual categories (IIIm) and also in
classes I and II. This reflects the fact that male and
female jobs have very different social class
distributions: more women than men are assigned to
III non-manual and IV manual, and more men than
women to III manual.
Having looked at the overall differences in
distributions we next show the same characteristics
just for households where a change of definition
would result in a change of reference person (Table 5).
Here we can see more clearly what contributes to the
overall differences show in Table 4. In almost all
cases, a change in reference person results in a woman
rather than a man being selected as the reference
person. The use of income as the criterion results in
slightly more younger people been chosen than on the
HOH definition. With regard to work status, where
there is a change the reference person defined by
income is more likely to be in paid work than the
HOH whereas the opposite is true for the reference
person defined by age. As noted above, the social
class differences are largely attributable to differences
SMB 381/96
Jean Martin and Jeremy Barton
Table 4
The effect of changes in the definition of the household reference person
Distributions of reference person characteristics
Head of
household
Highest income householder
Eldest
householder
%
%
%
Male
Female
73
27
58
42
56
44
16-24
25-34
35-44
45-54
55-64
65-74
75+
4
18
18
17
14
16
13
4
19
18
17
14
26
13
4
18
18
17
14
16
13
Work status:
Paid work
Retired
Neither
53
30
17
54
30
16
50
31
18
Social class:
I Professional
II Intermediate
IIInm Skilled NM
IIIm Skilled M
IV Semi-skilled
V Unskilled
Other
6
18
23
29
15
6
3
5
17
28
24
16
6
4
5
16
28
23
17
7
4
9794*
9352*
9297*
Sex:
Age:
Base
•
Base numbers are (slightly) lower for work status and social class due to missing values
in the occupations carried out by men and women and
the social classes to which they are assigned.
skilled occupations – a controversial feature of the
conventional social class classification.
5.
Gross change in reference person
characteristics
Considering first the income criterion (Table 6), and
bearing in mind that a different person would be
selected in 15% of households, we see from the results
for all households (first column) that, despite the
changes from men to women (14%), the gross changes
for the other variables are quite small: 5% for age
group, 2% for work status and 11% for social class.
Using the age criterion (Table 7) gives very similar
results: 15% for sex, 5% for age, 2% for work status
and 12% for social class.
Table 4 and 5 showed net differences which means
that a shift from one category to another can be
cancelled out by a shift in the opposite direction.
Rather than show complex tables of all changes we
summarise in Tables 6 and 7 the main gross changes,
giving results for all households and for those where a
change of definition results in a different person being
selected. We have treated social class as an ordinal
scale, summarising changes to ‘higher’ and ‘lower’
social classes. This means that women with junior
non-manual occupations (clerks, shop assistants etc.)
are assume to be in a higher social class than men in
5
Looking just at households where the reference person
has changed (second column) indicates the extent to
which changes cancelled when we considered the net
distributions above. In the case of social class, the
changes did not cancel out as there was movement
from both higher and lower groups into III nonmanual.
SMB 381/96
Jean Martin and Jeremy Barton
Table 5
The effect of changes in the definition of the household reference person
Distributions of reference person characteristics when reference person has
changed
Head of
household
Highest income
householder
Head of
household
Eldest
householder
%
%
Male
Female
98
2
2
98
100
0
0
100
16-24
25-34
35-44
45-54
55-64
65-74
75+
5
24
23
20
15
10
4
7
26
24
20
13
7
3
5
24
22
18
15
12
4
4
22
22
21
14
12
5
Work status:
Paid work
Retired
Neither
61
17
22
73
14
14
68
18
15
59
21
20
Social class:
I Professional
II Intermediate
IIInm Skilled NM
IIIm Skilled M
IV Semi-skilled
V Unskilled
Other
8
19
16
38
12
3
4
3
13
49
6
17
7
5
9
20
16
38
12
3
3
2
11
48
7
20
9
4
1363*
1361*
1444*
1439*
%
Sex:
Age:
Base
*
Base numbers are (slightly) lower for work status and social class due to missing values
The proportions moving up and down were equal in
the case of the income criterion (38%) but for the age
criterion in 45% of households where there was a
change the new reference person was in a lower social
class than the HOH while in 32% of households there
was a change to a higher social class.
6.
Conclusions
If we change the definition of the reference person to
select a sole householder in all circumstances, in
between 4% and 7% of all households we would select
a married or cohabiting woman rather than her partner.
For households with more than one householder,
selecting the highest income householder would result
in a change of reference person in a further 8% of
cases, so in total some 12% to 15% would have a
different reference person. If the eldest were selected
among joint householders different person from the
HOH would be chosen in 10% of households so in
6
total there would be a change in between 14% and
17% of households.
Since the current definition gives priority to men over
women, it is not surprising to find that all the possible
alternatives result in selecting a higher proportion of
women as reference persons. This would affect the
distributions of any variable that are strongly related to
sex such as social class.
In deciding whether age or income is the best criterion
for choosing between joint householders we note that
members of most couples are similar in age so
choosing the oldest seems a somewhat arbitrary
procedure for selecting the person whose
characteristics will define the household. It seems
more plausible to suppose that the person with the
SMB 381/96
Jean Martin and Jeremy Barton
The effect of changes in the definition of the household reference person
Table 6
Comparison of gross differences between HOH and highest income
householder
All
Households
%
Households where
HRP has changed
%
86
14
0
4
96
0
Age group:
Same
HOH older
HRP older
95
4
1
64
28
8
Work status:
Same: both working
Same: neither working
HOH working, HRP not
HRP working, HOH not
52
46
0
2
60
26
1
12
Social class:
Same
HOH higher
HRP higher
89
6
5
24
38
38
9352*
1361*
Sex:
Same
Male HOH, Female HRP
Female HOH, male HRP
Base
* Base numbers are (slightly) lower for work status and social class due to missing values.
Table 7
Comparison of gross differences between HOH and eldest householder
All
Households
%
Households where
HRP has changed
%
85
15
0
0
100
0
Age group:
Same
HOH older
HRP older
95
2
3
66
13
22
Work status:
Same: both working
Same: neither working
HOH working, HRP not
HRP working, HOH not
50
48
2
0
57
30
11
2
Social class:
Same
HOH higher
HRP higher
88
7
5
23
45
32
9297*
1439*
Sex:
Same
Male HOH, Female HRP
Female HOH, male HRP
Base
•
Base numbers are (slightly) lower for work status and social class due to missing values.
7
SMB 381/96
Jean Martin and Jeremy Barton
The effect of changes in the definition of the household reference person
the highest income would be the best person to act as
the main representative for the household.
Age is an easier criterion to operate in the field than
highest income. However, it would be possible to
reduce the number of households at which it would be
necessary to establish the highest income householder
by giving priority to those in full-time work on the
grounds that it is reasonable to assume that a
householder in full-time work would have a higher
income than one in part-time work or not working. It
would them only be necessary to ask which of two
householders had the highest income when both were
in full-time work or when neither was working.
The study described here has used data available from
existing surveys; we have yet to try operating a new
definition in the field. It will be necessary to test the
acceptability of any new definition to interviewers and
8
respondents, and to evaluate it in terms of ease of use
and accuracy of application. In particular we need to
know whether respondents will tell interviewers which
of joint householders has the highest income.
Before deciding whether to adopt a new definition for
the household reference person, customers who
commission the main government household surveys
will be consulted for their views on whether a new
definition offers sufficient benefits to justify some
disruption to time series.
Reference .
1.
Martin, J. (1995) Defining a Household
Reference Person. Survey Methodology Bulletin, 37,
1-7.
SMB 381/96
A comparison of the census characteristics of respondents and nonrespondents to the 1991 Family Expenditure Survey (FES)
probabilities derive from census data for non-response
weighting.
Kate Foster
Summary
1.
OPCS has carried out a study linked to the 1991
Census to investigate the effects of non-response on
the representativeness of five major continuous
surveys in Great Britain.1 This paper presents results
for the Family Expenditure Survey (FES) which
collects information on household income and
expenditure. In 1991 as through most of the 1980s,
around 70% of sampled households co-operated with
the survey.
The FES defines households as responding only if all
adult members co-operate. It was therefore not
unexpected to find that the variable most strongly
associated with FES response in 1991 was the number
of adults in the household: the achieved sample clearly
under-represented households with three or more adult
members, especially where there were no children
under the age of 16.
Households in London also had above-average nonresponse and were therefore under-represented in the
achieved sample, as were those whose head had no
post-school qualifications, was born outside the UK,
or was self-employed. Response rates were above
average for households with dependent children and
those whose head was single or under the age of 35.
The greater part of the survey’s non-response is due to
household refusal, so the characteristics associated
with total non-response reflect the characteristics of
refusing households. The overall non-contact rate for
the FES is only 2% or less, but rates were significantly
above average for single person households,
particularly where that person was of working age, and
for households living in flats or shared
accommodation.
Most of the types of household under-represented in
the 1991 FES achieved sample were also identified by
previous census-linked studies, although the 1991
results highlight the importance of the number of
adults in the household as a predictor of response. The
main difference from earlier studies was that there was
less strong evidence for a linear decrease in response
with increasing age of the head of household. In
general, the consistency of associations over time
suggests that it may be appropriate to use response
9
Introduction
The FES provides data on household income and
expenditure in the UK. This article is concerned with
the sample in Great Britain (excluding Northern
Ireland), for which OPCS is responsible for fieldwork.
At the time of these checks the annual household
response rate was around 70%, a level which naturally
raises concerns among users of FES data about
possible non-response bias. If non-responding and
responding households differ in terms of
characteristics which are themselves related to
variables measured by the survey, then survey
estimates may be affected by non-response bias. This
study measured non-response bias in terms of the
census characteristics of households.
The FES response rate is relatively low in comparison
with other OPCS household surveys, for example rates
of around 82% on the General Household Survey and
77% on the National Travel Survey. The FES rate
reflects the heavy demands that the survey makes on
respondents and the way in which response is defined.
The survey seeks to interview each adult in selected
households about their income and expenditure, and
then requires each adult to keep a two-week diary of
all items of expenditure. Interviews by proxy are not
normally allowed. A small payment (£5 in 1991) is
made to each adult in responding households in
recognition of the burden involved in keeping the
expenditure diary.
When calculating response rates, households are
counted as responding only if expenditure diaries are
completed by all adults and all key items of income
information are obtained for the household. Thus the
FES has a more stringent definition of response than
many other surveys which either allow proxy
interviews or count households as responding where
some members are not interviewed. The FES used this
definition of response because its primary purpose is
to gather information on expenditure for the whole
household and data are not analysed for individuals.
SMB 381/96
Kate Foster
2.
A comparison of the census characteristics of respondents and non-respondents to the 1991 FES
The Census-linked study
Linkage to an external source of data, such as the
Census of Population, is a valuable method of
comparing the characteristics of survey respondents
and non-respondents because it offers a wide range of
variables collected in a standard way or both groups of
households. Similar studies of FES non-response were
carried out following the 1971 and 1981 Censuses.23 A
particular advantage of these studies is that they
provide information about certain characteristics of the
achieved sample which are not routinely measured on
the survey, for example ethnic group or educational
qualifications.
In 1991, as previously, the link between each sampled
household and the relevant census form was primarily
on the basis of address. For multi-occupied addresses,
where matching is more difficult, the names of
household occupants and the location of the household
within the building were also use where available.
The results are based on the FES sample for January to
June 1991. Overall, almost 97% of households in the
study sample were successfully matched and census
data obtained. Cases not matched include those in
which the address was not traced, a census form had
not been returned, or there were o usual residents at
the census address. Match rates were considered to be
sufficiently high to give confidence in the results.
3.
Variation in response
All census items were available for analysis. Those
collected for individuals were used either by creating
aggregated variables, which summarise the data for
whole households, or by taking the characteristics for
one representative person designated the head of
household.4 A number of derived variables describing
household composition were also included.
Testing was in two stages. First we used the Chisquare test to identify which census variables were
significantly associated (p<1.05) with non-response,
then a test of differences between proportions was
used to show which rates were significantly above or
below average (p<0.01). The analysis was repeated for
non-contacts and refusals as well as for total nonresponse. In addition, logistic regression analysis was
used to identify which characteristics were most
strongly associated with non-response and which had
additive effects.
10
The results presented here concentrate on variation in
total response by household type because this indicates
types of household which are under or overrepresented in the achieved FES sample. Tables 1 and
2 show response rates for most of the census variables
which were significantly associated with FES
response.
3.1
Household characteristics
Response rates showed marked variation according to
the size and composition of the household. There was
a clear fall in response between households with one
or two adult members an those with three or more
(from just over 70% to around 60%). This obviously
reflects the response rules applied on the FES.
Particularly low response was seen for households
comprising three or more adults with no child (56%)
or, in an alternative classification, couples with only
older, non-dependent children (55%).
In general, the presence of children (aged 0 to 15
years) in the household had a beneficial effect on
response: rates were around 75% for all households
with children compared to 68% for those with none.
The effect was more marked for households with
younger children. Higher response for households with
children was seen regardless of the number of adults.
The higher response for households with children is
common to most surveys and there are a number of
possible reasons for it. These include the greater
likelihood of the interviewer making contact initially
and the greater compliance of adults in these
households, for example because they have a more
predictable and organised lifestyle or because they are
more likely to be persuaded of the benefits to society
of the information that they give. On the FES, both
non-contact and refusal rates were significantly lower
for households with children than among those with
none.
The variation in response by region and area type is
well-known since it can be monitored from sampling
data. This study showed that response rates were
significantly below average for households in London
(63%) and above average for those in nonmetropolitan areas (71%). For this sample, response
rates were above average in Scotland (75%), which
covers both metropolitan and non-metropolitan areas,
and lower throughout the South East than in other
parts of England and Wales.
As in previous FES census-linked studies, response
was lower among the small number of households
SMB 381/96
Kate Foster
Table 1
A comparison of the census characteristics of respondents and non-respondents to the 1991 FES
FES response by selected household characteristics
Household characteristics
Response rate
Sample size
Average response rate
69.6%
4825
Number of adults
1
2
3
4 or more
71%
72%*
59%*
60%*
1432
2458
629
306
Number of children
None
1
2 or more
68%*
75%*
76%*
3544
554
727
71%
74%
69%
75%
70%
78%*
56%*
68%
1277
461
815
155
1589
869
678
257
78%*
73%
76%*
55%*
604
677
1113
445
Area type
London
Metropolitan
Non-metropolitan
63%*
69%
71%*
553
1007
3265
Region
Scotland
North
Midlands & E Anglia
SW and Wales
London
Rest of SE
75%*
71%
70%
71%
63%*
67%
436
1267
979
653
553
937
Number of cars
None
1
2
3 or more
68%
71%
71%
60%*
1620
2067
949
189
Household composition
(categories are not mutually exclusive)
1 person only
- aged 16-59
- aged 60 or over
1 adult with child(ren)
2 adults, no child (0-15)
2 adults with children
3 or more adults, no child
3 or more adults with child(ren)
All households with child(ren)
- youngest aged 0-4
- youngest 5-15
Couple with dependent child
Couple with non-dependent child
* Response rate for category is significantly different from the average rate (p<0.01)
11
SMB 381/96
Kate Foster
Table 2
A comparison of the census characteristics of respondents and non-respondents to the 1991 FES
FES response by selected characteristics of the head of household
Household characteristics
Response rate
Sample size
Average response rate
69.6%
4825
Age
16-24
25-34
35-44
45-54
55-64
65-74
75 or over
77%
77%*
71%
67%
65%*
68%
67%
201
777
858
822
794
764
609
Martial status
Married
Single
Widowed or divorced
69%
75%*
69%
2855
720
1250
Qualification level
Degree or equivalent
Other higher qualifications
No post-school qualifications
76%*
84%*
68%*
383
303
4139
Country of birth
UK
Other
70%*
60%*
4490
335
Ethnic group
White
Black or other ethnic minority
70%*
56%*
4655
170
Economic status
Employee
Self-employed
Unemployed
Retired
Looking after home
Other inactive
71%
64%*
68%
69%
67%
71%
2180
485
268
1319
274
299
Social class
I or II
III non-manual
III manual
IV or V
74%*
72%
68%
68%
1228
455
1110
716
* Response rate for category is significantly different from the average rate (p<0.01)
12
SMB 381/96
Kate Foster
A comparison of the census characteristics of respondents and non-respondents to the 1991 FES
with three or more cars. This is directly related to the
lower response seen for households with three or more
adult members, who are by far the most likely group to
have a large number of cars. It may also indicate less
co-operation among more affluent households.
3.2
Characteristics
household
of
the
head
of
We found less variation in response rates by the age of
the head of household than had previous studies. Both
in 1971 and in 1981 there was a consistent decrease in
response through the age range. In 1991. response was
again highest for households with the youngest heads
(aged 16 to 34) but was then fairly stable (65% to
68%) where the head was aged 45 or over. The aboveaverage response for households with a younger head
also showed in higher response rates for heads who
were still single (75%).
A finding which was common to all surveys was
below-average response, due to high refusal rates, for
households whose head was less well qualified. On the
FES, response rates were substantially above average
for households whose head had a degree or equivalent
qualification (76%) or other higher qualifications
(84%). The effect may be exaggerated on the FES
because of the demands of keeping the expenditure
diary. The high respondent burden associated with the
survey may also be reflected in the low response rates
for households whose head was born outside the UK
(60%) or was classified to an ethnic minority group
(56%). The under-representation of these groups was
only seen for demanding diary-keeping surveys such
as the FES and National Food Survey.
As found previously, households with a self-employed
head had lower response (64%) and so were underrepresented in the achieved sample. This is assumed to
be related to the survey subject matter, either because
the self-employed may be particularly sensitive about
giving details of their income or may have greater
difficulties than other groups in compiling the
information. There was also a clear, though not very
strong, social class effect with the highest response
among households whose head was in a professional
or intermediate non-manual occupation.5
3.3
Characteristics
most
associated with response
strongly
Having looked at census characteristics individually,
the next stage was to use logistic regression analysis to
test which variables were most strongly associated
with FES response.
13
For each category in the final model, the method also
calculates the odds of the event occurring: in this case
the odds are calculated as the ration of the probability
on non-response to the probability of response.
Table 3 shows which variables were included in the
logistic regression model for non-response and the
order in which they were entered. Thus, the number of
adults was most strongly associated with FES nonresponse and then, having allowed for this effect, the
qualification level of the head of household, and so on.
The table gives the odds of non-response for each
category of household relative to odds of 1.0 for a
reference category, so we see that the odds were more
than 60% higher for households with three or more
adults than for those with one or two adults (ratio of
1.67). Interestingly, once the other variables had been
included in the model, the odds of non-response
increase with age throughout the range.
Odds ratios can be used to estimate response rates for
household types defined by combinations of census
variables and which might have only small numbers
for cases in our data-set. First, the odds of nonresponse for households with different combinations
of the characteristics used in the model are calculated
by multiplying the ratios for the appropriate categories
together with the baseline odds. The probability is then
given by the odds divided by the odds plus one.
Thus, for example, the odds of non-response for a
household containing 3 or more adults and no
children, in London and in the reference category for
the other variables, would be 0.94. The predicted nonresponse rate for the group is 49% (response rate of
51%). If, in addition, the head of household had no
post-school qualifications the predicted non-response
rate would increase to 58% (response rate of 42%).
4.
Different types of non-respondents
The FES, like most surveys, distinguishes chiefly
between non-responding households which are not
contacted during the interview period and those which
are contacted but refuse to co-operate. The noncontact rate for households matched to 1991 Census
data was less than 2% out of a total non-response rate
of 30%, so refusals made up the majority of nonresponse. On the FES, this category includes refusals
to the advance letter as well as in person to the
interviewer and households where some, but not all,
adults would have been willing to co-operate with the
SMB 381/96
Kate Foster
Table 3
A comparison of the census characteristics of respondents and non-respondents to the 1991 FES
The odds of non-response for different household types
Characteristic
Baseline odds
Odds
0.288
Multiplying factors
Number of adults
1 or 2
3 or more
1.00
1.67*
Qualification level of head
Degree or equivalent
Other higher
No post-school qualifications
1.00
0.63*
1.44*
Number of children
1 or more
None
1.00
1.36*
Area type
Non-metropolitan
Metropolitan
London
1.00
1.09*
1.44*
Economic status of head
Employee
Self-employed
Unemployed
Economically inactive
1.00
1.31*
1.11
0.85
Age of head
16-34
35-54
55-64
65-74
75 or over
1.00
1.24*
1.42*
1.46*
1.61*
* Coefficient significantly different (p < 0.05) from reference category
survey. It is also convenient to include the small
number of households which initially co-operated but
then abandoned record-keeping or gave incomplete
information.
Because the refusal category is so dominant, the
characteristics associated with non-response closely
reflect those which are associated with refusal.
There was also very clear variation in non-contact
rates for different types of household, but these
associations did not show through in terms of bias
in the achieved sample.
14
The examples of non-contact rates for different
types of household (Table 4) suggest that both the
physical attributes of the accommodation and the
household composition are important factors.
In
terms of the household’s accommodation,
noncontact was more likely for households in multioccupied accommodation or in purpose-built flats.
This reflects the difficulties imposed by
entryphones or other barriers to access.
In addition,
single adult households were hard to contact and the
effect was more pronounced it the adult was of
working age (16-59) . Whereas non-contact rates were
below average for households comprising two older
adults (at least one aged 60 or over). The higher non-
SMB 381/96
Kate Foster
Table 4
A comparison of the census characteristics of respondents and non-respondents to the 1991 FES
FES non-contact rates by selected household characteristics
Household characteristics
Average rate
Non-contact rate
1.5%
Sample size
4825
Number of adults
1
2
3
4 or more
3%*
1%*
1%
ø
1432
2458
629
306
Household composition
1 person only
- aged 16-59
- aged 60 or over
2 adults, no child (0-15)
- both aged 16-59
- one or both aged 60 or over
3%*
6%*
2%
1%
2%
ø*
1277
461
815
1589
783
807
Accommodation type
Detached, semi-detached or terraced house
Purpose-built flat
Converted or shared house/flat
1%
4%*
8%*
3973
708
129
Area type
London
Metropolitan
Non-metropolitan
3%*
2%
1%
553
1007
3265
* Non-contact rate for category is significantly different from the average rate (p<0.01)
ø Less than 0.5%
contact rate for households in London was not
significant after controlling for the type of
accommodation and number of adults in the
household.
5.
Comparison with previous censuslinked checks
Finally, we were interested in assessing whether
patterns of response on the FES had changed since the
first census-linked study in 1971. Survey response
rates have been reasonably stable over this period:
response among the 1971 study sample was the same
as that in 1991 (70%) although it was slightly higher
in 1981 (74%).
In order to make comparisons of non-response
15
bias over time (and also across surveys) it is helpful to
control for variation in the overall
response rate by calculating correction factors for each
household type at each point in time. These are
equivalent to simple (univariate) weighting factors and
are calculated by dividing the overall survey response
rate by the response rate for the category. Factors
greater than 1.0 indicate households which are underrepresented among survey respondents.
The number of categories which could be compared
was restricted by the use of different definitions over
time. Table 5 concentrates on variables found to be
strongly associated with response in 1991, but the
number of people in the household is used instead of
the number of adults as the latter was not available in
earlier years. The number of people is less useful for
interpretation as, apart from single person households
SMB 381/96
Kate Foster
Table 5
A comparison of the census characteristics of respondents and non-respondents to the 1991 FES
Comparison of FES non-response bias: 1971, 1981 and 1991
Household characteristics
Correction factors a
1981
1991
74%
70%
Average response rate
1971
70%
Household composition
Couple with no children
Couple with dependent child(ren)
Couple with non-dependent children only
n/a
n/a
n/a
1.01
0.95*
1.21*
1.00
0.92*
1.28*
Number of people
1
2
3
4
5 or more
1.04*
1.03
1.01
0.95*
0.93*
n/a
n/a
n/a
n/a
n/a
0.98
1.00
1.06*
0.99
0.97
Number of children
None
1
2 or more
1.06*
0.93*
0.90*
1.04*
0.99
0.91*
1.03*
0.93*
0.92*
Number of cars
None
1
2b
3 or more
1.01
0.97*
1.13*
-
1.01
0.97*
1.01
1.23*
1.02
0.98
0.99
1.15*
Age of held of household
16-25
26-35
36-45
46-55
56-65
66-70
71 or over
0.79*
0.84*
0.95*
1.03
1.10*
1.06*
1.13*
0.88*
0.88*
0.96
1.06
1.05
0.99
1.14*
0.87*
0.93*
0.98
1.06*
1.06*
1.02
1.02
Qualification level
Degree or equivalent
Other higher qualifications
No post-school qualifications
0.99
0.91*
1.01*
0.93*
0.89*
1.01*
0.91*
0.84*
1.02*
Country of birth
UK
Other
n/a
n/a
0.98*
1.20*
0.99*
1.17*
Economic status
Employee
Self-employed
0.98
1.17*
0.97*
1.14*
0.98*
1.09*
a
Average survey response rate divided by response rate for category
2 or more in 1971
* Response rate significantly different fro average (p<0.05)
b
16
SMB 381/96
Kate Foster
A comparison of the census characteristics of respondents and non-respondents to the 1991 FES
which correspond with one-adult households, it
confuses the separate effects of numbers of adults and
children.
The results show a very consistent pattern, suggesting
that the types of household under-represented in the
FES have not changed substantially over the last
twenty years. The 1991 results show a better
representation in the sample of households with an
older head (aged 71 or over) and, perhaps related to
this, of single person households. This may reflect the
greater efforts made by interviewers to gain cooperation from older people living alone, perhaps in
response to the earlier findings, or it may indicate a
cohort effect and a greater willingness to co-operate
among people now reaching the age of 70.
There may also be some evidence that the underrepresentation of households with three or more adults
has increased in the past twenty years, although this
can only be surmised from the results for number of
people in the household. The 1971 correction factors
for households comprising three or more people are
systematically lower than those for 1991 indicating
that large households were better represented in the
earlier year. However, the effect might be partly due to
changes over time in the response rates of households
which are larger because they contain children.
The consistency of the pattern of associations over the
past 20 years is of interest when considering how to
re-weight data to compensate for non-response. If
patterns are broadly stable, and if appropriate variables
are available on the survey data-set, then a potentially
17
effective method of reducing non-response bias would
be to weight using response probabilities derived from
census-linked checks, at least as a first stage of
adjustment.
References
1.
Surveys included were the Family
Expenditure (FES), General Household
(GHS), Labour Force (LFS), National Travel
(NTS) and National Food (NFS). A
comparison of results across surveys is to be
published in the SSD New Methodology
series.
2.
Kemsley, W.F.F. (1975) Family Expenditure
Survey. A study of differential non-response
based on a comparison of the 1971 sample
with the Census. Statistical News, 31, 3-8.
3.
Redpath, R. (1986) Family Expenditure
Survey; a second study of differential
response, comparing Census characteristics
of FES respondents and non-respondents.
Statistical news, 72, 13-16.
4.
For comparability with FES data, the head of
household was defined by applying survey
definitions to the census data-set.
5.
Social class was only available for census
respondents who had been in paid
employment within the previous 10 years.
SMB 381/96
Survey of English Housing – A test of initial contact by telephone
Anne Klepacz and Aidan O’Kelly
1. Background
The Survey of English Housing (SHE) is a continuous
survey carried out for the Department of the
Environment; it has been running since April 1993. It
involves an interview of about 30 minutes with the
head of household or his/her partner at each selected
address. The normal procedure for approaching
informants in SEH (as for all face to face surveys
carried out by SSD) is for an advance letter to be sent
to all sample addresses for the month. The letter
informs the residents that their address has been
sampled, gives brief details of the particular survey,
explains that an interviewer will call at the address in
the next few weeks, and gives a contact point for
anyone wanting further information. The interviewer
then calls in person at the address and either carries
out the interview straightaway, if the respondent is in
and the time is convenient, or makes an appointment
to call back.
An alternative approach, which might reduce
interviewers’ travelling costs, is for the interviewer to
contact the informant first by telephone rather than by
a personal visit. It has generally been thought that such
an approach would be likely to increase the number of
refusals. However, research in the USA indicated that
an advance telephone call was not likely to increase
the refusal rate greatly. Interviewers working on the
1994 British Crime Survey were allowed to contact
addresses by telephone if they wanted to but no
analysis was possible of the effectiveness of the
method.
In order to investigate the feasibility of this approach
more rigorously, interviewers working in just over half
(33) of the July 1994 areas sampled for the SHE were
asked to make their first contact with the sampled
addresses by telephone. In 1994, there were 65 work
areas a month, with 36 addresses in each area.
2.
any reluctance over the telephone, and instructed to
call in person on any informants who refused over the
telephone. Interviewers working in the remaining
areas were asked to approach addresses in the normal
way. Both sets of interviewers were asked to return a
sheet giving details of the interview outcome at each
address, and the number of personal calls made at each
address to achieve a result.
Interviewers working on the “telephone” assignments
were additionally asked for information on whether a
telephone number was obtained at each address,
whether the first contact was made by telephone and
whether an appointment for the interview was made by
telephone. They were also asked for information on
the distance from their home of the library of town hall
where they located electoral registers and telephone
directories, the time spent looking up names and
telephone numbers, and for their reaction to this
method of working.
Additional instructions were given part way through
the field period because some interviewers assumed
they could not make a personal visit to an address
(even if they were passing) if they had not yet
succeeded in making telephone contact. This was not
the intention given that achieving greater economy
was one of the aims of the experiment.
During the experiment, several interviewers contacted
the office, concerned that a large number of
informants were refusing over the telephone. (One
interviewer had received 5 consecutive refusals.) As a
result, a note was sent from the field office advising
interviewers not to make further telephone contact at
any address if they received 3 or more refusals. Also,
two experienced interviewers were allowed to make
the first call in person rather than by telephone,
because they were very concerned that telephone
contact would not work well in their areas (large
council estates, with a lot of anti-government feeling).
Method
Interviewers working in the 33 areas allocated to
telephone contact were sent detailed instructions
which covered going to the Town Hall or library to
look up informants’ names in the electoral register and
looking up the corresponding telephone numbers.
Interviewers were also given advice on how to counter
3.
Results
Information was received back for 21 assignments in
the non-telephone part of the experiment and for 24
assignments in the telephone part. A further 8
interviewers gave their reaction to the experiment, but
did not complete the detailed analysis sheets for each
18
SMB 381/96
Anne Klepacz and Aidan O’Kelly
Survey of English Housing – A test of initial contact by telephone
address. These assignments are therefore excluded
from the tables, but the interviewer reactions are
included in the comments. (Reactions of these eight
interviewers ranged from very positive to negative so
the absence of detailed information from them is
unlikely to affect the representativeness of the
sample.)
a)
Availability of telephone numbers
Telephone numbers were found for 40% of addresses
in the telephone sample, an average of 14 per area.
This average actually conceals a wide variation. On an
assignment of 36 addresses, telephone numbers were
available for as few as 8 or as many as 25. Although
the majority of interviewers reported no problems in
locating and using electoral registers and telephone
directories, there were one or two frustrations such as
finding the library was closed on the day the
interviewer wanted to start work, the registers being
use by someone else, or records held in more than one
place (this applied to some rural areas).
Just under a third (31%) of addresses were first
contacted by telephone. This is lower than the
proportion for which telephone numbers were found
(40%) because in some cases interviewers reverted to
a face to face approach when they had had 3 or more
refusals over the telephone (see above). There were
also some cases where the telephone number found
was no longer current (in some areas increasing
numbers of people were switching from BT to
Mercury).
Table 1
Of the 271 addresses contacted by telephone, 205
(76%) agreed to an appointment. 42 informants
refused over the telephone and could not be persuaded
by a personal visit. The remainder were successfully
interviewed and represent a mixture of people who
wanted to see the interviewer before agreeing to an
interview and people who were initially reluctant or
refused, but changed their minds following a personal
visit from the interviewer.
b)
Response
Response in the telephone areas was 82%, both for
those addresses contacted by telephone, and for the
other addresses, as against a response of 84% of the
“non-telephone” areas. This difference is not
statistically significant. However, these results have to
be treated with caution. As outlined above,
interviewers getting 3 or more refusals by telephone
reverted to a face to face approach. If interviewers had
stuck rigidly to making telephone calls, the response
for the telephone addresses is very likely to have been
lower.
c)
Costs
The average number of visits made to addresses
contacted by telephone was 1.2. this compares to an
average of 2.0 calls to the remaining addresses in the
“telephone” areas and 1.9 visits to addresses in the
“non-telephone” areas.
The average number of personal calls to achieve an
interview at addresses contacted by telephone certainly
demonstrates a saving of time and therefore of money
(although one interviewer working in a rural area
mentioned that because she was making appointments
Contact by telephone
NO
%
Set sample
864
100
Telephone number found
344
40
Contacted by telephone
271
31
Agreed to appointment
over telephone
205
24
19
SMB 381/96
Anne Klepacz and Aidan O’Kelly
Table 2
Survey of English Housing – A test of initial contact by telephone
Response
Set sample
Size
Number of
interviews
%
response
Contacted by phone
270*
222
82
Not contacted by phone
481
395
82
Combined
751
569
82
Non telephone assignments:
677
569
84
Telephone assignments:
* 1 contacted by telephone proved ineligible
Table 3
Average number of personal visits
Telephone assignments:
Contacted by phone
1.2
Not contacted by phone
2.0
Combined
1.8
1.9
Non telephone assignments:
Telephone assignments:
Average no of trips to find information
1.2
Average distance to library/town hall
21miles
Average time taken to obtain information
2 hours
without knowing the geographical layout of all the
addresses, she found herself criss-crossing the area
more than usual so her planning may have been less
economical from that point of view).
However, the average number of calls for an achieved
interview in the “telephone” areas is very close to that
in the non-telephone areas (1.8 as against 1.9).
Significant saving s would only be possible for those
areas where telephone numbers can be found for a
high proportion of addresses, especially when the
costs of the interviewer looking up the information
20
and making telephone calls are taken into account.
(For example, interviewers took an average of 2 hours
to obtain telephone numbers, and needed to travel an
average of 21 miles to the relevant town hall or
library. This is equivalent to the cost of 1 – 2
interviews.) savings might also be greater where the
interview is longer. At the time of this experiment the
interview was quite short (25-30mins) which meant
that interviewers were often successful in achieving an
interview at first call, without needing to make an
appointment. When the interview lengthened in
October 1994, there was a corresponding rise in
number of calls taken to achieve an interview, with
SMB 381/96
Anne Klepacz and Aidan O’Kelly
Survey of English Housing – A test of initial contact by telephone
fewer people prepared to do the interview there and
then.
d)
Interviewer reaction
Some interviewers were very positive about contacting
informants by telephone, while others felt strongly that
it was not a good method. Their comments were
classified into four categories with almost equal
numbers falling into each category: 9 positive, 8 fairly
positive, 7 fairly negative and 9 negative.
In each category there was a good spread of
interviewer experience and ability.
Positive reactions
Some interviewers who had been dubious to begin
with were surprised how responsive some informants
were over the telephone. Some interviewers told
informants that the telephone was being used to reduce
costs: informants appreciated this.
Some interviewers felt strongly that telephone contact
cut down on fruitless visits. The approach was
especially useful for contacting busy informants who
worked odd hours, and who might not have been
contacted face to face.
Interviewers who were enthusiastic about the approach
said they personally felt confident about contacting
members of the public by telephone, and/or they had
had previous experience of doing so.
Reservations
The method may work better in certain types of areas.
Interviewers thought it would be received more
favourably by busy professionals, who often ask why
interviewers don not telephone first. Some
interviewers felt it would not work so well on council
estates, if there was a high level of anti-government
feeling, and where informants needed to be reassured
face to face. Some also felt that the method would not
work so well in an area with a transient population.
However, it could work very well in rural areas, with
widespread addresses, and where interviewers could
ask directions on how to get to the address.
Many interviewers said it was more difficult to make a
face to face visit to a household which had refused
21
over the telephone. Some interviewers said informants
were very suspicious about telephone contact. This
made it more difficult to sell the survey, and the
explanation sometimes took much longer than on the
doorstop.
In view of these reservations, several interviewers
thought the method should be optional, depending on
the area; or it could be treated as a “back-up”. For
instance, if contact could not be made after the 2nd or
3rd visit to an address, the interviewer could try
telephoning. Many interviewers were already doing
this.
Even some of the interviewers who were willing to try
the method were afraid that the approach might have a
detrimental effect upon response. It was also felt that
telephone contact did not give interviewers a “feel” for
the address. In one case, this raised concerns about
security, as an interviewer would not necessarily be
making an initial visit to an address during daylight
hours. (It has to be borne in mind that the experiment
was carried out in July, when this was not really an
issue. Interviewers might be more concerned about the
safely issue in December.)
There was a general concern that interviewers would
be mistaken for telesales agents, especially on a
housing survey and some informants had associated
“housing” with double glazing sales.
In some cases, where interviewers were very negative
about the method, they also said they personally did
not feel confident about contacting members of the
public by telephone, and had no previous experience
of this sort of approach.
4.
Conclusions
The results of the study are rather inconclusive on the
effect of a telephone approach on response. Although
the difference in response rates in the two parts of the
study was not statistically significant, the decision to
advise interviewers to stop trying to make telephone
contact if they were getting too many refusals
obviously affects the weight that can be attached to
this result. The results also suggest that there are
unlikely to be any great cost savings in using
telephone contact as the standard on all assignments,
at least for short interviews. The main point to emerge
was that there was considerable variability in the
effectiveness of the telephone approach in different
areas.
SMB 381/96
Anne Klepacz and Aidan O’Kelly
Survey of English Housing – A test of initial contact by telephone
One aspect of this variability was connected with the wide
variation in the availability of telephone numbers (in some
areas subscribers had been changing to Mercury, so numbers
found in BT directories proved to be out of date). The
implication is that it would be far more cost effective to
provide telephone numbers centrally. Interviewers working
in areas where few numbers are available would be very
frustrated at needing to spend several unproductive hours in
the library. This argument is particularly relevant for smaller
areas.
The other main variability was connected with type of
area. The method was not felt to be helpful on council
estates, areas with high anti-government feeling, or
with transient populations. It was felt to be helpful in
professional areas, with busy people who were hard to
An alternative option might be to recommend much
wider use of the “telephone back-up” method already
used by some interviewers ie using the telephone when
there has been no contact made at the address after 2
or 3 calls.
If the method were to be introduced more widely,
more thought would need to be given to the safety
aspect (mentioned above) and to the need for
interviewer training. An interviewer who was very
enthusiastic about the method, and who had done
telephone work in a previous job, said that training
was essential if all interviewers were to adopt the
approach. In particular, she felt that interviewers
needed more guidance on what to do when they
encountered reluctance over the telephone.
contact face to face, and in rural areas. The implication
is therefore that interviewers should be given
flexibility in using the telephone, with the choice left
to their judgment of its usefulness in the particular
area.
22
SMB 381/96
Researching in prisons – some practical and methodological issues
Ann Bridgwood
During 1994, Social Survey Division (SSD) carried
out two major projects in prisons – a Survey of the
Physical health of Prisoners and a Census of Mothers
in Prison. This article reports on some of the practical
and methodological issues which arose, and one or
two of the factors affecting response.
1.
Background
The Survey of the Physical health of Prisoners1 was
carried out on behalf of the Directorate of health Care
of the Prison Service. The aim was to collect
information on the health and health-related behaviour
of sentenced male prisoners, in order to inform the
planning of health care. A probability sample of just
under a thousand men in 32 establishments was
interviewed for 25-35 minutes. The interviewer used a
laptop computer to record their answers. The interview
was followed by a nurse session lasting 10-15 minutes,
during which each sample man had his blood pressure,
respiratory function, height and weight measured.
Nurses also collected details of medication. A team of
two interviewers and one nurse worked in each
sampled prison.
The Census of Mothers2 was designed to produce an
estimate of the proportion of female prisoners who had
children under 18 and/or were pregnant, and to collect
details of the women’s children and the arrangements
made for their care. A Census date of November 21st
1994 was chosen and all women in prison on that day
were asked to participate in a short sift interview to
establish their eligibility for a longer, follow-up
interview. Although the census covered all female
prisoners, participation was not compulsory.
Interviewers used paper questionnaires rather than
laptops3 and interviewed 1766 women in all 12
establishments housing women prisoners. Teams of
interviewers worked in each prison; smaller prisons
could be covered by two interviewers, but larger
establishments required bigger teams.
For both the Prisoners’ health Survey and the Census
of Mothers, a small amount of information was drawn
from prison records.
23
2.
Sampling issues – Prisoners’ Health
Survey
For the Prisoners’ Health Survey, the intention was to
select a sample of 39 men in each of the selected
establishments (to yield an achieved sample of
approximately 1,000). We used a two-stage sample
design. At the first stage 32 establishments were
chosen with probability proportional to the number of
sentenced male prisoners from a list of all prisons in
England and Wales holding such prisoners, stratified
by type of prison.
In each establishment, a random sample of men was
chosen immediately before the start of fieldwork from
a list of all sentenced males in the prison. The
sampling interval was different in each of the selected
prisons so that each sentenced man had an equal
probability of being chosen.
2.1
Dealing with a changing population
Because of the need to gain consent from the sampled
establishments (particularly those chosen for pilot
studies), to make the necessary practical arrangements,
and to arrange security clearance for interviewers,
several months elapsed between drawing the primary
sampling units (prisons) for the Prisoners’ Health
Survey and the start of fieldwork. The sample of
prisons was drawn in February 1994, using prison
population figures for December 1993 (the most recent
available at the time) and fieldwork took place in July
1994. Between December 1993 and July 1994 the size
of the sentenced male prison population increased
from 32,243 to 34,080; this, combined with mobility
between prisons, meant that the number of sentenced
meant in the sampled establishments had often
changed considerably between selecting the prisons
and the start of fieldwork.
To maintain the same probability of selection for each
sentenced man, the original sampling interval for each
prison was retained and consequently the set sample
exceeded 39 in some establishments. Interviewers
therefore needed to have sufficient cases to allow for
any increase in the size of their sample. Each prison
was allocated 50 random numbers to make the sample
selection within the prison, allowing up to 11 spare
cases in addition to the 39 we were aiming to select.
The lead interviewer was instructed to continue using
the numbers until she had worked through the list of
SMB 381/96
Ann Bridgwood
Researching in prisons – some practical and methodological issues
all sentenced males in her establishment; the actual
number of men sample in each prison ranged from 21
to 49.
2.2
Sampling beds rather than individuals
A hazard of prison research is that inmates who are
sampled may leave prison before the interviewer has
had a chance to talk to them because they have been
released, given bail or been transferred to another
prison. This could affect response by increasing the
proportion of non-contacts and lead to bias in the
responding sample. In order to avoid this, those
respondents who have left prison need to be replaced
by other prisoners, in a way which does not bias the
sample.
For the National Prison Survey 19914, if a selected
prisoner had left before the interviewer could make
contact, he or she was replaced by the person now
occupying their bed, provided that the new occupant
was newly arrived in the prison and had not had a
chance of being selected when the initial sample was
drawn.
A similar procedure was used for the Prisoners’ Health
Survey. It was not successful in some prisons, partly
because these prisons contained induction units in
which newly-arrived inmates spent up to two weeks
before transferring to a wing. Given the short period of
fieldwork, the likelihood of a prisoner arriving in an
induction unit after the sample had been drawn and
being transferred to a bed on a main wing before
interviewing was completed was small. Men moving
from the induction unit into the ‘bed’ of a man who
had left or been transferred would have been included
in the original sampling frame and therefore already
have had a chance of selection. Men who had arrived
in the induction wing after the sample was drawn
might be replacing someone who was still in the
prison. Fortunately, the number of affected prisoners
was small and had a minimal effect on response.
2.3
Sampling individuals and replacing
leavers with arrivals – Census of
Mothers
For the Census of Mothers, we wanted to interview
every woman in prison on November 21st 1994.
Because of the time delay between Census night and
fieldwork, finding all of the women posed some
problems. There was a lot of movement in and out of
one of the larger prisons, which also had an induction
unit, which could potentially affect response by
increasing the number of non-contacts.
Table 1
Where a woman had been transferred to another prison
before she could be interviewed, it was sometimes
possible to contact an interviewer in the woman’s new
prison and arrange for her to be seen on arrival at her
destination. For other women, a different system of
replacement was devised. In addition, we needed a
method of replacing women who had left prison
altogether so that we interviewed (or attempted to
interview) the number of women on the Census list for
November 21st.
When interviewers had accounted for all of the women
on the sampling frame (ie. All those in prison on
Census day) still in the prison, they counted the
number of women who had left prison before they
could be interviewed – listing separately women who
had been sentenced or were on remand. The lead
interviewer then asked for lists of those women who
had entered the same prison since the registers were
compiled, with sentenced and remand prisoners again
being listed separately, and selected the requisite
number of women as replacements for those who had
left. This method proved successful; at the end of
fieldwork, only 18 women had left prison and not been
replaced by a new arrival (see Table 1).
3.
Fieldwork issues
3.1
Sifting arrangements – Census of
Mothers
As noted earlier, one of the aims of the Census of
mothers was to make an estimate of the proportion of
women in prison who were either mothers of children
under 18 and/or pregnant at the time of interview. This
involved asking all women in prison on Census day to
take part in a short sift interview to establish their
parental status. It was important to devise an optimal
way of dealing with the sift interview in the prison
context; many women would not be asked to do the
half-hour follow-up interview, and we wanted to make
best use of interviewers’ time and avoid making
excessive demands on prison staff.
Two different methods of sifting were tested during
the pilot study. In one prison, interviewers carried out
the sift interview and then, if the woman was eligible,
went on to do the follow-up interview. In the other
prison, the interviewing was split; one interviewer
carried out the sift interviews and then passed eligible
women on to other interviewers.
It was decided to adopt the former procedure for the
main fieldwork for a number of reasons.
Response to the Census of Mothers in Prison
Number
1945
51
Total on register
Ineligible
24
%
SMB 381/96
Ann Bridgwood
Researching in prisons – some practical and methodological issues
Total eligible persons
Co-operating with sift
Mothers
Non-mothers
(Mothers as a % of women interviewed
Refusal at sift
Non-contact at sift
Advised not to see
Left prison, no replacement
Incapable of interview
Total eligible mothers
Fully co-operating
Partially co-operating
Refusal at follow-up
Non-contact at follow-up
Firstly, interviewers from the prison where the ‘split’
method was used reported that a number of women
who were eligible did not attend the follow-up
interview, whereas no-one refused to continue from
the sift to the full interview with the same interviewer.
Secondly, it was felt that, having established rapport
with the interviewee during the sift interview, it would
help the interview to flow better if the same
interviewer carried on and did the follow-up interview
straight away. Thirdly, this method proved more
practical in organising the flow of work. Using the
split method, the sift interviewer sometimes built up a
backlog of women waiting to do the full interview. At
other times, when a number of consecutive
interviewees were not eligible, the sift interviewer had
no-one to pass on to the follow-up interviewers.
Asking all interviewers to do both parts of the
interview also meant that they could work separately
and were not constrained by having to work in teams,
reducing practical problems such as finding sufficient
rooms close to each other.
3.2
Monitoring fieldwork progress and
response – Prisoners’ Health Survey
SSD
now
uses
computer-assisted
personal
interviewing (CAPI) for most household-based
surveys, but the Prisoners’ Health Survey was the first
institution-based survey to use CAPI. Overcoming
some of the administrative constraints of CAPI in an
uncertain and changing fieldwork situation was a
major issue for the survey and, as such, is relevant to
other institution-based surveys.
25
1894
1766
1082
684
87
11
9
18
3
1082
1049
8
18
0
100
93
57
36
(61)
5
1
0
1
0
100
97
1
2
0
Unlike household surveys where a fixed number of
addresses is usually allocated to interviewers,
fluctuations in the size of the prison population,
combined with mobility between prisons which can
result in a change in the ratio of sentenced to
remanded prisoners, made it impossible to predict
beforehand exactly how many prisoners would be
sampled in any individual prison.
Interviewers in many prisons were dependent on
prison officers to escort respondents to and from
interview rooms because prisoners are not allowed to
move freely about the prison. In addition, because of
the demands on inmates’ time described below, it was
rarely possible to draw up a timetable and to expect to
see respondents at a set time or in a set order;
interviewers had to be prepared to interview anyone
from the sample who turned up. For this reason, it was
not feasible for the Prisoners’ Health Survey to
allocate half of the sample t each interviewer; this
could have led to a situation where an interviewer was
unable to interview because the prison officer was
unable to find one of the men on her list.
Interviewers had to account for every case by entering
an outcome code for it. We therefore needed a method
of listing the sampled cases, and of monitoring their
progress and response which was flexible enough to
deal with changes in the size of the prison population,
and enabled interviewers to interview any of the men
who were sampled, and to keep track of all the cases.
SMB 381/96
Ann Bridgwood
Researching in prisons – some practical and methodological issues
Both interviewers were given a disk with all 50 cases
on it so that either of them could interview whichever
of the men were brought to the interview rooms. This
meant that interviewers had to take extra care to make
sure that the same serial number was not used for two
men.
In the pilot study, interviewers had entered an
outcome code for unused cases or for cases dealt with
by the other interviewer, and transmitted them to the
office in the normal way. This proved timeconsuming. For the main fieldwork, IT staff wrote a
program to delete unused cases from interviewers’
laptops. This obviously could not be done until all
used cases had been transmitted, checks had been
carried out against the list of sampled men to ensure
serial numbers had not been used twice, and all the
cases which we were expecting had been received.
3.3
Factors affecting response
SSD puts a lot of work into achieving high response
rates when carrying out surveys in prisons and our
efforts are usually successful; 90% of those sampled
for the National Prison Survey 1991 agreed to take
part. The Prisoners’ Health Survey achieved a
response rate of 85% for the interview and 84% for the
measurements, while the Census of Mothers achieved
a 93% response rate (Tables 1-2).
Table 2
Although prison surveys may appear to have a ‘captive
audience’. Prisoners are not as much at the disposal of
interviewers as is often assumed. Many work or attend
educational classes in prison, some have to attend
court hearings and a small proportion leave the prison
each day to work or study. As in any other survey,
prisoners have the right to refuse to participate in the
study. Response cannot be taken for granted, and we
took a number of steps to try and maximise it. Two
particular factors appeared to affect response; the use
of advance letters and the siting and organisation of
fieldwork.
Use of advance letters in institutional settings
SSD normally send letters to sampled addresses in
advance of the interviewer calling, explaining the
purpose of the survey. This prepares the way for the
interviewer, who is in a better position than she would
be if calling cold, and has been shown to increase
response.5 Although a householder is able to discuss
the letter with friends, relatives or neighbours and
decide either independently or after consultation with
others, not to take part in the survey, he or she is not
usually in a position to discuss the letter with other
sample informants, or to persuade or dissuade them
from taking part. An individual receiving an advance
letter in a prison (or in any other institution), in
contrast, is likely to be in contact with at least some
others who have been sampled.
Response to the Prisoners’ Health Survey
Total eligible persons
Fully co-operating*
Partially co-operating+
Non-respondents
Refusals
Non-contacts
Advised not to see
Left prison, no replacement
Number
1173
981
11
%
100
84
1
101
40
4
36
9
3
0
3
* Respondents who were interviewed and agreed to all the measurements
+ Respondents who were interviewed but refused one or more of the measurements
26
SMB 381/96
Ann Bridgwood
Researching in prisons – some practical and methodological issues
This was particularly true for the Census of Mothers,
where the aim was to talk to every woman in prison.
In surveys of institutions, opinion leaders are in a
position to influence response in a way that does not
apply to those asked to participate in household
surveys.
For both projects, we used the same procedure for the
pilot studies, based on that developed for the National
Prison Survey. Notices, to be posted on staff and
inmate notice boards, were sent to the prisons a few
weeks before fieldwork began. Once interviewers had
selected the sample, they sent individually-addressed
letters to those sampled (for the Census of Mothers,
this was one wing in each of the pilot prisons).
Prisoner’s Health Survey
The advanced letters were well-received during the
Prisoners’ Health Survey pilot, with interviewers and
nurses reporting that a number of respondents had
referred to them during the interview or nurse session.
In one prison a rumour circulated that the nurse would
be taking a blood sample, which would be used to test
for HIV status and drug use. We were able to specify
in our advance letter for the main fieldwork that this
was not the case, and that respondents would only
have their blood pressure, respiratory function, height
and weight measured. We have no way of knowing,
of course, whether this enhanced response, but it did
provide an opportunity to allay possible
misconceptions and fears about the survey.
Census of Mothers
During the pilot for the Census of Mothers, a group of
influential women in one of the prisons decided after
receiving the advance letter that they would not
participate in the study and were able to persuade
other women on their wing not to take part. The
women raised concerns about information being
passed on by interviewers to prison staff and social
services. Interviewers also pointed out a practical
disadvantage of individual advance letters; handaddressing and delivering several hundred letters in
the larger prisons would require a lot of time and
resources.
It was decided that advance letters could be counterproductive for this project and therefore they were not
used for the main stage. Notices were sent to the
prison beforehand for the officers’ and inmates’ notice
boards.
27
Where women wee eligible for the full interview,
interviewers explained that it would take about 30
minutes and that, consequently, if prison staff did not
already know that a woman had children, they would
be able to guess by the amount of time she spent in the
interview. At this second stage, 98% of those eligible
to do so agreed to take part in the follow-up interview
(Table 1).
3.3
Siting
and
organisation
fieldwork in the prison
of
Finding suitable accommodation for interviews is
quite an issue in prison research. Most prisons are
short of rooms, and there is heavy demand for them.
Many inmates have to be unlocked from their cell and
escorted to and from interviews by a prison officer; the
interviewing rooms therefore cannot be too far from
the residential wings as this would make heavy
demands on officers’ time. Interviewing rooms must
also combine privacy with adequate arrangements for
interviewers’ safety.
Prisoners’ Health Survey
The Prisoners’ Health Survey placed particular
demands on the selected prisons as three rooms were
required for confidentiality to be maintained; one for
each interviewer and one for the nurse. Ideally, the
rooms needed to be close to each other to allow
respondents to go straight from the interview to the
nurse session. The interviewing was carried out in the
health care suite in several establishments, which
helped to give the survey a ‘medical’ status.
Elsewhere the OPCS team was located in education
rooms. The proximity of interviewers and nurses
almost certainly helped to improve response to the
nurse session; only 11 men (1% of the sample) who
had taken part in the interview did not complete the
nurse session – and two of these started it.
The disadvantage of being located in the health care or
education rooms, away from the prisoners’
accommodation, was that interviewers wee reliant on
prison officers to give advance letters to sampled men,
an were not always able to talk to men who indicated
(to prison staff) that they did not wish to take part.
Where prison officers agreed to escort interviewers to
the residential wings, men who had initially refused to
participate in the survey often changed their mind and
agreed to take part when the interviewer had the
opportunity to give a full explanation of what was
involved. It was not always practical to ask prison
officers to do this, because of other demands on their
time.
SMB 381/96
Ann Bridgwood
Researching in prisons – some practical and methodological issues
Census of Mothers
There was no requirement for a team to be close to
each other for the Census of Mothers; interviewers
could split up and work separately in a way which was
not feasible for the Prisoners’ Health Survey. Nor was
there any particular advantage in being in one location
rather than another; interviewers were very flexible
and interviewed wherever prison officers were willing
to escort them. Interviews took place, for example, in
cells, workshops and recreation rooms. This gave the
interviewers more opportunity to talk personally to
women and to try and persuade some of those who
initially refused, to take part.
4.
Conclusion
Several of the issues discussed in this article, such as
the difficulty of talking to people who had initially
refused to participate because of the need to be
escorted by prison officers, are specific to prison
surveys. Others arose because we were working in
institutions rather than in households and, as such,
may be relevant to other institution-based surveys.
Although institutions such as hospitals or night
shelters may have a relatively stable population size,
they, like prisons, may have a high turnover and
require a way of keeping track of cases which is
flexible enough, especially for CAPI surveys, to cope
with a changing population.
Other issues, such as the opportunity for respondents
to influence each other after seeing advance letters, the
need to track individuals between institutions and
practical problems such as hand-addressing large
numbers of envelopes, arose because the mothers’
28
project was a Census. One particular question – how
best to carry out the sifting – resulted from a
combination of the specific needs of the project
(establishing which women were mothers and/or
pregnant) and its institutional setting. Although sifts
are often a feature of household-based surveys, it is
unusual for more than one interviewer to approach a
household, so the question of which method to use
would never arise.
One of the main conclusions to be drawn from these
two projects, however, is the importance of pilot work.
The pilot studies highlighted all of the issues discussed
above, enabled us to test various approaches and to
decide on the optimum procedures for the mainstage
fieldwork.
References
1.
Bridgwood, A. and Malbon, G. (1995) the
Physical Health of Prisoners 1994. London;
HMSO.
2.
Analysis of the data from the Census of
Mothers is being carried out and will be
published by the research and Planning Unit
of the Home Office.
3.
Although CAPI was considered, we chose to
use traditional paper collection methods due
to various time and cost constraints.
4.
Dodd, T. and Hunter, P. (1993) National
prison Survey 1991. London: HMSO.
5.
Clarke, L. et al (1987) General Household
Survey advance letter experiment. Survey
Methodology Bulletin, 21, 1-8.
SMB 381/96
The presentation of weighted data in survey report tables
Dave Elliot
It is becoming increasingly common for social surveys
to use some form of weighting. This may be desirable
for three main reasons:
i)
to remove the bias caused by the use of
different selection probabilities or sampling
fractions;
ii)
to adjust for nonresponse;
iii)
to gross the survey to the population.
When grossing is not used, most survey tables
typically present the proportions in the categories of
some variable of interest crossed with some
background characteristics (eg age, sex, social class)
and provide the sample sizes for each background
class. Table 1, taken from the 1993 General Household
Survey report, is a typical example.
For grossed surveys, the numbers in the table are
normally estimates of population totals but once again
it is common practice to present the sample sizes as
one indicator of reliability.
As is well known, proportions constructed from
weighted data are not affected by the scale of the
weights, so that if all the weights are multiplied by
100, say, these proportions are not changed. The same
result holds for more complex statistics and methods
Table 1
of analysis – they are completely unaffected by a
change of scale. Because of this, the choice of
weighting scale is largely cosmetic.
The use of weights will however affect sampling
errors as well as survey estimates and most types of
analysis. A useful summary statistic of the
approximate effect of weighting on sampling errors in
a single-stage sample has been developed by Kish.1
This statistic came to be known as the ‘effective
Sample Size’ but this term is now used more broadly
so I refer to the Kish statistic in this paper as the Kish
Effective Sample Size, or KESS for short. KESS can
be interpreted as the size of a simple random sample
that would have produced the same sampling error as a
weighted but unclustered and unstratified sample. For
some examples of the use of this statistic as an aid in
designing samples, see Elliot.2
It is sometimes more useful to look at the ratio of
KESS to the actual sample since this ratio, which is
known as the weighting efficiency, can be validly
interpreted as the proportional effect of weighting on
the sample size, even in clustered and stratified
samples.
The most common choice of scale for survey weights
is one which ensures that the sum of the weights
across the sample is equal to the actual responding
sample size. Indeed, I’ve even seen other choices of
scale described as incorrect by one statistical
consultant.
Whether or not self-employed have employees by sex
Self-employed men and
Women aged 16 and over
Great Britain: 1993
Whether employees
Men
%
67
32
0
990
Self-employed without employees
Self-employed with employees
No answer to whether employees
Base = 100%
When means and proportions are being presented, the
main impact of the choice of scale will be the marginal
sample sizes which are used in report tables. Given the
widespread agreement on scaling weights to the
sample size, it is therefore surprising that there is such
a divergence of views on which numbers to present in
29
Women
%
65
34
1
356
Total
%
67
33
0
1346
published tables but, in OPCS at least, this is currently
the case.
There are three main alternatives for what to present.
SMB 381/96
Dave Elliot
a)
The presentation of weighted data in survey tables
The actual (unweighted) sample size. With
simple tabulation programs, tables must be run
twice – weighted and unweighted – to provide
these numbers. Some software (eg the SPSS
Tables procedure) can produce them both in
one go. This convention was used on a recent
Private Renters Survey3 where both percentage
tables and grossed estimates were presented.
iii)
For weighted samples, different choices of base may
fulfil these aims to a greater or lesser extent.
i)
b)
c)
Weighted counts scaled to the actual responding
sample size. This is the natural choice if
weights are scaled in this way on the data file.
This convention was used for example on the
recent Psychiatric Morbidity Survey4 and is
used on the OPCS Omnibus Survey.
Weighted counts scaled to Kish’s Effective
Sample Size for the sample as a whole. This
convention has been used once on the 1987
OPCS survey of drinking in England and
Wales.5
A variant on (c) which has not been used in any SSD
report but which might be considered is:
d)
The value of KESS for each tabulated subgroup
separately.
Several other alternative weighted bases have been
used in the past including weighted counts, scaled so
that the maximum weight is 1, and some more ad hoc
alternatives.
In the case of an unweighted sample, the main
purposes of reporting the class sample sizes are
perhaps:
i)
to give some general impression of the
reliability of the proportions:
ii)
to enable more sophisticated readers to perform
their own approximate significance tests: and
to enable readers to calculate the sample
numbers in cells and other statistics from the
tables in case they are needed for further
analysis.
Judging general reliability
In weighted samples, all three of the presentation
alternatives (a)-(c) answer this first requirement in
most cases. Although the KESSs in different classes
will normally be less than the actual sample sizes, they
are often of the same order of magnitude because most
designs avoid the extreme weights that cause large
reductions in effective sample size. In cases where this
is not so, weighting to the overall KESS (option c)
might be preferred.
The other problematic case arises when the weights
are either defined by or are closely correlated with the
variables used in the table, so that the proportions in
the table are in effect not weighted. In this case either
of the weighted sample sizes (b) or (c) may differ
substantially from the actual sample size in the
subgroups so the use of the unweighted sample sizes
(a) is clearly preferable. However this situation could
be treated exceptionally even when weighted bases are
the norm.
For example on the 1987 survey of drinking in
England and Wales, referred to earlier, people aged
16-44 were over-sampled compared with older adults.
When results were presented for the two age groups
separately or in separate columns of a table,
unweighted bases were reported, whereas when they
were combined, a weighted base was used. In this case
the use of KESS as the weighted base for the total
sample led to an apparent inconsistency between the
age group and overall totals, whereas weighting to the
overall sample size (b) would have avoided this.
ii)
Approximate significance test
This second requirement is more problematic. Aspects
of the sample design other than the weighting, in
particular any clustering and stratification, may make
the simple significance tests that can be performed on
tables very approximate. Since most survey design
factors are greater than one, significance tests that
assume simple random sampling will result in some
false positive findings. Most good survey reports now
estimate valid sampling errors and design factors for
30
SMB 381/96
Dave Elliot
The presentation of weighted data in survey tables
selected variables and also provide advice on adjusting
the approximate tests. Nonetheless, unless sampling
errors are produced for every estimate in the survey
report, there will be some survey estimates where an
approximate method will be needed.
The classic significance test of whether the
proportions in some category of interest are the same
in two groups uses a pooled variance estimate. To use
this test one must know the
relative sizes of the two groups in the population
which can be deduced from the weighted but not the
unweighted counts.
However this test is rarely used in practice, most
people preferring to construct a confidence interval for
the difference between the proportions in two groups.
This can be done without pooling the variances and
the interval can then be used to provide a marginally
less efficient test that the two proportions are equal.
If all other features of the design have a negligible
effect on the sampling errors, then weighting to KESS
(c) should provide the best approximate confidence
intervals, with weighting to the overall sample size (b)
usually second best. When the report recommends the
use of an approximate design factor, this will generally
incorporate the effect of weighting, so then the use of
the unweighted sample size (a) will provide better
approximate confidence intervals, with option (b)
again second best.
To illustrate the effect of these different options,
consider some data from the October 1995 Omnibus
survey. The design selects one adult at random from
an equal probability sample of households. Weights
equal to the number of adults in each household are
therefore required to remove any selection bias when
analysing the adult sample.
The distribution of numbers of adults in the household
in the responding sample by marital status was as
shown in Table 2.
Table 3 shows employment status by marital status for
all adults from this weighted data as it might appear in
a report table. Below the table are the bases that would
be produced by options (a)-(c) together with the Kish
Effective Sample Size (d) for each marital status
subgroup.
weighted options. Since the weighting efficiency for
individual data is 85% on this survey (i.e. the
proportional effect of weighting is to reduce the
sample size by 15%), one would have predicted that
weighting to the overall KESS (c) would provide the
best approximation to KESS for most subgroups.
However, in this case, option (c) would clearly overcorrect for the effect of weighting in the Widowed and
the Divorced/Separated subgroups and weighting to
the sample size (b) would be a better choice for these
two subgroups, although (c) is the best choice in the
other two subgroups.
It might be argued in favour of weighting to the
overall KESS (c) that it is better to over-correct than
under-correct for the effect of weighting on the
principle that it is usually less desirable to draw false
positive conclusions than false negative ones.
However there must be limits to this line of argument.
Also it seems rather spurious to adjust sample size for
just one aspect of the sample design and ignore the
impact of stratification and clustering on the true
effective sample sizes. In view of this, it may be best
to aim always to provide some general guidance to
readers on the pitfalls of performing approximate tests
on report tables.
iii)
The derivation of sample counts and other
statistics
The requirement of enabling the reader to extract
actual sample counts for the cells in a table is not met
by any of the three options. (The only general way to
achieve this aim would be to produce unweighted as
well as weighted versions of each table.) Multiplying
the weighted proportions by the weighted bases ((b) or
(c)) will produce estimates of the sample numbers that
would have been achieved by an unweighted sample
of the size shown. Using the actual counts (a) in this
way produces numbers that have no obvious
interpretation in most cases. Although neither of these
is ideal, the weighted options, which on average
recover the correct population distribution, are
certainly more useful than the unweighted option
which does not.
Another thing readers sometimes want to do is to
produce percentages based on the rows rather than the
columns of a table. These can be reconstructed from
the weighted but not the unweighted bases.
The use of the unweighted bases (a) in this example
would not produce more accurate tests than the two
31
SMB 381/96
Dave Elliot
Table 2
The presentation of weighted data in survey tables
Distribution of adults in household by marital status
Number of adults
Marital Status
Total
Married/
Cohabiting
Single
Widowed
Divorced/
Separated
1
624
10
206
234
174
2
1076
942
76
26
32
3
233
156
64
5
8
4
73
26
45
2
0
5
12
4
8
0
0
6
2
1
1
0
0
7
1
0
1
0
0
Table 3
Employment status by marital status
Employment status
Marital Status
Total
Single
Widowed
%
Married/
Cohabiting
%
%
%
Divorced/
Separated
%
Working full-time
40
43
44
6
40
Working part-time
16
18
13
5
13
4
3
9
1
7
40
36
34
88
41
(a) Unweighted
2021
1139
401
267
214
(b) Weighted to
sample size
(c) Weighted to
overall KESS
(d) Subgroup KESS
2021
1309
411
162
138
1717
1113
350
138
117
1717
1083
293
230
174
Unemployed
Economically inactive
Bases
32
SMB 381/96
Dave Elliot
The presentation of weighted data in survey tables
Thus in the Omnibus example above, the proportion of
adults in full-time employment who are married can be
estimated validly as:
.43*1309 / (.43*1309 + .44*411 + .06*162 +
.40*138)
= 70%, using the weighted bases (b)
unweighted bases (a) or bases weighted to the sample
size (b) is relatively intuitive but weighting to KESS
(c) will require some explanation, including probably
the fact that it is only an approximation to the effect of
weighting on precision and for some variables it may
be a poor one.
Discussion
but not as:
.43*1139 / (.43*1139 + .44*401 + .06*267 +
.40*214)
= 64%, using the actual sample sizes (a).
A third way that readers may want to manipulate the
data in a report table is to combine the subgroups into
larger groups. Once again, this can only be done if
weighted bases are presented. Thus for example an
estimate of the proportion of single or married adults
who are unemployed can be constructed validly as:
(.03*1113 + .09*350) / (1113 + 350)
= 4.4%, using the weighted bases (c)
but not as:
(.03*1139 + .09*401) / (1139 + 401)
= 4.6%, using the unweighted counts (a).
A more general consideration in choosing between
these options is whether or not they produce consistent
totals. Both unweighted and weighted totals will be
consistent (the subgroup totals will sum to the
combined total) so long as a single set of weights are
used in all tables. When some tables or columns of
tables are weighted and some are not, the results may
be inconsistent but such discrepancies are easily
explained and unlikely to cause much confusion. The
other case where the totals will not be consistent is
where effective sample sizes are calculated separately
for different subgroups (option (d)). Although this
approach gives the best approximation to the effect of
weighting on precision, short of estimating true
sampling errors, the resulting inconsistency adds to the
other disadvantages of this approach - the greater
effort and risk of error in table preparation – so it
cannot be recommended.
Finally, some consideration should be given to how
the choice of base is justified in the report. The use of
33
The use of weighted counts that recover the overall
responding sample size as bases in tables is certainly
the simplest and most popular all-round choice and the
arguments above provide few grounds for deviating
from it in most cases. Nonetheless there are some
situations where other possibilities should be
considered.
When readers want to perform their own approximate
significance tests and the survey report recommends
the use of an appropriate design factor, use of the
actual sample sizes may be a better option, although
this will limit the kinds of further analysis of the tables
that can be done. Also the differences between the
unweighted counts and the weighted counts, scaled to
the overall sample size (b), are likely to be slight for
most subgroups unless the weighting is closely related
to the classification used to define the subgroups (as in
the Omnibus example) and the range of weights is
large.
When the main component of the design factors is due
to the weighting and no advice is provided about
appropriate design factors, the best choice is the Kish
Effective Sample Size for each subgroup. The most
accurate way to provide these is compute them
separately for each column in each report table, but
this will create inconsistencies as well as requiring a
lot of work and opportunities for error. So the simpler
method of scaling the weights to the overall KESS is
likely to be a better practical alternative.
However these two possibilities will only rarely occur
and even when they do they may not dominate the
argument so I recommend that in future, for surveys
that are weighted but not grossed, OPCS should
standardise on one option unless there is a strong
reason to do otherwise. Although there is little to
choose between the two weighted options (b) and (c), I
propose we adopt option (b) and present weighted
bases, scaled to the responding sample size, as it is the
simplest method, it is relatively intuitive, t is currently
the most widely-used option both in OPCS and
elsewhere and it involves the smallest amount of work
and least risk of error in table preparation.
SMB 381/96
Dave Elliot
The presentation of weighted data in survey tables
References
1.
Kish, L. (1965) Survey Sampling. Wiley.
2.
Elliot, D. (1990) the use of the effective
sample size as an aid in designing weighted
samples. Survey Methodology Bulletin, 26.
3.
Dodd, T. (1990) Private Renting in 1988.
HMSO.
4.
Meltzer, H. et al. (19950 The prevalence of
psychiatric morbidity among adults living in
private households. HMSO
5
Goddard, E. and Ikin, C. (1988) Drinking
in England and Wales in 1987. HMSO.
34
SMB 381/96
Harmonised questions for government social surveys
Tony Manners
This article comprises the introduction to the recently
published booklet Harmonised Questions for
Government Social Surveys, Government Statistical
Service, HMSO, 1995. It explains the background,
purpose and methods of a project which is being
carried out by Social Survey Division on behalf of the
Government Statistical Service (GSS). The booklet
publishes the first results of the project: as
harmonisation is a process involving continual
updating of details and response to methodological
investigations, further work is planned.
The need for harmonisation of concepts and
definitions
The United Kingdom has a wide range of government
surveys of persons and households which provide
sources of social and economic statistics. The
decennial Census of Population is the largest and best
known but in addition government departments
commission continuous household surveys on a range
of topics. These include economic activity (Labour
Force Survey), income (Family Resources Survey,
Family Expenditure Survey), expenditure (Family
Expenditure Survey), food purchase and consumption
(National Food Survey), health (Health Survey for
England), housing (Survey of English Housing) and
transport (National Travel Survey), as well as the
multi-purpose General Household Survey which links
many of these topics and others such as education.
There are also several large-scale surveys which are
repeated regularly, such as the British Crime
Surveys, the Dental Surveys and the House
Conditions
Surveys.
The
government
also
commissions single surveys from time to time on
subjects of national importance such as the prevalence
of disability and psychiatric morbidity.
These surveys have been designed at different times,
to meet different needs, and have been commissioned
by a range of departments. Consequently, the surveys
have been developed to a significant degree in
isolation from each other. This has resulted in a lack of
cohesion, with differences arising in concepts and
definitions, in design, and in fieldwork practices. This
lack of cohesion is a source of frustration for many
users.
Investigations have been carried out by the Social
35
Survey Division (SSD) of the Office of Population
Censuses and Surveys (OPCS) into the scope for
harmonising a range of variables across the following
government surveys:
Family Expenditure Survey
Family resources Survey (FRS)
General Household Survey
Health Survey for England (HSE)
Labour Force Survey
National Food Survey
National Travel Survey
Survey of English Housing (SEH)
(FES)
(GHS)
(LFS)
(NFS)
(NTS)
Account has also been taken of concepts and
definitions under development for the 2001 Census of
Population. The SSD work has taken account of a
separate study carried out for the Central Statistical
Office (CSO) and the Department of Social Security
(DSS) into the feasibility of closer harmonisation of
the financial surveys which they sponsor, respectively
the FES and FRS.
The booklet Harmonised Questions for Government
Social Surveys set out the questions which it has been
agreed with the sponsoring Departments should be
harmonised wherever possible among key group of
government surveys of persons and households listed
above. Most of the continuous surveys update their
content at the beginning of a financial year. It is
intended that as many as possible of the questions will
be harmonised from April 1996, with all or most of the
remainder following by the time of fieldwork in April
1997.
Principles of harmonisation
Harmonisation concerns concepts which are inputs
(i.e. interview questions and answer categories) or
outputs (i.e. analysis variables derived from the inputs)
or both (e.g. the question on sex). Initially,
harmonisation has been addressed by looking at
inputs. This allows scope for users to obtain different
derivations from a common set of questions. Work has
begun to investigate the issues faced in harmonising
government statistical outputs.
Surveys’ interest in particular concepts varies
from substantive investigation to use of the data
mainly to classify persons or households to assist
analysis of research topics (e.g. the numerous
questions which comprise gross household income in
SMB 381/96
Tony Manners
Harmonised questions for government social surveys
the
FES
and
FRS
as
compared
with
the single question on this topic in the SEH). Some
surveys will require further details on topics than can
be obtained from the harmonised questions alone. It
will normally be the case that such surveys already ask
for that detail. The harmonised questions have been
designed so that the surveys which ask for more detail
can either derive them with the further detail without
adding to the length of interview.
The idea that a survey for which it is appropriate to
derive an equivalent variable to a harmonised question
should additionally ask the harmonised question
directly has been considered and rejected. To do so
would duplicate effort and seem inappropriate to
respondents in the context of the interview.
Harmonisation involves some compromises, since
surveys’ prime concerns vary so widely. For example,
surveys vary in the extent to which they allow
information to be given by one respondent on behalf
of another who is absent at the time of interview. It
would be unrealistic to expect Departments which
currently accept proxy data as adequate to their needs
on certain of their surveys to find the resources to
harmonise on data given in person.
The harmonised questions are intended to fit flexibly
into the designs of different surveys. There is no
intention that they should form a unified sequence
within a questionnaire. Questions and groups of
questions are intended to be placed in existing
questionnaires in the most appropriate places. This
will often mean substituting a harmonised question for
an existing one on the same topic.
Avoiding an increase in respondent burden has been a
major consideration in designing harmonised
questions. Some of the harmonised questions have
more detailed sets of answer categories than some of
the current surveys use for these topics. However,
classifying respondents’ answers to a more detailed set
does not necessarily increase the time needed to
answer a question. More detailed categories have been
included only where they will not add to interview
length. There is no intention to probe for detail which
is not volunteered, unless a question specifically
demands such probing. The aim has been to save time
by providing clear categories for the rarer answers,
where these are of interest for analysis. However,
account has also been taken of the need to make it
easy for interviewers and respondents to find the
36
major answer categories and not to lose them in a host
of details. Finding the right balance in such
compromises has an important bearing on survey
quality.
The harmonised questions have been built on the
current surveys’ experience, in particular that of the
surveys which are sponsored by Departments with the
lead interest in a topic. For example, harmonised
questions on economic activity have been based on
those developed for the CSO’s Labour Force Survey
to obtain, in particular, the International Labour
Office’s measure of unemployment.
Adopting this pragmatic approach does not preclude
continuing research on improved questions. Indeed, it
is essential to be able to take advantage of
opportunities such as the programme of census
question testing. Moreover, details of questions will
change as a result of new legislation or administrative
arrangements. Harmonisation is not a once and for all
process. The intention is that it should be subject to
continuous review, with periodic updates of the
booklet on harmonised questions.
The next steps in the process of continuous
harmonisation will be to examine the feasibility of
extension to outputs, as mentioned above, and to make
proposals for the interviewing procedures and
instructions, and for the consistency and plausibility
checks associated with the questions.
Practical assumptions
When specific harmonised questions are proposed,
practical issues of how they should be asked are
inevitably raised. The questions are framed within
specific methods (e.g. interview surveys; and at a more
detailed level, that certain questions should be asked
within special grids). All the surveys considered are
conducted by face-to-face interview, except that
second and subsequent interviews in the LFS (which
is a panel survey) may be carried out by telephone.
Most of the surveys are carried out with computer
assisted interviewing (CAI), which allows for quality
checks during the interview (for example on
completeness of information in complex grids) and
ensures that routing is correctly followed in every
case. Nevertheless, all the harmonised questions can
be asked in the proposed forms, using paper and pencil
methods, by adequately trained interviewers.
SMB 381/96
Tony Manners
Harmonised questions for government social surveys
While it might be desirable that harmonised questions
should be asked in exactly the same ways on different
surveys, it is recognised that this may not be
achievable. The emphasis in the harmonisation project
is on harmonising question wording, answer categories
and the subsamples to whom the questions are
addressed. Matters such as question sequence (for the
factual type of question involved) the use of proxy
respondents, and (at a rather more minor
methodological level) use of specific kinds of grids
were regarded as secondary in the sense that
achievement of the main type of harmonisation would
be worthwhile even if the second were not practicable.
The scope of harmonisation
Harmonisation which extends to all or nearly all
household surveys can be thought of as covering a
primary set of questions and definitions. Questions
and definitions which apply only for a selected group
of surveys, can be thought of as belonging to a
secondary set. There may be a number of secondary
sets (eg. one might involve a set of questions on the
FES, FRS and GHS, and another a different set of
questions on the FES GHS and SEH.
The primary set of questions and definitions
Common definitions of person and household
response units are vital steps towards harmonisation.
For government surveys, there is already a standard
definition of adults as persons aged 16 years or more.
Definition of the household response unit has differed
between surveys. Most use the household definition
which was adopted in the 1981 and 1991 Censuses of
Population, which focuses on shared living
accommodation. The FES and NFS, however, have
continued to use the previous census definition, which
reflects the response unit relevant to their
investigation, ie. the domestic consumption unit. For
the future, the intention is to seek to harmonise on the
current census definition for all surveys and, where
necessary, identify units within households, such as
consumption units and benefit units.
37
The topics covered by harmonised concepts and
questions in the primary set are:
-
definition
of
the
household
response unit
- household composition
- sex
- age
- marital status (ioe. legal marital
status)
- living
arrangements
(sometimes
known as de facto marital status)
- householder status
- household reference person
- relationship to household reference
person (o[ptionally: to all members)
- ethnic origin
-
tenure
-
economic activity
-
industry, occupation, employment status
and socio-economic classifications
-
full-time and part-time work
-
income classification
The secondary
definitions
sets
of
questions
and
The secondary sets of questions and definitions
have been based on the shared interests in
particular topics of different groups of surveys.
The topics covered in the secondary sets, and
the surveys to which they apply, are:
- social security benefits
FES,
GHS
FRS,
- consumer durables
FES,
GHS
FRS,
- income from main job
as employee
FES,
GHS
FRS,
- income from
self-employment
FES,
GHS
FRS,
- accommodation type
FES, FRS,
GSG, SHE,
NTS, HSE
SMB 381/96
Tony Manners
Harmonised questions for government social surveys
- housing costs and
benefits
FES, FRS,
GHS
- vehicles
FES, FRS,
GHS, NTS,
HSE
- length of residence
FES, FRS,
GHS, SHE,
NTS, LFS
- selected job details
FES, FRS,
GHS, LFS
- health
FRS, GHS,
HSE
38
SMB 381/96
Report on the Quality Issues in Social Surveys (QUISS) Seminar
London, 5th December 1995
“Quality – The Interviewer Effect”
Jeremy Barton
The third in a series of biannual half-day seminars
organised by Social Survey Division (SSD) was
chaired by the Head of the Division, Bob Barnes. Over
70 delegates braved the blizzard conditions to enjoy
three interesting and thought provoking talks on the
effect of interviewers on social surveys.
Introduction
Amanda White, Methodology and Sampling
Unit, SSD, OPCS
Amanda White gave a brief introduction which
emphasised the importance of the role that
interviewers play in the survey process. She
highlighted some of the responsibilities of the
interviewer and showed that these were also likely
sources of measurement errors. Three particular areas
were identified: uninformed errors, such as
misconceptions, bad habits and lack of knowledge;
interviewing and probing style, for example,
emphasising different words in the question: and the
demographic and socio-economic characteristics of the
interviewer, especially age, sex and race. All these can
be thought of as ‘interviewer effects’.
Once the source of measurement error has been
recognised, the likely effect must be considered.
Systematic measurement errors by interviewers in a
consistent direction lead to biases in survey estimates.
However even if interviewer errors are random in
nature, such that they cancel out across interviewers,
they can lead to an increase in the total variance. In
addition, the effect of individual interviewers
consistently deviating from the average can result in
correlated response variance. This can be quite
substantial in practice, but is difficult to measure
without specifically designed studies being carried out.
In order to evaluate the effect of efforts to deal with
interviewer effects, and thus reduce response variance,
it is important to have the means to measure
interviewer variance. This is best done by carrying out
experiments, such as re-interviewing (or replication),
preferably using an interpenetrating design. Other
techniques exist for measuring compliance with
training guidelines such as behaviour coding.
39
However these are generally both time consuming and
costly. There are a number of ways of attempting to
deal with interviewer effects at the source (i.e. in the
field): by standardising interviewer tasks and training
methods; by closely supervising and monitoring
interviewers, and providing feedback when necessary;
and also by reducing the workloads for each
interviewer (within cost restraints).
Finally the argument for standardisation versus
tailored interviewing techniques (standardising the
concepts behind the questions rather than the questions
themselves) was briefly touched on, and left for
further contemplation.
Monitoring and improving fieldwork quality
on computer assisted surveys
Chris Goodger, SSD, OPCS
Chris Goodger gave a talk on how computer assisted
personal interviewing (CAPI) can help in monitoring
and improving the quality of fieldwork. He examined
four main areas of interviewers’ work within Social
Survey Division – sampling, gaining response,
interviewing and post-interviewing tasks.
Tasks that were previously done manually (using
paper and pencil techniques) can now be done more
accurately using CAPI. For instance, whilst paper
selection forms are still used to choose individuals
from within households, CAPI programs can be used
to check the correct person has been chosen. On
surveys with complex placing patterns, computer
technology can be used to ensure changes to the
pattern still form a balanced design over the fieldwork
period. These procedures were demonstrated using
CAPI programs for the Omnibus Survey and the
National Travel Survey.
Calling at addresses at appropriate times is very
important for getting good response rates. Level of
response also shows how effective interviewers are at
gaining co-operation from the households. Details of
interviewers’ calls at each address, including time of
call, are entered onto laptop computers and returned to
the office via a modem, so the information can be
quickly analysed and fed beck to the interviewers.
SMB 381/96
Jeremy Barton
Quality Issues in Social Surveys (QUISS) Seminar
There are many ways CAPI can help the interviewer
during an interview. Checks that would have been
carried out during the office edit for paper and pencil
interviews (PAPI) can now be done during the
interview itself, so any inconsistencies can be resolved
with respondents there and then. It is easier to
standardise screen layout, so interviewers will know
where to find key information and thus be less likely
to overlook it SSD are also investigating ways of
making interview instructions – currently only on
paper – accessible on the laptop.
A link between supervisors’ and interviewers’ laptops
allows supervisors to see on his/her own screen
exactly what the interviewer is keying in, whilst being
able to observe the interaction between interviewer
and respondent. Consequently, they can assess the
control of the interview, the quality of probing, the
adherence to standard interviewing methods, and the
accuracy of recoding and note-making more easily
than before.
On record-keeping surveys, where further calls are
made to the household after the interview, an extra
level of checks can be built into the questionnaire for
the interviewer to activate at home. Any problems can
then be used for complex checks that would otherwise
interfere with the flow of the interview.
Computer-assisted coding can be used by interviewers
as part of their work at home or, in some cases, in the
interview itself. SSD is planning to introduce
occupation coding during the interview, so that further
information can be probed for until an appropriate and
unique code is found.
Quality control at your fingertips
John Hulbert and Sue Rolfe, British Market
Research Bureau (BMRB)
Sue Rolfe and John Hulbert spoke about the recent
developments at BMRB in extending the use of CAPI
facilities from controlling the interviewer process to
monitoring and improving the quality of interviewing.
The speakers described the historical context in which
the opportunity to develop these systems came about,
and went on to report in detail how CAPI-specific data
is analysed and used as a quality control for fieldwork.
Until 1993 all BMRB surveys were conducted by
PAPI, using standard tools for ensuring survey quality
such as questionnaire editing, back-checking and
interviewer accompaniment. There was no automated
system for monitoring interviewer performance over
time.
In 1993 CAPI was introduced gradually to most
surveys, so that by April 1994 90% of face-to-face
fieldwork was conducted this way. An e-mail
communication and booking-in system was introduced
at the same time to allow overnight transmission of the
day’s work. This led to an immediate increase in the
efficiency of field management and a faster turnaround of results, and with this a means to introduce
procedures to measure interviewer performance.
A system was developed which manipulated the
information captured by CAPI into a manageable form
with the minimum of human intervention. Daily
transmission and analysis of data allows for fast
feedback to the interviewer. Specific data captured by
the computer includes claimed work versus data
received (to measure falsification); quota achievement;
length of interview; and completion of key
questionnaire components.
From the raw data, several indices of performance can
be calculated which allow for identification of poor or
good interviewers. Basic measures such as the number
of codes selected, the actual time taken for the
interview, and the expected length of interview, given
the route taken through the questionnaire, are used to
calculate more complex measures, such as speed of
coding and efficiency of probing. All measures are
normalised in order to make comparisons across all
surveys, and interviewers’ measurements are analysed
over time to allow for day-to-day fluctuations in
performances.
A measure, velocity, was constructed as the ratio of
expected time take through the route to actual time
taken. This measures the speed of interview regardless
of the route taken. Density measures the efficiency of
probing, and is calculated as the ratio of the number of
codes selected for multi-code questions to the score for
the route. There is likely to be a trade-off between
velocity and density, but interviewers with particularly
high density and low velocity, or low density and high
velocity may need re-training to improve the speed of
interviewing and depth of probing respectively.
An average score for velocity and density combined
can be used as a general indicator of interviewer
40
SMB 381/96
Jeremy Barton
Quality Issues in Social Surveys (QUISS) Seminar
performance. Low velocity and density would indicate
poorer interviewers and high velocity and density
would signify interviewers of a higher than average
quality. Interviewers who are identified as underperformers can be back-checked or re-trained. But
good performers are rewarded with promotion, and
identification of these allows for the selection of a ‘hit
squad’ of high quality interviewers to be used on
particularly difficult jobs.
Work is ongoing to increase the scope of the system to
include CATI (computer assisted telephone
interviewing), and it was felt that the benefits already
experienced by field, researchers and clients can only
increase once the full potential is realised.
The assessment of interviewer effects
Dr Dick Wiggins, City University
Dick Wiggins talked about the effect of interviewers
on the survey process as a whole and the need for a
global evaluation of the survey. He first looked at the
context within which interviewer errors are to be
found, and the implication for researchers. He then
focused on particular ways in which the interviewer
effect could be measured, in particular drawing on an
example from a study which he conducted and
analysed.
Kish’s1 schema of total survey error shows how errors
in the field make up a great deal of all variable errors.
In CAPI surveys, since there is little or no processing
error, variable errors in the field account for nearly all
non-sampling errors. No matter how hard we try to
reduce or control sources of error, we still need to
assess the impact that the various sources contribute to
our survey estimates. However, there are a number of
conflicts which researchers must contend with,
specifically, the collection and analysis of policy
relevant data versus specific methodological studies
and the reduction of response errors on the current
survey versus the improvement in quality of future
surveys. Dick argued that what is needed is a gradual
or pragmatic approach of ‘many investigations of
modest scope’ as proposed by Kish.
In principle it is possible to operationalise the optimal
design efficiency by using an index of information per
unit cost as proposed by Deming.2 The quality of
information for an item (or a set of key items) is best
measured by the reciprocal of the product of variance
and cost per unit of information.
41
quality of information = 1 / (variance*cost).
The variance term can include various aspects of
variable error, and could also be replaced by mean
square error to include important survey biases such as
non-response.
The interviewer is the vehicle by which the objectives
of the researcher are translated into results. This was
illustrated by looking at a motivational model of the
interview as a social interaction, where the
interviewer’s task is to motivate the respondent to
participate. But there is still little research available on
what makes a good interviewer, and what does exist
tends to be contradictory.
Dick argued that knowing the source of interviewer
effect would not necessarily eradicate it – we need to
measure it. With minimal modification to the survey
design we could use a degree of interpenetration, so
that the area effect can be controlled for. The inflation
factor of the sample variance due to interviewers can
be shown to equal,
1 + rho * (average workload size – 1)
where rho is the intra-class correlation coefficient (and
is typically between 0 and 1.11). O’Muircheartaigh3
showed that doubling the number of interviewers had
the same effect on variance as doubling the sample
size, assuming rho stays constant, though in practice
rho is likely to increase since the expertise of the field
force is diluted.
Historically interviewer effects were measured by
ANOVA methods. However, multi-level modelling
(MLM) provides a means of measuring and explaining
sources of variation simultaneously, e.g. by modelling
interviewer characteristics directly the variation
between the interviewers can be explained. It is also
possible to examine the differential impact of
interviewers by allowing relationships to vary across
workload clusters (known as complex level 2
variation). A study carried out by O’Muircheartaigh
and Wiggins4 on a physical dysfunction measurement
called The Functional Limitations Profile (FLP)
identified ‘average number of calls’ and ‘attitude
towards the disabled’ as positively related to the FLP
score.
Finally, non-response can be used as a measure of
performance by modelling response rate against
indirectly controllable variables (e.g. motivation and
experience) and directly controllable variables (e.g.
selection and training, terms of employment, call-back
strategy).5 This could be incorporated into the MLM
framework.
SMB 381/96
Jeremy Barton
Quality Issues in Social Surveys (QUISS) Seminar
References
1.
Kish, L (1962) Studies of interviewer
variance for attitudinal variables. JASA, , 57,
92-115.
2.
Deming, W.E. (1953) On a probability
mechanism to attain an economic balance
between resultant error of response and the
bias of nonresponse, JASA.
3.
O’Muircheartaigh, C.A. (1977) Response
Error. In: Model fitting: the analysis of survey
data (Chapter 7) O’Muircheartaigh, C.A. and
Payne, C. Wiley.
42
4.
O’Muircheartaigh, C.A. and Wiggins, R.D.
(1981) The impact of interviewer variability
in an epidemiological survey. Psychological
Medicine, 11, 817-824.
5.
Thomsen, I. and Siring, E. (1983) On the
causes and effects of non-response:
Norwegian experiences. In: Incomplete data
in surveys (Chapter 2) Vol 3. Academic
Press.
SMB 381/96
Report on the 6th International Workshop on Household Survey NonResponse. Helsinki, 25-27 October 1995
Kate Foster
This year’s workshop was hosted by Statistics Finland
and drew delegates from ten European countries,
Canada and the United States. The Workshops are
intended to bring together survey practitioners who are
actively engaged in work on non-response so that they
can share results, discuss ideas together, and make
progress towards a co-ordinated research agenda.
Some members of the group have been involved
throughout the past five years while others attend
workshops only while they are working on projects
related to non-response. This makes for a varied
programme with some papers describing the
experiences in countries which are not regularly
represented and others giving the next instalment for
continuing projects.
This year I was the only representative from Social
Survey Division (SSD) and I presented a paper on
using the results of our census match project to reweight the Family Expenditure Survey. John King of
the Central Statistical Office and Pamela Campanelli
of SCPR also attended from the UK.
Format
The Workshop began and ended with keynote
speakers, Carl-Erik Sarndal from the University of
Montreal, and on secondment to Statistics Finland,
started proceedings with a paper looking at the use of
imputation to compensate for unit non-response. This
gave examples of the methods available to Statistics
Canada in their generalised edit and imputation
system, and considered the effects on variance of
using imputation rather than weighting.
At the end of the Workshop Bob Groves, of the
University of Michigan Survey Research Center, gave
a presentation on problems of methodological
innovation in government statistical agencies. This
identified various barriers to innovation, including
how budgets are set, the existence of distinct research
and production cultures in statistical agencies and the
lack of formal academic training that is directly
relevant to the work of official statisticians
Nonetheless, he suggested that some recent
developments might offer a stimulus to innovation: the
quality movement, new data collection technologies
and being more directly exposed to external forces, for
43
example through the consumer movement or increased
competition.
The rest of the Workshop was split into eight main
sessions following broad themes, with two or three
papers presented in each. There were also two
opportunities to join small group discussions on a
variety of topics and, on the final day, a plenary
session to discuss research ideas and the agenda for
the next workshop. This report concentrates on papers
and areas of work which might be of most interest to
other members of SSD.
Interviewer effects
Pamela Campanelli described a number of projects
concerning interviewer effects which are currently
underway at SCPR. These include a study which aims
to isolate the influence of interviewers on survey
response from other factors such as characteristics of
the area and of the sampled individual or address.
They plan to use data from the British Household
Panel Survey, which has an interpenetrated design
with at least two interviewers per quota, and will use
multi-level modelling for the analysis. Other projects
will compare call strategies and patterns of contact for
interviewers in different organisations and will
investigate the persuasion strategies which
interviewers use. The latter study will include a
methodological element to compare different methods
of collecting information about the initial interaction
between the interviewer and sampled person.
A split-sample study from Finland compared attitudes
towards the role of interviewer among professional
interviewers and public health nurses working on the
same survey. Response rates for professional
interviewers were substantially higher than for the
nurses (88% compared with 74%) and the two groups
showed very different scores on a selection of Likert
scale questions on attitudes to voluntariness and
persuasion. A general attitude index was derived by
principle components analysis of the set of attitude
measurements. Professional interviewers were
characterised by scoring low on voluntariness and high
on persuasion and the reverse pattern was seen for
nurses.
SMB 381/96
Kate Foster
Report on the 6th International Workshop on Household Survey Non-Response
Edith De Leeuw of the University of Amsterdam
presented the results of multi-level analysis to identify
the effect of the interviewer on gaining co-operation in
a survey of the elderly. For this survey she had a
number of items of information on sampled
individuals and some measures for interviewers,
including information on their attitudes and an
evaluation by supervisors. A number of dichotomous
dependent variables relating to the response process
and final outcome were defined, for example whether
the sampled person co-operated immediately, whether
they entered into discussion, and whether they cooperated eventually. The analysis indicated that
interviewer characteristics had little effect on the
probability of the various outcomes investigated.
all sampled units and which were associated both with
the propensity to respond and with major survey
variables. Weighting schemes using these indicators to
define weighting classes should be particularly
effective in reducing bias. The analysis centred on four
indicators derived from field administrative records or
a record of the topics mentioned in the initial
interaction between the interviewer and sampled
person. The most successful of the indicators tested
was whether the respondent had said at any stage that
they were “not interested” in the survey topic: this was
a good predictor of eventual response and also
associated with substantive responses to the survey. If
the finding stands for other surveys then indicators of
reluctance because of survey topic might be a
powerful variable to use in weighting schemes.
The characteristics of non-respondents
Weighting for non-response
Over the years a number of papers have presented
information on the characteristics of non-respondents
in different countries. In some cases, as in our own
recent study, the data are derived from linkage to the
population census but other countries have access to
more varied data on population registers.
At this year’s workshop, researchers from Slovenia
presented analysis of non-respondents on their Labour
Force and Family Budget Surveys by linkage to
various registers. Many of the findings were similar to
results for our census-linked study: lower response for
people living in multi-unit accommodation, older
people, single-person households and those in urban
areas. They had also linked the survey samples with
income tax and unemployment registers so were able
to test directly for bias in key estimates on those
surveys; in our studies we can only derive an indirect
measure of bias by looking at household
characteristics. Results showed that higher income
households had significantly low response rates on the
Slovenian Family Budget Survey.
Researchers from Statistics Finland had learnt more
about non-respondents to their Survey of Living
Conditions by linkage to register sources giving
information on economic problems. This showed that
non-response among men in particular was strongly
associated with evidence of economic problems such
as unemployment, low incomes and over-debtedness.
Mick Couper from the Survey Research Center at
Michigan presented some more general findings based
on the US National Election Study. The aim was to try
to identify any indicators that might be available for
44
This year there were few papers concerned with
adjustment to compensate for non-response. I
presented a paper which used our census-linked
dataset to compare the effects of different methods of
weighting the FES. This showed that, for this survey,
using the response propensities derived from the
census-matched data was more effective in reducing
bias in the characteristics of the achieved sample than
were more conventional weighting methods such as a
simple variant of post-stratification, adjusting for
stratum response rates or using data on the number of
calls made to contact the household. Census-based
weighting also had most effect on survey measures of
household expenditure. We intend to develop this
work further in order to compare the effects of more
sophisticated population or sample-based weighting
and to allow for different methods being used in
combination.
Bob Groves reported on his latest work on an
approach to post-survey adjustment which is
motivated by theoretical perspectives on survey
participation. The method involves using data from a
variety of sources: characteristics of the sampled
person, stratum characteristics from the sampling
frame and field information about the interview
process and the initial interaction between the
respondent and interviewer. Using these varied data
sources, he modelled the probability of contact and
then the likelihood of participation given contact on a
survey of the elderly. Weighting was therefore in two
separate stages. As with our FES experiment, he
found that survey estimates showed little change as a
result of weighting but that the changes were in the
expected direction.
SMB 381/96
Kate Foster
Report on the 6th International Workshop on Household Survey Non-Response
Other papers
Researchers from Statistics Netherlands had run a
split-sample experiment to test a different advance
letter on their National Travel Survey. This letter
included an informed consent paragraph which
allowed the Ministry of Transport to have access to
less restricted (although still anonymous) microdata
from the Survey. Analysis showed that there were non
significant differences in response times or response
rates between the two sections of the sample.
We also heard about various experiments on monetary
incentives which had been carried out over a number
of years by the US National Center for Health
Statistics. Experiments on their Health and Nutrition
Examination Survey clearly showed that increasing the
remuneration for the health examination resulted in
improved response but also that appointments were
easier to make and were less likely to be broken,
which reduced other costs. On a survey of HIV risk
behaviour, the incentive to first-time non-respondents
45
was doubled in a follow-up study and there was higher
reporting of risk behaviour among these refusal
conversions. This demonstrated how non-response
bias could be reduced by allowing higher payments to
those most reluctant to participate.
Statistics Sweden have embarked on a project to
produce a handbook on current best methods for nonresponse reduction. This has been proposed because of
pressures on response rates over recent years and the
aim is to disseminate knowledge within the agency
about normal organisational practice and the results of
recent research in relevant areas.
Finally, there was an update on the first year of
computer assisted interviewing on the Canadian
Labour Force Survey. Although response had
decreased at the time that CAI was introduced, only a
small element was attributable to technical problems
and most of the fall was due to other changes in
procedures and in sample design. Statistics Canada are
planning to carry out more analysis of data collected
by their case management systems, looking at number
and timing of calls and duration of interviews. They
will investigate whether the findings could inform
interviewer training or add to the field indicators
which are monitored continuously.
SMB 381/96
NEW METHODOLOGY SERIES
NM1
NM2
NM3
The Census as an aid in estimating the
characteristics of non-response in the GHS.
R Barnes and F Birch.
FES. A study of differential response based
on a comparison of the 1971 sample with the
Census.
W Kemsley, Stats. News, No 31, November
1975.
NFS. A study of differential response based
on a comparison of the 1971 sample with the
Census.
W Kemsley, Stats. News, No 35,
November 1976.
NM13
A Sampling Errors Manual.
R Butcher and D Elliot. 1986.
NM14
An assessment of the efficiency of the coding
of occupation and industry by interviewers.
P Dodd. May 1985.
NM15
The feasibility of a national survey of drug
use.
E Goddard. March 1987.
NM16
Sampling Errors on the International
Passenger Survey.
D Griffiths and D Elliot. February 1988.
NM17
Weighting for non-response – a survey
researchers guide.
D Elliot. 1991.
NM4
Cluster analysis.
D Elliot. January 1980.
NM5
Response to postal sift of addresses.
A Milne. January 1980.
NM18
The feasibility of conducting a national
wealth survey in Great Britain.
I Knight. 1980.
The Use of Synthetic Estimation techniques
to produce small area estimates.
Chris Skinner. January 1993.
NM19
Age of buildings. A further check on the
reliability of answers given on the GHS.
F Birch. 1980
the design and analysis of a sample for a
panel survey of employers.
D Elliot. 1993.
NM20
Convenience sampling on the International
Passenger Survey.
P Heady, C Lound and T Dodd. 1993.
NM21
Sampling
frames
establishments.
S Bruce. 1993.
NM22
Feasibility study for the national diet and
nutrition survey of children aged 1½ to 4½
years.
A White, P Davies. 1994.
Prices:
NM13 £6.00 UK - £7.00 overseas
NM17 £5.00 UK and overseas
NM1 to NM12 and NM14 to NM16 £1.50
UK - £2.00 overseas
NM18 to NM22 £2.50 UK - £3.00 overseas
NM6
NM7
NM8
NM9
NM10
Survey of rent rebates and allowances. A
methodological note on the use of a followup sample.
F Birch. 1980
Rating lists: Practical information for use in
sample surveys.
E Breeze.
Variable Quotas – an analysis of the
variability.
R Butcher.
NM11
Measuring how long things last – some
applications of a simple life table technique
to survey data.
M Bone.
NM12
The Family Expenditure and Food Survey
Feasibility Study 1979-1981.
R Barnes, R Redpath and E Breeze.
46
of
communal
Orders to: New Methodology Series,
Room 304, OPCS, St Catherines House,
10 Kingsway, London WC2B 6JP
SMB 381/96