The electoral register as a sampling frame

The electoral register as a sampling frame
Kate Foster
1.
decrease by around 1% over the year in which the
register was available for use.
Introduction
The Postcode Address File (PAF) and the electoral
register (ER) are the most complete and accessible
national frames of residential addresses in Great
Britain and both are extensively used for drawing
random samples for general population surveys.
Although an ideal sampling frame would cover the
whole population of interest it is well known that
both of these frames are, in practice, incomplete.
Among survey practitioners there is considerable
interest both in monitoring changes in the coverage
of the PAF and the ER and also in defining the
characteristics of those who are omitted, since this
may be a source of bias in survey results.
2.
The use of the electoral register as
a sampling frame
The electoral register is compiled as a list of all
people eligible to vote in the United Kingdom; this
includes citizens of the Commonwealth and the Irish
Republic as well as of Great Britain and Northern
Ireland who are aged 18 or over or who will become
18 during the life of the register. The register is
compiled on October 10th each year, and is in force
from February 16th of the following year for a period
of one year. Because of the time required to bring
together in one place the different parts of the new
register, it is normally available for sampling
purposes roughly from April of the year in which it
comes into force through to March of the following
year (ie. form April 1991 to March 1992 for the
1991 register).
The coverage of both sampling frames is currently
being assessed by Social Survey Division (SSD)
using the sample of households selected for the
Electoral Register Check, which was carried out by
SSD in conjunction with the Census Validation
Survey. This paper reports on the coverage of the
1991 electoral register as a sampling frame of the
private household population and updates a similar
analysis presented as part of the report on electoral
registration in 1981.1 The coverage of the PAF will
be dealt with in a later paper.
The register is known to be an incomplete list even
of electors, and it will obviously not list adults who
are not eligible to vote. The 1991 Electoral Register
Check2 showed that, for Great Britain as a whole,
7.1% of eligible people who were recorded in the
Census were not included on the register, but that
the non-registration rate varied by the individual’s
age, length of residence at that address, ethnic origin
and region of residence.
Although there is general interest in monitoring
changes over time in the coverage of sampling
frames, there was particular concern that the
electoral register’s coverage might have suffered
over recent years because some individuals wanted
to avoid registration for the community charge and
hence did not register as electors. The check on the
1991 register reported in this paper showed that
94.7% of households were in addresses listed on the
register and that 95.4% of the usually resident adult
population were in listed addresses. These results
indicate a slight deterioration in the coverage of the
frame since 1981 when 96.4% of households and
96.5% of adults were in listed addresses.
Although the register is primarily a list of adults, it is
preferable to use it as a sampling frame of addresses
because the coverage of addresses is known to be
more complete than the coverage of named electors.
This is primarily because an address is listed so long
as at least one elector is registered there, but
coverage of addresses may also be improved
because of the practice in some areas of including
the same information as on the previous year’s
register if no form has been returned by the
household.
The register’s coverage was lower among single
adult households, those in privately rented
accommodation, those in London, especially Inner
London, and, elsewhere, in non-metropolitan areas
of Great Britain. Coverage among adults was lower
for individuals who had moved in the previous 12
months, those in the 20-29 age range, and among
non-White ethnic groups.
3.
The method of assessing coverage
The usual method of assessing the coverage of a
sampling frame is to identify a representative
sample of the target population drawn from an
independent source and to check whether the
sample members are covered by the frame. In 1991,
as in 1981, a suitable sample for an assessment of
the electoral register as a sampling frame was
provided by the sample for the Electoral Register
An assessment of the deterioration of the register as
a frame of households over time gave similar results
to the 1981 study, suggesting that coverage might
1
SMB 33 7/93
Kate Foster
The electoral register as a sampling frame
about one fifth of whom were identified by the CVS
as being missed form enumerated households. Thus
about 1.6% of the resident population were missed
both in the Census and in this element of the CVS.
Insofar as there was deliberate evasion of both the
Census and the CVS, it is likely that the ERC sample
will tend to underestimate the level of nonregistration of individuals on the electoral register. It
is, however, probable that this under-enumeration
has much less effect on the register’s coverage of
addresses than of named individuals.
Check (ERC). This survey was carried out alongside
the Quality Check element of the Census Validation
Survey (CVS) which used a sample of private
households drawn from census records.
The sample design for the CVS was a multi-stage
probability sample to select 6,000 households in
Great Britain that had returned a census form. The
sampled households contained about 11,300 usually
resident adults aged 17 or over on 15 February 1991.
Visitors staying at the sample of addresses on census
night (21 April 1991) were excluded from the
analysis as they were also listed on the census form
for their usual residence. Since the CVS oversampled areas where enumeration was expected to
be difficult, the achieved samples of households and
of adults were weighted in the analysis to reflect
their different probabilities of selection. The tables
in this paper give weighted bases only.
4.
The coverage of households and of
adults
Although the electoral register is mainly used as a
sampling frame of addresses, surveys are generally
concerned with households or adults. The adequacy
of the electoral register as a sampling frame is
therefore assessed by the proportion of households
and of adults that are included in listed addresses.
Both coverage rates are shown in Table 1.
On the electoral Register Check, interviewers
transcribed
census
information
onto
the
questionnaire before checking the entry for the
household on the electoral register. This information
was therefore available for all cases regardless of
their response to the ERC interview. Further items of
information relating to informants’ eligibility for
inclusion on the register were collected in the
interview, including the previous address of adults
who had moved since the qualifying date in October.
The study showed that 94.7% of all private
households that were occupied on census night were
at addresses listed on the electoral register. This
coverage rate for households compares with a figure
of 96.4% for the 1981 register, so there is evidence
of a slight deterioration in the register as a sampling
frame for households.
The coverage rate for adults is based on those who
were defined as being usually resident at the ERC
sample of addresses. Some 95.4% of this sample of
adults were in addresses listed on the register. Most
of the sample of adults (88.0%) were themselves
listed on the register at their April address and a
further 7.4% lived in addresses that were listed even
though they themselves were not. The difference
between the coverage rates for adults and
households reflects the variation in rates of coverage
by household size, as shown in Table 2.
The results in this paper are based on households
which returned a census form, although the sample
of adults also includes any people in that sample of
households who were identified by the CVS as not
having been enumerated on the Census. The undercount for the Census is estimated to be 2% of the
resident population (around one million people),
Table 1 The registration of adults and their addresses in April 1991
Households
Percentage
Person on the register at April address
Person not on the register but April
address is
Total in addresses on the register
Address not on the register
Total (weighted bases)
Adults
Percentage
Number
Number
na
na
na
A
88.0
7.4
8720
736
94.7
5.3
100
4862
271
5133
95.4
4.6
100
9456
451
9907*
* The sample base for adults is all aged 17 years or over on 15.2.91 who were usually resident on census night in the
sampled households.
2
SMB 33 7/93
Kate Foster
The electoral register as a sampling frame
Table 2 Households and adults on the register by the number of usually resident adults in the household
Number of adults usually
resident in the household
One
Two
Three
Four or more
All households/adults
Households
Adults
Proportion in
addresses on
the register
92.2%
95.5%
97.4%
95.7%
94.7%
Base=100%
1552*
2698
615
268
5133
Proportion in
addresses on
the register
93.2%
95.5%
97.4%
95.3%
95.4%
Base=100%
1527
5396
1844
1140
9907*
* Includes a small number of households with no usually resident adults under census definitions.
5.
Characteristics of the household
The register’s coverage of households was lowest
(92.2%) among those comprising only one adult.
Coverage increased with household size up to 97.4%
for households with three usually resident adults but
was slightly lower (95.6%) for adults but was slightly
lower (95.6%) for households for four or more adults
(Table 2). An improvement in coverage with increasing
Variation in the coverage of households
and adults
We now look at variation in the register’s coverage by
selected characteristics of households and of
individuals. The conclusions reached are broadly
similar to those reported in the 1981 assessment of the
electoral register but some new analyses are also
presented.
Table 3 Households and adults on the register by region and country
Region/country
Households
North
Yorkshire & Humberside
North West
East Midlands
West Midlands
East Anglia
South East (exc London)
South West
London
Inner London
Outer London
Proportion in
addresses on
the register
95.1%
97.3%
96.9%
93.5%
96.5%
96.3%
93.9%
93.6%
91.2%
87.3%
93.7%
Adults
Base=100%
292
483
563
379
450
191
965
457
622
234
388
Proportion in
addresses on
the register
96.2%
97.3%
97.2%
95.2%
97.1%
96.3%
94.5%
94.8%
91.8%
87.9%
94.0%
Base=100%
537
939
1045
744
889
387
1886
909
1171
426
745
Regions exc. London
Metropolitan
Non-Metropolitan
97.1%
94.4%
1012
3047
97.6%
95.2%
1930
5939
England
Wales
England & Wales
Scotland
94.7%
93.0%
94.6%
96.3%
4402
280
4682
451
95.3%
94.9%
95.3%
97.1%
8505
534
9039
868
Great Britain
94.7%
5133
95.4%
9907
3
SMB 33 7/93
Kate Foster
The electoral register as a sampling frame
Table 4 Households and adults in addresses on the register by housing tenure
Housing tenure
Households
Owned outright
Buying with a mortgage
Local authority rented
Other rented
Proportion in
addresses on
the register
97.9%
95.7%
96.7%
82.5%
All households/adults
Adults
Proportion in
addresses on
the register
98.5%
96.3%
97.1%
82.8%
Base=100%
1214
2168
1051
670
94.7%
5133*
95.4%
Base=100%
2234
4710
1793
1144
9907*
* Includes a few cases where hosing tenure was not known
London (87.3%) than in Outer London (93.7%).
Outside London, coverage tended to be better in
metropolitan areas: 97.1% of households in
metropolitan areas (excluding London) were in listed
addresses compared with 94.4% of households in nonmetropolitan areas.
household size is to be expected since households
comprising more adults will generally contain more
electors and hence there is a greater chance that one
elector is listed. The fact that this improvement in
coverage did not extend to the largest households,
comprising four or more adults, suggests that such
households may differ from smaller ones in other
significant respects.
Table 4 shows coverage by housing tenure group. The
strongest pattern to emerge is the lower rate of
coverage for households in the privately rented sector,
which includes those renting accommodation with their
job or business as well as those living in furnished or
unfurnished rented accommodation. Around 83% of
households and of adults in this tenure group were in
addresses listed on the register compared with more
than 95% for each of the other major tenure groups –
those who owned their accommodation outright, those
buying on a mortgage and those living in local
authority rented accommodation.
The register’s coverage of households and adults varied
according to region of residence, as shown in Table 3
which also gives summaries by country. At the national
level the electoral register gave slightly better coverage
of households, and also of adults, in Scotland than in
England and Wales; 96.3% of households in Scotland
were in addresses listed on the register compared with
94.6% in England and Wales.
Within England, London has the lowest coverage rate
but this was markedly worse for households in Inner
Table 5 Adults whose April address was on the register by how recently they had moved to the address
Whether had moved in 12 months
before census
Adults on the register at
April address
Had moved in previous 12 months
Of whom
- Had moved in 6 months since
qualifying date
- Had moved in 6 months before
qualifying date
31.7%
Adults whose April
address was on the
register
76.1%
Base = 100%
993
5.6%
69.8%
415
81.9%
93.0%
356
Had not moved in previous 12
Months
94.3%
97.6%
8914
All adults
88.0%
95.4%
9907
4
SMB 33 7/93
Kate Foster
The electoral register as a sampling frame
Table 6 Adults whose April address was on the register by age
Age
Adults on the register at
April address
17
18-19
20-24
25-29
30-49
50 and over
71.1%
81.6%
67.7%
75.9%
89.5%
96.1%
All adults
88.0%
Adults whose April
address was on the
register
95.3%
94.5%
89.2%
91.7%
95.2%
98.3%
Base = 100%
154
346
964
1007
3503
3930
95.4%
9907*
* Includes a few individuals whose age was not known
With respect to the use of the register as a sampling
frame, about three quarters (76.1%) of those who had
moved in the previous year, and 69.7% of those who
had moved in the previous six months, were in
addresses that were listed on the register compared with
97.6% of non-movers. There are a variety of reasons
why the April addresses of movers might not have been
listed: the addresses may not have existed or may have
been unoccupied at the qualifying date or they may
have been occupied by people who were either
ineligible for inclusion on the register or were eligible
but not listed.
Characteristics of individuals
We now turn to variation in the register’s coverage of
adults by selected characteristics of the individuals
involved. The tables also show, for reference, the
proportion of adults in the different categories who
were themselves listed on the register.
Table 5 looks at the likelihood of adults being listed on
the register, or of living in an address that was listed, by
whether and when they had moved in the previous year.
As would be expected, only a very small proportion
(5.6%) of those adults who had moved in the 6 months
since the qualifying date for the register, that is
between October10th and April 21st, were themselves
listed on the register. Those who had moved in the six
months before the qualifying date were also less likely
than non-movers to be listed; 81.9% were on the
register compared with 94.3% of non-movers.
Table 7
As found in previous checks, adults under the age of 30
were not only less likely than older people to be listed
themselves on the register but were also less likely to
live at addresses that were listed. Table 6 shows that the
proportion of individuals who were themselves listed
was lowest for the 20-24 age group (67.7%), rising to
Adults whose address was on the register by age and whether they had moved in the previous year
Age
Proportion
of movers
Adults whose April address was on the
register
Not moved
Moved
All
Base = 100%
17
18-19
20-24
25-29
30-49
50 and over
10%
16%
28%
23%
9%
3%
98.4
98.6
94.3
95.7
97.0
98.9
(10)
72.7
76.1
77.7
75.3
77.7
95.3
94.5
89.2
91.7
95.2
98.3
Not
moved
139
291
694
781
3205
3803
All adults
10%
97.6
76.1
95.4
8914*
Moved
All
(15)
55
270
227
298
127
154
346
964
1007
3503
3930
993*
9907*
* Totals include a few individuals whose age was not known
5
SMB 33 7/93
Kate Foster
The electoral register as a sampling frame
Black (including Black-Caribbean and Black-African),
Indian (including Pakistani and Bangladeshi), and
Other (including Chinese). Those classified as White
by ethnic group were the most likely group to be living
at addresses listed on the register but there was little
difference in the coverage of those classified as Black,
Indian, or Other. Some 95.9% of White adults were in
addresses that were listed compared with 84%-88% of
the other groups.
75.9% among the 25-29 age group, but most of those
who were themselves not listed lived in addresses that
were listed 89.2% and 91.7% respectively of adults in
these age groups lived in addresses that were listed on
the register. Although only a relatively small proportion
(71.1%) of 17 year olds were themselves listed on the
register, presumably because it was the first year in
which they were eligible for inclusion, they were no
less likely than all adults to live at addresses that were
listed.
6.
Movement and deterioration of the frame
Table 7 explores whether the lower coverage of
younger age groups is related to their greater mobility.
The second column of the table gives the proportion of
adults in each age group who had moved in the
previous 12 months and shows clearly that adults in the
20-24 age group (28%) and those aged 25-29 (23%)
were most likely to have moved; 10% of all adults had
moved in that period. As we have already seen, the
coverage rate for those who had moved in the 12
months before the census was much lower than for nonmovers (76.1% compared with 97.6%) and Table 7
shows the coverage rates by age for these two groups.
There was little variation with age in the coverage of
movers while coverage rates for non-movers were only
slightly lower for adults in their twenties. Thus most of
the under-representation of adults in their twenties can
be explained by the group’s higher mobility although
non-movers in these age groups were also less likely
than non-movers in other age groups to be in listed
addresses.
The register will deteriorate as a frame of household
addresses to the extent that new addresses become
occupied over time; such addresses may either be new
buildings or existing addresses which were unoccupied
at the time that the list was drawn up. The coverage rate
may also be affected if some addresses that were
occupied at the time that the list was drawn up become
unoccupied over time, and hence ineligible for
inclusion in the sampling frame. This will result in an
increase in the proportion of “deadwood” among listed
addresses but such changes will only cause a
deterioration in the coverage of the frame if listed
addresses are more likely than unlisted addresses to be
affected.
The main effect over time on the register’s coverage of
occupied addresses is, therefore, the extent to which
newly constructed and previously vacant addresses are
(re-)occupied. It was not possible to pursue analyses of
this type using the ERC data-set since that survey did
not collect information on the history of the addresses
occupied by recent movers. Some relevant information
is available from the Department of Environment
(DOE) which collects data on house constructions
although not on the proportion of dwelling units which
are temporarily vacated or re-occupied over a given
period. The DOE data show that house construction
resulted in a 0.9% increase in the dwelling stock over
the year in which the 1991 register was in use as a
sampling frame (April 1991 to March 1992).
Finally, Table 8 looks at variation in the electoral
register’s coverage of adults according to their ethnic
group. Eligibility for inclusion on the register as an
elector is defined by an individual’s citizenship rather
than their ethnic group but it is, of course, likely that
there is an association between these two attributes.
The census question gave a choice of nine ethnic
groups which, because of the relatively small number
of ethnic minority households in the survey sample,
have been grouped to four categories. These are White,
Table 8 Adults whose April address was on the register by ethnic group
Ethnic group
White
Black
Indian
Other
All adults
Adults on the register at
April address
88.8%
69.6%
79.9%
64.8%
88.0%
Adults whose April address
was on the register
95.9%
88.3%
87.1%
84.2%
95.4%
6
Base = 100%
9380
170
274
82
9907
SMB 33 7/93
Kate Foster
The electoral register as a sampling frame
change in coverage can be obtained by expressing these
two groups as a percentage of all adults in the sample.
Deterioration in the coverage of individuals
The ERC data does, however, enable us to make a
rough estimate of the deterioration over time in the
frame’s coverage of adults since it includes information
on the previous address and date of move of adults who
had moved into their census address during the
previous year. In general it would be expected that the
movement of individuals during the lifetime of the
register would have a much greater effect on the
accuracy of the register (ie. whether adults are listed at
their latest address) than on its coverage (ie. whether
they are living at an address which is listed on the
register at all).
There were some problems in carrying out this analysis
on the1991 ERC database due to the large number of
cases in which either the date of moving or the October
address was missing. Analysis on those cases for which
complete information was available gave similar results
to those obtained in the 1981 study, which suggested a
deterioration of about 0.4% in the frame’s coverage of
adults over a six month period. If movement is assumed
to have continued at the same rate over the period in
which the register was available for use, then this
would imply a deterioration in the coverage of adults of
about 1% over the 12 month period from April 1991.
The deterioration in the register’s coverage of adults
over time was estimated with reference to those adults
who had moved between the qualifying date for the
register (in October 1990) and the date of the census (in
April 1991). The frame’s coverage of adults will
1.
deteriorate to the extent that these adults moved from
listed addresses to unlisted addresses, but this will be
offset to the extent that adults moved from unlisted to 2.
listed addresses. An estimate of the net percentage
References
Todd J and Butcher B. Electoral registration in 1981.
OPCS (1982)
Smith S. Electoral registration in 1991. OPCS (1993)
7
SMB 33 7/93
The use of substitution in sampling
Dave Elliot
1.
refusals or non-contacts would have biased the sample
towards more co-operative and more available
respondents.”
Background
On a number of occasions recently, the issue has arisen
of whether and how to use substitution to make up for a
shortfall in sample numbers, from whatever cause.
Many people in SSD have a knee-jerk reaction at the
mere suggestion, believing it to be inextricably
associated with quota sampling and massaging
response rates and therefore not the sort of method that
a good survey organisation should ever contemplate
using. In this paper I take a different view – that
substitution for non-respondents when used with proper
controls may sometimes be a useful addition to the
survey sampler’s toolkit. However, on other occasions,
especially its superficially more innocuous use in
replacing ineligible units, the method may sometimes
result in significant biases.
In all cases the substitute use was randomly selected
using methods similar to those used in selecting the
initial sample.
Another part of the sample consisted of users of day
centres. In this case people entering the centres were
selected using a constant interval, ineligibles and nonrespondents were noted but disregarded and the
sampling was continued until the set sample size was
achieved. Despite the statement that “No substitutes
were selected to replace refusals or people who were
screened out”, the procedure described can be
interpreted as substitution by another name.
By substitution I mean the replacement of some
specific unit in the set sample which fails to yield a2.2
usable response with another unit from the population. I
shall illustrate this issue with four recent examples
before moving on to generalities.
2.
2.1
Sampling institutions
A second example concerns some advice I gave on
sampling for a planned survey of children. The sample
design has three stages: children within schools within
a stratified sample of local authority areas. The plan is
to select just one secondary and a number of primary
schools in the selected areas and then seek the cooperation of the schools in selecting and interviewing
children. If a secondary school declines to take part in
the survey, I advised selecting a substitute but the
project officer was not happy with this advice believing
that the substitution method is fundamentally flawed.
Four examples
Survey of the homeless
The first comes (indirectly) from the planned survey of
Psychiatric Morbidity, as part of which OPCS plan to
include the homeless. This mobile group is particularly
problematic to sample for a number of obvious reasons
and the planned design draws heavily on the lessons
learnt in a pioneering survey of single homeless people2.3
undertaken by SCPR1. In discussing the details of the
methods used in sampling in short-stay hostels, Lynn
describes how the establishments were sampled and
then a random sample of beds was used in the selected
hostels. Substitution was used at both stages – hostels
that declined to co-operate were substituted (twice in
some cases) and sampled beds that were unoccupied on
the date of the survey were substituted. Occupants of
sampled beds that refused the interview or who could
not be located were however not substituted. Likewise
respondents who were not eligible for the survey were
not substituted.
Sequential sampling within institutions
My third example concerns a design we suggested in
response to an invitation to tender for a survey of fees
paid to private residential homes. The specification
suggested a design in which 5 eligible residents were
selected from each of a sample of institutions. Since
eligibility could not always be easily determined prior
to selection, the suggested method was to sample
residents one at a time, determine their eligibility and
continue sampling until the target number was
achieved. This is an example of “sequential sampling”
and mirrors closely the method of sampling visitors to
day centres described in the first example.
It was particularly problematic in this case because the
primary aim of the survey was to produce grossed
estimates of total expenditure and there was no reliable
independent measure of the size of the eligible
population. Consequently it was essential to know and
control the selection probabilities in order to gross up
In justification of this procedure, Lynn writes:
“These strict probability sampling methods were
deemed necessary in order to ensure the accuracy of the
survey results. Allowing for
8
SMB 33 7/93
Dave Elliot
The use of substitution in sampling
the survey means. With the method suggested, these
probabilities could not be determined nor eveniv.
estimated without bias, which in turn would bias the
grossed estimates. It may be useful to run through the
argument for this assertion in a simple case before
moving on to a more general discussion of the effects
of substitution in different situations.
Any positive correlation between survey variables
(average fees) and the number of eligible residents will
tend to increase the bias. A negative correlation will
reduce it.
Table 1 below shows the % bias in estimates of the
total eligible population for varying population sizes
and ineligibility rates for a fixed eligible sample size of
5 residents. In the absence of any correlation (see (iv)
above) the same bias will occur in all survey estimates.
Suppose we need a sample of 5 eligible residents from
a home with 10 residents, exactly 5 of whom are
eligible. We are aiming to produce an estimate of the
probability of sampling any eligible resident. The true
value of this is 1, as we are taking a sample of 5Table 1 Percentage bias in eligible population estimates by
population size and ineligibility rate
eligible residents from a population of only 5. So using
the sequential sampling method suggested, each
Population in
Ineligibility
% Bias
eligible resident must eventually be selected.
Home
Rate (%)
10
10
1.9
The sample could be achieved in a number of ways, the
20
10
1.8
two most extreme of which are that the first 5 selected
50
10
1.7
residents are all ineligible. In the first case we would
10
20
3.8
estimate the selection probability as only 1/2, as we
50
20
3.6
would assume that the non-selected residents are
10
30
6.0
similar to those selected and are also eligible. In the
50
30
5.6
second case we should estimate the selection
10
40
8.3
probability as 1, as we will have actually selected all 10
50
40
7.7
residents and will know that only 5 are eligible, and
10
50
10.8
that all of these are bound to be chosen in the sample.
50
50
10.0
Obviously in no case should we estimate a probability
50
60
12.5
greater than 1. Thus averaging over all the possible
50
80
18.3
sequences would produce a mean value less than one
and thus the estimated probability is biased downwards
in this case.
2.4
PAF used address procedure
The effect of this underestimation of the selection
probabilities is that population totals will inevitably be
overestimated (as they are obtained by dividing the
sample totals by the erroneous probabilities). The
extent of the bias depends on several factors.
i.
The average number of eligible residents per institution
– as this increases, the bias reduces. In this case we
knew that the average number of residential places per
home was just 17 but with a wide variation around this
figure.
ii.
The ineligibility rate. With zero ineligibility there is no
bias. As the ineligibility rate increases, so does the bias.
A previous feasibility study had suggested between 5%
and 15% ineligibility overall (although this estimate
was based on a purposive sample and is therefore not
reliable) but ineligibility rates in different homes will
inevitably vary greatly around the overall value.
iii.
Substitution is currently used within the PAF sampling
system developed in OPCS. PAF addresses that have
been selected for one OPCS survey are normally tagged
and excluded from reselection in any other OPCS
survey for a fixed time period (currently three years for
most surveys). The way this is implemented is that such
addresses are left on the file and so are liable to be
reselected on another survey. When this occurs they are
immediately substituted by a neighbouring address. The
rationale is that as the first sample which marked them
as a used address was random, the substitutes that
happen to have such addresses as neighbours can also
be regarded as a random sample of the population. In
fact insofar as the ordering of addresses within the PAF
places addresses with similar characteristics close
together, then systematic sampling will act as a kind of
implicit stratification and we could expect some
efficiency gains as a consequence. This stratification
effect will be preserved by substituting a neighbouring
address rather than simply boosting the initial sample
size to compensate for these special “non-respondents”.
The eligible sample size per home. This bias problem is
particularly serious for small sample sizes.
9
SMB 33 7/93
Dave Elliot
3.
The use of substitution in sampling
the size of these groups directly by substituting the nonrespondents. In this case the effect on the bias is
identical to that post-stratification.
Substituting for non-respondents
Substitution is being used or considered for use in two
quite different contexts in these four examples:
replacing initial non-respondents (both refusals and
non-contacts) and replacing selected units which are
later discovered to have been ineligible for the survey.
In both cases the main aim is identical – to recover the
number of cases that have been lost from the sample
and hence boost precision. However the effect is
different in the two cases.
A problem with the approach occurs when the units are
not being selected with equal probability – the most
likely situation occurs when one is selecting aggregates
such as institutions where these are often selected with
probability proportional to size. In this case the residual
population of institutions, having selected a sample,
will have a different mean from the total population.
The only way to deal satisfactorily with this situation in
general is to take a larger sample than is needed
initially and hold part of it in reserve to be used a s
substitutes. In most cases, this should be the preferred
method of implementing substitution even if the units
are being selected with equal probabilities.
Under a simple model for non-response, all members of
the population will either respond or fail to respond to a
survey and different samples will pick up these two
groups in different proportions by chance. If the mean
for respondents differs from that for non-respondents,
the normal survey estimate, excluding the nonrespondents will be biased. If we substitute the initial
non-respondents with a further random sample from the
population, the mean of the combined sample of the
two groups of respondents will be biased to exactly the
same extent as the mean of the first group of
respondents, but the sample size will undoubtedly be
larger and so the estimate will be more precise. Clearly
we could extend the substitution procedure by
continuing to select and approach people until we
achieve a set target of interviews. So long as the4.
substitutes are randomly selected, the procedure clearly
does not affect the bias in either direction.
The design in example 2.1 seems inconsistent, as
substitution was allowed for non-cooperating hostels
but not for non-responding individuals and the basis of
Lynn’s argument against substitution of individuals is
unclear. However the apparent inconsistency might be
due to concerns about the effect on interviewer
motivation and response rates if substitution of
individuals had been allowed. This is discussed further
in section 5, below.
Substituting for ineligibles
In the example in 2.3 above substitution for ineligibles
would have made the estimation of selection
probabilities and hence of any survey estimates
particularly problematic. As the discussion above
makes clear, the bias is likely to be most serious when
ineligibility rates are high and target sample sizes are
small. However it does not disappear entirely in other
cases whereas the most straightforward alternative to
substitution, boosting the initial sample size in line with
overall expected ineligibility, is unbiased. The bias
arises because of the necessity of estimating the
different
selection
probabilities.
Consequently
substitution for ineligibles will only be unproblematic
when the units involved are being selected with equal
probabilities at that particular stage in the sampling or
when the true ineligibility rate for the sampling unit is
known. Although this may be true at the final stage of
some multi-stage samples, the widespread (and highly
desirable) use of pps sampling means that such
examples will be rare and that consequently
Moving now to a slightly more realistic model of the
non-response mechanism, suppose that the tendency to
respond to the survey is different amongst different
groups of people in the population and that once again
the means of respondents and non-respondents differ
within the groups. Then on average any random sample
will select people from these groups in the proportion
that they occur in the population and survey estimates
will again be biased. If the initial sample is selected
with equal probability, then the remaining population
will have exactly the same means as the full population
and so an additional sample taken to substitute for the
initial non-respondents will not affect the bias of
sample estimates.
If no substitutions are made and post-stratification by
these groups is used to reduce the non-response bias,
the effect is artificially to boost the size of the groups
with the lowest response rates by giving them larger
weights. This will often reduce but not eliminate the
bias. An alternative which can be used if the groups can
be identified on the sampling frame would be to boost
substitution for ineligibles cannot be recommended in
anything other than simple random samples.
10
SMB 33 7/93
Dave Elliot
5.
Other considerations
The use of substitution in sampling
6.
Conclusion
Bias and precision should never, of course, be the sole
criteria in determining sampling procedures. We should
also consider their impact on OPCS staff and
particularly on interviewers. Substitution or something
very like it is widely used in quota and other nonrandom sampling methods and we must beware of
giving interviewers (or anyone else involved with the
survey) the impression that any informant is as good as
the one we initially selected.
Substitution of non-respondents with randomly selected
alternatives while not in any way reducing nonresponse bias, in principle does not increase it either. Its
use would increase the sample size more efficiently
than boosting the set sample since it would fix the final
sample size. However the argument of the last selection
on the potential psychological impact on interviewers
of permitting some substitution when none has been
allowed before I believe sways the argument against its
widespread introduction in SSD.
There are two separate risks involved here. Firs that
interviewers will try less hard to secure a high response
rate if they know that a substitute will always be
provided – the discussion in Section 3 assumes that the
methods we use to produce a high response rate will
continue to be used on both the initial set sample
members and the substitutes. If the result of permitting
substitution is a reduction in response rates then we
may be more prey to non-response biases of the
argument used above on the absence of any change in
non-response bias will fall. Secondly there is a risk that
if we permit substitution in some cases, interviewers
may make their own non-random substitutions in other
cases to boost their apparent response rates.
However in those situations where its use does not
impinge on interviewers, for example in replacing noncooperating institutions, it appears to offer some
advantages. This is especially true when the substitute
can be selected from the same population group as the
initial non-respondent, when its effect is akin to poststratification, ie it may reduce non-response bias.
Substitution of ineligible units, although not affecting
interviewers in any obvious way, may introduce biases
in certain cases and should in general be avoided.
Reference
There is also a third rather less tangible risk that by
introducing a method which interviewers may associate
with lower quality research, they might start to feel less
confident in our own commitment to quality methods
which could in turn affect their motivation to maintain
high standards.
1.
Lynn, P. Survey of single homeless people,
Technical Report. SCPR.
11
SMB 33 7/93
Characteristics of non-responding households on the Family
Expenditure Survey1
Sarah Cheesbrough
completed for every household, whatever the outcome
code. On return to the office the RCQs were keyed into
a Blaise CADI program. This process allowed the
comparison of responding and non-responding
households to be completed separately from the normal
keying and editing timetable of the FES.
Response rates on the FES are normally in the range of
68-72%. These figures are lower than on most SSD
surveys due to the demand placed on respondents to
complete a long questionnaire on income and
expenditure and then keep a detailed diary of
expenditure for the two weeks following the interview.
Non-response is a problem in all sample surveys of the
general population; on the FES the rather high rate of
non-response means that there is a danger of underrepresenting important groups of spenders. Researchers
need to seek ways of improving response and, at the
same time, to develop methods of compensating for
non-response after the event. It is the latter approach
which is the subject of this paper.
The RCQ covered basic information about household
members such as age and sec, relationships within the
household and working status. Additionally there were
questions on the type and tenure of the accommodation
and on car ownership. A second section required the
interviewer to record his/her observations of the ethnic
group and main language spoken by the household.
Finally, there were three questions at which the
interviewers’ impressions of the household were
entered, covering any ill health in the household that
might affect response, wealth of the household and any
other relevant information.
Methods of weighting the data to compensate for nonresponse are being investigated. Every 10 years the
comparison of FES data with Census records provides
the most accurate analysis of the characteristics of nonrespondents. The variables compared could then
provide a source to re-weight for non-response. This
and other methods of re-weighting survey data are
described and discussed in a recent monograph in
SSD’s New Methodology Series2. One drawback of
using Census data is the time lag of up to 10 years
between the Census measures and the survey data that2.
are being re-weighted. This project was set up to2.1
consider a method of collecting information on nonresponding households alongside the main survey. The
emphasis of this exercise was to evaluate the feasibility
of interviewers gathering information direct from nonresponders on the doorstep.
Interviewers were briefed on various methods of
introducing the RCQ; a flexible approach in gaining the
information was the key to not damaging normal FES
response rates.
Results
Reception by informants
Completed RCQs were available for a total of 772
eligible households. Some forms were not returned in
time for analysis but the outcome codes for the RCQ
exercise were in exact proportion to the figures for the
total of 819 households that were eligible in January.
Other studies of non-respondents have looked at the use
of a ‘basic question’ that is asked of non-responding
households in order to identify a key characteristic of
these households that is relevant to the subject of the
survey3. For the FES a pool of basic questions was
considered in terms of their importance for reweighting variables and how practical it would be to
collect this information without affecting the main
survey response rate or antagonising members of the
public.
1.
The January 1993 FES response rate of 73.3% was
below that of the previous January (74.6%), when an
increased incentive payment
for co-operating
households had just been introduced; but it was above
that of any of the previous six calendar months and
slightly above the monthly average for the whole of
1992 (72.8%). It seems safe to conclude, then, that the
exercise did not damage the FES response rate.
There was also no evidence, from interviewers’
comments or complaints to OPCS from members of the
public, that people were antagonised by the small
amount of extra probing for information from nonresponders that the exercise required.
Method
The study of non-responding households was carried
out for one fieldwork month (January 1993) in
conjunction with the normal FES quota. A very short
Response Characteristics Questionnaire (RCQ) was
12
SMB 33 7/93
Sarah Cheesbrough
Non-responding households on the FES
Table 1 Main source of information for non-responding households
Main source
of answers
Member of
Household
Neighbour etc
Interviewer
Observation
Total Number
of households
2.2
Introducing the
Questionnaire
Refusal
before/during
Interview
105
Refusal at
diary stage
NonContacts
14
0
119
7
58
0
0
4
8
11
66
170
14
12
196
Response
obtain
some
information
from
neighbours.
Encouragingly, 62% of households who refused to
participate in any part of the survey did give basic
information for the RCQ.
Characteristics
Once it was clear that at least one member of the
household had refused to co-operate with the survey
there were two distinct methods used by interviewers to
collect the information. This varied both according to2.4
the interviewer and the type of household. The first
type of introduction explained briefly the exercise to
the household member…
The quality of information
It was a particular concern of the project to evaluate
whether the information obtained about non-responding
households was of a high enough quality to compare
with main FES data. The refusing cases were examined
and results are reported according to the method by
which the information was obtained.
“People refuse for all types of reasons and we are
interested in seeing whether we are losing similar
groups of the population, so if you could just spare me
a few moments I’d be very grateful if I could just ask..”
Alternatively it was often more appropriate to use
indirect methods to obtain the answers to the RCQ
questions. An interviewer reported..
Questions directed to a member of the household
In general, if the co-operation of a member of the
household was gained, the information was very
accurate. As shown in table 2, basic demographic
information was readily given, whilst information about
the accommodation and vehicles was harder to obtain.
“I never asked the questions as questions but tried to
ask them as part of general conversation. Someone who
is telling you about how disgraceful this big brother
attitude is is hardly going to turn round and tell you
how many bedrooms they have.”
Table 2 Proportions of refusing households
where no information available
Question
Interviewers found it easier to gain co-operation where
either only one member of the household had refused to
participate or where it was only the income section of
the survey that the household objected to.
Sex of HOH
Age of HOH (no exact age or band)
Marital status of HOH
Working status of HOH4
Sex of other household members
Age of other household members
(no exact age of band)
Marital status of other household
members
Working status of other household
members
Number of bedrooms in accommodation
Household tenure
Car or van available
Age of vehicle
Although interviewers were given the options of using
the questionnaire or a small prompt card on the
doorstep, the majority found it easier to memorise the
questions so that there were no physical interruptions to
the primary task of converting the household to a
response.
2.3
Total
Souurce of information
Interviewers were asked to report on the methods they
had used to collect the information on non-respondents.
In table 1 the methods used are shown against the type
of non-response at the household. In the few cases on
non-contact interviewers still sometimes managed to
% of
households
1
8
1
1
8
2
5
15
16
16
19
Interviewers’ observations
13
SMB 33 7/93
Sarah Cheesbrough
Non-responding households on the FES
tests of differences between the responding and nonresponding groups.
Two questions on the RCQ required the interviewer to
observe the ethnic group and the main language spoken
by members of the household. This did not present any
problems for interviewers but obviously the results are
the opinion of the interviewer rather than the
respondent.
However, using the variables found to be significant in
the 1981 Census comparison as a basis, some
distributions for particular questions were compared.
With the emphasis on household non-response, analysis
concentrated on information about the head of
household (HOH).
Interviewers’ impressions
The interviewers were asked to give their impressions
of the health and wealth of the household. Although the
questions were nearly always answered many
interviewers commented on how their experience had
shown how misleading initial impressions could be.
3.
Comparison of responding
responding households
and
Age of HOH
The 1981 Census comparison found that response
declined with increasing age of HOH: young adults in
general might be hard to contact or reluctant to cooperate but response was high where a young adult was
actually HOH.
non-
For this study in households where the age of the HOH
was established, it was apparent that a larger proportion
of non-respondents fell into older age brackets. Overall
the mean age of HOH for non-responding households
was 54 years old (n=127) compared to 50 years old
(n=566) for responding households.
In the following analysis only non-responding
households who refused directly to the interviewer,
either before or during any interviewing or later at the
diary stage, are included.
3.1
Information about individuals
Previous studies comparing responding and nonresponding households have matched addresses
selected for the FES sample to Census records4. The
sample size for the response characteristics exercise
would be too small for any comparable significance
Figure 1 show the distribution of the age of the HOH in
the responding and non-responding households. The
graph show that age groups more likely to have
Figure 1 Distribution of age of HOH for responding and non responding households
14
SMB 33 7/93
Sarah Cheesbrough
Non-responding households on the FES
Table 3 Economic activity of HOH
Non-responding
households
Responding
households
1991 FES
%
%
%
Economically active
60
63
62
of which.. Employed
Self-Employed
Unemployed
45
11
5
48
7
7
48
9
5
Economically inactive
40
37
38
Total
100
100
100
Base = 100%
169
566
7056
Not surprisingly, the non-responding group contained a
larger proportion of self-employed people. In table 3
results are shown beside those for the FES in 19915.
dependant children form a greater proportion of
responding households whilst the non-responding
group contains a larger proportion of households with
an older HOH.
The lower level of economic activity for nonresponding households is consistent with a higher
proportion of HOHs that were of retirement age. The
lower proportion of unemployed HOHs in the nonresponding group could also be a result of the higher
average age of the group. However, there were an
additional 8 non-responding households where it was
not clear whether the HOH was unemployed or
economically inactive.
The 1981 Census comparison found a positive
association between households with dependant
children and survey response. The results from the
RCQ confirmed this finding. Whilst 33% (n-566) of
responding households contained one or two adults
with at least one child under 16 years old, this was the
case for only 19% (n=185) of non-responding
households.
Employment status
3.2
A larger proportion of heads of household in the nonresponding group were economically inactive.
Interviewers were very successful in determining the
employment status of those who were working. Many
reported that the nature of the non-respondent’s job is
often mentioned in any explanation for refusal. If a
person was not working it was more difficult to clarify
whether they were economically active or not.
Information about the household
Accommodation type
Interviewers were able to observe the type of
accommodation for all refusing households and then
ask some non-respondents how may bedrooms there
were within the household. With this small sample
there were no clear differences between the groups.
Table 4 Tenure of responding and non responding households
Non-responding
Responding
%
%
Owned, including with a mortgage
Rented from a Local Authority,
New town, Housing Association etc.
Rented from Private Landlord
61
27
67
23
12
9
Total
100
100
Base
155
563
15
SMB 33 7/93
Sarah Cheesbrough
Non-responding households on the FES
Table 5 Number of vehicles available to household
NonResponding
Responding
1991 FES
Responding
Households
%
%
%
No car or van
One car or van
Two or more cars or vans
36
40
23
30
50
19
32
45
23
Total
100
100
100
Base
156
566
7056
Tenure
Interviewers were successful in ascertaining tenure at
84% of refusing households. A larger proportion of
responding than non-responding households in January
were owner occupiers.
Ethnic group and main language of household
Interviewers were required to record their impressions
of the ethnic group and main language of the
household. Non-responding households consisted of a
slightly higher proportion of ethnic minority
households (5% compared to 4% responding, n=181
and 565 respectively). Two of the non-responding
households had no members who spoke English
compared to only two of the all the responding
households. Information on this group of nonresponders was more limited than average and
inconclusive.
Vehicles
The 1981 Census Comparison found that response was
lowest amongst multi-car households, possibly
reflecting non-response among those with high income.
Information was available for nearly 85% of refusing
households. However interviewers felt this was the
most unreliable question unless it could be asked
directly of household members. Table 5 shows the
RCQ figures beside FES 1991 results.
Health and wealth of household
As mentioned earlier, interviewers often found ill
health discussed when reasons were given for refusing
the survey. Interviewers noted that at 25% (n=172) of
non-responding there was some or much ill health
compared to 17% (n=564) of responding households.
However, this information was very closely related to
the fact that non-respondents included many elderly
households.
Non-responding households do appear more likely to
have two or more cars available and this group also
tend to be in a higher income bracket. Most notably,
80% of HOHs from this group are over 40, 28% of
HOHs are self-employed and 87% of households are
owner-occupiers. At the other end of the scale the
higher proportion of non-responding households
without a car reflects the larger proportion of elderly
people in this group.
A scale of wealth was given to rank the household
approximately from 1 as the every poorest to 6 as the
very richest household. Although interviewers were
instructed to record their initial impressions of all
households before any interviewing, RCQs were, in
general, completed after the main interview and could
well be coloured by more detailed knowledge that
would not be possible with non-respondents. This
information has therefore not been used for
comparisons. In future tests, the use of income bands
on a show card for use on the doorstep could be
considered.
Recording the age of the car is not normal practice for
the FES. For this study interviewers were asked to
collect this additional detail. 45% of responding
households had cars which were less than 5 years old
compared to 41% of non-responding households.
However there was more variation in age of car for the
non-responding groups which seemed to reflect the
proportions of types of household in the group; whilst
pensioner households tended to have older cars, the
higher income refusing households tended to have very
recently registered cars.
16
SMB 33 7/93
Sarah Cheesbrough
4.
4.1
Conclusions
Feasibility of the fieldwork
Despite some initial reservations, the fieldwork was
very successful. The interviewers’ reports made it clear
that any exercise of this nature should use a very small
number of key questions that are easy to memorise.
This allows the interviewer to adapt to the situation on
the doorstep and prevents distraction from the main
task of persuading the household to co-operate with the
survey.
4.2
Non-responding households on the FES
In a future repeat of this exercise the following basic
questions should be retained:
Household level
- type of accommodation
- number of rooms (if converted property)
- number of bedrooms
- tenure
- number of cars
- ethnic group
- main language of household
Distinguishing variables
Questions about the household
Where the interviewer obtained the information from a
household member the quality of the information was
very high. Interviewers often commented that obtaining
‘household grid’ details forms a natural part of their
doorstep introduction. The information on age,
household composition and occupation in particular
distinguished between responders and non-responders.
Questions on accommodation and car ownership were
(a)
also successfully completed and provided useful
(b)
information on non-responders’ characteristics.
-
Basic questions would be useful in targeting:elderly households put off the full FES by its length;
higher-income households, often with HOH in late
middle age, who have reservations about the financial
nature of the survey or object to the invasion of
privacy;
self-employed persons in all age groups who object to
questions that probe for details of financial
arrangements.
Interviewers’ observations
Recording the ethnic group of the household presented
(c)
no problems and there appeared to be some slight
difference in response rate that could interact with other
variables.
People who refuse for more general reasons, such as
dislike of all surveys or the government, may also give
some valuable information.
Interviewers’ impressions
Long term ill health in the household was relatively
easy to record but was highly correlated with the age of
the head of household. Overall impression of wealth
presented the greatest difficulty to interviewers and did
not clearly distinguish between responders and nonresponders.
Implementation using computer assisted interviewing
The practicalities of collecting non-response
information must be considered in the context of the
transferring of the FES to computer assisted
interviewing (CAI) in 1994.
Future work using records from the 1991 Census
should provide the accurate information on variables
which distinguish between responding and nonresponding households.
4.3
Individual level
age
sex
marital status
relationship to HOH
employment status
personal outcome on FES
Currently, on CAI trials, the interviewers record
household and personal outcome records in a section of
the interview program known as the administration
block. This is separate from the main interview and
always completed when the interviewer is at home. If a
household has refused the interviewer is required to
give any reasons for refusal at both the household and
individual level. This provides a greater level of detail
than is currently recorded on the calls and outcome
records for the paper survey. A non-response exercise
carried out using CAI could include some basic
questions, built into the administration block, that
would appear if a refusal code is used.
Proposals for future work
The following recommendations are made:-
Partial interviews
The FES should consider accepting information from
partial interviews with:(a) HOH and partner from the household, when nondependant children or other household members refuse
to con-operate;
(b) elderly people who consent to the interview but fail to
complete the diary.
Basic questions
Notes
17
SMB 33 7/93
Sarah Cheesbrough
1.
2.
Non-responding households on the FES
This article is based on a paper sent to the Central3.
Statistical Office in March 1993. It forms part of
ongoing investigations into the use of re-weighting
techniques to compensate for unit non-response on the
Family Expenditure Survey (FES).
4.
Kersten, HMP. & Bethlehem, JG. (1984). Exploring and
reducing non-response bias by asking the basic question.
Statistical Journal of the UN. ECE2, pp369-380.
Redpath, R (1986). Family Expenditure Survey: a second
study of differential response, comparing Census
characteristics of FES respondents and non-respondents.
Statistical News. Vol. 72, pp.13-16.
Elliot, D. (1991). Weighting expenditure and income
estimates from the UK Family Expenditure Survey to
compensate for non-response. Survey Methodology
Bulletin, No. 28, pp.45-54.
5.
Central Statistical Office (1992). Family Spending, a report
on the 1991 Family Expenditure Survey. London: HMSO.
18
SMB 33 7/93
Sarah Cheesbrough
Non-responding households on the FES
The use of standardisation in survey analysis
Kate Foster
1.
Introduction
Methods of standardisation
2.
Survey analysts are often interested in comparing the
rate for some even or characteristic aceoss different
subgroups of a population or for the same population
over time. Comparison of the overall rates or
proportions is not a problem if the populations are
similar with respect to factors associated with the
measure concerned, such as age, sex or marital status.
When this is not the case, a direct comparison of
overall rates may be misleading.
Standardisation has most commonly been used within
SSD in relation to health indicators, which often show a
strong relationship with age. The technique provides a
way of comparing health indicators between different
subgroups of the sample after making allowance for
differences in age structure between the groups and
provides a single numerical summary of the agespecific rates for each subgroup of interest. There are
two commonly-used ways of deriving a summary of
age-specific rates, known as direct and indirect
standardisation. These methods are illustrated below
with some comments about their limitations and
advantages. Examples and commentary can also be
found in Marsh (1988)1 and Fleiss (1981)2.
One commonly used solution is to present three-way
(or more) tables to control for other confounding
variables which are associated with the measure of
interest and also with the main independent variable.
An example would be to take account of age when
looking at the relationship between cigarette smoking
and social class by tabulating prevalence of cigarette
smoking by social class for a number of different age
groups. The resulting tables may, however, be difficult
to interpret and suffer from small cell sizes. In addition,
they do not provide a single summary measure which is
suitable for comparison between groups.
Standardisation, by whichever method, is not a
substitute for comparison of the age-specific rates for
the subgroups of interest. Even when the technique is
used it is advisable to look at the relevant three-way
table and, in particular, at whether the relationship
between the health measure and the characteristic of
interest varies with age. If there are interactions in the
data, for example where the percentages of people with
the characteristic of interest in two subgroups are lower
for some age bands but higher for others, then
standardisation will tend to mask these differences. In
these circumstances the results of the standardisation
may be misleading and should be treated with caution.
A more statistically sophisticated solution is to model
the data which, for categorical survey data, would
normally involve the use of log-linear modelling.
However, this approach may not be the most
appropriate where the requirement is to produce simple
summary measures across a large number of analyses
that are suitable for tabulation and can be readily2.1
interpreted.
Direct standardisation
Direct standardisation is widely used in medical
statistics and the output is normally a rate (proportion)
for each subgroup. The method applies the observed
age-specific rates for each subgroup to a standard
population distribution, often that of the total sample
and the standardised rate is obtained by summing these
values across all strata (age groups) in the subgroup.
This is given by the equation
An alternative approach to the problem which provides
output in recognisable tabular format is standardisation.
The technique allows direct comparison between rates
or proportions measured for populations which differ in
a characteristic which is known to affect the rate being
measured by, in effect, holding the confounding
variable constant. In the example mentioned above, it
would provide a measure of cigarette smoking
prevalence for each social class group having adjusted
for the effects of age.
(1)
Standardised rate = ∑ rijwi
i
Where rij is the observed rate (proportion) for the cell
defined by the ith stratum and jth subgroup, and wi is
the number of cases in the total stratum (age group) as a
proportion of the total sample
This paper gives some background to the use of
standardisation in Social Survey Division (SSD) and
presents the results of recent work on the estimation of
standard errors for age-standardised ratios.
The use of direct standardisation is illustrated by the
example at Table 1. The example uses data from the
19
SMB 33 7/93
Sarah Cheesbrough
Non-responding households on the FES
Table 1 Example of direct standardisation.
Reported longstanding illness by marital status, age and sex
Age group
Marital status groups
{Strata (i)}
Subgroups (j)}
Married,
Cohabiting
Total sample
{Standard
Population}
Single
Widowed,
Divorced,
Separated
Proportion in
Stratum (wi)
Percentage reporting longstanding
Illness (rij)
Marital status group
Subgroups (j)}
Married,
Cohabiting
Single
Widowed,
Divorced,
Separated
Expected rate (∑rijwi)
Men
16-44
45-64
65 or over
23.5
41.7
60.1
21.0
47.3
57.1
32.1
43.0
66.3
0.52
0.30
0.18
(12.2)
(12.5)
(10.8)
(10.9)
(14.2)
(10.3)
(16.7)
(12.9)
(11.9)
All men
37.0
25.2
51.6
1.00
35.5
35.4
41.5
Women
16-44
45-64
65 or over
21.8
38.8
55.2
24.8
43.0
64.0
30.0
50.7
61.1
0.50
0.28
0.22
(10.9)
(10.9)
(12.1)
(12.4)
(12.0)
(14.1)
(15.0)
(14.2)
(13.4)
All women
32.4
30.0
52.3
1.00
33.9
38.5
42.6
standardised rates were, however, still higher for men
and women who were widowed, divorced or separated
indicating that this group had higher rates of chronic
sickness even after allowing for their age distribution.
There is also some evidence that married or cohabiting
women reported lower rates of chronic sickness than
would be expected on the basis of their age distribution.
These results can be seen to be consistent with the
observed age-specific rates for the subgroups. For most
of the age-sex strata observed rates of long-standing
illness were higher among informants who were
widowed, divorced or separated, and married women in
each age band had lower observed rates of chronic
sickness than single women.
1991/92 General Household Survey on the proportion
of men and women in each of three marital status
groups with a long-standing illness or disability, which
is an indicator of chronic sickness. For simplicity the
example uses only three age bands, but the technique
can readily be applied to more strata. The agestandardised proportion of chronically sick in each
marital status category is the proportion which would
result if a standard population (given here by the total
sample of men or women) were to experience the agespecific rates observed by that subgroup.
From the observed percentages in Table 1 we see that
men and women who were widowed, divorced or
separated were much more likely than those in other
groups to have reported a long-standing illness. Also,
among men only, the observed rate was lower for those
who were single than for the married or cohabiting
group. These results may be misleading since there is a
strong association between marital status and age as
well as between the incidence of chronic sickness and
age: single people are on average younger than others,
while those who are widowed, divorced or separated
are on average older than the married or cohabiting
group.
Although direct standardisation is initially attractive
because the resulting statistic is a rate (proportion), the
method has more rigorous data demands than does
indirect standardisation. The major requirement is that
the age-specific rate for the measure under
investigation must be known for each population
subgroup being considered. In most survey contexts
this level of detail is available in the data but the
sample size for each cell in the cross-tabulation may be
too small to give reliable measures and hence the
resulting standardised rates may be unstable.
The direct standardised rates for the marital status
groups are shown in the right hand part of Table 1. As
would be expected, once age has been taken into
account there was less variation between the subgroups
in the percentage chronically sick. The directly
The other requirement in order to calculate a directly
standardised rate is that the age structure of the
standard population is known. In cross-sectional
surveys the age distribution for the total sample (of men
20
SMB 33 7/93
Sarah Cheesbrough
Non-responding households on the FES
standardisation are generally preferred to directly
standardised rates because they have greater numerical
stability and are less sensitive to changes in agespecific rates within subgroups. The calculation also
uses the age distribution and the overall rate (number of
occurrences) for the subgroup.
or women) is used. When using the method to compare
rates over time it is necessary to decide on the
composition of a standard population which is then
applied to calculate the standardised rates for each
repeat of the survey.
2.2
Indirect standardisation
The expected number of occurrences for the subgroup
is given by:
(2)
Expected Count = ∑ rinji
Indirect standardisation is often thought to be more
appropriate than direct standardisation for the survey
data and is the method generally used within SSD in
the analysis of cross-sectional surveys. It has less
rigorous data requirements than direct standardisation
since it uses the age-specific rates for the population as
a whole, rather than for each subgroup, so avoiding the
use of rates based on small sample sizes. Thus, in the
survey context, ratios calculated by indirect
i
where ri is the observed rate (proportion) for the age
band and nij is the cell sample size.
Table 2 Example of indirect standardisation.
Reported longstanding illness by marital status, age and sex
Age group
Marital status groups
Total sample
{Strata (i)}
Subgroups (j)}
Married,
Single
Cohabiting
Widowed,
Divorced,
Separated
Married,
Cohabiting
Sample size (nij)
Men
16-44
45-64
65 or over
2615
2123
1087
Proportion
reporting
longstanding
illness (ri)
1635
169
98
168
221
350
All men
Observed percentage
Observed count
Standardised ratio
(Observed/Expected)
Women
16-44
45-64
65 or over
37.0
2154
25.2
479
51.6
381
100
95
110
3100
2124
852
1344
114
164
410
513
1122
All women
Observed percentage
Observed count
Standardised ratio
(Observed/Expected)
32.4
1971
30.0
487
52.3
1069
94
106
110
Marital status group
21
Single
Widowed,
Divorced,
Separated
Expected count (∑rinij)
0.229
0.422
0.613
599
896
666
374
71
60
38
93
215
0.356
2161
505
346
0.233
0.412
0.233
722
875
503
313
47
97
96
211
662
0.362
2100
457
969
SMB 33 7/93
Sarah Cheesbrough
Non-responding households on the FES
The number of observed occurrences is then compared3.
with the expected number for the subgroup to give a
standardised ration (SR):
Assessing
the
standardisation
Observed count
___________________
of
indirect
Since standardised ratios are complex statistics the
calculation of the associated standard error is not
straightforward. It has therefore bee usual to apply
various ‘rule-of-thumb’ methods to assess whether
observed ratios are likely to differ significantly from
100 and hence to decide which ratios should be
commented on.
(3)
SR =
results
x 100
Expected count
The resulting standardised rations are highly dependent
on the age structure of the specific population for which
they are constructed and are sensitive to variation in the
subgroup sample distribution. It is also possible,
although less usual, to express the results of indirect
standardisation as rates by dividing the expected count
for the subgroup by its sample size.
Two methods of estimating standard errors for
standardised ratios have recently been investigated on a
number of examples.3 The first method discussed below
makes the simplifying assumption of a simple random
sample design, whereas the second provides an
estimate of true standard errors taking account of
complex sample design. The standard errors resulting
from either method of estimation can be use to test
whether an individual standardised ratio is significantly
different from 100, but not to test for significant
differences between two ratios.
Table 2 illustrates the use of indirect standardisation on
the same data as shown in Table 1. The resulting
standardised ratios compare the observed prevalence of
chronic sickness in each marital status group with the
rate that would be expected if the age-specific rates in
3.1
the total sample were applied to the age distribution
observed for that subgroup.
Standard errors assuming a simple random
sample design
Initially we look at estimated standard errors for
standardised ratios under the simplifying assumption of
the survey having a simple random sample (srs) design.
The formula for this calculation4 was derived using the
Taylor series approximation; further details of the
method of derivation are given in Wolter (1985).5 The
calculations were carried out by means of a spreadsheet
which required only that the total sample sizes and the
observed number of cases with attribute were entered
for each cell defined by the strata and the subgroups.
The results of the standardisation are interpreted by
looking at the deviation of the standardised ratios from
100, which is the implied ratio for a standard
population (ie. for the total sample shown in a specific
table). A ratio of more than 100 implies that the
subgroup is more likely to display the characteristic
than would be expected on the basis of age distribution
if its members were similar to the sample as a whole.
Conversely, a ratio of less than 100 implies that the
group is less likely to display the characteristic than
would be expected from its age distribution.
The first part of Table 3 shows the estimated standard
errors for the example of indirect standardisation used
above (Table 2). The age-standardised ratios of 110 for
widowed, divorced and separated men and women
were both statistically significant at the 5% level;
standard errors for these groups were 3.50 on a
subgroup sample size of 2045 women. On the basis of
these standard errors, the ratio of 94 for married or
cohabiting women was also found to be significantly
different form 100.
From Table 2 we see than men and women who were
widowed, divorced or separated were more likely than
expected from their age distribution to have reported
chronic sickness (ratios of 100 for both sexes). Rations
of less than 100 were recorded for single men (95) and
married or cohabiting women (94) although these ratios
were both relatively close to 100.
Since standardised ratios are calculated from survey
data they are subject to sampling error and more precise
assessment of their deviation from 100 would involve
use of the standard error of the ratio in a conventional
test of statistical significance. The next section goes on
to look at estimates of the standard errors associated
with various standardised ratios.
Since many surveys involve much smaller total sample
sizes or smaller subgroup samples, the second part of
Table 3 shows the effect on interpretation of the ratios
of reducing the sample size by a factor of 5 whilst
keeping the age-specific rates and sample distributions
for the subgroups the same as in the original example.
With this smaller sample size the standardised ratios
22
SMB 33 7/93
Sarah Cheesbrough
Non-responding households on the FES
Table 3 Estimated standard errors for standardised ratios: reported longstanding illness by marital status and
sex with varying sample size
Men’s marital status
Married,
Cohabiting
Single
All
men
Widowed,
Divorced,
Separated
Women’s marital status
Married,
Cohabiting
Single
All
women
Widowed,
Divorced,
Separated
Observed percentage
37.0
25.2
51.6
35.6
32.4
30.0
52.3
36.2
(a) Full sample
Sample size
Standardised ratio
Standard error
5825
100
0.86
1902
95
3.06
739
110*
3.50
8466
6076
94*
1.02
1622
106
3.39
2045
110*
1.84
9743
(b) One fifth sample
Sample size
Standardised ratio
Standard error
1165
100
1.92
380
94
6.83
148
110
7.86
1693
1215
94*
2.27
325
107
7.56
409
110*
4.11
1949
* Significantly different from 100 (p<0.05)
associated with the ratios in example (c), where the
overall prevalence is 78%, are lower than those in
examples (a) or (f), where the overall percentages are
24% and 11% respectively.
for married or cohabiting women and for previously
married women were still significantly different from
100 but the ratio for widowed, divorced or separated
men was not significantly higher than 100. As with
standard errors for survey statistics, the standard error
of a standardised ratio is directly related to the square
root of the sample size, so reducing the sample size by
a factor of 5 results in an increase of about 2.24 in the
magnitude of the standard error.
ii. From the result at (i) it is clear that the size of
standard errors can be reduced simply by running an
analysis for the inverse of a low percentage. For
example, if 20% of the sample has a particular attribute
then lower standard errors will result if the
standardisation is run for the percentage of informants
who do not have the attribute, ie. 80% overall. This use
of the inverse percentage does not affect the eventual
interpretation of the results since the rations are more
likely to diverge from 100 when the overall percentage
is low. An example of the effect of inverting an
analysis is given in examples (c) and (d). With an
overall percentage of 78% the standardised ratios are
between 86 and 110, standard errors range from 1.5 to
3.8 and the ratios for groups 1, 3 and 4 are significantly
different form 100. Repeating the analysis for the
inverse percentage (22%) gives individual ratios in a
much wider range, from 60 to 147, and larger standard
errors, form 5.5 to 14.0, but the ratios for the same
subgroups are significantly different from 100.
Similar calculations were made for a number of
standardised ratios taken from a recent survey carried
out by SSD with the aim of reaching some general
conclusions on the size of standard errors. The
examples are shown in Table 4 and are all based on a
total sample size of around 1500 men or women, which
is perhaps a typical sample size for a small survey.
Subgroup sizes in the examples are between 35 and
1119, with most in the range 200 to 600 cases. For each
example the table shows the observed proportion, the
subgroup sample size, the standardised ratio and the
estimated standard error. An asterisk beside a ratio
indicates that, based on the estimated standard error,
the ratio was significantly different from 100 (p<0.05).
The following patterns are revealed by the examples.
iii. Within a single table standard errors may vary
widely for different groups and this was investigated
further by looking at the contribution of different terms
in the calculation of the standard error to the total
standard error. This revealed that standard errors are
larger for subgroups with larger values of the
standardised ratio, smaller values for the percentage or
smaller sub-sample size. Since a large ratio is usually
i. The size of standard errors is strongly influenced
by the overall proportion of the sample with the
attribute: standard errors increase substantially as the
proportion decreases. Thus, the standard errors
23
SMB 33 7/93
Sarah Cheesbrough
Non-responding households on the FES
Table 4 Estimated standard errors for a variety of examples of standardised ratios
Sample subgroup
Total
sample
1
2
3
4
(a)
Percentage
Sample size
Standardised ratio
Standard error
20%
586
90
5.5
27%
139
116
13.2
27%
456
105
5.9
25%
274
103
8.9
24%
(b)
Percentage
Sample size
Standardised ratio
Standard error
19%
1119
87*
2.9
36%
191
19
10.3
56%
78
138*
13.2
69%
35
153*
19.4
25%
(c)
Percentage
Sample size
Standardised ratio
Standard error
88%
585
110*
1.5
81%
139
103
3.8
72%
456
94*
2.1
67%
274
86*
3.1
78%
(d)
Percentage
Sample size
Standardised ratio
Standard error
12%
585
60*
5.5
19%
139
90
14.0
28%
456
121*
6.8
33%
274
147*
11.0
22%
(e)
Percentage
Sample size
Standardised ratio
Standard error
35%
586
91*
3.8
34%
139
87
9.2
45%
456
107*
4.4
46%
275
113*
6.4
40%
(f)
Percentage
Sample size
Standardised ratio
Standard error
11%
586
107
9.3
12%
139
113
24.1
9%
453
85
10.5
11%
275
104
15.8
11%
(g)
Percentage
Sample size
Standardised ratio
Standard error
4%
177
34*
11.5
13%
339
92
11.0
12%
347
147*
17.7
13%
877
106
6.1
12%
(h)
Percentage
Sample size
Standardised ratio
Standard error
26%
422
111
6.8
21%
489
98
6.4
10%
521
85
9.1
18%
(i)
Percentage
Sample size
Standardised ratio
Standard error
23%
333
168*
13.2
15%
569
96
6.8
3%
770
49*
7.9
11%
24
SMB 33 7/93
Sarah Cheesbrough
Non-responding households on the FES
associated with a large column percentage in
comparison with those of other subgroups, the first two
situations will not exist for the same subgroup.
Example (h), where the subgroups are roughly equal
size and ratios are relatively close to 100, suggests that
a small percentage may have a marked effect on
standard errors. The largest standard error is recorded
for group 3 for which the percentage vale is 10%
compared with at least 21% for the other subgroups.
useful insights into the relationships under
investigation when used in conjunction with detailed
tables, especially where there are strong associations in
the data. It is also apparent that carrying out the
calculations for some examples of standardised ratios
on a specific survey may give a general indication of
the size of standard errors for the size of subgroups
being considered; this may then be used as a guide in
other analyses.
iv. Subgroup size has a particularly strong effect on
the comparative size of the standard error for different
subgroups in the same analysis. As would be expected,
standard errors increase as subgroup size decreases
although the relationship cannot be simply defined
because of the influence of other variables. Most of the
examples in Table 4 indicate the larger standard errors
associated with smaller subgroup sizes, although this
effect is sometimes difficult to separate from the
similar influence of a higher ratio (examples (b) and (i)
or, more rarely, a lower percentage (as perhaps seen in
example (e)). In example (f), where the percentages are
reasonably similar across the subgroups and the ratios
are not in a very broad range, the size of the standard
errors increases as subgroup size decreases; the largest
standard error (24.1) is recorded for the smallest
subgroup (n=139) and the smallest standard error (9.3)
for the largest subgroup (n=586). Example (g) is
unusual in that the largest standard error is not seen for
the smallest subgroup but for the group with the largest
standardised ratio.
3.2
Making allowance for the complex
survey design
The estimated standard errors discussed so far and
shown in Table 4 are calculated on the simplified
assumption of a simple random sample design. Since
most surveys carried out by SSD use a multi-stage
probability sample design, involving both stratification
and clustering, standard errors calculated assuming a
simple random sample will tend to underestimate the
true values.
A method of estimating true standard errors for
standardised ratios, allowing for the complex sample
design, has been developed using Epsilon, SSD’s inhouse package for calculating sampling errors. The
method first requires that a set of relatively complex
variables are derived from the raw data, and these
variables were derived in SPSS for the following
examples.
True standard errors were calculated for a small number
of examples from a survey in which the primary
sampling units were postcode sectors and with an
achieved sample of about 20 households per cluster.
Table 5 compares the resulting ‘true’ standard errors
with the estimated standard errors for the same ratios
assuming a simple random sample design (srs). The
comparison of results from the two methods of
estimation will, of course, vary according to the nature
of the variables of interest and the specific details of the
sample design.
Although some general patterns in the size of standard
errors can be identified, it is difficult to quantify the
relationships and use them to predict the likely size of
standard errors in individual analyses because of the
complex interaction of the various factors involved.
The examples perhaps illustrate the danger of using
‘rule-of-thumb’ methods to assess the deviation of
ratios from 100. With the subgroup sizes used in Table
4 a possible rule might have been to concentrate on
deviations from 100 of +/-15. This approach would
have correctly identified nine ratios that were
significantly different from 100 on the basis of their
estimated standard errors, but would also have
identified four that were probably not significant and
would have failed to identify six statistically significant
ratios.
The results in Table 5 show that the two estimates of
standard error were, in most cases, very close. As
expected, the standard error assuming a simple random
sample was usually slightly smaller that the estimate of
the true standard error although there were examples
where the reverse was true. In four fifths of the cases
shown, the ratio of the estimated true standard error to
the srs standard error was between 0.9 and 1.1 and in
only one case (5% of the total) was the ratio as great as
1.2. With differences of this order of magnitude it is
unlikely that use of the srs standard error rather than the
‘true’ standard error would affect the interpretation of
results.
The examples shown indicate that there are difficulties
in judging whether individual standardised ratios differ
significantly from100 without estimating standard
errors. However, standardised ratios can still offer
25
SMB 33 7/93
Sarah Cheesbrough
Non-responding households on the FES
Table 5 Standard errors for standardised ratios: comparison of estimates assuming a simple random
sample and estimates allowing for complex design
Sample subgroup
1
2
3
4
(a)
Percentage
Sample size
Standardised ratio
Srs standard error
‘True’ standard error
40%
1028
98
1.9
2.0
31%
334
98
6.7
6.7
58%
130
113
7.5
6.9
(b)
Percentage
Sample size
Standardised ratio
Srs standard error
‘True’ standard error
38%
1074
98
2.2
2.5
29%
286
93
7.4
7.6
53%
389
106
4.0
4.1
©
Percentage
Sample size
Standardised ratio
Srs standard error
‘True’ standard error
88%
585
109
1.6
1.6
81%
139
102
3.9
3.8
72%
456
93
2.1
2.1
71%
274
90
3.2
3.9
(d)
Percentage
Sample size
Standardised ratio
Srs standard error
‘True’ standard error
83%
598
108
1.7
1.6
78%
256
107
3.1
3.3
68%
442
92
2.4
2.6
66%
382
92
2.8
2.9
(e)
Percentage
Sample size
Standardised ratio
Srs standard error
‘True’ standard error
12%
199
57
10.1
9.2
20%
240
94
10.4
11.7
14%
488
110
9.4
9.4
27%
552
111
5.3
5.4
(f)
Percentage
Sample size
Standardised ratio
Srs standard error
‘True’ standard error
4%
177
33
11.2
12.5
13%
339
100
12.3
11.2
12%
347
130
16.3
15.3
13%
877
107
6.2
5.9
chance that a ratio might be identified as significantly
different from 100 when this difference could be
explained by sampling error.
In general terms the effect of a complex sample design
on the accuracy of survey estimates may be measured
by the design factor (deft), which is the ratio of the
estimated true standard error to the standard error
assuming a simple random sample. If summary values
of deft can be identified for specific topic areas within a
survey, then the appropriate deft value may be applied
as a multiplier to the SIS standard error to give an
approximate estimate of the true standard error. This
approach might also be used for standard errors of
standardised ratios and would reduce still further the
The method of calculating true standard errors is
relatively complex and time-consuming, even with
access to a customised package to calculate standard
errors and researchers would not usually be in a
position to adopt this approach for a large number of
analyses using standardisation. A more realistic
approach is to carry out the simpler calculations for
26
SMB 33 7/93
Sarah Cheesbrough
Non-responding households on the FES
paper show that standard errors tended to be larger
where the overall percentage of case with the attribute
was small. For the categories included in a specific
analysis, standard errors tended to be larger where the
subgroup sample size was relatively small, and where
the standardised ratio was large or the observed
percentage was small in comparison with other groups.
standard errors assuming a simple random sample and
then perhaps to adjust by a suitable value of deft to
allow for the complex sample design
4.
Conclusions
Indirect standardisation is a useful technique in survey
analysis to summarise relationships after taking account
of the confounding effects of a third variable. In SSD it
has been used particularly to control for the effects of
age when investigating associations with various health
measures. The method should not, however, be seen as
an alternative to consideration of the full detail of the
appropriate three-way tables.
References and notes
Full interpretation of the resulting standardised ratios
depends on having an estimate of the standard errors
associated with the ratios. Calculation of true standard
errors, which take account of the complex sample
design, is a relatively complicated procedure and is not
likely to be feasible on most surveys. Calculation of
standard errors under the assumption of an srs design
can be effected by means of a simple spreadsheet and
might be considered a more realistic option. Examples
presented above suggest that the results of this simpler
calculation are, in general, a useful guide to the true
standard errors although tending to be slightly smaller.
1.
Marsh, C. Exploring data. Polity (1988)
2.
Fleiss, J L. Statistical methods for rates and
proportions. Wiley (1981)
3.
Acknowledgements to Dave Elliot in SSD’s
Methods and Sampling Branch who carried
out the theoretical work on the estimation of
standard errors,
4.
The variance of a standardised ration (Rj) is
given by the following formula:
Even where standard errors are not calculated,
standardised ratios can provide a useful summary of
relationships when used in conjunction with detailed
tables. If standardisation is being used for a number of
analyses on the same survey then it may be possible to
judge which ratios are sufficiently different from 100 to
be worthy of comment, but estimating standard errors
for a small number of examples may help to provide
more specific guidance. The examples presented in this
where p is the observed proportion, w is the
number of cases as a proportion of the total
sample size N, i represents a stratum (age
band) and j represents a subgroup.
5.
27
Wolter, K M. Introduction to variance
estimation. Springer-Verlag (1985)
SMB 33 7/93
A clinical experiment in a field setting: the design of the Dinamap
calibration study
Keith Bolling, Joy Dobbs and Charles Lound
1.
the Health Survey are observer variance, movement
artefacts, and background noise.
Introduction
One of the primary aims of the Health Survey for
England is to measure changes in blood pressure levels
in the population over time, and to produce
comparisons of blood pressure between sub-groups.
The Dinamap 8100 automatic blood pressure
measuring device was selected for use in the survey
because it is robust, portable, easy to use, and has been
shown to produce valid results. However, a small
clinical study (O’Brien et. al. 1992) suggested that
compared with the more traditional mercury
sphygmomanometer,
the
Dinamap
tends
to
overestimated systolic blood pressure in the lower BP
ranges and underestimate systolic blood pressure in the
higher BP range. Diastolic blood pressure tends to be
underestimated for all BP levels. In order to calibrate
the blood pressure results produced on the main Health
Survey it was decided that a larger study should be
carried out under field conditions to find the
relationship between the Dinamap 8100 and a mercury
sphygmomanometer.
The Dinamap 8100 is one of a number of modern,
automatic devices now available on the market. This
device is based on the oscillatory principle and measures
vibrations in the artery. Cuff deflation is controlled
automatically and readings are recorded and displayed on
a digital screen. Unlike monitors based on the
auscultatory principle, the Dinamap 8100 allows, to a
large degree, the standardisation of the procedure and the
minimisation of observer variance. Previous studies in
both the U.S. and the U.K. (Ornstein, 1988; Whincup
1992) have concluded that the Dinamap models do give
blood pressure readings which can be compared across
devices and time.
3.
This paper examines the issues involved in setting up
and carrying out the Dinamap study over a four month
period. It concentrates on two design aspects of the
study – the sample and the fieldwork.
2.
Sample design
The twin restraints of nurse availability and time/costs
meant that the study was restricted to eight areas of the
country. The areas selected were spread between the
Regional Health Authorities and covered different types
of areas (e.g. inner city, rural, suburban, etc.). One area
was specifically selected because it had a high proportion
of retired people. Given the nature of the study and the
desire to minimise travelling time the areas were more
clustered than on most surveys, with approximately 1 in
20 addresses in each study area being sampled.
Measuring blood pressure
Various devices are now available for measuring blood
pressure all of which give slightly different readings.
Until recently not much has been known about how
different devices compared. Before selecting the
Dinamap 8100 for use on the Health Survey, careful
consideration was given to the merits of various
measuring devices in an epidemiological setting.
Since the main Health Survey is particularly interested in
monitoring the proportion of people in the higher blood
pressure ranges, the study had to ensure sufficient
numbers of respondents with high blood pressure.
Analysis of the 1991 health Survey data indicates that
only one in twenty adults have systolic blood pressures
over 180mmHg and that most of these are aged over 55.
With a target of 1000 adults it was clear that a simple
random sample would not produce enough respondents
with higher blood pressure. Consequently it was decide to
use age as the best surrogate for higher blood pressure and
to oversample those aged 55 and over.
The traditional device for measuring blood pressure is the
mercury sphygmomanometer, commonly used in doctors
surgeries and hospitals. It is based on the auscultatory
principle whereby the observer deflates the cuff at about
2mmHg per second while listening for Korotkoff sounds
by placing a stethoscope over the brachial artery. At the
first regular appearance of sounds (called Korotkoff phase
I) the systolic pressure is measured by reading off the
column of mercury. Cuff deflation continues until the
sounds finally disappear (called Korotkoff phase V) at
which point the diastolic pressure is read off the column
of mercury. The main disadvantages of the mercury
sphygmomanometer in epidemiological studies such as
3.1 Method of oversampling
Some very simple rules were devised to ensure
oversampling of those aged 55+. Due to the large number
of addresses that the interviewers had to visit in a short
period of time it was important that these rules could be
applied easily and quickly in the field.
28
SMB 33 7/93
Keith Bolling, Joy Dobbs and Charles Lound
The design of the Dinamap calibration study
Table 1 Households in 1991 GHS by number of adults aged 16-54 and number of adults aged 55+
Number of adults
In household
Aged 16-54
0
1
2
3
4+
Column total
%
Number of adults in household aged 55+
0
1
2
3
0
1630
3943
1010
464
7047
70.8
1425
422
137
33
10
2027
20.4
793
65
13
0
0
871
8.7
9
1
0
0
0
10
0.1
if anyone in the household was aged 55 or over
then all the adults in the household would be
included in the sample (irrespective of age).
-
If nobody in the household was aged 55 or over
then the interviewer would apply a simple selection
procedure to decide whether that household should
be included in the sample or not, so that only 1 in r
households which contained nobody aged 55+
would be sampled.
Row
Total
2227
2118
4093
1043
474
9955
%
22.4
21.3
41.1
10.5
4.7
100.0
(0.29 x n x 0.32) younger adults
The rules were as follows:
-
All
1-3
2227
488
150
33
10
2908
Combining these two would produce a sample as
follows;
adults aged 16-54 = n[0.71 x 2.06/r) +
(0.29 x 0.32)]
adults aged 55+ = n(0.29 x 1.31)
As the sampling fraction (r) varies so the sample size
and the proportion of adults in the lower age group
varies Table 2 shows what would be expected given
different values of r.
3.2 Sampling Fraction
Figures from the 1991 GHS provided an estimate
of the number of households in Great Britain and their
composition (see Table 1).
Table 2 Sample size per household sampled
and percentage of adults aged 55+
From the table it can be seen that 71% of households in
GB contain only adults in the age range 16-54. Such
households have an average of 2.06 adults. The
remaining 29% of households contain at least one adult
in the 55+ age range and have an average of 1.31 adults
in the older age group and 0.32 adults in the younger
age group.
r
1
2
3
4
5
6
Using these figures as a base we would expect that
households containing only the younger age group to
yield a sample of:
No. of adults
per household
sampled
1.934
1.205
0.962
0.840
0.767
0.719
% of adults
aged 55+
20
32
40
45
50
53
These calculations suggested that a sampling fraction of
2 or 3 was most appropriate. Although a sampling
fraction of 3 yields a higher proportion of respondents
in the 55+ age group it also yields a considerably
smaller number of adults, on average, per household
thus requiring a larger number of initial contacts to
obtain the overall target of 1000 adults. Therefore, in
selecting the sampling fraction the importance of
having a large number of respondents over 55 had to be
balanced with the workload.
(0.71 x n x 2.06/r) younger adults
where n is the sample size and r is the sampling
fraction.
Similarly, we would expect households containing at
least one adult 55 or over to yield a sample of:
(0.29 x n x 1.31) older adults and
29
SMB 33 7/93
Keith Bolling, Joy Dobbs and Charles Lound
The design of the Dinamap calibration study
Table 3 Initial Sample size required to achieve a sample of 1000 adults
Initial HHld
Sample
1420
1490
1560
1625
1690
1690
1760
1840
1900
Eligible
HHld
Sample
1250
1310
1370
1430
1490
1490
1550
1610
1670
No. of
resp.
Hhlds
840
880
920
960
1000
1000
1040
1080
1120
r
2
2
2
2
2
3
3
3
3
No. of
persons
sampled
1012
1060
1109
1157
1205
962
1000
1039
1077
Persons
aged 55+
321
336
352
367
382
421
398
414
429
Persons
aged 16-54
543
572
593
620
646
528
549
570
591
regulated the work done by the nurses to ensure a high
degree of accuracy and consistency.
3.3 Initial Sample Size
Having examined the impact of different sampling
fractions a few other assumptions had to be made to
determine what the initial sample size should be. First,
an ineligibility rate of 12% was assumed which is in
line with most other surveys. Second, a response rate of
67% was assumed which was slightly lower than the
71% response for measuring blood pressure on the
main Health Survey. There were several reasons for
this cautious estimate of response rate: unlike the
Health Survey no interview was carried out prior to the
measurements; it was unclear how respondents would
react to six blood pressure measurements being taken
particularly as no pilot work had been carried out to
assess people’s reactions; and the higher proportion of
older people in the sample.
4.1 Role of Interviewers
During the study, interviewers were involved at the
beginning of each field period. Following an advance
letter interviewers had to make initial contact with
households, to select households for inclusion in the
study according to the criteria outlined above, and to
make appointments for nurses. Although consideration
was given to using just Health Survey nurses in the
study the advantages of using interviewers were clear.
First, interviewers are better trained in approaching
members of the public and persuading them to
participate in surveys. Second, it was felt that potential
respondents were less likely to co-operate if
approached initially by two people carrying equipment
rather than by just one person. Third, nurses have no
training or experience in collecting household
information in a standardised way or in applying
standard rules according to prespecified selection
criteria. Finally, and probably most importantly, given
the short time available for fieldwork, using
interviewers to screen out ineligibles and refusals and
make appointments would reduce the amount of ‘nonproductive’ time nurses would have to spend in the
field.
Taking all the assumptions on household composition
and response rates Table 3 illustrates the number of
people we would expect to achieve given different
initial sample sizes and different sampling fractions.
Using these figures it was decided that an initial sample
size of 1840 households was most appropriate using a
sampling fraction of 1 in 3. Given a response rate of
67% this would be expected to yield 1040 adults, of
which 40% should be in the 55+ age group.
Given interviewer availability and the nature and size
of the task three interviewers were involved in each of
the study areas. This required careful co-ordination
between interviewers to ensure that different
interviewers were not ‘double booking’ nurses. In each
area interviewers were each allocated time slots by the
Field Officer in which only they could make
appointments. Additionally, one interviewer was
designated to ‘co-ordinate’ their work and also to liaise
with the nurses.
4. Field Design
The previous small study (O’Brien et. al. 1992) was
carried out in a controlled laboratory setting with a
clinical, rather than an epidemiological, approach. In
adopting a field study the challenge was to come up
with a design which replicated clinical conditions as far
as possible in the field. Thus, from the start the
emphasis was on procedures which controlled and
30
SMB 33 7/93
Keith Bolling, Joy Dobbs and Charles Lound
The design of the Dinamap calibration study
discarded in analysis to minimise the risk of obtaining
falsely high readings. This is consistent with procedures
on the main Health Survey and follows the
recommendation of a BHS working group (Petrie et. al.
1986) which concluded that during the first blood
pressure reading the body exhibits a ‘defence reaction’
which causes a temporary increase in blood pressure.
4.2 Role of nurses
In each area the role of the nurse team was to call at the
participating households, take blood pressure
measurements according to the measurement protocols
and collect some basic demographic information. Two
nurses were used, with one taking only Dinamap
measures and the other taking only mercury
sphygmomanometer measures in any particular
household. In all cases the aim was to ensure that each
nurse did not know the measurements taken by their
partner to minimise the risks of one set of readings
being ‘contaminated’ by the other set.
ii.
The ideal recommended by the BHS is for
simultaneous same-arm measurement between the test
device (Dinamap) and the standard (mercury
sphygmomanometer). However, because the Dinamap
8100 has a rapid-deflation cuff, simultaneous
measurement is not possible. Instead a system of
sequential measurements was adopted with all the
measurements being done on the same arm. This meant
that six separate measurements were taken alternating the
devices – eg. Mercury, Dinamap, mercury, Dinamap,
mercury, Dinamap.
A number of studies have shown considerable observer
variation in blood pressure measurement using a
mercury sphygmomanometer. The British Hypertension
Society (BHS) protocol for evaluating blood pressure
devices (O’Brien et. al. 1990) places considerable
emphasis on observer training prior to undertaking any
validity test, particularly with respect to the mercury
sphygmomanometer. If an observer has been trained in
the past then re-training just prior to any test is
considered essential.
iii.
The ordering of measurements was randomised
to ensure that any ordering effect was eliminated. A
simple rule relating to the address serial number on the
measurement schedule was devised to ensure systematic
randomisation. If the address serial number of a particular
household was even the nurses would always start with a
Dinamap reading and alternate accordingly. Thus the
sequence of readings would be:
For this reason an important element of the study was
to hold a training course for the 16 nurses involved.
Half a day was used to cover various aspects of the
protocol, the schedules, and collecting industry &
occupation information, etc. while a whole day was
devoted to retraining nurses in the use of mercury
sphygmomanometers. This training was carried out by
an external consultant with experience in clinical
hypertension. This training was held the week before
the start of fieldwork.
1 Dinamap
3 Dinamap
5 Dinamap
2 Mercury
4 Mercury
6 Mercury
Alternatively, if the address serial number of a particular
household was odd the nurses would always start with a
Mercury reading. Thus the sequence of readings would
be:
4.3 Measurement Protocols
Three separate protocols were devised. The first related
to the use of the Dinamap 8100 and was similar to the
protocol used on the main health Survey. The second
related to the use of the mercury sphygmomanometer
and was based mainly on guidelines set out in the BHS
protocol. The third protocol was a general one relating
to the way in which the whole set of measurements
should be done.
1 Mercury
3 Mercury
5 Mercury
2 Dinamap
4 Dinamap
6 Dinamap
vi.
The BHS protocol recommends that separate
observers should measure blood pressure in
approximately half the subjects to prevent any observer
bias. For this reason it was decided to alternate which
nurse carried out which measurements. This meant that
over the period of the whole study each nurse had to carry
out half the Dinamap measurements and half the mercury
sphygmomanometer measurements. To achieve this
nurses simply alternated what measuring device they used
from household to household.
In designing the general protocol, rules had to be
established which ensured accurate and consistent
measurements by all the nurses and were simple to
apply in the field. Again many of the points
recommended in the BHS protocol were adopted. The
main design factors were as follows:
It was decided that for each respondent three
v.
pairs of reading would be taken (six in total). Although all
six readings were to be recorded the first pair would be
i.
v.
When measuring blood pressure a range of cuff
sizes can be used depending upon arm circumference.
Selecting the correct cuff size is very important since the
vi.
vii.
viii.
31
SMB 33 7/93
Keith Bolling, Joy Dobbs and Charles Lound
The design of the Dinamap calibration study
use of an inappropriate cuff size in relation to mid-arm
circumference can produce inaccurate readings.
Moreover, changes in cuff size creates a discontinuity in
readings. To minimise this problem nurses were only
given two different cuffs (compared with three on the
main Health Survey) and very precise rules were laid
down for which cuff should be used. Prior to blood
pressure measurements being taken one nurse measured
the respondent’s mid-arm circumference. If the mid-arm
circumference was 31cm or less then the adult cuff was to
be used. If the mid-arm circumference was greater than
31cm then the large adult cuff was to be used.
the 55 and over age group. Analysis of the data is now
being carried out and initial results are due towards the
end of May.
Preparing the setting before taking measurements
was an important consideration. Although it is clearly
impossible to impose absolute conditions in a field setting
the ‘ideal’ which the nurses should try to follow, was laid
out. Factors to be considered included making sure the
mercury sphygmomanometer was always placed on a flat
surface at eye level, trying to reduce background noise as
much as possible (eg. switching off TV, asking people not
to talk, etc.), and covering up the Dinamap’s digital
display so it could not be seen by people in the room,
especially the nurse taking the mercury measurements.
O’Brien E. et. al. (1992) Accuracy of the Dinamap
Portable Monitor, Model 8100 determined by the
British Hypertension Society Protocol. Unpublished.
References
O’Brien E. et. al. (1990) The British Hypertension
Society protocol for the evaluation of automated and
semi-automated blood pressure measuring devices with
special reference to ambulatory systems, Journal of
Hypertension, Vol. 8, pp 607-19
ix.
5.
Ornstein, S. et. al. (1988) Evaluation of the Dinamap
blood pressure monitor in an ambulatory primary care
setting, The Journal of Family Practice, Vol. 26, pp
517-21
Petrie, J. C. et. al. (1968) Recommendations on blood
pressure measurement, British Medical Journal, Vol.
293, pp 611/15
Whincup, P. H. et. al. (1992) The Dinamap 1846SX
automated blood pressure recorder: comparison with
the Hawksley random zero sphygmomanometer under
field conditions, Journal of Epidemiology and
Community Health, Vol. 46, pp 164-69.
Conclusions
The design and fieldwork stages of the Dinamap study
have now been successfully completed. With design
work in December and January, briefing and training
towards the end of January and fieldwork during
February and March the total time between submission
of the initial proposal and completion of the fieldwork
was less than 4 months. Response rates have proved to
be very encouraging and the target of 1000 adults has
easily been met with 52% of the respondents being in
32
SMB 33 7/93
A brief look at response to postal sifts, surveys and keeping in touch
exercises from the viewpoint of the sampling implementation unit
Tracie Goodfellow
In recent years the Sampling Implementation Unit
(SIU) has been involved in the administration and
management of a wide variety of postal sifts, surveys
and keeping in touch exercises referred to hereafter as
“postals”. This paper does not aim to cover all of the
postal work that has been undertaken. Instead six
“postals” have been selected; these demonstrate all of
the aforementioned types, as well as showing response
to “postals” in general, plus response rates over time
and at various times of year. Also included are some
points about “postals” that have been identified during
the surveys and may be of use to note for future.
changed name or are likely to be moving and if so
where to. The example in this paper is the Retirement
Follow Up Survey.
1.
Explanation of Postal Sift, Survey,
and Keeping in Touch Exercises
1.1
Postal Sift
The PAS allows for the production of address and serial
number labels, address lists, reminder labels, reports and the
production of a final sample if a sift is being used, plus the
update of data where necessary.
2.
General Mailing Procedures
In the majority of postal work the addresses are kept on a
database. At present most postal work is conducted using the
Postal Administration System (PAS) which resides on the
VAX. The addresses reach the database either by a file
transfer from the PAF, if a PAF based sample, or by the
addresses being keyed by Data Prep, if a non PAF based
sample. All addresses keyed by Data Prep require serial
numbering by the SIU first.
A postal sift involves sending out sift forms to a sample
of addresses selected from the Postcode Address File
(PAF). These forms are fairly simple and normally ask
for the household composition, including age and sex,
plus whether the address contains one or more
households. Once these forms are returned the
information is keyed and a sample can be drawn from
those cases showing the required characteristics. Those
selected are then followed up with a full survey.
Examples in this paper are the Toddlers Dietary Postal
Sift and the Day Care Postal Sift.
The majority of postal work has two reminders sent. These
reminders are despatched to all who have not replied, at
intervals of two weeks. “Postals” are always despatched and
returned by first class mail since a fast turn round is essential.
Despatch normally takes place on a Thursday since the public
seem to respond better if they receive the sift/questionnaire
near to the weekend.
1.2
3.
Response
3.1
Toddlers Dietary Sift
Some postal sifts and surveys have all non response follow up
by interviewers calling. The SIU code the sift forms and
Questionnaires to show the outcome using a different set of
codes ot allow comparison of postal and interviewer response.
Postal Survey
A postal survey would normally use address data from
non PAF sources although a PAF based postal survey
could also be undertaken. For this type of survey the
full questionnaire rather than a sift form, is sent to the
address. The outcome only is recorded by the SIU, all
other processing is undertaken by Data Prep and
Primary Analysis Branch (PAB). Examples in this
paper are the Infant Feeding Survey, Children’s Dental
Health Survey and the National Foundation for
Educational Research (NFER) Survey.
1.3
Type:
Postal Survey
Objective:
To identify households with
children aged 1.5 to 4.5
years of age by means of a
one page sift document,
asking age and sex of all
those in the household.
Source of sample:
PAF based - clustered
Keeping in Touch Exercises
Keeping in touch exercises involve contacting those
who were interviewed in a main survey to ensure that
they can be contacted in a future follow up survey.
Normally these people are contacted on a yearly basis.
The form is once again simple and asks if they have
33
SMB 33 7/93
Tracie Goodfellow
Response to Postal Surveys
Table 1 Toddlers Dietary Survey Postal Sift
Replies by wave and stage
WAVE
1
2
3
4
Resp. to Orig M/O
No.
2,660
2,415
2,922
2,632
Tot.
a)
b)
c)
d)
10,629
%
38%
35%
42%
38%
38%
Resp. to 1st Rem
No.
1,127
1,520
1,604
1,422
5,673
%
16%
22%
23%
20%
Resp. to 2nd Rem
No.
1,419
1,502
884
1,352
20%
5,157
%
20%
21%
12%
19%
18%
Resp. to Int Call
No.
1,794
1,563
1,590
1,594,
6,541
%
26%
22%
23%
23%
23%
The base used for the calculations for each wave was the set sample of 7,000.
All postal response figures include: Completed Sift forms, Post Office (PO) returns, Refusals.
The response to the interviewer call includes all non contacts, hence each wave totals 100%.
M/O – Mail out.
Size:
At each wave 7,000 addresses
Timing:
Wave 1
Wave 2
Wave 3
Wave 4
c)
Wave 3 generated a much higher response to
the postal at the original mail out and first
reminder stages than all of the other waves.,
although the final response to the “postal” was
similar to that experienced at waves 2 and 4
(See Table 2). This may have been due to the
timing of this wave, although, other factors
such as areas selected, can not be ruled out.
Points to Note;
3.2
Day Care
a)
Tables 1 and 2 show the response to this sift.
Type:
Postal Sift
b)
The response at wave 1 was lower. This was a
direct result of the decision not to send a sift
form with the first reminder. All other waves
had sift forms sent at both reminder stages and
the response increased markedly (See Table
1).
Objective:
To identify households
containing children under
the age of eight, using a one
page
February – March 1992
May – June 1992
August – September 1992
November – December 1992
Reminders:
Two at each wave, plus, interviewer
follow up to non response
Table 2 Toddlers Dietary Survey Postal Sift
Positive response rates and non response rates
WAVE
1
2
3
4
Tot.
a)
b)
Post. Resp.
No.
4,518
4,940
4,934
4,890
%
65%
71%
71%
70%
Int. Resp.
No.
1,150
1,007
984
900
%
16%
14%
14%
13%
Non Resp.
No.
1,332
1,053
1,082
1,210
%
19%
15%
15%
17%
Total
No.
7,000
7,000
7,000
7,000
%
100%
100%
100%
100%
19,282
69%
4,041
14%
4,677
17%
28,000
100%
Response = Completed sift form.
Non Resposne – PO Returns, Refusals, No reply to interviewer/Postal, Non Contact by interviewer.
34
SMB 33 7/93
Tracie Goodfellow
Response to Postal Surveys
Table 3 Day Care Postal Sift
Replies by wave and stage
WAVE
Resp. to Orig M/O
No.
%
13,080
48%
4,712
49%
1
2
Tot.
17,792
48%
Resp. to 1st Rem
No.
5,753
1,897
7,650
%
21%
20%
Resp. to 2nd Rem
No.
5,510
897
21%
3,407
%
9%
9%
9%
Resp. to Int Call
No.
21,343
7,506
28,849
%
78%
78%
78%
Figures quoted include: Completed Returns, PO Returns, Refusals.
sift document asking age and sex
of members of the household.
Source of sample
PAF based sample – clustered
Size:
Initially
27,200
addresses
boosted by a further 9,600
Timing:
July – September 1990
Reminders:
Two postal reminders, on
interviewer follow up to non
response.
3.3
nt Feeding, 1990 Great Britain sample
Type:
Postal Survey
Objective:
To collect
information
about experiences of infant
feeding from mothers of
babies at three stages from
the age of six weeks to nine
months, using a sixteen page
questionnaire.
Source of sample:
Draft Birth Registrations
clustered by registration
sub-districts or groups of
sub-districts
Size:
At stage one 9,064 births
At stage two 7,950 babies
At stage three 7,139 babies
Timing:
Stage 1 October–
November 1990
Stage 2 January–February
1991
Stage 3 June – July 1991
Points to note:
a)
Sift questionnaires were mailed out with all
reminders.
b)
Response by post was higher than for the
Toddlers’ sift; 78% were returned by post for
this sift compared with 69% on the Toddler’s
Survey. Response on the Toddlers’ sift was
boosted by interviewer follow up.
Table 4
STAGE
1
2
3
Tot.
Infant Feeding Postal Survey, 1990 Great Britain sample
Positive response rates and non response rates
Post. Resp.
No.
7,150
6,336
5,577
%
79%
80%
78%
Int. Resp.
No.
800
803
-
%
9%
10%
-
Non Resp.
No.
1,114
811
1,562
%
12%
10%
22%
Total
No.
9,064
7,950
7,139
%
100%
100%
100%
19,063
80%
1,603
7%
3,487
14%
24,153
100%
Source: OPCS, 1990 infant Feeding Survey report
a)
b)
c)
Response = Completed questionnaire.
Non response = No baby, PO Returns, Refusals, no reply to interviewer/Postal, Non Contact by interviewer.
Response rates are calculated on the set sample at each stage.
35
SMB 33 7/93
Tracie Goodfellow
Table 5
Response to Postal Surveys
Children’s Dental Health Postal Survey
Replies by stage
Resp. to Orig M/O
No.
2,254
a)
b)
c)
%
40%
Resp. to 1st Rem
No.
1,328
%
23%
Resp. to 2nd Rem
No.
988
%
17%
Resp. to Int Call
No.
1,096
%
19%
The base used for the calculations for each wave was the set sample of 5,666.
All postal response figures include: Completed questionnaires, PO returns. Refusals.
The response to the interviewer call includes all non contacts, hence each wave totals 100%.
Reminders:
Two at each stage plus
interviewer follow up to non
response at stages 1 and 2
Reminders:
Two plus interviewer follow
up to non response
Points to note:
a)
The sample declined over time due to refusals
at each stage. However, the actual postal
response remained fairly stable.
3.4
Children’s Dental Health
3.5
National Foundation for Educational
Research
Type:
Postal Survey – Personality
Test
Type:
Postal Survey
Objective:
Objective:
To
collect
background
information about children
who had been dentally
examined in school, via a
ten page questionnaire.
To
collect
completed
personality tests from a sample
of adults. It was estimated that
it would take about one hour
for the test to be completed.
Source of sample:
Omnibus Survey, individuals
aged 16 – 64 selected by
interviewer and who had
responded to the Social Survey
Division Omnibus Survey. The
test was left with respondents
who were asked to return it by
post in a pre paid envelope.
Source of sample:
selected schools
Pupils on school registers in
Size:
The set sample was 5,811,
however, 145 dropped out of
the sample because the
school was told that the
child had refused the dental
examination or because the
child had left school prior to
the “postal”, hence, all
figures are based on 5,666.
Timing:
Size:
1,506
Timing:
January – March 1993
Interviewer placement 18 – 28
January 1993
1st reminder despatched 18
February 1993
February – March 1993
Table 6 Children’s Dental Health Postal Survey
Positive response rates and non response rates
Post. Resp.
No.
4,516
a)
b)
%
80%
Int. Resp.
No.
682
%
12%
Non Resp.
No.
468
%
8%
Total
No.
5,666
%
100%
Response = Completed sift form.
Non Resposne – PO Returns, Refusals, No reply to interviewer/Postal, Non Contact by interviewer.
36
SMB 33 7/93
Tracie Goodfellow
Response to Postal Surveys
Table 7 NFER Postal Survey
Positive response by stage and non response
Resp. to 1st Rem
Resp. to left Qu.
No.
1,226
a)
b)
%
81%
No.
61
Resp. to 2nd Rem
%
4%
No.
88
%
6%
Reminders:
Two but due to interviewer
placement a longer time than
normal was allowed between
the placement and the first
reminder.
%
91%
%
9%
A £5 incentive payment was offered to all who
returned a completed questionnaire.
b) First contact was by interviewer who asked the
Omnibus respondent if they would be willing to
complete a further questionnaire. This meant that
the element of ineligibility was ruled out and that
those asked had already responded to the Omnibus
Survey (80% of set sample responded to the
Omnibus).
Further NFER surveys using the Omnibus as first
contact will be carried out shortly without a £5
incentive payment although only to respondents in
Non-Manual occupations. It will be interesting to
see how response to these compares to the very
high response rate achieved with the incentive
payment.
3.6
Keeping in touch exercise
Objective:
To keep in touch with a
sample of those aged 55 to
69 years of age who were
interviewed on a survey of
retirement
plans.
Each
person was sent a one page
questionnaire to complete.
Source of sample:
Originally postal sift, then
full interview
Size:
Original follow up size 3543
Year 1 May – June 1991
Year 2 May – June 1992
Year 3 May – June 1993
Reminders:
One
a)
The 1993 survey is not yet complete, hence,
response rates are only to date.
b)
The number of movers found by this exercise
was minimal at 14.
c)
Between years some of the sample were lost
due to deaths which were notified to the SIU
largely by the National Health Service Central
Register. The number of deaths on this survey
were high due to the age of the population
covered.
4.
Summary
4.1
Postal Sifts
The Toddlers’ sift yielded a response rate of 69% on
average of all of the 4 waves. This was boosted to 83%
by the use of an interviewer follow up. There was an
indication that response to the reminders was lowered if
a further sift form was not included.
Retirement Survey
Type:
Timing:
Points to note:
Points to note:
c)
No.
1,375
Total Non
Resp.
No.
131
Response = completed questionnaire
All Non response were no replies.
2nd reminder despatched 4
March 1993
a)
Total Resp
The Day Care sift, which had no interviewer follow up,
achieved a response rate of 78%, however this figure
includes PO returns and refusals.
4.2
Postal Surveys
The two postal surveys reported in this paper with 2
reminders and a questionnaire included at all stages but
with no incentive payment yielded an average actual
response rate of 80%.
37
SMB 33 7/93
Tracie Goodfellow
Response to Postal Surveys
Table 8 Retirement Keeping in Touch Exercise
Positive response by wave and stage
YEAR.
1991
1992
1993
a)
b)
Resp. to Orig M/O
No.
%
2,087
59%
2,245
68%
1,941
63%
Resp. to 1st Rem
No.
%
766
22%
444
13%
449
15%
Total Resp
No.
2,853
2,689
2,390
Total Non Resp.
No.
%
690
19%
621
19%
693
22%
Response = Completed form
Non response = PO Returns, Refusals, No reply to Postal, death.
The NFER yielded a 91% response rate which was
considerably higher and was due to a combination of
the fact that first contact was made by the interviewer
and the fact that a £5 incentive payment on completion
of the questionnaire.
4.3
%
81%
81%
78%
exercise with a younger population may vary
considerably.
4.4
Future “Postals”
I hope to include a small paper in the next bulletin
following up some of the issues raised in this paper and
exploring the use of “postals” in constructing sampling
frames.
Keeping in Touch Exercise
The total response with one reminder only is 81%.
However, this is from an elderly population who tend
to respond more quickly and readily. A similar
38
SMB 33 7/93
Telephone ownership north of the Caledonian Canal
Andrea Cameron
The enhanced Labour Force Survey (LFS) use of a
completely unclustered sample presents problems for
the Northern Scotland area, North of the Caledonian
Canal (NOCC). The geographical spread of the
population means face-to-face interviewing is
impractical. Rather than revert to a clustered sample,
telephone interviewing was chosen as a way of
maintaining relatively low costs together with the
benefits of an unclustered sample.
Level of Telephone Ownership
Of responding households who were on the telephone,
84% stated that they were listed in the current
published directory at the time of our survey. Only 8%
claimed that they were ex-directory households
compared to BTs national average of 25% in 1989.
This supports Collins argument that the incidence of
ex-directory numbers will be higher for urban areas.
Fifteen per cent of households were not listed in the
published directories at the time of our survey because
an additional 7% of households, while not ex-directory,
are due to appear in a directory not yet published. (One
per cent did not know whether they were listed in the
directory). This figure indicates that the level of bias
introduced by ex-directory numbers on sampling from
published telephone directories is only part of the story:
we should also be taking into account those households
waiting to be listed in the directories.
Postal surveys have generally tended to attain low
response rates in comparison to other survey
methodologies. In order to maximise response rates a
number of factors were adopted from Scott2 1961:
The questionnaire was kept anonymous.
Two follow up letters were sent, the second
accompanied by a further questionnaire and
reply paid envelope.
85% of responding households in the NOCC area were
on the telephone at the time of this survey, compared to
the 89% reported by the GHS in 1989. this figure adds
support to earlier arguments (Collins 19873) that levels
of telephone ownership tend to be lower for rural than
urban areas.
As well as establishing information on level and type of
telephone ownership, this survey would also reveal
more detailed information on numbers of households
listed, or expecting to be listed in the published
directories and on ex-directory households.
2.
4.
1,000 letters were sent out to addresses taken from the
Postcode Address file (PAF). Replies received form 77
of these indicated that the addresses were ineligible. It
is however known that about 12% of PAF addresses are
ineligible, and we have therefore taken 880 to be the
effective sampling base. From this we received 761
usable replies, a response rate of 86%,13% were non
contacts, and 4 replies were refusals to take part.
The problems encountered through this experiment
triggered the need for more information about
telephone ownership in the NOCC area in order to
justify the use of an unclustered sample in this area. A
postal survey of 1,000 addresses, randomly selected
form the Post Office’s Postcode Address File (PAF),
was chosen as the quickest and most economical way
of getting at this information.
A short questionnaire was printed on the
reverse side of the covering letter.
All questionnaires were accompanied by a
reply paid envelope.
Response Rates
The central sampling frame for telephone surveys, the
published directories, are incomplete because of the
increasing incidence of ex-directory numbers.
According to BT (1989)1 25% of telephone users were
ex-directory in 1989, and this figure tends to be even
higher for urban areas. Figures from the 1989 General
Household Survey indicate that 89% of the population
are on the telephone. Interviewing by telephone,
sampling from the published directories, therefore
effectively excludes 33% of the sample population. In
order to attempt to counteract this bias for the LFS a
random digit dialling experiment was carried out in the
NOCC area. This experiment achieved low final
response rates (72%) and low productivity and
therefore could not be considered as a serious
alternative to directory sampling for the LFS.
1.
3.
Interestingly, we found no difference between levels of
telephone ownership amongst respondents whether they
claimed they live in a City or Town, Village or Rural
area.
39
SMB 33 7/93
Andrea Cameron
Telephone ownership north of the Caledonian Canal
judged that the benefit of an unclustered sample for this
area outweighs the losses from bias.
Conclusions
The results of this survey indicate that telephone
interviewing using the published directories as a
sampling frame in the NOCC area will exclude
approximately 28% (which includes movers) of the
population from the effective sampling base. This
figure, although lower than the UK average of 33%
calculated from BT and GHS sources (see above)
which may not include movers, still incorporates a
serious amount of bias into any telephone survey. Until
further advances are made in methods of random digit
dialling and more up to date information is made
readily available on level and type of telephone
ownership, other methods may have to be adopted to
counteract this bias, such as weighting.
References
At present, however, the telephone directory sample is
being used in the LFS for the NOCC area, because it is
40
1.
British Telecom. The code decoder. (1989)
2.
Scott Christopher (1961). Research on Mail
Surveys (paper read before the Royal
Statistical Society, February 1961). Published
by Social Survey Division in the
Methodological Series, M100, London
3.
Collins M and Sykes W. The problems of noncoverage and unlisted telephone numbers in
telephone surveys in Britain. The Journal of
the Royal Statistical Society A, Vol. 150, No 3,
1987, pp 241-253.
SMB 33 7/93
Recent OPCS Publications
Survey Researcher’s Guides
The following volumes are produced by the Social Survey Division of OPCS, and are useful for
anyone planning social survey research.
Weighting for non-response – a survey researcher’s guide
By Dave Elliott
This is a survey researcher’s guide to procedures aimed at correcting the effects of non-response
during the analysis and presentation of survey results. Some of the subjects covered are:
•
Methods available to eliminate or reduce any bias consequent on nonresponse.
•
A summary of the literature describing the characteristics of survey nonrespondents, with particular emphasis on OPCS Social Survey Division’s
own checks on its continuous surveys.
•
Procedures used in compensating for non-response.
Price £5.00 net
ISBN 0 904952 70 3
This publication is available from Information Branch, OPCS, St Catherine’s House, 10 Kingsway, London
WC2B 6JP, telephone 071 242 0262 ext 2243 or 2208.
A Handbook for Interviewers
By Liz McCrossan
This manual of social survey practice and procedures for structured interviewing is intended for use
by OPCS Social Survey Division interviewers, but is available to the public for purchase from
HMSO. Topics covered include:
* Sampling
* Approaching the Public
*Questionnaires
* Asking the questions
* Recording the answers
* Classification definitions
* Analysis of a survey
Price £6.75 net
ISBN 0 11691344 4
Available from HMSO bookshops or accredited agents, or HMSO Publications Centre, telephone 071 873 9090
41
SMB 33 7/93
NEW METHODOLOGY SERIES
NM1
The Census as an aid in estimating the
characteristics on non-response in the GHS.
R Barnes and F Birch.
NM12 The Family Expenditure and Food Survey
Feasibility Study 1979-1981.
R Barnes, R Redpath and E Breeze.
NM2
FES. A study of differential response based on
a comparision of the 1971 sample with the
Census.
W Kemsley, Stats. News, No 31, November
1975.
NM13 A Sampling Errors Manual.
R Butcher and D Elliot. 1986.
NM3
NFS. A study ofdifferential response based on
a comparison of the 1971 sample with the
Census.
W Kemsley, Stats. News, No 35, November
1976.
NM4
Cluster analysis
D Elliot. January 1980.
NM5
Response to postal sift of addresses.
A Milne. January 1980.
NM6
The feasibility of conducting a national wealth
survey in Great Britain.
I Knight. 1980.
NM7
NM8
NM9
NM14 An assessment of the efficiency of the coding
of occupation and industry by interviewers.
P Dodd. May 1985.
NM15 The feasibility of a national survey of drug
use.
E Goddard. March 1987.
NM16 Sampling Errors on the International
Passenger Survey.
D Griffiths and D Elliot. February 1988.
NM17 Weighting for non-response – a survey
researchers guide.
D Elliot. 1991
NM18 The Use of Synthetic Estimation techniques to
produce small area estimates.
Chris Skinner. January 1993.
Age of buildings. A further check on the
reliability of answers given on the GHS.
F Birch. 1980.
NM19 The design and analysis of a sample for a
panel survey of employers.
D Elliot. 1993.
Survey of rent rebates and allowances. A
methodological note on the use of a follow-up
sample.
F Birch 1980.
NM20 Convenience sampling on the International
Passenger Survey.
P Heady, C Lound and T Dodd. 1993.
Rating lists: Practical information for use in
sample surveys.
E Breeze.
Prices:
NM13 £6.00 UK - £7.00 overseas
NM17 £5.00 UK and overseas
NM1 to NM12 and NM14 to NM16
£1.50 UK - £2.00 overseas
NM 18 to NM20 £2.50 UK - £3.00
overseas
Orders to:
New Methodology Series,
Room 304, OPCS, St. Catherines
House, 10 Kingsway, London WC2B
6JP
NM10 Variable Quotas – an analysis of the
variability.
R Butcher.
NM11 Measuring how long things last – some
applications of a simple life table techniques
to survey data.
M Bone.
42
SMB 33 7/93