Bias Caused by Migration in Case-Control Studies

American Journal of Epidemiology
Copyright C 1996 by The Johns Hopkins University School of Hygiene and Public Health
All rights reserved
Vol. 143, Mo. 8
Printed In USA
Bias Caused by Migration in Case-Control Studies of Prenatal Risk Factors
for Childhood and Adult Diseases
Michael E. Jones and Anthony J. Swerdlow
Case-control studies of prenatal risk factors for disease in later life often ascertain cases from within a
defined area, trace the birth records of those cases bom within the area, and select controls from birth records
within the same area. Bias can occur in these studies if the disease risk factors are related to migration from
the area. The effects of this bias were examined in a study in Oxfordshire, England. Cases (n = 218) of diabetes
in children and young adults bom during 1965-1986 were identified from hospital discharges during 19651986; controls (n = 753) were selected from livebirths during 1965-1986. By 1987, 219 controls (29.1%) had
migrated from Oxfordshire or died. Low maternal parity and high social class were strongly related to
migration, more than the other perinatal factors studied. Migration, therefore, could lead to apparent associations of diabetes risk with parity or social class. For a general instance, the authors show how much bias is
caused by different degrees of migration and of association between migration and a perinatal risk factor.
Examples are given of how migration can produce apparent trends in risk as well as increased or decreased
individual relative risks. If more than 25% of controls migrate, bias may be appreciable. Am J Epidemiol 1996;
143:823-31.
bias (epidemiology); case-control studies; prenatal exposure delayed effects
There is increasing evidence that prenatal exposures
may be involved in the development of a wide range of
diseases in humans. For example, recent studies suggest that the in utero environment may be important in
the etiology of the sudden infant death syndrome (1),
childhood cancers (2, 3), testicular cancer (4), ovarian
germ cell cancer (5), breast cancer (6), Crohn's disease
(7), insulin-dependent diabetes (8), and schizophrenia
(9). The intrauterine environment may also influence
blood pressure (10) and lung function (11) in children
and susceptibility to hypertension (12), cardiovascular
disease (13), and non-insulin-dependent diabetes in
adults (14). It is, however, difficult to study prenatal
risk factors for diseases that may occur many years
after birth. Data on exposure variables may be highly
misclassified, with potential recall bias, if cases or
their mothers are interviewed about exposures many
years after the birth. In addition, many cases may not
know their own exposure history, and sometimes the
mothers cannot be traced or have died.
Information from obstetric case notes, if available, can
be used to avoid recall bias and reduce misclassification
errors. Retrospective cohort studies are possible; however, these are rare and limited to common diseases (14)
because of logistical difficulties in following up cohorts
large enough to generate a sufficient number of affected
subjects. An alternative, more practical, and more often
used method is the case-control approach. The usual
design involves the following: 1) selecting all cases of the
disease under study, usually from a defined geographical
area; 2) finding the obstetric notes or delivery records for
those cases who were bom within the area; and 3) comparing these with all births or a sample of births occurring at the same time in the same area. This design has
been used, for instance, to investigate prenatal and early
life risk factors for pyloric stenosis (15), the sudden
infant death syndrome (1), spastic cerebral palsy (16),
cryptorchidism (17), testicular cancer (18), insulindependent diabetes (8), inflammatory bowel disease (19),
schizophrenia (20, 21), childhood cancers (2, 3, 22-24),
neuroblastoma (25, 26), and leukemia (27).
There is, however, a potential bias in such studies.
The cases, by the way in which they are identified, are
individuals who were born in the study area and were
still living there at diagnosis. In comparison, the controls, although also born in the study area, may at any
time subsequently have died or migrated from the
area. If mortality or migration rates were related to the
Received for publication April 17,1995, and in final form January
31, 1996.
Abbreviations: FHSA, Family Health Services Authority; LRS[n],
likelihood ratio statistic on n degrees of freedom; NHS, National
Health Service; NHSCR, National Health Service Central Register
ORLS, Oxford Record Linkage Study.
From the Epidemiological Monitoring Unit, Department of Epidemiology and Population Sciences, London School of Hygiene and
Tropical Medicine, University of London, United Kingdom.
823
824
Jones and Swerdlow
prenatal risk factors under investigation, this could
lead to a biased estimate of the relative risk (28). (The
type of case-control study we have described can be
considered to be nested within birth cohorts so that the
usual parameter of interest is the rate ratio, although in
this paper we use the generic term relative risk.)
To investigate this bias, we used data from a casecontrol study of insulin-dependent diabetes to determine the relation between prenatal variables and subsequent migration. We also calculated, for a range of
general situations, the amount by which the relative
risk is biased by migration and death of controls.
MATERIALS AND METHODS
Since 1963, the Oxford Record Linkage Study
(ORLS) has assembled information to link birth,
death, and hospital records of individuals living in a
defined area in and around Oxfordshire, England. The
original ORLS area was Oxford city and county, with
a population of about 340,000 persons. From 1966, the
coverage was extended to Oxfordshire and West Berkshire, a population of approximately 800,000 persons
with approximately 14,000 births each year. Details of
the pregnancy, labor and delivery, and subsequent
morbidity of mother and infant were abstracted from
hospital case notes by trained clerks. For domiciliary
deliveries (home births), the midwife who delivered
the infant sent her notes to the ORLS where clerks
abstracted relevant information. The present study is
limited to controls born to parents resident in Oxfordshire because we were able to obtain information
about subsequent migration only for subjects born in
Oxfordshire.
We obtained data from ORLS files on all cases of
diabetes mellitus discharged from an ORLS hospital
during 1965-1986 who were born to parents resident
in the ORLS area during the same period. For diabetic
cases born during 1970-1986, as many as eight controls for each case were randomly selected from all
livebirths in Oxfordshire. Controls were individually
matched to cases on sex, year, and hospital of delivery,
or if domiciliary, to another domiciliary delivery. Controls born in the hospital had to have been discharged
alive from that hospital. For cases born before 1970, it
was not possible to match on hospital of delivery
because of the way the data had been stored. Instead,
as many as two controls for each case were chosen,
matched on sex, year, and place of delivery (which
was either at home or in hospital). The maternity and
delivery information for cases and controls was extracted from ORLS computer and microfilm files.
Data were not recorded by the ORLS from all hospitals or for the whole study period for certain variables.
The analyses, therefore, are based on smaller numbers
of subjects where data were collected.
To assess whether controls had migrated from
Oxfordshire, we searched the register of the Oxfordshire Family Health Services Authority (FHSA),
which keeps a list of all people who are enrolled
with National Health Service (NHS) doctors in Oxfordshire or who have been enrolled at any time
since 1987. Virtually all individuals of the ages
covered by this study and resident in Oxfordshire
will have been registered with the NHS, which we
verified as described below.
The search of the Oxfordshire FHSA was made
using the unique NHS personal identification number,
the first and last names, the sex, and the date of birth.
Anyone not found by this method was assumed to
have died or migrated from the Oxfordshire FHSA
area before 1987. For five controls, we were uncertain
about the link to an FHSA record: Two were classified
as not linked and three, as linked. To validate negative
linkage, records for 10 males and 10 females who did
not link with a record on the FHSA were checked
against the NHS Central Register (NHSCR). The
NHSCR includes all people who are or have been
registered with the NHS in England and Wales. It
indicates in which FHSA subjects are currently enrolled or their reason for removal. Eighteen of the 20
people not Linked to records on the Oxfordshire FHSA
were confirmed as currently registered outside Oxfordshire, one had been removed from the Oxfordshire
FHSA in 1981, and one was matched to a person still
registered in Oxfordshire. This person was one of the
five uncertain matches and was reclassified as a positive match. To determine whether positive linkages
with the Oxfordshire FHSA represented individuals
truly present in the area, we verified the registration
with the Oxfordshire FHSA of diabetic cases with a
recent hospital admission in Oxfordshire. During
1983-1987, there were 122 diabetics among our case
group whose general hospital admission records indicated their area of residence to be Oxfordshire. Of
these, only two were not Linked to records on the
Oxfordshire FHSA, and both were known to have died
during 1986 (i.e., they were correctly not registered on
the Oxfordshire FHSA from 1987 onward).
In the following analyses, we use the information
collected on the control group to estimate the strength
of the association between migration and potential
prenatal and early Life risk factors for disease in later
Life. Odds ratios for migration were calculated by
unconditional logistic regression using the computer
package Stata (29). We also derived the expected size
of the bias due to migration and death, for a general
instance, for different proportions of the population
Am J Epidemiol
Vol. 143, No. 8, 1996
Migration and Bias in Case-Control Studies
lost to follow-up. Mathematical details are contained
in the Appendix.
RESULTS
A total of 218 cases and 753 controls were born to
parents resident in Oxfordshire during 1965-1986.
Eight of the controls had died before 1987. Of the
controls, 534 (70.9 percent) were linked with records
on the Oxfordshire FHSA register. In total, therefore,
219 of 753 (29.1 percent) controls had migrated from
Oxfordshire or had died during 1965-1987, an average
rate of 2.53 per 100 person-years. Hereafter, references to "migration" denote people who were not
linked to a record on the Oxfordshire FHSA, including
people who have died. The annual migration rate was
similar for males (2.53 per 100 person-years) and
females (2.52 per 100 person-years).
There was no difference in the odds of migration
between males and females (table 1). Persons bom in
the earlier time periods were more likely to have
migrated than those born in the later periods. There
was a strong relation between migration and social
class, with a statistically significant trend (p < 0.001)
across social classes I-V. (Up to and including 1972,
social class was defined as father's social class; beginning in 1973, it was defined as bread-winner's
social class.) The group most likely to have migrated
was born into the social class group classified as
"other occupations" (students, armed forces, and, after
1972, housewives). Those classified in the social class
"unoccupied and not known," which included the unemployed, were also more likely to have migrated than
the baseline group.
In table 2, the relation between obstetric variablesand migration is shown. There was little change in the
TABLE 1. Odds ratios for migration from Oxfordshire, England, for sex, year of birth, place of birth, and
social class, 1965-1986
Risk
factor
Sex*
Male
Female
LRS[1]f = 0.02, p = 0.900
Year of birth*
1965-1969
1970-1974
1975-1979
1980-1986
LRS[3] = 4.19, p = 0.242;
trend LRS[1] = 3.72, p = 0.054
Place of birth*
Oxford district hospitals
Banbury district hospitals
Domiciliary
LRS[2] = 3.66, p = 0.161
Social class of parents (derived from
occupation)*
I (professional etc.)
II (managerial and technical)
III (skilled)
IV (partly skilled)
V (unskilled)
Other occupations
Unoccupied and not known
LRS[4] = 27.71, p < 0.001;
trend LRS[1] = 27.04, p < 0.001
(groups I-V only)
LRS[6] = 73.07, p < 0.001 (all groups
included)
Migrated
Vol. 143, No. 8, 1996
Odds
ratio*
95%
confidence
Interval
P
value
yes
( n o 219)
No
(n = 534)
113
106
278
256
1.00
0.98
Baseline
0.71-1.36
0.900
57
91
47
24
104
220
128
82
1.00
0.73
0.67
0.57
Baseline
0.48-1.12
0.41-1.10
0.32-1.03
0.154
0.114
0.062
140
72
7
384
132
18
1.00
1.37
0.77
Baseline
0.96-1.97
0.30-1.99
0.086
0.586
29
30
58
13
4
42
43
37
62
218
73
35
23
86
2.93
2.04
1.00
0.66
0.44
7.35
2.24
1.65-5.20
1.20-3.49
Baseline
0.34-1.29
0.15-1.29
4.05-13.34
1.36-3.70
* Adjusted for the other variables in table except social class,
t LRS[n], likelihood ratio statistic on n degrees of freedom.
Am J Epidemiol
825
<0.001
0.009
0.226
0.134
<0.001
0.002
826
Jones and Swerdlow
TABLE 2. Odd* ratios for migration from Oxfordshire, England, for birth weight, gestation, maternal
parity, and age, 1965-1986
Risk
factor
Birth weight (kg)
<2.5
2.5-2.9
3.0-3.4
3.5-3.9
:>4.0
Missing
LRS[4]t = 2.41, p = 0.661;
trend LRS[1 ] = 1.39, p = 0.239
Gestation (completed weeks from date
of last menstrual period)
<37
37-39
40-12
!>43
Missing
LRSJ.3] = 3.90, p •= 0.272;
trend LRS[1] = 0.18, p = 0.673
Maternal parity (before this birth)
0
1
2
3-9
Missing
LRS[3] = 7.87, p = 0.049;
trend LRS[1 ] = 6.93, p = 0.008
Maternal age (years)
<20
20-24
25-29
30-34
£35
Missing
LRS[4] = 5.03, p = 0.284;
trend LRS[1] = 3.11, p = 0.078
Migrated
Odds
ratio'
95%
confidence
interval
P
value
(/J=»219)
No
(n = 534)
12
35
78
63
27
4
19
78
197
167
66
7
1.65
1.15
1.00
0.92
0.99
0.76-3.59
0.71-1.86
Baseline
0.62-1.36
0.59-1.69
0.210
0.569
12
55
111
6
35
29
173
247
25
60
0.98
0.73
1.00
0.55
0.48-2.01
0.50-1.07
Baseline
0.21-1.38
0.963
0.109
0.201
102
73
25
17
2
195
192
91
54
2
1.00
0.73
0.54
0.59
Baseline
0.51-1.05
0.33-0.90
0.32-1.08
0.090
0.019
0.085
29
75
75
27
12
54
169
180
98
32
1
1.23
1.00
1.00
0.64
0.85
0.72-2.10 •
0.68-1.47
Baseline
0.39-1.06
0.41-1.76
1
0.674
0.985
0.447
0.995
0.086
0.669
1
Adjusted for sex, year, and place of birth,
t LRS[n], likelihood ratio statistic on n degrees of freedom.
odds ratio across birth weight groups, except that those
with the lowest birth weights were more likely to have
migrated than those in the baseline group (odds ratio
= 1.65, 95 percent confidence interval 0.76-3.59).
This result was only slightly different when subjects
who had died were removed from the analysis (odds
ratio = 1.57, 95 percent confidence interval 0.713.49). Risk of migration was reduced for children with
the longest gestation; however, this observation was
based on small numbers and did not reach statistical
significance. Maternal parity (before this birth) was
strongly related to migration, with risk decreasing with
parity up to two but not decreasing thereafter. There
was weak evidence that children born to younger
mothers were more likely to migrate than those born to
older mothers, but this disappeared after adjusting for
maternal parity.
Results for selected perinatal factors are presented
in table 3. Raised maternal systolic and diastolic
blood pressure was associated with a reduced risk of
migration, the former significantly so, as was maternal AB blood group; however, this observation
was based on small numbers. Preeclampsia, excessive vomiting during pregnancy, radiographic examination, maternal body mass index, and smoking
during pregnancy were not strongly related to migration. Neither were rhesus blood group, rhesus
incompatibility between mother and fetus, mother
receiving a blood transfusion, albuminuria, duration
of labor, episiotomy, Apgar score, oxygen or resusAm J Epidemiol
Vol. 143, No. 8, 1996
Migration and Bias in Case-Control Studies
827
TABLE 3. Odds ratios for migration from Oxfordshire, England, for selected maternal variables,
1965-1986
Risk
factor
Maximum systolic blood pressure
(mm/Hg)t
<120
120-139
£140
Missing
LRS[2]$ = 6.16, p = 0.046;
trend LRS[1] = 4.18, p = 0.041
Maximum diastolic blood pressure
(mm/Hg)t
<80
80-89
£90
Missing
LRS(2] = 2.30, p = 0.316;
trend LRS[1] = 1.81, p = 0.178
Maternal blood groupt
O
A
B
AB
Missing
LRS[3] = 8.67, p = 0.034
Pre eclampsia
No mention
Any mention
LRS[1] = 0.02, p = 0.886
Vomiting during pregnancy
No mention
Any mention
LRS[1] = 0.68, p = 0.410
Radtographic examination during
pregnancyt
No
Yes
Missing
LRS[1] = 0.01, p = 0.919
Body mass Index (kg/m^t
<22
22-25.9
^26
Missing
LRS[2] = 2.73, p = 0.256;
trend LRS[1] = 0.30, p - 0.584
Current smoker during pregnancyt
No
Yes
Missing
LRS[1] o 0.20, p = 0.656
Migrated
Odds
ratio*
95%
confidence
Interval
P
value
(n = 219)
No
(n = 534)
5
57
11
35
14
148
65
36
0.90
1.00
0.42
0.30-2.69
Baseline
0.20-0.87
0.853
22
59
106
1.08
1.00
0.63
0.58-2.03
Baseline
0.31-1.27
0.807
1.00
0.81
1.33
0.27
Baseline
0.56-1.17
0.75-2.37
0.08-0.92
0.260
0.329
0.036
YBS
37
14
35
92
75
23
3
11
62
0.019
0.194
36
207
204
40
26
22
35
449
85
1.00
1.03
Baseline
0.67-1.60
0.886
204
15
497
37
1.00
1.33
Baseline
0.68-2.61
0.404
196
7
0
478
19
1.00
1.05
Baseline
0.42-2.60
0.919
6
21
11
7
40
62
44
8
0.46
1.00
0.68
0.17-1.24
Baseline
0.29-1.59
0.376
34
9
8
111
1.00
0.83
Baseline
0.36-1.91
0.659
184
1
36
12
0.124
* Adjusted for sex, period, and place of birth.
t Excludes subjects for whom data were not recorded by the Oxford Record Linkage Study.
$ LRS[n], likelihood ratio statistic on n degrees of freedom.
citation at birth, and size of baby's head (not shown
in the table). When the eight subjects who had died
Am J Epidemiol
Vol. 143, No. 8, 1996
were removed from the analyses, the results remained essentially the same.
828
Jones and Swerdlow
To give generally applicable information about the
impact of migration bias on the relative risk for disease
in studies of prenatal or early life risk factors, we
considered a hypothetical study in which controls are
selected at birth from a single birth cohort and cases
are selected as they are diagnosed in later life from the
same birth cohort. By assuming a constant migration
rate in the exposed group and a different but constant
migration rate in the unexposed group, it was possible
to determine how much bias was introduced into the
estimate of the disease-exposure relative risk by differential migration. Mathematical details are given in
the Appendix. The amount of bias in the apparent
relative risk in a study that used controls selected at
birth, compared with the true relative risk, for different
values of cumulative migration and different degrees
to which migration is associated with the study variable is shown in table 4. In table 4, "odds ratio for
migration in relation to exposure" is a measure of the
difference in migration rate between the exposed and
baseline groups (for examples of actual values, see
tables 1-3.) "Percentage of population exposed to risk
factor at start of study period" refers to the percentage
in the exposure group under consideration relative to
the baseline group, ignoring the numbers of people in
the other exposure groups. For example, in table 1,
approximately 19 percent of subjects were in social
class I relative to social class i n (the baseline). "Percentage of population that migrated from the study by
the end of the study period" is the percentage of
controls lost from the study area by the end of the
study period relative to the number that were selected
at the start of the study period.
Table 4 was constructed using a migration rate of
2.5 per 100 person-years (i.e., 2.5 percent per year), a
disease rate of 10 per 100,000 person-years, and a true
disease-exposure relative risk of 1.00. In sensitivity
analyses, we varied one parameter while holding the
others at the values given above and found that for
migration rates between 0.5 per 100 and 10 per 100
person-years, or disease rates between 0.1 per 100,000
and 100 per 100,000 person-years, or true relative
risks between 0.25 and 4.00, the results in the table
held to plus or minus one percentage point. For certain
extreme combinations of these values or for values
outside the given ranges, the bias may differ from that
given in the table by more than plus or minus one
percentage point.
As an example of the use of table 4, consider if there
were no true association between a disease and, for
example, social class (i.e., if all true relative risks were
1.00). If, during a study that used controls selected at
birth, 1) 25 percent of controls migrate from the study
area, 2) there were equal numbers of controls in each
social class at the start of the study period (i.e., 50
percent exposed relative to baseline), and 3) the exposure-migration odds ratios were 3.00, 2.00, 1.00, 0.75,
and 0.50 for social classes I-V, respectively, then as a
result of migration, the observed relative risks for
disease would be 0.88, 0.92, 1.00, 1.03, and 1.08.
Although the bias in each stratum is small, taken
TABLE 4. Percentage bias (relative risk with migration disregarded/true relative risk x 100) in a casecontrol study in which controls are selected at birth but subsequently migrate from the study area*
Odds
Percentage of population exposed to risk factor at start of study period
tor
10
50
90
tn
Percentage of poulallon
thai migrated from the ttudy
by the end of the study period
Percentage ot poulallon
that migrated from the study
by the end of tne study period
Percentage of poulallon
that migrated from the study
by the end of the study period
to
exposure
0.125
0.25
0.50
0.75
1.00
1.50
2.00
3.00
4.00
8.00
10
105
104
103
101
100
98
96
93
90
83
25
114
111
107
103
100
95
91
84
79
67
50
10
25
50
10
25
50
135
126
115
106
100
91
84
76
70
57
108
106
103
101
100
98
97
95
94
92
124
117
108
103
100
95
92
88
86
81
157
136
117
107
100
91
86
79
74
64
121
111
104
102
100
98
97
96
96
95
148
126
110
104
100
96
93
91
90
88
176
144
119
107
100
92
87
82
79
74
Table derived using: migration rate, ID(O,.), <=• 2.5 per 100 person-years; disease rate, ID(1,.), = 10 per
100 ,000 person-years; and true disease relative risk, R R ^ ^ 1-00 (see appendix for definition of ID(0,.), ID(1,.),
and RR^^J.The table holds to ±1 percentage point for true relative risks in the range 0.25-4.00, or migration rates
in the range 0.5-10.0 per 100 person-years, or disease rates in the range 0.1-100 per 100,000 person-years. At
extreme combinations of these values, the percentage bias may differ from that given in the table by more than ±1
percentage point
Am J Epidemiol
Vol. 143, No. 8, 1996
Migration and Bias in Case-Control Studies
together the results might easily be mistaken for a
trend.
As another example, consider a risk factor in which
10 percent of the controls are exposed relative to the
baseline, and the migration-exposure odds ratio for the
factor is 0.50 (this is similar to the actual figure for
high parity vs. nulliparity in Oxfordshire). If the true
disease-exposure relative risk were 3.00 and a large
proportion, such as 50 percent, of the controls selected
at birth had migrated by the end of the study period,
then the biased relative risk would be 3.45. If 25
percent of the controls had migrated, the relative risk
would be 3.21; and if 10 percent of the controls had
migrated, the relative risk would be 3.09.
It should be noted that the bias caused by migration
is not always toward the null value of no association.
If a risk factor is positively associated with disease but
negatively associated with migration, the relative risk
will be biased upward and away from 1.0. Conversely,
if a risk factor is negatively associated with disease but
positively associated with migration, the relative risk
will be biased downward and away from 1.0.
DISCUSSION
We have shown that markers for prenatal exposures, like parental social class and maternal parity,
may be strongly related to subsequent migration.
This in turn may lead to bias in the estimate of the
disease-exposure relative risk for these variables.
Other variables such as birth weight, gestation, and
maternal age were not strongly associated with migration; and this would lead to little or no bias in the
relative risk. The relations between the exposures
and migration in this study, however, were specific
to Oxfordshire. Studies should be conducted elsewhere to determine whether these associations are
similar in other Western populations and under different social and economic conditions. Nonetheless,
it appears reasonable that high social class and low
parity, at least, should be strongly associated with
potential for migration in Western countries.
We assessed migration by linkage of ORLS records
to entries on the Oxfordshire FHSA register; however,
the accuracy of this linkage was high: In a sample of
122 diabetics whom we knew to be recently resident in
Oxfordshire, all 120 who were alive after 1986 were
recorded on the Oxfordshire FHSA in 1987 or later,
i.e., there was no misclassification. In a sample of 20
people who did not link with an Oxfordshire record,
19 were confirmed by the NHSCR as not registered in
Oxfordshire. The twentieth person was unusual because she was one of five uncertain linkages. Because
there were only four other uncertain linkages of 753
control subjects, changing their classification would
Am J Epidemiol
Vol. 143, No. 8, 1996
829
have made virtually no difference to the overall results.
The results presented in tables 1-3 relate to migration from an English county and may show smaller
effects than for a US county or state where migration
rates may well be greater. However, in studies with
national coverage, the cumulative effect of migration
is likely to be smaller than for county- or state-based
studies. For example, in 1981 among 0-14 year olds,
the migration rate from Oxfordshire to the rest of
Britain was 3.0 percent (30, 31), whereas the migration rate from the country was only 0.3 percent (32).
Furthermore, the pattern of migration in relation to
prenatal risk factors may be different for international
migration than for intercounty or interstate migration,
so that our results may not apply to studies at the
national level.
In case-control studies using controls selected at
birth, our calculations showed that if appreciably less
than 25 percent of the population migrate, bias will
exceed only 10 percent if the migration-exposure odds
ratio is large, for example, greater than 4.00 or less
than 0.25. Only one of our observed odds ratios was
this extreme. This indicates that, with few exceptions,
when the proportion of controls that migrate is small,
bias caused by migration is unimportant as long as the
associations we detected are found to be of approximately the same strength in future studies.
If 25 percent or more of the population leave the
study area, however, migration may cause appreciable
bias even with more modest migration-exposure odds
ratios. In Oxfordshire, 29.1 percent of the controls had
migrated by 1987, the end of the study period, when
their average age was only 13 years. For a disease for
which the average age of onset is later in life, a longer
-interval from birth to disease will be required to ensure
that a sufficient number of cases have occurred. The
cumulative effect of migration may then result in more
than 50 percent of the control group eventually migrating by the end of the study period, even if migration rates are not as high as those in Oxfordshire. Bias
caused by migration, therefore, is likely to be important for case-control studies of diseases in adolescence
(e.g., insulin-dependent diabetes) or young adults
(e.g., testicular cancer) especially in populations that
are highly mobile (e.g., the United States, where in
1990-1991, 2.9 percent of Americans had moved to a
different state (33) compared with Britain, where 2.1
percent had moved to a different county (34)). Furthermore, the potential effects of bias caused by migration will be even greater when investigating the
role of prenatal factors in the etiology of diseases of
middle age and beyond, such as non-insulin-dependent
diabetes and heart disease (14).
830
Jones and Swerdlow
In conclusion, a case-control study ideally needs to
select controls who were known to be present in the
study area at the time the case was diagnosed. This
information is not usually available for studies of
prenatal risk factors when the controls are selected
from birth records. If the migration status of each
control can be ascertained at the end of the study
period, the potential effect of migration bias for any
given exposure can be determined in the same way as
in this paper; however, it is often not feasible to collect
such information. Our results suggest, however, that if
the investigator knows that appreciably less than 25
percent of the controls are likely to have migrated or
died by the end of the study period, then migration
bias will be unimportant, unless the association between exposure and migration is large. If more than 25
percent of the controls have migrated or died, as is
likely for state- or county-based studies of adolescent
or adult diseases, then the bias caused by migration
may be appreciable for certain risk factors. Such studies should consider the impact of migration bias on
their results and conclusions.
ACKNOWLEDGMENTS
We are very grateful to the ORLS, especially to its
Director, Dr. Michael Goldacre, and to Myfanwy Griffith,
for the data used in this study. We are also grateful to the
Oxfordshire FHSA for allowing us access to their records.
Michael E. Jones is supported by a Medical Research
Council Training Award. The Epidemiological Monitoring
Unit is supported by the Medical Research Council.
REFERENCES
1. Buck GM, Cookfair DL, Michalek AM, et al. Intrauterine
growth retardation and risk of sudden infant death syndrome
(SEDS). Am J Epidemiol 1989;129:874-84.
2. Dating JR, Staizyk P, Olshan AF, et al. Birth weight and the
incidence of childhood cancer. J Natl Cancer Inst 1984;72:
1039-41.
3. Harvey EB, Boice JD Jr, Honeyman M, et al. Prenatal x-ray
exposure and childhood cancer in twins. N Engl J Med 1985;
312:541-5.
4. Depue RH, Pike MC, Henderson BE. Estrogen exposure during gestation and risk of testicular cancer. J Natl Cancer Inst
1983;71:1151-5.
5. Walker AH, Ross RK, Haile RWC, et al. Hormonal factors
and risk of ovarian germ cell cancer in young women. Br J
Cancer 1988;57:418-22.
6. Trichopoulos D. Is breast cancer initiated in utero? Epidemiology 1990;l:95-6.
7. Ekbom A, Wakefield AJ, Zack M, et al. Perinatal measles
infection and subsequent Crohn's disease. Lancet 1994;344:
508-10.
8. Dahlquist G, Kfille'n B. Maternal-child blood group incompatibility and other perinatal events increase the risk for earlyonset type 1 (insulin-dependent) diabetes mellitus. Diabetologia 1992;35:671-5.
9. Adams W, Kendell RE, Hare EH, et al. Epidemiological
evidence that maternal influenza contributes to the aetiology
of schizophrenia: an analysis of Scottish, English, and Danish
data. Br J Psychiatry 1993;163:522-34.
10. Launer LJ, Hofrnan A, Grobbee DE. Relation between birth
weight and blood pressure: longitudinal study of infants and
children. BMJ 1993;307:1451-4.
11. Rona RJ, Gulliford MC, Chinn S. Effects of prematurity and
intrauterine growth on respiratory health and lung function in
childhood. BMJ 1993;306:817-20.
12. Law CM, de Swiet M, Osmond C, et al. Initiation of hypertension in utero and its amplification throughout life. BMJ
1993;306:24-7.
13. Barker DJP, Gluckman PD, Godfrey KM, et al. Fetal nutrition
and cardiovascular disease in adult life. Lancet 1993;341:
938-41.
14. Barker DJP, ed. Fetal and infant origins of adult disease.
London: BMJ, 1992.
15. Adelstein P, Fedrick J. Pyloric stenosis in the Oxford Record
Linkage Study Area. J Med Genet 1976;13:439-48.
16. Blair E, Stanley F. Intrauterine growth and spastic cerebral
palsy. I. Association with birth weight for gestational age. Am
J Obstet Gynecol 1990; 162:229-37.
17. Swerdlow AJ, Wood KH, Smith PG. A case-control study of
the aetiology of cryptorchidism. J Epidemiol Community
Health 1983;37:238-44.
18. Malone KE, Dating JR. Birth weight and the risk of testicular
cancer. (Letter). J Natl Cancer Inst 1986;77:829-30.
19. Ekbom A, Adami H-O, Helmick CG, et al. Perinatal risk
factors for inflammatory bowel disease: a case-control study.
Am J Epidemiol 199O;132:111 1-19.
20. O'Callaghan E, Gibson T, Colohan HA, et al. Risk of schizophrenia in adults born after obstetric complications and their
association with early onset of illness: a controlled study. BMJ
1992;305:1256-9.
21. McNeil TF, Cantor-Graae E, Nordstrom LG, et al. Head
circumference in "preschizophrenic" and control neonates. Br
J Psychiatry 1993;162:517-23.
22. MacMahon B, Newill VA. Birth characteristics of children
dying of malignant neoplasms. J Natl Cancer Inst 1962;28:
231-44.
23. Golding J, Greenwood R, Birmingham K, et al. Childhood
cancer, intramuscular vitamin K, and pethidine given during
labour. BMJ 1992;305:341-6.
24. Emerson JC, Malone KE, Dating JR, et al. Childhood brain
tumor risk in relation to birth characteristics. J Clin Epidemiol
1991 ;44:1159-66.
25. Johnson CC, Spitz MR. Neuroblastoma: case-control analysis
of birth characteristics. J Natl Cancer Inst 1985;74:789-92.
26. Johnson CC, Spitz MR. Prematurity and risk of childhood
cancer. (Letter). J Natl Cancer Inst 1986;76:359.
27. Shaw G, Lavey R, Jackson R, et al. Association of childhood
leukemia with maternal age, birth order, and paternal
occupation: a case-control study. Am J Epidemiol 1984; 119:
788-95.
28. Flanders WD, Louv WC. The exposure odds ratio in nested
case-control studies with competing risks. Am J Epidemiol
1986;124:684-92.
29. Stata Corporation. Stata Reference Manual: Release 3.1. 6th
ed. College Station, TX: 1993.
30. Office of Population Censuses and Surveys. Census 1981,
county report, Oxfordshire, part 1. London: HMSO, 1982:6.
31. Office of Population Censuses and Surveys. Census 1981, regional migration, south east, part 1. London: HMSO, 1983:394.
32. Office of Population Censuses and Surveys. Population trends,
67. London: HMSO, 1992:41, 54.
33. US Bureau of the Census, Population Division. Geographical
mobility: March 1992 to March 1993. Washington, DC: GPO,
1995. (Current Population report P20-481).
34. Office of Population Censuses and Surveys. 1991 census,
migration, Great Britain. Part 1 (100 percent tables), vol. 2.
London: HMSO, 1994.
Am J Epidemiol
Vol. 143, No. 8, 1996
Migration and Bias in Case-Control Studies
831
APPENDIX
ID(i,j) is the incidence rate for cause / in exposure group,/', assumed constant over age and period (i = 0 for
migration, i = 1 for disease under study; j — 0 for unexposed group, j = 1 for exposed group.)
/ i s the proportion of the study population exposed at the start of the study period relative to the baseline group.
L is the proportion of the study population that migrates by the end of the study period.
RRtnie, the true disease-exposure relative risk, is the ratio of the disease rate in the exposed group to the disease
rate in the unexposed group:
RR^
= 7D(l,l)/7£>(l,0)
The total rate at which subjects are lost from the "at risk" set in the unexposed (J = 0) or exposed (j = 0) group
is given by the sum of the rates for migration and disease:
ID(.J) = ID(0,j) + ID(\,j)
The migration (i = 0) or disease (i = 1) rate in the study population at the start of the study is given by the
average rate in the unexposed and exposed groups:
ID(i,.) = (1 - / ) X ID(i,0) + f X 7D(i,l)
The proportion of the study population, L, that has migrated from the study area in time T is given by 1 minus
the proportion of people still left in the study area:
L = 1 - [/" X e x p ( - / D ( 0 , l ) X T)+ (1 - f) X exp(-/D(0,0) X T)\
The odds ratio for migration in relation to exposure, ORmigration, is given by
(1 - exp( - /D(0,l) X T))/exp( - 7D(0,l) X T)
- (j _ e x p ( - 7D (0,0) X T))/exp ( - ID (0,0) X T)
The biased disease-exposure relative risk, RRbiased, is given by (see (28), equation 1)
ID{\,\)
X
7D(1,1)
( J -exp(-7D(.,l) X T))/[l - - ^ y X (1 - exp(-7D(.,l) X T))]
7£>(l,0)
7D(l,0)
^ X (1 -exp(-7D(.,0) X T))/[l - -j^-^
X (1 -exp(-/D(.,0) X T))]
A computer spreadsheet program may be used to solve the equations above subject to the constraint that all
of the equations are in agreement. For example, i f / = 0.90, L = 0.50, 0 / ? ^ ^ , ^ = 8.00,7D(1,.) = 10 per 100,000
person-years, 7D(0,.) = 2.5 per 100 person-years, and RRtrue = 2.00, then T = 28.6 years, 7D(0,0) = 0.48 per
100 person-years, 7£>(l,0) = 5.26 per 100,000 person-years, 7D(0,l) = 2.72 per 100 person-years, and 7D(1,1)
= 10.53 per 100,000 person-years. This satisfies the equations above and gives RRbiased ~ 1-487. The bias,
relative to RRlrve, is then 1.487/2.00, or 74 percent.
Am J Epidemiol
Vol. 143, No. 8, 1996