Estimating the Prevalence of Multiple Sclerosis in the United

American Journal of Epidemiology
Copyright © 1999 by The Johns Hopkins University School of Hygiene and Public Health
All rights reserved
Vol.149, No. 11
Printed in USA.
Estimating the Prevalence of Multiple Sclerosis in the United Kingdom by
Using Capture-Recapture Methodology
Raeburn B. Forbes and Robert J. Swingler
The geographic distribution of multiple sclerosis is nonrandom, as the disease is more prevalent in temperate
than in tropical regions. Surveys conducted between 1970 and 1996 suggest that multiple sclerosis is more
prevalent in the northern part of the United Kingdom than in the southern part. This north-south gradient ("the
latitudinal gradienf) might be a methodological artifact, because high prevalence figures from serial surveys of
the northern part of the United Kingdom might have been the result of better ascertainment. By using capturerecapture methods, the authors found that case ascertainment was similar in the northern and southern parts
of the United Kingdom. When prevalence figures for multiple sclerosis in the southern United Kingdom were
increased to account for the number of unobserved cases, the difference persisted: The prevalence of multiple
sclerosis in the northern part of the United Kingdom appeared to be at least 180 cases per 100,000 persons,
whereas the maximum prevalence in the southern part of the United Kingdom was less than 160 cases per
100,000 persons. The distribution of multiple sclerosis in the United Kingdom is not uniform and is consistent
with the hypothesis that populations with a high prevalence of multiple sclerosis may be genetically predisposed
to the disease. Am J Epidemiol 1999;149:1016-24.
logistic models; medical geography; multiple sclerosis; prevalence
The distribution of the prevalence of multiple sclerosis is nonrandom, as the disease is more prevalent in
temperate than in tropical regions. This apparent
increase in prevalence with an increase in latitude was
once termed "the latitudinal gradient" and led to
hypotheses that the disease is due to the delayed
effects of infection in genetically susceptible persons
(1). However, variations in the world distribution of
multiple sclerosis must be treated with caution because
the disease has been surveyed in different places at different times using different diagnostic criteria; therefore, some observed gradients may be spurious.
Kurtzke tried to address this problem by grading
surveys according to the methodology used (2). In the
absence of an independent measure of accuracy, however, it is still difficult to compare the best casefinding studies.
For example, in the United Kingdom, a cursory
examination of reported prevalence figures suggests
a clear difference in prevalence between the northern
part of the United Kingdom (Scotland and Northern
Ireland) and the southern part of the United
Kingdom (England and Wales) (table 1). However,
some authors have argued that the north-south divide
in prevalence can be explained by methodological
differences (3-7).
First, diagnostic criteria have been refined over the
years, and all published surveys from Scotland use
the older, broader criteria of Allison and Millar (8);
more recent studies from England and Wales have
used the narrower criteria of Poser et al. (9). The difference is important, because the first criteria include
patients with "early" multiple sclerosis who do not
meet the newer criteria for definite or probable disease that were formulated by Poser et al. A second
problem is that the high prevalence figures for northeast Scotland and the Northern Isles (Orkney and
Shetland) are based on serial rather than first surveys.
In fact, the figures from the first surveys of these
northern areas show little or no difference from the
figures from the first surveys in the southern part of
the United Kingdom (table 1). Furthermore, the high
estimates from Orkney and Shetland are based on surveys that included as few as 40 cases in a population
of about 20,000, so associated sampling error was
high.
To address more formally the issue of ascertainment
bias, we estimated the prevalence of multiple sclerosis
in a previously unsurveyed area of Scotland by using
capture-recapture methodology (10). An important
assumption of this method is that the sources from
Received for publication January 29,1998, and accepted for publication September 25, 1998.
From the Department of Neurology, Ninewells Hospital and
Medical School, Dundee, United Kingdom.
1016
Capture-Recapture Methods in Multiple Sclerosis
1017
TABLE 1. Crude prevalence* of multiple sclerosis, determined by using Allison and Millar and Poser et al. criteria, United
Kingdom, 1970-1996
Allison and Millar (8) criteria
Survey
and
year
(reference)
Scotland
Shetland, 1954(28)
Shetland, 1962(30)
Shetland, 1970(31)
Shetland, 1974(31)
Shetland, 1984(32)
Orkney, 1954(28)
Orkney, 1962 (30)
Orkney, 1970(31)
Orkney, 1974(31)
Orkney, 1983(33)
Grampian, 1970(34)
Grampian, 1973 (35, 36)
Grampian, 1980 (37)
Outer Hebrides, 1954 (29)
Outer Hebrides, 1979(38)
Lothian and Borders, 1995 (21)
Tayside, 1996
Wales
Southeast Wales, 1985 (15)
Southeast Wales, 1988 (26)
England
Sutton, 1985(16)
Suffolk, 1988(17)
Southampton, 1987 (18)
Rochdale, 1989 (39)
Sussex, 1990(3)
South Cambridgeshire, 1990 (4)
South Cambridgeshire, 1993 (7)
Jersey, 1991 (40)
Guernsey, 1991 (40)
North Cambridgeshire, 1993 (19)
Northern Ireland
Northern Ireland,
Northern Ireland,
Northern Ireland,
Northern Ireland,
1951 (9)
1961 (41)
1988 (42)
1996 (20)
Comparative study (25)
England, 1991
Scotland, 1992
No. of cases
observed
25
29
31
34
45
23
33
40
54
46
557
634
839
11
30
1,613
No. of
people
Poser et al. (9) criteria
Crude
prevalence
95% confidence
interval
No. of cases
observed
Crude
prevalence
95% confidence
interval
134
165
179
184
192
90-198
115-238
126-254
132-258
143-257
74-167
127-250
172-319
237-404
180-320
116-137
133-156
166-191
17-55
68-139
178-196f
208-237f
727
183
170-197f
880
18,715
17,537
17,327
18,445
23,454
20,746
18,531
17,077
17,462
19,182
440,176
440,176
471,000
35,807
30,844
864,300
396,500
441
453
376,718
376,718
117
120
107-129t
110-132
380
379
101
101
91-112
91-111
195
62
411
254
810
95
53
449
170,000
31,379
417,000
207,600
596,594
288,410
290,700
84,082
61,164
378,959
115
198
99
122
136
130
152
113
87
118
100-132t
154-253t
89-109t
108-138t
127-145
117_144f
138-167
92-138
66-113
108-130f
395
232
665
322
380
84
45
401
95
112
111
112
131
100
74
106
86-105
98-127
103-120
100-125
118-145
81-124
55-99
96-117
700
1,146
119
288
1,370,709
1,425,000
86,500
151,000
51
80
138
191
47-55
76-85
115-165
170-214t
254
168
149-190
3,677
736
3,617,890
527,736
102
139
98-105
130-150
374
441
111
178
234
309
240
127
144
178
31
97
187
222
* Per 100,000 people.
t Prevalence estimates suitable for adjustment by using capture-recapture methods.
which people are identified are independent; that is,
the probability of being observed from one source is
not influenced by being observed from another source.
Health care information systems nearly always violate
this assumption (11), and any estimate of missing
cases is biased downward. In a two-source model, this
bias cannot be quantified, so the estimate of the number of missing cases usually represents a minimum. If
the data allow a three-source model, then the effect of
dependency can be modeled (10).
Am J Epidemiol
Vol. 149, No. 11, 1999
In this study, we used a two-source capture-recapture
model to determine the likely numbers of cases of multiple sclerosis that were missed in prevalence surveys
of the United Kingdom. Numerators from these surveys
were adjusted for underascertainment, and prevalence
estimates were altered accordingly. A survey from
Tayside, Scotland, estimated the prevalence of multiple
sclerosis on September 1,1996. By using data from that
survey, we explored the effect of source dependency by
using a three-source capture-recapture model. If
1018
Forbes and Swingler
improved ascertainment is the reason for a north-south
difference in prevalence, then hypotheses based on a
latitudinal gradient require revision.
MATERIALS AND METHODS
Identification of suitable prevalence surveys
Surveys reporting the prevalence of multiple sclerosis in the United Kingdom were identified from a
MEDLINE search (until December 1997) and handsearching of studies published until June 1998. We
selected studies that reported sources of cases in the
form of intersecting lists, as this information is a prerequisite for adjusting for underascertainment when a
two-source capture-recapture model is used.
In the Tayside survey of the prevalence of multiple
sclerosis, four sources were used to identify cases. The
first two were Dundee (Scotland) Royal Infirmary
Neurology Department records from 1977 to 1996 and
visual evoked response requests from 1986 to 1996
(these responses are nearly always requested when multiple sclerosis is suspected and thus would be a reasonable source of potential cases). We confirmed that the
neurology department records and the visual evoked
responses were dependent on each other, so these two
sources were merged (source 1). Scottish morbidity
recording of hospital admissions coded as due to multiple sclerosis (SMR1 data from the Information and
Statistics Division of the Scottish Office Home and
Health Department) became source 2, and responses of
general medical practitioners in Tayside to a request for
the identities of people with multiple sclerosis registered at their practices was source 3. The structure of the
data used in the three-source capture-recapture model is
summarized in appendix 1.
The case records of each potential case were scrutinized to determine whether they met the diagnostic criteria of Poser et al. (9). All capture-recapture analyses
using Tayside data were limited to those fulfilling
these criteria.
For each survey of the prevalence of multiple sclerosis in the United Kingdom, we calculated maximum
likelihood estimates of the number of unobserved cases
by using a two-source capture-recapture model (refer to
appendix 2 for the formulas and notation). As only general practice records and neurology department records
were common to all surveys, comparisons were
restricted to these sources. Additional two-source estimates of unobserved cases could have been generated
for the published studies (e.g., by using hospital discharge data or Multiple Sclerosis Society returns);
however, because there was no means of direct comparison, these estimates were omitted. The maximum
likelihood estimate of unobserved cases was added to
the quoted numerator to produce an adjusted numerator. The adjusted prevalence (per 100,000 persons) was
calculated as the adjusted numerator divided by the
population at risk when each survey was conducted.
Ninety-five percent confidence intervals were calculated by using the method suggested by Hilden (12);
that is, the variance from the number of observed cases
(from a Poisson distribution) should be added to the
variance of the unobserved cases to produce an overall
standard error. Regal and Hook have advocated using
a goodness-of-fit method to calculate confidence intervals (13). We used and compared both methods, as the
lower limits of variance-based confidence intervals
versus goodness-of-fit confidence intervals are more
likely to be below the number of observed cases (13).
Three-source capture-recapture model
If dependency between sources had produced severe
downward bias in the two-source estimates of missed
cases of multiple sclerosis in the Tayside survey, then
we would have concluded that a two-source capturerecapture model was unable to determine whether
underascertainment produces an artifactual latitudinal
gradient of prevalence in the United Kingdom. If there
was little difference in the two-source and three-source
estimates of missed cases, then we would have been
confident that the issue of underascertainment had
been addressed adequately.
In a three-source model, there are eight possible
dependencies between pairs of sources and therefore
eight possible maximum likelihood estimates (10). We
used GLIM software (release 4.0; Royal Statistical
Society, London, England) to derive these estimates
from log-linear models that incorporated interaction
terms to allow for source dependencies. The bestfitting model is one chosen on the basis of a goodnessof-fit (G2) statistic (10). Frischer et al. chose the simplest model with adequate fit (14). Sources 2 and 3
would seem to be dependent, because when we used a
two-source model restricted to those two sources, the
estimated population was 651 as opposed to the
observed population of 727. Thus, the two models
expected to best fit the data were the one allowing for
all two-way interactions and the one allowing for an
interaction between sources 2 and 3 only. If we had
used the approach of Frischer et al., then we would
have chosen the latter (i.e., simpler) model if the goodness-of-fit statistics were comparable.
RESULTS
Including our data from Tayside, there were 10 surveys suitable for capture-recapture analysis (3, 4,
Am J Epidemiol Vol. 149, No. 11, 1999
Capture-Recapture Methods in Multiple Sclerosis
15-21) (table 1). Maximum likelihood estimates of
unobserved cases and the adjusted prevalence from
each survey were tabulated (table 2). It would appear
that on the whole, the surveys are very complete, with
evidence that the majority had more than 90 percent
coverage. With the exception of the Suffolk survey
(17), the highest prevalence rate in England was 151
cases per 100,000 persons (95 percent confidence
interval 136-167 (variance based), 95 percent confidence interval 128-197 (goodness-of-fit based)),
which was found in North Cambridgeshire (19). It is
notable that none of the adjusted prevalences exceeded
the unadjusted prevalence in Tayside, Lothian and
Borders, or Northern Ireland (table 1). Even the North
Cambridgeshire figure (Allison and Millar (8) cases)
barely exceeded the adjusted prevalence (Poser et al.
(9) cases) in Tayside (table 2). This finding is highly
significant, as it is not consistent with the hypothesis
that ascertainment differences produce a spurious latitudinal gradient of the prevalence of multiple sclerosis.
Three-source maximum likelihood estimates enable
dependence between sources to be assessed. The bestfitting model (table 3) allowed for all two-way interactions (maximum likelihood estimate, 130; adjusted
1019
prevalence, 217 cases per 100,000 persons; G2 = 0),
and the model that permitted an interaction between
sources 2 and 3 produced a maximum likelihood estimate of 51 cases (adjusted prevalence, 197 cases per
100,000 persons; G2 = 9.818). The latter maximum
likelihood estimate is indistinguishable from the twosource capture-recapture estimates for Tayside shown
in table 2.
The maximum likelihood estimate of unobserved
cases was 130 based on the best-fitting three-source
model, which raises the possibility that two-source
estimates are subject to downward bias (two-source
maximum likelihood estimate from Tayside neurology
department records, 52). However, as the degree of
coverage that results from using the two-source models is similar in northern and southern surveys (table
2), the effect of adjusting for dependencies would
eliminate a north-south difference only if southern surveys were much more incomplete than the Tayside survey. The lowest prevalence of multiple sclerosis in
Scotland (Tayside) is about 183 cases per 100,000 persons (Poser et al. (9) cases, table 1), and the highest
prevalence in the southern part of the United Kingdom
is between 150 and 160 cases per 100,000 persons.
TABLE 2. Adjusted prevalence* and 95% confidence intervals for multiple sclerosis, determined by using prevalence surveys,
United Kingdom, 1985-1996
Survey
and year
(diagnostic criteriat)
(reference)
No. of
people
Source
No. of
unobserved
cases
95% confidence interval
Adjusted
numerator
Coverage
(%)
Adjusted
prevalence
Variance
based
Goodness-of-fit
(G2) statistic
20
41
215
236
90.6
82.5
127
139
108-146
118-160
122-152
126-179
GP
NDR
1
2
442
443
99.7
99.6
117
118
106-128
107-129
117-118
118-119
417,000
GP
NDR
47
16
458
427
89.7
96.2
110
102
99-120
92-112
103-126
101-110
31,379
GP
NDR
0
19
62
81
100.0
76.4
198
258
148-247
166-350
198-198
245-453
South Cambridgeshire, 1990
(AM) (4)
288,410
GP
NDR
31
43
405
417
92.2
89.7
140
145
126-155
130-159
134-158
135-167
North Cambridgeshire, 1993
(AM) (19)
379,000
GP
NDR
90
125
539
574
83.3
78.2
142
151
128-156
136-167
126-175
128-197
Northern Ireland, 1996
(AM) (20)
151,000
GP
NDR
34
18
322
306
89.4
94.1
213
203
188-238
179-226
200-248
197-224
Southeast Scotland, 1995
(AM) (21)
834,000
GP
325
1,938
83.2
232
220-244
200-279
Tayside, 1996
(AM)
395,600
GP
NDR
41
25
921
905
95.5
97.2
233
229
219-247
215-242
226-247
225-238
GP
NDR
39
52
767
780
94.9
93.3
194
197
180-208
183-212
188-208
189-216
Sutton, 1985
(AM) (16)
169,600
GPt
NDRt
Southeast Wales, 1985
(AM) (15)
376,718
Southampton, 1987
(AM) (18)
Suffolk, 1988
(AM) (17)
Poser et al. (9)
* Per 100,000 people.
t AM, Allison and Millar (8) criteria: early, probable, and possible cases; Poser et al. (9) criteria: definite and probable cases only.
i GP, general practice records; NDR, neurology department records.
Am J Epidemiol
Vol. 149, No. 11, 1999
1020
Forbes and Swingler
TABLE 3. Maximum likelihood estimates for three-source capture-recapture models* from Tayside,
Scotland, 1996
Model
no.
Modelf
Maximum
likelihood
estimate
1
2
3
4
5
6
7
8
Independent sources
1-2 interaction
1-3 interaction
2-3 interaction
1-2, 1-3 interactions
1-2, 2-3 interactions
1-3, 2-3 interactions
1-2, 1-3, 2-3 interactions
35
7
39
51
39
75
62
130
Coverage
95.4
99.0
92.6
93.4
94.9
92.1
90.7
84.8
Goodnessof-fit (G2)
statistic
Adjusted
prevalence^
95%
confidence
interval
43.982
43.904
43.272
9.818
43.270
7.002
6.425
0
193
186
194
197
194
203
199
217
179-207
173-199
180-208
183-211
180-208
189-217
186-214
203-232
* Tayside multiple sclerosis survey, 1996 (total number of cases identified = 727, source unrecorded in 4):
Poser et al. (9) criteria.
1 1 - 2 indicates interaction between source 1 and source 2, etc.
i Per 100,000 people.
The confidence intervals seemed plausible, in that
they were not too narrow. The variance-based intervals
were below the observed prevalence at least once in all
studies except for the North Cambridgeshire survey,
which was the least complete.
DISCUSSION
To our knowledge, ours is the first attempt to formally test the hypothesis that the latitudinal gradient
of the prevalence of multiple sclerosis in the United
Kingdom is an artifact of differences in ascertainment. Capture-recapture methods have been used previously to determine the completeness of surveys of
disease prevalence. For example, a two-source
capture-recapture analysis applied to a Huntington's
disease prevalence study found that the reported figures could have underestimated prevalence by as
much as 25 percent (22); Hook and Regal subsequently recommended that future surveys of disease
prevalence or incidence should always include a
capture-recapture analysis or the data from which
such analyses could be performed.
However, there are problems associated with using
capture-recapture methodology in health service or
epidemiologic research. The most important is
undoubtedly the effect of dependence between
sources, because it produces downward bias in any
maximum likelihood estimate (10, 11). In our study it
could be argued that dependence did not influence our
conclusions, given that completeness, as determined
by using a two-source model, was similar in the northern and southern parts of the United Kingdom. Thus,
when dependence is modeled, it merely shifts all
prevalence estimates further upward without necessarily causing the estimates to converge. However, if we
had used the approach of Frischer et al. (14) and had
chosen the simplest three-source model with reasonable fit (i.e., model 4, table 3), we could have determined that using a three-source model would not alter
the conclusion derived from using the two-source
models alone, as the numbers of unobserved cases in
both instances were indistinguishable. Therefore,
source dependence appears not to have invalidated our
two-source approach.
Another problem is the difficulty in calculating confidence intervals. Indeed, violations of model assumptions (such as source dependencies) led Hook and Regal
(10) to suggest that the confidence intervals for maximum likelihood estimates may be misleading. They
have proposed that authors present a plausible range
within which they think the true value lies. On the basis
of our observations, the plausible range of prevalence in
the southern part of the United Kingdom does not
exceed the range for Scotland and Northern Ireland.
Hilden (12) contended that any capture-recapture
adjusted estimate should include confidence intervals,
as there is random error in both the number of observed
cases and the estimate of unobserved cases. Although
Hilden's approach (summing the variance of the maximum likelihood estimate and the variance of the
observed number to give a total variance) may be
regarded as naive, it at least addresses the issue of random variation in the adjusted prevalence figures. Even
when we tried to consider random variation in confidence intervals, we still were left with an unexplained
difference between prevalence estimates (table 2).
As described previously (13), the lower limits of the
variance-based confidence intervals were below the
unadjusted prevalence, presumably because of the small
numbers of unobserved cases in certain studies and the
additional allowance made for error in the numbers of
observed cases. The goodness-of-fit-based confidence
intervals seemed to perform more reliably, in that they
Am J Epidemiol Vol. 149, No. 11, 1999
Capture-Recapture Methods in Multiple Sclerosis
were below the unadjusted prevalence less often and
were skewed toward the upper limit. Regal and Hook
found that variance-based confidence intervals performed as well as goodness-of-fit confidence intervals
when the sample size was large and the probability of
ascertainment was high (13). Therefore, it seems likely
that differences in ascertainment do not explain the difference in the prevalence of multiple sclerosis between
the northern (Scotland/Northern Ireland) and southern
(England/Wales) parts of the United Kingdom.
Some have argued that capture-recapture methods
are of real value in only a limited range of circumstances, for example, when either data are difficult to
obtain (a "difficult-to-catch" population) or all that is
required is a rough estimate of the numbers in a certain
population (23). A study of cancer registration completeness concluded that capture-recapture analysis
could identify possible underascertainment from a
given source only after the registration process was
complete (24) and therefore there was no opportunity
to intervene to improve ascertainment. Despite these
difficulties, the capture-recapture method has offered
an opportunity to explore the problems of underascertainment in surveys of multiple sclerosis.
Other causes of bias in prevalence surveys should be
considered before the conclusion is drawn that the
north-south difference is real. Use of magnetic resonance imaging may increase the prevalence of multiple
sclerosis in more recent surveys because it probably
enables earlier diagnosis, and it may encourage false-
positive diagnoses. Variations in the numbers of early
or suspected cases included in total prevalence estimates also may be misleading, as those researchers
who apply criteria more loosely than others will find
higher overall prevalence estimates. If we had tried to
compare cases that could be diagnosed on clinical
grounds only (i.e., Poser et al. (9) clinically definite
multiple sclerosis or Allison and Millar (8) probable
multiple sclerosis), then we might have eliminated
some of the error due to observer error and use of magnetic resonance imaging. In fact, the magnitude of the
difference in prevalence between the northern and
southern parts of the United Kingdom is reduced when
this is done (e.g., Poser et al. clinically definite multiple sclerosis in Sussex compared to Tayside) and is
eliminated in the comparisons of Allison and Millar
probable multiple sclerosis (table 4).
The data in table 4 suggest that there is 139 percent
more multiple sclerosis in Scotland compared with the
aggregate prevalence in England and Wales. This figure is consistent with the value from the only comparative study in the United Kingdom, which used computerized general practice databases (table 1) (25);
Swingler et al. compared the prevalence of people with
a diagnostic code of multiple sclerosis in England and
Wales with that of people in Scotland and found that
there was 137 percent (95 percent confidence interval
126-148) more multiple sclerosis in Scotland.
Only those figures from the small study in Suffolk
exceed those for the adjusted prevalence in Tayside of
TABLE 4. Differences in the prevalence of clinically definite and probable cases of multiple sclerosis,
based onTayside and other surveys, United Kingdom, 1985-1996
Survey and year
(reference)
Population
No. of people
with
multiple
sclerosis
Prevalence*
Difference in
prevalence
95%
confidence
interval
Poser et al. (9) criteria: clinically definite multiple sclerosis
Southeast Wales, 1985(15)
Southampton, 1987 (18)
South Cambridgeshire, 1990 (4)
South Cambridgeshire, 1993 (7)
Sussex, 1991 (3)
Channel Islands, 1991 (40)
North Cambridgeshire, 1993 (19)
Northern Ireland, 1996(20)
Tayside, 1996
(All England and Wales)
376,718
417,000
288,410
290,700
596,594
145,706
379,000
151,000
395,600
2,203,428
298
318
256
281
528
120
286
185
451
1,806
79
76
89
97
89
82
75
123
114
82
35
38
25
17
26
32
39
-9
0
21 to
24 to
10 to
2 to
13 to
14 to
25 to
-29 to
49
51
40
33
38
50
52
12
-19 to
- 1 0 to
- 6 to
-10 to
27 to
7
14
27
24
68
Allison and Millar (8) criteria: probable multiple sclerosis
Aberdeen, 1970(34)
Aberdeen, 1973(36)
Sutton, 1985(16)
Channel Islands, 1991 (40)
Northern Ireland, 1996 (20)
Tayside, 1996
* Per 100,000 people.
Am J Epidemiol
Vol. 149, No. 11, 1999
440,176
440,176
169,600
145,706
151,000
395,600
1021
310
324
147
120
186
300
310
324
147
120
186
300
-5
2
11
7
47
0
1022
Forbes and Swingler
217 cases per 100,000 persons (tables 1-3). This study
was initiated because of anecdotal reports of a locally
high prevalence, and the study population is small so
that sampling error is high. Nonetheless, the Suffolk
survey suggests that there may be pockets of high
prevalence in the southern part of the United Kingdom.
Our capture-recapture analysis casts light on hitherto
unexplained observations. While most subsequent surveys of multiple sclerosis in an area produce an
increase in prevalence, the repeat survey from southeast Wales (table 1) did not (26). From table 3 it is evident that the first survey (15) was likely to have missed
only four cases at most. As the first survey had such
high coverage, the subsequent survey was unlikely to
increase the prevalence.
A repeat survey in South Cambridgeshire (7) found
58 cases that were missed during the first survey (4).
This figure compares with the 31 or 43 cases estimated,
from using the two-source model, to have been missed
during the initial survey (table 3). The fact that the
actual number of cases missed is larger than predicted
suggests source dependency (as expected). Nonetheless,
it would appear to validate our approach, since the
actual number detected (374/426) and the observed
coverage (88 percent) on resurvey were of the same
order as predicted in the capture-recapture estimation.
The two surveys from adjacent areas of
Cambridgeshire produced different results, with prevalences of 130 cases per 100,000 persons in South
Cambridgeshire and 118 cases per 100,000 persons in
North Cambridgeshire (difference in prevalence, 11; 95
percent confidence interval -5 to 28). These differences
converged when we adjusted our capture-recapture
analysis for unobserved cases (table 2).
Our study may have some interesting implications
for those concerned with geographic or ethnic variations in the prevalence of multiple sclerosis.
McDonnell et al. (20) and Roth well and Charlton (21)
have argued that multiple sclerosis is more common in
Northern Ireland and Scotland than in the southern part
of the United Kingdom (29) because of genetic factors.
While there does seem to be an excess of Poser et al.
(9) clinically definite multiple sclerosis in Northern
Ireland compared with Tayside (table 4), the overall
adjusted prevalences (table 2) are similar (213 and 203
cases per 100,000 persons vs. 233 and 229 cases per
100,000 persons) and considerably higher than any of
the southern United Kingdom estimates. McDonnell et
al. have suggested that there is a "step" in prevalence
between England/Wales and Northern Ireland (20).
The populations of Northern Ireland and Scotland are
linked historically (and therefore genetically), which
may be evidence of a genetic predisposition that places
the Scots and Northern Irish at increased risk com-
pared with the Welsh and English (27). Interestingly,
Sutherland commented in 1956 that differences in the
observed prevalence between Orkney and Shetland
and the Outer Hebrides may reflect "a constitutional
vulnerability" in those of Nordic as opposed to Celtic
extraction (28).
There are still those who think that the purported latitudinal gradient in the United Kingdom could be
explained by methodological differences and does not
represent true differences in disease prevalence (7).
Although other sources of bias have not been
addressed in this paper, we conclude that differences in
ascertainment cannot explain the difference in the
prevalence of multiple sclerosis between the northern
and southern parts of the United Kingdom. Rather than
consider a latitudinal gradient for multiple sclerosis in
the United Kingdom, and by implication a transmissible environmental causative agent, it seems more
accurate to consider that geographically contained
populations may share a higher than average risk. It is
known that HLA-DR2 alleles that predispose to multiple sclerosis are more common in the general population in Scotland than in the southern part of the United
Kingdom (29), but the precise nature of susceptibility
genes and the role of exogenous agents remain unclear.
ACKNOWLEDGMENTS
The authors thank the executry of Mrs. N. K. Learmont's
estate for funding this project; general medical practitioners
in Tayside, Scotland, who contributed to this study; S.
Ogston, Department of Epidemiology, University of
Dundee, Dundee, Scotland, for statistical advice; and S.
Wilson for coding ascertainment sources.
REFERENCES
1. Raine CS, McFarland HF, Tourtellotte WW. Multiple sclerosis.
Clinical and pathogenetic basis. 1st ed. London, England:
Chapman and Hall, 1997.
2. Kurtzke JF. A reassessment of the distribution of multiple sclerosis. Part one. Acta Neurol Scand 1975;51:110-36.
3. Rice-Oxley M, Williams ES, Rees JE. A prevalence survey of
multiple sclerosis in Sussex. J Neurol Neurosurg Psychiatry
1995;58:27-30.
4. Mumford CJ, Fraser MB, Wood NW, et al. Multiple sclerosis
in the Cambridge health district of East Anglia. J Neurol
Neurosurg Psychiatry 1992;55:877-82.
5. Rice-Oxley M, Williams ES, McKeran RO. Matters arising:
multiple sclerosis in the north Cambridgeshire districts of East
Anglia. (Letter). J Neurol Neurosurg Psychiatry 1996;61:121.
6. Robertson N, Compston A. Surveying multiple sclerosis in the
United Kingdom. J Neurol Neurosurg Psychiatry 1995;58:2-6.
7. Robertson N, Deans J, Fraser M, et al. Multiple sclerosis in
south Cambridgeshire: incidence and prevalence based on a district register. J Epidemiol Community Health 1996;50:274-9.
Am J Epidemiol
Vol. 149, No. 11, 1999
Capture-Recapture Methods in Multiple Sclerosis
8. Allison RS, Millar JHD. Prevalence and familial incidence of
disseminated sclerosis. Ulster Med J 1954;23(suppl 2):5-27.
9. Poser CM, Paty DW, Scheinberg L, et al. New diagnostic criteria for multiple sclerosis: guidelines for research protocols.
AnnNeurol 1983;13:227-31.
10. Hook EB, Regal RR. Capture-recapture methods in epidemiology: methods and limitations. Epidemiol Rev 1995; 17:243-64.
11. Wittes JT, Colton T, Sidel VW. Capture-recapture methods for
assessing the completeness of case ascertainment when using
multiple information sources. J Chronic Dis 1974;27:25-36.
12. Hilden J. Ascertainment corrected rates: applications of capture-recapture methods. Int J Epidemiol 1994;23:865-6.
13. Regal RR, Hook EB. Goodness-of-fit based confidence intervals for estimates of the size of a closed population. Stat Med
1984;3:287-91.
14. Frischer M, Bloor M, Finlay A, et al. A new method of estimating prevalence of injecting drug use in an urban population: results from a Scottish city. Int J Epidemiol 1991;
20:997-1000.
15. Swingler RJ, Compston DAS. The prevalence of multiple sclerosis in south east Wales. J Neurol Neurosurg Psychiatry
1988;51:1520-4.
16. Williams ES, McKeran RO. Prevalence of multiple sclerosis in
a south London borough. BMJ 1986;293:237-9.
17. Lockyer MJ. Prevalence of multiple sclerosis in five rural
Suffolk practices. BMJ 1991;303:347-8.
18. Roberts MHW, Martin JP, McLellan DL, et al. The prevalence
of multiple sclerosis in the Southampton and South West
Hampshire Health Authority. J Neurol Neurosurg Psychiatry
1991;54:55-9.
19. Robertson N, Deans J, Fraser M, et al. Multiple sclerosis in the
north Cambridgeshire districts of East Anglia. J Neurol
Neurosurg Psychiatry 1995;59:71-9.
20. McDonnell GV, Hawkins S A. An epidemiologic study of multiple sclerosis in Northern Ireland. Neurology 1998;50:423-8.
21. Rothwell PM, Charlton D. The incidence and prevalence of
multiple sclerosis in south east Scotland: evidence of a genetic predisposition. J Neurol Neurosurg Psychiatry 1998;64:
730-5.
22. Hook EB, Regal RR. The value of capture-recapture methods
even for apparent exhaustive surveys. Am J Epidemiol
1992;135:1060-7.
23. Laporte RE. Assessing the human condition: capture-recapture
techniques. BMJ 1994;308:5-6.
24. Schouten L, Straatman H, Kiemeney LALM, et al. The
capture-recapture method for estimation of cancer registry
completeness: a useful tool? Int J Epidemiol 1994;23:1111-16.
25. Swingler RJ, Rothwell P, Taylor MW, et al. The prevalence of
multiple sclerosis in 604 general practices in the United
Kingdom. (Abstract). Ann Neurol 1994,36:303.
26. Hennesey A, Swingler RJ, Compston DAS. The incidence and
mortality of multiple sclerosis in south east Wales. J Neurol
Neurosurg Psychiatry 1989;52:1085-9.
27. Forbes RB, Swingler RJ. An epidemiologic study of multiple
sclerosis in Northern Ireland. (Letter). Neurology 1999;52:
215-16.
28. Sutherland JM. Observations on the prevalence of multiple
sclerosis in Northern Scotland. Brain 1956;79:635-54.
29. Swingler RJ, Compston DAS. The distribution of multiple
sclerosis in the United Kingdom. J Neurol Neurosurg
Psychiatry 1986;49:1115-24.
30. Fog M, Hyllested K. Prevalence of disseminated sclerosis in
the Faroes, the Orkneys and Shetland. Acta Neurol Scand
1966;42:9-11.
31. Poskanzer DC, Prenney LB, Sheridan JL, et al. Multiple sclerosis in the Orkney and Shetland Islands. I: Epidemiology,
clinical factors, and methodology. J Epidemiol Community
Health 1980;34:229-39.
32. Cook SD, MacDonald J, Tapp W, et al. Multiple sclerosis in
the Shetland Islands: an update. Acta Neurol Scand
1988;77:148-51.
33. Cook SD, Cromarty JI, Tapp W, et al. Declining incidence of
Am J Epidemiol
Vol. 149, No. 11, 1999
34.
35.
36.
37.
38.
39.
40.
41.
42.
1023
multiple sclerosis in the Orkney Islands. Neurology
1985;35:545-51.
Shepherd DI, Downie AW. Prevalence of multiple sclerosis in
North-east Scotland. BMJ 1978;2:314-16.
Shepherd DI, Downie AW. A further prevalence study of multiple sclerosis in north-east Scotland. J Neurol Neurosurg
Psychiatry 1980;43:310-15.
Shepherd DI. Multiple sclerosis in North-east Scotland.
Doctoral thesis. University of Aberdeen, Aberdeen, Scotland,
1976.
Phadke JG, Downie AW. Epidemiology of multiple sclerosis in
the north-east (Grampian region) of Scotland—an update. J
Epidemiol Community Health 1987;41:5-13.
Dean G, Goodall J, Downie A. The prevalence of multiple
sclerosis in the Outer Hebrides compared with north-east
Scotland and the Orkney and Shetland Islands. J Epidemiol
Community Health 1981 ;35:110-13.
Shepherd DI, Summers A. Prevalence of multiple sclerosis in
Rochdale. J Neurol Neurosurg Psychiatry 1996;61:415-17.
Sharpe G, Price SE, Last A, et al. Multiple sclerosis in island
populations: prevalence in the Baliwicks of Guernsey and
Jersey. J Neurol Neurosurg Psychiatry 1995;58:22-6.
Millar JHD. Multiple sclerosis in Northern Ireland. In:
Clifford-Rose F, ed. Clinical epidemiology. Tunbridge Wells,
England: Pitman Medical, 1980:222-7.
Hawkins SA, Kee F. Updated epidemiological studies of multiple sclerosis from Northern Ireland. (Abstract). J Neurol
1988(suppl);235:S86.
APPENDIX 1
Three-source Data from the Tayside Survey
NDR(VER)
Yes
GP
SMR1
No
GP
Yes
No
Yes
No
Yes
204
54
49
27
No
169
149
71
0
where NDR(VER) = neurology department records
and visual evoked responses combined,
GP = responses from a survey of local general medical practitioners,
SMR1 = Scottish morbidity recording discharge
data coding for multiple sclerosis/demyelinating
disease,
total no. of cases identified = 723,
total from NDR(VER) = 576,
total from GP = 493, and
total from SMR1 = 334.
(Please note that 4 cases whose precise source was
uncertain were excluded from the 723 cases.)
Refer to Hook and Regal (10) for the formulas used
to calculate three-source maximum likelihood estimates of unobserved cases.
1024
Forbes and Swingler
APPENDIX 2
Two-source Capture-Recapture Model (10)
Other lists combined
List X
Yes
No
'
Yes
No
a
b
c
d
where counted cases = n = a + b + c,
no. of missing cases = d = bc/a = maximum likelihood estimate (MLE),
adjusted no. of cases = nMLE = a + b + c + d = a +
b + c + MLE,
coverage (%) = n/nMLE x 100%,
adjusted prevalence = nMLE/population x 100,000,
variance of MLE = ((a + b)(a + c)bc)/a\ and
variance of nMLE = variance of MLE + variance of
observed cases assuming observed cases are from a
Poisson distribution (see Hilden (12)).
Am J Epidemiol
Vol. 149, No. 11, 1999