Uncertainty in Most Probable Number Calculations for

1084
MCBRIDE ET AL.: JOURNAL OF AOAC INTERNATIONAL VOL. 86, NO. 5, 2003
SPECIAL GUEST EDITOR SECTION
Uncertainty in Most Probable Number Calculations for
Microbiological Assays
GRAHAM B. MCBRIDE
National Institute of Water and Atmospheric Research, PO Box 11-115, Hamilton, New Zealand
JUDITH L. MCWHIRTER and MATTHEW H. DALGETY
University of Waikato, Department of Statistics, Hamilton, New Zealand
Microbiological assays commonly use incubations
of multiple tubes in a dilution series, and microorganism concentration is read as a most probable
number (MPN) in standard tables for the observed
pattern of positive tubes. Published MPN tables
differ, sometimes substantially, because of use of
approximate MPN calculation procedures, different rounding conventions in the results, and different methods of calculating confidence or credible intervals. We conclude that the first 2 issues
can now be resolved by using recently developed
exact MPN calculation methods and by reporting
rounding conventions in standard tables. The
third issue is not amenable to complete resolution, especially if credible interval (as opposed to
confidence interval) limits are desired—as we
think they most often are. In that case, Bayesian
statistics are called for and the analyst must provide a distribution of concentration that was presumed to be true before the assay was performed.
This is mathematically combined with the assay
data, resulting in a posterior concentration distribution. These distributions may then be used to
quantify the uncertainty in the MPN estimate, and
the best approach is to use the highest posterior
density regions of these distributions. If based on
diffuse prior information (positing that, prior to an
assay being performed, all positive concentrations are equally likely), then established procedures might be used to calculate the limits and
publish them in standard tables. In the event that
this prior assumption is held to be not satisfactory, we show results for an empirical Bayes procedure, with a Poisson prior distribution, giving
credible interval widths much narrower than in
the other cases examined.
Guest edited as a special report on “Uncertainty of Measurement in
Chemical and Microbiological Testing” by John L. Love.
Corresponding author’s e-mail: [email protected].
ultiple tube incubation techniques have been in use
for many decades and continue today with an increasing variety of setups (i.e., number of dilution
series and numbers of tubes in each series). The essence of
the technique concerns the pattern of positive results obtained after incubation of the tubes. These are routinely
translated into a concentration of microbes via a table of
most probable number (MPN) values, where MPN is defined as the mode of the distribution of all possible concentrations that could have given rise to that pattern. The formal development of these procedures was first set down
during the First World War (1–3), with continuing developments thereafter (4, 5). Those tables may now also contain
confidence limits (typically for 95% intervals and sometimes also for 99% intervals) as measures of uncertainty in
the MPN result (6, 7).
It may come as a surprise that differences remain among
published standard tables, despite the longevity of the development of MPN theory. For example, consider a pattern
of 5-5-2 positive tubes in a setup with 5 tubes in each of
3 series of decimal dilutions. Two standard works have the
equivalent MPN (per 100 mL water) as 500 (6) and as
540 (7). In these 2 cases the 95% confidence limits are
stated as the intervals 200–2000 and 220–2000, respectively. Other examples are cited below. There are 3 causes
for these differences: Most of the methods used to date for
calculation of MPNs are approximate, and it is seldom clear
exactly what approximations form the basis of particular tables; different and unstated rounding conventions appear to
have been used (e.g., rounding 540 to 500, as noted above);
and quite different approaches have been used to develop
the confidence intervals and limits (and some are not actually confidence limits at all).
M
We suggest that the first 2 causes can now simply be addressed and resolved by using exact MPN methods (8, 9) and
by authorities adopting the convention of stating the rounding
criteria adopted in preparing their tables. Therefore, readers
are at least aware that the differences in MPN among tables
may be attributable to rounding conventions. For the third issue, we survey the types of intervals being used and their implications for the width of the interval reported. It is noted that
the intervals commonly reported are based on 2 quite different
MCBRIDE ET AL.: JOURNAL OF AOAC INTERNATIONAL VOL. 86, NO. 5, 2003 1085
statistical paradigms: classical and Bayesian, and a conscious
choice has to be made between them.
METHODS
Calculation of Exact MPN Values
Procedures for exact calculation of MPN values, using occupancy theory, were first described in the 1980s (8, 9), by using results of occupancy theory documented by de Moivre in
1718 (10). Essentially this theory allows the calculation of a
probability that a particular tube among a set of replicates will
contain at least one bacterium (as shown by a positive response after incubation) if a number of bacteria are distributed
at random among those tubes. By properly accounting for all
the possible combinations over a wide range of possible numbers of bacteria (n), we can obtain the probability of a particular pattern occurring for each possible value of n. From the resulting bar graph of these occurrence probabilities versus n we
can then read both the MPN (i.e., the value of n for the highest
bar, divided by the total volume in the test’s setup) and its occurrence probability. This is a relatively straightforward task
in computer programming (11). An example of such exact results is given in Table 1, along with approximate MPN values
reported by various authors, including the simple and popular
method reported by Thomas (12). For reasons of compactness
the table refers to series with only 3 replicates per decimal dilution (setups with 5 replicates are rather more common).
Approximations for more complex setups, using tables for
subseries, also give rise to inaccuracies. For example, consider
a setup consisting of 5 ´ 100, 5 ´ 10, 5 ´ 1, and 5 ´ 0.1 mL. If a
5-5-1-0 set of positive tubes is obtained, a common convention (13) is to use a 3-series table to read the value for a 5-1-0
pattern (in a 5 ´ 10, 5 ´ 1, 5 ´ 0.1 mL series), or a 5-5-1 pattern
(in a 5 ´ 100, 5 ´ 10, 5 ´ 1 mL series), and to take the value as
the correct result. In fact this procedure is an approximation to
the correct result because it ignores the volume of sample discarded. For example, the exact MPN for the 5-5-1-0 pattern is
32.76 per 100 mL (i.e., 182 bacteria in the total volume of
555.5 mL), whereas the 5-5-1 pattern gives an MPN of
34.59 per 100 mL and the 5-1-0 pattern gives an MPN of
30.63 per 100 mL.
Calculation of Confidence and Credible Intervals
Although early theoretical developments focused on calculation of the MPN value (1–3), attention has since focused on
quantifying some measure of uncertainty about that value.
Routinely, one uses confidence intervals, in which probabilities of various MPNs are computed for a range of assumed
true concentrations; unlikely MPNs then fall into the tail of the
distribution of those probabilities and so define the confidence
limits. This is the relative frequency approach—the probabilities refer to a proportion of outcomes under a particular hypothesis (the assumed concentrations). An example is Woodward’s much-quoted paper (13) describing the use of an
MPN-ordering procedure (as explained in ref. 16), results for
which have been used in standard works (19, 20). Different
sets of intervals have been calculated by using pattern-order-
ing in “Sterne-type intervals” (16) and other lognormal approximations (4, 5, 17). It has been held that Woodward’s is
the most accurate of these methods, especially when accompanied by necessary corrections (16). More recently the U.S.
Food and Drug Administration (FDA) has endorsed a narrower set of confidence intervals—those published in 1983 by
de Man (15) as discussed below.
However, a substantive and seldom-addressed issue with
respect to the interpretation of confidence intervals remains.
That is, because these intervals are based on concepts of relative frequency, the 95% (or 99%) probability they invoke refers to the proportion of time that the interval would contain
the true value, were repeated assays to be performed. But
commonly only one assay (or a limited number) is performed,
and the analyst wishes to claim 95% probability that the sample assayed had a microbial concentration in the numerical
range given by the confidence limits for the observed pattern
of positive results. In that case the frequency interpretation has
no meaning (21) and one is actually making a Bayesian statement, that is, a probability statement about the concentration
given the pattern of positives obtained in the test. Furthermore, Bayesian probability calculations can proceed only if
one invokes (even unwittingly) a “prior probability distribution,” that is, the analyst’s view as to the distribution of concentration prior to performing the test. This is used, along with
the data via a likelihood function, to calculate the posterior
probability distribution, mimicking the learning process of
updating previous views in the light of new information.
The specification of prior probability is a matter of some
contention, because it appears to introduce subjectivity. Yet, if
probability limits are desired for a particular result, one must
invoke a prior distribution. In that regard we note the views
of 2 experienced environmental professionals: “It is interesting that most researchers are taught statistics from a classical
perspective, yet confidence intervals are often interpreted in a
Bayesian sense. When the Bayesian interpretation is adopted,
the analyst should realize that this implies a subjective interpretation for probability, and this should be specified in the
analysis … the prior probability distribution must be stipulated if the Bayesian interpretation for confidence intervals is
adopted…” (22). Such intervals are more properly called
“credible intervals” (23).
In fact, early MPN theory was developed in a Bayesian
framework (1–3), in which a diffuse uniform prior was
adopted, which stated that any positive concentration value is
equally likely (the manner in which this analysis leads to credible intervals was not pursued because of its computational
complexity). More recently, the development of intervals put
forward by de Man (14) is equivalent to a Bayesian approach
using a diffuse prior (16, 24), as the author later acknowledged (15). Technically these are “likelihood intervals” (25),
but in the MPN context they are equivalent to a credible interval with a diffuse prior. Interestingly, these results appear in
many recent standard tables such as those found in standard
methods for food and for water (6, 7). Users of these tables are
in fact using Bayesian intervals correctly. Whether that use is
appropriate depends entirely on how reasonable the adopted
1086
MCBRIDE ET AL.: JOURNAL OF AOAC INTERNATIONAL VOL. 86, NO. 5, 2003
Table 1. MPN values reported from the literature, and accompanying occurrence probabilities for a 3 ´ 10, 3 ´ 1, and
3 ´ 0.1 mL setup
MPN (per 100 mL) reported by reference number
a
(2)
(3)b
(12)
0-1-0
3
3.05
3.05
3.0
3
1-0-0
4
3.57
3.59
3.6
4
1-0-1
7
7.23
7.20
7.2
1-1-0
7
7.36
7.34
7.3
Pattern
(13)
9.1
(16)
(17)
(18)
Exact (11)c
Pd
3.0
3.01
3
3
3.00
0.090
3.6
3.57
4
4
3.00
0.901
7
7.2
7.23
7
7
6.01
0.016
7
7.4
7.36
7
7
6.01
0.162
(14)
9
(15)
2-0-0
9
9.18
9.50
9.18
9
9
6.01
0.541
1-2-0
12
11.38
11.26
11
11
11
9.2
11.38
11
11
9.01
0.015
2-0-1
14
14.33
14.31
14
14
14
14.33
14
14
12.01
0.018
2-1-0
15
14.69
14.82
15
15
15
14.68
15
15
12.01
0.184
2-2-0
20
21.06
20.62
21
21
21
21.07
21
21
18.02
0.033
3-0-0
25
23.12
28.62
23
23
23
23.12
23
23
21.02
0.398
3-0-1
40
38.50
38.75
39
40
38
38.50
39
39
36.04
0.034
3-1-0
45
42.73
45.71
43
40
43
42.73
43
43
39.04
0.400
3-1-1
75
74.89
58.42
75
70
75
74.89
75
75
72.07
0.069
3-2-0
95
93.28
75.99
93
90
93
93.28
93
93
90.09
0.339
3-1-2
115
115.21
71.75
120
—
120
115.22
115
120
114.15
0.007
3-2-1
150
149.36
94.92
150
150
150
149.36
149
150
147.15
0.129
3-2-2
200
214.66
115.66
210
210
210
214.66
215
210
213.21
0.025
3-3-0
250
239.79
189.83
240
200
240
239.79
240
240
237.24
0.370
3-3-1
450
462.18
271.24
460
500
460
462.18
462
460
459.46
0.430
3-3-2
1100
1098.95
438.40
1100
1100
1100
1098.95
1099
1100
1096.10
0.446
a
b
c
d
Only some patterns are shown in the table. Those omitted generally have lower occurrence probabilities, and so can be considered as
improbable MPNs.
Calculated by the authors using software with Newton-Raphson root-finding to solve the MPN equation of Greenwood and Yule (3; p. 54).
Rounding to 2 decimal places was adopted to facilitate comparison with other results.
P is the occurrence probability for the given pattern (assuming the MPN is the true concentration), using the exact theory (11).
prior distribution is. At least in some cases one can argue that
it is not. For example, the diffuse prior posits that the analyst’s view prior to the data being collected was that all concentrations are equally likely. This prior implies that a water
body is more likely to be grossly contaminated than it is to be
healthy (with a much larger range of concentrations implying contamination), even when historical sampling has routinely demonstrated a healthy state. One can adopt other
more informative priors or adopt the Empirical Bayes approach (26), in which the data are used to guide the choice
and parameter(s) of the prior distribution. One such approach
is to adopt a Poisson prior, based on the notion that microbes
are distributed following a Poisson random process in the
sampled environment (27), and using the calculated MPN as
the mean of that distribution. A detailed description can be
found in Dalgety (28).
Table 2 gives selected confidence and credible intervals for
the same pattern of positives shown in Table 1, including this
Empirical Bayes interval. It should be noted that Dalgety’s approach is a naïve Empirical Bayes method and so produces re-
sults that are overconfident (26), i.e., his intervals are too
short. This is because such methods use the data twice (in the
prior distribution and in the data likelihood function). This
naïvety can be addressed by explicitly incorporating posterior
uncertainty about the Poisson parameter (26); this is a fruitful
research area.
We also note that some have proposed the use of the Most
Probable Range to quantify uncertainty, being the range of
values with occurrence probabilities at least 95% of that for
the MPN (9), though its arbitrariness and difficulty of interpretation have been noted (29). This term has also been used
to refer to equi-tailed credible intervals (28), for which perhaps a better term is MCR (Most Credible Range).
Results and Discussion
Table 1 shows that the exact and approximate methods are
in reasonable agreement, except that the Thomas approximation (12) tends to return values that are considerably too low,
especially when many tubes are positive. The exact values
MCBRIDE ET AL.: JOURNAL OF AOAC INTERNATIONAL VOL. 86, NO. 5, 2003 1087
Table 2. Published confidence and credible intervals (per 100 mL) for a 3 ´ 10, 3 ´ 1, and 3 ´ 0.1 mL setup
95% Confidence intervals
Pattern
(13)
95% Credible intervals
(14)a
(15)
(18)a
(18)b
(28)a
0-1-0
0.085–13
0.1–10
<1.0–17
0.7–17
0.1–15
3–6
1-0-0
0.085–20
0.2–17
<1.0–21
0.9–21
0.1–18
3–6
1-0-1
0.87–21
1.2–17
2–27
2.2–27
1.1–24
6–12
1-1-0
0.88–23
1.3–20
2–28
2.2–28
1.1–24
6–12
2-0-0
1.0–36
1.5–35
2–38
2.9–38
1.3–33
6–12
1-2-0
2.7–36
4–35
4–35
4.1–35
2.6–31
9–15
2-0-1
2.7–37
4–35
5–48
5.2–48
3.1–43
9–21
2-1-0
2.8–44
4–38
5–50
5.3–50
3.2–44
9–21
2-2-0
3.5–47
5–40
8–62
8.5–63
5.8–56
12–30
3-0-0
3.5–120
5–94
<10–130
8.7–130
3.8–108
9–33
3-0-1
6.9–130
9–104
10–180
15–180
8.1–150
18–57
3-1-0
7.1–210
9–181
10–210
17–210
8.4–180
18–60
3-1-1
14–230
17–199
20–280
28–280
16–250
42–102
3-2-0
15–380
18–360
30–380
33–390
18–340
57–123
3-1-2
30–380
30–360
40–350
44–350
29–320
78–150
3-2-1
30–440
30–380
50–500
56–510
34–450
105–189
3-2-2
35–470
30–400
80–640
86–640
59–580
162–264
3-3-0
36–1300
40–990
<100–1400
91–1400
40–1170
183–291
3-3-1
71–2400
90–1980
100–2400
180–2400
88–2070
384–534
3-3-2
150–4800
200–4000
300–4800
380–4800
190–4130
981–1212
a
b
Central credible interval (with an area of 0.025 in each tail of the posterior distribution).
Noncentral HPDR (shortest interval with total tail area of 0.05 in the posterior distribution).
tend also to be a little lower than the remaining approximations. Note that the Thomas approximate MPN for the 3-2-0
pattern is higher than that for the 3-1-2 pattern, in contrast to
all the other methods shown.
Table 2 shows that the two 95% confidence interval results
displayed are reasonably similar in their widths and limits.
However, because of their particular method of construction,
the intervals of de Man (15) are always shorter than those of
Woodward (13). If confidence intervals are to be used it therefore seems appropriate to endorse the intervals presented by
de Man (15), especially as they have been incorporated into
the FDA (2001) Bacteriological Analytical Manual
(http://vm.cfsan.fda.gov/~ebam/bam-a2.html). We note that a
previous endorsement of Woodward’s confidence intervals (16) was made before de Man’s paper was published.
The first 95% credible interval shown in Table 2, that of de
Man (14), is widely used. As expected it is very similar to the
second interval shown, developed by using a diffuse
prior (18). These intervals are designed to have an area of
0.025 in each tail of their posterior distributions. The third
credible interval, the noncentral case (18), has been obtained
by using a diffuse prior but requiring only that the total tail
area is 0.05. These are guaranteed to be the shortest intervals
satisfying this criterion; for that reason they may be described
as delimiting the highest posterior density region (HPDR).
Such regions have the added attraction that the probability
density at any point inside the interval is greater than at any
outside point (23). Because the posterior MPN distribution is
skewed to the right, both of the HPDR limits are always to the
left of their central credible interval counterparts. As the results show, they are indeed always shorter than the first 2 credible intervals shown in the table. Our implementation of the
Greenwood and Yule theory (3) shows, as expected, that it
gives almost identical answers to those in Beliaeff and
Mary (18) for these 2 intervals. In the HPDR case it also
agrees with results calculated from equivalent procedures reported by Roussanov et al. (30).
The last column shows the Poisson Empirical Bayes credible interval (28), which is much shorter than the others, reflecting the strong influence of the Poisson prior distribution
on the results. We note that others (26) have adopted similar
assumptions. For this interval the central and HPDR intervals
are very similar, because the posterior distribution with a Poisson prior tends to be quite symmetrical.
Finally, there is a question about what MPN estimate
should be reported with credible intervals. It could be argued
that when a Bayesian credible interval approach is used, the
MPN should be read as the median or mean of the posterior
1088
MCBRIDE ET AL.: JOURNAL OF AOAC INTERNATIONAL VOL. 86, NO. 5, 2003
distribution. Our view is that it is better to use the exact value
obtained from occupancy theory because it is exact and because it is the least dependent on assumptions.
(3)
(4)
(5)
(6)
Conclusions
The practice of developing standard tables from various approximate procedures should now be abandoned because the
result can be calculated exactly. The promulgation of a computer code implementing these approximate procedures (31,
32) therefore seems inappropriate. Rounding conventions used
in standard tables should always be stated. We do not recommend a particular convention here, other than to note that
rounding a figure of 540 MPN/100 mL to 500 MPN/100 mL (as
noted earlier) seems excessive; the exact integer value in that
case is actually 541 MPN/100 mL (11).
In contrast, there is no exact way to calculate measures of
uncertainty about the MPN value. Most importantly, one must
decide between the use of classical confidence intervals and
Bayesian credible intervals. If the former is appropriate (i.e.,
the analyst wants to make probability statements about performance in the long run), then the intervals presented by de
Man (15) should be used. If the Bayesian approach is to be
taken (i.e., the analyst wants to make statements about the current result), we recommend the use of noncentral intervals
(i.e., HPDR), as given for 2 common setups by Beliaeff and
Mary (18). We favor use of the HPDR, rather than equi-tailed
intervals, because they are the narrowest of all possible credible intervals, and all probability densities inside the HPDR are
greater than at any outside point. Software has been developed
that gives essentially the same answers for those setups and
can be applied to any other (11). But note that this approach
assumes as its necessary prior distribution that all concentrations are equally likely, implying that before obtaining new
data the analyst held that a water body was more likely to be
grossly contaminated than it was to be healthy, even when historical sampling had routinely demonstrated a healthy state. If
this precautionary approach is not an appropriate assumption
(and we can see many cases in which it will not be), then a
Poisson Empirical Bayes procedure may be used. In so doing
the calculated uncertainty interval is much narrower. Further
research is desirable on optimal forms of such intervals.
Acknowledgments
Partial funding was obtained from the New Zealand Foundation for Research, Science and Technology (contracts
C01819 and C01X0215), and the Ministry of Health. Our microbiologist colleagues Chris Francis and Desmond Till and
Andrew Ball reviewed the manuscript; Andrew Ball also
tested the software.
(7)
(8)
(9)
(10)
(11)
(12)
(13)
(14)
(15)
(16)
(17)
(18)
(19)
(20)
(21)
(22)
(23)
(24)
(25)
(26)
(27)
(28)
(29)
(30)
References
(31)
(1) McCrady, M.H. (1915) J. Infect. Dis. 17, 183–212
(2) McCrady, M.H. (1918) Public Health J. Can. 9, 201–220
(32)
Greenwood, M., Jr, & Yule, G.U. (1917) J. Hyg. 16, 36–54
Eisenhart, C., & Wilson, P.W. (1943) Bacteriol. Rev. 7, 57–137
Cochran, W.G. (1950) Biometrics 6, 105–116
APHA (1998) Standard Methods for the Examination of Water and Waste Water, 20th Ed., American Public Health
Association, Washington, DC
APHA (2001) Compendium of Methods for the Microbiological Examination of Foods, 4th Ed., American Public Health
Association, Washington, DC
Tillett, H.E., & Coleman, R. (1985) J. Appl. Bacteriol. 59,
381–388
Tillett, H.E. (1987) Epidemiol. Infect. 99, 471–476
David, F.N., & Barton, D.E. (1962) Combinatorial Choice,
Griffin, London, UK
McBride, G.B. (2003) Preparing Exact MPN Tables Using
Occupancy Theory and Accompanying Measures of Uncertainty, NIWA Technical Report 121, Hamilton, New Zealand
Thomas, H.A. (1942) J. Am. Water Works Assoc. 34,
572–576
Woodward, R.L. (1957) J. Am. Wat. Works Assoc. 49,
1060–1068
de Man, J.C. (1977) Eur. J. Appl. Microbiol. 4, 307–316
de Man, J.C. (1983) Eur. J. Appl. Microbiol. 17, 301–305
Loyer, M.W., & Hamilton, M.A. (1984) Biometrics 40,
907–916
Best, D.J. (1990) Int. J. Food Microbiol. 11, 159–166
Beliaeff, B., & Mary, J.Y. (1993) Water Res. 27, 799–805
APHA (1975) Standard Methods for the Examination of Water and Wastewater, 14th Ed., American Public Health
Association, Washington, DC
WHO (1984) Guidelines for Drinking-Water Quality, Vol. 1,
Annex 2, World Health Organization, Geneva, Switzerland
Casella, G., & Berger, R.L. (1990) Statistical Inference,
Wadsworth & Brooks/Cole, Pacific Grove, CA
Reckhow, K.H., & Chapra, S.C. (1983) Engineering Approaches for Lake Management, Volume 1: Data Analysis
and Empirical Modeling, Butterworth, Boston, MA
Lee, P.M. (1997) Bayesian Statistics: An Introduction, 2nd
Ed., Arnold, London, UK
Aspinall, L.J., & Kilsby, D.C. (1979) J. Appl. Bacteriol. 46,
325–329
Royall, R. (1997) Statistical Evidence: A Likelihood Paradigm, Chapman & Hall, CRC Press, Boca Raton, FL
Carlin, B.P., & Louis, T.A. (2000) Bayes and Empirical
Bayes Methods for Data Analysis, Chapman & Hall, CRC
Press, Boca Raton, FL
Broman, K., Speed, T., & Tigges, M. (1998) Stat. Sci. 13, 4–8
Dalgety, M.H. (1999) Establishing a Most Probable Range
Using Existing Most Probable Number Estimation Techniques, BSc(Hons) Project Report 0655.420, Department of
Statistics, University of Waikato, Hamilton, New Zealand
Beliaeff, B. (1995) Water Res. 29, 1215
Roussanov, B., Hawkins, D.M., & Tatini, S.R. (1996) Food
Microbiol. 13, 341–363
Hurley, M.A., & Roscoe, M.E. (1983) J. Appl. Bacteriol. 55,
159–164
Parnow, R.J. (1972) Food Technol. 26, 56–62