Vol 25, No. 4
Printed In Great Britain
International Journal of Epidemiology
O International EpkJemJotogical Association 1996
'Unqualified Success' and
'Unmitigated Failure':
Number-Needed-to-Treat-Related
Concepts for Assessing Treatment
Efficacy in the Presence of
Treatment-Induced Adverse Events
MICHAEL SCHULZER AND G B JOHN MANCINI
Schulzer M (Departments of Medicine and Statistics, University of British Columbia, Vancouver, British Columbia, Canada)
and Mandni G B J. 'Unqualified Success' and 'Unmitigated Failure': Number-Needed-to-Treat-related concepts for
assessing treatment efficacy in the presence of treatment-induced adverse events. International Journal of Epidemiology
1996; 25: 704-712.
Background. Common indices for the quantal assessment of treatment efficacy are reviewed. The absolute risk reduction
is a practical index for public health considerations. Its reciprocal has been termed the 'Number Needed to Treat1 (NNT),
representing the health effort that must on average be expended to accomplish one tangible treatment target. We extend
the NNT to evaluate outcome combinations of treatment benefits versus treatment harms.
Methods. We descnbe the mathematical context of the NNT, and extend it to evaluate outcome combinations (treatment
success/failure with/without treatment-induced adverse effects) in a treated population. These extensions are carried out
assuming either independence or positive association between treatment benefit and treatment harm. A method is
provided for calculating the standard errors of these extended NNT values. Applications to cost-effectiveness analysis
are discussed.
Results. We calculate NNT in three recent therapeutic studies. The results of a trial of the prevention of strokes with
warfarin In patients with non-valvular atrial fibrillation are analysed to evaluate treatment success (stroke prevention)
against treatment-induced bleeds. An NNT-related cost-benefit analysis Is also carried out. We also analyse the results
of a study of two modalities of chemotherapeutic treatment in small-cell lung cancer, and of two modalities of surgical
intervention In the treatment of cholelithiasis.
Conclusions. The NNT are useful in direct evaluation of outcome-specific treatment benefits versus treatment-induced
harms. They may also be used in cost-effectiveness analyses and are helpful in guiding public health programmes
towards the identification of optimal treatment strategies.
Keywords: treatment efficacy, adverse events, NNT
the effort that must be expended in order to accomplish
a tangible treatment target. Further extension of the
NNT was undertaken by Riegelman and Schroth in
1993 3 T n e y S U gg e s t e r j adjustments to the NNT to incorporate the concepts of utility and timing of benefits
and harms.
The concept of 'Number Needed to Treat' (NNT) was
introduced by Laupacis, Sackett and Roberts in 1988,'
and was further elaborated by Sackett et al. in 1991.2
They reviewed several methods of assessing the efficacy of a treatment rated on a quantal (all-or-nothing)
response scale: the relative risk reduction, the odds
ratio and the absolute risk reduction. The NNT was
defined as the reciprocal of the absolute risk reduction,
and was shown to be particularly useful in emphasizing
The NNT has played a quantitative role in the assessment of cost-effectiveness for various clinical interventions. Recently, for example, it has been used in several
Swedish studies for evaluating the potential effects of
primary stroke prevention with anticoagulants in atrial
fibrillation, indicating that treatment with anticoagulants is cost-effective provided that the risk of serious
haemorrhagic complications due to anticoagulants is
kept low.4"6
Departments of Medicine and Statistics, University of British
Columbia, Vancouver, British Columbia, Canada.
Reprint requests to: Michael Schulzer, Department of Medicine, Vancouver Hospital and Health Sciences Centre, Laurel Pavilion, 910 West
10th Avenue, Vancouver, British Columbia, Canada, V5Z 4E3.
704
ASSESSING TREATMENT EFFICACY
In this work we provide a mathematical characterization of the NNT, and we describe ways in which it can
be usefully extended to assess treatment efficacy adjusted
for treatment-induced adverse effects. This extension is
carried out while allowing for various possible combinations of outcomes in the treated population: benefit
without harm, benefit with harm, harm without benefit,
neither benefit nor harm. These outcome-specific NNT
values remain consistent with the initial mathematical
interpretation of the NNT as the expected value of a
geometric random variable. They are derived under two
distinct scenarios: independence versus positive association between the occurrence of treatment success
and of treatment-induced adverse effects. We also
apply these extended NNT values to the context of costeffectiveness analysis, allowing for different costs or
utilities of treatment benefits versus treatment harms.
METHODS
The Mathematical Context of the NNT
When an event occurs in a proportion p of the population, and a random sample from this population is
taken, a variable can be defined to measure the 'waiting
time', i.e. the total number of individuals who must be
inspected until an individual is found in whom the
event is first observed to have occurred. This waiting
time variable is known as a 'geometric' variable and is
said to have a 'geometric' distribution.7 It can readily
be shown that the average ('expected') value of this
variable is \lp, the reciprocal of the probability p of the
event's occurrence in the population: on average one
must inspect \lp individuals in order to come across
one in whom the event has taken place. On repeated
sampling from the population, the waiting time can be
shown to vary with a standard deviation (s.d.) of
s.d. = Up- V(l -p).
Thus the variation is large when the event occurs
infrequently, i.e. when p is small, and the s.d. decreases
as the frequency of the event grows larger. This
geometric variable occurs, for example, in the classical
context of family planning;7 following the first birth,
the smallest number of additional children needed to
produce a family with at least one child of each sex
follows a geometric distribution. Assuming equal probabilities that a girl or a boy will be born, and provided
that the sexes of successive offspring are independent,
it is clear that after the first birth in a family, there is a
constant probability of 1/2 that any subsequent birth
will result in a child of the opposite sex to that of the
first-born. Thus, the waiting time (i.e. the number of
additional children) until a child of the opposite sex
705
is born is a geometric variable with an average value
of 1/(1/2) = 2. On average, then, a minimum of two
additional children are required to guarantee the occurrence of one child of each sex, leading to a total family
size of 3(1 +2).
Since the s.d. of the family size variable above is
s.d. = \lp- V(l - p ) = V2= 1.4,
it also follows that, if one were to record total family
sizes required to achieve one child of each sex in,
say, 10 families, the average size would lie between
3 - (1.96 x 1.4/VlO) and 3 + (1.96 x 1.4/VlO), i.e.
approximately between two and four children, 95% of
the time.
One measure of the efficacy of a preventive treatment is the corresponding absolute risk reduction:1
if condition A (e.g. ischaemic stroke) occurs in the
untreated population with a probability px and in the
treated population with a (lower) probability p2, then
the absolute risk reduction induced by the treatment is
defined as pt- p2 = Ap', Ap represents the probability
that the treatment results in a 'success', i.e. in the
prevention of condition A. Applying the concepts of the
geometric variable outlined above, it will be necessary
to treat on average \/Ap individuals to achieve one
prevention of condition A. Thus, l/Ap is the 'Number
Needed to Treat', or NNT, characterizing this treatment:
it is defined as the average number of observations
needed (average waiting time) in order to encounter one
occurrence of treatment success.
Let us now suppose that an adverse event B, (e.g.
cerebral haemorrhage), also occurs in the untreated
population with probability qx. Assume further that B
may in addition occur as an adverse effect induced by
the treatment under investigation, so that in the treated
population, B occurs with a total probability qv with
q2> qv For the moment we assume that A and B occur
independently of each other in both the untreated and
the treated populations.
To fix ideas, suppose px = 25%, p2 = 5%, qt = 10%
and q2 = 50% (Figures la, lb).
Thus, the probability of treatment success in this
population is px - p2 = &p = 20%, with a corresponding
NNT of l/&p = 5 (s.d. = ± 4.5): five individuals must on
average be treated to achieve one treatment success, i.e.
one prevention of condition A.
However, adverse effects induced by the treatment
occur at a rate of q2 - <?, = Aq = 40%. The corresponding NNT (1/0.4 = 2.5 ± 1.9) gives the average
number of individuals who must be treated to produce
one treatment-induced adverse effect (equivalently, an
average of five inspections will yield two occurrences
of treatment-induced adverse effects).
706
INTERNATIONAL JOURNAL OF EPIDEMIOLOGY
FIGURE la An untreated population, suffering (independently)
from condition A (e.g. stroke) (lined; p t - 25%) and from adverse
event B (e.g. cerebral haemorrhage) (stippled; qt = 10%)
FIGURE lc A treated population, characterized by a positive
association between the successful prevention of condition A and
the induction of adverse event B
! - vn
Rectajigle I: individuals who do not suffer from A or from B:
(1 - P|) • (1 - qt) = 67.5% of total population; II: individuals
suffering from B only: (1 - p , ) • qx = 7.5%; III: individuals suffering from A only, p , • (1 — qt) = 22.5%; IV: individuals suffering
from both A and B: p t • <?, = 2.5%.
Probabilities are calculated under the assumption of independence. Areas of rectangles are drawn to scale.
FIGURE 1 b A treated population, suffering (independently) from
condition A (lined; p2 = 5%) and from adverse event B (stippled;
q2 = 50%). Broken lines in the Figure indicate corresponding
boundaries in the untreated population of Figure la
The notation follows Figure 1b, with p2 = 5% and q2 = 50% as
before, but the probability of treatment-induced adverse event B in
individuals in whom condition A has been successfully prevented
is now 80% (versus 40% in Figure lb). Events (I) through (IX)
have the same interpretation as in Figure lb. However, the
probabilities now become; I: 45%; II: 22.5%; III: 7.5%; IV: 2%;
V: 16%; VI: 2%; VII: 3%; VIII: 1.5%; IX: 0.5%.
The probability of treatment success is 20%, treatment failure 5%,
unqualified success 4%, unmitigated failure 1.5%. Areas of
rectangles are drawn to scale.
Various combinations of mutually exclusive events
of importance can be identified in the treated population. Thus, in Figures lb and lc, rectangles I, II and III
represent individuals in the treated population
who
never would develop condition A, whether treated or
not. The three rectangles subdivide these individuals
into those who also never develop adverse event B,
whether treated or not (I), those in whom the treatment
induces B (II), and those who develop adverse event B
Rectangle I: individuals who never develop condition A or adverse
event B, whether treated or not: (1 - p , ) • (1 - q2) = 37.5% of total
population; II: individuals who never develop A, whether treated
or not, and in whom treatment has induced B: (1 - p , ) •
(q2 - <7|) = 30%; III: individuals who never develop A, whether
treated or not, and who develop B unrelated to treatment: (1 - p,) •
qt = 7.5%; IV: individuals in whom A has been prevented by treatment, and who never develop B, whether treated or not:
(p t - p 2 ) • (1 - q2) = 10%; V: individuals in whom A has been
prevented by treatment, and in whom treatment has induced B:
(p, - p 2 ) • (q2 - qt) = 8%; VI: individuals in whom A has been
prevented by treatment, and who develop B unrelated to treatment:
(p, - p2) • qi = 2%; VII: individuals in whom treatment has failed
to prevent A, and who never develop B, whether treated or not: p 2
• (1 - q2) = 2.5%; VIII: individuals in whom treatment has failed
to prevent A, and in whom treatment has induced B:
p2 • (q2- q,) = 2%; IX: individuals in whom treatment has failed to
prevent A, and who develop B unrelated to treatment: p 2 • qt= 0.5%.
Probabilities are calculated under the assumption of independence. Areas of rectangles are drawn to scale. The event 'treatment
success' is represented by rectangles (IV + V + VI), with total
probability 20%; 'treatment failure' occurs in (VII + VIII + IX),
with total probability 5%; 'unqualified success' corresponds to
(IV + VT), with total probability 12%; 'unmitigated failure' is
represented by (VIII), with probability 2%.
unrelated to the treatment (III). Similarly, rectangles
IV, V and
VI represent
individuals
in whom
the
treatment has successfully prevented condition A, again
subdivided into 'never B' (IV), 'treatment-induced B',
(V), and 'B unrelated to treatment' (VI). Finally, rectangles VII, VIII and IX represent individuals in whom
treatment has failed to prevent condition A. These are
again subdivided according to event B as above. The
probabilities of each rectangle are readily calculated
under
the assumption
that events
A and
B occur
independently in the treated population (Figure lb).
Of special interest are individuals in whom condition
A has been successfully prevented, while adverse event
B has not been induced by the treatment (rectangles IV
and VI). We may refer to these individuals as having
experienced 'unqualified success'. Under the assumption of independence (Figure lb), the total probability
of unqualified success is given by
ApO
- A $ ) = 12%,
ASSESSING TREATMENT EFFICACY
yielding an NNT of
1/0.12 = 8.3 (±7.8).
On average, 8.3 individuals must be treated to
produce one 'unqualified success', i.e. one individual
in whom therapeutic success is achieved (condition A
is prevented) while, simultaneously, the individual is
spared from the treatment-induced adverse effect B;
(equivalently, an average of 25 individuals must be
treated to produce three unqualified successes). Thus,
the stricter requirement that treatment success should
not be accompanied by a treatment-induced adverse
effect increases the NNT by a factor of \l{\-bq) = 1.67,
(i.e. by 67% or 3.3 individuals), representing a considerable increase in treatment effort.
At the other end of the spectrum are the unfortunate
individuals in whom the treatment fails to provide
protection from condition A but nevertheless produces
the adverse effect B (rectangle VIII in Figure lb). The
probability of such 'unmitigated failure' under independence is p2 • Aq = 2%, yielding an NNT of 1/0.02 =
50: 50 individuals must on average be observed to come
across one unmitigated failure.
The above calculations were based on the assumption
that condition A and adverse effect B occurred
independently within both the untreated and the treated
populations. It is however plausible that condition A
and adverse event B are positively (or in some cases,
negatively) associated within the treated population.
The successful prevention of condition A and the induction by the treatment of the adverse event B may be
correlated, notably when the 'therapeutic window' (i.e.
the difference between the minimum effective dose and
the maximum tolerated dose8) of the treatment is quite
narrow. Figure lc presents an example in which the
events A and B are positively associated in the treated
population.
In Figure lc, pv p2, qx and q2 are the same as in
Figure lb: pt = 25%, p2 = 5%, q[ = 10%, q2 = 50%.
Thus, the probabilities and NNT of the marginal events
of treatment success and of treatment-induced adverse
effects remain unchanged. However, we have introduced a positive correlation between A and B in this
treated population by making the assumption that of
the 20% of individuals who are successfully treated
(i.e. in whom A is prevented), a full 80% will experience treatment-induced adverse effect B, while in the
remaining 80% of the treated population, only 30% will
experience treatment-induced effect B. In this situation,
the probability of an unqualified success is only 4%,
with a corresponding NNT of 25 (± 24.5). The strict
requirement of unqualified success requires an increase in NNT of 20 (25 - 5) over the more modest
707
requirement of treatment success alone. Finally, the
probability of unmitigated failure is, in this configuration, 1.5% (NNT = 66.7 ± 66.2).
NNT and Cost-Effectiveness
The NNT can be extended to cost-effectiveness
analyses and to cost-utility analyses. Costs are
measured in monetary terms, and effectiveness in
physical terms (e.g. number of strokes avoided, or
number of life years gained). Effectiveness may also be
measured in units of utility (e.g. number of life years
gained adjusted for quality of life). In cost-benefit
analyses, both costs and benefits are measured in
economic terms.6'9 Riegelman and Schroth3 extend the
NNT to take account of the timing of benefits and of
harms, using life expectancy, and time-discounting the
value of years of life gained or lost in the future. Their
modified NNT measures the number needed to produce
one additional year of life at present value ('current
year of life'). They also calculate the NNT in the
context of utility, to measure the number needed to
produce one additional year of quality-adjusted timediscounted life expectancy.
In Application 1 below we calculate the NNT in
the context of primary stroke prevention in atrial
fibrillation, and discuss simple methods for applying
outcome-specific NNT in a cost-benefit analysis.
Applications 2 and 3 illustrate the direct uses of NNT as
a clinical measure in comparing two chemotherapeutic
modalities in the treatment of small-cell lung cancer,
and two surgical modalities in the treatment of
cholelithiasis.
Applications
1. We analyse the results of a recent randomized trial
(BAATAF10) assessing the efficacy of warfarin therapy
in the prevention of strokes in patients with nonvalvular atrial fibrillation. In particular, we derive
adjusted NNT values, balancing the prevention of embolic end-points against the induction, through therapy,
of haemorrhagic end-points (fatal intracerebral haemorrhages, major and minor bleeds), and apply the results
to a cost-benefit analysis.
Table 1 shows the reported incidence in the
BAATAF trial10 for various aggregated embolic and
haemorrhagic end-points. We consider in our example
the prevention of 'total strokes', adjusted for the adverse
effect of 'total bleeds'. The trial had an average duration of follow-up of 2.2 years. In all, 208 patients were
randomized to control; of these patients, 13 developed
strokes and 21 experienced bleeds. Of 212 patients randomized to warfarin, 2 suffered strokes and a total of 40
had bleeds.
708
INTERNATIONAL JOURNAL OF EPIDEMIOLOGY
TABLE 1 Reported incidence of tmbolic and haemorrhagic endpoints (BAATAF10) (mean follow-up duration: 2.2 years)
Warfarin
(n = 212)
Embolic end-points:
Moderate/minor stroke
Fatal/major stroke
All strokes
Haemorrhagic end-points:
Fatal intracerebral haemorrhage/major bleed
Minor bleeds
All bleeds
Control
(n = 208)
1
2
1
6
13
7
8
32
40
7
14
21
These values yield the following annualized rates:
pl = rate of total strokes in controls (Figure la) =
2.84%
<7, = rate of total bleeds in controls = 4.59%
p2 = rate of total strokes under warfarin therapy
(Figure lb) = 0.43%
q2 = rate of total bleeds under warfarin therapy =
8.58%.
Thus, the probability of treatment success (prevention
of total strokes) is given by
Ap=pt-p2
= 2.41%,
with a corresponding NNT value of:
NNT=
Thus, on average, 41 patients have to be treated for
one year to prevent one stroke of any type. A method
for estimating the standard error associated with this
NNT is given in the Appendix.
We now consider the combined event of treatment
success without treatment-induced bleeds. To calculate
the NNT corresponding to this unqualified success, we
assume, conservatively, that the prevention of strokes
occurs independently of the induction of bleeds (Figure
1b). Using the formula discussed previously we have,
NNT = I/A/? (1 - Aq) = NNT (treatment success)/
(I -Aq).
q = q2-ql = 3.99%, so that 1/(1 - Aq) = 1.04
is the factor multiplying the NNT of treatment success
to obtain the NNT of unqualified success. Thus, the
NNT has to be increased by some 4% to guarantee
treatment success without any bleeds. Specifically, for
unqualified success, NNT = 43. Based on the findings
of the BAATAF trial, the NNT for treatment success
must therefore be increased by two patients (from 41 to
43) to come across one patient in whom warfarin has
prevented a stroke without inducing any bleed in the
same individual. The expected burden on health costs to
achieve an unqualified success is thus the delivery of
care to two additional patients.
We can also readily calculate the NNT corresponding
to unmitigated failure: the failure to prevent a stroke while
inducing a bleed. In fact, NNT = \l(p1 • Aq) = 5849,
indicating that unmitigated failures, as defined here, occur
rarely: for 5849 individuals treated per year, one may expect
to encounter on average one such extreme failure.
Suppose we wish to evaluate the strategy of introducing preventive anticoagulant treatment to this population by means of a cost-benefit analysis. We may, for
example, use the following cost estimates, derived from
Swedish data, 46 The cost per year of preventive treatment with warfarin is assessed at US $567 per person;
the total cost of a stroke (direct plus indirect costs
discounted for future value) is estimated at US $79 000
per patient from first stroke to death; median survival
time from onset of stroke (all strokes diagnosed) is
approximately 3 years; thus, the median cost of a stroke
is approximately US $26 333 per patient per year.
Using the annualized NNT estimates derived above
for the BAATAF trial, we can assess the net cost of
preventing one stroke, and compare it to the net cost of
producing one unqualified success. Since 41 individuals
need to be treated to prevent one stroke, the total cost of
treatment (US $23 247 per year) clearly offsets the
cost of a stroke, yielding a net gain of 26 333—23 247 =
US $3086 per year per stroke prevented. Fortythree individuals need to be treated to produce one
unqualified success, i.e. one prevention of a stroke without the induction of a bleed. Let the cost of a bleed be
represented by the arbitrary amount of US $ X per
year. Then, the total cost of treating 43 individuals
(US $24 381 per year) clearly offsets the combined
yearly cost of a stroke and a bleed, viz. US $(26 333 + X),
with a net gain of US $(1952 + X) per year. More
specifically, the two additional individuals needed to
treat to produce an unqualified success, rather than
merely the prevention of one stroke, cost US $1134 per
year, and should offset the cost of US $ X per year of
the prevention of one bleed. Thus, a net gain occurs
through treatment whenever the annual cost of a bleed
exceeds the breakeven value of US $1134. This method
of applying cost-benefit analysis to NNT values is used,
for example in Gustafsson et al.5
A more systematic NNT-related approach to costbenefit analysis for this problem is shown in Table 2.
Using the Swedish cost estimates given above,4'6 we
calculate the total costs (treatment plus harms) for each
of the nine combined events described in Figure lb. We
assume for simplicity that the costs of a stroke and of a
709
ASSESSING TREATMENT EFFTCACY
TABLE 2 Cost-benefit analysis for preventive anticoagulant treatment in atrial fibrillation (BAATAF data10, Swedish cost estimates*6)":
treatment versus harms (stroke ami/or bleed)
Costs under treatment per year (US $)
Combined outcome
(Figure lb)
NNT
I
II
III
IV
V
VI
VII
VIII
IX
1
26
22
45
1040
903
255
5849
5082
Total costs
Combined outcome
(Figure la)
5 036 660
387X
219 429
252 882
446X
125 024
5500
10X
6237
11X
22 113 1 026 987
1021 47 399 + 2X
1134 52 666 + 2X
5 670 000 +1 127 052 +858X
n 6 797 052 +858X
Costs under no treatment per year (US $)
NNT
III
IV
Cost per 10 000 treated
(treatment + harms)
1
22
37
767
Total costs
Cost per 10 000 untreated
(harms)
446X
7 136 243
342 329+
7 478 572 +
13X
459X
" Cost of treatment: US $567 per patient per year, cost of stroke: US $26 333 per patient per year, breakeven cost of bleed: US $ X per patient
per year.
bleed are additive when they occur together, (more
specific estimates may of course be substituted), and we
again represent the breakeven cost of a bleed as US $ X
per year. The overall cost of treatment is then compared
to the overall cost of no treatment. Total treatment cost
in Table 2 is estimated at US $6 797 052 + 858X per
year for treating 10 000 individuals. The same individuals left untreated are estimated to cost US $7 478 572
+ 459X per year, due to the uncontrolled strokes and the
spontaneous bleeds that will occur. The treatment will
therefore provide a net gain over no treatment as long
as the yearly cost of a bleed X exceeds US $1708.
Similar calculations may be carried out using other
measures of cost-effectiveness or cost-utility analyses,
such as the number of strokes prevented, or qualityadjusted life years (QALY).
2. We apply our NNT and adjusted NNT calculations to
a randomized trial comparing weekly versus 3-week
chemotherapy in small-cell lung cancer (SCLC)." The
aim of the study was to determine if weekly scheduling
produced higher response rates than conventional
chemotherapy treatment. In all, 438 patients with SCLC
were randomized: 221 received weekly chemotherapy
consisting of 12 alternating cycles of ifosfamide/
doxorubicin and cisplatin/etoposide (PE); 217 were
assigned to the reference arm of the study, receiving a
3-week regimen of six alternating cycles of cyclophosphamide/doxorubicin/vincristine and PE. Complete
response was defined as complete radiological clearing
of the chest x-ray and disappearance of all symptoms,
signs, and biochemical abnormalities, as well as normalization of investigations that had indicated metastatic
disease. Partial response was a 50% or greater reduction
in the size of the tumour on the chest x-ray, with
improvement or stability in other disease sites. Both
complete and partial responses had to be maintained for
a minimum of 3 weeks.
Various toxicity rates were reported under both
treatment regimens. Of particular importance were:
haematological toxicity (which led to at least one
chemotherapy cycle reduced or delayed), nausea and
vomiting, infection, and mucositis.
We calculate the NNT for the total response (combined complete and partial responses) of weekly versus
710
INTERNATIONAL JOURNAL OF EPIDEMIOLOGY
TABLE 3 Incidence rates of total response to treatment and of toxicity in weekly versus 3-week chemotherapy in SCLC." NNT values are
shown for treatment success, unqualified success, and unmitigated failure
Weekly (%)
(n-221)
Response to treatment
Toxicity:
Haematological
(>1 cycle reduced)
Haematological
( » 1 cycle delayed)
Nausea and vomiting
Infection
Mucositis
3-week (%)
(n-217)
NNT
Treatment
success
Unqualified
success
Unmitigated
failure
-
82.4
81.1
80.2
-
67.4
13.4
-
174.1
10 5
72.9
20.7
-
167.1
10.9
87.0
43.5
48.6
77.1
36.0
28.5
-
88.8
86.5
100.1
57.0
75.2
28.2
conventional (3-week) therapy. We also calculate the
adjusted NNT values when the response is balanced
against different toxicities (unqualified success), as
well as the NNT values corresponding to individuals
who failed to respond to weekly therapy but in whom
toxicity was induced by this therapy (unmitigated
failure) (Table 3). Thus, for example, an average of
80.2 patients need to be treated to observe one patient
who responds to the weekly treatment, and who would
not have responded to the conventional therapy.
However, 174.1 patients need to be treated on average,
to observe one patient (unqualified success) who benefits from the weekly treatment over the conventional
therapy, and in whom the weekly treatment does not
induce haematological toxicity severe enough to cause
a reduction in therapy of at least one cycle. At the same
time, after treating on average only 10.5 patients, one is
likely to observe one patient in whom the weekly
treatment fails to produce a response, while producing
in this patient a severe haematological toxicity (unmitigated failure). Cost-effectiveness analyses may be
carried out to compare these two therapeutic modalities,
following the methods outlined in Application 1.
3. A recent study compared the prevalence rate of persistent right upper quadrant pain after open cholecystectomy to that following the laparoscopic operation.12
Open cholecystectomy (OC) had been regarded as the
'gold standard' in the treatment of cholelithiasis13 until
recent years, when laparoscopic cholecystectomy (LC)
became widely accepted as the new 'gold standard'. 14
In this study, 155 patients had received OC and responded to self-assessment questionnaires after a mean
follow-up time of 32 (s.d. 23) months; 205 patients had
undergone LC and responded to the questionnaires after
a mean follow-up time of 15 (s.d. 7) months. Persistent
right upper quadrant pain was reported by 15 OC
patients (p, = 9.7%) and by 7 LC patients (p2 = 3.4%).
A variety of post-operative complications were also
reported: wound infection, incisional hernia, bile leak,
haematoma, urinary tract infection, chest infection,
myocardial infarction, urinary retention, abdominal
pain, retained common bile duct stones, and pancreatitis. The total incidence of complications in the
OC group was 18 (<?, = 11.6%), and in the LC group 15
{q2 = 7.3%).
A direct overall comparison between the two groups
yields an NNT value for treatment success (i.e. no
persistent pain) for LC over OC of I/A/? = 16.0, so that,
on average, 16 patients must be treated with LC, to
produce one patient in whom post-operative pain has
been prevented, and who would have suffered such
persistent pain under OC.
In this case, LC has both a higher success rate in
preventing pain and a lower incidence rate of complications. Calculations of the NNT for unqualified success
can be carried out for both procedures in comparison to
a hypothetical untreated control group, with 100% right
upper quadrant pain and 0% complication rate. Thus, a
patient undergoing OC has a probability of (1 - p,) •
(1 -<?,) = 79.8% of not suffering from persistent pain or
from complications, assuming independence between
these events. This yields an NNT value for unqualified
success under OC of 1/0.798 = 1.25. Similarly, the
probability that a patient receiving LC will experience
unqualified success is (1 - p2) • (1 - q2) = 89.5%, with
a corresponding NNT of 1/0.895 = 1.12. Hence, the
NNT for unqualified success with LC is lower by 0.13
than that with OC: on average, 13 fewer patients need
to be treated with LC than with OC to produce 100
unqualified successes.
The NNT for unmitigated failure (persistent pain and
complications) also favour the LC procedure. Under
ASSESSING TREATMENT EFFICACY
OC the NNT is given by
I/O, • ?,) = 89
and under LC, by
showing a marked NNT advantage for LC of 400 - 89
= 311: on average, treatment of 311 additional patients
with LC than with OC must take place before one
would anticipate encountering one unmitigated failure.
The two treatment procedures may be further compared
using cost-effectiveness analysis, as in Application 1.
DISCUSSION
A number of indices have been proposed for measuring
the efficacy of treatments evaluated on a quanta!
(all-or-nothing) scale. Among the best known are the
odds ratio, the relative risk, the absolute risk difference
and the relative risk difference.ll15 Of these, the odds
ratio is the most widely used. It is closely related to the
relative risk (when the incidence rates are low), and
exhibits sufficient regularity to permit its epidemiological use in prediction, interpolation and extrapolation. It
can be employed in retrospective studies as well and
corresponds to a natural mathematical (logistic) model.
Nonetheless, the odds ratio has been seriously criticized
for losing essential information on the actual level of
the incidence rates being compared. l5~17 The absolute
risk difference (Berkson's index16) is a direct and simple measure of the difference between the incidence
rates, and provides the most practical magnitude of
treatment efficacy for public health applications.l6>17
However, both the absolute and the relative risk
difference (Sheps' index 18 ' 19 ), lack regularity, and
demonstrate a degree of erratic behaviour.
The NNT is the reciprocal of Berkson's index. It
represents the expected value of a geometric variable
with success probability given by the absolute risk difference. It therefore shares the advantages of Berkson's
index (direct, practical use in public health applications), but also suffers the drawback of some lack of
regularity, manifested by relatively large standard
errors. This disadvantage can be overcome by deriving
NNT values from larger studies, or from metaanalytical syntheses of multiple studies.
We have extended the range of applications of the
NNT by developing methods for calculating NNT
values and corresponding standard errors for combined
events, in which treatment success is balanced against
treatment-induced adverse effects. In particular, we
have calculated an adjusted NNT that measures the
average number of subjects who must be treated to
71!
achieve one occurrence of an unqualified success (treatment success unaccompanied by treatment-induced
adverse effects), the average number needed to achieve
one occurrence of an unmitigated failure (treatment
failure accompanied by treatment-induced adverse
effects), as well as other outcome combinations. Our
calculations were carried out under two scenarios: the
conservative assumption that treatment success and
treatment-induced adverse effects occurred independently in the treated population, and, in the case of
treatments with a narrow 'therapeutic window', under
the assumption that a positive association is likely to
exist between treatment success and adverse effects,
leading to a larger value of the NNT for unqualified
success. Evidence for such an association may be
obtained from the study data by cross-classifying
subjects according to treatment success versus adverse
effects. In the presence of significant association, the
NNT method described in Figure lc may be employed
to obtain more accurate estimates.
We have illustrated our methods with three examples, dealing with three different therapeutic contexts:
the prevention of strokes with warfarin in patients with
non-valvular atrial fibrillation, two modalities of chemotherapeutic treatment in small-cell lung cancer, and two
modalities of surgical intervention in the treatment
of cholelithiasis. These examples emphasized the
importance of calculating NNT values not only for
treatment success, but also for unqualified success, unmitigated failure, and for other outcome combinations.
These NNT values provide a useful, direct measure
of the clinical impact on patients of treatment versus
adverse effects. We have also shown by means of an
example how these extended NNT for combined events
may be applied to cost-benefit (or, more generally, to
cost-effectiveness) analyses, where treatment costs may
be offset against treatment-derived benefits and harms.
In these extensions, the NNT retained its mathematical
interpretation as the expected value of a geometric 'waiting time' variable. These applications demonstrate the
value of the NNT as a useful tool in public health considerations and in decision-making problems concerning the identification of optimal treatment strategies.
REFERENCES
1
1
3
Laupacis A, Sackett D L, Roberts R S. An assessment of clinically useful measures of the consequences of treatment.
N Engl J Med 1988; 318: 1728-33.
Sackett D L, Haynes R B, Guyatt G H, Tugwell P. Clinical
Epidemiology: A Basic Science for Clinical Medicine, 2nd
Edn. Boston: Little, Brown and Co., 1991, pp. 205-09.
Riegelman R, Schroth W S. Adjusting the Number Needed to Treat:
incorporating adjustments for the utility and timing of
benefits and harms, Med Decis Making 1993; 13: 247-52.
712
4
INTERNATIONAL JOURNAL OF EPIDEMIOLOGY
Persson U, Silverberg R, Lindgren B el at. Direct costs of stroke
for a Swedish population. IntJ Technol Assess Health Care
1990; 6: 125-37.
5
Gustafsson C, Asplund K, Britton M, Norrving B, Olsson B,
Marke L-A. Cost effectiveness of primary stroke prevention in atrial fibrillation: Swedish national perspective.
BrMcdJ 1992; 305: 1457-60.
6
Asplund K, Marke L-A, Terent A, Gustafsson C, Wester P O.
Costs and gains in stroke prevention: European perspective. Cerebrovasc Dis 1993; 3(Suppl. 1): 34-42.
7
Feller W. An Introduction to Probability Theory and its
Applications, 2nd Edn, Vol I. New York: J Wiley and Sons,
1965, pp. 155-57, 210, 252-53, 130.
8
Ekholm B P, Fox T L, Bolognese J A. Dose-response: relating
doses and plasma levels to efficacy and adverse experiences. In: Berry D A (ed.). Statistical Methodology in the
Pharmaceutical Sciences. New York: Marcel Dekker Inc.,
1990, pp. 117-38.
9
Naglie I G, Detsky A S. Treatment of chronic nonvalvular atrial
fibrillation in the elderly: a decision analysis. Med Decis
Making 1992; 12: 239^»9.
10
The Boston Area Anticoagulation Trial for Atrial Fibrillation
investigators. The effect of low dose warfarin on the nsk
of stroke in patients with non-rheumatic atrial fibrillation.
N EnglJ Med 1990; 323: 1505-11.
" Souhami R L, Rudd R, de Elvira M-C R et al. Randomized trial
comparing weekly versus 3-week chemotherapy in smallcell lung cancer a cancer research campaign trial. J Clin
Oncol 1994; 12: 1806-13.
12
Stiff G, Rhodes M, Kelly A, Telford K, Armstrong C P, Rees B
I. Long-term pain: less common after laparoscopic than
open cholecystectomy. BrJSurg 1994; 81: 1368-70.
13
McSherry C K. Cholecystectomy: the gold standard. Am J Surg
1989; 158: 174-78.
14
Soper N J, Stockmann P T, Dunnegan D L, Ashley S W.
Laparoscopic cholecystectomy. The new 'gold standard'?
Arch Surg 1992; 127:917-23.
13
Fleiss J L. Statistical Methods for Rates and Proportions, 2nd
Edn. New York: J. Wiley and Sons, 1981, pp. 90-93, 13-15.
16
Berkson J. Smoking and lung cancer: Some observations on two
recent reports. J Am Stat Assoc 1958; 53: 28-38.
17
Feinstein A R. Clinical biostatistics XX. The epidemiologic
trohoc, the ablative risk ratio, and retrospective research.
Clin Pharmacol Ther 1973; 14: 291-307.
" Sheps M C. Shall we count the living or the dead? N Engl J Med
1958; 259: 1210-14.
19
Sheps M C. Marriage and mortality. Am J Public Health 1961;
51: 547-55.
20
R a o C R. Linear Statistical Inference and its Applications,
2nd
Edn. N e w York: J. Wiley and Sons, 1973, Section 6 a . 2 .
(Revised version received October 1995)
APPENDIX
1. To estimate the standard error (s.e.) of the NNT
associated with treatment success, viz.
N N T = 1/Ap =!/(/>,-p 2 ),
one must take into account the s.e. associated with the
estimates px and pr These are given by,13
V
/#!,], s.e.(p2) = V[p2(l -
p2)/n2],
where n, and n2 are, respectively, the sample sizes from
the untreated and from the treated populations.
Then, by applying a Taylor series expansion (the
'delta method' 20 ), one readily obtains, for sufficiently
large samples,
[s.e.(/72)]2
s.e.(NNT)
For example, for the BAATAF trial data (Application
1), the total (non-annualized) incidence rates were
pt = 6.25%, p2 = 0.94%, ns = 208, n2 = 212, s.e.(p{) =
1.68%, s.e.(p2) = 0.66%, and hence, for the nonannualized NNT of treatment success (NNT =18.8), the
standard error is s.e.(NNT) = 6.40. The annualized
NNT, calculated in our text (NNT = 41), is obtained by
dividing the total incidence rates by 2.2 (the average
duration of follow-up). Thus, the s.e. of the annualized
NNT is
s.e.(NNT) = 6.40 • 2.2= 14.1.
2. To estimate the s.e. of the NNT associated with
unqualified treatment success, viz.
NNT = l/Ap ( 1 - Aq) = l/(p, - p 2 ) (1 - [q2- <?,]),
an analogous Taylor series expansion yields
s.e.(NNT) - [\IAp(\ - Aq)]
,
[s.e.(p2)]2)/(Ap)2+ |[s.e.( ?1 )] 2 + [s.e.(q2)]2]/(\ - A/?)2].
For the BAATAF example, the reported, nonannualized rates are qx = 10.10%, q2 = 18.87%, s.e.(<y,)
= 2.09%, s.e.(<?2) = 2.69%, and, for the non-annualized
NNT corresponding to unqualified success, (NNT =
20.6), the s.e. is s.e.(NNT) = 7.04. The corresponding
s.e. for the annualized NNT (NNT = 43) is
s.e.(NNT) =14.7.
3. Finally, to estimate the s.e. of the NNT for unmitigated failure, viz.
NNT= \l{p2- Aq),
another application of the Taylor series expansion
yields
s.e.(NNT) « [Up2 • Aq] V[{s.e.(p2)} V
+ [s.e.(q2)]2}/(Aq)2].
+ {[s.e.^,)] 2
In the BAATAF example, the s.e. of the nonannualized NNT for unmitigated failure (NNT = 1213)
is given by s.e.(NNT) = 973.3. The corresponding s.e.
for the annualized NNT estimate (NNT = 5849) is
s.e.(NNT) = 4710.8.
© Copyright 2026 Paperzz