HOO CONSERVATIVE ARE POPULAR SAMPLE SIZE FORMULAS?
by
Lawrence L. KupPer· ;and Kerry B. Hafner
Department of Biostatistics
carolina at Chapel Hill
univets~t~{O~ North
Institute· of Statistics Mirneo Series No. 1839
November 1987
How Conservative Are Popular Sample Size Formulas?
Lawrence L. Kupper and Kerry B. Hafner*
Department of Biostatistics
School of Public Health
7400 Rosenau Hall
University of North Carolina at Chapel Hill
Chapel Hill, NC
27599-7400
*
Lawrence 1. Kupper is Professor, Department of Biostatistics, University of North
Carolina at Chapel Hill, Chapel Hill, NC
27599-7400.
Kerry B. Hafner is Ph.D. Student,
Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC
27599-7400.
This
#2 T32 ES07018.
research
was
partially
supported
by
N.I.E.H.S.
training
grant
Abstract
One concern in the early stages of study planning and design is the minimum sample
size needed to provide statistically credible results.
This minimum sample size is usually
determined via the use of simple formulas, or equivalently, from tables.
However, the more
popular formulas involve large-sample approximations and hence may be too conservative.
This article provides empirical evidence indicating that this conservatism is drastic for certain
sample size formulas based on confidence interval width. Common sample size formulas that
consider statistical power are also discussed; these are shown to perform quite well, even for
small sample size situations.
1. INTRODUCTION
For a variety of experimental and observational studies, it is of interest to estimate the
mean JJ of a random sample from a N(JJ,
0'2)
population. In this situation, one may wish to
specify the maximum 100(1-0')% confidence interval width
tolerance of 8 units.
80
that JJ is estimated to within a
The minimum sample size nm needed to achieve this precision is
frequently recommended (e.g., see Armitage 1971, p. 185; Koopmans 1987, p. 239; Ott 1977,
p. 240) to be the smallest positive integer satisfying the inequality
0'
(nm)
-1/2
Zl-a/2 ~ 8
(1)
or
where pr(Z > Zl-a/2) = 0'/2 when Z ..... N(O, 1).
Alternatively, suppose it is of interest to conduct a one-tailed size 0' test of H o: JJ=JJo
versus H 1: JJ>JJo for a random sample selected from a N(JJ,
0'2)
population.
A popular
inequality (Ott 1977, p. 241; Rosner 1986, p. 209) used to calculate the minimum sample size
n m necessary to achieve a power of at least (1-,8) when JJ = JJl (> JJo) is
(2)
where 0 = (JJl - JJo)/O"
More generally, consider taking random samples of the same size froII\ N(JJo, 0'2) and
N(JJ1'
0'2)
populations to make inferences about (JJ1-JJO)'
Then, the inequality analogous to
(1) is (Armitage 1971, p. 185; Ott 1977, p. 241)
or
(3)
2
Similarly, the two-populatioIl,aJialogue;,of expression (2) for testing H o: JJi =JJo versus
Hi: JJi>JJO is (e.g., see Fleiss, 1986'nf.i,;5;}~1~in~;a~mJ Kupper, and Muller 1987, p. 31; Meinert
1986, p. 84; Pocock 1983, p. 128)
(4)
During the planning phases of various types of research studies, expressions (1)-(4) are
used by both statisticians and non-statisticians to provide guidelines for the numbers of
experimental units to be sampled.
For example, consider a randomized clinical trial designed
to measure the efficacy of a new antihypertensive drug. Formulas (3) and (4), and analogous
ones for proportions, are often used to help decide on the number of subjects to be allocated to
the treatment and control groups (Freiman, Chalmers, Smith, and Kuebler 1978; McHugh and
Le 1984).
Most users of expressions (1)-(4) probably appreciate that these inequalities involve
,
;
C U,
large-sample approximatidns.
\
!"~. :::.:. ~\
<:
\<";'''_~ t:·~,"'
'
l'
Thus, their use in small-sample situations may lead to an
underestimation of the sample sizes required to achieve specified inference-making goals.
i;:U'U
I·'
noj":'::d~'L:i;'ii)
To
~
our knowledge, no published literature seems to address the magnitude or the potential
., \ L
seriousness of this conservatism.
The purpose of this paper is to quantify this sample size
underestimation phenomenob: 'a.ia
concern. It is shown that
'l~) ~~ksll:J;l{trirt-erfca11y
whether it should be a cause for
i~~q~ialitier;(~ta.h(f(4"~rtbrm'amazingly well even for very small
sample sizes, while inequaliii~sr;(f)"kr'tcl"(:3) it;~1ii~~so~'p&rly in all instances that their future
use should be strongly
di~c~urake([ i i~te~:~iBhai;>t6) b~he<r' situations will also be discussed
briefly.
~.
.,
e
3
2. ONE-SAMPLE METHODOLOGY
n
and
S2
= (n-1r1I: (Yi
- Y)2.
i=1
For a one-tailed size a t-test of H o: IJ =
of H l when IJ =
1J1
1J0
'"
versus HI: IJ'
:> Po, the power to reject H o in favor
>0 is
where t n - l , I-a is the 100( I-a )-th percentile of the central t n - l distribution; equivalently, we
have
(5)
where T~-l({liB) has a non-central t distribution with (n-l) degrees of freedom and
noncentrality parameter {liB
=
{Ii (Jll-:-JlO)/U,
r;::- ,j
~f) ~'-'i~(..q]!J·
~r:L'.,,:,.{'.'; .. ~( .::{
7d-l
For specified values of ~. ~n4l1:~~JH:«;~sioe~'.,\~1 c&? p~f~s~e)r~'r¥" see Guenther 1973) to
find the minimum sample sizen~;'J~~edi-·~9,(~~?~~~~:,.a:[p~w~r.?f,at least (1-,8).
It is
interesting to note that the actual power.
att.aine4
,witl~ the sample,!size nm computed using (2)
•.)- l u~ ~
L --:tL; '. '. :.J "',
_~;: ~:t1t I, '
'F,i'
j
I
is generally quite close in value tP;~~<1 c;l((si~~%;~R-;'r~3 (1-;(!),~ A::}~~ticeable power loss (5% or
more) only occurs for small values of nm (roughly, values less than 20). As expected, such a
loss increases with decreasing a. A general rule is to increase any sample size n m obtained via
inequality (2) by two or three to achieve approximately the desired power (1-,8).
4
In contrast to inequality (2), ,the use of expression (1) always leads to a serious
underestimation of the
required,~atnple size.
This surprising conservatism can be illustrated by
appealing to an overlooked,hilt Jlleverth~le86 important, result due to Guenther (1965). Under
the same assumptions 'which :lea to (5); the appropriate <;onfidence interval for JJ is
Following Guenther, we define n~ to be the smallest sample size such that
pr{s(n~fl/2 t
*
.
nm- 1,1-a/2
~ 6} ~ (1 -
')').
(6)
In contrast to expression (1), expression (6) accounts for the stochastic nature of the random
variable S2 via the tolerance probability (1-,),).
Expression (6) is easily shown to be
equivalent to the probability statement
from which it follows that
11'; is t~~ s~~1~~~IP3~!.H~e integer satisfying the inequality
n~ (n~ - 1)
2•
> (uI6)2 Xnm-1,1-"1
F.
.
1, nm-1, 1-0
-
Since (u 16)2 == nm 1Xi-a from (1), the expression relating nm and n~ is
-
.,
-
n~ (n~-l)/nm
2
>
X2 •
F.
Ix 1,1-0
•
nm-1,l-"1
1, nm-1, 1-0
(7)
Appendix A provides the corresponding values of n~ which satisfy inequality (7) for
various combinations of values for a, (1-,),), and nm.
The entries in Appendix A clearly
•
5
illustrate the inappropriateness of inequality (1) for sample size determination in this context.
As an example, if nm=40 based on theuse;of{l), theexactft.am,ple size n~ needed to insure
reasonably precise [say, (1-,) = 0.90] estima.1;ioD:afJ.liwith- a 95i%;,cQnfidence interval (0- =
0.05) is n~ =53.
It is quite disturbing' that' the actual tolerance pJ10bability (1- ,') based on
using a sample size of 40 is only 0.42..-. which is less ,than. half of the desired value! Such large
discrepancies should convince users that their nm values so determined from the popular
formula (1) should be corrected via Appendix A.
3. TWO-SAMPLE METHODOLOGY
For the two-sample situation, expressions analogous to (5) and (7) can be similarly
developed.
For i = 0 and 1, let Y i 1' Y i2 ,... , Yin constitute a random sample of size n from
H l when (Ill - Ilo) has a specified positive value
(
.'.i
p,{ T~(n_1X(n/2)'/20] >
t
,
~~1
r
,.
T 2(n-1)L(n/2)
is
"I,
1)
c.: (J _ .. ,.
",.,
f
i
where
q(}
f>
;
1/2
"(n-1),1-. /
13 '·-~1
f~Gi::J?~1'~·.;.;~~;"
(P1-PO)
~:'
1
#
= uO > 0 },
(8)
lnjt~·;
:l
1/2
(}J = (Y 1 - Y o)/Sp(2/n)
has a non-central t-distribution with 2(n-l) degrees of freedom and noncentrality parameter
6
A comparison between nm values based on (4) and corresponding n~ values based on
(8) shows that the degree of agreement here is at least as good as that seen in the one-sample
case.
Moreover, exact equality often holds for sample sizes as little as 10. The excellence of
1: !).:, - , ; ;.q ..,.: ~
1- '.:' i :
.:!
the approximation (4) in the
small-~ample situation
-I'
'il":
has also been noticed by Fleiss (1986, p.
:.;J
.-.'.' ,:- ' .
369).
Under the stated assumptions, the ;appropriate 100(1-a)% confidence interval for
1/2
(Y 1- Y 0) ± t 2 (n-l),1-a/2 Sp (2/n)
and hence the two-sample analogue of (6) is
pr{
Sp(2/n~)1/2 t
:.,.j
f: 1:-1".,J
(*)
2 nm-1 ,1-a/2
11 r:
<
8}
~ (1
- "Y).
~'~": \ i
Using arguments identical to those leading to (7), we find the inequality relating n~ in the
...
"i'
above expression to nm in expression (3) to be
i'
(9)
The body of Appendix B contains n~ values calculated via inequality (9) for specified
:F~F{'~
~t:' ~1~:.;~" ~'t:':--ffl,F.· ;-, ':-.'
.'{
combinations of values for a, (1-"Y), and nm; the entries clearly document the inappropri·r . . '2':(")
<i',~·.D(~i~~ fUJ
,~~). ~
~--';.I.
ateness of inequality (3) for sample size determination in this situation.
:_"~I1U\jrn ~'.·)Jr;fn~~·?~;; "3 .. ,'i':'.
· ... {!1'-
We strongly
i;~:
recommend that users of expression (3) take note of its extreme conservatism, and that they
fiN :: r; i t f:-l10'~
<:1
~< 3 ~d ~-I CtL6::,i)3;~fJ
--I
use Appendix B to correct any sample size estimates based on the use of inequality (3).
7
4. DISCUSSION:
,,:
of'thispaPer\"~'tg'~d~terriiinesituations, if any,
As stated earlier, the specific goal
':"'1'£.1.'£;-."'
'!t :'-'n
. Of,
n~),'~;('
',:
-j
,
where the popular sample size formulas (1)-(4) could be misleading.
. n.,
~~
IC't'b~
' ..
~
We have found that
;- :
inequalities (2) and (4), which are typically used to caicuiate the smallest sample size needed
to achieve a specified minimum power, were quite reliable in all instances considered.
In
contrast, expressions (1) and (3), which' are commonly employed to estimate the minimum
sample size required to obtain a 100(1-a)% confidence interval with a specified maximum
width, were seen to be uniformly inappropriate.
The fact that inequalities (1) and (3) perform so poorly is disturbing, especially since
the use of such confidence interval-based sample size estimation formulas for the design of both
randomized clinical trials and observational epidemiologic studies is becoming quite common.
The reason for the increased popularity of formulas like these, relative to ones like (2) and (4),
is that the goal of such research efforts is more often to estimate as accurately as possible the
IJi
~;'"
'J',\'
0:[ ~~fi.i))J:;~_::\j ~~~·..),d t OJ
i.1
magnitude of the effect of interest, rather than to decide whether or not a finding is
t1\1
statistically significant (see Rothman 1986).
0-- ;~~: i-:oj-;;L::n(~.., ~'; n
Our results further suggest that the use of
popular sample size formulas for estimating other parameters (e.g., differences in proportions,
,
, - I.
"
'.
odds ratios, etc.) to within specified tolerances may also be providing sample size estimates
which are much too low.
When using confidence interval-based sample size estimation formulas for study design,
what steps can be taken to correct for their anticipated conservatism? Appendices A and B
~_;"f.t:ift~:!1~}J~h
=f~'_f~,
:""i,:.tJr;.'(j;:
can, of course, be used to adjust sample size estimates obtained via inequalities (1) and (3).
f',/,jr_~,
"r .k.l
:;·'.JE ~:n~J"; ~.~.:
'(
\.l0jE~~~1{!,<'
..
For well-known confidence interval-based sample size formulas where the parameter of interest
fL.-'
, Cd €'9.J,6rr~::J2·"} '}:-:;~: .~;.q.fl_~B.;; '{f'
is a proportion 71' or a difference in proportions (71'1-71'0)' we recommend that, when
economically feasible, researchers use that maximum sample size computed assuming that the
population proportions are equal to 1/2. Not only will this simple approach avoid the type of
8
conservatism considered in this p'aper~it will also help to provide additional subjects so that
subsequent more complicated data analyses may have reasonably good statistical properties.
This is an important con-sideralioiCslncfnfsers ofowcsampie size estimation formulas like (1)-
(4), and analogous ones involving proportions, often seem to ignore the fact that the sample
sizes so computed are appropriate only for v~ry.simple statistical analyses. It is invariably the
case that much more complicated statistical methods (e.g., regression procedures) are
employed at the data analysis stage; and, the sample sizes required to insure that such
multivariable procedures have adequate precision and/or power will generally be considerably
larger than those based on formulas like (1)-(4).
Based on the discussion above, we wish to stress to users of standard sample size
estimation formulas that, for all of the reasons cited above, the sample sizes so obtained will
generally be inadequate for the desired analysis goals.
employ formulas like (1H4) because
..;,
o(~heir
f
G
Even so, researchers will continue to
simplicity and popularity.
We hope that this
;,."
paper will help to make them aware of 'some of the problems associated with the use of these
I
;(-..t
1
(}<
,;.()
formulas.
r
. 1
APPENDIX A.
One-sample tolerance probability comparisons between n:'n and nm; (1-')') is
the tolerance probability using n:'n..,. and,(t7""')"»)s the tolerance probability
using nm.
.
';",
".
:
"'-~::"
- ; f ~.
:'
~
!:-:
:::~Vpt
:~ ..;:}
"
.lj
5 '-J
~'.'nH~
~-i;"_~
·"t_ l
.,J
i
_'.
,'-
.,
, "II
a=0.10,/
(1-')')=0.70
nm
5
10
15
20
25
30
35
40
45
50
55
60
65
70
75
80
85
90
95
100
(1-')")
0.33
0.39
0.41
0.42
0.43
0.44
0.44
0.45
0.45
0.45
0.45
0.46
0.46
0.46
0.46
0.46
0.46
0.46
0.46
0.47
(1-')')=0.80
(1-')')=0.90
(1-')')=0.95
(1-')')=0.99
n:'n
n:'n
n:'n
n:'n
n:'n
8
14
20
25
30
36
41
46
52
57
62
67
73
78
83
88
93
99
104
109
9
15
21
27
33
38
44
49
55
60
65
11
17
23
29
35
41
47
53
58
64
, .' 70
75
";n,81
86
92
97
103
108
114
119
12
19
25
32
38
44
50
56
62
67
73
79
8,5
'11
90
96
102
107
113
119
124
13
21
28
35
42
49
55
61
68
74
80
86
92
98
104
110
116
122
127
133
"I
.-
71
,76,;F~ c,""
81
87
92
97
103
108
113
_,-.1;,
'j"
Uf
(
'"j
'~11 ,,;"'\!,h
APPENDIX A: ( continued)
a:±O.Q.5
(1-1)=0.70
Dm
5
10
15
20
25
30
35
40
45
50
55
60
65
70
75
80
85
90
95
100
(1-1')
0.26
0.34
0.37
0.39
0.40
0.41
0.42
0.42
0.43
0.43
0.43
0.44
0.44
0.44
0.44
0.44
0.45
0.45
0.45
0.45
D~
9
15
20
26
31
36
42
47
52
57
63
68
73
78
84
89
94
99
104
110
-.
_
••. _
.....
(1:-:1 )~~~H
.•
-
. ..
(1 i-,1)=O.90
!."
~77Tttn
- - " - - ' '<
D~
':,'
~
10
16
22
27
33
39
44
50
55
60
66
71
77
82
87
93
98
103
109
114
••• _ • . _ _ " • • _ . _
:'j
_.'
-0 __ -
.1
-
.
(1-1)=0.95
(1-1)=0.99
s
D~
D~
D~
11
18
24
30
36
42
48
53
59
65
70
76
81
87
92
98
103
109
114
120
12
19
26
32
38
44
50
56
62
68
74
80
85
91
97
102
108
114
119
125
14
22
29
36
43
49
55
62
68
74
80
86
92
98
104
110
116
122
128
134
e
APPENDIX A: ( continued)
03 0.01
(1-,)=0.70
Om
5
10
15
20
25
30
35
40
45
50
55
60
65
70
75
80
85
90
95
100
(1-,')
0.13
0.23
0.27
0.30
0.32
0.34
0.35
0.36
0.37
0.38
0.38
0.39
0.39
0.39
0.40
0.40
0.40
0.41
0.41
0.41
(l:-,)=o.~Q-.
(1-,)=0.80
0;"
0;"
10
16
21
27
32
38
43
48
53
59
64
69
74
80
85
90
95
101
106
111
11
17
23
29
34
40
45
51
56
62
67
73
78
83
89
94
99
105
110
115
r
0;"
L'
12
19
25
31
37
43
49
55
60
66
72
77
83
88
94
99
105
110
116
121
,.
:..
~~"
_._ .... -.-
:·•• 1
'
:r
..1
:(.1-,)=0.95
(1-,)=0.99
no~
0;"
13
20
27
33
39
46
52
58
63
69
75
81
87
92
98
104
109
115
120
126
15
23
30
37
44
50
57
63
69
75
82
88
94
100
106
112
117
123
129
135
'-_'._.-._.-. -"--_."'-'
APPENDIX B.
Two-sample tolerance probability comparisons between n:r, and nm; (1-,) is
the tolerance probability using n:r" and (1-,') is the tolerance probability
using nm.
i,
!,.
1)(.1.,:;
- -
.~
(1-,)=0.70
nm
(1-,')
.-
..
~
.~._-,.
_.- _ _ _ _ _ _.__.'__ -. ___ '4':"
-
r-.
.
f~
. _. -_ ... _-_.. _a-0.10
(1-,)=0.80
n:r,
n:r,
.~
,~
-
(1-,)=0.90
(1-,)=0.95
(1-,)=0.99
n:r,
n:r,
n:r,
8
14
19
25
30
36
41
46
52
57
62
68
9
15
21
27
32
38
44
49
55
60
65
11
18
25
31
37
43
49
55
61
67
73
76
82
87
92
98
103
108
114
10
16
22
28
34
40
46
51
57
62
68
74
79
85
90
95
101
106
112
117
~.
,C-I
i
5
10
15
20
25
30
35
40
45
50
55
60
65
70
75
80
85
90
95
100
0.38
0.42
0.44
0.45
0.45
0.46
0.46
0.46
0.46
0.47
0.47
0.47
0.47
0.47
0.47
0.47
0.47
0.48
0.48
0.48
7
13
18
23
29
34
39
44
50
55
60
65
70
75
81
86
91
96
101
106
.'.
71
78
83
89 .. '
94
99
104
109
-·····_·_ _ _ .w, .,...._ _ •.
~.
~'
•. _ , _•.• _.
~
__ •_ _._
73
78
84
90
96
101
107
112
118
124
e
,
APPENDIX B: (continued)
a=O.Ol
(1\-')'}'~O:801 . (li.-t):t:0.90
(1-')')=0.70'
Om
5
10
15
20
25
30
35
40
45
50
55
60
65
70
75
80
85
90
95
100
(1-')")
0.21
0.30
0.34
0.36
0.37
0.38
0.39
0.40
0.41
0.41
0.42
0.42
0.42
0.42
0.43
0.43
0.43
0.43
0.44
0.44
o~
(1-')')=0.99
O~
8
14
19
24
30
35
40
45
51
56
61
66
71
76
82
87
92
97
102
107
(1-')')=0.95
.
,j
9,
15
20
26
31
37
42
47
53
58
63
68
74
79
84
89
95
100
,'l05'
110
--.-~
..
;', t
:, '.'/ {.. .-:
,_.~;}
j.1 .' ';
11
17
23
29
35
41
46
52
58
63
69
74
80
85
91
96
102
107
113
118
10"
16
22
28
33
39
44
50
55
61
66
72
77
83
88
93
99
104
.l09
115
-' - ,,..
-
>.;.':
0 ..
12
19
25
32
38
44
50
56
62
68
74
79
85
91
96
102
108
113
119
125
•
I. - .
,
•
REFERENCES
Armitage, P. (1971), Statistical Methods in Medical Research, New York: John Wiley.
Fleiss, J. L. (1986), The Design and Analysis
olClinif(l,I~periments,New
York: John Wiley.
Freiman, J. A., Chalmers, T. C., Smith, H., and Kuebler, R. R. (1978), "The Importance of
~,- . .
--_._-• • ' _ ••
,
--
'~"~'<"
~_.-
~
'---.
Beta, the Type II Error and Sample Size in the Design and Interpretation of the
Randomized Control Trial," The New England Journal of Medicine, 299, 690-694.
Guenther, W. C. (1965), Concepts of Statistical Inference, New York: McGraw-Hill.
(1973), "Determination of Sample Size for Tests Concerning Means and Variances of
Normal Distributions," Statistica Neerlandica, 27, 103-113.
Kleinbaum, D. G., Kupper, L. L., and Muller, K. E. (1987), Applied Regression Analysis and
Other Multivariable Methods (Second Edition), Boston: PWS-Kent.
Koopmans, L. H. (1987), Introduction to Contemporary Statistical Methods (Second Edition),
Boston: Duxbury.
McHugh, R. B., and Le, C. T. (1984), "Confidence Estimation and the Size
.,
n
Trial," Controlled Clinical Tf1als;-5;-15-7463.
q
••
-._._
of a Clinical
•••
Meinert, C. L. (1986), Clinical Trials: Design, Conduct, and Analysis, New York: Oxford.
Ott, L. (1977), An Introduction to Statistical Methods and Data Analysis, Boston: Duxbury.
Pocock, S. J. (1983), Clinical Trials: A Practical Approach, New York: John Wiley.
Rosner, B. (1986), Fundamentals of Biostatistics (Second Edition), Boston: Duxbury.
Rothman, K. J. (1986), Modern Epidemiology, Boston: Little-Brown.
© Copyright 2026 Paperzz