Gertrude M. Cox
OlJ THE DISTRIBUTION OF THE l'JU11D:Eft OF
SUCC':SSES IN INDEPENDENT 'l'RIALS
by
Wassi1y Hoeffding
University of North Carolina
This research was supported by the
the United States Air Force, tnrough
the Office of Scientific Research of
the Air Research and Development
Comlland
Institute of Statistics
Mimeograph Series No. 128
April, 1955
ON THE DIS'JRIBUTION OF THE NUHBER OF
SUCCESSES IN IND~PENDENT TRIALS I
by
Wassily Hoeffding
University of North Carolina
1.
Introduction and Summary.
Let S be the number of successes in n independ-
ent trials, and let p. denote the probability of success in the j-th trial, j = 1,
J
2, •.• , n (Poisson trials).
11e consider the problem of finding the maximum and the
minimum of E g(S), the expected value of a given real-valued function of S, when
ES = np is fixed.
when PI
It is well known that the maximum of the variance of S is attained
= P2 = •••
~
Pn
= p.
This can be interpreted as showing that the variability
in the number of successes is highest when the successes are equally probable (Bernoulli trials).
This interpretation is further supported by the following two theor-
ems proved in this paper.
ability PCb
(Theorem
5,
~
S
~
If band c are two integers, 0
~
c) attains its minimum if and only if PI
a corollary of Theorem
4,
b
~
= P2
np
~
= •••
c
~
n, the prob-
= Pn =
P
which gives the maximum and the minimum of
peS ~ c) ). If g is a strictly convex function, E g(S) attains its maximum if and only
if PI
= P2 = ". = Pn = p
(Theorem 3).
These results are obtained with the help of
two theorems concerning the extrema of the expected value of an arbitrary function
g(S) under the condition ES
= np. Theorem
mum and the minimum of E g(S).
1 gives necessary conditions for the maxi-
Theorem 2 gives a partial characterization of the set
of points at which an extremum is attained.
Corollary 2.1 states that the maximum
and the minimum are attained when PI' P2 , ••• , Pn take on at most three different
values, only one of which is distinct from 0 and 1.
Applications of Theorems 3 and
1. This research was supported in part by the United States Air Force, through
the Office of Scientific Research of the Air Research and Developmunt Command.
2
5 to
problems of estimation and testing are pointed out in section
2.
S.
The extrema of the expected value of an arbitrary function of S.
Tho expected value of a function g(S) is
feE)
(1)
where
E=
= Eg(S) =
n
~ g(k) A
nk
k=O
(PI' P2' ... , Pn)' and Ank(l?),
(E)
the prObability of S''' k, is given by
1.
1-1.
p.J(l_p.)
J, k:3 O,l, ••. ,n •
j==l J
J
n
JT
~
ij"'lO,l;j==l, ••• ,n
il+ ••. +in :3 k
The function f(£) is symmetric in tho components of
ponant.
E and linear in each com-
We observe in passing that conversely any function of E with these two pro-
perties can be represented in the form (1).
the maximum and the minimum of
feE)
The problem to be considered is to find
in the section D of the hyperplane
(0 < p < 1)
which is contained in the closed hypercube
o -<
p. < 1,
j == 1, 2, •.. , n •
J-
i1'i 2 ,·· ·,im
We shall donote by E
tho point in the (n - m) - dimensional space
which is obtained fromE by omitting the coordinates p. , p. , ••• , Pi
II
feE)
( 2)
l2
Since
m
is symmetric and line3r, we can write
== f
1
n- ,0
j
(n ) + p. f
~
J
1 1 (pj),
n-,
-
j :3 1,2, ••• ,n ,
3
lO
and f n-1 , 1 arc independent of the index j and symmetric
whero the functions fn,
and lirlGar in the components of ,Ej.
(3) f
In gGneral we dofine th0 functions f
n-
k' by
,(pl,2, ••. ,k)=f
.( 1,2, ••• ,k+l)
f
( l,?, ••. ,k+l)
n-k,l n-k-l,l E
+ Pk+l n-k-l,i+l £
i
= O,l, ... ,k;
k
,1
'
= 0,1, ••• n-l •
Applying (3) repeatedly, we obtain
(4) f()
E
m
=
i: e ( PI' P2,···,Pm ) f m l' (~1,2, ••• ,m) , m = 1 , 2 ,
i=O mi
n- ,
... , Cm,m
whero
••• ,
n ,
are the symmetric sums
(5)
If we write (Oulv) for the point whoso first u coordinates arc 0 and tho remaining v coordinates are 1, and lot (Pl'P2' •.• 'Pm)=(0~-hlh),h = O,l, ••. ,m, we obtain
from (4) a system of linear equations for f
i
.
i
.(pl,2, •.• ,m) whose solution is
n-m,l n_
~ (_l)l-h(h)f(Om-hl h
)'
- (6) f n-m,i (E1,2, ••• ,m) = h~
,pm+l,···,Pn ,1
maximum.
Then for every two distinct indices i, j we have
= (!) , 1 , ••• ,m
•
4
(8)
o<
a. = a. < 1 •
J
~
Tho inequalities (7) ~ (9) are strict if the mmcilnum is not attained at the
points in D which differ from a only in that
---------
3
j
-
3. and
~-
a. are replaced by
J
3. +
~
x and
-
x withlxlpositivo and arbitrarily small.
Proof.
placed by a.
~
Lot
+
I'll
denote the point which is obtained from a if
x and a. - x.
J
Tho point
finod by 0 .:: 3 i + x ~ 1, 0 < a j -
X
~'
:s 1.
11.
is in D for all x in the
~
and a. 3rc roJ
int~rv~l
I de-
By (4) we have
Hcmce
(10)
Since feE) is n maximum, the right side of (10) must be negative or zero for
all x in I.
We may assume that a. < a..
~
ciontly smnll, then x is in I.
then tho point x
-
J
If a. < a., and x is positive and suffi~
J
Honce (7) must hold.
= 0 is in the intorior of T.
If the maximum is not attained
~t ~r
If
0 < a < 1 and 0 < a < 1,
i
j
H~nce (8) and (9) must hold.
when x is in I
~nd
is different from and
sufficiently close to zero, the inequalities (7) and (9) must be strict.
Tho proof
is complete.
j
The following explicit expressions for f _2 2(Qi ) will be useful in the app1in ,
cations of Theorcm 1. It is oasily seen from probability considorations that
Ank(02-~h'P3'.','Pn) = An- 2,k-h(P3, ••• ,Pn)' h = 0, 1, 2.
Hence, from (6) and (1),
5
(11)
f
~
j
2 2(.si ) =
n- ,
k=O g(k) {An-2 , k- 2(a
-
ij
)...2A 2 k
n- , -
l(~ij)+An-,2 k(a- ij )}.
Alternat ively this can be written in the forms
and
In general the maximum or the minimum of f(p) can be attained at more than one
-
point in D.
Thus if np < n-l, the function PlP2.' 'Pn attains its minimum 0 at every
point in D with at least one zero coordinate, and thero arc infinitely many points
with this property.
Tho following theorem gives soma information about the set of
points at which an extremum is attained•
..Iheorem 2.
~ ~
be a point in
D
at which f(12) attains its maximum or its
minimum.
Suppose that a has at least two unequal coordinates_which arc distinct from
o and
Then
1.
(i) f(p) attains its maximum (or minimum) at any point in D which has tho samo
-
number of zero coordinates nnd the same number of unit coordinates as
(ii)
1! ~
(or minimum) of
has
ex~ctly
~ ~j
r zero coordinates and s unit coordinates, the maximum
feE) is equal to
f(~) = (l-np+s) g{s) + (np-s) g(s+l)
,
and we have
(15)
g(s+k)
= kg(s+l) - (k-l) g(s),
k
= 2,
••. , n-r-s.
6
Proof.
Let m = n-r-s be the number of coordinates of !!
nre distinct from 0 and 1.
we may assume that ~
r a2 •
We may take aI' a2 , ••• , am to be these coordinates, and
vIe
first show that
(16)
i
for k
(a ,a2 , ••• ,a ) which
l
n
=:
=:
2,
... , k ,
= 2, ••• , m.
Equations (16) will be proved by induction on k.
follows from Thoorem 1, (8).
That (16) is true for k
=:
Assume that (16) is truo for ~ fixed k, 2 ~ k < m.
2
Lot
(17)
where
(18)
The point llk is in D.
(19)
f(b
- k)
By (4) and the induction hypothesis,
f ~,
k oCa~ l,.··,an )+(~+
••• +a)f
k leek+l, ••• ,an
)=f(a)
L
K~,
- •
=:
Thus the maximum is attained at cVGry point ~k which satisfies (17) and
In particular, (:8) can be sntisfiod with bl ~ b 2 , b l
i
= 1, 2, ••• , k.
with
~
f
~c+l' b 2
(18).
# ~+l'
0 < b i < 1,
Under these assumptions we can apply the induction hypothesis (16)
replaced by the point £k whose first k+l
coordin~tcs
can be suitably rearranged.
f n-k ,~.(b1,ak"" 2,···,an )=0, f n- k.,~.(b2 ,ak+2'."0.n )=0, i
Applying (3) to tho left sides of those
-
- ... -
equ~tions
wo obtain
=:
2, ••• , k.
7
Since b
~
1
(16) holds for k
we.find ~hQt (16) is sAtisfied with k replaced by k + 1.
2
~ 2, ••. , m.
b
By (16) with k ~ m, equations (19) hold with k
= m for
Thus
0vory -m
b which satisfios
(17) and (18). Since f is symmetric, this implies part (i) of tho theorem.
To prove part (ii) we obsorvo that
~
3nd we c~n put (a + , ••• , ~)
m1
d
+ a 2 + ••• + am
(Orl~).
= np
- s
,
Hence, by (19) and (16) with k
=
m,
and
i
:::w
2, ••• , m.
Applying (6) and then (1), we obtain
i = 0, 1, •.• , m •
H~nco
(14) follows from (20), and (IS) is obtained by solving the equations (21) for
g(s+2), ••. , g(s+m).
The proof is completo.
Tho following llmacdiato corollary of Theorem 2(i) is ofton convenient for finding an extremum•
.s0rollary 2.1.
Tho m:ndmum and the minimum of f(.12)
2:::
D '.lre attained at points
whose coordinatos t:lko on at most threo difforent vnluos, only ono of which is dis-
~
tinct from 0 and 1.
Thus to find an extremum, it is sufficient to determine the numbers r and s of
8
the zero and unit coordinates of
~ll
equal.
Theorem 1.
~n
extremal point whose remaining coordinates arc
We shall sec that rand s can sometimes be determined with the help of
If an extremum is attained at only one point (perhaps except for permu-
tations of the coordinates), part (ii) of Theorem 2 will provo useful to establish
the uniqueness.
3.
The maximum of the expected value of a convex function of S.
Theorem 3.
(n)
If ES
:%
-
np nnd
k
g(k+2) - 2g(k+l) + g(k) > 0,
~
0, 1, ••• , n-2 ,
then
,
(23)
= P2 = •••
Pn = p •
Thus in particular every absolute momGnt of S, E( I s-bl C), nbout an arbitrary
whore the sign of eQuality holds if and only if PI
point b, which is of order c > 1, attains its maximum if
~nd
~
only if all the p. are
J
equal.
Proof of Theorem 3. Let a
-
Eg(S) attains its maximUJ.-n.
f _2 ,2(Qi j ) ~ O.
n
= (nl , a2 , •.• , a ) be a point
n
Suppose thnt a
i
Ie
a for some i, j.
j
in D at which f(p)
-
=
By Theorem 1,
By (13) nnd (22) this implies ~_2,k(~ij) = 0, k
= 0,
1, ••• ,n-2.
But this is impossible since the sum of thc probabilities An- 2 , 0' An-21,···,A
,
n- 2 ,n-2
is 1.
a
l
Hence tho maximum is nttnined if and only if all the a
= a2 = ••.
= an
= p.
j
arc.equal, that is,
This implies (23) and complotes the proof.
Observe that in tho proo.f of Theorem 3 no usc Has :,lO.do o.f Theorem 2.
equality (7) of Thaorem 1
WnS
needed.
Only in-
9
4.
The extrema of certain probabilities.
In this section we consider the determination of tho maxima and tho minima of the
probaoilities
Thoorom
pes ~ c) and PCb ~ S ~ c) whon ES
4.
-If ES
= np,
-and c
= np.
is an integor,
0 < l-Q(n-c-l,l-p) ~ peS < c) < ~(c,p) < 1 if np-l < c < np
(25)
-
- -
,
(26)
where
Q(c,p)::> mroc
o<
s < c
np-s
n-s
(28)
a ... -
The maximizing
v~uc
unless c
=
n-l, in which case s
All bounds arc
bound for np
~
Theorem
e
(30)
of s satisfies the inequality
(c+l-np)(n-s) <n-np
(29)
o ~.b
•
.5 np
~
c
~ttained.
= n-l.
Tho uppor bound for 0
c < n nro attained only i f PI
5. 1£
~
n,
~ c ~
np-l
and th8 lower
= P2 = ••• = Pn = p •
ES - np, and band c nre two integers such that
~
C n) k
)n-k
)
~ (k P (l-p
~ PCb ~ S ~ C ~ I
k:::b
•
j
I
10
Both bounds are attained.
unless b
:::0
0 and c
~
The lower bound is attained only if PI
:::0
P
2
:::t
•••
= Pn
:::0
n.
Proof of Theorem 4.
We first consider the maximum of feE) :::0 pes :: c I,E) in D.
By corollary 2.1 the maximum is attained at a point
~
= (Or an-r-s 1 s ) (using a nota-
tion similar to that employed in section 2), where r ~ 0, s ~ 0, n-r-s ~ 0, and
(n - r - s) a = np - s
If c
Then a
:::0
~
•
np, let s be the greatest integer contained in np, and r
I
np - s and peS ~ c
a)
:::0
n - s - 1.
= 1. Hence the (obvious) upper bound in (26) is
attained.
Now let 0 ~ c < np.
for all c > O.
n - r
:::0
If s > c, peS ~ c)~)
Hence we must have s
np, then
have n - r > np.
~
~
= (0r 1n-r.) and n - r
c.
= O. But pes
Since a
~
~
01
p, P, •.. ,p) > 0
1, we have n - r .:: np.
> c, hence P(
S :: c)
,~ = O.
If
Thus we must
Hence we have the inequalities
o<
s .:::: c < np < n - r < n ,
and this implies 0 < a < 1 •
We have peS ~ c)
= Eg(S),
where g(k)
=1
or 0 according as k <
Hence by (12),
f _
n 2,
2 (_a
ij
):::0
i'
..
A _2 c(~ J) - A
(a 1J )
n ,
n-2,c-l -
Then
k-V(
= ( n-u-v-2)
k-v
a
I-a) n-k-u-2
and
•
C
or k > c.
p
11
f
( ij )
(n-u-v-2)!
n-2,2 ~ = (c-v)!(n-c-u-l)r
a
c-v-l(
I-a)
n-c-u-2 f.
l (n-u-v-l )a - c + v )\
J
Since 0 < a < 1, we see that if v < c < n - u - 1 , fn_2,2(~ij) has the same sign as
(n - u - v -1) a - c +v
Suppose that
r
> O.
By Theorem 1 with a
f
(or-l an-r-s-l1 s) < O.
n-2,2
-
tradicts the assumption.
Suppose that s > O.
i
•
= 0, a j = a we must have
Hence (n - r - s)a - c + s = np - c _< O.
Thus r = 0, a
= (a r - s l s ), (n-s)a = np - s,
By Theorem 1 with a.1
But this con0 ~ s < c •
= a, a j = 1 we must have
f n _2 ,2(an-s-11 s) ~ 0, that is,
(n - s)a - c + s - 1 = np - c - 1 ~ 0
(31)
Hence if c < np - 1, we must have r
= s = 0, a = p. Thus the second inequality (24)
holds for 0 < c < np - 1, and tho bound is attained.
c
(~k
postpone the proof for
= np - 1.)
Now suppose that n - s > 1.
f _ 2( an-s-2 1 s)
n 2,
~
By Theorem 1 with a. = a.
0, that is, (
n - s)
- 1 a - c +
J
1
S
> O.
=
a we must have
This is equivalent to
(c + 1 - np) (n - s) ~ n- np •
= n - 1.
If c
=
n - 1, this contradicts the assumption n - s > 1, and we must have s
If
f
n - 1, we have c < n - 1 and n - s > 1, so that (32) must be satisfied. Hence
C
if c < np, the maximum of peS ~ c) is Q(c, p) as definod in (27) and (28), and the
maximizing value of s satisfies (32) and is equal to n - 1 if c
pone the proof of strict inequality in (3?) for c ~ n - 1.)
c - s < n - s, we have Q(c, p) < 1.
=
n - 1.
(We post-
Since a> 0 and
12
We next show that if 0 ~ c < np, the maximum can be attained only at a point
whoso coordinates which are distinct from 0 and 1 are all equal.
Suppose the maxi-
mum is attained at a point a which has at least two unequal coordinatos which are
the
distinct from 0 and 1. Let s be/number of unit coordinates in.§!. By Theorem 2,
equation (14), we must have f(~)
f(~)
= 0 if s
> c.
must have s = c.
is not true.
= 1 if s
<
c, f(.§!) = 1 - rp + s if s = c, and
Since for 0 ~ c < np the maximum is positive and less than 1, we
By (15) with s
= c,
k = 2 we must then have g(c
+
2)
= -1,
which
Hence the coordinates of a which are not 0 or 1 must be all equal.
By Theorem 1 this implies that the inequalities in (31) and (32) are strict.
All statements of Theorem
4
concerning the upper bounds are now easily seen to be
true.
The statemonts concerning the lower bounds follow from the equation
peS
~ c
I E) = 1 - pes
where q = (1 - PI' 1 - P2' ••. , 1 - Pn)'
Proof of Theorem
S.
-< n -
-
c - 1 I q) ,
Tho proof is complete.
Since PCb < S < c) = peS < c) -
- -
-
pes
- b - 1), the lower
<
bound in (30) and the condition for its attainment follow from Theorem
c-bIb), where (c-b ) a
bound 1 is attained at (0 n-c a
S.
Statistical applications.
in Theorem
=
4.
The upper
np - b.
Tho lower bound for PCb ~ S ~ c) which is given
5 shows that the usual (one-sided and two-sided) tests for the constant
probability of "success" in n indepondent (Bernoulli) triols can bo used as tests for
the average probability p of success when tha probability of success varies from
trial to trial.
That is to say, tho signific::mce l.Jvol of these tests (which is
understood as tho upper bound for the probability of an error of tho first kind) remains unchanged.
Moreover, wo can obtain lower bounds for the power of these tests
when the alternative is not too closo to the hypothesis which is being tested. (Very
13
roughly, the significance level has to be less than 1/2 and tho power greater than
1/2.)
We can also obtain a confidence interval for p vnth a prescribed (sufficiently
high) confidence coefficient, and an upper bound for the probability that the confidonce interval covers a wrong value of p when the latter is not too close to the
true value.
Details arc loft to the reader.
Theor0m 3 can be applied in certain point ostllilation problems.
to estimatG a function e(p), and the loss due to saying e(p)
Suppose we want
= t is W(p, t). If tho
estimator t(S) is a function of S only, and W(p, t(S) ~ is a strictly convex function
of S for evary p, then Theorem 3 implies that tho risk, EVl(p, t(S) ), is maximized
when all the p. are equal.
J
It follows in particular that if t(S) is a minimax esti-
mator under thu assumption that the p. arc all aqual, it rotains this property when
J
the assumption is not satisfied (With no restriction on the class of estimators).
One may doubt whether those problems are statistically meaningful since the
average probability of success depends on the sample size.
The main interest of
these results to the practicing statistioian scorns to be in cases where he assumes
that the probability of success is constant, but there is the possibility that this
assumption is violated.
© Copyright 2026 Paperzz