Hoeffding, Wassily; (1954)The efficiency of tests." (Air Research and Dev. Command)

THE EFFICIENCY OF TESTS
by
Wassily Hoeffding
Universiiu of North Carolina
Institute of Statistics
Mimeograph Series No. 99
Limited Distribution
April 1954
IThis research was supported by the United States Air Force,
through the Office of Scientific Research of the Air Research and
Development Command.
UNCLASSIFIED
Security Information
Bibliographical Control Sheet
1.
2.
O.A.:
Institute of Statistics, North Carolina State College of the
University of North Carolina
M.A.:
Office of Scientific Research of the Air Research and Development Command
O.A. :
CIT Report No. 6
OSR Technical Note -
}1.A. :
3.
THE EFFICIENCY OF TESTS
4.
Hoeffding,
5.
A'pril, 19.54
6.
21
7.
None
8.
AF 18(600)-4.58
9.
ROO No. R-3.54-20-8
( UNCI/I SSIFIED )
~1 assily
10.
UNCLASSIFIED
11.
None
12.
The efficiency of a family of tests is defined. rtethods for
evaluating the efficiency are discussed. The asymptotic efficiency
is obtained for certain families of tests under assumptions which
imply that the sample size is large.
THE EFFICIENCY OF TESTSI
By Hassily Hoeffding, University of North Carolina
Summary.
The efficiency of a family of tests is defined.
for evaluating the efficiency are discussed.
Methods
The asymptotic efficiency is
obtained for certain families of tests under assumptions which imply that
the sample size is large.
Introduction.
Let w(l) and w(2) be two tests each of which has
n
n
the same fixed significance level for testing a hypothesis 8 R 8 , and which
1
are defined for every sample size n. The relative efficiency of test w(2)
1.
n
(or rather, of the sequence
has been
{w~2)}) with respect to test w(l)
n
defined as the ratio n /n , where n is the least sample size required by a
l 2
2
test of the sequence {w~
in order to achieve the same power for a given
2)}
alternative
sample of
e = 82
size~.
as is achieved by the test in
{
w~l)} which uses a
This is essentially the definition used by Pitman (see
Noether ~1_7, p. 241).
In section 2 we extend this definition by replacing sequences of
tests by arbitrary families of (nonsequential) tests and the parameters
8 , 8 by two arbitrary classes of distributions. The tests are regarded
1 2
as general two-decision rules.
If N(T) is the least sample size used by a
=
test in family T whose probabilities of the two kinds of error (corresponding
=
to the two classes of distributions) do not exceed two given numbers, the
1. This research WaS supported by the United States Air Force, through
the office of Scientific Research of the Air Research and Development Command.
2
ratio N(!1)/N(~2) is defined as the relative efficiency of family ~2 with
respect to familY!l'
In section 3 it is pointed out that the determination of N(T) is
=
closely related to finding a test which maxi~izes the minimum power.
In section
4 the problem of asymptotic efficiency is considered. In
studies of the asymptotic efficiency of tests it is customary to consider a
simple hypothesis 6 = 81 (say) and a simple alternative, 6 = 8 , and to assume
2
that as n ---> 00, 61 remains fixed and 62 approaches 61 in a certain way,
2
for instance by setting 6 = 6 +kn- l / • Neither the restriction to simple
2
l
hypotheses nor the assumption that 6 2 depends on n seemsto be entirely adequate from the point of view of most applications.
In this paper a different approach is used.
In section
that one or the other decision is undesirable according as 8
4 we assume
8 or 6 ~ 6 ,
1
2
where 6 and 9 2 > 8 are two fixed numbers. Since 8 and 6 will usually
1
1
1
2
be so chosen that neither decision is strongly preferred when Q ~ 6 < 6 ,
2
l
small values of 6 - 9 are of particular interest. We consider families
2
1
of tests which depend on stAtistics whose distributions are determined by
the parameter
family
~
e.
~
Let N(!) denote the least sample size used by a test in
for which the probability of a wrong decision does not exceed a
l
1 and does not exceed a 2 if 6 ~ 6 2 , Thus N(!) is a function of
8 and 8 , Under suitable assumptions we derive asymptotic expressions for
2
1
N(!) when 5 = 8 2 - 61 tends to zero, while al and a 2 remain fixed.
if 6 ~ 6
2.
The efficiency of a family of tests.
Let X be a random variable
or a random vector with cdf (cumulative distribution function) F, which is
-
assumed to belong to a class C of cdfs.
It is desired to decide, on the
3
...
basis of n independent observations x = (xl'
, Xn ) on X, between two
-n
alternative courses of action, Al and A2 • Let d. denote the decision in
~
favor of Ai (i
and
~2
of
~
g such
1, 2).
Suppose there are given two disjoint subclasses £1
that decision d i is preferred if F is in gi (i
= 1, 2).
(In most applications one will choose the classes £1 and £2 in such a way
that neither decision is strongly preferred if F is neither in £1 nor in ~2')
Consider a decision rule, briefly referred to as a test, of the
Let E denote the space of points
following type.
n
X •
-n
(Thus i f x. is a
J
vector with k components, E may be taken as the nk-dimensional Euclidean
n
Let wI be a (Borel-) subset of E , and denote its complement by
space. )
n
n
w ' (All sots of points ~n considered in this paper are assumed to be Borel
2n
sets, and all functions of x are understood to be Borel-measurable. This
-n
will usually not be explicitly stated.) A test determined by the pair of
= (wn l' wn 2) consists in taking n observations
on X and taking de.
sets wn
cision d. if the observed point x
-n is
~
referred to as the test w.
n
= 1,
in w. , i
~n
2.
This test will be
A test w with w. C. E will be called a test
n
n
III
based on n observations.
Let aI' a 2 be two positive numbers.
We shall say that the test wn
solves the problem (£1' g2' aI' a 2 ) if
(2.1)
P(!n e Win
I F)
> 1 - a i for all F in gi' i == 1, 2,
where X = (Xl' •.• , X ) is a random vector with values in E , and peR IF)
~
n
.
n
denotes the probability of relation R when Xl' •.• , X are independent with
n
conunon cdf F.
4
-
Let T be a family of tests.
-
We shall mainly be concerned with
families 1 which, for every positive integer n, contain at least one test
based on n observations.
For eXAmple, ! may be the family of all tests
wn with wIn = S
X : t n (x
( -n
-n ) < C } , - 00 <: C <! 00, n = 1, 2, •.• , where ~l t n )1
is a given sequence of functions. (Here ~n: R} denotes the set of all
i
points ~n such that relation R is satisfied.)
Let N(~) = N(J, £1' £2' a l ,a 2 )
be the least integer n such that the inequalities (2.1) arc satisfied for
some test in
1.
Thus
N(~)
is the least number of observations with which prob-
lem (£1' £2' aI' a 2 ) can be solved when we restrict ourselves to tests belonging to family!. N(T) will be termed the efficiency index, or simply
=
the index, of family ~ for problem (£1' £2' aI' a 2 ). Evidently! may contain more than one test based on N(~) observations.
If
11
dices ~(~i)
and
~2
are two families of tests and at least one of the in-
= N(!i' £1' g2' aI' a 2 ), i
will be called tho efficiency of family
= 1, 2, is finite, the ratio
~l
relative to family J2 for prob-
lem (£1' £2' aI' a 2 ). The relative efficiency can thus equal any nonnegative
rational number or fnfinity.
This definition of relative efficiency of two tests is an extension
of Pitman's definition of the same term; see Noether L-l_7, p. 241.
Let =T* be the family of all tests wn = (wI'
n w2n ), for all n = 1, 2,
If N(i~)
= 00, problem (~1' ~2'
ap a 2 ) has no solution (within the
5
and T is any subfamily of T-::-, then eff(TjT~~)
=
=
=
=
may be called the (absolute) efficiency of family ~ for problem (£1,E2,a ,a )
l 2
(provided we confine ourselves to tests in T~~). Clearly
=
family
T·Y~).
If
N(T~~) < 00,
eff
(T/T~!-)
< 1,
== -
-
the sign of equality holding if and only if T contains a test which solves
problem (~1'~2,al,a2) and is based on the least possible number of observations with which the problem can be solved by any test in
~*.
All that has been said can immediately be extended to families of
randomized tests.
A randomized test is determined by a function
taking n observations x and performing a random experiment whose two
-n
possible outcomes, e l and e , have probabilities
2
respectively.
~l
(x ) and
n -n
If event e. occurs, decision d. is taken.
~
by the pair of ftlnctions
~
'n = (
$ln'
~2
(x ),
n-n
A test determined
'2n) will be referred to as the
is defined as the least n such that the relations
for all F in C.,
E(
are satisfied for some $
value of
~.
~n
=~
n
in T.
=
(X ) when Xl' _.• ,X
-n
n
Here E(
~.
~n
I F)
i=1,2,
denotes the expected
are independent with the.. corrmon cdf. F.
6
If randomized tests arc admitted which do not use any observations, we could
have N(!) = O.
This trivial case can be excluded by assuming that a + a 2 < 1.
l
One could extend the notion of efficiency to families of tests such
that the choice of the number of observations also depends on a random
experiment, the probabilities of whose outcomes mayor may not depend on
the observations.
tests.)
(In the former case we are dealing with sequential
In this paper we confine ourselves to families of nonsequential
tests basJd on a nonrandom number of observntions.
So far we have assumed, to simplify the exposition, that Xl ,X , ..• ,
2
is a sequence of independent, identically distributed random variables.
Supoose, more generally, that for every n the random vector -n
X =(Xl""'X)
n
has a cdf G which belongs to a class C. For every n there are given
n
=Il
twodlsjoint subclasses, C and C2 ' of C.
ln
n
=n
=
solves problem (I £In}
~
where peR
I Gn )
, {£2n
I Gn )
p(X & W.
ID
TJe say that a test
=
W
1 ,al ,(2 ) if
> 1 - a. for all G in C. ,
1
n
=ID
n
=(wl ,w 2 )
n· n
i=1,2,
denotes the probability of relation R when X has the cdf
Gn •
The definitions of N(J) = N(~,
are
obvious
extensionS
{gln! '
f £2nl
~
' aI' (12) and eff(~1/J;2)
of the corresponding definitions in the special
case.
3.
The detormination of N(T).
randomized or not.
.
...
p-
Let T be a given family of tests,
=
Since a randomized test
$n such that
$
(x )
n -n
==
a
or 1
7
for all x is equivalent to a nonrpndomizod test
-n
tests in T by
=
$.
n
wn ' we may denote the
Lot T be the family of all tests in T which are based
=
~
on n observations. D:rote by ~ the family of all tests
(3.1)
E(
$1n
I
Gn ) -> 1 - ~.L
$
n in !n for which
for all G in C
•
ln
n
=
Let
M(
41 ):2
n
M
n
sup
E(
41
inf
e T*
riC
Ge C
n =2n
=
$
n
1n
I
$
G ),
n
)
n
=n
The following theor:Jm is self-evident.
Theorem 3 .1.
N(~, {£In
t ' f£2n J ,aI' a2 ) is the least integ3r
n for which M < a •
n-
2
Adapting a familiar terminology, we may say that a test which
satisfies
(3.1) has level a 1 with respect to £In' and we may call
1-M(
the minimmn power of test
t$
n
)
exists a test
$
n
in To!} such that M(
=n
t$
n with respect to ~2n'
$
n
)
If there
= M , the test is said to
n
maximize the minimum power with respect to C2 • Tests which maximize the
=n
minimum powor can sometimes be
obtained by a method due to Weld and
explicitly apDlied to this problem by Lehm6lUl
£2_7,
Theorem 8.3.Fol' 8 proof of
/
8
a special case of the theorem and for illustrations of its use see Lehmann
4.
ASymptotic efficiency.
We shall now consider the asymptotic
-
behavior of the efficiency index N(T) for certain families of tests in cases
£In
whero the "distance" between the classes
and
~2n
Let Q(G )
n
be a real-valued function defined for all G in C and for every n=1,2, •••
n
(or for every sufficiently large n).
is small.
=n
We assume that the set w of values
Q(G ) when Gee, n=1,2, . . , is an interval, finite or infinite. For
n
n =n
example, C may bo the class of all cdfs of the form G (x )=F(x )F(x ) ..• F(x ),
l
=n
n -n
n
2
where F belongs to some class
Q;
in this case Q(Gn)=Ql(F) is a function of
F only, such as the mean or Ql(F) = F(O).
As another example, let C be
=1l
the class of cdfs Gn(~n)=Fl(xl)... Fl(xm)F2(xm+l)" .F2(x n ), where Fl and F2
are normal cdfs with common variance a
2
and respective means ~l and ~2'
and m = men) is a given function of n, and let Q(Gn )
= (~l-
~2)/a.
Let Ql and Q2 be two numbers in w , 91 < 92 , and let £In and £2n
be the classes of all G in C such that 9(G ) < 9 and 9(G ) > 9 , respectively.
n
=n
n- 1
n- 2
Let
1t n <.!n)}
be a given sequence of functions, defined for all
that the distribution of t (X ) depends on Gn only
n=1,2, •.. , and SUpDOSO
.
n~
through 9( G ) • We shall write p(t e S
n
n
I Q)
for pet (X )e
n -n
s I Gn )
when
9( G ) = 9.
n
Let T be the family of all tests w with
=
(4.1)
n
-00
<
C
<
00
n=1,2, ...
9
We shall consider the asymptotic behavior of N(T)=N(T, 5 C 2
Sc 2
.
= 1 =In ~ , l =2n ~ ,
aI' a ) when 5 = 9 - 9 is small. Apart from the assumptions made so far
2 1
2
we shall make four special assumptions A, B, C, D.
Assumption A.
For every n and every c, pet
-
and nonincreasing function of 9 for 9 e
will be satisfied by test
I 9)
is a continuous
00 •
The nonincreasing character of pet <
ninequalities
p(x
-ne w.].n
n-< c
I Gn ) > 1 - a.
].
0
I g)
implies that the
for all Gn in =J.n
C. ) i=l, 2,
(4.1) if and only if
(4.2)
Let on be the smallest number which satisfies the inequality
This number exists since P(t ~ c
n
continuous on tho right.
I 91 )
is a nondecreasing function of c,
As an application of Theorem 3.1 we obtain
Lenuna
(4.3 )
4.1. N(T)
..,. is the least integer n for which
pet < c
n -
n
I
92 ) < a 2 .
-
10
Assumption B.
a
1
+a
2 < 1.
From Assumption B and the continuity assumption in A we obtain
Proof.
M
l is fixed and 6 = Q2- Q1 ~ 0, then N(I) ~ 00,
Given an integer nt, we can choose 6 > 0 so small that
Lemma 4.2.
if 6 < 6 , for n
1
Q
1
~
nt •
Assumption C.
Then N(T) > nt,
=
There exist a positive number r, a sequence of evory-
where increasing functions f (t ) of t , n=1,2, •.. , and a family of cdfs
n n n
Hd(X), defined for everl d ~ 0, such that for every real x and everl d ~ 0
(4.4)
It follows from Assumption A that Hd(x) is a nonincreasing function
of d.
l'>1e have to assume more.
Assumption D.
except whero Hd(x) = 0
o<
Hd(x) is continuous in x and increasing at every x
~
Hd(X) = 1.
Furthermore, for everl x such that
HO(x) < 1, Hd(x) is a continuous, over~Thorc decreasing function of d,
and
lim
d~ 00
Hd(x) =
o.
11
The first part of Assumption D implies that thoro exists an inverse
function H -1 (u), uniquely defined for 0 < u < 1, so that Hd-1 (u ) = x if
d
and only if Hd(x) = u. Assumption D further implies that if we set
a
(4.5)
the equation in D,
= Ho-l(l_a1 ),
~(a) = a
2 has a unique nonncgative root, which will be
denoted by D(al ,a2 ). By Assumption B, D(a l ,a 2 ) > O. Thus D(a l ,a 2 ) is the
unique positive root of
(4.6)
Thoorem 4.1.
~!
be tho family of tests (4.1).
A, B, C, D arc satisfied, then for 9 fixed and 6
1
~
If Assumptions
0 we have
as~ptotically
Since t nl = f n (tn ) is an increasing function of t n , the
inequality t < c is equivalent to t l < e l , where c' = f (e). To simplify
n- n
n- n
n
n n
the notation we substitute t nl , cnl for t n , cn and omit the primes. Thus
Proof.
equation
(4.8)
(4.4) is replaced by
lim
n~
p( t
00
n
~ x
I 91
r
+ d n- )
,
= Hd(x).
12
Let & b3 a given small positivG number,
&
where D = D(a ,a ).
l 2
Lot a be defined by
< D,
(4.5).
The quantity
(4.10)
is positive by Assumption D.
Choose &1 > 0 so small that
(4.11)
if
Ix-al
~ &1
It follows from Assumption D that Hd-1(u) is continuous in u for
o < u < 1.
Choose
~
> 0 so small that
(4.12)
~
<y ,
(4.13 )
Let
(4.14)
~ (d)
n
= pet n-<
cn
I Ql
+
r
d n- ) - Hd(c
).
' n
13
Sinco tho cdf Hd(x) is continuous in x, the convergence in (4.8)
is uniform in x.
Honco we can choose n so largo that
l
(4.15)
if n > n .
l
In addition we may assume that
(n~l)
(4.16)
Finally, by Lemma
r
< 1 + e
°
4.2, we can choose 1
> 0 so small that, with N~N(~),
(4.17)
Recalling the definition of c
n
and making use of (4.14) and (4.15),
we have for n > n
l
I
1 - a l -< p( t n -< cn9O
(c ) + 'r"I..,
1 ) < Hn
(4.1B)
14
The theorem will be proved if we show that for 0 sufficiently small
(4.19)
By (4.16) and (4.17) this will
ho~
if 0 < 0
1 and
(4.20)
~~
:
.
shall establish the theorcm by showing that the assumption that (4.20)
is false leads to a contradiction when 0 < 0 ,
1
By Lcmma 4.1 wo have
(4.21)
First assume that 0 < 0 and 0 ~ (D-e) N- r , so that the first
1
inequality (4.20) is violated, Then, applying successively Assumption A,
(4.17), (4.15), (4.18), (4.12), (4.11), (4.10) and (4.6),
But this contradiots (4.21).
Hence the first inequality (4.20) is true
for 5 < 51.
In a slllular way we can show that i f 5 < 6 and (D+e)(N-l)-r < 5,
1
then P(t _1 ~ c _ I 9 + 6) ~ a , which contradicts (4.21). This completes
2
N
1
Nl
tho proof.
Let
1tIn ~
and
~ t 2n }
be two sequences of statistics, and denote
by !l and !2 tho corresponding families of tests of the form (4.1) •
that the distributions of t n and t n depend
only on 9 = 9(Gn ) and
Supooso
1
2
that Assumptions A, B, C, D are satisfied in either case. If r and
i
D (a ,a ), i=1,2, denote the values of rand D(a ,a ) for the two families,
l 2
i l 2
an application of Theorom 4.1 gives immediately
Theorem 4.2.
Let!l and !2 be two families of tests of the form
(4.1) which both satisfy Assumptions A, B, C, D.
Then as 5
~
0,
(4.22)
Thus if r
tends to zero.
(4.23 )
< r , the efficiency of family !l relative to family !2
2
1
If r l = r 2 = r,
lim
5~O
16
In many applications the statistic t
asymptotically normally distributed,
1
4> (x) ::
(2n)
"2
or a function of t
n
n
will be
Let
x
f
e-y
2
/2
dy
-00
Assumption C , There exist two functions ~(g) and a(g) > 0,
l
defined for ~ 6 ro , and a sequence of increasing functions gn(t ) of
n
tn' n::l,2, ... , such that for every real x and every d
(4.24)
°
l 2
P{ n /
lim
n --+
~
00
Assumption D • The function ~(Q) has a continuous and positive
l
derivative at Q :: 9 , The function a(Q) is continuous at 9 = 9 ,
1
1
By Assumption Dl we can write
where
~(x) ,
17
e
n
~Oandel~O
n
as n --;.
00.
Hence
!J.1(9 )(1+e )
n
1
d 0"(9 )(1 +e~)
1
has the same
Thus if
WG
l~niting
let
distribution as
18
Assumption D.is also satisfied.
Let A(U) be defined by
(4.25)
U = 1 - ~ (A(U»
Then A(l-u)
= j (-A(U».
= -A(U),
and from (4.6) we obtain
Hence we can state
Thoorem 4.3.
~
I
bo tho family of tests (4.1).
If
Assumptions
A, B, 01' Dl are setisficd and 91 is fixed, we have asymptotically as
{) --+ 0,
(4.26)
The functions
~(e)
and cree) depend on the family!.
Let
19
J(T)
=:
From Thoorem
4.3
Theorem
we obtain
4.4.
~!l ~ !2 be two families of tests of the form
(4.1) which both satisfy assumptions A, B, C , D •
1 1
Then
(4.27)
Thus in this case the asymptotic relative efficiency is independent
of a
1
and a •
2
Theorem
4.4
is cssontially due to Pitman (seo Noether L-l_7, p.
241)
who obtained an analogous result under someWhat different assumptions.
Pitman's result has been extended by Noether
L-5_7.
So far we have considered tests with WIn of the particular form
t
<
n -
c.
Since, by Assumptions C and D, the probability of t
zero as n tends to infinity, Theorems
4.1
to
4.4
n
=c
tends to
apply as well to families
of tests with wln of the form t
< c. They also apply to families of tests
n
which prescribe to take decision d or d2 according as t < c or t > c,
n
l
n
and a random decision when t = c.
n
One is frequently interested in the performance of a sequence of
tests of the form hore considered, with the constant c
for every n.
= c'n
prescribed
From the proof of Theorem 4.1, in particular from inequality
(4.l~), we obtain the conditions which the sequence {c~
1has to satisfy
20
in order that the theorems be apnlicable to sequences of tests of this
type.
We can summarize these results in the following theorem.
Theorem
4.5.
Theorems
4.1
~
4.3
remain true if
~
denotes one
of the following families.
(a)
A family of tests
{,
( , -eo < c < 00, n=1,2, ... , where
n,c 5
'n , c prescribes to take decision dl if t (x ) < c and decision d2 if
n-n
-
t (x ) > c.
n -n
(b)
A
sequence of tests
f 4l ncln 1 ' n=1,2, • . ,
defined as in
(a), where
(for Theorem 4.1)
or
Theorems
4.2
types (a) or (b).
~
4.4
remain true for any two families of tests of
21
REFERENCES
L-l_7
G. E. Noether, Asymptotic properties of the Wa1d4folfowitz
test of randomness. Annals of Math. Stat., Vol. 21 (1950),
pp. 231-246.
L-2_7
E. L. Lehmann, Some p~nciples of the theory of testine
hypotheses.
Annals of Math. Stat., Vol. 21 (1950), pp. 1-26.
E. L. Lehmann and C. Stein, Most powerful tests of composite
hypotheses. I. Normal distributions. Annals of Math.
L-4_7
-Stat., Vol. 19 (1948), pp. 495-516.
E. L. Lehmann and C. Stein,
parametric hypotheses.
(1949), Pp. 28-45.
On the theory of sane nonAnnals of Math. Stat., Vol. 20
G. E. Noether, Asymptotic relative efficiency of tests of
hypotheses. Unpublished.