Rosenblatt, J.R.; (1955)On a class of non-parametric tests." (Air Research and Dev. Command)

•
ON A CLASS OF NON.PARAMETRIC TESTS
by
Joan Raup Rosenblatt
University of North Carolina
,.
'-I"
\'/
,~
I[
This research was supported by the
United States Air Force, through the
Office of Scientific Research of the
Air Research and Development Command •
..
.
•
Institute of Statistics
Mimeograph Series No. 138
August, 1955
ii
ACKNOWLEDGEMENTS
I am indebted to Professor Wassily Hoeffding, who suggested
the problem on which an attempt is made here.
His gUidance and
assistance at every stage of the work have been illuminating and
instructive.
Grateful acknowledgement is made of encouragement
and confidence received from Professor Harold Hotelling.
I thank the Department of Statistics and the United States
Air Force for financial assistance.
Finally, I have incurred a heavy personal debt to the generous
friendship of Miss Martha Jordan.
Joan R. Rosenblatt
lOl _
•
iii
TABLE OF CONTENTS
Pa.ge
ACKNOWLEDGEMENTS
ii
TAB!..E OF CONTENTS
iii
Chapter
I.
DESCRIPl'ION OF A CLASS OF PROBLEMS
1
1.
Chara.cterization of e. C1a,ss of Problems
1
2.
A 01a,as of Decision Rules for the Problem
(D,
3.
II.
•
III.
(1)0'
~)
3
"Optimum" Decision Rules for
(D,
(1)0'
~)
7
SOME CONVERGENCE THEOREMS
12
1.
Introduction
12
2.
Uniform Convergence:
3.
Properties of Two-Sample U-Statiatic
17
4.
Uniform Convergence:
20
5.
Bounds for the Deviation from Normality
Single-Sample Ca,se
Two-Sample Case
12
22
ASYMPTOTIC EFFICIENCY
26
1.
Introduction
26
2.
Definition of Efficiency
26
3.
As,mptotic l5lcpression for N ( ~ )
c
32
4.
A Class of Statistics for which Theorem 3.1
Applies
,8
5.
Asymptotic Comparison of Two Test Families
44
6.
Examples
49
7•
Further As,mptotic Expressions for Nc (
8.
Examples
J)
51
56
•
iv
Page
Chapter
IV.
TWO-SAMPLE TESTS
1.
Introduction
2.
Tests of
eS
9
1
Against 9
~
9 ;
2
Small
Samples
60
:3 •
Rank Order Tests
65
4.
Tests of F
5.
Asymptotic Properties
77
6.
Large-Sample Properties
78
= G;
7 • Large-Sample Comparison of
8.
Discontinuous
BIBLIOGRAPHY
69
Sma.ll Samples
U
D1st~ibutions
and
1J
and Ties
95
100
104
•
CHA.P.rER I
DESCRIPl'ION OF A CLASS OF PROBLEMS
1. Characterization of a Class of Problems.
Let F1' ... , Fk be cumulative distribution functions (edt's),
and let n
= (nl ,
••• ,
~)
be a vector of positive integers.
Let
.
= Gn represent the joint distribution of (n l + ••• + nk )
k
l
independent random variables of which n are distributed according
i
to F (i = 1, ... , k). Let!f) be a class of k-tuples (Fl, ... ,Fk ),
i
G
n , ••• , n
£J n
and let
= {G : (F , ... , F )
n
k
l
E
f)}.
The distributions F may
i
be multivariate.
Let -i,n
x
i
= (!l
= (Xil ' ... ,
Xi n ), i
= 1,
i
••• , k, and z = z
n
nl, ••• ,nk
n ' ••• , !k, n. ).
, 1
'
l{
Suppose that for all (Fl' ••• , Fk)
E
iJ there is defined a real-
valued functional
(1.1)
where r
g(F , ••• ,F )
k
l
~
1 is an integer.
=
J~(Z
.
r,r, ••• ,r
)
r, ••• ,r (zr, ••• ,r ),
dG
The kernel ¢ is understood to be a measu-
rable function and integration is extended over kr-dimensional Euclidean space (in the case of univariate Fi).
stood in the sense of Lebesgue-Stieltjes.
The integral is underSuppose the functional
e(F l' ... , Fk) takes values in an interval w which may be finite or
infinite.
Many of the problems of statistical decision theory may be stated
in these terms. We consider non-sequential two-decision problems
which may be characterized as follows:
2
We are given a sample
zn consisting of n observations from each
i
of k populations, and we assume that Z is distributed according to
n
Gn
E
et} n • We wish to make one or the other of two decisions dO' dl'
and we are given subsets (1)0' (1)1 of the interval (1) (corresponding to
subsets £) 0'
(j
= 0,1).
rf)1
of £; ), such that we prefer d J i f Q(Fl' • • •,Fk )EroJ
We will call this the problem
(0,
(1)0' (1)1).
The object
of theoretical investigations in this context is to find or determine
the properties of procedures for selecting one at dO' dl on the basis
of the sample zn' that is, for solVing the problem (~, (1)0' (1)1).
This representation of a class of problems is not newj references
to some of the contributions to the theory of certain problems of
•
this type are included in Section 2 of this chapter •
It is assumed that the kernel
rj)
of the functional (1.1) is
symmetrical in each of its k sets of r arguments.
essential restriction, except as noted below.
If
This is not an
rj)
is not symmetrical,
it is always possible to construct a symmetrical kernel rj)'.
where for i
= 1,
(J i )
••• , k, !i,r
Let
is a permutation of (xil, ••• ,Xir )
and J i indexes the set of permutations of (1, 2, ... , r)j 1:' denotes
k-fold summation over the ra.nges of the indexes Jl, ••• ,Jk •
We have assumed that the kernel ¢ of (1.1) has the same number r
of arguments in each of the k sets.
restriction.
Suppose
ThiS, again, is not an essentia.l
3
•
= max
(r , ... , r ). For i = 1, ... , k, ¢ may always be
k
l
regarded as a function of (x , ... , X ) which de1Jende on these
il
ir
and let r
arguments only through the values of (xil ' ••• , x ir ). When regarded
this way, of course, ¢ is not symmetrical.
We will give particular attention to certain decision problems
for which the functional (1.1) is equal to the probability of an
event; that is, for which
¢ is
the characteristic function of a set
in kr-dimensional Euclidean space (univariate Fi) and takes values
zero or one only.
In this case, the assumption we have made about
the symmetry of the function
¢ is
not trivial, since in general the
symmetrical function ¢t would not be a characteristic function.
For
the examples with which we will be concerned, however, ¢ satisfies. all
our assumptions.
2. A Class of Decision Rules for the Problem (f), (1)0' (1)1) •
The purpose of the present investigation is to develop and illustrate the application of methods for deriving properties of certain
classes of decision rules which have been proposed for solving the
problem (£) , (1)0' (1)1).
Chapters II and III contain some general :
theorems about these classes of decision
rules,~th
s1Jecializations
and illustrations relevant to a particular class of two-sample problems.
In Chapter IV, the properties of' these classes of decision
rules are examined in detail in the context of the same two-sample
problems.
4
A
~1xed-sample-size
mined by; and will be
decision rule based on the sample Zn is deter-
ident~ied
with, a real-valued measurable f'unc-
(z ), 0 < $ (z ) < 1, which gives the probability of making
nn
- n n decision dO' say, when zn has been observed. Except in Sections 3
tion
$
and 4 of' Chapter IV,we will consider decision rules of' the form
(1.2)
$
=
(z )
n n
1
it
t (z )
n n
<
A
an
if'
tn(Zn)
A
0
if'
t n (z n )
=
>
n
n
,
)..
n
where the statistic t (z ) is a real-valued measurable function of
n
n
the observations, "'n is a constant, and an is a constant (0 ~ an $ 1).
•
Consider the statistic U (z ) defined by
n n
• ••
where for i
= 1,
(ji)
••• , k, x.
-J.,r
is a vector of r distinct elements
k-fold summation over the ranges of jl' ••• , jk.
Statistics of' the form (1.3) have been considered in general by
various authors, including Balmos
L-16J.
L- 8J,
Hoeftding
L-IO_7,
Lehmann
They are unbiased estimators of minimum variance tor the func-
tional (1.1); they have asymptotic normal distribution as (n l , ••• ,
tend to infinity, under quite general conditions. A variety of
~)
5
for k
= 2.
The general results obtained in Chapter III are especially
applicable to a sub-class of the class of statistics of the form (1.;),
namely, those for which the function
¢ takes
values zero or one only.
Various statistics belonging to this sub-class have been proposed for
non-parametric decision problems, and their properties studied in some
detail.
Among them are (1) the sign test for the median, see
L-11-7; (2)
see L-15-7;
Kendall's difference-sign rank correlation coefficient ~,
(3) the Wilcoxon-Mann-Whitney two-sample statistic, see
L-18_7, L-16J, L-25J; (4) other two-sample
discussed by Lehmann L-16_7, including in particular (5)
for example
•
L-6J, L-IOJ,
statistics
a statistic
for testing F·= G against F ~ G which has been further investigated
by Sundrum
L-26_7.
When it is assumed that
2
contains only continu-
ous distributions, these statistics are all based on functionale
which are equal to the probability of an event. Corresponding functionals are: (1) for testing whether the median 1s at x
= 0,
P(X < 0);
(2) for a bivariate random variable (X, Y), a linear function of
pL- (Xl
- X2 )(Yl - Y2)
> O.J;
for independent random variables X, Y,
(3) p(X < Y), and (5) P(X , X2 < Y , Y2 or Yl , Y2
l
l
< Xl' X2 ).
The following are additional examples of these statistics, not in
the SUb-class illustrated above, but of a closely related type.
(6)
A certain set of k-sample statistics which appears in the construction
of a rank-order one-way analysis of variance test, see Andrews
L-1-7,
is associated with the set of functionals (for equal sample sizes)
L
i:fj
P(X i < X ), j
J
= 1,
••• , k, where Xl' ••• , Xk are independent
6
random variables.
(7) Spearman's rank. correlation coefficient p (the
prOduct-moment correlation of the ranks) is a consistent estimator for
a functional of the form (1.1), which is a linear function of
> X2,
Y > Y ), where (X,Y) is a bivariate random variable. The
l
3
symmetrical kernel of this functional does not take the values zero
P(X I
and one only.
The rank correlation coefficient is a linear function
(With coefficients depending on n) of the "sample functional", and has
the same limiting distribution as the statistic of form (1.3); see
In Chapter II, we use Hoeffding's methods
<L-IO_7,
Section 5) to
obtain the variance and certain other properties of Un (Z n ) in the case
k = 2, and point out that the same methods may be applied for any k
(although the notation would become increasingly complicated).
Condi-
tions are given under which Un (Z n ) converges uniformly to its limiting normal distribution, for k ~ 2, with the same possibility of
generalization.
The principal result of Chapter III prOVides a method for obtaining
an asymptotic expression for an index of efficiency for families of
decision rules of the form (1.2) based on statistics (1.3), relative
to certain decision problems of the form
(f),
).
l
We may mention one further property of U (z ) which recommends its
n
'
O
ill
ill
n
consideration in connection with decision problems of the form
(~,
'
O
ill
l ). A statistic of the form (1.3) is equivalent or closely
ill
related to the "sample functional" e (Zn; 81' ••• , Sk)' where
•
7
s.~ = 8 l (x; -xi ,n. )
1.
= (no. of observations x .. < x,
lJ -
= L, ¢
(xl'
J ll
, ••• , xl'
J lr
j
; ••• ; ~j
= 1, ••• ,n)/n1 ;·
kl
, ... , ~. ) ,
J kr
and L denotes rk-fold summation over the ranges (1
indexes ji£ (i=l, ••• , k; £ = 1, ••• , r).
< Ji ,",<- n.)
1-
-
of the
e(Zn; 8 l , ••• ,8k ) reduces
to Un (z n ) if r = 1, and is a consistent, though not in general unbiased,
L-IO_7 points
estimator for e(F , ••• , F k ), if r ~ 2. Hoeffding
l
that the asymptotic distribution of Q(Zn i 8 1 , ••• , 8k )
out
is the same
as that of Un (Z n ).
3.
"Optimum" Decision Rules for
(cf>,
(l)0' (l)l)'
An important object of a general investigation of decision rules
based on U (z ) is to determine whether such decision rules have optin
n
mum or "good" properties, for use in connection With problems
(£), (l)0' (1)1)' Even
if
£) is taken to be a relatively small class of
distributions, very little is known about the relative "goodness" of
alternative decision procedures for non-parametric problems such as
those cited in the preceding section. We will summarize some of the
known "good" properties of statistics of the form (1.3), and then consider the question of "optimum" properties.
First, however, we introduce another class of statistics which is
important to this discussion.
Let s = L-min(nl, ••• ,nk)/r-l, where
8
L-uJ denotes
the greatest integer ~ u.
s-l
(1.4)
s W (z )
n n
= r. ¢
j=O
Let
(x(j) ••• , x..~(j)
-It,r
-l.r'
'
l ,
x(j) -- (x i(rj+l)' ••• , x
) i - 1, ..., k. Thus."W
W ~'ere -i,r
i(rj+r)'
n (Z n )
U'
is a sum of s independent and identically distributed random variables.
Statistics of the form (1.3) have been called U-statistics, and
we will call statistics of the form (1.4) W-statistics.
Consistency and unbiasedness are considered to be desirable properties of tests (decision rUles).
Lehmann
1:16_7 has
demonstrated
that two-sample U-statistics, for functionals (1.1) having
¢ equal
to
zero or one, may be used to construct consistent sequences of tests.
It is frequently possible to construct uniformly consistent sequences
of tests using U-statistics.
For the present purpose, it is sufficient
to say that a uniformly consistent sequence of tests (cf. L- 2_7) is a
{~n(zn)}
sequence
lim
n->
00
lim
Ii
->
sup
(F1' ••• ,Fk )€(1)O
~l,· .. ,Fk
4l
sup
(F1' ••• ,Fk )€(1)l
~l,· .. ,Fk
4l
n
(z )
n
=0
n (Zn) = 1,
is written to represent "n , ... ,~ -> 00". The
1
following statement is proved by an easy application of the Tchebycheff
where ten
->
00
such that
00"
inequality (see for· example Lemma 4.3 in Chapter IV):
= {Q:
Q~
o} ,
Q
(J.)l =
{e: e.2=
If (J.)O
Ql} , GO < el , and the variance of Un
tends to zero as min (nl' ••• '~) tends to infinity uniformly for
9
(F l , ••• , Fk )
tn
lim
n
= Un
is uniformly conSistent, i f
sup
->
€iJ , then a sequence of tests of the form
~
ex>
Q
O
< lim
(1.2) with
A ,
inf
n->oo
n
< Ql
In Lehmann's paper
L-16_7
it is further shown that tests based on
U-statistics ma.y not be unbiased.
If the kernel ~ of the functional
(1.1) takes only the values zero and one, however, it is alwa.ys possi-
ble to construct unbiased tests by using W-statistics which under this
condition have binomial distribution with parameters (s, Q).
We turn now to the question of optimum properties.
In this inves-
tigation, we consider the performance of tests under assumptions which
imply that
2
may be a very large class of distribution.
II and III, certain restrictions are imposed
on:f)
In Chapters
in order to insure
uniform convergence of the distribution of Un (Z n ) to its limiting
distribution. In Chapter IV, for detailed consideration of the properties of two-sample U- and W-statistics for the functional P(X < y),
it is assumed that
~ consists of pairs of continuous distributions
(but see Section 8 of that chapter).
Under such broad conditions, the class of admissible tests for a
given problem
(iJ,
(1)0' (1)1) will be very large; one cannot expect, for
example, to obtain uniformly most powerful tests. A criterion of
optimality proposed in this context by Hoeffding
L-ll_7,
applied successfully in one case, is the following.
•
which he there
A test
$
n
is said
to maximize the minimum power with respect to wl among tests of fixed
level of significance Q for testing H : Q(F , •• , F ) € (1)0 i f
O
k
l
10
for all ~~(zn) satisfying
where i) 0' £)1 are the subsets of
!iJ corresponding to
0, 001 of 00. (Recall that we defined
making decision dO.)
00
For testing HO:
P(X < 0)
~
~n(zn)
the subsets
to be the probability of
p(X < 0)
QO against Rl :
~
Ql' 80 < el ,
the statistic for the sign test is the proportion of the observations
Xl' ••• , xn less than zero.
This is Hoeffding's case, mentioned above;
in L-ll_7 it is shown that a test of the form (1.3) based on this statistic maximizes the minimum power.
A further successful application of this criterion of optimality
is included in Chapter IV, where it is found that a two-sample test
based on the W-statistic associated with P(X
mum power with respect to
P(X < y) ~ eO' where (.01
< Y)
maximizes the mini-
1 among tests of leval a for testing
00
Q ~ Ql \ ' 81 > eO. This result is
suggested by the observation that the problem stated is equivalent
= {e:
to testing the hypothesis that P(z < 0) ~ QO' where Z ;:: X - Y.
The
U-statistic for P(X < Y) does not in general provide tests which
maximize the minimum power for testing e
~.
QO.
Nor has it been
pes~i­
ble to find a test having this property for testing the usual twosample null hypothesis that the distributions·.ct' X and Y are identical.
li
The small-sample exact distribution of the U-statistic for P(X < Y),
that is, the Wilcoxon statistic, has been examined in some detail under
the assumption that X and Y have continuous distributions.
Sections
2 to 4 of Chapter IV summarize the few results which it has been possible to obtain.
Except for certain significance levels and certain
very small sample sizes, it appears unlikely that it will be possible
to find optimum tests according to the criterion stated, when the class
[j) is
a large class.
This is true even i f we try only to find tests
having optimum properties among the class of rank order tests.
This
investigation, in summary, makes negative rather than positive contributions to the attempt to determine "good" properties of tests based
on U-statistics.
Results may be obtained if the class
The work of I. R. Savage
L-2lJ
~
is severely restricted.
is an example; Savage finds most
powerful rank order tests tor testing the hypothesis that two samples
come from the same distribution, with
£)
= {(F, G):
Savage t s methods may be applied for any class
£J
G =: Fa, a
> 0}
for which the
probabilities of possible rankings in a sample may be calculated in
manageable form; the methods will be successfully applied if these
probabilities imply an ordering of the possible rankings.
•
CHAPTER II
SOME CONVERGENCE THEOREMS.
1.
Introduction.
In this chapter, we consider a class of statistics with asymp-
totic normal distribution, namely the class of U-statistics, whose
properties have been investigated by Roettding
£)oJ.
Theorem 2.1
gives conditions under which the distribution of a U-statistic converges uniformly to the normal distribution as sample size tends to
infinity.
In Section 3, a two-sample U-statistic is defined and cer-
tain properties ot its variance are derived, by methods analogous to
those of Section 5 of ~lo_7. Theorem 2.2 (Section 4) is the twosample analog of Theorem 2.1.
The two-sample U-statistic is an obvious generalization ot the
single-sample statistic.
cases by various authors
Its variance has been found tor certain
£.
20, ' 26
_7.
Lehmann [l6J has shown that
its distribution is asymptotically normal.
It is evident that the
methods employed in Sections 3 and 4 of this chapter could be extended
to obtain the analogous results for k-sample U-statistics.
Theorems 2.1 and 2.2 are required for the proofs of Theorems 3.3
and }.4 ot Chapter III.
Theorem 2.3 (Section 5 of this chapter) shows
bow Theorem 2.1 or 2.2 may be used to obtain bounds for the distributions, and is used in Chapter IV.
2.
Unitorm Convergence:
Single-Sample Case.
Given a edt F(x), let the functional
13
be defined for that class i) of cdt's for which the integral exists
and equals the repeated integral. We will assume that the kernel
¢(x1 ' ... , xr ) 1s symmetric in its r arguments. F may be a univariate or multivariate distribution.
Let X be a random variable having cdt F(x), F€
x
= (xl'
n
~
r.
B.
Let
••• , xn)be a sample of n independent observations on X,
The U-statistic associated with the functional Q(F) is de-
fined by
U (x)
n -
=( rn)
-1
I. ¢(xJ.' , ••• , xl.' ),
1
r
where summation is taken over all sets of r distinct integers,
1
~
1
1
< ••• < i r
~
n.
It is obvious that
EFUn(X) = Q(F) •
Hoeffding
LIo_7 makes
the following definitions:
c
= 1,
••• , r,
11°= 0,
and
... , XC ) •
(2.4)
J
and derives
2
o (un )
Let
(2.6)
= VarFUn(X)
•
14
Hoeffding shows that Y has l~iting (standard) normal distribution,
n
provided t > 0 and t < 00. By a slight modification of his proof
l
r
[LO _7, we will show that this convergence is uniform in any class of
distributions F having
t l bounded away from zero and t r ,
~3
bounded,
where
Consider the random variable
':i:
1 (X);
- 2
E ~ (X) = O,E 1: 1 (X) = t 1 •
Let
. (2.8)
If 0 <
.1l u)
t l < 00, then clearly the distribution of Wn converges to
u
= (21t)-1/2
J
~t2
exp { -
J
dt as n tends to infinity. Fur-
-00
thermore, letting Hn (u) = PF(Wn< u), we may apply a theorem of
A. C. Berry ~5 _7 to obtain:
Lemma 2.1.
If
t l > 0,
~3
<
00,
then
where C is an absolute constant.
Lemma 2.2.
Choose €>O.
If
t1 > 0, t r < 00,
r
~
2, then
where B is a constant depending only on r, given in (2.14).
Proof of Lemma 2.2. From (2.6) and (2.8) and the TchebYCheff-Bienaym~
inequality, we obtain
(2.10)
p(
IYn - wi>
n
€)
-<
~2 (1
€
- EY W ).
nn·
15
We have by an easy calculation
t 1/2
EY W = r(..!)
n n
/o(U).
n
n
We will show that
(2.11)
0< o(U )
n
Then
and we substitute (2.12) in (2.10) to obtain (2.9).
establish (2.11).
It remains to
The first inequality of (2.11) is given in Theorem
5.2 of 1J0·~7. From Theorem 5.1 of the same paper, we obtain
(c
= 1,
2, ••• , r)
and substitute in (2.5) to obtain
(2.13)
where
B
= sup
b
n>r
n
(r)
The second inequality of (2.11) follows from (2.13). We assume
r
=
>2
-
because Y : W for
1.
n
n
It remains to establish
(2.14)
B
=
4
if
r
=2
12.6
if
r
=3
if
r
>4
r(r-l)
2
•
16
We have
,
b (2)
n
=
2n/(n-l)
bn (3)
=
6n(2n-5)/(n-l)(n-2),
and the first two parts of (2.14) are obtained by maximizing bn (r)
with respect to integral values of n ~ r, r = 2, 3.
To establish the last Part of (2.14), we first show that r(r _ 1)2
is an upper bound for b (r) when n
n
= nr -< r(r - 1)
we have b (r)
n
2
j
>
r > 4.
-
for n
b (r) < r(r - 1)2 by induction on r.
n
as n
->
-
- 1, r
> 4,
- -
Finally, b (r)
n
Let Xl' X2 , •••
we show
---> r(r • 1)2
be distributed according to F(x).
Y~ and w~ be functions of Xl' ... , Xn and
such that as n
->
PF(W~
and
PF ( IW~
for any e
-
00.
Theorem 2.0.
Let
> 2r
For r < n < 2(r - 1),
> O.
Ca
class of cdr1s F
co
(u)
uniformly for
Fee
- Y~ I > e) ->
0 uniformly for
Fee.
:::
u)
-> 2
•
Then as n
PF(Y~ :s
u)
->
00
-> ~
(u)
uniformly for Fee.
17
Proof:
Choose e
PF(Y'n< u)
> o.
Now
= PF(Y'n<-u.'
IY'n - W'n I
> e)
+ PF(Y'n< u, IYln • w'l
< e).
n-
But
PF(Wt < u.e) - P ( Iyl.w' I
F
nn n
> e)
< PF(Y' < u, ly'.w'l<e)<P.F(W'< ,*e),
nn n- n-
and
o -<pF(ytn<u.
-'
IY' .wII
n
n
> e)
<PF(IY'
n
-w'l
>
n
e:).
Thus
< u - e) - PF ( IY'n • WIn I
PF(W'n(2.15)
We have, therefore, for any e
> e)
< u)
-< PF(Y'n-
>0
I ( u - e) - 5 n (F) ~ PF(Y~ ~ u)
where 5 (F), 5' (F)
n
n
->
0 as n
->
00
:$ ..F(u + e) + 5~(F)
uniformly for Fe
G
The theorem follows by uniform continuity of I(u).
As a corollary to Theorem 2.0, we have
Theorem 2.1.
as n
->
tl 2
a
Proof;
>
00,
Let Y be given by (2.6), r
n
> 2.
PF(Y
n
:$ u) -> -F(u)
uniformly in a class J) of cdf's F(x) such that
0, ~3 ~ M, ~r
:$
M
<
00.
Let W be given by (2.6).
n
By Lemmas 2.1 and 2.2, the condi-
tions of this theorem are sufficient for the uniform convergence
hypotheses of Theorem 2.0.
3. Properties of Two.Sample U-Statistic.
Consider the functional
(2.16)
Q(F ,G)
=
J... S ~(xl'
18
... , Xri
Yl'.'"
yr)dF(x1 ) ...dG(yr )
defined for that class of pairs (F,G) of cdt t s such that the integral
exists. We suppose that
~
arguments.
... ,
Let! = (Xl'
is symmetric in each of its two sets of r
not smaller than r, and
where summation is extended over all sets of integers
1 -< i
l
< ••• < i r -< nand
1 -< J
l
< ••• <
j
< m.
r-
Given (m + n) independent random variables Xl' ••• , Xn ,
Yl , ••• , Ym, distributed according to F(x) and G(y), respectively, we
develop the properties of U analogous to those which are known for
mn
the single-sample statistic Un •
We define
Y
(Xl' ••• ,x ; Yl ,·· .,Yr )
r
= 11(xl' ... ,xr ;
(c, d
and
and obtain
= 0,
Yl' .. •.. yr) - Q,
1, ""
r)
19
The following lemmas are two-sample analogs of certain of the results
of Section 5 of [i.o
7.
First, it will be convenient to define
for c,d
= 1, 2, ••• ,
r.
The
~cd
are non-negative, since we have
We have then, from (2.20), an alternative expression for the variance:
,2(Umn
l=C)
+(D)
r
Lemma 2.3.
-1
-1
C)
C~l(r)
c
-1
51d~lC)CJC)CJ~Cd
(n_r·)t
co
r-c
+( m)
r
-1
~ (r)( m-r)t Od
d=l d
r-d
Let
Then
t
c
cd
=2::
d (·C.,(d)
I.
i=O j=O
I
i I
j
8,
iJ
and
8 c d>
- 0,
o$
c,d $ r.
The method of proof is analogous to that given for Lemma 5.1 of
[)OJ.
20
Lemma 2.4.
t Od
d
If 1
< d < h < r,
-< c -< g -< r, 1 _....-
t Oh
t co
::::"11' c::::
t gO
g'
Tlcd
then
TlDh
Cd $~ •
The proof of Lemma 2.4 follows from application of Lemma 2.3.
Now, applying Lemma 2.4 to (2.22), we obtain:
2
a2(u ) < r
+ r..
+ r
mn - iiiii 'lrr Ii ~rO iii
t
~Or'
(2.24)
As a further consequence of Lemma 2.4, we have the following
two-sample analog of the relation given in (2.13):
Lemma 2.6. If n < m, then
where B depends only on r;
if r
-> 2.
If n
4.
> m,
B
=1
if r
=1
and is given by (2.14)
then n is replaced by m in the right-hand side of (2.25).
Uniform Convergence:
Two-Sample Case.
Consider the functional (2.16) and the U-statistic (2.17) assoLet Z = (Xl, ••• ,X ; Yl, ••• ,Y ) be a vector of inden
m
pendent random variables Xi' Yj distributed according to F(x) and G(y),
ciated with it.
respectively.
Proceeding as in Section 2 of this
the random variables
~lO(X) and
.\Ff OleYl.
chapter~
consider
21
Let
~3 = E 1'flO(X) 13
,
= E IY- Ol (Y) 1:5
·
(2.26)
v3
Let
= (uInn
ymn
- Q)/a(uInn ).
We will show that the distribution of Ymn converges to ..J:.
ni(u) when
m"n
->
00
(m/(m + n)
->
P" 0 < p < 1)" uniformly in the class of
pairs of distributions (F"G) having
~3"
v ,
3
t rr bounded and at least
one of t lO " t Ol bounded away from zero.
Let
EM
mn
2
EWmn
= 0,
Let Hmn (u) = PFG(Wmn
Bergstr'om £4 _7, we have
Lemma 2.7.
If t lO + t ol
> 0,
sup
-00
<
u
<
~
00
u).
~3
= 1.
Then, applying a result of
< 00"
v, <
00"
then
IHmn(u) - I(u) I ~ C'~
where C' is an absolute constant (C' < 4.8) and
22
By a method analogous to that followed in the proof of Lemma
2.2, and applying Lemma 2.6, we obtain
Lemma 2.8.
Let
p(
where B
=1
> O.
E
If
IY
_ wi>
if r
=1
mn
mn
t lO + t Ol > 0, t rr < 00, and n S m, then
E)
< 11-
B
- n e2 ~
metrr - rt 10 - rt61 )
,
mt lO + nt Ol
and B is given by (2.14) if r
~
2.
The following theorem is a corollary of an obvious two-sample
analog of Theorem 2.0.
Theorem 2.2.
m,n
->
that
00,
Let Ymn be given by (2.27). PFG{Ymn S u) -> ..F{u) as
uniformly in a class!i) of distributions (F ,G) such
t lO + t Ol
m/(m +
Proof:
n)
~
a
-> p,
>0
0
and 1J.
3
< M, "3 < M, trr < 1'4 < 00,
provided
< p < 1.
Without loss of generality, we may assume n
S m and 21 S p < l.
Let Wmn be given by (2.28). By Lemmas 2.7 and 2.8, the uniform convergence conditions of Theorem 2.0 are satisfied for (F,G)e
.2J .
This theorem could be proved using Berryts bound in Lemma 2.7.
However, Lemma 2.7 will be used to obtain bounds on the deviation
from normality of P(Y < u), and the Bergstrom bound is generally
mn better.
5. Bounds for the Deviation from Normality.
Examination of the proofs of Theorems 2.1 and 2.2 and of the
lemmas on which they are based enables us to obtain bounds on the
difference between the distribution of a U-statistic and the normal
approximation to it.
In this section, we illustrate the method of
obtaining bounds for a two-sample statistic in the case where the kernel
¢ of the functional Q(F,G) takes the values zero and one only.
23
Let
Theorem 2.,.
!emn(u;
It '/J
= 0,1
only and t10 +
t Ol > 0,
then tor n ~ m,
F,G)I 1s bounded by
n
-00
where
P(U,E)
lui
it
<4.8,
1~.6
f
B =
4
r(r_l)2
Proot:
e
lui>
E ,
and
1
(2.,1)
~
=
\.... exp
and C'
j~ ,
< u < 00,
. (1
(2.,O)
-1
=1
it
r
it
r
it
r
it
r ~
=2
=3
4.
The conditions ot Lemmas 2.7 and 2.8 are satisfied.
two-sample analog of
(2.15) and Lemma 2.7, we have tor any
T(u_e) _ C'l'
..£.
JUl.
E
>0
- b
< P{Ymn -< u) -<: ...J:.
T(u+e) + C'T' + b ,
mn mn
mn
where C t,.,' are given in Lemma 2.7 and
mn
bmn
From a
= P( IYmn - wi>
e)
mn
24
has the upper bound given by Lemma 2.8.
Now, for e
~
I2iII(u
>0
e)• .F(u) 15
Thus, from (2.32), for any
Ie
mn
(u; F,G) I
(~.29)
The bound
the bound for b
su:!? exp { It-ulSe
E
€
~
= ep(u,e).
>0
< e(23f)-1/2p(u,e)
-
+
Cl.,,'111+
is obtained by sUbstituting
Inn
2
t }
1~
b
mn
•
from Lemma 2.7 and
from Lemma 2.8, in (2.33).
Remarks.
(1)
To use a similar bound in the one-sample case, we would
need a numerical bound for the constant C of Lemma 2.1.
(c
$
1. 88) given by Berry
P. L. Heu
(2)
£14 J,
L 5 _7 has
been disputed.
or a footnote by K. L. Chung in £7
The bound (2.29) can be improved;
relatively simplified.
The value
See, for example,
_7, p.
201.
the form given here is
In particular, the bound given in Lemma 2.8
can be improved, since the inequality of Lemma 2.6 can be made more
precise. For the case r
(3)
= 1,
see Chapter IV, Section 6.
Lemma 2.7 could be replaced by an alternative lemma based
on Berry's theorem.
In a
particu~ar
case, it would be advisable to
investigate both bounds to see which is better.
(4) For the case of the Wilcoxon statistic. Stoker
L22,23J
has made a very extensive investigation of the bound for deviation
from normality.
here.
His method is basically the same as that employed
Be mentions that the method can be applied to the more general
case considered in this section.
25
Stoker
L-22_7
gives a bound similar to (2.29), for the Wilcoxon
statistic only, using Berry's theorem.
He also gives an improved
bound valid for the tails of the distribution.
(5)
An upper bound (C
< 3.2) has been given by Bergstr8m L-3_7
for the constant of Berry's theorem.
CHAPTER III
ASYMPTOTIC EFFICIENCY.
1.
Introduction.
The efficiency of a test procedure for a particular decision
problem may be defined.
In this chapter, an asymptotic expression for
an index of efficiency is derived under conditions sufficiently general
to include certain classes of tests based on U-statistics.
In part1-
cUlar, a method is given for calculating the asymptotic efficiency of
a decision rule based on a U-statistic relative to a decision rule
based on a related statistic which has binomial distribution.
The concept of efficiency used here is that developed by
Pitman and generalized by Hoeffding and the author in ["13
also the references cited there.
Jj
see
The definition given in Section 2 is
extended slightly to cover k-sample decision procedures.
The asymptotic results are developed in Sections 3, 4, 5 and 7
and examples are given in Sections 6 and 8.
2.
Definition of Efficiency.
where the n i are integers (i = l, ••• ,k);
be the set of k-vectors whose components are positive in-
Let n
and let
n
= (nl, ••• ,nk )
tegers.
Let Xi
(i
= 1, ••• ,k).
= (Xil' ••• 'Xini)
Let Zn
be distributed according to Gni(Xi ),
= (Xl' ••• ,~)
be dj.stributed according to
k
F (z ) .= 'IIl G
n
n
-
n
(xi) ,
i
nE
1~
.
Let the decision problem be given by the classes ( J):m
nE
12,
'1
5'
of distribution functions Fn' and disjoint subclasses J) 10'
27
:J)2n of
oDn
Fn is in
cBin
such that we prefer alternative Ai when the distribution
(i
= 1,
2).
Generally, £)10 and
0820
will not exhaust
eB n , but are chosen so that we are indifferent between the two alternatives if Fn is in neither
Let
tions.
$
n
0810 nor
oE)2n.
be a decision rule based on a sample z
Let 41
be defined for n in some subset of
n
from k
n
12
popula.~
e.g." all
j
vectors n whose components are not less than some integer
or all
rj
vectors n whose components are equal (equal sample sizes).
Let ~ be a family of decision rules 41 , and let
n
subset of ~ for which the family
12( J")
be the
tr is defined.
Let P(A ,41 IF ) represent the probability that a decision rule
i n n
$n from
0, based on zn"
will select alternative Ai (i = 1, 2)
when zn is distributed according to Fn.
(cf.
[i3 J)
solved by
0
that the problem (
Given 0:1
> 0"
t £J 1o} , {£) 20 ~
if we can find a test in the family
when
F £
n
0:
2
> 0,
, 0: ,
1
0"' such
£)in
0:
2 ) is
that
(i
= 1,
There are several possible ways of defining an index N{
of efficiency for the family ~ relative to this problem.
we say
2).
'J )
We require
an index representing in an appropriate sense the IIleast" among the
set of vectors n for which the inequalities (3.1) are satisfied by
some test 41
n
in
Let17Z (
f-.t
cJ •
J ) C 17( j' ) be
that set of vectors n for which
(3.1) holds.
Let
empty" let
-ftt
C
= (cl'u.,c ) where c
k
i
~ 0, I.e i = 1. If
m( J
) is
not
28
If we suppose that cost of sampling is proportional to sample size and
that ci/c
is the ratio of unit sampling costs for the i-th population
J
relative to the J-th population" then N (
may be interpreted as an
e
J)
1ndex of efficiency.
Observe that N
e
To determine Ne <
Let
a)"
<:;)
is not in general an integer.
we proceed as in Section 3 ot
::;: be the sub.family of
J
[i3 J :
containing tests 4l n based on zn and
having
Let
M($ )
n
Then
M
n
171 ( J)
.
= FEr;[)
sup
n
P(A l ,,$
n
2n
IFn )
is that set of vectors n for which M
< an
n -
< 02 or M($n) = CX2 for some $nE
'J n*.
and either
t;
Nc(:J) is then determined
from (3.2).
We might further define
N(
3')
= max
CE
where
C
e
stands for the set of all vectors c (With non-negative compo-
nents adding to one).
The following are a few immediate properties of these indexes.
m(ty)
(I)
If
n1
~ nO (1
is determined by an integer nO and the inequalities
= l, ••• "k),
then for every
CE
c:
(II)
N(
J )<
-
n€
max(n , ... ,~).
1
emmin( ':J)
Let us suppose in the rest of this section that
contains, with a point n, every n' such that n'
~
VJl! ( j)
n, that is,
n ~ n i (i = l, ••• ,k). Then, i f we let mn = max(nl' ••• '~) we have
(m , ... ,m )e l1e(
whenever ne
(J).
n
n
1
'1)
1ll
Let s be the least integer such that the point (s, ••• , s) E: '1lI (y> •
Then it is clear that
s
=
min
m<y >
nE:
mn
and (II) may be stated:
N(
Let
1/1. (J ) denote
J> S s.
We will say (for brevity) that
of
1n ( 1 )
(III)
If
nt E
9ll ( t;f)
tJll ( Ij)
k
'7/1 ( 'j
N( t'j)
in
'?[, ( y).
is convex when the convex hull
in E contains no point of ~ (
m ( ::r ) is convex,
We say
tJJl ( !J)
the complement of
>s
tf ).
- 1-
) is symmetrical if nE
% ( ~)
implies
for any n' obtained by a permutation of the coordinates
of n.
(IV)
If'
m ( PJ)
is symmetrical,
N(
J) = NCo( ' ] )
where
Co = (l/k, ••• ,l/k).
These properties of' the index N(
J)
suggest that under fairly
reasonable conditions, a "most economicalll procedure is approximated
30
through taking samples of equal size from each of the k populations,
when we have no information about the cost of sampling.
We may make some remarks as to the conditions under which Properties (I) - (IV) hold.
(I)
The condition of Property (I) is satisfied for families of
tests based on the W-statistics introduced in Chapter I, Section 2.
(2)
It would not be reasonable to suppose that there might be
for any of the k samples a greatest sample size such that (3.1) holds.
We assume
conversely
that once a set of sample sizes has been found
such that (3.1) holds, all vectors corresponding to sets of larger
sample sizes will also be in
m( J ).
Any family:5 containing only
sequences {Q> n ~ of tests whose power functions are decreasing functions of the sample sizes will satisfy this condition.
(3)
~(~) is symmetrical, and the condition for Property (IV)
is satisfied, if the tests
<l>
n
in
J and the classes i)n of distributions
A
have appropriate symmetry properties.
For example, a family
ttl of
tests based on the Wilcoxon two-sample statistic is defined in Chapter
IV, Section 6, and has ~
A
(1l) symmetrical under conditions stated in
Lemma 4.5 of that section.
(4)
Property (II) always holds; and under the condition dis-
cussed in (2), we have Property (II) in the form N(~) ~.s, where s
is the smallest integer for which a vector representing equal sample
sizes s is contained in
IJll (J).
If ~ (J) is also convex, so that
Property (III) holds, then taking samples of equal size s is very
close to a "most economical" procedure when we can make no assumptions
about the cost of making observations.
31
(5)
ing.
An example of a family ~ having
6fIl( tf) convex is the follow-
Consider the usual test for testing the difference between the
means of two normally distributed random variables, Xl and X2 , having
variance one and means
Let
J
jJ.
and IJ. + 0 (-oo
< IJ. < 00,
I:) ~
0), respectively.
<I>
1. defined as follows.
1 n l ,n2 )
be the family of tests [
Let
nln,.., ) 1/2
t n1n2
= ( nl+~2
(x2 - Xl)
where n l , n 2 are the two sample sizes and Xi is the mean of n obseri
vations on the random variable X. (i = 1, 2).
J.
I:) ~
against H':
=1
<0
81' we use the test of level 0;
{:
where ~ ("'-0;)
For testing H: 8
if
t
if
t
>}..
nl n2 :,,:,
n n
l 2
?'
< "'-Q
- a, ] is the standard normal distribution function. .
The power function of this test is
Now
1n(:J)
is the set of pairs of integers (n ,n ) such that
l 2
~(5I' n I , n 2 ) ~ 1 - 13.
Equivalently,
m
(J) is given by
•
If n l ,n2 are allowed to vary continuously, we obtain a set ~ whose
32
boundary is one branch of a hyperbola.
be contained in
11 ,
but every pair of integers (n ,n )
l 2
If a second family
~I
J
,
of tests is given, then we define s1mi.I
larly the indexes of efficiency N (:J
c
JI
)
'J I
and N(
) for the family
The relative efficiency of :; I with respect to
•
1f/() ) must
in M is in 1J1. (J).
The convex hull of
J
is given by
for a fixed vector c.
If' we desire a measure of relative efficiency independent of
the cost of sampling, we may use
eff(O)(
'1 11 J)
J
Suppose a family of tests
/
~
= N(
)/N( "1 ).
is Partitioned according to a
parameter p (which may be a vector) having values in some 'set
P. .We may compute N (
c
r;p ) for each sub-family
'J p
of
'::J
and then
we have
N
C
(~
) ... :lnf
p€p
3. Asymptotic Expression for Nc (
N
c
(
J p ).
J ).
Let Q(F ) be a real-valued function defined for all F €
n
n
and for every n€
t?J1.. •
in an interval
(finite or infinite).
in
0),
"1
<
0)
Assume that the function Q(F ) takes on values
n
Let "1 and "2 be two numbers
O2 , and let
B ln =
{Fn : O{Fn)
c8 2n = {Fn
g(Fn )
~
"1 }
~ "2
}
Q and 02 are assumed to be independent of n.
l
c8n
•
33
:J
Let
be that family of tests based on the sequence of
{t ~
n
statistics
,with the probability 01' selecting Al when zn is
observed given by
if
t
(3.4)
< A
n-
otherwise,
-00
< A < 00;
n(:s ), and $2n' the probability of selecting A2 ,
ne
given by 1 - $10.
We will consider the as;ymptotic behavior of N (
c
&
= 02 - 01
->
J ) as
0, with 01' 0;1' 02 fixed.
Assumption A.l:
The functions t (z ) are independent of 8.
n
n
x IF
n
Assumption h.2:
lim
X
->
P( t
infJ)
FnE
00
10
<
n -
}>1
..
°1 •
This ensures that A is finite, when A is defined as
n
n
the
smallest number which satisfies
FnE
.
F
j)
OV
In
m(J ) given by the set of vectors n for which
We have, then,
Nc (
petn< An IFn ) -> 1 .. 01.
inti:\
n
E
°
petn< An IFn ) -< 2 -
supL\
CC-/
2n
is then determined from (3.2).
Assumption A_3:
Let
~ n (0')
01 + 02
== {F
n
< 1.
: Q{F n ) = Q' I.
J
for Q'eoo. ,
11 ( J )
.4: For every n€
and every Oe 00, there
C\* (Q) of
exists a distribution function H* (x,Q) and a subset ~
Assumption
h
n
cf) n (0),
such that
n
pet
< ~l
n-
F )
n
= H*n (x,Q)
when
*
where for every x Hn(x,Q)
1s continuous on the right in Qat Q = Ql.
Assumptions A.l - A.4 are essentially the same as assumptions
A through D of [i3
-
7 and are
needed first to prove that N (:::5)->00
c
as 8 --;. O.
Lemma 3.1:
nE:
CE
77. ( J)
If Assumptions A.l - A.4 are satisfied and for every
C , Nc ( :J' ) -> 00
Proof:
0')
there exists n' E VJZ. (
->
as 8
such that n'
> n,
then for any
O.
The assumptions give us
sup
FnE
CI
oCI
pet
su~
thus n 1
;.
If1Z (:J ) and
n
IFn ) -> H*n (An ,Q2)'
17 (:J ) we
'Il. ( J' ) with n < n
Fn €
A
2n
Now, for any n' E:
for every nE
<
n-
2n
N
c
l
•
can choose 8
Hence, if 8
1
so small that if
< 8 1 and n ~ n',
p( t n -< An IF>
"2 j
n
(:J ) > min(ni, ... ,nk).
This completes the
proof.
Assumption
that nE
'Il (j )
A~5:
There exist positive integers a , ••• ,8 such
k
l
if and only i t n = simi i = 1, ... ,k, m = 1,2, ••• •
i
(The following is unchanged if we require m ~ r
> 1.)
Thus, we may replace the subscript n by the subscript m.
35
k
~ cin.
Now, for a > 0, d > 0, let I
=
m,c
i=l
lJ m,c (a,d) = {Fm :
Assumption A.6:
lim
Assumption
=
Q(F ) > Q + I-a
m -
l
m,c
sup
lL\
€ C1V
(
~
0,
a,d )
C
CE
i
i=l
d
J~
There exist numbers a > 0, dO
function Bc (d) defined for d
m -> 00 F
~
k
m ~ cis ' and
~ 0,
,such that if d
pet < A IF)
m- m
m
and a
~
dO'
=Hc(d).
m
m,c
A. 7:
Hc Cd) is a continuous, everywhere decreasing
function and
(i)
Hc(d) = 0,
lim
d->oo
(11)
Bc(dO) >02
•
By Assumption A.7, the equation in D, He(D)
= 02'
has a unique
root Dc = Dc (01,02) > dO·
Theorem 3.1:
Let.J be given by (3.4).
A.7 are satisfied, then for Q and
l
CE
e-
If Assumptions A.I -
fixed and {) - > 0, we have
asymptotically
•
Proof:
Since D > dO > 0, we may choose E > 0 so that
c
E
< Dc - dO.
-
Let
., = min
{ Bc(De ) - ac(De + e),
By Assumption A.7, '1 > O.
Let
Bc(De - e) - Be(De ) } •
36
n (d)
'In
=
F €
n
sup
tEJ
<
pet
(a d)
kiF ) • H (d).
m -"m
m,c'
m
c
By (3.5) and Assumption A.6, we may choose m so large that
l
(3.7)
We may assume that
if m > ml •
(3.8)
Since N
c
(J ) ->
as 5
00
->
0, we may choose 51 so that
By Assumption A.5, there exists an integer m such that
n€
172
lmo'c
(j )
= ~=l
only if m ~ mO.
m( :J)
Hence n€
O
implies N (
c
c i simO·
The theorem will be proved if we can show that for 5
D
C
By (3.8),
- E
a
<
5N <
c-
(3.9) this will hold
(D
(3.10)
c
c
< 8 1 and
if 5
- e)N- a <8 < (D
c
-
-
c
+ e){N
c
_ l)-a.
First, we have, by Assumption A. 5
if
sup
FE.£)
m
Now, suppose 8
2m
pet < A IF)
m- m m
if
< 8 1 and
8
<
(D
c
_ e)N- a
c
< 51
+ E)( 1 + E).
(D
We establish (3.10)by contradiction.
(3.12)
0' ) =
•
This implies
Thus (3.12) implies, if m = mO'
F €.
m
sup
08
P( t
2m
>
-
<).,
IF)
a,D
-E
m-
m
m
sup
F
(i"\
E rI:..J
m
p( t
(
m,c
C
)
<).,
m-
m
IF)
m
Hence (3.12) implies
(}.13)
F €
m
sup
rf)
2m
pet <)., IF)
m- m m
which contradicts (3.11).
> 0:2 ,
if m = mO '
The second inequality of ('.10) 1s estab-
lished in similar fashion.
Remarks.
Theorem 3.1 holds also for the family ot tests based on
(1)
{tn~
with
4l
(2)
ln
The class
=
1
it
t
arbitrary
it
t
o
it
t
n
n
n
<).,
=).,
>)., •
of) m,c (a,d), defined on the basis of the func-
tion f.-a , might be replaced by
m,c
£J m,c (K,d)
=
{F
m
: Q(F )
> Ql
m -
+ dK(/
m,c
)Jl,
38
where K(l
} is a strictly decreasing function tending to zero as
m,c
m -> 00. We would have in this case
where K- l is the inverse function.
Assumption A.4 is stated in a form convenient for veritica-
(3)
tion in the applications with which we are concerned at present. A.4
may be replaced (see ~13_7, p.56 ) by any assumption sufficient to
prove Lemma 3.1.
Theorem 3.1 is proved essentially by a method used by
(4)
Hoettding in L12_7 to prove a less general theorem.
slightly more general than Theorem
•
4.1
of
Theorem 3.1 is
L 13 _7,
but Assump-
tion A.6 does not immediately lend itself to verification •
(5) In the foregoing, we have suppressed a subscript
s
= (sl, ••• ,sk)
tamily
J
which would indicate the partition
given by (3.4) with
(:J s ) for
Thus Theorem 3.1 gives us
such a partition, and Nc (
N
C
(J ) = min
SES
where S is the set ot all vectors
relatively prime integers.
ot a
1L (.J ) equal to the set ot vectors
with positive integer components.
Nc
{J s }
J ) is
found trom
N (~ ) ,
C
S
e containine at least one pair of
In general, of course, D (al'<l2) will dec
pend on s.
4. A Class of Statistics tor which Theorem 3.1 Applies.
Let
1 = l(Fn ) = (gl' ••• '~h)
Sj
with £(F )e
- n
n
be a vector ot real-valued functions
= Sj(Fn ),
when F
n
E
i) n ,
j
nc::
= l, ••• ,h,
12 .
39
n(::r )
We will continue to assume A.5" so that
may be
indexed equivalently by the vector n and the integer m.
Assumption A.8:
There exist e(F ) and a family of continuous
-
-
cdt's Hn (x" Q"j) defined for every Q E
that for every real x
n
IDJ
1
when Q(Fm) = Q and HFm) = e; and Em(xJFm)
formly for FmE f) m.
Assumption A.9:
defined for d
~
0"
CE
There exist a
c:
> OJ
E
.n " nf:: 17 (J )"
->
0 as m ->
00
such
uni-
dO ~ 0" and a function Hc(d)
J such that when d
~
dO"
EVidently" AssumptioIBA.8 and A.9 are sufficient for Assumption
A.6.
We will be Particularly concerned with cases where t n has normal
limiting distribution. In the following, the function He(d) and the
number Dc
= Dc (ol"a2 )
are evaluated for a class of statistics includ-
ing some U-statistics.
There exist l(Fn )" functions m(O) and
> 0 defined for 0 E (J) and eE Il "nE
(J)" and every-
Assumption A'.8:
s
n
(O,,~)
-
-
17.
where increasing functions gn (tn ) such that for every real x
when O(Fm) = Q and 1(Fm)
formly for FmE £) m.
=1;
and Em(x"Fm)
->
0 as m ->
00,
uni-
40
If Assumption AI.8 is satisfied, then clearly Assumption A.8
holds with
Lemma 3.2:
If Assumption A'.8 is satisfied and 01
<~, then
there exists m such that
l
Proof:
1
From AI .8 and the definition of ,. , we have
n
<
.
- al -
Fm€
p( t
f
1ny\
TJ{)
1m
<
m-
A IF)
<
- tl.
m - ~
m
-m
T
(A )-m(Q
m
1
S (n A)
>]
m '-poi
-
A
+
where
~m€ f) m(Ol>
€
(A ;F )
m Ill.
has ..i<~m) =
m
1.
Clearly, if a l
< 1/2,
we may find
m so large that
l
1 -
a1
-
I e:m(Am,F"m) I > 1/2
and hence the argument of
Now, let
when Q(F )
n
.I must
be positive for m > m •
l
.11.(0) be the subset of
such that ~(F)e:
-
n
n
(0)
= o.
Assumption At .9: There exist a
Oe: co, c€
Il
> 0,
sc{Q»
0 such that for every
C ,
t.:m,c
Assumption A" .9:
s
(O,s)
-
m
= s c (Q).
The function m(O) is non-decreasing for
Oe:a>, and has a continuous and positive derivative at Q
= 01 •
The
41
function sc(Q) is bounded for Q E ID and is continuous at Q = 01.
1
Furthermore, there exists ~ > 0 such that L-s (Q)-7m(Q)_7
c
is a non-increasing function of Q for Q E ID,if m(Ql) < u < m(91 ) +
L-u -
We remark that examination of the proofs below shows that the
continuity requirements of Assumption A".9 could be stated in weaker
forms.
We remark also that Assumption A.2 follows from Assumptions
Lemma 3.3:
If Assumptions
a l < 1/2, then given
E
>0
A'
.8,
At
.9, A".9 are sat-istied and
there exists mO such that
if m > mO.
-Proof:
By the uniform convergence assumption of AI.8,
..
(3.14)
E
t
m
and we may suppose that
IE',m I
By
At
•
m~
l
.9, A" .9. and Lemma 3.2, (3.14) becomes
g (A ) - m(O)
m m
s c (9)
(3.16)
if
if' m> m •
> m1 •
Sc(Q) $ a, Q
By A" .9, there exists a number a <
€
ID,
00
so that we may write, when m > ml ,
]
+
iom
t
such that
~.
(3.17)
Now, by (3.1), (3.15), Lemma 3.2, and the Tchebycheff-Cantelli inequality, we obtain
- 1 - 1/2 C¥
!lm(Am' - m(el )
:s t:.ac 1_ 1/2 "l 1
ll/2
,
,
if m > m • The lemma follows immediately.
1
Let
A(U) be defined by 1 - U = 1(A(U».
Lemma 3 .4:
C¥l
If P.ssumptions AI .8, A'.9, A".9 are satisfied, and
< 1/2, then
Proof:
By Lemmas 3.2 and 3.3 and Assumption A" .9,
Sm(~m) - m(e)
Sc(9)
is a non-increasing function of e if m is sufficiently large.
Thus,
from (3.16), we obtain
if m is sufficiently large.
Lemma 3.5:
C¥l
<
The lemma follows by continuity of
A(U).
If Assumptions AI .8, AI .9, A".9 are satisfied, and
1/2, then Assumptions A.8, A.9 are sat:isfied with dO any
number greater than sc(el)~(al)/m'(e), and
Proof:
We have noted above that AI.8 implies A.8, with
Now, i f
C, .18)
d> 0,
then by AI .9, A" .9,
if Q ~ 91 + (a d.
m,c
Furthermore, by A" .9, m(Ql+(a
d)
m,c
< m(9 l )+
T)
for m sufficiently large, so that by Lemma 3.2, the argument of!
in (3.19) is a non-increasing function of Q when (3.18) holds and
m is sufficiently large.
Now, it follows from A".9 and Lemma 3.4 that
tends to the limit dm'(9 ) - Sc(9 »I.(a ) Be m ->00.
l
l
l
Hence (3.18)
will hold for m sufficiently large if we require d ~ do and choose
dO
> s(Ql)Hal ) 1m' (Ql) •
We can thus assert that for m sufficiently large, and d
s~
F ~;V
(a,d)
m
m,c
Ii (;. ,Q,.l)
In m
=1
-
-
t
m,c
tt. (A. )-m(Q +fa d)]
m
1 mac
ill
s (9 +(a d)
c 1 m,c
ThiS, with A".9, completes the proof of Lemma 3.5.
We obtain (asslmling At .8, AI .9, A" .9)
~
dO'
44
Furthermore, Assumption A.7 is satisfied provided
(3.20)
Now, both (3.20) and the condition for Lenma 3.5 will hold if we set
dO so that
which we may do if and only if a 2
Assumption AI .3: ell
< 1/2,
< 1/2.
0;2
< 1/2
•
Lemma 3.5 and the discussion above complete the proof of
Theorem 3.2:
Let
J
be given by (3.4).
If Assumptions A.l,
A' .3, A.4, A.5, A' .8, A' .9, A".9 are satisfied, then for
to
ce L
fixed and 8
->
Q
l
and
0, we have
We observe that inspection of the proofs shows that Assumption
A' .9 may be replaced by
Assumption A* .9:
11m
m ->
00
uniformly for 9
There exists a
sup
fa s (Q,j)
jeD (Q) m,c m
€
>0
and sc(9)
>0
such that
= Sc(9)
ID.
5. Asymptotic Comparison of Two Test Families.
In this section, we specialize the foregoing to certain classes
of tests based on U-stat1stics, in the one- and two-sample cases.
Let ¢(X1 ' ••• ,xr ) be a (measurable) function of r variables.
Let JL) be a class of distributions F(x) for which there is defined
the functional
and for which the conditions of Theorem 2.1 are satisfied.
£)
for n> r.
-
n
= {F
n
: F (z )
n n
=
if1=1 F(X
·),
1
F€
&)
~
)
Let
,
Let 9(F ) = 9(F).
n
Consider the statistics
(3 .21)
Un
(3.2?)
VI
=
( : ) -1
,-nQ-l [n/r]-l
=
I
E
_r
i=O
n
~(xr i+l' .•.,xr i+)
r
,
where [n/r] denotes the largest integer conta ined in n/r.
~
:J l'
Let
2 be families of tests of the form (3.4) based on Un' Wn re-
spectively, with
7l (j ) equal to the
set of integers
~
r.
Consider
the decision problem
( {cD In \ , {£) 2n 1, c¥l' C¥2)
where 8 = e2 - 9 1 is the "distance" between the classes
In order to compare the efficiencies of
i)
J 1 and J
In'
0
2 for
this problem asymptotically as 8 - > 0, we apply the theorems of
Chapters II and III to obtain N(
J 1)' N( J 2)'
We may note imme-
diately that Assumption A.l is satisfied for each of the two sequences of statistics.
By Theorem 2.1, Assumption A'.8 is satisfied for {Un} wibh
l(Pn)
m(e)
= (~l'"
=e
.,tr )
2n'
sn (e ,f..)
:=
I5:n (Un)
where
ti ,
...
,t r
cr(Un )
= Un
and cr(U ) are defined in Section
n
3 of Chapter II.
If ¢(x ' •.• ,x ) takes the values zero or one only, then hW ha.s
r
n
l
binomial distribution with parameters ([n/rJ, e).
case, as a consequence of Theorem 4.3 of [13
We have in this
J,
as 8-:> O.
¢ is
If, on the other hand,
further restrict the
fied .
!i)
in order to have Assuroption A'
€
£)
I¢(x l ,
••• ,x )
r
,3 ~ M <
00
and establish the required uniform convergence by a proof
£5 _7.
given by parzen £19_7,
using Berryts theorem
those
.8 satis-
We might assume
E
F
for F
C1DC'S
a more general function, we must
be satisfied for
Or we might make assumptions such as
Chapter II.
{wn 1 with
m(e)
G!. (w )
--n n
e
:=
:=
W •
n
It 1s easy to show from (2.13) that
Assumption A ' .8 will then
47
lim
n->
00
so that we have Assumption A*.9 satisfied for the familie.s
~
:;; l'
2 with a = 1/2 and, respectively,
2
2
sl(9) = r
(3.23)
sup
F
E
F
€
£J
(e)
2
(3.24)
s2(e) = r
s'b
(9)
t l (F)
t r (F)
We see that these functions s(9) are bounded;
it remains to assume
the other parts of Assumption A".9.
Assumption A** .9:
sc(e) is continuous at e = e l and there
exists T}>O such that (u - 9)/sc(9) is a non-increasing function of
9 for 9
E
(l)
if e
l
< u < 91
+ T}.
We have established that under appropriate conditions
and
as 8 -> O.
We may summarize the above discussion in
Theorem 3 .3:
Let
f)
be a class of distributions for which
the functional e(F) is defined and for which t 1 ~ b
E 1~(Xl' •.• ,Xr ) 13
5- M < 00.
Let
J
l'
> 0 and
.J 2 be the families
of tests
of the form (3.4) based on the sequences of statistics (3.21), (3.22),
respectively.
Then if
0/7.
(:J ) is
the set of· integers
sumptions A' .3, A. 4, A** .9 are satisfied for both
we have as 8
-> 0
~
~
r and As-
1 and
.j' 2'
48
<
for the problem (
Corollary:
1
{ J) 2n ~ , a , a2 ).
l
Let 9(F) be a functional whose kernel
[
0 In·~'
the values zero or one only, and let
for which
~ ~
b
> O.
£)
~
takes on
be a class of distributions
Then if Assumption A'.3 is satisfied and if
Assumptions /j,.4 and A** .9 hold for 8f'-( l' we have as 8
->
0
For the two-sample case, using Theorem 2.2 and notation
developed in Section 3 of Chapter II, we obtain an analogous theorem.
Let
(3.23 )
U
mn
= ( ~)
-1 ( :)
-1
1 ~ 11 <.
1~
where t
~
.. <ir
.11 <. ..<jr
~
n
~
n
= min(m,n) •
It is found that, under the conditione stated below,
In this case, we see that the asymptotic relative efficiency of the
two families is independent of
Theore! 3.4:
Let
£)
(C l 'C2 ).
be a class of pairs of distributions for
t lO ~
which the functional e(F,G) is defined and for which
t Ol ~
b > 0, and E I¢Or(X l '" .,Xr ) ,3
~
00.
M
<
Let
J
.J 2 be
l'
~
M
< 00,
E l¢ro(Yl"
b > 0,
.. ,Yr ) /3
the families of tests of the form
(3.4) based on the sequences of statistics (3.23), (3.24), respectiveThen if Assumptions AI .3, A.4, A .5, A** .9 are satisfied for both
ly.
:r
:J 2'
1 and
we have for every c €
e
,as 8 - > 0
t
lO(F,G)
() J
eff c (
r min(sl,s2)
sup
\F,G) € 00(9 )
1
sup
(F ,G) € ~ (9 )
1
ty.
2/ 01)
rV
~
for the problem (
t
_
sl
(F,G)
rr
1,
{J) lJnn ~, {cf) 2mn ~
0'1.'
,
The analogous corollary may also be stated.
2 ),
0:
In the statement
'of Theorem 3.4, we may suppress Assumption A.5 and consider the partition
iJ
i,s
~
components? r, i
of
:3 i ,
where
= 1,
2.
Ji
12 (;; i)
contains vectors n with
tJ i,s'
is repla.ced by
6 . Examples.
The following are i11ustra.tions of the application of the
corollaries of Theorems 3.4,3.3, respectively.
Example I:
e(F ,G)
= p(X < Y);
continuous distributions, for which
0:
2
,[) consists of all pairs of
t 10 + t 01 ~
b> 0;
a
1
< 1/2. By Theorem 3.4, for every s,
as8-> 0,
<
1/2,
50
~ in
for the problem (
are as given in Section 3 of this chapter (i
We must verify:
(i)
= 1,
A.4 is satisfied for
2).
.:J 1.
This is
seen upon examination of the distribution of U when (F,G) is given
mn
by (4.11), (4.10) in Chapter IV.
eel -
(ii)
g)
=
SUP(S2 t lO + B1tOl)
max(sl's2)
In addition, we may compute
whence for every c €
e
= F(x,y), 9(F) = p[(X1 - X2 )(Y1 - Y2 ) > O];~
consists of continuous bivariate distributions with t ~ b > 0;
1
ExampleII:
F
a 1 < 1/2, a 2 < 1/2.
where
SUbject to verification of the conjecture that
8
2
(9) is the correct
maximum for ~1 (F) •
We must verify A.4.
Let Ge(x,y) be given by uniform distribu-
tion of X on (0,1) and
1 y=
{
gl/2 + X
1 - X
2
if X ~ e1 /
if
X>
gl/2.
51
e ,)
9(G
=
e'.
e)
Note that tl(G
= s2(9) if 9> 1/2.
The distribution
of Un is given by
if k
= 2, .•• ,n,
and
J 1 is based in this example is
The U-sta.tistic on which the family
a linear function of Kendall's ooefficient of difference-sign correlation
1i5_7.
The conjectured s2(9) is, as we have seen, not greater than
the true value.
It is attained for a distribution analogous to the
discrete distribution for which Sundrum has Shown~2~7 that
tl
attains its maximum in a cla.ss of discrete distributions.
7. Further Asymptotic Expressions for Nc (J
).
The results given in Sections 3 to 5 are not always applicable.
We
derive here an a,ymptotic expression for Nc {
J ) which
is appli-
cable to certain teste of hypotheses of invariance.
Let 9(F ) be a real-valued function defined for all Fn €
n
and for every n e CJZ .
Assume that the function e(Fn) takes on
values in an interval
(finite or infinite) .
Let
J
00
£) n
be a family of tests of the form (:3.4), defined for
nel'l(0).
Let
pose
£) n (e)
be the subset of
that for some 9 1
€
00,
G On
f) n
having e{F )
is a subset of
n
=Q
and
BUp-
£) n{9 l ) such that
5.2
Let 9
> 91
2
be numbers (independent of n) in
(:5,26)
S)=
10
Lon
,f) 2n = {
F : 9(F
n
n
ill
~
and let
92 }
,
We will consider the asymptotic beha.vior of Nc
8
= 92
- 9
1
-> 0
with ~, (l2 fixed, and
(J ) as
l fiXed.
Q
Let "'n be the smallest number which satisfies
K(~n,n) ~ 1 -
(3,27)
Then
fII.
( j ) is the set of vectors n for which
Fn
N (
c
J)
al ,
s~
P(tn ~ "'n IFn ) ~ 0;2'
60 2n
E
is determined from (3,2),
AssumEtion B ,1:
R* (x, e) and
n
For every n
08 *n (e) C i) n (9),
pet
E
n and every e
E ill,
there exists
such that
< xlF ) = R*(x,e)
n-n
n
when F
n
E
rf) *(9),
n
*
where for every x,Rn(x,e)
is continuous on the right in e at 9
Assumption B,2:
R* (A ,9 )
n n 1
= 91 ,
> a2 ,
If Assumptions A.l, B,l, B.2 are satisfied, and if for every
n
E
12 (0 ) there exists n' ?Z ([J ) with n' > n,
->
E
00
as 8
->
0' )
then Nc <
O.
The proof of Theorem 3,1 ma.y be quoted (With appropriate reinterpretation of notation where necessary) to prove
Theorem 3,5:
Let;] be given by (3,4),
B ,2, A,1, A.5 - A, 7 are satisfied, then as 8
->
If Assumptions B.1,
0 we have
5;
We have, in the notation of Section 4, ASsUInptione A.8, A.9 sufficient
for A.6.
We proceed now to coneider a class of statistics having
asymptotic normal distribution.
Assumptione A.8, A. 9.
We obtain sufficient conditions for
Among these are , as before, A'.8, A'. 9, A". 9 •
We have also
Assumption B.3:
There exist numbers b> 0, G > 0, a cdf KO(x)
c
Which is continuous and everywhere increasing (except possibly when
it is equal to zero or one), and everywhere increasing functions
fn(t n ), n
11 (J ),
€
= IIQ {
K(x,m)
where
E
m(x)
~
Assumption A' .8.
such that for every real x
~~c
0 as m -:>
[rm(x) - m(91 )]}
00
+E
m(x) ,
and m(9) is the function given in
Since we have I\ssumption 11..5, the index n is re-
placed by m.
Let K(U) be defined by 1 - u
= KO(K(U».
Then by (;.27) and
Assumption B .3,
where
-
€
.m
-3;>
0 as m -3;>
(3.28)
lim
> 00
mAssumptio~ B .4:
{
~ (x)}
00;
and
lb [f (r.. ) - m(9 )]
l
m,c - m m
~(x)
rV fm(x) as m
= Gc K(al ).
->
00,
are the functione given in Assumption A' .8.
Now if for d
> 0,
for every x,
&l. (").. )
< m(91
-m m -
+
(8 d),
. m,c
then
(3.;0)
for 9
?
91 + fa d
m,c
•
By (3.27) and Assumptions A" .9, B .4, m(9 ) < ~(A.m)
1
for m sufficiently large, provided (3.29) holds and
for m sufficiently large and
by Assumption
(a )
l
It
It
(a )
l
<
m(9 )+
1
> O.
1)
Hence
> 0,
A".9, if (3.29) holds.
Now, by tlssumption :8.4,
.
and by (3.28),
lim
m ->
00
It follows that
t: o-C~(A.m
'
tm,c ,---m(A.m)
&l.
- m(91 )] =
{~c ·(,\'
if a
a
a>b
00
- m{9 +
l
r:.m,ca d)] tends
<b
=b
to the limit
if a = b
(; .31)
a
<b
•
when a
=b
,.
when a
<b
•
Hence (3.29) will hold for m sufficiently large if we choose
dO>
dO
(<xl )!m'{9 l )
O'c lt
>0
55
We have, if a ~ band
This completes the derivation of Hc (d).
K(O:l)
>
0,
dO given by C~ .32).
=
[[00
if a
= b,
if a
<
b,
We obtain
(91 )'(<>2) + ao .("'l)]fm ',(91 )
=b
if a
I
Dc (0:1' 0:2 )
~ Sc(91)~(0:2)/m'(9l)
if a< b
Assumption A.6 will be satisfied provided we choose dO so that
+ Sc(9l)~(0:2)
ml (9 l )dO
< 0cK(O:l)
m' (9l )d o
< Sc(91)~(0:2)
which we may do if
0:
2
=b
when a
<
b,
< 1/2.
Assumption B. 5:
Theorem 3.6:
when a
K( cx )
l
>
0 and a2 < 1/2.
Let :; be given by (3.4).
B .5, A.l, A.5, A'.9, An .9 are satisfied, with a
If Assumptions B.1~
b, then as 8
->
°
where Dc (CX 'CX2 } is given by (3.33).
1
It should be noted that this theorem will not apply, for example,
to the case of Lehmann's two-sample U-statistic corresponding to the
functional fJ.2
=
S
(F - G)2dF , since the uniform convergence reqUire-
ment of At .8 is not satisfied.
In general, this theorem may be
56
applicable in cases where the set
9
1
t
an interior point of the interval
On is contained in
fj)
(e ) for
l
W.
8 . Examples.
The examples given in Section 6 concern tests which may also
be used for testing, respectively, the hypotheses F
F (X}F2 (y).
1
= G and
F(x,y)
=
Both statistics have asymptotic norma.1 distribution under
such null hypotheses.
Example I:
F
= G implies
Q
= 1/2;
hence, for 9
1
situation is the same as in Section 6, for the family
on the binomially-distributed statistic.
= 1/2,
the
j 2 ,s based
For
:J 1 ,s ,we obtain
, '\,
~}.
and
Thus for every c
~
C,
)
for the problem (
{
Example II:
obtain
(l
= 1/9,
and
c on1
F(x,y)
'
{eB 2n~
= F1 (x)F2 (y)
implies 9
= 1/2.
For
J 1 we
57
Thus,
CHAPT~R
IV
TWO-SAMPLE TESTS •
1.
Introduction.
In this chapter, we consider the properties of various two-
sample tests, for several types of two-smnple decision problems.
We will be concerned chiefly with a class of pairs (F,G) of
distributions for which the functional
f5
(4.1)9(F,G) - P(X < Y) -
¢(x,y)dF(x)dG(y)
is defined, where
r
(
(4.2)
¢(x,y)
=
if
1
x<y
)
to
if
types of null hypothesis are of particular interest.
The
most usual null hypothesis for the two-sample problem is F - G.
The
~fO
Wilcoxon-r1al~-\ihitney
hypothesis.
statistic was proposed for tests of this
lihen considering that statistic, however, one is led
naturally to a partition of alternatives according to the value of
e(F,G).
The next step is to consider a null hypothesis defined
according to values of e.
We will see, for instance, that the
Wilcoxon statistic can be shown to have a certain kind of weak
optimum property for testing H : 9
l
9 < 9 2, for large samples.
9
~
1
against H2 : 9
~
9 ,
2
1
In this chapter, unless it is
othe~lise
stated, we will
suppose that we are given the(n + m) mutually independent random
variables
Xl, ••• ,Xn and Yl, ••. ,Ym, distributed according to F(x),
.
G(y), respectively.
(F,G) is supposed to be a.pair of distributions
59
from a class 1) of pairs of distributions for which 0 < ~(F,G) < 1.
Let ci)(Q) be the subset of
£J
in which
~(F,G) = Q.
The decision problem is to select one or the other of two
hypotheses, each of mich is given by a subset of the class 3)) on
the basis of information given by two observed samples.
in general H: (F,G)e
£J 0 an~
are disjoint subsets of
HI: (F,G)&
,f) l' whGre
Thus, we have
9 a and 081
Jj.
Lot zmn represent the vector (xl'·.·,Xn ; Y1' ..• 'Ym). Given
a>O and a statistic t (z ), we will mean by a IItest II, a function
ron mn
~(z
mn I t
mn) 'ltJhich gives the probability of se1Gcting HI (rejecting H)
when
~ron
has been obsGrved, such that
(4.3 )
{~
=
<l>
where cmn and amn (0 < a
- mn
(4.4)
if
t
if
t mn
if
t
ron
>c
mn
= cmn
< c
ron
ron
< 1) are chosen so that
sup
E
cI>(Zmn
(F, G) E. ';j) 0 FG
I \m)
=
Q
Thus we consider a test which is randomized only to the extent necessary in ardor that the level of significance be Gxact1y a.
city, alim and c
mn
For simp1i-
will be written without their subscripts when it is
not desired to emphasize their dependence on m,n.
A family
~
of tests will consist of tests of the form (4.3)
1'
based on the sequence {tmn
m,n = 1, 2, ••• •
1"Jo will in particular bo interested. in the families of tests
ti
and
°Luf based
on the statistics (4.5) and (4.6), respectively.
Suppose n< m and let
60
n
mnU
=
E
mn
. 1
].=
(4.5)
m
E ¢(x. ,y.)
. 1
J=
1
1
n
nWmn ~ i:l ¢(Xi,Yi ),
(4.6)
U
where ¢(x,y) is given by (4.2).
lies of two-sided tests based on
vuJ will denote the fami1/2 J, , "'Jmn - 1/2 I , ro-
and
I
I U,l1Ul
-
I
spectivoly.
2.
Tests of S
~
against Q ~ Q2;
Q
l
Lot us consider HI: 9
first thElt the family
~
9
1
Small Samples.
~
and H : 9
2
9 , Q < 9 • We show
2
l
2
tr provides a test Hhich maximizes the minimum
power with respect to H , among test of size ro for testing HI.
2
For tasting HI against H , we hElve a test ,( z
~ Wmn ) of tho
2_
~
mn
form (4.3) from the family
~
B(n,S).
Thus, for (F,G)e
binomial distribution
J) (S),
n
(4.7) E cl>(ZmnIvJmn) =
FG
W . nWmn hns
n.
.
n
c
( )QJ(1_9)n- J + a ( )9 (1_9)n-c= R(9; c,a),
Z
j=c+l j
c
which is an increasing function of
Q~
and
have n· and c determined by
110
(4.8)
Lot
property.
1-.1.
be a measure on the roal line
1
having tho following
Thore exist disjoint intervalS 11 , 12 , I
3
(i)
(4.9)
R
~(ii)
x.eI.
.1
0
Theorem.4.1:
(j
J
= 1,2,3)
< 1-.1.(1.)< 00,
J
Le,t
?f Rl
such that
nnplios· Xl < x 2< x
3
j = 1, 2,
3.
cf) be a class. of pairs (F,G) of distribu1-.1.
tions having generalized density with respect to some fixed measure IJ.
satisfying (4.9).
Then a test from tho family
~
;f size a for test-
61
ing Hl' maximizes the minimum power with respect to H , provided £)
2
1..1.-
contains the pairs (K(x,O),H(y»,O < Q < 1,defined by (4.10),(4.11).
Proof:
(Fl'Gl )&
By Theorem 2 of
iJ ~(~
[h _7, it will bo
and (F2 ,G 2 )e
£J ~ (9J
suffi~ient to exhibit
such that
~(ZmnI1I\Tmn) is
a
most powerful test of size a for testing the simple hypothesis
(F,G) :: (Fl,G ) against the simple alternative H2 : (F,G)=(F ,G ).
l
2 2
Let H(y) be a distribution having the probability density
(4.10)
hey)
l/IJ.( 12)
=
{ O
if ye1 2
otherwise.
Lot K(x,9) be a distribution with density
9/1J.( II)
(4-11)
k(x,Q)
::
(1 - 9) /tJ.( 1 ) if x e
3
0
Lot Gl(y)
= G2(y)
:=
if x e I
H(y) and F.(x)
].
1)
otherwise.
= r::(x,9.),
i::
].
1,2.
The desired
result follows from a familiar application of the Neyman-Pearson
lemma.
We might broaden the statement of the theorem slightly, for the
test from
U
actually maximizes the minimum power uniformly with ro-
spect to a p3rtition according to values of Q of the subset of
with 9 :=: 92 "
It is also possible, for Gxample, to prove Theorem
class
£J
4.1
for a
restricted to contain only distributions F, G which are
continuous and everywhere increasing (unless equal to zero or one).
The
distr~buticns
with (Kit,Hit)e
K, H are replaced by sequences {Kit
~
(Qi)' (i
3 ' {Hit ~
= 1,2; t = 1,2, •.• ), such that the power
of a most powc;rful test of level a for testing (Klt,H1t ) against
(K2t ,H 2t ) tends as t ---> 00, to tho power of t(zmn'Wmn ) for testing
62
(F,G)&
!iJ
f) (Q2).
(Ql) at level a, against (F,G)e
A sDnilar result may be obtained for two-sided tests from the
family
tf' for
testing H:t:
Q
The proof follows as above.
:II
2: I Q -
1/2 against H
1/2 J
A m.p. test is obtained for,
.
..
~
5 > 0..
t~sting
.
the
simple hypothesis that the joint distribution of (Xi' ... ,Xn ) is
n
)( K(x., 1/2) against the simple alternative that it is
. ::
:1.
1
:1
n
n
1/2 j( K(x., 1/2 - 0) + 1/2 j( K(x., 1/2 + 0).
. 1
:1.
• 1
~
:1=
:1=
That a test from the family
tL does
not in general maximize the
minimum power may be seen by examination of the proof of Theorem 4.2 •.
Lerl1i1la h.1:
tributions.
Let
cf)
c
be the class of pairs o£ continuous dis-
Then e(F,G) = JFdG and
(4.12)
where s = min(m,n).
Proof: 11e have three steps in the proof of (4.12).
(i)
S
(1 - G)mdFn::
(1 - G)m :: (1 - G)n.
(ii)
J
J
s
(1 - G) sdF •
If m > n, then
If m < n, integrate by parts to obtain
s
S
(l_G)sdF :: e •
The integral is' equal to the proba-
bility of the event A: max (X1 , ..• ,X ) <min(YI, ••. ,Y )' while QS i~
s
s
the probability of the event B: Xl < II, ••• ,X < Y ;and peA) < PCB).
s
s
-
6)
(iii)
upper bound QS is attained.
let F(x) "" K(x,9), G(Y)
> m,
0 < 9 < 1, such that the
Consider the distributions K(x,9), H(y)
defined in (4.10), (4.11).
It n
~ o (9),
There exist (F,G)e
Let
~
.-
be Lebesgue measure.
If n < m,
= H(y). Then J(l~G)mdFn = peA) = PCB) = GS •
= K(y,l-G).
let F{x)·= H(x), G(y)
Again, peA)
= PCB).
To prove (4.13), we see that steps (i) and (ii) may be repeated
J
with F and G interchanged and Q replaced. by 1 - Q =
GdF. That
the upper bound 9 s + (1_9)s may be attained for every Q is verified.
with the same distributions used in step (iii).
Theorem 4.2: If'
(F ,G)£
08
0
a tost from the family
,
1.1.
is in-
admissible for t~sting HI against H2 whenever a l < Q~, where
s
= min(m,n)
> 1.
Proof: "ve show that a uniformly bettor tost is available, namely a
test from tho family
V.
\
For m = n
= 1, of course,
U
mIl
and Wmn
are identical.
If a < Q~, then tho following tests are of size a with respect
to HI:
Q .:: 9 ,
1
,(z Iw )
mn ron
=
eoiSa
{
otherwise,
S
,( ~ ron furon ) =
by Lemma
4.1.
If' (F, G) &
Soi
{
i)c (9),
ifWmn "" 1
if U
a
mn
=1
otharwise,
64
Thus, by Lemma 4.1,
E-FG
+(z.mn Iwmn ) >?
E- • (z
mn Iumn )
G
and it is easily seen that strict inequality holds for some (F,G).
Let
ti'
be the family of (two-sided) tests of the form (4.,)
IUmn
based on the statistic
H~:
e = 1/2
- 1/2
I,
for testing
against
The second part of Lemma 4.1 may be used to show that a test from
is inadmissible for testing Hi against H whenever a
2
That the tests based on U
mn
'1
< 2-(8-1), s > 1.
exhibit curious properties in
small samples is strikingly illustrated by the following example.
Suppose (F ,G)£
:It,
0
S 2:1t :S a
-41
since
Now
:It
=
cBc •
9
Let
m =
n = 2, a
S Q~.
It is shown that for any
-2
' the following test is of size a:
1
J
l
,I
~
a e-12
if
U22 = 1
it
if
U
=
22
0
otherwise.
3/4
max
(F ,G)€
eE)c (e)
by Lemma 4.1.
(1 ) The test -*
t
corresponding to n ;:: 1 / 2 a Q-2 is uniformly
l
better that • n tor smaller n •
-
.....
•
(2)
from the family
has the same power function as the test. (Z22IW22)
We
For a
(3)
<
Hence -(
is not necessarily inadmissible.
gi, a test
"in is not of the form (4.3). Thus,
by permitting greater randomization, we may obtain a test based on U
22
t(Z22IW22). This weakens somewhat the effect
which is no worse than
of Theorem 4.2.
2
For a ;:: Ql'
(4)
• n is a test of the form (4.3), and we see
that it is possible that a. test ot the form
(4.}) may not be uniquely
determined by the fixing of the level of significance.
4.2 cannot hold for m = n
3.
Rank Order
Let u
J
= 2,
a
= 912 •
T~sts.
= uJ(m,n),
= 1,
j
••• , M, denote an arrangement of n
(m:n~ possible
XIs and m yts; let;; mn denote the class of M ;::
rangements.
12
=
l
(xxy), (xyx) I (yxx) }
A rank order test (r. o. test) assigns to each u
Jt
j
~
ar.
For example I
y
(0 ~
Also, Theorem
J
•
a number n j
1) which is the probability of rejecting tne null hypothesis
(usua.lly Eo: F ;:: G) if the (ordered) xts a.nd yt! ot
have arrangement uJ{zmn rV U j).
the~
sample zmn
We will use the nota.tion·
66
u
where ~
= (~l'
••• , ~M)' for r.o. tests.
for testing H : F
O
= 1,
... , M)
For a test of size a,
=G,
M
E ~
.1=1 .1
(4.14)
(.1
• a(m + n)1 / m1 nt
We will find it useful also to define a symmetrical rank order
~,
,
for testing H against two-sided alternatives.
Let u denote
J
O
the arrangement obtained from u by reversing the order; e.g.,
j
The set
Y
~
~k
=nj
~ J may be divided into subsets, tV mn
0 in which U '. = u., and
dmn
d
J
J
in which uj rf u j ' For a symmetrical rank order (s.r.o.) test,
if ~ = uj.
We will write
for s.r.o. tests.
Thus, the arrangements u j
a a.r.o. test.
(4.15)
U
€
We have for all u j
mn (zmn )
=1
Y
~
E
11 '
c!mn
( z' ) if z
mn mn
mn
- U
are considered in pairs in
rV
u"
J
Rank order tests and a.r.o. tests Lave
based on the statistics Umn andlumn - 1/2
I,
assign the n. according to the value of' U (z
J
~s
special eases
tests
respectively, which
mnmn
) tor z
mn
f'"",)
u •
j
Since more than one arrangement may be associated with a given value
of U , a test based on U
mn assigns the n j to groups of arrangements.
By (4.15), a test based on IU - 1/2 I assigns the same probability
mn
mn
t
to u j and u j •
The power function of a r.o. test is given by
M
= r.
(4.16)
j=l
n. Pr(Z
J
mn
f'.)
U
j
).
We say that a test p(z j n') maximizes the minimum power among r.o.
mn
tests of size a for testing H against Ht : (F,G) € ~l if for all n
O
such that (4.14) holds
(4.17)
And, of course, p(z j n l
mn
satisfying (4.14)
)
is u.m.p. among r.o. tests if for all n
For s.r.o. tests, replace p by Ps in (4.16) - (4.18).
Consider
now the use of r.o. tests for testing Hl : Q ~ 01 against H2 : Q ~ Q2.
In the example given at the end of Section 2, it is found that
a test based on U
mn
pen in general.
maximizes the minimum power.
This does not hap-
-
It is, however, always possible to construct a rank
order test which maximizes the minimum power among tests of size a for
testing H : Q ~ e against H2 : ~ ~ Q2. This is done by constructing
l
l
a test whose power function is always equal to the power function of
the test based on W , as follows:
mn
68
(1)
A
(4.3) based on Wmn may be written
test of the form
if
= 0,1, ••• ,
t
°<_
where
P
s Wmn (zron )
s
=t
= min(m,n),
t _< 1. We construct the symmetrized test 4>* (z ron Iwmn )
given by
m~n~
where
LI
m:n:
or
$*
(z
mn
Iwmn ) =
L' 4>(z'
mn
Iwmn )
denotes two-fold summation over the permutations
z'mn
of the
$*
XIS
(z
= (x.1.
, .•• ,x. ;
1.
1
and y's of z
mn
Iw )
mn mn =
s
L
t=O
Pt
n
Now
Lno.
z' with s Vi
mn
mn
= tJ
,
~* (z
mn
so that
Iw ) =
mn
~*
(2)
s
E P
t=O
Z
t
mn
r'W
u(m,i')
-
7
is clearly a r.o. test.
Evidently, the power function of
the power function of
$*
is the same as
$.
In the same way, we can show that the symmetrized form of
~ (z
mn
I I wmn -
1/2
I)
is an s.r.o. test.
We may remark that the test ~* based on U22 which maximizes
the minimum power in the example at the end of Section 2 happens
to be identical with $*(z22
4. Tests of F
= G;
I W22 )
Small Samples.
Various results concerning the properties of small-sample rank
order tests of the hypothesis H : F
O
= G against
alternatives have been obtained by 1. R. Savage
certain classeS of
£21_7.
Savage's
approach is to examine the ordering induced on the set of possible
arrangements of (ordered)
XIS
and y's by the distributions (F, G) of
the alternative hypothesis (all arrangements being equally likely
under H ).
O
He considers alternatives which are all special cases of
the slippage alternatives, H : F(x)
s
> G(x)
for all x.
70
Savage shows that it is impossibleifor most sample
size~
to con-
struct uniformly most powerful rank order tests for testing HO against
H • His remarkBapply af'ortiori to the problem of constructing u.m.p.
s
rank order tests against H : 9(F,G) :$ Ql or H : Q{F,G) ~Q2
2
l
The general conclusion reached by Savage is that the class of
admissible tests for testing F
= G against
H is very large. The same
s
remark may be made about the classes of admissible tests for testing
F
= G against
H , H2 , or H~: IQ(F,G) - 1/21 ~
l
e > o.
We may, however, suggest that tests based on W cannot be expected in
mn
g~neral to have even the weak optimum properties that were shown for
.. '.:,
the problem of testing Q:$ 9
1
If m = n
Theorem 4.3.
against Q ~ Q (Q
2 l
= 2,
a < 1/2, (F,G)
sided test of the form (4.3) based on
2.
IVJ
< Q2).
E
£)c ,
the two-
- 1/21 is inadmissible for
22
testing H against H
o
Proof:
We show that a test based on
IU22
- 1/21 is uniformly better.
Let
'(Z'221 IW 22 -l/2!'>
={:
i f IW 22 - 1/21
= 1/2
otherwise.
This test is of size a and
Ee .cIl(Z22I
Iw 22
- 1/21 )
= 2a L- Q2
+ (1 _ Q)2_7•
Let
e
4l l
(Z22I
1°22 . -
1/21)
=
{:
if
/U22
- 1/21
otherwise,
= 1/2
71
if
IU22
- 1/21
= 1/2
1 if
IU22
- 1/21
= 1/4
1
}l -
o
These tests are of size 0 for 0 <0
where
b
2
=
j
(F _ G)2 dG.
tl-?
With (~.19),
we
otherwise.
S 1/3,1/3 <0 < 1/2,respect1vely.
Now,
(Q _ 1/2)2.
establish for'O
< a < 1/3
(with an obvious
.\
abbreviation of notat}on);
and for 1/3
S ex <
1/2
? 2(1
- ga)(e - 1/2)2
> 0,
if Q
> O.
A corollary to Theorem 4.3 is that Cl(Z22 I 1w22 - 1/21) cannot
maximize the minimum ~ower for testing H against H
O
2.
72
We may ask whether, in the absence of "better" optimum properties,
it is possible to find tests which maximize the minimum power among rank
order tests of size a for testing He: F
=G.
Table 4.1, at the end of this section, summarizes the results
which have been obtained.
It is clear that unless the alternative hy-
pothesis is restricted, the class of admissible tests is too large to
permit satisfactory results from this approach.
We consider one statistic not defined above, namely, the twosample statistic proposed by Lehmann
L-16_7
for testing He against
H : F ,. G. Let
3
alternatives
K(x l , x2 ' Yl' Y2)
!
=
\
1
if max (X ,x2 ) < min (Yl' Y2)
l
or max (Yl'Y2) < min (XII x )
2
0
otherwise
where summation is extended over sets of integers (i, j, k, {) With
1
-< i
-
< J < n,
1
-< k < l ..< m.
E L (Z )
ron ron
where
,l =
1/3 + ~2
=
S
(F - G)2 dG.
The remainder of this section is devoted to the derivation of
Table 4.1.
Conditi.O::S tor Table 4.1:
Let ~ be the class of pairs of conti-
nuous distributions (F, G).
Let the one-sided alternatives
iJ 2
j
= {(F,G): 9(F ,G)
> 1/2
be partitioned according to values of
73
e(F,G).
> o}
SJ;
Let the two-sided alternatives
= {(F ,G): IQ(F,G) - 1/21
be partitioned according to values 01' d
the alternatives
2
to values 01' A.
iJ 3 =
2
{(F,G): . b.
>
o}
= le(F,G)
Let
- 1/21.
be partitioned according
The null hypothesis is HO: F
= G.
Tests denoted by $(Zmn1tmn) are of the fo~ (4.3); there is no
ambiguity 01' definition.
Tests denoted by i(zmn1tmn) are tests based
on t
but employing more extensive randomization.
mn
the ones which appear in the table:
if
U
12
The following are
=1
= 1/2
=
=0
Derivation 01' Table 4.1:
Theorem 4.3 gives an example of a proof'
of inadmissibility.
The set of' r.o. tests (or of s.r.o. tests in the two-sided case)
is reduced by use of necessary conditions tor admissibility. Let c;l
be a class 01' alternatives.
(i)
If P(Zmn
r-...J
uj )
~
P(Zmn
r0
p(Zmn;n) will be admissible only if nk
L21_7, page 13 ).
(i1)
It P(Z
mn
C l'
tV
u.) + P(Z
J
mn
where u ' '1t
J
missible only i t nk > 0 ~plies n j
for all (F ,G)
€
E
'1t)
for all (F ,G) e
>0
implies n
rv u
.
t
j
)
J
> p(Zmn N
-
= 1.
e
then
p
(Savage
Uk) + p(Z
ron
y~, then ps{zmn; n) will be
= 1.
t
r0u. )
It
ad-
74
(111)
(F"G)
ELI'
If P(Z Nu ) + p(Z ,-...".) u t.) > 2P(Z. ru u.)for all
j
mn
mn
J mnlt
€:!J ~,
where u j
admissible only if nk
>0
~
€ :;
~"
then ps (zmn i n) will be
implies n j = 1.
When there is only one r.o. (s.r.o.) test which is not inad.
missible, then that test is most powerful among r.o. tests.
It will be seen that the results are "complete" for m = 1, .
n
= 2.
For m ~ 2, n
~
2, there are only certain significance levels
and certain subsets of the alternatives for which results may be obtained without restrictions on the class of distributions considered.
For larger sample sizes, it does not seem to be possible to obtain
the power functions of the r.o. (s.r.o.) tests in manageable form.
76
Simple
sizes
m=2
n
Level of
significance
Alternatives
for which
properties
hold
(uniformly)
o<o~!
iJ 3(602 )
1 < 6,2 <
=2
b-
o<a~~
m =2
0<a~3
n = 3
3**
o<cx~7
2
£)3{6. )
2
0<60 <
Tests which
maximize
minimum.
power among
r.o. (s.r.o.)
tests
4>(z22 iL22)
1:
3
Inadmissible
tests
4>(z221Iw22~1>
*
1
4>{z22 1 IU22 - '21>*
4>(z221IW22 -
1:3
3)' (d2 )
4>{z23 IL23)
11 < d2 < 1**
'Ii
52 -
4>(z23
2
*
Ilu
23 -
'21 1)*
~;(d2)
4>{Z23!lw23
0< d2 < 1j:
1
0<0$3
2
J)3(60 )
1
< 602 <
'53**
o< cx :s'7
*
4>{Z23 IL23)
1:
3
4> (z23
Ilu
11>
*
23-
2'11>*
0 3 (602 )
2
0<60 <
~ I)
4>{z231Iw23
~
This test is most powerful among r.o. (a.r.o.) tests.
** This condition is sufficient but undoubtedly not necessary.
i
I)
77
5. Asymptotic Properties.
The results of Chapter III were applied in Example I of section
6, to show that the families
'it.
and tf have asymptotic relative effi-
against Q ~ Q2 when (F, G) is in the
l
class of all pairs of continuous distributions (or certain subsets
ciency one for testing
Q~ Q
of that class).
~ 92 > 1/2, eff ('lJ / U) is asymptotically less than one, as shown in Example I, Section 8 of Chapter
For testing F
= G against
Q
III.
We have remarked that the asymptotic results of Chapter III will
not apply to the problem of testing F
Lehmann's statistic.
= G against
A2
> 0,
using
However, it might be reasonable in some experi-
mental situations to suppose that the class
at)
of pairs of continu-
ous distributions (F,G) does not contain pairs (F,F).
t lO + ~Ol ~ a >
The assumption
2
0 (required for uniform convergence) implies that A
is bounded away from zero. We may, therefore, consider the problem
2
2
of testing a S A S 8 1 against A ~ 8 2 > 8 1 • For this problem,
Theorem 3.4 is applicable; the asymptotic upper bound for the variance must be found and shown to satisfy Assumption A** .9.
In Section 1 below, we consider the behavior of tests based on the
statistics U and W relative to the problem of testing Q < 9
mn
mn
- 1
against Q ~ 92 , when the pair of distributions (F, G) is assumed to
be in a certain class
£L)o which
is restricted in such a way that
tests based on W do not maximize the minimum power. It is pointed
mn
out that under this restriction, the asymptotic relative efficiency
of
11 with respect to W is greater than one.
78
6. Large-Sample Properties.
In this section, we derive various properties of the families
and
tl
~ for testing 6 ~ 01 against 9 ~ 92 , in large samples.
In the first place, in order to use tests from the familY~ for
testing 9 ::; 9 , we must have an approximation for ;"mn' the smallest
1
number'such that
0<0; < 1/2,
(4.20)
where
91
is
Suppose
having 0
the subset of
J) contains
f)
having Q(F,G) ::; Ql'
all pairs of continuous distributions
(F~G)
< 9 < 1. It is required at various points in the following
discussion that for every (m,n) and every 9,
£Jmn (9)
contain a pair
of distributions such that 02(Umn ) attains its maximum eel - e)/min(m,n);
or at least that Jt)mn(9) contain a sequence of pairs of distributions
for which ~(Umn ) tends to eel - e) I min (m,n). The assumption that
~ contains all pairs of continuous distributions implies this but
is unnecessarily strong; in certain lemmas, weaker assumptions are
explicitly stated.
Now, the number
~mn
defined by (4.20) will always exist.
In fact,
since mnU mn takes on integer values only, mn;"mn will always be an
integer. We obtain from (4.20)
(4.21)
sup
G\
( F,G ) E~1
P.FG(Umn>;..mn + lImn) -< ex
<
sup~
()
F,G E 1
P.FG(Umn>;..mn ).
79
Consider the normal approximation to the distribution of U • For
mn
u such that mnu is an integer,
e:'
ron
(u;F,G)
where, in the notation of Theorem 2.3,
E':'~(U;F,G) : Eron (u . ~f:::j 9 ; F'G)
(4.23)
€'
mn
->
t 10 +
0 as m, n
~01
->
00,
and is uniformly bounded for (F,G) having
bounded away from zero.
Now the bound (2.29) for €mn(u;F,G)
is a function of u having its maximum at u
the bound (2.29) evaluated at u
IE'mn
Lemma 4.2:
(F,G).
If p
Let e(F ,G)
>0
Proof:
Let
= O.
(U; F, G)I
iJ be a
;
= O.
Let €mn (OjF,G) denote
Then
< €mn (0;
-
F, G).
class of pairs of continuous distributions
= J FdG, t 10
=
5(1 - G - 9)2 dF, t 01
= S(F-idG.
and 9(1 - 9) ~ 1/12 + P, then t 10 + ~Ol ~ p.
For (F,G) €£) such that e(F ,G) =
e, tlO+~Ol
=
J(F_G)2dF
+ 28(1 - e) - 1/3 ~ e(l- e) - 1/12.
Lemma 4.3:
for n
<m
If
Amn
> 91 and 0 < 9(F,G) < p < min (91 , 1/2), then
80
Proof:
P ( Umn
It G
<
Ql' then
ron - eJ
)
[umnmn- er > Aa(u
'mn
> Amn ) = P a(u
~ e(l - e) /
2
(Q1 - Q) n
I
by the Tchebycheff'-Btenayme inequality.
By Lemma 4.3, 0 <
< ex imply
(4.24)
(if n
p
(e , 1/2),
:s m)
sup
(F,G)€J9
1
> e1
91 < Amn , and P(1-P)/(Ql-p )2n
1
> 1/12
> 1/12, and 0 < P < min
Now, i f u
1
PFG (Umn > "'mn) =
sup
sup
P (U > A ) .'
P S e S 9 (F,G)€~(e) FG mn . mn
Furthermore, 9 (1 - e )
1
1
p(l - p
< min
~
implies the existence of p such that
(e , 1/2).
l
e(F, G), we have for n .:s m
(4.25)
Thus, by Lemmas 4.2 and 4.3, we obtain from (4.21), (4.22), (4.24),
(4.25)
(4.26)
+
where
(4.27)
I~ I <
ron -
sup
<e<9
p -
-
1
...,
€
mn
81
and 1/2 - 1/ Jb
9 (1 - 91 )
1
< p < min
> 1/12,
(G , 1/2), provided n ~ m, A - lImn
mn
l
< o.
and pel - P)/(9 - p)2n
1
(4.28)
in such a way that m/(mt-n)
->
p (1/2
->
91 ,
We will write
sup
(F,G)e:2)(e)
By Lemma 4.2 and Theorem 2.2, B(P,Ql' m, n)
>
I'€'mn (O;F,G)I.
O,as m, n
->
00
S P < 1).
But the last statement holds no matter how m, n tend to infinity.
This follows from the promised improvement of Lemmas 2.6 and 2.8,
which is included in the proof of lemma 4.4.
only for the statistic U
mn
immediate.
This lemma is proved
defined by (4.5); generalizations are
Let
sup
-co
Lemma 4.4:
< u < co
If
t lO
Jl.JUmn - Q) /
-~
+ tOl
> 0,
a
(Umn )
~ u) -1 (u)"\J
then ,
(4.29)
and
with
E
mn
->
t lO +
Proof':
0 as min(m,n)
~Ol
->
00,
uniformly for (F, G) in a class
bounded away f'rom zero.
Only the third term of' the bound in (4.29) dif'fersf'ram the
expression (2.29).
2
In place of Lemma 2.6, we may write, from (2.22),
t lO
t Ol
a (umn ) - - n - - m<
tll -
t lO
mn
- ~Ol
•
82
We then obtain" 1n place of Lemma 2.8"
(If r
> 1"
the corresponding improvements in these lemmas lead to
more complicated expressions.)
It remains to show that the first term of the bound in (4.29) is
O(max(m-1/2" n-1/2) ) • But 1-\3
S t lO'
v
3
S t Ol •
This completes the
proof.
Now, for any (F 0' GO)€[)l'
•
0:
~
PF oGo(Umn
> ~mn).
In particular
let F = K(X'~l)' GO = H(y), where H,K are defined by (4.10), (4.11),
O
with 1-\ taken to be Lebesgue measure. If (F,G) = (FO,G O)' nUmn has
~l)
binomial distribution with parameters (n,
Uspensky
0:
L-27 -7"
p. 129)
< n~mn )
-> 1 - PF G (nUmn-
o0
and we have (see
=1
- PF G (nUmn< L- ni-ron-7)
0 0
where L-u_7 stands for the greatest integer less than or equal to u,
whence
(4.30)
0:
>
1
-
- l!.
~ (I mn ) - D(l mn , n, ~l)
where
and
2
(1..z2)e- Z . /2 +
1 • 28
(4 .32 ) D(z,n,Q ) =
6 J23tnQ{l.Q) •
83
with
+
provided nS(l - Q)
~
25.
sup
-00
->
0 as n
->
~
JnQ(l-el }
Let
b(S,n) =
b(Q,n)
exp {-
<
Z
< 00
D (z,n,S)j
00.
We are now in a position to obtain upper and lower bounds for;"
mn
Let h(U) be defined by 1 - u
=1 (~(u».
Let Amn be defined by (4.20), let B(p, 91 , m,n) be
given by (4.28) and b(Sl,n) by (4.34 ). Suppose 9 (1-9 ) > 1/12 and
1
1
n91 (1-9 ) ~ 25. Let p be chosen so that
Theorem 4.4:
1
1/2 - 1/~ < p < min(Ql' 1/2).
Then
(4.36)
Q (1-9 » 1/2
A
mn ~ 91 + ( 1 n 1 . ·}..(a + b( 91 , n» - 1/2n i f n
and
where N = N(a'Sl'p) is an integer such that
(i)
~
< b(91 ,n} < 1/2 - ex if n > N,
>N
•
84
(11)
(11i)
(iv)
Proof:
B(p,eVm,n)
n
< 0:
> p(l-~)AQ1-P)20:
2
m 2: n
it'
> N,
> N,
if n
2
n > 3(1.02) /X (0: + b(el,n»
it'
n > N.
We verify first the eXistence ot an integer N satisfying (i)-
(iv): (i) and (iii) need no comment; (ii) is a consequence ot Lemmas
4.3 and 4.4, given (4.35); (iv) will be possible since by (i) the
right-hand side of the inequality is positive, and decreases with n.
From (4.30), (4.34), and (i),
Now, from (4.31),
Jnel (1-e )
£
1
<
mn -
ni.
mn
+
1/2 - n91,
so that (i) implies
from which we obtain (4.36).
Now, trom (4.38) and (i), X
mn
- l/mn
Thus, from (4.26) and (4.27), 91 (1-91 )
> 91
> 1/12
if N
< n -< m and
with (4.35), (i),
(4.39), and (iii) implies
1[ Vii
(X
mn
J9
- l/mn - 9l
(1-91 )
1
>J
<1
-
0: + B(p,91 ,m,n)
85
when m ~ n
> N.
With (ii), we obtain (4.37).
Finally, we show that (iv) is sufficient for (4.39). We require
n > N. We have assumed 91 (1 - 91 ) > 1/12; n9l (1-91 ) ~ 25 implies
n > 100; hence this inequality surely holds i f (i) holds and
if
This completes the proof.
For large m and n, we may on the basis of Theorem 4.4 use the
approximation A
mn
(4.40)
\-l
mn
tf"'.J
\-lmn' where
== Q +
1
;"(0:) •
Remarks.
(1)
Trial computations show that we must have N close to one
million in order to satisfy (1i) for 0:
= .05,
even in the favorable
case that 81 = 1/2 , m = n.
(2) For the values of a which are usually of interest, condition
(iv) will not be a significant restriction.
(3)
B(p,Ql,m,n) will not be very sensitive to changes in p as p
->
8 • Thus, (i) and (U), particularly the latter, are in prac1
tice the conditions which must be satisfied before the theorem can
be applied.
(4)
4.4.
By symmetry, we may of course interchange m and n in Theorem
86
The next problem is to obtain some information as to the sample
sizes m and n which would be required to solve the problem
i
( {£\,mn J ' £\,mn} , a,f3) with a test based on the- statistic Umn •
."I.
Let
U
be a family of tests of the form
/\
(4.41)
{l
Iu ) =
mn mn
4>(z
0
if
U
if
umn -< 'J..mn ,
mn
>'J..
mn
where A is determined by
mn
(4.42)
A
and 11.(1).)
contains all pairs of positive integers. We have defined
1\
rJt(U)
A
as that subset of
fZ( U)
in which
(4.43)
Lemma 4.5:
G(y»
8.
is in
Suppose the pair of(continuous)distributions (F(x),
cf)
if and only if' the pair (1 - G(-x), 1 - F(-y»
~
Then, for the family of tests
J\
if and only if (n,m)€
Proof;
is in
A
U,
A = A , and (m,n)€?Jl('U)
mn
nm
m( U).
In the notation developed in Section 3 of this chapter, let
u(m,n) represent an arrangement of n
XIS
and my's.
Let u'(m,n) be
the arrangement obtained from u(m,n) by reversing the order, and let
u(m,n) be the arrangement of m XiS and ny's obtained from u(m,n) by
replacing the
XiS
by y's and the y's by
zmn ~ u(m,n) and Zlmn AJ ul(m,n), then
also, if
rv u(m,n), U (z ) = 1 run
mn mn
z
XiS.
We have seen that if
Umn (zmn )
U
nm nm
(z ).
=1
- Umn (z'mn ). Now,
Thus, U (z )
mn mn
87
= Unm (Zlnm ),
where Z' rvul(m,n), which is obtained from u(m,n) by renm
versing the order.
Let Fl(x)
=1
- G(-x), G'(y)
=1
- F(-y).
If F,G are the d1stri-
butions of X, Y, then F', Gt are the distributions of -Y, -X.
8(F,G)
= p(X < Y) = e(F' ,G').
PFG(Zmn
Furthermore,
= PF'G'
u(m,n»
N
Thus
(Znm
No
ii' (m,n) ).
It· follows that, for any constant c,
sup
C1
(F ,G)ecV
l
PFG(Umn
> c) =
and
This completes the proof.
By Lemma 4.5, we may restrict our attention to tests based on samples of sizes m and n, with n
~
m.
that (i)-(iv) of Theorem 4.4 hold.
(4.44)
~*(z
run
Suppose also that n is so large
Consider the test
U
>.,..*mn
U
<.,..*mn
mn
I umn ,p) _-lOl
mn -
where p, Q satisfy the conditions of Theorem 4.4, and
l
.,. mn
*
=
e
1
+
(
e (l-Q
By this choice of)"* we have
mn'
1. 1
n
»)
1/2
• A(e:
88
(4.46)
Let
U*
= {(m,n): m
denote the family of tests of the form (4.44)with
~ n > N}
, where N is given by Theorem 4.4. We will ob-
tain some information about
Lemma 4.6:
if n
If u
?[(ou*)
< 92
?rl(1i*)
for
a < ~ < 1/2.
and max (92 , 1/2)
< pi < 9(F,G) < 1,
then
<m
The proof is analogous to the proof of Lemma 4.3.
Now
92 > A:n
Q
2 -
if and only if'
9 (1-9) 1/2
1)
•
9>(1
1
n
By Lemmas 4.2 and 4.6, if (4.47) holds, 8 (1-9 )
2
2
> p'(l-p') > 1/12,
> P'(1-PI)/(pt-92)213, m(~*) is given by
2
the set of integers (m,n) such that
p'
> max
(9 , 1/2), and n
(4.48)
inf(,'\
(F,G )€ou( 9)
~Gep*(Z
mn
lU
mn
We will derive upper and lower bounds for the quantity
and hence sets
,p)
~
>1
-
- 13.
in (4.48),
tfJlt 0(9)..*) c:.1iteu*) c: Wtl (~*).
Recall the distributions (Fa' G ) for which nU has binomial disa
mn
tribution. If we replace 9 by 9 , we obtain
1
2
Q
""mIl
:5
P G (U
F a a mn
> 'A.*
mn
)
:5
l-P G (nU
F a a mn
:5 L- n'A.* 7> <
Inn-
l-~(l' )-D(l t ,n,9 )
2
mn
mn
89
where
I'
mn
=
*
~nAmn-7+ 1/2 • ne2
Jn~2(l • 9 2 )
and D(z,n,e) is defined by (4.32). Let
b'(e,n) =
inf
-00
D(z,n,e)
< z < 00
Then
Now, from (4.22),we have
(4.52) P(U >,..*) = p(U > l..rmn~* 7 + lImn)
mn
mn
mn - moL
lnn-'
=
~
[e -L-
A:n J
mn
o(U
mn
If u
< 9,
n
I
mn ]
)
+ e'
mn
m, then
~
in!
(F ,G )erV( e)
and (4.47) implies
1.(eo(u.
u )
mn )
92 > L-mn A:n-J /
=! ( In
mo.
(i- U
"e(1-9)
»)
;
Bence, given the conditions
under which (4.48) holds, we have from (4.52), (4.53)
Q.
(0)
> QP
--mIl -
where from (4.23)
1:n.n
= ';l
£.
[JD
(92- A:n)]
'e (1 e )
v 2 - 2
.
..
B'(p' ,9 ,m,n)
2
90
We now state and complete the proof of
Theorem 4.5:
(J
= 1,
and p
> 1/12 + TI, TI > 0, n e.J (1-9.)
> 25
Jso that p(l-p) > 1/12, p'(l-p') > 1/12,
Let 9j (1-9.)
J
2), and choose P, p'
-
< min(91 , 1/2), p' > max (92 , 1/2). Let b(91,n), B(p,e1,m,n),
b'(92,n), B'(p',92,m,n) be given by (4.34), (4.28), (4.50), (4.55),
respectively.
Let N be the integer given by Theorem 4.4 and let
N' be an integer such that N'
(if)
(ii')
Nand
-(1/2 - ~) < b'(92, n) < ~
if n
> N' ,
B'(P',92, m, n) < ~
n> p'(l-p') / (pi - 92)2~
(iii')
(iv')
2:
n
ifn>N',
> 3(1.02)2 j ~2(~ _ b')
if n
> N' ,
Let MO' M1 be integers defined by
+ l/m Vn)
2
if m ~ n ?;: MO'
1n 0(Vj)*) C 1i!(11*) c ~l(~*)"
(j = 0,,1).
Proof':
where 1JI.j (1l*)
From (4.51)" (4.54) we have
the sets tJ1I. J('U*)
= {(m"n):
m~ n
~ Mj ]
~) ? ~ ~~) , and we obtain
~) > 1
from the relations
..
~
(j = 0" 1).
Thus" from (~.51)" (4.49)
1 (-i~) 2: 1
... f3 + b f
,
-i'mn -> h(f3 ... b').,
and the set in which
~) ~ 1 - f3 will be contained in
the set where
The inequality (4.58) defines MI.
From (4.54)" it is sufficient for
~) ~
~ f~e:;~ ~ :~)J ~ ~
1 -
1 - f3 that
+ B',
The inequality (4.59) defines MO.
We may use ~n to obtain 7!I.('U*) since all the conditions of (4.48)
have been assumed"
and (4.47) follows from (v).
It remains to verify that all of the conditions of Theorem 4.5 can
be satisfied.
92
The quantities
Ib(91,n)l, Ib'(Q2,n)I, B(p,Ql,m,n), B'(p',92 ,m,n)
can be uniformly bounded for 9 j (1 - e j ) ~ 1/12 + 1"1, j = 1, 2. Hence
(i), (it), (ii), (ii t ), (iv), and (iv r ) can be satisfied with an
integer Nil independent of 9 , 92 ,
1
Since Qj(1-9 j ) is bounded away from 1/12 (j
= 1, 2), (iii) and
(iii') can be satisfied with N" independent of e , '11 ,
2
l
Now MO 2: M~ by construction, and we can choose (92 - ell so small
that M > Nt provided Ml is greater than the quantity on the right-hand
1
side of (v). This will be so if
(4.60)
when m 2: n
> N',
and (iv r ) is sufficient for (4.60).
This completes
the proof.
Theorem 4.5 may be described in the following summary:
First, a
set of conditions on sample sizes is necessitated by the crudeness of
our bounds on the deviation from the normal approximation (i.e., all
except condition (v».
Second, we choose (e2 - ell so small that condition (v) is sufficient for the others. Then, for the familyll*
with 12 (V,. *) defined by (v), we have bounds for
m(11*).
(4.61)
It follows from Theorem 4.5 that (under the conditions stipulated
there) for every c
(4.62)
93
In other 'Words" an experimenter using a test from the family
f.L *
would
need to take no more than M observations in order to solve the problem
O
({0 l ,mn1,,{tV 2 ,mn
1, a,,~).
By the symmetry shown in Lemma 4.5" (4.62) holds for the family
a,
** defined as-~*
~I
-VL
with nand m interchanged" and hence for the
union of the two families.
On the basis of Theorem 4.5, one might use the approximation
== -N(
al*
Nch{
) N
IH*
l4 )"
where
This approximation" while more symmetric, would in practice by very
little different from the asymptotic
In order to make certain comparisons" we
of calculations for the family
~
perform a similar set
Only the definitions and results
are given here.
Suppose n
(4.64)
P
S m.
Let nf,. be the integer defined by
n
<a <
e1 (Wmn > An) -
mn > An -
Pn (W
~l
If
-a < b I (9 , n) < 1 - a,
1
then
l/n).
.
(4.66)
where b'(e,n) is defined by (4.50).
Let
11* be
the family of tests given by
$ * (z
Iw ) =
Inn Inn
with
n
1l (1J*)
{l
0
if
W
>)..*
u
if
W
< 'f..*n
mn
,
mn -
containing pairs of positive integers (m,n) such that
< m and (4.6,) holds. Now em.(1S*-) is the subset of
ttl. (tJ*)
in
which
(4.68)
Pe (WInn
2
> A*)
>1 n-
But, since the distribution of W
Inn
~.
depends only on the smaller of the
two sample sizes, (4.68) will also define the integer Nc (~*)
We obtain M1
:s Nc(~*) ~ M; ,
where
M~,
n>
if n ~ M~ ,
provided (4.65) holds and
Mb are
= N(~*).
defined as follows:
95
provided (4.65) holds and
(4.72)
where b(e,n) is defined by (4.34).
7. Large-Sample Comparison of
For testing
Q~
el against
U and U
Q
2:
92 , we have seen that tests from
?fare preferable in very small samples; also that the asymptotic
U and OW is one.
relative efficiency of the families
nevertheless, one would expect to find that tests from
Intuitively,
zt , which
appear to utilize more of the "information" in the samples, would
be better in some sense than tests
from~.
tends to confirm this expectation.
The following discussion
It is valid only under restric,
tions which imply that the sample sizes are very large indeed (one
million or more, say).
Presumably, however, a more precise approxi-
matiOD to the distribution
U would permit us to demonstrate the
mn
same propositions for more reasonable sample sizes.
o~
we say" ]'1 is more efficient that
J 2"
i f the index of effi-
ciency of the former is smaller than that of the latter, with respect
to a specified problem.
tions (F,G), we write
([at) l,mn i'{£) 2,mn~
For a given class
(D,
, 0,
9 ,
1
iJ of pairs of distribu-
"2' 0,f3) to repl'lesent the problem
f3) , since ~
contains distributions
mn
Fmn of the form
n
F
(z) =
mn -
m
"IT TI
i=l
j=l
F(x.) G(y.), (F,G)eJ),
J.
and Vl,mn' cf)2,mn are the subsets of
e(F,G)
2:
92 , respective1y.
J
0mn with 9(F ,G) ~ 91 and
(I)
if
~
tt/*
is more efficient than qA.* for the problem (i),9 ,e2 ,a,f3)
1
contains all pairs of continuous distributions, provided
(92 - 91 ) is chosen ao small that the sample sizes must be large
enough to satisfy the conditions of Theorem 4.5, as well as certain
conditions implied below.
Consider M , M~, the lower and upper bounds, respectively, for
l
N(~L*), N('2/"). (We drop the subscript c in Nc(J) since the bounds
are the same for every c.) We show that if m and n are sufficiently
large, Ml >M~. From (4.57), (4.71), it is sufficient to show that
.
Now
A(U) is a decreasing function of u and B, B' ~ O(n
b,b'
= 0(n-l / 2 ),
-1/3
)while
so that the quantities in square brackets are positive
and in fact have positive lower bounds of order n- l / 3 • Hence (4.73,)
will hold if m,n are sufficiently large (n
~
m).
Proposition (I) may be interpreted to mean that, if we consider
test families
1£ *, U*
(in order to be certain that the level of sig-
nificance is at most 0:) we may solve the problem
(i),
9 , 9 , a, f3}
1 2
"more economically" (i.e., with smaller samples) by using a test from
tf*.
This result is to be expected, since we know that a test based on
Wmn maximizes the minimum power for testing Q ~ 9 against Q ~ 9 ,
1
2
while tests based on Umn do not in general have this property.
97
At this point, therefore, it is of interest to compare the behavior
of tests from the two families with respect to slightly modified problems, far which tests from 1Jrdo not maximize the minimum power.
In the remainder of this section, we consider only families of
tests based on samples of equal size. With this restriction, we have
the variance of U depending on only one parameter besides 9, namely,
mn
2
n a (Unn ) :: 29(1-9) + 6
where 6 2
= 6 2 (F,G) =
2
2
.. 1/3 - lIn L-e(1-e)+A ..
1/3J,
j(F _ G)2 dF.
nn ) attains its maximum e(1-9), for (F,G)€~(Q), when
2
6 (F,G) = 1/3 - e(1-9). But the pairs of distributions for which this
2
Now n a (u
maximum is attained are rather unusual: in a typical case, the random
variable distributed according to F must take on values in two intervals separated by a third interval:il which must be found the entire
range of the random variable distributed according to G.
Suppose we impose the condition that, for all (F,G)€~(Q),
First, under this condition, the proof of Theorem 4.1 fails, as do
also the proofs of the modifications and extensions of that theorem
mentioned in Section 2 of this chapter.
we retain in
£) (e)
Second, under this condition,
pairs of distributions of several common types,
among which are the following:
(1)
Normal.
Let
N(~,a)
~ and variance a2 • Let F
only on ~ (cf.
denote the normal distribution with mean
= N{O,a), G = N(~a,a).
Sundrum L- 25J), and (4.75) holds.
2
Then 9,6 depend
(2)
Rectangular.
the interval (a,b).
Let R(a,b) denote the uniform distribution on
(4.75) holds if we let F = R(O,l), G = R(5,1+5),
o > o.
3. Exponential. Let E(a,b) denote the distribution whose cdf
is
E(x;a,b)
=
{o f ( >J
. l-exp -~-a
Then (4.75) holds if we let F
F
= E(a,l),
G
x<a
= E(O,b),
•
x>a
G
=E(ab,b);
or i f we let
= E(ab,b).
(4) "Lehmann t s alternatives," see L-17.1. (4.75) holds if (F,G)
= (F,Fa ),
a
> o.
Finally, under the restriction (4.75), we have
max
(F ,G)€[)(e)
no2( U )
nn
=
/\
Let
V2 denote
the subset of
distributions (F,G) with 0
n + lL2
'32
n
Q( 1-9 ) •
£J containing all
pairs of continuous
< Q(F,G) < 1 and ~2(F,G) ~ 1/3 - 46(1-9)/ 3
=Q~
/\
9 • In the problem (.f) 2' 61' 6 , 0:, 13 )" then, the
2
2
null hy'pothesis is unchanged while the "unusual" distributions are
if Q(F ,G)
eliminated fram the set of alternatives.
~ative
This is a relatively conser-
modification of the problem.
1\
As in Section 6" we find an integer MO which is an upper bound
1'\
for N(1t*) relative to the problem (£)2' el" 82 ,
1\
conditions of Theorem 4.5, M is defined by
O
0:,
13). Under the
+
99
1\
(II) 1[* is more efficient thantJ* for the problem
(~2,91'92'0:,~),
provided (e - el' is so small that n is sufficiently large to satisfy
2
the conditions of Theorem 4.5 as well as certain conditions
~plied
below.
The index of efficiency of 1J* is unchanged for this problem, since
A
it depends only on e , 9 ,
l
2
lower bound M~ for N(ir*).
o:,~.
We show that M is smaller than the
O
By (4.76), (4.69), it is sufficient to show
that
(4.77J v'91 (1-91 )
[1-(a + b' (91 ,
+ ,f9 (1-9 )
2
2
n» -
(n+~/2 ) 1/2 1-(a-B(p, 91 , n,nnJ
[I-(~_b' (92 ,n» - (n+~/2)1/2(~)1/21-(~_B'
(p' ,Q2,n,n)!J
> 2n- 3j2
But the second term of (4.77) has a positive lower bound of order one
while the first has a negative lower bound of order n- l / 3 •
Now, i f we further restrict
2
6 (F ,G)
~to
a subset £)0 in which
~ 1/3 - 4/3 eel-e) for every 8, we may obtain a new (and
smaller) upper bound for "'ron and hence a new family of tests
having level of significance 0: for the problem (ol)o,el'Q2'
Proposition (II) will hold a fortiori with
$)
U°
o:,~).
1.1 *, £)2 replaced by Uo'
O·
Furthermore, it is evident that the asymptotic relative efficiency
of
1i with respect
(~
0' 61 , 6 2 , 0:,
to
~).
Y is greater than one for the problem
100
8. Discontinuous Distributions and Ties.
In this chapter we have assumed (except in Theorem 4.1) that the
two samples are drawn from continuous distributions F and G.
This
assumption may be relaxed in certain ways.
The small-sample exact distributions of U and Vi , considered
mn
mn
in Sections 2m 4, were calculated for continuous distributions.
Similar calculations might be made in the more general case, but
this has not been done.
Some of the large-sample and asymptotic results may, however, be
modified without too much difficulty.
Let
~r be the class of all pairs of distributions (F,G) such
that F and G have no common point of discontinuity.
be changed in the foregoing discussion if the class
Nothing needs to
iJ
of continuous distributions is replaced by the union of
Stoker
L-22_7
of all pairs
f) and~
I.
has pointed this out in connection with the bounds for
deViation from normality of the distribution of U • That there is
mn
no difficulty with uniform convergence will be clear in the discussion
of a more general case below.
We note that for (F,G)€
functional e(F,G)
SJ + 8
= p(X < Y)
r,
we retain the properties of the
which recommend it as a measure of the
difference between the two distributions, namely:
and F(x) = G(x),
-00
< x < 00,
cause p(X
= y)
-00
< x < 00, then Q(F,G)
then Q(F,G)
==
==
If (F,G) € k) +
SJ
1/2; if F(x) ~ G(x),
> 1/2. These properties are retained be-
O.
The situation is changed, however, if we admit a wider class of
pairs of discontinuous distributions.
Theorem 4.1 will still hold,
I
101
but Q(F,G)
= P(X < Y)
is no longer a reasonable parameter.
= O.
extreme case, we may even have Q(F,F)
section, we let
~
In an
In the remainder of this
represent an arbitrary class of pairs of distri-
butions (F,G) having 0 < e(F;G) < 1.
Since, in practice, it is frequently unrealistic to suppose that
p(X
= Y) = 0,
and that there will be no tied observations, the Wilcoxon-
Mann-Whitney is usually (cf.
mnLJ
mn
L-9_7>
defined by
=
where
rj;' (x,y)
=
1
x<y
1/2
x = y
o
x<y
The corresponding functional is
9' (F ,G)
= p(X < Y)
The functionals
¢',
t~o'
+ 1/2 P(X
t Ol '
= Y) = 1/2
J
[F(:k>O) + F(x-017dG (x).
~il may also be defined with
as 1s clear from Section 3, Chapter II.
¢ replaced
by
Nothing is affected in
Chapters II and III, except that the examples in Sections 6 and 8 of
Chapter III are valid only under the condition that
zero or one only.
¢ takes
values
The argument in Sections 5,6, and 7 of this chapter,
however, fails in several respects.
We will discuss first those
points which depend on the fact that ¢(x,y) takes values zero or one
only; and second, those points which depend on continuity of the two
distributions.
102
Where we formerly had t ll
~~
= gel-g),
we now have
= e (1 _ a') - 1/4 p(X
t
EV:j,dently, however, we have (provided
cf. Theorem 4.1)
=
Y) •
iJ (ef) is
It is easy to verify that we still have (if n
~
sufficiently large,
m)
Thus, except for the question of uniform convergence, the treatment
of the distribution of U is unchanged, and analogs of Theorems 4.4
mn
and 4.5 might be proved. The conditions for uniform convergence are
considered below.
(Certain trivial modifications are required at
points where we made use of the fact that mnUmn takes integer values
only when the distributions are continuous.)
The statistic W , when defined with ¢1(X,y) instead of ¢(x,y),
mn
no longer has binomial distribution. While its distribution can be
computed exactly for small samples, it depends on p(X
as on gt.
But W
mn
= Y)
as well
The distribution is not convenient to handle in general.
is still a sum of independent, identically distributed random
variables having asymptotic normal distribution.
may be established under the condition that
Uniform convergence
1
t ll is bounded away from
zero, and the error incurred by using the normal approximation is (by
Berry's theorem, for example )
of order n-1/2 • An approximation to
the index of efficiency for the family
~relative to
the problem
103
(£),Q~, 9~,
ex,
~)
may then be obtained by the methods used in Section
6 of this chapter.
The continuity of P and G is used. in eB'tablishing the relation
~10 +
where 2/).2
=
t Ol
f(F - G)2 d(F + G).
t'
y'
~10 + ~Ol
where 2/).,2
2
= D. - 1/3 + 2e( 1 - 9),
=
S
'iF
= F(x +
(:F -
= A ,2
In the present case, we have
- 1 / 3 + 29(1-9) + p,
Gl d(F + G),
O} + F(x - 0),
2G = G(x
+ O} + G(x - 0),
and p is a certain function of the probabilities of tLes, which may
be positive or negative (but is non-negative when F,G have no common
points of discontinuity).
Conditions sufficient to bound ~io +
tb1
away from zero are sufficient for uniform convergence to normality of
I
Umn • A condition on e analogous to the condition eel-e) > 1/12 used
in Section 6 will not suffice, unless F,G have no common points of
discontinuity.
The details of the comparison of
against S'
~
2
1A
and
9 have not been carried out.
If for
testing
e' ~
9i
It is eVident, however,
that eff(1{/~ is asymptotically one under conditions sufficient for
the required uniform convergences.
While we may expect the first
proposition of Section 7 to be retained, although perhaps with added
conditions, the second proposition would have to be reconSidered
extensively.
BIBLIOGRAPHY
"Asymptotic behavior of some rank tests for
analysis of variance," Annals of Mathematical Statistics, Vol. XXV (1954), pp 724-736.
"On uniformly consistent tests," Annals of
Mathematical Statistics, Vol. XXII (1951),
pp. 289-293.
Harald BergstrBm, "On the central limit theorem," Skandinavisk
Aktuarietidskrift, Vol. XXVII (1944), pp. 139-
153.
"On the central limit theorem in the case of
not equally distributed random variables,1I
Skandinavisk Aktuarietidskrift, Vol. XXXII
(1949), pp. 37-62.
L~5.J
A. C. Berry,
L-6_7 w.
"The accuracy of the Gaussian approximation to
the sum of independent variates," Transactions
of the American Mathematical Society, Vol. XLIX
(1941), pp. 122-136.
J. Dixon andA. M.1Vlood, "The statistical sign test,"
Journal of the American Statistical Association,
Vol. XLI (1946), pp. 557-566.
L-7_7
B. V. Gnedenko and A. N. Kolmogorov (translated by K. L. Chung),
Limit Distributions for Sums of Independent
Random Variables, Cambridge" Mass., AddisonWesley Publishing Co., Inc., 1954.
L-8_7
P. R.Halenos,
"The theory of unbiased estimation," Annals of
Mathematical Statistics, Vol. XVII (1946), pp.
34-43.
"Nt>te on Wilcoxon's two-sample test when ties
are present," Annals of Mathematical Statistics,
Vol. XXIII (1952), pp. 133-135.
L-10_7 Wassily Hoeffding, "A class of statistics with asymptotically
normal distribution," Annals of Mathematical
Statistics, Vol. XIX (1948), pp. 293-325.
L-ll_7 Wassily Hoeffding, "'Optimum' non-parametric tests," Proceedings
of the Second Berkeley Symposium on Mathematical
Statistics and Probability, 1951, pp. 83-92.
105
L-12_7
Wassily Hoeffding, "The efficiency of tests," Institute of
Statistics Mimeograph Series No. 99, University of North Carolina, 1954.
L-13_7 Wassily Hoeffding and J. R. Rosenblatt, "The efficiency of
tests," Annals of Mathematical Statistics,
Vol. XXVI (1955" pp. 52-68.
"The apprOXimate distributions of the mean and
variance of a sample of independent variables,"
Annals of Mathematical Statistics, Vol. XIV
(1945), pp. 1-29.
Rank Correlation Methods.
Griffin, 1948.
L-16j
E. L. Lehmann,
London, Charles
"Consistency and unbiasedness of certain nonpa:cametric tests," Annals of Mathematical
Statistics, Vol. XX'fITI95il-;-w.165-179.
"The power of rank tests,1I Annals of Mathematical Statistics, Vol XXIV (1953), pp. 23-43.
L-18_7
H. B. Mann and D. R. to/hitney, "On a test of whether one of two
random variables is stochastically larger than
the other," Annals of Mathematical Statistics,
Vol. XVIII (194'7), pp:-,O-':bO.
L-19_7
Emanuel Parzen, "On uniform convergence of families of sequences
of random variables," Un5.versity of California
Publications in Statistics, Vol. II (1954), pp.
23-54.
E. J. G. Pitman,
1. R. Savage,
Unpublished lecture notes, 1948.
Contributions to the Theory of Rank Order
Stair~ics, Nat10nal Bureau of Standards Report
3262,
u.
S. Dept. of Commerce, 1954.
"An upper bound for the deviation between the
distribution of Wilcoxon's test statistic for
the two-sample problem and its limiting normal
distribution for finite samples, I and II,"
Proceedings of the Koninkl~ke Nederlandse
A~ademie van Wetenschappen, Series A, Vol. LVII
(195 4 ), pp. 599- 614.
Oor 'n Klas van Toetsingsgroothede vir die
Probleem van Twee Steekproewe, The Hague,
Uitgeverij Excelsior, 1955.
,,
l
106
"Moments of the rank correlation coefficient
T in the genera.l case," Biometrika., Vol. XL
(1953), pp. 409-420.
"The power of Wilcoxon's two-sample test,"
Journal of the Royal Statistical Society, Vol.
XV-2 (1953), pp. 246-252.
L-26_7
R. M. Sundrum,
"On Lehmann's two-sample test," Annals of
Mathematical Statistics, Vol.XXV (195 4), pp.
139-145.
L-27J
J. V. Uspensky, Introduction to Mathematical Probability, New
York, McGraw-Hill, 1937.