A NON-PARAMETRIC TEST OF WHETHER TtVO
SIMPLE REGRESSION LINES ARE PARALLEL
by
Richard F. Potthoff
University of North Carolina at Chapel Hill
Institute of Statistics Mimeo Series No.
/./'-/
'--_.__ -
445
..
University of North Carolina
at Chapel 'Hill
~
','
".,-
Department of
-r~'
},~~,~f:~;;-,~·
Statistics
~~1!,.:,
"'j,i"it "
:'~it~~:~~' .~.
t
'J'-'
~~#~".,.
l.J···;It';'
~i'.
II,£J /
;vtJ
IVII 4 4S--
/9(p~
"-;'':i
'"
:'."'-,.,
-. 1
,
.
~:'
,',., •
"
.
.j
A NON-PARAMETRIC TEST OF WHETHER Tt'1O
SIMPLE REGRESSION LINES ARE PARALLEL
.'.'.,
by
.>
Richard F. Potthoff
University of North Carolina at Chapel Hill
Institute of Statistics Mimeo
Series No.
~_/
-'
445
September 1965
This paper provides a means for testing the hypothesis that
tvlO simple regression lines are parallel, when the two sets of
error terms have tHO arbitrary unknown continuous distributions.
The non-parametric test which is developed here is analogous to
the two-sample Wilcoxon test. The results of t his paper represent a generalization of the results of Mimeo Series No. 331,
which established tl~ validity of the same test for the special
case where the two sets of error terms are both normally distributed but not necessarily with the same variance.
At the end of this paper, a couple of additional problems
in non-parametric regression analysis are briefly referred to.
This research was supported in part by Educational Testing Service,
in part by National Institutes of Health Research Grant GM 12868-01,
and in part by the Air Force Office of Scientific Research.
DEPARTMENT OF STATISTICS
UNIVERS ITY OF NORTH CAROLINA
Chapel Hill, N. C.
·1
I
•
A NON-PARAMETRIC TEST OF WHETHER Tt'TO
SIMPIE REGRESSION LINES ARE PARALIEL
1
by
Richard F. Potthoff
University of North Carolina at Chapel Hill
1.
Summary and initial remarks.
I f one desires to test whether tv10
sin\Ple regression lines have the same slope, then one can, of course, use
the standard test derived from least-squares theory if one is willing to
asswne that the distributions of the two sets of error terms are both normal
and tllat they have the same variance.
I
I
"
.i
are both normal but have different variances, t he tests of (1] and [8] are
available.
In some cases, though, the experimenter may need a test which is
valid for two arbitrary distributions of the two sets of error terms.
present paper presents such a. test.
...
j
1
The
The test bears a close resemblance to
the Wilcoxon two-sample test [10,6], and may be considered as an extension
j
i
If the two distributions in question
of the idea developed in
[7] concerning the use of the Wilcoxon statistic
with respect to a generalized Behrens-Fisher problem.
I
'I
j
2.
Introduction and statement of results.
We suppose that we have M
!
\
i
1j
(i
•
= 1,2,
... ,
M),
I
I
..
Zj
!
=az
+ f3Z Wj + f j
(j
= 1,2,
••• , N) •
I
•
I
j
e
I
I
.,
j
I
This research was supported in part by Educational Testing Service,
in part by National Institutes of Health Research Grant GI-1 12868-01, and in
part by the Air Force Office of Scientific ResearCh.
The
a's and
~'s
are unknown regression parameters.
The Xi's and Wj's are
(known) fixed constants; to avoid complications, we wiD. assume that no two
Xi'S are equal and that no two Wj's are equal.
1'1' 1'2' ••• , f N are assumed to be mutua1J.y independent.
e l , e 2 , ••• , eM'
We assume that each
and that each f
The error terms
e
comes:f'ran a distribution with continuous c.d.f. Fy(e),
i
comes from a distribution with cont~ous c.d.f. FZ(f),
j
Where ordinarily Fy(e) and FZ(f) will both be unknown.
Without loss of generality, we may assume that
M
S lJ.
We wish to test the null hypothesis that the two regression lines are
parallel.
Specifically, we want to test
J
l3y
against alternatives
f
j
~Z •
Our test of H (2.3) will be based on the statistic
o
(2.4)
~
w= (
)-1(
~
)-1
I I
u(ViIjJ )
j
,
i<I j<J
where the function
Vi1jJ
The summation in
1
S
j
< J
.s N,
u(V)
=
(2.4)
equals
° if
V
S
° and 1 if V > 0,
ZJ-Zj _ YI-Yi
WJ-Wj
Xr-Xi
.)
1S
i < I <
M and
~ )( ~ ) terms altogether.
In Section 3 we show that, regardless of what Fy(e) and FZ(f) are,
E(w) -_12
if H is true,
o
and
(2.6b)
E(w)
I=!
if Ho is not true [E(w)
2
.{
•
is over all (i,I,j,J) such that
and thus embraces (
and where
>
-l if
~Z
> t=y,E(W)
<
!
if ~Z < ~] •
J
In Section 4 we prove that
if H is true.
o
Section 5 appeals to a theorem of Hoeffding dealing with U-statistics for
independently but non-identical.ly distributed random variables [3] to show
1
that, under certain mild restrictions,
[var{W)]-2{W-t) is asymptotically
This result, together with (2.6a) and (2.7),
N{O,l) if Ho (2.3) is true.
tells us that, if we base a test of H (2.3) on the critical region
.
J- t I
w - 2'11 >
2M+5
[ l8M(M-l)
where
z*a
0
zp
,
is defined by
2
,
then (disregarding inaccuracies due to the normal approximation) such a
test (2.8) will be of size
I. e:r:ror).
Our
Ct
("mere "size" means ma.x:iJm.un probability of
test (2.8) will of
~,f.pe
course be a conservative test, since
var(w) generally will be smaller than the value on the right-hand side of
(2.7).
Apparently the test is not unbiased.
Section 6 uses a slightly
refined form of (2.6b) to establish the consistency of the test (2.8)
against all alternatives
t3y /: t3Z•
We may note the follmdng points which are relevant for applications:
(i)
,
the extension to one-tailed tests can of course be made in the usual manner.
'.
(ii)
OJ
j
Although the discussion above is in terms of a two-tailed test,
(2.9)
We can obtain confidence bounds on
£:.
= t3Z
- ~
•
..
.. ~" ..
~
j
which are associated with the test (2.8).
The technique for getting the
bounds is similar to the one often used with the ordinary Wilcoxon statistic:
we find that value (or those two values) of D. which, when subtracted from every
V
in (2.4), will cause the resulting new value(s) of
iIjJ
hold of significance.
'tv
The reason why this works is that D.
to be on the thresh-
(2.9) is the
median of each V
(2.5), as we shall see in Section 3 •.
iIjJ
(iii)
In practical situations, there may be pair (s) of Xi' s or of Wj •s
whose two members are equal, contrary to one of our assumptions.
If such ties
do occur, perhaps the best thing to do ordinarily would be simply to count a
tally of
t
in the sum (2.4) for each
u(V
iIjJ
) whose argument (V
iIjJ
) would
J
be undefined by virtue of the two X' s being alike andjor the two W' s being
J
alike.
J
1
For a one-sample problem involving one simple regression line, Theil
[9]
j
introduced and briefly examined a non-parametric test which resembles our
test above for the two-sample problem.
Section
7 below will provide a short
discussion of the one-sample problem.
J
Finally, Section 8 poses a two-sample regression problem which is a
J
)
companion problem to the one covered by the present paper.
The expectation of w.· In this section we prove (2.6).
(2.1-2.2),
Using
(2.5),
J
j
and (2.9), we obtain
,
j
where
I
)
.J
It is simple to show that
same applies to
yiI.
ZjJ(3.2) is. symmetrically distributed about
o.
The
J
J
Now if any two independent variables are each
4
J
..L
•.,.W".'ll\II!UIIIliliMli. . .' . '...Ull_ _..JrillA...........:. . . . . . . .- - . .....
symmetrica.l.1y distributed about 0, then their sum or difference is aJ.so
symmetrica.l.1y distributed about O.
buted about O.
Hence (ZjJ + Y ) is symmetrically distrii1
Also P (z .J + Yi1 = 0)
.J..
.
vo1ved have continuous c.d.f. 's.
is 0 since aJ.1. the random variables in-
. .
Thus, suppose now that H (2.3) is true (i.e.,
o
that 6. = 0); we write
For the case
(3.4)
I3Z > ~ (6. > 0),
we can write
= P (ViIjJ > 0) = P (Vi1jJ
E [u(Vi1jJ)]
> 6.) + P (6.
2: ViljJ
> 0)
•
Using (3.1), we obtain
1
='2
•
We also have
•
1,
I
I
(3.6)
P(6.
~
2: ViIj ?
0) = P(-6. < ZjJ+YiI.:s 0)
1
= P(-~ < ZjJ .:s 0)
~
,,
i
=I
>P(-~ < ZjJ.:s o,-~ < YiI .:s 0)
X P(-~6. < Yi1
CFZ(f+5ZjJ)-FZ(f)]dFZ(f)X
I
:s 0)
[Fy(e+OyiI)-Fy(e)]dFy(e),
j
•
where
Nmv there exists a number f 0 such that, for any 5
Z
€z = €Z(5Z)
such that
€z
>0
and Fz(fo + 5Z)
is continuous, there exists a number foo (f
I
Fz(foo) = FZ(fo) + €z.
o
> 0, there exists an
= Fz(fo)
+ 2€z.
< f 00 < f 0 + 5Z)
Since Fz(f)
such that
Hence we obtain
(
j
e~
I
I
..
5
'~I.I'
"IIX_,.itll.~;A"
1.1.I.1..\\1'1'1:121..-11;;112.41
. 1==1,
11.lt.MIJI••NIIWliE.;;.UIIIU.4.I:• •21• •SlItZ_:• •
(3.8)
?Irfoo CFZ(fo+5Zl-FZ(foolJdFZ(f)
= [<z(5z lJ2
•
o
An inequality exactly a.naJ.ogous to (3.8) can be written for the second integral
on the bottom line of (3.6).
Applying (3.8) and this second inequality to
(3.6), we end up with an inequality of the form
p( 6 ~ ViIjJ
(3.9)
By combining (3.4),
(3.5),
> 0)
~ (€z(8
ZjJ )]2(ey(8.yiI)]2
I
)
and (3.9), we can write
(3.10)
if
Hhen
I3Z <
~(6
< 0), we proceed in a
I3z > ~ •
J
'1
manner like that of the preceding
j
paragraph, with slight IOOdi:f'icatiens; we write finally
j
I
(3.ll)
Oz OJ and
J
II
.
OyiI again being defined as in (3.7).
,
The results (2.6a) and (2.6b) now follow immediately if
and
'We
apply
(3.3)
(3.10-3.11) respectively to the expectation of (2.4).
4.
The variance of 'u.
Sections 4 and 5 are concerned with proving
)
I
J
'1
J
certain properties of wunder H •
o
,In this section we prove (2.7).
First we
will establish,that, if the tyro c.d.f. 's are Fy(e) and Fz(f) = F~(Kf), where
F~ is any c.d.f., then (under H ) we have
o
(4.1)
lim var(w )
K -+co
=
2M+5
18M{M-l)
I
j
j
I
•
J
J
j
)
6
_,I_III, 1l1,I(I" _ _
j
~IIIJII'U.lJllllI!~.~
Let us start by defining
(4.2)
=PCViIjJ>O,
note that
V
iIjJ
Vi'I'j'J'
is given by <:3.~-3.2) with 6. = O.
>O}--k
By using (2.4), we can
vn-ite
) -2( N )-2
var(vT) = ( M
2
2
(4.3)
L L [I
j < J
+ 2
I
+ 2
I
CiIjJII'j'J'
i<I <I'
where
r
is the sum of the terms
I
CiIjJi'Ij'J'
i<P<I
i<I<I'
+ 2
+
i < I
j' < J'
CiIjJiI'j'J'
CiIjJiIj'J'
l
+ r ,
-'
CiIjJi'I'j'J' for which (i,I,i',I') are all
different but (j, J, j , , J') are not all different.
In SectiQns 4 and 5 we 1-Till
assume (with no loss of generality) that the Xi's and the 1'1j 's are arranged
l
•
in ascending order, so that i < I imp~ies Xi < XI and j < J implies I"j < HJ •
NOi'T
as K
-+10 ,
FZ{f) = F~(Kf)
approaches a distribution ~ction which has all
of its probability mass concentrat~d at a single point (O).
He can thus show
that, upon taking the limit in (4.3) and using (4.2, 3.1-3.2), we obtC'.in
which proves (4.~).
7
The remainder of this section will be devoted to proving that
=s
wr(w)
no matter '\-That Fy
2M+5
l8M(M-l)
and F
z
are.
Clearly, (4.5) taken together with (4.1)
'\-lill be sufficient to establish (2.7).
Let ~ = (Rl , R2 , ••• '. ~~) denote a permutation of M of the numbers
1,2, ... , N.
There exLst N~/(N-M)! such permutations a.ltogether.
From (2.4)
it follows that
(4.6)
W
=
(N-M)!
N:
\
(M )-1 \
(V
)
2
L u iIR R
i <I
i I
L
R
~
"There the left su.mma.tion (over
pernnltations~.
!V
'
indicates S'WlmBtion over all N:/(N-M):
Now if t , t , ••• , t v denote any randoT:l wriab1es Whose
l 2
means are all equal, then, regaxdless of what the joint distribution of the
I
1
j
j
t's is, we can write
\
Hitl1
v = NHN-M):, we apply this standard inequality (4.7) to (4.6); we
obtain
(4.8)
)
(
<
va.r w) _
(N-M)l ( 2M )-2 \
If!
L
vax
fi
For each of the
1
=s
i l < i 2 < i;
(4.9)
J
")
:s M
J..
and
.I
w11ere the summation
(t , t , .t:?; L , L ,
l 2
l 2
r.,)
L
U
(V
)]
•
iIRiR
1
J
i<I
(~)(~) sets (i1 ,i2 ,i;;j1,j2,j;) "ffiich satisfy
=s
S(il ,L,io:z:.;jl,j2,jo:z:.) =
~
[ \
.I
~
j1 < j2 < j;
')
Y8
ci
.
:s N,
j
j
let us define
.
~ J. -'2 ~ ~ J. t l
i
..
.t:? JLlJI.;
,
J
I
)
in (4.9) is over the 18 sets of values of
"J
J
which are obtained by taking a double summation in which
8
J
J
I
I
I
•
,
(L1'~'r,) runs over all
6 permutations of (1,2,3) and (.t , .t , .t ) runs over
1 2 3
the 3 permutations (1,2,3), (2,1,3), and (3,1,2). Then we can ,v.rite
(4.10)
N~:
2
Our
N-M !
final task in this section w:U1 be to prove that
S(i1,i2,~;j1,j2,j3) ~
(4.11)
i
for any values of the indices (i1 < i 2 <~, jl < j2 < j3). Once (4.11) is
established, (4.5) follows easi~ by combining (4.8), (4.10), and (4.11).
.,
Let
I
(4.12)
,
denote the conditional probability that Vi1jJ and ViI'jJ' are both> 0 given
that
(4.13a)
and
(4.13b)
fj,fJ,f J ,
= fOl,f02,f03'
Given (4.13), the 3: x 31
= 36
all have the same probability (
'1
distributions) the three e's are
but not necessarily in that order.
possible values of the sextet (ei,eI,eI,;f.,fJ,fJ')
,J
= 1/36),
inasmuch as (in the unconditional
identical~ distributed,
the three f's are
.,
(
identically distributed, and the six variables are mutually independent.
•
Let
I
I
us define
,
I
I
9
Ii
i 4i
iii iI; II;
Qi
·ili i" Ai.
*'
, II;
'QI\t.,
e
(4.14)
where
L.
has the same meaning as in (4.9).
We will prove below that
18
(4.15)
8*(i l ,i2 ,i3 ;jl,j2,j3 leOl,e02,e03;fOl,f02,f03)
for any allowable values of the i's, j's, eO '8, and fo's.
S
t
Noting (4.2), '.;e
see that 8(4.9) is the expectation of S* (4.14); hence, (4.11) will follow
immediately from (4.15) once the latter is proved.
[Incidentally, we note in
view of (4.12) that, in (4.14-4.15) and the ensuing comments, we are neglecting
the point set in which the three eo's are not all different and/or the three
fO'S are not all different.
However, this point set is of measure zero (thanks
to the continuity of Fy and F )' and so we may legitimately neglect it.]
Z
All that remains, then, is to prove (4.15).
If we use (4.14) and recall
tl1at the 36 possible values of the sextet of ers and f's are equally probable,
)
j
j
J
J
I
we can write
J
J
J
I
(I1'~' 1; ;J1 , J 2 , J;)
.. J
'\'Thich are obtained by taking a double summation in ",hich (II' 12 , ~) runs over
J
were the summation
~
is over the
;6
sets of vaJ.ues of
10
J
J
J
I
rI
i
e
1
I
all 6 permutations of (1,2,3) and (J ,J ,J ) likewise runs over all 6 pernn.lta-
1 2 3
The right-hand side of (4.16) thus consists of a sunmation
tions of (1,2,3).
of 36 X 18 = 648 terms; eacll term is the product of two u-functions, and T:lUst
18
T,
therefore be equal to either 0 or 1.
these 648 terms can be equal to 1.
We will prove that no mare than 180 of
That will mean, :in vie,,'; of (4.16), tbat
36s* + 162 oS 180, from whicll (4.15) will follow at once.
Thus it now remains to show that no more than 180 of the 648 terms
can be l's.
Tables 1 and 2 will be of some assistance in identifying and
classifying the 648 terms.
(4.17)
\'Ie
define
P = (X. -Xi )/(X. -X. ),
J.
1
2
J.
3
J.
p
1
= (Wj
a = (f
•
Q=l~
q=l~
B=l~
-W
2
-f
j1
02 01
)/(W
j
-W. ) ,
3
Jl
)/ (f -f ) ,
03 01
b=l~,
Note that P, p, A, a, Q, q, B, and b will all lie between 0 and 1, under our
"
assumptions.
9 entries.
Now Table 1 is divided into 36 squares, each of which contains
Each entry is associated with a particula.r u-function of the form
u(g-kG); e. g., in the square vThich lies in the second row and second column
of squares, the entry in the upper right-hand corner refers to
(4.18)
u( [l/p] - k[-B/Q])
,
while the center ~ntry (Which has been assigned the identif'ying number
refers to
u(a-kA).
€X)
Now (4.18) will always be equal to 1 (regardless of
ttl it
::
it ;
p
u-function whose value will depend on what the quantitites (4.17) are, the
corresponding entry in the table ccnsists of an identifying number [e. g. ,
u(a";'l>A) can be equal to either 0 or 1, and it has the identifying number ~ •
. The entries in Table 1 include 162 identifying numbers altogether:
s*; ... ; ~
as the pair
§1*.
*" *.*;
~
For any of these 81 pairs of identifying numbers (suCh
~ and ~*, e. g. ), the arguments of the two u-t'unctions are equal
except for sign [thUS, e.g., note that ~* refers to u(-a+kA)].
Hence, in
any such pair of u-functions, one u-function will be equal to 0 and the other
'rl.1l be equal to 1 (unless both u-functions have a zero argument, a case which
nOll and hereafter we will properly ignore since it involves a set of measure
zero).
J
Consider the product of two u-functions that correspond respectively to
e
t'·l0 entries which are in the same 3 X 3 square of Table 1, but which are in
different rows and different columns of that square.
(One example of such a
product would be
j
I)
u(a-kA) u(-[b!q] + k[B!Q]) ,
'mich "\'1e designate rore briefly by the notation ~. !Q.*.
notation such as ~. [+]
Ue will also use
to denote the product of two u-functions the second
of vThich is always equal to 1.)
From each of the 36 squares of Table 1, one
J
can obtain 18 possible products of two u-functions such that their two respective
entries are in different rows and different columns of the square.
ine; 36 X 18
e
= 648
,products are all accounted for and grouped in Table 2, which
also provides an outline for proving that no more than 180 of the 648 products
)
can be equa..l to 1.
J
NOvT it is not hard to verify that this set of 648 products is identical
with the 648 terms on the right side of (4.16).
1.2
.
The result-
.
",i.<,~~)" \ ... p ,':', ',' ' .
; .; ,::
;~"'J"/:;-'" /, ":" ''-'i>~'"
,J;
'L" - '~;", .. ~ ::'~..,,:,:: ""-')'" :' "z··~:
Thus the proof of (2.7) is
J
J
j
j
I
f.-:')'v::, ~S),';.,;':, C)i;)'>.;' ; <, '.;{'~~;:'~ ~+~.;~: h~,.~):.,:~ ::·~f~.•~;~i~}';;"""F~,:~!~(: .tff~l'';?'
_."". ,. . .. __.. _.. . .
~
e
e~.
~.
..., -', •••• _.,.,- ....... ....- .c ...., •.-, ..
__ "'''''''J''-'''''''''''''''_''''.''''''''''~' \ '~ ••_~"\>""'''_H''_,'.'''."".~
·~ ~ ........ ' • • '<1. -:::.....-.~>-..40, ..,....' '¥~: ~':Jo~"" •• ~,.,,,,,,,,,:;,~,,.,,,,,,,,'l'
e
.• :"~:~",,-",,.A...;,;.~,.,.~..o!l~''''''''....u'.
J~."\_~A.. :f'.;~"".""~'~"" (,,~~~
e
TABIE 1
Identification of' the functions u{g-kG) which are irwo1ved in the proof of (4.15)
t:
I
K
Alp
1
alp
15
,.,....
....8
1
b/q
77
........
9
l/p
B/Q
IIp
A
-B/Q
-Alp
B
l/Q
B/P
-A
17
........
18
........
¥.2
10
........
31
,.,....
13
,.,....
....3
....6
....7
28
........
12
........
[+]
[+]
[+]
20
........
30
........
11
........
[+]
[+]
[+]
12
21
........
16
........
76
........
14
,.,....
N
[+ ]
[+]
[+]
[+ ]
[+]
[+]
[+]
[+]
[+]
-B
[+]
[+]
[+]
48
,.,....
54
........
51
........
66
........
72
.......
[+]
[+]
[+]
[+]
[+]
[+]
[+]
[+]
.?2
12
-b/q
[0]
[0]
[0]
[0]
67
........
[0]
[+]
[+]
.8
a
47
........
65
........
10*
........
-alp
[0]
[0]
[0]
[0]
[0]
b
39
,.,....
45
........
38
,.,....
42
.......
40
,.,....
l/q
57
.......
63
,.,..
56
,.,..
60
,.,..
b/p
74
,.,..
[0]
78
,.,....
-a
79
,.,..
[0]
-1/q
[0]
-l/p
N
,
4
-l/Q -lIp
A/Q
-B/p
1
[+]
[+]
[+ ]
N
.2
....2
-1 -A/Q
[+]
[+]
[+]
[+]
[+ ],
[+ ];
[+] [+]
[+] [+]
[+ ]
[+ l
!
70
,.,....
2}
68
,.,....
71
,.,....
[+]
[+]
~
[0]
[0]
[0]
12*
,.,....
14*
,.,....
11*
......
12*
[0]
4*.
...,
'1*
....2*
16*
......
15*
........
[0]
[0]
[0]
18*
,.,....
ll*
1-2*
[0]
~*
8*
1*'
[+]
[+]
43
,.,....
41
.......
44
,.,....
58
,.,....
[+]
[+]
61
,.,....
~
62
.......
[+]
[+]
20*
,.,....
[+ ]
[+ ]
[+]
[+]
[+ ]
[+]
46
.......
64
,.,....
[+ ]
[+]
[+]
[+] [+]1
34
.......
32
.......
[+]
[+]
~
~
~
[+]
[+]
[+ l.
[+]
II
[+ ]
[+] [+]
[0]
[0]
[0]
65*
,.,....
66*
,.,....
[0]
[0]
[0]
§l*
68*
,.,....
§2*
[0]
[0]
[0]
[0]
56*
,.,....
57*
,.,....
[0]
[0]
[0]
~*
~
60*
,.,....
[oj
ll*
62*
,.,....
'E..* ';0.*
[0]
12*
61*
,.,....
[0]
[0]
[0]
[0]
[0]
47*
,.,....
48*
,.,....
[0]
[0]
[0]
!t2*
~*
~*
[0]
ll* ..2.* 2i*,
-b
[0]
[0]
[0]
[0]
[0]
12*
[0]
[0]
[0]
40*
,.,....
41*
,.,..
!U*
[0]
81
,.,....
75
.......
80
,.,....
24
.......
22
.......
[+ ]
25
,.,..
~
-[+]
~*
a/q
38*
,.,....
[+]
ll*
42*
,.,....
-b/p
[0]
[0]
[0]
[0]
[0]
78*
,.,....
T1!
:z!t.* ll*
-1
[0]
[0]
[0]
[0]
[0]
76*
.......
~*
-a/q
[0]
[0]
[0]
[0]
[0]
80*
.......
11*
81*
,.,..
~*Ui
~
~
.. _ ..~
~_.""-,.~,.,...~
;"""!'::':".--~..,..
-.-
i1
44*
26
.......
[+]
[+ ]
[+]
[+]
~
[0]
-.
[0]
[0]
~*
~*
a*
~*
[0]
[0]
[0]
[0]
~
~*
ll*
[0]
[0]
[0]
[0]
28*
,.,....
22*
,.,....
i2*
6*
,..
~*
24*
,.,....
~*
[0]
26*
,.,..
.-........ -....--..-
[:]1
~*~*
!t2*j
[+] [+l
~*.
14
TABLE 2
Accounting and grouping of the 648 products of two u-f'unctions, presented
an outline for use in proving that no more than 180 of these products can
be eClual to 1.
..
we.s
Group
Number 'Which
axe e en: al. to
Products which are included in the group
D
I
50 products in which both u-functions are (+]
II
274 products in which one or both u-functions are (0]
III
64 pairs of products in which each pair is of the form 9. (+], Blf. (+],
were 9=;1" ;I" ;I"
~ /1" /1" /1" ~ ~ ~ ~ it. it. it. it.
~
tt.
, .1.Z, .1.Z, J.a, J.a, ,l.,S, ,l.S,
tt
!.tf:tt,i.~~*)l,:i2,.u..u.~~~~
_, ~, E, _, ~ ~ ~ . l Q, .7l, ]l, ~ Ii
,
2Q, 2Q"
§i.~
r..r
£. [+]; 7;!;*.,5.*, ,5.. [+];
50
274
0
64
64
,t
i'Z.tr:,
z;t*.§'*, §,. [+]; 11*.2*,
0
,
i§.~ ~*. (+]; !tl.12" ,7.2*. [+];
It.5,*. (+]; .5£.~, ~*. [+];
~*.~*, 28. (+]; f.*.~*, ~. [+]; 2*.~*,.J:.. (+]; ~~;a*, ~. [+];
~.§, §*.1+]; 1.£.~ _8*. (+]; 1l.~ 1*· (+];
.1, 1*. [+];
IT
1
> 16
~
-> 24
~8
16
~*.~*, ,5.. (+]
-
48*.g,z, £.i*.~*, ~.§2*; 66.!tl, !t:Z*.~ ~*.~*, ~•.€2*;
~•.:?£, :§*.*. ~*.~*, ~.).2*; ~.2L 2Z*.~ :li*.!*' ~·ir;
12.1L Kl.*.
*•.u*, ~.k6*; lJa·
*.,Zi, Ii*· *,
. *;
.D'.~ ~*·E ~*·21*, ~.~*;
m*.7l-; 11*·_*, _.!T
*2.tt
~~! ~*.@t*, ~.it.*, it·~; ~*.5ft
VI(a)
J.J.)
*. ~*, 22,. ~ii., J:a.~;
(iii)~. [+ , ~. (+ ; ~*. +],
(iv) ~.~ ~*·tt
(v) ~.*2, ;l.2*.~
> 16 <8
if,2:-17)t (~ 'f)
*.
.5Jt,*. [+]
ir**.~*' ~.~ ~.~ ~t.lQ*
_*.2*, 2.
*.~ ~*.~
(i) 72*.~*, ~W-, ~~; U*.2!t.
(ii) ~*.:u.*, :u,.;u.*, ;u..2a; 5Z*.~
(iii)~ ( +], ~ [ +]; lU,*. [+], ~*. [+]
(iv) ~~ U*.i ~*.~*, a~ ~*.22., 22.*.~7(.
(v) ~.~ ;!&.*.~ ~;~.• 2t.*, ~.~ ~*.~ li*.~if'
KrI(b)
VII(a) (i)
(ii)
(iii)
(iv)
(v)
(vi)
(vii)
TL~ J.Q.*.~ ~.U*,
~.~ ~*.61
7.i*.i*'
u,,;Jj,*, Jj..7.1 Z*. [+]
61*.€2!, ~l.Q.*, ~~ ~*. [+]
~~*~~ ~*.~
T)..*.~*, ~6IT*,
*.~*, -~ [+]
~ ~*.~ ~*. *,
t
> 16 <8
17) (~ 7)
(~
~*.~*,
'?ilJ U*,
~~
l2*. *, .. ~*.Q,*,
~.~*, ~.~ £*.~ ~.k ~*.'?l*,
> 29 < 13
28) ~~ 14)
(~
[+]
J,.Q,~
'?l.g,*
-
~
t four
I
]
J
J
a.€O U*.~~.~*, ~~ U*.~ U*.~*
ZL~ ~*.~ ~*.Qa*f.2..Qa.W, s.~ a*. [+]
iVII(b) (i)
~ii) '& ~ W. ~ ~*.~4
JiJ.*, ~ 7., 7.*. [+]
iii) 7t;.*.~*1 ~·m, ~~ tt:.~ ~*.~*, ~ [+]
(iv) t5.*.~*, ~ 1-lf', ;:l, ~ ~*.~ ll*· *,
[+]
(v)
~1& lJ:Q.*.~.~.~*, ~~W.'& 7.Q,*.~
(vi) 2l*.~*, ~.7.!)!, 'l2..~ ~*. _*, J&.~*I ~~
(vii) €2:.L*, L·~ 2.*.~ ~*.~ ~*·ll*, a.&.*
Totals
1
,2: 29 <13
(,2: 28) ~~ 14)
~.468
Sl&l
s refer to two different case!3.
,
I
.,
conp1eted vn1en we show that no n~e than 180 of the
648 products accounted
for in Table 2 can be equal to 1.
Table 2 requires a.mplification.
but
Groups I and
n
present no difficulties,
present some explanations for the remaining groups:
lTe
This group consists of 64 pairs of products.
Group III.
Exactly one
product in each pair will be equal to 1.
4
Group rl.
In each of tIle
16 pairs of products in tllis group, at least
one product in the pair must be O.
Group V.
This group consists of eight quartets of products.
No rore
L6
tllan one product in each qua.l tet can be a 1.
1
We prove this statement for the
first quartet; the proof for tIle other seven quartets is shular.
Suppose
that there are actually two products in the first quartet l'1i1ich are l' s.
These
two products would have to be either ~.~ and ~*.a*, or else ~*.~ and
~.~*.
that a
If ~.!t£ and ~*.Zt.* are both lIs, this would imply (among other tlungs)
>
k B/Q and -a/q
>
-l~ B.
In turn, these tlVO inequ.ali ties would impl~'
tl1at q Q, > 1, vlhich is impossible.
Zt,. ~*
)
parcntlleses, viz.,
that
110
and
We \'Till prove that no rore than 0 of the 24 products in
Gro1..p VI(a) can be l' s.
~)
~*.~
are both l' s, we are led to a similar contradiction.
Group VI(a).
3
If ",/e start by supposirlg tInt
(For Group VI(a), Table 2 llas noue figures in
2: 17 and S 7, Ifmch will be expalined later.) Note first
more than one member of each of t he followins pairs of u-functions
t.
can be equal to 1:
&* and ~ II and ~~-;
and ~ 1!t. and ~*; ~ and §l*,
~ and ~*, ~ and ~*, ~ and
;
~-l(. and ~ II and ~*; *2* and ~ ~ and ~*; ~ and ~*, ~ and ~*, ~ and
\
I
'.
!t.Q.*, ~ and ~*, ~* and ~
and ~~~, ~* and :zl,
1 cr:<-.
NN
1!t. and
§l*
and
1!t" a*
and
kL
~* and ~; ~* and ~ ~
~*, ~* and ~ ~ and ~*,
!2t.*
and ~
II and
To beGin '\nth, we may conclude that, in each of the five sub-groups (i-v), no
more than two of the products can be 1 'so
We now consider three :fossible
situations:
(A)
Suppose that none of the four products in sub-group (iii) are l's.
Then tIle total number of 1 's in all five sub-groups clearly could. not exceed 8.
(B)
Suppose there are two l's in sub-group (iii).
be either tIle first two products or th e last two products.
They would have to
In either event,
sub-c;roups (i) and (ii) could then furnish no more than one 1 apiece.
Hence
the total could not exceed 8.
(c)
1.
Suppose that exactly one of the products in SUb-group (iii) is a
For illustration, let it be ~ [+] (whiChever of the four products we
start ",itll, a similar line of argument is used).
We
nOvT
suppose also that
sub-croups (i) and (ii) eacll have t'\-JO 1 's, since other'\nse the total number
to be
2i.*l*,
~*.~ ~*.~*, and
!i.a.
Finally, the various conditions on
sub-groups (i-iii) imply that, in sub-group (iv), there is exactly one product
(viz., ~.~) which is a 1.
GroUl' VI(b).
Thus, as before, the total nlmber of l' s is
.5 8.
Group VI(b) is exactly analogous to Group VI(a).
Groups VII(a) and VII(b).
For Group VII(a), let us observe first
tha.t no roore than one member of each of the following pairs of u-functions
ca.n be equal to 1:
and ~,
U
~ ~* and
£
Tl
12* and
and
1*; 7£ and
~*, ~ and ~*;
Observe also that, in each of the following
t~ios
Zi*
and ~ ~* and ~,;
of u-i'unctions, it is not
possible for all three members to be equal to 1, or for all three members to
be equal to 0:
16
'1
.J
J
FrO:-.l these two observations
lIC
can conclude respectively that, in each of
sub-croups (i-iv) and in each of sub-groups (v-vii), no r.lOre than tID of the
six products can be 1 's.
t1e can similarly verify that there can be no more tllan
tlTO lt s in each of sub-groups (i-vii) of Group VII(b), since Group VII(b) is
analogous to Gl~OUp VII(a).
~lUS
14.
the number of lIs in each of Groups VII(a) and VII(b) cannot exceed
We "dll distinguisJ.l tllO cases:
Case 1.
The number of lIs in each of Groups VII(a.) and VII(b) is
Case 2.
The number of 1 's in Group VII(a) and/or Group Vn(b) is
exactly 14.
In the last four groups in Table 2, the second set of figures (in the
,vee
parentlleses) refers
to Case 2, the first set to Case 1; He will prove that,
if Ca.se 2 IlOlds, then Groups VI(a) and VI(b) can contain no l10re than seven
1 t S apiece, and this "lill allOI" us to conclude that in all groups the ntmber
.1
of l' s i-lill al'\'laYs be < 180•
Define f (a) to be t:le set of four u-functions (lQ., ~ ~*,
feb) to be the set (~ ~ ~*, 21*).
consider all 2 4
= 16
0 , ..
~
,
and
We l·lill work I·d tIl Group VII (a) and will
possible values of f (a); since the development for f (b)
we vlill not consider feb) and Group VII(b) explicitly.
is analogous,
(A)
U*),
It is not possible for f(a) to consist of all four l's or all four
because the associated set of four inequalities vTOuld be internally
inconsistent.
e
Suppose that ;l;Q and ~ are equal to 1 Hhile
(B)
to 0, or vice versa.
17
'~CiII.,1
~~'
"',
IV..' I
. " ,
'
•
and ~* are equal
Then sub-group (vii) lnll contain just one 1, and tIle
..'.
J
l€*
i.a
_Z&A&2
total number of 1 1 s for all sub-groups will be
(C)
Suppose;tQ, and ~* are equal to 1 While
0, or vice versa.
(D)
::s 13.
~* are equal to
Again, there 1nll be just one 1 in sub-croup (vii).
Suppose!Q. and ~* are equal to 1 'while
0, or vice versa.
1i and
*l and ?1.*
are equal to
Then it is easily shown that sub-groups (v-vii) can contain
no r:lOre than five lIs am:>ne; them.
Hence the totaJ. for Group VII(a) will be
(E) Suppose that three u-f'unctions in r (a) are 1 1'1i1ile the fourth is
0.
In this situation it my be possible for 14 of the products in Group
VII(a) to be lIs, in which event 1'1e are thrown into Case 2.
let ~* be the single u-function in rea) which is equal to
For illustration,
° (the development
llOuld be similar if we picked one of the other three me~ers of r(a».
Since ~* and ~* are both 0,
is to contain two l' s,
22*
-
a* will
also have to be 0.
"Till have to be equal to 1.
(v) and (vi) are each to contain two l's, then ~ and
If SUb-group (vii)
Next, if sub-groups
9:l*
respectively will
.
lave to be equaJ. to 1.
must be
limr note that ~ ImlSt be
° because l:.Q,* is 0.
Hence
11. and z£
° because 1i* is 0,
must be equal to 1 in order for
sub-groups (i) and (ii) respectively to contain two lIs.
sub-group (iii).
possible for
1i*,
Since
~
11.
'?l*,
is equal to 1, ~* must be 0.
Consider finally
J
It is not
and ~ all to be equaJ. to 1; the four associated
ineq:ualities are not consistent.
1.
and ~*
The u-functions
11* and
~ cannot both be
Hence, in or~er for sub-group (iii) to contain t"'0 l's, ~ and ~ must be
equal to 1.
[IncidentaJ.ly, sub-group (iv) will automatically contain tvlO l's.]
We have just deduced the only way for Group VII(a) to contain 14 products H11ich are 1 1 s, if 1'le start by assuming that ~* is the sole member of
r (a) Hllich is equal to 0.
TllUS we have encountered Case 2.
We nnlst
nOvl
show
j
'.
,
•
~
I,
that Groups VI(a) and VI(b) can contain no nx>re tl1a:n seven 1 's apiece.
To do
tlli:l, it ,·Till suffice to prove that sub-groups (i-iii) of Groups VI(a) and VI(b)
can contain no more than one 1 apiece.
The proof is immediate once we note
tl1B.t no more than one :member of each of the followine pairs of u-functions
can be equal to 1:
~ and ~ ~ and
a*, '& and
~ and
(F)
~.r,.
~ and ~*, ~ and
~*; ~ and ~*,
'& and
a*;
~ and ~*, ~ and ~*;
!tl*; ~ and ~*, ~ and
U*
Suppose finally tl1at three u-:f'u.nctions in r (aJ are 0 "mile the
fourth is equal to 1.
let
lQ.*,
Thi~ ~1aY
also throw us into Case 2.
be the u-:f'u.nctiOl1 ulricll is equal to 1.
For illustration,
It can then be shown, via
teclmiques somew'1'18.t similar to those used in (E) above, tl1B.t the only vray
for 14 of the products in Group VII (a) to be l' s is for tIle se 14 products to
be as fo11ovTS:
gz·a*,
1.*.[+];
~*.~; ~*·a*, ~*.~*; ~.~*, ~.1Q*; ~*.~ or ~·U*,
~*.~ or ~*.~, t:;··[+]; ~*.~*, 2*.~; ~*.~* or ~.[+], ~.~
For tlus occurence of Case 2, vre proceed, as in (E), to show that SUb-group
1
-~
(iii) of Groups VI(a) and VI(b) has exactly one 1 (viz., ~*. [+]); then, rMU~-
1.* ,
inc use of the fact that the u-functions
~*, and ~* are all equal to
1, Ire verify that sub-groups (i-ii) of Groups VI(a) and VI(b) can contribute
no mre tl1aJl one 1 apiece.
Since Group VII(b) is analogous to Group VII(a), vTe have accounted
£x
1\11
possibilities for Case 2 and have thus establislled the correctness
of tile figures in parentheses in Table 2.
Hence the proof of (2.7) is complete.
\
I
1:
5.
Asymptotic nor;·.1B.lity of w.
In this section Ire
SI10W
that, under
(
i
t,
mild restrictions, w(2. 4) is asymptotically normal under lIo (2 • .3 ).
Our
proof
i
•
,nll use a theorem of Hoeffding [.3, Theorem 8.1] concerning the asymptotic
\0
19
j
~
'1
_~.
"m,,"!,~''''ijI''!'I'''''
,..OJ
we.
.is.
*I.
normality of U-statistics for random variables independently but not necessarily
identically distributed.
A U-statistic must be of the form
U = ( : ) -1
"There the summation
the function ~
(0: =
I
I
t
~ (~, ~'
[3, equation (5.1)],
... ,
:sam)
,
t
is over all (:) sets satisfying 1
~ '\ < ~
< ••• < am
:s n,
is symmetric in its m (vector) argwnents, and the !a's
1,2, ... , n) are mutually independent (but not necessarily identically
I
distributed) r-dimensional random variables of the form
_
(1)
(2)
(r) "
3Sa: - (xa ' xa ' ... , xa ). (In this section we are tITing to maintain the
\
I
I
same notation which Hoeffding used, except that we are using 3S where he used
x.)
J
First we find a U-statistic which is related to w(2.4).
= M + N,
us make the identifications "n
x"'(l
m = 4, r
=
In (5.1) , let
J
for 1 < 0: < M
-
for M+l
Q.,( ,
~
a
~
J
M+N
2
of (~, ~,
nI
:J~l
J
-~J
"\oThere the summation in (5.3) is over all 4:
!
,
and
~!
il[ii,
1
= 3,
0:.;, o:q.),
= 24
permutations (h, H, h', H')
and where
J
,
'j
I
J
«
= 0 otherwise
•
·',i
]
20
J
• _ - _. ._ - - - _. . . . . . . .jjjii-~~
,
•
.'"
I 'I ,
'
'.
~I'
>
i.:,
•
TIle
x~2) 's
and
X2() 's of
(5.2) are fixed constants, but for our present
purposes we r,egard them as random variables for which all the probability
Now note that
~(5.3)
satisfies the specification of being symmetric
in its arguments.
When H (2.3) is true, the statistic U determined by
o
(5.1-5.4) will be related to w(2.4) by the equation
5 n,
U = l~
!.f,N
(w -
,
t)
wllere
(5.6)
•
We will apply Hoeffding's Theorem 8.1 (3] to U(5.5) to prove that
•
_.1.
(var(U)] 2(U-E(U)] is asymptotically N(O,l).
At this point we introduce the
two mild assumptions Which '\'re will use:
AssUI!tQtion 5A.
We assume that, as n -. 10
,
N/M approaches some constant
c (e ~ 1, since N ~ M).
Ass~tion
5B.
Let us define
•
]'
n (0
n~
1) and a number Ko (K0 > 0)
suell that, for all (M,N) and for all integers v in the interval 1 5 \I 5 M/2,
We assume that there exists a fraction
<
the following property holds: in the set of (M- v)(~) triples (I,
which
v
< I < Mand 1 5
j
-i
\
,;
~
\
j
l
_ j
.., .
( ... ~
I
....111. if'l& Ii"
~_"''\JO":
,
J) for
< J 5 N, the number of triples satisfying the
condition
,.
j,
~;-,":"',:-
,!
~j.:_''''''''."
(5.8)
is
K
::: n(M -
V)(~)
V1jJ
< KO
•
Observing (5.6), we see tha.t Assumption 5A implies that
,
lim
n
"'10
l'1h!ch is strictly
> 0.
We conclude from (5. 5) that, once we prove that
[va.r(U) r~ [U-E(U)] is asymptotica.lly N(O,l), it will follow immediately that
1
[va.r (w)
r2
t)
(w -
is likewise asymptotically N(0, 1) •
To establish the a.symptotic normality of U( 5.5) by using Hoeffding I s
II
I
Theorem 8.1 [3], we need only show that the three conditions [3, formulas
(8.2-8.4)] are satisfied. We can see at once that [3,(8.2)] and [3, (8.3)]
hold, simply by virtue of the fact that ~ (5.3) is bounded in absolute value.
The condition
I
'.
[3, (8.4)] is
n
IE
lim
n
,
v=l
I
,
"'00
J
,I
/1
"There
W
( v)
l
is defined by [3,' equation (8.1)].
Since
E(~) =
°
and
I~ I .:s t [see (5.3-5.4)] , it ldll follow that rwl ( v) I .:s t. Hence
l'flS( v) I <Wf( v) , and
E l'f~( v) I <
E 'f~( v). Therefore the fraction
on tIle left s~de of (5.10) is < [
E y12( v) ] -t , "ihich allows us to
L
I
L
conclude that (5.10) will be proved if we can show that
n
(5.11)
I
v=l
E ( 'I{(
v) (~») "'10
as
n"'l0
•
TllUS
all that remains is to verify (5.ll).
NOli
using (5.2-5.4) above
alone; lr;ith (3, equations (0.1), (5.15), and (5.14)], we can write
-
~1( v)(x
....v) =
(H+U-1 -1
,- )
(v
(Sl v
+v
S2 )
= 1,2,
••• , M) ,
,,,here "re define (for v = 1,2, ••• , M)
v-1
Slv
L
I
= SlV(e) =
i=l
,
'i:ji e )
l=Sj<J9l
M
(5.14 )
S2v = S2v(e v ) =
I
J
I=vH
(i = 1,2, ... J v-1) J
and
(I
Note that E(Sl) = E(S2) =. O.
Let
cfv
and
= v+1, v+2 J
... J
M) •
O"~v denote the respective
variances of Sl v and S2 v' a...'1d let Pv be the correlation coefficient between
Slv and S2v.
Then
From (5.15) we see that
1~7
jJ(e v)
J.V
(5.13)
I -< ~.
Hence it follows from
•
23
He will show shortly that t11ere exists a constant
1'(0 <
l'
.:s 1)
such
i
...
that
I
}
(j.2 > (l-I-v)2 (N)2 (.1:)2 -r2
2v 2
2
for all integers v in the interval 1
.:s
v
.:s M/2.
I
i
Nm'T define
(
v to be the
M
la.rcest integer v such that (M- v) 't' - (v-l) > 0; thus
(M-l)'t' <
't'+l and
v1<1.:s M/2.
vM
< M't' + 1
't'+l
,
Using (5.13-5.20), we can write
(5.23.)
From (5.12), (5.17), and (5.21) we obtain
e
[(M't'+l)-v( 't'+1)]2
(5.22)
But
•
v
M
(5.23)
L
v=l
,
the inequality in the second line of (5.23) being a consequence of (5.20).
~
Finally, (5.11) carl now be established at once if i'~ combine (5.22) and (5.23),
.
let n -. 00, and apply AssUllIption 5A.
,
T11US
once
vie
the proof that
if
show that there is a
J
j
is asymptotica.l.J.y normal ''Jill be complete
't'(> 0) for which (5.19) holds.
24
1
1
I
~
I
A
Z,.ot:jj(tJlWffijWt±t ..
Proof of (5.19).
~v=
M
From (5.14) and (5.16) we obtain
1-1
I I
I=v+1 I'=v+1
i'lhere
ive now proceed along lines 1'1'hiC11 will be somewhat simi1.a.r to those of part of
Section
4.
Let
e lO < e 20 < e30' flO < f 20 , and fio < f 20
(5.26)
'
e.rAonsider the conditional probability of the event inside the curly brackets
in the last line of (5.25), Biven that
~5.27c)
Ie
fj t1fj,
= fio' f 20
but not necessarily in that order.
denote this conditional probability by
,
5.23)
~o,
p*
= P (KvIjJ(fJ-f j) -eI+e v
> 0, KvI' j IJ' (fj,-fj ,) -e± ,+e v > 0
let us denote
25
~iMtMt i,i
141' Fa IN#••O$ tl.: :.. ".
'22:;: p;
;$;,,\14 \11«1
i
"
-.
'II
I'
•
Given (5.27), the 31 X 21x 21 = 24 possible values of the septet
(eI,ei",e v; fj,fJ;fj"fjt) all have the same probability, viz., 1/24.
24 P*
tIt f
=
,/,:::0 l t =0
L=O
Hence
,
>Fl
,mere
tIle 0 symbol denoting the Kronecker delta.
e
Beca.use of (5.26) and because
K
and KvI ' j fJf are both p.oSitive under the restrictions on their subscripts,
V1jJ
all four of the quantities (5.29) will be
> O. Thus we conclude at once that
=1
v0001=v0011= 1; v0101+v0002 = 1, v0111+v0012
(As in Section
ar~nt.)
4, we
He
ShOYl
i
ienore the possibility of any u-function having a zero
J
Now (5.32) accounts for 16 of the 24 v.UfL'AfS, and shOls that
The remaining 8 v's are
exactly 4 of these 16 are equal to 1.
(5.33 )
•
J
J
}
•
v0003'V0613'V1001'VI002'V10ll'V1012'VllOl'Vll1l
I
tha.t at least 2 of the 8 v's of (5.33) will ali'Jays be equal to 1:
(A)
> 1. Then v0003 . = v0013 = 1.
Suppose D > 1 and D'
)
(Remember
)
that Po and C1c are < 1.)
26
.
,
\.,7\0
i
•
<
••
,
.....
"
""",
•••
'.
... , . . . . . .
-.:..:.~~~:~J...[... ,,~;..
'":.
: : ....
J~'!.
~,
• .;-;
...
,"".-: .... v . '
J-
~.~'
• • : _....-., ..
'~.~
,:..~~;
,7JrmtrW'ttfi'ligp:·W.Wib·.
> 1 and Po < D' < 1" then vOO13 = vlOll = 1.
(e) If' D' > 1 and Po < D < 1, then VOOO, = v1OOl. = 1.
(B)
If' D
(D)
If' D
(E)
If'
(F)
If D
> 1 and
< Po" then v1011 = vl012 =
D' > 1 and D < Po" then v1001 = v1002 =
Dr
~.
1-
< 1 and D' < 1, then vlOO1 = v1011 = 1.
In situations (A-E) but not necessari~ in situation (F)" the renaining 6
members of the set (5.,3) ~lill all be equaJ. to O.
From the previous paragraph we conclude that 24p* is always > 6.
(5.25) is the expectation of (p* ..
Now consider the case Where
e(~) (Where 0
<
~
we can thus write
K vIjJ and K"I' j 1J1are both
Since
< Ko • First, let
< 1) be defined as the smallest number which satisfies the
wation Fy(e(~»
= ~.
We then define I.l = min(e{.4)-e('.2)" e(.8)-e(.6»;
since Fy is continuous, I.l ,ust be
(eI,ei:"e,,; fj,fJ;fj"fj,)
(5.35a)
t)"
-
> O. Now let r denote the set of points
such that
eI,ei:"e" lie in the intervals (_eo, e(.2»' (e(.4),e(.6»'
(e(. 8)' eo) but not necessarily i~ that order;
IfJ-f'jl < WKo ; and If'j,-fj,1 <
WKo
20 )
or, in other words, such that the ordered values (elO,e20,e30;f'lO,f20;f'io ,f
satisfy
Let
r
denote the complement of
r.
We already know fi'om the preceding para?aph
that
27
'="",. if n,;¥:
i
i
:::$: : i';'
li,lli
; i ,iSh MAliA Ai i i
iIlll'
I
!
i
!
i
_
P* >
4
.!.
for all points in
j
r
J
From (5.28) and (5.35) it is easily seen that
fr
Letting
stand for the septuple integral which extends over the set
and 'vhich has an integrand of 1,
fr
vle
where
can write
= [3: (.2)3][2P (0 oS l'J- f j <
= .192{f CFZ(f
If
'We
r (5.35)
~ K~1}]2
+~K~l) - Fz(f)] dF z (f)}2
•
apply (3.8) to (5.38), we obtain
€z
Since (5.25) is the expectation of (p* -
is of course> O.
t) ,
we can use (5.36-5.37) and (5.39) to establish that
f'
Finally, we combine (5.24), (5.34),and (5.40), and apply Assumption
5B to obtain
,
,
(5. 11-1)
~v ~ (M_v)2(~)2 IF (.016)[€z(~~1)]4
Thus (5'.19) is satisfied ,'lith
Remark.
l'
•
= (.064)~ n[€z(~K~1)]2
It may be of interest to note that our development here in
Section 5 suggests a comparatively simple way of proving that the Wilcoxon
statistic [10,6] is asymptotically normal, in either the null or the non-null
case.
The proof is effected via Hoeffding I s
cations analogous to (5.2-5.4) above.
Theorem 8.1 [3], using identifi-
I
I
J
6.
Consistency of the test.
In this section lIe show that,
aga.Ll~t
all alternatives /). /: 0, the pOifer of the test (2.8) approaches 1 as M -+ 00
•
In order to prove tIllS consistency property, vle will impose a mild assumption:
Assumption 6A.
and a n'Unlber
k
(k
o
o
properties I'lOld:
~-le
assume that there exists a fraction
> 0) such that, for
(i) in the set of
(~)
no (0 < n0 < 1)
all (M,U), the following t'WO
pairs (i, I) for iThich 1
~
i
<I ~
1
~
j
lot,
the nur:mer of pairs satisfyinG the condition
Ix-""1 - x.J. I > l~ 0
(6.la.)
is
2:
lIo
(~1) ; and, (ii) in the set of (~) pairs (j,J) for lThich
the number of pairs
the condition
IwJ - \'[j I > Ie0
(6.1b)
is
satisf:~
< J S N,
> II0(N)
2·
In proving the consistency of our test (2.8), we will consider explicit~
onlJ," the case /).
> 0; however, the proof for /). < 0 is analogous. First we
obtain a couple of preli1l1ina.l7 results.
of Section
.3.
When the conditions
[€Z(OZjJ)]2[€Y(&YiI)]2 in
(.3.9)
[€z(i /). ko)]2[€y(~ /). k )]2
o
and
One of these is related to the material
(6.1)
are satisfied, the term
(.3.10)
and the inequalities will still hold.
vTill follovT from Assumption 6A tl1at, if /).
E(vr) ~ -~ +
(6.2)
can be replaced by
> 0,
then
for all
d
Hence it
(M,N)
The result (6.2) provides a
vlI1ere
sligllt refinement to (2.6b).
e
He will need an upper bound on var(w) for the non-null case.
First
lie
use (2.1~) to write
I
29
)
.-;
: ; ... ·:~.:-t~,~ .....
_l-<............. ~,;.;
rc'j;.l}:'
;.'...-.r."""';.f--------.... ---~-:-"', ... - .. -~ ... ,,·~...,~~:"'
~1
'
'
• • •
~.~. ~
......
"~.,
~.,_.
••
•
......
r~ ~
r
....
/_#~,.~'~
NOlT (~)(~) (M~2)(N~2)
of the (~)2(~)2 covariance terms on the right side of
(G.3) 'Hill be 0, and the renaining covariance terms wU1 all be
S 1.
Hence
"re ha.ve
va.r(w) oS
(6.4)
(~)-2(~)-2[(~)2(~)2 _ (~)(~)(M~2)(N22)]
=1
for
1-1
_ (M) -1{1.I-2) (N) -1{N-2) < 1 _ (1 _ L)4 < ..§...
2
2 2
2 M-l
1>1-1
> 2.
Let
PeR) denote the probability that the test (2.8) will reject H •
o
We uill show, for 6> 0, that
P(R}
peR} ~ 1
~ pew
lE:(~-l) ]-t Iw - tl
.i. _J..
- "2 > M 2 z.p}
~ P (w
- E (w) > M-i
= P(
[
M ~oo.
as
>
Z!a - d}
z~
Now
}
,
the last line of (6.5) being a consequence of (6.2).
Applying Tchebycheff's
inequality to (6.5), we obtain
(6.6)
peR)
>1 _
-
for
var(w)
(d-M- to)2
M sufficiently large that d-M-i
)
)
z~> 0.
Our desired result now follows
2
at once from (6. 6) if we utilize (6. 4) and let
M ~ 00
•
J
7.
The one-sample problem.
Suppose we have just a single sample of
size 1-1 imich conforms to the IJX>del (2.1) and to the accompanying assumptions,
=
I
J
)
and suppose we wish to test
H':
8.._
o ry
)
~
0
30
,mere
t3
is a specified number, against alternatives
It one-samp~e simple regression problem,
is applicable.
~
f
t3
0
•
For this
a test statistic proposed by Theil [9, p.390]
This statistic, to which w(2. 4) bears a certain similarity, is
essentially
•
As Theil has noted, (2w 1
l'
-
1) is closely related to the Well-known statistic
'Which has been considered by M. G. Kendall (see, e.g., [4], p. 82, equation
(1)) and by others.
Althougl1 there evidently is not an exact correspondence
betw'een (21,, 1 -1) and
l'
for t11e respective non-null cases, it is easily verified
tl1B.t the null distribution of (2w l -l) is identical with the null distributicn
of
Therefore it follO't'lS at once that E(w l
l' •
I)
var ( w
2M+5
)
=t
i f HI is true; that
o
if HI is true
= l8M(M-l)
o
[incidentally, note how (7.3) compares with (2.7)]; and that
normal under HI.
o
Wi
is asymptotically
Hence a test with critical region
2M+5
]-~
[ 18M(M-l)
Iw -2'11 > zp
I
constitutes a non-parametric test of
HI
·0
(7.1)
with level of significance
al'lmys exactly equal to ex (disregarding inaccuracies due to the normal approximation).
The consistency of the test
(7.4)
is easily established
via
teclUliques similar to those used in Sections 3 and 6 above.
In addition to the test
(7.4),
the nodel (2.1) is also available.
another way of testing H~
(7.1)
under
Recently HB:jek [2] introduced an extensive
class of distribution-free tests vrruch are applicable to the one-sample simple
recression problem; the Fisher-Yates-Terry-Hoeffding cl-test is (as he points
31
out) one IOOmber of his class.
'Where 13
0
(Explicitly, ~jek just considered the case
of (7.1) is 0, but this really involves no loss of generality since
the formulation (2.1, 7.1) can effectively be put into ~jek' s form if
tract
'We
sub-
13oXi :from both sides of (2.1) and subtract 130 from both sides of (7.1).]
In order to pick a specific test from ~jek' s class of tea-cs, the user must
select a denSity function which in turn determines a function ~(u) (2, p.1125,
equation (1.6)] upon whfch tIle formula for the test statistic is based.
; t
test
~dll
The
have optimal power in a certain sense if the selected density is the
(presumably unknown) density of the ei's.
Some users may object to the pro-
cedure of arbitrarily selecting a density each time they utilize H{jek's class
of tests.
I
J
However, such users could simply decide in advance that they would
aJ.vre:ys utilize a certain density, such as the normal or the logistic density
e
(densities 3 and 2 respectively in ~jek's Table 1 (2, p. 1127]).
In fact,
it might even be possible, through further theoretical investigation, to discover a specific $(u) (i.e., discover a specific choice of the density) Which
will in some sense possess a minimax property with respect to some criterion
related to the power of the test.
Although we will not attempt here to undertake a detailed comparison
of Theil's test (7.4) versus Ha:jek's tests, we may briefly indicate some of
the matters [in addition to the problem of choosing
4l(u)
for H'jek's tests]
which should perhaps be further examined in such a comparison:
(i)
Power is obviously an important considera.tion.
The power of the
tests will depend (among other things) on the Xi's, on Fy(e), on
and on the choice of
som~rt1at difficult.
,(u)
for ~jek's tests.
(l3y - 130 ),
Thus po~ver comparisons may be
For the case 'Where $(u) is based on the logistic density,
J
)
I
J
)
hm-rever, the power comparison seems to be relatively clear-cut and dependent
32
J.
:1 . .
e
ma.inl.y on the
Xi's (evidently this particular power comparison can never favor
the test
(7.4);
COL~nonly
(i.e., for certain
usually favors the H(jek test to a Sli~lt degree; and may less
t~~es
of sets of Xi's) favor the Hajek test to a
more substantial degree or, contrarily, find the two tests essentially equal
in po,.,er].
(ii)
The relative rapidity with which the null distributions of the
competing test statistics approach normality could sometimes be an important
matter to the test users.
For the particular case where $(u) is based on tl~
logistic density and the X. 's consist of the integers from 1 to M, Htjek's
.J.
test statistic is equivalent to Spearman's rank correlation coefficient p •
NO"T the distribution of Kenda.l1' s
't'
evidently converges to normality a bit
faster than the distribution of Spearman's p (5, p. 58]; tIns situation may be
indicative of a small advantage for the test
~
(iii)
(7.4).
Ease of computation is a third factor Which test users may be
expected to take into account.
If only a test of H~ (7.1) is desired and no
associated confidence bounds are required, then it appears that the test (7.4)
vrould Generally be less simple to compute than one of I~jek's tests.
the other hand, a confidence interval for
I3y
If, on
is needed, then basing it on a
~jel: test ,,,ould seem to necessitate' some sort of trial-and-error computations
wI1icll could be rather lenGtI1Y, 'l'Thereas a confidence interval associated
the test
(7.4)
~dth
can be computed ina straightforward manner •
.
A second tilo-sample problem. An obvious companion problem to the
two-sar.~le
problem treated in tIns paper would be to develop· a non-parametric
test of whether tw para.l1e.l simple regression lines are the same.
In other
words, suppose we have the same mdel as indicated by (2.1-2.2) and the
I
33
~,"i
,.. .: .
A4!P. Ui::i,i
i'iI
ii'"
Ii
Ai
II
accompanying discus sion, except that
~
and I3 are each replaced by 13 (i. e. ,
Z
the two lines are known to be parall~l), and suppose lore idsh to test the
hypothesis
CXy = CXz against aJ.ternatives CXy 1= CXz •
It looks as though one interesting possibility for this situation
"rould be to develop a test based on a statistic of the form (2.4), but with
ViIjJ in (2.4) replaced by
(8.1)
riIjJ = (XI-Wj)/(JS: + \iJ - Xi - Wj ) [for the formulas (8.1) we ass~
wllere
that the X's and W's are in increasing order, so that Xi < ~
and
W
j
< HJ1.
Evidently such a test would not be valid under as generaJ. conditions on Fy
and F
Z as the test of the present paper: the random variable (8.1) is equal
to
(8.2)
and thus has the desired median (~-<ly)
if
Fy and FZ are both synmetric (about the origin)
(8.3a)
and/or
,
(8.3b)
but it rray have a different median if (Fy,FZ) do not satisfy (8.3).
If the test just proposed were to be fully developed aJ.ong the lines of
the present paper, the outstanding rratters to be resolved llOUld include the
I
j
follmTing:
(i)
finding a satisfactory upper bound for the variance of the test
statistic under the null hypothesis, given the condition (8.3);
J
J
34
--------.------
(ii)
showing that the null distribution of the test statistic is
~as~totiCally normal under appropriate restrictions,
(iii)
given (8.3); and
establishing the consistency of the test under appropriate assurnp-
tions, given (8.3).
Of course,
(8.1,8.2) would ha.ve the proper median even under conditions
semel-Tha.t broader than (8.3), but such broadening of the conditions might
complicate the proofs and at the same time be of limited practical import.
9. Acknowledgments.
The author wishes to thank Dr. Frederic M. wrd
of Educational Testing Service for suggesting the problem of developing a
test of the hypot1::l.esis that two regression lines are paraJ.1el when the two
sets of error terms are not necessarily normal with the same variance.
The author is also indebted to Dr. Y. S. Sathe for helpful discussions
and conments, and for pointing out the Theil [9] reference; to Professor
Wasslly Hoeffding for suggestions which were useful in obtaining certain parts
of the proofs; to Professor J. L. Hodges, Jr. for calJ.ing attention to the
relevance of the ~jek [2] reference; and finaJly to a referee of an earlier
manuscript, who made the vita.lJ.y important suggestion of' using essentially the
relations (4.6, 4.8) as a step in proving the upper bound of va.r{w).
[:
f
r
.:.,
_
•
•
•
'.
~
• •
..... •
III ~
..:
'.
'"
~~'7:..
_, .• ~...,
II ..
IIII1
Iijlll
II
.,
REFERENCES
HAJEK, JAROSIAV (1962). Inequalities for the generalized Student 's
distribution and tlleir applications. Selected Translations in
Mathematical Statistics and ProbabUity ~ 63-74.
[2]
HAJEK, JAROSLAV (1962). Asymptoticaily mst p01·rerfu1. rank-order tests.
Ann. Math. Statist. ~ 1124-ll47.
[3]
HOEFFDING, vm5SILY (1948) •. A class of statistics 'lilith- asymptotically
normal distribution. Ann. Math. Statist. ~ 293-325.
[4]
KENDALL, M. G. (1938).
~ 81-93.
[5]
KENDALL, MAURICE G. (1955).
A ne~1 measure of rank correlation.
Bionetrika
Rank Correlation Methods (2nd ed.).
Charles
Griffin, IDndon.
[6]
l-f.IOOi, H. B., and WHITNEY, D. R. (1947).
On a test of 'tmether one of t'lilO
random variables is stochastically larger than the other. Ann. Math.
Statist. 18 50-60 •
-
[7] POTTHOFF, RICHARD F. (1963).
Use of the Wilcoxon statistic for a
generalized Behrens-Fisher problem. Ann. Math. Statist. ~ 15961599.
a
[8]
ParTHOFF, RICHARD F. (1965).
[9]
THEIL, H. (1950). A rank-invariant method of linear and polynomial .
regression analysis, I. Neder1. Akad. Wetensch. Froc. ~ 386-392.
•
[10]
Some Scheffe-type tests for some BehrensFisher-type regression problems. J. Amer. Statist. Assoc. 60 (to
appear).
-
HILCaxON, FRANK (1945). Individual cOllq)arisons by ranking methods.
Biometrics Bull. 1,... 80-83.
I
L
I'
!
!
I
I
37
!
/
I
..
,illll."
© Copyright 2026 Paperzz