Sathe, Yashwant S.; (1962)Studies of some problems in non-parametric inference."

INSTITUTE OF S'TAT!:ST1CS
BOX 5457
STATE COL~.EGE SrAT10N
RALEIGH, NORTH (:AROUN/I
-_
.....
UNIVERSITY OF NORTH CAROLINA
Department of Statistics
Chapel Hill, N. C.
STUDIES OF SOME PROBLEMS IN NONPARAMETRIC INFERENCE
by
y. S. Sathe
May 1962
Contract No. AF(6;8)-213
The as~totic behavior of same nonpa~etric test criteria
used in analysis of variance (one-'WaY and two-'WaY classification) under the Pitman type of alternative is considered.
Step-down procedure is suggested for bivariate location parameter problem. Wald t s test is used for testing hypotheses
in the categorical set-up.
This research 'Was supported by the Mathematics Division of the
Air Force Office of Scientific Research.
'"
Institute of Statistics
Mimeo Series No. 325
1i
•
•
ACKNOWLEDGMENTS
I wish to express my sincere
tharu~s
to Professor S N Roy,
who as my research adviser suggested the problems in this thesis
and gave constant encouragement and guidance throughout the worki
and to Professor Harold Hotelling and Professor Wassily Hoeffding,
who read the manuscript and offered many helpful suggestions •
I
would also lilte to express my grateful appreciation to Dr. P. K.
Bhattacharya and Dr. Richard Potthoff for many encouraging discussions during the course of this work.
I owe my sincere thanks to the Department of Statistics,
University of North Carolina and the Air Force Office of Scientific
Research for financial assistance.
I am also grateful to Mrs. Doris Ge.rdner for the careful work
of typing the manuscript; and to Miss Martha Jordan for aid and
encouragement, and for pointing out many technical errors in the
manuscript.
With all this generous help, I must admit that I am solely
responsible for any shortcomings of this work •
•
T.A.BLE OF CONTENTS
•
CHAPTER
PAGE
ACKNOWLEDGMENTS
I
ii
INTRODUCTION
v
ON TEE ASl.'HPTOTIC BEHAVIOR OF SOME NONPABAMETRIC
TEST CRITERIA USED IN ANALYSIS OF VARIANCE (ONEWAY CLASSIFICATION)
- - - - - -
1
1.1 Introduction
1
1.2 Same Definitions and Known Results -
-
3
1.3 Bhapkarts Test Statistic -
-
5
1. 4 Asymptotic Distribution of
-
-
-
~ under H:N
5
1.5 Asymptotic Relative Efficiency -
16
1.6 Massey's Test
- - - - -
19
-
-
19
EN - -
20
1.7 Notation and Definitions -
-
-
-
1.8 Asymptotic Distribution of ~. under
-
1.9 Asymptotic Relative Efficiency of Massey's
Test -
-
1.10 Examples -
II
•
- -
-
-
29
- - - - - - -
32
-
-
-
-
-
-
1.11 Asymptotic Power -
-
-
-
-
35
1.12 Remarks -
-
-
-
-
36
- -
-
ON TEE ASYMPTOTIC BEHAVIOR OF SOME NONPABAMETRIC
TESTS USED IN ANALYSIS OF VARIANCE (TWO-WAY
CLASSIFICATION). - - - - - - - - - -
38
2.1 Introduction
-
38
2.2 Asymptotic Distribution of ~. under
40
2·3 Asymptotic Distribution of
1\~. under 1\
53
-
-
-
-
-
-
-
-
iv
CHAPTER
PAGE
2.4 Asymptotic Relative Efficiency of Mood's
Test ~th respect to Friedman's Test and
the F-Test
- - - - -
•
2.5 Asymptotic Distribution of Mood's Test
Criterion generalized to Balanced Incomplete Block Designs. - - - .. .. - -
III
59
..
63
2.6 Asymptotic Relative Efficiency of Mood's
Test for Balanced Incomplete Block Designs compared with Durbin's Test and the
F-Test.. .. - - - - - - - - - - -
65
TESTS FOR ~OTHESES UNDER ~RE EXP~IAL REGRESSION MODEL AND FOR HYPOTHESES ON BIVARIATE
LOCATION PARAMETERS - .. .. ..
68
3.1 Introduction
68
3.2 A Nonparametric Test
fo~ testing the hypothesis ~=/3 0 'When the conditional median of
y given x" is assumed to be EX:Pr~x7
under the model - - - - - - - - - - - -
69
3.3 Test for the Equality of Location parameters
for Bivariate Distributions"
IV
-
72
SOME PROBLEMS IN THE CATEGORICAL SET-UP
4.1 Introduction
4.2 Wald's Test and its Use for testing some
hypotheses in the categorical set-up ..
4·3 Hypotheses of Symmetry
-
..
79
87
-
4.4 The Non-centrality parameters for the
hypotheses of Symmetry
BIBLIOGRAPHY-
•
..
-
..
-
-
-
-
-
-
91
93
INTRODUCTION
This thesis is concerned with the asymptotic behavior of some
nonparametr1c tests used in the analysis of variance (one-way and
two-wa~ classifications), when the alternative hypothesis is of the
Pitman type;
with a generalization of Mood's test to test the
hypothesis of the equality of k bivariate distributions which are
identical except for the location parameters;
and with the use of
Wald's test to test certain hypotheses in the categorical set-up.
In the analysis of variance with one-way classification,
suppose we have
valued) N
= L;_
_l
•
j
= 1,2, ... ,
n.),(real
~
n. independent observations and further, suppose for
~
each fixed i, Fi(x) is the cumulative distribution function of X
ij
Then we are interested in testing the hypothesis
Various nonparametric tests based on either the ranks of the
observations or the quantiles of the combined sample have been pro1
posed by Wallis and Kruskal
Terpstra
["29_7 , Mood ["19_7, Massey ["17_7,
["25_7 and Bhapkar ["4_7.
Very little work has been done on the power of these tests.
Andrews
["2_7 has investigated the
~symptotic power of the Wallis-
Kruskal test and Mood's test when the alternative hypothesis is of
•
1
The numbers in square brackets refer to the bibliography listed
at'the end.
vi
the Pitman type i.e. the alternative hypothesis H is
N
(i
= 1,
2, ... , c),
where not all Oils are equal.
In Chapter I of this thesis, we consider the asymptotic dis·
tribution of Bhapkar's test criterion and Massey's test criterion
when the alternative hyPOthesis liN holds.
These two tests are
compared with other nonparametric tests mentioned above and with the
usual F-test.
For some specified choices of F(x) and c, their
asymptotic relative efficiencies are tabulated.
Similarly, in a complete two-way design with t treatments and
b blocks, we assume that the distribution of the random variable Xij
(i
= 1,2,
"', tj
= 1,2,
j
v is the general effect, a
the j-th block effect.
i
.•• , b), is F(x + v + a i + ~j)' where
is the i-th treatment effect and
~j
is
In this situation, the hypothesis of the
equality of treatment effects is
a
H:
=a
012
Mood
119_7
=
=C¥
t
and Friedman 111_7 have proposed nonparametric
tests for testing this hypothesis H and their tests have been
o
generalized to the case of balanced incomplete block designs by
Bhapkar
£4_7
and Durbin 19_7 respectively.
Chapter II of the present work, deals with the asymptotic distribution (as b --->
•
00
)
of Mood's test criterion, its generalization'
and Friedman's test criterion, when the alternative hypothesis is
vii
,
11,:
where not all Oils are equal.
As in Chapter I, asymptotic comparisons of these tests with
each other and with the usual F-test are made and for specified
choices of F(x) and t, the asymptotic relative efficiencies are calculated.
In the next chapter an exponential form 1. e. EXp( t3x) is
assumed for the median of the conditional distribution of y given x,
and for each x, the conditional distribution of y is assumed to be
symmetric about this median.
Under this model, a nonparametric test
is proposed for testing the hypothesis that
~.
o
~
has a specified value
Also the step-down procedure is used for testing the equality of
k bivariate distributions, where the model is one in which they differ
only in their locations.
In the last chapter, Wald' s test ["28_7 is used for testing
the hypotheses of two- by-two independence in a 2 x 2 x 2 table and
the equality of two marginal distributions in an r x r table.
Pro-
blems of symmetry in, two-way and three-way tables, when each
classification has the same number of categories are considered.
The corresponding chi-square test statistic and the non-centrality
parameters are given •
•
CHAPTER I
ON THE ASYMPTOTIC BEHAVIOR OF S (ME NONPARAMETRIC TEST
CRITERIA USED IN ANALYSIS OF VARIANCE (ONE-WAY CLASSIFICATION)
1.1 Introduction.
In the usual analysis of variance when the
aS6~ption
of normality
is not realistic, one usually resorts to nonparametric procedures where
no functional form of the distribution is assU11led.
Various nonparame-
tric tests have been proposed in analysis of variance with one-way
classification and c
Let
fxij ) (i
samples
= 1,
(c
~
2).
2, ••• , c; j
= 1,
c
2, ••• , n i ) be N = Z ni
i=l
independent observations (real valued) and suppose
tive distribution function
F (x).
i
Xij has a cumulaThen the null hypothesis to be
tested is given by
(l.l.l)
for all real values of x.
Wallis and Kruskal
["'22,7 have considered a test based on the ranks
of observations in N-fold over-all sample.
They have shown that the
asymptotic distribution of their test statistic, under
square distribution with
CoOl degrees of freedom.
generalized Wilcoxon's two-sample test
by considering the Wilcoxon
•
samples.
13E7 to
Ho 1 is a chiTerpstra ["22,7 has
the case of c
samples
test statistic for all possible pairs of
His test statistic, also, has a limit chi-square distribution
with
(~) degrees of freedom when Ho is true.
Another generalization
of Wilcoxon's test is proposed by Bhapkar ~~7, who considers
of observations, one from each sample.
c-plets
Under H, he shows that his
o
test criterion has a limit chi-square distribution
~th
c-l degrees of
freedom.
Instead of considering the ranks of the observations, Mood ~127
considers the number of observations in each sample which are less than
or equal to the median of the combined sample.
This idea is extended
by Massey ~117 who considers more than one quantile, say h ~ 2, and
bases his test statistic on the number of observations from each sample which lie in each of the
h+l
intervals formed by the h
Their tests reduce to the conventional
2 x c and
(h+l) x c
x2
quantiles.
for testing independence in
contingency tables respectively, when the null
hypothesis holds.
So far very little work has been done on the power of these tests.
Andrews ~5.7 bas investigated the asymptotic power, in Pitman's sense,
of the Kruskal-Wallis test and of Mood's test when the alternative bypothesis
HN'
is specified by
(1.1.2)
for all real values of
F(x)
x
and where not all
Q.' s
J.
are equal and
is a continuous cumulative distribution function.
He also, bas
made a comparison of these tests with the usual F-test of the anaJ.ysis
•
of variance •
3
In this chapter, we investigate the asymptotic power, in Pitman's
sense, of Bhapkar's test and Massey's test, under
(1.1.2).
An
HN
defined in
asymptotic comparison of these tests with other nonpara-
metric tests and the usual F-test is made.
It should be noted that
some work on the asymptotic efficiency of Massey's test (or one form
of it) has been considered in a paper by Chakravarti, Leone and Alanen
to be published in the Bulletin of the International statistical Institute.
But as far as we know they bavedealt with the two-sample case.
1.2 Same Definitions and Known Results.
Definition 1.2.1:
Let
~ijJ
(i
= 1,
2, ••• ,
Cj
j = 1, 2, ••• , n~
for a fixed i, be independent and identically distributed random variables with cumulative distribution function
m < n.
c - c
Fi (x) •
Let
D1. ~
nl ,
Further, let
(1.2.1)
be a function symmetric in each set of its arguments.
n
(1.2.2)
UN
-1
...
= (IrJ.1)
E ~(x~""'Xla ; .••••• ; ...•. ;X
•... ;Xc~ )
C81
-"-l
~
mc
x
where the summation runs over all subscripts
•
Then the statistic
1
<
cxl
< cx2
1
<
~l
< ~2
81
< 8
.• • ....
1
<
2
...
cx,
~,
< n
<cx
l
~
< n
<~
2
m2
.....
<8
< n
mc
c
••• , 8
such that
4
is called a generalized U-statistic.
Theorem 1.2.1 (Bhapkar ["3:7 and Sukhatme ["2'!-7):
Let Pl' P2' ••• , Pc be c fixed positive numbers and let
= NPi
n1
c
where E Pi = 1.
i=l
Suppose we consider the vectors
and
Ur
-N
where u(i)
N
=
(U(l)
N'
is a generalized U-statistic corresponding to (p(i).
Let
Then under the assumption of the existence of second moments of (pr s ,
the limit distribution of
i.6
a g-variate normal distribution with zero means and variance coE =(aij ) (i,j = 1, 2, ••• , g) ,where
c
~i)~j)
g(i,j)
O"ij = E
k=l
Pk
0,0, ••• ,1,0, ••• ,0
variance matrix,
(1.2.4)
1 at the k-th place.
and
(1.2.5)
S(i,j)
0,0, ••• ,1,0, ••• ,0
1 at the k-th place
(1.2.6)
•
(])lu) =
\il(U)(Xl1 , JS.2' •• , Y'1ta{X2l,X22p,·,X2m2;"·;
5
and
<P~~~
X'
excepting
I
S
~iu)
is obtained from
~l.'
random variables.
by replacing all the
XI s
by
the primes denoting e. new set of independent
This is a generalization of Hoeffding's theorem
on U-statistic ~1~7 extended by Lehmann ~127 to the case of two
~les
to the case of
c
samples and random vectors of U-statistics.
1.3 Bhapkar's Test Statistic.
Under the assumptions of the preceding section, let
=0
otherwise.
(i = 1, 2, ••• c).
Then
n
c
E ~
a =1
c
(i)
(Xl
01.
'XBa2,... ,Xca
c
).
Let
UN = ~
i=l
p
i
U(i)
N
,
and
(1.3.4)
Then, under the null hypothesis
tribution of
~ as N ->
co
H ' defined in (loLl), the limit diso
is a chi-square distribution with c-l
degrees of freedom.
1.4 Asymptotic Distribution of
Theorem 1.4.1
•
(i)
for each
XB2 Under HN:
Suppose
positive integer N,
HN
is given by
6
..
H:N:
for all real
x and not all
Qi's
are equal and F(x)
is an abso-
lutely continuous cumulative distribution function with f(x)
as
derivative with respect to Lebesgue measure,
(ii)
f(x)
is differentiable and :rex)
and f' (x)
are bounded
almost everywhere.
~ is a non-central chi-square
Then, the limit distribution of
distribution with
2
~,
c-l degrees of freedom and non-centrality parameter
given by
XB2
(1.4.1)
2
2 2
= c (2c-l) I aQ ,
where
J
00
(1.4.2)
I
=
["1 - F(yL'f-
2
r(y) dy
,
and
(1.4.3)
Proof:
The proof of theorem 1.4.1 will be a direct application of
theorem 1.2.1.
.!IN
and E
In order to apply theorem 1.2.1, we need to calculate
l
when H:N is true. We shall neglect terms of O(N- ) and
1
O(N2)
in calculating .!IN and
~~i)= E(U~i»
(1.4.4)
=
~E
respectivelY.
E(~(i»
= Pre (X
< Xjg/
iai
=
Joo .Tr
-00
(1 - Fj(X»
j
= 1, 2, ••• , c and
fi(X) dx
jJ/i
1
•
=
JOO
-00
j ~ i)
iT(l - F(X + N- 2 Q.»f(x +
J~i
J
1
;'2 Qi) dx
-rr
c
(1.4.4)(cont. )
j~i
7
1
..-
(1 - F(y + N 2(Q 4
~
..
Q.») fey)
~
dy •
Using Taylor's series expansion, we get
1
1
F(y + N-'2 (Q
j
..
Qi» = F(y) +
i'2 (Qj
-
Qi) fey) + O(N- 1 ) .
Therefore,
1
= .fI
-F
~
_;"2
(yjf-1
x
j~i
(Qj- Qi ) fI - F (y)}C-2
f(y) + O(N- l ) •
Substituting (1.4.5) in (1.4.4) and integrating, we have
1
(1.4.6)
(i \
~N J
-'2 c
N E (Q j
1
= - -
j~i
c
= ~-
N-'2 (Q _ cQi)I + O(N-I) ,
c
= E
i=l
Next,
and
•
Q
•
i
1
Qi)I + O(N- ) •
1
where
Q
-
8
(i) (i)
E( wI ~2
,r (i)
) = EL ~
) (1)(
,
(X11,X21,X11,XCl ~
.
co
1 IT
=1 IT
=
-co j~i
(xl72
["1 - Fj
Using
j~1
ax
1
co
-co
1 (x)
f
2
["1 - F(Y + N- (Q. J
Q1>l72
fey) dy .
(1.4.5), we get
nr1 - F(Y
j~1-
1
-"2
+ N (Q~
- 01»
72
J-
1
= (1
- F()2C-2-2
y)
+ O(N ) .
Hence
E(~fi) ~~~i
)
J
1
co
=
(1 - F(y»2C-2 f (y)dy + O(N-"2)
-co
1
-"2
= 2c-l + O(N )
1
(1.4.6)
Also, fram
(1.4.8)
Therefore,
=
s(1,1)
0,0, ••• 1,0, ••. 0
1 at thei-th place
Further,
~(1,1)
s
0,0, ..• ,1,0, ••• ,0
1 at the k-th place
k~1
•
=
E(~(1) ~(i»
1
2,k
-
7
X12,X22,···,X11,Xcal
E2(~(1»
1
'
9
and
1
=:
1
c-1-"2
c-1 L1-(1-F(y»
_7 + O(N ).
Therefore,
(1.4.10)
1
2 .
= c(2c-1)
•
+
0(N-2)
10
Using (1.4.8) and (1.4.10), we get
1
(1.4.11)
=
2
S(i"i)
c(2c-1)
0"0, ••• ,1,0" ••• ,0
1 at the k-th place
_
c
12
+
0(N-2)
k " i
1
=
2
c (2c-1)
This is true for all k" i,
therefore by (1.2.4)
(1.4.12)
Next,
S(i,j)
0,0, ••• "1,0,, ••• ,,0
1 at the i-th place
and
E(~ii)~~:f) = E(~(i)(Xl1"""Xil""XC1)
)
x 4l (j)( X12 ,X22 , .•. Xj2, ... X.
i1 " . 'Xc2 )
= pr(xi1<JS.1'" .Xi1< Xc1 iXj2< ~2'"
.Xj2<Xi1 ,·· .Xj2<Xc2 )
=J £! ' k"J."
TT £1-Fk(Z)}fj(z)~7TTfl-F,,(Y17
fl:1' .
x.
fXJ
-co
=j
fXJ
j
-fXJ
£l-(l-F(y) )C-17
.1 -
c -
-co
n
("i
x fi(Y)dY
-~
£l-F,,(y)7f i (Y)dy + O(N
x-
Using (1.4.9)
J
fXJ
=
•
-
C:1 L1-(1-F(Y»
1
= c(2c-l)
C"~7(1_F(Y))C-1f (y)dy + O(N~)
1
"co
-"2
+ O(N
1
).
)
II
Hence,
1
1
= (C(2C-1)
~(i,j)
0,°/",/ 1 /°/"'1°
1
."2
- ~) + O(N )
c
1 at the i·th place
Similar1YI
=
g(i,j)
0,0,.",1,0",,"°
1 at the j-th place
g(1,j)
-(c-1)
2
c (2c-1)
+
=
0,0 1 ",,1 / 0, ••• ,0
1 at the k-th place
k ~ 1, j
Now
E(~ii) ~~:~) = E(~(1)(X11""/Xi1""'Xk1",,XC1)
~(j) (X12,· u,Xj2 ,'·· '" Y'k1""
,Xc2
»'
TT fl. Ff/(u17fi (u)d~7
l~i,k
x
£.1 T.T £1-Fm(v17f j (v)iiy} x fk(Y)dor
-00 m?'i,k
•
. 12
Hence
1
(1.4.JJ.,. )
s0,0,
(i,j)
= [" C(~C-1) ••. ,1, ••• 0
-~
1
~_7 + O(N
c
)
1 at the k-th place
k # i, j
Thus, from (1.2.4) we have
(1.4.1:5)
Therefore, by theorem 1.2.1, the asymptotic distribution of
is a multivariate normal distribution with zero means and covariance
IDa trix
!:
= {O'ij },
O'ii =
where
1
c 2( 2c-1
r
)-
c
!:
k#i
1
-
7,
Pte-
and
=
1.
Hence the joint distribution of UN 1 s
is singular and therefore the
asymptotic distribution of UN's must also be singular.
•
be seen by noting that the matrix !:, is singular, since
This can also
13
c
.E
j=l
=
"ij
= 1"
(i
0
2" ••• " c) •
We shall cO&1sider the first c-l UN's and denote by
and
,
..!lo"N
( «1»
= E UN
«2»
"E UN
. «C-l»)
" . • •• E UN
•
The corresponding covariance matrix obtained from .E" be deleting the
last row and column of .E
I _ =
c l
c-lxc-l
is denoted by
Eo'
Further" let
Diagonal matrix with unity as diagonal elements and off-diagonal elements zero"
and
=
~~
(Q - cQl" Q - cQ2"'"
Q - CQc_l ) •
lxc-l
Then" from
(1.4.6)
'
..!l o,N
1
= ~c -0
J' - (N- 2
I) Q' I
-0
c-l
"
and the asymptatio d.i&tJrlibu'tion of.
is
N«I)I c_l ~o " .Eo>
Now in order to obtain the asymptotic distribution of
following lemma •
•
~, we need the
14
Lemma 1.4.1
Suppose the vector x
has a multivariate normal distribution
rXl
with mean vector
and nonsingular covariance matrix.- E , then
.
.
0
the quadratic form Xl E- l x has a non-central chi-square distriI..L
ril
-
bution 'With r
trix I:
o
0-
degrees of freedom (where
) and non- centrality parameter
2
A
=
-1
u
-I..L' E o,l;;
A proof of this lemma is given by
-1
We calculate Eo'
is the rank of the ma-
r
2
A
given by
•
Roy
£2"27.
Let
=
p'
-0
Do =
c-lXc-l
Diagonal matrix.
and
a=
c
E
k=l
-1
'0.
-l{
Then
2
c (2C_l) Eo =
c~0 + a -0
J -0
J' - c
q J' - c -0
J -0
q' •
-0 -0
Let
where ct, 13, 'Y and 8 are some constants to be determined.
Now
15
I = ZoZ·l
= I + Lr-a(~P .a) + ~(a-apc+ c) - ~c-7 -J 0 -J'0
0
e-lxc-l
e-lxe-l
e
+ [" e(a-~(l-pc)
L7 3 0
~~
+L
r- cr -e
e5(1-p
)- ~ 7 q p'
e - -0-0
7
+ Lr-r(~
•
P -a) + 5(a+e-ape ) + ~2 - -J 0 -p'
0
e
e
Equating the coefficients of -0
J J', q Jot, q p' and J p' to
-0 -0 -0 -0
-0 -0
zero and solving for
a
a,~,
rand 5, we get
222
= pc (l-pc )/c,
~ = p /e , r = p /e and
e
c
5 = -l/c
2
•
Therefore,
Z-l
= (2c-l)
o
fD-0 l
+ p e (l-pc )J
J' +
- 0;;..0
PC (p-0-0
J' + J p')
-0=-0
- -0=-0p p' 7
.
Hence
N(U N -0,
~c -0
J )'
Z-l(U N 0
-
O,
~J
)
e -0
= N(2c-l)
~ Pi(UN(i)- UN)2
i=l
2
= Y-:a'
and
_ ,-1
Il Zo _Il
=
12
"'"
Al
-0
~-l
&oJ
-0
A
"'"
-0
= ~ .
The proof of the theorem is now completed by using lemma 1.4.1 with a
theorem of Mann and vTald ["l§}.
The above result is also obtained independently by Bhapkar ["~7.
16
1.5 Asymptotic Relative Efficiency.
The concept of asymptotic relative efficiency of one consistent
test with respect to another, when both test statistics have a limit
normal distribution under the null as well as alternative hypothesis,
is due to Pitman £25.7.
£'Z};,7.
An
account of his method is given by Noether
Briefly, the idea of asymptotic relative efficiency is to se-
lect a sequence of alternatives which depend on the sample sizes in
such a manner that the powers of the two tests for this sequence of
alternatives tend to a common lUn1t less than one.
then made on a sample size basis.
The comparison is
This method has been extended by
Han.nan £15.7 to consistent tests, 'Whose test statistics have limit
distributions of same analytical form which is not necessarily normal.
In particular, in the case of chi-square distribution with the same
number of degrees of freedom for both test statistics, the formula:
for asymptotic relative efficiency reduces to the ratio of their
non-centrality parameters.
In this section, we shall compare Bhapkar's test with the
Kruskal-Wallia test, Mood's test and the usual F-test for some specified cumulative distribution functions.
The non-centrality parameters
for the last three tests are given in Andrews £5.7.
By using Hannan's result, and denoting the asymptotic relative
efficiencies of Bhapkar's test 'With respect to the Kruskal-Wallis
test, Mood's test and the F-test by eB,JC' eB,M and eB,F respectively, we have
(X)
~,K =
C
2
2
(2C_l) I / 12 £
1-00 fE(X)dx_72
,
17
where a
is the median of the distribution F(X)"
eB"F
where
~~ =
2
=
C
(2c-l) I
2 2
~F '
co
J
and
eo
2
x f(X)dx -
£J
~72
x f(x)
-co
-co
•
Example 1.5.1 Normal Distribution N(O"l)
In order to calculate the integral I defined in (1.4.2) for the
normal distribution" we make use of the for.mulae
Gupta
£ §] and Hoj lJ..1I:.7.
0
given by Bose and
Table 1.5.1 gives the values of a.r.e.
of Bhapkar's test.
Table 1.5.1
c
2
4
3
5
6
8
7
10
eB"K
1.00 0.94 0.86 0.80 0.74 0.69
0.65
0.58
eB"M
1.50 1.41 1.29 1.20 loll 1.03
0.97
0.87
eB"F
0.95 0.90 0.82 0.76 0.71 0.66
0.62
0.55
Example 1.5.2 Exponential Distribution
F(x) = 1 .. EXP£-:;7
=0
c
eB,K
2
3
1.00 1.67
(x ~ 0) ,
(x < 0) •
Table 1.5.2
6
4
5
2.33 3.00 3.67
7
4.33
8
5.00
9
5.67
10 11
6.33 7.00
eB,M
3
5
7
9
11
13
15
17
19
21
~,F
3
5
7
9
11
13
15
17
19
21
18
Example 1.5.3 Double Exponential Distribution:
= (1/2)Exp(x)
= 1 - (1/2) Exp(-x)
F(X)
(x
~
0) ,
(x > 0) •
Table 1.5·3.
2
c
ij.
3
6
5
B
7
9
10
11
eB,K 1.00 0.94
0.79
0.66 0.55
0.47
0.40
0.35
0.31
0.28
eB,M 0·75 0·70
0·59
0.49 0.41
0.35
0.30
0.26
0.23
0.21
eB,F 1.50 1.40
1.18
0.98 0.82
0.70
0.60
0.52
0.46
0.42
10
11
Example 1.5.4 Rectangular Distribution
(x < 0) ,
F(;,,) = 0
=x
=1
(0 ~ x ~ 1) ,
(x > 1) •
Table 1.5.4
2
c
3
4
5
6
7
8
9
eB,K 1.00 0.94 1.04 1.17 1.32 1.47 1.63 1.79 1.95
2.12
2.81 3.11 3·52 3.96 4.42 4.90 5.38 5.86
6.35
1.00 0.94 1.04 1.17 1.32 1.47 1.63 1.79 1.95
2.12
~,M 3·00
~,F
Remarks
For non-normal alternatives it would be of interest to campare these
tests with a test which is optimal or "good" when these particular distributions are assumed.
For exponential alternatives the F-test is surely very inefficient.
1,9
1.6 Massey's Test
Let
f'Xij }
for 1 • 1, 2, .•. , c J j = 1, 2, .•. , n
of independent (real valued) random variables.
i
be a set
The probability disc
is denoted by F (x). Let N = E n
i
J
i=l i
Zl' Z2"'" Zh be the P1- th , (P1+P2 )-th, .•••
tribution function of
and further, let
Y~.
(P + P + ..• + Ph)-th - quanti1es of the combined sample, where
1
2
h
Pj > 0
j = 1, 2, .•• , hand
For convenience, let Zo
(1.6.1)
=
Mi,t
= -~
(Number of
(i
h
.r:
" j < 1 and f'urther
j=l
and Zh+1
= 1,
+
j=l J
and further, let
=~
j'B such that
PfJ. 1"" 1- '.r: p .•
Zt_1< Xij ~ Zt)
::! , ••• , c; t = 1, 2, ..., h:.r1 ) .
Then
c
(1.6.2)
.r: M
i=l
=
i,t
PtN
(t
= 1,
2, ••. , h+1),
and
Define
2
(1.6.4)
c
XMa.
2
h+1
= ..r:
E (Mi t - Pt ni ) /Pt n i
J.=1 t=l
'
Then, under the null hypothesis, H defined in (1.1.1) the
o
asymptotic distribution of
i = 1, 2, •.• , c
of freedom.
~ as n i --> ~ such that ni/N--> pi>O
is a chi-square distribution with (c-1)h .,' c,eg1rees
We shall consider the distribution of
~. under ~.
1.7 Notation and Definitions.
Corresponding to the random vector Z where Z'
= (Zl'
define
(1.7.1)
Gi,t(~)
Gt(~)
=
=
Fi(Zt) - Fi (Zt_1)
F(Zt) - F(Zt_1) ,
i
= 1,
2, ••• , c
t
= 1,
2,
... , h+1,
Z2""'Zh)
e
20
where
Fi(Zo)
= 0,
Fi (Zh+l) = 1
F(Zo)
= 0,
F(Zh+l)
(i = 1, 2, .•. , c).
=1
Let
(1.7.2)
5i ,t
= 1
if Zt belongs to the i-th sample,
=
otherwise
0
i
t
and
8i ,h+l
= 1,
2,
= 1, 2"
...
... ,, c
h
'
for all i •
=0
Further, let
~'
= (51 ,1,51 ,2'
.•• , 51 ,h; 52 ,1"'"
52 ,h;"'; 5c ,l' 8c ,2,···ac)h)
1.8 Asymptotic Distribution of ~. Under
H:N'
Theorem 1.8.1
Suppose
(i)
F(X)
is an absolutely continuous distribution function having a
continuous, bounded derivative, f{x) with respect to Lebesgue measure,
(ii) fl (x)
exists and is bounded
(iii) for each positive integer N,
H:N: Fi(X) = F(x + N- l / 2 Qi)
for all real x and not all Qi's are equal, and
c
(iv) for each i, as
n.]. -->~, ni/N -> Pi > 0 where .E Pi = 1 •
Then the limit distribution of ~
J.=1
is a non-central chi-square dis-
bution with (C-l)h degrees of freedom and non-centrality parameter
~. given by
(1.8.1)
h+l
E
~. = (j=l
2
21
where
c
Q
=
aj
= f(A j )
f(~o)
Z Pi Qi
2
tribution function
Proof:
- f(~j_1)
= 1,
(j
2, ... , h+1),
= f(~+l) = 0
and A , A , ••• ,
1
'
i=l
~
are the respective
h
quanti1es of the dis-
F(X) •
The joint probability density function of [Mi,t] and
{Ztl
is given by
c
tTn.~
=
i~i
Z
8
c
c
J.
h+l
1f IT mi
i=l t=l
'
t~
h+1
5i t
lTTTm'
i=l t=l i,t
Zl' z2"'"
c
h+1
zh )
JIll
rG (z) 7 i,t
- 1,t - -
- 8 i ,t
m
x
where the summation Z
th~
'l-.' s
h
1=1
t=l
h
c
8
1,t
possible values of the
(1=1, 2, ... ,c; t=1,2, ... ,h+l),
Zt
where
is over all
c
IT TT f f 1 (Zt L7
~a t1sfy
22
e
c
E ni Gi t (2:)
i=l
'
(1.8.4)
= Pt N
(t=1,2, ••. ,h+l) .
Obviously,
h+l
E
Ui,t
t=l
(1.8.5)
( i:::l, 2, n
= 0
• ,
c) ,
and
c
1/2
E n
Ui,t
i
i=l
(1.8.6)
=0
(t=1,2, ••• ,h+l) •
Then we obtain from (1.8.2), the probability density function of
as
and
=
-h/2
N
c
1T
i=l
h/2
ni
E
0
c
TI
i=l
x
c
h
IT
TT
L f . ('A.t
i=l t=l
J.
Consider a particular term, say
o
-g
of the random vector
(1.8.8)
o
i,g
;::
o
h
E
t=l
Then
c
E
i=l
oi,g
=
h
+ N-
1/2
w~'>7
01 t
'
v-
s , in E corresponding to a value
g
and let
For large values of ni's, using Stirling's approximation to factorials, we get
(1.8.9) log Sg
= -(ch/2)
c
c
1
loge 23t) - 1: ni + t (ni + '2) log ni
1=1
1=1
1/2
(niGi t(~) + ni
Ui,t)
i=l t=l
'
c
+ 1:
c
h+l
1:
h+l
1/2
8 t- 109(niGi t(~) + ni
i=l t=l 1" g ,
+ 1:
c
+ 1:
ui t)
1:
h
1:
i=l t=l
ait- log f i ( A.t
"g
:;
Gi,t(~)
1
-1/ 2
+ N
'W
t
)
•
+ N-l/2(Wtfi,K(A.t)-Wt_lfi(A.t_l»
+ O(N- l )
J
and
()
(-1/2)
f i ( A.t + N-1/2)
'W
t = f i A.t + 0 N
-
Substituting these values in (1.8.9), and after some simplification,
'We get
c
c h+l
= -(ch/2)log(2n )-(h/2) E log n. - E Z
g
i=l
~
i=lt=l
log s
1/2
1
(
(niGi , t(~)+n.~~,
u. t+ -2)log G.~, t ~)
-
c h+l
1/2
1
- ~ Z (n.Gi t(~) + n i
u. t + -2)
i=lt=l ~ ,
~,
x
).
Therefore
(1.8.11)
where
Pi~Wtfi(~t)-Wt_lfi(~t_~72
Gi,t(~)
and
c
g
= (2",
/
r ch/ 2 i=l
nc n..i (h/2-5i, g ) i=l
TIc 1;1),
I I r G. (~) 7 -1 2
t=l - ~,t - c
h
IT IT ~f.(~)7
i",l t=l
~ t-!-
x
5 i t.
"g
substituting the value of s,
given above, in (1.8.7) and after
g
same simplification, we obtain
wl w2"" ,wh )
=
C~EX:P[" -1/2 0/_7
1 2
+ O(N- /
t7
25
where
C = (2rr
tt
r Ch/ 2 1=1
Tr n~/2
fr rG (7-.)7 .1/2
J. 1l:i t=l - 1, t c
+r e
e
h
11 TIff (7-. )7 1,t;g 11 p 1,g •
1=1 i
g 1=1 t=l 1 ~
l:
Thus, the asymptotic distribution of {Ui,t}
variate normal distribution.
and fWt
J
is a multi·
But from (1.8.5) and (1.8.6) we observe
that not all Ui,t'S are independent.
Therefore, we shall consider
the joint distribution of h(c.1) Ui,t'S say U11 , ••• ,U1 ,h;
U21 , U22 ,···,U2hi ... Uc_1 ,1,Uc _1 ,2'" Uc _1 ,h and W1 , W2 ,···, Who
Then from (1.8.12) we have
(1.8.13). f(u11,u12,···u1,h;u21,u22""u2,hi ••• uc.1,l,uc_l,2, •••uc_1,hj
w1 ' .• • ,wh )
where
<r>
=
c·1
c-1
h
h
l:
l:
a i t'i' t' u4 • t ui',t' +
.,
i=l i'=l t=l t'=l " ,
h
h
l:
l: bt t' wt wt '
t=l t'=l '
l:
l:
c-1
- 2
h
h
,
l:
l:
l:
d t.t' uJ.'.t wt '
,
i=l t=l t'=l i "
and
(1.8.14) ai,t;i' ,t' =
p~l ~Pi/GC,t(~)
+
PC/G1,t(~)
if 1
+
Pi/GC,h+1(~)
= i', t = t'
,
26
= :p~1.LPi/GC,h+l (~)
,., (PiPi ,)
+ PC/Gi ,h+l (!O17
1/2/P G ,h+l (~)
c c
,., 0
otherwise.
Further,
=
pi/2 £fi(')...t)/Gi,t(!::). -
f C(')...t)/Gc ,t(?;;L7 if t=t'
t=1,2, ... ,h,
=pi/2 £fC(At_l)/GC,t(!::)-fi(')...t_l)/Gi,t(?;;)_7if t'=t-l,
=
pi/2£fi (~)/Gi1h(?;;) -
fC(~)/Gc,h(?;;L7if
H:N'
Now tUlder
Fi(X) ,., F(x + N- 1/ 2 Qi)
1
and
fi(X) ,., f(x + N- 1/ 2 Qi)
for all real x.
Theref'ore,
(
-1/2)
F(')...t + N-1/2)
Q
Q
i - F i\t_l + N
i
=
1 2
1
F(At ) - F(')...t_l) + N- / at 0i + O(N- )
where the a t 's are defined in (1.8.1) •
t' =h.
27
Using the above relation in (1.8.4), we get
(1.8.18) N pt
-1/2
c
= N Gt(~) + N
at E n. Qi + 0(1) •
i=l ~
Hence as ni -> 00 ,
and
(1.8.20)
Gi,t(~)
1
1 2
= Pt + N- / at(Qi - Q) + 0(N- ) •
substituting the values of Gi,t(~) given by (1.8.20) in (1.8.14)
and (1.8.16), we get
(1.8.21) ai,t;il,t l =
=
p~l(Pi+PC)~l/Pt + 1/~+1_7,if i=i l , t=t'
p~1(PiPi,)1/2~1/Pt ~ 1/~+1_7
if
= (Pi + pc)/PcP;n+l '
= (PiP i ,)
1/2
i~i I ,
t=t I
if i=i l , t;6t I ,
/PC~+l
and
for i = 1, 2, ••• , c~l
t, t' = 1, 2, ••• , h _
Therefore {Ui,t}
fWt JCt
= 1, 2, ••• ,
(i = 1, 2, .•• , c-l; t
= 1,
2, ••• , h) and
h) are asymptotically independent and the limt
distribution of {Ui " t} (i = 1, 2, ••• , c-l; t = 1, 2, •• " h) is
a multivariate normal distribution with zero means and covariance
rna tri~ E, where
28
~-1 =
«a i,t ;i',t'»
and a i 1 t ,•.J. , 1 t' are defined in (1.8.21) •
Substituting the values of Gi,t(~) given by (1.8.20) in
(1.8.,), we get
Therefore, as n ->
i
00
Let
) -1/2
.... nc--1/2(
1 Mc-,
11- PInc- 1 ••• nc- 1
and
Then the asymptotic distribution of the random vector M is multivariate normal with mean vector
Q
and covariance matrix
~,
where
Therefore, by Lemma 1.4.1, the asymptotic distribution of
is a non-central chi-square with (c-1)h degrees of freedom and
non-centrality parameter ~.
given by
29
,
2
A..-=.
1"\'
- "Ma -_ _
.., ",--
rI
""
£.J
h+l 2
c
2
) ( '"
. (rl
" a j /p.
p.
"'1 - ?i
""»
j=l
J 1=1 J.
=
('"
£.J
•
This completes the proof of theorem 1.8.1.
Remark:
In case h
= 1,
2
= P2
"1
AMa. = 4Lf(A1 L7
2
= 1/2 ,
c
2
(1~l Pi(Qi - '0) )
where Al is the median of the distribution F(x). The above expression for ~. agrees with the one given in Andrews L~7 for
Mood's test.
1.9
Asymptotic Relative Efficiency of Massey's Test.
In this section, we shall compare asymptotically Massey's
test 'With other nonparametric tests and the F-test.
First, we
observe that for a specified choice of F(X), the non-centrality
tribution with
on~
c
2
upon E Pi (Q1 - '0). Secondly, for h ~ 2,
i=l
Massey's test statistic has asymptotic non-central chi-square disparameter depends
(c-l)h 'degrees of freedom, while the other test
statistics under consideration have asymptotic non-central chisquare distribution with (c-l) degrees of freedom.
simple formula
Thus Hannan's
mentioned in section 1.5 is not applicable here.
Instead we shall use the following consideration.
For each positive integer n, denote
(nli)(n),
i
= 1,
2 where each
positive integers.
n(i)(n)
j
n~i)(n), ••• , n~i)(n»
is a sequence of increasing
30
Let
(i)
c
(
N = L n i)(n)
n
j=l j
and
A'
~n
= ( N-1/2( n ) Ql'
N-1/2( n ) Q2' •.• N-1/2(»
n Qc
where for some pair (j,k) Q
j
->
Q
k and suppose
n~i)(n)/N~~~
Lim
n
~
c
= Pj >
00
Further, let
¢(l) and
n
'
¢(2)
° where J=l
.L Pj = 1 •
be sequences of Massey's
n
test and some other competitive test under consideration based on
N(l)' and N(2)'
-n
respectively and having the same fixed signifi-
-n
cance level a
for testing the null hypothesis defined in (1.1.1).
l
We shall consider alternatives of type Hn : Fj(X) = F(X+N- / 2(n) Qj)'
j = 1, 2, ... , c, where F(x) is a specified distribution satiso
fying the conditions of theorem 1. 8.1.
Denote by p(i) (¢(i), F, QI) the power of the i-th test. Now
n
-n
if there exist two sequences N(i)' 1 = 1, 2 which satisfy (1.9.1)
-n
and are such that
(1)
for each p~sitive integer n, ¢(l) and ¢(2) have same signifin
cance level a
(ii)
->
,
pel) (¢~l), F, ~~l)') = Ltm
Lim
n
o
n
00
=
where
n
° < f3 o < 1;
and
->
f3 o ,
p(2)(¢~2), F, Q~2»)
00
f'or all
Then the asymptotic relative ef'ficiency of test ¢(l)
pect to test ¢(2)
j.
with res-
is defined as
We know that 'When H
holds, Massey's test statistic and the
n
test statistic corresponding to any other competitive test under
consideration have limit non-central chi-square distribution with
(c-l)h and (c-l) degrees of' f'reedom.
Theref'ore to satisf'y the
requirement (1i) we must have
(1.9.;) A2
~
0o'~o'
h( 1)
= c
c-
~
1,F j=l
p
j
(g(l).
j
g(1»2 ,
and
,2
""0 ,(3
o
2
where A
.
~
°o"~ 0'
0
,(c-l)
h(C-l)
~
= c2 ,F
2
and A
(Q(2) _ g(2»2
p..
j=l J
~
°o'~ 0'
J
(-1) are the tabulated values
c
of non-centrality parameters given by Fix
£19.7
and cl,F' c2 ,F
are constants depending on F(x) •
Substituting the values of
Q~2)
given by
we get
(1.9.5)
Therefore, f'rom
(1.9.;) and (1.9.4) we have
=
•
Lim.
n
->
IX)
(1.9.2) in (1.9.4),
32
1.10 Examples.
Example 1.10.1
F(x) =
(i) h = 3,
f
Normal Distribution:
x
1 2
-00
= 1/4
Pj
4
2
E a
j
j=l
= 4,
(ii) h
(21rr /
p. =
J
(j
/p .
F;}cpL- 1 /2 t 2_7dt
= 1,2, 3, 4) ,
=
0.8629.
J
1/5, (j = 1,
5 2
E a j /Pj
2, ;,
4, 5) ,
= 0.8969 .
j=l
(iii) h
= 7,
= 1/8,
Pj
8
2
j:1
(iv) h
= 9,
P
j
10
aj
= 1/10,
2
j=l
= 1,
2, ••• , 8),
= 0.9146.
/Pj
E a j /Pj
(j
=
(j
=1,2,
0.9590 .
.•. , 10) ,
(-00
~ x ~ 00)
33
Table
1.10.1
Asymptotic Relative Efficiency of Massey's Test
for Normal Alternatives
ao
No. of
samples
= 0.05
No. of
quantiles
w.r.t.
A. R. E.
Mood's
Test
Kruskal
Test
F-test
'lw..M
'lw..K
'lw..F
c
h
2
3
1.00
0.67
0.64
3
3
0.98
0.66
0.63
4
3
0·97
0.65
0.62
5
3
0.96
0.64
0.61
2
4
0.96
0.64
0.61
3
4
0.93
0.62
0·59
4
4
0·91
0.61
0.58
5
4
0·90
0.60
0.57
2
3
4
5
7
7
7
7
0.87
0.83
0.81
0.79
0.58
0.56
0.54
0.53
0.55
0·53
0·51
0·50
2
9
3
4
5
9
9
0.80
0.76
0·73
0·71
0.53
0.51
0.49
0.47
9
'"
0·51
0.48
0.47
0.45
Example 1.10.2
F(x)
Exponential Distribution
=
1 - E"A'P£-!;7
(x ~ 0)
=
0
(J~
< 0)
.
/')...2
,
For any h,
= (Pl-l-l) ')...2
ao,~,(c-l)
0
=
iMa.F(aO'~O)
a,~
0
0
,h(c-l)
,
2
2
a.
= ( P-1 - 1 )')...a,~,
(
) I 3')...a,~,h
(
) •
Ma.,Kl
c-l
c-l
o 0
0
0
In this case, the asymptotic relative efficiency of Massey's test
depends only on
Pl' hand c and for given hand c, it can be
made arbitrarily large by choosing
Example 1.10.3
P
l
sufficiently small.
Rectangular Distribution
F(x)
=
0
=
"'"
.~
== 1
(x < 0)
,
(o:s x:s 1)
,
(x > 1) .
For any h,
a.,M(ao'~o)
=
Ma
l
1
2
2
(P-l + '-n+l
0.()
) '). .a,~, ()/4')...
c-l
a
,~ ,h c-l ,
o 0
0
0
= e"....
1rJ&:M,
F (a0 ,~ 0 ) •
In this case, also, the asymptotic relative efficiency for given
hand c, can be made arbitrarily J,arge by proper choices of
and f1J.+l'
Pl
35
E.:xample 1.10.4
Double Exponential Distribution
F(x)
= (1/2)
Exp (x)
(-00
denote the
l~
x
5 0),
(x > 0) •
= (1/2)£2-EXP( -x)]
For given h, let k
~
integer less than or equal
to h such that
k
Z P
j=l j
:5 1/2
,
and let
k
Z
j=l
Pj
= ~ :5 1/2
Then
1Ma,M(aO'~o) =
~(1_~)2 - ~+l(l
-
~17
,
x
2 '1t+l
1.11 Asymptotic Power.
The power of a test is defined as the probability of rejecting
the null hypothesis when the alternative hypothesis is true.
We have
already seen that when the alternative hypothesis of type
EN: Fi(X) = F(x + N- l / 2 Qi) is true, both Bhapkarts test criterion
and Massey's test criterion have limit non-central chi-square distributions. Moreover, for a specified choice of F(X), the non-
•
centrality parameters depend on Ql' Q2' ••• J
Q
c
only through
c
E Pi (Qi - Q)2. Therefore, for a given choice of F{x) and
i=l
given sample sizes nlJ n , ••• , n and a given alternative
2
c
Fi{X) = F(x + 8i ) where 8i 's are fixed n~
bers, we compute
,
and
Corresponding to the calculated value of non-centrality parameter
and number of degrees of freedom, the
power is obtained by re-
ferring to the tables of non-central chi-square distribution
given by Fix
-e
112.7
1.12 Remarks.
We have made an asymptotic comparison of Massey's test with
other non-par.ametric tests when the alternative distributions differ only in locations.
It would be interesting to make a compari-
son when the alternative distributions differ in more than one
parameter.
However, the difficulty in this caseJ is not only in
finding the asymptotic distributions of the test criteria, but also
except in the case of location and scale alternatives, in specifying
the alternative distributions in convenient forms so that they
differ in more than one parameter.
In the case of location and scale alternatives, if we assume
(1.l2.l)
•
37
and some further conditions on F(x), then it can be shown that
the asymptotic distribution of ~.
is a non-central chi-square
distribution with (c-l)h degrees of freedom and non-centrality
parameter ~ given by
h+l
~. ="t=l
~
where
f(~t)
at
=
bt
= ~tf(~"t)
f(~o)
f(~t_l)
-
= f(~+l)
-
~t-l f(~t·~l) ,
=0 ,
c
~ts
c
~ :Pi E i '
'5 = ~ Pi 8 i
i=l
i=l
are defined in section 1.8 ••
'£
and the
J
=
In particular, for Mood's "test
~.
where
A.
=
4f
2
c
(A.l ).L ~ «E i -'£) + A,1(81 -'5»2_7,
i=l
is the median of the distribution F(x).
l
Unfortunately" in this situation, the asymptotic relative
1
i S and 8i 's. It would be interesting
to compare these two tests for different choices of values of
efficiency depends on
6
Ei'S and 8 's and also to study the behavior of other nonparai
metric tests based on ranks when the alternative defined in (1.12.1)
is true •
.
•
CHAPTER II
ON THE ASYMPTOTIC BEHAVIOR OF SOME NON-PARAMETRIC TESTS
USED IN ANALYSIS OF VARIANCE (TWO-WAY CLASSIFICATION)
, 2.1
Introduction
In this chapter we shall consider the asymptotic distribution
of some non-parametric tests used in the analysis of variance setup with
t
treatments and b
blocks.
Let or Xij } (i .= 1" 2} ... " tj j = 1" 2, ... , b) be
bt indepen-
dent (real valued) random variables.
Let
lative distribution function of
and suppose for each
where
X
ij
Fij (x) denote the cumu-
f3 j
(Xi denotes the i-th treatment effect,
block effect and
function.
F(X)
(i, j)
denotes the j-th
is a continuous cumulative distribution
Then the null hypothesis to be tested is
Mood and Brown
["29,.7
have proposed a non-parametric test for
testing the hypothesis
Let
Xj
H in a complete two-way classification.
o
(j =' 1, 2, .•• , b) be the median of observations in the
j-th block and in two-way table" let the observation
placed by
mi
•
+1
Xj
if it exceeds
be the number of
+l's
in the
tic used by Mood and Brown is
or by
0
xij
if it does not.
i-th row.
be reLet
Then the test statis-
t
(1.2.2) ~. = ft(t-l)/d(t-dt7 i:l (tni - ~d )2,
where
d=t/2 if t is even and d = (t-l)/2 if t is odd. Unless
b
is small, Mood and Brow have shown that the limit distribution
of
~ is a chi-square distribution with t-l degrees of freedom
when H is true.
o
Another test criterion suggested by Friedman
of the re.t1ks of the observations.
xij
-e
r ij
makes use
denote the rank of
when the observations in the j-th block are arranged in as-
cending order.
on
Let
fll7
r ij IS.
Then calculate the usual
F-test statistic based
An equivalent test statistic is
where
b
rio = !:
j=l
rij/b
Under the null hypothesis
H, the limit distribution of
o
a chi-square with t-l degrees of freedom.
v2
is
~r.
These test statistics have been generalized to the case of
incomplete block designs by Bhapkar ~~7 and Durbin ~27.
In this chapter, we shall consider the asymptotic distribution
as
b -->
~,
of the Mood-Brow test statistic, of its generaliza-
tion to incomplete block designs and of Friedman's test statistic
when the alternative hypothesis is
•
for some pair (i, i
I ) •
The asymptotic distribution of Durbin I s test
statistics under
~,
has already been considered by Van Elteren
39
40
end Noether
.L217·
Finally an asymptotic comparison, in Pitman's sense, of the
Mood Brown test is made with Friedman's test and the F-test and also the Mood-Brown test in case of balanced incomplete block design
is compared with Durbin's test.
Asymptotic Distribution of ~. Under
2.2
Hb'
Theorem 2.2.1:
Let
lxij1(i = 1, 2,
.. " t;
j =
1, 2, ... , b) be bt
inde-
pendent random variables and let Fij (x) denote the cumulative distribution function of
= F(X
ij .
Further, suppose
X
i + ~j) where F(x) is a continuous
cumulative distribution function with bounded first and second deri(i)
Fij(X)
vatives
f(x)
and ff(x) respectively, and
l 2
a i = b- / Qi'
Hb:
(ii)
+ v + a
'Where Qi·~ Qi'
for sane pair (i, i') .
Let
M'
= (ml
-
bd
t '
m2 -
bd
t ' ,.. ,
bd
mt - t)'
where mi's are defined in section
Then under the hypothesis
(a)
Hb'
~.l.
as b ->
co
the asymptotic distribution of the random vector
b.- l / 2 M
is a multivariate singular aomal distribution with mean vector
H and covariance matrix E
•
= (~i,il)'
where
41
.
Il
= -t<~:i)(Qi-Q)
i
co
J LI"'F(;~17d ~d(x)f2(x)dx
...
-co
if t
co
is odd,
= .. t(~:i)(gi-Q>f Ll_F(x17d-IFd-I(X)f2(X)dx
(2.2.lb)
-co
if t
is even ,
and
(j.
J.,
d(t_d)/t 2
if =
if i = i f ,
= -d(t-d)/t2 (t-l)
t
and where
=
~
i=l
if t
= t/2
Qi' d
if i
if t
~ i' ,
is even and d
= (t-l)/2
is odd,
and
(b) the asymptotic distribution of ~ is a non-central chi-square distribution 'With t-l degrees of freedom and non-centrality parameter
~, given by
2
<::i)
L
J
co
Ll_F(xl7d... 1Fd(X)f2(x)~72
-co
c
Z (Q .. Q)2
i=l i
(~:i)
2
if t
is odd,
co
Lj Ll_F(x17d-1Fd-l(X)f2(x)~72
-co
if t
is even.
In order to prove this theorem we need the following multivariate
form of the central limit theorem due to Bernstein.
Theorem 2.2.2
•
(Bernstein 13_7)
Given a sequence (]Cl,nk' X2 ,nk, .•• ,Xt ,nk) (n = 1, 2, .•• j
k = 1, 2, .•• v(n)j lim. v = co) of sets of random vectors in Rt ,
n
->
co
42
independent for each fixed n, with E(Xi,nk)
=0
(i
= 1,
2, ••• ,t),
let
Assume that
Lim.
v
v
-1
Z
n -> co
Then as
n -->
co,
~(nk) (r,s)
p,q
=~
p,q
(r,s)
,
k=l l,nk
~
X nk' ••• ,
2
k=l'
~ Xt nk )
k=l'
has a limiting normal distribution with mean
covar1ances
.
~
(p+q=2) •
the random vector
(~X
v -1/2
.e
k=l
(0, 0, ••• , 0) and
p,q (r,s).
Proof of Theorem 2.2.1:
-
Let Xj
denote the median of observations in the j-th block
and define new random variables fUi,j} as follows:
(2.2.4)
=0
and let
Ur
-j
Then
•
otherwise
Let
(2.2.6)
::
1'I1.j
,
E(Ui"j)
and
b
(2.2.7)
1'Ii
::
E(mi )
For a fixed t, let T(t)
::
E
j=l
1'Ii,j
denote the set of integers 1, 2, .•• ,t.
We shall consider first the case when t
is odd.
(t:: 2d+l)
Case 1:
(2.2.8)
•
For a fixed u
f
i" wbere
i, u
6
upi1(i ,i " ••• i ) :: pr'(Xi1,j
d
l 2
(2.2.• 9)
.e
T(t), let
~ Xu,j'
Xi2 ,j
~ XU,j'
....
Xid' j -< Xu,J.j Xi , j > XU,J.
•••• Xi
where
~
d+l'
j
> ~ j"" Xi
,
ed-l"
j>~ j)'
#i
or u and (i ,i , ••• ,i ) is a specified combination
l 2
d
of d integers from the subset T(t)·(i,u). Then
where the inner summation is over all combinations of d integers
from the subset T(t) -{i,u} and the outer s'l.Ull17lation is over all
treatments other than the i-th treatment .
•
44
substituting the values of Fi,j(X)
from (i) and (ii) (2.2.11),
we get
xdx.
Put
Then using Taylor's series eJqJansion, we get
F(X+V+~j+b-l/2Qi) = F(Y+b- 1/ 2(Qi- Qu»
(2.2.13)
• F(Y)+b-1!2(Qi-Qu)f(Y)+O(b-1),
and similar expressions for other F's in (2.2.12).
Therefore,
•
Hence" substituting the above va.lue in the integrand of (2.2.12)
and integrating out the first term, we obtain
2d-1
_ b- l / 2(Q + E
Q -dQ)x
i k=d+1 ~ u
co
J L1-F(Yl"f-~d(Y)f2(y)dy
·e
-co
d
+
co
.
b-1/2(k:1Q~-d gU)~coL1-F(Y17d
Fa..-1(Y)d2(y)dy
+ O(b- 1 ) •
Now out of
lar integer
{~=~)
d
( 2d-1)
)
combinations of type (
i 1 "i2 , •••
"id , a particu-
s~i or u will occur (::~) times
times in (ia+l, ... ,i2d_l)'
(2.2.16)
(11i1~~ ... ~id) and
Therefore
E
p(j)
= (2d-1).E,!d+i ,Id+I
d
(i ,1 , ••• 1 )u 1;(il ,12 ,···1a )
/2d+2
1 2
d
-
•
in
2
b- 1 / (Q1- d Q ) x
u
46
co
J
1 d
2
["1-F(Ylr- F (Y)f (y)dy
-00
co
_b- 1 / 2d QuI L1-F(Y17~d-1(Y)f2(y)dy_7
-co
J
00
_b-1/2(~:~)
( E Q)
SFi,us
2
1 d
L1_F(Y17d- F (Y)f (y)dy
-00
00
+
b-1/2(~:~)(
E
Q
SFi,u s
) IL1-F(Y17~d-1(Y)f2(y)dy
-00
+ O(b- 1 ) .
Now
t
E
Q=tO-Q-Q.
i
u
s~i,u s
·e
substituting from (2.2.17) in (2.2.16), we get.
co
= d/t(t-1)
_(2d~1)b-l/2.L(Qi-dQu>j L1-F(y17d-1
-00
x Fd (y)f2 (y)dy
co
- d QuI
.L1-F(Y17~d-1(Y)f2(Y)d:[,7
-00
J
00
_b- 1 / 2(:':;)(t Q
+
b-1/2(~:~)(t Q +
•
1
O(b- ) •
~. Qi
- Q > .L1-F(Y17d-~d(Y)f2(y)dy
u -00
Qi -
Qu)~~1-F(Y17~d-1(Y)f2(y)dy
47
Hence
"i.1
1
00
d 1
[1-F(Y17'7 - (y)t'2(Y)iJ¥
d
2
x F (y)f (y)ay)
·e
8mplifying the right hand side of the above expression by
using
we obtain
J
00
l 2
11i.1 = d/t - t b- / (Qi_Q)
2
d l d
LI_F(Y17 - F (Y)f (y)dy
-00
•
+
1
O(b- )
J
J
and thus
co
J
l 2
1')i = bd/t - tb / (Qi - "Q)
(2.2.21)
f1_F(y17d-1Fd(y)f2(y)dy+O(1).
-co
Next, let
2
= E(Uij ) - '\j
=
d(t-d)/t 2 + O(b- l / 2 ) ,
and
As before, for a fixed u ~ i or
~
j)
p . i') (i
u~,
j
i
i )
l' 2"'" d
= Pro (Xi ' j
i',
let
< XU',
j , .• ,Xi
< X j .X. j > XUJ. ,
d,j- u, ~
-
> ~,j) ,
where (ilJ i , .•. i ) is a specified combination of
d
2
the subset
T (t)-{i, i' , u)
.
d
integers from
Then
Following the same procedure as above and after considerable
simplification, we get
•
49
and hence
Further, let
Then
Case II
As
t
= 2d
in case I,
and
·e
Let
UPi(~()1l' i 2""
#
i d)
Xi j < X j""'Xi j < X jjX. j
= prr
-,
- u
d - U
1.,
Xi
where u
~
i,
d+l
j
> X j"" Xi
u,
2d-2
> XU, j
j > X j-7
u,
~ ~
i or u and (i l ,i , .•• i d ) is a combination of d
2
integers fram the set T(t) - '{i,u}. Further, let
i PHli ,i , ... i ) =prfXi,j
l 2
d
Xi
•
~~,j'''' 'Xidj ~ Xijj
d+l
j > Xij ... , Xi
j> xi j-7
2d-l
where (i l ,i , ••• i ) is a combination of d integers from the subd
2
set T( t) - f i J.
50
Following the same procedure as in case I, we obtain
(2.2.25)
E(U
ij )
= d/t - tb-l/~(~:i)(Qi
J
co
- Q)
(1_F(y)d-1Fd-l(y)
-00
f2(Y)dY + O(b- l ) •
Similarly,
~i~f(i,i) = d(t_d)/t 2
l 2
+ 0(b- / ) ,
~i:f(i,il) = -d(t-d)/t2(t-l)
l 2
+ O(b- / ) ,
and
The proof of part (a) of theorem 2.2.1 is completed by verifying
that the conditions of theorem 2.2.2 hold.
We shall take
Xij
v(n)
= Uij
= nand
- 'lij
Then from (2.2.23), (2.2.24), (2.2.26), and (2.2.27), we have
2
-det-d) /t (t-l)
•
=
d(t_d)/t 2
if
if
i ~ it,
i ~ i' ,
51
and
Lim
b->
Hence as
00
b --> 00, the random vector
b - l !2(m-~ _ ~l' ~ - ~2"'"
mt -
~t
)
has a limiting normal distribution with mean vector (0, 0, ••. , 0) .
and covariance matrix Z
~i
.I
"J.
= «~i,il»
where
= d(t_d)!t 2
if i
= -d(t-d)!t2 (t-l)
if i ~ i' .
= i',
t
Moreover Z ~i i' = 0 for i = 1, 2, .•. , t. Therefore"
i'=l
'
is a singular matrix. Thus, as b -> Q) the random vector
~
has a limiting multivariate singular normal distribution with mean
vector
H
and covariance matrix Z
are defined in
(b)
(2.2.1)
and
(2.2.2)
= «~i,i'»
where ~i
and ~i"i'
respectively.
We note first that
t
~
. 1
m.
=
J.
J.=
bd
and therefore mi's are linearly dependent.
m Is
i
Consider the first t-l
and denote by
M
'
-0
=
bd
(ml - t
Then by part (a)
1
b - !2
Mo
'
bd
bd
m2 - t ' ... mt _l • t ) .
52
has a limit normal distribution with mean vector
ance matrix ~ o
lJ.
= «aO"J., it»
I
o ,1
(1
= \.1 i
~
-0
and covari-
where
= 1,
2,
... , t-l)
,
Let
Then
if i
::0
.e
= l'
,
if i ~ l' .
t{t-l)!d(t-d)
Therefore by lemma 1.1.4, the asymptotic distribution of
~
t
"'M = f t (t-l)!bd(t-d'-7
= 1;b
M'
-0
~ (m i=l l
bd 2
t
)
~-l M
0-0
is a non-central chi-square distribution with t-l degrees of freedom and non-centrality parameter
2
~
given by
if t
•
is odd,
5;
(:=i)2
Ilf
l-F{y17d - Voley)
..co
f
2
2
{Y)d:.l.7 x
if t is even .
2.; Asymptotic Distribution of ~. Under ~ •
Theorem 2.;.1:
Let fXij} (i = 1, 2, .•• , tj j
= 1,
2, ••• , b) be bt
inde-
pendent random variables and let
be the cumulative distribution function of Xij
effect of the i-th treatment and
~j
where a
is the effect of
i
is the
j-th block.
Suppose
(i)
F{X) is a continuous cumulative distribution function with
bounded first and second derivatives
(ii) for each b,
Hb:
ai
= b -1/2
f(x) and f'(x) respectively,
Qi' where
~
i r Qit
Q
for same pair Qi,ii) •
Let I'ij
denote the rank of the observation xij when the t
observations in the j-th block are arranged in ascending order of
magnitude and further, let
r1. :: b
-1
b
r ij
j=l
The the asymptotic distribution as b ->
•
(2.;.1)
~.
l2b
= t(t+l)
Z
t
Z
. 1
J.=
(r
1.
co,
...:t?:!:!) 2
2
of
54
is a non-central chi-square with t-l degrees of freedom and noncentrality parameter
~Fr2 •
2
~Fr.
given by
= (12t/{t+l})
.rioo
J.ty}
2
f (X}dx72
-
~
(Q _ Q)2 ,
i=l i
where
Proof; Define new random variables
u(j)
ii'
=1
if Xij > Xi'j
otherwise.
=0
Since we have assumed that Fij(X)
,
is a continuous cumulative
distribution function, therefore
for each j.
Hence
Let
and
Now under
Eb '
E(U~V}
= Pro (Xij
> Xi'
i
i
~
i'
00
="[00 F(X + v + t3 j + b-l/2Qil )f(X+v+t3j+b-l/2Qi) dx
55
Using Taylor's series expansion, we have
Therefore,
(2.3.6)
~
=~
E(Uli?)
~
F(y)f(y)dy + b-l/:(Qi,-Qi)
~f2(y)dy+O(b-l)
1/2 + b- l / 2(Q1,-Q1>j f2(y)d;y + O(b- 1 ) .
=
'-CO
Thus
'I'l1j
= E(r1j )
t
= E
1t~1
.-
~
2
1
1
Ll/2+b- / 2(Qi,-Qi)f f (y)dy+O(b- t7+ 1
u..~.
= (t+1)/2 + tb- 1 / 2(Q -
Qi>j~ f2(y)dy
+ O(b- 1 ) •
-~
Let
Now
2.
E(r ij )
= EL,r
t
= E
1 ' ~i
t
E
(j)
Ui1 ,
1'~1
t
(j)
+ 2E~ E U
7 +1
1'~i 1i'-
{(j»2
E U1i '
+ Z
E(U(j»
ii'
«j)
(j»
£
E Uii I U11 "
1 '~1 1" ~1, i '
2
= E(Ult ~ )
=
and
.12
1/2 +
( (j»
+ 2 E E U11 , +1
1 ' ~1
~
b-l/2(Qi,-Qi)~
-~
f2(y)dy + O(b- l ),
1
J
ClO
=
F(y+b-1/2(Qi r-Q~)F{y+b-l/2(Qi,,-Q1»f(y)dy
-ClO
2
1
= [ F (y)f(y)dy + O(b- / 2 )
-(J()
=
1/;
+ O{b-
1/ 2 )
Therefore,
E(r~j)
= (t-l)/2
=
+ (t-l)(t-2)/;+(t-l) + 1 + O(b- 1 / 2 )
1 2
(t-1)(2t+l)/6 + O(b- / )
J
and hence
(2.;.8)
~i:I(1,1) = (t+l)(2t+l)/6
- (t+l)2/4 + O(b-1/2)
= (t2.1)/12 + O(b- 1/ 2 )
.e
•
•
57
JF(X+V+Qklb-l/2+13j)Ll.F(X+V+13j+b"1/2Q~7
00
I:
-00
x f(X+v+13j+b-l/2Qi')dx
J
00
=
l 2
F(Y) Ll".F(y17f(y)dy + O(b- / )
-00
l
1/6 + O(b- / 2 ) •
=
Similarly,
if k ~ 1, i' ,
.e
<=
J
fl-F(X+v+13j+b-l/2
Q117fl-F(X+V+13j+b~1/2Q1._7
-00
x
f(X+~ j+b-1/2 ~)
dx
00
=
J
l 2
(1_F(y»2f (y)dy + O(b- / )
-00
=
1/3 + O(b- l / 2 ) ,
and
00
= LJL1-F(X+V+13j+b-l/2Q117f(X+V+13j+b-l/2Qk)~7
..00
00
X fjfl-F(X+V+(3j+b-l/2Q1.17f(X+v+f3j+b-l/2Qkl)
-00
x ~7
co
= ~~~1-F(Y17f{y)dy_72
l 2
+ 0(b- / )
-co
=
Therefore,
(j)(
')
t-2 t-2
t .. 2
(t-2)(t-,;)
(t_l)2
( 2·'.9 ) 1J.1,1
i,i
= T
+ T + -,- +
4
.. 4
+
= -(t+l)/12
l
+ 0(b- / 2 )
° (b- l / 2 )
i
~
i' .
Similarly, it may be shown that
p;
= max(Elrlj" ~ljl';,
- 7,
~ max (E'
/4
Ir2j-~2j 1'), ... ,Elrtj
E{
4 _"1/4
(rlj"~~j)' E'
4
~tjl')
-
,/4
(r2j-~2j) , .•• E
Therefore,
Lim
b
I:
b-,/2
j=l
b ->co
,;
j
Hence by theorem 2.2.2, as
= 0.
b ->
co
,
the asymptotic distribution
of the random vector
is a multivariate normal distribution with mean vector (0,0, ... ,0)
.,», where
and covariance matrix I: = ({ 0"1 ,J.
O"1,i'
= (t2-1)/12
if 1 ... i'
:: -(t+l)/12
if i ~ l' ,
,
4
(rtj-ntj »
59
and
J
00
(2.3.11)
l 2
T)ij = (t+l)/2 + b- / t(Q - Qi)
2
f (x)clx .
-00
Now
t
~ r
,
= t(t+l)j2
ij
i=l
and
t
~
CT,
i'=l ~,
Therefore, the
it
=0
as~totic
(i-l,2, .. , ,t)
•
distribution is singular, considering
the first t-l of r ij ' s and applying lemma 1.4.1, it may be show
that as
b ->
00
,
the asymptotic distribution of
v2
is a non~r.
central chi-square with t-l degrees of freedom and non-centrality
2
parameter A
Fr. given
i\~r.
by
£J f2(x)~72
00
= (12tjt+l)
~
i=l
-00
2.4
As~totic
(Qi - Q)2 .
Relative Efficiency of Mood's Test With Respect
To Friedman's Test and The F-Test.
In this section, we compare Mood's test with Friedman's test
and the F-test.
As already noted in section 1.4, the asymptotic
relative efficiency of one test with respect to another is the
ratio of their non-centrality parameters.
(2.4.1a)
iM.,Fr.
Thus
rJ rl_F(X)7d-1Fd(x)i~(x)clx
2 2
2
= t t -1) (t-2)
l2d t-d d-l -
00
_
-00
L
J
!
,2 /
~f
_
ClQ
-~
f2(x)~72
if t=2d+l,
00
(2.4.1b)
~r:--::+ (~:i)2£ Ll_F(xl7q.-1Fd-l(X)f2(x)~72 j
-00
00
•
£J f2(x)~72
-00
if t=2d,
60
and compared with F-test,
(2.4.2b)
if t=2d ,
where
co
(J'2 = Jcox2f(x)dx -
co
L
L
x f(X)c1x_7
2
Example 2.4.1:_ Normal Distribution
x
F(x) =
(21t,-1/2 EJePL -1/2 t2_7dt (-co <
X
< co)
-co
Table 2.4.1
.e
t
2
3
4
5
6
7
8
9
10
~.Fr.
LOO
0.75 0.77 0.72 0·73 0·70 0·71 0.70
0·70
e
0·77 0.54 0.59 0·57 0.59 0.60 0.60 0.60
0.61
M.F.
t
•
J
11
12
13
~.Fr.
0.69 0.69 0.69
\l.F
0.61 0.61 0.61 .
61
Exponential Distribution.
Example 2.4.2:
F(X) = 1 - EXPL -~7
(x
=0
(2.4.,a)
~.Fr.
(2.4,'b)
(2.4.%)
~.F
(2.4.4b)
~
0) ,
(x < 0) •
= 1/3
t is odd,
= (t+1)/3(t-1)
t is even ,
= t/(t+1)
= t/(t-1)
t is odd,
t is even.
Table 2.4.2
t
.e
2
4
3
5
6
8
7
9
10
e
1.00 0·33 0.56 0·33 0.47 0.33 0.4; 0.;;
0.41
~.F.
2.00 0.75 1.;3 0.83 1.20 0.87 1.14 0·90
1.11
t
11
~.Fr.
0.;; 0·;9 0.;;
~.F.
0·92 1.09 0.9; •
M.Fr .
12
Example 2.4.;:
1;
Rectangular Distribution.
F(X)
(2.4.5a)
(2.4.6b)
(x < 0) ,
=x
=1
(0
a.
M.F
= (t+1)/;(t-1)
= t 3/ ~+1)(t_1)2
= t/3{t-1)
:5 x :5 1)
,
(x > 1) •
1M. Fr. = t;/;(t_1)2
(2.4.5b)
(2.4.00)
=0
if t is odd,
if t is even,
if t is odd,
if t is even .
e
62
Table 2.4.3
t
2
6
5
7
LOO
~.F
0.67 0·56 0.44 0.43 0.40 0.40
t
11
~.Fr.
0.40 0·39 0.39
~.F
0.37 0.36 0.36
12
9
o.,e
0.38
10
.41
·37
13
Double Exponential Distribution.
F(x)
= 1/2 ExpL!:7
=
.
(2.4.7a)
8
0·75 0.56 0·52 0.47 0.45 0.43 0.42
~.Fr.
Example 2.4.4:
-
4
3
~.Fr.
1 - 1/2
(x < 0) }
EXPL -!:7
(x ? 0) •
222
-1) (t-2) L2-t(~(d}1/2) _ ~(d}1»72
3d t-d
d-1
-
= 4t ~
for all t)
( 2. 4
. Tb) e N.F
=
where
4'
2
2
td(t-d}
(t-1) (t-2)
d-1 L 2-t( ~ ( d/1/)
2 -~ ( d/ 1 »7
_ for all t)
~(m/n)
..
1m
rn / /m+n .
Table 2.4.4
t
•
2
3
4
5
6
7
8
9
0·75 0.87 0.81 0.88 0.86 0.91 0.89
~.Fr.
LOO
~.F.
1.00 0.84 1.04 1.02 1.13 1.13 1.21 1.20
t
10
~.Fr .
0·93 0·92 0.94 0.94
~.F.
1.26 1.26 1.31 1.31
11
12
13
2.5 Asymptotic Distribution of Moodls Test Criterion Generalized
TO! Belatlced i:C~ete
BJi.ook Designs.
Bhapkar ~~7 has generalized Moodls test for two-way classification with one or equal number of observations in each cell to
the case of incomplete block designs.
incomplete block design with parameters
In particular, for a balanced
t, b, r, k and A, the test
statistics is
2
XM;B.I.B.
(2.5.1)
where
d
= k/2
= !~(k-V
"k-d At
it' k
is as defined before.
2
t
E
i=l
is even and
Unless b
the asymptotic distribution of
x2
rd )
(mi - k
(k-1)/2
,
is odd, and mi
is small" Bhapkar has shown that
.
M. ;B.I.B.
if k
is .' chi-square with t-l
degrees of freedom when the null hypothesis of equality of treatment effects holds.
In this section" we shall consider the asymptotic distribution
of v2
~. iB.I.:B.
as
r' -->
co
,under H
r
where Hr
is defined in
(2.5.3).
Theorem 2.5.1.
Consider a balanced incomplete block design with parameters
Let Xij (real valued) denote the chance
variable when the i-th treatment occurs in j-th block and let
t, b, r, k and ')....
Fij(X} denote the cumulative distribution fiUlction of Xij • Suppose
•
where
F(x) is a continuous cumulative distribution function with
bounded first and second derivatives
f(x) and fl(X) respectively,
ai
~j
is the i-th treatment effect and
Consider the alternatives
( 2 .5·3)
ai
H :
r
r
=
li
r
is the j-th block effect.
given by
-1/2 Qi' where
for some pair
(i,i').
Then as
is
r
-->~,
the asymptotic distribution of
2
~;B.I.B.
non central chi-square with t-l degrees of freedom and non2
A
centrality parameter
• given by
2
(2.5.4a) AM.B.I.B. =
(2.5.4b)
where
=
tQ
=
t
I:
. 1
J.=
Proof:
Q
•
i
Suppose the i-th treatment occurs in
....,
jii),
j~i), ...• j~i)
Further, let Xj denote the median of the observationa
in the j-th block. Defining new random variables Uij M in
blocks.
(2.2.4), we have
m
i
=
r
I:
k=l
U. j ( i )
J.,
and
•
1).
J.
= E{m.)
J.
=
k
,
r
L:
k=l
E(U .(1) ) •
1 'J
k
Then it may be sho'Wtl as before, that when H
r
(2.5.5a)
Tl i
= rd/ k
_ r
1/2
J
holds,
(t)(k-1) (k-2) (Q. _ Q) x
(t-l)
d-l
~
00
2
l
d l d
Ll-F(x17 - F (X)f (x)dx + O(r- )
-00
if k
= rd/k
(2.5.5b)
r
-
= 2d+l
,
1/2 t ( k-1) (k-2) (
-)
(t-l)
d-l Qi - Q x
00
~ Ll_F(xl7d-1Fd-l(X)f2(x)dx + O(r- l )
if k = 2d ,
and
.e
= ~i,i = rd(k-d)/k 2 + o(r- 1/ 2 ),
(2.5.6)
var mi
(2.5.7)
cov (mi , mil)
= ai,i l = -d A(k-d)/k2(k-l)
The asymptotic normality of mils as
using Bernstein's theorem.
l 2
+ o(r- / ).
r -->00 can be proved by
Hence using lemma 1.4.1, the asymp-
totic distribution of ~.B.I.B. is. non-central chi-square with
2
t-l degrees of freedom and non-centrality parameter AM.B.I.B.
given by (2.5.4a) and (2.5.4b).
We observe that when k = t,
(2.5.!Ja) and (2.5.4b) reduce to the formulae given by (2.2.3,,)
and (2.2.3b) respectively.
2.6
Asymptotic Relative Efficiency Of Mood's Test For Balanced
Incomplete Block Designs Compared With Durbin's Test and
•
The F-Test .
Durbin
£27
has genera.lized Friedman's rank test for ran-
66
domized blocks to the case of balanced incomplete block designs.
His test statistics
2
(2•.
6 1 ) xn.,;B.I.B.
~,; B.I.B.' is as follows:
=Lr 12/ tA (k+l)
t r
r(k+l) 7 2
i:l L r i 2
_'
where
ri
= sum
of the ranks of observations on t?$e i-th treatment.
It is shown by Van Elteren and Noether ["217, that as
r ->
CIO ,
the asymptotic distribution of
~.jB.I.B. ia a non-
central chi-square with t-l degrees of freedom and non-centrality
parameter,
2
A.D.,; B. I. B• given by
CIO
.e
Also, for the F-statistic, the corresponding non-centrality parameter given in Anderson and Bancroft
where
J'
J
CIO
~2
=
["];7 is
CIO
x2f(x)dx - ["
-CIO
x f(x)~72 •
-CIO
Thus, 'by using Bannan's result ["lg7, we have
(2.6.4a)
•
iM.D.;(B.I.B)
=
if k
= 2d+
1 ,
(2,6.4b)
if' k = 2d
1
and
:5
(2.6,5a)
~.F.(B,I.B.) = ~(~~d)} (~:i)
2
.rJLi-F(x17d-1Fd(X~
OP
-00
t?(X)~72
if
:5
,
(2.6.5b)
2
= ~(~~d) (~:i) L
k
= 2d
+1
1
1.LI-F(x17d-~d-l(X)
00
-00
f2(x)~72
if k = 2d .
We note that both eM.D.(B.I.B) and eM.F.(B.I.B) depend only on
F(x) and k
i.e, the block size.
CHAPTER III
TESTS FOR HYPOTHESES UNDER THE EXPO:NENrIAL REGRESSION
MODEL A1ID FOR HYPOTHESES ON BIVARIATE LOCATION PARAMETERS
3.1.
Introduction
Suppose we have a random vector" z = (x" y) whose cumulative
distribution function" F(X" y) is continuous.
be a random sample of
F(x" y).
Furthermore" we assume
given x" is continuous and has median of the
form m(x)" which is an unknown function of x.
fl9.7
zl" z2" .•• "zn
n paired observations each having the same
cumulative distribution function
that the variate y
Let
Mood and Brown
have considered m(~::) = a + ~x and suggested methods for es-
t imating a and 13 and testing hypotheses about the values of a
and 13-
Later" Bhapkar
L!J:.7 has
considered some additional re-
gression problems along the lines of Mood and Brown.
different approach for estimating a
vals has been presented in Theil
An entirely
and 13" by confidence inter-
["2§l.
He &Bsumes only a sy.mmetric
distribution about the origin for the deviations from the regressions.
A generalization to other forms of
Theil
m(~
has also been indicated in
["2§l.
In section 2" we assume in the model" that the conditional distribution of y
given x" is symmetric about the median m(x)
which is of the form Exp(~x) and then test for 13 = 13 0 by using
Theil's procedure.
Another test based on the quantiles of the
conditional distribution of y
given x", has been proposed recently
by Bhattacharya ~~l for testing under a perfectly general model,
the hypothesis that the median is a completely specified function
of
x.
In the last section, a step-down procedure is suggested for
testing the equality of k
bivariate distributions which are
identical in form except for the location parameters.
be remembered that while considering this same
1::'7 bas
It should
~othesis,
Bhapkar
assumed in his model that the conditional distributions
of y's given x's
are identical except for a location parameter
which is a linear f'unction of x.
relaxed that assumption.
However, in our case, we have
This is possible because we have reduced
the problem to testing the equality of marginal distributions of
x's
and then testing the equality of conditional distributions of
y's gi'\e n that the corresponding x
is less than the sample
median of x's.
A further extension to two-way classification is under consideration.
:3.2 A Nonparametric Test For Testing the Hypothesis f3 ::: f3 0' When
the Conditional Median of y given x, is assumed to be EXPLf3!f:7
Under the Model.
(xl'Yl)' (x2 'Y2), ... ,(xn 'Yn) be an independent random
sample fram a population with continuous cumulative distribution
Let
function
F(x,y) and continuous marginal distribution functions
F (x) and Fi Y).
l
symmetric about
Suppose for given x, the distribution of y is
Ex:p1f3E7, where f3 is an unknown parameter.
We
70
want to test the hypothesis
H:
o
where t3 0
t3=t3 0
is a given constant.
Suppose
xfi_T- xf2_7 .. ·:5
Xffi_7 denote the
arranged in an ascending order of magnitude.
of Fl (x), the probability of the event
Xf:!7' s
so we may consider
nl
= (n-l)/2
if n
f
Because of the continuity
Xi = x
j
i ~ j_7 is zero
to be in a strictly ascending order.
is odd and equal to
integer contained in n/2
~,x2,,··,xn
XiS
f
if n is even.
~7, i.e., the largest
XLJ:.71s
We shall assume
to be tiKed and define new random variables
E ,E , ..• ,E
l
2
n
as
follows:
Then the distribution of 6
Y£J:.7 and
EfJ:.7
i
is symmetric about the origin.
denote the y and
6
corresponding to
Let
x :!7'
L
Consider
and
=
ExpL13 x r n +i 7 + Ern +i 7-7
L 1 L 1 -
Then
Expf13(X
rn +i7 -
-1-
•
Let
71
Therefore,
Since
€~nl+~7
and
€~~7
have distributions which are sym-
metric about the origin, therefore
ern
+i7 - €r i7
- 1--
Exp£fJ(X£ nl+~7 - X£:!7L7 has a symmetric distribution about the
origin.
Thus
Pr·fY1.nl+~7
=
~
Y[":!7
EX!)
£fJ(i n +:!7 - L:!7 L7_7
l
pr.LY£n +:!7 > YL :!7 Exp 'P(X["n +:!7 l
l
=
X[i~7L7
1/'2 ,
and therefore,
(3.'2.3) Pro [ ,:-Og Y£nl+17 - log Y1f1
~ fJ] = .
. w••
Tnl+~7 - Y:!7
=
pr·f
log Y£ n +:!7 - log Y[":!7
l
Y nl +:!7 - xL :!7
=1/'2 .
If anyone of the Yi'S is negative, we replace 1t by its absolute
value .
72
Define new random variables
Zf
J:.7
by
Then, under the null hypothesis zf1=7' ZLE.7'
medians
zLn
and hence' we con use the sign test.
(30
Instead of considering the
Y nl +j},
.•. ,
-we
yf s corresponding to
l
_7
have
XfJ:.7
and
could as well have considered the yf s corresponding
to xfJ:.7 and
Y ~7
(i < j).
Then the test would have been based
on n(n-l)/2 observations on
z's and for large n, this would
involve more computations.
K-Samp1es:
~t
(Xij , Yij) i
=1,
2, ••• , k; j
= 1,
2, ••• , nt' be
N = .E n
independent observations and suppose for each i, the
i=l i
conditional distribution of y given x, is symmetric about
Exp ff3i~7.
We 'Want to test the hypothesis
Then following the above procedure, we introduce the random variables
= 1,
=1,
2, ••• , n1 ,i) where n1 ,i =(ni~~1'/2
if ni is odd or ni /2 if n
is even. Then under Hk' each
i
Zij has median (30 and hence we can use the sign test.
Zij (i
2, ••• , k; j
3.3 Test for the Equality of Location Parameters for Bivariate
Distributions
~t
N = .E
i=l
n
(Xij'Yij)
i
i
= 1,
2, ••• , k; j
independent observations.
= 1,
2, ••• , ni be
For each i, let Fi (x, y)
denote cumulative distribution :fUnction of (X ' Yij).
i3
Suppose
where F(x, y) is a continuous cumulative distribution :fUnction
with density :fUnction, f(x,y) and marginal densities flex) and
f (y).
2
The null hypothesis to be tested is given by
Ho :
•••
=11t •
We resolve the hypothesis Ho into two hypotheses HOl and He2
given by
(3.3.2)
(3.3.3)
and first test H
and if H
is accepted, then test H02 •
Ol
Ol
(a) Test for H01 :
Suppose N = 4r + 1 so that the sample medians determdned
later are unique.
We shall denote by xCI) ~ X(2) ••• ~ X(N)
the N observations xij (1 = 1,2, ••• ,k; j = 1,2, ••• ,ni )
denote the
arranged in ascending order of magnitude. Let a
i
number of observations from the i-th sample which are less than
or equal to x(2r+l) i.e. the sample median of xij's.
let
(3.3.4) Pi(x(2r+l»
J
f
~ (2r+l)
= -co
flex - Si) dx •
Further,
Then, under the hypothesis,
(~.~.5)
Pl (x(2r+l»
=
HOl
P2 (x(2r+l»
= •••
= Pk (X2r+l »
=Po(x(2r+l»
(~.~.6) f(al,an, •••ak ; P (X('n 1»
~
~r+
0
say
k n
k
1
= ~( ) x Z a 1
i=l a i
1=1
where the i-th ter.m in the summation signifies that X(2r+l) belongs to the i-th sample.
k
Z a =(2r+l), therefore we have
i
i=l
Since
(~.~.7)
k
f(al,an, ••• ,akJ P (X('n 1»
~
0
~r+
= (2r+l)
2r( ,
)(1 P ( r
Po
x(~r+l)
- 0 x(2r+l)
Thus
fa i } and X(2r+l)
n
lir ( i )
i=l a
i
r.
»a.'X·1
f
(.p:(
,
(~+l)
)
are independently distributed under the
H
and moreover, too marginal distribution of a i
Ol
is a hypergeometric distribution. Hence for la~e N, we can use
hypothesis
Mood's test which is given by the test statistic
•
75
(b)
Test for Ha2 assuming HOl holds:
If HOl is not rejected, then we proceed to test H02 '
consider the conditional density of Yij , given xij ~ x(2r+l)
and denote by
xtt:r+l)
(3.3.9) "'i(y IX(2r+l»
=
J
-OQ
.
f(X-S i ,Y-ll i )dx/
COJxt2r+l)
f(x-~i'Y-lli)dx dy •
.
J
-co ..co
Then if H02 is true, when HOl is not rejected, we obtain
(3.3.10) "'1(Ylx(2r+l»
= ... ="'k(ylx(2r+l»
= "'2(Ylx{2r+l»
=
Let us denote by y* the Y for which x
"'O(yIX~2r+l»
~
say.
x(2r+l) and
let Y!j (i=1,2, ••• ,k; j=1,2, ••• ,a i ) denote the Yij'S from the
i-th sample for which the corresponding xij
denote by Y'fl) ~Yf2) .••• ~Y*(2r+l)
~
x(2r+lr
the Y!j'S
Further,
(i=1,2, ••• ,k;
j=1,2, ••• ,a i ), arranged in an ascending order of magnitude and
let bi
such that
denote the number observations fram the i-th sample
Ylj ~
Y*(r+l) , that is, less than the median of
the observations Y!j •
We shall write
(3.3. 11 ) Pi
=
f(~~~/xb+l»
-00
and if (3.3.10) holds, we have
•
d¥ ,
16
and the joint probability density function of
given
fai
)
and x(2r+l) is given by
k
But E b.
i=l J.
=
r+l
,
therefore,
(3.3.14) t (b l ,b 2,···,bk j Fol al,a2""'~; x(2r+l»
=
k
I;. TTl (:~) P~
Thus
fbi) and Po given
distributed and
7
(l-iio)r_ (r+l) •
fa i ) and x(2r+l)
are independently
fbi) have a hypergeometric distribution.
Hence,
we can again use Mood's test given by the test statistic
( 3.3.15 )
_2
X2,M
=
(P~+l)2r
r(r+l)
k
E
i=l
For large values of N, the asymptotic distribution of xi,M
can be approximated by a chi-square distribution with k-l degrees
of freedom and the distribution of ~,M can also be approximated
by a chi-square distribution with k-l degrees of freedom even if
2r is only of the order of twenty provided all the ai's are at
least five Mood (19).
Thus, in this sense,
asymptotically independent.
xi,M and ~,M are
Therefore, if the two tests are per-
formed at the significance levels (Xl and (X2 respectively, then
77
the over-all significance for
H is equal to
o
1-(1-a1 )(1-a2 ).
Finally" we observe two things about the above procedure,
Firstly" we could have used y's first and test for
then test for
Hal'
H02 and
In many problems" there might be some natural
way to effect the resolution, for example, the more important hypothesis is tested first, since the test stops if the first hypothesis
is rejected.
Secondly, instead of considering the y's such that the
corresponding x' s
~
X{2r+l)' we may have considered the y's such
that the corresponding x's > x{2r+l)"
another test statistic
~"M
and since both
asymptotically independent, therefore,
asymptotically distributed as
freedom.
Then we would have obtained
~,M
and
~"M
are
~,M + ~,M will also be
chi-square with (2k-2) degrees of
Thus at the second stage, we have three test statistics
~,M' ~,M and ~,M + ~,M
•
CHAPTER IV
SOME PROBlEMS IN THE CATEGORICAL SET- UP
4.1
Introduction.
In this chapter, we shall be concerned with some problems which
arise when the data are presented in the form of observed frequencies.
We shall call a variable a factor or a response according as the
ponding marginal frequencies are held fixed or not.
corre~
The appropriate
probability model will be a single multinomial or a product multinomial according as whether all the variables are responses or only
some of them are responses and others factors.
It is assumed that the
cell probabilities are functions of k unknown parameters, 01 , 02' ••• ,
Ok which can be represented as a point 0 in k-.dimensional Euclidean
space.
The null hypothesis usually specifies that 0 lies in a subset
which can be represented by r functional relationships hi(O)
i
= 1,
2, "', r
~
k.
=0
Various hypotheses both in the spirit of depen-
dence and analysis of variance have been posed by Roy ["23_7, Mitra
£18_7 and Bhapkar ["4_7. The most common tests used in this field are
the Wilks' likelihood ratio test, Karl Pearson's chi-square test,
Neyman's modified chi-square test and his other test obtained by using
linearization technique.
The corresponding test statistics are all
asymptotically equivalent and have asymptotic chi-square distribution
when the null hyPOthesis is true.
The major difficulty which is common
to all the above methods lies in the fact that it is extremely difficult to solve the minimizing or maximizing equations and to obtain the
estimates of the parameters explicitly,
straints.
when they are subject to con-
One way of avoiding this difficulty is to use Wald's method
79
~28_7 wherein the test statistic involves only the estimates of the
~arameters
~onding
without any restrictions.
The calculation of the corres-
test statistic involves only an inversion of a matrix which can
be done by the use of an electronic
In Section 2, we use this
com~uter.
to test two-by-two
a~~roach
inde~en-
dence in a 2 x 2 x 2 table and also to test the equality of two marginal distributions in an r x r table.
In the last section, we consider some hypotheses which can be
~osed
in the categorical
ber of categories.
set-u~
where each
The hypotheses of symmetry and
considered, and the non-centrality
has the same num-
res~onse
~arameters
are
inde~endence
of the
corres~onding
test
statistics under alternatives of the Pitman type are obtained.
4.2
Wald's Test and Its Use for Testing Some Hypotheses in the
Categorical
Set-U~.
,°
Let f(X l , x2 ' "" x ; 01 2 ,
m
"', Ok) be the joint ~robability density function of the variates
We first state Wald's test.
xl' x2 ' ••• , x m involVing k unknown ~arameters 01 , 02' ••• , Ok which
of k-dimensional
can be re~resented by a ~oint 0 of a subset
1-2
1-2 mayor may not be the entire k- dimensional
subset of 1-2 defined by the equations
Euclidean space.
Let (I) be the
(r ~ k)
( 4.2.1)
Denote by H(I) the hypothesis that the true
~arameter
A
•
s~ace.
A
point 0 is in (I).
°
,°
A
A
Let 0n denote the ~int with coordinates 01 , n ,2n , ••. , k,n
A
where 0i,n is the unrestricted maximum likelihood estimate of '\ based
on n
inde~endent
observations on xl' x2 ' ""
x '
m
The
ex~ected
value
80
of
-?P
log f(x l , x2 ' ""
~;
O)/dOido j
is denoted by cij(O) and C(O) denotes the matrix «Cij(O)
».
Let
,
and denote
(4,2.2)
by rr*
k
k
Z
Z (dh (O)/dod)(dh (O)/do )rrg
f =1
p,q
m=l
p
(0)
m A. , m
q
A.
and
B( 0 ) =
Then Wald
["28_7
«b p,q» = «rr*p,q(0 ) ) r 1
has shown that when the number of observations n is
large, the statistic
2
X
=n
w
(4.2.Lj·)
r
r
Z
Z
p=
1\'
1\
1\
h (0 )h (O)b
1 q=1 P n
q
(0)
n p,q n
has a chi-square distribution with r degrees of freedom when the
hypothesis HCJ.) is true.
The matrix B and the test statistic,
xw2 can
be written in the
following form:
Let
E' (0) = (hl(O),
1 x r
and let
Ho
h/'"l(O), ' ' ' ' h (0»
~
r
denote the k x r matrix
(i = 1, 2, .0., k;
Then
( 4.2.6)
,
j =
1, 2, "', r).
81
and
2
A
A
A
Xw = nh'(O n )B(O n )h(O n )
Example 4.2.1
Two by two independence in a 2 x 2 x 2 table.
Consider three responses "i", "j" and "k" each with two categories.
Suppose we have a sample of n observations and let nijk de-
note the number of observations in (i, j, k) cell.
The corresponding
-rpbabo'otu density model is
( 4.2.8)
2
¢ = 2 nl
1f
. j ,k
J.,
Tr
n. 'k!
J.J
i,j,k
nijk
Pijk
,
where
2
L Pijk = 1
i,j,k
,
and
I
P1jk
>0
Let
2
=
L p. jk
Pijo
k=l J.
,
2
=
.L Pijk
Piok
J=l
2
p, 'k = .E
,
2
2
.E
=
Pioo
.L Pijk
k=l J=l
oJ
P j
o 0
i=l
Pi ' k
J
2
2
= L L p.jkand
i=l k=l J.
,
,
2
Pook
2
.L1 p.J. jk
i=l J=
= .E
Similarly, let
2
n
n
·e
.E n. jk
ijo = k=l
J.
2
=
.L
nijk
ojk
J.=1
,
niok
2
= .E
j=l
nijk
2
2
n
.E •.E n
=
ioo
ijk
k=l J=l
,
,
n
ojo
=
2
2
~
E
2
2
n k = Z Z nO jk
00
i=l j=l 1.
,
n ..
i=l k=l l.J k
The null hYJ)othesis H is given by
o
Alternatively, we can write Ho as
HI:
(4.2.10)
0
H~
where
hl(P)
= PlloP220
- Pl20P210 = 0
,
h2 (P)
= PlolP202
- Pl02P20l
=0
,
~(p) = PollP022 - Po12P02l = 0
It is easy to see that Ho'---y
<~ HI, so we shall consider the
0
hYJ)othesis HI.
o
Let
,
where Plll ' P
, ••• , P22l are seven independent parameters such
112
that 0 < Po Ok < 1.
J.J
Then the maximum likelihood estimates of
PijklS are given by
A
Pijk = nijk!n
(4.2.11)
and
,
A
Piok
= niOk!n,
1\
1\
1\
Plll(l-Plll)
1\
-PlllPl12
A
1\
A
Pl12 (l-Pl12 )'"
,
A
-Pl12P22l
1\
1\
• •• P 221(1-,P22 J.,)
hI
...• .• .•
A
A
A
h
h
2
•
. . • . 3• .
A
A,
P220-PllO
P202-PlOl
A
A
A
1\
A
P22Q-PllO -PlOl-P20l
A
H(~)
A
-PllO-P2l0
A
( 4.2.13)
=
A
A
-PllO-P2l0
7x 3
A
P022- POll
A
A
-P11Cy'" Pl20
1\
A
-Pl1O "P120
0
A
-P02l-POll
A
A
P2Oi- P1Ol
A
A
0
1\
1\
0
1\
P022" P011
-PlO2 - P lOl
1\
A
-P0 12-Pall
-P~Ol-Plel
/\
A
1\
A
-P021- POll
1\
1\
-PlO2-PlOl
1\
-P012-POll
Let
A
A
PlIO + P220
A
A
(4.2.14)
A(!:)
3 x 1
and
=
A
PlOl + P202
A
A
POll + P0 22
,
1\
D(~)
A
1\
1\
(4.2.15)
-1'120 1'210
A
1\
+1'100 1'101 1'212
1\
=
3x3
1\
"
"
~202 ~121
+~020 ~022 ~211
"
"
1\
1\
+1'100 1'110 1'221
A
1\
-1'020 1'021 1'212
1\
1\
A
-POlO 1'012 1'121
1\
1\
A
+1'010 1'011 1'122
1\
-1'200 1'201 1'112
1\
A
-1'100 1'102 1'211
+0200
1\
/\
"
"
/\
/\
+1'001 1'011 1'122
1\
'/\
-1'100 1'120 P211
-1'001 1'021 P112
+1'200 P220 P112
+P002 1'022 P211
1\
A
1\
1\
A
t.
1\
. '/\
-P200 P210 P122 '
1\
A
+P010 P110 P221
1\
-POlO 1'210 1'121
1\
A
A
1\
1\
A
A
"
"
"
1\
"
/\
A
-1'001 1'201 1'112
+:P020 P220 1'112
1\
"
"
+1'001 1'101 1'212
A
"
1\
-1'002 P012 P221 '
A
1\
,
-1'020 1'120 1'212
'
+1'002 1'202 1'121
A
/\
A
-1'002 1'102 1'221 '
Then
(4.2. ~,6)
1
B- {i)
=
1\
-1 A
A
HI (l?) C (~) H{~)
A
= 4h(l?)
1\
A
A
1\
h I (~)- A(l?) hI (J?) + D{.R) ,
and
an asymptotic chi-square distribution with three degrees of freedom.
Exampfe 4.2.2
Equa1iti: of two marginal distributions in an rX!'
table:
Let
"i" and "j" denote two responses each with r
categories.
85
Suppose we have a sample of n observations and nij observations
lie in the (i" j).th cell. The corresponding probability model is
(4.2.18)
=
$
r
nl
IT
i,j
··'/Tnij ~
nij
Pij "
r
I:
r
Pij
i=l j=l
I:
= 1"
Pij > 0 •
1,j
Let
r
Pio == j:1Pij
,
r
Poj ::: I:
P "
i==l ij
r
ni = I: n. j" noj
o j==l J.
:::
r
I:
n. j
. 1 J.
J.==
The null hypothesis to be tested is given by
(i
=1,
This can be written equivalently as
Hi"
(4.2.19)
Hl :
Pi o == PoJ.•
2" ••• "
r).
where
(4.2.20)
Consider
as a vector of r 2-1 independent parameters.
Then the maximum like..
lihood estimates of Pij are
"Pij
==
nij/n,
"Pio
==
niO/n ,
and
1\
(4.2.21)
1\
P11 (l..Pll )
C.. l(~) == "Pll P12
"
2
1\
-2
r .. lxr -1
"
-P/::.ll P12 ••••••••••••••••• "P1\ll P" r , r.. l
"
1\
''''
"
1\
P12(1..P12 ) ••••••••••••••• "P
12 Pr,r-l
.
" "
A
(1\
-
-P12 ~ ;r-l·· .. •••••••••••• :9.r,(r-l) l"~,r_l
)
86
Let us denote the element in the i-th row and j-th column
A
of HI (,R)
where
by
j = r(k-l) +
f
snd for k=1,2, .•• , (r-l), f takes the
values
1, 2, ... , r and for k=r, f takes the values
(r-l).
Then
hij = 1
if k=i and
1,2" .•. ,
f = 1,2, ••• ,(r-l)
=-1
if f=i and k
=0
otherwise,
= 1,2, ••• ,(i-l),(i+l), .•• ,r,
(4.2.22)
and
h ij = -1
=
if k=T and f
= i,
otherwise.
0
A
Further, let D(,R) denote the
r-lxr-l matrix whose element in the
i-th row and j-th column is
( 4.2.23)
A
dij
= 5ij (PiO
5ij
=1
1\
A
A
+ POj ) - Pij - P ji
where
<=
0
if i = j,
otherwise.
Then
(4.2.24)
B-\~) =
=
A -1 A
A
H'(,R) C (,R) H(,R)
A
A
1\
D(g) - heR) hI (,R)
,
'
87
and
has an asymptotic chi-square distribution with r-l degrees of
freedom.
In case
r=3, let
A
Uij(~)
"
~(~)
A
K{~)
A
A
i~j ,
= Pij + Pji
A
A
= -(hl (.E)
+ h 2 (12»
2"
= 113(12)
,
2/\
It.
A
u12 (,E) + h 2{R) u13 {R)
2 A
A
+ hl(,E) u23(~) ,
A
L(~)
Then
A
A
= u12 (12)u13 (R)
A
/\
A
A
+ u12(~)u23(R) + u13(~)u23{~)
_2
"A /\
Xu
=
n k{p)/L(p)- k(p)
w
-
(4.2.26)
4.3 Hypotheses of Symmetry
4.3.1 Hypothesis of Symmetry in a two-way table:
Consider a tvlO-way table 'With two responses "i" and "j" each
having
r
categories.
The corresponding probability model is
given by
(4.3.1)
,
r
E
i=l
~~
E Pi'
j=l J
=1
p ..
2J
>0
We shall consider the hypothesis of symmetry which is
(i,
j
= 1,2, ... , r).
The corresponding test statistic can be easily obtained by minimizing
88
2
=
X
r
2
(n .. - n Pij ) /n Pij
i=l j=l ~J
L:
r
L:
with respect to Pij's
subject to the conditions
r
t
L:
Pij
i=l j=l
The minimizing estimates of Pij's are
L:
A
1\
Pij = Pji =(n ij + n ji )/2n
=1
,
and
~
r
= L:
L:
i=l i<j
(n' j - n ji )2/(n.t.+ n .. )
~
~J
J~
has an asymptotic chi-square distribution with r(r-l)/2 degrees
of freedom when H is true. The same result has been obtained
2
previously by BovikJa.r In case 't = 2, the hypothesis of
It7.
symmetry
7~the
hypothesis of the equality of two marginal dis-
tributions, and for r > 2, the hypothesis of symmetry does imply
the hypothesis of the equality of two marginal distributions but
the latter does not imply the former.
Hence for
r > 2, we cannot
replace
H considered in section 4.2 by H .
2
l
4.3.2 HYPothesis of SYmmetry and Independence:
In a two-way table, consider the hypothesis of independence
(4.3.4)
H3:
Pij
= PioPQj
•
Suppose
Then for testing H i.e. symmetry and independence, the correspond4
ing test statistic is given by
89
__2
(4.3.6) r
r,..
= 4n r: r:
i=l j=l- ~J
2
rn.. - (nio+noi4n)(n jo+noj ) 7/(n
+n )(n. +n
.
io oi
JO
.)7
o~
with r(r-l) degrees of freedom.
4.3.3 If;rpothesis of Independence Under symmetry .
We might consider ~ under the model () H • The corresponding
2
test statistic is given by
(4.3·7)
r
r- = 4ni=lr:r j;::l
r: L nij ..
(n +n )(n. +n ) 2
io oi JO oj J!J~. +n )(n. +n )7
4n
~nio oi
JO ojr
r:
r:
(n ij .. n ji )2
1=1 i<j.
nij+l'lj1
with r(r-l)/2 degrees of freedom.
4.3.4 HYPothesis of Symmetry In A Three-way Table:
Consider three responses "i"" "j It and "kit each with r
goriest
(4.3.8)
cate...
The corresponding probability model is given by
=
4>
n!
IT
i"j"k
We shall consider hypotheses of two-by..two symmetry which are
given by
·.. , r) ,
(4.3.9)
~:
P1jk ;:: Pjik
(1 ~ j" k ;:: 1, 2"
(4.3.10)
H:
6
P1jk ;:: Pkji
(i
1, 2,
·, ., r) ,
(4.3.11)
H:
P1jk ;:: Pikj
(j :f k; i = 1, 2,
·.. , :r) •
7
:f
k,
j =
90
The corresponding chi-square test statistics are given by
each with r 2 (r-l)/2
degrees of freedom.
We might as well consider the hypothesis of complete synmetry
which is given by
(ir'k) •
The corre&ponding chi-square test statistic is
+ 3n Z ~nijii#j
2
(niij+niji+njii)
_7
3n
/(niij+niji+njii)
(n11k+n1k.+~ .. ) 2
+ 3n1~~n11k3: ~~ _7 /(ni1k+n1k1+~i1)
+
.
-
3nj~Lnjklt
(n·kk+~·k+~·)
J
5n
J _7
with 2r(r-l) degrees of freedom .
2
/(njkk+~jk+~j)
91
It can be easily seen that
but
4.4
The Non-centrality Parameters For The ' HYPothesis of Symmetry.
The non-centrality parameters for the test statistics used in
testing the hypothesis of symmetry can be easily obtained by using
the theorems given by Diamond, Mitra and Roy
ra7.
We list below
in each case the hypothesis, the test-statistics and the corresponding non-centrality parameter.
(1)
Two way table "i" and "j" response:
H :
2
P
H2 ,n:
ij
i,j=1,2, ••• ,r ,
= P ji
= Poji +
Pij
-1/2
n
0ij ,
where
(4.4.1)
(2)
Three way table
H,
:
H5,n:
,
2
x(~)
;
II
i ", "jll and "k"
respons es
= Pjik
i ~ j ,
_
0
-1/2
Pijk - Pjik + n
0ijk'
Pijk
r
= k=l
E
E
i<J
2
0
_
0
Pijk - Pjik
(ni·k-n j · k ) /n.jk+nj'k .
J
~
~
~
92
<,
2
(4.4.2)
6(5)
=
1/2n
r
~
k=l
H6
P1jk
= Pkj1
H6,n:
P1jk
o
=Pkj1
(1 ~ k) ,
-1/2
+ n·
0ijk'
0
0
P1jk = Pkj1 .
(4.4.3)
(j
~
k) ,
o
:P1jk
0
= :Pikj
.
BIBLIOGRAPHY
<,
:["1_7
Anderson, R. L. and Bancroft, T. A., Statistical Theory in
Research.
["2_7
New York, McGraw-Hill
Book Co., 1952.
Andrews, F. C., "Asymptotic Behavior of Some Rank Tests for
Analysis of Variance," Annals of Mathematical Statistics,
Vol. 25 (1954 ), pp. 724-736.
13_7
Bernstein, S., "Sur l'extension du theor~me limit du calcul
des probabilites aux sommes de quanti tes dependantes," Mathematische Annalen, Vol. 97 (1927), pp. 1-59.
["4_7
Bhapkar, V. P., "Contributions to the statistical analysis of
experiments," North Carolina Institute of Statistics Mimeograph Series t No. 229, July, 1959.
["4a_7
Bhapkar, V. P., "A nonparametri c test for the problem of
several samples," Annals of Mathematical Statistics, Vol.
32 (1961), pp. 1108-1117.
["5_7
Bhattacharya, P. K., "Study of nature of dependence in a bivariate population with few assumptions," Unpublished paper.
["6_7
Bose, R. C. and Gupta, S. S., "Moments of order statistics
from a normal population," Biometrika, Vol. 46( 1959) ,pp.433- 4 39.
["7_7
BOWker, A. H., "A test for symmetry in contingency tables,"
Journal of the American Statistical Association 2 Vol. 43
(1948), pp. 572-574.
["8_7
Diamond, E. L., Mitra, S. K. and Roy, S. N., "Asymptotic
power and asymptotic independence in the statistical analysis
of categorical data," Bulletin de l'Institute International
de Statistiq,ue, Tome 37 - 3eme Livraison, 1960, pp. 309-329.
Durbin, J., "Incomplete blocks in ranking experiments,"
<,
["10_7
British Journal of Psychology, Vol. 4 (1951), pp. 85-90.
2
Fix, E., "Tables of the non-central x ," University of
California Publications in Statistics, Vol. 1 (1949),
pp. 15-19·
Friedman, M., "The use of ranks to avoid the assumption of
normal1 ty impl1ci,t in the analysis of variance," Journal of
the American Statistical Association z Vol. 32 (1937),
pp . 675- 701.
["12_7
Hannan, E. J., "The asymptotic powers of certain tests based
on multiple correlations," Journal of the Royal Statistical
Societyz Series B, Vol. 18 (1956), pp. 227-233.
["13_7
Hoeffding, W., "A class of statistics with asymptotically
normal distributions," Annals of Mathematical Statistics,
Vol. 19 (1948), pp. 293-325.
["14_7
Hojo, T., "Distribution of the median, quartiles and interquartile distance in samples from a normal population,"
Biometrika, Vol. 23 (1931), pp. 315-360.
["15_7
Lehmann, E. L., "Consistency and unbiasedness of certain
nonparametric tests," Annals of Mathematical Statistics,
Vol. 22 (1951), pp. 165-179.
["16_7
Mann, H. B. and Wald, A., "On stochastic lim! t and order
relationships," Annals of Mathematical Statistics, Vol. 14
(1943), pp. 217-226.
Naslley, F. J., "A note on a .two- sample test, II Annals of
Mathematical Statistics, Vol. 22 (1951), PP'
304-306 •
95
Mitra, S. K., "Contributions to the statistical analysis
of categorical data," North Carolina Institute of statistics Mimeograph Series" No.
142" December, 1952.
Mood, A. M., Introduction to the Theory of Statistics,
New York, McGraw-Hill Book Co.,
1950.
Mood, A. M. and Brow, G. W., "On median tests for linear
hypotheses,"
Proce~s
of the Second Berkeley Symposium
on Mathematical Statistics and Probability, University
1950" pp. 159-166.
of CalU'ornia Press, Berkeley,
Noether" G. E." liOn a theorem of Pitman,," Annals of
Mathematical Statistics, Vol.
26 (1955), pp. 64-68.
Pitman, E. J. G., "Nonparametric statistical inference,"
unpublished lecture notes given at Columbia University,
New York,
1948.
Roy, S. N., "Some Aspects of Multivariate Analys is, "
John Wiley and Sons"
1957.
Sukhatme, B. V., "Testing the hypothesis that two populations differ only in location," Annals of Mathematical
Statistics, Vol.
29 (1958),
pp.
60-78.
Terpstra, T. J., "A nonparametric test for the problem of
k-samples," Proceedings, Koniklijke Nederlandse Akademie
van Wetenscbappen, Vol
57 (1954), pp. 505-512.
Theil, K., "A rank-invariant method of. linear and polynomial regression analysis," Indagationes Mathematicae"
Vol.
12 (1950)"
pp.
85-91, 173-177" 467...482 .
127_7
Van Elte'~n, Ph. and Noether, G. E., "The asymptotic
2
efficiency 0f the X -test for balanced incomplete block
r
design," Biometrika, VoL 46 (1959), pp. 475-477.
["28_7
Wald, A., "Tests of statistical hypotheses concerning several
parameters when the number of observations is large," Transactions, American Mathematical Societl' Vol. 54 (1943),
pp. 426-482.
["29_7
Wallis, W. A. and Kruskal, W. H., "Use of ranks in onecriterion variance analysis," Journal of the American Statistical Association, Vol. 47 (1952), pp. 583-621.
Wilcoxon, F., "Individual comparisons by ranking methods,"
Biometrics BUlletin, Vol. 1 (1945), pp. 80-83.