V.P. Bhapkar; (1959)Contributions to the statistical analysis of experiments with one or more responses."

, ··fIt
CONTRIBUTIONS TO THE STATISTICAL ANALYSIS OF
EXPERIMENTS WITH ONE OR MORE RESPONSES
(NOT NECESSARILY NORMAL)
by
Vasant P. Bhapkar
University of North Carolina
•
r;
~
•
This research was supported partly by the
Office of Naval Research under Contract No.
Nonr-855(06) for research in probability
and statistics at Chapel Hill and partly by
the United States Air Force through the Air
Force Office of Scientific Research of the
Air Research and Development Command, under
Contract No. AF 49(638)-213. Reproduction
in whole or in part is permitted for any
purpose of the United States Government.
Institute of Statistics
Mimeograph Series No. ~29
July 1959 .
•
ACKNOWLEDGMENTS
I am deeply indebted to Professor S. N. Roy for
suggesting the problem to me and for his inspiring guidance and constant encouragement.
I would also like to
express my grateful appreciation to Professor Wassily
Hoeffding and Professor W. J. Hall for going through
the manuscript and for their suggestions.
lowe my sincere thanks to the U. S. Educational
Foundation in India and the Institute of International
Education for the Smith-Mundt and Fulbright awards.
I
am also grateful to the Office of Naval Research for
financial assistance.
Finally, thankful appreciation is due to
Mrs. Ouida Taylor for the careful work of typing the
manuscript, and to the secretarial staff of the Department of Statistics at Chapel Hill and, in particular,
to Miss Marianne Byrd for the help received from her
at various stages of the work •
•
ii
TABLE OF CONTENTS
Acknowledgments. • • • • • • • • • • • • • ••
Introduction • • • • • • • • • • • • • • • ••
Notation • • • • • • • • • • • • • • • • • ••
PART I
ii
v
xi
CATEGORICAL SET-UP
CHAPTER
I.
II.
•
SOME ANALOGUES OF THE CUSTOMARY HYPOTHESES IN
"NORMAL" ANOVA, MANOVA, AND IN STUDIES OF
"NORMAL" ASSOCIATION • • • • • • • • • • • ••
1
1. Introduction • • • • • • • • • • • • • • • •
1
2. A three-way table (ijk) in which "i" is a
response and "j" and "k" are factors. • • •
2
3. A three-way table (ijk) in which "i", " j "
and "k '! are responses • • • • • • • • • • •
8
4. A four-way (ijkt) table in which "i", "j",
"k" and IIt ll are all responses • • • • • • •
14
5. A five-way (ijktm) table in which "i" and
"j" are responses and "k", "t" and "m" are
factors • • • • • • • • • • • • • • • • • •
15
ON SOME BASIC THEOREMS OF NEYMAN ON X~
AND "LINEARIZATION" • • • • • • • • • • • • • •
18
1. Introduction. • • • • • • • • • • • • • • •
18
2. Theorem 2.2 • • • • • • • • • • • • • • • •
20
3. Linearization • • • • • • • • • • • • • • •
28
4. Theorem 2.4 • • • • • • • • • • • • • • • •
30
iv
CHAPTER
III.
Page
SOME SPECIAL PROBLEMS POSED IN CHAPTER I • • •
39
1. Introduction • • • • • • • • • • • • • • • •
39
2. On the test of linear hypotheses on the
responses by the use of X~ statistic and
the X~ minimization method of estimation..
40
3. On the test of nonlinear hypotheses on piS
in ANOVA and
N~NOVA
situations. • • • • ••
61
4. On the hypotheses about association for a
single multinomial. • • • • • • • • • • ••
PART II
IV.
V.
64
NONPARAMETRIC SET-UP
SOME UNIVARIATE PROBLEMS • • • • • • • • • • •
68
1. Introduction • • • • • • • • • • • • • • • •
68
2. Extension of Mood's test for two-way
classification to cover "incomplete designs"
71
3. Generalization of Hoeffding's theorem on
U-statistics to c samples........
81
4. An application to a certain nonparametric
test for c samples. • • • • • • • • • ••
89
SOME REGRESSION AND BIVARIATE PROBLEMS • • • •
98
1. Introduction •• • • • • • • • • • • • • • •
98
2. Some regression problems. • • • • • • • ••
99
3. Some bivariate problems. • • • • • • • • •
114
BIBLIOGRAPHY. • • • • • • • • • • •.• • • •
•
• • • •
120
..
"
INTRODUCTION
In the general analysis of variance situation, we
have observations on a single character of each individual or
experimental unit for various factor-combinations.
be called a multi-factor uni-response situation.
This may
Such data
are usually analyzed under the assumption of a normal distribution for the response.
More generally, we may have observations on several
characters of each individual or experimental unit for various
factor-combinations.
This may be called a multifactor multi-
response situation.
Such data also are usually analyzed under
the assumption of joint normality of the several responses
for each individual.
Similarly when a number of observations on several
variables are available, the associations between the variables are generally studied under the assumption of normality.
In Part I of this thesis, we shall be concerned with
experimental data given in the form of frequencies in cells
determined by a finitely multi-way cross-classification, with
predefined categories, finite in number, along each way of
classification.
..
We shall pose hypotheses, which might be con-
sidered to be generalizations appropriate to this set up of
the usual hypotheses (i) in classical "normal" univariate
"fixed effects" analysis of variance or AAOVA, (ii) "normal"
multivariate "fixed effects U analysis of variance or MANOVA,
v
vi
and (iii) in analysis of various kinds of "normal" independence, and shall offer large sample tests for such hypotheses.
The large sample tests suggested for all these cases
in Part I are based on the frequency x2-test of Karl Pearson
[25].
Analysis of categorical data, thus going back to Karl
Pearson, has been developed at subsequent stages, among others,
by Fisher [10], Barnard [2], E. S. Pearson [24], Cramer [7]
and Neyman [22].
However, Part I of this thesis is along the
line, historically going back to Barnard and E. S. Pearson for
the simple 2 x 2 table, but developed extensively (and for a
long time in ignorance of Barnard and Pearson's prior work)
for more general cases by Mitra [19J, Roy and Mitra [33],
Ogawa [23] and Diamond [8J.
SO far as the mathematical methods
are concerned, Part I of this thesis uses and extends the ones
introduced by Cramer [7] and further developed by Mitra [19J,
Ogawa [23] and Diamond [8].
The general probability model for Part I is that of a
product of several multinomial distributions.
According as
the marginal frequencies along any way or dimension are held
fixed or left free, that dimension or way will be said to be
associated with a "factor" or a "response."
The model is then
TT
j
Poj=1 and
Lnij=n oj
n .1
TTi
n ..
TTp.~J such that ~ p .. ~ J.J
n .. ! i J.J
J.
OJ
J.J
is fixed, i=1,2, ••• ,r and j=1,2, ••• ,s.
J.
Thus here "i" refers to categories of the response while "j"
vii
refers to categories of the factor.
noj
denotes the pre-
assigned sample size for the j-th factor-category, out of
which
nij
happen
to lie in the i-th response-category.
should be further noticed that
i
It
may be a multiple subscript,
say, i, i a ... i k with i, = 1,2, ... ,r, ; i a = 1,2, ... ,ra ;
••• ; i k = 1,2, ••• ,rk ; so that, all combinations being supposed to be allowed, r
=
r, ra ••• r k • Likewise, j also
might be a multiple subscript, say, j, ja ••• jt with
j,
= 1,2, ••• ,s, ; ••• ; jt = 1,2, ••• ,st • but with this im-
portant distinction that all combinations may not be allowed.
This will be called a k-response (or k-variate) and t-factor
problem, i"ia, ••• ,i k
j"ja, ••• ,jt
denoting categories of responses and
denoting categories of factors.
According as a
~et
of (real) values is or is not assoc-
iated with the categories along any way of classification .
(factor or response), that way of classification will be said
to be structured or unstructured.
If the categories are
class-intervals for a continuous variate (or factor), then
the distances from an arbitrary origin of, say, the midpoints
of such intervals form a natural set of associated scores.
Likewise, if the variate (or factor) is discrete, then these
values will be natural scores.
If the response (or factor)
is not even discrete but is categorical with an implied ranking (like good, fair, bad) then the scores may be assigned to
these categories accordingly.
It may happen that a system of
scores is assigned on some other considerations, even for
categories without any implied ranking to start with.
viii
So far as the problems of interest are concerned I the
main distinction between the unstructured and the structured
case seems to be the following.
It is possible and also use-
full in the structured case, to define certain over-all
aspects of the distribution.
Then, while it is still possible
to study the same problems as in the unstructured case l such
problems are hardly of the same interest for the structured
case and the problems involving the newly defined over-all
aspects are the ones that become more meaningful.
Part II considers, broadly speaking, the same kinds of
problems as Part II but under the usual probability models of
the nonparametric inference.
An important distinction between
this Part I approach (which might be characterized as one kind
of nonparametric set up and hence might be called categorical
nonparametric set up) and the nonparametric approach may be
pointed out.
In the nonparametric set up, while studying
some aspect of different populations, it is generally assumed
that the population distributions are identical, apart from
the aspect that is being studied.
In the case of one-way
classification, for example l it is assumed that the factor
under consideration leaves the entire distribution unaffected
except for location.
In some situations such an assumption
ix
other words, the main point of interest might be just how
the different levels of a particular factor are affecting
some special feature of the whole response and not the whole
response itself.
This type of problems, in particular, and
some other types, in general, can be tackled more easily (at
least for large samples) under the approach of Part I than
under the nonparametric approach.
Naturally, since the set
up is more general with fewer assumptions, we have to be content, at least at this stage, with approximate criteria of
asymptotic nature.
On the other hand, in the nonparametric
set up with some broad restrictive assumptions, it is often
possible to have exact criteria for small samples and asymptotic approximations for large samples.
Thus the nonpara-
metric approach may be recommended where we have reasons to
believe that the restrictive model is not so unrealistic
while categorical approach may be recommended for other situations, with the proviso that, as of now, no small sample
tests are available.
This means that, at the moment, there
is nothing we can do in those situations \vhere the categorical
approach seems to be more reasonable and where, at the same
time, we need exact tests.
In this thesis, the first three chapters (i.e., Part I)
deal with the categorical approach, while the last two chapters (i.e., Part II) deal with the nonparametric approach.
In Chapter I, the hypotheses are posed which are considered to be analogues appropriate to this categorical set up
x
of the usual hypotheses in classical normal "fixed effects"
ANOVA and MANOVA, and in analysis of various kinds of normal
independence.
A more detailed version of this material has
already appeared elsewhere (see Roy and Bhapkar [31]), but
is included here for the sake of completeness.
In Chapter II , two theorems on minimum
X~
These have been already proved by Neyman [22].
are proved.
The first one
is proved here along Cramer's lines while the second gives
the justification for Neyman's linearization procedure.
In ChapterDI, some special problems out of those posed
in Chapter I are investigated.
The univariate two-factor
problems are studied in some detail and other problems are
considered briefly.
eses," the minimum
It has been shown that for "linear hypothX~
is the same as the one obtained by the
"general least squares" approach on some asymptotically normal
variables.
In Chapter 1\1:, Mood's test [20] for the two-way classification has been extended to cover incomplete block situations.
An extension of Hoeffding's theorem [14] on
U-statisti~
is stated and proved and a new test-criterion for the problem
of
c
samples is offered.
In Chapter V, some regression problems and some bivari-
ate problems in the nonparametric set up are studied.
Most of
the test-criteria developed are asymptotic in nature.
The
methods employed for the regression problems are extensions of
those used by Mood and Brown [20].
xi
NOTATION
As far as possible the following notation will be used,
all departures being clearly indicated at the proper places.
Matrices will be denoted by capital letters, small
letters underscored will denote column-vectors and row-vectors
if they are primed.
by
A' •
A matrix
The transpose of a matrix
M with
p
rows and
q
A is denoted
columns will
sometimes be written as
d.f.
N(~,a)
Mpxq to denote its structure.
denotes degrees of freedom.
denotes a normal variable with mean
standard deviation
N(~,Z)
~
and
a.
denotes a random variable having the multivariate
normal distribution with mean-vector
variance-covariance matrix
~
and
Z.
(p)
~
denotes convergence in probability.
When there are multiple subscripts, as in
Pijk' a
zero in the place of a subscript indicates the result of summation over that subscript.
A star in the place of a subscript will indicate that
the quantity in question is independent of that subscript.
If "i" denotes the categories of a response, we shall,
in short, say that "i" is a response.
J
e
r
will denote a matrix
[1]rxr •
J
will denote a column-vector of unities.
=-
will denote 'approximately' (sometimes in probability).
xii
o
E
will denote 'of the order' (sometimes in
probability~
will denote 'expectation.'
In Chapter IV, sometimes, capital letters denote 'variables' while small letters denote 'fixed quantities.'
In Part I,
q
with some subscripts will denote a quan-
tity which is not necessarily a probability.
CHAPTER I
SOME ANALOGUES OF THE
CUSTON~Y
HYPOTHESES
IN "NORMAL" ANOVA, MANOVA, AND IN
STUDIES OF "NORMAL" ASSOCIATION
1.1
Introduction
Roy and Mitra [33] state:
"This is an attempt at a
somewhat systematic exposition (i) which is based on a clear
distinction between a 'variate' (response) and a 'way of
classification' (factor), that stems from differing experimental situations and sampling schemes, (ii) which sets up
different probability models for the different situations,
and (iii) which poses different types of hypotheses according
as it is a 'multivariate analysis' situation or an 'analysis
of variance' situation or something of a mixed type."
Accordingly, we shall pose hypotheses, that might be
considered to be generalizations appropriate to this categorical set up of the usual hypotheses in the classical
'normal' set upo
The
pro~lems
of interest will naturally de-
pend on the nature of 'responses' and 'factors.'
If a set
of values is associated with the categories along any way of
classification (factor or response), that way of classification will be said to be structured.
We shall consider three
different types of problems, namely where (i) all responses
are unstructured and so also are all factors, (ii) some responses are structured and factors are unstructured,
1
2
(iii) responses are structured and so also are some factors.
To make everything concrete we shall consider here only some
three-way, four-way and five-way tables.
This will serve as
an illustration and will indicate what happens as we increase
the dimensionality of the table.
1.2
A three-way table (ijk) in which "i" is a response and
"j" and "k" are facto£,!
Let
i
denote the categories of the response and
denote the categories corresponding to the two factors.
j,k
Then
the probability model is given by the product-multinomial
distribution
<P =
(1.2.1)
rr
~
n
'k!
OJ
j,kTTn"k!
where
LPijk -
Pojk
~J
i
=
1
and
i
i
=
Lnijk -
nOJ'k (fixed). Suppose
i
1,2, ••• ,r ; j = 1,2, ••• ,s
and
k
=
1, ••• ,t
but with the
provision that all combinations (jk) may not be allowed.
other words, given j,
In
k takes a set of values which is a
subset of (1,2, ••• t), depending upon
j •
We shall refer to
these as either a complete design or an incomplete design, as
the case may be.
1.2.1
The case where "i", "j" and "k" are all unstructured
The hypothesis of no interaction between "j" and "k"
(i.e., between the two factors) means, essentially, that for
a given
i '0 there is a lesser number of unknown parameters
·e
3
than would be given by all allowable (jk) combinations.
Two
specializations are in the same spirit as in ordinary analysis
of variance.
(1.2.2)
p,~J'k
and
•
As explained in the section on notation, we shall be using
q
with some subscripts as a general symbol for quantities which
are not necessarily probabilities.
' "k
For (1.2.2), p,~J'k/P~J
tions are as follows.
of "k".
The physical interpretais independent
This means that, for any two categories of the first
factor, the proportions in the i-th category of the response
are in the same ratio for any category of the second factor.
Similarly,
p,~J'k/P~J
, 'k'
interpretation.
is independent of "j" with a similar
Hence we might call (1.2.2) the hypothesis
of no interaction (between the two factors) in the multiplicative sense.
"k" and
For (1.2.3), P"k
- P"'k
~J
~J
Pijk-Pijk '
interpretations.
'
~s
. d
dent
epen
~n
is independent of
"
0 f'" J,
' th
w~
' 'I ar
s~m~
Hence we might call (1.2.3) the hypothesis
of no interaction (between the two factors) in the additive
sense.
Now, if the design is complete, then summing both
sides of (1 .2.2) over "j" and
"k"~
separately and then jointly
it is easy to check that (1 .2.2) can be rewritten (letting
e·
stand for complete) in the equivalent form
(1.2.4)
(c)
H01
:
Pijk
=
PiokPijo/Pioo
•
c
4
It fuust be t~~~mb~fed'thet hone tif
Piok' Pijo or ?ioo
is a
pr~6~bilitYj ~~chbein9 b$s~d dh summation over subscripts
,belonging to di.££erent mUltinomial distributions.
Thus (1.2.4)
is only formally ~imilat to (1.3.2) (to be discussed in the
next ~ettion), btit
Ji .2.4)
would be identical with the con-
dition that there is no partial association between "j" and
"k ll , for given "i", if "j" and "k ll were variates and not fac-
tors as in the present case.
Thus the "no interaction"
hypothesis in the form (1.2.2) (for a com~lete design) is
r~lated
to the hypothesis of "no partial association" in the
cjse Where "j" and "k" are variates.
Going back to
that
~ij*
H01
we can test the narrower hypothesis
is independent of "j", or, in other words, that
PiJk . is independent of
II
j",
which, without any loss of gem-
erality, can be written in the form
(1.2.5)
H02
:
Pijk = qi*k
•
We shall eventually get the same hypothesis if we proceed
( 1)
Similarly from
H01
Now, if we assume for concreteness
•
that "j" stands for treatments and "k" for blocks, (i)
and (ii)
H~~)
H01
state respectively the hypothesis or model of
no interaction (i) in the multiplicative sense and (ii) in the
additive sense.
.effect.
H02 states the hypothesis of no treatment
It is open to us (depending upon past knowledge)
(a) to start from (1.2.1) as the model and test as a hypothesis either
H0 1 or
( 1)
H0 1
or directly even
H0 2
,
or
5
(b) to start from a model which is (1.2.1) together with
( 1)
either
H01 or H01 , and then to test
1.2.2
The case when Hi" is structured
H02
as a hypothesis.
In this case, the natural analogues of
( 1)
H01 and H01
seem to be
(1.2.6)
Lai
Hos
Pijk ; q**k q*j*
i
and
(1.2.7)
a.1. P"k
1.J
i
where
a, , s
1.
the response.
are the scores associated with the categories of
( 1)
Hos
and Hos
are then seen to.be hypotheses
of no interaction in the multiplicative and the additive sense
respectively, appropriate to the case of structured variate
where we might be primarily interested in the average response.
(1.2.7) seems to be more natural, being in the spirit of the
usual hypothesis of no interaction in the analysis of variance.
(1 )
(1 )
Remembering that q**k and q*.* are completely
J
unknown, we can rewrite (1.2.7) in the equivalent form
( 1)
(1) + a q.( 1'*)
(1 .2.8)
a.1. qi*k
a.1. p..
k
Hos :
i lJ
1.J
i
i
i
or
( 1)
(1 )
q.
,*] = 0
a,[p"k
•
qi*k
1
1.J
lJ
i
. easy t 0 see th a t H(011) --"7
~ H(os1) but not conversely, and
It 'lS
L
=L
L
L
that (1.2.8) for all sets of
•
hand, neither
H01
~H03
for all sets of ai's)o
ai'S
==9
( 1)
Ho 1 •
On the other
nor does the converse 'hold {even
Suppose we ask what happens if (1.2.6)
6
is to hold for all sets of a.~ IS?
of (1.296) as, say, q*j*
Laiqi*k
Rewriting the right side
' we can rewrite (1.2.6) as
i
Has:
Lai[Pijk -
qi*k q*j*]
=
0
•
i
This means that if we set up a hypothesis
(1.2.10)
then
H0 4
all ai's
but not conversely.
~Hos
~
H0 4
However, Has
for
with a counterpart formed by interchanging
"j" and Uk" on the right side of (1.2.10).
Thus it turns
out that Has does not have a natural tie-up with H0 1 in the
Hb~)
sense in which
tie-up of
has a natural tie-up with
H~~).
(in this sense) is with H04 or its counterpart.
HOB
However, if we sum both sides of (1.2.10) over
should have
The
1
=
q*j*
L q:5.*k
i , we
' whence
i
which really means that both are pure constants.
Thus H04
is essentially H02 ' and (102.6) for all sets of a·~ 's is the
same as H02 and its counterpart.
Going back to Ho 2 , we write its analogue in the form
(1.2.11)
Ho 5
:
L
ai
i
Again, as before, H02
(for all sets of ai's)
Pijk
=
>H 05
q**k
but not conversely.
But H0 5
==+ H02 • The other remarks made after
(1.2.5) would also carryover to this case, covering (1.2.6),
(1.2.7) and (1.2.11).
7
Going back to (1.2.6), another hypothesis which has
( 1)
( 1)
the same relation to H0 1 as Ho a has to H01 is
ai
(1.2.12)
Hoe: 1T P. 'k = q*j* q**k
•
~J
i
a.
a.~
, we can rewrite
Rewriting the right side as TI qij*~
qi*k
i
(1.2.12) as
(1.2.13)
Now H01
ai's)
~
>H06
but not conversely, and also Hoe (for all
Ho 1 •
We notice, (1.2.12) implies that the weighted
geometric mean of Pijk (over "i") has the 'no interaction
property.'
How meaningful this interpretation would be is
not very clear and thus (1.2.12) is offered quite tentatively.
1 .2.3 The case where "i" and "j" are structured
In this case (assuming a given set of weights b.'s to
( 1)
go with IIjll) the natural analogues of Hoa, Hos
J
and H05
seem to be
(1.2.14)
Ho?:
~ ai p.~j k - q**k x an assumed function of bJ.'s
!...J
i
- q**k(A + Mb j ) , say,
or
= Ak + Mk b j
(which is more general),
where Ak and Mk are unknown functions of k
unknown constants,
e
(1.2.15)
(1)
Ho? :
and
(1.2.16)
HOB
Li a·
ai
L
i
~
(.1 )
p.~J'k = q**k + Mb.J
p.~J'k
=
q**k
•
and A and Mare
8
The remarks made after (1.2.5) would also carryover here.
It may be noted that "j" may be a structured factor in a
two-dimensional (jk) design or a concomitant variable in a
one-way (k) classification.
parallel to
(1.2.17)
Another meaningful hypothesis
HOB is
Hog:
Lai
Pijk
= h + ~bj
•
i
1.2.4 The case where IIi", "j" and "k" are all structured
In this case, assuming furthermore a given set of
weights ckts to go with "k ll , it is possible to set up the
(1)
same Ho?, Ho?, HOB and HOg and also similar ones in which
the roles of "j" and "k" are interchanged.
However, the more
interesting hypothesis would seem to be
(1.2.18)
H01
0:
La i
Pijk
i
=
an assumed function of bj and ck
=h
+ ~j + vC k ' say.
(i) Starting from (1.2.1) as a model we can test this hypothesis or directly one in which
~
=0
and/or
v
=0
, or
(ii) starting from (1.2.1) together with (1.2.18) as a model
we can test the hypothesis that
~
=0
and/or
v
=
0 •
1.3 A three-way table (ijk) in which "i", "j" and "k" are
responses
The probability model is given by
(1.3.1 )
where
L
i,j,k
~ =
n1
TI n. 'k!
i,j,k J.J
TI
i,j,k
n ijk
P.J.J'k
and
L
i,j,k
nijk
p,J.J'k = 1
-
n (fixed) •
9
1.3.1 The case where "i", IIj" and "k" are all unstructured
Consider
(1.3.2)
H0 1 1
:
which can be described as the hypothesis of no partial association between "i" and IIj", given "k".
equivalent ways to get at (1.3.2).
There are two
One is to notice that
the conditional joint distribution of "i" and "j", given "k",
is
Pijk/Pook
and the conditional marginal distribution of
Pio~Pook' that of
"i", given "k", is
Pojk/Pook ' which leads to (1.3.2).
"j", given "k", is
Another way is to start
with the condition
(1 .3.3)
Pijk/ Pojk is independent of "j"
=
qi*k ' say,
which means that the conditional distribution of "i", given
"j" and "k", is independent of "j", and then rewrite (1.3.3)
in the form
pijk -- pojk qi*k· Summing over "J"" we have
that (1.3.3) is equivalent to (1.3.2). This hypothesis and
the appropriate X2 testhave already been discussed in [33].
We next proceed to the hypothesis that "j" and "k"
are independent and so also are "i" and "k", i.e.,
(1.3.4)
We note that
H 011
n
H012
)
(1.3.5)
which is the hypothesis of over-all independence of "i", "j"
10
and "k".
It should also be noticed that
(1.3.6)
H014:
Pijk
=
H0 12 ~
Pijo Pook
which is the hypothesis that there is no multiple association
between "ij" and "k".
It can be seen that
H 014
i> H012 ,
but not conversely (unlike the multivariate normal case). One
could also get to (1.3.6) by starting either from the condition that
( 1 3 7)
Pijk/p ijo
• •
or from the one that
is independent of "iJ"" ,
(1.3.8)
is independent of "k"
Pijk/p ook
•
Another hypothesis which might be of interest is that of
two by two independence, or in symbols,
and
It is well known that
in general.
H0 13
The hypotheses
Pojk = Pojo Pook
•
-> H01 5 , but not conversely
H 013
and
H 0 14
have already been
discussed in [33].
1.3.2 The case where "i" is structured
We may consider hypotheses analogous to (1 .2.6) and
(1.2.7), namely
(1.3.10)
•
H 016 :
Li a i
Pijk/Pojk
=
q*j* q**k
a i Pijk/Pojk
=
q*j* + q**k
or
(1.3.11)
(1 )
H0 16 :
L
i
(1 )
(1 )
11
Next let us consider
a
hypoth~sis analogous to H011
,
using the version of H011 giv~n by (1.3.'3) •
(1.3.12)
Li
H0 1 7:
a.~ Pijk/Pojk
,
~s
independent of "j"
.
It is easily seen that (1.3.12) could be written in the
equivalent form
(1.3.13)
Thus
H01 1
•
~
H017
but not conversely.
However, (1.3.13)
(for all a.~ 's) ==:::> H0 11 •
Also analogous to H012 we now have
(1.3.14)
Ho 1 s:
Li a.
Pijo/Pojo
is independent of "j" ,
Li a.
Piok/Pook
is independent of "k" ,
~
~
which is equivalent to
H01S :
Lai(Pijo -
Pioo Pojo) = 0 ,
i
L ai(Piok
- Pioo Pook)
=0
•
i
The same remarks are applicable as after (1.3.13).
analogous to H0 14
(1.3.16)
H019--
Likewise,
we have
\' a i Pijk/p ojk is independent of "J·k" ,
L
i
•
which is equivalent to
(1.3.17)
H01
9:
Lai (Pijk -
Pioo Pojk)' = 0 •
i
The same remarks are applicable again as after (1.3.13-).
•
12
1 .3.3 The case where iii" and "jll are structured
If, for concreteness, we fix our attention on "i", then
the hypothesis of independence of "i" with respect to "jll
and ilk" of the types (1.3.12), (1.3.14) and (1.3.16) (as
well as the tests) remain as before.
interesting possibilities.
However, we have other
Assuming a set of weights bj's
to go with "j", we can write, for example, a hypothesis
related to Ho 1? in the form
(1.3.18)
~ a.~ p.~J'k/POJok = an assumed function of b.J
L
Ho 2o :
i
x an assumed function of "k"
(A + Ilb j) q**k (say) ,
=
or
= Ak
+ Ilkbj (which is more
general)
We can test this in the spirit of testing for linearity of
regression; or assuming this as a model, we can test the
hypothesis that
Il
= 0
or
Ilk
= 0 •
A hypothesis naturally related to H018 would seem to be
(1.3.19)
H021 :
Ii
Ii
a.~ Pijo/Pojo
a.~ Piok/Pook
=
an assumed function of b.
-
A. + Il b j
J
(say) ,
is independent of 11k" •
1.3.4 The case where "i", "j" and "k ll are all structured
Here we have further interesting possibilities. Assuming
a set of weights ck's to go with "k", we have, for example,
a hypothesis related to H019 in the form
13
(1 .3.20)
H02 2 :
La i
Pijk/Pojk
=
i
an assumed function of b.
J
and c k
We can test this hypothesis in the spirit of testing for
linearity of regression, treating
constants.
Or
A,
~
and v as unknown
assuming (1.3.20) as a model we can also
test the hypothesis that
= 0 or that
~
v
=
o.
Likewise,
the one related to (1.3.14) seems to be
(1.3.21)
H0 2S:
Li ai
Pijo/Pojo
=
an assumed function of b j
+
= A1
Li ai
Piok/Pook
=
~b.
J
(say) ,
an assumed function of c k
= A:2
+
vC
k
(say) •
This also may be tested in the spirit of testing for linearity
of regression, or, assuming this as a model we can test for,
say,
~
=
0
and/or
v
=0
•
It will be seen that in this study of association in
this categorical set up we have been working in the spirit
of "regression" rather than in the spirit of "correlation."
We have not been trying to use a single measure for any of
the various types of association.
Such a single measure
seems to have a limited use in this categorical set up.
How-
ever, such single measures (which come out as the noncentrality parameters in the asymptotic power functions of the
respective tests for independence) have already been discussed in [8].
14
1.4 A four-way (ijkt) table in which "i", "j", ilk" and Itt"
are all responses
n!
(1.4.1 )
ex> = -'""=;.;;...;..-~-
TT
i,j,k,t
where
L
Pijkt
nijkt!
nijkt
TT Pijkt
i,j,k,t
L
and
= 1
nijkt
n (fixed).
=
i, j , k ,t
i,j,k,t
There is much in common between this four-variate case
and the three-variate case discussed in 1 .30
However, the
four-variate case presents certain new features and we shall
state some for purposes of illustration.
1 .4.1 The case where all are structured
(104.2)
H024 :
Pijko
=
Pioko Pojko / Pooko
Pijot = Pioot Pojot/ Pooot
•
1.4.2 The case where "i", "j" and "k" are structured
(1.4.3)
H02 5:
Li a i
Li a i
Pijko/Pojko
=
an assumed function of b j
and c k
=
A, + lJ.b j +
vC
k
(say) ,
PijotiPojot - q**-J&t x an assumed function
of b.
J
- q***t (A,1 + lJ.1 b j)
(say) •
15
1 .5 A five-way (ijktm) table in which "i" and "j" are
responses and "k", ".t" and "m" are factors
TI
i,j
(1.5.1 )
=
where
1
and
i,j
nijk.tj-
P.~J'k.tm
L
i,j
There is much in common between this case and the
corresponding three-way case discussed in 1 .2.
We shall
discuss some new features.
1.5.1 The case where all are unstructured
We may consider the hypothesis
which may be interpreted as the hypothesis of no three-factor
interaction (in the multiplicative sense) or a similar hypothesis
Hi~6 in the additive set up. Similarly, we may
consider
which may be interpreted as the hypothesis of no two-factor
interaction and a similar hypothesis
(1.5.4)
H02S:
H~1?
' and finally
the right side is independent of one or more
of the factors "k", ".til and "m"
16
which may be interpreted as the hypothesis of no corresponding
(1 )
H028 •
One may start from (1.5.1) and test H 026 or
main-effects and a similar hypothesis
a hypothesis, or from
or
(1)
H 02 ?
test
H026
or H~1~ as a model and test
as a hypothesis, or from
H0 2 8
or
(1 )
H 028
(1)
Ho 2 ?
as a hypothesis.
or
(1 )
H02?
as
H 026
Ho 2 ?
as a model and
There are various inter-
mediate cases.
1.5.2
The case where "i" and "j" are structured
The analogues of the hypotheses in the previous case
seem to be as follows:
H0 2 9:
\'
~ ai Piok-t.m
i
\'
~ b j Pojk-t.m
or
(1)
H0 29
(1.5.6)
(1)
(1)
(1)
=
q***-t.m q**k*m q**k-t.*
=
q***-t.m q**k*m q**k-t.*
(2)
(2)
(2)
j
with the additive set up,
( 1 )
Ho 80 :
Li a i
Lbj
(1)
Piok-t.m
Pojk-t.m
= q**k**
=
(2)
(,)
(,)
+ q***-t.* + q****m
(2)
(2)
q**k** + q**~* + q****m
j
or
Ho 80
(1.5.7)
with the multiplicative set up, and finally
H0 8 1:
the right side is independent of one or
more of "k", "-t." and "mil
(1)
or H08 1 with the additive set up.
Finally we consider
17
1.5.3 The case where "i", "j" and "k" are structured
The hypotheses of interest might be, for example, as
follows:
(1.5.8)
HOS2:
L ai
i
Lj
or
(1.5.9)
Hoss:
b. Pojktm
J
Li
Lbj
H0 34:
Li ai
L b.
.
and finally
(1.5.11)
...
Hos5:
=
~ 2)
tm +
(1 )
J
J
(1 )
~tm
(2 )
~tm
(1 )
ck
(say)
ck
( say)
ck
(say)
(2 )
(2 )
ck
+
~
Pojktm = "'tm
(say)
a·~ Pioktm ;: A:tm +
j
(1.5.10)
(1)
p.~o ktm = "'tm +
~
"'( 1) + ~ 1) + ~(1 ) c
Pioktm = t*
k
*m
Pojktm
=
,
(2 ) + ~2) + (2)
ck
~
At*
*m
the right side is independent of one or
more of "t" and "m" with or without ~ts
being zero •
-e
CHAPTER II
ON SOME BASIC THEOREMS OF NEYMAN ON X~
AND lILINEARIZATION"
2.1
Introduction
Let
(2.1.1 )
CL>
= TI
j
~
n
TI
•
1.
.1
OJ
n .. 1
1.J
denote a product multinomial distribution so that
I
and
n .. :: n .
1.J
i
OJ
is fixed.
Lp.. = 1
•
1.
1.J
If a hypothesis Ho is given in
the form of certain constraints on Pij'S, then the large
sample test of Ho under (2.1.1) for the model is in terms
of a statistic given by
( n .. _ n. p" •• )2
1.J
(2.1.2)
OJ
1.J
i,j
. Wh'1.C h'"Pij , s
1.n
to
I
p ..
.1.J
1.
=
1
are estimates of Pij'S which maximize
subject
and to the further constraints on Pij'S that
define the hypothesis.
n ~~
CL>
This statistic, in the limit as
subject to noj/n's being held fixed, is distributed
as a X2 with degrees of freedom equal to the number of
independent constraints on p.. 's that define the hypothesis.
1.J
It may be observed that instead of the maximum likelihood
18
j
j
19
estimates of Pij'S, one might as well consider any set of
estimates belonging to the broader class of BAN estimates,
as shown by Neyman [22].
Likewise, instead of the statistic
(2.1.2) one might as well consider the slightly different
one known as
X~
If we introduce some additional constraints
•
on p. , 's and thus define a new hypothesis
~J
H~
C Ho , then
the test of H*
o , under (2.1.1) for the model, is in terms of
a statistic given by
L
i,j
(2.1.3)
...
(noo - n . poo)
oJ
~J
n ~oo
which, in the limit as
2
~J
subject to
noj/n's being
held fixed, has the X2 distribution with degrees of freedom
equal to the number of independent constraints on Pij'S that
*
define Ho.
However, if we want to test Ho* under Ho for the
model, then the test will be given in terms of a statistic
...
L
(2.1.4)
...ol , J'
(n , - n .. poo)
oJ
~J
nOJ'
which, in the limit as
2
~J
~
P';J'
...
n ~oo
- i,jL
(n oj - nij Pij)2
"
noj Pij
subject to
noj/n's being
held fixed, has the X2 distribution with degrees of freedom
equal to the number of additional independent constraints
on the p .. 's that define H~ under Ho for the model.
~J
As
before, any set of BAN estimates may be used, and similarly,
the slightly different
X~
statistics may be used.
Neyman shows that the relevant equations (namely maximum likelihood or minimum X2 or minimum
X~)
have a system
20
of solutions which are BAN estimates of parameters.
On the
other hand, Cramer [7] shows, in the simplest case, that the
maximum likelihood equations (which are the same as modified
minimum X2 equations) have a unique system of consistent .
solutions, and then the X2 statistic based on this solution
has an asymptotic X2 distribution.
Mitra [19] and Ogawa
[23] extend it to more general cases.
We shall prove along
Cramer's lines the theorem, in the simplest case, for the
minimum
X~
estimates and the
X~
statistic.
It could then
Mitra [19] and Diamond
be extended to more general cases.
[8] have defined and obtained asymptotic power of the X2
tests.
The same thing could be done for
X~
tests.
2.2 Theorem 2.2
Suppose that we are given
Pr(~)
of
s
<r
variables
a'
r
=
P1(~)'
functions
(a1' ••• , as)
for all points of a nondegenerate interval
s-dimensional space of ~ , the functions
••• ,
such that,
in the
A
Pi(~)
satisfy
the following conditions:
r
(a)
LPi(~) =
1
•
i=1
(b)
Pi (~) ~ c 2
(c)
Every
>0
Pi (~)
for all i •
has continuous derivatives
8p.l.
-8a.
and
J
8 a j 8a k
(d)
•
The matrix
D =
8Pi) i = 1 , ••• ,r
( ~a.
J' = 1 , ••• , s
OrI..
J
is of rank
s •
21
Let the possible results of a certain random experiment
be divided into
r
E
mutually exclusive groups and suppose
that the probability of obtaining a result belonging to the
i-th group is
p91
A•
an inner point of
,
, where
p.(ao)
1-
=
v.1
Let
~o
... , a s
O
::: (af,
is
)
denote the number of
results belonging to the i-th group, in a sequence of
n
r
L vi = n
E, so that
repetitions of
It is assumed
•
i=1
that none of the vi's is equal to zero. Then, the equations
r (v. - np.) Op.
1
11
(2.2.1 )
j=1,2, ••• ,s
V.
8(i":' ::: 0
I
i=1
of the minimum
J
1
X~
method, have exactly one solution
such that
to
as
~o
n
X~ =
I
i=1
is, in the limit as
Proof:
Let
converges in probability
-
and
-+ 00 ,
r
(2.2.2)
,..
a
[v 1'
np. (a) ] 2
-
1 -
vi
n
-+00,
distributed a s a X 2 wi t h r - s _ 1 d. f •
Op.1
p ..
-=
1J
oa.
(~)
and
oa. a=ao
J
•
J --
The equations (2.2.1) can then be written as
r
L
i
(v. - np?)
1
Vi
1
( p.1J. - p..
1JO ) +
Ii
° Pl' J' 0
Vi - nPi
v.
1
~ (p. - p?)
r (p.1 - p?)
_ n L
1
1
(p .. _ p .. ) _ n L
1
p.. = 0 ,
Vi
1J
lJO
vi
1JO
i
i
j =
Therefore,
1,2, ••• , s •
22
\'
° \'
°
1
\' vi - nPi
L (a k - a k ) L - o P'J.JO
. p.J. k0 = L
PJ.· J' 0 + wJ' (g) ,
np?
P
k
i i i
J.
where
w.(a)
J -
=
L
v. - np~
p. - p?
J.
J. (p .. _p .. )-n
J.
J.(p .. _p .. )
J.
vi
J.J
J.JO
J.
vi
J.J
J.Jo
L
\,PijO{npf-vi(p
0)
(
0) \'(
0)
}
- L -po
vJ.'
i-Pi + Pi - Pi - L a k - a k Piko
i
i
\' (v. _ np?)2
- L
i
k
J.
J.
nv i p?J.
= 1,2, ••• ,5
j
PJ.'J'o
•
Let
Then by condition (d)
(2.2.5)
x.J.
=
V.
J.
B
np?J.
../np?
J.
is of rank
s.
Let
-
XI
=
(X1,
and
Then the equations (2.2.3) can be written as
(2.2.6)
so that
(2.2.7)
Then, following Cramer, we have with a probability greater
than
1
1 --
(2.2.8)
A,2
IV i - npfl
<
A,
~
for all i
=
1, 2, ••• , r •
Until further notice, we shall assume that
v.J. satisfy (2.2.8).
23
We shall let A, denote a function of
as
n -+00, while
o<
q
< i].
A,2
-
vn
-+
0
as
n
n -+00
such that
[e.g.
A, =
A, -+ QO
n
q
,
All results obtained will then be true with a
probability
>1
- ~ , and which -+ 1
as
A,2
n -+00 •
By condition (b) ,
Ix.1 < 1:c
Now, for two inner points
w. (a1
J -
w. ( a 2 ) =
) -
J -
2
L
=
i
~
~1
~2
and
of
1,2, ••• , r
A, we have
v. - np9
~
~ (p. . 1 - p. . 2) - n
L. .
~
v~
~J
0
- Pi (p. "1 - p .. 2 ) n . Pi v~
~J
~J
~
·
o
vi - nPi ( 1
v.
Pi
~J
~
~
~
pO
i
j
=
Lp1~ - p~~ (p. '1 - p. . 0 )
vi
.
~
L-Pijo {1p. -
•
p.~2 -
L p. k
k
~
~J
0
~J
(1
ak- ak2 )
1, 2, ••• , s •
~
Now from (2.2.5), (b) and (2.2.9)
v.~ = np9~ +
x· d> np9~ _ -Ll ,
x.~ = np9~ + :;npf
~L
c 1V"'OJ
~
~
~
~
2
-
so that
(2.2.10)
and
Therefore,
1
i = 1,2, •••
,r •
24
c 2(1 -
.....L)
2 ~ IwJ' (-C% 1 ) - w,J (-C% 2) I<
- ~
~ "I
L p,1J'1
c ¥n
I p.1J'1
¥n i
"I'
- p,1J. 21 + LP.1 - P ,21
ill
X
I .vn,
A"
I
ijo l11
"12
011
IPpC;>
p.1 - p,12/ +
1JO + L, p,1 - p.1 p,1J'1 - p,1J'2 + - L
1 1 1
- p"
j=1,2, ••• ,s
•
In view of conditions (b) and (c), 1Pij1 - Pij21 ~ k1.ijJ~1 - ~21
1~1 - ~21
where
a constant.
is the distance in the s-space and k 1 ij is
Let k",
,. and k 1 = max(k 1.,.J. ) • Then,
,J = "
/,J k 1 1J
1
"Ip,
~ 1J'1 - p.1J'2'
< k11_C%1
-i!21
for all
j.
Similarly,
for all
j ,
for all
i,J' , and
1
"Ip,
/,J 1J'1 - p"1JO I _< ksl_C%1 -
_CIol
1
\ JL0 Ip"1JO I < k 5
/..J
1
Pi
<
Also
IC%~ - C%~I
Therefore,
<
1i!1 - S!21
keli!2 - ~1 11~2 - S!O I + k71l!2 - i!1
for all
k.
2
I •
t
25
j = 1,2, •••
,s •
Hence,
(2.2.11)
•
We now define a sequence of vectors
v
=
for
~v
by
1,2,...
~v = ~o + n-t(B'B)-1B'~ + (B'B)-1~(~v_1)
(2.2.12)
and we propose to show that
{~v} ~ a definite limit
which is then evidently a solution of (2.2.7).
seen from the definition of
~(~)
,
~
It will be
that
\' (v. _ np9)2
1
. 1
Pijo
i
nv.p9
1 1
Wj(~o) = ~
j = 1,2, ••• ,s. Also,
and
The matrices
Denoting by
(B'B}-1 B,
and
(B'B}-1
are independent of
9 an upper bound of absolute values
n.
of the
elements, we have from (2.2.8), (2.2.10) and (c)
<k
where
k
-----
is some constant independent of
(2.2.9) and (2.2.13) we have
n.
for all (i,j),
Hence, from
26
=
j
1,2, ••• ,s
so that
r, + .;,(1 _l\_~J
L
c2~~
(2.2.14)
a suitable constant.
where k' is
Similarly, from (2.2.11) and (2.2.13),
we have
<
(2.2.15)
-
k'ia
-v -a
-v- 1 1 {'\1\
c2
where
n
k'
vn
1 _~
is independent of
.vn
n
and
l_a1 - ao I
so that
-
< k * ,;n
.1..
Since
v.
, for sufficiently large values of
~oo
+
n, 1
~ ~ 0
.vii
_--L>
c2
.vn
as
1 ,
~
and
-
(2.2.16)
I-v
a +1
{l:..
a - -a 0 I + Ia v1-a 0 I} ·
.vn + I-v
- -v
a I -< 2k t I-v
a - -va 1I
Then, following Cramer, for large
n
(2.2.17)
•
The infinite series
converges absolutely for sufficiently large
~
define
n, and if we
by
(2.2.18)
then
A
a
satisfies (2.2.7) and hence (2.2.1).
It follows
27
,g ~ go
from (2.2.17) that
ness along
Cr~mer's
as
n
-+ 00.
Proving unique-
lines, we thus have a unique solution
of (2.2.1), which converges in probability to
~o
~
•
Still assuming (2.2.8), we have, from (2.2.13),
As seen already, every component of (B'B)-1~(~O) is, in
>--2
absolute value, < 2k
for large n and therefore, from
rr
(2.2.17), we see that every component of (B'B)-1~(~)
< M~
n
in absolute value,
where
M'
le.1
<1
J -
, so that (2.2.7) may be written as
~~
is a constant and
, j
=
(8 1
,
••• ,
8s )
v. -np.(&)
~
~
-
,
i=1,2, ••• ,r
""vi
r
so that
such that
= 1,2,. ~'" s. Consider now
(2.2.20)
X~ =
L y.
2
~
~
is,
Then
i=1
- pQ]
~
•
28
and from (2.2.19)
a.J - a9J = O(~)
,
Vfi
50
that
.vri[ p. ((i) - p9] =.vri ~ p.. ( a. - a C?) + 0 (1:.)
~
-
4
~
Hence
~JO
J
n
J
•
J
IE
• = {x.:,• - ~p9
y.:,
~
~p .. (a. -
~ ~JO
J
J
aC?)+ 0
J
(1!):t
n J
i = 1,2, ••• ,r •
Thus,
:t.=Ji - .vri
(2.2.21 )
= -x
B (a
-
- -ao) +
k"
~ a
.vri-2
- B(B'B)-1 B,x + Mo ~ a
.vri- B(B'B)-1 B,]x + Mo ~ a
.vri-
where k's and M's are independent of
vectors such that
Ie.~ I -< 1
,
i
=
n
1,2,
-
and a's stand for
eo • ,
r.
From thi s
point on the rest of the proof goes through along Cramer's
lines.
2.3 Linearization
It will be noticed that Cramer's theorem and its
generalizations, as well as analogous theorems on
of the nature of pure existence theorems.
'I"
•
X~,
are
They prove the
existence of a particular system of solutions for the
minimizing equations, for which the theorem stated is true.
But neither of them says how to isolate this particular
system of solutions.
When the equations concerned have
29
just one real solution. there is no problem.
However. when
there are more than one such system. a theorem due to Wald
[36] and Ogawa [23] says that the solution system that maximizes the likelihood in the arithmetical sense is the
consistent one.
In many situations, the hypothesis can be equivalently
expressed in terms of the constraints on P's. say, for
example
(2.3.1 )
t
= t.2, •••• J..l.
•
In the particular case. when these constraints are linear
in P's. the method of minimum
X~
reduces the problem to the
solution of a system of linear equations and hence is more
convenient.
When these constraints are not linear. Neyman
[22] has suggested the following "linearization" technique.
Taylor's formula is applied to obtain the expansion of
Ft(E)
about the point
proportions.
(2.3.2)
~.
=£ , where q's are sample
Thus,
Ft(.E)
=
*
+~
Ft(q·.E)
I I Ct· . (p. - q. ) (p . - q. )
1J
i j
1
1
J
J
where
(2.3.3)
Here
b ti
*
Ft(,g.p)
= Ft(q)
1
1
- q.)
1
i
represents the partial derivative of
respect to p.1 taken at
upon the p' s. so that
p's.
+I b t · (p.
.E
=
q •
*
Ft(q·p)
Thus
b ti
Ft(p)
with
does not depend
is a linear function of the
On the other hand, the coefficients
C
tij
are functions
30
of both the p's and the q's.
Neyman considers two models
for minimizing the "generalized distance" between p and q •
-
-
(i) Under the first model, the minimization is effected with
respect to such variation of the p's as is consistent with
(2.3.1) and (ii) under the second model, the minimization
is effected with respect to such variation of the p's as is
consistent with
t = 1,2, •• • ,Il •
He then proves that if the generalized distance is such that
its minimization under the restrictions (2.3.1) leads to
BAN estimate of
~o
, then the minimization of the same
distance under the conditions (2.3.4) also leads to the BAN
estimate of
~o.
In particular, the minimization of
X~
under the conditions (2.3.4) leads to the BAN estimate of
the
~o
and thus the problem is reduced to the solution of a
system of linear equations.
Some steps in Neyman's proof are not very clear.
says, for example, "In fact, the numbers
strictions (2.3.1) and (2.3.4) • • • • "
P~
He
satisfy re-
We shall give an
independent direct proof for the simple case and this
could be easily generalized to the product multinomial
situation.
2.4
Theorem 2.4
Suppose that we are given
£'
= (P1, P2, ••• , Pr)
such that, for all points of a nondegenerate interval
A in
31
the r-dimensional space of
~,
Pi's satisfy the following
conditions:
r
Pi
L
i=1
(a)
(2.4.1 )
= 1
•
(b) In addition, Ft(,E) = 0 , t=1,2, ••• ,j..L , where
F's are independent functions of ,E •
> c 2 > 0 for all i
(c) p.
1. -
(d) Every
8F t
-8p.
has continuous derivatives
Ft (E)
8 2 Ft
and
•
8p.8p.
1.
•
1.
J
Let the possible results of a certain random experiment
be divided into
r
E
mutually exclusive groups and suppose
that the probability of obtaining a result belonging to the
i-th group is
point of
pf, where
A such that
E~ = (p~, ••• , p~)
= o.
Ft(EO)
Let
vi
is an inner
denote the
number of results belonging to the i-th group, which occur
r
in a sequence of
n
repetitions of
Then, (i) the equations
minimizing~ X~
L
vi = n •
i=1
with respect to such
E , so that
variation of the p's as is consistent with (a), (b*), (c),
and (d), where
(2.4.2)
(b*)
F;(~) = Ft(g) +
Li bti(Pi
and
- qi) = 0
vi
q.1. =n-
32
~, (ii) ~
( v. - _ n~.) 2
have a unique solution
r
L
of ~o , and (iii)
1
i=l
distributed as a X2 with
Remark:
is, in the limit as
1
v.1
~
is a consistent estimate
n
-+000 ,
degrees of freedom.
What this theorem does, in effect, is to offer a
test for the original hypothesis
in terms
ofax~
Ft(~)
=0
(t
= 1,2, ••• ,~)
in which the estimates for the pIS are
x~
obtained not from the minimization of
under the original
constraints but under the constraints (2.4.2).
Proof:
Under constraints (2.4.1) on
pIS we have
~
""
r - 1 -
~
=
s
~
, eliminating
independent p's, say P1' P2,
Ps ' so that
i=s+1, "', r
where
Then, from (2.2.19), we have a minimum
x~
estimate, subject
to (2.4.1), given by
(2.4.4)
"'*
P
-sx1
= ~o*sx1
,
+ n-t (BIB)-1 B'x + k - 61
sxs sxr rx1
n - sx1
)..2
where
(2.4.5)
B
=
'f-(~)J
";p~ 5pj
0
i = 1,2, ••• ,r
j
= 1,2, ••• ,s
•
Under constraints (2.4.2), we have to minimize
\' _
(v.
_ np.1o..._
)2
1
f = L
i
Vi
~+1
- 2n
L)..t[F t (£) + Lbti(Pi
t=1
i
- qi)]
33
r
{~
assuming that the
I
+ 1)-th equation is
Pi - 1 •
The
i=1
equations are
i - 1,2, ••• ,r,
(2.4.6)
and
(2.4.7)
Ft
(9) +
I
b ti (~i - qi) = 0
t=1,2, ••• ,{1l+1) = Il'
say.
i
Let
(2.4.8)
and
Then, (2.4.6) and (2.4.7) can be written as
...
,.,
3 rx 1 - ~rx1 + Q A' A
f
+ A
(~'" - 3)
rx1
ll'x1
Il'xr
-f + (AQA')\ =
Therefore,
0
and
- 0
rxr rXIl' J..I.'x1
=
0
•
•
Il 'XIl '
Since the restrictions (2.4.1) are independent, A is of
rank Il' so that
AQA'
is nonsingular.
Hence
so that
~
= 3
- QA' (AQA t) -1
-rx1
rx1
Thus
~
i
•
is unique.
P
Now, in the notation of Theorem 2.2, with probability
>
1
- -A1
2
< 1C
, Ix.1
1.
i = 1,2, ••• , r .
Also,
1
34
q.J.
(2.4.10)
= b
=
p9J. + x.J. ~p9/n
J.
+0(1.)
vn
.
t J.O
, and hence
where
•
Similarly,
Ft(g) = Ft(Eo)+I
i
·
= i.J\' bt J.O
a2 F t
- - - (q. btio(qi-P£)+~I ap.ap.
1.
i j
J.
x.J. jp?/n
+ o(~)
J.
n
0
0
p. ) (q. - p. )
J.
J
J
•
i
•
Now,
(AQAt) ..
J.J
= Ik b'J.K qk b'J k
l
=
If·
k
k
= \' b. k
i.J
k
+0
J.O
J.O
Suppose
•
and
e
= P R
IJ. ' x IJ. '
IJ. 'x.t
•
e
, where
(p~,.o.,p~)
o(l)
vn
(1_
51_
vn~1
O
e
P =
+0
Pk b.JO
k + 0 (.1..)
vn
(AQA') = AoDA~ +
D = diagonal
(~~
vn ~p~
,
and
where
0.
k
JO
+0(
so that
= (e .. ) ,
J.J
e..
= 0(.1...)
J.J
v7f
.t
=
Rank
P =
tx IJ. '
Then,
(AQA' )_1 = (AoDA~)-1 + (AoDA~)-1p A R(A oDAl,)-1 ,
where
A is a suitable matrix.
.1..~
vn
Hence,
•
Rank R
J
35
(2.4.11)
•
Similarly,
(QA ') ~J'
~
QA'
Thus, QA'(AQA,)-1
QA' (AQA'
)-'1
=
=
q. b.. = rp ? + 0(.1.))"}
~ J~
L~
rb.. + 0(..1..)J
VtiJ L
J~O
vr(J
=
p?b..
~
= DA~
=
J~O
+0(.1..)
Vti
+ o(.1..)
so that
•
Vri
+0(.1..)
OA~(AoOA~)-1
,and
Vti
~A6(AoDA6)-' +o~ f-tAoot.!i
+
0(*"1]
= n-toA~ (AoDA6) -1 Aoo"~· 2S. + 0(*-)
where
ot
= diagonal(~p~,
... , ~~ .
)
Therefore, from (2.4.9) ,
(2.4.12)
n-t DA~ (AoOA~ )-1 Aoot 2S. + 0 (~)
PI
P
= Po
-rx1 - rx1
~ ~ Po
which shows that
in probability.
If we consider,
as before, only the independent estimates, say
~1
' •••
'~s '
then we have, from (2.4.12),
(2.4.13)
= Po*
sx1
+
n-t[Dt~OJ[I- OtA~(AoDA~)-1AoOt]2S.
+O(¥-)
where
o
s
=
r-s
s
r-s
say •
36
We shall show that
[DttO][I-DtA~(AoDA&)-1AoOt]
(B'B)-1 B , =
(2.4.14)
•
~:~
Now
0
rxs
~ [:J:,
~
0Of~
~
G=
_ PJ 0
5
B= o-t [IJ
(2.4.15)
where
G
s
5
; o-t =
d'
•
,so that
,( 1 . 1, ••• ,r)
~agona,-
:v'"P'~9
Ii '
t == 1,2, ••• ,1i'
,~=
, we have
j = 1,2, •••
,5.
Then
(2.4.16)
•
We notice that
of
Ft , t
of
(--1)
op·
-~
= 1,2, •••
is nonsingular in view of independence
,1i'
Q
A == (A1IA2)1i' , from (2.4.8).
5
h
G = -A2-1 A1 , were
Thus
(For convenience, we shall
Ii'
drop the suffix 0.)
Hence,
B
= o-t\ I
1
J,
so that
L-A 2 A1
BtB = D;1 + A;A~-1021A21A1
.
= D;1
Let
Then,
(B ' B) -1
+ A{A~-10s10a1A21A1 , where
, ,_1 Os-1 ~Ds-1 A2-1 A1 0 1
= 01+ 01A1A2
Ds
•
= D! •
•
37
[I + Da1A21A1D1A~A~-1DS1]~
(2.4.17)
Thus, (B'B)-1 B1
•
=
=
=
-I •
•
[D1 + D1A~A~-1Da1cpDa1A21A1D,][I:-A;A~-1
]D-.1.2
•
, '-1 D-1
[ D1 + D1A1A2
S
in view of (2.4.17).
~
_
: D,A,.A
, ,-1
'
Ds-1 A21A,D1.
2 Ds cp
•
.1.
Ds]D-~
Now
1
where
D4
( 2 •4 •18)
=
D~
•
Then,
1
, - , D- 1 -1
D]
[ I + D4A,A2
2 A2 A, 4 cp,
From (204.17) and
(2~4~18),
D4 A1'A'-'D-1
2
8 cp
= -I
.
we can see that
=~,
1
D4 A'A'-'D1 2
8
•
Now
right side = Dt [ (I 10) _ (DiIO)A' (ADA') -1 Ao"t]
of (2.4.14)'
,
.1.
.1.
.1.
= D~[(llo)- D~A~(ADA')-'AD2]
=
Dt[(110)-
DtA;(A~-'D2'A21+ A~-1D2iA2'A1D4D4A~A~-1Ds1
X
~ Ds 'A2 1 )ADt ] {from (2.4.19}
=
Dt[ I - DrAnA~ -1 D2 1A2 1 + A~ -1 D21A2 1A1D1 A; A~ -, Ds' ~Ds1 A21)A1Dt
~
•
i A'A'-1 D-t DtA'A'-1D-1A-'A D A'A,-1 D-1 ]
- D1
1 2 2 - 1 1 2 2 2 1'12
a~
,
from (2.4.17)
=
(B'B)-1 B,
=
left side of (2.4.14)
•
38
Thus we have
•
Now, from Theorem 2,2,
r (v. - np.
J.
J.
X12 Vi
i=1
A
)
2
L
is, in the limit as
n
~oo
, distributed as a X2 with
Let
~ )2
( v.J. - np.
J.
(2.4.21)
•
•
Since the first two terms in (2.4.4) and (2.4.20) are
identical, using the same argument as for the minimum
under the original constraints, we have the limiting
distribution with
~
d.f.
for
X:2
•
~
d.f.
CHAPTER III
SOME SPECIAL PROBLEMS POSED IN CHAPTER I
3.1
Introduction
Reiersol [27] considers binomial experiments and
makes use of results of Neyman [22] to determine X2 tests
for the hypotheses appropriate to factorial experiments.
Mitra [19] not only generalizes Reiersol's theorems to
multinomial experiments, but also avoids his restriction
that the parameter sets in the different linear forms
occurring in the hypothesis be non-overlapping.
We shall
prove theorems, analogous to Mitra's theorem, to cover the
cases that cannot be treated by his theorem.
when the hypothesis
Ho
In particular,
specifies linear functions of pIS
as known linear functions of some unknown parameters, the
minimum
X~
to testHo is exactly the same as the minimum
sum of squares of residuals obtained by the general least
squares technique on the linear functions of q's to estimate the unknown parameters.
We shall then treat the
various problems posed in Chapter I as direct applications
of either these theorems or Cramer's general theorems as
extended by Mitra [19] and Ogawa [23J.
39
40
3.2
On the test of lin~ar hYEqtheses on the responses by
the use of X~ statistic and the X~ minimization method
of estimation
Let us consider a product-multinomial distribution
in the usual notation, so that Hi" refers to categories of
the response and "j" refers to s different multinomial
r
r
distributions. Also
Pij = 1 and
nij = noj (fixed),
i=1
i=1
j = 1,2, ... ,5. Let p.. > 0 and also n .. > 0 for all
I
I
~J
(i,j) •
~J
We shall consider the hypothesis
Ho
defined by
m
linearly independent constraints on Pij'S [independent of
r
r p ~J
.. = 1], say,
L
i=1
(3.2.1 )
Ft(E) =
Ho :
II
i j
where
and
f tij
( i) Rank (ft' .)
~J
ht
mx(rs)
= m
< (r
e
b tj
Then,
X~
=Li f tij
q ..
(302.2)
ett'j
and
gtt'
t= 1 ,2, ••• ,m ,
1)s ,
(ii) the above equa-
r
L Pij
i=1
least one set of solutions
Let
= 0,
are known constants such that
tions, together with
all (ij).
ftijPij + ht
=L
=
1
(j = 1,2, ••• , s)
(p .. J for which
~J
r
p .. > 0
~J
.
(q .. _p .. )2
n . L - ~J
~J
OJ
i
qij
j
have at
,
=Li (ft' . - bt·)(f
t ,·· - bt,·)q··
J
=L n-:-oJ ett'j , t,t'=1,2, ••• ,m
1
j
~J
J
n ..
where q." .=~J
~J
nOJ
.
~J
~J
for
~J
•
•
41
We notice that
btj
is in the nature of a "sample
mean" of 11Ft" for J'-th sample, while
e tt'j
is in the
nature of a sample covariance of ItFtil and "F t ," for the j-th
sample. Since Ft's are linearly independent, it follows
that
G = (gtt')
is positive-definite.
mxm
We shall prove
Theorem 3.2.1
Xf2 =
Proof:
Min X~
=
subject to Ho
To minimize
X~
£' G- 1 £
•
subject to the constraints we
introduce Lagrangian multipliers and write
(p .. - q .. )2
1.J
1.J
_ 2
q. .
.
1.J
Differentiating with respect to
L A..( L p ..• 1) - 2 L~tFt(R).
j
J
Pij
1.J
i
t
and equating this to
zero, we get the minimizing equations
n .
OJ
(p .. - q .. )
J.J
i=1,2, ••• ,r,
j=1,2, ••• ,s.
J.J
qij
Multiplying by
and summing over
q ..
J.J
- A.. J
Lt ~t b t J· = 0
i , we get
•
Eliminating the A.tS we
=
where
~ts
qij
t
are to be determined from (3.2.1).
Hence,
42
t=1,2, ••• ,m.
These may be written as
G
and
Hence
Then
+ c
- -0
~
~r
=
(~1
~
=
1
- G- c
xf2=
where
=
,
~2,
-
... , ~m)
= (C1,C2,""C m)
•
•
Minimum x~
= \' n . \' q ..
L
OJ L
i
j
=
-c'
{..L \' ~t(ft ~]
.. -bt.)}2
J
~J nO]' L
t
Lj ~
LL ~t ~tt
OJ t tt
= ~t
-
= .£
G
e ttrj
~
•
t
Then by Neyman's theorem, if Ho is true, X~2 is distributed
in the limit as a X2 with m d.f.
*2
The form of (302.4) suggests that X1 may be the same
as the one we would obtain if we test the hypothesis
(3.2.1) by considering b t s, the unbiased estimates o'f
LLf tij
i
Pij
t
and using asymptotic normality.
j
LL
f tij
i
£ (b t ) = LLf tij
bt
=
i
and
qij , so that
j
j
Pij
=-
ht
if Ho is true,
We have
43
cov(bt,b t ,)=
II
i
j
ft' .f t ,··
~J
~J
p .. (1-p .. )
~J
~J
n '
OJ
_
II ft' .f
j
iA'
~J
t
p .. p.,.
~J ~ J
,. '0
~
J
n '
oJ
_ ~~ ftijf t , ijPij
\' 1 ~
~
LL
L
(L
ft'
.p
..)(
L
f t ,·~J.p ~J
.. )
. .
n oj
. n oj
.
~J ~J.
J ~
J
~
~
say •
= q>tt'
Hence, in the limit, when Ho is true,
£'
q>-'£
£
~
N(Q,q»
,so that
is asymptotically distributed as a X2 with
If we replace Pij'S in
q>
by qij'S
we get
may be considered as an estimate of
Theorem 3.2.2:
The minimum
X~
q>.
G.
m d.f.
Hence
G
Thus we have proved
method to test the linear
hypothesis of the type (3.2c1) is exactly equivalent to the
"large sample test ll based on the asymptotic normality of
the unbiased estimates of
Ft(~)
, whose variance-covariance
matrix is estimated by the "sample variance-covariance
matrix."
Invariance
We then expect the
X~
statistic to be invariant
under the choice of linearly independent constraints (on p's)
defining the hypothesis (3.2.1).
This can be easily proved.
Suppose we start from
II
f;ij Pij + ht
=0
where
i j
e
ft·~J.
=I
..
ktu f U~J
and
h*
t --
U
where
K= (k tu )
is nonsingular.
I
U
ktu hu
44
,
Then
from {3.2.4} •
•
Now
•
Thus
so that
.£
*
=
-
K c
•
Also,
so that
*
and hence
9tt'
G* ;:;: K G K'
•
Then
3.2.1
•
Structured response
Sometimes a linear hypothesis is defined by linear
functions of unknown parameters.
Theoretically, of course,
this can be reduced to the case, already considered, where
the hypothesis is defined by linear constraints on p's.
But in many cases this equivalent expression in terms of
linearly independent constraints on p's may be tedious.
We shall prove a theorem, which might be considered as
45
another version of theorems 3.2.1 and 3.2.2, which reduces
the problem to that of least squares.
Theorem 3.2.3:
With the same product-multinomial distribu-
tion for the model that was used in theorem 3.2.1, let the
linear hypothesis Ho be defined by
j=1,2, ••• ,s,
where d's are known constants and a's are unknown parameters.
Then the minimum
X~
to test Ho is the same as the minimum
sum of squares of residuals obtained by the general least
La.
squares technique on
•
q"J.J , with the variances
].
].
estimated by "sample-variances."
Proof:
Now
>
Ho
Let
D
= (d, )
[-t.
La
,J, i
J]'
Hence, if
LL
-t
i
[-t,
. J
dJ'k
and
Jk sxt
Rank
Pi'J=[-t.
[d·ka
, J
J k
k
J
=0
, (k
D
=[k
= u _< s •
ak[-t
dJ'k·
' j
J
= 1,2, ••• ,t) , then
J
j
a,]. pij -- 0 •
On the other hand,
j
whatever may be a's, then
L-t.
,
J
J
Such a linear function
d, k
J
= 0,
~ -tj ~ a i Pij
J
k = 1 ,2, ••• , t •
may be called a
J.
"hypothesis constraint."
Since
Rank D = u , the number of linearly independent
s-vectors (-tv1 ' -tv2 ' ••• , -t vs ) satisfying
(k
= 1,2, ••• ,t) is s- u •
[-t
. VJ'
J
dJ'k
=
0
46
Let
(-t .)
be a matrix whose rowVJ (s _ u)xs
vectors satisfy the above conditions. Then Rank L = (s - u)
LD
and
=
L
o.
=
Now, Ho
==9
LL
. . -tVJ. a.
J
1
s - u).
On
t~
3-
Pij
= 0 ,
(v=1,2, ••• ,
other hand, ~ ~ -tvj a i p. = 0 , (v= 1 ,2, ...,s-u)
J
1
L
~ . -tVJ. m.J =
J
(v = 1 ,2, ••• , s - u) , where
0
p.. •
1.J
J
Thus
(m, ,m2, ••• ,m s )
is orthogonal to row-vectors of
But the row-vectors of
L.
L form a basis of the vector-space
orthogonal to column-vectors of
••• , ms )
belongs to the vector-space generated by the column-vectors
of
D.
D.
Hence
(m"
Thus, there are 6's, not all zero, such that
j=1,2, ••• ,s
Therefore,
Lai
(j=1,2, ••• ,s) •
Pij
•
Thus
i
LL-t·
. . VJ
1. J
Ho~
LD
a.1. p.. = 0
1J
(v=1,2, ••• ,s-u)
= 0 and L is of rank (s - u)
the minimum X~
f
b .
VJ
..
V1J
=Li f vij
Then by theorem 3.2.1,
is given by (3.2.4).
-tvJ. a.1
=
•
Here
so that
q .. =-t.
1J
, where
VJ
a.1 q .. =-t.
L
vJ
1J
i
a. == [ a i q ..
1.J
J
,
1
•
a·J
, where
47
Also
hv
=
and
0
hence
-c
Also,
e vv ' j
=Li
=
C
v = bv
-
L a
Thus,
•
•
(f .. - b .)(f , .. - b ,.)q ..
Vl.J
VJ
v l.J
V
J
l.J
where
•
Then,
so that
~.
;... = --:L
n .
J
G
=
LA L'
where
,
A = diagonal (;...
J
and
oJ
= -a'L'(L A
X2
(3.2.6)
1
j=1,2, •.• ,s) •
L' )-1 L a
Hence,
•
Note that a. 's are independent and
J
var(a j )
=Li
a~
=~
oJ
Hence
l.
p.. (1-p .. )
l.J
l.J
n .
OJ
[ af Pij
I}
- LL
iIi'
-q
"sample var" (u j ) = n:
j
~
p . . p. , .
l.J l. J
a i ai'
noj
a i Pij}
J·
af qij -
{~
a i qij}
J
If we use the least squares technique on aj's (using ;"j'S
for the variance), then the sum of squares to be minimized
with respect to the parameters is
48
52 =
L
j
=L
where
j
= aj~jt
Yj
Y = A-t~
so that
where
and
A- t
djk~jt
b jk =
and
6
(b jk )
=
= diagonal(Ajt,
=
j
= A-to
1,2, •• ~,s) •
Then it is
well known that (for example, [29])
Min 52
e
61
sxu
-y'y-
=
- y'6 1 (6,' 6 1 )-16~r
where
-
= Basis
of 6
We note that D. = A-to
D. 1 = A-t 0 1 , where 0,
•
sxt
:> Rank D. = Rank o
is a basis of
0 •
=u
and
Hence
•
It remains to show that (3.2.6) = (3.2.7).
Now
o
Let us choose
Then, (3.2.6)
and hence
MM'
L, subject to
~fLIL~.
=
=
I
and
Let
L01
=0
, so that
M = L At
MA-t 0 1
=
o.
so that
Also
L A L'
L
= M A-t
0~A-'01 is
positive-definite, so that there is a nonsingular
X such
that
that is,
=I
•
•
e
49
5-U~-~----Jt t' :: A-tD,j
Now
s
•
X"D{A-
'-
=
u
s -u
t' :A-tD,j r
Hence,
s
X'D{A-
u
~
= I
•
M'M + A-tD1XX'D~A-t = I
~ J'
Thus,
so that
Hence,
•
= ~'L'L~ = (3.2.7) , which completes the proof
Then (3.2.6)
of the theorem.
It may be noted that the minimum
distributed as a X2 with
3.2.2
Xr
is, in the limit,
s -u d.f., where
u = Rank D •
Application of 3.2.1 to linear hypotheses on
structured uniresponse
In what follows, "i" will denote the structured
variate.
We shall consider some simple cases.
(i) One dimensional design
(3.2.8)
Ho:
Lai Pij
(lIj"
-+-
factor) •
is independent of j ,
j= 1,2,. •• ,s •
i
s
(3.2.9)
X~ =
L
j=1
2
noja j
~j
[t
s
L
oaJ
n oJ J
~j
n .
~
j=1 ~j
d.f. = s - 1
,
50
where
Q
j
=~
~.
and
aiqij
~
~
~
(ii) Two-dimensional design ~"j"
"k"
(a)
L. (a. -
=
J
-+
-+
Q •
J
)2q . ,
•
~J
"Treatments ll )
"Blocks")
Hypothesis of no treatment effects on the basic model
~
Ho:
(3.2.10)
aiPijk
[H 0 5
q**k
=
(1.2.11)]
,
~
j=1,2, ••• ,s
k= 1 ,2, •. o,t
X~
[Design may be incomplete} •
t
=
LL
a~kh'k - L[L, J
h· ka· k ]2 )r
. k J J
J
/ '-:
k=1 J
J
d.f~
where
=M-
t
h' k
J
J
,
= ~ aiqijk '
a jk
~
and
h'J k = nOJ'k/ ~ J'k
the summation is over allowable (jk) combinations and M is
combinations~
the number of (jk)
plete,
(b)
M=
st , so that
d. f.
When the design is com-
= (s -
1)t
•
Hypothesis of no interaction -(in the additive set up)
Ho :
La. p, 'k = q*J' * + q**k
,
~
~
~J
X~ = LL
. k a~kh'k
J J
J
t B2
-
L~
ok -
k=1
d.f. = M - (s+t-1)
where t's satisfy
(1)
[ Ho s
'
s
\' Q.t,
L J J
j=1
, (1. 2. 7)] •
51
e
s
(3.2.14)
Q.
J
= 2:
Bk
=2:j
Qj = Tj
M
I
t.
J
c· ., =
JJ
= 2:
k
h jo =
Bk
hok hjk
-2:k
j= 1,2, ••• ,s
I
T.
J
(ljkhjk
=Lj hjk
and
Here
JJ
'=1
j
and
hok
c ..
-2:
k
,
C jj
,
(ljkhjk
L
h'
k Jk
-
h jk h j'k
hok
h.
JO
-2:k
-hjk
hok
•
is, as before, the number of (jk) combinations
and the summations are over allowable combinations only.
It may be noted that (3.2.13) and (3.2.14) are
similar to "error sum of squares" and "normal equations,1I
respectively, in analysis of variance, Tj and Bk playing
the roles of "treatment total" and lib lock total,1I respectively.
The fundamental difference, however, is that
Cjjl'S here depend not only on the design but also on the
observed proportions.
In normal ANOVA, The designs can be
chosen suitably so that the normal equations have neat
closed solutions.
This approach fails here for the
corresponding equations (3.2.14).
For example, even for a
complete design (which may be called "randomized block
design"), there is no essential simplification in the
equations (3.2.14). [The degrees of freedom for
case are = (s-1)(t-1).]
X~
in that
52
(c) Hypothesis of no treatment effects on the no interaction
model
X~ = (3.2.11)
s
= Q.t.
(3.2.16)
L
(3.2.13)
,
J J
j=1
=
d.f.
,
s - 1
where Q's and t's are defined as before.
(d)
[ Ho 9
h=LLh'
j k J
s
m=
k
b.
L
J
j=1
s
t=
s
L bj
(1.2. 1 7)] •
= M-2,
d.f.
where
,
h jo
h.
JO
t
T. =L
L
J
j=1
k=1
G =
j=1
B
k
and
s
Y
(e)
\' a. p. . k
Ho: ~
1 1J
xf = I
j
where
mk ~
k
J
+
T.
(other quantities as before).
J
f..lb.
J
t
Ik
aJ~k
d.f.
=
j
h jk
Eb
J
= A
1
(3.2.18)
b.
L
j=1
=
h J' k -
M-t-1 ,
Ik=1
(1 e2.15)] •
t
[Y -
L
k=1
t
t-I
k=1
(other q\Jal1t ;ties defined as before).
53
(f)
Ho:
E
a.J.
J.
X~
(1.2.14)]
P"k
J.J
= [
[
J' k
d.f.
t
hJ'k - [
k=1
aJ~k
t [Yk
- [
ok
k==1
2
~
= M - 2t
where
.f.-k == [ b:i} h jk
• J
and
J
(other quantities defined as before).
,
1 d '
(lIjll
(J.'J.'J.') Two- d'lmenSJ.ona
esJ.gn("k"
"
Ho : [a. p, 'k ='-A.
.
J.
1
J.J
factor)
another factor)
+~,J + vC k
x~ = [[ a~k hJ'k - ~G k
j
J
[H 0 10'
ny-
(1.2.18)].
vb
d.f. = M-3
where ~, ~ and
A
G
= A.h
Y
=
~
= ~w
v
satisfy the equations
lit.
A
+
~
+ vw
A.m +
J-I'V
~o
+
~
A
VX
A
+ !.LX + vy
where
t
t
<:>
=
L
k=1
ckB k
W=
[
k=1
c k hok
t
X
= [[
j k
b, c k hJ'k
J
and
Y =
[
k=1
c~ h ok
(other quantities defined as before).
54
Application of 302.1 to linear hypotheses on
structured multiresponse
Let us now suppose that
there are
p
i
= (i"i 2 , ••• ,i p ) , that is,
responses indicated by
such that
i"is, ••• ,i p
••• , i p = 1,2, ••• ,r p ' If these responses
are structured, the linear hypotheses, in general, will be
i,
= 1,2, ••• ,r"
of type
(3.2.21)
=
Ho:
•
•
•
•
•
0
0
•
(p)
*""
f t ** ••• 1 J p 00 ••• 01""J
p
p
+ ht
= 0
t=1,2"",m
where the linear functions are linearly independent.
We can
write these as
L~
1 J
h(k)
p ..
f t *••• *i k*••• *j 1J + t
=
0
k = 1,2"",P
= 1,2, ••• ,m
, t
and hence (3.2.21) is a particular case of (3.2.1).
rk
="~\'
1
k=1
ft
Let
* •• , *"1 * ••• *.J q0 ••• 01"
k
"I
k O •• OJ
•
Similarly, let
(kk')
gtt'
= \'
~
j
1
n-:
oJ
(kk ' )
ett'j
55
where
( kk ' ) \' {
ett' j =
f t * ••• *ik*... * j
1
r
=
I
k
r
k
(k h{
bt j J f t t* ••• *i k 1 * j
-
-
(k I)} ..
bt t j qJ.J
,
I
ft* ••• *ik* ••• *j ftt* ••• *ikt*••• *j x
i =1 i k ,=1
k
•
Thus,
G
=
pmxpm
G1 1
G1 2
• ••
G21
G22
• ••
G1 p
G2 p
, where
• • • • • • • • • •
P1
G
• • •
GPp
•••
i = 1,2""tm
j
x~
Then
=
C
t
G- 1
C
•
= 1,2, ••• ,m
d.f.
= pm
•
We shall not go into the details of various special cases,
for example,
(1)
H029,
Ho ss (1.5.9), Hos 4
(1) (
)
(1) (
)
oso 1.5.6, H OS1 1.5.7,
(1.5.10) and Hoss (1.5.11).
H
HOS2
(1.5.8),
By considering
"hypothesis constraints," one can reduce the above hypotheses (expressed in terms of parameters) to equivalent forms
like (3.2.21).
3.2.4 Unstructured response
In theorem 3.2.1, we considered the test criterion
appropriate to a linear hypothesis.
Its equivalence to a
56
certain least squares technique, for the linear hypotheses
in structured cases, was shown in theorem 3.2.3.
We shall
establish a similar equivalence for the linear hypotheses
in unstructured cases in theorem 3.2.4.
Theorem 3.2.4
Under the product-multinomial set up, as in 3.2,
let the linear hypothesis be defined by
i = 1,2, ••• ,r
j =
1,2, ••• ,s,
where d's are known constants and a's are unknown parameters.
Then the minimum
X~
to test Ho is the same as the minimum
"generalized sum of squares" of residuals obtained by the
"generalized least squares technique" on qij' with the
covariance-matrix estimated by the "sample covariance-matrix."
and Rank D = u.s. s •
D= (d jk )
sxt
Then as in theorem 3.2.3, we can show that
s
v = 1,2, ••• ,s-u
t Vj p,1.J, = 0
Ho ~
i = 1,2, ••• ,r
j=1
Proof:
Let
L
where
LD
=0
theorem 302.1.
and
L is of rank (s - u).
Here
f (vi) i' j
Hence we can apply
b, . , = 1 if i = i'
{
= b i i , -t.vj
J.1. = 0 otherwise ,
so that
b( V1.')'J
and, hence,
="i..J
i.'
f( V1.')','
1. J q",
1. J
,
=-t.,
VJ
q.,
1.J
\
57
~vi) = ~
Also
h(vi)
=0
= ~ ~vjqij
b(vi)j
so that
,
c(vi)
•
=
E~vj
• Thus,
J
c
L*
r(s-u)xrs
=
L
'* .9
~1 s I r
• • •
s-u,1 r • •• ~ s-u, s I r
.~. ~"Ir • • •
(s-u)~s ~ •• ~ • ••
LX· I
=
=
= (q11' q21,···, qr1' .'., qiJ", ••• ,qrJ",.··,qrs) •
qt
-1xrs
Similarly,
and
e (vi )e (ui 1) j == ~
~ [f ( v~"}"""
.:\"]
~ J _ b ( V.J.I
J [f ( u~"I)':
. .a.I\"J - b ( u~"I)"J ] q"~ "J"
~"
=
L
"II
~
(b.""
- q"~J")~ vJ. ( b"~
~~
t " II -
~
q.~ , J")~ uj q"~ " J"
== ~vJ"~uJ"[ b"~~" ,q""
- q"~J. q"~ t J. ]
~J
j
= ~vJ"~uJ"y ~~
••
, say.
G
=L*Y
L*
r(s-u)xr(s-u}
rsXrs
Hence
Y1/n01
Y =
e
t
0
0
Ya!n02
• • • • • • •
0
t
,
•••
0
•••
0
·...
where
and Y" == (y~ " .)
J
~~
rxr •
Ys/nos
Then, from (3.2.4),
(3.2.23)
•
X~ = .stL* [L* Y L*'J-1 L*q
-
•
58
[This has already been shown by Mitra [19] in a slightly
different form.]
On the other hand, if we consider asymptotically ,
normal variables
qij'S, then
cov( q .. , q.,. t) = 0
J.J
and
Let
cov( q!)
-J
j 1= j' ,
when
J. J
= cov( q1 J., ••• , qrJ.)
=
-l.y.
n
J
•
oj
S2 = \' n .(q. - d· 1 61 - ••• -d't6t),y:1 (o...-d'161 - •.•• ~ OJ -J
J J J -J
JJ
djt~t)
~~
where
= (6 1k , 62k , ••• , 6rk ). S2 may be called the
"generalized sum of squares of residuals," and S2 is to be
~'s.
minimized with respect to
Since
y.
is positive-definite, there is p,
J
J
(nonsingular) such that
n , y:1
oJ J
.9j*
Let
=
p! p.
J
J
=
P j .9j
(rxr)
j = 1,2, ••• ,5
,
•
so that
.6 1 - ••• - d'tPJ,6t)
S2 = \'
(.9~
.6 1 - ••• - d J'tP. J.6t)t(.9~J - d J'1 P'J~
J - d J'11' JJj
=
(.9* - C~)' (.9* - c~)
... , .9*'s )
.9*' = (.91*' ,
1xrs
P1 0
and
Crsxrt
=
0
where
0 ••• 0
P2 0 • • • 0
• • • • ••• •
00
,
a •.•
ps
CD sxtX' I r ) rsxrt
rsxrs
•
59
Then, it is well known that
'] *
Mi n S 2 = .9 *'[ Irs - C, (C,' C, )-' C,.9
( 3. 2 • 24 )
where
e
is a basis of C. We notice that if
rsxru
the corresponding basis of D , then
C,
C,
Then
=
0
0
P2 Ii 0 • 0
• ••• •
0
Ps
•
0
0
, = D,*'
C,C,
Also
"
LD
e
=
Df
~
where
Df= D, X • I r
•
Hence, from (3.2.24),
•
Min S2
0
•
and
(3.2.25)
..
p,
0, is
= .9'Y-'.9 - .9'Y-'Df[Dt'Y-'Dt]-'Dr'y-'.9
0 4=-> LD,
=
0 <F==9 L*Dt
=
o.
•
(3.2.23) and
(3.2.25) can, then, be seen to be identical by exactly the
same argument as the one used for (3.2.6) and (3.2.7).
3.2.5
Application of 3.2.4 to linear hypotheses on
unstructured uniresponse
..
.
(IIj" ~ ltTreatments")
Two-dlmenslonal deslgn (Ilk"
~
"Blocks")
(i) Hypothesis of no treatment effects, on the basic model
[H o 2'
(1 .2.5)] •
This has already been considered by Roy and Mitra [33].
In this case. maximum likelihood equations give a unique
60
solution and using this in a X2 statistic we have
L
(3.2.26)
(n
nojk niOk)
'" iJ'k n ook
2
i,j,k
d.f. - (r - 1) (M- t)
- t(r - 1) (s -1)
in the usual notation
for a complete design.
(ii) Hypothesis of no interaction (additive set-up)
(1 )
Ho : p,.
k = q.J. *k + q J.J
.. *
J.J
[H o1 ,
(1 .2.3)] •
We shall not consider the details of the computation from
(3.2.23).
d.f. = (r-1)(M-s-t+1) = (r-1){s-1)(t-1)
for a complete design.
(iii) Hypothesis of no treatment effects, on the model of
no interaction (additive set-up)
x2 =
(3.2.26) -
X~
for (ii) above.
d.f. = (r-1)(s-1)
•
3.2.6 Linear hypotheses on unstructured multiresponse
As in 3.2.3, let us now suppose that
that is, there are
p
i= (i 1 ,i 2 ,
responses indicated by i 1 ,i 2 ,
•••
••• ,i
,i p )'
p
i 1 = 1,2,o .. ,r1' ••• , i p = 1,2, .... ,r p • If
these are unstructured, the linear hypotheses, in general,
such that
will be of the type
It may be seen that it is not a particular case of
(3.2.22) or of its equivalent form.
But it is a particular
case of (3.2.1) (by suitable definition of ftij's).
the
X~
statistic may be worked out.
these details.
(1)
Hence
We shall not go into
This will cover, for example, the cases
(1)
(1)
H026 , H02 ? and H028 •
3.3 On the test of nonlinear hypotheses on p's in ANOVA
and MANOVA situations
As already mentioned in Chapter II, we can adopt
Neyman's technique of linearization t so that the problem
is reduced to one of the previous cases.
On the other hand,
it may happen in some cases that the maximum likelihood
equations are fairly simple so that the X2 statistic
(using maximum likelihood estimates) may be used.
3.3.1 Minimum
X~
by "linearization"
Suppose the hypothesis is defined by
Ft(R)
=0
so that "linearization" gives
Ft(~, R)
=
Ft(~) +
Ii I
j
- q .. )
l.J
=
0
t - 1,2, •••
,ID
•
,
/
62
•
d.f. = m
,
where
We shall not go into the details of special cases,
except for the case discussed in the next subsection.
The
cases that may be worked out in this way are, for example,
Ho 1 (1. 2. 2), Ho S (1. 2. 6), Hoe (1 • 2. 1 2), Ho 26
(
1 .5.2), Ho 2 7
(1.5.3) J H028 (1.5.4), H0 29 (1.5.5), Hoso and HOS1
•
3.3.2Ihe hypothesis of no interaction (multiplicative
in the two-dimensional design
[Ho 1 ,
set-up~
(1 .2" 2)] "
This may be tested by the I"lineai'iz·ation" technique J a·s mentioned in the
last~ragraph.
On the other hand, in the case
of a complete design, the maximum likelihood equations
appear to be fairly simple and may admit an iterative solution.
It can be shown that
(3.3.1 )
Ho
<
-j)
i-1,2, ••• ,r
j = 1,2, ••• , s-1
k - 1,2, ••• , t-1 •
Then the maximum likelihood equations, subject to (3.3.1)
and
L
Pijk = 1 ,
i
can be obtained by differentiating
63
s
t
L nijk log Pijk - j=1L k:::1L ~jk[L Pijk -
f =
1,j,k
r
1J
i
s-1 t-1
- L L L~ijk[ log p, 'k + log p.
1J
1S
t - log p,1S k - log p,1J't]
1=1 j=1 k=1
with respect to the pas, where
multipliers.
and
~'s
~'s
are Lagrange
The final equations are
(nOOk
1) - ~,1J'k)(n.1S t - ~.100 )
:=
(n isk + ~iok) (n ijt + ~ijo)
i = 1,2, ••• ,r;
j =
1,2, ••• ,s-1
(n ojk - ~ojk) (nos t - ~ooo)
(nos k + ~ook) (n ojt + ~ojo)
and
k
s-1
~iok =
L
j=1
= 1,2, ••• ,t-1 ,
where
t-1
,
~ijk
~ijo =
L
k=1
~.lJ'k
r
~ojk
:=
L
i=1
,
~ijk
In particular, when
r
and so on.
= s:= t = 2 , we have just two
equations (linear) and these can be explicitly solved.
In
this special case, Bartlett [3] posed another hypothesis of
no interaction, but the solution of the maximum likelihood
equation comes out as a root of a certain cubic equation.
Mitra [19] shows that it is the numerically smallest real
root that gives consistent solution.
The equations, in
the present case, thus seem to be simpler.
Roy and
Kastenbaum [32] extended Bartlett's hypothesis to more
general cases where "ill, "j" and Uk" are variables, and
they get equations similar to (3.3.2).
4t
64
3.4
On the hypotheses about association for a single
multinomial
Most of the hypotheses, in this case, are nonlinear.
Hence, we can always apply Neyman's linearization technique.
On the other hand, in some cases, the maximum likelihood
equations are very simple, so that X2 statistic may be used.
This is so, for example, in the cases H011 (1.3.2), H018
(1.3.5) and H014 (1.3.6), which have been considered by
Roy and Mitra [33].
There is another possibility, namely, the conditional
,
probability approach, which we shall illustrate by considering
the case of three variables (i,j,k).
~
nl
= -....;;.;.,;.....-1T n. 'k 1
i,j,k
Here
[ L k p,1J'k = 1]
nijk
,iT k P.1Jok
1;j,
•
°
1,J,
1J
which can be written as
=r.rr
~,k
= ~1
Let
Pijk
Pojk
X ~2
noJok!
f
say •
* 'k ' so that L~ p,* 'k =
= p.1J
i
1J
1 •
Then
~1
denotes
the conditional probability density of nijk's, given nojk's,
while
~2
denotes the probability density of nojk's.
the number of independent parameters in
rst - 1, st (r - 1) and
~f ~1
st - 1 , respectively.
and
~2
Also,
is
We may consider
,
65
P:jk'S and Pojk'S instead of Pijk's.
Then it is logical to
require that the hypotheses, which are expressed in terms
of P:jk'S only, should be tested by criteria based on ~1
only.
,
Hence the test on p* s only will be the same as that
on p's if "j" and "k" were factors.
This approach seems to be similar to the "step-down
procedure" in normal multivariate analysis [e.g. [28], [30]].
This procedure reduces the problems of association to problems
of analysis of variance.
Exactly the same thing seems to
be happening here by the conditional probability approach.
We shall illustrate by considering three simple examples.
(i) Hypothesis of independence in two-dimensional table
* =:!l
Pi' is independent of j • Thus He, by the
p,.
~J
Poj
conditional probability approach, is equivalent to
Hence
*
He:
p*
..
~J
= q.1. *
the hypothesis of homogeneity when "j" is a factor.
It is
well known that
2
X
=LL
i j
(n ~J
.. - n.~o nOJ./n)2
d.f. = {r-1 )(s- 1)
nionoj/n
may be used to test both Ho and He* •
(ii) Hypothesis of "no partial association" in a threedimensional table
Ho :
p.J.J'k
= Piok
Pojk
Pook
•
66
*
p. jk
1
is independent of j . Thus Ho , by the
1J
Pojk
conditional probability approach, is equivalent to
Then
p. 'k =
* ••
Ho
*
Pijk -- q i*k
the hypothesis of "no treatment effects" when "j" and "k"
are factors.
It has already been noticed [19] that
n
, d.f.=t(r-1)(s-1)
n.10 k
ojk nook
may be used to test both Ho and H~ •
...
(iii) Hypothesis of "multiple independence" in a
three-dimensional table
•
As noted earlier
*
Pi 'k
P"k=~
1J
Pojk
is independent of
j
and k.
Thus
Ho , by the conditional probability approach, is equivalent to
H'o* :
when
*
Pijk
= qi**
"j" and "k" are fa~tors.
It may be seen that
d. f • =- (r - 1 )( st - 1) ,
may be used to test both Ho and Ho*
•
67
Applications
i
[He1?' (1.3.12)].
H~:
It is equivalent to
L aiP:jk = q**k
i
when "j" and "k" are factors.
in 3.2 and the appropriate
X~
H~ has already been considered
is given by (3.2.11).
Hence,
the same statistic may be used for He •
(ii)
He:
Lai
Pijk;!Pojk
is independent of (jk)
i
[ He Hh
(1. 3 •1 6 ) ]
It is equivalent to
*
Ho:
L ai
*
Pijk is independent of (jk)
i
when "j" and "k" are factors.
,
The criterion for H*o can be
easily derived and the same may be used for Ho.
Similarly
H~~~
(1.3.11). H020 (1.3.18), H021 (1.3.19),
H02 2 .(1.3.20) will follow from corresponding cases when
"j" and "k" are factors.
PAR T I I
NONPARAMETRIC SET-UP
CHAPTER IV
SOME UNIVARIATE PROBLEMS
4.1
Introduction
Much of the usual analysis of variance rests on the
assumption of normality and homoscedasticity.
When these
assumptions are not realistic, two different kinds of
approaches have been made so far.
One is the transforma-
tion of variates and the other is a nonparametric development of the whole problem.
Transformation of variates with a view to
"normalizing" and "stabilizing the variance" has been in
vogue for a long time and has, by and large, served a useful
purpose.
But, at the same time, one feels that there are
some dangers in the indiscriminate use of such procedures.
We should try to pose physically meaningful models and then
state hypotheses or, in general, decision problems in terms
of the original variates themselves.
If we make a trans-
formation even before we have posed any model and hypothesis, then proceed with the usual analysis of variance and
reach some conclusions on that basis, such conclusions may
not have much physical meaning in terms of the original
68
69
variates and, in any case, such conclusions should be interpreted in terms of the original variates.
Under the non-parametric approach, various methods
have been suggested so far to avoid the assumption of normality implicit in the analysis of variance.
made use of ranks in the problem of
Friedman [13]
m rankings.
statistic, to test the hypothesis Ho that the rankings by
m "observers" of
n
"objects" are independent, essentially
offers a rank test for two-way classification with one
observation per cell.
For large
m , when Ho is true, Xr2
is distributed asymptotically as a X2 with
n- 1 d.f.
Durbin [9] has given a generalization for the balanced
incomplete block design.
Benard and Van Elteren [4] have
generalized it still further.
Fisher and Yates [12] proposed that each observation be replaced not by its rank but by its normalized
rank, defined as the average value of an observation having
the corresponding rank in samples of the same size from
N(O,1).
They proposed that ordinary one-way analysis of
variance be applied to these normalized ranks.
The argu-
ment seems that X2 approximation might be approached more
rapidly with normalized ranks or some other set of numbers
which resemble normal form more. than do ranks.
Another technique that has been suggested to get
around the assumption of normality, is the use of tests
based on permutations.
Permutation tests, which seem to
70
have been first proposed by Fisher [11], accept or reject
the null hypothesis according to the probability of a test
statistic among all relevant permutations of the observed
numbers.
Application of the permutation method to important
cases may be found in articles by Pitman [26] and by Welch
[37].
Kruskal and Wallis [16] have proposed an analogue
(of one-way F-test) based on ranks and called the
to decide whether
c
H test,
samples come from the same population
(assuming that the populations are approximately of the same
.
form, in the sense that if they differ it is by a shift or
translation).
If the samples come from identical populations
and the sample sizes are not too small, H is distributed as
a X2 with
c-1
d.f.
When
c=2, the H test is essentially
the same as Wilcoxon's test [39].
Mood and Brown [20] generalize to
c
samples the
test proposed for two samples by Westenberg [38], utilizing
the number of observations above the median of the combined
sample.
Massey [18) extends Mood's technique to use other
order statistics, besides the median, from combined samples.
Mood and Brown consider also the two-way classification with
one observation per cell or the same number of observations
per cell.
Mosteller [21] proposed a multi-decision procedure
for accepting either Ho (that
population) or one of
c
c
samples come
from~e
same
alternatives that the i-th popu-
lation is translated to the right.
His criterion is in
71
terms of the number of observations in the sample containing
the largest observation that exceed all observations in the
samples.
Terpstra [34] gives a statistic (for the problem of
c
samples) for testing against trend.
His statistic is
based on Wilcoxon's statistic for all pairs of samples and
is asymptotically normal.
on a statistic
He also [35] gives a test based
Q, again based on Wilcoxon's statistics
for all pairs of samples and shows that
X2
with
Q is asymptotically
(~d.f.
In this chapter, we shall first extend Mood's test
for two-way classification to cover incomplete.block situations.
Then, a new test for the problem of
offered.
c
samples is
For this purpose, Hoeffding's theorem on
V-statistics [14] extended by Lehmann [17] to the case of
two samples, has been extended in a straightforward manner
to the case of
c
samples.
a new test-criterion for
c
This is, then, applied to derive
samples.
The statistic derived
may be considered as an extension of Wilcoxon's statistic.
4.2 Extension of Mood's test for two-way classification to
cover "incomplete designs"
Mood and Brown [20] have considered a test for
equality of "row" effects in the usual set up with one observation per cell or
h
rows and
c
The distributions of
medians
Vij
columns.
= ai
+
~j
observations per cell, with
r
xij have
+ v where the median of the numbers
72
a i is zero as is the median of
~j.
The distributions of the
x..
's are assumed to be continuous and identical, except
~J
for location.
Under the null hypothesis that the row effects ai are
equal (i.e., zero), all the observations in a given column
have the same distribution.
""
x
Let
j be the median of observations in the j-th column, and in the two-way table let
the observation xij be replaced by a plus sign if it exceeds
""
xj ' or by a minus sign if it does not. Let mi be the
number of plus signs in the i-th row.
The test criterion
used is
r
- 1)
X2 == r(r
ca(r-a)
(4.2.1 )
where
c
a
L(mi _ ¥-)2
i=1
= ~ if r is even or
Y
if
r
is odd.
Unless
is small, the X2 approximation with (r - 1) d. f. is used.
For practical purposes the large sample distribution is
satisfactory if
rc
~
20.
computed.
c > 10
For smaller
or even if
c
= 5 provided
c, exact probabilities could be
We shall consider the generalization to incom-
plete blocks.
4.2.1
Let us write
or zero otherwise.
i-th row be
e
~J
if (ij) combination is allowed
Let the number of observations in the
c i (i== 1,2, ... ,r)
r j (j=1,2, ••• ,c).
rj - 1
2
n .. = 1
Let
if r j is odd.
and in the j-th column be
a j == r j /2
if
rj is even or
Then there are a j plus signs in the
73
j-th column.
Let mi's be defined as before.
Then we expect
(under Ho) mi to be approximately equal to ci /2.
Following Mood, let us derive the generating function
to exhibit the distribution of the mils.
Suppose t i is
associated with a plus sign in the i-th row (i = 1 ,2, ••• , r) •
Let
~a.(t1'
J
••• , t r .}
consist of the sum of all terms
J
that can be formed by multiplying tis together a j at a time.
Each term of ~ describes a possible arrangement of signs in
a given column.
Furthermore, each arrangement of signs is
equally likely; hence the probability of a particular
arrangement is
1;1(:~).
J
Suppose the j-th column contains observations in
the j1.ja, ••• ,jr.-th rows.
Then consider the function
J
~ ~a (t. , •••• t j )
cZl = I I
j
J1
rj
(4.2.2)
j=1
.
(:~)
•
J
There is a one-to-one correspondence between ways of getting
m1 rna
ffi
terms t1 t a ••• t r r in the numerator of ~ and arrangements of signs in the r x c table which gives rise to mi
plus signs in the i.th row (i
= 1,2, ••• ,r).
Hence
m1 rna
g(m1 , ••• ,mr )t1 t a
m1 rna
where
g
mr
is the density function for the mi's.
Note that
¢>
aj
(1,1, ••• ,1)
= (raJ,')
J
• 4) is thus a
,
74
factorial-moment generating function for the mi's.
C(m~)
= ~~.1
...
Then
t. 's = 1 • We note that
with all
1
a~
a·
--=.J. (t . " ' " t .
Jr.
J1
at i
)
=
0
if
n ..
1J
=
0
J
= <L> a . -1 (t J.1
" ' " t j ' ) if
r.
J
n .. = 1 ,
1J
:J
where one of the tis from the previous bracket is missing .,
Hence,
Then,
Similarly,
(1 ••
11
=
var{m. ) =
1
~
-
2~
+ £ (m. ) - [ E (m. )] 2 •
at~
1
1
1
t=1
From (4.2.4), we have
~
=
1
~f o..J Wa . - 1 (t.J
at ~ -fn(r j)
1
I I
I
j=1 a j
=1
1
J
x
Hence,
IT
1
~
Lk i
• .... t. )
Jr. . /.
~j~j' ak
J J 'F]
(t k ... • • t k
1
Wa <t . 1 ,
j
)1 ·
rkJ
I
J1
• "
t J. I )
r.
J'
75
c n, ,
\'
~ 1J
[ '=1
= [c(m,)] 2 1
a~2
.:i
r,
J
a~
c
L
n~, ..:.J.
r
j=1
1J
2
j
a2
Lc n, . ..:..J.
'
r
j=1
1J
2
j
so that
(4.2.6)
Similarly,
0"
1J
=
cov(m, ,m.) =
1 J
[a 2 q; J
LatiatjJ
£ (m1, )
t=1
From (4.2.4), we have
Hence,
(
c
. aScI> )
\'
at·at, t-1
1
J
-
=k=1
G
c
=L
k=1
so that
£(m ,)
J
•
76
(4.2.7)
4.2.2
•
Asymptotic normality
We have
c
<P(l) =
"fr 1 r.J) )r
J=1
<pa . (t j
. 1 ( a.
J=
J
Replacing t i in ~ by
~, C~) =
J
1
m. s.
L
i .ve:1
L...Lg (m1 , • • • , mr )
mr
= moment
log
)
rj
s.
e 1/~ci ' we have
m1
Let us consider
..... t j
1
m.
generating function of --!'s •
,;~
for large ci's.
¢t
e
1
We have
c
(4.2.8)
log
¢t -
L
•
j=1
,
Now
where the summation is over
k 1, ka , •••• ka.
J
(:~)
combinations of type
out of (1, 2, ... , r .).
J
Hence,
77
2
+ 1
~
r
2
~
+ 1 !..J
'l'f
~
n.. (r.-1)
J
-si
c J..
1
. 1 J.J a.J
J.=
Lr
r
~
L
2
a. s.
1
a. s.
::: 1 +!..J n.. ..:..J. --! + -2
n.. .:.J.
...!. + 'lI;1 2:
n .. n. , . x
1J
itIC':
J.J
~
1J 1 J
i=1
rj Ci
i=1
r j Ci
i'~i
a.{a.-1)
5.5.,
J J
1 J.
+ O{cr . (r . -1) .y' C • c. ,
1 J.
J J
so that
r
),
a.s.
1
+ '7'Ir~
n.1J. ..:..J.
-!
+
'7'f
n..
n ,. x
r. c.
~
1J i J
i=1·
J 1
i'~i
1
,
log - r ~a
(a~)
2
12:
3/2
j
LL
J
a.{a.-1)
5.5.,
1 J.
r.(r.-1)
~c.c.,
1 J.
J
J
J
J
.:J.
rj
sJ
.y'c~
2
+ O(
c-
3/ 2 ) •
Then, from (4.2.8), we have
r
log~' =
r
· si + ~1 L
€(m.)L
. 1
1
1 .
1=
/\fC.
J.
~
•
J.=
2
(1 . .
si + 'lI;1
~
11 -C.
J.
II
.J.
JF1
(1 ••
Si Sj + O(c- 3!2)
-==
'IIc.c.
J.J"
1 J
m.
Thus for large ci's, we have the distribution of --!
.v Ci
'5
approximated by the multivariate normal distribution. Hence,
•
78
r
Since
c
Lmi = La j
i=1
, it follows that mi's are linearly
j=1
dependent.
Hence the above distribution is singular.
Con-
sidering only
m1, m2' ••• , mr _1 , which have an asymptotic
nonsingular normal distribution, we shall have a chi-square
criterion with
r - 1 d. f ., given by
r-1 r-1
X2
(4.2.10)
=I: I: [mi-f(mi)][mj-c(mj)]aij
(rr)
i=1 j=1
where
in
(a ij ) =
(rr)
~-1
Z
(rr)
(rr)
being the cofactor of arr
(a .. ) •
4.2.3
~J
Special case
Suppose
C1
= C2
= • ••
-
and
r1
-
=
= rc
Then
r2
a1 = a2 = • •• = a c
ro
ao =""'T if ro
=
Then, from (4.2.5),
ai i
A.ij
ro - 1
2
c: (m.)
~
= Co say
=
=
k
•
is even
otherwise.
ao
ro Co
= -
coao ( ro - ao ) ,
= --s-
I: nik n jk
say •
ro
= ao say, where
ro
ao(ro - ao)
a.1J. - - -------A..1J.
r~(ro _ 1)
(4.2.11 )
where
• ••
cr
Also
rco = cro •
i = 1, 2, ••• ,r , and
i = 1 ,2, ••• , r, from (4.2.6),
j
1= i
, from (4 • 2 • 7 ) ,
79
•
(i) Balanced Incomplete Block Designs:
=~
Let ~ij
co(ro-1)
for all (ij), (i ~ j).
:::~(r-1).
Then we have
Also,
say , and
(J ••
1.1.
say •
Thus,
i:
Then
Z-1
:::
(rr)
(a - ~)Ir_1 +
1
-
(rr)
a -
I
~
+
J _1
r
•
~
r-1
= Y I r- 1
~
J _
(a - ~)[a + (r - 2)~]
b J
_
r 1
say •
!
z,
= [ (m. - -aOc O ), .1= 1,2, ••• ,r-1 ] •
-1x(r-1)
1.
ro
X2 =
Now :!'Jr_1 .!
Z'Z-1
Z =
-
(rr)-
=(
r-1
L zi)
Y zt z + b z'
- -
::: L-L(m
1
2
i=1
==
i _
ro
(
)
4.2.10
Then, from
c~:o)~
(r-1)coa~-2
Let
,
z.
r-1-
i=1
Gaocr - mr -
It can also be seen that
J
-
r 1
=
-(
2
_ coa o)
mr
ro
2
•
y::: b , so that
r
\' (
(4.2.12)
L mi
_ Coao)2
ro
•
i=1
If we put
A:= c
and hence
ro
= r,
Co == c
and
ao::: a ,
we get back to (4.2.1).
I
80
(ii) Partially Balanced Incomplete Block Designs:
Let us consider rows as treatments, so that
m
if
i and j
are t-th associates.
Then
I: = aIr +
"" l.J
.. = ,,"0'1.-
L ~tBt '
t=1
m is the number of associate classes, a is defined
where
as before, B's are association matrices [6] and
~t
ao(ro - ao)co
r~
~o =
Let
m
f
I:=
~t Bt '
L
t=0
=q
a
ao(ro - ao)
r~(ro - 1)
= -
where
q = ao(ro - ao) •
ro(ro - 1 )
and
Bo
=
""t '
Then
,
I
in the notation of [5].
Using the results derived in [5] and simplifying, we
have
r
L c .. (m.
j=1
where
C
=
(c ij )
equations" for
given by
..
!
1
_ ~)(m. _ cOa O) ,
ro
J
ro
is such that the solution of "normal
in the analysis of variance for PBIBD is
1 = cg, g being defined in the usual notation.
4.3 Generalization of Hoeffdingts theorem on U-statistics
.!2 c samples
be independent and identically
Let
.,
1J
distributed (real or vector-valued) random variables with
distribution function
F.
Similarly let
be with distribution function
Y1,Y2""' Yn 2
G and so on ••• and
81
z,
,Z2' ••• 'Z
(4.3.1 ) u
H.
be with distribution function
nc
-
=
Consider
1
x
--:=:c----~-------
IT"n.(n.-1) ••• (n. - m. +1)
i=1 1 1
1
1
I ~(x a, , ••• ,x am,;YQ
, ••• ,YQ
I
where
and similarly for
cl>
* (x"
=
1 < a. < n.
of the
m,
~'s,
1
••• ,
1
Ya
, ••• ,z
Ym
c
as,
),
and are all different,
and y's.
Let
, Z 2 , • • • Zm
c
)
P
1
z
y,
, ••• ,z
m,
integers (1,2, ••• ,m,) and so on for
Then
cl>*
Ym
),
c
~IS,
••• ,
is symmetric in x's, symmetric in
Then
e1n
lTi(
i)
. 1 m.1
1=
I'1>*(x
a,
, ••• ,x
am,
; ••• ;Z
y,
, ••• ,z
Yme
)
denotes the sum over all combinations (a, , ••• ,a m,)
integers chosen from (1,2, ••• ,n,), and so on for
••• , yls.
We shall now assume that
(4.3.4)
and
Then,
(4.3.5)
,z
here denotes the sum over all permutations (a, , ••• ,a m,)
u=
of
Y,
L<:t>(xa ' ., ",xam,I-"
;Yo" ",Yo ; ' ' ' ;
ms
( m. ) •,
y's, and so on.
I
~IS,
if 1
and for y's.
where
-
••• , xm, ; Y, , • • • , Yms ; ••• ; Z,
. 1
1=
I
; ••• ;z
denotes the sum over all permutations of (a"
••• , am,) such that
where
m2
I-'
t-"
E (cl>*)
;=
11
and
•
82
from Schwarz's inequality.
E(u) =
(4.3.6)
TJ
Then,
•
Also,
...,.
var(U)
zY, , •••, ZY
)
mc
,* = c1> * -
where
,
Thus,
TJ
)W * (Xa" ••• 'x, ; ••• ;Zy" ••• ,Zy')
'
am,
1
m
c
c
Zy, ••• ,Zy
t,
•
m
Let
*
(4.3.8)
'd
d
"2'···,
d
C
(x" ••• , Xd ; y, , • • • , Yd ; ••• ; z, , Z2, ... , Zd )
1
c
2
=C'*(X1, ••• ,X d ,Xd +1,···,Xm j
1
l '
•••
;z"z2' ••• 'Zd'
c
Zd +1,.··,Zm )
c
and
=
c
var 'd*,,2,···,c
d
d (X" ... ,Xd 1 ; ••• ;
Z" ••• ,Zd)
C
=
8 'd*2,,2,···,C
d
d
which exists in view of Schwarz's inequality and (4.3.5), and
l;0 , 0 ,
Then, from (4.3.7),
c (n,)2
m~ var(U)
"JTi
J.=1
where
J.
m,
=
d2 ,
\' • • •
•••
L,
,d
formed by " choosing
c
•
mc
d,=O
Nd
... , 0 :;:; 0
LNd " 2d ' •• ·' dC l; d " d 2'···, dC
d c=0
= number of combinations that can be
m,
distinct integers
a's
and
m,
83
t
distinct
at S out of (1,2, ••• ,01) such that exactly d1 are
common between a's and a' 's and similarly for ~'s ... and r's.
Then we can see that
n. - mi )
Tr
(n
.)(m·X
. 1 m~ d~ m~
d.
c
N
d 1 ,d2, ••• ,d c
=
~=
~
~
~
•
~
Thus
(4.3.10)
4.3.1 Asymptotic behavior of var{U)
Let us study
var(U)
n.~ 's
as
i = 1,2, ••• , c •
Then,
- m,) (m.) (n - m.) m. -d.
Cd.m.)(n.
m.-d.
d.
i
~
~
~ == --~------~
~
~
~
c:~)
~
in such a way that
-+ 00
~
~
- d.)l
~
(m i
m.l
~
-n. m.
~
~
~
•
Therefore,
m1
var(U) ==
mc
L."L
c
IT
•
d1 =0 d c=0 i=1·
Hence,
(4.3.11 )
var(U)
=
c m~
~.-!
~
+ O(N- 2 )
L n. 0,0, ••• ,0,1,0, ••• ,0
. 1
~=
where
~
c
N =
L
i=1
n.~
•
,
84
Asymptotic normality
We shall now show that .vN(U
where
C
0'2
L
=
i=1
n, ,n2'.· .,n c
m~
...!l;
Pi
0,. • • , 0, 1 , 0, ••• ,
- 11)=- N(0,0'2),
°
n.
and
,r ~ Pi > ° ·
Proof:
Let
and
VN=v'N(U n" ••• ,n -11)
c
nc
1'*
(X )
\' 1'0* ,ee."
1,0, ••• ,0 a + • •• +-=m=c=
,;
!..J,
ncp c y=1
° °
= W1 ,n,
+ ••• + Wc,n
n,
c
say,
(X )
L
1,0, ••• ,0 a
a=1
where
\Tf*
%
and so on.
Then
independent and identical random
m2
variables with zero mean and variance ---n' 1
and,
W1 ,n,
is a sum of
1 (zy )
n,
,p 1
hence, by the central limit theorem W1 ,n1
e
"
°
• • • ,
°
has a normal
limiting distribution with mean zero and variance
m2
-l l;1
0. Similarly for other W1,·,n .• Moreover,
Pi
" •••,
1,
°
W1 ,n,' W2 , n 2 ' ••• Wc , n c are independent. Hence WN has a
normal limiting distribution with mean zero and variance 0'2,
given by (4.3.12).
We shall now show that
in probability.
Now
VN - WN converges to zero
It is sufficient to prove that
85
c ml?
=N["....!e
+O(N- 2 )]
L n i 0, ••• ,0, 1 ,0, ••• ,0
i=1
c
+2:
ml?
....!
~
- 2
Pi 0, ••• ,0,1,0, ••• ,0
tvNwN ,
and
i=l
(4.3.14)
Expectation of the product term is zero unless
of the integers
e1,0, •• • ,0·
a
is one
a, , ••• ,am, , in which case, expectation is
Hence, from (4.3.14)
, where
M, is the number of terms in the summation such that
one of the integers
a" ••• ,a m1
possible combinations of a's,
1)
n, _
(
M, = n, m, _ 1
Hence,
for
~'s,
c
a
~
~
n,
••• , and y's.
(n,)
~ m~
1
•
a
and all
Thus
is
86
.vN
-
m, n,
"/n1 P1
(n, - 1)
(mn,')
rn, - 1
E;1 , 0, ••• ,
- rn 2 ~
E;
- 1 "/n1P1 1,0, ••• ,0
other terms, so that
c
=
.
L
m2
i
°
•
j.1L
niPi E; 0, ••• ,0,1,0, ••• ,0.
1=1
Therefore,
hence
VN
-+
°
= WN +
as
N -+
00 ,
(VN - WN)
VN is also
~
,r
since
and
-+
Pi.
(VN - WN)
-+
°
Since
in probability,
N(O, a2 ).
Extension to a vector V-statistic
4.3.3
Suppose we consider ~'
y' ::::
+ O(N-') , and
n.
= (~(1),~(2), ••• ,~(g))
(V(1), U(2), ••• , V(g)), where
corresponding to
and
V(i) is aU-statistic
~(i). Then under the assumption of
existence of second moments of
~ts,
we shall have extensions
of the previous results.
Let E ~ ( i) == fJ ( i )
and
fJ' = (fJ ( 1) , ••• , fJ ( 9 ) ) •
Al so
let
l:
to
(.)
i ,J
d"d 2 ,
•••
,d c
C"[ *(i)(
-
c.,..
W
X1 , • • • , Xd ' Xd
.
,
1
+1'···' X (i)'·.·'
m1
Z" ••• 'Zd ,Zd +1'''·'Z (i)) x
c
c
mc
,rr* (j ) (X
:r
1 , • • .,
X
X
d'
(i )
1
m1
+1
, •••,
X··
( j) (i )
, •••,
m1 +m1 -d1
Z1, • • • , Zd ' Z (i) , •• • , Z (j ) (i )
)]
mc +m c -d c
c mc +1
87
and
~
(i,j)
= 0
I
Then we shall have in a manner as
0,0,,"",0
before
m{ij)
~c
••• !..J
d =0
c
d 1 =0
N{ij)
d1, ••• ,d
x
c
N{ij)
= number of
d1,d 2 ,oll,d c
combinations that can be formed by choosing m~i) distinct
integers a's and similarl y m~ j) distinct integers at t s out
of (1,2,1 •• ,n1) such that exactly d 1 are common between a's
and a' 's and similarly for ~ts, ... and y's. Then,
c n
m{i)
n
m(i)
( .. )
~J
(
k
)
(k
)
(
k
N
/I ()
( j ) kd ) • Therefore,
d 1 , d 2 , • • • , d c- k-1
i
d
k
- mk
mk
- k
m{ij) m(ij)c
(i)
1
k )
( i ) (j)
~
~c
(4.3.15) cov{U ,V
1 = c 1n
. !..J "'!..J Tn d
x
k
k
Tr ( (.))
d1=0 d =0 k=1
k=1 m J
c
k
n
m{i)
(m
(~j)
mk
Asymptotic behavior.
k
)
- dk
~
(i,j)
d1,d 2 ,
•••
,d c
•
Let us study the behavior of
nkts.
Now
•
88
Hence, from (4.3.15), we have
c m(i) m(j)
U
(4.3.16) cov(u(i),u )) =
k n k
~ (i , j )
I:
k=1
+ 0 (N- 2 )
•
0, ••• ,0,1 ,0, ••• ,0
k
t
(k-th place)
Asymptotic normality.
We shall now show that
where
(4.3.17)
~(i,j)
0, ••• ,0,1,0, ••• ,0
l:= (0' •. )
J.J
n.
J.
and 1r ~ Pi
Proof:
>
°·
V = ~(Un
n
-1,···,c
-N
Let
m(i)
n1
()
E.!L..:..- ~ i* i
=.vri"1'P1
~
a=1
m(i)
(X )
c
1,0, ••• ,0 a + ••• + ~p
nc
~
!..J x
c c y=1
,*(i)
(z )
0,0, ••• ,0,1 Y
c
=
(i)
I: Wk , n k
•
k=1
Then, as before,
(i)
var(W N )
c
=
I:
m2 (i)
_k
k=1
Pk
~ (i,i)
0, • • • , 0, 1 , 0, • • • ,
°
=
cr •.
J.J.
and, similarly,
cov(W~i),w~j)) =
c
I:
k=1
~(i,j)
=
0, ••• ,0,1,0, ••• ,0
cr •.•
J.J
By the central-limit theorem we have asymptotic normality
of
W
- k ,n k
and, since these are independent for k = 1 ,2,
GO . ,
C,
e
89
we have asymptotic normality of
and covariance matrix
L.
!N with mean-vector zero
Now
and as before,
.YN =!N + (.YN - WN )
~(V~i) _ W~i))2
V(i)
W(i)
N
N
VN -!N
-+
0
-+
0
in probability, that is,
-+
0
in probability.
(i = 1,2, ••• ,g)
so that
Hence the assertion follows.
4.4 An application to a certain nonparametric test for
c
samples
Let
be independent (real-valued)
observations from a population with distribution function
Similarly let
Y1' ••• ' Yn 2 be independent (real-valued)
observations from G, ••• , and Z1, ••• ,Zn be from H.
We shall assume that the distributions are
c
continuous.
We shall consider a certain nonparametric test for the
hypothesis
(4.4.1 )
Ho :
F
=G
=
•••
=H
•
If we assume that the populations are approximately
of the same form, in the sense that if they differ it is by
a shift or translation, then we may say that we are testing
for the equality of location parameters.
and
Let
F •
90
v
(1)
= number
of c-plets that can be formed by
choosing some xa'y~"",Zy such that xa
is the smallest.
Here
mf1)
= m~1)
=
~(1) = ~*( 1)
= m(1)
c
•••
and
so that
= 1 ,
,( 1) = ,*( 1) •
Also,
and
(4.4.3)
Ho
If
11(1) =
E[~(1)(X,y, ••• ,Z)] = Pr[X < Y, ••• ,X < z]
X,Y, ••• ,Z
is true, all orderings of
are equally
probable, and hence
n=(c-1)1
C1
'I
1
= -c
Also
•
2
~(1,1)
=
1,0, ••• ,0
£[W(1)
(X)]
1,0, ••• ,0
where
W(1)
(x) =CW(1)(x,y, ... ,Z)
1,0, ... ,0
c 1
1
= [1 -
F(x)] -
-
=E[~(1)(X,y,••• ,Z)-t]
c
so that
~(1,1)
=
£[1 _ F(X)]2c-2
1,0, ••• ,0
- -1
00
= s [1
- F( x) ] 2c-2 dF (x)
_00
=
S1
_
(c _ 1 )2
C 2 (2c - 1)
o
(1 _ t)2c-2 dt
•
- -1
91
Similarly,
W(1)
0,1,0, ••• ,0
()
C (1)( X,y, ••• ,Z )
Y = ~~
- -1c
1
-c
_00
1
-c
1 _ [1 _ F(y)]c-1
1
'
=-......;~-~..O..:._-c
C
-
1
so that
l;(1,1)
,1 ,
== E{W(1)2
(y)}
0,1,0, .... ,0
...
,
° ° ° == G- 2£1- F(y)}C_1
,
£[
+ f1 _ F(Yl1. 2c -2]
J
(c - 1)2
____
1 __
•
c 2 (2c _ 1)
Similarly,
1
~(1,1)
0, ••• ,0,1 ,0, ••• ,0
== - - - - - - c 2 (2c _ 1)
Hence, from (4.3.17), we have
(4.4.5)
where
0'11 =
Pi
=
n.~
1r
1
G.C_1)2
c 2 (2c - 1)
P,
L
•
In general, if we define
•
1
--
92
~
(2)
(Xa'YA' ••• 'Z)
y
p
=
{1 if Y~ < x a ' ••• ' Y~ < Zy
otherwise
°
• • •
and
(k=1,2, ••• ,c)
= number
of c-plets that can be formed by choosing
one observation from each sample such that the
observation from the k-th sample is the least.
Then
m~k)
so that
= m~k)
u
1
= ~c---IT
n.
. 1 J.
~
u(k)
= m(k)
=1
c
= •••
If
•
n,n2·· .n c
is true,
Ho
= _~1_ _
,,(k) =
1c , so
J.=
that
1
" = -c -J
(4.4.6)
(j..
J.J.
•
Then, similar to (4.4.5), we shall have
= -S- -1 - c (2c _ 1)
r
1
C
-
Pi
)2
+"
i.J
..L~
k~i Pk
•
Now
1;(1,2)
= E[W(' )(X,y"
••• ,Z, )W(S)(X,Y2, ••• ,Zs)]
1,0, ••• ,0
=£ [ ~ (1) (X,y"
... ,Z, ) <L> (2) (X'Y2" •• ,Z2 )]
- -1
93
=E.{[1 _F(X)]C-1 S;(X)(1 _ t)C-2dt }
-(c - 1)
=
•
c 2 (2c _ 1)
In'general,
- -1
_
E;(i,j)
-(c - 1)
c 2 (2c _ 1)
0, ••• ,0,1,0, ••• ,0
•
t
(at the i-th or j-th place)
Similarly,
E;(2,c)
= C[V(2)(X,Y1, ... ,Z1 )V(c)(X,Y2, ••• ,Z2)]
1,0, ••• ,0
'
= (,[ <p ( 2) (X, Y1,
... , Z1 ) <P ( c ) (X, Y2, ... , Z2) ]
---c1
1
2
= [[SX[1
-00
_ F(Y1 )]c-2dF (Y1) SX[1_ F(Z2)]c-2dF (Z2)]_..1..
-00
c2
;; E,[SOF(X) (1 _ t)c-2 dt ]2 _
.1...
c2
1
......_
-
= -
c 2 (2c _ 1)
Hence, from (4.3.17), we have
c
(4.4.7)
a ij ;;
L ~ ~(i,j)
k=1
=
;;
e
0, ••• ,
1
c 2 (2c
f(C-
p.
- 1)
1
c 2 (2c
°,1 , 0, ••• , °
- 1)
ll
-(C-1l
1.
~~1
p.
J
~j,
1
C
C~
P'k-Pi-Pj
Thus from (4.4.6) and (4.4.7), we have
1
+L
k~i Pk
•
94
c
c 2 (2c - 1)~ ::
(4.4.8)
~) J + c 2D - C~I
(L
k==1
where
qI
D:: diagonal
P2
v(k)
L
k=1
Now
and
= number of possible c-plets
== n1 n 2·.· nc
U(1), ... ,u(c)
Hence
k:: 1,2, ••• ,c)
Pk
-
..1..)
•
Pc
= (JL, ..1.., ••• ,
P1
c
(JL
cJql
•
are subject to one linear relation
c
L u(k)
=
Hence the distribution of u's is singular
1 •
k==1
and hence the asymptotic distribution should also be singular.
Thus,
c
~
should be singular.
•• +
L\ ' (J ~~
L\' (J..
~J
i=1
i~j
c 2 (2c - 1 )l'~ ==
In fact we expect
to be zero.
Now
c
(L f) l'J + c21'D -
c.:!',sd' - cd '19'
•
k=1
But
~ =
c
\'
L
_1
-
J'J == cJI
k=1 Pk
so that liZ
'
-JIJ- ==
C
and
-J'D == -ql .
= o. Hence Z is singular.
Let us consider only
U(1), U(2), ••• , u(c-1)
their asymptotic normal distribution.
If
Zo
and
denotes the
covariance-matrix of the asymptotic normal distribution of
U(1), ••• , u(c-1)
(4.4.9)
then from (4.4.8) we have
c
c 2 (2c -1 )i: o = (L
Jo+ c2Do - C.9o~~ - C~o.9~
J
f)
k==1
k
,
95
where
J o = (1)
Do
(c.1}x(c.1)
l.~
= diagonal(--p1 , ••• , p 1
c.1
1
4.4.1
,
=
)
(1)
and
1x(c.1}
qoI
=
-
(1--p, ••• ,----p
1)
c.1
1
Special case
Let us first consider the special case when
ni
1
n1 = n2 = ••. = nc • Then Pi = 1r
= C ' so that Do = cI ,
.90 = c:1,o and hence (2c • 1}Zo = cI • J o • Therefore,
1
2c .,
Z.1
= 1.[1
+
c
0
If we denote
J o] •
~~
b l = .vN'(U' .1. J') =
c
=
(b1,b 2 , ••• ,b c • 1 )' then from the
asymptotic norma~ity of b~, we have b O
Z01b o distributed,
(b 1 ,
•••
,b c )
and
-
in the limit, as a X2 with
x2 = -b OZ01-b o = ~o
= N( 2c • 1 }
c
(4.4.10)
4.4.2
= N(2c
c.1 d.f.
--
Hence, here,
2c.1
[I + JoJ£o
c
c-1 (u(i) _ ~)2 + [_1L (u(i)
.L
i=1
1
-tf
c
• 1)
c
L
i=1
•
General case
Let us now suppose that not all n's are equal.
~o
and:1,o are linearly independent.
From (4.4.9) we have
c 2 (2c .1 )Zo = a J o + c 2Do - C.9o:1,o - c:1,o.9b
c
when
a =
L J..
k=1 Pk
• Therefore,
- c 2 Do - EF
say •
Then
•
96
Then
1
~01
c 2 (2c-1)
= J.. D0 1 _ D01EAFD01 , where A is
2
c
given by
•
c
Simplifying and using the relation
L
i=1
1 , we finally
have, by similar considerations, the limiting X2 distribution
of
N(2c -1)
w1· th
[t
t)" - \~1
Pi (u(i) -
f
c - 1 d ••
Wh en
Pi (u(i) -
t)}
1
P1 = P2 = ••• = Pc = C
J
' th'1S
reduces to the earlier expression.
Remarks.
It will be interesting to investigate the asymp-
totic power of the test against some specific alternatives.
The general alternative in mind behind the test is
F,(x)
1
= F(x - a.)
where a's are not all equal
1
(F
~
F1,
F2, ••• , H ~ Fc ). In this respect, it is similar to
Kruskal's test or Mood's test for c samples. In a way,
G
~
it is similar to Mosteller's test, but his test is against
alternatives where one population is shifted to the right
and correspondingly his test-statistic is also with reference
to one particular sample.
On the other hand, our statistic
is symmetric with respect to all samples and hence covers
much more general alternatives.
97
Kruskal [16] says, »Unfortunately, for the H test
as for many nonparametric tests the power is difficult to
investigate and little is yet known about it.»
Recently,
Andrews [1] investigated the power of Kruskal's test and
Mood's test for c samples and concluded that the asymptotic
efficiency of Kruskal's test relative to Mood's test for
c
samples is
~1 depending on the alternatives.
It will
be interesting to carry out similar studies on this test
with respect to these two tests.
It is expected that the
same type of conclusion will be reached in view of the
very nature of such nonparametric problems.
CHAPTER V
SOME REGRESSION AND BIVARIATE PROBLEMS
5.1
Introduction
Mood and Brown [20] have considered some simple
regression problems.
servations
On the basis of a sample of
(x.,y.)
,where
~
~
x
n
ob-
is in the nature of a con-
comitant variable and y, given x, is a continuous variate
whose median is of the form
a + ~x
where
a
and
pare
unknown parameters, they consider the problem of estimating
a and
P
and testing hypotheses about them.
They also
discuss briefly the general linear regression under this
nonparametric set up.
In this chapter we shall extend their methods to
discuss some additional regression problems.
Next we shall
consider some bivariate analysis of variance problems •. We
shall use the "step-down procedure" [28,30] to reduce the
problem to univariate cases with the other variate as a
concomitant variate.
The regression methods developed
earlier will be used here in these bivariate problems. The
method seems to be perfectly general and could be extended
to three or more variates--that is, to the general multivariate situation.
Most of the tests are offered on
heuristic considerations.
They are expected to be "good"
for large samples.
98
e
99
5.2
problem~
Some regression
We shall first state a lemma [20] which will be
useful for later applications.
Lemma 5.2.1
Let
=
g(m, .m..... ,mk )
k
where
n -
t c:~)/0
k
I
ni
and
I
m=
i=1
mi ' denote the density
i=1
function for the mi's.
Then
k
\'
n(n-1)
2
!-J
X = m(n _ m)
. 1
J.=
1 (
n:-J.
ni
n
mi -
has an asymptotic X2 di stribution with
~2
k - 1 d. f. for large n.
Mood [29] says, liThe expression (5.2.1) has a distribution very accurately approximated by the chi-square distribution with
k - 1 d. f. even if
of twenty provided all the
5.2.1
n.J.
n
i s only of the order
are at least five."
One sample
Let
(X1,Y1), ••• ,(x n 'Yn)
observations.
denote a sample of
n
We shall assume that
(a) the distribution of
y
for any
x
is continuous and
identical apart from a shift or translation, and
(b) the regression is linear, that is, the location parameter
(usually the median), given
and
~
x, is
a +
~x
,where
a
are unknown parameters.
To estimate
a
and
~
that the estimates
a
A
and
~
A
, Mood and Brown [20] suggest
should be determined by
100
(5.2.3)
Median of (Yi -
a
~x.)
=- 0
1
Median of (Yi -
a-
~xi) =- 0
for
x.
<X
for
x·1
>
1 -
and
....
where x is the median of x.1 's.
....
X
If it happens that several
N
values fall at x, then the sign
~
(5.2.4) may be replaced by < and
~
x
in (5.2.3) and> sign in
if such a replacement
would more nearly divide the points into groups of equal
size.
and
They also give an iteration procedure to determine
a
~.
We shall find it convenient to speak of
group one and of
xi >
x
as group two.
x.
< x as
1 -
Then (5.2.3) and
(5.2.4) may be equivalently written as
a = Median
(5.2.5)
(y.1 -~x.)
1
and
(5.2.6)
Median (y. - ~x.)
I
1
1
= Median
II
(y~ - ~xl..) ,
•
when I and II stand for groups one and two, respectively.
Test for
~
= ~o
f3o xi )
•
If
f3
= f30 ,
a is estimated by
Mood considers the number of points,
say m1 and m2, above the line Y = a... + f30x in each group.
a'" =- Median (y.1
•
Let us, for convenience, assume that
n
is even.
probability density of m, and m2 is given by
n/~ n/2)
p(m"m.)
= (ml~(m.
Cn/J
so that, by lemma 5.2.1, Mood obtains
,
"L
Then the
101
2
(5.2.8)
X ==
n,s
- V
n16 ( m1
as the test-statistic.
that
n
d.f.
=
1
It may be seen that the supposition
be even may be relaxed.
We may arrive at (5.2.8) on some heuristic considerations.
Assuming
n
is even, as before, we have
in each group and we note that
hypothesis is true, we expect
= i.
mately
a-
Yi -
and
rna
If the
to be approxi-
from the first group and, similarly, for ms.
Now, Yi - a -
~Oxi's
a-
as
0
m1
points
Now m1 is equal to the number of 'positive
~oxi's
a (p)
m1 + rna = ~.
~
n
have identical distribution and, also,
~oo
, so that, on heuristic considera-
tions
for large
n
and, by lemma 5.2.1, we again have asymptotic X2 statistic,
given by (5.2.8)
If we are willing to assume, in addition, that
(c) the mean and variance of
y
exist for any
x, then
taking the mean as a location parameter given by
a +
~x,
a
and
can be immediately estimated by
~
the usual least squares estimators.
a= y
for
b
- ~ox , where
x.
Then also
y
is the mean of y's and similarly
a - a
1\
In the above case,
J..E..L> o.
In this case, if
denotes the number of points above the regression line,
we have by a similar heuristic argument
102
for large
where m1 and m2 are defined as before.
n ,
Hence, by lemma
5.2.1, we have an alternate test-statistic
2 :::::
4n
em
X
b{n _ b)
1
(5.2.10)
Consistency of
Let
zi
a and
= Yi
,.
~
- a -
Median [z. +
~
I
•
~xi'
Then zi's have identical
Now (5.2.6) may be written as
,.
(~- ~) x.
~
]
=
Median [z. +
~
II
,.
(~- ~) x.]
~
•
fMedian(z.) - Median(zi)I ~ 0 , so that
I
~
II
,.
intuitively it seems that ~::::: ~ will satisfy (5.2.11),
Now as
n
that is,
~oo
=1
d.f.
determined by (5.2.5) and (5.2.6)
distribution with median zero.
(5.2.11)
b,2
'V
-
,
I~ - ~I .iE.L>0. It has not been possible yet to
give a general formal proof.
We shall give a formal proof
for a special case, when x's are constants at our disposal,
so that they can be chosen suitably.
Proof for the special case.
~n
Let
xn ' e1n , e2n
denote the median of x's, Median(z.), Median(z~)
I
respectively when the sample size is
that for
and
and
~
II
2n.
Let us suppose
...
n > no , (i) x's are chosen alternately from
group I (x ~
xno )
and from group II (x
> xno )' so that for
~
d (ii) all x's in group II are
n = xno ' an
greater than or equal to
+ b ,where b is a fixed
no
n
~
no,
~
x
x
~
103
positive number, however small.
[For example, b may be in
the nature of a unit of measurement.]
Since
there is
81n
n1
~0
and
B2n
~0 ,
f),E > 0 ,
given
such that
,
with probability greater than
Let
Case (1):
Suppose
8
~
I
and
3.
~n
Consider
n
greater
= 8n •
Then
•
-
med.[z. + (~
•
-
n-> 0
1 -f).
A
~n)xi] < 81n + 8 nxno
med.[z. + (~ - ~ n )x.]
> 8 2n + en
3.
II
3.
(x no
(x
+ b)
•
x ,
Then (5.2.11) ~ 8 2n + en no + b) -< 81 n + 8 n no
so that
8 nb ~ 8 1n - 8 2n < 2E
from (5.2.12).
8n
Hence
Case (2):
=
18 n I < ~
b-
en-< 0
Suppose
and
Then
n
3.
Again, (5.2.11) ==9 e1n -
Hence, again,
3.
-
n
-
I 8 n I (X no
I8 n I ~no ~ e2n - I8 n I (~no
18 n Ib -< 82n
so that
such that
say.
med. [z. + (~ - ~ ) x· ] < 8 2
II
Thus, given
•
£t
- 81 < 2£
n
+ b) •
+ b) ,
from (5.2.12).
•
f)
and
18 n l < Et
E' > 0 , there is
n*
with probability
>1
max(no,n1) ,
=
-
f)
for
n
> n*
•
104
It may be seen that for large
n
(5.2.8)
~
the
corresponding expression obtained by lemma 5.2.1, if
(5.2.7) were replaced by
[~ [!1
where
n, the sample
[¥J
size, is odd.
Hence, in the previous argument, for
n
~
no
.....
, we may take group I as x < xno ,
.....
as before, and grQup II as x > x
and the previous
no
2n + 1
and sample size
argument goes through.
8n
Thus
~ 0 , that is,
~n 1E.L> ~ , and the
proof is complete for the special case mentioned above.
Remark:
It may be seen that this proof hinges on the
existence of b.
as group I and
We may note that if we decide
x
> Xo
x
~
Xo
as group II (even though Xo is not
the median of x's), then the test statistic (5.2.8) can be
modified suitably and the above proof does not require
condition (1).
Consistency of
Now
a:::
a:
med. ( y.~ - pax.)
~
=
Let us assume that
a + med. [ z.~ + (~ -
shall assume that x's are bounded (at least bounded with
probability ~ 1 - 1'11).
Suppose
I xi I < M
for all
i.
~ lE..L> ~ = ~ given t,T') > 0 , there is n* such that
I~ - ~I < ~ for all n ~ n* , with probability > 1 - T') •
Also
Then
105
med (z1..) for
£
< med[ z.1. +
-
(~- ~) x.]
1.
< med (z.)
+
1.
-
> n* with probability > 1 -
n
1)
Also
•
£
med (zi)
a .i.EL> a
so that
.iE.4 0,
•
Henceforth we shall assume that the condition (2) is
A
obeyed, so that
5.2.2
and
a
are consistent.
c samples
Let us suppose that we have
tions
i
(x ij ' Yij )
= 1 ,2, ••• , c.
j = 1 ,2, •••
,n i
n.1. independent observafrom the i-th population,
We shall assume (a) as before and (b) that
the regression is linear, that is, the location parameter
a 1.' + ~.x
.. •
1. 1.J
(usually the median) of Yij' given xij ' is
~. =~.
(i) To test
1.0 , i= 1 ,2, ••• ,c •
We shall have c independent X2 statistics
1.
with 1 d.f. each, giving a X2 statistic with
c d.f.
No
new problem is presented here.
(ii) To test
~1 = ~2
= •••
=
•
~c
On this hypothesis, Yij'S have medians
a. +
1.
~x..
1.J •
We may estimate ai's and
a.1.
=
and
median(y .. I
1.J
~
median (y .. . 1 , 2 , ••• , n 1.J
J=
i
a.1. - ~x 1.J
.. )
=
by
A
~x .. )
median(y .. - a. - ~x .. )
II
1.J
1.
1.J
For convenience, we shall take group I as
of all XiS) and group II as
x
1.J
x
~
;ow
x (median
> x , though the test-
statistics can be modified to suit other cases.
•
106
c
Ln i
Let
N.
=
For simplicity, let us take
ni
to
1
be even.
Let
mi be the number of points from the i-th
sample belonging to the second group and t i be the number
a
of points out of these mi that lie above y = i + ~x •
c
c
Then
and
t i ~ ~. If the hypothesis is
mi = ~
L
L
1
1
true, we expect
t.
1
i-th sample, such that
a.-ct.
1
1
~:r.
to be
of observations from the
Since
m.
m.1
Let
ti
be the number
in the second group of the
y .. - a. -
Z· • =
1J
1J
1
~o and ~-~J.E..4o
~x ..
1J
is
>0
•
t.-t~
~o as
1
1
nits ~60.
Therefore, heuristically, tits have the same
c
distribution for large nils as tits subject to
t~ ~ ~ •
L
1
Since Zijt s have identical distribution,
so that
c
m.
TJ (t:)
(~~~
for large
Hence, by lemma 5.2.1, we have
...
d.f.
If some
m.
1
=
=
c - 1
•
0 , the corresponding term will be absent
and d.f. will be reduced by one.
We could have considered
e
107
group I instead of group II.
condition
n.J.
It may be seen now that the
be even may be relaxed.
If we are willing to assume, in addition, (c) as before,
then we may take least squares estimates
\' \' (y .. -
L L
&i
=
yi -
A
~xi
&i..tEL) ~i
so that
~A
where
t'
y.J. )x J.J
..
i
- tiiiir-P'r------- \' (x .. - X.)2
L
J.JJ.
i j
~ 1E.L) ~.
and
J.J
ti
If
,
denotes the
number of points from the i-th sample above the corresponding
c
regression line and
t i = t , then by a similar heuristic
L
1
argument,
for large ni's ,
(~
so that by lemma
,
~5.2.1
LC
X2 ::::::
(iii) To test
a1
=
1
t( N - t) 1 n i
N2
= ••• = a c '
a2
[ 'V.
0
_
ni oJ 2
N
' V , d. f • = c- 1 •
J.
~ ~1
=
~2
On this hypothesis, Yij'S have medians
a and
~
a +
~Xij
med( y ..
J.J
A
med[ y .. I
J.J
~x .. ]
J.J
- px J.J.. )
•
A
= med[ y ..
II
J.J
,..
- ~x J.J
.. J
,w
e
~c •
may be estimated by
a=
and
= ••• =
.....
where, for convenience, we take groups I and II as x,S x
(median of all XIS) and
x ) ""x
respectively.
Let
N
be
108
even and
t i be the number of points in the i-th sample
above the regression line Y = a + ~x. If the hypothesis
c
n.
is true, we expect t i ~ ~. We note that L: t i = ~ •
A
1
Let
ti denote the number of positive terms in
-a-~xij
Zij=Yij
~ J.J2.L> ~,
and
(j=1,2, ••• ,n i )·
t.l. _ t!l. ~ 0
as
N
Since
a~a
~oo.
Hence by
similar heuristic arguments, the distribution of tits for
large
N is approximately the same as that of ti's subject
c
to
L: t~
= ~.
Hence,
1
N ,
for large
so that by lemma 5.2.1
c
X2
= 4
~ ..1..
L
n.
1
(t.
,
l.
d.f.
= c - 1 •
l.
If we are willing to assume, in addition,
(c~
that
is, the existence of mean and variance, then we can have
a and ~,
least-square estimates
such that
c
~ ~ ~.
If we denoteL: t i
heuristic argument
1
by
a~ a
and
d, then by the same
for large
N,
so that
2
c
1,
N 2 \ , -n
X == d(N- cl) L
1
l.
.(o.l.' _
'V
~
1'1
d)2
d.f. -= c-1
,
109
We shall indicate here briefly a formal proof for
(5.2.15), which was first derived on heuristic considerations.
A
Let
Uij = Yij - ~Xij
•
Then
t. = number of positive y J.J
.. - a - ~x ..
J.J
= number of Uij'S > a = median (Uij) •
i,j
(j = 1, 2, • • • , ni )
A
A
~
c
Also
Lt i
=
1
~
Let
•
za
be the a-th (a
= ~)
Then the joint density function of t
magnitude.
in
1 , ••• ,tc
and za' under the hypothesis, is
c
(5 •2. 17)
LL F
111
i=1
where
that
.
,
F..
(z a )
J.J
za
=
(z a) • • •F1 1 n 1 -t 1 (z a )[ 1 - F1 1 n, -t 1 +1 (z a)] • • •
Pr[ u..
< z a ] , the i-th term indicates
J.J -
is from the i-th sample and [denotes the sum
over all posable combinations.
~ ~ ~ , given s,~ > 0 , there is No
such that, for N ~ No , I~ - ~I < e with probability
Since
It
110
> 1 - 11.
N ~ No ,
Then fot'
with probability> 1 -11 , we
have
pr[Y'J'-~X,
.<z -Ex .. ]<F .. (z }<Pr[Y .. -~x .. <z +EX .. ] ,
J.J - a
J.J - J.J a J.J
J.J - a
J.J
J.
that is
F(za - EX .. ) < F.. (z ) < F(z + EX .. )
J.J - J.J a a
J.J
where
F
denotes the distribution function of all
In view of the continuity of
y .. -
J.J
F .. (z )
J.J
a
=
F ,
F(z ) + b ..
a
J.J
where b's are arbitrarily small and tend to zero as
N ~oo
•
Then (5.2.17) becomes
=
~ ~ n1-t1
lJ lJ F
i=1
c
=
N
( za )[ 1 -
1
LL F'2- (za)[1
F( z) ]
t1
• • of
n i -ti -1
( za )[ 1 - F( za) ]
-ti
dF ( za )
N
-
F(Za)]~ dF(za} + O(b)
i=1
On integrating out
z
a
we have the joint density of
-t1 ,-t2 , ••• ,-t,c
c
_
\'
en,).
- lJ t
···
. 1
J.=
1
ni •,
-t,.J. 1 (n.J. - -t,.J. - 1 )1
n
(C)
0° 0\.t
c
N 1
1 ~-
S
0
X
(1 - x)
N
~
dx + 0 (b)
e
111
which is the same as (5.2.15).
(iv) To test
~
=0
~1 = ~2 = •••
, when
=
~c
= ~
say.
On this hypothesis, y .. 's have medians G's.
We may
J . J J .
A
take
CI.
J.
=
median
(yij)
j=1 ,2, ••• ,n
For simplicity let
ni
be even.
i
n.
~
Then
points from the
i-th sample are above the corresponding line.
x,
points are to the right of
~
Also
the median of all the x's.
t i be the number of points from the i-th sample to
,...
the right of x and above the corresponding line and let
Let
c
t
=
Lt i
•
We expect, then,
1
and
m be defined similarly for
N
to be
x~
===
x.
Let
'4 •
m. 's
J.
Then, by the same
heuristic argument, for which a formal proof could be given
as in (iii), we have
p(t,m)
(~)(~2)
===
N
for large
(N/~
and, hence, by lemma 5.2.1, we have
(5.2.18)
d.f.
=
1
•
N,
112
The condition that
5.2.3
n.l.
be even, then, may be relaxed.
Testing linearity of regression
As in the normal analysis, it is necessary that we
have a number of observations for each
observations be
x·1. •
Let the
(x.,
y l.J
.. ) , j = 1,2, ... ,n.,
i = 1,2, ••• ,k •
1.
l.
We shall assume that the distribution of
y, given
continuous and the same apart from location, say
which may depend on
x.
x, is
h(x) ,
We want to test the hypothesis
that the "regression" is linear, that is,
h(x)
=
a +
k
Let
I
ni
N and these
=
1
two groups, say
x > x
x
< xk1
~x
..
N observations be divided into
forming the first group-and
forming the second group, as evenly as possible.
k1
Let us suppose that observations corresponding to
(i
=
xi
1,2, ••• ,k1) belong to the first group and the rest to
the second.
Let the groups contain
tions respectively.
,.
med(y .. I
l.J
Consider the
n.1.
and
We may then estimate
,.
,. )
med(y 1.J
.. - a - ~x.l.
and
a
N- a
a
and
observaby
~
=0
A
~x.) =
l.
med(y .. II
l.J
~x.)
1.
•
x·l. •
observations corresponding to
If
the regression is linear, we expect these n.1. to be split
,.
,.
evenly by the regression line y = a + ~x • Let .t1 ' out
n·
of these n. , be above the line. We expect t. s:t:I ..J; •
l.
l.
2
113
k
k1
Et i
Then
~ and
=
\'
0
L
i=1
'Vi
N-a
=-r
, assuming for con-
i=k 1 +1
venience that
a
and
N- a
z .. = y~J
.. - a ~J
Let
are even.
~x.
~
•
Then on the null
hypothesis, z .. 's have identical distribution.
~J
the number of positive termsin
Since
a~
be
t!~
z i j ( j = 1 , 2, .. • , ni ) •
~ ~ ~ , t.~ _ t~~
and
a
Let
iP.L> 0
•
Hence on heuristic considerations as before, the distribution
of tits is the same (asymptotically) as that of ti's subject
to
k1
\'
L
,.
k
a
ti: = ~
L t~
and
=
N~a.
~)2 ,
d.f.
Thus,
i=k 1 +1
i=1
so that by lemma 5.2.1
X~ ~
k1
4
E J..n. (t.
1
~
~
, d.f.
d.f.
= k - 1
= k - k1 - 1
= k - 2
•
,
It
114
5.3
Some bivariate problems
5.3.1
One-way classification
Let there be
y1.J
.. )
i
j = 1 ,2, ••• ,n~
•
= 1,2, ••• ,k,
ni
independent observations (Xij'
, from the i-th population,
t
and let
ni
=
N •
1
Suppose
Fi(x,y)
denotes the distribution function
of (X,Y) for the i-th population.
We shall assume that
(i) F's are continuous,
(ii) the distributions are identical except for location,
and
(iii) the median of the conditional distribution of
given
X, is a linear function of
(i)
~the
We note that
X.
Let
fi(x,y), fi(x) and fi(y/x)
denote the densities of (X,Y), X and Y/X
Also, (ii)
X, is also a
conditional probability, given
probability measure.
Y,
respectively.
>
F.(x,y)
=
1.
F(x-~.,
1.
y-11.)
1.
•
We want to test whether the populations are identical.
(5.3.2)
Ho:
and
= t: k
~1
= t:2 -
•••
111
= 112
• • • = 11k
=
•
(5.3.1 ) ===S> fi(x,y) = f(x-t:.,
y-11')
1.
1.
so that
(iii)
fi(x)
>
f(x,y)
f 1 (x - t:.)
1.
=
=
f 1 (x)f 2 (y - a
say •
-~x)
,
Thus
115
so that
f(x-t:·,Y-f)·)
1
1
=
f,(x-t:.)f2[y-f)·-(1-~(X-t:.)]
1 1 1
= f, (x - t:.1 )fa(y - (1.1 - ~x)
say.
Thus, we see that
Ho +=9
and
t:1
= ta = ••• =
(11 = aa = •••
t: k
= ak
•
It may be noted that we have relaxed just the normality of
the distribution, but retained other features from the
classical set up_
We shall use a step-down procedure to test Hostep-down procedure for Ho with a level y
for
with a level
A '
will be a test
1"1 , and if it is not rejected, a further
test for
= •••
with a level
Ya, where
Y1
and
Va are chosen suitably
so that
(1 - y) =- (1 -
'(1 )
(1 - Y 2)
•
The test for
Hoy/ x will be derived from the conditional
distribution of Y, given x, so that the xts then can
be regarded as fixed.
For Hox ' we consider only x's. Let us consider the
test given by Mood [20]. [We could have used either
Kruskal's test or the test derived in Chapter IV.]
Let
mi
116
denote the number of observations in the i-th sample
greater than the median of all
XiS.
Mood shows that the
density function, if Hox is true, is
where
a ~ ~
if
N
~ if
is even or
test-statistic proposed by him for large
(0 •3 •6)
· N(N _ 1) ~ 1 (
hi a)2 .,
X2 =: a (N _ a) L n:- mt... ~
•
1
N
N
is odd.
The
is
d.f. == k. ... 1
~
1
For small nls, the probability is computed from the exact
distribution
(0~3.0)$
The test for Hoy/ x is seen to be precisely the same
as that considered in 0.2. Hence we may take (0.2.16)
(in its modified form) as a test-statistic, if the condition (ii) mentioned on page 102 holds good.
stated, it may be pos~ible to prove that
As already
a~ a
without
using condition (ii), in which case (0.2.16) may be used
for large samples in general.
0.3.2
Two~waY
classification
For'simplicity, we shall consider only the case of
one observation per cell, when the design is complete.
"in denote "treatments" and
i=1,2, ••• ,t
j=1,2, ••• ,b
"jll
denote "blocks."
and
N = bt •
Let
Suppose
117
Let
F .. (x, y)
denote the distribution function of (X,Y)
~J
for the (ij)-th cell.
We shall assume that
(i)
F .. (x,y)
~J
(ii)
is continuous,
the distributions are identical except for location,
that is,
F. . ( x , y) = F ( X ,- a. ., y - ~. .)
~J
.
~J
. ~J
(iii)
the model is additive, that is
aij
(iv)
= ~i
+ ~j
the "regression" of
~ij
and
Y on
= Yi
+ b j , and
X is linear.
As before, we notice that we have relaxed just the
normality of the distribution while retaining other features
of the classical set up.
Let
fij(x,y), fij(x) and fij(y/x) denote the
densities of
(ii)
~
(X,Y), (X) and (Y/X) respectively.
f .. (x,y)
=
f. . (x)
:;
f(x,y)
=
~J
and
~J
(iv)
~
f(x- a ij , y - ~ ~J
.. ) ,
f,{x-a ~J
.. ) say •
f 1 (x)f 2 (y - a -
~x)
,
so that
f. •-lJ' (x, y)
( 5 • 3 • 7)
=
f 1 (x - a ~J
.. )f 2[y - ~ ~J
.. - a - ~ (x - a ~J
.. ) ]
= f, (x - ~. - ~ . ) f 2 [ Y - a ~
J
y.~ + ~t:.~ - b.J + ~ ~ j - ~ x ] •
We will be interested in the usual hypothesis
e
Ho :
t:,
and Y,
= t:2 =
= Y2
•••
= t:t
= •••
= Yt
•
e
118
We shall consider a step-down procedure to test Ho •
Con-
sidering x's separately, we can test, at a level a1 ,
H
.
~1
ox·
=
~a
= ••• =
~t
by the criterion, given by Mood [20],
t(t- 1)
d.f.
ba(t - 0.)
where
a
=
~
if
t
= t - 1 ,
i=1
is even or
t
otherwise, and
21
mi = the number of xij's (j = 1,2, ••• ,b) greater than
j ' the median of the j-th column. Then, considering the
x
~.
conditional distribution of Yij'S, given Xij'S, we have
to test
y .. 's have medians A. +
HI·
oy x·
1J
J
~x ..
1J
at a level aa , so that
(1 - a) = (1 - a1) (1 - aa)
A.
We may estimate
A
A.
=
J
median
. 1,
1=
A
1J
••• ,
t
(y .. 1J
A
~x ..
1J
A
med (y.. - A. I
2,
by
and
J
J
~ x.
•
)
A
.)
1J
=
A
med (y. . - A.
II
1J
and
J
-~ x·
,
.)
1J
where the groups are with respect to x's as usual.
A
y1J
.. - A.J - ~x 1J
.. ' s for
are positive and hence in all ab ~ut of bt
that
a, defined as above, out of
each
j
,
y .. _~. - ~x .. 's are positive.
1J
J
We note
A
1J
of positive terms out of
b
t
Let t. denote the number
1...
,.
y1J
.. - A.J - ~x 1J
.. , for given i .
119
~.~ - ~2
Then we expect
~f (5.3.9) i3 true.
t
I
Also
~i - abo
i=1
Let ~i denote the number of positive terms out of
y ~J
.. - }...J -
~x..
1J
• for given i.
~IS
bution of
On heuristic considerations.
A
for large samples
}...
J
b
,..
and
~}...
J
~ ~ ~
, so that the distri-
is asymptotically the same as that of
~IS
t
L~i
subject to
=
ab •
Hence,
1
t
b
VCti)
- C:)
for large
N ,
(N = bt),
so that by lemma 5.2.1,
2
X
~ ab(~~ ab)
t
L~ C~i
1
-
~
ab)2
d.f. = t - 1
(5.3.10)
•
The same remark as that at the end of 5.3.1 will hold good
here.
Also, it may seem that we require
we require
A
>.... ~ >....
J
J
t
large (since
in the above argument), but if we
give a formal proof, similar to that given in 5.2.2 (iii),
we shall note that
A
~ ~ ~
is sufficient to reduce the
proof to the one given by Mood.
large
t
but only large
test-criterion for large
bt.
b.
This does not require
Hence (5.3.10) gives a
BIBLIOGRAPHY
[1] Andrews, F. C., "Asymptotic Behavior of Some Rank Test~
for Analysis of Variance," Annals of Mathematical Statistics, XXV (1954), 724-736.
[2] Barnard, G. A., "Significance Tests for 2 x 2 Tables,"
Biometrika, XXXIV (1947), 123-138.
[3] Bartlett, M. S., "Contingency. Table Interactions,"
Sup~lement
Soc~ety,
to the Journal of the Royal Statistical
II (1935), 248-252.
[4] Benard, A., and Ph. van Elteren, "A Generalization of
the Method of m Rankings," Proceedings,
Koninklijke Nederlandse Akademie van Wetenschappen, LVI (1953), 358-369.
[5] Bhapkar, V~ P., "A Note on Confidence Bounds connected
with ANOVA and MANOVA for Balanced and Partially
Balanced Incomplete Block Designs," North
Carolina Institute of Statistics Mimeo ra h
er~es, No.
anuary
•
[6] Bose, R. C., and D.'M. Mesner, "On Linear Associative
Algebras corresponding to Association Schemes
of Partially Balanced Designs," Annals of
Mathematical Statistics, XXX (1959), 21-38.
[7] Cramer, H., Mathematical Methods of Statistics,
Princeton, Princeton university Press, 1946.
[8] Diamond, E. L., "Asymptotic Power and Independence of
Certain Classes of Tests on Categorical Data,"
North Carolina Institute of Statistics Mimeograph series, No. 196, April 1958.
[9] Durbin, J., "Incomplete Blocks in Ranking Experiments,"
British Journal of Psychology, IV (1951), 85-90.
[10] Fisher, R. A., "On the Interpretation of Chi-square
from Contingency Tables and the Calculation of
p," Journal of the Royal Statistical Society,
LXXXV (1922), 87-94.
"
[11] --------The Design of Ex~eriments, Edinburgh, Oliver
and Boyd Ltd., 1 35.
120
121
[ 12]
~The Use of Ranks to avoid the Assumption
of Normality implicit in the Analysis of Variance, Journal of the American Statistical Association, XXXII (1937), 675-701.
[13] Friedman, M.,
[14] Hoeffding, W., "A Class of Statistics with Asymptotically Normal Distributions," Annals of Mathe.
matical Statistics, XIX (194a), 293-325.
[15] Kruskal, W. H., "A Nonparametric Test for the Several
Sample Problem," Annals of Mathematical Statistics,
XXIII (1952), 525-540.
w. A. Wallis, "Use of Ranks in One-Criterion
Variance Analysis," Journal of the American
Statistical Association, XLVII (1952), 583-621 •
[16] --------and
.
[17] Lehmann, E. L., "Consistency and Unbiasedness of
Certain Non-parametric Tests," Annals of Mathematical Statistics, XXII (1951), 165.179.
[18] Massey, F. J., "A Note on a Two-sample Test," Annals
of Mathematical
Statisti~s,
XXII (1951), 304-306.
[19] Mitra, S. K., "Contributions to the Statistical Analysis
of Categorical Data," North Carolina Institute
of Statistics Mimeograph Series, No. 142,
December 1955.
[20] Mood, A. M., Introduction to the Theory of Statistics,
New York, McGraw-Hill Book Co., 1950.
[21] Mosteller, F., "A k-sample Slippage Test for an Extreme
Population," Annals of Mathematical Statistics,
XIX (1948), 58-65.
[22] Neyman, J., "Contributions to the Theory of X2 Test,"
Proceedings of the Berkele SymFosium on Mathe.
matical Statistics and pro~ab111t~, UnIversity
of California Press, (1949), 239- 73.
S
[23] Ogawa, J., "On the Mathematical Principles underlying
the Theory of the X2 Test," North Carolina
Institute of Statistics Mimeograph Series,
No. 162, January 1957.
122
•
[24] Pearson, E. S., "The Choice of Statistical Tests
illustrated on the Interpretation of Data
Classed in a 2 x 2 Table," Biometrika, XXXIV
(1947), 139-167.
[25] Pearson, K., "On the Criterion that a Given System
of Deviations from the Probable in the Case of
a Correlated System of Variables is such that
it can be reasonably supposed to have arisen
from Random Sampling,1I Philosophical Magazine,
Series 5, L (1900), 157-172.
[26] Pitman, E.J.G., "Significance Tests which may be
applied to Samples from any Populations. III.
The Analysis of Variance Test," Biometrika,
XXIX (1937), 322-335.
[27] Reiersol, 0., "Tests of Linear Hypotheses concerning
Binomial Experiments t ll Skandinavisk Aktuarietidskrift, XXXVII (1954j, 38-59.
[28] Roy, J., "Step-down Procedure in Multivariate Analysis,"
Annals of Mathematical Statistics, XXIX (1958),
1177-1187.
[29] Roy, S. N., Some Aspects of Multivariate Analysis,
New York, Jonn Wiley and Sons, 1957.
[30] --------and Bargmann, R. E., "Tests of Multiple
Independence and the Associated Confidence
Bounds," Annals of Mathematical Statistics,
XXIX (1958), 491-503.
[31] Roy,
r
[32] Roy, S. N., and M. A. Kastenbaum, liOn the Hypothesis
of No Interaction in a Multiway Contingency
Table," Annals of Mathematical Statistics,
XXVII (1956), 749-751.
[33] Roy, S. N., and S. K. Mitra, "An Introduction to Some
Nonparametric Generalizations of Analysis of
Variance and Multivariate Analysis," Biometrika,
XLIV (1956), 361-376.
123
•
[34] Terpstra, T. J., "The Exact Probability Distribution
of the T Statistic for Testing against Trend
and its Normal Approximation," Proceedings,
Koninklijke Nederlandse Akademie van Wetenschappen, LVI (1953), 433-437.
[35J --------"A Nonparametric Test for the Problem of k
Samples," Proceedings, Koninkli~ke Nederlandse
Akademie van Wetenscfiappen, LVI (1954),
505-512.
[36] Wald, A., "Note on the Consistency of the Maximum
Likelihood Estimate," Annal; of Mathematical
Statistics, XX (1949), 5§5-601.
[37] Welch, B. L., "On the z-test in Randomized Blocks and
Latin Squares," Biometrika, XXIX (1937), 21-52.
[38] Westenberg, J., "Significance Test for Median and
Interquartile Range in Samples from Continuous
Populations of any Form," Proceedings,
Koninklijke Nederlandse Akademie van Wetenscfiappen, LI (1948), 252-261.
[39] Wilcoxon, F., "Individual Comparisons by Ranking·
Methods(" Biometrics Bulletin {now Biometrics),
I (1 9451. 80-83.
f
I
'"