McCanless, Imogene; (1983).Analysis of Covariance by Matching for teh K-Sample Problem."

ANALYSIS OF COVARIANCE BY MATCHING FOR THE K-SAMPLE PROBLEM
by
Imogene McCanless
Department of Biostatistics
University of North Carolina at Chapel Hill
Institute of Statistics Mimeo Series No. 1423
January 1983
ANALYSIS OF COVARIANCE BY MATCHING
FOR THE K-SAMPLE PROBLEM
by
Imogene McCanless
-e
A Dissertation submitted to the faculty of the University of North
Carolina at Chapel Hill in partial fulfillment of the requirements
for the degree of Doctor of Philosophy in the Department of
Biostatistics in the School of Public Health.
Chapel Hi 11
1982
Approved by:
[)lAItfA fJ~k
ii
·e
•
ABSTRACT
IMOGENE MCCANLESS. Analysis of Covariance by Matching for the
K-Sample Problem (under the direction of DANA QUADE).
Two methods of analysis of covariance by matching are presented
for comparing K populations with respect to a continuous response
variable while controlling for a discrete (possibly multivariate)
covari abl e.
The first method presented, called the AMP sign test for Ksamples, is an extension of Schoenfelder's All Matched Pairs (AMP) sign
test and the Kruskal-Wallis test, basing the test on a combination of
sets of two-sample statistics.
It is found to have Pitman asymptotic
relative efficiency (ARE) of unity with respect to the Kruskal-Wallis
test, when the covariable is independent of the response variable.
The second method advanced, called the All Matched K-tuple (AMK)
sign test, extends Schoenfelder's AMP sign test and Bhapkar's V-test,
and analyzes matched K-tuples.
This is also found to have ARE of
unity, with respect to Bhapkar's V-test, when the covariable is
independent of the response variable.
The two new methods are also compared with each other with respect
to ARE.
The results suggest that if the conditional distribution of
the response variable given the covariable is bounded below, then the
AMK test tends to perform better than the AMP test; otherwise the AMP
test is preferred.
Efficiency comparisons are reported for certain
conditional distributions.
iii
ACKNOWLEDGEMENTS
First, I would like to express my deep gratitude to my adviser,
•
Dr. Dana Quade, for suggesting this topic and guiding its development.
His continual patience with me, gentle persuasion, and abundance of
insights were invaluable to this work.
Next, I would like to thank the other members of my doctoral
committee - Drs. P.K. Sen, Gary G. Koch, Ronald W. Helms, Gerardo
Heiss, and John R. Schoenfelder.
Their moral support and helpful
suggestions while reviewing this manuscript are greatly appreciated.
A special thank you goes to Dr. Schoenfelder for his continued
interest in the extensions of his work, and the attention he gave to
this manuscript.
Thanks, too, go to my family and friends for their continued
support and encouragement and for making my years in Chapel Hill
most enjoyable.
Finally, many thanks, and a big sigh of relief, to Ernestine
Bland for her very timely and proficient typing of this manuscript.
•
iv
TABLE OF CONTENTS
Page
I.
INTRODUCTION
A.
B.
C.
D.
II.
Introduction........................................ 1
Review of the Literature
5
1. Analysi s of Covariance........................ .. 7
2. Matching
18
Nonparametric Methods Ignoring Covariables
23
The AMP Si gn Test.
32
THE EXTENSION BASED ON THE KRUSKAL-WALLIS TEST
A. The Basic Framework and Notation
B.
C.
D.
E.
"-
F.
G.
III.
Parameter Definitions
Number of Analysis Units
The AMP Sign Statistic for K-Samples ..•.............
The Limiting Distribution of D* under
Translation-Type Alternatives
The AMP Sign Statistic for K-Samples, A
Special Case: n1=n,= ... =n K•.......•••..••...••..•
Chapter Summary ana Concluding Remarks
36
36
38
39
42
58
63
70
THE EXTENSION BASED ON BHAPKAR'S TEST
73
A.
B.
C.
D.
73
73
G.
The Basic Framework and Notation
Parameter Definitions
Number of Analysis Units
A Class of Test Statistics
The AMK Sign Test Statistic
The Limiting Distribution of G* Under
Trans 1ati on-Type A1ternat i ves
The AMK Sign Statistic, A Special
H.
Ch;~~:~ S~~;~f;·~~~n~~~~i~di~g·R~~~~k~::::::::::::::166
E.
F.
IV.
.
EFFICIENCY CONSIDERATIONS AND EXAMPLES
A.
B.
C.
Efficiency Considerations Comparing the.
AMP Sign Statistic with the Kruskal-Wallis
Test for the Special Case nl =n 2= =n K
Efficiency Considerations ComparTng the
AMK Sign Statistic with Bhapkar's V-Test
for the Special Case nl=n,=
n
Comments on "Handpicked" Exampl es,Kand an
Example with "Real" Data
75
79
81
89
103
103
107
lll
v
Page
D.
Efficiency Comparisons Between AMK
. Statistic and AMP Sign Statistic
Sampl es for the Special Case n(
Sign
for Kn2 =
... =... n ••••••••••.•••••.••.••..••••...••••..•.•• 11 4
K
V.
COMMENTS AND SUGGESTIONS FOR FUTURE RESEARCH
120
A.
B.
C.
120
122
REFERENCES
Comments on the AMP Sign Test for K-Samples
Comments on the AMK Sign Test
General Comments and Suggestions for Future
Research
124
126
CHAPTER I
INTRODUCTION
A.
I ntroducti on
The issue of comparing two or more populations with respect to
some response variable is of concern in many disciplines.
The lit-
erature is rich with presentations and discussions of sampling techniques and methods of analyzing the data after selection for the
multisample problem.
When the response variable is affected only by
variables under the direct control of the investigator, the analysis
proceeds in a straightforward manner.
If, however, there are vari-
ables that are not of primary interest but nevertheless affect the
response variable, the analysis is more complicated.
This disserta-
tion proposes two new techniques for making comparisons among groups
in the presence of extraneous variables.
As a simple example of the problem being addressed, consider
the case where an investigator is interested in assessing the effectiveness of a certain treatment designed to lower blood pressure
levels in humans.
He selects two samples and subjects the individ-
uals in one sample to the active treatment while giving those in the
other sample a placebo (i.e., there are a treatment group and a control group); He wishes to compare the treatment group to the control
2
group with respect to blood pressure measurements in order to determine whether the treatment has the effect of lowering the blood
pressure.
If the data that are collected provide information only
on the blood pressure level of the subjects after this experiment is
completed, the usual one-way analysis of variance (ANOVA) techniques
are the standard analysis procedures that are used to analyze these
data.
If the hypothesis of no difference between groups is rejected,
the natural conclusion drawn is that the treatment effect is significant.
However, if the treatment group had lower blood pressure levels
than the control group before the experiment began, a detected difference could be due to the disparity of blood pressure levels prior to
the experiment rather than due to the treatment.
It is well known
that the best predictor of future blood pressure is current blood
pressure.
Hence, the analysis would be "cleanerll if it would in some
way account for initial blood pressure levels.
In general, when an investigator wishes to compare two or more
population groups with respect to a response variable, there may be
extraneous variables that affect the response variable in some, perhaps predictable, way.
These variables should be considered, either
at the design stage or at the analysis stage of the experiment, in
order to minimize bias.
If the relationship between the extraneous
variables and the response variable is known, this information can
be used to improve the precision of estimates, in addition to reducing bias.
3
Notation (and general framework)
The problem may be described in the following way:
Define Y..
lJ
to be the response measurement of the jth subject from the ith sample
(i=l, ... ,K; j=l, ... ,n i ), where there are K population groups under
consideration, and ni subjects are selected from the ith population.
Let X be a variable (possibly multivariate) known or suspected to influence Y.
Then Xij is the measurement of this extraneous variable
on the jth subject in the ith sample. Let X be called a covariable.
If X has the same marginal distribution in each of the populations,
then X is said to be concomitant.
The issue at hand is the following:
It is desired to make comparisons among the K population groups with
respect to Y based on the analysis of the selected samples while incorporating the information on the covariable X.
Reasons for Considering X
Returning to the blood pressure treatment example, suppose that
the only data collected are the final blood pressure measurements (Y ij ).
The standard one-way ANOVA would produce an estimate of the betweentreatment group difference.
However, if the initial blood pressure
levels are different, this initial difference could cause the estimate
to be biased.
If the treatment group has lower initial blood pressure
levels than the control group, then an observed post-treatment difference could be due to the pre-treatment difference, rather than the
treatment; hence, an erroneous conclusion could be drawn.
Or, if the
treatment group has higher blood pressure levels than the control group,
then a clinically significant treatment effect may not be detected due
4
to the initial disparity.
It follows that ignoring the initial blood
pressure measurements (or age, or body mass, or other well known predictors) could severely bias the estimated difference between the two
groups.
In addition to introducing bias into estimates of population
differences, the precision of the estimates will be affected if important covariables are ignored.
Considering the above experiment where
no covariable measurements are collected, ANOVA techniques may be used
to analyze the data.
In this analysis, the total variability of the
data is partitioned into components.
The interesting components are
the variabil ity attributed to treatment effects and the "within group"
variability (i.e., variability due to error).
Inevitably, if there are
varying initial blood pressure levels (and differing age groups, body
size groups, etc.) this will increase the total variability, which will
increase the error variance.
If covariables are accounted for (either"
in the design of the experiment or at the analysis stage) the error
variance will be reduced, resulting in greater precision in estimating
the treatment effects.
In general, if an experiment has a response variable that is
influenced by some extraneous variable (covariable), ignoring the covariable introduces bias and inflates the estimate for error variance.
Clearly, a IIcomplete" analysis of the experiment will incorporate the
covariable X.
The method of including X must reduce the bias and in-
crease the precision of the comparisons.
e·
5
B.
Review of the Literature
Methods of Accounting for X
Basically, two methods exist in the literature for incorporating.
a covariable into data analysis:
structural or adjustment methods,
and balancing or matching methods.
The classical (parametric) analysis of covariance (ANOCOVA) is
the most familiar adjustment method.
This technique combines the
features of linear regression analysis and ANOVA.
The usual para-
metric ANOVA procedures for testing differences in means (for K population groups) are based on partitioning the total sum of squares
into components; if the mean square for the means (treatment) is
significantly large relative to the mean squared error, the hypothesis
of equal means is rejected.
The ANOCOVA procedure for testing differ-
ences in means is also based on partitioning the total sum of squares
into components.
But in this latter case, the test is for a differ-
ence in means of residuals, where the residuals are the differences
between the actual observations and a quantity 9redicted by regressioh on X.
Hence~
the values of the response variable are adjusted
for values of concomitant variables.
In general, adjustment methods of analysis postulate (or find)
some relationship between X and Y, and adjust the Yij accordingly,
yielding "res iduals" on which usual ANOVA procedures are performed.
ANOCOVA and other adjustment methods will be described in greater
detail later.
Matching is a method of incorporating covariables in the analysis
by equalizing the distribution of X in each treatment group.
There
6
are various methods of matching.
matching is the most common.
For the two-sample
pair-
case~
A matched pair is a pair of observa-
tions, one from each sample, which agree on X.
If X is discrete,
the observations are said to agree on X if they have the -same value
of X.
If X is continuous, observations agree on X if their X values
are linearly" the same.
liNearly the same" could mean that the X
values are within some tolerance, or within the same subgroup of
the continuum of X, or are "closest neighbors,1I etc.
These and other
matching schemes will be discussed in more detail later.
In general,
matching methods ensure group comparability with respect to the covariable.
These methods are directed primarily at reducing (or
eliminating) the bias, but may also have the additional effect of
increasing the precision of the estimates.
Randomization is a method of design that eliminates bias due to
X.
In a completely randomized design, the researcher randomly assigns
the experimental units to treatment groups.
of assigning the X effect to error.
This has the consequence
This is the technique that Wold
(1956) describes as the con trolled experiment,1I and is frequently
II
used in practice.
The use of randomization protects against bias
arising from unmeasured or unanalyzed extraneous variables.
not, however, always feasible.
It is
In fact, it is sometimes unethical
to use randomization, especially in studies on human population groups.
In particular, as Cochran (1969) discussed, ethical considerations
often prohibit randomization in observational studies.
Inthese
studies, the objective is to investigate possible cause-effect relationships.
The analysis includes comparisons of groups subjected to
7
different treatments which are pre-assigned in a non-random manner.
In these and similar studies, effects of extraneous variables are
typically considered via adjustment procedures or matching.
Analysis of Covariance
In this section, adjustment methods of including X in the analysis are presented.
The classical parametric ANOCOVA is the most
popular method for incorporating a covariable in analysis, and hence
is presented first.
As Fisher (1934) has said, the usual parametric ANOCOVA "combines
the advantages and reconciles the requirements of the two very widely
.e
applicable procedures known as regression and analysis of variance.
1I
Consider the data (Xij,Y ij ) i=l, ... ,K and j=l, ... ,n i . As before, Yij
are response measurements and Xij are measurements on a covariable for
which adjustment is made.
If X is univariate and has the same mean in
each sample, the one-way classification model considered may be expressed as:
Y•.
lJ
= 1.1.1 +
S(X •. -
lJ
X•• ) + [ lJ
..
where
Y.. = the response measurement on the jth observation in
lJ
the ith sample,
X.. = the measurement of the covariable on the jth obserlJ
vation in the ith sample,
-
X•• = the overall mean of the covariable,
1.1.
1
[
= theith population mean,
.. = the error.
lJ
8
The E ij are assumed to be independent, identically distributed N(O,a 2 ).
Under this model, assumptions are:
1)
y. has a linear regression on X.
2)
The regression of Y on X has the same slope in all populations.
3)
The covariable X is not affected by the "treatments."
In addition to these assumptions, either randomization or concomitance
is required, or some assumption which permits extrapolation of the
model beyond the available. data.
If X is multivariate, extensions are clear.
For example, if the
data require adjustment for two covariables, the two-way classification model may be expressed
e.
- X
Y.•
=]1.1 + Sl(X':
- X•• 1) + B2(X"
2) +
1J
1J l
1
"
2J
E ••
1J
where
X..
= the measurement of covariable m (m=1,2) on the jth
observation in sample i,
X• 'm
= the corresponding sample mean for covariable m (m=1,2),
1Jm
and Yij , ]1i' and
E
ij are defined as before.
Assuming the one-way classification model, the normal equations
imply the following (least squares) estimators for Band ]1i:
K _
_
_
_
K n.1
)(Y.-Y)
L:
L: (X 1J
..- X•• )(Y 1J
.. - Y•• ) - . L:n.(X.-X
1
l'
••
l'
•.
1
=1
8 = i =1 j=l
K
n·
-
2
K
- X• • )2
L: L: 1 (X 1J
.. - X•• ) - . L: 1n.(X.
1
1•
1=
i=l j=l
y.1 = y.1 • - 8(X.1 • -
X• • )
9
where
y.1 •
-
V
n.
1 L;l Y•.
= n.
lJ
1 j=l
K n.
= -1n L: L;l V.. ,
i =1 j=l lJ
K
n=
L;
i=l
n.1
n.
L;l X.•
X. =
1·
n·1 j=l 1J
X
K n.
= -n1 L; L;1 X...
i =1 j=l 1J
A
Under this model the adjusted mean scores S( x.1 • -
X. J are defi ned to
be the predicted values obtained from the model by allowing X to
assume its mean value for each of the K samples.
is "attached" to
tion mean.
That is,
S(X i .- X.. )
Yi ., the usual ANOVA estimator for the ith popula-
To test the hypothesis of no difference among the K popu-
lations (of equivalently the hypothesis of no treatment effects) the
usual ANOVA is performed on these regression-adjusted estimates; that
is, the residuals resulting from the regression of V on X.
This ana1-
ysis proposes to eliminate the effect due to X by adjusting the
response measurements for it.
In ANOVA, the treatment mean Vi. is
used to estimate the mean response with the ith treatment.
With
ANOCOVA, an adjustment for the covariable (based on the regression) is
made to make the Vi. 's comparable with respect to the concomitant
variable.
In the simplest case, the adjustment takes the form
Y.(adj)=Y.
,.
1·
-S(X.1· -X)
••
(i.e., X has the same mean in all samples).
These adjusted treatment
10
means are Simply estimates of the intercepts of the treatment regression lines.
Comparisons among the populations may be made adjusted
for X by making compar-isons among the Y.1 • (adj).
A more general ANOCOVA model may take the form
Y.. =
lJ
where
)J
+ 13 1X.. + 132D.. + E ••
lJ
lJ
lJ
is the overall mean, Dij is a dummy variable distinguishing
the samples, and the Y.. , B., and E .. are as previously defined. The
)J
1J
1
1J
assumptions under this model are the same as before, however, the covariable is used more efficiently than under the previous model.
Test-
ing the significance of the 13 2 in this model is equivalent to testing
that there is no difference among the K populations with respect to Y,
having taken account of the relationship between X and Y.
For this problem, the least squares estimator of the regression
coefficient is derived using all the data.
Some researchers suggest
that only the controls should be used to derive a proper estimate for
61 (Belson, 1956).
is appropriate.
Cochran (1969) provides conditions under which this
The relationship between X and Y may, of course, be something
other than linear.
However, as long as the form of the relationship
is known (or postulated and tested), then the responses may be
lI
adjusted for X, and the ANOVA performed on the residuals.
ll
In addition to the standard parametric ANOCOVA procedures presented, many other adjustment techniques are possible, but few seem
to have been developed in the literature.
Following are some non-
parametric adjustment methods which have been advanced.
e.
11
Nonparametric ANOCOVA
Quade (1967) proposed a rank ANOCOVA.
In his presentation he
invokes a general principle which is applicable to the usual
(parametric) technique as well:
If the hypothesis is true. then the populations are
all identical and the samples can be pooled. $0 use
the pooled sample to determine a relationship through
which V can be predicted from X. Then compare each
observed response Vij with the value which would be
predicted for it from the corresponding Xij, and assign it a score Zi', positive if Vij is greater than
predicted and negaiive if smaller. Finally. compare
the populations by performing an ordinary one-way
analysis of variance of the scores.
The procedure proposed by Quade is applicable when a random sample
of observations has been selected from each of the K populations
(Vij,X ij ) where Vij is the univariate response and Xij is the
(possibly multivariate) concomitant variable measurement (i=l, ...• K
and j=l, .•.• n.).
1
The procedure is designed to test the hypothesis
that the conditional distribution of vlx is the same for each population. against the alternative that at least one population is stochastically larger for all fixed values of X.
The procedure is applied
as fo 11 ows:
1.
Let R.. + (N+l)/2 be the rank (from the smallest to largest)
lJ
of. V..
among all N = Ln.1 observed values of V, so that R..
lJ
lJ
is the rank corrected for the mean.
(Use average ranks in
case of ties.)
2.
Let Cij(u) be the corrected rank of Xij(U) among the observed values of XU (for u=l, ... ,p).
12
3.
Regress R on C(l) , ... ,C(p) using ordinary multiple linear
regression, to form an equation whereby ranks of V can be
predicted from the ranks of X.
4.
Compute the predicted ranks R.. from the regression equation.
lJ
5.
Define scores Zij to be the residuals from the regression of
ranks.
6~
So Z.. = R.. - R.••
lJ
lJ
lJ
Test the hypothesis of identical conditional distributions
of vlx by comparing the variance ratio (VR) with the critical value for an F with (K-l, n-K) degrees of freedom (n=Ln i )
where
(n-K)
VR
~ r ~i
Z.. ])2n .
lJ
1
= -------~----'------K
(K-l)
L:.
i=l lJ'=l
n·1
L:
[ i =1 j=l
z..
lJ
J!n.l
lJ
This proposed test relaxes the usual assumptions of normality, linearity of regression, and homoscedasticity.
It does,
however, require that X be concomitant, where the standard
procedure does not have this requirement.
Hence, the standard
procedure may be applicable in some situations where this
procedure is not.
Puri and Sen (1969) advanced a generalization of this procedure
in their proposal of ANOCOVA based on general rank scores.
The situ-
ation for which this test is applicable is the same as for the above
test.
The procedure is outlined as follows:
13
1.
Define the (p+l )xl vector
~
= (Y, Xl"" ,X p) where p is the
number of concomitant variates of interest .. Then let Z~
-J
represent the (p+l)xl vector of measurements on the jth ob-
.
servation from the ith sample.
2.
Construct the matrix
1
1
G = [Zl ••• Z
,
-n l
3.
Z2
-1
2
Zn
- 2
•.• Zl
K
-
Rank each row of G in increasing order of magnitude, yieldn
ing a (p+l) x n matrix
R
n
=
R (1)
01
R (1) ... R01(K)
On1
RonK(K)
R (l)
Rlnl(l) .. ; Rll(K)
Rln / K)
R (1 )
R
ll
pl
pn l
(1 )
R
p1
(K)
Note that each row of R is a permutation of the numbers
1, .•. ,N .
4.
Let {En,a(l), a=l, ... ,n} be a set of real (known) constants
Th en replace th'
e 1th row 0 f Rn by En,a ( i)
yielding the (p+1) x n matrix
.-0 , 1, ... ,p.
f or 1-
E(O)
n,R On (K)
K
En =
1
14
5.
Define:
(.)
1 ni r
T 1 =L: En R ( i)
n,r n; 0.=1
' ra
fo r r=O, ... , p, i =1, ... , K
where r=O corresponds to the response measurement and r=l,
... ,p to the respective variates of the covariable X.
There
are (p+l)K such random variables.
6.
Defi ne a1so :
-(r)
1 n E(r)
En = - L:
n 0.=1 n,a
-
_
En - (En
(0)
' En
(1)
- (p)
, ... ,E n
).
7.
Compu t e Ln = Tn,r (i) - E-(r)
n ' r= 0 , ... ,p
. 1 , ... , K.
an d 1=
8.
The statistic Ln is a function of the residuals of the regres..
(i)
(i)
(i)
(i)
Slon llnes of Tn,O
on Tn,l
' Tn ,2
, ... ,Tn,p
(i=l , ... ,K) and, as such, is shown to have a limiting distribution which is chi-squared with one degree of freedom.
Under the null hypothesis, the distribution is central chisquared; under the alternative hypothesis, its distribution
is non-central chi-squared.
According to Puri and Sen, this procedure can be used with the
lI
usua 1" restrictions placed on the scores.
When the scores for Puri
and Sen's test are the ranks themselves, the technique is asymptotically the same as Quade's proposed method discussed earlier.
Hence,
Quade's test is asymptotically a special case of Puri and Sen's. Differences in power will exist, however, as each technique uses a different theoretical distribution for its asymptotic approximation.
Another nonparametric procedure is proposed by McSweeney and
.
15
Porter (1971).
This method involves applying parametric ANOCOVA
procedures to the mean-devi ated ranks in the same way that Fri edman s
I
chi-squared test is parametric ANOVA applied to ranks.
Monte Carlo
work performed by McSweeney and Porter generated an empirical distribution for the variance ratio thus obtained, which appeared to be
closely approximated by the F-distribution.
Concluding Remarks on ANOCOVA
Presented in the preceding discussion are some of the nonparametric ANOCOVA procedures that are advanced in the literature, including a brief discussion of the standard parametric ANOCOVA.
-e
These
procedures are used primarily in experimental situations (where randomized layouts are frequently used) to increase the precision of
estimates of population differences, or equivalently, treatment effects.
(Note that these nonparametric procedures assume concomitance, which
is guaranteed by randomization, while the standard ANOCOVA procedures
require either concomitance or that some assumption be made which
allows extrapolation of the model beyond the available data.
Never-
theless, in experimental situations, problems of bias should not be
of critical concern.)
In a randomized experiment, the amount of
precision increase depends on the correlation between X and Y.
Cochran (1957) shows that when this correlation is smaller than 0.3
in absolute value, the increase in precision of ANOCOVA over ANOVA
is negligible.
In observational studies, ANOCOVA is used primarily
,
to reduce bias caused by non-random assignment of X among the treatment groups, but also serves to increase precision.
16
ANOCOVA is a powerful statistical technique; however, it does
necessitate requirements, as discussed previously.
Elashoff (1969)
compiled results of various researchers who have studied the robustness of ANOCOVA to violations of its assumptions.
In her paper, each
assumption is considered separately with respect to the effect on
the analysis when the assumption is not met.
She notes that little
investigation has been made of situations where more than one assumption is unsatisfied.
Elashoff concludes that the assumptions of normality, linearity,
and homoscedasticity are mostly for statistical convenience, citing
studies as following:
1.
Atiqullah (1964) investigated robustness against nonnormality for ANOCOVA concluding that the test is
"... appreciably affected by non-normality even in balanced
classifications.
The degree of sensitivity to non-normality
is determined by the distribution of the concomitant variables."
He finds that the problem is not serious when the
concomitant variable is normal.
Also, in many cases trans-
formations of the data may be useful in producing more
nearly normal distributions.
2.
Atiqullah (1964) investigated the case where the true regression of Y on X is quadratic.
He concluded that results are
seriously biased unless the coefficient of the quadratic term
is quite small.
As Elashoff points out, however, transforma-
tions of the data may help to produce the required linearity
e-
17
(see Kruskal, 1968).
If the linearity assumption is not
satisfied, adjustments should be made based on a more
accurate model of the relationship between X and Y.
3.
Potthoff (1965) addresses the issues of unequal variances
for a special case of two comparison groups, and concludes
that the effect of unequal variances is minimized when
nl = n2 and the variance of X is the same in each group.
If'the variance of the errors depends on X in a uniform way
across treatment groups, a transformation of the data may
produce the necessary homoscedasticity.
-e
Although Elashoff states that normality, linearity, and homoscedasticity are for statistical simplicity, she explains that other
assumptions of standard ANOCOVA are crucial to its use:
1. Many researchers identify randomization as a requirement for
standard ANOCOVA (Winer, 1962; Evans and Anastasio, 1968;
Lord, 1963). Major issues cited include concern over not
removing all the bias and concern over extrapolation.
comitance is, however, sufficient for ANOCOVA.
Con-
If neither
randomization nor concomitance is satisfied, it is crucial
that some other assumption be satisfied that allows for
extrapolation.
2.
A basic postulate underlying the use of ANOCOVA is that the
covariable is unaffected by the treatment.
If this is un-
satisfied, the adjustment may remove part of the treatment
18
effect or produce a spurious treatment effect.
3.
The assumption that the regression of Y on X has the same
slope in all populations (i.e., there is no treatment-slope
interaction) is also crucial to ANOCOVA use.
If the assump-
tion is not satisfied, the treatment which is best on the
average may not be the one best at all levels of X. _
It
is~
as discussed
above~
agreed by many that the random assign-
ment of subjects to treatment groups is desirable to ANOCOVA use.
ANOCOVA is, however, a widely used method of analysis in studies where
the randomization is impossible or unethical.
In such
cases~
the
techniques are advantageous, but the results must be interpreted with
prudence.
Elashoff suggests that a more correct approach might be to
perform the analysis on data for which the values of the covariable
are fixed, rather than usi ng adjustment techniques.
The method of
including a covariable in analysis without using adjustment methods is
known as matching.
Matching Schemes
In the 1iterature the term ma tching most frequently refers
II
to pair-matching.
ll
In a pair-matching scheme each member of the
treatment group is paired with one and only one member of the control
group subject to the restriction that the members lIagree" on X.
Nota-
tionally, a pair of observations, one from each sample (Xlr,Y lr ) and
(X 2s ,Y 2s ) are matched in this sense if Xlr and X2s agree. If X is
categorical, the pair of Observations is matched if Xlr
= X2s '
i.e.,
19
if the covariable assumes exactly the same value in each observation.
If X is continuous, there are several methods of defining a matched
pair.
Three schemes are presented:
1. The continuum of X is partitioned into well-defined subintervals.
Then (Xlr,Y ir ) and (X 2s ,Y 2s ) are matched if Xlr
and X2s fall into the same subinterval. This method actually
converts X from a continuous variable to a discrete one, and
is then equivalent to the above categorical case.
2.
A tolerance c > 0 is specified.
are matched if !X lr - X2s ' ~ c.
caliper matching.
3.
Then (Xlr,Y lr ) and (X 2s 'Y 2s )
This scheme is called
The observations in the treatment group are ordered in any
well-defined manner.
Then (Xll,Y ll ) is paired with the control (X 2s ,Y 2s ) which minimizes !X ll - X2t '. Next (X 12 ,Y 12 )
is paired with the next available control (X 2t ,Y 2t ) which.
minimizes the absolute difference.
The scheme is continued
until all pair matches are defined.
This scheme is called
"nearest neighbor" matching.
There are other schemes presented in the literature.
These schemes
for pair-matching with respect to a continuous covariable are highly
flexible, with refined definitions of matching left to the investigator (e.g., lengths of subintervals, tolerance definition, ordering
specifications, etc.). The ramifications of various decisions are discussed in the literature.
20
Billewicz (1965) investigated methods of selecting subintervals
for continuous X when using the first method discussed above.
Cochran
and Rubin (1973) investigated bias reduction relative to various ordering schemes for the third method of pair matching on continuous X discussed above.
Other consequences of matching definition decisions are
discussed in Nathan (1963), Billewicz (1963,1964), Raynor (1977),
Yinger et
~
(1967), and many others.
For a more detailed discussion
of the literature on methods of matching, including advantages, disadvantages, criticisms, and appropriateness of matching, see
Schoenfelder (1981).
It must be pointed out, however, that matched
samples are not random samples, and cannot be treated as such in the
analysis stage.
As mentioned earlier, taking account of a covariable can occur
at either the design stage or the analysis stage.
analysis and matched sampling may be misleading.
To compare "random"
Schoenfelder ad-
dressed this issue, explaining that a fairer comparison is to contrast random sampling and its resulting analysis with matched sampling
and its resulting analysis.
These comparisons could provide a solid
theoretical basis for the widespread use of matching.
Schoenfelder proposed that a form of "matched analysis" which
could be performed on random samples could be devised.
This technique
could then reasonably be compared with the usual "random analysis"
on random samples.
"Matched" Analysis
Frequently an investigator is confronted with two random
21
samples (of sizes nl and n2) from population groups (or treatment
groups) and requested to analyze the data to make comparisons between
the groups.
Matching in the usual sense is not a viable option,
since the data are already collected.
In order to employ the usual
technique of matching on X, he must combine the two samples into one
matched sample of size n
~
min(n l ,n 2) independent matched pairs (IMP).
ConsequentlY, the nl + n2 - 2n unmatched observations are then discarded. The obvious loss of information using this technique makes
use of matching in this way very unattractive.
Quade (1967) suggested
an analysis based on all matched pairs (AMP) as opposed to using only
IMP.
To clarify the notion, consider n
=
3 IMPs that have the follow-
i ng X va1ues :
X
= X21 = 3
X
= X
= 4
X
= X
=
ll
12
13
22
23
4
Using IMPs only, Yll is compared in Y21 , since their associated X
values match; Y12 is compared with Y22 and Y13 is compared with Y23 .
However, the associated X values of Y12 and Y23 match, as do those
for Y13 and Y22 . The usual IMP analysis ignores this, but the AMP
i ncorpora tes these cross matches in the ana lys is.
II
Schoenfelder
(l~8l)
II
examines the efficiency of performing an
AMP analysis when matched sampling has been performed relative to
the usual IMP analysis that is standard for matched samples.
In the
cases studied, he finds that if the comparison is based on the
optimally weighted AMP sign statistic, that statistic, in terms of
22
asymptotic relative efficiency, is always half again more efficient
than the sign test based on IMPs.
He also examines the relative ef-
ficiency of basing comparisons on the proposed matched ANOCOVA (AMP
analysis) using random samples relative to the usual ANOCOVA test
based on random samples.
cy to be 0.955.
He finds this asymptotic relative efficien-
This result reveals that, when comparing two popula-
tions in the presence of a discrete covariable, if there are any
doubts about the parametric ANOCOVA assumptions, then Schoenfelder's
matched ANOCOVA based on the optimal AMP sign statistic should be
used.
In doing this, only a slight loss in efficiency results if
the assumptions are indeed satisfied.
Concluding Renarks on Matching
The appropriateness, advantages, and disadvantages of matching
are the subjects of discussion by many researchers.
Matching is a
desirable method of incorporating a covariable X in the analysis due,
primarily, to its simplicity and intuitive appeal, and to the relaxed
condition that it is not necessary to know precisely how X and Yare
related.
In the ANOCOVA, the relationship between X and Y must be
postulated, or found, but this is not necessary for matching.
How-
ever, when using matched sampling there is the added expenditure of
resources, and increased risk of lost information due to mis-matches,
attrition, inability to find matches, etc.
If it is known that X is
related to Y such that X should be considered in the comparisons to
be made, but the precise relationship between X and Y is not known,
matched sampling provides the obvious alternative to using, and
23
possibly misusing ANOCOVA.
If random samples have been selected,
using AMP analysis results in a negligible loss in efficiency, but
has the advantages of relaxed assumptions over ANOCOVA.
The AHP
analysis procedure for the two-sample problem where X is discrete is
developed by Schoenfelder (1981).
The generalization of this test
(say to K samples) can serve as a powerful statistical tool with many
applications.
C.
Nonparametric Methods Ignoring Covariables
In this section, the test procedures forming the basis of the
proposed tests are reviewed.
The Mann-Whitney test is a nonparametric
test for comparing two populations with respect to a response variable
aimed at detecting shift differences.
Schoenfelder's AMP sign test
is based on the Mann-Whitney test, but incorporates a covariable in
the analysis.
The Kruskal-Wallis test is an extension of the Mann-
Whitney test to a general K-sample problem, which derives the comparison by combining all possible two-sample statistics.
Bhapkar's V-test
extends the Mann-Whitney test to a general K-sample problem by analyzing K-tuples.
The test procedures that are proposed in this disserta-
tion are extensions of Schoenfelder's AMP sign test to the K-sample
problem by generalizing the Kruskal-Wallis test (Chapter II) and
Bhapkar's V-test (Chapter III).
The Mann-Whitney Statistic
The usual nonparametric test for comparing two samples of size
nl and n2 is based on the Mann-Whitney (Wilcoxon) statistic, which is
24
given by
=
n
l
2:
n
2:
2
i =1 j=l
I (Y 2 · > Yl ·)
J
,
where
a > b
I(a > b)
else
The null hypothesis of interest is that the populations do not differ
in distribution; the alternative is that population 2 is stochastica11 y 1arge r .
(Lehmann, 1975).
Theorem 1:
statistic as defined above.
Let U n be the Mann-Whitney
nl 2
Then
e·
a.
E(U)
b.
Va r( u) = n n T + n n (n -l)Ll12 +
l 2 12
l 2 l
2
-
nln2(nl+nZ-l)T12
where
= the probability that of two observations, one from each
sample, the one from the first is smaller.
= the probability that of three observations, two from the
first sample and one from the second, the one from the
second is largest.
= the probability that of three observations, two from the
second and one from the first sample, the one from the
first is smallest.
Note that under the null hypothesis, T12 = 1/2, LllZ = T122 = 1/3;
under the expressed alternative L12 > 1/2. Hence, the null hypothesis
is rejected if the test statistic is too large.
Mann and Whitney (1947)
25
showed that under the null hypothesis the statistic is asymptotically
normal with mean n1n2/2 and variance n1n2{n 1+n 2+1)/12.
The Mann-Whitney test assumes:
1.
Both samples are random samples from their respective
populations.
2.
There is mutual independence between the two samples.
3.
The response measurements are continuous.
This test is used when two population groups are compared in the absence of, or ignoring, covariab1es.
The Kruska1-Wa11is Test
-e
Perhaps the most widely used nonparametric technique for addressing the K-samp1e location problem is the Kruska1-Hallis test.
The
appropriate framework is that of the usual one-way layout, where the
hypothesis of interest asserts that the K populations are identical,
and this is tested against the alternative that location differences
exist.
The strict validity of the Kruska1-Wallis test does not depend
on the location model, but the power of the test is primarily aimed at
detecting location differences.
test are:
The only assumptions required for the
1) the data are K independent, random samples from the K
populations, of sizes np i
= 1,2, ... ,K, and 2) the response variable
is univariate and continuous.
The continuity assumption is imposed
for statistical convenience, so that ties may be ignored.
The Test Statistic
Let the data be expressed as Yij , i
= 1,2, .. .,K; j = 1,2, ... ,n i ,
26
n = L:n,..
Let R. , be the rank of Y.
i'J
n.
R. =
,
L:'
j =1
,
,J
among the n observations, let
R. . be the rank sum for sample i, and let R.
'J
,•
= __
1 R. be the
ni
'
average of the ranks associated with the ith sample observations.
can easily be demonstrated that E[R.,.
IH 0 ]
=
n+ 1
-2-
It
The Kruskal-Wallis
test statistic is expressible as
H
=
r
12
n . (R. _ n+ 1) 2
n( n+ 1) i =1 ' , •
2
n 1
Note that under Ho (R._
,. +2 )2 tends to be small. When shift differences
exist, the Ri.ls tend to be more disparate, and thus (at least some of)
the ni (R i: n~1)2 values tend to be larger, yielding large values of H.
Kruskal (1952) shows that H is asymptotically distributed as a
chi-squared random variable with K-l degrees of freedom.
Some exact
tables exist for small K and ni (Kruskal and Wallis, 1952; Kraft and
Van Eeden, 19£8; Iman, Quade, and Alexander, 1975). The adequacy of
the .chi-squared approximation for small samples is investigated by
Gabriel and Lachenbruch (1969).
The asymptotic relative efficiency
(ARE) with respect to the normal theory one-way layout is 3/n, or
approximately 0.955 (see Andrews, 1954).
Kruskal (1952) and Kruskal
and Wallis (1952) discuss the consistency; the test based on large
values of H is consistent if and only if there is at least one population for which the limiting probability is not one-half (i.e., Tit
1/2 for some i
~t)
r
that a random observation from this population is
greater than an independent randomly selected member from the other
sample observations.
When K=2, the Kruskal-Wallis test is easily seen
to be equivalent to the Wilcoxon-Mann-Whitney test.
Let
27
U'
1t
=
n.
n
L: 1
L:
t
j=l j=l
sgn(Y .. - y .)
tJ
lJ
be the "centered" Mann-Whitney statistic (with zero expectation
under the null hypothesis) for comparing samples i and t. Then the
Kruskal-Wallis statistic may be expressed as
H
=
3
k
L:
_1 [L:U. J2
n(n+l) i=l ni
t lt
From this form of the statistic it is clear that the Kruskal-Wallis
test makes comparisons among the K populations by combining all
possible two-sample (Mann-Whitney) statistics and weighting these
according to sample size.
Bhapkar's Test
Bhapkar (1961) proposed a nonparametric test t called the V-test,
for the problem of several samples. The test is appropriate under
these general conditions:
1. The data Y" t
j
= lt2t ... tn. are independent (real valued)
~bservatio~~ from the ith pJpulationwith c.d.f. Fit
1
= lt2t ... tK.
2.
The K samples are independent.
3.
The Fi are assumed to be continuous.
The general homogeneity hypothesis Ho: F.=F for all i is
tested against location alternatives. I~ particular t if the
investigator assumes that the populations differ only by
translation if at all t then the test is for equality of
location parameters.
4.
Variations of the V-test appear in the literature.
Bhapkar (1964)
presented a Wstatistic; Deshpande (1965a) proposed an L statistic;
28
Deshpande (1970) presented a general class of test statistics for the
K-sample problem, of which the V-test,
W-test~
and L-test are members.
A brief description of each of the named test procedures follows.
Foy' each of these tests, the following procedure is required:
1.
Construct K-tup1es of observations by selecting one observation from each sample.
Obviously the total number of K-
tuples that can be thus formed
.
1S
k
TI n
i
i =1
.
2.
Rank the K observations within each K-tup1e.
3.
Define vij to be the number of K-tup1es in which the observation from sample i has rank j (and is hence larger than
exactly j-1 observations and smaller than exactly k-j observations in that K-tup1e).
Note that the continuity assump-
tion for the Fi permits ties to be ignored.
4.
Define U..
1J
= v·1J·/TIn
.•
i 1
Observe that U1· · will assume values
J
in the closed interval [O,lJ.
Note that if the observations
belonging to sample i are small in comparison with those belonging to the other samples, then Uij will tend to assume
large values for small values of j and small values for relatively larger values of j.
Note also that if the samples
come from populations which do not differ in location, but
population i has a greater dispersion than the other populations, then it is expected that Uij will be relatively large
for extreme values of j (near either a or K) and relatively
small values of j near the median rank.
e-
29
The following numerical example makes the algebraic descriptions
more lucid.
Sample 1
Sample 2
Sample 3
1
2
4
3
5
6
7
Then 12 K-tup1es (triples) can be formed:
(1,3,6)
(1,3,7)
(1,5,6)
(1,5,7)
(2,3,6)
(2,3,7)
(2,5,6)
(2,5,7)
(4,3,6)
(4,3,7)
(4,5,6)
(4,5,7)
vll = 10
v12 = 2
v21 = 2
v22 = 10
v31 = 0
v32 = 0
v13 = 0
v23 = 0
v33 = 12
u11 = 5/6
ul2 = 1/6
u21 = 1/6
u22 = 5/6
u3l = 0
u32 = 0
ul3 = 0
u23 = 0
u33 = 1
The V-Test
In accordance with Bhapkar (1961), define
where Pi = n;ln.
-1
Under the null hypothesis, E[U n IHo] = K . Hence,
V is a measure of deviation from the null hypothesis since, under Ho '
V tends to have small expected value, and under translation alternatives, V tends to be larger.
Bhapkar shows that the asymptotic
30
distribution of V is chi-squared with K-l degrees of freedom, and
with a noncentrality parameter that is zero under the null hypothesis
and positive under translation alternatives.
= 35
[1 ~~~J
= 4.960
The L-Test
In accordance with Deshpande (1965a), define x,.1
for i = 1,2, ... ,K.
=
.
-u·1 l + u·1 K,
Note that if the observations from sample i are
smaller than other observations, 1 i = -1, and if they are larger·
than other observations, 1.1 = +1. Thus define the test statistic
n(2K-l)(K-l)2
(2~=n
L= -
2K 2{(2~=n - 1}
II {I
i=l
p.£.2 1 1
i=l
P.Q,. }2
J
1 1
The L statistic is sensitive to shift alternatives, but does not
appear to be sensitive to differences in scale parameters.
Deshpande
(1965a) shows that the asymptotic null distribution of L is chisquared wi th K-l degrees of freedom.
The null hypothesi sis rejected
31
for sufficiently large values of L.
Example:
=9
1 [1 01 8J = 5. 386
3 1764J
P = Pi {XI2) > 5.386} = 0.0677
The W-Test
k
.e
Define w. = L: (j-l)u .. , i = 1,2'00 .,K. Note that if sample i
1
j=l
lJ
contains the smallest observations,-wi=O; if sample i contains the
largest observations, then wi = -1.
is
Then the test statistic proposed
It may be noted that if K=3, the L-test and the W-test are identical.
Bhapkar shows that in the special case where ni=N, say, the W-statistic is equivalent to the Kruska1-Wa11is H.
The null asymptotic dis-
tribution of Wis chi-squared with K-l degrees of freedom, and the
null hypothesis is rejected for large values of W.
'
Example
~~~.
W1 -- 6'
1. W2 -- 6'
5.
w =12fl
=
W3
=2
r]
[tl})~ tl~)~ t(d - [tl})+ tli)+ f(2 l
4(I)[~~~~] = 5.386, as expected (see L-test).
32
REMARKS:
Bhapkar (1964) discusses the asymptotic efficiencies
of these tests.· The asymptotic distribution of the W-test is the
same as that of the Kruskal-Wallis H-test, and, under the same sequence of alternatives, the ARE(W,H)=l, and hence, ARE(W,F)=3/n (where
F is the usual analysis of variance (ANOVA F)), if the underlying distributions are normal.
The V-test (i .e., the test based on the number
of K-tuples for which the observations from the ith sample are least)
is much more efficient for populations bounded below.
The L-test
(based on numbers of K-tuples with respect to both the smallest and
largest observations) is much more efficient for distributions bounded
above and below.
The
l~-test
appears to be more efficient for un-
e.
bounded distributions.
D.
The AMP Sign Test
Quade (1967) suggested an all matched pairs statistic, to use
all matched pairs rather than just independent ones.
Schoenfelder
(1981) developed the AMP sign statistic
where
(Xli,Y li ); i=l, ... ,n l is a random sample from population 1,
(X 2j ,Y 2j ); j=l, ... ,n 2 is a random sample from population 2.
The statistic Tn n is the number of matched pairs where the
1 2
observation from the second sample is larger on Y. The statistic
is a sum of Mann-Whitney statistics, and, as such, is a member of
33
a general class of weighted AMP sign statistics, each member of which
is representable as
c
Q
nl n2
=
L:
c=l
wcU c
where Wc ~ 0 are appropriately defined weights and Uc is the MannWhitney statistic defined for the subcollection of observations
matched on X with X=X . Schoenfelder derives optimal weights and
c
proves that under the null hypothesis of no difference Q~ n is
1 2
asymptotically normal, where
=
c
L: P -1 U
c=l c
c
and
Pc
= P(X = xc ) ' c=1,2, ... ,C (L:pc = 1, Pc
> 0 Vc)
is the probability function of X, and, under homogeneity, Schoenfelder,
p. 98, shows:
E(Q~ n)
1 2
= (1/2) nl n2
Var(Q~ n )
1 2
=
(1/2)n l n2K + (1/3)n l n2(n l +n 2-2)
The optimal weights were derived assuming that the Pc's were known.
If they are unknown, analysis would proceed using estimates of the
optimal weights; hence, the stati sti c with estimated optimal \olei ghts
is
Q*
A
nc .
Since Pc = n
1S
=
~
rnc]-lu
c=l n
c
t
nl n2
a consistent estimator for Pc' it follows (ft'om
34
Slutsky's tbeorem (1925)) that Q*n n and Q* n n have the same limitl 2
l 2
ing power (see Schoenfelder, 1981, p. 127).
In this dissertation two extensions of Schoenfelder's AMP sign
test are presented for comparing K populations with respect to a continuous univariate response.
The covariable is assumed to be discrete
and is possibly multivariate.
The fi rs t extension is presented in Chapter II, and is called
the AMP sign test for K samples.
It is based on the Kruskal-Wallis
test, basing the test on a combination of sets of two-sample statistics.
The null asymptotic distribution is found to be chi-squared with K-l
degrees of freedom; the asymptotic distribution under translationtype alternatives is shown to be noncentral chi-squared with K-l
degrees of freedom, and the noncentrality parameter is derived.
The
AMP sign test for K samples is shown to have Pitman ARE of unity with
respect to the Kruskal-Wallis test, when the covariable is independent
of the response variable.
The second extension is presented in Chapter III, and is called
the all matched K-tuple (AMK) sign test.
This test is based on
Bhapkar's V-test, and analyzes matched k-tuples.
The asymptotic
distribution of the test statistic is found to be central chi-squared
(K-l degrees of freedom) under the null hypothesis, and noncentral
chi-squared under translation-type alternatives.
parameter is derived.)
(The noncentrality
In the case where the covariable is indepen-
dent of the response variable, the AMK sign test is shown to have ARE
of unity relative to Bhapkar's V-test.
In Chapter IV, the ARE results given above are derived.
Also,
35
the two new methods are compared with respect to each other via ARE
results for certain conditional distributions.
The results suggest
that if the conditional distribution of the response variable given
the covariable is bounded below, then the AMK methods tend to be
superior to AMP methods; otherwise, AMP methods are preferred.
Finally, Chapter V summarizes the salient features of this research and discusses suggestions for future research .
.e
CHAPTER II
THE EXTENSION BASED ON THE KRUSKAL-WALLIS TEST
In this chapter, a method of analysis of covariance by matching
is presented that is an extension of Schoenfelder's All Matched Pairs
(AMP) sign test to the general K ~ 2 sample problem that is based on
the Kruskal-Wallis test . . As such, the K-sample comparison is based
on the combination of two-sample statistics.
A.
The Basic Framework and Notation
The Data:
The data are assumed to be representable as vectors (X .. ,Y .. )
~lJi
lJ i
where i=1,2, ... ,K (indexing the populations) and ji = 1,2, ... ,n i ·
(indexing the observations within populations); ni is the number of
observations from population i, and n = Ln .. The covariable X
i
1
(possibly multivariate) which is incorporated into the analysis is
assumed to be a discrete random variable with finite support, so that
it takes on possible values
~c'
for c=1,2, ... ,C (C is the number of
possible values of X), with probabilities £c = Pr[X ..
V ], £ > 0
~c
c
The response variable Y is univariate and
~1Ji
=
for all C, and L£ = 1.
c c
continuous. The continuity assumption is made for statistical convenience so that the treatment of ties need not be considered.
37
The Hypotheses:
The general hypothesis of homogeneity for K populations is
HO : Fi(y) = F(y)
for all y=1,2, ... ,K,
where Fi(y) denotes the cumulative distribution function for
population.
~he
ith
The K-sample location model asserts that the populations
are identical except possibly for a shift.
This chapter presents a
test based on the Kruskal-Wallis test which does not require the location model for its validity, however, is primarily sensitive to
shift differences.
The issue of current interest is that of comparing K populations
for the purpose of detecting shift differences in the presence of a
covariable.
Hence, the null hypothesis incorporating X may be ex-
pressed
Note that nothing is assumed about the form of the distribution of X
asserted to be common to the K populations, simply that the distributions are identical.
The particular alternative considered is that
the conditional distributions differ by shifts.
It is recognized that
there may be other distribution differences, but the power of the test
presented in this chapter is aimed at detecting shift differences.
Hence, its appropriateness for any given testing situation must be
considered by the investigator.
The Assumptions:
As has already been stated, the general framework assumes that
38
the data are from K independent random samples, hence the observations (which are vectors) are independent within samples and among
samples.
The following three assumptions are made throughout this
presentation:
A.l
The covariable X is concomitant and has finite support.
A.2
The response variable V is continuous and univariate.
A.3
Conditional on the XiS, the V's are independent.
Note that Assumption A.2 is not required for the strict validity of
the presented analysis technique, but is rather for statistical convenience so that ties on V may be ignored.
B.
Parameter Definitions
The parameters describing probabilities of certain rankings among
observations must be defined, conditionally and unconditionally.
Definition 2.1:
Let
denote the [unconditionalJ probability that
T .. ,
11
of two randomly selected observations, one from population i and one
from population i', the one from population i I is larger on V.
Defini tion 2.2:
Let T..
,., I [T 11
.. ,.".,
IJ denote the probability that
11 1
1
1
I
of three [four] randomly selected observations, one each from populations i, ii, ii', [i I I] (not necessarily distinct), their relative
I
ranks
al~e
i, ii, ill, [ili'J.
there corresponds a e (for example: Gii I(~C)
to T 11
.. I) which is the conditional probability of the event given the
Definition 2.3:
To each
T
observations are matched on X with
~ = ~c.
39
Definition 2.4:
To each L there corresponds a y (for example: Yii
I
corresponds to Lii ' ) which is the probability of the event given that
all ~ are matched; these are called matched probabilities.
Let 8m denote the probability that m > 1 randomly
selected observations match on X. For the special case m = 2, the
Definition 2.5:
subscript K will be dropped, i.e., 8
= 82,
Let the e's and y's be the conditional probabilities
Proposition 2.1:
and the matched probabilities defined above.
ii)
Proof:
Y"I'1l
11 1
=
L:
c
~
E
C
Then
e..
, ... (V_c )/83
11 1
This is a straightforward application of the result of
Schoenfelder (1981, p. 39).
C.
Number of Analysis Units
Since the extension of Schoenfelder's AMP analysis based on the
Kruskal-Wallis test is an analysis performed on all matched pairs
(recall, all pairs (~ij'Yij) and (~i'j"Yi'j') such that i;i ' and
.
~ij
and
~i'j'
match), then the number of units of analysis (the
number of matched pairs) is a random variable, say A.
A
where
=
I
i<i
n'
'
I'
j!rl
l
n.
I'
j.=l
, ,
M(X •. ,X' I ' I
- 1J i -1 J.,
40
if X..
M(X .. ,X.
I' 1
""J., -, J'I
,
={
)
1
0
-, J'I
, and X.•.
,
1
-'J.
match
else.
If all pairs match, then clearly the number of analysis units is
n2 _ L:n. 2
In general, this is not expected. It is
In. n.• = ------2--"i<i ' , ,
desirable to determine E(A) and Var(A).
Let Aiil be the number of matched pairs from samples
and ii, respectively, with sample sizes ni and nil' satisfying
Assumption A.l. Then,
Theorem 2.1:
i)
ii)
E(A ..
"
,,
= n.n·.S
1 )
Var(A ... ) = n.n.I[S + (n.+n. , -2)S3 - (n.+n .• -l)S2],
,,
"
"
, ,
where Sand 63 are as defined above.
See Schoenfelder (1981).
Proof:
It is worth noting that A..• In.n"
" "
for estimating S.
Theorem 2.2:
is a two-sample U-statistic
Let A be the number of matched pairs from K independent
random samples of sizes ni , i=l ,2, ... ,K, satisfying Assumption A.l.
Then,
i)
ii)
E(A)
Var(A)
=
,2
n· )
41
Proof:
.2:. Aii ,·
It is obvious that A =
;)
2: A.. ,J =
E[AJ = E[
i<i
ii)
l
Var(A) = Var(
"
I E(A .. ,) = 1. n.n i ,B
i<i'"
i<i"
co v(A.. , ,A .. I
'<' <,
"r,
I
n2-L:n. 2
=
B(
2')
, )
cov(A ii "Ai;' ,)
i<i'<i'l
I
+ 2
.e
2
2: A.. ,) = I Var(A .. ,)
i<;'"
i<;'
"
+ 2 . . I, ."
+ 2
,<,'
" , ,
cov(A .. I ,A. , 1 •
i<i '<i "<i'"
, , , ) •
Co v(A .. .,A.. ,,) = E[A .. , •A.... J - E[A .. I J E[A .. " J
""
= E
11
11
n., n·
2:' 2:'
;=1 j=l
r
11
n.
II
11
n.
]
I' I'
M(X .. ,X.,., ) .
M(X,'k ,X,'''k'' )
~'Ji~' Ji'
k"=l k=l
~ i ~
i'
2
- n·, 2 n.,n·"B
,
1
=E{LkJ'1
L
J J k
jrk
M(~ij,'~i'j~,) M(~ik"~i"k'!)
,
+ I" L
. , I. M(X~ ,..J.
k
J J
=
1
1
l '
,x.,.,
) M(X ..J. ,X'''k''
)}
1 J.
1
."
~
1
1
~,
~
l
"
",
2
2
n.,n.,,8
1
,
"
2
, .,n.,n·,,8
,
{n.n' n.,,(n.-1)8 2 + n.n'lno,,83} - n.
"
- n.,
2
Similarly, cov(A .. "A. , .,,) = n.n.,n,"[(n. , -1)8 2 +8 ] - n.n.,2 n.,,8 2 •
3
11
l'
""
l'
1
42
So,
n n' lI (S
Cov(A ..• ,A"'II) = n ..•
" "
Cov(A .. I ,A' II
"
,
"
,
-
(2)
3
= 0, clearly, when i, ii, ill, i lll are distinct.
'111)
1
And
Var(A) =
I
,<,
Var(A ..• ) + 4
•• ,
I
'<' <,
"
•
'1
'11
1 n3 + 1 Ln,.3
n.n· l n· 1I = -6
3
. .• <1. II "
'"
1<1
I
\
.L.
,<1
"
,
_12
n Ln,,2
n.n.,(n.+n ) = n En. 2 - Ln. 3
,
1
1 C
i 1
,"
t
Var(A) =
~
,<,
D.
n.n'ln'II(r.~3 - (32)
n.n,,[S + (n.+n .• -2)S
'-'11'
e.
- (n.+n'I-l)S2]
"3'1
The AMP Sign Statistic for K Samples
Defi ne,
n •
i
n
i
UCii' = I
I sgn(Y i
j!.=l j.=l
,
,
IJ'~,,
- YiJ·.)
,
I{~iJ"=
~i'J·~.=
~c}
,
,
as the centered Mann-Whitney statistic for comparing samples i and i
using only those observations for which the covariable assumes the
value
~c..
Then Schoenfelder's unweighted
A~'P
sign statistic is
I
43
the general weighted AMP statistic is
where the Wc are general weights,
Q*
n
= c~
Pc
-1 U
C12
is the optimally weighted AMP sign statistic, and
Q*
n
= '" (
~
mc,-1 U
l-nJ
C12
is the optimally weighted AMP sign statistic with estimated weights,
.e
where mc is the number of observations available for which X
=V
.
~
~c
While it would be desirable to-derive optimal weights for the
AMP sign statistic for the K-sample problem, complications arise in
trying to invert a complicated covariance matrix.
Consequently, in
an attempt to simplify the presentation, a particular weighted
A~1P
sign test for K-samples will be developed and its properties (in terms
of ARE) di scussed 1ater.
At this point, another assumption is imposed on the framework.
In addition to Assumptions A.l - A.3, assumption A.4 is now in force.
Assumption A.4:
The K populations are parallel in pairwise probabili-
ties.
Definition 2.6:
If for every pair of populations, given one observa-
tion from each member of the pair, with the two
~'s
matched at
~c'
the expected ordering of the V's does not depend on c, then the populations are said to be parallel in
pairwi~e
probabilities.
44
If the populations are parallel in pairwise probabil-
Corollary 2.1:
ities, then for all values V
of X,
~c
~
e.·I{V)
11
~c
Lemma 2.1:
= Y"I
11
•
Let Vi' , ji=1,2, ... ,n i for a fixed i, be independent
~
Ji
(real or vector) random variables, identically distributed with c.d.f.
F.,
i=1,2, ... ,K.
1
Further, let n = Ln. and
i
(
ni
m.1
!2B
( r)
J1
1
-1
*2. h (r)( ~ 1a
-
l
' ... , ~ 1a
ml
( r);
~2Bl"'"
(r);'" ; ~k61""'~k6 (r)! ' r = 1,2, ... g,
mk )
m2
where each her) is a function symmetric in each set of its arguments
*
and 2. denotes the sum over all combinations (al, ... ,am, (r)) of ml (r)
integers chosen from (1,2, ... ,n,), and so on for the SiS and the 6 s.
1
Assume that E[h(r)] = 0(r) and E[h{r)]2
i)
ii)
E[U (r)]
n
Then
00.
= 0(r)
-1 m (rs)
r~
Cov[U (r),U (s)J =
[n i
n
n
i=l m.{S)
-
1
rs
m/
2.
dk=O
<
)
•
J] I
d =0
l
k lm.{r)] In.-m.(r)]
1
1
1
i=l
d.
m.(s)_d.
TI
1
1
1
s
(r,s)
di ,d 2, .. ·,d k
'
where m.(rs) = min(m.{r)
1
s
l'
(r,s)
dl ' d2, ... , dk
= E[h ( r) (~ll ' .. ., ~ 1d ' ~ 1d +', ~ 1m,c r) ;
l
l
!k"''''~kdk' ~kdk+l'''''~kmk(r) }
... ,
e.
45
X h(s) (~ 11 ' ... , ~ 1d ' ~ 1m ( r) +1' ... , ~ 1,m1 (r) +m (s) -d ; ... ;
l
l
l
l
Ykl""'Y
~Ykm k(r)+l""'Y~ k ,m (r)+ m (s)_d)J
~
~ kd k ~
k - 0(r)0(s),
k
k
r,s = 1,2, ... ,g.
Note that if r=s, this produces var[Un(r)J~ and
1
n~[~n-~J is, in the limit as n
iii)
+
00
in such a way that n~=nai'
the ai's being fixed numbers such that Lai=l, normally distributed with zero mean vector and asymptotic covariance matrix.
L
= (ors)
given by
k m. (r)m. (s)
·e
= I
. 1
1=
1
a.1
1
r;;
0,0, ... ,0,1,0, ... ,0
(1 at the ith place)
(r,s),
r~s
= 1,2, ... ,g,
where
~n = (Un(l) , ... ,un(g)) and
o = (~(1)~ .•. ,0(g)).
Proof:
This lemma is given in the above version in Bhapkar (1961).
The generalized U-statistics results are a straightforward extension
of Hoeffding's theorem (1948), and the details are not supplied here.
1=1
Define the parameters
k
O.
1
= I (y .. , - y.,.)
i'=l
11
i Ifi
and note that under Ho
11
for i=1,2, ... ,K
46
I
<5
:::
(01'02 ... ·• oK)
2·
=
To estimate 0i one may use the U-statistic
, (TIn.)-l
,.' W.,
T.
=
in which the numerator is
( .)
n2
nk
~ ... ~
h' [(X ..• Y.. ),(X 2 · .Y 2 . ) •...•
j =1 j =1
j =1
~'J1 'J 1
~ J2
J2
12k
n1
W,.
and the
= ~
~ernel
is
~
(~2j 2•Y2j 2) • ... • ( ~ kj k•Ykj k) J
. h( i ) [ ( 1j 1•Yi j 1) •
=
1
i'=l
s gn (Y . I •
' Ji ,
-
y.. )
'J;
I
c
€
c
-1 I {X.
~,
I'
Ji'
=
e·
x.. = V }
~'Ji
~c
so that
E[h(i)(o)] = (
=~
~
C
i
I
I ,;
v}l
{x..
sgn(Y· I • -Y .. ) >' € -1 I
= X· I • =
1
' J i I , J iCc
.-, J i ~, J i I -c_
-1
EC
E
2
C
I
, =1
• I
[e ,,~
.. ,(v c ) -
, ,.(V_c
8'1
)1
i I;! i
Under pairwise parallelism.
E[h (i ) ( o)J = I
k
E
I
(y..
c c i'';1''
I -
y. I
"
.)
= 6i .
i I;!i
Let
and
k=
((ors))' the asymptotic covariance matrix of
In
(~n - ~).
An
47
~
expression for
is given in the following theorem.
Let T
be the vector of K-sample U-statistics defined
~n
Theorem 2.3:
above, and 0~ the vector of corresponding expected values (under H0 :
Then the asymptotic covariance matrix of
r = ((ors))' given by
0=0).
-
a .- 1 = -31
1 (
)2 a -1
.°rr = -3 k-l
r
°rs = 13
0
1
[I
3 i =1
k
(k-l)[6 -l+ a -l J + 1 I a.- 1
r
s
3i ";1 1
(under H ) is
[Ii k1a. - 1 + (k 2 - 2k) ar- 1-_,'
=
1
In T_n
=1
a .- 1 - k (a -1 +a - 1J
1
r
s
ift"
irs
where ni = na i .
Proof:
From Lemma 2.1 the asymptotic covariance matrix of In!n is
given by ((ors))' where
k
°rs = i ~1 a i
Hence the
1;
(IS
-1
,
sO,O , ... ,1 , ... ,0
(1 in position i)
r,s
= 1,2, ... ,K
are derived.
(1)
1,0,0, ... ,0
I,
= E{
.
I s gn (V . Ji'
i'=2
I'
1
Ik
ill =2
-
Vl' ) I E c- 1 I
Jl c
sgn(V'II
' 1 - V1· ) \'L
1 Ji '
J1 c
{X.,
.
-1 Ji
-1 I { X· II ·,
E'I
I
1
-1
l
Jill
= X . = V} •
~lJl
_c
= Xl' = V I }}
- J 1 ~C
48
k
k
=E{ I
I
sgn(Y,"J'
;1=2 ;11=2
• I {X.
II . ,
~, J; II
=I
-2
E
E
C C
3
C
=
X,.
~ J,
I
;"
=
- Y,j) sgn(Y,'II
I}
V
I {X.,.
~C~, J;,
I {Pr(Y"
,"":2
- ,'11-2
-
'1
J ;11
=
J,
y,.)
-
,
Ic IE
- EC~'·
c' c
X,.
= YC}
~ J,-
}
_
_
_
IX,. -X.,.
-X· II ·, -V)
~, J., -, J. II ~C
·, - J,
•
J<Y·
, ' I J.
I <Y., II J'II
"
"
+ Pr(Y,. <Y' II ' <Y.,. IX,. =X· I . =X· II ·, =V)
", , J'II , J., - J, ~, J., ~, J'II- C
I
"
"
+ Pr(Y'
I ' 1I
, J;,
+ Pr(Y· II ·,
, J; II
,
•
-Pr{Y', 1 J.
I
< Y' II ' I <
, J;,
y,.J, IX,.
~ J,
= X..
< Y. I •
<
, J; I
y,.J, IX,.
- J,
= X., .
~, J;
= X' I ' II = V)
-, J;I
~C
~'J;I
I
= X. II .
= V )
~, J; II
~C
.
= X' II . I = V )
< Y, . < Y '11 ., IX, . = X.,
~, J.,
~C
_1 J'II
, J; II ~ J,
J,
,
,
= X' II .
= V
- Pr{Y '11 . < Y, . < y. ,. IX, . = X.
I'
_C )}
-, J.,
, J;II
-, J."
'J;,-J,
J,
,
I
=I
k
E
C
C
•
I
k
I
i '=2 i"=2
4 2
(6 - 6)
,
2
(' )
-',0, ... ,0 =")(k-l)
(:
Similarly,
r;
(1),
2
0, .. .,0,',0, ... ,0 =") (k-l) ,
( 1 ; n ith p, ace)
r = ',2, ... ,K
,
I
e·
49
(1)
1:
0
· · ,....
'0
( )
= E[h(l)(y .. ,Y 2 · , ... ,Y k · )oh' (Y'J"'Y 2 ' ,Y 3J· , ,···.
J2
Jk
,
J2
3
,J,
0
Y
kjk
=
E[i'='I
sgn(Yi,j, - Y'j') 1: Eel I{Xilljl = X'jl
ill' e I
ill
- ,
Ik
Ik
=
sgn(Y."
i'=2 ill=2
II
eel
-e
E I{X. I . = x, . = V}
-e
C~, Ji '
- J,
i 11=2
o
= 'E[
y,.J, ) Ie
sgn(Y. I' , Ji '
I
)] - {E[h(')(o)]}2
1
Ji,
- Y"
J,
= ~c,}
]
J,
) sgn(Y 'II" - Y, , ,) x
1 Jill
=Ve}I{Xl'IIJ"=X'J'I=~,el}]
i II
,-
EEl
e e I {X"I
- J' i I =X'J'
~,
E[ i ' I=2
0
~
~
I
sgn(Yi'J' - Y'J") sgn(Y iIlJ., - Y'i') x
i ll =2
i"
ill
",
not(i l =i ll =2)
Ie ~t,c
E
e
-,
EI-, I
c
{X'I'
-1
J'I
1
--
X .--v'r I{X ' 1 \ "-x
'-'1
-C)
-1 J'II ~'J,
1
-'J,
+ sgn(Y 2j - Y,j ) sgn(Y 2j, - Y,jl) x
2'2'
Ieel
I
= I
e
-2
£
e
+
Ee Eel I {X-J
= V}
2 " = X",
2 ' = X,.
_c I {X-J2
-J, = Yet}]
.2 -J,
.
k
k
4
£
e
Ie
E -2
c
I
I
0
i ll =2 i 1=2
not(i l =i ll =2)
E:
e
3
0
0
I
[pr(Y 2 , < Y, , < Y, 'I X"
= X, . I = X2 · =V )
J2
J,
J, - J,
- J,
- J2 _e
Ix,.
+ Pr(Y 2 · < Y,'I < Y"
= X",= X ' = V )
J,
J, - J, - J, - 2J2 ~e
J2
Ix"
+ Pr( Y'J' < Y,'I< Y2 ,
= X,.,= X · = V ,)
,
J,
J 2 - J, - J, - 2J 2 -e
50
+ Pr( Yl' I < Y1 , < V2 ' Xl' = Xl ., = X2 · = V )
- J 2 _c
J1
J1
J2 - J 1 - J1
S
-
Pr(Y 1 , <Y 2 · <V 1 .,IX 1 . =X.,=X. =V)
. J1
J2
J 1 - J 1 _1 J 1 -2J 2 -c
-
Pr (V1 . I < Y2 ' < Vl' IX2 . = X2' = Xl' I = V )]
J1
J2
J1 - J2 - J 2
- J1
-c
(1) _ 1
0,1,0, ... ,0
-"3
Similarly,
( 1)
So ,0 , ... ,0 ,1 ,0 , •.. ,0
=
~,
r = 1,2, ... , K
(1 at any place except r)
= E[
I
i •=2
(v ...J i -lY1Jl·)
1
1-
C
E
c-
1
I{X",
_1 J
i
v}.
I
= Xl' =
J 1 -c
51
=
E[
~
; , =2
(V.,.
- V1 .}(V 1 · - V2 ., ) x
1 J; I
J1
J1
J2
Icc'C
1. E -1
k
E
C:
1
I{X 1 · = X.,. = V } I{X 1 · = X2 . , = Vc}
-J1 -1J;, -c
-J1 -J2-
k
II
;'=2 ;"=3
(V. I' - V1 · ) (V,"., - V2 · , )
1 J;,
J1
1 J;II
J2
+
Ic )c' E c-1
(1,2)
s1,0, •.. ,0 =
E :
C
X
1 I{X · = X., .
= V}
- 1J 1
- 1 J;,
_c
•
k
I
E
C
c
-2
+ Pr(V.,.
1 J;
3 I {pr(V ·,<v · <v.,. Ix 1 · =x 2 .,=x.,. =V)
2J 2 1J 1 1 J.,
c .1, =2
1 - J 1 - J 2 _1 J.1 1 -C
E
< V · < V ·, IX . = X ., = X.,.
1J 1
1
2J 2 - 1J 1
- 2J 2 - 1 J;,
=V)
_c
-
Pr(V,. < V.,. < V ·, X,. = X2 .,= X. , . = V)
2J 2 - J, - J 2 _1 J;, _c
J,
1 J;,
-
Pr(V,.,< Y2" < V.,.
X,. = X2 ·,= X.,. = V )
J
J2
lJ;I-J1 -J
2 -1J;,-C
-
Pr(V.,.
-
Pr(V 2 ·,< V". < V,.!X,. = X2 ., = X·1 I · = V)}
J2
1 J;,
J , - J 1 - J 2 - J; I -c
J;,
1
+ 1.
c
<V 2 ·.<V,·IX,. =X 2 " =X.,. =V)
J2
J, - J,
- J 2 - 1 J; I -c
k
E
c
-2
E
c
4
I
;1=2 i";3
I (il -t)
; 1=2
k
)
·
0
52
(1,2)
sl,O, ... ,O
Also
,
t
~O
1
= -"3
(k-l)
° =- l
(1,2)
, 1, 0 , ..• ,
(k-l)
3
Similarly,
( r, s)
s0,0, ... ,0,1,0, ... ,0 = - 13
(k-l)
f or r,s
=1
,2
, ...K
, ,
r f s
(1 in place r or place s)
(1 ,2)
s0,0,1,0, ... ,0 =
E[h(l)(y 1 · 'Y 2 ' 'Y 3 ' ""'Y k ' )
Jl
J2
J3
Jk
•
h(2l(Ylji'Y2j2'Y3j3'''''Ykjk)] - E[h(1)(·l+(2)(.)]
r~
= E
s gn (Y , I '
-
' Ji 1
i 1;;2
Yl' ) I E -1 I {X, I '
J1 C c
~, J i
I
IE: 1
sgn(Y' II '1 - Y2 ,,) l
;"=1
' Ji"
J2 c C
I{X"II
'I
J
~;II
I
= Xl'
=V }
~ J 1 ~C
= X2J, , =
~
2
;"f2
k
k
E I
I
sgn(Y, I' -y ·) sgn(Y '''' -Y 2 · I) •
[ i "= 1 i 1=2
' J; I 1J 1
' J;"
J2
=
i"f2
. I I
C
•
E
-1
cl c
I{X'
I'
=X , = V } •
1
~'J;,~Jl
-c
{x 'II .I = X2 , I = V I}]
~'Ji"
~J2
~C
•
vCI}lJ
~
e-
53
-1
E.
C
k
I
;"=1
+
I
;'=2
•
sgn(Y .•. - Y1 ·) sgn(Y;"J· ! •• - Y2J· ·)· I ~c-'
1 J;I
Jl
2
l
c
-1
-1
EC EC '
;" 'f2
not(;'=3=;")
•
I{X .•
_1
o
J;I
(1 ,2)
=
1;;0 ,0 , 1,0 , • . . ,0
-e
= Xl' = V}. I{X.,,". = X2 ·.= V,}]
- Jl -c
_1 J;"
- J 2 _c
Ic
E
c
-2
E
C
Ix
3 • [pr(Y " <Y ",<Y ·
· = X ",= X . =V )
1J 1 2J 2 3J3 - 1J1 ~ 2J2 ~ 3J3 ~C
+ Pr(Y 2 ··< Y1 · < Y3 · Xl" = X2 ",= X3 " = V )
J2
J1
J3 ~ J1 ~ J2 ~ J 3 ~C
+ Pr(Y 3 · < Y1 · < y2 ·.lx 1 · = X2 · , = X3 · = V )
J3
J1
J2 ~ J1 ~ J2 ~ J3 ~C
+ Pr(Y 3 " < Y2 ·,< Y1 · X1 "1= X?.= X3 . = V )
J2
J1 - J
-~J2
~ J3 -c
J3
- Pr(Y 1 · < Y3 · <
J1
J3
y2J2
"lx- 1J1
· = X2 .· , = X3 " = V )
~ J2 - J3 -c
- Pr(Y 2 ··< Y3 · < Y1 · Xl' = X2 ",= X3 · = V
J2
J3
J 1 ~ J 1 ~ J 2 ~ J 3 ~C
-2E 4
+ \'L E
C
C
C
k
•
I
1:.
;"=1 ; 1=2
;"12
not(; 1=; "=3)
=
c;;
)
c.-
E
(1. -
c ~6
.
£)6
(1,2)_1
0,0,1,0, ••. ,0
-
~
0
)J
•
54
Simi 1arly,
S
(r,s)
0 •... ,0,1,0, ... ,0
_1
-"3'
= 1,2, ... ,K,
r,s
r
r s.
(1 in any place except rand s)
k
°rr
=
( r)
a.1 -1
"L
i == 1
So ,
°,...,1, °,...,°
=a- 1 S
(r)
°rr r
,1 , ... ,
(1 at rth place)
°,°,...
o
rr
= a r -1 (!(K-l)2)
3
°
1
3"
+
+
r
k
\'
L
= 1,2, ... ,K
a. -1
i 1=1
1
i t=r
1.=
ai
°,...,°,1 ,0, . . . ,°
(1 at ith place)
I
I. k1
So ,
-1
it=r
°rs
=a
( r, s)
-1
+ a -1 S
S0,0, ... ,0,1,0, ... ,0
t'
0,0, ... ,1,0, ... ,0
5
(1 at 5th place)
(1 at rth place)
(r,s)
k
+ .
I
sO,O, ... ,0,1,0, ... ,0
1 1=1
(1 at ith place)
i!r,s
= (a -1 +
r
( r, 5)
a
5
-1)(_ ! (K-l)) + !
k
I
3 i 1=1
3
a.1 -1
it=r,s
= 13
[I
i=l
a .-
Corollary 2.2:
above.
1
1
- K(a -
r
1
1
+ a - )]
s
Let!n be the vector of K-sampleU-statistics defined
Then the asymptotic distribution of
In T-n
has, .under the null
hypothesis, a limiting normal distribution with zero mean vector and
e-
55
covariance matrix
Proof:
~,
given in the previous theorem.
The result follows directly from Lemma 2.1.
_
k
Let T =
I
,=. 1
_
T./K.
Conditional on T, the T,.'s are subject to a
'
single linear constraint, and their distribution is singular, so
their asymptotic distribution is singular,
and~
is singular.
In
fact, Rank(~I=K-l).
Note that,
K
a.- l +(K 2_2K)a- l]
orr + 1. °rs =l[I
. r
3 i=l '
s=l
sIr
+
·e
JI [~ [JI a;-I - K(a r-I + as-I}]]
srr
K
a.- l + K2a -1_ 2Ka -1+ (K-l) I a.- l
,
r
r
i =r '
K
- K
=
I
s=l
sIr
o.
Hence, clearly
~J
= O.
,
Consequently, any K-l of the T.'s may be selected as the nonsingular set.
and let
~o
Therefore, let
be the corresponding covariance matrix.
56
Lemma 2.2:
~
If the vector X has a normal distribution with mean vector
and nonsingular variance-covariance matrix
then the quadratic
~~
form XA-1X' has a X~(A) distribution~ where r is the rank of A and
A=lJ1\
Proof:
-1 '
ll·
See Rao (1952).
Corollary 2.3:
1~2,
Let ~~
=
in (Tl,T 2, ... ,T K_l ) and
~o = ((u rs ))' r,s =
... ,K-l, r1s be the corresponding covariance matrix for vll
~o'
Then
has~
under the null hypothesis, asymptotically a chi-squared distri-
bution with K-l degrees of freedom.
Proof:
The result follows from Corollary 2.2 and Lemma 2.2.
Theorem 2.4:
Let
-
!o = vll (Tl-T, T2-T, ... , TK_l-T).
1=1
Then, conditional
on T,
I
-1
!o ~o !o
has, under Ho ' asymptotically a chi-squared distribution with K-l
degrees of freedom.
Proof:
E[E[!o!TJJ
-
= E[~ - T1J = O.
-
The covariance matrix is unaltered by adding - T to each component of
.Q,
~o
•
-
,
-1
It has been shown that conditional on T, !o Eo !o is asymptotically chi-squared with K-l degrees of freedom under Ho '
It can be
57
easily shown that, since the distribution is chi-squared (i.e., depends on the degrees of freedom and nothing more), it does not depend
on the conditioning factor, and is, hence, the same as the unconditional distribution.
1 - -
Corollary 2.4:
defined above.
-
Let !o = In (Tl-T, T2-T, ... ,TK_l-T), and
Then
= -0
t
0*
l
~o
be as
I- l t
-0
-0
has asymptotically a null distribution which is
x2 (K-l).
Proof:
·e
Conditional on T, 0* is asymptotically X2 (K-l), by the previous theorem.
Hence,
the characteristic function of D*
$0
. *
E[e
1tD
If]
+
_ (K-l)
(1-2it)
2
_ (K- 1)
+
(1-2it)
and
2
_ (K-l))
+
(1-2it)
2
and the result follows.
It has just been demonstrated that 0*
= -0
t'
I- l t
-0
-0
has asymptoti-
cally a null distribution which is chi-squared with K-l degrees of
l t , the AMP sign statistic for Kfreedom. To express 0* = -0
t' I_0 -0
samples, explicitly, it is necessary to invert I.
-0
sidered in a later section.
This will be con-
First, the limiting distribution for 0*
under certain types of alternatives is considered.
58
E.
The Limiting Distribution of D* Under Translation-Type Alternatives
In this section, the asymptotic distribution of D* is investi-
gated, assuming a sequence of translation-type alternative hypotheses,
Hn , for n = 1,2, ... ,.
The hypothesis Hn specifies that
F.{ylx)
= F{y-n-~e·lx),
,
1
~
, ,
where 8.;e.
as n -+
I
~
for some ifi
I,
i=1,2, ... ,K,
The limiting distribution is then found
00.
n.
--'
= a, exists and O<a.<l,
For each index n, assume lim
n--;= n ,
,
and the truth ofH n. If F possesses a continuous derivation f, and
there exists a function g such that
Theorem 2.5:
2.5.1
and
r)g{YI~) f{y'~D dYI ~
<
00
,
2.5.2
_00
then, for n -+
00,
the statistic D* has a limiting noncentral
x2 distri-
bution with K-l degrees of freedom, and noncentrality parameter
Although AD* appears to depend on ~, by Assumption A.4, it is
invariant to the choice of any particular value of~, and hence does
(Note:
not depend on X.)
e·
59
Proof:
Let Y~;)
=
E[T;(o)IHnJ
=
E{T.[(V .. ,X .. ),(V 2 · ,X 2 · ), ... ,(V k · ,X k · )]/H}
1
1J1 -lJ1
J 2 - J2
Jk - Jk
n
E{ ; I=1 sgn(V.,.
- v.. ) ~
1 J; 1 1J 1 c
=
Y ( i)
n
= I
E -
1
c c
E
I
2
c ;'=1
; '1;
E
-1
c
I'
[pr{v 1. J;. > v..1J; IX•_1 ,.J;. = x..
=V }
_lJ; -c
I'
- pr{v 1J.
.. > V.,1 J'I
. Ix..
-lJ. = X.
_1
.
r{x.- 1 J;, = X..
= V} }
- 1J; -c
1
1
= V}
J.,
-C
1
I'
1
1
-
.e
=
I
;
=
(2
;'=1
1/1.
1
=2
1 J;I
< V..
IX. ,.
1J; _1 J;I
= X.. ) - 1)
-lJ;
'~;
2
rIo
;1=1
-;
=
Pr( V. I '
0
~);
Pr (V • 1 • <
1 J;
v..1J; IX•-1 J;' = x··)1
- (K- 1)
_lJ;
I'
'~;
,
say
Ix
·
v..
lJ;
I
{pr(Yl . < V..
Jl
Pr (V · <
KJ
K
1J; - 1Jl
= X..
) +
-lJ;
X · = X.. ) }
- KJ K -lJ;
Pr(V 2 · < V..
J2
-
IX ·
1J; - 2J2
( K- 1) .
= X.. ) + ... +
_lJ;
60
Ix..
By Assumption A.4, Pr(Y .. < Y. 1 •
= X· • ) is the same for all
1Ji
1 J i I _1Ji _1 I Ji I
X, hence, does not depend on X. Choose X = x, arbitrary but fixed.
-
-
rt
_00
•
... +
-co
e.
Under H ,
n
F.(ylx)
1
-
=
F(y - e./mlx)
1-
Thus,
~i
=2 {
I: F(Yi-
+ .,. +
Change of variables:
e,/1n
1:
I~) dF(YI~)
F(Yi- ek/ln
Let y = Yi dy = dy.
1
+
1:
F(Yi- e211n
I~) dF(YI~)} e;lm
(K-l) .
I~) dF(yl~)
61
Wi =
2.{ [
+
F(y + 8/1i1
- 8/1i1
foo F(y + 8i /1in - 82/1in
I~) dF(yr~)
I~) dF(yl~)
+ ... +
_00
Write:
w;(t) = 2 • {
+
I:
~
F(y+
F(y +
(8i-81)·tl~) dF(yl~)
(8i-82)·tl~) dF(yl~)
+ ... +
_00
+[ F(y +
(8 i -8 k)
tl~) dF(YI~)} - (K-l).
Expand ~~t) in a Maclaurin series in t.
0oo(since this corresponds to
f F(yl~)
dF(yl~)
and Fi(YI~)
The constant term is clearly
~~O) = foo F(yl~) dF(yl~) + ... +
-~
1
= F(yl~) for all ;, Pr[Yi<Yil~) = 2)·
Fo r the 1i near term:
_00
+
_00
_00
(The above derivative exists, and "differentiation under the integral
sign is permitted by Assumptions 2.5.1 and 2.5.2.)
ll
62
-(X)
-(X)
and
lJLw~t)1
2 dt ,
t=O
=
i
~ 1(e.,
- e,.,) fooF'(YI~) dF(yl~).
- -
I';
_(X)
Let
A =
rF'(YI~) dF(YI~)
_(X)
and
k
0,' =
I
e.
(e, - e,")'
i '=1
'
Then
But
Wi
(JJ = Wi
thus. under Hn
By a similar argument
L
= I_ +
_m
O(n-~)
-
where ~m is the covariance matrix of In
is a matrix whose elements are
1
O(n-~).
(! - Yn)
as n
+
00. and ~(n-~)
It follows from Lemma 2.2 that.
63
under hypothesis Hnt
has zero mean vector and covariance matrix
is asymptotically normally distributed.
~
(asymptoticallY)t and
ConsequentlYt
;:.1)
In (T -
-
-
has a limiting normal distribution with mean vector
In [1-
° A]
rn~
.e
2 M .
Note that ~ 0i = O.
1
_
I
And let ~o = (01,02t ... tOk_l).
Then from Lemn~
2.2, In (!n - r .!) unjer hypothesis Hn is asymptotically distributed
as a chi-squar=d with K-l degrees of freedom t and noncentrality
parameter
1 is explicitly derived.
~o
O* may be simplified if 2:-
A
F.
The AMP Sign Statistic for K-samples
A Special Case:
nl =n 2=.•. =n k
In this section t the AMP sign statistic for K-samples is presented
for the special case
statistic for
~en
K-sampL~s
the sample sizes are equal.
is given by
= Eo
I
0*
Eo-1 Eo .
The AMP sign
64
Note that nl =n 2=.... =n K only if a l =a 2=... =a K=
will be useful:
t.
The following lemma
Let the mxm matrix C be defined by
Lemma 2.3:
C b (q- t) I + tJ .
~
~
Then C has an inverse if and only if qft and qf-(m-l)t.
If C- l exists,
then it is given by
-1
1
C = q_t
Proof:
1.
I- l
[! - q
t
+ ( m-l) t ~ J.
See Graybill (1967, pp. 173,174).
e.
is derived.
~o
Recall,
°
rr
\,-1
fa = ((ors))' where
= 13 [
1
I a. -
i =1
1
1
-1
I
ai
1 1=1
°rs ="3 [ .
2
+ (K - 2K) ar- 1J '
·k
I
-
K(a r
-1
+ as
-1 ] , r, s =1, 2, . . . , K- 1; rf s
)
So, in this case, ai - l = K for all i.
J
orr = 31 [k
.1 K + (K 2-2K)K
1 =1
=13
[K 2 + K3 _ 2K 2J
1 2
0rr=3K (K-l)
and
r= 1 ,2, ... , K-l
, r=1,2, .. .,K-l
65
°rs = 13 [i=lI K -
Drs
1
= -
K( K+K)]
1 2
K
r= 1,2, ... ,K- 1
3"
2
Hence, ko = 3" K [K!K_1 - ~K-1J
It follows from Lemma 2.3 that
-1
(K-1) + (K-2)(-1)
\,-1
L
~o
.e
J
3 [
=:3
K ~K-1 + ~K-1J
2. Statistic D* is expressed explicitly.
.
Expand1ng
1-1
~o ~o
!o' where
yie1 ds
t
l
1:- 1 t
~o ~o ~o
=
=
n
33
K
{k.f(T.-Tf+
1
rk.i
(T ._T)]2}
L=l
1=1
1
k-1
._T)2 +. (-T K+T)2 }
3n
I (T
3 {. 1 1
1=
K
i
1
l
~K-1_
66
3.
The limiting distribution of 0* under the null hypothesis.
The following result has been established.
Corollary 2.5:
Let D* be the AMP sign statistic for K-samples with
nl =n 2=... =n k . Then,
Under the hypothesis of [conditional] homogeneity, 0* is asymptotically
chi-squared with K-l degrees of freedom.
4.
The limiting distribution of 0* under translation-type alternatives.
As in Section E, assume a sequence of translation-type alternative
hypotheses, Hn , for n=1,2, ... ,
where
e.
Hn : F.(ylx)
= F(y-8.jln
Ix) ,
1
~
1Corollary 2.6:
derivative F ' l
I
i=l ,2, ... ,K
Assume the truth of Hn " If F possesses a continuous
and there exists a function g such that
[F'(Y+hlx) - F'(yl~)]jh
I
< g(yl~)
and
_00
then for n +
00,
the statistic 0* has a limiting noncentral chi-squared
distribution with K-l degrees of freedom, and noncentrality parameter
AD*
= 12 •
t i~1(ei-e)2 {JooF'(YI~)
dF(YI~)}2
_00
Proof:
The result follows directly from Theorem 4.2, simplifying the noncentrality
67
parameter.
, since L:0i=O
3
K
= j(3
i~l
2
°i
K
e= L
Let
i =1
K
o~
i =1 1
I
Then
.e
e/K.
K.
i=l
Hence
Ao*
=
K
K
1 i'=1
i=l
)2 = K2 I (8._8)2
= I (Ke. - L 8.•
1
4'A'
1
{K§ .K2 ;11(Bi -S)2}
= 40A2
{lK i;;;l~ (e._8)2}
AO* = 12 •
t ·J/B;-S)2 {
1
r
F'(YI~)
dF(Yld
_00
5. The AMP sign statistic with estimated weights.
Assuming that the ECIS are known, the AMP analysis for K-samples
is based on
n1
nk { k
T. = (II n.) - 1 .. ! ... L I sgn (Y . •. -Y.. HE- 1 I{X... =X .. =V }}]
- 1 J i' - 1 J i -c
j =1 i'=l
1 Ji l 1Ji e e
1
i 1 [J. =1
1
k
Let we
= E~1.
Then wc's are weights.
If the EelS are unknown, then we's
are estimated by ~e = ~~1 where ~e = meln, and
68
,
k
k
n.
~l
~k
I
mc = I m. = I
!
. 1 1C i =1 j .=1
1=
1
Note
T1.= IWc[(II.n,.)-'{.
....
(
sgn(Y.,. _Y .. ) I{X.,. = X.. =V }}}.
C
1
J =1
J =1 i'~l
1 Ji' 1J i
~l Ji' ~1Ji ~c
1
k
I
The quantity in brackets is a K-sample U-statistic, say Tc
And
C
T.1 = 'i w T .
c=l C C,1
Defi ne
C
,
T.1 = I wcTc,i
c=l
I
A
We may invoke Slutsky's theorem (1925) to show that the statistic Ti
has the same limiting power as Ti . Consider
C
J
= I £-l.T' . In
o c=l C C,1
By the central limit theorem for U-statistics, J o is asymptotically
normal with mean zero and finite variance, 0 2 • The test based on Ti
rejects if J o exceeds oZ*, where z* is the appropriate normal critical
value. If £c is substituted for £c in J o '
and the test based on Ti rejects if J o exceeds oZ*.
J
J
o
-
o = In (T i
C
=
I rn
c=l
~
T)
(~c -~Jc
,
• Tc, i
Now consider
e.
69
.
e
=
I
C
I In
c=l
=
As
C
I
T .
(E C - ~ c ). C,l
EC EC
A
. dnc
Cnc
c=l
, say.
00, C
nc is asymptotically normal with mean zero and variance
Ec (l-E c )' But as n ~ 00, the sequence of random variables dnc converges
to ~i under the fixed alternative or to under the contiguous a1ternan~
tive ~.(n) =
tC+ a/In.
t
C
I C d /w. has the
c=l nc nc '1
tribution as I Cnc by Slutsky's theorem (1925).
c=l
~
identically zero. Hence the tests based on Ti and
1
.e
Hence
same limiting disNote that
Ie
is
c nc
Ti have the same
limiting power .
6.
The relationship between the AMP sign test and the Kruska1-Wa11is
test.
K'
For the special case where n1=n 2=... =n K =
if all pairs are
matched (i.e., if C=l), it is easy to demonstrate that the AMP sign
test is equivalent to the Kruska1-Wa11is test.
Recall from Chapter I that a useful expression for the Kruska1Wallis statistic is
_
1
H - n(n+1)
n '
t
where
1
n
I
U•t =
k
I
sg n(Y •• - Yt' ).
1Ji
Jt
1
Since
H
=
2
3K
2
i
j =1 j.=l
t
1
.2 n.-1 nt Uit ]
1 =1
k
>-
\
[L Uit ]
n (n+1) i=l t
2
.
70
If C=l, then £:c=l, and
and
Ti
= (TIn i ) -1 Wi
U.
U..
'I
T.=~ 11
1
i 1 ni n;
=I
I
;
I
11
(n/k)
I
K2
2=-2
IU 11.,\
nil
Clearly IT;=O (for this case).
Hence, the AMP sign statistic is
e.
3K
=-3
n
n~l D*
G.
= H,
k
• I
1
I 1 [IU",J
., 11
=
2
1
and they are asymptotically equal.
Chapter Summary and Concluding Remarks
In this chapter, a new method of matched analysis of covariance
is advanced for the K-sample problem, called the All Matched Pairs
Sign Test for K-samples.
The test statistic is
D* = t
I
L:- l t ,
-0 -0
-0
71
where
!o = In (T i
-
T),
= w.jrr
n.
, i 1
T,.
I
h(i)[e] =
sgn(Y.,. - Y.. ) l.S-l I{X .. = x · =... =x . } ,
k
2
il=l
lJi l lJk CC
,-lJi -J2
-Jk
.e
and
~o
is the covariance matrix corresponding to !o' and is given by
(( °rs))
1
orr = 3"
[~. L
, =1
2
l
ai-1 + (K - 2K) a-r ] '
r= 1 , 2, • •• , K- 1
where ni = na .
i
It is demonstrated that 0* is asymptotically distributed as a
chi-squared random variable under the null hypothesis and translationtype alternatives.
The noncentrality parameter under a sequence of
alternatives
H F.(Ylx)
= F(y-n-~ a.lx),
, _
n 1 -
i=1,2, ... ,K
is given by
AO*
~ 4 ~~ ~~l ~o [fF'(YI~) dF(YI~)J2
-co
72
where
for
In the special case where the sample sizes are equal, D* is
simpl ified to
and the noncentrality parameter is
AD* = 12. 1.
Ki
The
I
'=1
(8,,-8)2
{rF'(YI~) dF(YI~)}2
_00
form of the test statistic for general sample sizes is not ex-
pressed explicitly due to difficulties involved in inverting a rather
complicated covariance matrix.
However, if numerical methods are
permitted, given data, the test statistic and noncentrality parameter
can be derived.
A1though the demonstration and di scuss i on are postponed until
Chapter IV, it may be noted here that the ARE of the AMP sign test
for K-samples and the Kruskal-Wallis test is unity, when Y is independent of X.
Examples and efficiency results are reported in Chapter IV.
e.
CHAPTER I II
THE EXTENSION BASED ON BHAPKAR'S TEST
In this chapter a method of Analysis of Covariance (ANOCOVA) by
matching is presented that is an extension of Schoenfelder's All
Matched Pairs (AMP) (two-sample) sign test and Bhapkar's V-test (nonparametric analysis of variance). The test to be proposed, called
.e
the All Matched
K-tuple~
(AMK)
si~n
test, requires the analysis of
matchedK-tuples.
A matched k-tuple is a vector of responses (Y lj ,
1
Y2j ' ... 'Y kj ) formed by selecting one observation from each sample
2
k
such that the corresponding covariate vectors are matched.
Definition 3.1:
A.
The Basic Framework and Notation
The basic framework and notation for this chapter are the same
as for Chapter I I.
The reader is referred back· to Chapter I I,
Section A, for Assumptions A.l - A.3.
B.
Parameter Definitions
The parameters describing probabilities of certain rankings among
observations must be defined conditionally and unconditionally.
74
Definition 3.2:
Let ¢i denote the [unconditional] probability that
of K randomly selected observations, one from each sample, the observation from population i is least on Y.
(NoteL:¢i = 1.)
Let n.(V
) denote the conditional probability that,
1 ~c
of K randomly selected observations, one from each sample, that are
Defini tion 3.3:
matched so that X
=V
for each observation, the observation from
~
~c
po~ulation
i is least on Y.
Defini tion 3.4:
Let . 1jJ.1 denote the [matched] probabi 1i ty, that of K
randomly selected observations, one from each population, that are
matched on X, the observation from population i is least on Y.
Definition 3.5:
Let Sm denote the probability that m > 1 randomly
selected observations match on X, as in Chapter II.
In this chapter,
for the special case when m = K, the sUbscript K will be dropped:
Proposition 3.1:
Let X be a discrete random variable with possible
c=l ,2, ... ,C with s c = Pr{Xc=V
values V
~
- ~c }, s c=0 for all
c' .
Then for m > 1 independent observations on X, S = L:s m
m c c
Proof:
Obvious by inspection.
Proposition 3.2:
Let n.(V
) and
1 ~c
1jJ.
1
be as defined above.
Then
Proof:
Let (~ljl'Yljl)~~2j2'Y2j2)""'(~kjk'Ykjk) be sample observa-
tions.
Then from Proposition 1,
e.
75
P(match of the K observations) = 8
= P(X l · = X2 · = ... = Xk , = V ) /P(X l ' = X2 , = ... = Xk .
- J l - J2
- J k -c
- J l - J2
- Jk
= P(X l , = V ).P(X 2 · = V ) . . . . . P(X kJ, = V )/P(X l · =X 2 , =" ,=X . )
- J l -C
- J2 -c
- k -c
- Jl - J2
- kJ k
= (tc)k/ S
.
Hence, it follows that
1J!i = pr{v i < Vii for i ' =1,2, .... k, i'fikljl=
~2j2="'= ~kjk}
= I P~V,< V'I for i'rilxl, =.,,= Xk · = V }
c '\ 1
1
- J1
- J k -c
.e
•
C.
P~~ljl='''= ~kjk= ~Ckljl =... = ~kjJ
Number of Analysis Units
The extension of Schoenfelder's AMP sign test presented in this
chapter is an analysis of AMK, i.e.,
all (Y" , V2 ' , ... , Vk ') where V"
= (X,. ,V,.), such
- J, - J2
- Jk
-lJi
-lJi lJi
that X1J' , X2 " , ... , X " match. Consequently, the number of units
- 1 - J2
- kJ k
of analysis (matched K-tuples) is a random variable, say
n
2
I
n
l
I
j2=1 j,=l
where
M(X 1J, 'X 2J' "",X k- 1J'
,X ,)
- 1- 2
(k-l) - kJ k
(4.2)
76
. {l
M(X l · , ..• , J(k' ) =
~ Jl
~Jk
if Xl' , ... ,X k · match
- Jl
- Jk
o else
If all K-tuples of observations match, then obviously the number of
k
analysis units is IT n..
i =1
In general, however, this is not expected.
1
Hence, it is desirable to derive E(A) and Var(A).
Let A be the number of matched K-tuplesfrom K indepen-
Theorem 3.1:
dent random samples of sizes ni , n = Ln i , which satisfy Assumptions
A~l - A.3.
Then
i)
E(A)
= S
. 1 1
1=
e.
(~n.).[
I
i=ll
i'=l
Va r(A) =
ii)
k
II n.
k
+
I
i '<i II
i~l(ni-l) {S2K-2 - s'} +
i fi I
ifi"
+
.I
1 =1
(n i -1) {(3K+l - (32}
Proof:
nk
nl
E(A) = E [ I .. I M(X l · , ... ,xkJ·l
~ Jl
~ kJ
J.k =1 j 1=1
= nk· ... ·n l E(M(·, ...• ·))
= (3
k
II n .
;=1 i
]
1J
+ [6 - (32 J
77
( k }-l
Let B = ,n ni
A. Then note that B is a K-sample U-statistic with
1=1
kernel M(-, .. .,-) for estimating the parameter 13, the probability
that K randomly selected observations match on X.
it is required to demonstrate that E(M(_, ... ,_))2
To invoke Lemma 1,
<
00.
However,
since Mis bounded between 0 and 1, it is obvious that E[M(_, ... ,_)]2
<
00.
The conditions of Lemma 1 are met, and it is now invoked:
.e
Hence, derive the
~l , 0 ,
°, ... , 0
=
~d
d
d .
0' 2'"'' k
Ej-M(X
, )-M(X~ l J, ,x- 2J2
",·",X k ,,)]
_ ~ l Jl· ,X~ 2J,2, .. "X~ kJk
~ Jk
l
- {E[M( - , ... , -
)J}
2
= E[1 {Xl J' = X2 ' = X3 , =... =X kJ, } ~ 1 ~ J2 ~ J3
- k
= P~2K-l randomly selected observations match on ~}-
=
13 2K - l - 13
2
•
Similarly, z;;0,0, .. "0,1,0",,,0 = S2K-l - 13 2 for all i=1,2, ... ,K.
(1 at ith position)
62
78
, •.•. ,X kJ· )ot~(X1' ,X 2 · ,X3J",···,XkJ·')]
, , ,... ,0= ErLM(X-1J· 1 ,X- 2J· 2 ,X,:;,
-~.J 3
- k
- J1 - J2 - 3
- k
(110
- [E(M(o , ... ,o))J 2
Similarly, l;
0, ... ,0,1,0, ... ,0,1,0, ... ,0
= S2K-2 - S2 for all iri',
i,i ' =1,2, ... ,K.
(l·s in positions i and ii,
ifi I)
S2
[111
101
' " , ... "
" ... , 1=SK1+
(0 in ith position)
foralli=1,2, ... ,K
S1,1, ... ,1 = S - S2
Hence,
S
d1 ,d 2 ,· .. ,d k
= S
k - S2 ,
2k - L: d.
i =1 '
(Note that when d1 =d 2=···=d k=0,
for at least one dirO.
l;d ,d , ..• ,d
k
1 2
= 0.)
Then, from Lemma 2.1
Var(B) =
=
[i~lnr[d)O"·dJO i~l (~~::j 'dl'd A]
2 •..
(~i=l'n.j-1[ ki=l'
rr 1(n._n
k
+ irr=(n.-1)
l'
So , a, ... , 100
, ,
l;
0,0, ... ,0,1
~
i=l
ilk-l
(n.-1)l;
k
1
0,0, ....0,1,0
k
it2
+ IT (n.-l) l;
'
+•.. + i=l
IT (n,.-l) s 1
a
0, ,Op,,,
ilk-2
i=2
+
k-2
+ IT (n.-l)
i=l 1
1,0, ... ,0
s0,0, ... ,0,1,1
e.
79
k
+
IT
i=l
i;k-l
i;k-2
(n -1) (,0 0
1 1, 0 +... + (nk-l) (,1 ,1, · · · , 1s 0 + ...
i
" ... "
+("1-1) '0.1.... ,1 +'1,1,....1]
=
[.~1=1 ni ]-l f.1'=1I 1=1.~
-
i;i
(n i -1) [S2K-l-
(32]
+
I
i'<i"
l
Hence,
~ n.] { I
Var(A) = [
.e
1
i=1
i'=l
k
+ ... +
I (n.-1)
. 1 1
1=
D. A Class of
Te~t
Statistics
u~
Let
[SK+1 - S2J +~ - S2J}
1
=
(nn.) -1 V~
. 1
1
1
~ I{V 1Ji
.. <V.,.}
1 Ji'
where
i'=l
i ';;
=
~i ~ {~r
I {V..
s=l
1Ji
ji=l r;i
<
V
r
s} }.
Then U~ is a K-sample U-statistic for estimating ~i' with kernel
k
h{Vljl,V2j2""'Ykjk) =
i'~l
i ';i
I.{V 1J..
<
i
V.,.}
1 Ji'
80
and
E[h(·)] = E[V .. < V·I· , for all i =1,2, ... ,K, i'ri] = <Pl·'
1Ji 1 Ji '
l
Bhapkar's V-test statistic is
where a.=n./n,
as defined before.
1 1
A class of test statistics is advanced that is motivated by
Bhapkar's V-test, but that
v(1c)
2
j
~i=1
1
IT
rli
incorporates~.
[~r I{V 1J..
1S~1
t
< Y } I
rs
i
Let
{x~rs = X..
= V }l .
~lJi ~cJ~
Then V~c) represents the number of K-tuples for which the observation
1
from sample i is least, calculated only for those observations whose
covariables assume the value V.
~c
V. =
1
Let
c
I b
v~ c)
c=l c 1
U.=V./(ITn.)
1
1 i 1
where the be's are arbitrary weights, presumably functions of the ECIS.
Then consider a class of test statistics G whose members are representable as
G=
where
k~ a. (U. - -U) 2- [kLa. (U. - -U)J2
i=l
1 1
f=l
-
1
1
1
U=-KIU
..
.
1
1
Members of the class will be considered relative to how the properties
e.
•
81
of the test (i.e., efficiency) depend on the weights (the bc's).
The
AMK sign statistic is a member of G, mu1tipl ied by an appropriate constant so that its asymptotic distribution is chi-squared.
G*
E.
= (constant) .[.Ik ai(U i - -U) 2, =1
{k.t ai(U - -U) }2 .
i
J
,'=1
The AMK Sign Test Statistic
At this point, another assumption is imposed on the framework.
In addition to Assumptions A.l-A.3, Assumption A.4 is now made.
Assumption A.4:
probability.
Definition 3.6:
The K populations are parallel in Bhapkar
The K populations are said to be parallel in Bhapkar
,
probability if the conditional probability n.(V-c ) is independent of
the value of the covariable; i.e., ni(!c) is independent of c.
It follows that, with Assumption A.4 imposed, the conditional
probabilities are identical to the matched probabilities.
Let y., = L bc skc n,'(V),
-c
c
of (conditional) homogeneity,
Definition 3.7:
- K
1 '\L bcSc'
k
yi =
c
,.
Then under the hypothesis
K
= 1
, 2
, •••
,
•
With the above defined parameter, the null hypothesis of interest
may be expressed as
Note that Vi may be expressed as
82
nl
Vi =
n2
nk
jl=l j2=1
jk=l
I
~
I· ... I
c
b r
g( i ) (Y l' ,Y 2' , ... , Ykj)
Jl
{~ljl = ~2j2=
... =
J2
k
x
~kjk= ~c}
where
(i) (Y · , •.•• ,Y · ) =
9
l
k
Jl
( i.e"
Jk
l
if y, J' < Y.
1,
1
{
a
1
I
fo r all i 1=1 ,2, . , . , K,
'
J'
1
1
' I .j.'
1 r1
else
g( i ) (Y 1J' , ... , Yk' ) = I{Y ., = mi n( Yl' ,.,., Yk' )}}
1
Jk
'Ji
Jl
Jk
So,
_
V" -
n,
j
n2
I
nk (i)
I .... I
=1 j =1
j =1
12k
wh ere h( i) [
0
]
=
h
[(X'J' 'Y'J' ),(X 2J, ,Y 2 · ), .. ,,(X k , ,Y kJ,
~ 1
1
~ 2
J2
~ Jk
k
9( i ) (Y l' ".., Yk' ) I b
J1
Jk c c
r{x 1J1.
~
=." =
Xk' = Vc} ,
~ Jk ~
Then Ui is a K-samp1e U-statistic of degree (1,1"",1) for estimating
'( . ,
1
I
~n = (U 1 ,U 2, .. .,U k )
Let
yl
and
(Under Ho'
Y
= 1
~
lXK
And let ~
= ((ors))
= ('(1 'Y2""
'Yk)'
1 1. bE))
k
°K(
• C c
c
be the asymptotic covariance matrix of
In
An expression for the asymptotic null covariance matrix of
In
(~n-r)
is required, and is given in the following theorem.
(~n~r).
)l
-l
83
Let
Theorem 3.2:
~n
be the vector of K-sample U-statistics and y
the vector of expected values, which is !o~
bcc~) under the null
c
Then, the asymptotic covariance matrix of ;n(~n-Y) is
hypothesis.
f = ((ors))
°
(I
given by:
2 2K 1
1 [ -1 2 k -lJ
(1
k] 2 k -1
= I b E - ° 2K-l a r + -K .I a,'
- -K I bCE C ' I a,'
rr c c c
, =1
c , '=1
, i,r
k'
2 k
\' b2 E2K-l ° K(2K-l)
1 [ 2. \'L a -1 +(K-2 ) a-1] - (1K \'L bcE kl . \'L a -1
c
r
C
c),'=1 i .
,1=1 i
c
c
= L
-_ Ic
2
1 [-1 -1
2 2K-1
k -lJ (1
k] 2 k_ 1
bc EC
°K(2K 1) a +a s +2 La,. - -K l. b E lI a.
r
i'=l
c c c i =l '
i,r,s
.e
(Orr>
° 'tJr, 0rs < 0,
r's.)
Proof:
From Lemma 2.1, the asymptotic covariance matrix of In
(~n-Y)
is given
by ((ors)) where
k
°rs
.
Hence,
=il~l
ail l';o,O, ... ,l,O, .. ~:os)
(1 at ith position)
th e 1';0,0, ... ,1, .•. (r,s)
' d
,0
are d
er,ve.
, r,s = 1,2, ... ,K.
84
(1) _
(l)
(1)
sl 0 0
0 - E[h
(Y 1J· ,Y 2 · , ••• ,Y kJ, )·h
(Y 1J· 'Y 2J"""'Yk")]
, , , ..• ,
1
J2
k
1
2
Jk
= E[9(l)(Yljl'Y2j2""'Ykjk)
g(2)(Yljl'Y2j2""'Ykjk)
= I b2 E2K - 1 •
e
c
C
~
be I
{~1jl= ~2j2="'= ~kjk= ~e}·
~,bc,I {~ljl= ~2j2="'= ~kjk= ~c}J - y~
l • I bCEkc} 2
_1__ (-K
2K-l.
e
Similarly,
cO,O, ... ,O,l,O, •..
~~) = ~ b~ £~K-l
• 2J-l -
[k ~ be £~J2
(1 at position r)
(1) _
(l)
1';;010
0 - E[h
(Y l · ,Y 2 , ,Y 3 · , ••. ,Y k · ) •
, , , ••• ,
Jl
J2
J3
Jk
Ib
I{X l ' =X 2 , =X 3 , =... =X k , = V } •
= E[g(1)(y , 'Y ' 'Y3' , ... ,Y · )
kJk c c
1Jl 2J2
J3
~ Jl ~ J2 ~ J3
~ Jk
~c
- [y ]
1
2
85
Simil ar1y,
l;
2
(1
(r) = l. b2 c 2K - 1
• K(2K-l) - K
0,0, ' , , ,1 , ' , , ,0
c c c
>.
C
k}2
bc EC
(1 at any position
except r)
= E[9(l)(y l , ,Y 2 , "",Y k ,) l. bc I {Xl' =X 2J, =",=X k , =V c}·
J2
Jk c
Jl
- Jl - 2
- Jk • g(2)(y 1 , ,Y 2 ,."",Y k ,,) l. b ,I {Xl' =X 2 ,.=",=X k ,.= v.}] - Y1Y2
Jl
J2
J k c' c
- Jl - J 2
- J k -c
Similarly,
l; 0,0, ..
°
,,1, , ... ,0
at position
or s)
(1
l;0
(r,s)
2
= \'L
C
bc2 E 2K - 1 • K(2K-l)
1
- (1K \'L b· C E k)
C
C
C
r
(1 ,2) _
(l)
•
°
10
°
E[h
(Y 1 , 'Y 2J' 'Y 3J' 'Y 4 '· ""'Y kJ' )
, , , "",
Jl
2
3 J4
k
86
Simil arly,
( r,s)
l,;o , a, ••• , a, 1 , a, ••• , a
= \'L
C
(1 \'
b2C Ec2K - 1 K(2K-l)
2
_ K I. bc Ek)
2
C
C
(1 at any position
except r or s)
k
So
i1
.
-1
(r)
a · <:0,0, ... ,0,1,0,: .. ,0
1 i
(1 at rth place)
orr
=
°rr
= a-1 • (;
, r
k
r
(r)
a,... ,1 , ... , a
(1 at rth place)
+
it
ail
+
.
I a: 1
i=l
iFr
= 1,2, ... ,K
'
<:'
(r)
0, .. .,1, ... ,0
(1 at ith place)
Ub~ £~K-l. K(2~-1) - ~~ be £~n
iFr
°rr =. Ic
b
I J-
2 2K l
1
1
1 2
a:
E - • 2K _l [a-r +-K
c C
i =l '
iFr
and
°rs
I
(lK I b sk}2
a-,.l
c c c
i =1
k
=i La:'
=l '
I
010 (r,so) , r,s = ',2, ... ,K, rt-s
, . . ., , , ,...,
(1 at ith place)
l;0
87
+
~L
i=l
i,r,s
0
rs
=
I
b2 2K-1
-1
ai
U
b2 £2 K-l
l
C
2
1
K(2K-l)
c c £c
(1 \' b k) 2 ]
- ,I L c £c
c
K(2K~1)
C
+
2 .I1 .a: 1]
1=
1
i;l:r ,s
-
It is easy to see that conditional on U, the Ui 's are subject to
a single linear constraint, and their distribution is singular, so
their asymptotic distribution is also singular, and I is singular.
-e
In
fact, Rank(I)=K-l.
A judicious restriction is now imposed on the bc's. Henceforth,
weights are arbitrary subject to the condition that
\' b22K- 1
L c £c
c
= (\'L
c
b £k,2
c cl .
J
With this restriction on the be's, the following result may be ob, served:
=
>. b2 2K-1
~ c£c
1
K(2K-l)
{I
s=l
st-r
[2
Ia:
i=l
1
[~-1
= c\'L be2 £c2K-l {K(2K-1)
2 L a.
i=l 1
1
l_
-1 -lJ
a r - as
+ (K-2)a-
r
1
k -l}
1 I a.
- -2
K i=l 1
+ (K-l)·2·
,
1.k a: 1 - (K-l)a- 1
i ''';1
1
r
88
k
I
s=l
s1r
a
I
-lJ
s
a: 1}
i =1
'
k
I
s=l
s1r
1
{ K{2K-l)
b2 E2K - l
c c c
=
I
=
o.
[
k
2K i!la~
a
-1]
s
-l}
_1 ~L. a,.
K . 1
,=
k
I a: lJ
1
,=. 1
'
{lK i=l'
I a:' _1K i";lI a-,.l}
Hence, subject to
~ b~ E~K-l
~
=
(~
I =0
bc
E~)
,
.
lxK
,
Consequently, any K-l of the U.'s may be selected for a nonsingular
vector.
and let Lo be the corresponding covariance matrix.
Then Io is of
full rank, and its inverse exists.
Corollary 3.1:
Let ~o be as defined above, and let ~o be the corres-
ponding asymptotic covariance matrix. Then
G* :: d ' I- l d
~o
~o
~o
has asymptotically a null distribution which is X(K-l)'
89
Proof:
Using arguments parallel to those employed in Chapter II to derive the
asymptotic null distribution of D*, it is easy to show that G* is also
2
X(K-l)'
The details are omitted.
It is established that G* = ~~ ~~l ~o has a null asymptotic distribution that is X(K-l)'
To demonstrate that this quadratic form is
indeed the AMK test statistic requires that
~o
be inverted.
In general,
this is a very difficult task, hence, one special case will be considered:
the case of equal sample sizes.
However, prior to consider-
ing the special case, the limiting distribution of G* under certain
alternative hypotheses is considered.
F.
The Limiting Distribution of G* Under Translation-Type Alternatives
The purpose of this section is to investigate the distribution of
G* assuming a sequence of translation-type alternative hypotheses, Hn ,
for n=1,2, ... , .... The hypothesis Hn specifies that
F.(ylx) = F(y-n-~ e.lx),
, -
where ei F6 i , for some i'Fi.
as n + 00.
,-
i=1,2, ... ,K,
The limiting distribution is, then, found
n.
For each index n assume ~ ~ = a i exists and O<ai<l ,
and the truth of Hn. If F possesses a continuous derivative f, and
Theorem 3.3:
there exists a function 9 such that
3.3.1
90
and
foog(YI~) f(YI~) dYI~<
3.3.2
00,
_00
n~,
then for
the statistic G* has a limiting noncentra1 X2 distri-
bution with K-1 degrees of freedom, and noncentra1ity parameter
AG*
= (~
U[l-F(Y[~)
be c~J 2 ~~~~ \
jK-2 f2(y
I~)
dy
[J
_00
Although AG* appears to depend on
(Note:
~,
by Assumption A.4 it is
invariant to the choice of any particular value of X, and hence does
not depe: Id on X.)
Proof:
e.
Let y(i)
n
= E[U.(.)IH ]
n
1
= E[U,[(Y ,
.
,
'Xl' ),(Y 2 , ,X 2 , ), ... ,(Y k · ,X k · )]IH ]
J2 - J2
Jk - Jk
n
1J1 - J1
= E[g(i)(y 1 , 'Y 2 ' ,···,y kJ·) Ib I{X 1 ' =X 2J' =... =X kJ· =v c}]
k c c - J1 - 2
- kJ 1 J2
where (recall)
1 if Y.. <Y. I ' , i 1=1,2, ... ,K,
( )
=
lJi' J i , i'fi
9 , (Y · , .. , Y · )
kJk
1J1
{
.
a else
= Ib (k • p.Jy ,. <Y.
ccc
'~'Ji 'Ji
I
•
,
l
i'=1,2, ... ,K, iltilx1' =X 2 , ="'=X k ' }
-J1- J 2
-Jk
(i)
k
Yn
= Ib ( .~.
c c C
1
Then,
t./J'
,
= PrIY" < Y1 "
\ ' Ji
y., < Y2 •
Jl ' Ji
all
J2
,.,., Y..
ex~ept
1J i
i
< Yk , Ix = x]
Jk, -
-
91
By Assumption A.4, Wi is the same for all X, hence arbitrarily choose
~=~,
arbitrary but fixed.
Wi=
~ foo foo... ~fl(Yl I~) f2(Y21~) ... fk(Ykl~)·fi(Yil~)dY11~dY2!~
_ooy i
Yi
Yi
... dYk I~
=~
[1-Fl(Yil~)][1-F2(Yil~)] ... [l-Fk(Yil~)] fi(Yil~) dYil~
_00
Under Hn,
Thus,
_00
Change of variables:
Let Y = Yi-8;1rn
ely = dy.
1
J[1-F(Y+8;1rn - 81/rnl~)J
OO
Wi
=
[l-F(Y+6 i /rn -
_00
Write
_00
• [l-F(Y+(6i-6k)tl~)J f(YI~) dYI~ .
Expand Wi(t) in a Maclaurin series in t.
82/rnl~)]
...
92
The constant term is clearly ~ (since this corresponds to ~i(O)
=
r[l-F(YI~)JK-l f(YI~)dy,
~i:~)'
i.e., Fi(Ylx) = F(yl~) for all i, and
_00
Find the linear term:
(This derivative exists, and differentiation under the integral sign
is permitted by Assumptions 3.3.1 and 3.3.2.)
= [-f(Y+(8.-e
1
1
1
1
2)tlx)]
ddt ~.(t)
l )]o[1-F(y+(e.-e
l )tlx)o(e.-e
~
l
~
2
2
+
[1-F(y+(e.-8 )tlx)]o[-f(y+(e.-e )tlx).(e.-e )]o[1-F(y+(e.-e3)tl~)]o
+
[1-F(y+(ei-el)tl~)]·[1-F(y+(ei-e2)tl~)]o
1
~
1
~
1
1
...• [-f(y+(ei-ek)tl~)·(ei-ek)]
• f(yl~) + [l-F(y+(ei-el)tl~)]· ... ·[l-F(y+(ei-ek)tl~)Jo:tf(yl~)
And
d~ ~i (t)
Let
and
= - it
t=O
(ai-a i ,)
r
_00
..•
{[t(yl ~d· [l-F(Y!~) ]K-2}
d
YI
~
•
93
Then
1
= K0iAt+ O(t).
~i(t)
~. (_1)
But
1
In
=
~.
1
thus, under Hn
E[U.] =
1
k 1
l: bc S [ c
cK
0.
- -'
In
1
A + 0(-)]
n
By a similar argument
+ O(n-~),
I =I
~n
~
~
where fn is the. covariance matrix of In (~-Yn) as n -+
is a matrix whose elements are O(n-~).
00,
and ~(n-~!)
It follows from Lemma 2.1
that, under hypothesis Hn ,
In
(~-Yn)'
for large n,
has zero mean vector and covariance matrix
I
(asymptotically), and
is asymptotically normally distributed. Consequently
In (U-U)
has a limiting normal distribution with mean vector
- In (1. b sk
c
1) + rn (~c bcskc • (1Kin))
- ~ All
c c K
The mean vector is, hence, -
(~ bc£~} A~,
Note that ?Oi=O. And let ~~
1
.
where
= In (l:b £k • [_ Oil)
c c c
;nJ
~I
= (01'02" .. ,0K)'
= (01,02, ... ,oK_l)' Then from
Lemma 2.2, In (U-U), under hypothesis H , is asymptotically distributed
-
2
n
as a X with K-l degrees of freedom, and noncentrality parameter
94
AG*
k} 2
= (~ bc EC
2
-1
Io
,
A "~O
~O·
~,~~l~o may be simplified if f~l is explicitly derived.
G.
1=:=1
The AMK Sign Statistic: A Special Case: n1=n 2=···=n K
In this section. the AMK sign statistic is presented for the
K-samp1e problem when the sample sizes are equal.
genera 1.
G=
_2Ika. (u .- U)
i=l
1
1
[kI
i;l
Recall that. in
_]2
a. (U . -u)
1
1
u.1 = V.1Irrn
.•
. 1
for
1
n
l
g(i)(y l , •...• Yk · ) IbcI {Xl' =... =X kJ, =V c}
Jl
Jk c
- Jl
- k1
(.)
where
g
1
(Y 1j ..... \ j )
1
k
=
{
if Y.. <Y.
1J.
I'
1 J·t
1
0
•
i l =1.2 .....K. il=i
1
else
and the n.1 's are determined so that
ni = na i
Note that n1=n 2=.•. =n K only if a1=a 2=... =a K = ~ .
,\-1
1.
'0
Reca 11 ,
is derived.
f~ 1 = (( °rs ) ). whe re
1 [( K-2 )a-1r
Orr = \'L bc2 EC2K-1 "K(2K-1)
c
~
-11
+ 2 .~ a i
1-1
-
r=1.2 ..... K-1,
1
k}2
- (KI bc E C
C
k
1
I
ai
1 1=1
.
95
r,s = 1,2, ... ,K-l; rl-s.
Since, in this case, ail = K for all i,
.e
Orr =
~ b~ £~K-l·K{2~_l)
[(K-2)K + 2Jl K] -
=
1b~ £~K-l·K{2~_1)
[K(3K- 2
=
I b2 /K-l
c c
Orr:: 9
,
J/
l] - (~ be £~J2
[3K-2] - (I b kJ
2K-1J
c c Sc
r
(t 1be £~(
2
= 1,2, ... ,K-1.
I
r
- I b2 2K-ll
[2
K - (K+K)] - (lK ~ b sk}2.
K
°rs - c c Sc ·K(2K-l)
i=l
c c c i '=1
(1
= 'i.' b 2 c2K - l •
1
[2K 2 2K]
'i.' b ckJ 2. K2
~ c ~c
K(2K-l)
- K 2 c ~c
0rs :: h ,
r,s = 1,2, .. .,K-l; rfs.
Hence Lo = 9 !K-l + h(~K_l - !K-l)' which is a standard form that can
"!
be easily inverted. It follows from Lemma 2.3 that
[I
?o-l =
b2S2K - l .-Lj-l I _
-{: c c
2K-l
~
rI
b2s2K-l.2K-2 - (1: b skJ2]
if c c 2K-l C C c
J
2
21
b2£2K-l.2K -3K+2 - (K-l)l'i.' b sk]
J
cc
2K-l
L
c c
U
'i.'
L
C
-
96
(Since 9 + (K-2)h
=
r~ b2c 2K - 1 (3K-2+(K-2)(2K-2)}
~ c c
~
2K-l
Subject to the restriction that
1b~ £~K-l = l~
be
[
£~12
1 -
~
2K-2
2K-1 - 1
And
2K 2 - 3K+2
2K-1
- (K-l)
~.O-l
.~
So
Statistic G* = ~o
~-1
~~=~
-
=
[1~ + Jl~
-
I
--'0 90
1
1
J
2K 2 - 3K+2 ~J
2K-1
(K-1)
2K-2 - 2K+1
2K 2-3K+2 - (2K 2-3K+1)
= [\'l bc2 E2K-1.~j-1
c
2K-1
c
I
2.
=
,
as expressed explicitly.
Note, then, that G* may be expressed as
-1
e.
97
- + (-u +U)2
- }
=
I
b2 £2K-l 1-1 .n (2K-l ) {K-1
I
(U._U)2
(
~oko ~o
C c c)
K
i=l 1
K
1\,-1
G*
=
(Ic bc2 £2K-l]-1
n(2K-l). I (u._U)2
c
K
i=l
1
3.
The limiting distribution of G* under the null hypothesis.
The following result has been established.
Corollary 3.2:
G* =
.e
Let G* be the AMK sign statistic with nl =n 2=... =n K.
(~
c
I
b2 £2K-l}-1. n(2K-l)
(u._U)2,
c c
K
i=l 1
where the bc's are arbitrary weights subject to the restriction that
.I b2 E 2K- 1 = (\'l b E
kJ 2, and the £c' Ui' U
- are as previously defined.
c c C
c c c
Then, G* is asymptotically distributed as a chi-squared random variable
with K-l degrees of freedom, under the hypothesis of [conditional]
homogeneity.
4. The limiting distribution of G* under translation-type
alternatives when nl =n 2=... =n K.
As in Section F, assume a sequence of translation-type alternative hypotheses, Hn, for n=1,2, ... , that approach the null hypothesis
as n -+ 00, .
For each index n, assume the truth of Hn if F possesses
a continuous derivative f, and there exists a function 9 such that
Corollary:
I[f(y+hl~) - f(yl~)]1 < g(ylx)
98
JOOg(YI~) f(yl~) dYI~
and
<
00
,
-00
then for n +
the statistic G* has a limiting noncentral X2 distri-
00,
bution with K-l degrees of freedom and noncentrality parameter
Proof:
The result follows directly from Theorem 3.3 of Section F, with simplifying the noncentrality parameter.
=
=
U ~2K-l.~]2
1. b c
e.
c
2K-l
1 {K-l
\ 0i2 + { - OK }
.L
1=1
rI bc2 E:2K-l'~J-l
I
c
2K-l
i =1
~
0:
1
k
e = . L1e./K.
1
Let
1=
Then
k
k
2
I 0. = . L1(Ke.1
. 1 1
1
=
1
=
Hence
-(2K-1)K -
k
i
>.
2
(e.-e) A2
I"; 1
1
2
}
e.
k
,since
\t.
i=l
0i=O
99
5.
Determining explicit weights for G* when nl=···=n K .
It is desirable for weights of most statistics to have certain
optimum properties.
A component of this research has been to inves-
tigate the properties of the AMK sign statistic, particularly as
efficiency is related to the weights.
A restriction was imposed on
the weights at the end of Section E,
\ b2
l
C
C
E
2K - 1
C
= ( \L
C
b
C
c-~) 2.
~
For the equal sample size problem, the noncentrality parameter is
expressed explicitly, and does not depend on the weights.
.e
Hence, the
efficacy of the test is invariant to the choice of weights subject to
2
2 2K l
k
>- b E - =
b E . A convenient choice is
C c c
c c c)
1
(2
bc -= E-K+l
C
•
Then
\ b"2 2K-l
L c EC
=
C
Hence, for the equal sample size case,
G* = n(2K-l).
K
I (U._U)2,
. 1
1=
1
where
U.
1
= V.1 lIT.i
n.
_ \\
~
1
V. - u· .. i.. 9
1 "
(i)
\-K+l {
_
_
_}
(Y l · , .•• ,Y k · ) L E I Xl' - •.• -X k · -V c
Jl
Jk c c
- Jl
- Jk -
g(i)(.) = I{ith observation is least}.
Efficiency considerations will be discussed in Chapter IV.
However,
100
it is noted here that, when X is independent of Y, (i.e.,
Fi(Y))~
Fi(YI~)
the ARE of G* relative to Bhapkar's V-test is unity.
=
Hence,
asymptotically, G* performs as well as Bhapkar's V-test when Bhapkar's
test is appropriate.
6.
The AMK sign statistic for equal sample sizes with estimated
wei ghts.
If the €c 's are unknown, then the AMK analysis is based on
where
U. = (IJ n,.)-l
,
,
II .. ·I
g(i)(y l , , ... ,Y k · ) I w
Jl
Jk c c
r{x lJ·1=.'.=X kJ' k=v c}
~
~
~
where, for each c, the selected weights Wc ( = €c-K+l ) are estimated by
A
Wc
=
A K+l
k
k
n. {
}
€where ~ = m In, where mc = 2 m. = 2
2'
I ~iJ',,=~c .
i~l'c
i=l j.=l
c
c
c
,
An argument parallel to that of Chapter II, Section 5, can be used to
show that the distribution of G has the same limiting power as G*. In
In (-K+l
· case, C = In
t h,s
€
- A-K+l)
€
and IC nc converges to zero in law.
nc
c
c
c
H.
Chapter Summary and Concluding Remarks
This chapter has presented a method of matched ANOCOVA for the
K-sampl e prabl em, call ed the All Matched K-tupl es Sign Test.
test statistic is
G*
where
' \-1
= d~o
~o
~o
The
e.
101
u.1
= V•III n"
1i
1
n1
n2
n
k (i)
V.= > I· .. · I 9 (Y1"Y2"""Yk')x
1 j;l j =1
j =1
J1 J2
Jk
12k
(1.) (Y
9
.e
~o
and
a
rr
_
1j ""'Y kj ) 1
k
{1
Y..
if 1J'
.
1
else,
a
<
Y,1J'1
I'
,
for all i I;i
1
is the corresponding covariance matrix, given by ((ors))'
k
k
1 + ~ I a:lJ - (1) 2 • I a 1
[a2K- 1 r
K i;1 1 K
i =1
= _1_.
-1'
r=l ,2, ... ,K-l
i;r
k
1 1) [a -r1 + as- 1 + 2 \.I. a:1 1]
a rs = K(2Ki =1
i I;r,s
r,s=1,2, ... ,K-1;
rl's.
G* is asymptoti cally distributed as a chi-squared random variab1 e under
the null hypothesis and translation-type alternatives.
The noncen-
tra1ity parameter under a sequence of alternatives
is given by
~* = ~~l~1~o [[[l-F(YI~} ]k-2 f2(YI~) dy I~
_00
where
r'
102
k
for
o. = K8. - I 8.
1
1
il=l
I •
1
In the special case where the sample sizes are all equal, then
G* is simplified to
and the noncentrality parameter is
AG* = K (2K-l)
it
(8; _e)2 [
r
K2
[l-F(y 1:<) J - f2(y
_00
I~)
dy I
~
r·
In the general case (unequal sample sizes) a simplified expression is
not derived due to difficulties in inverting the covariance matrix.
However,given data, clearly the matrix can be inverted (perhaps
using numerical methods), and the statistic calculated. The noncentrality parameter can also be found explicitly.
Although the demonstration is postponed until Chapter IV, it may
be noted here that the ARE of the AMK sign test relative to Bhapkar1s
V-test is unity for the equal sample size case, whenever the covariable
is independent of the response (i.e., when Bhapkar's V-test is appropriate).
In fact, it is easily demonstrated that in the trivial case
where all K-tuples are matched (C=l =>
€
c
=1), the AMK test is identi-
cally Bhapkar1s V-test.
Examples and efficiency results are reported in Chapter IV.
e.
CHAPTER IV
EFFICIENCY CONSIDERATIONS AND EXAMPLES
A.
Efficiency Considerations Comparing the AMP Sign Statistic with
the Kruskal Wallis Test for the Special Case nl =·· .=n K.
1. ARE of the AMP Si9n Statistic to the Kruskal-Wallis Test:
.e
It was demonstrated in Chapter II, Section F.4, that under trans1ati on-type alternati ves, D* has a 1imiti ng noncentral chi -squared
distribution. with K-l degrees of freedom and noncentrality parameter
Andrews (1954) has shown that the Kruskal-Wallis H-test is (under translation-type alternatives) distributed as a noncentral chi-squared
random variable with K-l degrees of freedom and noncentrality parameter
(assuming equal sample sizes)
AH = 12 {
J"
F'(y) dF(Yl}2
k;~l
(6;-8)2
_00
A well-known result [Andrews (1954), Hannan (1956)] is that the
asymptotic efficiency of one statistic to another (when their limiting distributions are chi-squared with the same number of degrees of
freedom) is equal to the ratio of their noncentrality parameters.
104
Hence, it follows that the ARE (D*,H) is as follows:
ARE: 0* , H)
AD */ AH
=
1
k
12· K• i~l(ei-e)
2 {
(00
}2
LooF'(YI~) dF(yl~)
=
12 { I"F'(Y) dF(y)}2·*;,11 (6;-8)2
_00
[F' (yl~) dF(y I~)
2
1•
=
fooF (y) dF (y )
-00
I
It follows that, if the covariable is independent of the response,
= 1.
then the ARE(O*,H)
For large n, then, no loss in efficiency
results by controlling for an irrelevant covariable.
However, in
finite samples some loss in efficiency may be expected.
2.
Example 1:
The following illustrative example is presented.
Sample I
Sample II
Sampl e I II
X=1
1
3
2
4
5
6
7
X=2
11
12
13
14
15
16
17
18
X =3
21
22
24
23
25
26
27
28
29
e.
105
"'-1
E:
1
1
"'-1
(
2
"'-1
E:
~
24
= T
= 824
= 924
w1 = 24(64)
7
+ 24(120) + 24(128)
= 920.7619
+ 24( 0 ) + 24 (16)
W2 = 24(16)
16
7
8
= 97.5238
8
9
+ 24(_120) + 24(_144) = 957.3333
W3 = 24(_80)
9
.
8
9
.e
T1
=
1.7984
T2
=
0.1905
T3
=
-1.8698
T
-
=
0.0397
0*
=
3~~4) [3.0929 + 0.0227 + 3.6462J
=
2.67 [6.7618J
0*
= 18.0310
The Kruska1-Wa11is Test
I
-13
8
9
10
16
17
19
R1=83
II
-24
5
11
12
18
20
21
R2--93
III
6
7
13
14
15
22
23
24
R3=124
106
H
12
83 2
93 2
124 2
+
= (24)(25)
+
-8J
[8
8
-
3(25)
1
= 50
[3864.25J - 75
H =
t
2.285
Example 1 illustrates that for this case (n i =8, K=3) where the
data indicate a location difference, the AMP sign test for K-samples
detected the difference (rather dramatically) while the KruskalWallis test did not.
W = 24(
7
l
W2 =
0) + 24(-8 ) + ~4(-16)
8
¥<-16) + 2: (16 )
+ ~4( 16)
W3 = 24( 16) + 284( - 8) + £!(
9
7
= -66.6667
=
35.8095
a) =
30 .8571
107
Tl
..
= -0.1302
T2 = 0.0699
T3
=
0.0603
T
=
0.0000
0*
~ 3~~4) [0.0169
+ 0.00489 + 0.0036J
= 2.667 [0.02547J
0*
..
.e
B.
=
0.0679
Efficiency Considerations Comparing the AMK Sign Statistic with
Bhapkar's V-test for the Special Case nl=n 2=···=n K·
1. ARE of AMK Sign Test to Bhapkar's
V-test~
It was demonstrated in Chapter III, Section G.4, that under
translation-type alternatives, G* has a limiting noncentral chisquared distribution with K-l degrees of freedom and noncentrality
parameter
AG* • K (2K-l).;.t (8;-8)2
[r
[l-F(Y
I~) lk-2 f2(YI~) dy
It.
_00
Bhapkar (1961) shows that the V-test, under translation-type alternatives, is asymptotically distributed as a chi-squared with K-l degrees
of freedom, and noncentrality parameter (with n,= ... =n K)
108
Hence the ARE(G*,V) is as follows:
= AG*/AV
ARE(G*,V)
K (2K-1)
=
J/
K (2K-l)
~
il=l
ai-e) 2[
CD-F(Y l~ll-2
(e._e)2[
1
f
_00
~)dYI
f2(y I
t
rl - F(y)]k-2 f2(y) d J'"12
Y
2
-00
•
It follows that, if the covariable is independent of the response,
then ARE(G*,V) = 1.
So if the AMK sign test is performed when
Bhapkar's test is appropriate (i.e., when a covariable need not be
controlled for, but yet is incorporated into the analysis) asymptotica11 y, nothi ng is lost.
It shoul d be noted, however, that thi sis
an asymptotic result, and that in finite samples, some loss of efficiency may be expected if an irrelevant covariable is incorporated
into the analysis.
In general, if a covariable should be incorporated
into the analysis, but is not, Bhapkar's test may not detect a true
population difference, or may depict a false difference.
2.
Example 1:
Refer to the data of Example 1 of Section A.
AMK Sign Test:
E:
-
1 -
7
24
nl =n 2=n 3=8; n = 24
W
l
= (;4)-2 = (2~~
2
e.
109
A-
E:
2
A-
E:
3
8
24
W
= (~)-2
=9
4
=
9
24
W
= (~f2 = 64
2
3
24
2
9
V1
= (2~~
V2
=
(24) 2.
2 + 0 + ~4'3 ~ 44.843537
49
V
3
=
0
(~ n;)
=
512
U
1
·=
0.879314
U2
·
=
0.087585
U3
=
·
0
,
he
=
.10 + 9.18 +
~24 = 450.21768
W; = 0.966899, U = 0.3222996
n(2~-1).1
G*
=
G*
= 18.7693
(U;_U)2 = ( 24 1(5) [0.3102649+0.0550909
,=1
+ 0.10387J
Compare with X(2)'
Bhapkar's V-test:
= 234
U1
= 0.4570313
V2 = 191
U2
= 0.3730469
V3 = 87
U3
= 0.1699219
V
1
110
V
=
k
n{2K-1) I (U.- 1)2
K
i=l 1 K
= (24)i 5) [0.0153012
=
v =
+
0.0015772 + 0.0267033J
40[0.0435817J
1. 7433
Compare with Xl2)"
Example 1 is an extreme case where a translation difference
exists, and the covariab1e is correlated with the response.
Bhapkar's
V-test failed to detect any difference, while the AMK test shows a
dramatic result.
In general, the ARE{G*,V) depends on the relationship between
the covariab1e and the response.
3.
Example 2:
Refer to Example 2 of Section A.
V1 = 70.530612 + 36 + 56.8 = 163.4195
V2 = 23.510204 + 63 + 78.2 = 164.73242
V3 = 47.020408 + 63 + 56.8 = 166.90929
U1
= 0.3191787
U2 = 0.321743
U3
= 0.3259947
LUi = 0.9669164,
-
U = 0.3223054
111
G*
= 40[0.0000097
=
+ 0.0000003 + 0.0000136J
0.0009
Compare with X(2) •
C.
Corrunents on "Handpicked" Examples and Example with Real Data
Examples 1 and 2 illustrate how the AMP and AMK sign tests are
performed, and also that, (for ni =8 at least), if the data are extreme indicating a translation difference (when matching on X), the
statistics tend to be large; if the data are extreme indicating no
translation difference, the statistics tend to be close to zero.
An Example with Real Data:
The following unpublished data are a subsample of a large stuQy
with three treatment groups.
The covariable is univariate continuous;
however, the measuring process converts it into a discrete measure.
The response is univariate continuous.
The subsample is selected
TABLE 1
The Data
Treatment 1
X
y
4
3
5
5
6
10
10
25
25
0.20
0.31
0.79
0.93
0.17
2.11
1.00
1.60
1.43
Treatment 2
X
y
3
4
5
6
5
10
0.54
0.33
1.43
2.38
1. 76
3.16
1.88
2.14
2.14
10
20
20
Treatment 3
X
Y
3
3
4
5
6
10
10
10
20
0.62
2.43
4.00
0.06
0.31
. 1.43
1.67
3.16
2.31
112
TABLE 2
Data Displayed by X
,.
Rx 1
Rx 2
Rx 3
X = 3,4
0.20
D.. 31
0.33
0.54
0.62
2.43
4.00
X = 5,6
0.17
0.79
0.93
1.43
1. 76
2.38
0.06
0.31
X = 10
1.00
2.11
1.88
3.164
1.43
1.67
3.162
1.43
1.60
2.14
2.14
2.31
X
~
20
AMP Sign Test for K-samp1es:
A
Wi
A
27
=T
27
W2
= 8""
"
W3
=T
A
W4
27
27
=5
W = Pr90) + ¥<45) + 2/(36) + ¥-<54)
1
W
2
= 2/(18)
+ 2;(-135) +
~7(-54) +
=
¥-<-18)
W = 27(-108) + ¥<90) + 2/(l8) + 27(-36)
3
5
929.475
= -691.6821
= -226.5429
eA
113
Tl = 1. 2750
= -0.9488
T2
T3 = -0.3108
T = 0.0051
0* = 3t~~~
[1.6125 + 0.9100 + 0.0998]
0* ~ 7.8669
AMK Sign Test:
_ 7
A
E
1
.e
n,=n 2=n 3=9,
A
n=27
A
-2'7
Wl = (.1-)-2
27
_ 8
A
W2 = (287) -2
2
-2'7
3
-27
w3
5
=2]
W4 = (2-)-2
27
E
7
A
E
A
E
4
A
(.1-)-2
27
A
2
27 2
27 2
27 2
27
4"9(12) + 6"4(3) + 49(7) + ~4) = 433.48534
Vl
=
V2
=0
V3
= 0 + ~~
(IJ ni )
1
=
2
+ 0 + ~~ (1) + 0
= 729
U ~ 0.5946
l
U ;, 0.0204
2
= 14.877551
2
2
(15) + ~~ (4) + 0
=
230.3696
114
U ;" 0.3160
3
Wi = 0.9310,
G*
D.
U = 0.3103
= (27)(5)
[0.0808
3
+ 0.0841 + 0.0104J
= 7.888
Efficiency Comparisons Between AMK Sign Statistic and AMP Sign
Statistic for K-samples for the Special Case: nl =n2=... =nK
Given the two new methods of matched analysis of covariance ad-
vanced in this dissertation, it is desirable to compare them to each
other with respect to efficiency.
It is stressed that the efficiency
results presented in this chapter are asymptotic results, and in
finite samples of reasonable size, the relative efficiencies may
differ.
However, Pitman ARE is used in this section to address the
question:
Which test is better?
The noncentrality parameters are:
'0*
=
12+;
t
(8;-ii)2 [
rF'(YI~) dF(YI~)r
_00
Hence,
K(2K-1);ll(8;-ii)2[ ["[1_F(YI:)]k-2 F'(YI:)
_00
dF(YI~)]2
e.
115
2
or
fOOF'{YI~) dF{yl~)
ARE{D*,G*) =
12
K2{2K-l)
-00
_00
Note that [1_F{YI~)JK-2 $ 1, henc~ ARE{D*,G*) ~ 12/[K 2{2K-1)].
Consider special cases:
a.
K
= 2:
_ 12
ARE ( D*,G*) - (4)(3)
r
F (y I~) dF (y I~)
2
I
_00
_00
= 1,
regardless of
F{yl~).
This has to be the case, since both the AMP test for K-samp1es and
the AMK test reduce to Schoenfelder's AMP sign test when K=2.
b.
&I x)
~
Uni form:
2
ARE{D*,G*) =
1
12
2
K {2K-1) !:(1_X)K-2 dx
o
_ 12{K-1)2
- K2{2K-1) .
116
Clearly, this is the ARE(D*,G*) if
F(yl~)
is uniform (for any scale
pa rameter) .
c.
F(yl~) - Exponential:
2
ARE(D*,G*)
=
12
K2(2K-l)
"
I
,
>;':' ,
.
'
...
e.
..
.
'\
,
=
_3_
2K-l
d.
F(y I~)
,
'
..
- Pare to (1 , 1) :
2,
'.
117
e.
F{yl~)
- Beta{r,l):
2
ARE{D*,G*)
=
12
2
K {2K-l)
fl {1_t r )K-2 r 2t{2r-2)dt
o
r 2 .2"r-T
1
12
=
.
K2{2K-l) {B{K-l, 2 -
}
~
where B(-,-) indicates the beta function
2
12 r
=---_---:.-=--:._----
1 2
K2(2K-l)(2r-l) 2[B(K-l, 2 - r)]
.e
* *)
For large r, ARE ( D,G
.
2
-+
12 r
_ 3(K-l)
2
2
2 - 2K-l
K (2K-1)( 2r- 1) [ B(K-1, 2) ]
4 r2
2
-
3(K-l)
- (2K-l)
= _
{2r-l)2
(Note that case (b) is a special case of the above.)
f.
F{yl~)
- Double Exponential:
e 1 2{t-8) dt
'~
ARE{D*,G*) - 2 12
- K {2K-l)
f
n r_00
,
e
1+(t-
-00
2
+
[
~-2(t-e)dt
e4
.~2( t-e) dt+ n1-Ht~(
e
1-0
(t-e))}]
K-2~-2(t-e) dt
118
2
lr-
12
=
t ri(t-S)[l-t e(t- S
2dt +
_00
1 fe e2(t-e) [1
4
=
2K
12
e-(t-e)K dt = _1_
K2 K
1/4
2
K (2K-l)
=
K2
2
1
+ - 1 - -1
K
(K_l)2K-l K-1
'
_1
e
e-(t-S)K dt
_k(t-e)]K-2dt = _1_K _
-00
r
r;2e
1
1
K(K-l)[l - 2K- 1]
3(K_l)22 2K - 4
(2K_l)(2 K- 1_1)2
For the normal distribution, the ARE's can be computed from
Table 1 given by Hojo (1931).
The results are as follows:
K
3
ARE(D*,G*)
Then, if
1.06
F(yl~)
since ARE(D*,G*)
5
6
1. 25
1. 35
4
1.16
7
1.45
8
1. 54
10
1.72
is normal, then the preferable test is D* for
~
1 for
K~3.
A summary table follows.
12
1.89
K~3,
119
TABLE 3
-,
Summary of Efficiency Results:
ARE(D*,G*)
K
F(yl~)
,_
2
3
4
5
6
10
Uniform
1.00
1.07
0.96
0.85
0.76
0.51
Exponential
1.00
0.60
0.43
0.33
0.27
0.16
Double Exp.
1.00
1.07
1.26
1.52
1.82
3.20
Beta (2,1)
1.00
1.67
2.05
2.30
Normal
1.00
1.06
1.16
1. 25
1. 35
1.72
Pareto (1,1)
1.00
0.47
0.30
0.21
0.16
0.08
Lower Bound
1.00
0.27
0.11
0.05
0.03
0.01
Based on the efficiency results reported for these particular
conditional distributions, it appears that for distributions that are
bounded below, the AMK test has better asymptotic efficiency.
For
those not bounded below, the AMP test tends to perform better.
Which
test is preferable relative to efficiency really depends on the conditional distribution.
Bhapkar (1961) notes that for the normal
distribution, the ARE of the V-test relative to the Kruska1-Wallis
test tends to zero as the number of populations tends to infinity.
It follows that for the normal distribution, the ARE of the AMK test
relative to the AMP test for K-samples also tends to zero as the
number of populations tends to infinity, since the ARE(D*,G*) is the
same as the ARE of the Kruskal-Wallis test relative to Bhapkar's
V-test.
CHAPTER V
COMMENTS AND SUGGESTIONS FOR FUTURE RESEARCH
This dissertation advances two new analysis techniques for the
K-sample homogeneity problem in the presence of a discrete covariable.
The method presented in Chapter II combines the features of
Schoenfelder's AMP sign test and the Kruskal-Wallis test.
Chapter
III presents a method extending Schoenfelder's AMP test and Bhapkar's
V-test.
In each of these chapters, the test statistic is presented
for the general K-sample framework as a quadratic form.
1he asymp-
totic distribution of the quadratic form is, in each case, derived
under the null and translation-type alternative hypotheses, and an
expression for the noncentrality parameter is provided.
For the
special case where the sample sizes are equal, the quadratic form
is expanded and simplified yielding a well-defined explicit test
statistic.
Also for the special case, the noncentrality parameter
is simplified.
A.
ARE result are provided in Chapter IV.
Comments on the AMP Sign Test for K-Samples
Quade (1982) suggests a method called rank analysis of covari-
ance by matching.
Assuming the framework of this dissertation, the
test may be defined as follows:
e-
121
k
Let
ni
I
S..
\'l
\' sgn(Y 1J
.. - Y.•.
.. =X.,.,}/M
..
1 J ,) I{X~ 1J
1J
1J =i'~l
j L"-1
~1 J
i .-
where Mij is the number of observations matched with Sij. (If
~ij=Vc, then Mij=M .) An analysis of variance can be performed on
c
the Sij's. Quade shows that this test is a generalization of the
Kruskal-Wallis test.
Clearly, the weights are Schoenfelder-type
weights and, in fact, it can be shown that analysis of variance on
these scores is equivalent to the AMP sign test for K-samples.
The
results in this dissertation support the intuition presented in Quade's
paper (for discrete X with zero tolerance), and provide efficiency
results for a special case.
.e
Quade's rank ANOCOVA by matching is defined for a more general
framework.
The only restriction on X is that it be concomitant.
A
matching function is imposed on X which defines two observations to
be matched if their covariables assume values that are within a given
tolerance.
This generalization can be pursued for.the AMP sign test.
With the intuition and framework provided by Quade, it might be possible to use the methods of Chapter II to verify the asymptotic distribution and derive efficiency results of such a test.
Puri (1964) proposed a class of test statistics which generalizes
the Kruskal-Wallis test, among others.
The test statistics are of
the form
L
where
~n,i
k
=il=l
I n.[(T
.
1 n,l
-
~
.)/A ]
n,l
n
2
and An are normalizing constants and
T
.
n,l
= n: l ~ E . Z(i~
1
j~l
n,J
n,J
122
where
l if the jth smallest observation is from
sample i
=
{
r
o else.
It appears that a generalization of the AMP sign test for Ksamples based on Purils general class of statistics would be possible.
As with the AMP sign test, the weights (that are functions of the ni IS
only) are fixed across categories. Hence, consider
It should be possible to show that L* is asymptotically chi-squared
with K-l degrees of freedom under the hypothesis of homogeneity;
however, this has not been investigated.
B.
Comments on the AMK Sign Test
Deshpande (1970) presented a unified treatment of Bhapkarls V,
test, the L-test, the W-test, and other variations of ANOVA problems
based on the analysis of K-tuples.
As in Chapter I, let Vij be the
number of K-tuples in which the observation from sample i has rank j,
k
L.
1
where the
and
flS
= L
j=l
f.U ..
1 lJ
are real constants such that they are not all equal,
123
k
n nJ
(~= (~=
k
F = Y l. fJ.fl
j=l l=l
(2K-l) (2K-2
j+l-2
Then define
L
= n(K-l)
2
2 - k
[I'-1 a.L~
1 1
FK
-
( k
2
I
a.L.) ]
'-1 1 1
1-
.
-1-
It is easy to see that if fl=l, fj=O for jrl, the result is Bhapkar's
V statistic.
If fl=-l
~
fK=l
~fj=O
for Hl
If fj=j-l, the result is the W-test.
~K~
the result is the L-test.
The AMK sign test can obviously
be extended to a more general form in this. way.
.e
Let V~~)
be the
1J
number of K-tuples for which the observation from sample i has rank
j~
calculated only for those observations whose covariable assumes the
value
~c'
Then let
V..
1J
= \.l . E-c K+l V~c:)
~
1J
c
(£ =P(X=V »0)
C
--c
u..
= V••
JII n.,
1J
1J
1
L~
1
k
=
1.
j';l
f.U .. ,
J 1J
with f i and F as defined above.
Then
defines a class of test statistics for ANOCOVA by matching.
be possible (by using the methods of Chapter III and those of
Deshpande (1970)) to show that members of this class have null
It should
124
distributions which are chi-squared (K-l), and noncentral chi-squared
distributions under translation-type alternative hypotheses.
C.
General Comments and Suggestions for Future Research
There are many obvious directions future research can take.
A
few of these are now listed.
1.
Continuous covariable:
Both of the methods presented assume a discrete covariable.
The
extension to continuous X would certainly prove to be a useful analysis
tool.
The next step from what is developed in this research would be
to a discrete X defining a matching function, where two observations
are matched if their covariables assume values within a specified
tolerance.
categories.
This would isolate the issue of coping with "overlapping"
Next in the process of generalizing to continuous
~
would
be to let X be continuous and define the matching function with fixed,
positive tolerance.
2.
Discrete response:
The framework assumed in this research specifies that the response
variable is continuous.
might be ignored.
The assumption was made so that ties on Y
An adjustment to the significance level for approxi-
mating a discrete distribution with a continuous one is made frequently.
The impact of this can be assessed by assuming tied values at certain
levels of the covariable, breaking the tie(s) in all possible ways,
and comparing the results.
Another possibility might be to derive an
•
125
analogue of the II midrank ll method commonly used for the Kruskal-Wallis
test and other rank procedures.
Future results in many other respects are desirable.
matical inversion of
L:
-0
The mathe-
could be pursued, methods acconJTlOdating a
multivariate response, ordered alternatives, multiple comparisons
procedures, and methods of point and interval estimation of effects
(assuming shift alternatives) are other suggestions for future research .
•
126
REFERENCES
Al thauser, RP and Rubi n, DB (1971), "Measurement Error and Regression to the Mean in Matched Samples ,II Social Forces 50: 41-46.
Andrews, Fe (1954), "Asymptotic Behavior of Some Rank Tests for
Analysis of Variance,1I Annals of Mathematical Statistics 25:
724-735.
Atiquillah, M (1964), liThe Robustness of the Covariance Analysis of
a One-Way Classification,1I Biometrika 51: 365-372.
Belsen, WA (1956), IIA Technique for Studying the Effects of a
Television Broadcast,1I Applied Statistics 5: 195-202.
Bhapkar, VP (1964), IIA Non-Parametric Test for the Several Sample
Location Problem,1I University of N.C., Institute of Statistics,
Mimeo Series No. 411.
Bhapkar, VP (1961), IIA Non-Parametric Test for the Problem of Several
. Samp1es,1I Annals of Mathematical Statistics 32: 1108-1117.
Billewicz, WZ (1963), liThe Efficiency of r~atched Samples:
Empirical Investigation,1I Biometrics 21: 623-644.
An
(1964), "~1atched Samples in Medical Investigations,1I
--=B-r-'-it'"""'i"-s-'-h-Journal of Preventive and Social Medicine 18: 167':'173.
Bross, 10 (1969), IIHow Case-to-Case Matching Can Improve Design
Efficiencies,1I American Journal of Epidemiology 89: 359-362.
Campbell, DT and Stanley. JC (1966), Experimental and QuasiExperimental Designs for Research. Chicago: Rand McNally.
Campbell, DT and Clayton, KN (1961). IIAvoiding Regression Effects
in Panel Studies of Communication Impact, II Studies in Pub1 ic
Communication 3: 99-118.
Cochran, WG (1969), liThe Use of Covariance in Observational Studies,"
Applied Statistics 18: 270-275.
(1957), "Analysis of Covariance:
------=Bl~·o-m-e~trics (September 1957): 261-281.
Its Nature and Uses,"
_ _.....----:_ (1953), "Matching in Analytic Studies,1I Journal of the
American Public Health Association 43: 684-691.
(1950), liThe Comparison of Percentages in Matched Samples,1I
----=B~io-m-e~trika 37: 256-266.
•
127
. Cochran, WG and Rubin, DR (1973), "Controlling Bias in Observational
Studies: A Review," Sankhya-A 35(4): 417-446.
Deshpande, JV (1970), "A Class of Multisample Distribution-Free
Tests," Annals of Mathematical Statistics 4: 227-236.
(1965), "A Non-Parametric Test Based on U-Statistics
for the Problem of Several Samples," Journal of the Indian
Statistical Association 3: 20-29.
---:--,-:--
_ _..,-,----,,..,..-_ (l965a), "Some Non-Parametric Tests of Statistical
Hypotheses," Ph. D. Dissertation, Uni versity of Poona.
Elashoff, J (1969), "Analysis of Covariance: A Delicate Instrument,"
American Educational Research Journal 6(3): 383-401.
Evans, SH and Anastasio, EJ (1968), "Misuse of Analysis of Covariance
When Treatment Effect and Covari ate are Confounded," Psychol ogi cal
Bulletin 69: 225-234.
.e
Fisher, Land Patil, K (1974), "Matching and Unrelatedness," Ameri can
Journal of Public Health 100: 347-349 .
Fisher, RA (1934), "Probability, Likelihood, and Quantity of Information in the Logic of Uncertain Inference," Proceedings of
the Royal Society, A 146: 1-8.
Gabriel and Lachenbruch (1969), "Non-Parametric ANOVA in Small Samples:
A Monte Carlo Study of the Adequacy of the Asymptotic Distribution," Biometrics 25: 593-596.
Hannan, EJ (1956), "The Asymptotic Powers of Certain Tests Based On
Multiple Correlation," Journal of the Royal Statistical Society,
B 18: 227-233.
--r.--:;~ (1949), "The Asymptotic Powers of Certain Tests Based On
Multiple Correlations," Journal of the Royal Statistical Society,
B 11: 21-219.
Hoeffding, W (1948), "A Class of Statistics with Asymptotic Normal
Distributions," Annals of Mathematical Statistics 19: 293-329.
Hojo, T (1931), "Distribution of the Median, Quartiles, and Interquartile Distance in Samples from a Normal Population," Biometrika
23: 315-360.
Kraft, CH and Van Eeden, C (1968), A Non-Parametric Introduction to
Statistics. New York: MacMillan.
Kruska1, JB (1968), "Statistical Analysis II: Transformations of
Data," International Encyclopedia of Social Sciences. New York:
MacMi 11 an and Free Press, 182- 192.
128
Kruska1
~
WH (1952) ~ "A Non-Parametric Test for the Several Sample
Annals of Mathematical Statistics 23: 525-540.
Prob1ems~"
Kruskal~
WH and Wa11is~ WA (1952)~ "Use of Ranks in One Criterion
Analysis of Variance~" Journal of the American Statistical
Association 47: 583-621 •
. Kupper~ KL;
JM; K1einbaum~ DG; Lewis~ DK; Morgenstern~ M
"Matching in Epidemiologic Studies: Validity and
Effi ciency Consi derations ~ II Institute of Statis ti cs Mi meo
Series No. 1239~ UNC.
(1979)~
Karon~
Lehmann~
EL (1975), Non-Parametrics:
Ranks. Holden-Day.
Statistical Methods Based on
Lehmann, EL (1951), "Consistency and Unbiasedness of Certain NonParametric Tests," Annals of Mathematical Statistics 22: 165-179.
Lindquist, EF (1931), liThe Significance of a Difference Between
Matched Groups," Journal of Educational Psychology 23(3):
197-204.
Maccoby, EM (1956), "Pitfalls in the Analysis of Panel Data: A
Research Note on Some Techni ca 1 Aspects of Voti ng, II Ameri can
Journal of Sociology 61(4): 359-362.
Mann ~ HB and Whi tney, DR (1957), liOn a Test of Whether One of Two
Random Variables is Stochastically Larger than the Other,.,
Annals of Mathematical Statistics 18: 50-60.
Mathan, KK (1963), "Matching in Comparative Studies in Public Health"·
Indian Journal of Public Health 7: 161-169.
McKlinlay, SM (1974), liThe Expected Number of Matches and Its Variance for Matched Pai rs Design, II Appl ied Stati sti cs 23: 372-383.
McNema1, Q (1940), "Samp1ing in Psychological Research"· Psychological
Bulletin 37: 331-365.
(1947), "Note on the Sampling Error Df the Difference Be-
~~-;-tw-e-e-n Corre1 ated Proportions or Percentages"· Psychometri ka 12( 2) :
153-157.
.
McSweeney~
M and Porter, AC (1971), "Sma l1 Sample Properties of NonParametri c I ndex of Response and Rank Analys i s of Covari ance"·
Occasional Paper No. 16, Office for Research Consultation,
School for Advanced Studies, College of Education~ Michigan
State University.
Miettinen, OS (1968), liThe Matched Pairs Design in the Case of Allor-None Responses, II Bi ometri cs 24( 2): 339-353.
129
(1969), "Individual Matching with Multiple Controls
--T"in--:-t"-he---"Case of All-or-None Responses," Biometrics 25(2): 339354.
J
_ _.......----;..-;-- (1970), "Matching and Design Efficiency in Retrospective Studies," American Journal of Epidemiology 91: 111-118.
Pike, MC and Morron,
Control Studies
and All-or-None
Social Medicine
RA (l970), "Statistical Analysis of Patient
in Epidemiology: Factor Under Investigation
Variable," British Journal of Preventive and
24: 42-44.
Potthoff, RF (1965), "Some Sheffee Type Tests for Some Behrens-Fi sher
Type Regression Problems," Journal of American Statistical
Association 60: 1163-1169.
Puri, ML and Sen, PK (1969), "Analysis of Covariance Based on General
Rank Scores," Annals of Mathematical Statistics 40: 610-618.
Puri, ML (1964), IIAs',Ymptotic Efficiency of a Class of C-Sample Tests,1I
Annals of Mathematical Statistics 35: 102-121.
.-e
Quade, D (1982), IINon-Parametri c Analysi s of Covariance by Matching, II
Biometrics 38: 597-611.
(l967a), II Comparisons with All Matched Pai rs ,II (Abstract)
---~B~i-ometrics 23: 606.
(1967b), "Rank Analysis of Covariance,1I Journal of American
---~S~t--atistical Association 4: 1187-1200.
(1966), liOn Analysis of Covariance for the K-Sample Problem,1I
---~A~n-nals of Mathematical Statistics 37(6): 1747-1758.
Randles, RH and Wolfe, DA (1979), Introduction to the Theory of NonParametric Statistics, Wiley.
Raynor, WJ and Kupper, LL (1977), "Matching on a Continuous Variable
in Case Control Studies,1I Institute of Statistics Mimeo Series
No. 1127, UNC.
Rubin, DR (1973a), IIMatching to Remove Bias in Observational Studies,1I
Biometrics 29: 159-183.
(1973b), liThe Use of Matched Sampling and Regression Adjust------me---n"7""t to Remove Bias in Observational Studies," Biometrics 29:
185-203.
Rulon, PJ (1941), "Problems of Regression,1I Harvard Educational
Review 11(2): 213-223.
130
Schoenfelder, JR (1981), Analysis of Covariance by Matching.
Dissertation, UNC.
Ph.D.
Sen, PK (1960), "On Some Convergence Properties of U-Statistics,1I
Calcutta Statistical Association Bulletin 10: 1-18.
Slutsky, E (1925), "Uber Stochastiche Asymptoten und Grenzwerte,
m, 5, No.3, p. 3.
Thorndike, RL (1942), "Regression Fallacies in the Matched Groups
Experiment," Psychometrika 7(2): 85-102.
Wal ter, SD (1979), "Matched Case-Control Studies with a Variable
Number of Controls Per Case," Presented at ENAR Biometrics
Meetings, New Orleans, April 1979.
Winer, BJ (1962), Statistical Principles in Experimental Design.
New York: McGraw Hill.
Wold, H (1956), IICasual Inference from Observational Data,1I Journal
of the Royal Statistical Society (A) 119(1): 28-60.
Yinger, JM; Ikeda K; and Laycock, F (1967L "Treating Matching as a
Variable in Sociological Experiments,1I American Sociological
Review 32: 801-812.
e·