-
_
.....
A TREA!1!MENT VS CONTROL ALIGNED RANK PROCEIURE
IN A
~O-WAY
LAYOUT
by
Howard Blair Christensen
Institute of Statistics
Mimeograph Series No. 102'
Raleigh - 1975
ABSTRACT
CHRISTENSEN, HOWARD BLAIR.
A Treatment vs. Control Aligned Rank
(Under the direction of Bibhuti B.
Procedure in a Two-way Layout.
Bhattacharyya).
The linear statistical model, Yo 0k=)A+
.
~J
~o
~
+1'0
J
+ t. ok' for a ran~J
domized complete block experimental design without interaction was
considered.
It was assumed that the design consisted of n blocks,
p treatments, and m observations per cell with treatment 1 serving
as a control.
A distribution-free test of hypothesis concerning
the treatment effects
Tl' •••
'~p
was sought as well as a procedure
for estimating contrasts among the 1"s.
In order to improve the
efficiency of such procedures by allowing interblock comparisons,
the median of the control treatment in each block was subtracted from
each observation in the block to remove the nuisance parameter 0<.
The resulting N=mnp median-aligned observations were then ranked in
ascending order of size.
A Wilcoxon-type score E
was assigned to each of the ordered
Ni
observations such that ENi=i/N where i denotes the i-th smallest observation.
A statistic T&j was then defined to be the average of the
scores associated with the j-th treatment.
the random vector
It was shown that as m-+oo,
[~(T&l-.M./1'l)),... ,~(T&p-.uN(1'p)~
possesses an
asymptotically multivariate normal distribution under rather mild
assumptions about the distribution of the aligned observations.
From this it was shown that a properly standardized test statistic
was obtainable that was asymptotically non-central chi-square with
p-l degrees of freedom under a collapsing alternative and an assump-
tion of independent and identically distributed error
terms..
This
test statistic provided a distribution-free test of the hypothesis
of no treatment effects ..
Using the statistic
T~j'
the Hodges-Lehmann estimator of the
difference in location was obtained which provided estimates of the
pairwise differences in the f's.
It was shown that these estimators
were asymptotically normal with a variance dependent on the quantity
G--
Sf 2 (x) dxo
Using Lehmann's (1963c) approach, an estimator of G was obtained.
As a result, a large
sample procedure was developed to provide a
confidence interval estimate for any particular linear contrasts
among the l"s..
In addition procedures were suggested for obtaining
both Scheffe and Tukey-type simultaneous confidence intervals on
various contrasts and linear combinations.
An example problem was worked out to illustrate the procedures
developedo
ii
BIOGRAPHY
Howard Bo Christensen was born December 9, 1939 in Payson, Utah
receiving both his elementary and secondary educations there.
Upon
graduating from Payson Sr. High School in 1958, he enrolled in the
mechanical engineering program at Brigham Young Universityo
He took
two years off from his schooling to serve as a missionary for the
Church of Jesus Christ of Latter-Day-Saints in Florida.
After returning to Brigham Young he changed his major to statistics, graduating with a Bachelor of Science degree in 1964.
enrolled at N, C
He
State University the same year, receiving a Masters
of Experimental Statistics degree in 1966.
In the fall of 1967 he
received an appointment as an Assistant Professor of Statistics at
Brigham Young University with whom he has been employed since.
In
the fall of 1974, he accepted a one year sabbatical leave to work
in the Statistical Research Division of the U. S. Bureau of the
Census.
A most significant event in his life was his marriage in 1963
to the former Bonnie Gayle Heelis of Santaquin, Utah.
Since that
time they have had seven sons--Derek, Quinn, Devin, Brandon, Jordan,
Doran and Trent.
iii
ACKNOWLEDGMENTS
The author wishes to express sincere appreciation to all persons who have provided help, encouragement and assistance towards
the completion of this study.
Special thanks go to Professor B. B.
Bhattacharyya, Chairman of the Advisory Committee, whose invaluable
help and suggestions through the course of the study made possible
its completion.
Appreciation is also expressed to others on the
committee who gave constructive criticism and suggestions under
short-term pressure, specifically Professors T. M. Gerig, R. J.
Monroe, and R.
& Alvarez.
Professors R. G. D. Steel and C. P.
Quesenberry also provided help and suggestions over the duration
of the study.
Gratitude is expressed to Professor H. G. Hilton, Chairman
of the Statistics Department at Brigham Young University, who provided important encouragement to see the study through.
Finally, the author expresses his sincerest thanks to his wife,
Bonnie, who never doubted, and his children who suffered the absence
of their father on numerous occasions without complaint.
iv
TABLE OF CONTENTS
Page
1.
INTRODUCTION • • • • • • • • • • •
2.
MEDIAN ALIGNMENT--THE GENERAL CASE
3.
............
2.1 Using a Known Median • •
.........
2.2 Using a Sample Median
ASSUMING IDENTICALLY DISTRIBUTED ERROR TERMS . . . . . . .
3.1
3.2
3.3
3.4
4.
1
Introduction • • • • • •
• ••••
The Variance-Covariance Terms • • • • • • • • • •
The Collapsing Altern~tive • • • • • • • • • • • •
An Asymptotically Distribution-free Test
Statistic • • • • • • • • • • • • • • • • • • •
ESTIMATION PROCEDURES
•
•
•
•
•
•
•
•
III
••.•
•
•
•
•
•
•
•
• • • • •
4.1 Introduction • • • • • • • • • • •
4.2 The Hodges-Lehmann Estimator • • • • • • • • • • •
4.3 The Application to Linear Models • • • • • • • • •
• •••
4.4 Estimates Using Aligned Observations.
5.
7.
5
9
28
28
28
35
43
45
45
46
47
54
.................
63
An Example • • • • • • • • • • • • • • • • • • • •
Summary and Recommendations
• • • • • • • •
68
SUMMARY AND CONCLUSIONS
5.1
5.2
5
LIST OF REFERENCES
•
APPENDIX •
•••••••••••
0
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
e"
63
•
70
• • • • • • '.
73
•
•
•
s
•
•
1. INTRODUCTION
Let us assume that a set of observations satisfies the linear
model for a two-way layout, or randomized complete block design, as
follows:
<I.D
Assume also that the« .'s are the block effects (i=I,2, ••• ,n) and that
1.
the l' j' s are the treatment effects (j=1,2, ... ,p), while the
Yijk's
are random variables from some distribution F~( x-)J...- 0<..).
J
1.
There has been much investigation concerning tests of hypotheses
about the treatment effects,~, and to the eatimation of various lin~
ear contrasts involving the'r's.
that F~
J
The assumption generally made is
is the normal distribution function and that the E .. k's are
1.J
independent and identically distributed; i.e., F~=F for all i and j.
J
Such procedures generally culminate in an F-test possibly combined
with some multiple comparison procedure for examining contrasts of
interest.
Interest in these same linear models has increased with the
purpose of applying distribution-free techniques where the only
assumption made is that F~ or F is continuous.
(An additional assump-
J
tion of symmetry is made in some instances.)
Many pf tDese distribution-free test statistics fall under the
general category of linear rank statistics, the Wilcoxon and Normal
Scores tests being two familiar ones.
These rank tests have a dis-
tinct advantage over their parametric counterparts due to their insensi~ivity
to gross errors.
In addition, they do not suffer serious
loss of efficiency in most cases.
In fact, for some underlying dis-
2
tributions they have greater efficiency than their comparable normal
distribution theory counterparts, as pointed out by Chernoff and
Savage (1958) ..
Most of the tests in the framework of the two-way layout uti1ize intrab10ck comparisons; e.g. Steel's treatment vs .. control test
(1959), Nemenyi's tests (Miller, 1966), Hollander's treatment vs.
control multiple comparison procedure (1966), among others.
However, Hodges and Lehmann (1962) suggested that such tests
suffer a loss of efficiency by failing to take advantag=! of information contained in interb10ck comparisons, a conjecture verified by
Mehra and Sarangi (1967).
As a result, Hodges and Lehmann (1962),
Sen (1968a, 1969) and Puri and Sen (1971) developed tests for socalled aligned-rank order statistics.-. Their procedure suggests the
block mean or median, or any other symmetric function of the block
observations, be subtracted from each observation in the block.
This
"a1ign!i~'
the observations by removing the nuisance parameters
represented by the block
effects,~...
].
After this alignment is per:'
formed, the whole set of observations are then ranked without concern
for the blocking..
The basic difficulty with this procedure is the dependence
created
among
the observations in each block when transformed in
this manner which complicates the distribution theory of the rank
tests involved..
One way to solve this problem is to make certain
assumptions of symmetry and permutational invariance..
However,
Bhattacharyya (1973), using the median for alignment, was able to
develop the distribution theory without making the above assumptions
3
of synnnetry or permutational invariance.
This was achieved
wit~h
the
aid of Bahadur's (1966) expression for quantilese
A particular criticism of many distribution-free techniques was
their inability to estimate location and scale parameters associated
with the underlying distributions.
However, Hodges and Lehmann
(1963) made a significant contribution in this area by developing an
estimator of location in both the one and two-sample cases.
Their
estimators are based on the corresponding linear rank statistics
used for testing hypotheses, and thus preserve to the estimators
many of the efficiency properties possessed by the rank statistics on
which they are based.
A drawback with the estimators developed is
that they are not asymptotically distribution-free, their asymptotic
variance depending on the underlying distribution function.
The Hodges-Lehmann approach was extended to linear models by
Lehmann (1963a,1963b,1963c) when using a Wilcoxon score function.
He was able to devise an analysis-of-variance-type notation, and
formulated expressions for tests of hypotheses and multiple comparisons procedures.
The asymptotic properties of the estimators were
derived and it was shown that the asymptotic efficiency of these
estimators relative to the standard least squares estimators was the
same as the Pitman-efficiency of the Wilcoxon test relative to the
t-test.
Much of this work was generalized to a broader class of scores
by Bhuchongkul and Puri (1965) and by Sen (1966).
This approach was
then applied to the two-way layout by Puri and Sen (1967).
Sen C1968b) also expanded his work on aligned rank order statis-
4
tics to allow for multiple comparison procedures using Tukey's T-method
of multiple comparisons for estimating contrasts in a randomized block
experiment.
This was done under the condi.tions assumed in his earlier
paper (1968a) in which he developed a general linear rank test for
aligned observations.
One of the objectives in the following pages will be to develop
an aligned rank procedure in the two-way layout, assuming a treatment-vs.-control situation with m observations per cell.
The median
of the control treatment in each block will be used for alignment.
The distribution of the Chernoff-Savage type test statistic will be
investigated with particular emphasis toward the Wilcoxon test statistic.
This obj ective will be pursued in a general
blish notation.
setting
to esta-
Then the additive model assuming identically and
independently distributed error components will be developed.
In
addition, the simplifications that result from assuming Pitman-type
shift alternatives will be investigated.
These simplifications will
be applied in the last section where a Hodges-Lehmann type estimator
will be derived for the special case of the Wilcoxon score function.
The distribution properties of these estimators will follow from the
properties of the test statistics on which they are based.
properties will be stated explicitly.
These
5
2. MEDIAN ALIGNMENT--THE GENERAL CASE
2.1
Using a Known Median
Let there be a two-way layout consisting of n blocks, p treat-
ments and m observations per cell where the random variables, Y"k'
~J
have the cumulative distribution function, F~(x).
Also assume that
J
Y
represents a random variable for the control treatment in block i.
i1k
In the above notation, i=I,2, ••• ,nj j=I,2, ••• ,pj k=I,2, ••• ,mj and ~1
represents the population median for the control treatment in the i-th
block
therefore
representing
the common median in the i-th block
under the null hypothesis of no treatment difference.
To align the N=mnp observations, the control treatment median
in each block is subtracted from each observation in that block.
These aligned observations will be denoted by X"k where
~J
X..
k<:,Y,~J. k~J
v: 1·
~
The aligned observations are then ranked and assigned a score, E «.
N
The following statistic 'is then formulated:
T ,= ( mn )
NJ
(2.1.1)
-1
n
<c:;;"
N
E
c:::::
~..e::::::..
i= 1 0<= 1
Z(i,j)
N::lt NO(
where
if theoC. smallest observati.on
came from the (i,j)th cell
otherwise.
Thus T
represents the mean score associated with the j-th treatNj
mente
Also let
F g (x)= m-1( number of X
~q
gq
k~x,
.
k=1,2, ••• ,m)
6
-1 ~
H ( x ) = ( np )
N
~ Frngq(x),
..c:... "-
g... l q=l
(2.1.3)
H( x)=( np) -1
=E
~
g=l q=l
Then we can re-express T
in the following way:
Nj
1 n
TNo=n- 2
J
i=l
(2.1.4)
£<>0IN(HN~ x) )dF
~(x).
0
rnJ
The integral in the above expression can be expressed as
~
~JN(~(X»dFrnJ~(x)=IS[JN(HN(X»-J(HN(X»]dFrnJ~(x)
__
N
+fI J(HN(x»dFrnJ~(x)
+
denotes the region
O(H (x)<l.
N
N
where IN
S
~(x)
IN(HN(x))dF
H =1
rnJ
N
Following the approach of
the paper by Chernoff and Savage (1958) and Puri (1964), this integral
can
~e fu~ther
re-expressed as
(2.1.5)
where
f J(H)dF~(x)
B1N~= S
A_~=
NJ
j.
o.:+l<l
J
~(x)-F~(x»
J
J(H(x»d(F
rnJ
0<.+1<.').
q=1,2, ••• ,p
7
C1N~= ~ ~(HN(X)-H(x))2JII(eHN+(1J~
.
8)H)d(F
~(x))
~
C2N~=~w(JN(~(X))-J(HN(X))dFm~(X)
C3N~= SJ/~(X)JdFm~(X)
J 1'1., =1
J
C4N~=
S(-J(H(x))- {~(x)-H(x)lJ'(H(x)))dF~(x).
J
J
ffiJ
HAI"'j,
Under the set of regularity conditions given by Chernoff and
Savage (1958), the
i
~j
term has been shown to be finite.
have also been shown to be of b (m-~ ) by Puri (1964).
p
The C-terms
(Several of
the C-terms become especially easy to deal with when using the Wi1coxon score function, J(u)=u.)
Following Puri (1964), let us write J(H(x)) as
J(H(X))=JJI (H(y))dH(y)
1Ito
=
(~t (H(y))d( ~ ~
(np)-lFg (y))
g=l q=l
q
{
where
n p
-1 g
g q
q
'21. (np)
=
B (x)
x
(2.1.6)
with x
o
Bg(X)=~
q
x
Jt(H(y))dFg(y)
q
q
arbitrary, say, such that H(x
0
)=~.
Now integrating B2N~ by parts, letting
U=~and
we get
H = ~f (np)-\F g(x)-Fg(x))
g q
mq
q
8
This gives
•
B2N~J 1
=(if,Cnp)-lBgCX)dCF ~Cx)-F~(x»
g q
q
mJ
J
n p
1
- JB~Cx)~! (np)- d(F gCx)-FgCx»
~ J
g q
mq
q
("
0
CIl
=
f L i: Cnp)-lBgCx)d(F J~(x)-F~Cx»
:J ,g
C2 .. 1.7)
q
q
m
.
J
g:/:i&q:/:j
~
-JB\x)
~ 2:. (np)-ld(F gCx)-FgCx».
J
g
q
mq
q
-tIJ
g:/:i&q:/:j
(Notei
In the following, for notational convenience, the expression
n
~~
g*q
i,j
p
should be understood to mean ~£, or if clear from the context
g
q
g:/:i&q:/:j
it will be written simply as:E~ Also the upper limits of the sum
g*q
on g, q. and k will be understood to be n, p and m respectively.)
B i +
2'Nj
BZNJ~ =L2:Cnp)-lfm-l2:CBgCxHk)-E~BgCXoo»
g*q
t
k
q 1.J
J q 1J
(Z.I.8)
In this expression, with Xook=Yook1J
1.J
y.l. 1 '
i f -.:f'.1 1 represents the
0
true median, then we have for a given i and j the sums of independent random variables, each being a mean of identically and independently distributed random variables.
Puri(1964) has shown that under
i
i
proper regularity conditions on the score functions, BINj+BZNj
satisfies the conditions for the validity of the Cent:ral Limit Theo:..
rem and so is asymptotically normally distributed..
also shown that
BIN~+B2N;
Puri (1964) has
are multivariate normal for all i and je
Since T
is the mean of n multivariate normal random variables,
Nj
9
each T
Nj
is also asymptotically normally distributed..
from the work of Puri (1964)
is
multivaria~e,
matrix
2.2
~
that
It also follows
Lrrf2(T N1 ...c.l N1 ) , " ... ~
j2(T
NP
- ).{NP)]
normally distributed with a variance-covariance
similar to the one that is found in Puri's (1964) paper.
Using a Sample Median
Suppose in section 2.1 that ~1 is not known and the sample
~
median ~1 is used for alignment purposes instead"
denote
Using *'s to
expressions involVing sample medians we would get
T~]....
(2.2.l)
n-
1
i fJN(H~(X»dF*~(X)
i=l--
m]
where F*~(x) is the empirical distribution function of X. 'k=Y
_ {"'.,..
m]
1.J
ijk v i1
and
H~(x)=(np)-lif ~g(x). This leads to the following theorem.
g q
Theorem 2.2.1:
i
F.(x).
]
q
Let the random variable
X"
1.J k
have the
distribution
Also let the following three conditions hold:
i
i( x) are bounded and continuous for
The densities, f.(x)=F'.
(i)
]
]
all i and j"
~,..
( ii)
(iii)
The quantity m2 (V - v;.1) has a bounded variance for all i.
i1
F"f( x) is bounded in the neighborhood of
Then for
J(u)=u~
Vi l "
the Wilcoxon score function, the vector
[m~(T~1-~N(~1»~e".,m~(T~p~(~»1
has asymptotically a multivariate normal distribution with mean zero
and variance-covariance terms defined in expressions (2.2 .. 9) and
(2.2.36) as found in the following proof.
Proof:
write
Following the same approach used when~l was known we can
~
10
.ro
~
.e.
~~C*
1
I ( H*N' (x) ) dFm*Ji'-~
( x), "'" A*J1," ,+ B*,NJ,' _. B i, +
2N ]
N
1
i
i
where ~j = ~j~
(
4
g q
<:;;'"
i
N'i + ~C~NJ'
g?q J
k
are shown to be of order
The C*=terms
0
p
(N ~~)
in
the appendix .. )
Now we
take the difference
10
"
,
10
l'
l'
2 (B
2 (B.o. 1
B" 1)
B)
mIN'J + 2N"J
= m
INJ' + 2NJ"
which upon simplification equals
= I - 11= III + IV8
Then we expand the expression inside
Taylor's series.
the brackets of term I by a
This yields
~v
~~
g
i
I=(m /mnp) 2:~~ '1- V, I)B I (y"" k- )1',1)
k
1.
L
q
1J
gAq
(
.0.
where
~ il
lies between
102
Vn
A
Vi l "
and
Further manipulation yields
.~
g
. (At"
I=(m /mnp)(v'l- y "l)~~~B' (Y"'k~ \,1"1)
1
1,
1
gAq k q 1J
.0.
~
. ('
':-('
g
~
g
•
+ (m /mnp)(Y'l=V 'l)2:L'2:(B' (Y""k=)~_)-BI (Y"k-U'l,))
1
1
g*q k
q 1J
1.1
q 1J
1
~
=m
CY.1. 1=
\
~,
v'1)
Ia + m
l,
~
(v. 1- '/.,1) Ibo
1,
1
Now recall ing from equation (2.10 6) that
Bg(y" "k=
q 1J
-.r 1)
1
YlJ~~yf.s
=J
g
J I (H( z)) dF ( z)
q
10
let
J(u)=u~ so that J'(u)=l~
As a result
j,
11
FgCy" 'k- V'l) + c.
q 1J
1 .
It can be assumed without loss of
generality that
Vi1=0
for all i.
Then
Bg(Y"'k- V'l)
q
1J
1
so that
With ~ denoting c.onvergence almost surely, we get
I
a
= L 2~B i gCY"
-'. k
g"q
q
. k- 'lf ) /mnp
1J
1
1
=(np)-1!'~m-1Lfg(y,'k)
k q
g*q
C2.2.4)
1J
~ (np)-l~ ~f;g(X)f~(X)dX
g*q
(Note that
Gg~=G~g.)
qJ
Jq
q
-.:0
J
This is finite since the densities are assumed
to be bounded.
Now write I
b
as
1 = (np) -lZ:~m-1 ~ (B' g(y , . kb
g*q
k
q 1J
=(np)
We show that I
c
4
5. 1 ) -B' qg(y 1.J, "k-lJ':l»
1.
1.
-1
. ~~Ic
g*q
0 by Chebychev's inequality written as
Therefore, we need to examine E( 1
2).
c
This can be written as
12
2
~2E(-e;;,:'[B,g(y
_?) _ B,g(y
_ ..... )1 2
E(r. )_
-ill
,.
O'k ,.)'}
"·k· VO}. ..1
c,
k
q 1)
1.
q 1)
1.
Now by our assumptions both B,g(y"kq
I)
1 1)
1
and B,g(Y"k~V:l) are
q
1J
1..
bounded, so
Therefore the first t.erm on the right side of the equality for eval-
2
uating E(I )
c
is~4M2/m
which approaches zero as m approaches infinity.
The second term is
m-2 { ahE [Bi
~(y ijk- Sn)B' ~(y ij h- 5n~
+m(m-1)E B,g(yo 'k)B,g(y. 'h)
q
q
1J
1J
Since the B,g terms are continuous and ji1~
q
B' g(y , , kq
1J
f 1)
1.
0 (=vrl) , the
term converges in distribution to B,g(Y"k).
q
Moreover,
1J
B,g is bounded, so by the moment convergence theorem of bounded ranq
dom variables the above expression goes to zero as m approaches infinity, and as a result r converges in probability to zero.
c
I
b
converges in probability to zero.
In consequence of these results we can write
(2 .. 2 • .5)
k2
.",.
l=m (V'l-V'l)(I +I )
:1
1.
a b
~.r'
~t'
m (V'l-~l)(np)
1.
1.
Similarly, it follows that
d
~
-l~~ gi
~~G , ..
~
g~q
qJ
Hence
13
(2.2.6)
Now to show that expressions II and IV approach zero, we consider
expression I I written as
which, by a Taylorts expansion, can be written as
~
i
.,..
1'('
g
,
II=(m /mnp) ~ ~L E, «v '1~ Y.. )B' (Y, 'k- .,)~l))
1
1. 1
q
1J
1
g* q k J
=(np)
-1
~ If
~
-l~
g
~
"-oLE,(m ( 'I - 'l)m .&- B' (Y"k-J·l))·
J
11
q
1J
1
k
g'q
<" <' i
*
By squaring and taking the expectation of the above term following
the double summation, we get.
V-
i
~,"
-1
g
I
2
E,(m (v'l- 'l)m
~B' (Y"k-.)'l))
J
11
q
1J
1
k
~ E.i( m~('"'1-'
0'1 ))2 • E,i( m-1 ~ BY,
,g(
~ 1 ))2 •
.....
. k- J,
J
1
1.
J
k
q 1.J
1
1
'2
,~
,:,,,
By assumption, m (ViI-viI)
has a bounded variance for all i
E~(J(Vil-~l))2<CX'
so that
•
Also, B,g is assumed to be bounded so that
q
i( -l~ ,g(
~ ))2
E, m L. BY, 'k- J" ' 1
<tlJ.
J
k
q 1.J
1 .
,
~.~
t-~) -1
g
c)
S1.nce m (v'l-v'l m ~B' (Y"k-;>'l converges in distribution to the
1
1.
k q 1J
1
'same distribution as that. of
J(""'l-~l)Gg~,
and
1
1
qJ
since its second
moment. is uniformly bounded, then
, i[ . :" . . .
~ ~
-1
g(
~
11m E, m (v'l-~'l)m ~B' Y"k-~'l
m"eo J
1
1.
k q 1J
1.
i[,
~,
,.
t'~
-1
= E, hm m (v'l-v'l)m
J m ...eo
11
1
g(
i ~ ~o
~ B' q Y"k-J'l)
1J
1
k
14
by the moment convergence theorem.
By the same argument it can be shown that IV~ 0.
Then, upon
collecting terms, we get
~(B* i
B* i)
~(B i
B i)
INJ + 2Nj - m INj + 2Nj ::
m
~
:: (m I np)
,« v_ . 1-~
'" 'I) - ( v - VI»)
g1 g
gi
G
q]
'Z:.1:
*
g .q
f'
.,.."
1
1
+
0
P
(1).
If we assume F"f(x) is bounded in the n~ighborhood of ~1'
then using Bahadur's (1966) representation for quantiles, we get
where
R
.
N
is
-~ ) and
p (m
0--
otherwise
(m~Inp)~~m~1~
.. k-V.l)-E~Bg(y,. - \1':1»
.
l', ~,(Bg(y
q 1J
1
J q 1J
1
g*q
(2~ 2. 8)
+(j2/np)
... m-l~
. k
k
(B~(Y
J
gq
J.»}
k -\.I" 1)-8~~(Y -\I'
g
q J
gq g
~*~ G~~ { m-~k(I_(Yilk' ~1)-F~(~I» If~(~I)
g q
In expression '(2.2.8) we can assume "il" and \{l are equal to
zero wi~hout 'loss of generality.
expression (2.1.8)
When
H:i
we can write using
15
BIN~+B2N~=
J
J
i
k
[J(H(Y.
q
·k)=E~J(H(Y.
'k))] 1m
J
1.J
=(mnp)~l1:.
2: [B~(Y
k)-EgB~(Y k~
J
gq
q J
gq
g q
=m-1
~ [J(H(Y 1.J.. k) )~B~(Y
.. k)-E~{J(H(Y.. k)-B~(Y '.'k.)\1
"f
J
1.J
J
1J
J 1J '1J
~ (mnp)-lLL[B~(Y
g*q
=B' (y " .) +
1.J
Now suppose we consider
J
gq
-Z:~ B' ~(Y
g*q
B!N~
+
J
B~N~
k)-EgB~(Y
q J
gq
k)]
gq
)•
from expression
(2.2.8).
As before,
B* i B* i
INj+ 2Nj
B' (y .. ) + ~ ~ B' ~ (Y )
1J
g*q
J
gq
The last expression involving the medians in the above sum can be
written
(np)
-1
gir
A
~~
2:~ G • L(\$. l=\J.l)-("" I-v 1)=
1.
1.
g
g
g * q qJ
=(V·l=~l)G>'.f
./np- ~ (vg l-'\Il)'~
Gg~/np
1.
1. .
1J
g
...
qJ
g
g~q
n
1\
=(v: -\I':
)G-':~ 1.1 l l L J
'"
~ (V -V
)G"
. gl gl gq
g 1.
.
".
p.
* -- "-...
~~ Gg 1.., G*;'<-G*
~ Ggl-. •
h
were
G..
.. - ., _ ~ .G1.l
. , an d G" -_ ....
1.J
g~q
qJ
1.J
1.]
q J qJ
gq q qJ
Combining
terms with common Y-subscripts, we get, after absorbing the value np
16
n
°
B'"'~
2"N°J
:=
B*(Y) +
0
c,
~]
_
°
~'1' B",l,C"y 1 ) -\.(,("
:;... )G" 1 }
v I-vI
L...
0
g=l']
g
g
g
g
g,fi
n
p
+ ~ 2g=l q=2
B*~(Y
).
J gq
g#<i j qJ:j
'This expression is a sum of np independent terms each of
which
is asymptotically normal since each is the mean of i.i.d. random
variables with finite I-at and 2-nd moments.
Therefore, their sum
is asymptotically normal and the individual components would be multi(The terms involving the expression 0~-
variate normal.
"0J
can be
shown to be asymptotically normal using Bahadur's (1966) representation for quantiles.)
When j=l, a similar type of expression can be formulated involving np independent, asymptotically normal terms.
all i and j, it can be concluded that
BiN1+B~N~
Therefore, for
is asymptotically
normal with its individual components being multivariate normal.
We then argue that
T~j
is the mean of multi.variate, asymptotic-
ally normal terms and is therefore asymptotically normal itself for
all j.
r
~
"
Continui.ng the argument, the vector _m (Tih-)'t/1'I»)''''''
m~(T~p~~(""p))J
is made up of sums of terms that are multivariate
normal and therefore is multivariate normal it.self.
The el ements tha t make up 'the variance-covariance matrix will
be denoted by V(T~j) and COV(T~j 9T~h).
we write mV( n
-1
i
L
J
J
2
mV( J: B,:- ° )=(m/n )
(2.2.9)
1
'
);C B11'N ° +B'2:'<N.'o )
~
~J
2S
Li
mV(~B1'<.") where
1 LJ
V(Bl/ •
.i" J
n
To find the varianc.ete-rms
B~N~)
J
~~ C t'B'"'- i B'"'- i B~IN]"h + B'"2.N
'- h')J
+ 2'~ ov l"N]o + 2.N]·;
•
Jo
1<h
9
17
We will evalual.e the varianc.e and c.ovariance C.erms found in expression
First. the components in equation (2e2e8) will be denoted as
Next we write
k
i
i
v(m2(BiNj+B~Nj))=V(a-b)+V(c)+V(d)+2Cov(a,c)-2Cov(b,c)
-2Cov(a.d) +2Cov(b,d)-2Cov(c,d)
some
where
them.
terms
drop out due to the independence between
Now,
The integral in the
second te<.cm on the right hand
side of
the
equality sign, upon integrating by parts, is equal to
with gj:i when q""j"
The
expectat:ion of t.his term :is zero, therefore its va:':"! anc:e
18
is found to be
"
Ei(~/np)\(F g(x)-Fg(x»J'(H(x»dF~(x)12
..
,'.;
mq
q
J
_.
Consistent with the result Puri (1964)obtained, this expectation is
equal to
(2.2.13)
=2(np)-2 ffF(x:y;gq)dF(x; ij )dF(y; ij)
\',
where in the simpler notation,
F(x:y,g~
g
g
F (x)(l-F (y»J'(H(x»J ' (H(y»
and the region of integration -~<x<y<oo
q
q
is denoted by R
while R
denotes
yx
xy
will be used to denote
-~<y<x<~.
Next we want to find the variance of the first term in expression
(2.2.11) which, upon integrating by parts, can be written as
'"
-(~/np) L2 ~(Fm~(X)-F~(X»J'(H(X»dFg(X).
g*q .,\"
J
q
J
Therefore, the term is
This term also has expectation of zero.
squared and its expected value taken.
Applying Puri' s (1964) re-
suIts again and accounting for the double
2(np) -2
summation, we get
Z .:ESIF( x: y; ij ) dF( x; gq) dF(y; gq)
g*q R"y
+[:,> :' .~. -r (np) -2JJF( x: y; ij) dF( x; gq) dF(y; gt)
g qt·
g*q, g*t ,qJ:t
P.,,/
+
(2.2.14)
+r-""'~.?-:--;
~ i'F(y:x;ij)dF(x;gq)dF(y;gt), i
{(np)-2U F(x:y;ij)dF(x;gq)dF(y;sq)
R~'I
g s q
g>'(q, s*q, gJ:s
+
\ ,\ F(Y:x;ij)dF(x;gq)dF(y;sq)}J
(cont.)
19
+
=Q+ R
where Q)
R~
Jf~,,(y: X; ij) dF( X; gq)dF(y; st~} J
+ S + T
Sand T stand for the four terms in the brackets as
written above.
If we let P equal the result of expression (2.2.13) summed
over the range g''<q) then we get
(2.2.15 )
V(a-b)c P + Q + R + S + T.
The next term to consider is V(c) written as
V(c)=V(r>:::-r. (mnp)-lGg~z:.(I(Y ilk ,0)-\) /
g"'<q
qJ k
(2.2.16)
= ( 2npf i ( 0 )) -
1
f~(O))
2(')"')
g i) 2 ( 2npf i ( 0)) - 2 G~ 2.
",-~_G.:;:
g*q qJ
1
~J
Also,
(2.2.17)
= ( 2np ) -
2
:rg*qC'i: GqJ~ f i(0 ) ) 2•
g /
\
-1
gig
Cov(a,c)=Cov(m ':"",-(mnp)
~(B (Y .. k)-E.B (Y .. ));
~-J q.
1J
~q
k q ~J
.
(2.2.18)
"r
h
-1 gi
i
m2~2~(mnp)
G. '£.(I(Y
,0)-\)/f (0))
ilk
1
g"'<q
qJ k
i f jll
' g
2 i
(Yolk);I(Yolk))/(np) fl(O) ~ jc:.lo
lG"'.~.(L:LCOV(B
lJ
~_
q 1.
1
g"q
In expression (2.2.18), t.he covariance t.erm for the Wilcoxon scor e
20
function is
=Cov(B
gq
;1'1).
1
g
g
This result is obtained by recognizing that B = F when J(u)=u.
q
q
Also, I(Y,O) is a binomial random variable with
pa~ameter of~ ..
Next,
ig
]
Cov(a,d)=Cov(m~ L'r.(mnp) -l-d;
.qB g
(Y. 'I!-E.B (Y. ,) ;
.;..
k
q
1J
l\.
J
q
1J
ghq
m\::~Gg~'2:(I(Y
lk ,O)~) / fi(O))
g*q qJ k g ·
<2.2.19)
[0
=
1
if
j#l
(np)-2(
~ Gi~)(~~COV(B ; I'I)/f~(O))
q*1 q
(2.2.20)
Similarly,
°
(2.2.21)
Cov(b,c)=
(2.2.22)
Cov(c~d)
i f j=1
g*q
gq
1
i f j=1
21
The next step is to find
i) ~(B* h
Cov ( m~(B*1Nji + B*2Nj'
jm
1Nj +
where we let
Here, a', b', c' and d' are of the same form as a, b, c and d of
equation (2.2.10) except the its are replaced by the h's in the
superscripting.
Therefore, we can write
Cov(a-b+c-d;a'-b'+c'-d')=Cov(a-b,a'-b S ) + Cov(a,c')
-Cov(a,d') - Cov(b,c') + Cov(b,d') + Cov(c,a') - Cov(c;b')
(2.2.23)
+Cov(c,c') - Cov(c,d') - Cov(d,a') + Cov(d,b') - Cov(d,c')
+Cov(d,d')
Term by term, we get
.
Cov( a-b, a' -b' )...mcov(BlN~ +B
i
2Nj
h)
h
j B1Nj + B2Nj •
(In the following, the notation will be shortened for convenience by
dropping the N).
Expressing
the covariance as expected values, we get
Cov(a-b,a'-b')=mE(BllB2~) + mE(B2JBl~)
i
+
mE(B2;B2~)·
Integrating B by parts, we would get for the
1j
following product
Upon taking expectation, the cross-product terms drop out leaving
the resul t
22
r'
i
h. (. )-21
mE(B 1jB2j)=np
;
(2 .. 2.24)
i
,'.',
....,
2:1 JJF(Y: X; ij )dF(x; gq)dF(y;hj)J
+
g q
c
Ry ..
Similarly,
..
mECB2fBl~)=-(np)-2[1."2~~ )F(X:y;hj )dF(x; gq)dF(y; ij)
'
( 2 2 25)
g q . ,
~l'
i
:?··'5"·\SF(Y:X;hj )dFCX;gq)dF(Y:ij ) ] .
g q' ;<y~,
+
And finally,
(2.2.26)
mE(B/B ~)=(np)-2[ 2.~J'(F(X:y; gq)dF(x; ij )dF(y;hj)
J 2J
g q Jbi
(!
.
+':.:' :;"l \ F( Y::1<; gq) dF( x; ij ) dF(y; hj )
g q'
"R,(,(
Adding these three expressions together yields Cov(a-b,a'-b').
Also, since hJi, Cov(a,c')=Cov(a' ,c)=O.
The next terms are
CO
(2.2.27)
hH
Cov(a,d')=i
l (fGi~)
LL Cov(B ; In) f(np)2f~(0) i f j=1
q q g*q
gq
CO
(2.2.28)
if
if
Cov(a',d),",~
l
HI
~~CoV(B
(iGhi)
q q
g*9
;In)!Cnp)2fiCO) i f j=1
gq
i\~l
(2.2.29)
Cov(b,c')=
G~jCOV(Bij;Ihl)!(np)2f~(0)
(2.2.30)
Cov(b',c)=
G~oCOV(BI o;I. 1 )!(np)2 f1i.(0)
1J
1J
1
:..s: ,--
.
gh
2 g
8*~_COV(Bij;Igl)Gq/(np) fl(O)
. 2: "';-
)
Cov(b,d')"" L,J
'sh
2 g
..i' ,,-CoV(B'I;I '1)'·" 1!(np) f (0)
1
1
\gr1 q
1
g
q
if
HI
i f ]"",1
23
(2.2.33)
(2.2.34)
r-.:: ih
i
2
COy ( c,d' ) =G~.\L. G,)/(2npf (0))
1J q . q J
1
Cov(e' ,d)=G*h'(~- Gh~)/(2npfhl(0))2
J ..:~ qJ
q
Cov( d, d' h (2np) -2(2. (:i.. Gg~)( 2.: Gg~) / (f~(0)) 2
g q qJ
q qJ
gl:i
g:/:h
(2.2.35)
Adding the appropriate terms together from equations (2.2.15)
through (2.2.18) plus (2.2.19) through (2.2.22), and from (2.2.24)
through (2.2.35) and putting them together as indicated by equations
(2.2.11) and 2.2.23), we ultimately get the variance term defined
by equation (2.2.9).
Now we need to determine the form of
mCov(n
mCov(T~j ,T~h).
We can write
-1 i(Bi~+B~~); -Ii (Bi~+B~~»
n
i
J
J
t
(2.2.36)
And
where a, b, c and d are previously defined as the components of
equation (2.2.10), and a", b", e" and d" are defined in the same
manner, except for replacing each j wit.h an h in the subscripting.
Remember that jl:h in all that follows.
The term Cov(a-b+c-d,a"-b"+c"-d") has already been expanded in
24
expression (2 2 23)
0
0
The evaluation of the individual terms
0
in t.hat
expression are as follows:
CovCa-b,a"-b ll )=-(np)-2
:~r ((F(x:Yiij)dFCxigq)dF(Y;ih)
g q t.. ,)) F,'l'
~~~;~:Xiij)dFCXigq)dF(Y;ih)
+
\
i
+ \\F(x:y;ih)dF(Xigq)dFCYiij)
.J ,J~'<':f
+ \\FCY:X;ih)dF(Xigq)dFCYiij)
'''0.1''
- jSF(X:
Y; gq) dF( x; ij ) dF(y i ih)
R.,
y
S~:~;:X;gq)dF(Xiij)dF(Yiih)]
-
r0
i f .j~l
I
(2.2.38)
Cov(a,e")='~
l
G!h
Cov(a" ,e):::
j=1
g*q
(0
(2.2.39)
:~_.Y~OV(Bgq;Iil)/(np)2fi(O) i f
if
hJl
J
2 i
\,G!j ~~>_COV(Bgq;Iil)/(np) fl(O) i f h=l
g*q
1,1
a
(2.2.40)
COV( a, d")=
{
HI
if
(2:Gi~)()-~COV(B ilil)/(np)2f~(0)
q~h q
rQ
g*q
i f )=1
gq
i f hl=l
CovCa" ,d)=\J
tCL
q~j
(0
(2.2.42)
Gj~)(L~2.COV(B
qJ
g*q
gq
;Iol)/Cnp)2fiCO) ifr,:-l
1.
i f j=-l
COV(b,C,,)=J
lG!hCOVCBij
ilil)/(np)2f~(0) if
j;,%1
25
(2.2.43)
(2.4.44)
(2.2.45)
(2.2.46)
(2.2.47)
(2.2.48)
(2.2.49)
+ (LCi~)(
q~jqJ
Next we consider the term
m
I: Ci i )f(fi(0»2).
q~h qh,
1
Cov(Bt~+B~~;Bl~+B~~)=cov(a-b+c-d,
a"'-b"'+c"'-d''') where a, b, c and d are identified in equation
(2.2.10).
The primed a, b, c and d are defined in the same manner,
except for replacing an
superscripting.
i
by an r and a j by an h in the sub and
Therefore we can write
Cpv(a-b ,a "'_b'" )=-(np)-~
L:[ fJ:F(X: Y; ij )dF(x; gq)dF( y; rh)
g q
+
~y.y
SSF(y: x; ij )dF(x; gq)dF(y; rh)
~Ylll
(2.2.50)
+
+
5('J F(x:y;rh)dF(x;gq)dF(y;ij)
srJ
'Cl<'1
F(y:x;rh)dF(x;gq)dF(y;ij)
R.~)(
(cont.)
26
- \ \F(x:Yigq)dF(xiij)dF(Yi rh )
" ·'R lll
I"
- \, F(Y:Xigq)dF(Xiij)dF(Yi rh )]
/' ..,Ryx
(2.2.51)
Cov(a,c':'!>=Cov(a'" ,c)=O
iO
(2.2.52)
ifj~l
cov(a,d"')=l
'C-'
ir
-2 i
.
.
Cov(B gq i I ~'l)/(np) f (0) 1.f J=1
. (./. Gq h) -::;"
Co." £..
1
q
g*q
t> '1
if h~l
0
(2.2.53)
Cov(a'" ,d)=
(~.: Gr~) L2. Cov(B
[
q
qJ
g*q
gq
iltl)/(np)2f~(0)
if h=l
t,I1
(2.2.54)
CoV(b,C'")=G*hcoV(B .. ill)/Cnp)2flr(0)
(2.2.55)
Cov(b'"
r
~J
r
,c)=G~.Cov(B
hiI'1)/(np)2fli(0)
~J
r
~
(2.2.56)
if
(2.2.57)
(2.2.58)
, (2.2.59)
(2.2.60)
Cov(c,c"')=O
COV(C ,d'"
)=G~. C>~ Gi~) / (2npf~(0»
~J
q
q
2
0,:ri
r
2
COV(C", ,d)=G*hC~G .)/(2npf (0»
1
r
qJ
q
hI:. 1
21
(2.2.61)
+
"'-... .
.
i i
<.~ ~ G .)(
ql=j q]
Equations (2.2.37) through (2.2.49) and equations (2.2.50)
through (2.2.61) are put together in the manner indicated by equation
(2.2.23).
They are then inserted into equation (2.2.36).
yields the covariance term sought.
This
28
3. ASSUMING IDENTICALLY DISTRIBUTED ERROR TERMS
3.1~Introduction
In chapter 2 we assumed that for the linear model under consideration that X. 'k1J
~1
1
had the distribution
F~(X),
J
where the i and
j indicate a dependency on the block and treatment effects.
This
could be due to different variances within each cell that are unaffected by the median alignment.
In this section, the more custom-
ary assumption is made that the error term,
according to some continuous
6 ijk'
is distributed
distribution function F(x).
Then
A
X .• k=Y .. k- .;;: 1
1J
1J
1
will be free of the block effect and will have the distribution
F.(x) which indicates a dependency only on the treatment effect,
J
~j.
W~ assume without loss of generality that ~1' the control-
treatment effect, is zero.
Since these assumptions represent a special case of the results
obtained in chapter 2, Theorem 2.2.1 will hold without additional
proof.
However, there will be a significant simplification in the
expressions
for the variance and covariance terms obtained there.
3.2 The Variance-Covariance Terms
Before the variance 'and covariance terms of chapter 2 are
simplified, it will be useful to note the following preliminary
simplifications:
29
Gg~ =
G.
qJ
qJ
a=m~ 2.r(mnp)-~(B
0.2.1)
g*q:k
b=m~!.L(mnp)-l$"
0.2.2)
1{
g*q
~
q
G.
Jq
(Y. 'k)-E. (B (Y .. »)
1.J
J q 1.J
(B.(Y k)-E (B.(Y »)
J gq
q J gq
-L
0.2.3)
c=m !:'!:(mnp) ~ G . (I(Y ilk)-~) I f (0)
1
g*q
k qJ
0.2.4)
d=m~ ~~(mnp)-~ G
g*q
k
.(I(Y a.k)~)/f1(0)
qJ
g
Therefore, according to equation (2.2.11)
~
i
.
V(m (BINj+B~N~»=vra-b)+V(c)+V(d)-2CoV(b,c)
+2Cov(a,c).2Cov(a,d)+2Cov(b,d)-2Cov(c,d)
Now the term, yea-b), in light of the simplification would become
by equations (2.2.13) through (2.2.15)
V(a-b)= 2(np)-2[
+
+ (np)r2
~ I'. (c"(x:y;q)dF(X;j ~dF(y;j~
g*q JJJQty:
.
.
.'
LLSCF(x:y;j )dF(x;q)dF(y;q)]
g*q JR'lI.Y
[~~~ZrX:y;j)dF(X;q)dF(y;t)
g q t
~)tY
g*q ,;g*t, ql= t
0.2.5)
+
~ ...~F(Y:
x;j )dF(x;q)dF(y; t~
R'fx
j
22. L {J JF( x: y;j 'dF( x; q)'dE(y; it)
g s q
R",,,
g*q, s*q, gl=s
+
5S F(y: x;j) dF( x; q) dF( y; q)}
t'l
Yo
'~~;~'fk (F( x: y; j )dF( x; q) dF(y; t)
. £..1-1... U~Rl'.'i
g s q t
.. g*q, s*t, gl=s ,ql=t
+
~SF(y:x;j)dF(x;q)dF(y;t)11,
Ryx
J
where F(x:y;j)=F.(x)(l-F.(y»JI(H(x)Jl(H(y», and dF(x;q)=dF (x).
J
J
q
NoW by applying equations (2.2.16) through (2.2.22), we get
30
0.2.6)
2
V(c)=(nG .-G .. ) /(2npf (0))
1
.J J j
2
where G .= ~ G .•
•J .t:_ qJ
q
0.2.7)
V(d)=(~(~G
.uo
.-.
g*q
2 )=[
2 .+(~ G.)
n-l)G
qJ./2op· fu(O})
l.
• J
..c... qJ
qJJ
2) /(2npf 1 (0)) 2
0.2.8)
i f j=l
0.2.9)
i f j=l
rO
0.2.10)
Cov(b Ie)=
i f j=l
1
l(nG .-G .. )COV(B.;I )/(np)2 f1 (0) if jJl
, .J JJ
J 1
0.2.11)
Cov(c,d)=(nG .-G .. )(~ G .)/(2npf (0))2
1
.J J J qJj qJ
0
0.2.12)
i f jJl
Cov(a,d)=
~
2
\ ( ~ ~ qj) (n 2: Cov( Bq; 11) -Cov( Bj ; 11) ) / (np) f 1 (0)
q'F J
q
i f j=l.
(In the above expressions which involve the covariance of B.(Y 1) and
J
I(Y
g1
g
), the subscript g does. not enter into consideration since we
can write explicitly
COV(B/Y g1 );I(Y g1 ))=
Now in order to evaluate terms
~
(0
((j
j Fj dF1- ~ Fj dF1 ·)
-co
C'
such as
mCov ( B!ji + B~i
Zj; B*h
Ij + B*h)
2j
we consider the terms a-b+c-d and a'-b'+c'-d' which have previously
been defined in equations (3.2.1) through (3.2.4) with the primed
values being of the same form as the unprimed except for
changing
31
the i with an h in the superscript.
Now 8$ before,
Cov(a-b+c-d,a'-b'+c'-d')=Cov(a-b,a'-b') + Cov(a,c f
- ·Cov(a,d') - Cov(b,c')
+
Cov(b,d') + Cov(c,a')
- Cov(c,b') + Cov(c,c T )
-
Cov(d,a') + Cov(d,b')
)
- Cov(d,c') + Cov(d,d').
From this we get, using ~quations
(2.2.24)
CQv(a-b, a' -b' )=( np) -2[i:'!{2
g q
sr
through
(2.2.35)
F( x: Yj q)dF( Xj j) dF(y j j)
JRlI.Y
~~F(X;Yjj )dF(xjq)dF(Yjj)
-
Rxy
- S~ F(y: Xj j ) dFl;Xj q) dF(y jj)
- S)~'lC,/
(F(X:Yjj )dF(xjq)dF(Yjj)
~Y)l.
-S
Il,sy: Xjj )dF( Xj q)dF(y j j)} ]
=( 2n/ (np) 2)
! q [filby
F( x: Yj q)dF( Xj j) dF(y j j)
_ (' rF(x:Yjj )dF(xjq)dF(Yjj)
j ...)~ ~'1
-
~ J'F(Y: Xjj )dF(xjq)dF(Yjj)
Ry'll
(3.2.16)
if jf1
(3.2.17)
i f j=l
32
Cov(e,d')=(nG .-G .. )(G .)/(2npf (O»
1
.J J J
.J
2
Cov(d,d')=«n-2)G 2 .+2G j(2,G .»/(2npf (O»2 •
1
•J
•
I ' qJ
qrJ
From the foregoing we note that
Cov(a,e')=Cov(a' ,e)=O
Cov(a,d')=Cov(a' ,d)
Cov(b,e')=Cov(b' ,c)
Cov(b,d')=Cov(b',d)
Cov(e,d')=Cov(e' ,d)
Now to
eva~uate
covariance terms such as
COV(T~j;T~h)'
we need
the covariance pf terms such as
where a", bit, e" and dlt are defined in the same manner as a, b, c
and d from equations (3.2.1) through (3.2.4)
e~eept
for putting an
h wherever a j is found.
Using expression (2.2.37), we can write
Cov( a-b ,a
b )=Cnp )-2[
lt
Jl
..
L. ~jS~( x:y; q)dF( x; j )dF( y; h)
q l
R"./
+ SCF(y:x;q)dFCx;j)dFCy;h)
'\Rtx
- S~F(X:y;j
)dF(x;q)dF(y;h)
R.. y
- ~ SFCy:x;j)dF(x;q)dFCy;h)
RylC
- SSFCx:y;h)dF(x;q)dFCy;j)
R... . .,
-
S~F(Y:X;h(dF(X;q)dF(y;j)}]
R'f)(
33
0.2.21}
0.2.22)
i f j=1
0.2.23)
if
HI
0.2.24)
(3.2.26)
Cov(c,c")=(nG .-G .. )(nG h-Ghh)/(2npfl(O»
(3.2.27)
Cov(c,d")=(nG .-G .. )(~G h)/(2npf (O»2
l
.J JJ q,Jh q
(3.2.28)
Cpv(c",d)=(nG h-Ghh)(;E G .)/(2npf (O»2
l
•
q,Jj qJ
.J
JJ
2
•
With these results we can write
Cov(a-b+c-d,a"-b"+c"-d"):::Cov(a-b,a"-b")+Cov(a,c")+Cov(a",e)
0.2.30)
-Cov(a,d")-Cov(a",d)-Cov(b,c")-Cov(b",c)+Cov(b,d")
+Cov(b",d)+Cov(c,c")-Cov(c,d")-Cov(c",d)+Cov(d,d").
34
Next, we turn attention to the expression
where a"', b"', c'"
and d'"
are like a, b, c and d except i is re-
placed by rand j is replaced by h in the sub- and superscripting
with the understanding that itr and jth.
As before, we can determine the various components to be
Cov(a-b, a'" -b'" )=C np )-Y2..[f.'r;C x: y; q) dFC x; j ) dFC~; h)
q
J~l''/
+ f(FCY:X;q)dFCX;j)dFCY;h)
J~y)t
- SJ:Cx:y;j)dFCX;q)dFCY;h)
R~y
~1FCY:X;j)dFCX;q)dFCY;h)
-
~y)C
-S1
FCX : y ;h)dFCX;q)dFCy;j)
R")I
-5IFCy:x;h)dFCX;q)dFCy;j~)lJ
•
R'Ix
Also,
0.2.32)
CovCa,c"')=CovCa'" ,c)=CovCc,c"')=O
o
(3.2.34)
CovCa'" ,d)=
{
i f ~l
G.jCn2COVCBq;Il)-CovCBj;Il))/Cnp)2flCO) i f htl
q
2
(3.2.35)
Gov(b,c"'):CnG.h-Ghh)CoV(Bj;Il)/(np) fl(O)
0.2.36)
Cov(b", ,c)=(nG .-G .. )CovCBh;Il)/Cnp) fl(O)
,2
.J
JJ
35
j(nG.h-Ghh)COV(Bj;Il)/(np)2fl(0)
0.2.37)
if
j~l
Cov(b ,d"')= ')
l«n-2)G. h+ ~hGqh)COV(Bl;Il)/(np)2fl(0) if j=l
(nG.j-Gjj)COV(BhiIl)/(np)2fl(0)
Cov(b", ,d):;;:
i
L
l«n-2)G.+
.J
0.2.39)
if
h~l
G .)COV(BliIl)/(np)2fl(0) i f h=l
qJ
q~j
Gov(c ,d'" )::;( nG . -G .. )( ~ G h) / (2npf (0»2
1
.J J J
q q
0.2.40)
Cov(d,d'")=«n-2)G .G h+G . (2,G h)+G h(2: G .»/(2npf (0»
1
•J q~ hq
• q~ j qJ
•J .
With these result&, we then have
Cov(a-b+c... d~a"'-b"'+c"'-d"~=Cov(a-b,a"'-b"')-Cov(a,d'")
(3.2.42)
-Cov(a'" ,d)-Cov(b,c"')-Cov(b'" ,c)+Cov(b,d"')+Cov(b'" ,d)
-Cov( c, d'" )-Cov( c'" ,d)+Cov( d, d'" ).
3.3
The Collapsin& Alternative
Suppose we assume a Pitman-type shift alternative
tive to the parameter
values
rela-
by expressing the distribution
~.
J
function of the random variable X"
1.J k
in the following way
)
F.(x):;::F(x + m-~ 1':.•
J
J
Under this condition
.dF . ( x) / dx
J
:a
.f' (x + m-~1'. )
J
or
f. (x) = f( x + m-~.,...)
J
J
2
36
Therefore,
lim F( x· + m-~1'. )=F( x)
~<Xl
J
and
lim fCx +
m-+co
with f(x) as a
m~"", )=f(x)
J
bo~nded f~nction.
Our purpose in this section is to find expressions for the
vari~nce
and covariance terms under the above shift alternatives,
but first we need to evaluate several expressions that will facilitate simplifications as we go along.
First,
~~.evaluate
G..
qJ
By definition
G .= (f (x)f.(x)dx
qJ
..) q
J
= tf(x+m-~1' )f(x+m-\'t. )dx.
J
q
J
In the limit we get
by the dominated convergence
theorem,
lim G .= \ f(x)f(x)dx
m~ ....~lJ
,..
~ 2
= . .~ f (x) dx = G.
Another frequently encountered expression is of the form
where for the Wilcoxon score function, JI(x)J'(y)=l.
Therefore, we
can write
limJrjf (x)(l-F (y))f.(x)f.(y)dxdy
m-'Olt
q
q
J
.J
=Si'F(X)(l-F(y))f(X)f(y)dXdY
F~)
=_r<l-F(Y))f(Y)dyiUdU=
.
~~2(1_V)dV = 1/24.
37
Also, we can solve for the expression
In the limit this yields the expression
Now, from previous results we have interest in the term
i
i) jm~( B~.+B~.
h
h))
COY (~(
m B~j+B~.
~h
1
ZJ
~
J
+ ; ""
~
where
~
i
.
V(m (Btj+B~;))--' V(a-b)+V(c)+V(d)+2Cov(a,e)-2Cov(a,d)
-2Cov(b,e)+2Cov(b,d)-2Cov(e,d).
Using equations (3.2.5) through (3.2.12), we get after some simplification
V(a-b)=(np-l)/12np
V(e)=(np_l)2 G2/(2npf(0))2
V(d):(np2_2p+l)G2/(2npf(0))2
o
Cov(a,e)=
i f jl:l
{ -(pp~l) 2G/8(np) 2f(O)
if j=l
Cov(e ,d)= (p_l)(np_l)G2/ (np2 f(0))2
0 i f j=l
Cov(b,e)= .
2
[ -(np-l)G/8(np) f(O) if j*l
(-(np-l)G/8(np )2f (0) if jJl
Cov(b,d)= 1
2
l-p(n-l)G/8(np) f(O) if j=l
a
Cov( a,d)=
i f j/:l
.
2
{ -(p-l)(np-l)G/8(n~)
f(O)
38
As a resuLt, we get when j=l
VCm~CB!~+B~i»=Cnp-l)/12np
C3.3.5)
+
C~p2_2p+1)G2/C2npfCO»2
-
pCn-1)2G/8Cnp)2 fCO )
+
(np_1)2G2/C2npfCO»2
_
2Cnp-l)2G/8Cnp)2fCO)
2Cp-l)Cnp-1)G 2/C2npfCO»2
-
+ 2Cp-l)(np-l)G/8Cnp)2 f (0)
~Cnp-l)/12np +np2Cn_1)G2/C2npfCO»2 -np2Cn-l)G/C2np)2fCO)
If
jJ~then
we get
.
~
r"
VCrn CBt!+B~~»)=Cnp-1)/12np
+
Cnp2_2p+1)G2C2npfCO»2
2 2
+
_
(np-l) G /C2npfCO))
2
2Cp-1)(np-1)G Z/C2npfCO))2
0.3.6)
+ ZCnp-l)G/8Cnp)2 f (0)
=Cnp-1)/12np
-
2(np-l)G/8Cnp)2 fCO )
+ np2Cn_l)G2/(2npf(0»2
Next, we have by using equations C3.2.12) through C3.2.19) that
the componepts of the term
rncovCBtf+B~~;Bt~+Bt~) are
CovCe ,e 1)=0
22
CovCd,d l )=pCnp-2)C /C2npf(0»
CovC a ,e 1)=0
0.3.7)
o
CovCa ,d 1)=
t
t
i f jJ1
-pCnp-l)G/8Cnp)2 f (0)
if j=l
Cov(b,e l )=-Cnp-1)G/8Cnp)2fCO )
cnp-p-l)G/8cnp)2fCO)
CovCb,d f )=
-Cnp.nG/8Cnp)2fCO )
if j=l
if
jJl
Cov(c,di)=p(np-l)G 2 /(2npf(O» 2
Therefore,
if j=l we get
i
i
h
h
ITlCov(Bil+B~l; Bil+B~l)=-l/ 12np + p( np-2)G
(3.3.8)
2
I< 2npf(O»
2
- 2p(np_l)G 2/(2npf(O»2 + 2p(np-l)G/8(np)2 f (0)
2
+ 2(np-l)G/8(np) f(O)
2(np-p-l)G/8(np)2 f (O)
=-1/12np
np2G7(2npf(O»2
+
np2G/(2np)2f(O).
For j!:l we get
.
.
h
h
mcov(Bi~+B~~;Bij+B~j)=-1/12np
+
0.3.9)
(np-l)2G/8(np)2 f (O)
2
+
p(np-2)G /(2npf(O»
2
(np-l)2G/8(np)2 f (O)
2p(np_l)G 2/(2npf(O»2
=-1/12np
_
np2G2/(2npf(O»2.
Now from the above we can see that under the collapsing alternative conditions, all vari.ance and covariance terms are constants
relative to the notation using i, j and h.
1
n
.
.
i
J
J
1
Therefore, for the
n
vari.ance of m'2 L. (BiN~+B~N~)1'1= nl'2 -.:,. B~'./n, we get
i
IJ
mV(L B~•. /n)=Cl/n)2(nVar + n(n-l)Cov)
i
1.J
=(l/n)(Var + (n-l)Cov).
For j=l, this gives
Cl/ n) [( np-l) /12np
+
np2(n_l)G/(2np)2f(O)
(n-I) /12np
(3.3.10 )
222
np (n-l)G /(2npf(O»
=(p-l)/12np.
2
+
2
.
np (n-l)G /(2npf(O»
2
40
(1/n)[(np-l)/12np
0.3.11) . -
(n-l)/lZnp
-
+
np2(n_l)C2/(2npf(0»2
np 2 (n-l)G 2/(Znpf(O» 2J
=(p-l)/lZnp.
The next quantity we want is the covariance term.
From pre-
vious results we have written the term as
(l/n)Z(~ ~cov(Bt~+B~~;Bl~+B~~)
1.
Using equation 0.2.30) and equations (3.2.20) through (3.2.29), we
get
Cov(a-b,a"-b")=..1/12np
222
Cov(c ,c")=(np-l) G / (2npf(0»
Cov(d,d");(np 2.. Zp+l)G 2/(2npf(0» 2
(3.3.1Z)
o
Cov(b,c")=
Cov(b ,d")=
{
{
if
j=l
_(np_l)G/8(np)Zf(O)
if jJl
-Pcn_l)G/8cnp)Zf(0)
if
-(np-l)G/8(np)Zf(0)
if jJl
Cov(c ,d")=(p-l) (np_l)G Z/ (Znpf(0»2
j=l
41
Therefore, if j=l or h=l
but not both, then
222
(np-1) G /(2npf(O»
+
( np 2_2p+1)G 2/(2npf(0»2
_
2(p_1)(np_1)G 2/(2npf(0»2
(np_1)2 G/ 8(np)2 f (0) + (p-1)(np-1)G/8(np)2 f (0)
0.3 .. 13)
+
(np-1)G/8(np)2 f (0)
-
p(n-1)G/8(np)2 f (0)
=-1/12np
If both hand
j
+
(np-1)G/8(np) 2 f(O)
2
2
222
np (n-1)G /(2npf(0»
- np (n-1)G/8(np) £(0) ..
are unequal to 1, then we get
mCov(Bt~+B~~;Bt~+B~~)=-1/12np + (np_1)2~2/(2npf(0»2
+
(np2_2p+1)G2/(2npf(O»2
+
2(np-1)G/8(np)2 f (0)
_
2(p_1)(np_1)G 2/(2npf(0»L
0.3.14)
=-1/12np
-
2(np-1)G/8(np)2 f (0)
222
+ np (n-1)G /(2npf(0» •
Next, from equation 0.2.42) using equations 0.2 .. 31) t,hro1.1gh
(3.2.41), we obtain
Cov( a-b ,a" I-b"; )=-1/ 12np
Cov(c ,cIt I)=0
2
2
Cov(d,d"I)=p(np-2)G /(2npf(0»
Cov( a, e" 1)=0
(0
0.3.15)
Cov( a, d 'll )=
if
j/:l
<
l-p(np-1)G/8(np)2 f (0)
Cov(b, C ll 1)",_ (np-1)G/ 8( np) 2£ (0)
42.
_Cnp-p-1)G/scnp)2fCO)
CovCb,d"')=
{
-Cnp-1)G/SCnp)2 fCO )
if jJ1
2
CovCc ,d'" )=pC np_l)G / C2npfCO)) ~
So, if either j or h equals 1, we get after simplification
i
i
r
2
r
mCov(Bi1+B~1;Bih+B~h).-1/12np - np G/C2npf(O))
C3.3.16)
If bath, j
becomes
+
2
np2G/SCnp)2fCO).
a.nd h are unequal to 1, the covariance term
in 0.3.16)
equal to
222
-1/12np - np G /C2npfCO))
0.3.17)
If these terms are combined as indicated in the preceding work,
it turns out that for any j and h, jJh, we get that
0.3.1S)
mCov(IB~ . ,~B~ . ): -1/12np.
lJ
.
~:LJ
We can sununarize the preceding results in the following theorem:
J:
Theorem 3.3.1: If F.(x)=F(x+m 2't.) for each integer m and the WilJ
J
coxon score function is used, then the random vector
...
~...
~
[m~ (Tih-,t.{N1)
, ... ,m (TNp-MNp)J
has a limiting normal distribution with zero means and covariance
matrix ~ whose ~omponents( are
0- .. =(p-l) /12np
:L:L
and
cr. . =-1/ 12np.
:LJ
Proof: The above results are obtained directly by using Theorem
2.2.1 and equations (3.3.11) and (3.3.1S).
k
The covariance matrix of the vector of m2 T* 's has the form
Nj
0.3.19)
(p-l)/p
-lip
-lip
(p-l)/p
-lip
.
-lip
=1/12n M
Z=1/12n
•
-lip
(p-l) Ip
-lip
where M is idempotent and has rank of p-l.
3.4 An Asymptotically Distribution-free Test Statistic
In this section we propose to consider the test statistic L*
defined as
0.4.1)
L*=12mn
~
(T* -M (1' »2
~
Nq
N q
q
and its asymptotic properties under the assumption of a collapsing
alternative.
Theorem 3.4.1:
Let the assumptions of Theorem 3.3.1 be satisfied.
Then L* defined in expression (3.4.1) follows asymptotically a chisquare distribution with p-l degrees of freedom.
Proof: The proof follows directly from Theorem 3.3.1 and the fact
that the matrix M of equation (3.3.19) is idempotent with rank p-l.
In testing the hypothesis that all treatments are equal; i. e. ,
H : 1'.=0 for all j, the appropriate test statistic is
o
J
0.4.2)
where i<N(O)=
U'=12mn
a
~(T*Nq q
~J(F(X»dF(X)=~ for
AA (0»2
N
J(u)=u.
Therefore, we can
sta~e
the following theorem:
Theorem 3.4.2:
If J(u)=u, then L~ of equation 0.4.2) is distributed
asymptotically as a non-central chi-square random variable having p-l
43
44
degrees of freedom and non-centrality parameter
0.4.3)
p
where
t=
L'f'
/p
q q
and
Proof: The proof follows directly from Theorem 3.4.1 that L,s is noncentral chi-square.
The non-centrality parameter is obtained as
follows, using the results of Puri (1964):
lim ~ (,(~N( 1". ) -))N(O) )=lim
m"'" ,~l',
J
~ (CJ(H( x) )-J(F. (x»
dF" (x)
J]
>
lef
=lim m2 ,(H(x)-F.(x»dF.(x)
. ,)
J
J
p
-~'1' )-F(x+m-~'1':»dF. ( x)
=p -1 lim m~J :LCF(x+m
t qJ,j
q
J
J
(. -1."" ) lim.\r f ( x+m-~.>t,) dF . ('x)
=p -1 ~"f
;Jj
q
J
J
•
J
where i?,) lies between ~
and,t'.
I q
.J~
=p-1
2: (1'
qlj
=p-l~L.'
-,
q
q
_'T:)(ff 2 (x)dx).
J
(-y- 1'.)G
q
J
=p-lp(1'. _ f)G.
J
J
4. ESTIMATION PROCEDURES
4.1
Introduction
Suppose we want to estimate various contrasts among the treat-
ment effects,
1'.,
J
in the model
(4.1.1)
+
'1' " +
J
E'"k
1.J
where i=1,2, ••• ,n; j=1,2, ••• ,p; and k=1,2, ••• ,m.
contrast in the
~'s
can be written as the
the treatment effects, we
s~m
Since any
such
of the differences
of
could denote the contrast of interest and
its estimate as
p
p p
e = ~ c.1'.= ~~ d .. (1'.-1'.)
j
(4.1.2)
J J
i j
1.J
1.
J
"p
P P
"""
e = ~. c.AI'.=
~~ d .. (1'. -"t.)
J J
.,
1.J
1. J
J
1. J
p
where
~
j
c.::O.
J
As mentioned earlier, Hodges and Lehmann (1963) developed an
estimator of the difference in location of two populations satisfying certain regularity conditions.
Lehmann (1963a,1963b,1963c),
using Wilcoxon's score function, extended these results to linear
models of which equation (4.1.1) is a specific example.
Bhuchongkul
and Puri (1965) and Sen (1966) later extended Lehmann" s results
to a broader class of score functions with Puri and Sen (1967)
establishin~
the results for a model in a two-way layout having
a single observation per cell.
Sen (1968a) proposed an aligned rank procedure in order to g",ln
46
greater efficiency by using interblock comparisons" an approach
earl ier suggested
by Hodges and Lehmann (1962).
Sen (l968b) also
proposed a multiple-comparison procedure in the aligned rank case
based on Tukey's T-method.
In sections 2 of this chapter, the basic results of Hodges and
Lehmann (1963) will be summarized.
In section 3 the extensions to
linear models
Section 4 will exami.ne the aligned
rank
will be consi.dered.
procedures, specifically considering the use of the control
treatment median for alignment, appli.ed under conditions and assumptions made in chapter's 2 and 3. The asymptotic results will be seen
to coincide with Lehmann's work.
4.2
The Hodges-Lehmann Estimator
Suppose
x1 , ••• ,Xm,
Y ' ••• 'Y are
n
1
continuo~
independent random
variables with the respective distributions:
that is, Xl;". ,X and
m
Yl-~''''. 'Yn-~
distributed random variables.
are independent and identically
Under these conditions, Hodges and
Lehmann (1963) developed an estimator for A based on a test statistic h( x,y) that satisfies certain regularity conditions.
Suppose
h(x,y) is a linear rank test statistic for testing H :6=0 satiso
fying equation (3.1) of Hodges and Lehmann (1963).
Thus, we have
(4.2.1)
where X and Y denote the vectors of the observations; sO)<.. .... .( S(n)
denote the ranks of the Y. in a combined ranking of the observations
1
1
~n
.
in X and!; and V , ••• , V
denote an ordered sample of size m+n from
/ ."
4/
a distribution
~.
Then the estimator of
is defined to be
L,
(4.2.2)
where
.6*
== surJ.A: h
11.:
(X,Y-A».A.,,(
xy - -
"J""
(4.2.3)
A**= inff.t,,: h
\.
(X,Y-A)<,A.('}
xy-J
and M is the point of symmetry of the function h
If h
xy
xy
when
~=O.
is a Wilcoxon test statistic for the above hypothesis, then
~ =med(Y.
(4. 2.4)
1.\'{
-x.Jfl )
where med(Y, -X. ) denotes the median of the mn differences obtained
1P(
Jf:$
from the Y and X sample observations.
The Hod~es-Lehmann estimator is unbiased for
is symmetric or if m=n.
1-
random variable N2
It
(a - A)
Q,.
if ei ther F( u)
is also true that with N=m+n, the
has a limiting normal distribution with
zero mean and variance
2
2
V(A )=A /(B ACl_)));
A2== J'~ J 2 (u)du - (Jr~ J(u)du) 2 ;
(4.2.5)
...
Q
B = fCd/dX)J(F(X»dF(X);
..
0<.)...< 1.
~
where A=lim m/N, such that
m... t»
When using
the Wilcoxon score function J(u)=u, the vari anee
expressions becomes
(4.2.6)
4.3
The Application to Linear Models
Suppose we start with the model considered by Lehmann (]96'3a)
48
and Bhuchongkul and Puri (1965) for data in a
one-way classification
layout; that is
y.1)(.=c.
;) 1
(4.3.1)
where ~=l, ••• ,mij i=I, ••• ,cj the
the tIs are independent
+t.10(
SIS
are the treatment effects and
and distributed according to some continuous
distribution function F(x). Using equation (4.2.2), we denote the
estimate of the contrast !i-!j by
w..
= CA -k
1J
+ I;:;. **) I 2.
It has been shown that W.. has the following asymptotic proIJ
perties:
Theorem 4.3.1 (Lehmann-Bhuchongkul-Puri):
Under the assumption that
F(x) is continuousj
(i)
The joint distribution of (V , ••• ,V _ )
c 1
1
is
asymptotically
normal'where
V.=~(W.
_(~._i))
1
lC
~ 1 .)c
(4.3.3)
and the variance-covariance terms
Var(V.)=(I/)... +
1
1
llAC )(A 2 IB 2 )
(4.3.4)
Cov(V. , V . )=A
1
J
2
I CA.C B 2 )
where m.=A.N and A and B are defined in equation (4.2.5).
1
1
Here the
density function, f(x), of F(x) i.s assumed to satisfy the regu]a1'ity
conditi.ons of Lemma 3(a) of Hodges and Lehmann (1961).
(ii)
\ .. ~N\ (W. -W.) where Nindicates
For any i and j, NW
1J
IC
JC
that the difference of the two sides tend to zero in probabil i ty ..
Proof: The proot of (i) rests on the following lemma:
Lemma 4.3.1:
Suppose the variables Yiltx', have the distribution speci-
fied in connection with expression (4.3.1)) and a sequence of means
~)
.
~
(.)1' ... • 'Jc)=(s' N1'''· ,JNc) satisfying the condi.t.ion, (~NC 5Nc):<8. i !N
Let h .. (Yi,Y.) be
1J -J
defined as i.n equation (4.2.1) and let.,« .. ""E(h .. ).
1)
I)
Then the random variables (U , ••• ,U 1) given by
c1
1,
U .=N 2 (h.
(4.3.5)
I
Ie
/mc -)..C
)
1C
i=:1, ... ,c-l,
have a j oint asymptotic normal di stribution as N
approaches 1n-
finity with a zero mean vector and variance-covariance terms,
(4.3.6)
Cov(U. ,U.)=A
1
Proof:
J
2)....A./A (A.+A )(A.+,A).
1J
C
1
C
J
c
The proof of Lemma 4.3.1 is given in the cited papers by
Lehmann, and Bhuchongkul-Puri.
To prove (i) of Theorem 4.3.1, equation (9.1) of Hodges and
Lehmann (1963) is used to give
1
lim
"
p{N2[w.
lim
= lim
"
-(~ .-5;' )\. ~ a for all if
1C
1
c· ' 1
PN{~ (hic/me -oO~O
for all
i}
PN{~(h.1C /mC -JJ,1C )<f:a.B'A./O·,.+A)
1
1
1
C
wherec(=SJ(F(x»dF(X), and P
~
N
for all
i\
;
indicates that the probability is
1
2
computed for a sequence of means satisfying (§Ni-!NC)'-'= a/N •
This result is coupled with the conclusions of Lemma 4.3.1 to
yield the proof of Theorem 4.3.1
(U.
Part (if) of the theorem follows using U-stat:istics but V'la"
proved by Lehmann (1963a) using a t.heorem by LeCam.
The next observation is that the estimator W
ij
incompatible,
0
in the sense that. if we write
orr.
". i
_?
'~j
i.,
('\ i - ;, j ) = (~ i - ';k) + ($ k -
Sj ) ,
we would get a different estimate of (~,-S.) using W'k+Wk' than by
•~
simply using W. .•
~J
J
~
J
To overcome t,his difficulty, Lehmann 0963a)
suggested using adjusted estimates of the form
z.. =W,:L. -W.J ..
(4.3.7)
~J
where
c
w.~. ='">W
•. /c ..
-:- ~J
J
(4.3.8)
(The term W.. is defined to be
equal to zero.)
~~
Lehmann (1963a)
and Bhuchongkul and Puri (1965) showed that as a direct consequence
of Theorem 4.3.1-(ii), the difference
in probability for all i and j.
N~(Z. ,-W 1.]
.. )
~J
tended to zero
Therefore these adjusted estimates
have the same large and small sample proper·ties as the WI s.
This
means that the random variables
~
~
... ~
')
N (Z. -(~.-J
~c
c
. ... ,c-l;
»; l=l,
have a limiting normal distribution with zero mean and covariance
matrix given by equation (4.3.4).
The estimates Z,. do possess a pec.uliari ty that may be con~J
sidered a drawback; that
is, the estimator of (s?, -?) depends not
;) .l"'J
only on the observations from the i-th and j-th samples, but also
on the other samples as well.
An additional drawback of these
Hodges-Lehmann estimators is t.hat.they are not asymptot.ical1y di stribution-free.
From equations (4.2.5) and (4.3.4), it can be seen
that their asymptotic variance depends on the dist:ribution function
F(x) through the definition of B.
Lehmann C1963c) and Sen (966) provided a consistent esUmat:o:r
51
of B, making possible tests of hypotheses and confidence interval
construction 8
This estimator is stated in the following theorem.
Theorem 4 8 3 8 2 (Lehmann-Sen):
Let X ,u8,X and Y ,u8'Y be indepenn
m
1
1
dent observations from F( x) and F(y-~) respec ti vely where F is assumed
to be continuous but otherwise unknown8
A consistent estimator of
B as dEdined
where
K(/'/2 is the upper 100""2/0 point of a standard normal distri-
"2) N=ElU8N-ElL8N
bution,
and
For the Wilcoxon score function, the estimator
of Theorem
for a two-sided, syrrnnetric, O(-level test of the hypothesis ,H :A=b ,
o
()
(1)
(mn)
and D ~
are the ordered differences of the mn differences
888
<D
of the Y and X observations.
The value of CO( can be obtained from the t.ables oft-he null
distribution of the Wilcoxon test statistic, or for large
it can be approximated by
ill
and n
S2
1·
C('lI. ",mnl 2-K
OI 2(mn(m+n) I 12) 2 ..
(4.3.11)
With this estimate of B, it becomes possible to construct
typical analysis-of-variance tests as well as appropriatE confidence intervals.
In fact, Theorem 1 of Lehmann (1963b) provides
a parallel notation between the regular analysis-of-variance resul t8
and the results of this section.
This parallelism in notation makes
it easy to write down the estimates, tests, etc .. , by using the usual
analysis-of-variance procedures..
Th1.s leads to the following theorem
proved by Lehmann (1963b).
Theorem 4.3~ 5:
If Y.
:u:;,
satisfies the assumptions of Theorem 4 .. 3 .. 1
where a Wilcoxon score function is used, then it follows that.t.he
term
te
Z ., )2
12(
. ",,)( f2( x )d x )~,
,,.=. m,'l '\l'~a..
1.J 1J
j
is chi-square in the li.mit wit:h some r degrees of freedom under a
hypothesis
H about the linear relationship of
~tl'''''· '~c"
The a ..
1J
are such that
~.aij (~c.-§j )~O ..
~
If B is a consistent estimate of the term,
\t'2 (x)dx
(see Theorerr!
,j
4.3.2) then
(4.3.12)
"<;;:'"
re"
)2
W= 12'..::;.
m. \£,:.a •• Z . .
B2
1.
I)
1)
has a limi.ting chi-square distribution with r degrees of freed0ffi o
As another approach, it is possible in some cases to
ob~aln
a quadratic form, say QI, which is asymptotic.ally Lnde[JE:'Cldenr. of
The desire is to so choose QI so that: upon
)]
multiplication by 12( \' f
2
(x)dx); the result is asymptotically di s-
d
tributed as chi-square with 1 say, r' degrees of freedcm regardless
of whether the hypothesis is true or not..
tic. W'=Q/r+Q'/r'
In that case, the sta t.i s-
has a limiting F distribution with :r and
1"
degrees
Lehmann (l963b 1 P .. 1499) indicates such a pro-
of freedom under H..
cedure.
To obtain large sample confi.dence intez'vals Eor contrasts as
defined in equation (4 .. 1.2), we Can use the expression
(4.3.13)
""'- ~+
,.2 ~".. 2
~
.2..2..d .. Zoo - (C/C12N B ) (2.(c,/f".»
~
~
1.
1.
where C is the appropriate tabled value for constructing a c.onfidence
interval for either a single contrast or all possible contrasts ..
Also, if a quadratic form Q' is available, then we would get
(4.3.14)
~
+
~,.2:.dijZij -
2
k
~
~
C(I:c/A
)2 Q1 /r'N
i
as an alternative to (4.3.1) ..
Here C would ordinarily be the
critical value for a two-sided t-test at level
C(
but~
can appr'opriately
be adjusted by Scheffel s S-method to give c.onfidence intervals for
I
all possible contrast.s--see Scheffe
(1953) and Miller Cl966)~
The preceding resul ts can he applied directly to the model of
Given that. thE errot t.crill
equation (4.1 .. 1) for the Wilcoxon case.
Y
'
l
_
'i ~
(I
,J' k-/i ,.+ .J.~ l' +
"{
0
..,
J• +!:,,,1,J' jl\,
;>
--- ...,<' ij + t" k
.lJ ,
has
the properties previously discussed 1 we define V11 k I
·
t or a f $
... ij-Jkl;
('
th
"
the e s t 1ma
"a t.. 18
(4.3.15)
V.1.J"kl,=med(Y.1.,)::<".
'. -Y k.1j!l, ).
Est.imates of the main effects would be
if'
54
where the dot notation is defined as in equation (4.3.8).
remember that V
(Also,
=0.)
Lehmann (l963b) shows that for this model, the term
is such that r=p-1, r'=(p-l)(n-1),
-v
Q= nmL'.(V.
e
J"
)
2
0
and
Q'=m'1.I:(V..
-V.
1.J8. 1
ft " .
-V.
-V
8J •• e
2
).
•••
This provides a distribution-free test of hypotheses for the treat.ment
effects.
Large sample confi.dence intervals can be obtained for contrasts
by using either
equation (4.3.13) or (4.3.14).
2
(4.3.13), an estimate of Sf (x)dx is necessary
To use equation
which can be estimated
by applying Theorem 4.3.2 using all possible pairings of cells.
4.4
Estimates Using Aligned Observations
As previously stated, Sen (1968a) examined what he called
aligned rank tests in the context of a two-way layout for a model
such as that given in equation (4.1 .. 1).
His alignment procedure
utilized the block mean which was subtracted from each observation
in the block.
(He first developed the results for the case of one
observation per cell with a later extension to the case of multlplt'
observations per cell.)
and assigned a score
tions..
After al ignment., the observations wee"" rank,,,'.!
IN(i!N) subj ecl. to the Chernoff-Savdge c.ondi~
(Thus the approach of Hodges and Lehmann (962) was gene.ral~
!..:::::;
of Sen' 88)
.J,)
PdP~~"
ized to a broader class of scorl" functions by the above cited
Sen constructeo a test. statistic for H :'1'" ..... -='1
o
p
u~ing
an argument of permutational invariance of ranks "wit.hin blocks given
the n blocks of ordered observations..
He showed the
~est
statistic
to be asymptotically chi-square under Hand nen-central chi-square
o
otherwise..
He also obtained effecienc.y results comparing his test.
statistic to the classical test procedure..
test procedure was
also
obtained~
An all-pairs simul
t~aneOU6
and in a following paper:, Sen
(l968b) developed the theory for applying l\.key' s T-method fo:r g-et.ti.ng
confidence intervals for all possible contrasts associated with the
treatment effects ..
We now consider the model of equation (4.1 .. 1) where treatment.
1 is a control treatment..
We propose to align the observations
wiU!~
in each block by subtracting the median of the control treatment
'{I from each observation in block i8
If the true median is knovm,
this removes the block effect and the model of equation (401 .. 1)
reduces to
Yi j k:::: T j +
where
e. ij k
T .=1'. - 1"'1' T1",O, and the E. .. are independent and i.denticdll y
J
J
.
1J k
distributed random variables.
For such a case, t.he procedur",s di 8-
cussed in section 4.3 and Lehmann (1.963a, 1963b, 1963d apply dilcC.tly~
But
if
v. 1
1
;,
,"
is unknown and the sample median \i.~ l.s ilsed for al1gfl-
ment instead,
11
then different modifications are
can still be writ·ten as
Tequired~
'PIe lli\)de[
in equat.ion (4.4.1) above, but the t ._._
l .I l\
di;';~
no longer independent though they are identically distrihJi:ed to::
To est.ablish a notation and context for discussion, suppose w€
let Y. denote the vector (Y.1,e.e.Y. ).
-J
J
. Jm
Next~ conside1~ the Itn<'::c
'1'- L~
56
rank test statistic for comparing Y, and Y. , written as
-J
-1
N
T~ .. =h .. (y. ,Y.)='2,E ZN Imn
Nr r
1.)
1J '-1 -J
r
C4.4.2)
with E
being the Wilcoxon score function and
Nr
if t.he r-th smallest observation is a y
jk
otherwise.
Now it follows that the non-compatible Hodges-Lehmann estimator
of
1'.-1'.=C1'.-'h )-Cl'.-'f. )=T.-T.
IJ1 1
..
J 1
1.)
would be
The compatible estimator will be
Z, .=W, -W.
IJ
1..
)
•
p
where W. =~ w.. /p. By equation (4.1.2), the estimator of an arbl1.
.
IJ
J
trary contrasts becomes
p p
(4.4.3)
e: L. c.1"'.=2~
d, .Z ..•
JJ
..
:LJ1.)
J'
1
)
The asymptotic properties of W.. and Z .. follow as in section
IJ
1J
4.3 with m tending to infini ty such that m!N=A.=l!np. The results
1
are summarized in the following theorem:
Theorem 4.4.1:
Assuming the same conditions on F(x) as in Theorem
C1) The J' oint di.stribution of (V , ••• , V .,) where
. p-l
1.
C4.4 .. 4)
~
V.=m (W. -C'I'.-l"»
J
JP
J
P
is asymptotically normal with zero mean and
(4.4.5)
Var(V. )= 11 6nC
J
"':\lar 'La·nt~e,-cGvar.i aI1C'2
2
Cov(V ,V )=1/12nc
h
j
2
j ,tl
rs;
p-l
I::f..:~··~nl(..;,
57
Here the density f(x) of F(x)
i.s assumed to satisfy the rE.gu-
lari ty conditions of Lemma 3(a) of Hodges and Lehmann (1961) ~
( ii )
For any i and j
~
~.
m W.hNm (W. -W )
J
JP h P
where
"J
indicates the difference in the two sides tends to zero in
probabil Ly 8
( iii)
k
The difference, m2(Zjh-Wjh) converges to zero in pro-
bability for all j
Proof:
and h8
The proof of this theorem follows by the same arguments
used
in proving Theorem 4 8 3.. 1, with adjustment made for the dependence
caused by the al igning of the observations.
We assume
l",-1'=a"/N~
J
P
J
so that Lemma 4 .. 3.1 will apply by defining U, to be
J
By the results of chapter 3, U. is asymptotically normal with varJ
iance-covariance terms
being
Var(U, )=p/24
J
To show this we recognize T*
to be the test statistic pTe=
Njp
viously considered in chapters 2 and 3 for median al i gned "':aol< sta=
tistics assuming identically distributed errol terms and a
~o11apslng
In the case considered here, p=2 so that Lhe asymploti
alternative.
k
variance of m 2 (T*.-M.. ) is equal t.o
NJp JP
Now to obtain
mCov('T~""
. T-k )" we use the same app:[oacb
Njp' Nhp ,
(J
sed
in chapter 2 and chapter 3, recognizing
andP
T'"
n'BL
Nhp
Njp
invol'i'::~
only those observat,ions under tl>.:atmcnt j) hand p wi th the
obse:('~
vations ljnder treatment p being the only ones commcm to both
and T-"Nhpo
T'~
Njp
We need t.o obtain
mCOV(B-lkN~'lP +B-2kN~JP'; B11'Nhi,'P +B2kN'~'tI;J )
""Cov( a~b+c~d 9 a"~bnH·n~d"')
and
.
i
i
r
r
meov( Bl'..
; Bi'N··',
INJP +- B'IN'
L
J~'
1 hp + Bi)'''N
_ tJ~ )
where, with some notational simplificat:ion over cbapter 2) we define
1-
a""m2(2n)~
1 n
,.
2:
~. B1,g(y, ,)
g q,=,j ,p q
1.J
g>'<q
lIn
,
b ""m'2(2"
,n )~' ~-"'"
2. ~ B'" .l(y'
"
g q=J,P J
g>'<q
-1
1·
c~m2(2n) . U(Y
gq
n ) L~:
).
gi
G
g q=J ,P
qj
g"'q
1,.
.
·-1
'2 ("
d=m
•
.Ln)
~ 1."",·
L
g
,
c"
.
11' Y 1)
g
The definitions of ali, b n , e" and ePi are s.unl1ar t:c thuse of a, b,
c and d except each j
q
is
re~,laced
wi t1: an 11
can takE.: only the vallJeG c·f fl dnd
are the same with eaeb i
results ..
repl."l.cl~d
po
The
dwi
UtE
v:llue of
Ll.it='le~v~imed Vd'dco
by all T? ind eacb ] :r
(~IJ'Ldced
59
and it can be shown that Cov(b,b")""Cov(b,b"lh,1!48ne
covariance terms simplify to
The n:cmaining
those found in expressions (J 2" 12)
0
and Oe3 .. 15) where t.he value of p in those expressions is r epl aced
by 2"
The end result, after simpli.fication and routine algebra j
is
that, asymptotically,
mCov(T;'(, • T* )= 1/48n
N]p9 Nhp
which i.s so for any }l:h.. Multiplying by N/ffi;:::np gives (4 .. 4,,6)0
Using these results along with equation (ge1) of Hodges and
Lehmann (1963), we get for N=mnp in the multivariate case
lim
p{~ IIrW,J p -(1',J -7'p)"J. "<.
=lim
PN{r/2(T~jp-~)~ 0
a, for all
J
for all
j,J'
j}
=lim PN t r/2(T *, ~, )~ a,G/2 for all
NJP JP
l
J
j}
e
Therefore, the conclusion of (1) of Theorem 4.4,,1 is obtained
in the same way that (i) of Theorem 4.3.1 was obtained"
Also palts
(U) and (iii) would follow, relying on Lehmann I s (1963a) argGmentso
Now in order to utilize the above conclusions for testing hypotheses or in constructing confidence intervals on the contrasts) a
c.onsistent estimate of G is necessarye
In order to get such :in
estimator, we can consider the model for the observations
origi.nal, unaligned two-way table ..
inr.~'
We ean write this modEl as
Y"1.]'
'k=S"
1.] + €"k
1)
where S, .=)J.. +
1J
~,
1
+ 'f'. e
J
In thi.s model, the assumptions of
4,,3,,2 are satisfied relative to any two c.ells i,j
way table.
V80
T'h,:(iC',;{ll
k,lio(he
Therefore, we can get an estimate of Gby apply illS
!.""'\~
60
equation (4.3.10) to the 2m observations in the two cells considered&
1\
Letting G.. k1 denote this estimate of G, we Can similarly obtain
l.J ,
estimates of G for all (7) pairings of the np cells in the table.
The proposed estimator of G would be then
np-1
1\
(4.4.7)
G=( 2)
~Gij ,kl
where the sum is over all possible pairings of the np cells in the
table.
Utilizing this estimator of G, we can get consistent estimates
for Var(V.) and COV(V. ,V ) as given in equation
h
J
J
(4~4.5).
Then in order
to construct tests of hypotheses about the 1' I s, section 4.3 can be
utilized.
In addition, using the approach of Puri and Sen (1967),
the contrast, 8, of equation (4.1.2) can be estimated by
~1:.'Id
.. Z.. =~c.W.
. • l.J l.J
l. l..
l.
J
where d .. =c./p for j=l, ••• ,p; i:::1, •••• ,p.
l.J
l.
It can be shown that
where V=1/6nG 2 and
(4.4.8)
f
=~..
Therefore,
it follows that
~(~-8)/[2:c:/12n.G~~
.
l.
l.
is asymptotically normally distributed and thus provides a means of
testing hypotheses or constructing a
confidence
interval on tbe con-
trast, 8.
In addition, we could construct large sample multiple c.ompa,itL'fl
procedures for all possible contrasts
~nalogous
to Scheffel s S-metl·od.
This procedure, suggested by Lehmann (l963b) uses expression (4 03010)
61
to give the approximate 100(1-«) per cent confidence intervals for
any contrast, 8, as
p
(4.4.9)
where
,",c,W,
~
1.
~(.p- 1),Q(. is
1.
1..
the upper 1000(.70 point of the
~ distribution
with
(p-1) degrees Qf freedom.
To facilitate pairwise comparisons we can take advantage of the
fact that the vector (Vl'''. ,V _ ) of Theorem 4.4.1 ha. the covariance
p 1
matrix
(4.4.10)
1
~
~
~
1
~
(6nG 2 )-1
I»
.
.
•
~
~
::;:( 6nG 2 ) -1~ •
1
As a result, the vector (Zlp, ... ,Z(p_1)p) would have the approximate
2 -1~
large sample covariance matrix, ( 6mnG)
~.
-
Miller (1966) shows that Tukey's T-method can be applied easily
when the covariance matrix is of this special form, when an independent estimate of 6mnG
2
is available or when we can assume it is
known in large sample cases.
Pairwise comparisons among the Z, 's would provide estimates
JP
only of the differences
1j - "h'
itself estimates the difference
j~p,
and h~p.
If'. -1".
J
p
However, each Zj p
Therefore using the ind.i vi~
dual Z, 's as well as all possible pairwise differences, gives all
JP
possible pairwise comparisons among the total of p ~'Se
Such an approach, however, takes us out of the space of linear
contrasts into the linear space of all linear combinat.ions..
If vie
62
restrict ourselves to the above described linear combination of
individual and pairwise comparisons, we can construct the approximat~
large sample, simultaneous confidence intervals with a
100(1-00% confidence coefficient:
(4.4.11)
+
Ll.Z.
1 1P
ri
,,11~
tp-f)/6mnG:J
p- ,00
Ol.
ql 1
where q~~l ,or> is the upper 1000( percentage point of the studentized
augmented range distribution with parameters (p-l) and
procedure is discussed in Miller (1966,p.42).
00.
This
63
SUMMARY AND CONCLUSIONS
5.1
An Example
We will illustrate the median-aligned rank testing and estimation
proc~dures
considered in the preceding chapters
by an example.
The
hypothetical data appearing in Table 5.1.1 represent measurements
obtained from an appropriately conducted experiment where it is
assumed that treatment 1 is a control, and that the data satisfies
model (4.1.1) and (4.4.1) with n=2, p=3, and m=9 experimental units
per cell.
Tabl~
5.1.1--Data
Treatments
Blocks
1
1
48
43
49
56
39
51
61
44
51
2
74
79
57
67
56
87
69
49
66
53
82
81
84
72
2
45
57
53
65
55
59
70
53.94
3
74
72
61
72
73
66
67
79
75
75
64
82
63
75
85
80
83
76
85
75
85
81
69
71.50
75.44
77
The medians for the control treatments in the respective
blocks are
Vi 1=49 ,
"
and tA
\(21=57.
Subtracting these terms from each of
64
the other observations in their respective blocks yields Table 5.1.2
below.
Table 5.1.2-.Median Aligned Observations
Treatments
1
-12
-10
-6
-5
-4
-4
-2
-1
0
0
2
2
2
7
8
12
13
15
2
0
6
7
7
8
17
18
18
18
20
20
24
25
25
25
27
30
38
3
12
12
17
18
18
19
23
23
23
24
24
25
26
26
28
28
28
30
Suppose we want to test the hypothesis,Ho:~I=~2=1J=0. To do
this we would rank all 54 observations together and get the sum of
the ranks for each treatment.
This is the same procedure considered
by Kruskal (1952) and Kruskal and Wallis (1952) for comparing p treatments in a one-criterion analysis of variance of ranks.
Seigel (1956).
used.
See
also~
In the case of ties the average of the ranks was
The sum of the ranks are T =20Z.5, T =589 and T =693.5.
2
3
1
Recognizing thatT"'A.=(mn)-IT./ N, we get
NJ
J
T~1'=.20833, T~2=.60597,
and T~3=. 71348.
Using equation (3.4.2) we have that under H
o
L*=12mn
o
L (TN*q -~) 2=30.57
q
6 .Jc·
which exceeds the upper .0005 value in the chi-square distribution
for 2 df.
Therefore, we reject the hypothesis of equal treatment
effects.
Now to consider the problem of estimation relative to the terms
~1' ~2'
and ~3' the approach of section 4.4 is used.
After some computation involving the values in table 5.1.2 and
applying equation (4.2.4) where W.. =med(Y. -Yo ), we get
1.J
W =-W =-18,
12
21
P
Therefore, with W. ::
1..
~
J
W =-W ",,-22,
13
31
;]..llt
J~
W =-W =-4.
23
32
w..
/p,
1.J
W =(0 - 18 - 22)/3=-13.3333
1.
W2 .=(18 + 0 -
4)/3=4.6667
W .=(22 + 4 +
3
0)/3=8.6667.
So that,
For comparison purposes, the least squares estimates using the
means are -17.5556, -21.5000, and -3.9444 respectively.
To estimate
G::
S
f2( x)dx, we consider the original observations
of Table 5.1.1 and the approach of Theorem 4.3.2 and equation (403010)
and equation (4.4.7). Suppose we let tx..=.05.
Then, from t:he Mann-
Whitney- Wilcoxon two-sample tables, we obtain
we nee d D(17) an d D(6S) •
C .05=
17, indi.caUng
In addition, we have that K. 02S =1.96,
66
Suppose we look at cell 11 vs. 12.
In this case it is found
,..
r-
As a result ~N=24 and G11 ,12=.02224.
Similarly, considering the other possibilities we get
"Gll ,13=·03812
"GIl, 21= .03J..39
"G
"Gn
"GI2 ,13=·02320
A
,23=·03812
ll
,22,,·03335
12 , 21= .02135
G
A
,A
13 , 21= .02965
(;12,22=·02224
G12
,23=·02320
G
1\
G , 22= .03558
13
'"G13 ,23=·04851
G
"G2l , 23= .03139
"G
1\
2l ,22=·02809
22 ,23=·04105.
"
Averaging over &ll 15 estimates as indicated in (4.4.7) we get G=.03111.
This estimate of G is used to give a consistent estimate of the
components of the covariance matrix associated with the W.. and the
~J
Z. ,.
~J
'.
They turn out to be by
"
expression (4.4.5)
Var(V. )=85.7727
1.
A
Cov(V, ,V.)= 42.8864.
~
J
It is easily seen thatr=~.
Suppose we are interested in estimating the contrast
The estimate of this contrast would be
Using expression (4.4.8) we can compute a 95% confidence int.erval
as follows:
or
e : 1.96 ~1-r)('I:.c~)/6m~2J\
-40 ! 1.96 [28.5909]
which gives the interval
k
2
(-50.48,-29.52).
67
To consider all possible contrasts using equation (4.4.9) we
get for ex= .05,
c i Wi.
1:c.W.
or
1
1,
~ [~~p_l),,,f(12mnG 2)J ~ [Ic ~~
~ (6)~(4.7652)~(I:c~)~.
1
For pairwise comparisons with
~c~= 2,',we get the simultaneous con-
fidence intervals:
-
-18 +
-4 +
-
-22
+
7.56
or
(-25.56, -10 .44)
7.56
or
(-29.56,-14.44)
7.56
or
(-11.56,
3.56).
For Tukey's pairwise, linear combination result of expression
(4.4.11) we need a value from the augmented studentized range distribution which unfortunately is not tabled.
However, Miller (1966)
pointed out that Tukey indicates for p-l)2 and
dentized range values serve as reasonable
mented range values.
~<.05,
regular stu-
ap~roximationsto
the aug-
Proceeding in this manner for our borderline
case, we get the following set of simultaneous confidence intervals
for pairwise comparisons when ,,(=.05:
~l.Z.
.
1
1p
~ 3.32(2.18)
which yields
-
-18 + 7.. 24
or
(-25.24,-10.76)
-22 - 7.24
or
(-29.24,-14.76)
-
or
(-11.24,
+
- 4
+ 7.24
3.24)
The above results lead us to conclude under our model assumptions that1'I*1'2 and 1'i/1"3' but that 1'2 and 1'3 could be equal.
68
5.4
Summary and Recommendations
In the preceding chapters, we develope9 an aligned rank test pro-
cedure for a two-way layout with p treatments and n blocks, using
the median of a control treatment to align the observations.
The pur-
pose of the alignment was to remove the block effects so that interblock comparisons of the treatment effects could be made with the
hope of producing a more efficient test statistic.
We used a Wilcoxon score function on the aligned observations so
that a Kruskal-Wallis, p-treatment test procedure resulted.
shown that the resulting test statistic, denoted by
T~j'
It was
was asymp-
totically normal under rather mild assumptions about the underlying
distribution function of the aligned observations.
This asymptotic
normality was shown to hold under reasonable alternative hypotheses
as well as under the null.
As a result, a distribution-free test
statistic which was aymptotically chi-square with
p-l degrees of
freedom was obtained for testing the equality of the treatment effects
under the assumed model.
Finally, we obtained estimates of treatment differences using the
statistic
T~j
in the manner proposed by Hodges and Lehmann (1963).
The estimators of pairwise differences in the treatment effects were
shown to be asymptotically normal, but unfortunately not distributionfree because of the presence of the term G=
and covariance expressions.
Sf2( x)dx in the variance
However, a consistent estimator of G
was obtained; and as a result, large sample tests and confidence
interval procedures were developed for linear contrasts of interest
69
among the treatment effects.
In the development of these methods, it was assumed that a treatment va. control experiment was under consideration.
As a result, it
was natural to use the control treatment median in the alignment process.
However, in a more general situation, the same results would
be obtained if an arbitrary treatment were identified to serve as
the'l:ontrol".
(Keep in mind that a basic assumption was that the model
was completely balanced and that m, the number of observations per
cell, was large.)
Areas for further investigation include the consideration of
other types of designs such as the partially balanced incomplete
block design, examination of unbalanced models-in such instances,
simultaneous procedures
su~h
as those proposed by Hochberg (1974,1975)
may be useful •. Different score functions could be considered or
development using a more general score function could be pursued.
Investigations into possible specialized single-observation-per-cell
models could be done.
In addition, the small
~ample
properties of
the median aligned approach could be investigated, and the Monte
Carlo investigation of efficiency could be looked into.
These pro-
plems represent just a beginning of possible Jump-off points for
further ,research and study in this area.
70
6.
LIST OF REFERENCES
Bahadur, R. R. 1966. A note on quantiles in large samples.
Math. Stat. 37:577-580.
Ann.
Bhattacharyy~,
H. 1973. On some nonparametric estimates of scale and
large sample distribution of sample median adjusted rank order
statistics. Unpublished Ph.D. thesis, Univ. of North Carolina
at Chapel Hill, Chapel Hill, N. C.
S, and Puri, M.L. 1965. On the estimation of contrasts
in linear models. Ann. Math. Stat. 36:198-202.
Bhuchong~ul,
Chernoff, H. and Savage, I.R. 1958. Asymptotic normality and efficiency of certain nonparametric test statistics. Ann. Math.
Stat. 29:972-994.
Hochber~,
Y. 1974. Some generalizations of the T-method in simultaneous inference. Jour. Mult. Analy. Vol. 4, No. 2:224-234.
Hochberg, Y. 1975. An extension of the T-method to general unbalanced models of fixed effects. To appear in Jour. Roy. Stat.
Soc. (B).
Hodges, J.L., Jr. and Lehmann, E.L. 1956. The efficiency of some
nonparametric competitors of the t-test. Ann. Math. Stat.
27:324-335.
Hodges, J.L., Jr. and Lehmann, E.L. 1961. Comparison of the normal
scores and Wi~coxon tests. Proc. 4th Berkeley Symp. Math. Stat.
Probe 1:307-318.
Hodges, J.L., Jr. and Lehmann, E.L. 1962. Rank methods for combinations of independent experiments in the analysis of variance.
Ann. Math. Stat. 33:482-497.
Hodges, J.L., Jr. and Lehmann, E.L. 1963. Estimates of location
based on rank tests. Ann. Math. Stat. 34:598-611.
Hollander, M. 1966. An asymptotically distribution-free mul tip 1 e
comparison procedure--treatments vs. control. Ann. Math. St.at.
37: 735-738.
Karlin, S. and Truax, D. 1960.
31: 296..324.
Slippage problems.
Ann. Math. Stat.
Kruskal, W.R. 1952. A nonparametric test for the several sample
problem. Ann. Math. Stat. 23:525-540.
Kruskal, W.R. and Wallis, W.A. 1952. Use of ranks in a one-criterion
variance analysis. Jour. Amer. Stat. Assn. 47:583-621.
71
Lehmann, E.L. 1963a. Robust estimation in analysis of variance.
Ann. Math. Stat. 34:957-966.
Lehmann, E.L. 1963b. Asymptotically nonparametric inference: An
alternative to linear models. Ann. Math. Stat. 34:1494-1506.
Lehmann, E.L. 1963c. Nonparametric confidence intervals for a
shift parameter. Ann. Math. Stat. 34:1507-1512.
Mehra, K.L. and Sarangi, J. 1967. Asymptotic efficiency of certain
ranks tests for comparative experiments. Ann. Math. Stat.
38:90-107.
Miller, R.G., Jr. 1966. Simultaneous statistical inference.
McGraw-Hill Book Co., New York.
Raghavachari, M. 1965. The two-sample scale problem when locations
are unknown. Ann. Math. Stat. 36:1236-1242.
Paulson, E. 1952. On the comparison of several experimental categories with a control. Ann. Math. Stat. 33:438-443.
Puri, M.L. 1964. Asymptotic efficiency of a class of c-sample
tests. Ann. Math. Stat. 35:102-121.
Puri, M.L. and Sen, P.K. 1967. On some optimum nonparametric procedures in two-way layouts. Jour. Amer. Stat. Assn. 67:12141229.
Puri, M.L. and Sen, P.K. 1971. Nonparametric methods in multivariate analysis. John Wiley & Sons, Inc., New York.
Scheffel, H. 1953. A method for judging all contrasts in the
analysis of variance. Biometrika 40:87-104.
Sen, P.K. 1966. On a distribution-free method of estimating asymptotic efficiency of a class of nonparametric tests. Ann. Math.
Stat. 37:1759-1770.
Sen, P.K. 1967. On some nonparametric generalizations of Wilkls
tests for ~, !\'C' l\tvc' I. Ann. Inst. Stat. Math. 19:451471.
Sen, P.K. 1968a.
way layouts.
On a class of aligned rank order tests in twoAnn. Math. Stat. 39:1115-1124.
Sen, P.K. 1968b. On nonparametric T-method of multiple comparisons
for randomized blocks. Ann. Inst. Stat. Math. 19:329-333.
Siegel, S. 1956. Nonparametric statistics for the behavioral sciences.
McGraw-Hill Book Co., New York.
Steel, ReG.D. 1959. Treatments Versus control multiple comparison
sign test. Jour. Amer. Stat. Assn. 54:767~775.
Wilcoxon, F. 1964e Some rapid approximate statistical procedures.
American Cyanamid Company, Stamford, Conn.
74
THE C*-TERMS FOR THE GENERAL PROBLEM
The individual C*-terms discussed in Section 2.1 are
q=I, .... ,p
CiNf =~ \(H N( x)-H( x» 2JII(~HN+(l- g»H) dFm~( x)
O(H II <.1
C~N;= ~ (IN(HN)-J(HN) )dFm~(X)
e<W N <1
C~N~=
SJN(HN)dFm~(X)
J
H
J
,..::i
~(-J(H)-{HN-H}JI(H)~Fm1(x)
CtNf=
1l~.1
where the asterisk signifies that aligned observations are
1\
that iS t x in the expressions represents y- VI;
~
used~
and H stand for
HN(x) and H(x) respectively; and J(u)=u so that J'(u)=1 and J"(u)=O.
N~ when g=i and
First, consider the term c*
g,q J
q=j.
The expres-
sion 'becomes after some manipulation
C':
i
'
'N.=~(np)
1.,J
-1[
J
O(
f
If
i
JI(H)d(F i.(x)-F.(x»
mJ
J
2
<'1
This identity comes from integrating the expression
over the regions Rand
R where R denotes the set of points of in-
crease in F ~(x).
mJ
Therefore, we get
~
'
5
i
2=~
JI(H)d(F i.(x)-F.(x»
ill]
J
[S
.
R
i
2+~J'(H)m
m.
-2'~J
JI(H)d(F i.(x)-F.(x»
mJ
J
.
1.
(cant.)
r ('
..
=~~ JJI(H)d(F ~(x)-F~(x»
2
75
+
~""tl«t
mJ
J
Upon integrating the first term above by parts, we get
Also
Now consider- the term c* N~ where either gf.i or qf.j or both.
g,q J
We write
i (
-1 (\(F
. g (x)-F
.
g()
i
N'= np)
x )J'(H)d (
F i.(x)-F.(x»
g ,q J
.,; mq
q
mJ
J
C'k
e«!'!ri
=(np)
<1
-I{ ~(Fmqg(y*g )-Fg(y*»J'
(H(y*»d(F ~(y*)-F~(y*»
q. g
g
mJ
g
J
g
TN
S(Fmqg(y*)-Fg(y*»J'(H(y*»d(F
~(Y~)-F~(Y»
g
q g
g
mJ
J
+
I.
1
N
·
.
Jr
(Fg(y*)-Fg(y»J' (H(y*»dF(Xy~)-F~(y~»
q g
q
g
mJ
J
+
:r.~
+
!
1
1
S(Fg(Y*)-Fg(Y»J'(H(Y*»d(F~(Y~)-F~(Y»}
qg
q
g
J
1
J
101
= ( np)
-1
(C gq 1 + Cgq 2 + Cgq 3 + C gq 4)·
In the above expressions using the method indicated by Raghav~_
k
2
achari 9 each \T
is replaced by a nonrandom t/N • (See Raghavachari
g1
~ for all g; and IN denotes the
(1965).) Further, we define y*=y-t
g
region of int.'~gration
a(
g
IN
H/x)< 1 previously defined.
Also to ease
the notational burden further, we let
G*(y)=Fg(y),
q
F*(y)=F~(y),
J
G*>'«y ).=Fg(y*)
q g
G**(y)=empirical distribution
m
corresponding to G**
F>'(*(y )=F~ (y~)
J
1
F**",.. empirical distribution corresponding to F*>'<.
m
76
r t..
-
-
t:
g
[o:r
to show
t h at C; ,qNji is
",
'.I'e ObJ eCL1?12
1,8
!to~l;,ao and
11
It,g I,a g where
0
a. and a
1
p
(N -~ ) uniformly in t. and
g
1
are constant.
NOli" •
r ('
\ \(G>'~;'~(X)-G;'o"(X»)(G;'~*(Y)-G"'~*(y)J'(H )J'(H ).
J.) m
m
x
y
1?" !
d( F;'d~( x) -F>'<*( x) )d( F;'d~( y) -F'**(y) )
m
,'. r'
+\,J)m
\(G**(X)~G**(X»(G**(y)-G**(y»J'(H
m
m
x
)J'(H )0
y
..= 'j
=2m-lC'(~**(x)(1-G**(Y»d(F**(X)-F**(x»d(F**(Y)-F**(Y))
J.J
m
m
Q'I-';/
2
+ m-
.~
\
G>'<;'~ (x)( I-G**( x) ) dF~*( x)
'X
=C
gqla
+ C
,
gq2a
where it should be remembered that J' (u)=l.
the region of integration,
~oo<x <.
In addition, R
xy
denotes
y(oo •
wherE Mxy =cG"o,« x) (l~G)'d'(y» .. Now ,
\i'l
()
'
)
E.[ 2m-1 \ M dF;'o'< x dF*;'~ ():.l
Y)J = 2m-3 "'
L- E(M
.
'. j xy In
m
x<y
xy
k .... :J
-35 ~
=2m(m-l)m
Also,
M dF**(x)dF*"'~(Y).
• xy
Rl':f
77
'. -1 (
El Lrn
\,
,
"
"
~
'1
;,
ThE:refore 9
E(C
gq1a
Z
)=:;_Zm- ( ('M dF*'>'(x)dF*''<'(y)
) J xy
=o(N
-2 )
since M
xy
is bounded.
In addition E(Cgqlb)=O(N-2) since the integrand is bounded and
i.ntegrating with respect, to F** is equivalent to taking the average
m
of the integrand.
Therefore, we have
So,
For the term C
we have
gq2
C
gq2
=:;
CCG**(y)-G**(y»d(F**(y)-F*(y»
j
m
l,~
and
fl
if a
where I(a,b)= \
b
,0 otherwise
1 1 k
g
Fgex-t N""'2) Cl-F ( ,,_N-':2 (t - t..
lex, v+t. N- 2 )f(x)fC v)dxdv
• ". q
g
q
g ~
1.
»
" !
_
(
I'
j(
+
\,
V
\~ fFgCx-t N-%)(l-F g (y-t N-%»)fCx)f(y)dXdyl\,"
~~ q
g
q
g
=2m
-1
(aN -
)
b N - c N + dN •
78
Now since Fg(x) is bounded and f is a probability density funcq
tion 1 we can use the dominated c,onvergence theorem on all four terms
and obtain
l1m a = 1 1m b = 1 im cN=d
N
N
N
and Urn
NE(C~q2)=,Oe
Therefore, by the Chebyshev inequality
C
gq
('
'-\
1
2=0 (N
p
-\ ) ~
. i
\i
-\ »
fg(y- St. N-~ )d(F
.(y-t.N- )-F.(y-t.N
q
g
mJ
1.
J
1.
0<8<1.
Following the methods for C
gq1'
we have
Lastly,
C
,= (' (FgCY-T: N-~)-FgCY»d(F~(y-t.N-\)-F~(Y»
gq'+
J
q
q
g
J
J
1.
I
-~t"g
",,-t N
g
\f q Cy-8t. N-\ )
t
g
i
-\,i
(f.(y-t.N )-f.(y»dy
J
1.
J
i
=-t N-'~J f g eu-(St -t.)N-\ H.(u)du
g
'I q
g 1.
J
+tgN
-\rg
1.'
,~/q(y-St N-'2)f7Cy)dy.
I
.
g
J
1,
2
So by the dominated convt=rgence theorem, lim N C
C
gq
=O, so that.
gq4
4=0 (N -~ )0
p
Summarizing, we have
i
-1
)
.-\
Ct,
.=(np)
(C l+C 2+ C 3+ C 4 ""0 (N )
g,qNJ
gq
gq
gq
gq
p
79
uniformly i.n t, and t ()
1
g
These results take care of the
i
Further 9 C1Nj""O sinee J i/(u)~O()
c*g,qNji
terms"
i
C!( i
An d C*2Nj'
3Nj an d C*4Nji are a.1 1
of'
1.
order o(N-'2) since in each one , the integrand is bounded and integration is with respect to F
~(x)"
mJ
This completes the argument.
© Copyright 2026 Paperzz