ROBUST, ISOTONIC, AND PRELIMINARY TEST ESTIMATORS IN
SOME LINEAR MODELS UNDER RESTRAINTS
by
Azza Rafik Karmous
Department of Biostatistics
University of North Carolina at Chapel Hill
Institute of Statistics Mimeo Series No. l816T
•
December 1986
ROBUST, ISOTONIC, AND PRELIMINARY TEST ESTIMATORS IN
SOME LINEAR MODELS UNDER RESTRAINTS
by
Azza Rafik Karmous
A Dissertation su bmitted to the faculty of The
University of North Carolina at Chapel Hill in partial
fulfillment of the requirements for the degree of Doctor
of Philosophy in the Department of Biostatistics.
•
Chapel Hill
1986
APprove~:
~--
Reader
~
I dedicate this work to my parents:
Ekram Rushdy
&
Rafik Karmous
ACKNOWLEDGEMENTS
I have had the honour of writing this dissertation under the direction of Dr.
Pranab Kumar Sen. It has been a very special privilege during this research to work
with him. His time, encouragement, counsel and guidance in the completion of this
dissertation was invaluable, and I wish to express sincere gratitude, appreciation and
admiration to him. I am also very appreciative to Dr. Dana Quade for his priceless
comments especially in the actual writing of the dissertation.
I would like also to thank my other committee members; Drs. Larry Kupper,
David Kleinbaum, and Shrikant Bangdiwala, for the time and support they gave to
this project.
Finally, I wish to thank my typist, Ms. Ann Davis for many hours of the
difficult task of typing this work.
TABLE OF CONTENTS
Chapter I
1.1
1.2
1.3
1.4
1.4.1
1.4.2
1.5
1.6
1.7
e
Chapter II
2.1
2.2
2.3
2.4
2.5
2.6
2.7
2.8
Chapter III
3.1
3.2
3.3
3.4
3.5
3.6
3.7
Introduction and Literature Review
1
Introduction
Isotonic Regression Analysis
Algorithms for Calculations of g •
Tests Against Ordered Alternatives
Tests Against Ordered Alternatives for the One-Way Layout
Tests Against Ordered Alternatives for the Two-Way Layout
The M-Procedure
Estimation After Preliminary Tests
Proposed Research
1
3
7
9
9
17
23
25
28
UI M-Estimators of Location Parameters After a Preliminary
Test in the Univariate Multi-sample Problem
30
Introd uction
Preliminary Notions
Asymptotic Properties of the M-Estimator
The UI M-Statistic
The Asymptotic Distribution of M;'
The Isotonic M-Estimator
The Preliminary Test Estimator (PTE)
Another Approach
30
31
34
41
58
61
68
74
Preliminary Test Estimator in The Multivariate One Sample
Problem
75
Introd uction
Preliminary Notions
Asymptotic Distributions of the M-Estimator
The UI M-Statistic
The Asymptotic Distribution of L N+
The Isotonic M-Estimator
The Preliminary Test Estimator
75
76
78
80
88
90
96
Chapter IV
4.1
4.2
4.3
4.4
4.5
4.6
Chapter V
5.1
5.2
5.3
5.4
5.5
5.6
5.7
5.8
Preliminary Test Estimator in the General Multivariate Linear
Model
100
In trod uction
Basic Assumptions
Asymptotic Properties of the M-Estimator
The VI Statistics
The Isotonic M-Estimator
The Preliminary Test Estimator
100
101
106
109
113
121
Some Numerical Comparisons of The Procedures
124
Introduction
Basic Estimates
Example 1
Example 2
ij/,t::-; j "
Example 3
Performance of the Test
Asymptotic Prope~~i~~ ~( thl~,·r'·n2:,;, '-<"
Recommendations for Future Research
124
125
126
132
138
147
153
160
I I,
'/
.':,
:'ff,
162
References
e
iii
ABSTRACT
AZZA R. KARMOUS
ROBUST, ISOTONIC, AND PRELIMINARY TEST ESTIMATORS IN
SOME LINEAR MODELS UNDER RESTRAINTS
(under the direction of PRANAB KUMARcSEN).
a
Based on S.N. Roy's union-inter~~b'tjoi/pil~Jrpte, robust estimate of group
;;,.;
;,[:")::~';.':~
-. ','
.~:
'i. '
~
i
_::;'-i;
means in linear models was developed following a preliminary test against restricted
alternative. The test is suitable for situations where the normality assumption may
..
not be satisfied. It is not greatly affected by the presence of extreme values in the
data. It is sensitive to departure from the null hypothesis in the direction of the
alternative. The test is generalized to the multivariate linear models. The asymptotic null and nonnull distributions of the estimates as well as the test are derived in
the univariate K-sample problem, multivariate one sample problem, and in the generallinear multivariate model. The estimate developed has a smaller mean square
error than the other estimates obtained assuming that the alternative is true. Comparisons are especially made with the isotonic regression estimator. Several characteristics of the testing and estimation procedures are explored through a simulation
study.
CHAPTER I
INTRODUCTION AND LITERATURE REVIEW
1.1. Introduction
Consider the GMLM
y =Xa+E
where Y (N Xp ) is a matrix of random variables, X (N X( q +1)) is a matrix of
known constants assumed to be of full rank (q +l)<N, 8·((q +l)Xp) is a matrix of
unknown parameters, and E(N Xp) is a matrix of random errors, the rows of which
are assumed to be independent and identically distributed random vectors. The
matrices 8 and X are defined as
a~
a;
a=
,X = (1,X 1 , ... ,Xq )
8'q
where each of
at'
is a 1 Xp , I
= O, ... ,q.
We are interested in estimating 8. There
are situations, however, where some information regarding some of the ai'S,
i =O, ... ,q may be available or that some hypotheses regarding them may be
suspected to hold. It may be known for example that
case Y 1>'''' YN
a/ =
0,1
=
1, ... ,q ; in this
are identically distributed with location ao, and many estimates of 80
are available in the literature. On the other hand, 80 may be known to be equal to
2
zero and again estimates of 8 j 's , i =l, ... ,q are many and easily available.
When definite information about the 8' s may not be available but only sllspicions concerning them, often a preliminary test of significance concerning them is
made. If the hypothesis of in terest is accepted we estimate the parameters in a certain way, otherwise the estimate may be altered. Such an estimate after a preliminary test is usually not unbiased, though it may have a smaller mean square error
than an estimate obtained directly assuming the hypothesis to be true.
Two special problems are of interest in this work. The first is that of assuming
that 8
0
is equal to zero and suspecting th'atthe' Hi'S " i =l",oo,q are ordered in a cer-
tain way. In this case we are interested :inestim'aiin~' t'h'e; 8 j 's , i =l,oo.,q after a
.
,
preliminary test concerning their 'ordering. : The;second 'is when we suspect a specific
order among the 8 i 's , i =l, ... ,q but arettlainly iri'terest'ed in estimating 8 0 ,
The need for a test against ordered alternative arises in psychological work
where qualitative characteristics can be ranked but not easily measured, Jonckheere
(19S4a) mentions, as an example, a hypothetical experiment to test the effect of stress
on a task of manual dexterity. The data would be obtained from groups of subjects
working under high, medium, low, and minimal stress, the null hypothesis being that
stress has no effect on performance and the alternative that increasing stress prod uces
an increasing effect. Similar examples occur in the analysis of social survey data
where people are often grouped according to "social class", or "living conditions". An
application of such a test in the study of time series was given also by Jonckheere
(19S4a). The births of siblings might be of three types: normal, abnormal in respect
of some characteristic, and stillbirths. The effect of birth rank on the type of birth
could be tested when for instance the alternative to randomness is that the earlier
3
births tend to be normal, the later births, abnormal with finally the appearance of
stillbirths. As another example, consider a clinical trial in which differeent groups of
patients receive varying doses of drug. The alternative hypothesis may be that the
effect of the drug increases with the dose. In a dosage-mortality experiment, groups
of animals may be given different doses of drug and it may be known that the greater
the dose the greater the expected proportion that will die.
When there is some implicit ordering among the group means, the usual one-way
analysis of variance against the global alternative,( that is, not all treatment effects
are equal), though remainin;g valiq.,
because the F-ratio is
i~n~t .th~'1J19S~ :J6fii~A~nkapproach
indep.en<;len~(pf~he
it may not be sensitive to an
this ordering into account are
ordE;r
i~ ~Qich
to the problem,
the group means occur, and
ordered,~alt;erna~ive.Mor.e.cp'owerfultests
n~eqed ..)'aki,ng"this
which take
information into account may
increase the efficiency of the analysis. Isotonic regression analysis provides a solu tion
•
.."
r
of this problem in the normal theory set up.
L'
'.1
In this chapter we review some of, th~\\Iork done on estimating ordered parameters, both in the univariate and multivariate case, and on estimation after a preliminary testing. Also we review major work on robust estimates in these situations.
1.2. Isotonic Regression Analysis
Suppose the lifetime of a certain article has a distribution function F(x); then by
definition F(x) must be a nondecreasing function. The theory of isotonic regression
makes use of the fact that F(x) is isotonic to improve the estimate of F(x).
Barlow et at. (1972) use the adjective "isotonic" as a synonym for "orderpreserving"; order restrictions on parameters which can be regarded as requiring that
the parameter, as a function of an index, be isotonic with respect to some order on
4
the index set. They define the isotonic regression as follows: Let X be the finite set
{x 1,
... , Xk }
with the simple order XI < X2<
... < Xk.
A real valued function f
on X is isotonic ir'x ,y EX and x <y imply 1 (x )<1 (y). Let g be a function on X
and w a given positive function on X. An isotonic function 9 * on X is an isotonic
regression of g with weights w with respect to the simple odering xI < X2< .. , < Xk if it
minimizes in the class of isotonic functions f on X the sum ~ [g (x )-1 (x)
fw (x ).
xEX
The function 9 * is uniquely determined and is characterized by
Ex [g (x )-g * (x)JJ
1
(x)w (x )<0 for all isotonic functions f on X, with equality when
=g *. The isotonic regression,"g'*, of gsatisfiesDfg
(x)w (x )= Ex 9 * (x )w (x).
An example of replacing the basic Jestimateby its isotonic'regression is given by Barlow et ai. (1972). They showed that th1e isotoIliic l'egression of the basic estimate coincides with the maximum likelihood estimate under the order restrictions when sampIing from distributions belonging to the exponential family.
Ayer et ai. (1955) were among the first to consider this problem. They obtained
the sample isotonic regression as the maximum likelihood estimate of simply ordered
probabilities in the stimulus-response (bioassay) situation, viewed as the problem of
estimating the distribution of the minimum stimulus that will lead to the response.
For i=I""Tn, let N i independent trials be made on an event with probability Pi,
where the Pi satisfy the inequalities PI
>
P2
> ... >
Pn
.
Estimates f\, ..... Pn of
P 1"",Pn which are calculated in an intuitive way by grouping the observations were
proposed, and shown to be the MLE's of the desired probabilities. Application of the
results of their paper to the maximum likelihood estimation of ordered parameters
was made by Brunk (1955). Brunk considered the case of K populations where the
distribution of the i-th population is c~mpletely specified by the knowledge of a single
5
distribution parameter OJ , i
=
1, ... ,k, where the OJ'S satisfy certain monotonicity con-
dition. The estimates were MLE's of the parameters 011 ... , Ok su bject to the
monotonicity requiremen t where the distribu tion of the populations belong to the
exponen tial family.
Brunk, Ewing and Utz (1957) and Brunk (1955, 1958), recognized that the
MLE's of ordered means of distributions belonging to an arbitrary one-parameter
exponential family are given by the isotonic regression of the sample means with
weights proportional to sample sizes.
The more general problem Qf!n&xImiz.ing 8;'strictlyAmimodal function was studied by van Eeden
(1956,~1957a{blp·She qealt·witha:range
which isotonic regression
provide~a:.SOlwtion.·
of estimation problems for
Her methods can be used in cases
where estimates are required which not'.only satisfy certain order restrictions but also
lie in a specified closed subset of the Te-alline •. They are also applicable when no restriction of the latter type is
imposed.,T~e
bability distribu tions were given under
.
:ty1LE's of a finite set of parameters of pro-
thes~ ~estrictions.
..'.
Brunk (1958) was concerned with the situation in which samples are taken from
k populations, each known to belong to a given one-parameter "exponential family".
Brunk developed maximum likelihood estimation of the k parameters subject to certain restrictions, by finding the minimizing point on a given intersection of the boundaries of the restricting sets. In particular, when all populations belong to the same
exponential family and when the restrictions on the parameters are order restrictions,
he observed that the MLE's of the means are independent of the particular exponential family. In a later article, Brunk (1965) showed that when the distributions
belong to a common exponential family then the isotonic regression of the mean of
6
the random sample from the distribution using sample sizes as weights furnishes the
MLE of a regression function 11(-) known to be isotonic on X.
Robertson and Waltman (1968) removed the van Eeden restriction to "strictly"
unimodal functions. They developed a procedure for finding the restricted estimates
and proved that they are MLE's.
Finally, although the class of normal distribu tions is not a one parameter
exponential family, van Eeden (19S7a) and Bartholomew (19S9a) showed that isotonic
regression furnishes MLE's of ordered means in sampling from normal distributions
with distinct variances provided that the ratios of the variances are known.
have been given by Ayer et at.
Consistency properties for isotonic, regression
..
) ,
(1955). A generalization appears in Brunk (1955), and both are subsumed in Brunk
(1958), a corrected version of which appears in Brunk (1970).
A characteristic feature of the isotonic estimators is that they are step functions.
In many problems smoothness is just as desirable as isotonicity, so that while the step
function may be isotone, its lack of continuity prevents it from being widely accepted
as a satisfactory estimator.
Wegman (1980) reviewed briefly the literature on nonparametric regression. He
considered the geometric structure of the isotonic regression. He showed that subject
to suitable constrain ts, the isotonic inference approach is an optimal solu tion to the
non parametric regression problem. He explains that, if the function f of X is not isotone, it is inappropriate to use the isotonic restriction. In these situations, when the
assumption of isotonicity is not appropriate, splines may be used as regression estimators where smoothness requirements are appropriate.
7
More recently a good deal of work concerning the use of splines in statistical
estimation problems has appeared in the literature. While a spline fit satisfies the
required smoothness properties, it may not be isotone as desired. Wright and Wegman (1980) present a combined approach. They examined a regression problem and
showed that the isotonic-type of spline provides a strongly consistent solu tion.
We will not discuss spline theory. in this work; we just mentioned splines as one
of the solutions to the non parametric regression problems.
1.3. Algorithms for Calculation of'g - .
The problem of calculating 9 - given g, the weights w, and certain order on X
is a type of quadratic
program~i~~p;r~bi~m.. N~'~t'~~ will summarize
some algo-
rithms for calculating 9 - .
The Pool-Adjacent-Violators Algorithm was discussed by Ayer et al. (1955).
The isotonic regression 9 - partitions X into sets on which 9 - is constant, i.e. into
'.,\:
I.·.
"
level sets for 9 - , called solution blocks. On each of these solu tion blocks the value of
9 - is the weighted average of all values of 9 over the blocks, using weights w. The
algorithm starts with the finest possible partition into blocks, namely the individual
points of X, and joins the blocks together step by step until the final partition is
reached.
,
The Up-and-Down Blocks Algorithm developed by Kruskal (1964), is a version of
the Pool-Adjacent-Violators algorithm which can be applied in the case of simple
order.
Before going any further let us first explain some terminology:
8
An elemen t x in the partially ordered set X is an
immediate predecessor of an element y if x::;;y but there is no
Z
In
X distinct from x and y such that x ::;;z ::;;y .
The root is a single element which has no predecessor.
A subset L of X is a lower set with respect to the partial order
"::;;" if y EL , x EX, x::;;y imply x EL .
If a is a real number and F is a subset of X then the excess of F
over a is
Va
(F)=
E (g (x )-a )·w (x).
zEF
The Minimum Violator Algorithm applies
when
each element except the root has
r
:lC
.
.-
·'i
f-"~'
\l;",.
-
exactly one immediate predecessor. This algorithm was developed by Thompson
:
' .• . - .
~.
1
;, '.: "
~'
_" ... "
(1962).
The Minimum Lower Sets Algorithm is given in Brunk, Ewing and Utz (1957)
and in Brunk (1955). It depends on selecting the lower set of minimum weighted
average of 9 over a subset of X. If more than one set has the same lowest average,
their union is taken over the set on which 9 * assumes its smallest value.
The Minimax Order Algorithm: The object of the algorithm is to produce a simpIe order in which each solution block has the property that the excesses of any two
~
adjacent independent sub-blocks are in nondecreasing order. The algorithm interchanges independent adjacent sub-blocks whose excesses are in the wrong order.
When a stage is reached for which the new simple order has the same solution partition as the previous one, the solution for this simple order is the solution of the originalorder. This algorithm is discussed in detail by Barlow et ai. (1972).
9
1.4. Test Against Ordered Alternatives
The testing problem is very similar to the estimation problem. Often it is necessary to test a hypothesis against a restricted alternative, i.e.,
H o: J.l E 0 0 against HI: J.l E 0 1
,
where 0 0 and 0 1 are subsets of the parameter space 0, such that 0 0 U 0 1 is a proper
su bset of O. When a test is specifically constructed for the restricted problem it is
usually more powerful than the test derived for the unrestricted problem
H 0: J.l E 0 0 against HI: J.l
". ,
f/.
00
,
:,,j" " ,"
f~'~~""J'><,f"""'~
A variety of parametric, as w'ell asnonparain-etric,"tests'for ordered alternatives is
available in the
literature.Basi~ally'the'probl~m is
testing the null hypothesis that k
populations have equal means against the alternative that the means satisfy order restrictions. That is, the problem is to test
t~e
'i·
.•.•
null hypothesis,
',''i.'
(4.1 )
where J.li is the i-th treatment mean, against the ordered alternative
(4.2)
where at least one inequality is strict and the subscripts 1 to k are arranged according
to the prespecified alternative.
1.4.1. Tests Against Ordered Alternatives For The One-Way Layout
There are many proposed tests, parametric as well as non parametric, to solve
•
this problem. An early attempt which is a generalization of the Wilcoxon MannWhitney two sample test with ordered alternatives in mind was given by Whitney
(1951) for the three sample case. This was followed by Terpstra (1952) and
10
independently by Jonckheere (19S4a). Jonckheere proposed the test statistic,
J =Eu <v Uut' which is the sum of k(k-1)/2 Mann-Whitney counts UUt, ,u <v for
comparing treatments u and v. If J is too large, the hypothesis of equal treatment
effects is rejected in favor of the ordered alternatives. Tables for evaluating the
significance of J are given by Odeh (1971), for small k. For large samples J is approximately normally distributed. Hollander and Wolfe (1973) showed that the approximation is best for large values of min (n l' n 2' ... , nk ), that is, for
min (n 1, n 2, . . . , nk) tending to infinity, where ni is the sample size for the i-th
sample. The exact test is powerful when the treatment means are monotonically
increasing.
A parametric approach to
th~
problem
ofo1;dered
alternatives when the rank
, .
. ,
order of the means is known was considered by Bartholomew (l9S9a,b,1961a). He
derived the likelihood ratio test under the assumption of normality to test for equality against ordered means. He also developed a generalized form of the x2-tesl ,
denoted by X2 , to utilize the information about the order between some or all of the
means. The test is derived in section 3.2 of Barlow et al. (1972). Let Xi be the i-th
sample mean. The population variances, Ui2 , associated with each sample are
assumed to be known. The maximum likelihood estimates of J.li under Hoare Pi =/1,
for all i, where
k
J.l =
E Wi
i =1
k
X'd
E Wi
and Wi =ni (1i- 2
.
.
i =1
Under H 1 in (4.2) the maximum likelihood estimates are
the isotonic regression of
Xi'
p/, i=1,2, ... ,k where p/ is
Under H 0 in (4.1) and the assumption that all popula-
tions have the same variance (12 the likelihood ratio test rejects H 0 for large values of
•
11
A method for calculating the ili * which involves amalgamating some of the original k
groups into subsets is given in Barlow et al. (1972). Under H 0
k
<)
P {Xk~ > c} =
E
,
2
<)
P (I ,k ).p (XI:1 > c ),c >0 and P (Xk =0) = P (l,k), where
1=2
P (I ,k ) is the probability that the isotonic regression function ili * takes exactly I distinct values, and XI~1 is a chi-squared random variable with I-I degrees of freedom.
Values P (l ,k ) are tabulated in Barlow et al. (1972).
Chacko (1963) proposed a rank analog of a normal theory ordered alternative
test developed by Bartholomew (1959,1961a), for the case of equal sample sizes. The
procedure is as follows. LetXi /({'·'i,2;':.'.,f;]
•
.' "
.'.", ' "I.:> .... .,/ : '," :,'
. ~ji
1L2',~.:,n) be
-
,-
.
tlOns on the k samples where
Xij
observations are ranked from
smalles{t~l~rge~t where
•
independent observa-
is the j-th observation from the i-th sample. The
all N observations, N =nk. The
rij
is the rank of
Xij
among
~*'s are no; ~alculated from the ranks. Tpe form
of the test statistic is
Xr~nk =(12/ N (N +1))'
I
E Nt [il/-(N +1)/2]2 ,
t=1
I
where N =
E Nt.
The estimates ilt *for the nonparametric case are calculated from
t =1
ranks obtained in an overall 1 to N ranking, that is, the ranks over all the observations, not separate ranks within each sample, while in the parametric case they
depend on some pooling of adjacent within sample means. Under H 0 the limiting distribution of Xr~nk is the same as the distribution of Bartholomew's X (see Barlow et
2
al. (1972)). The
X2 test is appropriate when the alternative states that the treatment
means are monotonically increasing. Chacko (1963) explains that when treatment
12
means do not increase linearly, the Xr~nk test will be more powerful than Jonckheere's
test.
Hogg (1965) considered a parametric procedure similar to that of Jonckheere
E (Xj -Xi) for
(1954a). The test statistic is
testing JLi =JL j against JLi
< JL j'
Some
i<j
of the differences
X j -Xi ,i < j
should have more weight than others depending on
which inequality JLi <JL j we want to detect.
Puri (1965) generalized Jonckheere's test to the class of tests including a normal
scores analog of Jonckheere's test. Puri described a family of k-sample rank tests.
Let Xi be the vector (XiV' .. , X in ,), i=1,2,oo.,k , and consider all k(k-1)/2 sam: ",
~
d i ,j )=1 if the v -th ~mallest observation from the combined i-th
j-th samples is an Xi observation and otherwise d i ,j )=0. Let vJi ,j )=-1 if the
pIes in pairs. Let
.
and
~!.
I
"
v -th smallest observation from the combined i-th and j-th samples is an X j observation and otherwise
n. T'U) =
)
where the
I)
EII(i
vu(i ,j )=0.
Denote h··
=TH
)+T'U)
where
I)
I)
I)
n,;,,} (E (i ,ile (i ,il) and n.
LJ
II =1
U
1,,11
)
T'(j)
I)
=
n.;,,} E
LJ
u =1
•
(i
II
,il v
(i ,il
II
,il are constants satisfying certain restrictions. Then the test statistic
IS
k -1
V
k
= E E
ni nj
h ij .
i =1j =i +1
'When
EY ,j)= V /( ni +nj ) the test coincides with Jonckheere's test.
Asymptotic
normality and the asymptotic distribution of the family of statistics are discussed in
Puri (1965).
13
Shorack (1967) has derived a fundamental theorem which represents a generalization of Bartholomew's results (19S9a, 1961b). Shorack extended Chacko's test to
the case of unequal sample sizes.
A further generalization was given by Tryon and Hettmansperger (1973). They
•
followed the approach suggested by Hogg (1965), to generalize Puri's family of statistics by including weighting coefficients to form arbitrary linear combinations
k -1
k
TN = ~ ~ aij T ij ,aij
i=lj=i+l
>
0 ,
where T··
Consider the subclass of statistics
I) are Puri's statistics.
(T 12, T
23, . . . ,
T k - 1,k), and
as~~meequ;'l}~~~ple sizes.
The authors prove that for
each statistic TN there correspond~ an equ'i~'~lent'statistic T;' where
k -1
T;'= ~ av Tv ,v +1
v =1
and
v
k
av = ~ ~ aij,v =1, ... ,k-1.
i=lj=v+l
The statistics are equivalent in the sense that, when standardized under H 0,
TN - T;'
-+
0, and the Pitman efficiency of T;' with respect to TN is one. Also T;'
is a simpler statistic, requiring the computation of k-1 rather than
(~)
two sample
statistics.
Lastly, Wallenstein (1980) proposed a statistic based on sums of Smirnov-type
statistics, and provided evidence that the test based on this statistic has high power
for a wide range of alternatives suggested by this problem. The statistic is based on
summing over k-l adjacent pairs of two-sample one-sided Smirnov-type statistics.
14
An approach to the multivariate version of this problem was considered by
Kudo (1963). Kudo derived the likelihood ratio statistic for testing that the mean
vector of a k-variatenormal distribution is null against an orthant restriction, given
that the covariance matrix is completely known. Let X 1,,,,,XN be independent,
identically distributed N k (8,E) random variables with N >k and E known. Kudo
has shown that the likelihood ratio test rejects H o:e=o in favor of H 1:8
>
0, for
large values of
(4.3)
e"
where
is the maximum likelihood estimator of
Geometrically
and j
I
e"
e
in the region
o={e:e >
O}.
can be determined as follows. Let J be any subset of J( ={1, ... ,k},
its complement. If for some J, ¢>C J CJ( , we let
then the union of all sets L(J) is a disjoint and exhaustive partitioning of the nonnegative orthant R k ={X ER k:X
>
O}. Project
XI
2:-1 Y , on to each of the 2k
e"
equal to the projection which falls in R
X,
with respect to the inner product
linear su bspaces generated by the sets L( J), and set
is a projection of X with respect to
k
and maximize X2 in (4.3). Because
e'"
2:- 1, X2 in (4.3) simplifies to
The distribution of X2 involves 2k terms, which complicates the determination of a
critical value.
Kudo outlined an intuitive algorithm for calculating X2 which avoids the examination of all 2k projections of X. For each set J, ¢>CJ CJ(, let
15
and
E(JJ I ) ]
E(JJ)
E=
[ E(JI J) E(J I JI )
denote the corresp~:>nding partitions of
-
-
X and E. Then define for each set
J,
-1-
X(J:J ' )
=
E(JJ:J)
= E(JJ)-E(JJ~ )l;(i' J'J4C!l Jl'
X(J)-E(JJ' )EU ' J' )X(JI ),
Kudo was able to show that
where I(A) reprsents the indicator function for the set A.
Under H 0:8=0, the distribution of
P(X 2 <C) =
x2 in (4.4) reduces to
k
E
wjP(X}<C),
j=O
where for each j ,O<j <k.
degrees of freedom and
Wj
xl represents a chi-squared random variable with j
,i =l, ... ,k are non-negative weights that sum to one.
The results of Kudo may be considered as a generalization of those of Barthalomew. Under H
1
the differences
J.t2-J.tll . . .
I
J.tk -J.tk-l ,
should be positive; they can
be estimated by the successive differences Y1=X:rX11 ...
variance-covariance matrix of (Y 11
...
I
I
Yk-1=Xk-Xk-l'
The
Yk -d can be calculated when the common
variance of the k samples is known, then the procedure of Kudo can be applied.
16
Niiesch (1966) attempted to develop the likelihood ratio test for the orthant
alternative problem. Shorack (1967) pointed out that Niiesch's results are not quite
correct except when the dispersion matrix A =
(1'2
Ao, where
(1'2
is unknown and Ao is
com pletely specifi ed.
Perlman (1969) applied the likelihood ratio principle to the general problem of
testing {H 0: IlEP d against {H ( IlEP 2} for the multinormal mean vector Il, when
the dispersion matrix
E is completely
unknown, and where PI and P 2 are closed,
"positively homogeneous" sets with PI CP 2' A set PeR P is said to be positively
homogeneous if X EP implies that ex EP , for all e >0. Perlman showed that the
power of the likelihood ratio test approaches one as the distance from the hypothesis
H 0 becomes large. However, the distribution futtction of the test statistic usually
depends on the unknown
E.
For the case of bivariate sample, Shirahata (1978) developed a parametric
approach to the problem considered by Kudo (1963). He constructed a test statistic
for the orthant alternative from the likelihood ratio conditional probability principle
(LRCP). Shirahata supplied diagrams on how to calculate the statistic for the bivariate normal, one-sided mean problem, with known covariance structure. Monte Carlo
results were given showing that this approach is more powerful than Kudo's (1963),
when the population correlation is negative. However, the LRCP statistic performs
poorly when the population correlation is positive and the likelihood
ra~io
statistic
dominates in this case.
Chinchilli and Sen (1981a) extended the union-intersection principle of Roy
(1957) in order to derive asymptotically distribution-free tests for a class of restricted
alternative problems in the general linear multivariate model. They intoduced a class
17
of rank statistics which arise from the general linear multivariate model. Their paper
dealt with the general existence theory and the structure of the union-intersection
rank tests for restricted alternatives. The asymptotic multinormality of this class of
rank statistics under the null hypothesis and under contiguous alternatives are discussed in their paper.
An extension of their work to the orthant restriction problem appeared in Chinchilli and Sen (1981b). They developed a class of linear rank statistics for testing
against the orthan t alternative for the general linear multivariate model. The class of
statistics is an extension of
rank test is shown to be
theunion~interse<:~io~prin~iple.
asymptotkall~LJnQre: powerful
The orthant-restricted
than the unrestricted rank test
when certain regularity conditioI,lshol<l,,;t anq -whe!). the true parameter value is within
this orthan t.
Following Chinchilli and Senp981a,b)l!Boyd and Sen (1983) proposed a distribution free rank test for ordered alternllttives in the simple linear model setup. They
combined the locally most powerful,rank test approach with the union-intersection
principle, to propose a genuinely distribution free test in the univariate setup. The
asymptotic distribution of the test statistic under the null hypothesis when the sample sizes are finite, supplemented by some numerical and simulation results are given
in Boyd and Sen (1983).
1.4.2. Test Against Ordered Alternatives For The Two-Way Layout
Again, one of the early papers was written by Jonckheere (1954b). Consider a
complete blocks design with k treatments and n blocks. Observations are ranked
within blocks, and the test statistic is the sum of the Kendall coefficients of correlation between the predicted ordering of the treatments and the rankings of each of the
18
n blocks. Jonckheere derives the sampling distribution of the test statistic and shows
that its large sample distribution is normal.
Jonckheere also considered the case of more than one observation per cell, where
the number of observations per cell is the same within each treatment group. The
treatments are ordered and assigned appropriate tied rankings based on the number
of observations per cell, e.g., if the r-th treatment group contains lr observations per
cell then the r-th treatment will receive predicted rank
1
r-l
2
i =1
-'Ur +1)+ E li'
Observa-
tions are ranked within each block and the test statistic is the sum of Kendall
coefficien ts of correlation between the observed ran king within each block and the
predicted ranking.
Page (1963) proposed a test statistic for the completely randomized blocks
design based on the Spearman rank correlation between the hypothesized ordering
and the treatment sum of ranks, where the ranks are determined by ranking within
blocks. He discussed the usefulness of this type of procedure and demonstrated the
efficiency loss which occurs when the experimenter fails to take into account the
directional hypothesis. Let
L
Ii:
n
j =1
i =1
E P j ( E R ij ) where P j
=
is the predicted rank of the
n
j-th treatment, j =1,2, ...,k , and
E R ij
equals the sum of ranks observed in the j-th
i =1
treatment. Large values of L reflect the ordered alternative. Exact tables are
.
included in Page's paper. For large samples L is asymptotically normal.
Hollander (1967) introduced a test statistic Y based on a sum of Wilcoxon
signed-rank statistics. Let
Xij
be a random variable denoting the i-th block and the
j-th treatment combination in a randomized complete blocks design with n blocks and
19
I X iu -Xiv I
k treatments. Let Y uv =
Yu~l),
... ,
and Ru~i) is the rank of Yu~i) in the ranking of
n
E R uv W J~) where W J~ )=1
Yu~n). Also let T uv =
if X iu <Xiv and Ooth-
i=l
erwise. Then Y =
Ev <v T uv ·
Hollander discusses the asymptotic normality of Y
and notes that Y is neither distribu tion-free nor asymptotically distribution-free.
Shorack (1967) extends the Bartholomew (1959a, 1961b) and Chacko (1963) procedures. For the nonparametric case, each observation Xij is replaced by its rank in
the i-th row, then the amalgamation (grouping) process is applies to the average
ranks Hj
;
Hj =
Ej rij / I
where I equals the number of levels of the first factor, to
obtain m distinct quan tities H 11 11, ••. , R II~l 'The test statistic is
Xr2
m
= [121 / J(J +1)] E
ti
[.R II.I-(J +1)/2)2,
i=;l
;1
and its asymptotic distribution is disclissed by Shorack (1967). In the parametric
case, the likelihood ratio tests a.re derived by minimizing the sum of the amalgamated
squares for the desired effect, subject to the constraint specified by the ordered alternative.
Puri and Sen (1968) generalize the results of Hollander to a Chernoff-Savage
class of tests. Let X/uv =Xiu -Xiv
,U
< v =l, ... ,k , i =l,oo.,n.
Consider the random
variables Tu~ =n -IE En aZu~~2 where zu~~2 =1 or 0 as the a-th smallest observation
among
I X/,uv I
comes from positive or negative X * , and En a is the expected value
of the a-th order statistic of a sample of size n from a distribution
w* (x )=w(x )-w(-x ),x
>
0, and 0 if x <0. Assuming that W(x) satisfies certain
conditions, the test statistic Vn is asymptotically normal. If W* (x) is uniform over
(-1,1), the statistic V n reduces to Hollander's (1967) Y-statistic.
20
A competitor of Page's test has been proposed by Pirie and Hollander (1972).
They introduced a normal scores test for ordered alternatives among k treatments in
the randomized block design. Let
denote the within-blocks ranking of
R ij
Xij
and
D/ be the expected value of the j-th order statistic of a random sample of size k from
a standard normal distribution (normal scores). The test statistic is
n
Ie
E E JDA, .
Wn =
i =1j =1
For k = 2 or 3, the test is equivalent to Page's test. For k
>
4
1
the normal scores test showed better efficiency properties than Page's test for normal,
uniform, and exponential populations. Pirie and Hollander include tables for the null
distribution of W n , furthermore they ,showed that for large sample, Wn has normal
distribution with E o( Wn )=0, and V~r o(w n
)
~[nk (k +1)/12]
Ie
E (D /)2.
j=1
Pirie (1974) considered two classes of rank tests for ordered alternatives in the
randomized block design with k treatments and n blocks. Pirie refers to these classes
as tests based on among-blocks ranking (A-tests), and tests based on within-blocks
ranking (W-tests). Based on the asymptotic relative efficiency results for A-tests with
respect to W-tests, their distribution, and the ease of computation, Pirie concludes
that W-tests would more often be preferred to A-tests.
Hettmansperger (1975) proposed a nonparametric test for ordered alternatives in
the randomized block design, with more than one observation per cell. This test is an
extension of Page's test statistic to the general case of unequal numbers of observations per cell. The statistic
K
T -
E
j=1
where
R ij .
n
j
n
K
E (Rij·/nij) = E E
i=1
is the rank cell sum and
j
(Rij·/nij) ,
i=1j=1
nij
is the number of observations in the (i,j)-th
21
cell. Large values of T lead to the rejection of the null hypothesis. T is shown to be
asymptotically standard normal under the null hypothesis. In the case where nij =1
for all i and j, T reduces to Page's (1963) statistic, and in the case nij
>
1 and k =1,
becomes the Tryon and Hettmansperger (1973) statistic for ordered alternatives in the
one way design.
De (1975) extended the union-intersection principle to formulate a test using
aligned ranks. Define the aligned observations as
Y ij =Xij -X where
k
Xi = E Xij /k, i =1, ...,n ;j =1, ...,k.
Let R ij be the rank of
j=l
Yij,i =1, ... ,n;j =1, ... ,k, when nk (=N)Yij's are arranged in order of magnitude.
For every N a sequence of rank scores,EN-(ENlJ ... , E NN ) is defined. Consider a
n
class of statistics TN =(TN l>'''' T Nk )' w~ere T Nj =n- 1 E EN R ij , j =1, ... ,k. De
.
i=l
defines a reference class of test statistics
k
where
E
bj =0, and Pn represents the set of intrablock ordered ranks. Under the
1=1
null hypothesis, Sb ,N is asymptotically N(0,1).
A test against ordering of treatment effects is derived using Roy's unionintersection principle where the test statistic Q =SUPb EB Sb and B is a set of values
of b such that (JCUb EB ( \ where (J={(J:01> ... >Ok, with at least one inequality
being strict }, is the i-th treatment effect. The
Lim PH o( Q > C)=
k -1
E P (X/ 2> C)P (l ,k ) where the weights P (l ,k ) are tabulated
/=1
Barlow et at. (1972) and Chacko (1963).
in
22
Skillings and Wolfe (1977) presented a class of tests based on weighted sums of
block statistics. They formed a statistic from Puri's (1965) family of test statistics for
ordered alternatives in a one-way layout on each block, and then used a weighted
sum of these n block statistics for the overall test statistic. Let T j be Puri's statistic
on the i-th block. Then for the overall block design the class includes test statistics of
II
the form T
Eb
=
T j , where the b's are nonnegative weighting constants. The cri-
j
j
=1
terion used in selecting the bj 's is based on maximizing the asymptotic relative
efficiency of T. Different scoring schemes may be used in different blocks to form the
weighted sum in an optimal way which will lead to tests with better efficiency. This
might be done when the sample sizes or the distributional forms are thought to be
different from block to block.
Skillings (1978) considers the application of Jonckheere's test statistic in block
designs which have unequal scale parameters for the blocks. Skillings proposes the
statistic JK =
n
Eb
j
·JKj , where the bj 's are nonnegative weighting constants. Note
i=1
n
E JK
that Jonckheere's statistic is JK * =
j •
j
Assuming that the distributional forms
=1
are the same in all blocks, Skillings main concern is the selection of the bj 's in the
unequal scale situation. The bj 's are selected to maximize the asymptotic relative
efficiency of JK with respect to JK * under translation alternatives. To adaptively
select the weighting constants bj , Skillings suggests estimating the O'j 's and using the
estimates to determine the bj's. Asymptotic normality of the statistic JK is shown.
Salama and Quade (1981) proposed the test statistic W to test against ordered
alternatives. Their test statistic is the weighted average rank correlation
II
E bq, C
W =(
j
=1
II
Eb
j ,
j )/
j
=1
where Cj is the correlation between the predicted ranking
23
and the ranking observed within the i-th block, qj is the rank of the i-th block with
respect to credibility (variability within the block) and the b's are weights such that
0< b 1 <
...
<b n
.
W is asymptotically normal when max( bi f /:Ebi 2--+ 0, as n
--+00
and 0< V (C )<00, where V(C) is the variance of the correlation statistic C.
An extension of the Boyd and Sen (1983) approach to complete block designs
was made by Boyd and Sen (1984). They proposed union-intersection rank tests for
ordered alternatives in a complete block design.
1.5. The M-Procedure
·1 i.
Most of the non parametric procedures we have reviewed so far either in estimaI
;
't
~J<
!
o
:-
.... \ ...~
tion or in testing are based on some way of ranking the original observations.
Another robust procedure that can be used for both testing and estimation is
the M-procedure. The concept of M-est'imators was introduced by Huber (1964). He
n
E p(X
called any estimate defined by minimizing
.
j ,
j
T) with respect to T, which leads
=1
to the implicit equation
n
E \II(X
j
j
,Tn )=0,
(5.1 )
=1
where p is an arbitrary function, and \II(X ,0)= :0 p(X ,0), an M-estimator. The functional form of (5.1) is obtained by defining T(F) to be the solution of
f\ll(X,T(F))dF(x )=0. We call T(·) the M-functional corresponding to \II, and Tn
is the M-estimator corresponding to \II. This class of estimators contains in particular
the least square estimators, corresponding to \II(x )=x and the maximum likelihood
estimators, corresponding to \II(x)
= -f
I
(x)/ f (x).
24
We are particularly interested in the location parameter where \II(x ,t) has the
form \II(x -t).
Since equation (5.1) may have more than one solution, we should
allow for the possibility of multiple solutions. In particular, if \11(-) is defined over the
real line and is non decreasing and nonconstant with \II( t )=-\II(-t ) for all t, we may
define the M-estimator corresponding to \11(-) as:
where
n
0* = 0* (X vX 2)
... )
X n ) = Sup [0:
E \II(Xi -0»0]
i =1
0**
=
0** (X I,X 2,
. - . )
Xn
)
= In!
n
[0:
E \II(Xi -0)<0].
i=1
Huber (1964) studied the asymptotic properties of M-estimators based on \II an
he proved the following theorem for the case of monotone \II.
Theorem: Let \II be a monotone increasing function, but not necessarily continuous,
that takes both positive and negative values. Then the M-estimator T of location,
defined by j\ll(X -T (F ))dF (X)=O, is weakly continuous at F 0 if \II is bounded and
T (F 0) is unique, Since weak continuity of T at F 0 implies consistency,
T (Fn
) -
T (F), this theorem gives a simple sufficient condition for consistency.
The situation is more complicated if \II is not monotone, since in this case E(Xi -Tn)
has many distinct zeros. Huber suggests that we take the solution nearest to the
sample median.
Huber also showed that if \II(X ,0) is differentiable with \11(0)<0, then n 1/2(8 n -0)
is asymptotically normal with mean zero and variance equal to
00
j \112(t -O)dF (t )/[j\ll' (t -O)-dF (t )]2. Furthermore, the M-estimator can achieve the
-00
25
Cramer-Rao lower bound if and only if 'IJ(X -On) is of the form af ' (X)/ f (X) for
some constant a, which leads to the MLE. That is, the MLE is the most efficient Mestimator.
1.6. Estimation After Preliminary Tests.
The combination of estimation and testing in one problem known as "estimation
after preliminary tests" has interested many investigators during the last few years,
either on parametric or on non parametric grounds.
The problem of estimation on the basis of preliminary tests of significance may
be characterized as follows: a statistic T is evaluated from the data at hand; if Tis
not significant at some pre-assigned level or significance, then one procedure is used to
estimate the parameters in question, while if T is significant, another procedure is
,
"
used for obtaining the estimates. Usually such an estimator after a preliminary test
is not strictly unbiased, though it generally has a smaller mean squared error than the
basic one. The effects of preliminary tests of significance (viz., bias and mean squared
error) upon estimation have been studied in various special cases by different investigators.
Consider the case of two random samples from normal population with variances
(J';2
and
(J'22
respectively. Bancroft (1944) investigated this problem via performing two
tests; a test of the homogeneity of variances, and a test of a regression coefficient. In
the first test Bancroft proposed an estimate
that either
(J'22=(J'12
or
(J'l<(J't
e -
of
5
2
2 /(J'22
knowing apriori from the data
His procedure is to test 5
where 5 12 ,5 l are two independent estimators of
n2 5 12 /(J'12 , nn
2
(J'1
2
(J'1
and
2
1
(J'l
/5 l
by the F-test,
respectively such that
are distributed independently according to
2
2
XI ,X2 ,
with
n 1 ,n2
degrees of freedom. If F is nonsignifican t at some assigned significance level, Bancroft
26
<712
used [(nI812 +n2822 )/(nl+n2)] as the estimate of
otherwise he used 8{l' as the
estimate of <7F'. A detailed discussion of the bias in e " , and an expression for the
bias and the variance of e" express as a fraction of <7l, are given in Bancroft (1944).
He evaluated the reduction in variance as a result of using the preliminary test, and
conclud~d that in making the choice of an appropriate estimate of
<7F'
one has to con-
sider:
(1)
Using 8
2
1
always. This procedure leads to no bias, but is likely to have a large
sampling error.
(2)
Always pool, i.e., use [( n 1812 +n 2822
)/( n
l+n 2)]' When <712=F<7l this is biased,
but will have less sampling error than (1) since it will be based on (n l+ n 2)
degrees of freedom.
(3)
Use the test of significance of 8
2
1
/8 22 as a criterion in making the decision as
to whether to pool the two mean squares or not. In this case the preliminary
test of significance criterion will utilize the extra n 2 degrees of freedom whenever
possible and also avoid the bias in method (2).
The second test is concerned with the problem of fitting an appropriate model.
He considered the choice between the regression equations Y =b IX l+b 2X 2 and
Y =b IX, after having fitted Y =b IX l+b 2X 2; the population regression equation
being Y =131X 1+132X 2' In order to decide whether to retain X
the F-test, where
8 22
2
he tests 8 22 /8 32 by
is the reduction in sum of squares due to X 2 after fitting X
l'
and 8 l is the residual mean square. If F is not significant at some assigned
significance level he omits the term containing X
2
and uses bIas the estimate of 131,
If F is significant he retains the term containing XI and use bIas th, ,~rimate of 131'
Bancroft gives an expression for the bias in b" in his paper.
27
Mosteller (1948) investigated the problem of pooling two means in estimating the
mean J.l1l using a preliminary normal test for J.ll=J.l2 assuming (72 known. He showed
that if the difference between the true meaans can be though t of as normally distributed from sample to sample, pooling with unequal weights is preferable.
Han and Bancroft (1968) considered the case of unequal variances. They use the
t-test as a preliminary test. If the means do differ significantly, then they use XI as
estimator of J.ll' otherwise the estimator is (nlXl+n2X2)/(nl+n2), where
X\
and
X 2 are the sample X 1 and X 2 means. The bias and the mean square error of this
estimator and the regions of the parameter
spa~e, for
which this estimator has smaller
mean square error than the usual esti~ator X,1 ,":ere studied in detail.
A generalization of Han and Bancroft's work is due to Ahsanullah and Saleh
(1972). They investigated the effect of a preliminary test of significance on the classical least squares estimators. They proposed an estimate of the intercept in a linear
regression model with one dependent variable. Consider the model
where the X j 's are known quantities and the Ej 's are independent N (0,(72). They
used Student's t-test with n-2 degrees of freedom to test for
the estimate
b; -
131=0,
and then defined
b; of 130 as
Y
if the hypothesis
131=0
is accepted
if the hypothesis
131=0
is rejected
A discussion of the bias of the estimator
b; , its mean square error, and its relative
efficiency with respect to the usual estimator
Y -bX
is given in their paper.
28
A nonparametric version of Ahsanullah and Saleh's test was introduced by Saleh
and Sen (1978). They studied the effects of preliminary tests on 13 1 on the estimation
of 130' Their test statistic is based on a score function of the ranks of the dependent
variables. They discussed the asymptotic distribu tion theory of the estimator and its
asym ptotic relative efficiency.
An extension of these results to the multivariate case was made by Sen and
Saleh (!979). For a simple multivariate regression model, they proposed nonparametric rank order estimators of the intercept vector 130 , following a preliminary
test on the slope vector 1311 and discussed the asymptotic distribution of these estimators, their asymptotic bias and dispersion matrices, and their asymptotic relaative
efficiencies.
1.7. Proposed Research
The purpose of this research is to develop a new statistical theory in the area of
estimation after preliminary testing. We extended the UI principle due to Roy (1957)
to develop the UI M-statistic. The proposed statistic is based on a linear combination
of the aligned sample statistics.
Chapter 2 is concerned with constructing a test against ordered alternatives in
the univariate multi-sample problem. The union intersection M-statistic is derived in
Section 2.4, then its asymptotic null and nonnull distributions are developed in Section 2.5. The preliminary test estimator is defined and its asymptotic null and nonnull distributions are derived in Section 2.7.
Chapter 3 deals with estimation after preliminary testing against orthant alternatives in the multivariate one sample problem. While the union intersection Mstatistic is derived in Section 3.4, its asymptotic null distribution is developed in
29
Section 3.5. In Section 3.7, the preliminary test estimator is defined, then both its
asymptotic null and nonnull distributions are derived.
Chapter 4, the last theoretical chapter is an extension of the results of Chapter 2
to the general multivariate linear model. In Section 4.4, Roy's union intersection
principle is applied to derive the union intersection (VI) M-statistic for the orthant
alternative problem. Its asymptotic null distribution is developed in the same section.
The preliminary test estimator and both its asymptotic null and nonnull distributions
are derived in Section 4.6.
Finally, Chapter 5 introduces the results of some numerical studies concerning
the performance of the VI M-test with respect to some well known tests, and the
comparison between the performance of the preliminary test estimator (PTE) and the
isotonic regression estimator.
CHAPTER II
VI M-ESTIMATORS OF LOCATION PARAMETERS
AFTER A PRELIMINARY TEST IN THE VNIVARIATE
MVLTI-SAMPLE PROBLEM
2.1. Introduction
In this chapter we are mainly concerq.ed with the M-estimate of location parameters
~l , . . . ,
~K
in the univariate k-sample problem as a result of preliminary test
The test is an extension of the work by De (1975), Chin chilli and Sen (1981 a,b),
and Boyd and Sen (1983) to the M-procedure. The VI M-statistic we propose is based
on a linear combination of the aligned sample satistics.
In Section 2.2, we define the problem, summarize the assumptions required for
the su bsequen t sections, and define the classical M-estimator.
In Section 2.3, we study the asymptotic properties of the M-estimator and derive
its asymptotic distribution under both the null hypothesis and a local alternative.
We derive the VI M-statistic in Section 2.4, by the application of Roy's (1953)
VI principle.
The asymptotic null distribu tion of the VI M-statistic is developed in Section
2.5.
31
In Section 2.6 we derive the isotonic 1\1-estimator and derive its asymptotic distribution under the null hypothesis and under a local alternative.
Finally, in Section 2.7, we define the preliminary test estimator and focus on its
asymptotic null and nonnull distributions.
2.2. Prelimina.ry Notions.
Our goal in this section is to introduce the problem, define the M-estimator, and
state the necessary assumptions to be used in subsequent sections.
Let
Xv ... ,XK ; Xi
= (XiV' .. ,Xi")'
I
i=l, ... ,K
be k independent random vectors such that Xij,i =l, ... ,ni are independent and
identically distributed random variables with continuous distribution function F i (x)
on the real line R
= (-00,00),
with
F i (x )=F (x -£lj ), i =1, ... , k
(2.1 )
where £li is the location parameter.
It is desired to test the null hypothesis H 0: F l(X )=...=F" (x ) against the restricted alternative
Under the shift model (2.1), this is equivalent to testing the null hypothesis
(2.2)
against the ordered alternative
32
.
where at least one inequality is strict. If H 0 in (2.2), is rejected the isotonic Mestimates,
.&.n 1J
. . . '.&.nk
,
classical (pooled) M-estimate
will be used to estimate ~1J
tl N
...
'~k
;
otherwise, the
will be used as an estimate of ~ (the overall mean).
Define a score function w(x ): R
-+
R .
Assume that
(2.3)
where both '111 and '11 2 are nondecreasing and skew - symmetric functions, so that
Wi (-x )=-Wi (x) for every x and i =1,2.'11 1 is absolutely continuous on any
bounded interval in R, and '11 2 is a step function having finitely many jumps. Let
E j (aj ,aj +1)' j =O, ... ,p (for some p
>
0) where
(2.4)
where the
Of j
are real and finite numbers, not all equal. For P
>
1 ,conventionally,
let
Further, assume that
0<(1$ =
Jw 2(x )dF (x )<00.
R
For every real t and
nj
>
1 , define
n,
Mni(t)
= E
j =1
w(Xjj-t), -oo<t <00, i=I'H.,K.
(2.5)
33
Mni (t ) is non increasing in t ER. Denote the vector of M-estimators An of A by
where
.6. ni
=
.6. n:
-
A
**) Z. =I, ... ,K;
21 (.6.*
ni +.6. ni
n•
sup {t : E \II(Xij -t »O} ,
j =1
n•
:*
.6. n = in! {t : E \II(Xij -t ) <O}.
(2.6)
j=l
That is,
.::i ni
is the solution of
n.
MndAd =
E \II(Xij -Ad =
0,
i -I, ... ,K.
(2.7)
j=l
Assume that the distribution function F is symmetric about zero, so that
F (x )+F (-x )=1
\-j
x ER.
Since \II is assumed to be skew-symmetric, this implies that
\II
= J\II(x )dF(x) =
0
(2.8)
R
Thus by (2.5), (2.7), and (2.8) we obtain
(2.9)
and by the central limit theorem, it follows that
(2.10)
•
By (2.6), and the assumptions on F and \II made above, it follows that
tribution symmetric about Ai'
Ani
has a dis-
34
To study the asymptotic properties of ~ni ,assume further that the distribution
function F possesses an absolutely continuous probability density function f with
finite Fisher information I (J ), i.e.,
I(J )
=
J[ 1 '
R
(x) ]2dF(x )<00 ,
(2.11)
1 (x)
and let
(2.12)
By (2.3), (2.4), and (2.11), for
1 (aj »0,
,
P
E Qj [I
1= JW 1 (x )dF(x)+
R
j =l, ... ,p,
(aj) - 1 (aj_l)]
>
0.
j=1
Furthermore, assume that
J(W; (x ))2dF (x )<00.
R
2.3. Asymptotic Properties of the M-estimator'
In this section we first establish the contiguity of a sequence of local alternative
{KN
}
to the null hypothesis. We then derive the asymptotic distribution of the M-
estimator defined by (2.7) under both the null hypothesis and a local alternative, and
demonstrate an important property of the M-estimator : the asymptotic linearity of
M ni (~ni
)
in the neighborhood of the true value of the parameter ~i
Since the preliminary test estimator is of interest when the
be ordered, we define the sequence of local alternative {KN
}
as
~
.
's are suspected to
•
35
a
~=--
IN '
a=
(0 11
... ,
Ole )'
Next, we shall demonstrate that the sequence of alternatives K N is contiguous
to H o. The concept of contiguity is due to LeCam (1960). A thorough discussion of
contiguity is contained in Chapter VI of Hajek and Sidak (1967). Basically, if {P n
}
and {Qn } are two sequences of absolutely continuous probability measures on measure space (Xn
{P n (An
--+
,c4 ,J1.n ), such that for any sequence of events An
O} implies that {Qn (An)
said to be contiguous to {Pn
Since
JFr a' a
--+
--+
C
~,
O}, then the sequence of measures {Qn } is
}.
0 as N
--+ 00,
under assumption (2.11), the uniformly
asymptotically negligible condition VI 1.3.4. (cf. HajeK and Sidak (1967)) holds for the
sequence of hypotheses
(3.1 )
If we take
(72
= a' I(J )a,
condition VI 1.2.5 in HajeK and Sidak (1967) holds. Thus,
K N is a contiguous alternative to H o.
Next we will consider a result due to Jureckova (1977) that relates to the asymptotic linearity of M ni (~ni ) in the neighborhood of the true value of the parameter
We want to show that for 0< T
<00
36
Without loss of generality we can take D. i =0. Since M ni (t / vN) is nondecreasing in
t ER ,the supremum could be bounded from above by the maximum over a finite set
of points, i.e. \1 tl
<
<
t
tl +1> we have
Now \1 E>O, T <00 there exists a positive integer m E<oo such that for large mE
<
IT
mE
E/2 ; hence
~ I M ni ( ~) -
sup
Itl~TvN
vN
Mni(O)+ni
b
I <
vN
1 4 4
max
0~r~2mE
I"i\T
vN
I M ni ( vN
1"i\T) -
M ni (O)+ni I
I"i\T
vN
I +E/2
Now,
P {sup
1
I"i\T
Itl~TvN
< P {max
0~r~2mE
2mE
< E
1
P { r.:r
r =0
V N
I M ni (
1
r.:r
vN
t
r;;r) - Mni (0)
vN
I M ni (
I M ni (
+ ni
t
r;;r
vN
I
tr
1"i\T) - M ni (0) + ni
vN
I
I >E}
tr
W I >E/2}+E/2
VIV
:rr) - Mni -(0) + ni v:rr
I >E/2}+E/2.
vN
N
.
t
So it suffices to show that for large N
By Chebyshev's theorem, it follows that
t
I
.37
so it remains to show that as N
-+00
= -. nj
..fN
(as from (2.8)
J\II dF
=
J\II(x--)dF(x)
t
Now,
..fN
0)
nj
J\11( x )d {F (x)+ ..fN
tr f
= ..fN
Thus
r
1 )
(x ) - F (x) } +0( ..fN
38
The r.h.s. of (3.2) is equal to
4
1
var
( m(Mni (m) - M ni (0)
r
1 (
- + {E ( -m
M·l(t
-) n
m
+
M nl. (0)
ni
..
~4
m))
(3.3)
~ tr
+ n·1m
-))}I)
where
Thus (3.3) reduces to
And,
1 ~
t
~ .J[\lI(X"I) - _r_)
-< -N .LJ
W - \lI(X..I) )]2dF (X..I) ).
) =1
V IV
The r.h.s. is the average of a finite sum, so to show (3.2), it is enough to show that
J[\lI(x -
Now
m)-
\lI(x WdF (x )-0 as N
-00
.
39
(3.4)
and
(3.5)
From (3.4) and (4.5) above, it follows that
J[III(x
-1-) -
lII(x )fdF (x )--0 as N
--00 ,
and we may conclude that
(3.6)
Next, we want to establish the asymptotic normality of the M-estimator under the
null hypothesis in (2.2).
For every t ER, and i
=
1,... ,K
40
i.e.,
By the linearity result in (3.6) and from (2.10), it follows that
Thus, we conclude that for every fixed t ER , and under H 0
where Ai =
lim
N-oo
Since
3. ni ,
K
n·
_I
N
'
N
= E
ni·
i =1
i=l, ... ,K are independent, it follows that the asymptotic null distribution
of the M-estimator is
(3.7)
where
A
=
Diag (AI' ... , AK ).
From the normality just shown, it follows that
41
.JF[
I .i ni -~i I =
0p
(1).
(3.8)
are bounded in probability.
Thus, from (3:6) and (3.8) above, it follows that
~o.
(3.9)
This linearity result (3.9) will be very useful in the subsequent sections.
Now that we have established the contiguity of K N to H 0, and we know the
asymptotic distribution of
distribution under K N
.i ni
under H 0' it is possible to determine its asymptotic
.
Under K N , (2.1) can be written as
Fd x ) = F(x-~-
8·
IN-)
Therefore, under K N
Because of the con tiguity established before the linearity result remains true
under K N .
Thus, the asymptotic distribution of the M-estimator under K N is given by
(3.10)
where A is defined in (3.7).
42
2.4. The UI M-Statistic
In this section we will generalize the Barlow et ai. (1972) approach to general
M-estimators and derive a test for H 0 against HI in (2.2) that is based on the Mestimators by the application of the UI principle. Let wand w* be the subspaces
defined by the null and alternative hypotheses, respectively, i.e.,
{~:~1=
w=
w*
=
{~:~1
=~K
<
<
E (-00 ,oo)} ,
~K }.
Under w, let
~N
K
-
E
Ai
bo ni ,
(4.1)
i =1
be the classical M-estimate of~. The likelihood function of
bo ni
J
•••
J
bonK is given
by
Under w, as ~i =~. \-li,
K
E
ni (bo ni -~i )2 can be written as
i =1
K
=
E
ni
(~ni -~N )2
i=l
K
+ 2(~N -~) E
ni (~ni -~N )
i=l
From (4.1) the second term vanishes, and
Therefore, the maximum of L N (~) with respect to w is given by
43
Under w* , let
(4.2)
Since w* is translation-invariant, we may set ~+8a i
a
K
=
E Ai ai , and without
any loss of generality set
= (~+8a) + 8(ai -a), where
a
=
O.
i =1
The likelihood function of .&n 1, ... '.&nK under w* is given by
K
Now for fixed a's subject to the constraint
E ni aj 2=1, we want
to find 8!v and ~!v
i =1
that maximize LN(~) under w*. By differentiation with respect to 8 and ~, respectively, it follows that
K
E
nj aj
'&n.
K
j =1
if
E ni a i.&n• >0
i =1
8!v(a)
=
K
o
if
E ni a i .&n,
i =1
and
*
~N
1
K
N
i =1
= -[ "LJ'
n· -~ n.
1
= -N
K
A
i =1
A
"n·~
.
LJ I
nI
i =1
K
- 8"
n·la·l ]
LJ
<0
44
K
- E Ai ~ni =
i =1
Therefore,
D./ =
~N .
AN +8~ai 1
<
i
<
J(,
and
K
+ 8~2E niaj2}]
i=l
2
=
-/
K
const. exp[--2 {
20''11
E ni (D.ni -D.N )2
A
-
i =1
K
E nj a; ~ni
_ 2 (j =~
E ni aj 2
i=l
l.e. ,
K
A
_
[E ni aj (D.ni -D.N )]
- 2
_i_=_l_-=-K,.......-
E ni ai 2
as
+
A
i =1
-
K
i=l
_
E ni a; D.N =
i =1
K
O.
I)
[E ni aj (D.ni -D.N )]'"
E ni aj 2
i=l
K
2
}]
I(8~(a)
> 0)
45
.
if fJ~ >0
sup L N (.6.) =
w
if fJ~ >0
Thus,
-y
exp
A=
supL N (.6.)
--.:..W
.
sup L N
W
if 8~>0
2CTJ
_
(.6.)
if fJ~ =0
1
Therefore,
(4.3)
The likelihood ratio test rejects H 0 in favor of HI: .6. i =.6.+fJai ,
a
1
< ... <
a
K'
for large values of L N (aJ
Now
M ni (~N)
n.
= E \If(Xij -~N)
i =1
nl
ni
= E \If(Xij -~N )
i =1
as
-
E \I1(Xij -.6. ni )
i =1
46
n.
E 'I1(Xij -~ni ) =
O.
j=1
Expanding the first function in the r.h.s. around
~
and the second around
obtain
Mnd~N ),.....,
n.
E 'I1(Xij -~) -
nd~N -~h+o
I ~N -~ I
j=1
n,
- { E 'I1(Xij -~i ) - ni (~ni -~i h+o I ~ni -~i I }
j=1
=
-
1
A
M ni (~)-ni '/[-~N +~+~ni +~i ]+op ( m)
1
- M ni (~i )+op (""""/').
V
ni
Under H 0, it follows that
Thus, L N (a) can be written as
K
{ E ai ,/-1Mni (~N )}2
i =1
I(8~(a »O)+op (1)
K
E ni ai 2
i =1
2
((J.$
K
E Ai aj 2)1/2
i=1
I (8~(a »O)+op (1).
~i'
we
47
Under H 0
1
- M . (~N )
VN
m
1
n,
-
= -VN J~1
" \II(X· -~N· )
I}
.
(4.4)
The solution
"K
~N of ~
Mni (t )=0 and
_
~N
=
i =1
K
~
~ ~i ~ni
are asymptotically the same
i =1
up to the order N- 1j2, and therefore
K
K na
~ Mna(~N) = ~ ~ \II(Xia-~N)
a=1
i =1 a=1
I.e.,
K
_
{~Mna(~N) -
a=1
K
P
_
E Mna(~)+N(~N-~b}
- 0
a=1
(4.5)
By definition the first term vanishes and (4.4) can be written as
K
N- 1j2Mni (~N ) = N- 1j2Mni (~) - N-lj2~i ~ M n a(~)+op (1)
a=1
for i =1, ... ,K .
(4.6)
Thus,
where
~
=
O'J{A-~~'
} of rank (K -1), ~ = (~1l ... , ~K)' ,
K
and A was defined in (3.7). Thus, the asymptotic variance of ~
i=1
1
f7\T
vN
-.
aj M ni (~N ) IS
48
given by
[E
a ?Aj (I-Aj )+2
j
=1
j
E
I'j =1
a j a j (-Aj Aj)
]u,t
[. ~1' :2~. - (;~" ; ~;)2 ],,~
K
- E a lAj uJ
j
K
E a j Aj=a=O.
sznce
=1
j
=1
I
K
ri\T
From the above, it follows that under H 0
vN
Ea
j=l
-
j
M nj (~N )
'---K.,..,.....----(uJ
is asymptoti-
E Aj a l')1/2
j=l
cally N(O,I).
Define the set
A = {aj:aj
<
K
E Ajaj2=1}.
aj+1' i=I, ... ,K-I,
j
(4.7)
=1
To get a test against the ordered alternatives in (2.2) according to the UnionIn tersecion principle, we take the union of the critical regions corresponding to (4.3)
over the set A , which means using the supremum of L N (a) given in (4.3) as our test
statistic. That is, the union-intersection (UI) statistic against ordered alternative is
given by
L!J
=
supL N (a)
(4.8)
aEA
where A is defined in (4.7).
We reject H 0 in favor of H 1: ~j =~+6a; , a 1 <
.. , <
aK, if L!J is
sufficiently large.
'We attempt to solve this problem by reducing it to the orthant alternative problem
r
49
and using the UI principle to develop a test that is simpler and asymptotically more
powerful than the unrestricted test.
To construct the UI M-test against orthant alternatives, consider the transformation
f3 = D ti., where D is a (K -l)XK matrix of the form
-1 1 0
o -1 1
o
0
0
0
0
0
-1
1
Under the new setting the null hypothesis and the alternative in (2.2) can be written
as
where O=R +(k-l)=[O,oojk-l. Assume that the parameter space 0 is positively homogeneous, i.e., if
f3 E 0, then M f3 E 0, for all M >0.
K
Let ai =ai -1+bi , bi >0 for all i =2, ... ,K. Thus
_
E aj Mni (ti. N )
can be written as
i=1
K
_
K
_
K
K
_
E a; M ni (ti. N ) = alE M ni (ti. N )+ E bi ( E M nj (ti. N )).
i =1
i =1
i =2
j =i
By definition the first term in the r.h.s. will vanish and
Thus,
K
Mi
-
_
2: M nj (ti. N )
j=i
50
Note that
1
-
r,;rMj
vN
1
=
r,;r
vN
K
E
Mnj (t1 N ),
j=j
by the relation in (4.5), it follows that
+
Th us, the asymptotic variance of
iN- M
and the asymptotic covariance of (
15
j
iN- M iN- M
j ,
j , )
is
, say.
Then
1
r,;r
vN
EK
-
bj M j has asymptotic variance equal to
j=2
b' A~ b =
K
0'
JE
j
aj
2\ .
=1
Th us, maximizing
over the set A is equivalent to maximizing
0p
(1).
51
over the no"nnt;lgative orthant B .
Thus, if we define
M = (M 2'
reduces to maximizing (
b
>
~
vN
... ,
MK ),
K
b' M)I ( E b i
the maximization problem in (4.7)
Mi >
0) over the nonnegative orthant
i=2
0, with b' A* b = 1. For each b ER K
-1, b
~O, define
M N (b) = _l_(b' M )/(b' A* b )1/2
IN
From the above, we see that under H 0 , M N (b ) ,...... N (0,1). Due to results on the
extrema of quadratic forms (see Rao 1973), we know that
b = (A * tIp maximizes
Fix b E R K
-1
(b 'p)2
b' A* b
and construct the parameter subspace
O(b) = {.B E R K
-1 :
p = A* b },
(4.9)
define the set B as
B ={b ER K
-
1
:A*b EO,b' A*b =1} .
(4.10)
Due to the positive homogeneity of 0 and lines (4.9) and (4.10) we can say that
Oc U O(b), therefore we formulate our problems as:
b EB
reject {H 0:
if (M!J )1/2
=
p O} in favor of
{HI: p E O}
sup M N (b ) is sufficien tly large.
bEB
To maximize
1
(JiV
b' M)I
(b' M
orthant), is equivalent to finding
>
0) over the set B (the nonnegative
-in!(-_I- b' M) over B. In nonlinear
JiV
52
programming terminology the problem is:
Determine the vector b * E B such that h (b * ) = inf {h (b ): b E B } where
(1)
h (b ) is a scalar-valued function of b , and
(2)
B ={b :h l(b )
<
0, h 2(b )=o} where h
1
and h 2 are functions of b .
This is the minimization problem with constraints; h is the objective function,
h
1
the inequality constraint function, and h
2
the equality constraint function. The
Kuhn-Tucker-Lagrange (KTL) point is any point (b * ,t
t,t 2*) which satisfies the sys-
tern of conditions
(i)
t
(ii)
h l(b ) <0 ,
(iii)
h 2(b )=0 ,
(iv)
t; h l(b )=0
(v)
1
>0 ,
8L (b ,t lit 2)
8b
, and
o,
(4.11)
where L is Lagrangian function defined as
The Kuhn-Tucker conditions are sufficient for the constrained minimization
problem at hand. However, they require that the equality constraint function be
linear, and this is violated by the quadratic function b
I
A* b =1. This problem has
been solved by Chinchilli and Sen (1981b), who applied Roy's union-intersection principle by partitioning the parameter space into an infinite number of parameter subspaces, and construct a univariate test statistic for each subspace. They made use of
53
the Kuhn-Tucker theorem, and showed that the VI statistic against the orthant alternative is
M/;
=
b' *
M , where (b * ,t {,t 2*)
is the KTL point defined in (4.11). Fol-
lowing their notation, let J represent any of the 2 K - 1 subset of A = {1, ... ,1( -I}, and
JI
its complement. For each J partition (and rearrange)
Let
I(
I(
M -
[MM
A* -
[~;J) ~;J') ]
(J) ]
(J I )
'
M
and A* as
and
(4.12)
A(JI J) A{;, JI)
(j) be the number of elements in the set J, such that the dimension of A(~J) is
(j ) X I( (j), the dimension of
M is partitioned
space of
A(;J
I
)
is
I(
(j ) X I( (j I
),
etc. Note that the sample
into 2K -1 subsample spaces, and the VI statistic M/; can be
any of 2 K -1 possible quadratic forms, dependent on the subsample space in which
M
lies.
For each
t/J C J C A , define the conditional values
-
M
(J:J' ) =
*
A(JJ :J I ) =
-
M
*
*
{J )-A(JJ' )(A(J' JI))
-1 -
M
(J') ,
*
*
(*
)-1 *
A(JJ )-A(JJ' ) A(J' J') A(J' J) .
For J =t/J define M {J:J' )=0, and for J =K -1 define M (J:J' )=M , and
A(;J:J I )=A* .
Define the functions
'M
h (b)
=
-b
h l(b )
=
-A*
b
(4.13)
54
and define the lagrangian function in (4.11) as
Thus, the point (b * ,t 1*,t; ) is the KTL point if it satisfies the system:
(i)
t 1> 0
(ii)
A* b >0
(iii)
b' Ab = 1
(iv)
t l' A* b = 0
(v)
-M -A* t 1+2t 2 A* b =0
(4.14)
Line (v) of (4.14) implies that
2t 2 A* b
2t 2b
b
=
M +A* t l' I.e.,
= (A* tIM +t l '
Le.,
= (2t 2t1[(A* tIM +t 1],
(4.15) and line (iii) of (4.12) imply that
i.e. ,
4tl = (M' +t
t'
A* )((A*
rIM
+t 1)
= M' (A* rIM +M' t l+t l' M +t l' A* t 1
= M' (A* rIM +2 t l' M +t l' A* t 1 ,
but line (iv) in (4.12) together with line (4.15) imply that
(4.15)
55
I.e.,
which in turn implies that
t 1, M- = -t 1, A* t 1
(4.16)
From (4.15) and (4.16), we take
t 2 = ~ [M '. (A* tIM +2t 1M +t I' A* t 1]1/2
= .!.[M '
2
(A*
tIM -t I'
A* t
1]1/2
From (4.14) (i), (ii) it follows that
i.e., the vector t
1
and _l_(M +A* t
2t 2
d are
both nonnegative, but from line (iv) in
(4.14), their inner product is zero, i.e.
>
Since t 2
(M + A* t
0, the only way this is possible is if for some J ,<pC J CA ,
1)(J)
>
0 and t
1(J)
= o.
M (J )+A(;J I )t IIJ I ) >
0, and
M (J ' )+A(~I
=
J' )t
I(J/)
0 ,
which imply that
t
I(J/)
=
*
1 -(A(JI J/)t M (JI)
This leads to
(4.17)
56
Since t
1
>
0, it follows that
(4.18)
(4.17) and (4.18) imply that,
which in turn implies that
M (J :J ' ) >
(4.19)
0
As suggested from (4.18) and (4.19), for each J ,t/JC J CA , form the set
K
L(J)={MER
*
-1 :M(J:J/»0,(A(J/JI»)
M(J/)<O}.
-1 -
(4.20)
Kudo (1963) has shown that the collection of a1l2 K - 1 sets L (1) in (4.20) is a disjoint
and exhaustive partitioning of R K - 1. Therefore, (4.18) and (4.19) hold for some
J ,t/JC J CA , this allows us to take (b * ,t t,t 2*) as a KTL point, where
•
t2
= 21 [M'
(A* t 1M
-M
(J I )(A(J* I J I )t1M
(J I)] 1/2
I
and
Now,
2t;b*'
2t; b • M
I
=M'
=
M'
(A·r1+t~'
(A·
rIM +t ~ M
I
,
(4.21)
57
=
-
.1-
M'
(A
= M'
(A
t
•
t 1M
]
Jt1l [M(J)
M(J' )
-'.
M +[0 - M
-,
-M
(J' )(A(J' J'
•
)t1M
(J' )(A(J' J'
(J') .
That is,
b
.' -
M
=
1 -,
• -1 -,
•
-1 (A) M -M (J' ) (A(J 1 J'») M
2t 2
- . [M
(J
1
)l
From line (4.21), it follows that
(4.22)
Note that, the partitioned matrix A· has inverse
and,
-
M'
(A
•
-,
t 1M = M (J'
•
)(A(J' J'
)t1M
(JI
-
)+M
,
•
(J :J' )(A(JJ:J'
Line (4.22) can be written as
b
.' M
=
[-'
M (J :J'
•
)(A(JJ :J'
)t1M-
(J :J ' )
]1/2
Thus, the VI statistic for the orthant alternative problem is
-.
M
N
= -1
E
[- ,
N ;~J ~A
I
(M (J :J' ) >
M
•
(J:J' )(A(JJ:J,)t
0)1 ((A(~ 1
J 1 )t
1-
M
(J:J')
]
lM (J' ) < 0).
•
)t1M
(J:J
1 )
58
where I (a) represents the indicator function for the set A. Note that both indicator
functions are nonzero (or one) for only one of theM
tells us which M
(J :J' )
(J:J' ),
so the above expression
to use, and accordingly we calculate the corresponding qua-
dratic form.
2.5. The Asymptotic Distribution of M/;
Following the Chinchilli and Sen (I98Ib) approach, we can show that M~ has
the asymptotic distribution
•
P (MN
<
C) --
- ,
*
P {M (J :J' )(A(JJ :J'
E
)t1M
(J :J' ) <C
,
91~J~A
-
M (J :J' )
>
*
O} . P ((A(J' J'
)t1M-
<
0),
(5.1)
< C),
(5.2)
(J' )
which under H 0: f:J=0 implies that
lim P(M~
N
<
K-l
C) =
E
r =0
-+00
Wr P(Xr2
where Xr2 represents a chi-squared random variable with r degrees of freedom.
The nonnegative weights W 0"'" WK -1 sum to one and for 0
Wr =
-
EP(M
(J:J')
>
0)'
*
P ((A(J'
J')t 1M-
(J')
<
<
r
<
K -1,
0) ,
{J}
whereR(J)=r.
M has a null mean, (K -I)-variate normal vector with covariance matrix A* .
Outline of the proof
The proof consists of two parts: first, to establish the independence of the
- ,
*
1 -
events I(M (J:J' )(A(JJ:J,)t M (J:J')
<
-
C) and I(M (J:J')
>
0) under H o; and
second, to show that for each J ,¢JC J CA , the pair of orthant events
59
I(M
(J:J')
*
>
J')t 1M-
0) and I((A(J'
<
(J')
0) are independent regardless of H o (i.e.,
whether or not the expected value of M is null).
The first part can be directly shown by appealing to Kudo (1963), lemma 3.2.
Kudo points out that if X is an a-variate, null mean normal random vector with
covariance matrix
~,
then
P(X' ~-IX <x,B X >O)=P(X' ~-IX <x)·P(B X >0), (5.3)
where B is a full rank matrix of appropriate dimensions. If we take B to be the
identity matrix, then the two events
-
,
*
I (M (J :J' )(A(JJ:J'
)r M
1-
(J :J' )
<
-
C ) and I (M (J :J' )
>
0)
(5.4)
are independent under H o.
To show the independence in the second part, let 9 (M ,c5,~) represent the
[{ -I-variate normal density function for M , with mean c5 and dispersion matrix
E,
and partition M and ~ according to the sets J and J' , as in (4.12). It follows that
9 (M ;c5,~) factors into
9 (M
;c5,~)
=
g
(E(i,
J' )M (J' );E(jl, J' )c5(J'
),E(i,
9 (M (J:J' ),6(J:J' ),E(JJ:J' )).
In our case, (5.5) implies that
9(M
;O,A*)
=
g ((A(;, J'
9
rIM (J'
);O,(A(;, J'
(M (J :J I );O,A(;J:J' ))
From (5.5) above we can see that
*
-1 P(M
(J:J') > O;(A(JI J')) M (JI)
<
0) =
)t1)
J' ))
(5.5)
60
= P(M
(J:J')
> 0)· P
*
((ApI J'
rM
1 -
(JI)
<
0).
That is, the two events
I(M
(J:J')
>
*
0) and I ((ApI
JI)t
1 -
M
(J ' )
<
0)
are independent, regardless of whether or not the expected value of
leads to the result in (5.2).
M
is null. This
Unfortunately! the independence exhibited in (5.4) for
the normal random variable M does not hold if the mean vector is nonnull. Therefore the asymptotic nonnull distribution of M~ remains in the awkward form given in
(5.1).
The percentage points for the UI M-test statistic have been computed and tabulated in table 2.1 for the case K =3 and for values of p starting at p = -1 and by
increments of .1 up to and including p
=
+1.
Table 2.1
Percentage Points for the
VI M-statistic M~
By the Value of p
p
-1
-.9
-.8
-.7
-.6
-.5
-.4
-.3
-.2
-.1
o
.1
.2
.3
.4
.90
1.642
2.079
2.251
2.379
2.487
2.579
2.664
2.742
2.816
2.885
2.952
3.018
3.082
3.145
3.209
.95
2.704
3.244
3.445
3.593
3.715
3.820
3.914
4.001
4.081
4.157
4.231
4.301
4.371
4.439
4.507
.99
5.411
6.120
6.377
6.556
6.700
6.823
6.930
7.030
7.120
7.200
7.289
7.367
7.444
7.520
7.597
61
.5
.6
.7
.8
.9
1
3.275
3.342
3.415
3.490
3.594
3.808
4.577
4.649
4.726
4.811
4.914
5.137
7.672
7.750
7.833
7.924
8.035
8.275
To check these results, one thousand samples of a standard normal random variate were generated, for the case of p = .5 and for each of the sample sizes
n =5,10,15,20,25 and K =3. The test statistic was computed for each sample and
the .90, .95 and .99 percentage points are obtained for each sample size. The results
are tabulated in table 2.2.
Table 2.2
Simulated Percentage Points
of the VI M-statistic
by Sample Size and for p = .5
n
5
10
15
20
25
.90
3.875
3.674
3.126
3.391
3.582
.95
5.675
5.093
4.596
5.157
4.968
.99
10.808
8.471
9.485
8.808
7.580
We can see from tables 2.1 and 2.2 that the simulated percentage points do not
differ' greatly from the approximate values. Also we notice that the simulated values
did not differ appreciably by sample size indicating that a sample size of 20 or may be
15 would be large enough to justify the use of the approximate distribution.
2.6. The Isotonic M-Estimator.
Our aim in this section is first to derive the isotonic M-estimator A n* of 6. and
then to derive its asymptotic distribution under both the null hypothesis and a local
62
alternative.
ti. n* is the maximum likelihood estimate of ti. in the region
H 1:
~1
, i.e.,
< ~2< . . . <~K
it is the vector in HI that maximizes the likelihood
function
-1
A
= [( ·exp{-(ti. n -ti.)'
2
IF
1
A
(ti. n -ti.)}
(6.1 )
where
'" =
(1$.
1
1
n 1
nK
- " Dlag ( - , ... , - - )
''''r
is the covariance matrix of (An -ti.). Now, write ti. j = ti. j-1+t3j,
t3j >
0 for
i =1, ... ,K and ti.o=O. In matrix form we have
t3,
ti. = D
where
f3 = (t3 1,
...
,13K)'
and
D
K xK
10 O· ..
1 1 O· ..
0
0
1 1 1· ..
0
-
(6.2)
11 1 .. ,
1
According to this formulation the maximization problem in (6.1) becomes that of
maximizIng
1
1
L = K ·exp{--(ti. n -D (3)' ",- (ti. n -D (3)}
A
2
A
63
subject to
f3 > O. This
subject to
f3 > O.
is equivalent to minimizing -2..&~ v- 1D
f3 + f3' D v- 1D f3
I
To solve this problem we follow the same approach as in section 4. Let J
represent any of the 2K subsets of A = {1, ... ,K}, and J'
its complement. Adopt
the partition in 4.12, and define the conditional values as in (4.13). Define the
Lagrangian functions in (4.11) to be
d
1
L (f3,t ) = -2.6. 0'v-I
D f3 + p D v- D f3 I
A
t
I
fJ
From (4.11) the point (f/ ,t * ) is a KTL point if it satisfies the system
>
(i)
t
(ii)
t fJ=O, and
(ii)
0
fJ>O
,
(6.3)
(iv)
From (iv) in (6.3) it follows that
(6.4)
From (i) and (ii) in (6.3), it follows that both vectors are nonnegative, yet from (iii)
in (6.3), their inner product is zero, this suggest that ~ a J ,<pCJ CA , such that
(6.5)
Let A = D
D
-Iv
-IaN
(D
-
I
t 1,
note that A is the covariance matrix 9f D
1 00 ...
-1 10· ..
o -11 ...
0
0
0
0 0 0 ...
1
..&1
.:i2
Li K
-1..& n' Now
64
_ [~(J) ]
=
P(J' )
say
(6.5) can be written as
t (J )
=
0 ,
(~
and
A t
+ P)(J) >
0 ,
which implies that
[a
1
(JJ)
& (JJ')
2 a (J' J) a (J' J')
] [
0] +
t (J')
[p(J)]
P(J')
> o.
That is,
(i )
.!.a(JJ' )t (J'
)+P(J »0 , and
2
~J'
(ii)
2
J' )t (J'
)+P(J' ) = 0
(6.6)
From (ii) in (6.6), we obtain
(I'1 J' )P(J' ),
A
(J' )
=
-2
t (J' )
=
-2&
t
&
but since
t
>
0 ,
therefore
(i,
J'
.p{J' ) >
0
(6.3)
From (i) in (6.6), and from (6.7), we obtain
l.e.,
P(J:J' )
>
0
Kudo (1963) has shown that the collection of all 2K sets
(6.8)
65
L (J)
= {,BER +K ; ,B(J:J' »0
A
A
-1
<
A
,a (J' J' ),B(J' )
O} ,
where ¢>C J CA is a disjoint and exhaustive partitioning R +K. Therefore, (6.7) and
(6.8) hold for some ¢>CJ CA. This allows us to take
t * -
(3=
[~
a
c/o
J'
{J(J'
J
(p,t * ) as a KTL point, where
and
[~(JJ'I]
Thus the isotonic M estimator A. * if A. is
(6.9)
where D is defined in (6.2). Denote the partition of D -1 by D * , where
Then the matrix A can be written as
*
-1 D (J'
* J)+ D (JJ,)n
D (JJ)n
(JJ)
*
-1
*
*
-1 J')D (JI
* J/)
(J'
-1
D (J ' J)D:(JJp (JI J)+D (JI J/)n (JI JI
and
(3 can
be written as
P (J'* JI )
]
66
Therefore
where
Cand ~ * can be written as
*
~n
= D
D
-
[D
fJ
=
(JJ
[D
D
)(D
(J I J
(JJ )P(J :J')
]
A
(J' J)
fJ(J :J I
)
(~J )-CD (~I J))
)(D (;J)-CD (; I
J))
D
(JJ
)(D
D
(J I J
(~J I )-CD (~, J I))
)(D
(;J'
i-CD (;,
J I))
]
A
~n
, say
where C
P(J :J' )
P) is the matrix
can be written as
(6.11)
1
A
f3j
a (.I' .I' ) p(J') can be written as C (J
~n' where
A
67
From (6.10)-(6.12), it follows that
p (~n·
<
X) =
E
P {C
¢JCJ CA
SlLi n <
X ,C (~2lLin
>0 ,
That is,
lim P (J7V (D.n*-D.)
N-oo
E
P {J7V C
¢J~J~A
Define the region C
tJ)
< X)
Sl(Li n-D.) < X
by
(6.13)
Under H 0 in (2.2), and from (3.7), it follows that the asymptotic null distribution of
the isotonic M-estimator ~n* is given by
(6.14)
where GK (X ;J.&,E) is the K-variate normal distribution function with mean JL and
dispersion matrix
E,
C
tJ)
is defined in (6.13). Under the local alternative K N in
(3.1), and from (3.10), it follows that the asymptotic nonnull distribution of ~n* is
given by
68
where C
tJ)
and GK (X ;p,I;) are defined as in (6.14).
2.7. The Preliminary Test Estimator (PTE).
Our goal in this section is to define the PTE and focus on its properties, and its
asymptotic distribution under the null hypothesis and a local alternative. We define
the proposed PTE by
if M~
if M~
<
>
MN,a
MN,a
where MN ,a represents the upper 1000'% point of the distribution function of M!v.
Under H 0 in (2.2)
Note that
69
E
=
P{VN(/ln*-/l'l)
ifi~J~A
I (M (J:J I
*
> 0)1 ((A(J'
)
<
X ,M (J:J' )(A(;J:J '
)t1M-
J'
(J' )
<
0)
> N
)t1M (J:J '
(7.2)
M N ,cr}
Next, we will write (7.2) in terms of the vector
:M:
(J :J' )
may be written as
M (J :J' ) =
e (J ),e (J'
e
M
)
m (J
)M ,
where
are matrices such that
(J)M =
and
M (J ),
e (J,)M
=
M (J
I
n (~)
was defined in (7.3), and
K
E Aj
j =2
K
K
E
1-
K
Aj
- E Aj
j =3
j =3
j =2
)t1M-
(J:J' )
K
K
1-
E Aj
1-
E
j =3
j =3
Thus, it follows that
I
*
M (J:J' )(A(JJ:J I
E Aj
1-
J -2
J =2
K
- E Aj
K
1- E Aj
e=
(1)
)
can in turn be written as
where M
.
can be wntten as
)
Aj
70
M ~ (~)Q (J)M
(~), where
n
Q (J) = (m (J)e)' (A(~J :J' lt l (m (J )e) is of rank k (J).
(2)
M (J :J
(3)
• J'
(~(JI
l
I
>
0 can be written as
)t1M-
(J' )
<
0 can be written as
and from the linearity result in (3.9), it follows that
../FT (~n· -~'1) =
$N 'I c Sl A- M n (~) ,
1
where A was defined in (3.7)
From (1) to (4) above, (7.2) can be written as
M ~ (~)Q (J )M
(A(~I
J'
n
(~)
>
NMN ,a ,m (J)
eM
n
(~) > 0 ,
)tlc (J' ) eM n (~) < O}.
Note that, from (2.10) and the independence among samples, it follows that under
H 0,
m
M
n
(~)
--
NK
(O,O'~A),
where A was defined in (3.7).
Denote by B (J) the region
B(l) =
m (J)et
{t:C ()ljA-1t
>
<
'I X ,t' Q (J)t
O,(A(~ I J' )t1C (J' )et
<
O}.
>
M N ,a'
(7.4)
71
From (4.5) and under H 0' it follows that
r.7
2
u~
-
v N (AN -A) ,...., N (0,-").
I~
Thus, from (7.1), the asymptotic null distribution of the PTE can be written as
lim Po(VN (~JT-A'l)
N-+oo
+
~
tP<;;.J <;;,A
<
X)
J ... J dGK (t ;0,u1A),
Bll)
where B (J) and GK (X ,1l,E) were defined in (7.4) and (6.14), respectively.
Under {KN
},
n.
O.
J -1
V IV
= "'I1(X..
-A·, +-'-)
LJ
'J
'N
~
O·
O·
= j~1
"'I1(X.. -A· )+n· - ' -.. . .+ 0 1-'-\
'J
,
'IN'
IN'
I
That is, under {KN }
In other words
72
Note that,
(1)
(J)
eM n (Do)
m
(J)
e(M n(Do)+,,)
m (J)
(3)
M
eM
n
~ (Do)' Q
t;)
=>
0
>
(Do)
>-m
(J)M
n (Do)
(M n (Do)+,,)' Q
-Let C
>
m
(J
0
(J)
=>
e" ,
> NMN ,a =>
)(M n (Do)+,,)
> NMN ,a
be a subset of the K-dimensional space such that
(i.5)
and let
at;)
be its complement; then the asymptotic distribution of M/, under {[(N }
can be written as
and
73
E
~<;
where
t
<
G(J)
J .._. J dG
J <;A
K
(t ,p,u,GA)
C (I)
represents the intersection of the region under the hyperplane
/,A(C (~lltlX -I' and the outside of the ellipsoid C (;) , i.e.
(7.6)
Under K N in (3.1)
-IN (~N-~)
=
1
K
~ M (~. -
-IN /' j::l
n,
'
e·
_I)
-IN
Thus the asymptotic nonnull distribution of the PTE is
+
E
J ._.. J dGK (t ;p,u,GA).
~<;J<;A
Unfortunately, the regions C (;) and
c(Jl
G (J)
remain in the awkward form given in (7.5)
74
and (7.6). Even for the very simple case K =3, it is very difficult to express them in
a simple closed form.
As is usually' the case with PTE, ,5.JT is generally not an unbiased estimator of
Ll, and the same goes for the isotonic M-estimator Ll n*.
We had hoped to show that
the bias of the PTE is less than that of the isotonic M-estimator, and that the PTE
,5.JT is asymptotically more efficient than the isotonic M-estimator. But as we mentioned before, it is very dificult to obtain a simple closed form for the asymptotic bias
and dispersion matrix of both estimators.
Therefore, we will try to overcome this
problem in a subsequent chapter by simulation. This explains why very little has
been devoted to this problem in the literature.
2.8. Another Approach.
Developing the VI M-statistic
M/v
can be done using the estimator directly
instead of the aligned M-statistic, although the new procedure is more complicated as
it involves the estimation of "'I but it is more powerful especially when we move away
from the null hypothesis in the direction of the alternative hypothesis. This will be
illustrated more by a numerical example in Chapter 5.
CHAPTER III
PRELIM:INARY TEST ESTIMATOR IN THE MULTIVARIATE
ONE SAMPLE PROBLEM
3.1. Introduction
In this chapter we expand the VI M-test that was developed in Chapt@r II, to
test against orthant alternative in the multivariate one sample problem. Our main
interest is in estimating the location vector
H 0: {e
=
O} against
{H 1: 8
e
after a preliminary test
> O}
(1.1 )
In Section 3.2 we formulate the problem, define the M-estimate of 8 and state
the main assumptions that will be used in the subsequent sections.
In Section 3.3 we derive the asymptotic null and nonnull distributions of the Mestimator and demonstrate the asymptotic linearity of M (8) in the neighborhood of
the true value of the parameter
e.
We derive the union-intersection M-statistic for testing the hypotheses in (1.1) in
Section 3.4 by the application of Ray's (1957) union-intersection principle. In Section
3.5 we demonstrate its asymptotic distribution under the null hypothesis in (1.1).
In Section 3.6 we develop the isotonic M-estimator
likelihood function of the M-estimator
8 of e
e of e by maximizing the
in the region H 1:8
section we derive the asymptotic null and nonnull distributions of
>
e.
O. In the same
Finally, in Sec-
tion 3.7 we define the preliminary test estimator (PTE) and derive its asymptotic
76
distribution both under the null hypothesis and under a sequence of local alternative.
3.2 Preliminary Notions
First, we will introduce the problem and the M-estimate, then summarize the
basic assumptions required for the subsequent sections.
Let X
!> ...•X N
;
XI =(x
Ii , . . . , Xki)' ,
be independent and identically distri-
buted random vectors, distributed according to an absolutely continuous K-variate
cumulative distribution function Fe, defined on the Euclidean space R K and diagonally symmetric about 8. Thus,
F e(X )
and 8
=
= F (X - 8) ,X E R K
({}lJ ...
J
{}
(2.1 )
,
K)' is the vector of location parameters.
Our primary interest is in finding isotonic M-estimates of 8 following a preliminary test {H 0: 8=O} against the restricted class of alternatives
(2.2)
where
n=
R +k =[O,oojk. If we reject H 0
,
the isotonic M-estimator eN will be
used to estimate 8; otherwise a null vector will be used as the estimate of 8.
Define the score functions \IIlJ ...
that \IIj(x)
X
=
J
\Ilk with \IIj
=
{\IIj(x); x E R}, such
\IIjl(x)+\II j2 (x),
E R , j =l, ... ,k
where the \II j/
(2.3)
,t = 1,2, are both non decreasing and skew-symmetric;
lutely continuous on any bounded interval in R, and \II j
finitely many jumps. Let
E/ =(a/,a/+I), t =O, ... ,p,
2
\II j
I
is abso-
is a step-function having
77
>
for some p
0, -00 = a 0
<
a 1<
. .. <
ap
< ap +1=00, such
that
(2.4)
where the
(}:I
are real and finite numbers, not all equal. Conventionally let
Let
o<
(1}j =
J \II}( x) d
F Ij j( x ) < 00, 1
<
j
<
k
R
where F Ij I is the j-th marginal of F. And let
where Fiji] is the (j,l )th bivariate marginal of F. Define
~
= (((1 jl)) ,j =1, ... ,1(, I =1,oo.,k.
>
For every real t and n
1
Mj(t)= N
N
~
(2.5)
1, define
\IIj(X ji -t), j=l,... ,K.
(2.6)
i =1
Note that for every j, M j (t ) is monotone in t, t E R , so we may define the vector of
M-estimators
e of a by e =
(Ov ... , Ok)' ,where
1
~
OJ = -{sup (t :Mj (t »O)+£nj(t :Mj (t )<O)};
2
i.e., the M-estimator
OJ
(2.7)
of OJ is the solution of
(2.8)
"
By the symmetry of the F Ij I and the skew-symmetry of the \II j
,
we have
78
Wi = ! 'I1 i (x)dF[il(x)=O forj=l, ... ,K.
(2.9)
R
Thus, by (2.5), (2.8), and (2.9) we obtain
E
where
E
[VFi M
(8)]=0 and E (N M (8)M' (8))= E
(2.10)
is defined in (2.5).
By the multivariate central limit theorem, it follows that
VFi
M (8)/E -
Note that by (2.7), and the assumptions on F Ii I and '11 j made above,
N (0,1).
Bj has a distri-
bution symmetric about OJ'
Furthermore, assume that the marginal probability density functions
f Ii I ,j =l, ... ,K , are all absolutely continuous with finite Fisher informations, and
that
(2.11)
By (2.3), (2.4), and (2.11), for
f (al »0, I =l, ... ,p , we may write
I
li=!'I1jl(x)dF lij (x)
R
Ij as
P
+ Eal [J (ad-f (al_d]>O.
1=1
Finally, assume that
!R ('11 jl(x))
I
2
d F Ii j(x )<00, j =1, ... ,K.
3.3. Asymptotic Distribution of the M-Estimator
In this section we want to ascertain the asymptotic normality of
the null hypothesis and a local alternative.
e under both
to
79
Consider a sequence {KN
}
of local alternatives, where under K N (A),
A
8=-
(3.1 )
Jiii'
Since
~
0 as N -
A' A -
00,
if we take
(72
= A' I (f )A it follows that condi-
tions VII (1.3.4) and (1.3.5) in Hajek and Sidak (1967) hold for the sequence of
hypotheses
H o: 8
A
= 0 against K N :8 = $
'
Thus, K N is a contiguous alternative to H o.
Next, we will extend the linearity result we considered in Chapter 2, to the multivariate case. Note that the multivariate case (K
univariate result (K
=
>
1) will follow by applying the
1) to each coordinate separately. That is, we will show that
for every j =l,... ,K
sup
Itl<T
t
Jiii I M.(8.+_
_) - M·(8·) + _t_
J
J
Jiii
J
J
Jiii
Now recall that in the notation of Chapter 2
and nil correspond to .JFiMj (8 j
N
),
iN-
.JFiMj (8 j
+
M
ni
/'
J
I R.
(.6. i ),
vk-
b-), and Ii
vN
0
M ni (.6. i +
vk-),
in this chapter.
With this in mind and by following the same approach of section 2.3, the linearity
result follows, i.e.,
80
In a matrix notation, this can be written as
~up
..fN Ia-a I
where
r= Diag
~ T
m
I M (8) - M (8)+ q8-8) I 1:
0
(3.2)
(11' ... , IK ).
In the remainder of this section we will establish the asymptotic normality of the
classical M-estimator, first under the null hypothesis, then under the local alternative
K N of (3.1).
By (2.10) and (3.2), for large N, the asymptotic distribution of
8
under the null
hypothesis is given by
(3.3)
where
E
and
r
were defined in (2.5) and (3.2). Under the contiguity established
before, the linearity result in (3.2) remains true under
[(N
Note that under K N , (2.1) can be written as F a(X)
Thus under K N , as N --
00
in (3.1).
=
the asymptotic distribution of
F (X -
8
~),
and
is given by
3.4. The Union-Intersection M-Statistics
Assume that the parameter space 0 is positively homogeneous, and for each
eE
0 and
e~o
let
e=
6 a where a is a K-vector of nonnegative elements and 6 is
a nonnegative scalar. For a fixed a E R K construct the parameter su bspace
O(a)
=
{e E R K : e =
6a ,6>0}
(4.1)
81
For a fixed a the null hypothesis and the alternative in (1.1) can be written as
H 0: 8=0 against HI: 8
ten as L =K· exp{-
>
0, and the likelihood function of
~ [(a -
8a)' rE- 1qa
-
011
... ,
OK can be writ-
8a )]}. Under H 0 the NILE of 8 is
equal to zero, and since rand E are known,
max
Ho
N
1
L =K exp{--(a' rE- re)}.
2
A
A
Under the alternative L can be written as
Differentiate with respect to f> subject to the constraint a' rE- l ra =1. It follows
that the MLE 6 of f> is given by
6(a )=a' rE- l re
if a' rE- l re
>
0
(4.2)
otherwise
=0
Therefore,
N
A
1
A
. exp [--{a' r E- r e-( a'
if 6>0
2
N AI'
. exp [- :2 (erE- re)]
if 6=0
Thus,
supL
Ho
A -
supL =
HI
1
{exp{-ii.(a' r E- ref}
2
1
if
6>0
6=0
82
That is,
L N (a )=-2Iog>.=(
IN 8(a))2
1 (8(a )
> 0).
The likelihood ratio test rejects H 0 in favor of H 1 for large values of LN (a). For
ease of notation, define
v
= r
E-1 r,
and
(4.3)
Thus,
Note that E (v'FT a' U
JNa'
U
......;...;;---.,;;~....;;.,1/,..,....2
(a' va)
,......,
I H 0) =
0, and E (v'FT a' U
I H 0)2 =
a' va, l.e.,
N (0, 1).
Define the set
A
= {a E R +K : v a EO, a' v a = I}.
(4.4)
Due to the positive homogeneity of 0, and lines (4.1) and (4.4), we conclude that
oc
U O(a ), and we form our problem as:
a EA
reject {H 0: 8=0} in favor of {H 1: 8E O} if
L~
=
sup LN (a )
aEA
is sufficiently large.
Now, sup(JN a' U)I (a' U
aEA
the set A (the nonnegative orthant).
>
0) is equivalent to -in! (-v'FT a' U) over
83
Following the same approach and notation as in Chapter 2, let J represent any
of the 2k subsets of A ={I, ... ,K}, and JI
and rearrange U and
u
II
(J)
= [ U(J')
U
its complement. For each J partition
as
]
and
1I(JJ)
[
11= II(J'J)
II(JJ')]
4> C J C
For each
(4 ..5)
II(J'J/)
A , define the conditional values
and
(4.6)
For J =4> define U (J :J'
)
=
0 and for J =K define U (J :J' )
=
U and II(JJ:J' )
=
II.
Define the functions
h (&)
= -& I
U ,
h 1(&
)
=
-II & ,
h 2( &
)
=
& I II &
-1 ,
and the lagrangian in (2.4.11) as
L
(&
,t
1
,t
2)
=
-& '
U - t ; II
From (2.4.11) the point (& • ,t
(i) t
(iii)
1
>
0
&' II &
(v) -U
-II t
=1
1+2t 211
&
&
+ t 2( &
I
II &
-1).
t,t; )is a KTL point if it satisfies the system
(ii)
II &
(iv)
t; II
=0
>0
&
=0
(-1. i)
84
Line (v) of (4.7) implies that
= ,;-IU +t
2t 2 a
1 ,
that is,
a =_1_[v-1U
2t 2
+t 1],
(4.8)
and line (iii) of (4.7) implies that
l.e.,
4tl = (U' +t l' v)(v- 1U +t 1)
= U' v- 1U +U' t l+t ; U +t ; vt
= U' v- 1U +2t ; U +t ; v t
1
,
1
(4.9)
but line (iv) in (4.7) together with (4.8) imply that
1
I
-1
- t 1 v( V U +t 1) = 0
2t 2
l.e.,
t
I HI -1
1 ... \v U
+t
1
)
= 0 ,
which in turn implies that
t; U = -t ; v t
1
From (4.9) and (4.10), we take
t2
= ~[U'
v- 1U+2t;U+t; vt 1 11/ 2
= .!..[ U'
v- 1U - t ; vt 1 ]1/2.
2
Note that from (4.7) (i), (ii)
(4.10)
85
t
1
>
0, and
that is, the vector t
1
2~ 2 V
(v-IV
and _1_( V
2t 2
+t
+v
t
1 )
>
1 )
are both nonnegative, but from line (iv)
0;
in (4.7), their inner product is zero, i.e., (2t 2t1t ; (V
+v t 1) = O. Since t 2 > 0, the
only way this is possible is if for some J, ¢> C J C A , ( V
t
1(J)
=
+v t
1 )(J)
>
0 and
O. This leads to
(4.11)
which imply that
(4.12)
and from (i) in (4.7), it follows that t
1(J')
>
0, but (4.11) and (4.12) imply that
which in turn imply that
V
(J :J' )
>
0 .
The use of Kudo's result is in Chapter 2, allows us to take the KTL point
(a * ,t 1* ,t *)
2
Now,
I.e. ,
as
86
2 t ; a 1 * U = U' v-I U
+ t ; /U
1
1
= U ' V-U-U(JI)V(JIJI))
1
U(JI)
Therefore,
a
1
*U
a 1 *U
=
=
1
-4[U' V- U -U
2t
2
[U' V- 1U -U
(~I
(~I
)(V(J' JI
Jt1U
)t1U (JI
)(V(JI JI
(J' )]
)]1/2.
(-!.13)
But,
U ' V -IU =
U (J1 / )V(J-11 J /) U (J 1 ) +U (J
1
-1 :J ') U (J :J /) ,
:J 1 )V(JJ
then line (4.13) can be written as
a
1
*U = [
'
U (J:J 1
-1
)v(JJ:J 1
)
U (J:J I)]
1/2
.
Thus, the UI statistic for the orthant alternative problem is
I(U
(J:J ' )
> 0)1 ((v(J'
JI
)t1U (JI) <
0),
where I(A ) represents the indicator function for the set A. From (-!.3) and the partition in (4.5), it follows that
v
=
[fo(JJ)
0]
r(J'JI)
2;(J} )+ 2;(J} )2;(JJ 1
[-
2;(J~
)2;0 1,
J 1 :J )2;(J 1 J )2;(J} )
J 1 :J )2;(J 1 J )2;(J} )
r(JJ)
0
:
[o
r(JIJI)
X
87
f(JJ
-
)E(J~ )+f(JJ )E(~J )E(JJ' )E(j\ J' :J )E(J' J )E(j})
[ -f(J' J I )E(j\ J I :J )E(J I J )E(j})
0]
f(JJ)
[o
(4.14)
f(J'J')
f(JJ)E(j} f(JJ)+f(JJ)E(j} )E(JJ I )E(jl, J I :J)E(J' J)Er:i] )f(JJ )
V=
[ -f(J' J I
)L;(J\
J I :J)L;(J' J)L;(J} l(JJ)
- f(JJ)E(J})L;(JJ'
f(J
I
JI
)L;(J\
)L;(i,
JI
:Jf(J' J')]
(4.1.5)
J I :J f(J' J' )
Thus,
V(JJ)
=
f(JJ )E(J} f JJ +f(JJ )E(J} )E(JJ I
)L;(i, J'
V(JJ/)
= -
f(JJ)E(JJ))E(JJ
J)
= -
f(J' J' )E(j\ J' :J)E(J' J)L;(j} )f(JJ)
V(J
v(J
I
I
JI
)
=
f (J' J'
)E(i,
1
:J )E(J I J )E(J} f(JJ)
)E(J' J' :Jf(J' JI )
J' :J)f(J I J' ).
From (4.6) and (4.15), it follows that
(4.16)
From (4.3), we obtain,
U
(J)
=
v(JJ )e(J )+v(JJ'
U (J' ) =
v(J
I
l~(J')
,
and
J )e(J )+V(J I J' )e(J' )
Therefore, from (4.6), it follows that
U (J :J I
)
A I '
=
V(JJ )e(J )-V(JJ I )v(:1 I J I )V(J I J )e(J )
=
V(JJ :J ' ) a(J).
88
From (4.16), it follows that
The quadratic form U (~:J' )ZI(j~:J' )U
-1
=
N
can be written as
A
A
=
ZI(J' J' )ZI(J' J )8(J )+e(J' )
=
e(J' )-r(J' J' )E(J' J)E(JJ)r(JJ)8(J)
=
8(J' :J)
-1
A
Thus, the VI statistic in terms of
L!t
(J :J' )
E
if>~J~A
8
-1
A
can be written as
8('J)r(JJ )E(j~ )r(JJ )8(J )
1
A
I(r(JJ)E{iJ)r(JJ)e(J)
>
A
0) I (e(J' :J)
<
0).
From the linearity result in (3.2) and under the null hypothesis in (1.1), the VI statistic for the orthant alternative problem
L N•
=
N
~'
LJ
{M (J
)(O)(E(JJ
;~J ~A
I(r(JJ) E(J~)M (J)(O)
L!t can written as
>
)t1M
(J )(O)}
0) I(r(i, J' )M (J' :J)(O)
3.5. The Asymptotic Distribution of
<
0).
L!t
Following the Chinchilli and Sen (19S1b) approach, we can show that Ll~ has
the asymptotic distribution
(05.1)
which under H 0:
e=
0 implies that
89
lim P (L~
N-oo
<
K
C) =
E
.
W r P (Xr2
<
C )
(.5.2)
r =0
where Xr2 represents a chi-squared random variable with r degrees of freedom. The
nonnegative weights W 0,"" W.C sum to one and for 0
<
r
<
J(.
where K (I) = rand M (0) is a K-variate normal vector with null mean and covariance matrix
E.
Outline of the proof.
Similar to Chapter 2. The proof involves two parts. (1) To show the independence between the two events I {M (/J)(O)E(J~)M (J )(0)
<
C } and
(.:>.3)
This can be shown by appealing to Kudo's (1963) lemma 3.2. and taking B
In
(2.5.3) equal to f(JJ )E(J~)· (2) To show that for each J, t/>CJ CA , the pair of
orthant events I {f(JJ )E(J~)M (J )(0)
>
O} and I {f(i, J I )M (J' :J )(0)
<
O} are
independent regardless of H o. This can be shown as follows:
If 9 (M ,6,
E)
represents the K-variate normal density function for M , then we
note that it factors into
9 (M ,cS,E)
= g (f(JJ )E(J~)M (J)
;f(JJ
)E(J~ )cS(J)
,f (JJ JE(J} r(JJ))
X 9 (f(i, J' )M (J' :J) ;f(jl, J' )cS(J' :J) ,f(JI, J' )E(J' J' :J )f(JI, J' ))
which implies in our case that
(S.4)
90
From (5.4) we see that
P (r(JJ)E(j})M
=
(J
)(0)
P (r(JJ)E(J~)M
(J
0 ,r(j~
>
)(0)
>
J' )
M
O)P (r(i,
(J' :J
J'
)M
)(0)
<
(J' :J
0)
)(0)
<
0),
independent regardless of whether or not the expected value of M is null. This leads
to the result in (5.2). Unfortunately, the independence exhibited in (5.3) for the normal random variable M does not hold if the mean vector is nonnull. Therefore, the
asymptotic nonnull distribution of L~ remains in the awkward form given in (5.1).
Note that the statistic L~ is structurally analogous to Kudo's (1963) statistic,
the only difference between the two being that Kudo's is an exact test based on observations from a normal distribution with known covariance matrix.
3.6. The Isotonic M-Estimator
In this section we derive the isotonic M-estimator and its asymptotic distribu. tion under both the null hypothesis and a local alternative.
We denote by aN the isotonic M-estimator; eN is the maximum likelihood estimator of 8 in the region HI: 8
>
0, i.e., eN is the vector in HI which maximizes
the following likelihood function:
N ~
1
~
L = K 'exp{--(8-8)' rE- Q8-8)}.
(6.1 )
2
Maximizing L in (6.1) subject to 8
exponent or equivalently
>
0 means finding 8
>
0 which minimizes the
91
To solve this minimization problem, we follow the same approach as in section 4. Let
J represent any of the 2K subset of A ={1, ... ,K}, and JI
its complement. Adopt
the partition in (4.4), and define the conditional values as in (4.5). Now define the
functions in (2.4.11) to be
h (e)
h l (8)
= -28' rE- I r8 + 8' rE-lre ,
=
-8 ,
and the lagrangian in (2.4.10) as
From (2.4.11) the point (8,t * ) is a KTL point if it satisfies the system
(i) t
>
0
(ii) 8
>
(iii) t
0,
1
e=o,
and
(6.2)
Line (iv) of (6.2) implies that
l.e.,
(6.3)
Line (iii) in (6.2) implies that
From (i) and (ii) in (6.2) we see that both
t
and
(8+ ~ r-IE
r-It ) are nonnegative,
yet their inner product is zero ((iii) in 6.2). The only way this is possible is if for
some 4JCJCA,
92
t
(J)
=
(8+ ~ f- l 2: f- 1t
0 and
)U)
> o.
(6.4)
Now,
I.e.,
f- l 2: f- 1 =
f(jl )2:(JJ f(Jl)
[ f(i, J 1 )2:(J' J)f(Jl)
f(Jl )2:(JJ 1 )f(i' J 1 )
f(i, J 1 )2:(J 1 J' )f(Jl,
]
J 1 ))
Line (6.4) leads to
(i)
and
(6.5)
line (ii) in (6.5) implies
t (JI )
=
-1
-2 fpl JI )2:(J 1 JI
fu ' JI )eU' )
A
(6.6)
from (i) in (6.2) it follows that t (J 1 »0, and from line (i) in (6.5) it follows that
I.e.
8(J:J')
>
0
(6.7)
93
Kudo (1963) has shown that the collection of all 2K sets
¢JC J CA , is a disjoint and exhaustive partitioning R +K. Therefore, (6.6) and (6.7)
hold for some ¢J C J CA. This allows us to take (eN ,t * ) as a KTL point, where
and the isotonic M-estimate eN as
_
_
eN -
[e(J:J I
) ]
0
.
Under H 0 and from the linearity condition in (3.2), it follows that
and
Thus, for any nonnegative vector X ,
_
po(e N
< X) =
E
4>~J ~A
[f(j})M
P{
(J:J I
)(0) < X (J)
0<X I
-
]
'
(J)
(6.8)
Using the same argument as in section 5, it follows that
94
g (M (8};0,~)
=
g (f(J~)M (J:J 1 )(8) ; 0 , f(J~ )~(JJ:J 1 )f(J])) .
g (f(J' JI )~(i, J' )M (J' )(8) ; 0, f(J' J' )~(/, J' f(J' J' )} .
That is,
(6.9)
are independent.
Thus (6.S) can be written as
Under H 0,
..,fN M (J:J' )(O) '"" N (J)(O '~(JJ:J
M
)(O) '"" N (J'
(J'
)(0
1 )},
and
,E(J' JI ))
Thus, the asymptotic null distribution of the isotonic M-estimator is given by
f ...
r(J' J'
where G (J)(X
fl
)E(i ,
,Jl,~)
dG (J 1 )( t 2; 0 '~(J 1
J' )t
J1
))
2:50
is the J-variate normal distribution function with mean Jl an
dispersion matrix ~.
Under K N in (3.1),
The linearity result in (3.2) and the independence in (6.9) remain true under [(N,
95
therefore
.
1T\r-
hm P ( v N (8 N
~
r;;r
-
N-+oo
V
N
<
X )
=
Now, M (J:J' )(8) can be expressed as
M (J:J 1
)(8)
=
m (J)M (8), where
M (8) = (M l(O),... ,MK (0))' , and
m (J)
a~d C (J) ,C
C
(J'
=
(J
1
C
)
(J)-~(JJI )~(i,
J' )C (JI ) ,
are matrices such that C (J)M (8)
)M (8) = M
(JI
=
M (J
)(8), and
)(8).
Therefore,
lim PK (..jjV (8 N -
N-+oo
N
;,..)
V
N
<X )=
L; PKN {.IN r(j~)m (J)M ( ~) <
~CJCA
Note that
Thus,
vN
X (J) ,
96
(6.10)
where E,A, and f were defined in (2.5), (3.1), and (3.3), respectively.
Therefore, the asymptotic nonnull distribution of
.J ...
J
r(J' J'
)L;(Jl,
eN
is given by
dG K (t ;fA,E)·
J' )c (J,)t
where GK (X ;p,E) was defined in (6.9).
3.7. The Preliminary Test Estimator (PTE)
We define the proposed PTE of 8 by
where I (A ) stands for the indicator function of the set A.
Under H 0,
lim P (m 8~<X)= E
f(Jl, J I
Now, M
P { $ (eN -8)<X ,N M (~, )(O)E(.i})M (J)(O»L N ,a
t/I~J~A
N-oo
(J I :J)
M
)M
:J)(O)
<
=
m (JI )M (0) , where
= (M l(O),oo.,MK
M (0)
)
0 ,f(JJ )(E(JJ)t 1M (J)(O)
may be expressed as
(JI :J)
m (J I
(J I
= 0
(J I
) -
(0))'
E(J I
J
, and
)E(j} )0
(J)
>
O}
,
97
and C (J)' C (J' ) are matrices such that
C (J)M
Thus, (1)
M
(0)
=
M (J
)(0), and C (J 1 )M (0)
(~)(O)E(J})M
(J
M (J 1
(7.2)
)(0).
)(0) can be written as
where Q (J) is of rank
(2)
=
r(JJ )(E(JJ rIM (J
J(
(J),
)(0) may be written as r(JJ )E(J~
P (J)M (0).
Similarly and under H 0
can be written as
where
and the C 's are defined in (7.2). Thus, under H 0, (7.1) can be written as
lim Po(VN (e~-e)
N-+oo
NM'
r(j\
Denote by C
(O)Q (J)M
JI )m (JI
tJ)
(0)
)M (0)
the region
<
>
<
X)
LN ,a,r(JJ )E(J~)C (J)M (0)
0)
>
0
98
o (;) =
{t :m (J)t < f(JJ?C (J ),t
f(JJ)E(Ji)O (J)t
>
I
Q (J)t
>
LN
,Q
,
0 ,f(i' J' )m (J I)t < O}.
Thus the asymptotic null distribution of the PTE is
where E is defined in (2.5) and GK (t ,p,E) is defined in (6.9).
Under K N in (3.1)
lim PK (..fFT
N-oo
NM
N
I
(a~-
(&)Q (J)M
f(i, J' )m (J' )M
:rr) <
vN
X)
(~) >
L N ,Q ;f(JJ)E(J} P (J)M
(~»O ,
(~) < 0).
From (6.10), it follows that
mm(J)M(~)<f(JJ?C(J)=>
m m (J)(M (O)+f ~) <
f(JJ?C(J)
=>
..fFT m (J)M (0) < f(JJ}X (J) - m (J )fA ,
NM
I.e. ,
and
I
(
~)Q
vN
(J)M (
~) =
vN
(v'N"M (O)+f}')' Q (J)(v'N"M (0)7f}'),
99
can be written as
IN f(JJ)E(j}, P (J )(M (O)+f>') >
IN f(JJ )E(J})c (J)M (0) >
0
=>
-f(JJ )E(J}
P (J )f>. ,
Similarly,
f(jl,
J' )ID (J' )M
IN r(/,
Denote by C
C
t;)
t;) =
J'
)ID
(~) < 0 =>
(J' )M (0)
<
-r(i, J'
)ID
(J' )r>..
the region
{t
r(JJ)E(J})C (J)t
:ID
(J)t
<
r(JJ)x (J )-ID (J )f>. ,(t +f>')' Q (J )(t +r>.)
> -r(JJ)E(J})C (J f>' ,r(i, J'
)ID
(J,)t
<
_f(jl, J'
Thus, the asymptotic nonnull distribution of the PTE is given by
E
"'CJCA
'1'_
-
where
E and
J...
J dGK (t , 0 ,E)
c·· .
(I)
G (X ;p,E) were defined in (2.5) and (6.9), respectively.
)ID
>
L N ,Ct
(J' {>.}
,
CHAPTER IV
PRELIMINARY TEST ESTIMATOR IN THE GENERAL
MULTIVARIATE LINEAR MODEL (GMLM)
4.1. Introduction
This chapter is a generalization of Chapters 2 and 3 to the general linear model
containing the multivariate multisample problem as a special case.
In this chapter we aim to estimate the vector of location parameter ex after a
preliminary test against the restricted alternative,
l.e.,
(H 0: fJ E 0 0) vs (H l:fJ E 0d
(4.1 )
where 0 0 and 0 1 are subsets of the parameter space 0 such that 0 0 U 0 1 is a proper
subset of 0, and fJ is the regression slope. In Section 4.2 we state the problem, the
model, and the basic assumptions.
In Section 4.3 we discuss some of
~he
asymptotic properties of the M-estimatol',
and derive its asymptotic null and nonnull distributions.
In Section 4.4, we apply Roy's (1957) VI principle to derive the VI WI-statistic for
orthant alternatives, and develop its asymptotic null distribution.
The isotonic M-estimator is derived in Section 4.5 with its asymptotic null and
nonnull distributions.
101
Finally, in the last section, we define the preliminary test estimator
aJT and
derive both its asymptotic null and nonnull distributions.
4.2. Basic Assumptions
Consider the GMLM
y
=
X 0{1J + E
(2.1 )
where Y (N xP ) is a matrix of random variables, X O(N X (q +1)) is a matrix of known
constants assumed to be of full rank (q +1)<N, {1J((q +1)XP ) is a matrix of unknown
parameters, and E (N xP ) is a matrix of random errors, the rows of which are
assumed to be independent and identically distributed random vectors with p-variate
distribution function F.
The matrices {fJ and X
° are defined as
(fJ=
where each of at ,fJI' is 1XP , I =1,... ,q. We are interested in the estimation of the
location vector a(1XP), after a preliminary test against the restricted alternative in
(1.1). In testing against restricted alternatives, we consider the following problems.
(1) Test for equality among the rows of f3 against ordered restrictions among them,
l.e.,
(2.2)
•
where at least one inequality is strict.
(2) Test for equality among the f3
'5
against the orthant alternative, i.e.,
102
{H o: fJI=
... =fJq =o} vs {H I: fJI
~
°
(2.3)
,I =l, ... ,q}.
In either test, if we fail to reject {H 0: fJ=O}, the classical M-estimator a~ is used
to estimate the location vector a. Othewise, the isotonic M-estimators
used to estimate fJI, ... , fJq , respectively, and the M-estimator
/JI' ... ,(3q
is
aN is used to estimate
a after accounting for the restrictions on the fJ 'so
A special case of this model is the usual shift model with X /°=0 for I =l, ... ,q .
In this case
a
relates to the vector of location parameters and Model 2.1 above
becomes Model 3.2.1.
If we take p =1 and let N =n 1+ ... +nq , where nl, I =l, ... ,q are all positive
integers, and denote by X
X
X
netl+,..+n
let
= ...= X
et
n
l
let
the /th row of X
0,
and take
= (1,0, ... ,0),
q_l = ...= X 3 = (0,0,... ,1),
then Model 2.1 above represents the q-sample location model, i.e., it becomes Model
2.2.1 with
~i
replaced by (a+13I) and K replaced by q.
We attempt to solve the ordered alternative problem (2.2) by transforming it to
the orthant alternative problem in (2.3). This will be done through the following
reparametrization of Model 2.1. Let
pO
=
c eO,
where C ((q +l)X(q +1)) is defined by
(2.4)
103
o
o
o
100
010
011
o
(2.5)
o
1
1 1
and aO((q +1)XP ) is defined by
a'
a;
aO =
where each a'
,a,' IS
a 1X P vector, and for
a'q
all I =1, ... ,q
Thus model (2.1) can be written as
y =XoO aO+E
where X
== X °0
is an (N X(q +1)) matrix.
In the rest of this chapter, we will work with Model 2.7 instead of Model 2.1,
and we proceed to develop the UI statistic for the orthant alternative problem, which
is invariant under the transformation in (2.4). Under the new setting the null
hypothesis and the alternative in (2.3) can be written as
{H o: a 1 = ...=
aq =O}
vs {H 1:
a/ 2: 0 ,I = 1, ... ,q}
p.8)
Note that, a j (q XI) ,(j =1, ... ,p ) denotes the jth column of a, and a,' , I =1, ... ,q
denotes the I-th row of a. Y
transpose of the
~'th
i'
(p X 1), X / (q XI) , i =1, ... ,N respectively denote the
row of Y , and X.
104
Below, we introduce the basic assumptions that will be used through this
chapter.
Al
The distribution function F is absolutely continuous with density function f.
A2
F has a finite positive definite Fisher information matrix I (f )
=
((Iii (f ))),
where
l- . (f )
'1
=
J[ f I~ I(J(y (y))2
) t I'i dy)
] f ()d
Y
Y
..
P
1 ,J =1, ... ,
The j -th marginal distribution function Fiji of F is symmetric about the origin
A3
for every i =l, ... ,P .
The score function 1II={III(t )=(1II 1(t ), ... ,lII p (t )), t ER =(-oo,oo)} is defined such
Bl
that
III j (t)
where the 111 11
III j
1
=
, /
III jl(t )+111 j2( t), \-It ER , j =l, ... ,P ,
=1,2, are both non decreasing and skew-symmetric for all j =l, ....P .
is absolutely continuous on any bounded interval in R and III j2 is a step-function
having finitely many jumps, j =l, ... ,P. For some
E/ =
(a/ ,a/ +d,
/ =O, ...,K, with -00= a 0< a 1<
III jz(t )=0'/ , for all tEE/,
°$
p
2:
0, there exist open intervals
... < aK < aK +1=00,
such that
/ $ K, where 0'/ are real and finite numbers not all
equal. Con ven tionally, we let
III j2 (t) = 1/2(0'/_1+0'/), for /=1, ... ,K.
B2
E=
((O'JJI )) is a p Xp matrix, where
O'JJ' = E III j (Y ii -x i'
B3
E·
=
((O'J~'
e J )IIIJ'
(Yij'
-x / e J ,),
)) is a p Xp matrix, where
for
j ,j' =l, ... ,P
.
105
(7/."
B4
= E 'IIi (Yii-ai )'IIi' (Yij' - aJ')
for
i ,i'
=1,'H,P
.
i =1,H',P , let
For
and define
r =
C1
Diag
hI> ... , '/p)
be a p Xp
(2.9)
matrix.
The elements of X satisfy
(1)
lim
N ....oo
.!.
X'
N
X = D O((q +l)X(q +1)),
where D = [D o,D
(2)
max X·, (X' X
l::;i::;N
I
1, . . . ,
t 1x i
-
D q]
° as
N -00
.
For j =l'H"P, N ~ 1, we define
N
MNJ (t) = MNJ (t ,0) = .~ 'II i (Yij -t).
(2.10)
1=1
M N (t ) is monotone in t, t ER , so we may define the vector
J
estimator of
0',
aN,
the classical M-
by
, where
a~J = .!..{sup
(t :MN (t »0) + inf(t :MN (t )<o}
2
J
J
and M N (t) is defined in (2.10).
J
Thus, under {H o: 8=0}, the M-estimator a~ of ai is the solution of
J
N
M N (0') =
J
~ I{I i (Yii
;=1
-0')=0
j =1,H',P.
(2.11)
106
Similarly, for j =l, ... ,P , I =O, ... ,q , define
M Nlj (a J9)
N
EXil III j (Y,'j-X i' a?)
=
(2.12)
i =-1
where a?=(aj Oij ... Ofj)' . Let
e? be
the M-estimator of
a?,
i.e.,
e? is the solu-
tion of
Mfllj (a ?)=O
(2.13)
for j =l, ... ,p, I =O, ... ,q.
4.3. Asymptotic Properties of the M-Estimators
In this section we derive the asymptotic distribution of the classical M-estimator
a~ of a under the null hypothesis
{H o:a=O}, using model (2.7). We then derive the
asymptotic distribution of the M-estimator
distribution of the M-estimator
eO of aO, and use it to get the asymptotic
e, of the slope a, under both
the null hypothesis and
a local alternative.
From assumptions A3 and Bl it follows that
!lII j (y)dFljl(y)=O for j=l, ... ,P
R
From assumption B3, lines (2.10) and (3.1), and under {Ho:a=O}, we obtain
and
E o( ~ M
N
(a)M
~(a))=E'
where
M
N
(a)=(M N l(a), ... , M Np (a))'
From the linearity result in (3.3.2), we have
(3.1 )
107
sup
_1_ 1M N (o~ )-M N (o)+N
IN I aN-a1 ~K IN
f(o~ -0) I ~ o.
i.e., the distribution of JiVr(o~-a) is asymptotically equivalent to the distribution of
~M N(0). Thus the asymptotic null distribution of o~ IS
JiV (o~ -0) ~ Np (0, r- 1
E* r- 1)
(3.2)
Define the matrix (q +l)Xp M N(eO) such that
M N(eO)=(M No,oo,M Nq)' =((MNIj (8 l))), I =O,oo.,q , j =1, .."p ,
where M Nlj (8l) is defined in (2.12).
Our proposed UI-test procedure is based on M
N
(8°). For convenience, we shall
roll out MN (8°) and 8° into (q +l)p vectors, i.e., instead of writing them as (q +l)Xp
matrices, we take
M N (8°) = (M NO' , ... , M Nq' )'
= (MNOl1
..• ,
MNOp
, .• , ,
From (2.12) and (3.1), it follows that E (
MNq I'
frr M
vN
C1 and line (2.12), it follows that E ( ~ M
N
N
... ,
MNqp
)'
,
and
(8°))=0, and from assumptions B2,
-
(80 )M
N'
(8 0))=0
represents the Kronecker product of the two matrices 0
° 02::,
where
G9
° and 2::, which is a (q +l)p
by (q +l)p matrix. So by the multivariate version of the centra.l limit theorem. it 1'01lows that, when 8 0 holds,
~
M
N
(80 )
~
N
(q +l)p
(0,0
° ~ 2::)
(3.3)
Under assumptions AI, A2, B1, and C1, Singer and S'en (1985) extended two important results due to Jureckova (1977) to the multivariate case. The first relates to the
108
linearity of M
N
(eO) in the neighborhood of the true parameter
result relates to the boundedness in probability of
m
ea.
I eO_eo I.
The second
From Theorem 3.1
in Singer and Sen (1985), and under the above assumptions, it follows that, for all
K>O,
where
r
is defined in (2.9).
From the linearity result (3.4), and since
*'
ity, it follows that
M
N
m
m
I eO_eo I
is bounded in probabil-
(D a ~ r) (eo_eO) has the same asymptotic distribution as
(eO), when eO holds. Thus,
(3.5)
For ease of notation, let
(i) AD = D
0-
(ii) A = D
-I
1
~
~
r-1Er- 1 and
r-1Er- 1
(3.6)
where A is the su bset of the covariance matrix Aa that represen ts the covariance
matrix of
e, the M-estimator of the slope e.
distribution of
Thus, when
e holds, the asymptotic
e is given by
(3.7)
.jN (e-e) --... N qp (0, A),
and the asymptotic null distribution of
.jN
e --.. N
qp
e is given by
(O,A).
(3.8)
Now under {Ho:e=O}, let us denote p (q +l)-vector eO by e oo, where
(3.9)
109
and define a p (q +l)-vector ). to be
(O, ... ,O,A11' ... , Alp, ... , Aq I'
). =
... ,
(3.10)
Aqp )'
where
Ali
~
0, for / =l, ... ,q
,i =l, ... ,p.
Consider the sequence of local alternatives
(3.11)
where
aoo,
and). were defined in (3.9) and (3.10).
Given assumptions A2, B1, and C1, it follows that the sequence of local alternatives in (3.11) is contiguous to {H 0: a=O}: a proof is supplied in Chapter VI of Hajek
and Sidak (1967).
Thus, the asymptotic nonnull distribution of
eO is given
by
(3.12)
4.4. The UI Statistic
In this section we apply Roy's (1957) VI principle to derive the VI M- test for the
orthant alternative problem in (2.8). Then we derive its null distribution. Define
00 =
{a, ;a, =0,
/ =l, ... ,q }
0 1 = {aEOcR +pq }, where 0 = R +pq =[O,oo]pq
The likelihood function of
N •
e
,e
b ...
I
q
can be written as
•
(4.2)
L =K . exp{--(a-a)' A- (a-a)}
2
Vnder the null hypothesis in (2.8) the MLE of
known,
(4.1 )
a
is equal to zero, and since A is
110
N,
,
max L = K . exp{--8' A- 18} .
2
00
Under
°
a =(a
I, . . • ,
1,
let 8/ =oa / , a / ~ 0 for I =l, ... ,q , 0 ~
a
Ip , . . . ,
a
q 1> •.• ,
a
qp)'
to be a
o.
qp
Define
-vector. For a fixed a ER
subject to the constraint a' A-I a =1, we will find the
°
1•
8 that
+pq ,
and
maximizes (4.2) under
From (3.4.2) it follows that
otherwise
=0
Therefore
8'UpL
°1
__ {KK
N
exp{--
(8' A- 18-(a I A-1 8)2}
N
exp{- 2
8'
2
A- 18}
if 8(a
»0
if 8(a )=0
and
»0
if 8(a
if 8(a )=0
Thus,
(4.3)
The likelihood ratio test rejects H 0:8/ =0, I =l, ... ,q , in favor of HI: 8/ =oa / ,I =l, ... ,q
for large value of L N (a). Under {H 0:8/ =0, I =l, ... ,q }
Thus,
Assume that
°
in (4.1) is positively homogeneous and construct the parameter sub-
111
space
O(a) = {eER +pq, e=oa , o>O},
(4.4)
and define the set A such that
A = {a ER +pq ,A-1a EO ,a I A-1a =1} .
(4.5)
Due to the positive homogeneity of 0, and lines (4.4) and (4.5), it follows that
o c U O(a)
aEA
According to Roy's (1957) principle, the
Dr statistic for
the orthant alternative prob-
lem in (2.8) is
(4.6)
We reject H o in (2.8) in favor of the orthant alternative for large values of L N+. Thus,
our problem now is to find
equivalent to finding
Following the same approach and notation as in Chapter 3, let J represent any of the
2 Pf subsets of
A ={l,... ,pq
},
and let JI be its complement. Define
(-t.i)
then
where
112
From Chapter 3 it follows that the DI statistic for the orthant alternative problem is
I(U (J:J' »0)1 ((I.I(J' J'
)t lU (J'
) ~ 0)
From (4.7), the partition of U , and the inverse of A-I, it follows that
i.e.,
Thus,
=
U (J)+A(j})A(JJ')U (J')
= (~j})+A(j}jA(JJ'
jA(jl, J' :JjAJ' JjA{j}))elJ)
- A(J}jA{JJ' )A(J\ J' :J )e(J' )+A(J} jA{JJ' )(-A(J" J' :J jA(J' J)A(J} )e(J)
i.e.,
U (J:J' )
= Ati} ){I +A(JJ'
jA{jl, J' :J )A(J' J )A(J}) - A(JJ' )A(JI, J' :J )AlJ'
+ A(j} jA{JJ'
J
)A(J} )}6(J)
){ -A(i, J' :J) + A(i, J' :J )}8(J' )
(4.8)
Similarly,
113
=
A,-I
'~JJ)
+ A-I
,A,
A-I
(JJ )"~JJ') (J'
J' :J
,A
)"~(J'
J)
A-I
(JJ)
A,-I
(4.9)
II(JJ:J')='~JJ) ,
and
(4.10)
Thus, from (4.8) - (4.10), the VI-statistic can be written as
(4.11)
Following the same approach as in Chapter 3, it follows that the asymptotic null distribu tion of the VI statistic L N+ for the orthan t alternative problem is given by
(4.12)
where Xr2 represents a chi-squared random variable with r degrees of freedom. The
nonnegative weights W r , r =O, ... ,pq , sum to one and for 0
Wr =
~
I'
LJP (A(11)8(1)
>
.
0) P (8(J'
:J) ~
~ r ~ pq
0) .
{J}
where K(J) =
r.
4.5. The Isotonic M-Estimators
In this section we derive the isotonic M-estimators and their asymptotic null and
nonnull distributions. We denote by
aN
,8 1,
... ,
8 q ,respectively. That is,
aN
,8 1,
•••
a,8 1,
... ,
6 q , the isotonic M-estimators of
,8 q are the MLE of a,8 1°, ... , 8 qO,
114
respectively, in the region {HI: 6/ ~ 0 ,I =l, ... ,q }. From (3.5), and (3.6), the likelihood function of 6° is given by
Next, we partition 6°, and 6° into their components a,a and 6,6, respectively, and
partition (AOt l correspondingly. That is,
L = K . exp{- N [(a-a)'
(8-6)' 1
2
1
0-1 +Ao-IAQA
0-1A21OA011
11
12u 22:1
11
[A
-A~IA2~Al~1
+
A0-1AOAO
]
1: 12 22~11] [ (~-a)}
A22:1
(6-6)
-
(6-6)' Adtl(6-6) l}
where
a -- A 110-1
0-1
+ A 110-1 A 12O A22:1
1
A21O A011
(5.1)
To maximize the likelihood function above is equivalent to minimizing the exponent
+ 26 '
in the region 6/
A22:1
0-1 A21° A 11
0-1 a-..'>6' A22:1
0-1 A21° A 110-1 a - '>6'
A22:1
0-1 6'
..
A
+
6' A22:1
0- 16
~ 0, I =l,. .. ,q .
To solve this problem, we follow the same approach we used before, and define
the Lagrangian function in (2.4.11) as
L (a,6,t )
= r (a,6) - t' e
From (2.4.11) the point (a,e,t .) is a KTL point if it satisfies the system
(i)
t ~O
. ) 8L (a,6,t) = 0
( IV
8a
(ii) 6~0
(iii) t' 6=0
(v) 8L (a,6,t)
8e
o
(5.2)
115
From (iv) in line (5.2), we obtain
a a
• A11O-IA 12
0 A22:1
O-I(e - e')
= a a+
. ,
,1.e.
1 O
1
•
-IA0a = a+a
11 A 12 A022:1 (e - e')
From (5.1), we have
0A22:1
O-IA210A 110-1]
a = A 110-1 [I + A 12
a
0 A 22:1
0-IA22"
. A 12
0-1 .
= A 11O-IA 12"
L
1
Therefore,
(5.3)
From (v) in line (5.2), it follows that
1 O
1 ' 2A0- 1A oA0- 1 "A0- 1e' "A0- 1e
"A0~ 22:1 A 21 A011 a22:1 21 11 a-~ 22:1 +~ 22:1 -t = 0
Az'tHe-8) =
~ t +A2tttA2~Altl(a-&).
~
That is,
and from (5.3), we obtain
,i.e. ,
,i.e. ,
116
l.e. ,
1
0
0-1
0
•
= '2A22A22:IA22:1t +8
(S.4)
Note that A2~ and A which wa,; defined in (3.6), represent the same thing.
Equation (iii) in (5.2) implies that
t
I
.
(8 + ~ At)
= 0,
but (i) and (ii) in the same line indicate that both t and 8 are nonnegative. This is
possible only if for some JEA, A ={l, ... ,pq}, we have
(5 ..5)
Line (5.5) implies that
•
1
8(J' ) + '2 A(J'
t (J' )
=
-I
J' )t (J' )
= 0 ,
•
-2A(J' J' )8(J' )
But from (i) in (5.2), it follows that
Then (5.5) can be written as
which implies
(.5.6)
117
l.e.,
>
e(J:J ' )
(.5.7)
0
From (5.6), (5.7) and (5.3) we can take the KTL point as
~AtJ\ J,~e(J' )]
t •
e ~ [e(J~J')]
a
-
(5.9)
and
= a+A 12°A220-1 (e-e).
A
(5.10)
A
i.e., the vector of isotonic M-estimators
eO can
be written as
(5.11)
Next we derive the asymptotic null and nonnull distribution of aN. Note that the
vector eO can be written as
eO = (e iO e 20)' = (e iO e(~) e(~I))'
= (a
and similarly
e(~) e(~/))'
eO can be written
as
We make use of the above notation to write (5.10) in terms of
asymptotic distribution of
eO under both
asymptotic distribution of
aN
eO.
Since we know the
the null hypothesis and the alternative, the
can be obtained. Now (5.10) can be written as
118
From (5.9), it follows that
using the new notations
and
(5.12)
Thus,
0 0-1
• 0
0 0-1
• a
0 0-1
• 0
-I' a
aN = a - (AI~22 )(J)8(J)-(A I2A22 )(J' )8(J' ) + (A I2A22 )(J)[8(J)-A(JJ' )A(J' JI j8(J' )]
A
= a-
aN
[(AI~A2tl)(J' )+(AI~A~I)(J)(A(JJ' )A(jl, J' ))16(~,)
=
[1 0 - [(AI~tl)(J' )+(AI~A~I)(J)(A(JJ' )A(Jl, J' )]]6
=
0 (~ll
60
0
say,
,
From (5.8), (5.9), it follows that
E
P(aN ~ X) =
P ([0 (~ll6° ~
Xl, I (A(i,
J' )6(J' ) ~ 0)1 (6(J:J' )
>
0))
.~J~A
From (5.11), we obtain
•
8(1:1' )
1
=
[0 1 -A(JJ' jA.(J'
=
0
m6
0
,
•0
J'
)]8
similarly
A(i' J' )6(J' ) = [0 0 A(i' J' )]6
0
=
(5.13)
119
(5.14)
Thus,
E
P(aN ~ X) =
P (0 (~lleO ~ X ,0 (~3))
eO
~ 0,0
meO ~ 0)
(5.15)
4>~J~A
Under H 0:
e=
0, it follows that
POem (aN -a) ~ X) = P (vN (a~-a) ~ X)
which we already developed in (3.2).
Define the set R
mto be
From (3.12), it follows that the asymptotic nonnull distribution of aN
lim PK (m (aN -a) ~ X)
N
N ..... oo
IS
=
(5.16)
In the remainder of this section we will derive the asymptotic distribution of
eO under
both the null hypothesis and a local alternative. Following the same approach we
used in deriving the asymptotic distribution of aN, we write (5.11) in terms of
From (5.11), we obtain
•
From (5.12), it follows
eO.
120
I.e.,
[~
6'=
•
I.e.,
- 0
a
o
1
o
= C (J(4j"
a0
,
say.
For any vector X , we have
E
P(e O ~ X) =
P {C (~4Ieo ~ X ,e(J:J')
> 0,
A(jl, J' )e(J') ~ O} .
• ~J~A
From (5.13) and (5.14), it follows that
P (eo ~ X )
E
=
P {C (~41 eO ~ X ,C
.~J~A
meO > 0, C
(~3)) eO ~ O}
Under {H 0: a=O} the asymptotic distribution of eO is given by
lim P 0(.JN (8 0-aO) ~ X )
N .... oo
= Po(VN (a-a) ~ X)
Define the set R
R
m
=
=
Po(.JN (a~-a) ~ X)
=
Gp (X; 0
,r-1E* r- I ).
mto be
{t: C (~41 t ~ X, C (~2l t
> 0, C
(~3)) t ~ O} .
•
From (3.12), it follows that the asymptotic nonnull distribution of eO is
121
4.6. The Preliminary Test Estimator
In this section, we define the preliminary test estimator (PTE) a:JT, and derive
its asymptotic null and nonnull distributions. We define the proposed PTE as
if
L N+:$ L N ,0
if
L N+
>
L N ,0
where LN,o represents the upper lOOa% point of the distribution function of L N+.
P (.IN (a:J'f-a) :$ X) = P (vN (aN-a) :$ X ,L N+ :$ L N ,0)
+
P (.IN (aN -a) :$ X ,L N+
>
L N ,0)'
(6.1 )
Under H o:
e
From (3.2) and (4.12) it follows that
1im Po(JN(aN-a:):$ X ,L N+:$ LN,a:)
.....00
N
=
Gp (X; 0
,r-1r;* r- 1) .
f; W
r
P (x? :$ LN,a)
(6.2)
r =0
E
P(JN(aN-a:):::; X,N8(J!, At:iJ)8(J)
>
LN,a, I(A(:iJ)8(J)
~CJCA
Next, we write the previous form in terms of 8.
Let
(1)
8(J I :J)
=
& (J ),& (J' )
m
(JI
)8 ,
where
are matrices such that
> 0),I(8(J' :J):::; 0))
122
(2)
(3)
where
B
=
(J)
,
I
& (J )A(]J )& (J)'
Thus
Po(..JN" (aN-a) ~ X ,L N+ > LN,a) =
E
P {..IN" (a-a) ~ X ,N
8' B (J)8 > L N ,01 ,
;~J\;;A
Denote by R (~l the region
From (3.2), we obtain
= Gp (X; 0 ,r- I
EO r- I ) E f··· f
;\;;J \;;A
dG pq (t ;O,A).
Rm
From (6.2) and (6.3) above, the asymptotic null distribution of a~T is
=
Gp (X; 0
,r-IEo r- I)
[E
WrP (Xr2
r =0
Let
RBI
be the complement of R
m·
~L
N
Under [{.v,
,01) +
E
4>\;;J \;;A
f··· f
R(~l
dG pq (t :O,A)]
123
PKN(v'N (a~-a) ~ X ,L + ~ L N ,a)
= PKN(v'N
(a~-a) ~ X ,N 8(~)A(j})8(J) ~ L N ,a), 1(A(j})8(J) > 0)1 (e(J'
:3)
~ 0).
From 1-3 above, we get
PKN(v'N (a~-a) ~ X ,L + ~ L N ,a)
= PKN {v'N (a~-a) ~
X ,N 8' B (J)8 ~ L N ,a , I (A(j})8 (J)8
>
0)1 (m (J' )8 ~ O)}
l.e.,
=
G, (X
J... J
;o,r-1EO r- 1) E
9<;J <;A
dG (t ;>.,A)
R()2j
and
PKN(v'N (aN -a) ~ X ,L N+
E
>
L N ,a)
P {VN (aN -a) ~ X ,N 8' B (J)8>L N ,a, 1 (A(j})8 (J )8>0)1 (m (J' )8 ~
9CJCA
on·
From (5.6),
E [J'" J
=
9<;J<;A
Rc9I
dGp(q +l)(t
,A,AO) .
J ' .. J
R~I
dG pq (t ;A,A)] .
(6.5)
From (6.4) and (6.5), it follows that the asymptotic nonnull distribution of the PTE
aJ'f is given
by
CHAPTER V
SOME NUMERICAL COMPARISONS OF THE PROCEDURES
5.1. Introduction
In this Chapter we consider numerically the behavior of the VI test in both the
univariate multisamples case, and the multivariate linear model.
The notion of adaptive M-estimator and Huber's function with some basic estimates are introduced in Section 5.2.
In Section 5.3, we consider the behavior of the VI-test in the 3-sample problem, and
compare it with some well known tests such as: the isotonic regression test (see Barlow
et at. (1972)), Jonckheere's test, and Shorack's test (1967).
The one-way layout with 3 treatments and 10 observations per treatment, is considered in Section 5.4, and the same comparison is made as in Section 5.3.
In Section 5.5, we proceed to the multivariate case, by considering the Anesthesia
data set involving two variates Xl and X 2 , 3 treatments and 10 observations per treatment. The behavior of the proposed multivariate VI M-test is compared with the bivariate analogue to Jonckheere's test.
A study of the asymptotic performance of the test and comparisons between the VI
M-test and the isotonic regression test are given in Section 5.6 based on simulated data.
Three cases were considered: the equal and null mean case, the equally spaced means
case, and the nonequally spaced means case. The study was based on sample sizes
125
ranging from 10 to 40.
In Section 5.7 a study of the asymptotic properties of the PTE as well as the isotonic regression estimator is carried out through simulations. The average bias due to
the use of each estimator is computed as well as the asymptotic relative efficiency of the
PTE with respect to the isotonic regression estimator for each of several chosen alternatives.
5.2 Basic Estimates
In this chapter we introduce the following statistics which will be needed in the
actual analysis.
(1) We estimate 0'$ in (2.2.5) by
where
0; is the M-estimator of
0; .
(2) For 0'=.05, we estimate I in (2.2.12) by ~, where
A
A
11+ ... +IK
1=
I;
(2.2)
K
2~a/l'\jI
r:::-
A
A
V n; (Ow -OiL )
(2.3)
,i =l, ...,K,
OW'OiL are the upper and lower bound of O;,i =l,... ,K , i.e., Ow ,O;L are the solutions of
In: ".
i~IW(X;i -0;)
= ±
~a/2 U\jI, respectively, and 4>(~a)=l-O' (0<0'< 1),
4> is the standard
normal distribution function.
(3) The function W in (2.2.7) is Huber's score-function which is defined by
126
'1t(X) =
{KX ,sgn
x,
Ix I ~ K
Ix I > K
for some specified value of K >0. In this chapter we choose K
=
1.5, a value commonly
used in the literature.
This particular choice of lit corresponding to a p(.) function given by
PK (t )
=
1/2t 2
{ Kit
I
-1/2K 2
if
if
It I
It I
~ K
>K
K>O
has the advantage that the distance measure is like a square function in the middle but
like the absolute value function in the extremes. So the corresponding M-estimator will
tend to treat central observations much as
5c
does, but will deemphasize the extreme
observations in the fashion of the median of the observations.
Note that the M-estimator is not sca.le-equivariant, i.e., M" (dX" ) is generally not
I
I
. .
equal to dM" (X" ), for all d >0, unlike the R- and L- estimators. To overcome that, we
will use the adaptive M-estimator 0/ of 0.. , instead of the classical M-estimator OJ ,
i =l,... ,K. The adaptive M-estimator is based on the concept of regression quantiles due
to Koenker and Basset (1978). Jureckova and Sen (1984) defined the adaptive Mestimator 0.. ~ of 0.. to be the solu tion of
fl.
~~,,(x"i-t.. )=O
i
,i=l,... ,K,
~l
sp where ~" is defined as
(2.5)
and
a E (0,.5)
127
Throughout this chapter, the algorithms of chapters 2 and 4 were programmed
using several FORTRAN programs. The starting value for the iteration computation of
the M-estimates was the median value. The convergence criterion consisted of stopping
the iterations when the absolute value of the difference between the estimates from two
•
consecutive steps was
<
.0005.
5.3. Example 1. The following data (Lehmann (1975)) are from a study to determine
whether a certain diagnostic test can be interpreted successfully without much psychological training. 72 judges were presen ted with the results of the test on 200 carefully
matched patients, half of whom were psychiatrically hospitalized and half medically hospitalized without apparent psychiatric disturbance. Of the judges, 21 were staff members
and 23 trainees at Veterans Administration hospitals, the remaining 28 were undergrad uate psychology majors who were given only brief instructions on the interpretation of the
test. The following were the accuracies of the judges in terms of percent correctly
identified.
If training and experience have an effect, the staff members could be expected to be
the most accurate, the trainees next, and the undergraduates last.
128
Undergraduates
Trainees
Staff
74.72
74.37
72.75
72.60
71.17
71.98
72.27
69.54
69.04
69.68
68.15
64.44
65.28
58.42
74.52
74.09
73.83
72.91
71.34
71.27
71.04
70.11
69.23
70.16
68.92
66.44
66.03
60.04
70.39
73.44
71.03
74.42
70.41
68.48
74.39
71.44
70.90
74.19
69.66
71.92
62.33
74.11
69.20
66.42
70.95
69.15
70.18
63.91
74.82
71.36
73.73
74.64
75.15
75.38
73.92
74.44
68.63
70.08
70.86
72.56
73.25
73.85
70.64
76.39
69.55
75.37
69.90
74.16
71.75
75.78
76.55
76.21
Note that: these are not the exact data given by Lehmann. We broke the ties in the
actual data by replacing the decimal numbers by another using a table of random
numbers.
Under this notion, we will compute the significance probability of the above data
using the UI M-test, isotonic regression test, Jonckheere's test, and Shorack's test.
The model of interest is
Fdx) = F (x -OJ),
i =1,2,3,
where OJ is the effect of the i-th judge. It is desired to test H 0:0,=°2=°3=0 against the
ordered alternative
e
129
The VI-statistic: from (2.1), (2.2) we obtain
(J'f
=
8.5857 and
7=
.9429.
Thus, from (2.2.7) and (2.3.7), the M-estimate
ell· =
(70.1542
70.8914
73.2886)'
ell· and
its covariance matrix are:
,
(J'2
-;.. A- l = .13 Diag (2.5714,3.1304,3.4286).
N"I
From (2.4.1) and (3.1), the overall M-estimator
9" =
71.3040, and the vector
is given by
M
n
(9 n ) = (-28.875
-8.663
41.677)'
.
From (2.4.12) and since
_
Mi =
K
_
E
M"i(O,,) ,
i=i ;?:2
it follows that
M
= (33.014
.2377
A· = 8.5857 [ .1134
.1134 ]
.2066
Now for each J~{1,2} calculate
M (J:J'
)=0
and
41.677)',
and for J={1,2},
M
(J:J'),
M (J:J'
and A.t~J:J')' remembering that for J=q;,
)=M.
(3.1 )
130
The following tables contain the necessary components for calculations.
J'
J
{I}
{2}
{2}
{I}
-
J
M(J)
M(JI)
33.014
41.677
41.677
33.014
-
M
(J:J
'
•
)=M
(J)-AlJJl
2.0408
1.7138
• -1
AiJ '
-
)M
JI
.9736
.9736
A/JJ :J 1
(JI)
1.7738
2.0408
)=
.5638
.4900
A(JJ
• )-A/JJ I) A (J• 1-IJ I
~
o
{I}
{2}
10.13
25.92
1.51
1.31
{1,2}
33.014 ]
[ 41.677
.2377 .1134 ]
8.5857 [ .1134 .2066
jA(J• I J)
Thus the VI M-test statistic is
•
MN
=
1
N
E
-
[M
1
•
(J:J' )(A(JJ:J,)t
1-
M
(J:J' )]1
-
(M
(J:J ' )
~CJCA
1
= 72
1
[.3777
8.5857 [33.014 41.677] .1134
.1134
.2066
>
0)
]-1 [33.014]
41.677
MN = 14.5471
To calculate the significance of M N use expression (2.5.2);
lim P(MN ~ C) =
N-oo
2
E W.P(x.2 ~
.-0
C),
where
W.
= EP(M
{J}
(J:J
'
l
>
O)P
(A(;, JI
ltlM
(J
'
)
~ 0)
The W. may be calculated as follows: For A ={1,2}, r ({1,2}) is the probability that
both variates from a bivariate normal distribution are nonnegative. The expression for
calculating this probability is
131
00 00
f ff
o
where
p
(x ,y ,p)dxdy
0
1
= ~ + 21f' sin-1p
is the correlation between the two variates (see Gupta (1963)). This correlation
can be derived from the matrix A* as
.1134
P = J(.2377)(.2066)
=
.5119.
Therefore,
r ({1,2})
1
=
.!..+-!:.-sin- 1 .5119
=
.3355.
1
1
4
21f'
=
.!..+-!:.-(.5374)
4
21f'
=-+-=4
4
2
r({¢>}) = 1-.5 - .3355 = .1645
Thus the p-value is
.3355 (.00069)
+ .5(.00014)
=
.0003.
Next, we applied the isotonic regression test, Shorack's test, and Jonckheere's test
to the same data. To allow for comparison, we list below the results from all tests.
Test
Isotonic Regression
Shorack
Jonckheere
VI M-test
Statistic
2
X3
=17.61
=11.49
=11.63
M~ =14.55
H
J
P-value
.0005
.0008
.0008
.0003
132
We notice from the figures above that, in general the p-values from all tests are
close and highly significant, that is, the 4 tests above reached the same conclusion.
Based on the significance of each test, we reject the null hypothesis and conclude that
training and experience have a significant effect on the judges' opinions.
5.4. Example 2.
Consider the following real data set from a trial conducted to compare the
efficiency of 3 treatments used for reversal of anesthesia. The treatmen ts may be
identified as 1, 2, and 3. The two variates are defined as:
Xl =
X2
=
time elapsed from administration of treatment to completion of reversal.
time treatment administered.
133
e
Observation
Treatment
Xl
X2
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
1
1
1
1
1
1
1
1
1
1
2
2
2
2
2
2
2
2
2
2
3
3
3
3
3
3
3
3
3
3
4.8
13.2
5.8
4.6
6.0
2.9
5.2
5.6
3.9
5.6
6.0
9.6
15.5
8.7
7.9
5.2
6.6
2.7
5.4
8.2
16.4
6.7
8.6
7.9
6.0
19.4
19.0
2.8
6.6
10.4
17.0
6.0
30.0
40.0
30.5
48.0
50.0
20.0
40.0
24.4
30.0
25.0
25.0
56.0
51.0
61.0
26.0
45.0
45.0
21.0
15.0
6.0
20.0
15.0
32.0
0.0
15.7
59.0
26.0
21.0
Source: Boyd and Sen (1986). The data in this example are a su bset of the data set
given in Boyd and Sen (1986).
In this example, we consider only variate Xl, that is, consider a one-way-layout
with 3 treatments and 10 observations per treatment. X 2 will be used later in section
5.5. Again the model of interest is
Fdx) = F (x -(}j) £ =1,2,3,
where
(}j
is the effect of the i-th treatment. We want to test for equal treatment effects;
134
I.e.,
against the ordered alternative
In this example there is a tendency for nonhomogeneity of treatmen t effect. If we
use the regression quantiles method discussed by Jureckova and Sen (1984), we will be
.eliminating the effect of the variation among treatments on calculating k". The choice
of k" is very crucial for the robustness of the test, i.e., the larger k" the less robust will
be the test. Thus, we will lock to each individual sample range and use roughly the
.
k,,=6.5, w h'l
· 0 f t h elr
. semlranges
.
( max. 2-mm.) as k". Th'18 gives
me d Ian
let h e ot h er
method leads to k" =8.1.
From (2.1), and (2.2), it follows that 0';=12.931 and 1=.8412. The vector of Mestimators
e* and its asymptotic variance are given by
e'
=
(5.66
7.42
9.87)'
and
(72
----;.A- 1
N"'{
.
=
.609 Diag (3,3,3).
And the combined M-estimate
M
N
(ON )=(-16.6
.02
eN
and the vector M
N
(ON) are given by ON =7,42, and
16.56)', respectively. From (2.4.12) and since
Xli =
K
E
j=i?~
it follows that
M
= (16.58
A' -
[2.87
1.44
16.56)' , and its asymptotic variance
1.44 ]
2.87
IS
!'vI", (ON ),
.
135
M (J :J' ),
For each J S; {1,2}, we calculate
and A(;J :J')' The following tables contain the
necessary components for calculations
J'
{2}
{1}
J
{1}
{2}
M (J)
16.58
16.56
16.56
16.58
M
J
2.87
2.87
(J:J')
1.44
1.44
Ar;:J' )
{1}
{2}
0
8.28
8.27
2.16
2.16
{1,2}
[1658 ]
[2.87 1.44]
16.56
1.44 2.87
tP
.35
.35
Thus, the UI M-test statistic is
•
MN = -
1
"
-,
•
1 -
LJ M (J:J' )(A(JJ:J,)t M (J:J')
N ,~J~A
.
~ ~ (1658 1656) [~::: ~:;
r[: : ]
M N = 4.25
The significance level is given by
lim P (MN ~ c)
N
2
=
-+00
E W,
,=0
P (X,2 ~ c),
where
,
W,
=
E
{J}
P(M (J:J' »O)P ((A(;, J' lt lM (J' l ~ 0).
Following the same approach as in Example 1 yields a p-value of .049
e.
136
The VI M-test in this situation did not do very well. I suspect that the reason for that
is that the linearity result in (2.3.6) is true under both the null hypothesis and a local
alternative, but when we move away from the null hypothesis in the direction of the
alternative, it may not be accurate. So maybe if we use the estimator directly instead of
the aligned M-statistic and take the variation among treatments in consideration, this
will improve the performance of the test. Therefore, next we will compute a statistic
based on the difference between the individual sample estimator OJ and the combined
Following the same approach as in Chapter 2, we will II:laximize the likelihood ratio
subject to a
i
:::
a
i +1,
i =l, ... ,K -1.
In the new setting
M
and A* in (2.4.12) are replaced by
U
and A, respectively,
where
U = CU 2,
UK)' ,
... ,
K
K
;=-i
;=-t'
Ui = E Ui = E ndOi-ON),
A
=
17~
~((>'ii I
)),
and
where
K
K
i =i
i=i
[E >. i -( E >. i )~,
i =i' =l, ... ,K,
irfi' =1, ... ,K.
The VI statistic is given by
,
137
•
MN
= N1
"
-,
1
-
-
1-
U (J:J' ) A(jJ:J' )U (J:J' )1 {U (J:J' »0) 1 {A(j, J' )U (J' )
LJ
~<;J<;A
:s 0).
Now
U = (-17.6
U
= (24.5
A=
0
24.5)'
, and
24.5)' ,
12.931 [.22224
(.8412)2 .11112
.11112 ]
.22224
After partitioning iJ and the corresponding A it follows that
J
•
(J:J' )
A(J :J' )
{2}
0
12.25
12.25
3.05
3.05
{1,2}
[24.5 ]
24.5
[4.06 2.03]
2.03 4.06
t/J
{I}
the UI-statistic
iJ
M/; is
1
[4.06 2.03
MN = 30 (24.524.5) 2.03 4.06
]-1 [24.5]
24.5 =
6.57.
The corresponding P -value is .018.
Again to allow for comparison, we applied Jonckheere's test, the isotonic regression test.
and Shorack's test to the same data set. The results from these test accompanied by
the result from the UI-test are listed below.
•
Test
Isotonic Regression
Shorack
Jonckheere
UI M-test
Statistic
x
2
N3
6.07
= 7.29
=
J = 2.19
M/;= 6.57
P-value
.02
.01
.004
.018
138
This example is a little bit different, the p-values are in general larger than the pvalues for example 1, but they are still significant and again the 4 tests reached the same
conclusion. So, based on each test, we reject the null hypothesis and conclude that
treatmen ts have significan t effects.
5.5. Example 3.
We consider here the Anesthesia data set in Section 5.4, involving two variates Xl
and X 2, 3 treatments and 10 observations per treatment. This is a direct application of
Chapter 4.
Again the model of interest is
F/ (Y )=F (Y -X (3/0)
1=0,1,2,
where (3/ ,I =1,2 is the effect of the I-th treatmen t. It is desired to test
against the ordered alternative
Note that, in the case of 3 treatments, we will take
to avoid having a singular variance matrix.
Define
•
80°
= Po
81°
= PI -
(30
139
Now, the model may be expressed as
where
x °=
(1, ... , 1)' ,
x 11 =· ..
=X lO,1=0;X ll •1= · . . =X 20 •1=liX 21,1=· . . =X 30 ,1
X 12 = ...
=X 10,2=0;X ll ,2= . . . =X 20,2=0;X 21 ,2= . . . =X 30,2 =
=
1
1.
Now, the null and alternative hypotheses can be written as
i=I,2.
For j
=
1,2, equations (4.2.12) and (4.2.13) yield the following 3 equations
10
L:lll i (y,j-OOOj) +
M"oi(8,9) =
i=1
20
L:lllj(Yij-000j-91~')
i=1
30
+
E III j (y,j -OOOj -Oloi -020i )=0 ,
,=21
20
E lIIi(Y'i-OO~·-OIOi)+
M"1i(8,9) =
,=11
30
L: lIIi(Y'i-Ooi-91~·-020i)=0,
,=21
30
MN2j(8J9)
=
L: lIIi(y'j-Oo°,.-010j-020j)=0'
i=21
By solving these equations simultaneously for j
"°
"0]
"0"0
0 012 =
[nOn
°
21
22
001 002
"0
8 =
11
17
17
5.66
=
1,2, it follows that
31.10
[1.76 7.12]
2.45 -19.25
.
(-5.1)
140
In order to compute the asymptotic variance of
to estimate
E , r,
in (4.3.6), we need first
and find DO.
(A)
We will estimate the covariance matrix
(B)
We will estimate
"'I1,"f2;
E by
t, where
the diagonal elements of
respectively, where each
(C)
eO defined
;i
r
j ,j'
=
1,2
defined in (4.2.9) by
:fl
100
1 0 0
1...11...11...1 ] 110
0_ 1
,
_1
X
D -"N
X -"3 0 01...11...1
[
0 00...01...1
1 1 0
1 1 1
1 1 1
From (A)-(C) above, it follows that
'"
r =
D =
[11.638
-33.195
-33.195]
153.292
.8403 0 ]
.8231
and
[0
1
.6667
[ .3333 .3333
.3333]
.3333
The asymptotic variance of
AD =
That is
Do-I ®('-I
t
('-I
:b
is calculated using the same approach as in Section 2.
From assumption CI, Section (4.2), we compute the matrix D 0(3X3) as
E=
and
eO, AD defined
in (4.3.6) is estimated by
141
;"'0=
49.45
-143.98
-49.45
143.98
0
0
-143.98 -49.45
678.80 143.98
143.98
98.89
-678.80 -287.96
-49.44
0
143.98
0
143.98
-678.80
-287.96
1357.57
143.98
-678.76
0
0
-49.44
143.98
98.89
-287.96
From (5.1) and (5.2) above, it follows that the estimate of the slope
0
0
143.98
-678.76
-287.96
1357.55
(.').2)
e and its asymptotic
variance;'" are given by
e=
[1.76 7.12 ]
2.45 -19.25
98.89
-287.96
-49.44
143.98
-287.96
1357.57
143.98
-678.76
-49.44
143.98
98.89
-287.96
143.98
-678.76
287.96
1357.55
As in (4.4.7) we define
u = A.-1 e
and
v
= A.- 1 ,
then the VI M-statistic is given by
Now for each J ~ {1,2,3,4} calculate U
(J :J' )
the necessary components for calculations.
and
V(JJ :J')'
The following table contains
142
J
JI
{I}
{2,3,4}
.018
{2,3,4}
{I}
[-002
-.043 ]
U
(J:J' 1
II(J~
J'
lU
(J'
1
II(JJ:J
[12.245 ]
° °.006 ]
[001
o .026
o
-.025
I
.0lD
3.330
-21.812
2.454
I
.006
.002
{1,2}
{3,4}
[.086 ]
.024
[
3.30 ]
-15.690
[.026 .006 ]
.006 .002
{3,4}
{1,2}
[-.043 ]
-.023
[ 2.985 ]
[.026 .006 ]
.006 .002
{1,2,3}
{4}
[109
.024 ]
-5.994
-2.505
[.030 .006
.045
-.014
[ 3.802 ]
{4}
{1,2,3}
{1,3}
{2,4}
[040
]
.045
[ 12245 ]
{2,4}
{1,3}
[-.002 ]
-.015
[ 3.270 ]
-1.633
{1,3,4}
{2}
-.023 ]
[040
6.187
-2.505
-1.633
-12.116
{2}
{1,3,4}
.005
.001
[-013
001 ]
.007 .013
[ .001
.0004
.0004 ]
.001
[.013 .001
-.023
[ 3210]
1.695
-15.690
°Z1 ]
.006 .002
.007
0 .013
0~6
.007 .030
]
o .006 .002
.001
e
143
J
JI
{1,4}
{2,3}
[ .045 ]
-.019
{2,3}
{1,4}
[.003 ]
.020
{1,2,4}
{3}
[-086
]
-.016
-I
v(J/ J/)
U(J:J' )
U
(J')
[1314
]
-.774
[
[
{1,2,4}
-.012 -.001]
-.001 .001
[.001 -.001]
-.001 .012
3.653]
-11.313
[026006
.006 .002
0~5 ]
o .0005
.002
-.015
{3}
vUJ:J' )
.001
[086
]
.024
2985]
[ 3.553
.025
-12.116
1.76
7.12
2.45
-19.25
{1,2,3,4}
-
v
{1,2,3,4}
e
•
Th us, the UI M-statistic is given by
LN+
LN+
=
=
[OW
30(.109 .024 .045) .006
.007
001o n109]
.024
.006
.002
0
.013
.045
14.091
The significance of L N+ can be calculated from expression (4.4.12);
where
Wr
= EP(U
(J:J'
I>O)P
(II(]I/ J'
IU
(J'
I:::; 0).
{J}
For J C{I,2,3,4}, the W r can be calculated using the expressions given by Bartholomew
144
(1959):
3 1 (-I
-I
-I)
P (3 ,4) ="441T cos PI2.a+ cos P2a.l+cos Pla.2,
1
P (2,4)=2" - P (4,4)
(.5.3)
where Pii's is the correlation between the two variates i and j ; Pii
Pii -PiK PiK
Pii·K
The following table contains the necessary componen ts for the calculations of
P (l ,4), 1=1,... ,4.
Pii
cos-I Pii
Pii.K
-I
cos Pii.K
i=1,'=2
-.79
2.48
-.74
2.40
From (5.3), it follows that
p(4,4) = .0435,
p(3,4) = .2815,
p(2,4) = .4565, and
p(1,4) = .2186
Thus the p-value is calculated as
i=1,j=3
-.499
2.09
-.34
1.91
i=2,j=3
.39
1.17
o
1.57
•
145
p -value = 1-[.2186(.9998)+.4565(.9991) + .2815(.9972)+.0435(.993)]
=
.002
Next, for the sake of comparison, we will compute the bivariate analog to
Jonckheere's test. Following the same approach as in example 2, calculate Jonckheere's
statistic for each variate Xl and X 2 and call them J 1 and J 2 , respectively
J 1=219,
E(J 1)=150
Var(Jd=691.6667
Denote by
Thus under the null hypothesis Z 1 and Z 2 are each approximately normal with null
mean and variance 691.6667. And the vector Z =(Z 1 Z 2)' has a bivariate normal distri-
•
bution with null mean and covariance matrix
~ =
[
f7:21
pf7: lf7:2 ]
Pf7:1 f7 :2
f7 z 2
2
,where
p
is the
Spearman correlation between variates Xl and X 2'
The problem now is to test that the mean vector 8 of Z is null against an orthant
alternative, i.e.
H 0: 8=0 vs
HI: 8
2:
0 ,
that is we are looking for the vector 8' that maximizes the function
in the region
Barlow et
at.
HI:
8
2: o.
(1972) indicate that the test statistic for this problem is
146
where e- is the MLE of
e in the region e
~
o. From Chapter 3, section 6, it follows
that
__ [z
e -
(J:J' ) ]
0
Thus
Z
-2
X =
1
I
(J:J')'
a-2(1-p 2) Z (J:J'
) .
Note that Z = (69 -40)' and its asymptotic covariance matrix is
"
LJ
For each J
= 691 6667 [1
-.624
.
C {1,2},
-.624 ]
1
we calculate the following.
J
Z
(J :J' )
E'J
I
, I'
0
44.04
3.056
t/J
{l}
{2}
[~~ ]
{1,2}
)Z
(J' )
-.0578
.0998
691 6667 [ 1 -.624]
.
-.624 1
Thus
x=
(44.04)2
= 4.5922
691.6667(1-(-.624)2)
2
The significance of
P (X ~
2
x2 is given
2
C )
=
by
E Q (j ,p )P (X l
~ c)
i-I
where Q (j ,p) is the probability that e- has exactly j nonzero elements, and
xl denotes
a random variable having chi-squared distribution with j degrees of freedom. The pvalue is calculated as
1-[.357(.9679)+.143(.8994)]
=
.026.
147
The p-value (.002) from the UI M-test is substantially smaller than the other one
(.026), which may indicate its superiority in this situation. At the conventional
.5%
significance level, there appears to be a significant difference in treatment effects in favor
of the ordered alternative.
5.6. Performance of the Test.
In this section we investigate the asmptotic performance of the UI M-test developed
in Chapter 2, for samples of sizes
10,15,20,25,30,35, and 40, for the case of K=3
populations, using a type I error of 0=.05. Several cases were considered for these
assessments namely:
(i)
the null case where all means were equal to zero,
(ii)
equally-spaced order means:
•
e
•
Jll=·05
Jll=·10
Jll=·20
Jll=·50
Jl2= .10
Jl2= .20
Jl2= .40
Jl3= .15
Jl3= .30
Jl3= .60
Jl2=l.OO
Jl3=1.50
Jl2= .10
Jl2= .30
Jl2= .40
Jl3= .20
Jl3= .60
(iii) ordered but not equally spaced:
Jll=·05
Jll=·l0
Jll=·20
Jll=·50
Jl2=l.OO
Jl3=l.OO
Jl3=2.00
For each sample size, 1000 independent samples were simulated for each population, and the UI M-statistic and the isotonic regression statistic were calculated for each
sample. The UI M-statistic was compared to the .95 value of table 2.1 and the null
hypothesis that the location parameters of the three populations are equal was rejected if
the computed value was larger than the tabulated value. As for the isotonic regression
148
test, the null hypothesis was rejected if the value of the statistic x} as greater than the
.95 value in table A3 in Barlow et al. (1978) giving the critical values of the x} statistic.
The proportion of rejections is recorded for each test. Table 5.1 gives the proportions of
rejections for different values of n when the treatment means are all equal to zero.
Tables 5.2a-5.2d give the results when the treatment means are equally spaced. The
case of unequally spaced means is presented in tables 5.3a-5.3d.
Table 5.1
Proportion of Rejections
Case of Equal Means
IJI=O.O 1J2=0.0 1J3=0.0
using K =3 .0=.05
Sample Size
UI M-Test
Isotonic Regression Test
.051
.051
.055
.060
.054
.057
.047
.053
.056
.052
.068
.063
.060
.055
n
10
15
20
25
30
35
40
•
149
Table 5.2a
Proportion of Rejections
Case of
Equal Spaced Means
III =0.05 1l2=0.10 1l3=0.15
Vsing K=3, a=.05
Sample Size
VI M-Test
Isotonic Regression Test
.068
.081
.087
.101
.099
.109
.100
.073
.093
.093
.115
.113
.117
.116
n
10
15
20
25
30
35
40
Table 5.2b
Proportion of Rejections
Case of
Equal Spaced Means
Ill=O.l
Sample Size
1l2=0.2 1l3=0.3
Vsing K=3, a=.05
VI M-Test
Isotonic Regression Test
n
10
15
20
25
30
35
40
.094
.123
.143
.160
.165
.179
.184
.106
.143
.163
.186
.187
.201
.212
150
Table 5.2c
Proportion of Rejections
Case of
Equal Spaced Means
1J1=O.2
Sample Size
n
10
15
20
25
30
35
40
1J2=0.4 1J3=O.6
Vsing K=3, a=.05
VI M-Test
Isotonic Regression Test
.183
.238
.288
.326
.381
.410
.481
.215
.280
.334
.386
.443
.474
.540
Table 5.2d
Proportion of Rejections
Case of
Equal Spaced Means
1J1=O.5
Sample Size
1J2=l.O. 1J3=1.5
Vsing K=3, a=.05
VI M-Test
Isotonic Regression Test
n
10
15
20
25
30
35
40
.599
.752
.875
.925
.963
.979
.989
.682
.825
.919
.955
.980
.990
.998
151
•
Table 5.3a
Proportion of Rejections
Case of
Non-equal Spaced Means
Jll=0.05
Sample Size
n
10
15
20
25
30
35
40
Jl2=0.10 Jl3=0.20
Using K=3, 0'=.05
UI M-Test
Isotonic Regression Test
.084
.114
.131
.141
.151
.152
.160
.086
.100
.101
.133
.132
.141
.142
Table 5.3b
Proportion of Rejections
Case of
Non-equal Spaced Means
Jll=O.l
Sample Size
Jl2=0.3 Jl3=0.6
Using K=3, 0'=.05
UI M-Test
Isotonic Regression Test
n
10
15
20
25
30
35
40
.246
.311
.392
.450
.521
.515
.639
.281
.365
.441
.520
.615
.655
.703
152
Table 5.3c
Proportion of Rejections
Case of
Non-equal Spaced Means
1J1=O.2
Sample Size
1J2=OA 1J3=l.O
Using K=3, 0'=.05
UI M-Test
Isotonic Regression Test
n
10
15
20
25
30
35
40
.471
.614
.750
.835
.892
.931
.951
.537
.687
.816
.882
.927
.953
.972
Table 5.3d
Proportion of Rejections
Case of
Non-equal Spaced Means
1J1=0.5
Sample Size
1J2=1.0 1J3=2.0
Using K=3, 0'=.05
UI M-Test
Isotonic Regression Test
n
10
15
20
25
30
35
40
.923
.983
.996
1.000
1.000
1.000
1.000
.952
.992
.998
1.000
1.000
1.000
1.000
153
The rejection proportions shown in table 5.1 should all be around .05. For the
union-in tersection test 6 out of 7 are greater than .05, and for the isotonic regression test
they range from .052 up to .068 are all greater than .05. This suggests that both tests,
•
but especially the isotonic regression test, may be a bit liberal.
Looking to tables 5.2a-5.3d, we see that values listed for the isotonic regression test
are greater than for the union-intersection test, but their ratio is not generally greater
under Hi than under H o ; for the larger sample sizes, in fact, the ratio is smaller. This
shows that the two tests appear very similar in power for the alternatives considered,
with perhaps the isotonic regression test better for the smaller sample sizes and the
union-intersection test better for the larger sample sizes. It is important to notice that
we are drawing the samples from normal distribution and the isotonic regression test
designed for that. If we were drawing from a non-normal distribution, it is expected
that the union-intersection test would be more robust. Also with K=3, it is expected
that the isotonic regression test performs better, since in the union-intersection test we
are working with only two coordinates.
5.7 Asymptotic Properties of the PTE
In this section we demonstrate the asymptotic properties of the PTE through a
study of its average bias in a number of simulated samples and compare it with the
average bias obtained from using the isotonic regression estimator: also we study the
asymptotic relative efficiency of the PTE with respect to the isotonic regression estimator in the generated samples.
As before, several samples were considered in these numerical assessments:
154
(i)
the null case where all means are equal
(ii)
the equally spaced ordered means
(iii) the ordered but not equally spaced means.
•
We considered the case of K=3 populations with different sample sizes n =
10,15,20,25,30, 35 and 40. For each sample size 1000 independent samples were simulated and the PTE as well as the isotonic regression estimator of the means were calculated for each of the above three cases. The average bias was computed for each estimator and for each sample size, as well as the asymptotic relative efficiency e' of the PTE
with respect to the isotonic regression estimator.
To define e' let
where NS is the number of samples, jJ and
IJ
are the vectors of estimated and original
means, respectively. Define
NS
S
E
b j b/ INS,
j -1
then the asymptotic relative efficiency e' of the PTE with respect to the isotonic regression estimator is given by
e
I 11K
Table 5.4 gives the average bias due to the PTE and the isotonic regression estimators and e' when treatment means are all equal to zero.
Tables 5.5a-5.5d give the same results when the means are equally spaced. The
.
1
values selected for tables 5.5a and 5.5b are nearly TN(1,2,3) and
iN (1,2,3).
~
155
The case of unequally spaced means is presented in tables 5.6a-5.6d.
Table 5.4
Average Bias of the PTE and the Isotonic Regression
Estimators and their Asymptotic Relative
Efficiency e
Case of Equal Means
-
J,l1=O.O J,l2=O.O J,l3=O.O
Isotonic Regression
PTE
n
10
15
20
25
30
35
40
J,ll
J,l2
J,l3
J,ll
J,l2
.0061
.0069
.0070
.0039
.0057
.0063
.0063
.0136
.0141
.0130
.0094
.0095
.0103
.0101
.0329
.0251
.0250
.0220
.0195
.0209
.0180
-.1099
-.0931
-.0795
-.0748
-.0685
-.0601
-.0566
.0058
.0074
.0068
.0021
.0043
.0053
.0062
J,l3
.1318
.1071
.0955
.0848
.0776
.0727
.0692
e
4.3311
4.9139
4.2977
4.7852
5.4195
5.0517
4.7341
156
Table 5.5a
Average Bias of the PTE and the Isotonic Regression
Estimators and their Asymptotic Relative
Efficiency e
Case of Equal Spaced Means
-
1-'1=0.051-'2=0.101-'3=0.15
PTE
n
10
15
20
25
30
35
40
Isotonic Regression
1-'1
1-'2
1-'3
1-'1
1-'2
1-'3
.0509
.0485
.0481
.0442
.0455
.0460
.0461
.0127
.0149
.0119
.0088
.0086
.0092
.0088
-.0073
-.0137
-.0129
-.0140
-.0168
-.0142
-.0175
-.0884
-.0718
-.0588
-.0540
-.0490
-.0409
-.0375
.0055
.0071
.0068
.0017
.0041
.0052
.0059
.1105
.0862
.0747
.0643
.0583
.0535
.0503
e
2.9753
2.7923
2.4020
2.3726
2.2889
2.0953
2.0015
Table 5.5b
Average Bias of the PTE and the Isotonic Regression
Estimators and their Asymptotic Relative
Efficiency e
Case of Equal Spaced Means
-
1-'1=0.1 1-'2=0.2 1-'3=0.3
Isotonic Regression
PTE
n
1-'1
1-'2
1-'3
1-'1
I-'z
1-'3
10
.0894
.0822
.0799
.0749
.0750
.0740
.0717
.0111
.0138
.0119
.0066
.0064
.0078
.0077
-.0428
-.0449
-.0421
-.0420
-.0439
-.0425
-.0438
-.0704
-.0542
-.0425
-.0382
-.0338
-.0270
-.0237
.0052
.0068
.0064
.0016
.0039
.0052
.0054
.0928
.0689
.0588
.0486
.0433
.0396
.0371
15
20
25
30
35
40
e
2.0132
1.7756
1.·5676
1.4826
1.4027
1.3054
t .2·109
e
157
Table 5.5c
Average Bias of the PTE and the Isotonic Regression
Estimators and their Asymptotic Relative
#
Efficiency e
Case of Equal Spaced Means
1'1=0.2 1'2=0.4 1'3=0.6
PTE
n
10
15
20
25
30
35
40
e
Isotonic Regression
#
1'1
1'2
1'3
1'1
1'2
1'3
.1439
.1249
.1124
.1001
.0872
.0835
.0681
.0081
.0096
.0097
.0030
.0060
.0076
.0076
-.0928
-.0880
-.0794
-.0744
-.0658
-.0624
-.0490
-.0424
-.0282
-.0202
-.0172
-.0160
-.0095
-.0079
.0045
.0062
.0057
.0007
.0036
.0042
.0046
.0655
.0435
.0372
.0285
.0258
.0232
.0220
e
1.2677
1.1038
1.0008
.9374
.8815
.8528
.8192
Table 5.5d
Average Bias of the PTE and the Isotonic Regression
Estimators and their Asymptotic Relative
Efficiency e •
Case of Equal Spaced Means
1'1=0.5 1'2=1.0 1'3=1.5
PTE
..
n
10
15
20
25
30
35
40
Isotonic Regression
1'1
1'2
1'3
1'1
1'2
1'3
.1315
.0756
.0387
.0211
.0087
.0091
.0063
.0084
.0072
.0052
.0002
.0039
.0037
.0048
-.1045
-.0572
-.0194
-.0074
.0033
.0071
.0095
-.0019
.0031
.0045
.0013
-.0001
.0037
.0028
.0015
.0041
.0043
-.0012
.0023
.0030
.0043
.0280
.0143
.0139
.0119
.0111
.0112
.0116
e
.7553
.7620
.7978
.8278
.8743
.9082
.9199
158
"
Table 5.6a
Average Bias of the PTE and the Isotonic Regression
Estimators and their Asymptotic Relative
Efficiency e •
Case of Non-equally Spaced Means
JAl=O.05 JA2=O.10 JA3=O.20
PTE
n
10
15
20
25
30
35
40
Isotonic Regression
JAI
JA2
JA3
JAI
JA2
JA3
.0636
.0615
.0614
.0558
.0580
.0579
.0565
.0287
.0308
.0278
.0235
.0230
.0237
.0236
-.0305
-.0376
-.0376
-.0345
-.0376
-.0365
-.0380
-.0844
-.0678
-.0554
-.0506
-.0456
-.0381
-.0347
.0158
.0170
.0165
.0113
.0132
.0142
.0144
.0963
.0723
.0616
.0513
.0459
.0418
.0391
e
2.5935
2.3186
2.1086
2.0008
1.9718
1.8193
1.6909
e
Table 5.6b
Average Bias of the PTE and the Isotonic Regression
Estimators and their Asymptotic Relative
Efficiency e •
Case of Non-equally Spaced Means
JAl=O.l JA2=O.3 JA3=O.6
PTE
n
10
15
20
25
30
35
40
Isotonic Regression
JAl
JA2
JA3
JAI
JA2
JA3
.1672
.1446
.1265
.1103
.0905
.0836
.0671
.0325
.0333
.0293
.0208
.0223
.0193
.0186
-.1174
-.1081
-.0906
-.0791
-.0641
-.0554
-.0419
-.0395
-.0261
-.0188
-.0163
-.0153
-.0091
-.0076
.0186
.0182
.0160
.0093
.0113
.0112
.0107
.0485
.0295
.0256
.0191
.0174
.0158
.0156
e
1.2090
1.0649
.9652
.9274
.7065
.8579
.8390
159
Table 5.6c
Average Bias of the PTE and the Isotonic Regression
Estimators and their Asymptotic Relative
Efficiency e •
Case of Non-equal Spaced Means
111=0.2 112=0.4 113=1.0
PTE
n
10
15
20
25
30
35
40
e
Isotonic Regression
III
112
113
III
112
113
.1379
.0933
.0546
.0314
.0163
.0118
.0066
.0766
.0597
.0446
.0302
.0259
.0216
.0194
-.1431
-.0978
-.0538
-.0311
-.0137
-.0044
-.0008
-.0366
-.0241
-.0177
-.0159
-.0149
-.0089
-.0075
.0404
.0340
.0283
.0174
.0179
.0161
.0149
.0238
.0117
.0121
.0106
.0104
.0107
.0113
e
.9315
.8735
.8454
.8580
.8732
.8947
.8940
Table 5.6d
Average Bias of the PTE and the Isotonic Regression
Estimators and their Asymptotic Relative
Efficiency e •
Case of Non-equal Spaced Means
'r
111=0.5 112=1.0 113=2.0
PTE
n
10
15
20
25
30
35
40
Isotonic Regression
III
112
113
III
112
113
.0349
.0126
.0077
.0026
.0006
.0043
.0038
.0187
.0120
.0086
.0019
.0050
.0045
.0054
-.0182
.0010
.0082
.0094
.0103
.0111
.0113
-.0018
.0031
.0045
.0013
-.0001
.0037
.0028
.0128
.0099
.0073
.001.1
.0035
.0036
.0046
.0166
.0085
.0110
.0096
.0100
.0105
.0113
e
.8411
.9319
.9615
.9727
.9629
.9747
.9596
160
From table 5.4 we see that the average bias generated from the PTE is in general
smaller than that generated from using the isotonic regression estimate, and this seems
to hold for all sample sizes.
Tables 5.5a-5.6d show that the average bias due to the use of the isotonic regression estimators is slightly smaller than that due to the use of the PTE. This was
expected because for the isotonic regression case we estimate ordered means assuming
that they are ordered, while in the PTE case the estimate depends on the result of the
test and this increases the bias.
In most tables the values of e' indicates that the PTE is more efficient than the
isotonic regression estimator, which is a very important property. So although the average bias due to the use of the isotonic regression estimator in the cases of ordered alternatives is less than that from the PTE, the variability among the isotonic regression estimates is much larger than that among the preliminary test estimators.
We notice that as the differences between the ordered m~ans increase the asym ptotic relative efficiency e' decreases slightly giving the advantage to the isotonic regression since it has smaller bias.
5.8. Recommendations for Future Research
The members of the class of restricted alternatives are too numerous to consider
individually. The class represents various degrees of prior knowledge from complete
ignorance about the mean; Le., not alllJi are equal, on the one hand to a complete
ordering about the means; i.e., m
1::: ... :::1J1t
,on the other hand. In this research
we only considered the two most important members; the simple ordered and orthant
alternatives. An area of future research may include two other important members of
the restricted alternative class; the simple loop alternative, and the simple tree
161
alternative
(i)
The simple loop order in the case k=4 can be written as JlI::5 [JlZ,Jl3] ::5Jl4' which
may be defined by
•
Such alternatives may arise when there is an a priori ordering of the population, but
there are ties in the ordering. The probability of this partial ordering can be
expressed as the union of two disjoint events JlI<JlZ<Jl3<Jl4 and JlI<Jl3<JlZ<Jl4'
(ii)
The simple tree alternative is an important class of alternative specified by a
partial ordering of the type
The probability p (l ,k), l =l, ... ,k of such alternatives consists of all the partitions of
K into l level sets (B I ,B 2,
.•• ,
B,), where B l is composed of 1 and K -l members of
the set {2,3, ... ,K}, and Bz, ... ,B, are, in some order, the remaining members of that set.
REFERENCES
1.
Ahsanullah, M. and Saleh, A.K. Md. E (1972). Estimation of Intercept in a
Linear Regression Model with one Variable After a Preliminary Test of
Significance. Rev. Inst. Internat. Statist., 40, 139-145.
2.
Ayer, M., H.D. Brunk, G.M. Ewing, W.T. Reid and E. Silverman (1955). An
Empirical Distribution Function for Sampling With Incomplete Information.
Ann. Math. Statist., e6, 641-647.
3.
Bancroft, T.A. (1944). On Biases in Estimation due to use of Preliminary Test
of Significance. Ann. Math. Statist., 15, 190-204.
4.
Barlow, R.E., D.J. Bartholomew, J.M., Bremner, and H.D. Brunk (1972). Statistical Inference Under Order Restrictions. John Wiley and Sons, New York,
New York.
5.
Bartholomew, D.J. (1959a). A Test of Homogeneity for Ordered Alternatives.
Biometrika, 46, 36-48.
6
Bartholomew, D.J. (1959b). A Test of Homogeneity for Ordered Alternatives, II.
Biometrika, 46, 328-335.
7.
Bartholomew, D.J. (1961a). A Test of Homogeneity of Means Under Restricted
Alternatives. Journal of the Royal Statistical Society, Series B, e8, 239-281.
8.
Bartholomew, D.J. (1961b). Ordered Tests in the Analysis of Variance. Biometrika, 48, 325-332.
9
Boyd, M.N. and P.K. Sen (1983). Union-Intersection Rank Tests for Ordered
Alternatives in Some Simple Linear Models. Comm. Statist. Theor. Ateth., 12
(15), 1737-1758.
10.
Boyd, M.N. and P.K. Sen (1984). Union-Intersection Rank Tests for Ordered
Alternatives in a Complete Block Design. Comm. Statist. Theor. Math., 18 (8),
285-303.
11.
Boyd, M.N. and P.K. Sen (1986). Union-Intersection Rank Tests for Ordered
Alternatives in ANOCOVA. JASA, 81, No. 394, 526-532.
<
163
12.
Brunk, H.D. (1955). Maximum Likelihood Estimates of Monotone Parameters.
Ann. Math. Statist., 26, 607-616.
13.
Brunk, H.D., G.M. Ewing and W.R. Utz (1957). Minimizing Integrals in Certain
Class of Monotone Functions. Pacif. J. Math. 7, 833-847.
14.
Brunk, H.D. (1958). On the Estimation of Parameters Restricted by Inequalities.
Ann. Math. Statist. 29, 437-454.
15.
Brunk, H.D. (1965). Conditional Expectation given a
Ann. Math. Statist. 96, 1339-1350.
16.
Brunk, H.D. (1970). Estimation of Isotonic Regression (with discussion by
Ronald Pyke). In M.L. Puri (Ed.), Nonparametric Techniques in Statistical
Inference, Cambridge University Press, 177-197.
17.
Chacko, V.J. (1963). Testing Homogeneity Against Ordered Alternatives. Ann.
Math. Statist., 94, 945-956.
18.
Chinchilli, V.M. and P.K. Sen (1981a). Multivariate Linear Rank Statistics and
the Union-Intersection Principle for Hypothesis Testing Under Restricted Alternatives. Sankya, Series B, 49, 135-151.
19.
Chinchilli, V.M. and P.K. Sen (1981b). Multivariate Linear Rank Statistics and
the Union-Intersection Principle for the Orthant Restriction Problem. Sankya,
Series B, 49, 152-171.
20.
De, N. (1975). Rank Tests for Randomized Blocks Against Ordered Alternatives. Calcutta Statistical Association Bulletin, 25, 1-27.
21.
Eeden, C. van (1956). Maximum Likelihood Estimation of Ordered Probabiities.
Proc. K. Ned. Akad. Wet. (A), 59/ Indag. math., lB, 444-455.
22.
Eeden, C. van (1957a). Maximum Likelihood Estimation of Partially or Completely Ordered Parameters. 1. Proc. K. Ned. Akad. Wet. (A), 60/ Indag. math ..
19, 128-136.
23.
Eeden, C. van (1957b). Maximum Likelihood Estimation of Partially or Completely Ordered Parameters. Proc. K. Ned. Akad. Wet. (A), 60/ Indag. math.,
19, 201-211.
24.
Gupta, S.S. (1963). Probability Integrals of Multivariate Normal and Multivariate t. Ann. Math. Statist., 84, 792-838.
.
(J'
Lattice and Applications.
164
25.
Hajek, J. and ~idak, A. (1967). Theory of Rank Tests, Academic Press, New
York.
26.
Han, Chien-PaL. Bancroft (1968). On Pooling Means \Vhen Variance is Unknown. JASA, 62, 1333-1342.
27.
Hettmansperger, T.P. (1975). Nonparametric Inference for Ordered Alternatives
in a Randomized Block Design. Psychometr£ka, 40, 53-62.
28
Hogg, R.V. (1965). On Models and Hypotheses with Restricted Alternatives.
JASA, 60, 1153-1162.
29.
Hollander, M. (1967). Rank Tests for Randomized Blocks when the Alternatives
Have a Priori Ordering. Ann. Math. Stat£st., 98, 867-877.
30.
Hollander, M. an D.A. Wolfe (1973). Nonparametr£c Stat£stical Methods. John
Wiley and Sons, New York, New York.
31.
Huber, P.J. (1964). Robust Estimation of Location Parameter. Ann. Math. Stat£8t., 95, 73-101.'
32.
Huber, P.J. (1972). Robust Statistics: A Review. Ann. Math. Statist.,
1041-1067.
33.
Jonckheere, A.R. (1954a). A Distribution-Free k-Sample Test Against Ordered
Alternatives. B£ometr£ka, 41, 133-145.
34.
Jonckheere, A.R. (1954b). A Test Significance for the Relation Between m
Rankings and k Ranked Categories. The Br£t£sh Journal of Statistical Psychology, 7, 93-100.
35.
Jureckova, J. (1977). Asymptotic Relations of M-estimats and R-estimates in
Linear Regression Model. Ann. Stat£st., 5, 464-72.
36.
Jureckova, J., and P.K. Sen (1984). On Adaptive Scale-Equivariant MEstimators in Linear Models. Stat£st£cs and Dec£s£ons, Supplemental Issue, No.
I, 31-46.
43,
37.
Koenker, R. and G. Bassett (1978). Regression Quantiles. Econometrica, 46,
33-50.
38.
Kudo, A. (1963). A Multivariate Analogue of the One-Sided Test. Biometrika,
50, 403-418.
..
165
39.
Kruskal, J.B. (1964). Nonmetric Multidimensional Scaling: A Numerical
Method. Psychometr£ka, 29, 115-129.
40.
Lehmann, E.L., and H.J.M. Dabrera (1975). Nonparametrics: Statistical Methods
Based on Ranks. Holden-Day. San Francisco.
41.
Mosteller, F. (1948). On Pooling Data. JASA, 49, 231-242.
42.
Niiesch, P.E. (1966). On the Problem of Testing Location in Multivariate Populations for Restricted Alternatives. Ann. Math. Stat£st., 97, 113-119.
43.
Odeh, R.E. (1971). On Jonckheere's k-Sample Test Against Ordered Alternatives. Technometrics, 19, 912-918.
44.
Page, E.B. (1963). Ordered Hypotheses for Multiple Treatments: A Significance
Test for Linear Ranks. JASA, 58, 216-230.
45.
Perlman, M.D. (1969). One-Sided Testing Problems in Multivariate Analysis.
Ann. Math. Statist., 40, 549-567.
46.
Pirie, W.R. (1974). Comparing Rank Tests for Ordered Alternatives in Randomized Blocks. The Annals of Statistics, 2, 374-382.
47.
Pirie, W.R. and M. Hollander (1972). A Distribution-Free Normal Scores Test
for Ordered Alternatives in the Randomized Block Design. JASA, 67, 855-857.
48.
Puri, M.L. (1965). Some Distribution-Free k-Sample Rank Tests of Homogeniety
Against Ordered Alternatives. Commun£cations on Pure and Applied Mathemat£cs, 18, 51-63.
49.
Puri, M.L. and P.K. Sen (1968). On Chernoff-Savage Tests for Ordered Alternatives in Randomized Blocks. Ann. of Math. Statist., 99, 967-972.
50.
Roo, C.R. (1973). Linear Statist£cal Inference and its Applications. John Wiley
and Sons, New York, New York .
51.
Robertson, T., and P. Waltman (1968). On Estimating Monotone Parameters.
Ann. Math. Statist., 99, 1030-1039.
52.
Roy, S.N. (1957). Some Aspects of Multivariate Analysis. John Wiley and Sons,
New York.
53.
Salama, LA. and D. Quade (1981). Using Weighted Rankings to Test Against
Ordered Alternatives in Complete Blocks. Comm. Statist., Al0, 385-399.
,"
•
166
54.
Saleh, A.K. and P.K. Sen (1978). Nonparametric Estimation of Location Parameter after a Preliminary Test on Regression. Ann. Statist., 6, 1.54-168.
55.
Sen, P.K. and A.K. Md. Ehsanes Saleh (1979). Nonparametric Estimation of
Parameter After a Preliminary Test on Regression in the Multivariate Case.
Journal of Jvlultivariate Analysis, Vol. 9, No.2, 322-331.
56.
Shirahata, S. (1978). An Approach to a One-sided Test in the Bivariate Normal
Distribution. Biometrika, 65, 61-67.
57.
Shorack, G.R. (1967). Testing Against Ordered Alternatives in Modell Analysis
of Variance: Normal Theory and Nonparametric. Ann. Math. Statist., 38,
1740-1752.
58.
Singer, J. and P.I<. Sen (1985). M-Method in Multivariate Linear Models. J. of
Multa'variate Analysis, 17, 168-184.
59.
Skillings, J.H. (1978). Adaptively Combining Independent Jonckheere Statistics
in a Randomized Block Design with Unequal Scales. Comm. Statist., A 7, 10271039.
60.
Skillings, J.H. and D.A. Wolfe (1977). Testing for Ordered Alternatives by
Combining Independent Distribution-Free Block Statistics. Comm. Statist. A6,
1453-1463.
61.
Terpstra, T.J. (1952). The Asymptotic Normality and Consistency of Kendall's
Test Against Trend When Ties are Present in One Ranking. Proc. Sect. Sci. K.
Ned. Akad. Wet. (AJ, 55/ Indag. math., LI, 327-333.
62.
Thompson, W.A., Jr. (1962). The Problem of Negative Estimates of Variance
Components. Ann. Math. Statist., 99, 273-289.
63.
Tryon, P.V. and TP. Hettmansperger (1973). A Class of Nonparametric Tests
for Homogeneity ,\;ainst Ordered Alternatives. The Annals of Statistics, 1,
1061-1070.
64.
Wallenstein, S. (1980). Distributions of Some One-Sided k-Sample. SmirnovType Statistics. JASA., 75, 441-446.
65.
Wegman, E.J. (1980). Two Approaches to Nonparametric Regression: Splines
and Isotonic Inference. Recent Matusita, Ed.
66.
Whitney, D.R. (1951). A Bivariate Extension of the U Statistic. A.nn. ,Hath.
Statist., 22, 274-282.
'v
•
167
67.
•
Wright, LW., E.J. Wegman (1980). Isotonic, Convex and Related Splines. Ann.
Math. Statist., 8, No.5, 1023-1035 .
© Copyright 2026 Paperzz