Hoberman, David; (1986).A Preliminary Test Estimator (PTE) for P(Y>X) COnditional on a Rank Test of proportional Hazards in teh Uncensored Two-Sample Problem."

.
•
e
L_-
.-- ,.-.-- -_ .
....
--
-' ~ - -'
-_. --- -:-
;
i
A PRELIMINARY -'+ES'P ES"l'nmTOIf"'(PTE) "FOR P (Y>X)
CONDITIONAL ON ~ ~.TF;ST-OF PR0I>pRTIONAL
HAZARDS IN THE -UNCENSORED 'TWO-SAMPLE
...... - PROBLEM
.
-_..
-
.....
,
~,:"":",, ... ~~
_"'N"~'"
.
''''
,
---" ,-- --"
'. . Dayid-Hobernt4ii.--- .
i
,
~ ....._............,--..·,r-··--·_DeparemEin~-of Biostatistics
University of_ North C-arol±llll-at Chapel Hill
Institute of Stati'st:LCSMi~~~ Series No. l80ST
September 1986
A PRELIMINARY TEST ESTIMATOR (PTE) FOR P(Y>X) CONDITIONAL
ON A RANK TEST OF PROPORTIONAL HAZARDS IN THE UNCENSORED
TWO-SAMPLE PROBLEM
by
David Hoberman
A Dissertation subaitted to the faculty of the University of
North Carolina at Chapel Hill in partial fulfillment of the
requirements for the degree of Doctor of Philosophy in the
Department of Biostatistics in the School of Public Health.
Chapel Bill
1986
ii
ABSTRACT
DAVID HOBERMAN.
A Preliminary Test Estimator (PTE) for P(Y>X)
Conditional on a Rank Test of Proportional
Hazards in the Uncensored Two-Sample Problem
The probability that a random variable in one aroup is
areater than that in another aroup is a parameter of natural
interest in survival analysis.
Two choices for estimators are
the U-statistic (Wilcoxon statistic). which is unbiased
e
regardless of the
~elationship
between the two distribution
functions. and the lIl8Ximum likelihood estimator derived frOll the'
probability of the joint rank vector.
The latter has a simple
fora WRen proportional hazards holds and is a function of the
unknown proportionality constant.
Since the rank maximum
likelihood estimator bas less variance than the U-atatistic under
proportional hazards. we propose an a4aptive rank test to detect
departures frOll proportional hazards so that a choice of the two
estimators can be made using the dataits.lf.
The proposed test is a linear rank statistic whose null
distribution is asymptotically normal.
Various score functions
are proposed and their consistency for selected alternatives to
proportional hazards is examined.
In addition we derive non-
centrality parameters under selected local alternatives.
iii
Finally. the properties of the preliminary test estimator
are computed using numerical integration.
As
in previous
literature. we use bias and mean square error as criteria for the
performance of the
PTE
under local alternatives.
We find that
the PTE's performance depends mostly upon the correlations of the
test statistic with the two estimators.
The DOn-centrality
paraaeter seems to have less influence.
sri, mo~i "n8~8 B ~d b91~oqqt
.29~n9~~Z rl~Is9H
i£-
iv
AQCNOW'LEDGEMENTS
I would first like to thank ., dissertation advisor. P.K.
Sen. for our long disc:ussiou. his guidanc:e and
~raordinary
patienc:e.
Also. I would like to thank the other members of
e
committee:
1If'J
Lawrence Kupper. C.E. Davis. Kant Bangdawala and earl
Shy.
On
the less technical side. I want to express my
appreciation to Ms. Jean Coble for her effec:tive moral support.
Also. I would like to thank Ms. Betty OWens for typing the
manusc:ript under unc:ommon duress.
This research was supported by a arant from the National
Institute for Environmental Health Sc:ienc:es.
TABLE OF CONTENTS
AcknOW'ledgements. • • • • • • • • • • • • • • • • • • • • • • . • • • • • • • • • • • • • • •
CHAPTER I
THE PROBLE!1••••••••••••••••••••••••••••••
1
Introduction.............................
Literature Review........................
A U-Statistic as Estimator of
R = P(X < y)...........................
1
5
7
Preliminary Test Inference...............
9
THE PROPERTIES OF TWO ESTIMATORS FOR
P(X
8
Y).............................
14
Introduction •••••••••••••••••••••••••••••
Rank Likelihood Under PH•••••••••••••••••
Rank MLE Under H ••••••••••••••••••••••••
U-Statistic as Egtimator of R = P(X <y) ••
The Estimators Under Non-Proportional
Hazards. • • • • . • . • • • • . • •.• . . • • • • . • • • • • • • . .
Comparison of Rank MLE with U-Statistics.
14
R =
CHAPTER III
6
RUnder PH...............................
Tests for PH.............................
CHAPTER II
iv
<
16
17
21
23
24
~E TEST STATISTIC AND PTE UNDER
PROPORTIONAL HAZARDS.....................
25
Null and Alternative Hypotheses..........
The Inappropriateness of the NeYman Ca
25
Test ••.•.••••.••••••••• ~...............
The Chernoff-Savage Theorem..............
Adjustment in TN for Estimation of k....
Asym~totic Distribution of Iii(TN('NR)-~).
Cons1stency•••••••••••••••••••• ~.........
Scores for the Linear Rank Test..........
Pseudo-Efficient Score for a TWo-Parameter
Sequence of Local Alternatives.........
Computation of the Scores................
Preliminary Test Estimator Under H......
Asymptotic Bias of PTE Under H ••• ~......
Asymptotic Variance of PTE Undgr H......
o
27
29
31
34
47
53
56
62
63
66
67
QlAPTER IV
THE PTE UNDER LOCAL ALTERNATIVES •••••••••
69
Introduction•••••••••••••••••••••••••••••
69
69
The Rank LME Under RNA•••••••••••••••••••
Non-Centrality Parameter for 1'1ct.m-R)
No:~~:~t~i~;·p~~~~;~~·f~~·~T~~):~)
~A ••••••••••••••••••••••••••••••
Non-Centr&rity Parameter for .-ii(~-R) •••
Asymptotic Bias for PTE Under ~ ••••••••
Variance of PTE Under ~••••••••••••••••
71
72
73
74
NUMERICAL RESULTS••••••••••••••••••••••••
76
Introduction•••••••••••••••••••••••••••••
The PTE Under Local A1 temative••••••••••
NOD-Centrality P.r..eter•••••••••••••••••
PTE Bias .nd MSB Ratio•••••••••••••••••••
76
77
78
79
Under
QIAPTER V
VI
°T
70
TOPICS
roa
RESEARCH••••••••••••••••••••••
83
BIBLIOGRAPHY•••••••••••••••••••••••••••••••••••••••••••
85
TABLES•••••••••••••••••••••••••••••••••••••••••••••••••
88
QIAPTKR
APPENDIX A
APPENDIX B
APPENDIX C
:'..
•
~b\'~ - !) ; \
= -a::o.,;;
;'. :::
.-
'" -
e
CHAPTER I
An important problem in biostatistics is the comparison of two
populations when the outcome of interest is time to event.
Survival
analysis. for example. often attempts to compare the effects of two
different "treatments" by comparing the aample survivorship distributions
of two independent group..
Usually. the investigator formulates the null
hypothesis that the two distributions are identical versus the alternative
hypothesis that one of the groups' "survival curves" is "above" the other.
inidcating better surYival _perience in the former group.
Since this view
of the problem is equivalent to comparing the empirical distribution
'.
'
function of the two groups. one can alternatively pose the same null
~
hypothesis and restate the alternative, hypothesis to say that the time to
event random variable in one population is stochastically larger than that
in the other.
Finally. by forming the convolution of one distribution function with
the survival distribution of the other. one can derive the parameter:
the
probability that the time to went in one population is larger than that in
the other: i.e•• P(X < y)
Y has c.d.f.
F.
= !~
Gdr
=!~
(1 - r)dG
when X has cdf
Therefore. the estimation of P(X < Y)
important statistical problem.
G and
can be an
However. as in many problems. the
imposition of some mathematical relationship between the two populations
wil suggest the use of one estimator over another.
The structure in this
paper will be proportional hazards (PH). so that [l-G(y)]k
One can easily show that P(X < Y)
= l/(k+l).
= 1-F(Y).
k >
o.
and so the problem reduces to '
2
that of estimating
k. the unknown constant of proportionality under PH.
When PH i8 not the case. it ill clear that another estimator may be Ilore
appropriate, one that ill optimal in .000e .elUle. but that does not depend
upon any relation between the
F
and
G.
The.e conaideratiolUl lead to the notion of Pre1iainary Te.t Inference
(PTI).
III le.ral. PTI i. a proceclure which . e . a (pre1iaiDary) te.t of
• •e bJpothuia; aDd cODcliticmal1J' upon the ruult of that tat. the
ilwe.tiaator chooH.
-na
. . .tion of mterut.
utiaator. for P(X (Y)
for PH.
avai1ab1. procedure. to . .er the
priaar;.
III the pruent cont_t. va ..,. "bave ""eral
ancl aut choo.e _onl th_ cODclitioDa11y on teat
Since the procedure ia inferential. a 10•• i. incurred depending
upon the Type I and Type II error of the test.
In order to .ake the.e idea. concrete. we introduce the notation of
-.paces. It will be .hown in
n•
{I. G
cPH. {I. G
F
and G are known and PH
~.
,r, bna 2.::Jl:,.a:s:ta
"'!9r·'!(
G - .,{~). 1* arbitrary}
-
C*PH • {I*.G*
if
: .:::a1 :ts:ra ,'\od:r t.i "Ii):t :l9V :-L'T£..
I and G arbitrary}
rr", a'X
:0 Noi::ud.i:7:".. .i:b 9h
[1-G]k • 1-1 • k ) O}
..
0* • {I*.G*
Chpa~\-" II ·that
H 1'::
(k~. j' • k ) O}
!'''!rr.r. I H :Ish;
o
.L9lCl'S :J:>B
G* - .,(k).
r* . ,
S.:
We would like to take advantage of the tranaformation in order to
reduce the problem from one which
n
di.criminate. between PH and non-PH to
e
3
one which discriminates between jointly independent exponentials ana alternatively at least one distribution not beinF exponential. Therefore, we want the
estimation and testing procedures to be invariant under the monotone
transformation.
Invariance is the optimal, condition for estimation and
testing in this problem because (1) pex* < '1*) • P(X < 'I)
are never known.
apply the known
That ia. if
F
and (2) F and G
or G were known. we could actually
to the data of the other group and reduce
tr~formation
the original tvo-. . .ple probl_ to the one-s..p1e problem of t ••ting for
aponentiality of the tranaforaed data (s.e Chapt.r II).
bow1edae of the actual tranaforaation
8" that the ••tiaator. of
tr~fo~tiona
s.-&riaiDa••• can
1IIIDaC••••ry.
r (X < 'I) will
Invariance ukes
be inv.riant under aonotone
and the t.stina procedure will al.o be invariant under a
aonotone transformation resulting in the induced map
[H *.• F*. G*. £ .•
n '~.~....,
* ]
O
..
i
~,H3ul
_.:.
:ftD;aq".¥
~:ol:a.n;:: :0;:%8"::!
r;._'
1 _
~"70:'ono-
The invariant test of choice will be a rank test for three reasons:
d:
(1)
;,[
9':tsdw
- *y
• :: Ji) qX9
the rank vector of the joint ...ple of X's and Y's is invariant under
;
all monotone transformations. (2) the r.nk vector ia the aaximal invariant
statistic.
That ia. the rank vector is that .tati.tic which eEtracts the
h-::S-:r3lc'12
~
"
aaziaum information about the distribution of X's and Y·s. and yet is
r~
<
~
•
~-!
= '
independent of the actual order atatstics and the parameter(s) of the
{.':"fET~.r.d"I!. *~ • C )...
di.tribution. Consequently. if B is accepted. the estimator of choice
:'.
~ AX!
qJ:"9
*B ,':
would be a rank MLE computed under Bo • and (3) we want a test which ll81tes
no assumption about distribution functions of the two populations.
4
If
H
is rejected we use the modified form of the two-sample Mann-
o
Whitney U-statistics
U=
m n
I
I:
(y. - X.)
J
I: (y. - X.).
J
1
mn i-I j-l
if
1
where
X. > X:
J
otherwise
is the kernal of
P(X (y).
Detailed justification for these two
estimators if deferred to Chapters II and III.
Finally. since there is evidently no single UMP invariant test for PH.
we may partition the space of alternative hypotheses into local and global
alternatives.
Thus the preliminary test may be optimal (locally MP) for
local alternatives while consistent for a broad class of global
't.
a1 ternatives.
5
LITERATURE REVIEW
A.
R = P(X < Y)
when· X - up 0.), Y - up
(}.l )
Generally speaking, properties of various estimators of
extensively studied only in the past ten to fifteen years.
estimation of
R under PH
R have been
Since the
is closely related to the problem of estimating
R when both distributions are ezponential, it is worth reviewing the most
relevant literature.
Tong (1974), with a correction b7 Johnson (1975), derived a closed
ezpr..sion for the UHVUB of
~s
R when neither A nor 1.1 is bown.
Since 'it
very cuabersome to Wle in practice, one is led to consider the MI.B,
--
y/tx + y)
.s.
suitable alternative.
Kelly et al., (1976) .SSUllled 1.1 known
and computed, using nUlllerical methods, the bi.. and mean square efficiency
of the MI.E relative to the UHVUE for different sample sizes and different
values of R.
~
They found that the UMVUB is superior to the MLE for
R:' 1,
but for R < .5, nthe advantage is seen to be with the MLB even though it is
biased. n Even for R ~ 1, nfor large
n, the MLB might be preferred since
the computational difficulty probably outweighs the slightly greater
efficiency." sathe and Shah (1981)
derive~
bounds for the mean and mean
square error of the MI.B and the UHVUB under the two conditions,
unknown. n
known and
In the case when II is unknown, bounds for the first two moments
of the MI.! are obtained and also. lover
b~und
based on the Bhattaharja
bound is obtained for the variance of the UHVUB.n
The authors then
computed the bias of the MI.B and the upper bound for the ratio of the
M.S.B. of the MI.B to the variance of the UMVUB for equal sample sizes
n,
ranging from 5 to 100, and for A IlJ ranging from 0.01 to 0.9. For n=5,
the largest exact absolute bias was 1.9 x 10-2 : for n 10, the greatest
=
6
absolute bound was 0.9 x 10
0.4 x 10
-2
and for n
-2
= 50.
i
for n
= 25.
the greated absolute bound was
the greatest absolute bound was 0.2 x 10
-2
•
In
the case of the ratio of the MSE of the MLE to the variance of the UMVUE.
the ratio is less than or equal to one in every case except
and n
= 5.
= .01.
A /~
1.0
We conclude that based upon the work of Kelly et ale (1976)
10.
and Sathe and Shah (1981). the computational difficulties and complicated
expressions associated with the UMVUE militate against its use in the light
of the very good performance of the MLE over a relatively broad range of
ratios of scale parameters and realistic sample sizes.
B.
A U-Statistic as Estimator of R
Since there is no UMVUE of
= P(X
<Y)
R for all specified F and G. we are led
to consider the modified the Mann-Whitney U-statistic cit4d in the
n m
1
L L
introduction when we reject the null hypothesis of PH. R = mn
1/J
j=l 1=1
(Y. - X.)
J
1
is
unbiased;(b.iR~SK!~sia~istic)and
Lehmann (1950) showed that
G and later that
R
has variance (m+n+1)/12mn.
is the UMVUE for arbitrary continuous F and
"
/aorl(f~-~R)~~is
i&jmp~8fie6ily
the assumption thaf~~q§~8nlc n~a=a~OQ Afi!O~~
normal for any F. Gunder
R is not too near zero or
one (Lehmann (1951}~~!ThU§~f eithit ati&8~~ef the conditions are not
"
satisfied. other estiMa~.' if .A~~i'SIh.QId~be
used.
Van Dantzig (1951)
derived a sharp upper~~4.a4jfOr~Af••)' Rtt~~}/min(m.n). Besides
demonstrating the
constQ~eR~~~I~ '"
R.~tO~9P(JP(Y)
p@evi4~YBeful
inequality. this upper bound his
sample confidence intervals
for'PCK:~ ~).
though Chebeshev's
for constructing large-
(Govindarajulu (1968». In the
same paper. Govindarajulu further demonstrated the asymptotic normality for
R when normalized by r
assumption that
= min(m.n)
m = cn. n ---
~
for any continuous. F and G without the
and derived a distribution free. unbiased
-:-
and consistent estimator of Var(R).
7
The first attempt to construct confidence intervals for a
=
P(x <y)
which did not rely on large sample normality was that of Birnbaum (1956).
For F and G unknown. using the convolution of the two empirical
A
distribution functions (which equals a) and a set inclusion argument. he
obtained a lower bound for Pr(a-a < £).
Sup (F (s) - F(s)} + Sup (G(s) - G (s)}
m
n
a-a <
Noting that the inequality
= Dm+ + Dn+
is crude. he
acknowledged the large ...p1e aizes required to achieve saml1 confidence
intervala with high confidence coefficients.
Thus for ..all sample sizea.
the confidence iDtenala .Ul be large.
Birnbaua and Mccarty (1958) updated the procedure by obtaining the
comro1ution of D.+ and D +.
n
They note that this procedure ia atU1
conservative. Govindaraju1u
(1968) used Van Dantzig'. upper bound and the
asymptotic normality of R to improve upon the confidence intervals of
e
Birnbaum.
Uty
(1972) used Van Dantzig's
inequality to improve upon
c.
uppe~
bound and Chebeshev's
BiIl!Q"QJ!l~i.1it«"&l..iiu' the ::case where m = n.
a under PH
'"
We have noted in the ;§~!~4.~~~~ ~~at,jf~)holds.
then
1/(1+k) where
k
is the
P(X < y)
(u~o,~l QOQ8taD~ o~r.JrQpqr~iona1ity.
=
There is
little literature on the e~t~j.~~i~1) ~~ ::r4~ ::~,ldf1982) has derived a
,-
"rather sharp upper bound
R.
However. we would
fpr~~b'0~a~~a~~. ~~ ,~.~~nn-Wbitney
antictp.,~ .(.,,~er:'8ti. .~or
estimator
which used the actual
probability of the joint rank)veetor•.-:- Pethue:9Z:iJDmer and Williams (1979)
have derived a
non-par_etric".,tiJM.t.,~
that all the information on
of:; k - under PH.
They first show
a. i.- contained in ranks of one of the samples
in the joint rank vector and also that no unbiased estimator of
They then provide a closed form expression for
k
k
exists.
and a simple expression
8
for its bias.
However. the estimator does not use all tbe information in
tbe j oint rank vector if the sample sizes are unequal and the production of
A
k
n
requires roughtly 2 calculations. where
n
is one of the sample sizes.
Thus th.y resort to • randomization procedure which produces an .stimate
which converges strongly to k
incr.a.ed.
as tbe number of randomizations is
Por further det.il. on the e8tiaation of an accel.r.tion
par..et.r••ee MCRae (1971) and Stech .t al. (1974).
D.
T.st. for PH
lor c_pl.ten•••• we recall that if P.
say. were knOlnl. the probl_
of te.tina for proportional hazards would reduce to te.ting the
ezponentiality of the other ...ple'. tran.formed ..pirie.l di.tribution.
This iii the one-s..ple probl_ and has • eztensive b.ckground in
reliability theory.
Epstein (1960) published twelve tests for
ezponentiality••everal of which were t.ilored to detect specific
departur•• frail
exponen,~i~i;l'=~~~um.:.~~~ Yandell
..
(1984) have written a
~
review paper and
bibl.i:.0",a,ra/a~o~ ot,h~90J!s~-snepaple
c....
They concentrate on
tesU based upon spacina••.
DU..Iwh.r. D~ ':00
= {n+
l-i) (T.,.!)
- T(o1-1»
::cr as..t:cf... :1 .8922:> Ll
r.
,\I.
and
T1 < T(2) ••• < T(n) ".~ed:lth~ aOf..}11IJ:S."!\'ti:9\~J!:--:• .;frall a s..ple of .ize n.
constructed
an aIlnib~ aoodness-of-fit tests using a
Scboenfeld (1980),. 291
\'"'IStt.rm.tr9"!~ 9t1: me"!... _:-.;
sub-division of the tia.e.
szis..l and then calculating observed vs expected
.~un 9uf-ro no.r~~9(9"! ~~~
nUlllber of events in each int.rv..l •. Ander..on (1982) proposed a test to see
8BW ano.r1s8.r1a9v~.r 9a9n~
whether proportional haz~t.;~a i~l:ats~-::r:N t~:b~lt+-l)st covariate is added to a
model which already included
Coz
hazard function.
By
~'fE:t?ca::J=ia9~!: .!;n.
.pprozill~t_in,~: ~~~.
the exponential part of tbe
underlyiua hazard rat. A(t) with
a step function on pre-specified intervals of the tiae axis. Anderson
constructs a stati.tic which is chi-square under the proportional hazards
e
9
Nag~lkerke,
hypothesis.
et al (1984) proposed a test based on the
autocovariance of the successive contributions to the derviative of the log
likelibood.
B.
Preliminary Test Inference
Often in statistical analysis. the investigator uses tests of
significance in order to infer salient features of the statsitical
structure of the data.
Conditional upon these tests. he or she proceeds
with either a testing or estimationprocedure. or both.
This use of
statisticalinference before proceeding to the answer to the primary
question of interest is called Preliaina1:Y Test Inference and bas been
rigorously studied only since the 1940's.
The first published formal
investigation w.. that of Bancroft (1944) in which he studied two specific
problems:
(11
2
Testing for the homogeneity of two independent estimates of
a2 2 • " and conditional on that test. 'estimating a1 2 where
and
a 2•
2
(I)
Given a regression modi1 :;:'itfi
(II)
a1 2 <
=. 'b;':' i1:J+
b X • use a test of
2 2
significance for Ho: b = O' 'T~-C:r1fe/t!io ~1l~~1f~t:Ween the two parameter
2
model and E(Y) = 1:11 Xl. Iii-:-botti:<i:aie:tI!\hit1»ia~a~i~thepreliminary test
~~
•
estimator 1S computed under
a'!jaL's:a7=b~o
t~ree
9n1
par..eters:
~:t~
•
tne sample S1Zes. a measure
of the underlying departu:r'i ii~.i::\¥?e 'ptrJi.iliih~1-Y~~~s~ null hypothesis and
different critical va1ue.iyo\-':l;;..te'ctl*Ji cff'1cth8e-1rul\';; ~pothesis.
•
•
•
•
··,0') t103"Isbr..A
The
.15.\""':9-:_11: :!:.:
d1st1ngu1sh1ng feature of tnese 1uvestigst10ns was that exact results were
obtainable by analytic met~~is~j ~-r.ttf¥i.ti';i:~~te8Y was the following:
find sufficient statistics f~r 'li:\~a.'-ta::lst~~i~tlcs and the quantity to be
estimated.
•
...
" ......
-"':'.E..~'"
'
If need be. transf"orir'tHe'proD1-_ so that some functions of the
test statsitic and the quantity to be estimated are independent.
rODD
the
joint density function and then use a one-to-one transformation in order to
10
derive the exact joint distribution of the test statistic and quantity to
be estimated.
In both problems studied by Bancroft. closed form solutions
to the bias we derived in terms of the incomplete beta function.
Subsequent studies have included Kosteller (1948) in which he treated
the problem of pooling means from two independent populations on the basis
of unequal weights.
The regression problem has been atudied atenaively by Aahar (1970).
Laraon and Bancroft (1963a. 1963b). Chipman and lao (1964). Bancroft
(1964).
Recently. pooling of data in contata other than regression has
been atudied by Buntaberger (1955).
Bancroft and Ban (1980) have provided
an overall view and bibliography of preliainary testing in AHOVA models.
Of major concern in any preliminary test estimator scheme is the control of
TYpe I error.
The authors point out that nsuch investigations should
include respective recommendations regarding the significance level of the
preliminary test baaed on. Q.:"Cept.hle, criterion aa regards the final
. inference."
In their own investigation of a random effect ANOVA model.
they found that conducti....'1 .ests.f ."'e.ei.ty of variance at the ex
level is unacceptable
= •05
ai.uriiil.dait.p»oling when not warranted and
'"'
thus raise TYpe I
errorifo~I.1De ~..t
Thus:.~
(nTestitesting").
preliminary test of
square error as a
oI9~aebeity
of group means
s.c....~ i8crea.ingthe size of the
Bo~: D.f~ toa-;.~j,rs.." a1.o~:point out that use of mean
criterion~f~bes~~.·of
estimators is not always
feasipl"q_db~ead.
performance of
they focue on a method for
aazimizing the relative efficiency of· the· prel:illinary test estimator
relative to a competitor.
Sahek and Sen provide an introduction to non-
pametric preliminary test inference for a vector of parameter and simple
unvariate and multivariate regression models. Summarizing. it can be stated
11
the use of preliminary test estimation has been largely restricted to the
pratical data analysis situation of pooling data. either in the serivce of
estimating a population parameter. or model building in the content of the
general linear model.
In addition. the effect of a preliminary test of a
general linear bypothesis in order to decide on the validity of restricted
estimators bas been studied recently by Brooks (1976). Bock. Yancy and
Judge (1973). and Hill. Judge and fOIlby (1978) •. The typical _asures of
10•• under a specific estimator are bias and mean square error.
In Chapter II of this study. we concentrate upon the properties of the
two 4!stimatorS' of (!(Y>X) the rank MLE and the U-statistic.
.hown to be
The rank MLE is
the unique .olution to the MLE equation and the asymptotic
A
variance of /N'(kw-k)
is derived.
Similarly the asymptotic variance of
A
{If
(\w- B) •
derived.
the relevant statistic based upon the Wilcozon U-statistic is
finally it is shown that when proportional hazards holds then the
rank MLE bas less variance than
for
••t.ti.tic;:uniformly in k
the~U
at least
1< k < 3.
Chapter III presents
proportional hazards
hypothesis is
tAe~t,,~ectiea1a
••rk
. .lthe test statistic for
underioK~.~.~.i.~ua~a.a
. ...'t~of
a1ternative.~t.3.c.te~
1'"
"'iJ(.f~~he
the null
derivation of the
probability distributionofL~~aat~aat.tt~ ~.r~( R
is the Chernoffo
Savage theorem. The asymptotia ~ii,oof ib-e te,~ statistic is shown
-
using the Chernoff-Savage
.
.
tlll'een.ct~et~"".le{'.scores
and by adapting a
Chernoff-Savage approach to.fodlaJliac1bt!litpwion of the rank MLE
A
{'N(~-k).
•
~
This upanaion is then1 U8ed··:·~o· c!'II!pute
~equited
covariances
which are needed to derive the asymptotic variance of the final test
12
statistic under Ho • The resulting statistic is an "adaptive" statistic in
the sense that k NR .replaces k in the chosen score. It is argued that
t~ing
to construct a statistic along the lines of a Neyman Ca statistic
is not really relevant and is inordinately complicated.
In other words,
the notion of an asymptotically locally most powerful test
is not
appropriate to this prob1...
Chapter III also addr..ses the question of conaistency of the
statistic.
Consistency properties obviously depend upon the estiaate of k
throuSh the specified alternative.
proposed.
1)
Essentially tvo
cl.....
of scores are
a c1..s 100s.ly based upon a locally aost powerful score
statistic and 2) the S...,aSe score and WUcozon .core.
It i. easier to
d..onstrate consistency for the latter scores than the former.
Finally, the properties of the
P(y>x)
are derived.
pre1imina~
test estimator for
R=
Among other thinss, this work requires the,expansion
A
of the U-statistic
II~-R)
using the Seneralized U-statistic theor.. and
then expressions for its covariance with the test statistic.
Cbapter IV presents theoretical results deriving the properties of the
PTE under local alternatives to proportional hazards.
The main task for
this work is the derivation of the noD-centrality parameter of the test
statistic under the prescribed local alternative.
The complicated form of
the statistic due to the estimation of a nuisance parameter argues for the
use of LeCam's Third Lemma rather than the classical (Taylor series)
approach used for the computation of a Pittman efficiency.
Cbapter V presents a discussion of the performance of the PTE using
numerical analysis for the previously derived expressions.
~resented
for tbe cases under which Ho
Results are
is true and wben local
13
alternatives
exist in the direction of a linearly increasing hazard ratio
and a hazard ratio which increases more slowly in the Makeham direction.
All tests are one-sided with Type I error levels of
.05 •• 10 and .20.
Performance criteria are PTE bias and mean sequared error.
Chapter VI then presents topics for further research•
•
III
~
•
..
:.
'.
QlAPTER II
INTRODU C'l' ION
L.t us imagine two ind.pendent groups which are homog.neous with
r.spect to all variables related to survival exc.pt for the "Treatment" of
int.rut.
In classical problems the question of interut is usually
.
wh.th.r th.r. is a "treatment diff.rence n b.tw••n the two groups. i ••••
whether they hav. differ.ntial aurvival BP.ri.nc..
It is r ...onabl. to
aUllest at 1.aat two oth.r queationa relating to tb. groups- relative
survivorsbip:
a) given that w. alr.ady know that one group will b.
diff.rent. we aigbt be int.r.sted in sOlle ....ur. of tbe diff.r.nc..
b)
Regardless of our prior knowledge. wbicb estimators of a relevant parameter
should one use?
R = P(Y ) X).
In botb cases tbe parameter of interest in tbis study is
In Chapter I. two choices were mentioned:
the Wilcoxon
.stimator which is based solely on tbe rank vector itself. and the
'.
.
~J''-'"1-
...,. ...., -"',e.;
:-"."":0::"
'f
.stimator derived from th~ probahilitY- of 'the rank vector under the
. -• . '::.l
·~.··1
~o
•
a~c.3£':
proportioned hazards assumption. The following 1.... gives a connection
: ~ _. ~a:1:- - ~V'" ::r :T!l .. 2rr .... ~
between the hazard ratio and the inf;rmat{on'contained in the rank v.ctor•
. ::!
Lemma:
Z!7!;'::>
eX ::1::r1'"!
~-:.6s.6;;
'.;- .•
Under &ny'1Donotone_ transformation of tbe random variables of two
c~
-~~'Lb~c~ 8~~O~~~Q! 5~-
groups. the hazard ratio profile is the same over the real line.
~
~~
:rns1anc::>
9~1
That is.
~sbn'
even though the time scale is not invariant. the relative hazard ratio
profile
i:!
Proof:
Let
where
invariant.
Let
Y.
1
= x.J *
II
H is continuous. monotone and at least once differentib1e.
Similarly. let
'l(Y.)
1
= Y••
*
1
•
15
Then
lex)
thus
f(x)
SimUarly
= Prey ~ x) = Pr(H(Y)
= f* (H(x» H' (x)
G(x) = G•H(x)
I(x) = I • (H(x» H' (x)
= Prey* ~
< H(x»
= l*H(x)
H(x»
Thus,
..
(1)
1£(x1·
]
(8(~)
I
*
l-G H(x)
• *
,_. rf (x)
* *
l-F (x )
* *
I
I
]
8 (: ).
.
= H(x).
]
l-G (x )
1:.
where x*
*
[8 (H!X) H' (x) ]
I
l-F H(x)
l-G(x)
l-F(x)
,
fH~X)H (x)
] • I
QED.
So far we shown that the
r_a~. v~~_t~~: .frovides,.
-:> ••
~
...<
\- -
- -"'
information which leads
•••
to a choice of two rank estimators of P(Y>X) and a relationship which implies
_~w::iloJ
t
that if the hazard ratio i.
~!r;
.t.. ~..=~,=:,_":·e:!",
const~nt ove~
.'. :'::1~
rIQ.::£·:.n·:,)lr:...
time ,in the orilinal random
:;d~
~~
0': ~
variable space X, Y. then the hazard ratio is constant in the transformed
• *
space of X • Y.
:0
.;i:!j
What if the
~. ~"J'C
.
l1o.i:::£·tiI,:olc..run:
foreloi~1
';'':'::'!c
rank estimator were derived under
.
~~::
'3
c.ondi.t_ion were
a~
9
r: ~c :;'--1:
: .",
~he c~nstant .haz~rd
: ' . j . :!CtS":'Y-k ',on.:
10.1
a..-: ! ....;-::_<;
true? Then if a
ratio assumption. it
would likely be "better" in sOile vay than an estimator derived under more
leneral conditions.
On the ot.h~r han;•., i!"pro?o!iled hazards were not true.
,
_.:;1
9V",u
•.
;.J ••."
.....
the latter estimator might be better.
. /. ,
.
~.,,~..
:~.~.
Since the state of nature is unknown. ve are forced to make a
statistical test to see vhether the proportioUa! hazards assumption . is a
reasonable one.
If ve accept the null bfpothesis. then we choose the
estimator based upon the probability of the ranks. the RMLE
~.
If we
16
reject. then we choose the Wilcozon estimator.
~
We do this anticipating
that. on the average. it is better to use a statistical test to choose one
or the other. rather than using one exclusively and thus risking loosing
information that can increase MSE either through increase in variance of an
unbiased estimator or an increase in bias of a biased estimator.
In this Chapter we lI&ke uae of the 1. . . above to derive the
probability of the rank vector under proportioD&1 baaarcl8.
....
the . . .ptotic properties of Ii(~-R) and IN'<Rw-R)
and determine how
,..
,..~
'l'ben we study
,.
aw:h information is pined by ainl ~ if it is known that proportioD&1
We then discuss what can 10 wrong if "l\m.. is used when
baaarcl8 is true.
proportioD&1ity does DOt hold.
RANK LIXELDIOOD UNDER PROPORTIONAL HAZARDS
Referrinl to equation (1) we see that if I * (z*)/l-G* (z*)
were equal
to one. then possible hazard ratios in the original space can be "indexed"
by various F* 's in the
.
transfo~d·spaces.
It follows immediately from the
lemma that if the hazard ratio in the original space is k.
z. '-> z.* - sp(l).
J
J
transformation X. *
J
Prex.* ( z*)
. J
Now let
= Pr
the~
Y. -::-.~ Y~* - ~(k). In fact. the
"j::'J.:nu~u \ .. J Jk c.-:~.;l:'
= -l~g [1-G(~.
*
(j
[-log.V, <
;.;..;,.
Y. - - ) y'. It
1
then if
1
~
-'j
a*1 -
since G(z.)
J
h}U.- (~;;z*]
J
-z*
(* '.1=-1-+1-';
)..
=. .
~G:: OJ,''t;,.:,,• •¥l'
n
= UJ. ..:
U(O.l)
so
J
air \Y
lei = l-e-ky
* . 'J ~r$js.f.lli ~~(2:f= l-e-z
if the '~iirdH sratistic is from
X -) X
j
j
Further let
Also let
Znl
=[1
10
the Y-s8llple
othenrise.
zn 1* = zll.1 +•••+ znn 1 < 'R < N where
=~
+~.
'l'ba
Z
* is the ~·'.lIII.ber of observations equal to ,Jr greater that the .1th order
n1
statstics of the eombined . sample of N observations. If ! is the rank
N
17
vector and
z
is the z- vector just defined. then they contain exactly
n
the same information and thus we can write:
f.
. D
-
. 'k
l
W • {(N-.t+l) + (k-l)a..~ ~l
I-I'
Ii£"
lWUt MLE UNDE1l B
log P <.!l)
(1)
=
N
D
t log
...,1
1 log k -
a log peR)
...
D
1
:=
k
ak
.(2) k {.a.. (log P (R»
{ak
""
0
-
I
1-1
(N-R,+l) + (k-l) zaR. t,)
zNl*.
(N-1+l. ~t~k~I).z»!~::~
•
=
Thus the zero of the BItS of. (,'~ +s ~ly!. ~."'r.J..o:P~ th~ ,BItS of (1).
lCb/(a+b)-k/«a/b)+k) is incr...~~g
.a
k ak log P(~)
cOllYerges to
<
o.
Then It
and so
i,.r;.•k;l(.~l
Since
~< N .- R. i i 'ttR.). then
~
i.a decreasing i~~:;'-'.. ~.1':,
~
) 0
and when
.a 10gP(!)
"at
kNRis unique.
k
_)10 •
";;;""')0.
then the BItS of (2)
the BItS of (2) cOllYerges to
crosses the line
y=
0
~
- N
at .ost once
18
In the sequel. we will be concerned with the distribution of
"*
where k·* is a constant.
"
The consistency of ~
for k * when
"
is true) and the asymptotic normality of (k~~-k) has
(~-k)
k* = k (i•• e •• Ho
been demonstrated by Tsiatis
(~981l
among others.
Both properties will be
ierived in a more useful way for this investigation in Chapter III.
Letting k
o
be the null hypothesis value of
k,
~ 2~osP(R,k)
...
!
N
is between k o and "
~ we can calculate the asymptotic variance
OflN(~_-k) as follows. Let k=e a
Then log P(R. 8) =
where k'
nNR-rf
N a - 1:
1
~l
a
[N-~l] +
(e -1)
~
*
Nt
.
alogP(~.a)
•
aa
n1 -
N
• -
N
•
-
1:
..1
LEMMA:
[(N-R.+l) + (e
1:
~l
~u
*e
~R.
a
a
[(N-!+l) +
*
-l)~N R. ]~
a
t '-
0J0l-'"
2
P lim - ; a logP(~,B) ;:-
* a 2
(~N R E!
)
(ea-1)~t]2
-~
. ....
; \ ~rr a:.: ..., -.
(N-R.+1)+(e -l)~N
* a
9. e -
*a
• (04
~H9. e
[1 -
a
(N-R.+1)+(e -l)~N
Ia (B)
t]
•
aa2
N+co
k
/00
o
ii
(x)
A'(x)
[1-
+
k-l
k
H(x)
AF(x)
] dB(x)
+ k-l
19
where
H(z)
= (1-A)
G(z) + AF(z): k
1 N
I
PROOF:
+
aN 1-.
N- f
N 1-1
-
llO
o
k
(z)
I
'6) +
B
,
e-1
1
1
+ e B_1
k
1 .. \(x)
rL: GtIl
1 (x)
k
BNb:).
•
+
)JD (x)
1
/0
k
o !NCx)'
X,
D
+
S
k
, -'N(x)
::;,.
e -1
B]
+e
1 _.
[
Xl'
(x)
-1
]
.
S
+
e -1
.
;;.c1[~(x) - H(x)]
,"...
D1
r-
B
(xl. + e -1
1
= e ~ See Proof.
-.
k
!N(x)
)J
J
B
+ e -1
(x)
D
dH(x).
'.
'1
•
.' :';
-I
5-
~
---,.,----{' ~ " (. . ..)
J ••
:r~'.--
~
(.
,. ....::: _ -
, ,.
::"'""f""\.t T ...\ - ••
'"' 9)
. ---- -, .. ]
\.~.,
-I. ,;
N-MD
'I<
-" i
NotiDg that
Bx(x)
=
"N.l (z)
then by the
~up
N-MD
I
~(z)
G1tvenlc.o
~(z).
H(z) \.; P-<zt::.
H(z)
,,(z)
-F
n1
C;:~J1.te+1i. Lemma,
[H(z)/~(z) - 1]
a.8.)
0'
j,
...;)
,
-
~(z) =(1-A) G (z) + ~FD1 Eat) •• ).-lim ~/N.
n2
where
.>
•
(z)
.
() < l < 1
20
and
F
.up
n1
~
(x) [F
n1
(x)/F(x) - 1] a ••• )
~(x)
N+CD
It;{
(x)
& •• ~
Tn1 (x)
o.
Thus as
QED.
\l x.
The result follows.
Corollary:
If
! = G'lt.
then
l-~ G- 1-k +It ]
[ 1 -
.;o
k
=.
Ia(k)
T
kA (l-A) a l - k (Aka-~lda + (l-A)da)
[(l-A)
al - k + kA]2
A
#
(.
1
Thus the asymptotic variance of ~(aNR-a) -is
....
r
f
[kAU-A)
'-1
N(O.
Ia(B) ).
in
N(O.
a.
0
2)
a
dx.
0. Ak+(l-A)xl- k
.;JI'(k.tm-kY
where
j1
0
Since
.a-.
I
then since k· I(a) • e • where
0
tV
N
[O.I·~ (a) ]2. Ia(~l-:~~~
2
a •
k
ACI-A)
ro
2k
•
-1
dx
1-k
k+-x
where
1
A· -2
tV
is monotone increasing
Then
•
Ak+U-A) .j-k
,..
Ar (~"R-a
21
Since
00
1
P (Y>X).
.
[l-F(t)]dG(t) _
0
e-(k+l)t dt
o '
1
1
k+ T + 1 - k7 l
All
~-R
then
-
~l
0 •
-
A
+
-Nr
1
• (k.+l)2 +
•
2
(k.+·l) 3
1
+ °p(N) ,
T
and
IT
tV
Thus
N(~-R
0 (1).
p
"
1
(k +1)2
l+k'
where
A
T
_-.L
IN
-'NR
-T
- IN
k - k0
- k.+l
00
1
" N(~-k) + 0p (1).
-
)
Finally, then,
0
. N(O~)
"
fi( ~-R} ~
J..
where
R
'R'
_ _2__k~_
(k.+l) 4
U-STATISTIC AS ESTIMATOR OF
,.,
The U-statistic R.....~.
[ro
dx
]-1
k+x l-k
P(Y > x)
nl n 2
t t l (Y >X ) is clearly an
-~1llV
n l n 2 i-lj-l
i
j
lifT>O
and unbiased estimator for R, where I (T) • {O
hi.
ot erw se
Letting
•
cP(Yj ,Xi) ::
1
I(Yj-Xi~: ~,and ~~O~in~._~~~~. EcP
....
r'l
«9 9 = (2'2' :: >!
_
)s
......
2
(YrXi ).· < 00
then it follows from the two-sample U-statist1es rheorem that
where
O<A<l.
where
3,;'"
and
El,O
-
E [I (Y l
-
Pr
ex.t
~) I(y~;.~fr';;::R~2 (2'
2
< Yl'X'l < Y2r--R ".
00
00
o
0
_ 1 [l-F(x)]2 dG(x) _ [/
EO , 1
[I(Yl-~)
(Yl-~)]
-
E
-
Pr OCl < Yl ,X 2 < Y ) - R
l
•
1
00
o
I
[1-F(x)]dG(x)]2
-R 2
2
00
G2 (y)dF(y) _ [/ [1-F(x)]dG(x)]2
0
22
Under
B ,
o
Thus
Y - F- l ' {l-[l-G(y)]k}
s - F(x),
l-F(y) - [l_G(y)]k
then
-1
GF
Also,
l-G(x) - [l_F(x)]l/k.
(s) - G(x) - 1- (l-s)
Letting
11k
2
(lc:+1) (k+2)
2
crw·
~
....
11m If Var[
-
Rw-..;G -
'1
1
2lc:+l
- (lc:+l)2
-------
2
[ k +4lc:+l
]
(2lc:+l) (lc:+2)
Notice that
A(l-A)
Ow2
not a symmetric function of
'.: ':
is independent of
2
(k+l) (k+2)
1
(lc:+1l 2
-
when the sample sizes are equal, i.e.
A, i. e.,
is stU1 s function of
A... In many classical problems
,,:;::::.:
.,.. '
1
A indicating in some cases that
~)
least variance.
+
i.:;:!sJai~D'o~ !.j.:r
!
A-
'2
tW 2
is
A(l-A) Ow
2
produces the
.'~
However, that is not' the case here except in the case
-
".:.Jjsmlj29 ~:"':)
:'.3,
bnF
where k - 1, i.e., the classical case of interchangability.
When
k - 1,
2~9i~o ~d bSJs~ljzs C~~
¥
1 2 1
1
12A(1-A) , so :;
~9w ~O~'t$f.9J'~~ :..~ - '2
•
For any other val~~::J~~~)tt. ~~~~r!0~~ optimal to choose equal
sample sizes.
However, ~c;)e 2~~~A~t :rlgto~ in advance, it may be
difficult to justify different .aap1p).tl
.
.. ...sU,e..,.
.
,.'
to
2
O'R.
,
..;
~~'
The same comment applies
23
THE ESTIMATORS UNDER NON..PROPORTIONAL HAZARDS
When
B
o
is not true, the V-statistic has the advantage of being
A
unbiased for all alternatives.
.
Ii
Thus
,.,
since
~+ k
f
where
It
as
N+
1N(~1R-~) +
I!owever,
as
N+
It*
i8 the solution to the MLE equation:
! . I CD
o
It*
(Rw-R) '" N(O,.)
•
A
depends upon the specific alternative.
CD
,
when
CD
B
o
CD
is not true.
cIB(x)
(l_A).-7.
.*
+k·
AP(x)
Thus, when B
o
is DOt true,
IN
~
(Bmt-
asymptotically normal distribution.
of finding a test of
B
o
R) does not even have an
This fact indentifies the importance
which will be consistent against a broad class
of alternatives.
,.
In summary, the lower variance of /N<Rmt-R) und~r .Boand ihe 'unbiasedness
A
of
.
IN(RmJ-R)
:.r:.. ..'_
A
of
.
.. _0
•
under B motivates the use
A
.;;..;()~
of
!!..:
a preliminary test estimator
~:n~:Ji·~4._
R
For a fixed alternative, the cODsistency of the test will determine
A.::':': :3r:: ~o.'7 ci j£nj .:~
that we will always use ~ and so our estimator will be unbiased
';~~jr!1 ~o sag~ :£=~2~~r:
with a variance which has been estimated by others.
the burden of this
paper will be to investigate tll-.r"erntvl"o; lif itf€ preliminary test estimator
\
when
B
o
is true (Chapter fiIYO&uf.!ijhciJr .·'~eci\?etlC±e of local alternatives,
~ (Chapter IV) ~ since the~Ciwi'r~ioi.irt~ss~f'·;-the test and the resulting
expectation and mean squared er~'; "of ItK. 'PTE c, '; are not meaninful if the
power is arbitrarily close to one as
N+
CD.
24
COMPARISON OF RANK MLE WITH U-STATISTIC
For a preliminary test estimator to be a reasonable procedure,
we should expect
asymptotic variance
In fact
af-w
2
OR
2
OR
~O
2
W since we are deriving the estimator with
under the asumption of proportional hazards.
lies between 1.32 and 1.33 for all values of
k
between
2
0
R
1.0 and 3.0.
We have thus produced two possible est1mators of P(Y>;).
One is
unbiased and is always asymptotically normally distributed no matter what
the distributions of the two populations.
The other is the rank ML! computed
under the assumption that the two populations are related by proportional
hazards.
It is consistent for P(Y>JQ when proportional hazards holds, but
in general will not be if proportional hazards does not hold.
In addition
we have confirmed the usefulness of such a preliminary test procedure by the
lower variance of the rank MLE which proportional hazards hold.
'. <.
" ,.
CHAPTER III
In this chapter we begin by stating the null and alternative
hypotheses in rigorous ways which allow the state of nature to be
par..eteri&ed
~
faailies of probability distributions.
Then we develop
the aayapotic DUll distribution of a linear rank test for departures fraa
proportional
hasar~.
A wide class of fized alternatives for which the
strata. is consistent is derived; and finally we investigate the
perfo~nce
for the preliainary test estiaator when Bo is true.
Null and Alternative Hypotheses
H is states as follows: the ratio of the hazard rates of tvo
o
independent groups is equal to an uDknown constant k for all If > O.
There are two classes of alternatives for this study:
A fhed alterriative to B is a hazard ratio profile such that
o
one group's hazard is always greater than the other's and the ratio is
"essentially" 1I0notone increasing.
"Essentially" lIeans that at the
beginning of the study. the hazard ratio is less than sOlIe constant
k. at
the end of the study the ratio is greater or equal to k and that after
the beginning of the study. the hazard ratio equals k only once.
Referring back to the equation in Chapter II indicating the iuvariance of
the hazard ratio under a monotone transformation. we choose one of the
transformed random variables to be ezp
(1)
and the other to be a
distribution with "essentially" increasing hazard rate.
In order to impose
26
a compatability condition on the now "alternative distribution" to exp(k).
we require that the alternative F* have finite first and second moments
E(x) = 11k.
where
I
k
rex)
;;;;r
See illustrstion.
-
/
•
1
Thus the prob1s of detecting
DO~COnst8Dt
hazard has been transformed to a
prob1s of detecting departure frCll1 a Illixt:ure of exp(l) and exp(k)
distributions in the diatribution one group having a distribution with
"easential1y" increasing hazard.
H
0
.
H :
a
k
x
Thua
rex)
-
rex)
= k + ljI (x). where-ow (0): < k.
~
ljI (x ) = k (x )0
o
o
-0 and
/'" e
0
•
0
eXistDCand~ia'Unique)
~ljICtldtdx'= 11k
-,
H :
AN
This is a root-n:Se.Cp>e'DceS_i!f"logiJ SJ. ternatives to proportional
hazards.
r~
o
Just as for filted-: ii1ternidves~ -,;'
F (x)dx =
Ie-
-n
~
11k; however' F~ (z)5=if{Z1~N-L)
= k + A/IN.
As N -) '" F(x.
K
en
=8,:-'to 'tl-;l.}fca ~.:c
0
-'-0
N
where
.6 ~ - ) 1 - e -
--
--b.-'
•
27
•
However. there is a mathematical technicality which must be addressed
concerning this sequence of alternatives.
Recall the transformstion from
the original spare:
JI (x)
* where
. I f(x)/l-F(x) J!Ig(x)/l-G(x)]= f * (x* )/1-.t".
g* <x*)/l-G* (x*) : 1.
Note that if the LRS =0. thus
But under a sequence of alternatives we would like
to be equal to
k. IIJli.formly in
aiDgularity at the origin.
* (0) = O.
- rex)
f * (x*)/1-f *(x*) =
f( 0) : 0 :) f
Thus the transformation induces e.
L
'1'0 avoid this probls, we aodify tbe stetsent
of tbe local alteruative aligbtly ao that for uq sequence {xn} wbere
UIII
x N: 0, tben lim r<X'; -) k. !'inally, we IIllte that since we index tbe
Hsequence by a tvo-dimensional parameter space generating a family of
distributions. and that there is a restriction
•
·
1CD -F(x)dx=. k1 then
the sequence is along a curve embedded in the
0
unrestricted parameter space.
For the
remai~~er
of this thesis. we drop
the (*) notation for conveniencl[!...so,,!h~tc~~~.!l:~~1vat10ns· are done 1n tr.e
transformed space.
power under a sequence of
_ I.
:..
'.~
!
. i'..
~',
.:: ::.1--.-
i
_
•
10c~:~~fRa~iy.e~~) ,1I1e
paramter in any candidate for the
sta~st~c
principle developed by Neyman (l~5.9).
presence of a nuisance
suggests basing the test on a
'In-his .seminal paper. Neyman showed that
if interest centered on testing against the null value of a parameter when a
nuisance parameter is present. it is possible to construct an
asym:'totically locally most powed,.'. test by computing the efficient score
•
28
.t.tistic based on the p.rameter of interest ••nd at the same time•
• ubstitutins • "weakly" consistent or "strons1y" consistent estimate of the
nuis.nce parameter into the resultins .core st.tistic. en is "weakly"
couistent if
IN
(8 - 9) - 0 (1) .nd "
9nis "strons1y" conabtent if
n
p
A
A
Ii (8 n-
8) -
0
p
OJ.
In this fomer c.... oth.r t.m. must he added to the
baaic acore at.tiatic to providedle opdaal prop.rti.a.
It ia thia
.eaaral fr__ork whicb _ue.ta tbat locally optiaal rank teat can be
lor all we would bave to do is take the derived adaptive t ••t
atatiatica and produce rank scor.. by taJc.iDa expectatiou liven tb. rank
vector.
'l'hat ia. we would treat tbe proh1_ u "par.etric". derive the
appropriate "parameter" c
rank vector.
context.
at.tistic and take expected values siven the
Unfortun.tely. tbis procedure will not work in the present
Recall tbat in the c... of ••imp1e rank .core when the null
.
-
bfpothesis i. i.i.d. random v.riab1es. th. optimality of th. effici.nt
> :': .....
acor. requir.. th. scorefunct10n'to equiI the derivative of the log
~:
'- 3'7
likelihood.
~~-":O.:"IC"
91cl.8"11:~$b
.0.;.-"
c:
Only then will the Cauchy-Schv.n inequality become an
equality and yield a m.iaua for the Doft-centrality parameter.
conclude that the rank st.tiatic is asymptotically efficient.
~=q02;::
unaer B
i.i.a~
-.
..
,,170 .tJ'S V'1 ~ ;:'Cl'.'}
'::-a:b
0
':j- ,
~·-w••hall
.ee in the next
C"l:I~,.:'.
aection th.t the .t.ti.tic cannot be expr
.~,:'
However. if
?-:IA.....'i--..:.--:·l,·
the random variablea are Dot.
W. . .y then
,.:J'
!"••
d .. • simpl. sum of
.
"efficient acore." which .re equal to the deriv.tive of the los-likelihood
_.
.
UDd.r a specified .equence of local.altematives. Cons.quently. there is no
.quivalenc. of th. Caucbf-Scbwan in.quality in th. rank at.tiatic case to
the parametric ra. . . . . there is wh.n there .re i.i.d. random v.riable.
under Ho •
29
•
Moreover. even if it were possible to construct an noptiJllal test n
using the Neyman theory. there would be at least two draw backs:
apected value of a function of an order statistic depends upon
1)
k
the
in a
very complicated way and 2) computation of the variance of the statistic
would be inordinately difficult due to both the extra terc and the
tl
estimation of
k
by
k.
In eddition. it can be argued that efficiency of the statistic is
not reall,. relevant here.
After all. the bona fide DIl.11· ~othesb is
.proportional baerds in the IIDtranafomed .,.ee. DOt two independent
71Ius there is no test in
.apoDential populations in the trllDllformed .,.ee.
the untransformed space with which to campare this test.
Even if we make
the comparison in the transformed space. we run into problems; for there ia
no such thing as a two-sample parametric test for detecting departure from
•
two aponentials based upon the joint .likelihood. where one of tile
populations is ap(l)
under both null
:~
Lastly,
~~ alternative
::,., .~....: .+.:: .: ::. .
bypotheses.
a. ..Wiiori.__
~~a.:. ~timality
OJ .. -.. ... _ ..,...
there is no indication.
_.
:. ';
_.c;-_~:"';
(in some sense)
~
of the test statistic is related to deairable performance of the PTE. the
c"~'-:'~'.-'.:d~l·""') 9d1
I
~
C,'
main focus of this study.
31: ::)"::<:,':'!;"'
.:.':::o:q~~·."t':£
THE
CEERnOFF~SAVAGE
TEEORD1 (1958)
.' . H '::-:i5f"..:J .5 •..: ..:.
:.=
be the ordered observations of a random sample from a
.::·~:.c~·':~~
population with continuous cumulative
.. ' ".,:.:.
~ .... zn2
be the ordered
~:-.:~
9-: : ...'1r:r.:'
;
dis~ribution
::::.
obser:vati~~,of
. i:
function F(Z).
a random aample from a population
with continuous cumulative distribution function G(z).
=
~/NLl<>'-< I .
Define
•
the Eni
Define
= (1-A)
~(z)
T W= J N [H (z») dF n (z)
l
N
I/n
t
Bni
ilni
=
l
:f.-I
are scores and ilni .
=
Let
f~
Let
G (z) +' AF (z).
n2
Dl
where
i f obs. is from Y-ssmple
{: otherwise
N=
~
+ n , A
2
30
Theorem:
(1)
•
If
J(H) = lim J (H) exists for 0, -< H-< 'I
and is
not constant.
(2)
rIN[JN(~)
-
J(~)]dFDl(x)
(4) J(i)(H) =
)
I diJ/dH i I < X [H(l_H]-i - 1/2
for i = 0.1.2.
f
1/2
= oWN)
(3) IN(l)
providing ON
= 0p(N-
+
0
and for someQ) O.
•
O.
The utility of the..,tl!eo7:_~
is ~in cprovid,ing the asymptotic normality of
'." ....
,,,".' _. .:
"
.
~
a linear rank statistic based u;on a
~fiture of distributions.
As
1:\':'-.." .
developed in Chapter I. under H ' the,comb~ned' sample consists of ~ random
o
('.- "k
variables from a population with cdF F=1-(I~G)
and n random variables
2
with cdF G.
unknown.
Thus. in this investigation F(x)'= F = F(x,k) where
k
is
Consequently
J[H(x)] = J[H(x.k)].
-:' :."
The condition on the second deviation in condition (4) has since been
\
shown to be unnecessary.
Stronger results need only
i = 0 •.1.
The older
•
31
•
approach has been taken here ior too reasons: 1) the 2nd deviatives are
simple and 2) this formulation is useful for the Taylor erpansion of
Adjustment in TN for Estimation k
Since
k
is estimated from the data. we erpand
A
'lh (kNR)in
a Taylor series about the true value k.
A
.
"-;(kfill-k)- 0p (1) under B ' we can write
o
-1/2
T.,lk) + t B + 0p (N
kN
=k
In general. Olemoff and Savage show that
+""J(H(z)dF(z) + (1-]J
TN =1
.....
where
thus
+m
p
•
'..
,/J(R(z)} dF(.x}
-cD"
+ t/1iI.. Thus TlI(k+ t/!N ) •
) for a suitable conatant B.
cove ..
•
Since
::I:.~ -.
as
-::
N -)""
:.:. - .:.
We can abo erpress the
32
Consequently
B;
'"
=
'"
lim! J[H(x.k+t!,N )]dF(x).-! J(H(x.k)]dF(x)
No-
tI,!N
-aJ
....
•
+", .
If
a:!(H(x,k)] < :M(z)
k
where £..i!(x) dF(x) <",',
a
then
-+""
B = I aJ(H(x,k)]
k* be a running variable in the expression for
In fact, let
EJ(H(Yi,k*»
dF(x) ,
ak
-aJ
=
0
where
Y - exp (k*)
i
then under regularity conditions (see appendix),
d/dk
*
I:
'"
. *
* -k *Y
loJ(H(y,k)] k e
dy =
ak*
= -1"'yJ[H(y,k*)]k*e-k
..
y
0
*
B
k
=
k""
*
:.;:\~~ - Oii'tA'"-'r-
.:.'.
Y =COv [Y., J[H(Y" .k)]
1
1
.
= IN(~~R) ~ 0; so tllat' t
t
0
.-._l"5"'c
~.
,
Theorem: Let
k*~-kx =
ak *
.~:.. + ;.~£ k, then we fimf' that
=1'" yJ[H(y,k)] keo
*
n;i·.~
:,,' -
+1"'a<;11'ii(Y,k~'"
o
Setting
•
aIJ{H(y,k*)]k*e-k*Y Jdy
interval of the real line.
Then. for
is
op (1)
on a compact
C > O.
A
Pr(max lIN (1N(k
- 1N(k)] - tBl>£)-) 0
NR
t< C
as N --)"'. This statement is important because it says that when we
•
33
•
substitute
k
for
k in TN(k). then the adjustment is good uniformly on a
compart (bounded and closed) interval. not just pointwise. The theorem is
proved by adapting a method due to Jurescova (1969) which she used in a
different contezt.
First. partition the interval [-c.cl
tl <.tl <. .. < tl - c.
r
o
l
into
r divisions. such that -c
From the above, we know that we can chose r large enough
H large enough,
so that for
Pr litH (TNt. -TNo ) -~ i B\> 1/4,'£ 'cl1Vr where
T\ • £ > D. i
= D..... r.
Si:IIIultlUleOUllly. H is large enough
tlEl-c,c],
til -1' <tI. <tl
o
Assume
B) 0
The case
•
1N'.T)l1I -
so that
TNt.
i
0
is non-decreasing in
INrNo -
lIB
~ O.
In TNl\- INrI*:> - lIBl ...<
then
oj
o
•
'B6. . .
1
'Jeer!: .0"':.::
~lrN'l'Ni . ->"N.T No - Btli.'
o
0
IIl8Z
/1Nr Nli.-INTN-!ltI I
Illj < c
•
tI.
B < 0 . can be handled analogously.
= ..'11'1'.....
- mNo .L'UUo
Thus
that
Say N > No.
(til - tl i _ ) B < 1/2 E.
l
How choose any
80
<
0
2 max 11'N'l'''K
- .IlfT' N0 - BA1 l ... o./2)e:
,...
o~i~r
1
Btl.1 · - Btl.1~1
~-'. 0
0
34
Thus
Pr (1II8X
161 -< c
lINT
NIl.
_. T!
~o
Pr{21118X I INTNt.
o~i~r
>~}
•
<
- rNTN6 - B6 i I + (1/<) E> d
i
0
= Pr{maz lIN T·
Nt. .
.o~i~ r
61
- B
-liT.
No
i
- B6i I > E/4} <
n
by construction QED.
By the Chernoff-Savage
theorem.
distribution to a normal r.v. with mean zero and finite variance. where
+...
~~(k)= J J[H(x.k)] dF(x.k).
We must now investigate the effect using the
-
...
adoptive statistic
k
~
TN (k
~
NR
) " Sinc.~ ~ (~) ... 11 (k)
in probability (since
is consistent for k). then·w~'can write
with probability approaching 1 as N --> .... S.ince the RHS is the sum
" ~ . -' . ~ C\ :-: . :-.
two asympotically normally distributed random variables.
•
of
~-f:
asympotically normally distributed with mean zero and some variance.
-.'!".'j
:,;':!::-:~"
is
We
can now write that
~
N Var[TN(k)"l!] + 2BllICov [TN(k)"l!. kNR","k]
•
35
•
The first term is given by the Chernoff-Savage theorem.
given by
lim
The third term is
derived in Chapter II.
N-
In order to derive the second term. it us necessary to expreas
~-k
in a form using a linear combination of random variablea. juat aa in the
Chernoff-savage theorem.
We do this by noting. as in Chapter II. team the
Taylor series expansion for
"
~-k.
is
:nog P(R,k)
ak
"~-k =
a2
~-
log P(R,k)
--~-'--
•
where
a
log P(~,k) = )Jk - /'"
ak
1
o
Then we expand
about
•
K(F, G)
F(z)
= [G(z)]k.
36
•
2.
~(x)
H(F
d[~(x)
= aMldF
Let 1011 (F)
then
=
nl
n1
- H(x) + H(x)]
(F.Il) • ~(G) = aMjaG
[~-H)
.G ) dH = H(F.G) d
n2
N
n2
(F.G)
+ M(F.G)dH
•
~::;
~5::) :-~~,
+ M(F.G)d (~-H) +
1:
1=1
~,~ r 1"i] i'~
'\
~1
( . -J
t
•
37
Since
1-1
= (l-G)k
• then
M(F.G)
= Hr(F) = Mb(G)
(1-A)
G1-k+kA
Integrating the last expression by parts we find that
._
.... )
.j : -,
"'!-
_ I
~
.-.
.._ - . "
r .• _,
,.
+
:..:
_.. ," ...
'
i.
-
38
aMlaGn2
Evaluating
~(G)
at (I.G). noting that
I-
= G-k
then
=
[1- 4 (1-G) 1-k + k] 2
4
Hl(l)
S:1mUarly.
_ 1-4 (l_l)l/k - 2
=
_"""'4
_
[~ (1-1') 11k - 1 + k] 2
A
Noting that
dMc;(G)
-=
U-k) ~ (G)4G
and
dH!(I') = (11k - 1) ~(F)dl =(.k-~k ~(I')dl'.
•• have
A (k/k-1)/:rl n1(z)-r(x)] ~[F(x)] + (1-)J(k/k-1)
1~ln1(x)-I(X)]~(F(X)]dG(X)
o
+" 11-k/~Gn2(x)-G(x)]
o
G
M
[G(x)]~ ~ .~~,:",}J/O.-k)
\
-.
.
lGn2 (x)-G(x)]
dMG[G(x)]
0
r.
-
"1;ln1(y) - l(y)]dMG[G(xH--
!~t:!~"'{Gnn2(x) -
G(x)]dMG[G(x)]
Collecting tet1lls we have on the RBS
"Ift-l)
'k(1';'),)
1~ln1 (x)-l(x)] ~[F(x)] = .
k-l
+
"/~-k)
I
00
,
[ln1 (x)-l(x)] ~ [1(x)]dG(x)
0
[G (x)-G(x)] M ' [G(x)] dF(x) + (1- ~k 1
n2
G
(l-k)
0
00
[G
n2
(x)-G(x)] dMG[G(x)]
39
Integrating by parts and collecting tJrms. we have finally that
n
+ A.-l) U/n
n
!
2
! [D (V.) - ED (X.)]+ l1n
2
2 j-1
2
-"'.1
l'
1
! (M-[F(Y.)] - mL[F(Y.)]}
1i-1 -I'
1
-I'
k(1-A)~~I}{I/D2 j~~HG[G(~)1 - ~[G(xt)l) - I/Dll~~[Dl(Yl)
+
Y
D (Y ) = , __1
1 i
o
where
D (%i)
2
=
JX 1
Xo
dG(y)
1\t [G(x)]
dF(x)
CIO
Jo
MF[F(x)] dH(x)
1
1
= 10 !\; (G)dH = A/ O
=10Mb[G(x)]
~-1
and
G
!
dH~x)
[l-A) + k A! ~-1
(1-A) + AkG~-l
•.
= A!k
.
F
~.
_
dG
_, i)";'
.. ,"
-",
. "-k-l .since - dB
[101-A) + AG
]dG
=
5:
.
We w111 show t~a~. '!,fllNi
= G-l-k •
- EDI(Y I ) 1
5
!
.
+ i-I ~i
M,' [F(x)]
CIO
Now
1
_
-
-1/2
~oT(N
).
(See Appendix A).
-- i-'I
I
I
Thus
~~k
can be expressed as a linear combination of a mixture of
I
independent random variables with zerp mean and finite variance.
consistency of
. /"". !.
,.,
~
,.,
normality of
IN
for
k
is
'.~
----t-- ,-
immedi~te.
The
In addition, the asymptotic
I
(~-k) follows froml the central limit theorem since each
40
D-function and
~function
the asymptotic variance of
function of
k (
has finite second moment.
IN
,..
This follows since
(~- k) was shown earlier to be bounded
c.
A
linally, asymptotic joint normality of
/lCTNCk) -p)
follows
ca-po_nta ia the
8UIIl
b1
IN(~-k) and
ezpressina the pair as a vector, each of whose
of independent randOt:l variables with finite second
aa.ent.
the reault follClllfa fra- the aultivariate central liait theorem.
mBOUK:
lEwD - ) =-
~
.. ~
A
where
X - ap(1). Y - ap(k) . . It - ) 1.
III other
'.
worda, the rak a.Una likelihoocl utiaator cOIIVeraea ill probability to
the efficient . .tiaator of
PROOl:
It .. the two aroupa approach h_oaeneity.
Uaina a Taylor series expansion, we may write
2
= k-k(1-~) + k(l/k- ; ) +
-'l"
~
N
.~
YN
Usina the binoaial. theorem, we can write
.....
,
D (Yi)' D2 (Xi)
1
as infinite
...,.
aeries and intearate term'by""tena.; We do- not have to be concerned with
constant tenaa aince they will cancel when
atochastic tenaa.
Alao, when we _pand our expression about
Deed oaly account for thoae terma with k-1
tem will be divided
of amall order (k-l).
b1
k-1
k = 1, we
aa a coefficient aince that
and the rest of the atochastic tenas will be
41
Letting
= l-G(x.).
U.1
then substitution yields
1
U
= (t-l/k) I
U
i
o
1-2k
du,
1
k
(p u - +k)2
p
U
P u-1
where
1-A
p --A-
du
(p u1- k+k)2
cowerS" wben u 1-k. -<. -"p '.
Since eacb
~erm
i8 bounded. we can reverse tbe order of integration
and sWIIIII&tion:
_
k(k-1)
p
v--anding
-l'Ui
lUi
U
o
k(k-1) [(k)
p
p
j(l-k)
k(k-1)
p
k -2
[(-)
P
:J
log u
i
~
·~CD·-
1:
-
.
.
j-1
2
C)
j (l-k)
(k-l)u i
j
-2-j
log u i (k)
P
1-k
+
k-l
+ o(k-l) + constant
_
k(k-l)
P
00
[ (k)-2 log u + 1:
i
P
j-l
C 2 ) (k) Log u i +
j
P
0
(k-l) + constant]
42
Then the contribution of D(Zj)
to the overall statistic
is
lip [(lip) -2 + (1+1/p) -2 - (lip) -2 ] log u.
1
about. k=l. we find that its contribution is
Thus the total contribution of the z-sUlple is -
,..
~
(1- ~:Ei _ But since
~k
is a function of -D2 (zi). then the contribution of the z-sUlple to
~NRk
is
When u
~(l-A) i
N/IR(l)
=! when
~l-k
( kip_
l-k . k
) - • then
p
-1
=
k(k-l) ui u·
J
. .:i
=
u o u l-k + k)1o<
. ,,~,
p
k(k-l)
=
ct
--p
ltc(tS-t)-::::: ~
--
2 (k-l)-l
u
,.p"~--
.~
'.,',
-- .... ,~.
clu
-1 du
43
k(k-l) u
=
lID
L uj (k-l)-l (.~2)( k )-2-j du
1 i
p
u
j
o j-l
p
and the result follows frOll the previous case.
Y-a.ple:
u
1-k < -k
1:
=!::!
D ('Ii)
1
kp
= !::!
P
1
~ C2)(~)-2-j
0
j-l j
[(k)
kp
p
-2
u j (1-k)+l-2k) du
p
Log u. +
1
lID
-2
L (j) (k)
-2-J·
Log u. +
j-l;
0
(k-l)+ constant)
1
Comparing this to the comparable:" upr~8i~-l Jor the contribution of the xsample. we find that the Y-sampl~ contribute~~; A(l-A)Y i
-
thus -'I
to
f.I
k.
NR
Similarly. when u
,..
lim ~k
k+l
= i-.y
l-k
)
C
to
D1 (Y )
i
• tl1e result is the same.
in probability.
.; :.
and
Thus
44
a
= N Cov
TR
-
wbere
"
"
[TN CkNR}-J,1 • kNR-kJ=
A D (x.) - ED (X.) - kCl-A) KGCGCX.)] - mL[GCx.)] ,
.
1
-~
J
2 J
2 J
=~isber
l/I a Ck)
information at
k
fro. tbe rank likelihood.
Lettinl . A = 1/2 for this iDvestigation then
an·
Each covariance term can be written in ,compact form using expected
values of stochastic integrals.
It is necessary to evaluate expressions of
the form
where G C.)
1
is the empirical
,~st~ibution
\
-
. -'
,'
..
function for a sample of one
frca a poulation with cumulative distribution function
expectation can be taken inside the integral when
G.
~bing's
Ca.e Appendix).
B ':/: [G1 Ca)-GC.)] [G1 «t)-GCt)] ZlCs) Z2(t)d~(S)d~(t)
=
Then the above
theorem holds.
45
E [G (S)G (t)] = E [I(~-<.) I(:I:{ t)]
2
l
Now
= Pr(:l:o<min(s.t)]
=min
J
[G(s). G(t)]
Consequently. the required expectation equals
= IIG(a)[1-G(:t)] Z1(a) Z2(t)dF(a)dF(t)
_CD. < s. :< .'< CD
+ I I G(t)[1-G(a)] Z1 Ca) Z2Ct)dF(a)dF(t)
-~<t<S<CD
By symmetry of
t
and s
we can write the second term as
I / G(s) [l-GCt)] Z1(t) Z2(s)dF(t)dF(s)
-CD< s -< < II)
Finally we have
E /lI)l'IG (s)-G(s)] [G Ct)-G(t)lI:.z [{w) ;:Z:i(t>Q.CS)dF(t)
1
l
1
=
I I G(s)[l-G(t)] (Z1 (s)
-CD<s<t<1I)
i,Jt"f,¥;zOtft:> 1z'2,<,s-)ldFCS)dF
:7!.·d..t7=~..::.fi
(t).
;.
Similarly:
E /1I)/II)[F (a)-F(B)] [F (t)-F(t)] Z1 (B) Z2(.t)
1
1
o
0
/ /
~-<
dG(B)dG(.t)
F(B)[l-F(t)]{Zl (s) Z2(t) + Zl (t)Z2(s)}dG(B)dG{t).
s -< t-< II)
=
46
Y
Now
:8 (Yi)
= fyi
J' [H(s)]dG(s)
1
o
Integrating by parts we have that
Sillilerly
=10
-GO
:8 (Xj)-"f:B (Xj)
2
2
[G (.)-G(.)] JI [H(.)] 41'(.)
1
,,."" ;
II G(s) [1-G(t)] (J' [H(.)]KG' [G(;r)] + J' [B(t)]Kc;' [G(s)]}4F(s)4F(.t)
I.
-~<s<t<GO
~
.- ....
:~~
a~
..
II F(s) [l-F(t)] J' [B(S)]"r'[F(t)]4G(s)4F(t)
~<s<t<GO
+ II F(.) [1-F(t)] J' [B(t»]"r"" [F(s)]4G(t)4F(s)
~< s <: t·< GO
47
But since
and so
k
II
l-F(t)
= [l-G(t)]k.
dF(t)
Cov {MY[F(Y )]. B (Y )]
i
1 i
= k[l-G(t)]k-1 dG (t)
=
r(.) [1-r( t)] {JI [H(.)]MY' [F( t)] [1-G( t)]k-1
_00< s < t < 00
+ JI [H(t)]Mp' [F(s)] [1-G(s)]k-1}dG(.)dG(t)
k
II GC.)[1-GCO] {J'[HC.)]MG;' [G(t)] [1_GC.)]k-l
-aD< 8<'t'< CD
II FC.) [1"'F(t)] {JI [HCS)]~ [F(t)] + JI [H('t)]MY' [F(s)]}dG(s)dG(t)
-aD< s< t < 00
,.... ,• •>
,.,
~(.
,
,"'
All terms for calculation of mar [TN(kNR)-lJ] have now been derived.
ThereCllll:
Let
function. the
PROOF:
Y - exp(k)
i
B
= o.
then if
Consequently
Y
i
is uncorrelated with the score
!'
mar TNR(k)= N Var(T
NR
(k)JJ k> 0 •
The result follows immediately frCllll the result derived earlier.
CONSISTENCY
In order to demonstrate the consistency of a test. we must show that
the centering constant of the statistic under a fixed alternative is
48
different from zero.
In classical nonparametric theory there are useful
theorems which provide sufficient conditions for consistency.
However, in
this problem, the centering constant will depend upon an estimate of the
nuisance par_eter k.
Therefore, we do not expect to find simple
sufficient conditions which make the proposed rank statistic consistent for
all fized alternatives to proportional hazards.
really require.d for the purposes at hand.
Moreover, that goal is not
The ultimate goal of the pr•••
". ill to provide an estimate of the par.ater
which is not
P(Y>Jd
informative if the survival '.curves cross. The goal then becomes to identify
10
a subc1... of alternatives for which /0
J(B(z,k*
»
,
is the limit in probability of the estimate
"
where k *
dl(z) = 0
ku of
k
!lO..
and /Ol(z) = 11k.
Recall that the transformed null brpothesis is that the two
populations consist of exponential (1) and exponential (k) random
variables.
Under a fized alternative, one population i. exp(l) and the
CD _
other is arbitrary with /
o
l(z)dx = 11k.
for a broad c1as. of alternativs,
LEMMA:
r (z)
=k
PROOF:
Let
only
= 1-.-x•
once, then k*
G(z)
The following lemma shows that
k*) k.
_
If ::r(~)
-
f(x)
F(x)
) 1T
U
x , reo) < k and
) k when ~i(z) dx
The rank mazimum likelihood equation is
= 11k.
49
We have already shown in this ;chapter that asympotically. this equation
becomes
(1)
k*
1
/ooi(x)
=0
(l-)..)G(x)+
k *F(x)
hex)
Let
dH(x)
0
= (1-)..)
e-x + ~f(x)
h*(x) = (1-~) e-x + '-f* (x)
where f * (x) = kF(x) .0 that / eo.f * (x) = 1.
and
0
Since the LUS of
(1)
.. a log .P(B.}
J
ak ' "
=
*UDder the assumption of
*
,k-k
proportional hazard. and i. non-increasing in k • then
~ LUS > o.
k* ) k
Rewriting (1). we have that k* )'k <=)
1
k
[1-
00
1
k F(x)h(x)] ) 0 ~
h * (xl
0
/ .' . ~_-.-: .h* (x)_
---_.
::.i
~::,.-:
flO
f*(x)h(x) dx =!
*
h (x)
flO
f*(x)h(x) dx <1]
-
*
.*
flO
f dx + ! [hex) - h~'(x)]
.*
f1 (x) dx =
0 0 . *
:::9
"'-"0. "M."'."
~
h (x)
*.*
= 1 +! [hex) - h (x)] f (x) dx
o
.*
h (x)
= 1 +!fIO a(x) b(x) dx
o
where
a(x) = hex) - h*(x) = ).. (r(x)-k)F(x). where rex) =
b(x) = f * (x)/h* (x)
=
k
(1-)..) t:"(x)
r(x)
+ Ak
50
Note that since by the restriction of F. reO) < k
increases.
a(z)
and so
a(O) < O.
x
As
increases to a m8%imum greater than zero and then
decreases to zero in order that
'0 a(z)d% = O.
00
See Figure.
b(O)
1
o
a(O)
Let
S(z)
=•-x/F(z).
Then
S(z)
is increasing in z
e-x/F(z) [r(z)-l] ) 0
(=)
decreasing in z.
z
~
zo.
'0
00
Thus
a(z)b(z)d% < b(z )
*
0
Thus ,110 f (x)b(x) d% <1
QED.
o h *(x)
It i. clear that the most identifiable class of distribution functions
which satisfy the sufficency conditions are
IFR
distributions with
r(z)
) 1.
It ia atill possible to aay aamething about the additional claas
where r(z) < 1
and
r(z)
=k
only one.
In this case.
monotone since there is a solution to the equation
aa before and define
~ ~
distribution F such that
k * ) k.
b (xl)
= b (Q) •
;";1 < zo.
r(z)
~
b(z)
is
= 1.
Let
Then for that subclasa of
then as before
F
X
o
be
51
EXAMPLE:
Weibull alternative.
0
F(x) = up [- (Kx) 1+ ]
9) O. K ) O.
= (1+e)K1+& x 9 •
rex)
= b(D) (=)
b( Xl )
Thus
.,
~
( X
(=)
o
( ""-1)1+6
~
k (19.
"(x)d:x = 11k. and
THEOREM:
K
00
0
=
= K[(1+8)K1+ 8 ]-1
= -1
- (=)
= K-(1+8)
6
~1
Thus for those Weibull distributions such that
and 'WLOG
x
an
Ea (X.k*)
=
k*,
If
TN(~)
then
Ia(X.k)
•
uponential (k) cdf.
uniformly in
=0
where the
*
aa(x.k) •
ax
1S
is consistent for all IFR
k * ) k. Proof:
distributions such that
/
o
Let
decreasing in
e
x &
1+9. k* ) k.
(
upectations is w. r.t.
II TN
Then
/00
0
a (x.k*)dF(x)
*
*
- e-k.'X)
=/ 0 (F(x)
a (x.k*) d [F (x) - e -k x]
00
=
-
aa (x.k* )d:x
ax
wi.
Since
F(x)
is IFR
and E (x) = 11k"
_
reliability theory.
f:()
I: X
(e
-klt
F(x»
-klt
i
e
f or some Xl > k1.
then by a well-known theorem in
,
Uk
fory xo(
e· c.-
... :. :.. ••• ~. .-...
....
~
~'. ""l','
_.-.~
........
-
.
~
and
52
k* ) k. then
Since
~
e
is monotone decreasing in x.
aa
1-1 'TN )
(x .k* )
o
-k* x
for
Then if aa(x,k *)
ax
x ...< x 0 • say.
then
*
[ F(x)-e-kx] dx
f 00
0
ax
• aa
ax
)0
(=)
* [l/k - l/k * ]
(xo.k)
*
k' ) k
QED.
Finally. the foregoing results on sufficient conditions for
consistency show that the statistics on the pseudo-efficient scores and log
x are consistent when the one groups hazard is always above or below the
other's and the hazard ratio is increasing since each
J(H(x,k»
has non-increasing first derivative.
is consistent since it is a monotone function of
Also the logran,k. statistic
k
when
k ) 1.
Consistency follows fo't'-'~~ij.: ai~tribution for which k*> k
k* < k
since
. , ..-.
'
*
~(k)
lJ.m(k )
k*
= k.
.'
only when
53
Scores for the Linear Rank Test
In the beginning of this Chapter. four conditions were given for tbe
a~ptotic
normality of the Chernoff-Savage representation tbeorem.
i
and Sidak "0.967) show tbat if aN(i)-J( "N+l)
..
Ha]ek
then the first three conditions
are satisfied.
atemoff _d Sava.e abowed that they are also simplified
-.<i) • B J(u.ai ). where ~i is the ith order statistic in a ...ple of
sise N frca the anifora distribution.
We show
DOW
that the former method
is preferable when one att_pts to construct ·optimal· scores in this
probl_.
In Chapter I. we showed that there exists a monotone transformation of
the original data which maps tbe random variable from one group to
4It
exponential random variables with scale parameter 1 and the random
variables from the other group to exponential random variables with scale
parameter k when Ho
is true.
Although this particular transformation
is
arbitrary. it is convenient
\.'>-.';'
_;
•
.f.
.;;~
...
~.:-
~(7~::
\..~
to think of an altemative to proportional hazards beings a departure from
.~
a mixture of (unobserved) expone~t~f~~.+lfxj~? ~j
1
*
* = l- e-x *•
wbere G(x )
then for some alternative Y -) Y * where F(y)
* = !'(y*• Kee) wbere K,
i
i
e are scale and shape parameters respectively. We would like these same
scores to result in a statistics which is consistent against a broad class
of fixed alternatives to the exponential with constant hazard k.
",;
54
The locally most powerful rank test for a general alternative
has been presented by Hajek and Sidak (1967):
Theorem:
Let the family of densities d(x,0) 0EJ
three conditions.
satisfy the following
Then the test with the critical region
N
r cian(Ri,d)
i-l
~ K
i8 the locally most powerful rank thest for
at the respective level, where
with
~
i
~(Ri,d)-
denoting the i-th order statistic from a sample of size
from the distribution with density
(i)
the limit
N
d(x,O).
e for
d(x,0) is absolutely continuous in
(ii)
..
Ho against ~11,l1 > O}
,
almost every
d (x,O) - lfm 1/0 !d(x,0) - d(x,O)]
x.
exists for
0+0
almost every
(iii)
x.
lim
0+0
•
_
r ....
~'t'
•
... 01 ••• : : ' ,
The production of the locally efficient score relies on the expansion:
N
11
r
R. -1
,
c 1 1 ... / [ d (x 1,0)
R-r
......
d(x R.~O)
55
..
The score follow directly.
However, in the present problem, such a precedure leads to the
following result.
,
Ql1 (~ • ::)
• 1
+
d 2 (y t'O)
g(k)l1
QO(! • :)
...
-
d 2 (y t'O)
There is no longer a simple score which is the expected value
of a siuple random variable.
We conclude that under these circumstances,
this classical approach is not feasible and some substitute must be
found.
For the purposes of this investigation, we use score functions of
two kinds:
~
1)
The log-rank function
J(H(x,k)· -Log [l-H(x,k)]
2)
"Pseudo-efficient" functions.
These are the score functions
based upon the log- likelihood under a local alternative, ignoring
the function
g (k) •
~Iote that the function will depend on
c:;..._
general, but usually in a simple manner.
k
in
S6
3)
Loa(z). where z
ap(k)
ia a runnina variable for a random variable with
distribution.
(See Appendiz B for the juatification of theae
scor.. for Chernoff-Sava.e theoren and conaputation of B. the adjuataent
coefficient.)
AU for the cOilputation of the score itself. we recall frca the
be.innina of this section that the score can be re.arded as the apected
value of a .function of an ith order statistic.; however, a more convenient
formulation is
The latter
=J(t!N+l).
fo~u1ation
~pothe8ia,
i.i.d.
~(i)
we are
sample.
~
ia preferable in our case aince, under the null
dealing with order atatistics derived fram an
We vill see shortly that this
metbod~leads
to scores which
must be derived nUllerically and which depend upon our estimate of k
under
Psuedo-Efficient Score for a TWo-Parameter Sequence of Local Alternatives.
In Chapter I. we
saw
that a
seq~enc.
of local alternatives
~
to
proportional hazards is really a two dimensional sequence when the
alternative
cdf is parameterized by scale and ahape parameters.
be the two-diaensional space of these parameters.
Let
Let one group be the
n
57
x-sample. where
Yi • • 1.
= 1 •..•• n1
x
j
•• j = 1 ••• n2
are exponential (1) and the Y-sample
are from a continuous density
•
and
F(x .k.e )
o
=ke-]a.
The likelihood
•
Since the x. '.
J
are exponential (1) under both null and alternative
bypotheses. then the score can be thought of as being a function of the
unobserved exponential (k)
ranks of
total of
N observations.
random variables amongst the
Dropping the (.) for notational convenience. we
find that
n
log LR
= 1:1
i-I
[log F( YiK.e) - log F( Yi,k,8 0 0]
n
• 1:+[(X-k.e - e]
o
i-I
•
V
where
E (k.e )
o
+ (l":'e) ( o.e)
°
for some
o <
.E < 1.
58
Thus
f~ F(x)dx.
o
<a>
i
K
1
0
0
1:'\..
u-e -ou du • 1
k
Then
a(9) •
(~.9)
= (K(t).
9(t»
~t(t). [kg'(t),1]
We first show that
o
:
u0e-Gudu
= (K(t).t) =
a'(O)
~(t).
~'(O). [kg'(O), 1]
so
= lim
t-+-o
1)
1
g(0). e0 f
= au
v
then
0
K. g(0)k where
Let
Lettina
e
S(t)-s(O)
t
exists.
a(O) = lim aCt) exists.
t-+-o
2
f~ x t e-x
f~ xt(l-X+ ~1
dx •
t t
t+1 + x
f OX
-x
••• )dx -
t+2
) dx
21-···-
thus
tt+1 _
t+1
tt+2
t+2
aCt) =
et
t+1
t
= et
+
t t+3
--. '"d_
21 (t+3)
~
1: (_1)i-1
i-1
~
I
i-1
<_1)i-1
- ....
"
(i-1) 1(t+i)
tt+i
(t+i)(i-1)1
-
t i-1
(t+1)(i-1)1
e
+1
t
1
[t+1 + Polynomial in t]
as
t +
0
-e
59
00
1 + I: (_1)1-1
g(t)-g(O) .. lim e t It+1
t
t
t-+o
1-2
(t+1) (1-1) !
t
00
t
1__
t
__
+
I: (_1)1-1
]
11m e
t 1-1
t+1
1-2
t-+o
(t+1)(i-1)I
t
thus
lim
t..-o
00
+ I: (_1)1-1
lim e t 1- L
t+1
1-2
t-too
. lim
t+o
1
1
2
-) -1 -
1
t+2 + polynomial in
-
I - ~+1
]
t 1-2
(t+1)(1-1) !
3
t ]
as T -) o.
=- 2
Por the Makehsa distribution we find that
V Log f ( K. e) I K =k
= [ k1 -
-kx
x, 2(1-e
) - kx]
a-o
thus the pseudo-efficient score is
_ 1.
2
k (
1.
k
-x) + 2(1-e...kx) _ kx _ 2(l-e-kx ) + kx _ 1
2
2
LINEAR HAARD:
Letting
~
..
x+ 1£
a
1
-> dx - 7tr c1x
1
7F
thus
10
F(x)clx -
2
.!L
00_
e 2a
~2
e
2"
2
K
.. e
2a
1
K
• 18 12'IT [1- ~ ( nr)
]
de
-1]
60
where
~
(t)
Let l.l -
-
is the
K
cdf
of a standard normal random variable.
• then we must have
-k1
-
~
e 2
or
lin U-~(ll)]
K
i.
Iii
kp
Ie -
• ~
2 [l-~(p) ]
e
Jt
1i1T
It i. well-known that
e 2 Il-t(p)] • R(l.l)
11'
'ii
- "3 + 0
i. Mill'. ratio where R(p) =
1.1
where
R(P)
1
(4) •
1.1
Thus
In an arbitrarily smally neighborhood of e
a
~
k - 1 - - 2 ->
K
3K
~ (T)
K
2
3
2
(t) - [K(t), a(t)] -
= [l.e'(t)].
I K-kl <
1
£ £> 0
K - 9- (8)
K
--k
~
a'(K) - 2K -
a...
a-
= o.
[t~~(t}]
Since 9'(T):>0
then 9('K)
in'
'K •
for any small neighborhood
and so the inverse function
exists in a neighborboodO<s. <
Perparameterizing. we can say that
a...
where
(t) -
a...
*(u)
K' (u)1 -
u-o
V Log f(K,9)
I-
K-k
6-0
- lK(U),9(u)] - lK(U),U]
lUi (K)]-ll
-
_! .
K-k
1
x x2
lk - x, k - '2 ]
k
£' ...
61
Thus the pseudo-efficient score for a local sequence of linear hazards is
2
x
2x
-2 +k- -
A simple calculation show that
x2
2x
I x (- -2 + k o
Q)
so that for the statistic using this score.
B= 0
and the adjustment is
not needed.
It must be stressed that a "pseudo" efficient score is ezact1y that is
several respects:
1)
We ignore a coefficient which is a complicated function of .
2)
Even for the truly
1t.
"efficient" score. the concept of Pit"tman
efficiency does not apply to this problem.
Pittman efficiency is
meaningful when there is a parametric test with which to compare the
competing test.
Since the original null hfpothesis is that of proportional
hazards. the null qypothesis is actually an equivalence relation between
distributions functions where
F - G (=) l=F = (l-G)k. k ) 0
be different for difference pairs
3)
and
k
can
(F .G):' in the equivalence relation.
Moreover. even for the truly efficient score. the comparison of
the rank test to a parametric test. 'to' "detect 'the departure of one of the
populations from an exponential d:i:st"ribution .would not make sense for the
simple reason that there can be no parametric test which uses the joint
likelihood of the sample.
That is. the parametric version reduces to two
independent one-sample problems.
62
COMPUTATION OF THE SCORES
Typically. rank scores depend upon the rank. only.
That is. the score
function is a simple function of u = H(z): J(u)-- loa (l-u). the "loa-rank"
score. or J(u) = u
for the Wilcoxon score.
In this inve.tiaation.
however. derivation of the pseudo-efficient score. reveals that scores depend
upon .. the nuisance par.eter k and Y *• an unobserved random variable
i
with the ezpoDeDtial (k) distributioD. In order to aake ..nse of the
acoru. then. we let ",. 1-e-b ao that ancIer H.
o
1
u
-(l-A)ll-(l-v)k ]+).v
1
or l-u· (1-).1 (1-v)k + ).U-v}.
Set'
':~+1 •
(1-A)
aol",e for "'i.
for the ith ordered observatioD and
then zi· 11k Loa(1""'i)
aubatituted for
k.
(1-V1)~ + A(1-v1 )
nen, the rank
!lLB for
k
is
Note that .e have aupprused the dependence of v.1
k.
See araph below.
u
•I
-JJH
•
Obviously. Dumerical methods are required to compute the acores ao
that a acore such .. -
. N"i+1)
10a~
N+1
i. preferred if its perfor,aance is
comprab1e to these requirina numerical computation.
instead of : is . allowed
so
that
~ - iN
N+1
N
•L
N+l
since
+
p as
~:: ~
N+oo.
•
As N+oo,
i
Settina u =N+1
~N +
P 0 <P <
1,
on
63
PRELUfINARY TEST ESTn~TOR UNDER H
o
We are now interested in the bias and mean square error of the
PTE·
n
when H
is true.
o
....
(TN(~) -~) are jointly normal by the multivariate central limit
theorem.
Thus let
IN
.IN
....
(~R-R),
Pl
....
• corr
(TN(~
P2 •
C~
Let
Bo
ell
(-
aR
, e2
)
corr (e l2 , e 2 )
aw
he the critical value of the one-sided test in which we reject
if Pr(e 2
~ Ca*o}
< a.
Then the expected value of the
PTE is
'efmed as
-where
£22: (3 l2 ,3t - N
[0
o
,•••• ,.ot• . ;; ..
-•. _ , . . •..•..•• -.,.-.
V
We can immediately write down the asymptotic covariances
,..
aRT •
N cov [TN(~)'
....
~-RJ.
----------
....
N· cov
ITNR (k) ~R- Rl +
....
....
B N cov [k~!F-:-k, ~utR]
64
~
where R._-R· -"NR
I
A
<'k+11 2
N
(\rR-k) using the standard Taylor's expansion
""
~
covI TN(~R) '~W-R] •
We can split to the contributions to the covariance into two parts:
A
N cov [TN(kl, ~-R)]
a.
b~ . BN cov [. "
~ik, "~-R].
a)
We first express
integrals.
"
t1T (·Rmv..,;,R)
as a
sum of independent stochastic
Using the expansion for the generalized U-statistic for
"
Rw-
..l:n n
l 2
n 1 n:2
1:
1: 4>:(.~ , Yi)
i-1 j-l
where
if Yi > ~
otherwise
we have
+
where
<l>10(X) -
E
',.
<I>
-~-F(x)
[(yi,xj")lxj-x]
'
:~
,
.
Ol(y) - E !(Yi'Xj>.IYty]- .G(y)
-
we recall that
-lex
..
,
~
-y
F(x)· l-e
, G(y) • 1-e
A
N
n2
Write n(R._-R}- 1:
4> 10(Xj ) - R
-"MOl
n 2 j-l
•
65
~'I
1
1
n1
(I-A) [~
I:
n 1 i-I
= constant +
T~.(k)
1
(i1,
0p
where the last term is
n2
I:
n 2 j-1
B(Yi)-E B(Y )
i
- -
B(;X-e)-
B(X )
..I
j
]
where
B (r ) • 1 i
. B (X ) - 2 j
Now for
4l
r
00
o
I F1 (y)-F(y)]
00
1 I G1 (x)-G(x)] J' (H(x»dF(x)
0
j
C (X ) • l-F(X ) • l-l: dF(x), so that
2 j
j
10 (X ) we write
j
Cll)
1 I Gl
9
C (X ) -E. C (~) •
2
2 j
parts,
J~(H(y})dG(y)
(x) -G(x)] dF(x)
Also let
Cl (Y ) • G(li) •
i
Y
i
1
using integration by
dG(y)
o
00
Thus
- 1
C (Yi)-·E C (Y ) ••
l
l i
IN
f';
0
n2
(R..__
-R) • thus y~N
l'l
-"'NW
1:
j-1
n2
[
F (y)-F(y)] dG(y)
l
C2 (X)
"j -
E, C2'
(X)
'j
r;;
°1
+ - ' I:
°1 i=l
[C (Y )-EC (Y.)]
1 i
1 l.
+ ):1 cov{Cl(Y i ), B1 (Y i )]
,
I-A
+ ):
- - cov I C OC ),B (Xj)J
2
2 j
Cov
[C (X j) ,B (Xj)] - 2
2
CoV
[C (Y ) ,B (Y )] l i
l i
cov
C (Y ),B (Y ) .
l i ! I i
1:10 00 ~i~ (G(s) ,G(t) )-G(s)G(t)] J' H(s) (dF(s»)dF(t)
00 00
~
.
1 [(min F(s) ,F(t»-F(s)F(t) ] J'
0
r H(s)]
dG(s)dG(t)
Finally,
"
N Cov [T (k) ,!1n.7-R] -
N
11 G(s) (I-G(t)' {i'I H(s)] + J' [H(t)] } dF(s)dF(t)
-oo<s<t<oo
+ 11 F(s) (l-F(t)
-oo<s<t<00
r
{J' [B(s)]
+
J'
I
H(t)] } dG(s)dG(t).
66
bl
~~-k
Using the expression for
D- functions and
in terms of
M-functions,
we can write
'"
1
k-l
~
2
I-A
[k· ~
"'R
-
f'
I~-k, ~M-R]
N CQV
.
COV CDI (Yil, Cl (Yill
(~(F(Yi» ,Cl (Y i
+ cov
l2 A cov (D2OCj)'C2(Xj} - kEov
(MG G(X )
j
»
,C2 (Xj
»
thus
C1DlCl • eov 0'1 (lt1,Cl (Yill •
C1
FC1
C1
fr"F F(lt) ~Cl (Y i~·
• eov
D2C2
• eov
J, C2 (xj
D2 (xj
)
C2 (x )
j
1:/:
(min(F(s»,F(t)-
F(S}F(t~ ~'[F(S~ dG(s)dG(t)
1:/:~in(F(8» ,F(t)-F(S)F(t~
• - 1:/:&in(C(S) ,G(t»
• -
-G(s)G(t~ M' G ~(S)J
PTE UNDER H
o
- .:
(
After accounting for the two cotrelations, we can now rewrite
E (PTE) as:
E(Z2) • E(Z12) • 0
I
i
under Ho ' then
~~~(ZllIZ2)
- E
dF(s)dF(t)
1:/:~in(G(s),G(t»-G(s)G(t~ MGt~(s~ dG(s)dF(t)
ASYMPTOTIC BIAS OF
.
M' F fr(s~ dF(s)dG(t)
-,~
E(PTE).
(Z12I;2~
f2(Z2)dZ 2
67
•
Since the centering constants of both
6
11
, and 6
12
are zero
under H , the PTE will be positively or negatively biased if
o
a.
::,;~ or
(]
w
respectively..
But this is true if and only if
> a.
<L_
B:r
WI
or
°. WT < C1.Rr ,respectively.
ASYMPTOTIC VARIANCE OF . PTE m.'DER H
o
since
Now
under H •
o
?]
611162
N
2
2
[ORP1 6 2 , OR (l-P 1
6
N
[V262' q./(l-p/>]
6
12 1 2
thus
Var(PTF > ..
2
I
Ca ,
_lID
2
2 2
{OR (PI 6
2
.
2
+ I-PI>
x
2
2
2
2
• (OR PI -0'\01 P2 )
2 2
2 2
• (OR PI -Ow P2 >
-
0:
2
W
(l-p 22~>
68
2
- (aR Pl
• Ca
(a 2
rz
2
- aW2P22)
2
P2
~
(C ) + (a 2
R
a
a 2)
~
W
(C ) + a 2
a
W
C2
2
.. exp (- -.£...) + (a -a 2)
R
2
- aR2Pl 2)
;no
~
(C ) + a 2 •
a
In Chapter III we have seen that the problem demands a restriction
on the alternative hypothesis for problems to be well defined.
The one chosen
in this study, that the (fictitious) random variable associated with the
ratio of the two hazard functions have the same expectation under the null
and alternative hypothesis, is somewhat arbitrary, but has the heuristic
advantage of signifying some sort of correspondence of the null and alterna- \
tive through some sort of measure of central tendency.
One would not, for
instance, anticipate radicaly different results if the medians were set equal
to each other rather than the means.
We have also seen that reasonable, sufficient conditions on the profile
of the alternative and the nature of the scores can be found so that consistency is assured under the restriction pointed out in the previous
paragraph.
Further, the
prefer!~__test(locally
most powerful) is not
appropriate for several reasons and-SQ we must make our choice of scores for
the linear rank statistic in a somewhat arbitrary way.
Certainly simplicity
is a virture due to the potentially complicated means of obtaining the scores
numerically.
Lastly, the necessary quantities needed for finding the asymptotic joint
distribution of the
~est
by an expansion of the
~~E
statistic and the two estimators are made possible
equation of k.
Simple formulas for the PTE bias
and variance under the null hypothesis are available due to the asymptotic
joint normality of the
re~uired
statistics.
CHAPTER IV
INTRODUCTION
In previous chapters we have examined the performance of the PTE
when H is true and for a fixed alternative. Since the outcome in the
o
latter case is trivial in the sense that one would use the Wilcoxon
estimator with probability approaching one, we are led to alternatives
between H and a fixed alternative, close' to H.
o
0
The sequences of local
alternatives will be equivalent to one group being exp (1) under both H
o
and ~A and the other group having a distribution which converges in law
to exp(k).
As pointed out in Chapter III this sequence can be parameterized
by a 2-dimensional parameter space
(K~, eN)
restricted along a curve in
that space.
In this chapter, we derive the properties of the two estimators and
the test statistic under local alternatives.
Then we assess the performance
of the PTE under these circumstances. ,
In
Chapter II, it was easy to cOmPtit'e ,·;the:'rank likelihood of the
data under the assumption of propore;ionaloohazards.
adopted here to represent fixed
alt~rnativQto
Following the strategy
proportional hazards, we
would attempt to evaluate an N-dimensional integral where
be exp(k).
Clearly, this is an intractible task.
not necessary when examining behavior of IN
alternatives.
I)
F(x) would not
Fortunately, this is
(l~liR-R) under local
There are two ways to .go in this situation:
We recognize that log likelihood of the ranks derived under Ho
will still coverge to some number different from zero for a fixed alternative.
70
Thus e-kx
is replaced by
-F(x, K,e1.
One can then make a Taylor
expansion of the constant term around the point
centering constant for
(k,Qo) to compute the
"
IN (k.NR-kl.
Then, arguing that the remainder
( , by contiguity of
terms are still 01>.lL
)
fC
.x,·~,QN
to
k e ... la ,
t·h e
"
asymptotic normality of IN(kNR-k}
is shown under a local alternative.
This strategy is somewhat clumsy and is not particularly appealing
because the method does not fit into a more general theory for deriVing
non-centrality parameters. >In:fact,one can:· see. that this method is extremely .cumbersome. when it comes to calculating the non-centrality
parameters for
...
IN (TN (k"NR)-lJ·N) • There is a much more unified context
for this problem by a lemma due to LeCam{!960): Let SN be a statistic.
If the pair (SN' Log~) is under PN asymptotically normal
2 2
1 2
(lJr,lJ2 ,01,02,oI2) with lJ 2 - - 2 02 ,then SN is under QN
LEMMA:
asymptotically normal
(lJ
° ,°12
l + 12
),
where
PN, Q are probability
N
and
measures induced by densities
Note that this lemma obviates the need for Taylor expansion of
a constant term that is itself part of a Taylor expansion.
NON-CENTRALITY PARAMETER FOR
Y1f
"
(l\m
-R)
UNDER ~A
In Chapter III we expressed the log likelihood for a local alternative
in terms of the directional derivative (stochastic) plus a term which
converges to a constant in probability.
that LogL -
N(-
It is clear from that expression
2
21°2
2 , 02) since the constant is simply the Fisher
Information evaluated at a point along a curve, and the directional
derivative is a sum, of
Let
Log L •
i.i.d,
random variables:
"
We can express the stochastic part of fi(~R-k) which has non-zero
71
covariance with LogL as
+
(k-l)IR(k)
Thus the non-centrality parameter
&(l-A~
2 IS
(k+l) (k-l) I (k)
R
Since
P
(l)
"
of Iif(~-R) •
ell
cov (L ,D ) + Y cov (L , M )
Fi
i li
i
"
1
(R
-R)· NR
(k+l) 2
"
(k-_- k) ,
-1m
r
I; L(y) d 1 (y) - F(y)] • - I; ~1 (y) Dli ... ED1i • I; D1 (y)d ~l (y) -F(Y~ • - I; I!l
L i - '1:L i •
0
~
F(y)] L (y)dy
(y) - F(y>] \-' fr(YJ!G(Y)
M,i - EM,i·
Thus
11
.-
I:
M,
[F(Y~
d
~1 (y)
-
F(Y~
{k(l-~)
IS
+
+
• -
I:
[F1
II
(Y)-F(Y~ ~ fr(y~ dF(y)
F(s)
-co<s<t<co
L' (t)M'
[F(O~
c'
NON-CENTRALITY PARAMETER FOR
L' (t)M'
"
nTN(~R- lJ)
L
(01
h'
1"
[F(o~ (oj
A If - F{s) e-F(t]
-co<s<t:<;:co ,
..
~-F(tj IL , (s)M' fr(t~ G' (t)
(s)M'
F'
UNDER
dodt
r(t~ F' (t)
dodt)
HA.
aT
Expressing the relevant stochastic part of
"
Jli(TN(~R - lJ)
aT
+
72
vN(l- A>. B (Y i >.
1
+
B
N
-----0T (k-l) I (k)
R
we f1nd
C2 •
...L- [(l~~
cov
°T
[k(l:-~
where
cov (J.i,B
1i
)"-
cov (Li,D U ) + A cov
II F(s) [i-F(t'l
_-<S<t<CD
:J
+
lh'
~i,MFJ
(s)J' (H(t);' (t)
L'(t)J'(H(a~ G'(a} dadt
,.
.. NON-CENTlW.ITY PARAMETEll FOR IN(~ ~R )
A
The relevant stochastic part of "'(~. - R)
is
where
Cl (Y i } - ECl (Y i ) thus
Cl2 -
~ cov
- I: f l (y) ..: F(y3 dG(y)
Li, Cli
-
~ II
F(s) [!-F(tll [1'(s) G'(t) + L'(t)G'(si] dsat
_-<S<t<CD
SUMMARY
As in Chapter III, we note that the vector of the three relevant
random variables consists of components which are the sue of independent
random vanables with finite second moment.
Thus, by the multivariate
central limit thereon, the vector is asymptotically jointly normal
and we may use LeCan' s third Lemma.
random variable
remain
By contiguity, the variances of each
the same (asymptotically) as they were under Ho '
73
Thus the only differences in the behavior of these three asymptotic
statistics from H
o
to H
NA are tBe centering constants (shifts).
Consequently the covariances of each estimator with the test statistic
are asymptotically the same under
~A
as they were under Ro •
We are now
able to express the bias and mean square error of the PTE under
~A
in terms similar to those used under H •
o
ASYMPTOTIC BIAS OF PTE
Let
f
12
;11' ;12';2
: (3
12
be defined as under
, 3 ) _ N
2
[°
c
e
UNDER
12
~
2
O'WT
Ho '
E( PTE) •
If»a E(311132)
C
E( PTE)·
If»a
f
2
However, under ~A'
:~J
c
Again,
~A
(3 )d3 +
2
2
If» E (312132)f2(32)d32
c
a
I "E(311Ia2) - ~(3~2132)]
f 2 (3 2)d3 2 + C12 •
cr~'"
;12 132 ~ N IC12 + O'wP2 (32~C2)'
22
O'w (l-P2 )]
C
E(PTE)·
I ~(CI1-C122'+ (O'RPl - O'WP2)(32 - C2 )
_110
• (C1l-C12 ) ~ (Ca -C 2) + (O'WP2-O'RPl)
2'IT
f 2 (3 2 )d3 2 + C12
74
{I::
~
.:..:..-... lim
E-F(X,~,Q.Nj
dG(x)
-L}
-
l+K
N~
(~.QM) e: ~(t)
where the last term:
~fN(Y>:X> - l~K]
the shift in the parameter
for a sequence of local alternatives to proportional hazards.
:VAlUANCE OF ."PTE UNDER. ~A
We first calculate VareP'IE
Let
l.
*
*
*
*
all···· all - Cll • a 12 • a 12 -c12 • so .E (*11 ) • .E. (612 ) • 0
Now
Z~l/ 6 2
Ca
00
Since
+
Ca
1_
00
r;
*2
t:(*111*2) -
*2
. E(*12) •
Var(PTE ) •
*
*
1_Ca·
00
*2
*
*
*
2~
*2~
00
*2
E(*ll /*2) f 2 (*2)d*2 + I c E(*121*2)f2(*2)d*2
a
*2)J
E~121*2~
f 2 (*2)d*2
+ Ow
2
Ow 2
r;
2
2
2
2J
2
{OR ~Pl (*2- C2) + I-PI
- OW
l-P2~
<a,.2p22-aR2p12)
00
EwP 2 (Z2- C2)' C\/(l-p/J
Var (.PTE)·
• 1_
*
2
\'" N ERPI (*2-C2)' OR (l-P l
i!~2Ia2 ~ N
thus
*2
Ca
1_00 611 £.11 (6ll,62)d6lld62 + I Ca *12 t12 (*12'*2)d*12 dZ 2
Then Var (PTE).
} f 2 (*2)dZ 2 +
<Ca -C2)
~
exp [-
ow
2
1:r: 22 (*2-C2) 2 +
•
<Ca~C2)j+ <aR2-a,.2)
t <C -C ) + a 2 ,
w
a 2
75
In this chapter we have seen that by slight modifications of the FTE
bias and variance formulas developed in the last chapters we can produce
formulas for those quantities under specified local alternatives.
Previous
chapters have developed expressions for' the stochastic behavior of the test
statistic and the two estimators.
Using LeCam's third Lemma, it is then
possible to calculate the required non-centrality parameters for the three
relevant statistics.
CHAPTER V
Numerical integration of relevant expressions was performed by
IMSL programs DCADRE and DBLIN.
formance of the PTF
BIAS:
2
The parameters which 1lIeasure the per-
are defined as follows:
The bias of the
~TE for the parameter Vi (~A
-
l~k ).
BIASR:
The comparable bias of the rank estimator.
BIASW:
The comparable bias of the Wilcoxon estfmator.
RTMSR:
The ratio of the mean square error of the .PTE to OR •
RTMSW:
The ratio of the mean square error of the PTE to
2
2
Ow •
For the test statistic, itself, we calculate the standardized non-centrality
C2 •
Table 1 gives the combination of scores and local alternatives which
parameter
were used in this study:
TABLE 1
LOCAL ALTERNATIVE
-SCORE
Log x
Linear Hazard
~.akeham
Pseudo-efficient linear hazard
Linear Hazard
Pseudo-efficient Makeham
Makeham
Savage Score (Log-Rank)
Linear Hazard
Makeham
Wilcoxon Score
The values of
C2 used were
Linear Hazard
Makeham
C • .84, 1.282, 1.645 corresponding to one-sided
2
tests of level .20, .10, .05 respectively.
77
.THE PTE UNDER LOCAL
ALTER..~ATIVES
It is clear from Tables 2 through 9 that the biases and ratios of
mean square error to variance vary little over the range of
k • 1.9.
k· 1.1 to
There is more variability in the non-centrality parameters.
Not surprisingly, toe absolute value of the non-centrality parameter
decreases as
k
increases, since, in that case, the relationship of
the groups moves further away from the state of homogeneity in which
1'\
k
NR
is asymptotically efficient for
In addition, for each
parameters as
k,
a decreases.
k.
there are various patterns in the computed
For the Makeham local alternative, the
absolute bias increases with decreasing
RTMSR and RTMSW increase with decreasing
a
for all four scores.
a
However,
for the log x and pseudo-
efficient scores; whereas they decrease for the Savage and Wilcoxon score.
The similarity between the results for the Savage and Wilcoxon scores
follow from the fact that
- log (I-v) : v
for small v.
In both cases,
however, the RTMSR ratios approach a number very close to 1.
For, as
a
decreases to zero in the limit, the variance of the PTE will approach
OR
2
since we will accept H with
o
2
RT¥~R
.... 1 + Cll :
°~2
1
since
C2
11
p~o~apility.approachingone.
i s sma
. 11 compared t 0
°R.2
case of the parameter RTMSW, RTMSW increases with decreasing
log x
Thus,
In the
a when the
and pseudo-efficient score are used, and decreases with decreasing
a when the Savage and Wilcoxon scores are used.
2
2
OR + Cll
RTMSW ....
2
< ;L as
a .... 0 since
Ow
than 1 for the given range of
k.
In this case,
~s
considerably less
78
The behavior of the PTE under a Linear Hazard local alternative is
somewhat different.
Except for one instance for the log x score, the
absolute bias increases with decreasing a for the Log x, Savage and
Wilcoxon scores.
Tlie case of the pseudo..efficient score stands out
because it is the only case among all those in this investigation for
which the PTF 2 bias becomes zero for some
where
.05
~
a
~
.20.
a - ark)
for 1.1
~
K
~
1.9,
(See Table 7).
Finally, we note that in virtually all cases, the absolute value
of the PTE
bias is less than that of the rank esttm&tor under each local
alternative.
.
NON~CENTRALITY'
The non-centrality parameter
covariance of
-IN
~he
C2
results from two additive components: the
"
TN(k) and IN(~-k) with 10gL.
the nuisance parameter
that
PAMMETERS
k
can be considerable.
The effect of estimating
One might expect
pseudo-efficient score for the Makeham local alternative would
produce a positive Cr
"
However, the covariance of ~(kNR-k) and 10gL.
is negative; so negative in fact that the positive contribution of
~TN(k)
"
is swamped by the negative contribution of fN(kNR-k). This
extreme case shows that the effect of substituting an estimate of a nuisance
parameter can be very unpredictable since
C
2
is so dependent upon the
log likelihood under the local alternative.
For instance, Tables 4 and 5 show that for the Makeham alternative,
the Savage and Wilcoxon scores have considerably more local power than
the other two scores.
However, the picture is exactly
Linear Hazard alternative.
(See Tables
e and
9).
phenomenon better by looking at what happens when
alternative.
In that case,
\I1r TN(k)
reversed for the
We can understand this
k-l' under a Makeham
is directly proportional to
-IN y
79
IN
and so is
A
<~m-k).
At the same time, LogL is an increasing function
of
Y • Thus. both components of the statistic contributed "large"
i
positive amounts to the overall covariance. However. in the Linear Hazard
Yi
case. LogL is a parabolic function of
are cancelled by negative covariances.
and so positive covariances
Nevertheless, comparison of
Table 4 with Table 8 shows a remarkably similar pattern of PTE
,.
behavior
under both local alternatives :In the face of quite different non-centrality
parameters.
Next we look at the source of bias and !TE ratio variation among
different scores for each of the two local alternatives.
. PTE BIAS AND
~MSE
.RATIOS
In general, we would expect the MSE of the PTE
a
2
R and
the PTE
2·
'tT.
However, there are several occasions in which the . MSE of
is less than the variance of the rank estimator.
of the pseudo-efficient and Savage score for the
•
to fall between
instructive (Tables 3 and 4).
¥~keham
The comparison
alternative is
Notice that the biases are virtually
identical so that the differences in the RTMSR arises from the fact that
the variance of the PTE
is greater when the Savage score is used.
explanation is evident when the equation for the variance of the PTE
examined.
<cw
f •
2
We wish to find the circumstances under which
2
Letting
P2
~/
;T
> 1 'i k, and
R
•
if and only if:
; • ca-c2
then we can say that
The
is
80
2
2
( fP 2 -Pl 1 3 q,Gn < o.-f) 1l-4>(Z) ]
or
P/"; fP 2
2
l-~(Z)
>
; cP(Z}
f-l
If the
LHS <: 0.
then
the case for the Savage score.
However. if
PI
2
2
»P2'
2.
°pIE
> 0
2 when
Z-> O.
R
2 -
Thus when
2
Pl - P2
This is in fact
then
<
then it is possible for
2
CJ
PIE
a.R2 •
We can
then rewrite the inequality as
<:
a t (;)
•
l-t(a)
The RBS is increasing in
a
since the hazard function for the standard
normal distribution is increasing in a without bound. Thus. if Z»
2
2
2
2
and Pl »P2 • then 0pI,E < OR. This is what happens with the
pseudo-efficient score.
Since
C2 < 0. then
; > C
a
and
LHS
0
< RHS.
The foregoing is a good illustration of the fact that local optimability
of the score function is not necessarily related to the performance of
a preliminary test statistic. In fact. one of the two crucial quantities.
2
2
Pl -~2
has absolutely nothing to do with the particular local alternative.
Finally, we note that the foregoing is equivalent to the statement that a
sufficient condition for
RTMSR > 1
is that
CJ
RT
< 0WI •
PTE BI.AS
Since the bias of the rank MI!
estimator is always negative under
the two specified local alternatives. and the bias of the Wilcoxon estimator
is always zero, we should expect the bias of the PTE
than the former but still negative.
is positive on the domain
to be less negative
On occasion. however. the PTE
1.1 < k < 1.9
for some
bias
a(k) , .05 < a < .20.
•
81
This is clearly the case for the pseudo-efficient linear hazard score.
(See Table 7).
¥.5E
As
CX"
0, the PTE bias must approach that of the rank
and so since the bias is clearly a continuous function of
that the bias is zero for some
CX
cx, we know
in the range used in this study.
The general (but obviously not sufficient) condition for this situation
to Qccuris that Pl < 0 and P2 > O.
the PTE
In order to see this, we note that
bias can tie written:
since the WUcoxon statistic is unbiased for P(Y>x).
Since
Cll < Cl2
in all cases, the second term must be "as positive as possible" to make
the total positive,
This is what occurs in the case just cited.
It does
not occur for any other case.
The role of the non-centrality parameter is also important.
the PTE
For
bias to be positive, we must have:
or
C
lcx
~CD
> C12 > O.
In this example 1Pl' < < P2.
k
increases,
Referring to Table 7, we see that as
C decreases, and the P1~.-_·bias increases.
2
PRF 2 bias will occur with a test of size larger than .84.
is that since
Ipll«
Thus the zero
The explanation
P2' the first term does not contribute much of
a negative component, whereas the decrease in
to contribute a sizable positive component.
C2
allows the second term
82
In summary, it is clear that the performance of the PTE
terms of bias and
~~E
both in
is a complicated function of the covariances of
the test statistic and the estimators of
globally, the size of the test and the
P(Y > X}.
When viewed
~on-centrality
parameter have
secondary influence.
When proportional hazards is true, Tables 10-14 show that the
bias of the PTE is always positive and is very small when either the
Savage or .1lUco2:on scores are used.
However, the variance of the
PtE tends to be laraer for those scores than for the others.
This chapter have demoustrated that the nOD-centrality parameter is
very much dependent upon the local alternative 8ince it depends heavily
,.,
upon the covariance of the log likelihood and IN (~-k). However, the
behavior of the PTE appears similar despite the alternative.
This is a
somewhat encouraging development since we would like to use a statistic
which will have predictable performanc.e over a range of circumstances.
Another way of putting this is that local power is a property of a test
statistic and may be maximized under given circumstances.
However, this
study has suggested that considerations for optimal performance of a PTE in
asymptotic situations where a nuisance
:p~rameter
comes into play, may
render local power irrelevant or a factor of diminished importance.
We end by stating that this
pro~~em
has often veered into territory
off the beaten path of traditional rank test theory.
Once the null
hypothesis does not require homogeneity (iid r.v.'s), consistency and local
optimality results no longer apply.
Therefore this problem does not appear
to fit into a nice self-contained mathematical-statistical structure out of
\~hich
"nice" results fall.
CHAPTER VI
....
The fact that the set of possible joint rank vectors does not have
equiprobably elements under Bo ' except when
k· 1, produces considerable
practical difficulties when the issue of censorship is addressed.
For
instances, under Type II censoring, it is not difficult to show that for
the linear rank statistic be a martingal (an thus permit the use of the
martingale central lfmit theorem), the constant score assigned to the
censOred observations must be a function of the probability that the
(r+l) st
obs~rvation comes from a specified group if the r th observation
is the last one used.
a function of
k
it is so complex.
...
Although this probability can be expressed as
is this problem, the expression is useless because
If censoring is to be studied, some
appr~ximation
methods for usable censored scores may be the way to go •
In addition, the choice of the pair of estimators is not obvious
when censoring is present.
With random 'censoring, does one simply use
the censored version of the Cox estimator?
What modification of the
U-statistic is required?
Another topic for further research is the exploration of optimal
scores for specified local alternatives.
problem due to the estimation of
k
This is an extremely difficult
and the fact that classical
optimality results require the null hypothesis to be homogeneity of two
groups.
Even if one casts the problem in terms of a "restricted" one,
84
i.e., makes it a constrained maximization problem, the difficulties
are great because this problem is not one amenable
multipliers.
to Lagrange
It is a problem in infinite dimensional control theory.
85
BIBLIOGRAPHY
Abel, (1982). A note on the Mann-Whitney statistic for
Lehmann alternative. BIOMETRICAL JOURNAL 24, 565-570.
Anderson, (1982). Testing goodness-of-fit of Cox's regression
and life model. BIOMETRlCAS 38, 67-77.
Ashar, VG. (1970). On the use of preliminary tests in
regression. Unpublished Thesis, North Carolina State
University, Raleigh. NC.
Bancroft, TA. (1944). On biases in estimation due to the
use of preliminary tests of significance. ANN MATH
STATIS 15, 192-204.
Bancroft, TA. (1964). Analysis and inference for incompletely
specified models involving the use of preliminary test(s)
of significance. BIOMETRICS 20, 427-442.
Bancroft, TA and Han, CP. (1980). Inference based on conditionally specified ANOiA models incorporating preliminary tests.
HANDBOOK OF STATISTICS Vol 1, 407-441.
Basu, AP. (1981). The estimation of P(X < Y) for distributions
useful in life testing. NAVAL RESEARCH LOGISTICS QUARTERLY,
28, 383-392.
Birnbaum, ZA. (1956). On a use of the Mann-Whitney statistic.
PROCEEDINGS OF THE THIRD BERKELEY SYMPOSIUM ON MATHEMATICAL
STATISTICS AND PROBABILITY, Vol 1, Univ. of Calif. Press,
13-17.
Birnbaum, ZW. and McCarty, RC (1958). A distribution-free upper
confidence bound for Pr (Y < X), based on independent
samples of X and Y. ANN HATH STATIST 29, 558-562.
Bock, ME, Yancey, TA and Judge, eG. (1973). The statistical
consequences of preliminary test estimators in regression.
J AMRR STATIST ASSOC 68, 107-116.
Brook, RJ. (1976). On the use of a regret function to set
significance points in prior tests of estimation. J AMRR
STATISTIC ASSOC 71, 126-131.
Chernoff, H. and Savage, RI. (1958). Asymptotic normality and
efficiency of certain non-parametric test statistics. ANNALS
OF MATH. STAT., 29, 972-994.
86
Chipman. JS and Rao. MM. (1964). The testing of linear
restrictions in re&ression analysis. ECONOMETRICA 32.
198-209.
Doksum. KA and Yandell. BS (1984). Tests for exponent ial ity.
HANDBOOK OF STATISTICS. Vol. 4. North-Holland Press.
579-611.
Epstein. B. (1960). Tests for the validity of the assumption
that the underlying distribution of life is exponential.
TEQINOMETRICS 2. 83-101.
Govindarajulu. z. (1968). Distribution-free confidence bounds
for Pr (X < Y). ANN INSTITUTE STATIST MATH 20. 229-238.
Hajek. J. and Sidak. Z. (1967). Theory of rank tests.
Academia. Prague.
Bill.
ac. Judge. GG and POllby. TB. (1978). On testing the
adequacy of a regression Ilodel. TEQINOMETRICS 20. 491-494.
Hoel. DG. (1972). A representation of mortality data by competing
risks. BIOMETRICS. 28. 475-488.
Huntsberger. DV. (1955). A generalization of a preliminary
procedure for pooling data. ANNALS OF MATHEMATICAL
STATISTICS 26. 14. 734-743.
Johnson. NL. (1975).
17. 393.
Letter to the Editor. TECHNOMETRICS
Jureckova. J. (1969). Asymptotic linearity of a rank statistics
in regression pr8lleter•. ~ALS OF MATH. STAT. 40. 1889-1900.
Kelley. GO. Kelley. JA. and Schucany. WR. (1976). Efficient
estillation of P(Y < X) in th~ exponential case.
TECHNOMETRICA 18. 359-360. -, ,:
Larson. HJ and Bancroft. TA. (1963a). Sequential model
building for prediction inregr,es!i,on ·'analysis. I ANN
MATH STAT 34. 462-479.
-.
Larson. HJ. and Bancroft. TA. (1963b). Biases in prediction
by regression for certain incollpletely specified models.
BIOMETRIKA 50. 391-402.
LeCam. L. (1960). Locally aSyllptotically normal families of
distributions. Univ. of Calif. Publ. in STAT. 3. 37-98.
Lehmann. EL. (1950). Notes on the theory of estimation.
Mimeographed lect~:es. Unive%isty of California.
Lehmann. EL. (1951). Consistency and unbiasedness of certain
nonparametric tests. ANNALS OF MATH STAT 22. 165-179.
87
McRae. KB. (1971). Inference procedures for pairs of
distributions with proportional failure rate functions.
Ph.D. Dissertation. Oregon State University.
Md. AK. Saleh. E. and Sen. PK (1984). Nonparametric preliminary
test inference. HANDBOOK OF STATISTICS. Vol 4, NorthHolland Press, 275-297.
Mosteller, F. (1948).
On pooling data.
JASA 43, 231-242.
Nagelkerke, NJD, Oosting, J and Hart, AAM (1984). A simple
test for goodness-of-fit of Cox's proportional hazards
model. BIOMETRICS, 40, 483-486.
Neyman, J. (1959). Optimal Asymptotic tests of composite
hypotheses in Grenander, U. (Editor). Probability and
Statistics, The Harold Cramer Volume. Stockholm, 1959.
Pathek. PK. ZtBmer, WJ. and William•• RE. (1979). Nonparametric estimation of an acceleration parameter.
COMUN STATIST-THEOR MATH AS (4). 367-383.
Sathe. YS and Shah, SP. (1981). On estimating P(X < Y) for
the exponential distribution. COMMON STATIST-THEOR MATH
AI0(l). 39-.47.
Schoenfeld, D. (1980). Chi-squared goodness-of-fit tests for
proportional hazards regression model. BIOMETRIKA 67.
145-153.
,Steck, GP, Zimmer, WJ, and Williams, RE. (1974). Estimation
of parameters in acceleration models.PROC. 1974 ANNUAL
RELIABnITY AND MAINTAINABILITY SYMPOSIUM, Los Angeles.
California, 428-431.
Taulbee, JD. (.'1979). A general model for hazard rate with
covariables. BIOMETRICS 35. 439-450.
Tong, H. (1974). A note on the estimation of Prey < X) in
the exponential case. TECHNOMETRICS 16 ~ 625; errata,
17, 395.
Tsiatis. AA. (1981). A large sampletudy of Cox's regression
model. ANNALS or STATISTICS, Vol. 9(1). 93-108.
Uty, liK. (1972). On distribution-free confidence bounds for
Prey < X). TECHNOMETRICS. 14. 577-580.
Van Dantzig, D. (1951). On the consistency and the power of
Wilcoxon's two-sample test. KONINKLIJKE NEDERLANDSE
ADADEMIE van WETENSCHAPEN PRO~EEDINGS, Series A, 54, 1-8.
RESULTS
Tables 2 through 9 display results for the combinations of
scores and local alternatives for k
1.1,1.3,1.5,1.7,1.9 and
critical values C which produce one-sided tests of size
a= .05, .10, .20.
=
TABLE 2 9LOGz-MAKEHAM)
Co = .84, 1.282, 1.645
k
1.1
1.3
1.5
1.7
1.9
BIAS
-.006,
-.006,
-.006,
-.006,
-.006,
-.023,
-.004,
-.022,
-.021,
-.021,
BIASR BIAS
-.032
-.031
-.031
-.030
-.029
-.042
-.041
-.040
-.039
-.037
0
0
0
0
0
R'l'MSR
.952,
.952,
.952,
•952.
.952,
.953,
.953,
.953,
.952,
.953,
RTMSW
.969
.969
.968
.968
.968
.714,
.716,
.719,
.720 •
.722,
.715,
.727,
.719,
.74,
.722,
C
2
.727
.729
.731
.733
.734
-.310
-.304
-.302
-.301
-.301
e~
TABLE 3 (PSEUDO-EFFICIENT MAKEHAM-MAKEHAM
C = .84, 1.282,1.645
a
k
1.1
1.3
1.5
1.7
1.9
BIAS
-.023 ,
-.021,
-.019,
-.018,
-.016,
-.032,
-.031,
-.030,
-.028,
-.027,
BIASR
-.037
-.036
-.035
-.033
-.032
-.042
-.041
-.040
-.039
-.037
BIAS
d ".,970,
0
.967 •
0
.965,
0 .• 964 •
0
.963,
RTMSR
.972,
•970,
.968.
•966.
.963 •
.984
.982
.980
.979
•978
C2
R'I'MSW
.728,
.727,
.728,
.730,
.731.
.730,
.730,
.731,
.732.
.732,
.738
.739
.740
.741
.742
-.430
-.413
-.399
-.387
-.377
.
TABLE 4 (SAVAGE - MAKEHAM)
C a = .84. 1.282. 1.645
BIAS
k
1.1
1.3
1.5
1.7
1.9
-.026.
-.025.
-.023.
-.021.
-.019.
-.033.
-.031.
-.029.
-.027.
-.026.
BIASR BIASW
-.036
-.035
-.034
-.032
-.030
-.042
-.041
-.040
-.039
-.037
0
0
0
0
0
1.124.
1.123.
1.122.
1.122.
1.122.
RTMSW
RTMSR
1.076.
1.077 •
1.079.
1.081.
1.082.
1.047.
1.049.
1.051.
1.054.
1.056 •
.844 •
.845.
.847.
.849.
•851.
•808.
.811.
.815.
.818
.821.
C2
.786
.789
.794
.797
.801
+.494
+.483
+.474
+.464
+.455
TABLE 5 (Wn.COXON - HAKEHAM)
C a = .84. 1.282. 1.645
BIAS
k
e
1.1
1.3
1.5
1.7
1.9
•003.
.002.
.000.
-.001.
-.002.
-.009.
-.011.
-.012.
-.012.
-.013.
BIASR BIASW
-.020
-.021
-.021
-.021
-.021
-.042
-.041
-.040
-.039
-.037
0 1.164.
0 1.160.
0 1.156.
0 1.152.
0 1.149.
RTMSR
1.148.
1.143.
1.137.
1.132.
1.128.
RTMSW
1.122 •
1.116.
1.111.
1.106.
1.101.
.873.
•873.
•873.
.872.
•872.
.862.
.860 •
.858 •
.857
.855 •
C2
.842
.840
.839
.837
.835
.515
.503
.491
.479
.468
TABLE 6 (LOGX-LINEAR HAZARD
C a .84. 1.282. 1.645
=
k
BIAS
1.1 -.032. -.060. -.079
1.3 -.014. -.039. -.055
1.5 -.003. -.024. -.039
1.7 +.005. -.014. -.027
1.9 +.010. -.007. -.019
BIASR BIASW
RTMSR
-.107
-.080
-.060
-.017
-.019
.969.
.959.
.954 •
.951.
.950.
0
0
0
0
0
1.01.
1.00.
•994.
•990.
•987.
RTMSW
.973
.961
.955
.953
.952
•757.
•752.
•751.
.749 •
.749.
.727 •
.721 •
.720 •
.720
.720 •
C2
.730
.723
.721
.721
.722
.182
.151
.123
.100
.082
TABLE 7
BIASR BIASW
BIAS
k
1.1
1.3
1.5
1.7
1.9
LINEAR HAZARD-LINEAR HAZARD
Ca = .84, 1.282, 1.645
(PSEUD~EFFICIENT
+.021,
+.030,
+.037,
+.042,
+.045,
-.017,
-.004,
+.005,
+.012,
+.017,
-.048
-.031
-.019
-.010
-.004
-.107
-.080
-.060
-.017
-.019
0
0
0
0
0
RTMSW
RTMSR
1.140,
1.115,
1.096,
1.083 ,
1.722,
1.106,
1.071,
1.050,
1.035,
1.025,
1.085
1.049
1.029
1.016
1.001
.856,
.839,
.828,
.819,
.813,
C2
.830,
.806,
.793,
.784
.777 ,
.814
.789
.777
.769
.764
.430
.323
.247
.195
.156
TABLE 8 (Savage-Linear Hazard)
C a = .84 (1.282) 1.645
BIAS
k
1.1
1.3
1.5
1.7
1.9
-.086,
-.063,
-.046.
-.034,
-.025.
-.097,.
-.071,
-.053,
-.040.
-.030.
BIASR BIASW
-.102
-.076
-.056
-.043
-.033
-.107
-.080
-.060
-.017
-.019
0
0
0
0
0
RTMSR
1.10, 1.070 1.058
1.083.1.056.1.041
1.079,1.050.1.034
1.078.1.048.1.031
1.079.1.048,1.030
RTMSW
.822,
.815.
•815.
.816.
.818.
.803,
.794.
.792 •
.793
.795.
C2
.794
.784
.780
.780
.781
-.018
-.029
-.029
-.028
-.027
TABLE 9 «Wilcoxon-Linear Hazard)
C = .84. 1.282. 1.645
a
BIAS
k
1.1
1.3
1.5
1.7
1.9
-.059.
-.040.
-.026.
-.016,
-.008.
-.079,
-.057,
-.040,
-.028,
-.020.
BIASR BIASW
-.092
-.067
-.049
-.036
-.027
-.107
-.080
-.060
-.017
-.019
RTMSR
RTMSW
0 1.164,1.141 1.115 .874, .856.
0 1:.153.1.124.1.095 .867, .846,
0 1~146.1.114,1.083 .865, .841,
0 1.142.1.109,1.077 .864, .839
0 1.140.1.106.1.074 .864, .839.
Ca
.836
.824
.818
.815
.814
.086
.045
.025
.014
.007
e
..
TABLE 10 (LogX)
BIAS
k
1.1
1.3
1.5
1.7
1.9
•042.
•041.
•039.
.038.
•036.
.026 •
•026.
.025 •
•024.
•022 •
RTMSW
RTMSR
.015
.015
•014
.014
.013
.985.
.984 •
.983 •
.982 •
.982.
.951.
.951.
•950.
•950.
.950.
.953
.953
•953
.953
.953
.715
.717
.719
.721
.723
.714•
.715 •
.717 •
.719 •
.720 •
•739 •
•740 •
.742 •
•743 •
•744.
TABLE 11 (PseudcrEfficient Makeh.>
e
1.1
1.3
1.5
1.7
1.9
RTMSR
BIAS
k
.023 •
.025.
•025.
.025.
.025.
.015.
.015.
•016 •
.016.
•015.
.009
.009
.009
.009
.009
1.00.
1.00.
.994.
.993.
.992.
.968.
.965.
.963.
.962."
.962.
RTMSW
.967
.965
.964
.963
.963
.750.
.750.
.751 •
.751 •
.752 •
.727 •
.726 •
•727
•728
.729 •
•726
.726
.727
.729
•730
TABLE 12 (Savage)
1.1
1.3
1.5
1.7
1.9
RTMSR
BIAS
k
•000.
.001.
•002.
•003.
•004.
•000 •
.001.
.001.
•002.
.003.
.000
•000
.001
.001
.001
1.067 •
1.070•
1.072 •
1.075 •
1.078
1.034.
1.036.
1.040.
1.043 •
1.046 •
RTMSW
1.017
1.019
1.022
1.024
1.026
.801.
.805.
.810.
.814.
.818.
.776 •
.780 •
.785 •
.789
.793 •
•763
•767
•771
.775
•778
TABLE 13 (Wilcoxon)
BIAS
k
1.1
1.3
1.5
1.7
1.9
.022.
•022.
.021.
.021.
.020.
•014.
.014.
.013.
.013.
.013.
RTMSR
.008
.008
•008
.008
.007
1.147 •
1.145.
1.143 •
1.142.
1.141
1.109.
1.108.
1.106.
1.105.
1.104.
RTMSW
1.073
1.073
1.072
1.071
1.070
.861 •
.862.
•863.
.864.
•865.
.832.
.833.
.835 •
.836
.837 •
.805
.807
.809
.811
.812
TABLE 14 (Pseudo-Efficient Linear Hazard)
BIAS
k
1.1
1.3
1.5
1.7
1.9
.070.
.069.
•067.
•066.
.064.
.044.
.043.
.042.
.041.
.040.
RTMSR
.026
.025
.025
.024
.024
1.126.
1.101.
1.085.
1.074.
1.065
1.078.
1.055.
1.041.
1.030.
1.022.
RTMSW
1.048
1.030
1.019
1.012
1.006
•845.
.828 •
.819 •
.813.
.808.
.809 •
.794 •
.786 •
.780
.775 •
.786
•775
•770
.766
•763
e
~
APPENDIX A
, p.
P (l-G
n2
If the last observation is from
observat ion is from
M· k.
1
~
the ineerva1
0
<~.
k (J.-F
n1
i
A
)
1£ the last
Thus
1
CD
I o M(Fn ,Gn
On
+
then /~, 0,
G,
1
then
F,
)
1-A
< 1,
)d~-~ 0
2
-~
_-
(N 2) •
p
all except one of the remainder
terms require only the check on condition (4) of the Chernoff-Savage
theorem.
Let
H· (l-A) u
+ VA.
0 < u, v. < 1
and
constant.
(i)
1
(l-u) + k
p 1-v
M(u,v)·
aMI·
(ii) lau
Now
raMI_
av
< 1
k
P I [pC 1-u) + k i .
1-v
P ( 1-v)
1-v
[P(1-u)+(1-v)]2
1-v
~
C(l-H).
Thus
let
C be a generic
(iii)
C
(I-v) 2
a2M
(lu2 •
C(l-v)
(I-H) 3
<
-<
C(l-H)-2 _< C [H(l-H)
r2
Ip(l-u)+k]
I-v
2
a M
lav 21
Now
C [1-u + (l-u)2 ]
I-v
I-v
<
-
I-H
I-v
l-u : > I-u
• A + P I-v
I=V
-<
Thus
C [H (I-H)
C
r2
"
The only term which needs a further result is the m:lzed partial
term,
4N "
R
Note that
C(l-u)
(I-v) 3
C
-<
(I-v) 2
+
<
[p Cl - u )+k]2
I-v
C
(I-B)
Now
R4N
2
C
+
C
c
-
(I-H) 3
<k
[B(l-H)
r 2•
•
By a result subsequent to the original Chernoff-Savage paper
IN
sup
O<~ <1
Thus
IR4N I
-
•
I'ii(l-u)
fi(l-F)
I
<
<
lun-u I
O<~<l
C
N
'nl
•
op (1).
!G(l-G)
2
a M
1ii2
aF aG
n
l n2
-5/2 +
I
O<·~<
H(l-H) [B(l-B) ]
1
(v a)d~
<5
d~
~ C
N
I
s
[
H(I-H)]
- 3/2 + 15
dH
Ne:
<
-
dH
1 H3/2 - 15
11
C
N
CN
then
R •
4N
(N- 1/2)
0
P
.<
-
k
N
1/2 - 0
•N
APPENDIX B
CONDITIONS FOR CALCULATION OF B
1. for
E(logx + log k*+ c) • log k* - log k where
log x, note that
C - Euler's constant.
B -
11m
Thus
108
k~
k*-los k _
d 108 k*
dk*
k* - k
-lex
For J[H(x,k] - 2 (l-e
aJI _ -kx x
lak 2xe +"2
") + ~2
< 2x
2
I
.
k*-k
1
- k
3
2
+"2x which is integrable.
1
For
x
J [H (x, k)] • - -
II.
In order to simplify notation we let
+ 2x - - 2
2
"it"" k
,
B·
o.
the transformed random variables are exp(l)
J(H(x,k). J(u)
and exp(k).
under H,
o
1. e.,
, -k,
H (x) • (l-A) (l_e-x ) + A(l-e x)
-k
IU'(x)!'. (l-A)e-x + kAe x
-k
IU"(x)1 • (l-A)e-x + k2Ae x
Note that if
k ~ 1,
then
-k
1_e-x 5- H(x) < 1-e x .... _ ; log (l-u) .5-:lt, _< - 10g(1-u)
Log x
x~oo
Let
<..
_< - log (l-u) => log x
X
T-l
u·
T large.
--
T '
Then
[u(l-u) - 1/2 + 0 • (T-l)
T
log [- log (l-u)] •
log
- 1/2
X
_< log log T
+ 0
and
1/2-0
-
2
T
, T large for
some 0 > O.
Clearly, log log T_<log T,
_
log T. < T - 1/2-0.
Let
so it is sufficient to show that
T. e v.
Then we
v large, or log v 5v (1/2-0).
logvv
DlUS t
log v
But
v
s h ow that
~Oa8
~ 0 < ~ - 0 where O. < 0.
v~oo
< 1/2
x~O
1
J' (u) •
,
H' (0) > O.
J'(u) •
This
xH (x)
Near
u • 0, - log(l-u) • U
u
2
u
£. -e
x'-
c
log(l-u)
3
+ If + 3T +
Thus £. <
x-
x ...
H'(x)
Thus
~
00
l-H(x) for
1
<..L < 1
.>
H' (x) - l-u - u(l-u)
1
xU' (x)
IJ"(u)
r+{)
IJ" (u)
•
H"(x)
-
.~---:-
x[H' (x)]3
k > 1.
< ~c=u(l-u)
V
_<
e v(1/2-0) ,
1st Term:
H'(o) > O.
Thus the first term reduces to the previous
case.
2
nd Term:
1
c
• -....£-. <
x
•
2 -
<
.!L
u
log 2 (l-u)
[u(l-u)]
x+
B"(x)
1st. Term.•
1.
+
1
CD
c
so
t
x[H' (x)]
-+
00:
x
~
log T where
2
2
c
<
[u(l-u)]
2
x - 1
2
2
1
T·
x[H'(x)]
k
2[1_~-kx] +
x
c
<
x[H'(x)]3
< _1_
1-u
H' (x)
2
H"(x)
Then
B' (x)
Now
c
<
2
1-u
but
log T
~
1/2 - tS
T
as
show earlier.
IJ"(v) I
x
+
00
k
- x
2k -kx + k
e
J' (H(x) •
<
2
H t (x)
•
. (k-1)
e
e
<
C
x
~
k
(I-A) -x + kA - x
e
00
+1
-e_
IJ"(v) I
-kx+ k]
H"(x)
'2
12ke
J"(H(x) •
-kx
Ce
+
[H' (x)]3
H" (x)
H' (x)
[H' (x)] 2
Thus
1
<
H' (x)
1-u
<.-
Jt 2
q
by letting
x
-+
co:
C(x 2+x).
T· e
when
2
since
'
k > 1.
(l-u)
2x
k
2
/ J(U) /
[u(l-u) ]
1
<
-
<
[H' (x)2
1
<
1
C
I J"(H(x)/
Now
2
x
1
k2
-
5
(10gT)2
as before.
But
as before.
<
IJ'(H(x)1
Cx
•
Now
•
x
H' (x)
H' (x)
<
•
u(l-u)
[u (1-U)]- 3/2+0
J" (u)
I J" (H(x) I
<
x
x
-+
co:
+
C [x
[H' (x)]2
<
[U(l-ill-
•
[u(l-u)]- 5/2+0
[(l-ii) ]2
[H' (x)]2
- log(l-U)
J(u) • 10gT
J'(u)·
J " (l1') •
so
l~. ~
<
(i-u)2 -
_1_
(10gT)2
J(u)
~
[U(1_u)-1/2+0
1
-(l-u)
1
[u(l-u) ]
2
~
l 2 O
T / -
APPENDIX C
R
We applied the linear rank statistic us i ng Wi1 coxon scores (~)
N to
a data set produced by Hoel (1972).
Two groups of male mice were given
300 rods of radiation and followed for cancer incidence.
maintained in a germ-free environment.
One group was
Three causes of death were recorded
and we compared the two groups with respect to "other causes."
on next page).
deleted, so that
(See table
Each group had had one tie, so the tied observations were
n l • 38,
n 2 • 37.
Figure 1 shows the survivorship curves for the two groups.
the germ-free group.
Figure 2 is a loge-log survival) plot for a visual
assessment proportional hazards.
two lines should be parallel.
is 6.08.
N Var
If proportional hazards holds, then the
Such a relationship appear to hold generally.
The results are as follows.
k
Group 2 is
The rank maximum likelihood estimate of
~
The centering constant is thus calculated to be .3026, and
(TN(~) -].I)
is .1116.
k > 1), we find that
normal statistic under
TN(~)·
Using the scores from Group 1 (since
.3263.
Thus the asymptotically standard
H' is .751, clearly non-significant.
o·
the data is consistent with the
no~ion
In other words,
that these two samples come from
populations whose survival experience is related by proportional hazards
with a proportionality constant of approximately six.
0:
233
lOME lETS OF DATA
1V~"ti"wd
Data Sec
Other causes
40. 42,
252, 249.
(39%)
51.
62, 163. 179. 206. 222. 228
282. 324, 333, 341, 366,
~,
«17
420. 431, 441. 4161, 4162, .cs2, 517, 51'7. 524
564. 567. 586, 619, 620. 621. 622. 647. 651
686. '761. '763
Germ-free JrOup
Thymic lymphoma
158, 192, 193, 194. 195, 202. 212, 215. 229
230. 237. 240. 244, 247. 259. 300, 301. 321
337. 415. 434. 444. 4&5. 496. 529, 537. 624
(22%)
_.
1f17. 800
Reliculum cell
• rcoma (18%)
430, 590. 606, 638, 655. 679• 691, 693, 696
747. 752, 760. 778. 821. 986
136, 246, 2S5, 376_ 421. S65. 616_ 617.
658. 660. 662_ 675. 681. 734, 736,
769, 777, 800, 807. 82~, 855_ 857,
870, 870, 873, 882_ 895_ 910, 934,
Other causes
(46%)
e
Sourc~:
~
655.
757.
868_
1015,
652
737
864
942
1019
Hoel (1972). For discussion see Sections 1.1.1 and 7.
o.taSdV
T.
T,
121175
121175
121175
12117S
121175
121175
121175
121175
121175
121175
121175
12J175
121175
1:!2377
122.177
060677
121476
122377
010577
JJ J677
101477
J22377
122377
122377
040777
081076
MOUSE LEUKF.MIA DATA
J
a
'2
%,
z,
%,
Z.
Z,
z,
2
2
1
2
1
1
1
1
1
2
2
2
3
2
2
2
2
1
1
1
2
1
2
00.0
(10.0
2
2
2
2
2
1
1
1
2
1
2
08800
2
1
1
1
1
00.0
07.4
13.2.
00.0
05.8
1{)(IOO
02400
2
2
2
_ 2.
2
1
1
1
1
3
3
5
1
3
3
1
1
1
2
1
2
2
1
2
2
2
2
2
2
2
2
00.0
78.7
05.3
20.3
04.7
00000
10(){l0
00000
ORO< 10
111(1(1()
0<1(1(1()
{)(llIRO
0(1(1(10
100I(Kl
~
~
r
11
,-
,-
I
I
..
N
......
I
I
..
1I
........
,
I
.....
-
-
flO
.-
N
I-
JfI
............
...
fi
.........
!II
I,
it
I'"
I,
I
.
UI
-
~
""
+!
I
I
!o
--
,I'"
,I
I
:
...
a,
I
I
I
I
I
:.
. ft
I
I
;
--
I
I
I
fO
:~
I
I
I
I
I
I
•
••
I •
f-
I
I
I
I
I
.
I
~
---------~--------_._--~-----_._---------------.
~
lit
o
lit
o o 1
-q -'",.
I
..
.....ia
, ,a
..
....-'"
N
N
I:
......1&
..,.
I1:
....
III
fI
.......
-
•
..- .
,,t,
. 11,,..
h.
....-- ...
N
I
-
...
-1:I:
..... ••
-
-..
..
•
I
:J
~
....
-ft~
--
~a .J
II
~
-'
.1I
I
.:.
.' ..
I
,
I
~l
I
.-
I
~!
,
'.
,I ...
~o
,
,-,ta
I
I
.0
~~
...
,
---,..•
~
~
-
- - -.
•
0
..
•
o·
I
'.
I
~.
I
•
.
--
1)
~
4
•
to-
~
'"• ...• 0""
fS
""':) .. '> -
FI6- v A.e
6
"> ~
I
•
0
Q
-I
~
~
• •-.:a
I.
tl
_.
~
~
-..
t-
I
....
--
c:..
I
......
...
tal
~
0
•
•
Q•
Q
" -.
I
~o
••
~
0