Sen, Pranab Kumar; (1983).Subhypotheses Testing Against Restricted Alternatives for the Cox Regression Model."

..
•
.e
SUBHYPOTHESES TESHNG AGAINST RESTRICTED ALTERNATIVES
FOR THE COX REGRESSlQN MODEL
by
Pranab Kumar Sen
•
Department of Biostatistics
University of North Carolina at Chapel Hill
Ins titute of Stati stics Mimeo Seri es No. 1425
January 1983
Subhypotheses Testing Against Restricted Alternatives for the Cox
Regression Model
•
BY PRANAB KUMAR SEN
Department of
Biostatistics~
•
University of North
North Carolina~ U.S.A •
Carolina~
Chapel
Hill~
SUMMARY
For the Cox regression model involving both design and concomitant
variates, testing of subhypotheses against restricted alternatives based
on the partial likelihood scores is considered. Roy's union-intersection
principle plays a vital role in this context. Properties of the proposed
tests are also studied.
•
Some key words : chi-bar distribution;
Kuhn~Tucker-Lagrange formula;
partial
likelihood; several sample problem; survival function; union-intersection
principle.
1. INTRODUCTION
We consider here the general regression model,due to Cox(1972) , involving
both design variables (controlled and nonstochastic) and COncomitant variables
(stochastic). It is assumed that the
i~
subject (having survival time Y ) has
i
the hazard rate (given the design variate c. and the covariate z.)
1
T
h. (t; c . , z .) = h (t) •exp ( S c.
111
0
1
where h (t), the hazard rate for c.
+
1
T
y z. ), i =I, •• ., N,
1
(L 1)
= 0, z. = 0 , is an unknown, arbitrary
0 1 1
00
non-negative function (for which J
o
h (t)dt
0
=
+00
),
S is the vector of un-
known parameters relating to the design effects and y is the unknown parameter vector relating to the effects of the concomitant variates. Generally,
we are interested in the null hypothesis : H :
O
S = 0 against S to, where
we treat y as a nuisance parameter. In many problems of practical interest,
though the null hypothesis remains the same, the alternative hypotheses may
-2-
be restricted in nature. For example, if we consider the k (=r+l) sample
problem involving a control and r(? 1) treatment groups, then the c
T
may be
i
either of the vectors (0,0"00,0), (1,0'000,0), (0,1'000,0)'000,(0'000,0,1),
B.J
so that if we write ST = (Sl'.oo,S ), then
r
stands for the jth treatment
effect (over the control), for j=l,ooo,r. In such a case, against the null
a ),
hypothesis of no treatment effect (i. e., 13 =
.
•
we may be interested in the
alternatives that none of the treatment is inferior to the control, ioeo,
>
H : B ~ a , j=l,ooo,r, with at least one strict inequalityo (1 2)
j
This may be termed an orthant alternative, as in (102), 6 belollgs to the
0
positive orthant
0
A second hypothesis of interest may be the ordering of
the treatment effects when H may not hold, ioe o,
O
*
H : Bl ~o.o~ B , with at least one strict inequalityo
r
This may be termed an ordered alternative
0
(103)
We are basically interested
in testing for H : S = a against an orthant or ordered alternative o A third
O
case of interest is the following
H*> :
O~
Bl
< oo.~
Sr,with at least one strict inequality,
(104)
and this may be termed an ordered orthant alternative.
As is usual the case, the survival times Y. may not be observable; the
1
observable random variables are the X. = mine Y., T.) =Y.AT. and
11111
o.1
=
I(Y. < T.) , where the censoringvariablesT. are either fixed or independent
1 -
1
1
of Y., given c"z'Q Further, the covariates z. are assumed to be given at
111
1
the start, so that c., z. are observableo Therefore, based on the set
1
1
(X.,o.,c.,z.), i=l,o •• ,N, our problem is to test for HO:S =
1
1
1
H* or H*>
1
0
a
.
>
agalnst H
For this testing problem, the Cox (1972) omnibus test based on
the partial likelihood is a valid one, but is generally not very efficient o
The situation is very much comparable with the classical parametric case,
where the global likelihood ratio test may not be very efficient for restricted
•
-3-
alternatives , for which there are other versions having better power properties
[ see, for example, Barlow et. al. (1972, ch 3
0
&4)J.
In the multivariate non-
parametric case, Chinchilli and Sen (1981 a,b) have used the Roy (.1953) union-
•
•
intersection (UI-) principle for generating rank order tests for restricted
alternatives. In the current problem, as in Cox (1972), we shall make use of
the partial likelihood function
and incorporate the UI-principle to generate
alternative tests for the restricted alternative problems in (1. 2), (1. 3) or
(1.4). Along with the preliminay notions, the partial likelihood, UI-principle
and the proposed test procedures are considered in Section 2 Section 3 is
0
devoted to the study of the distribution theory of the tests under the ntlll
hypothesis. General comments on the non-null case are made in the concluding
•
section.
2. THE PROPOSED TESTS
Note that the partial lilelihood function is given by
=
P
LN(13,y)
NIT {
N '
~'-l
i=l
J-
Thus, under H : 13 = 0,
O
Tc
yT zi )
T
I(X.>X.) exp( 13 c.
exp( 13
J-
N
N
1
t
= N
,,'
yT z .
=0
J
'
N
1
1
N
J=
(MPLE)
AO
• (2 2)
J-
1
J=
J
0
TAO
. N
zTYN
N
z'YN
(.ElI(X.>X.)c.e J )/(.E,'lI(X.>X.)e J
)}.
'~l6.{ c.
1=
0
of the system of equations:
Consider then the partial likelihood scores
A
k2
P
UN = N- (a/a13)10g LN(13,y)I13=o,y=yO
_~
(2 1)
+
y~, the maximum partial likelihood estimator
z. - ('~l I(X.>X.)z.e
l 6.{
1
1
J=
'J- 1 J
.E
+,
1
of Y is obtained as the solution
1=
i
J-
1
(2.3)
Also, let
•
I
-V = -N -1(a2/ ~f,)Oa
Q
T)log LP ( 13 ,y) 13=o,y=y~
A
ll
N
-
-T
V12 = V21 = -N
-1
I
2
T
P
(0 /aBay ) log LN(13,y) 13=o,y=y~
-V = -N -1 (a 2/ayayT) log LP (13,y) IB=O,y=y~
22
N
(2.4 )
(2 5)
0
(2 6)
0
-4-
VIl • 2 = VII
- -1-
V12 V22 V21 •
(2.7)
TAO
Note that if we let w.. = I(X.>X.)exp(z.y ), for i,j=l, ••• ,N, then VII =
1.)
)- 1.
) N
-1 N
{N
T N
N
N
T
N
2
N L1= 10'1. E.)= 1 w..
c.c./E.
.c.) (E.)= lW'1.))
.c.)/(X.)= lW")
1))
] )= lW"1) -(E.)= IW'1)J
1J } , and
similar expressions hold for VI2 (with the cr being replaced by zr ) and
.
T
T
V22 ( with the c. and c. being replaced by z. and z. , respectively).
]
J
]
J
•
The omnibus partial likelihood ratio test statistic ( for testing HO:S =
o
against S f 0 ), due to Cox(1972), is given by
A
=
A
T UN( Vll • 2) UN'
(2.8)
where A- stands for a generalized inverse of A. To motivate the proposed tests,
we may note that by definition
J:~
= sup{
(aT~N)/(aTVll.2a)~:
a fO}. Or, in
other words, it is the global maximum.of the normalized linear combinations
>
A
of the elements of UN • For testing H vs H
O
in (1.2), we make use of the
union-intersection principle. Note that B> 0 <=>
T
as> 0 , for every a > O.
An asymptotically efficient test for H against the specific alternative H: :
O
T A T
1
as> 0 is based on the test statistic aTuN/(a vll.2a)~ , rejecting the null
hypothesis for large positive values of the statistic. Also, for each a, it
follows from the results of Cox(1972), Tsiatis(1981), Sen(198l ) and Anderson
TA
T~
and Gill (1982), among others, that under HO ' a UN/(a Vll • 2a)
has closely
the normal distribution with 0 mean and unit variance? so that the normal
percentile point may be used to demarcate the critical region. Hence, according
to the Roy (1953) UI-principle, it seems appropriate to choose the test
>
statistic for testing H vs H in (2.1) as
O
~ (1)
~
}
(2.9)
ll • 2a) : a > 0
A
..
Th us,. we nee d to maX1m1.Se
Vll.2a = constant. If
aTUN sub'Ject to a > 0 an d aTTA
Twe let h(a) = -a UN ' hI (a) = -a and h 2 (a) = (a v lL2 a - 1), then the problem
,JvN
= sup
{T
A
a UN/(a
TV
reduces to minimize h(a) subject to the constraints: hI (a)
For this
non~linear
2
0 and h (a) = O.
2
programming problem, the Kuhn-Tucker-Lagrange (KTL-) point
formula theorem applies, and, we arrive at the following solution.
•
-5-
A
Note that UN has
r(~
1) components.Let a be any subset of P ={l, ••• ,r}
P~ a
~
AT
(following possible rearrangements ) UN and
=
and a be the complementary subset (
( ~11.2(aa)
V11. 2 -
and
•
P ). For each a , we partition
VlL 2
~lL2(aa)
(aa)
as
(2.10)
)
lL2(aa)
a , let
Also, for each
- -1
A
= UN(a) - VII. 2 (aa) VI 1. 2 (aa) UN (a)
-*
V
lL 2 (a) =
VIL 2 (aa)
- 'V11. 2 (aa)
Vl~: 2 (aa) VlL 2 (Cia)
(2.12)
Then, the UI-test statistic in (2.9) is given by
(2.13)
T
in (1.4). Let us write 8 =
*>
Consider next the test for H : 8 = 0 vs. H
O
T
=
(8 1 ' ••• , 8r)' A
T
Also, we write c.
1
.QQ,
(A 1' ••• , Ar) and A. =
J
=
S. - 8. 1 ' j=l, ••• ,r, where 80 = O.
J
J-
(c'l'
••• 'c.l r ) , i=l, ••• ,N and
. 1
r
N, where, we take d .. = E . c.
1)
S=J
1S
T
d.1 = (d.l,
••• ,d.l r ), i=l,
. 1
, for j=l' •• Q,r and i=l,. • .,N. With
these changes in notations, we may rewrite (1.1) as
T
T
(2.14)
h o (t).exp(. A d.1 + Y z.1 ), i=l, •• .,N.
In terms of this reparameterized model in (2.14), the null hypothesis H : 8 = 0
O
reduces to that of H : A = 0 , while the ordered orthant alternative in
O
>
(1.4) reduces to the ordinary orthant alternative H : A > O. As such,
if in (2.1) through (2.7) as well as in (2.10) through (2.12), we replace
the c. by the d. , then the resulting statistic in (2.13) would be the
1
•
1
test statistic for testing H against the ordered orthant alternative •
O
Finally, we consider the case of the ordinary ordered alternative hypothesis in (1.3), where, under the alternative,
81 need not be
non-n~gative.
With the reparameterization in (2.14), our null hypothesis is H : A = 0 ,
O
while, the alternative is
A. > 0, for j=2, ••• ,r, while
J -
As such, in this case, in (2.9), we allow for a
T
Al is unrestricted.
= (al'.'.' a r ) , a l to be
-6-
unrestrained, while the a.,j=2,ooo,r are all non-negative o For the single element
J
°
. A*
~
set a = 1 , a = {2,ooo,r}= P, say, we defme UN(l) and VlL2 (1) (scaler) as in
2
A*
2 -*
AT
(2.11) and (2 012), and let QN(l) = (UN(l)) /V l 1. 2 (1). Further, we let UN =
A AoT
AoT
A
A
_
(UNl'UN ), UN = (UN2 ,00o,UNr ) and the minor of V1l02 comprising the last r-1
~
rows and columns is denoted by ~102 0 Then, for every aE pO, we define U~~a)
A
O*
J
:-:0*
UNCii') , VlL2 (a) etco, as in (2 11)-(2.12) (but for the r-l dimensional case).
0
•
Note that we are to work here with the reparameterized model in (2.14), so that
for all these statistics, we use the vectors d. instead of c .• Then'S:N(2). the
1.
1.
proposed test statistic, for testing H against H* ,is given by
O
-1
. 2
AO*T ~*-l
A O*
~
A O*
:-:0
0
}
(2 15)
f1(;..~~po {(QN(1)+UN(a)VlL2(a)UN(a)) 2 I(UN(a) > 0)I(V1L2(aa)UN(a)~O)
0
In passing, we may remark that the UI-test statistic in (2 13) or (2.15)
0
may be readily extended to test for H : 8 =
O
a
.
**
T
T
T
aga1.nst H :{S =(8(1),8(2)):
8(2) ~ 0, 6(1) f O}
or having an ordered relationship among the elements of
SCI)
unrestrained o The only difference would be to replace in
8(2) • leaving
(2 015)
.Q,
Q~(l)
by a quadratic form in U:(.Q,) with discriminant
V~l~~(.Q,)
(where
is the set of indices of 8(1)) and,restricted to the subset of indices of
S(2)' the other statistics are to be computed as in (2.11), (2 12) and (2 015).
0
Further, if instead of testing H : 8
O
=
0 , we like to test for H : 8(1)
O
against an orthant or ordered alternative relating to
=
a
B(l) alone, then the
testing problem may be handled similarly by the following reparameterization o
(T
T)
cT
(aT
T) an d zi*T = (T
T)
·
Wr1.te
cT
t.J(2)' Y
c i (2),zi'
i = ci(l)' c i (2) , i=l,.oo,N,.., =
for i=l,ooo,N. Then, the model in (1 1) may be written as
0
•
T
T *
ho(t)exp( 8(1)c i (1) + t;, zi ) , i=l,ooo,N ,
(2 016)
where the z.* involve partly nonstochastic and partly conditionally fixed
1.
stochastic elements o Treating t;, as a nuisance parameter vector, we may then
work with the statistic in (2 013) or (2.15) with obvious modifications on the
sets P,a
and the allied component vectors and matrices ,as defined in (2 11)0
-7-
3. DISTRIBUTION THEORY UNDER THE NULL HYPOTHESIS
T
Note that if W
,.
•
=
dispersion matrix L:
(W~1)'W~2)) has a multinormal distribution with a
= (( L: •• )).
1J
'-1 2 and if, as in (2 11)-(2 12), we define
0
l,J-,
0
-1
-1
- L: 12 L: 22 W(2) ,L: 11 • 2 = L: 1l -L: 12 L: 22 21 ' then the pair of events
and I(~;~W(2) ~ 0) are independent, irrespective of whether the
.
.
2
*T
-1
*
mean vector of W1S null or not. Also, 1f we let QCl)= W(1)l:lL2W(1) , then,
it follows from Lemma 3.2 of Kudo (1963) that when EW = 0, the three events
*
-1
I CQ(l) .::. x) , I (W (1) .::. 0) and I (L 22 W(2) .:. 0) are mutually independent, for
every
x .::.
o.
Thus, when EW = 0, for every x .::. 0,
*
p{ Q(I)I(W Cl )
,
.e
=
-1
2:. 0)I(l:22 W(2) ,::. 0) .:. x
*
}
-1
P{Q(l) < x }.p{ W{l)~ 0 J.p{ L 22 W(2) < oJ
= p{ Xk
< x
*
J.p{ W(l)
<0
l.p{
-1
L 22 W(2) < 0 }
(301)
Xk has the central chi
distribution with k degrees of freedom (DF)o For k = 0, X is equal to 0
o
-1
* > oJ and p{
with probability 1
Further, both p{ W(l)
L 22 W(2) < O} are
where k is the dimension of the vector WeI) and
0
normal orthant probabilities, for two independent normal distributions with
null mean vectors and dispersion matrices
L
Now, as in (2.10)-(2.12), we define the sets
* and
statistics W(a)
*
~aa.a'
-1
l102 and L: 22 ' respectivelyo
a in P and the corresponding
for every a in Po Also, we define a statistic
*
as in (2.13) but involving these W(a)
and W(a) 0 Then, we may note that of
r
the 2 possible terms in (2.13) only one term will have both the indicator
functions different from 0 0 As such, using (3 01) and arguing as in Kudo
(1963), we may conclude that based on the vector Wand the corresponding l:,
the UI-test statistic in (2.13) will have the distribution C when EW = 0)
-1
}
~aa WG~)
a -< 0 ,
(3.2)
r
where k is the cardinality of the-set a, a e: Po Regrouping the 2 terms in
a
(3.2) in terms of the cardinality, we obtain that (7 02) is equivalently
-8-
(3. 3 )
wk are nonnegative quantities with k~=O wk = 1. In the statistical
literature, (3.3) is referred to as the chi-bar distribution. The weights w
k
where the
,
can be computed from the normal orthant probabilities; some tables for the
•
same are given in Gupta(1963), Barlow etc al (1972), and other places.
We shall make use of (3.1)-(3.3) in our study of the large sample distribut ion theory of the VI-test statistics under the null hypothesis. It follows
from the general theory of partial likelihood functions relating to the Cox
(1972) model [ see for example, Tsiatis(198l), Sen(198l), Slud(1982) and
Anderson and Gill(1982), among others] that under H : S = 0 and some general
O
regularity conditions ( on the c. and z. ), as stated in these papers,
1
1
is asymptotically normal with null mean vector
and dispersion matrix vll • 2 '
•
(3.4)
e.
and
Vll • 2 converges in probability to
where
V
ll • 2 '
• is a positive definite and finite matrix of rank r. Given these
ll 2
V
basic convergence results, we may justify the steps in (3.1) through (3.3).
P , we may use (2.10)-(2.12) along with (3.1), (3.4) and (3 5),
A* T _* -1
A*
~
A*
and claim that under H ' for every x ~ 0, P{[(UN(a)Vll.2(a)UN(a)) I(UN(a» 0).
O
- -1
< x}.P(a), where k is the
I (V110 2 (aa) UN (Ci) .2. 0)] < x} converges to p{Xkaa
cardinality of the set a and pea) is the product of two normal orthant probaFor each a
£
0
*
-1
bilities with respective dispersion matrices Vll • 2 (a) and V 1l • 2 (aa) • Also,
of the 2r terms in (2.13), only for one term the two indicator functions are
simultaneously different from O. Hence, we conclude that under the null
hypothesis H : $
O
=
0, the UI-test statistic J:~l) in (2.13) has asymptotically
the chi-bar distribution in (3 3), where the weights
0
wk are to be computed
from the multinormal distribution with dispersion matrix
because of (3.5),
Vl1 • 2
may be used instead of v
v
• • In practice,
ll 2
• to determine these weights.
l1 2
•
-9-
Let us next consider the null distribution of .r.,~2) in (2 015). Note that
,,*
"0
by virtue of (3.4), UNCI) and UN ' defined before (2 015) are uncorrelated
and asymptotically independent. Further, Q~CI) has asymptotically (under HO)
the chi square distribution with 1 OF, and in (2.15), the rest of the arguments
•
"
all depend on UN0
~
and the corresponding V
Thus, in this case, we need
ll02 0
T
to partition W into three components i.e o , WT = (WI' WT(1)
W(2)) , where WI
is independent of W(l) or W(2)' and then, (3 01) holds even when we replace
the Q(l) by bwi + Q(l) , b > O. Thus, we may claim that under HO ' for every
o
2
"o*T ~* -1 "0*
2
"0*
a E P ,the three events I(QN(l)+ UN(a)VII.2(aYUN(a) < x ), I(UN(a) .:::. 0)
and I(~I~~(aa)U~(a) ~ 0) are asymptotically independent, where, under He'
2
"o*T - 0* -1 "0*
QN(I) + UN(a) VlL2 (a)UN(a) has asymptotically chi square distribution with
k + 1 OF o Hence, proceeding on parallel lines, we conclude that under HO:S = 0,
a
the large sample distribution of J:~2) is given by
r
Lk=l
*
wk p{
where the weights
Xk
< x}
,x ~ 0,
(3 6)
0
wk* are nonnegative and L~=l Wk* = 1; these may be computed
by using the matrix ~I.2[of order (r-l)x(r-I)J instead of the unknown V~I.2 0
We conclude this section with the following two remarks. First, both the
distributions in (3.3) and (3.6) are dominated (in the upper tail) by the
chi distribution with r OF. Hence, the percentile points of (3 03) or (3 6)
0
are dominated by those of the chi distribution with r OF o This is not surprising
as the restricted maximum is always smaller than or equal to the unrestricted
one, so the test statisti~[~l) ~~~2) are always smaller than or equal to
•
(J:~)~. On the other hand, under the specific alternative considered, ~~l)
('(2)
«0 k
(or dlN ) will be close to (~N) 2 , so that with smaller critical values, they
would have better power properties. Secondly,
ba~ed
on some simUlation studies
on normally distributed vectors, it has been observed that the chi-bar approximation in (3.3) or (3.6) is satisfactory only for moderately large or large
sample sizes ; the situation is somewhat better with the chi 'square approximation
-10-
for the null distribution of the unrestricted test statistic ~~ •
4. SOME GENERAL REMARKS
Under local a)ternatives, ~ in (2.8) has asymptotically non-central chi
square distribution with r DF and some appropriate noncentrality paramter o The
situation is more complicated with the restricted alternative tests. The
contiguitx of probability measures under such local alternatives to those under
•
the null hypothesis may be established as in Sen(198l, Sec.4). However, even
A
for such contiguous alternatives, the asymptotic distribution of UN ' though
normal, would have non-null mean vectors. For EW f 0, the independence in
(3 1) does not hold, and, as a result, we may not have a noncentral chi-bar
0
distribution for either of the statistics in (2.13) or (2 15), that is, we may
0
not have an average of severlal noncentral chi distributions for the asymptotic
distribution of t~l) or £~2). This situation is quite comparable to the
parametric case, where the chi-bar approximation works out only for the null
e.
hypothesis case. In the parametric case, the power superiority of the restricted
alternative tests to that of the global test has mostly been proved in various
specific cases by particular constructions or by numerical studies. Chinchilli
and Sen(198Ib) have done some similar studies for the multivariate nonparametric
case in some specific simpler models. The picture is expected to be the same
in this case of partial likelihhod statistics. However, this requires an
extensive amount of simulation work, and is intended to be covered in a
follow up study.
~
-!< /\
For non-local alternatives, N
2
UN converges to some non-null
, so that under the alternative hypothesis of the restrictive type, we would
expect a much better performance of the restricted tests. More numerical studies
are needed to cover the case of small or moderate sample sizes and non-local
alternatives
0
This work was supported by the National Heart, Lung and Blood Institute,
Contract NIH-NHLBI-7l-2243-L from the National Institutes of Health.
•
REF ERE N C E S
ANDERSON, P.K.
&GILL,
R.D. (1982). Cox's regression model for counting processes:
A large sample study. Ann. Statist.
lO~
BARLOW,R.E., BARTHOLOMEW, D.J., BREMNER, J.M.
•
1100-1120.
&BRUNK,
H.D.(1972).Statistical
Inference Under Order Restrictions. New York: Wiley •
&SEN,
CHINCHILLI, V.M.
P.K. (1981a). Multivariate linear rank statistics and
the union-intersection principle for hypothesis testing under restricted
alternatives.
Sankhya~
&SEN,
CHINCHILLI, V.M.
Ser.B42, 135-151.
P.K. (1981b). Multivariate linear rank statistics and the
union-intersection principle for the orthant restriction problem. Snnkhya,
Sere B
42~
152-171.
COX, D.R. (1972). Regression models and life-tables (with discussion). J. R.
Statist. Soc. Sere B 34, 187-202
GUPTA, S.S. (1963). Bibliography on the multivariate normal integrals and
related topics. Ann. Math. Statist. 34, 829-838.
KUDO, A. (1963). A multivariate analogue of the one-sided test. Biometrika
50, 403-418.
ROY, S.N. (1953). On a heuristic method of test construction and its use in
multivariate analysis. Ann. Math. Statist. 24, 220-238 •
•
SEN, P.K. (1981). The Cox regression model, invariance principles for some
induced quantile processes and some repeated significance tests. Ann.
Statist. 9, 109-121.
SLUD, E.V. (1982). Consistency and efficiency of inferences with the partial
likelihood. Biometrika
69~
547-552
TSIATIS, A. (1981). A large sample study of Cox's regression model. Ann.
Statist. 9, 109-121.