S1 PROOF LEMMA 1

1
Statistica Sinica (2013): Supplement
A LIKELIHOOD RATIO TEST FOR MONOTONE BASELINE
HAZARD FUNCTIONS IN THE COX MODEL
Gabriela F. Nane
Delft University of Technology
Supplementary material
S1
PROOF LEMMA 1
Proof. The proof of (i) has been provided by Lemma 1 in Lopuhaä and Nane (2013).
The NPMLE λ̂n px; βq is obtained by maximizing the (pseudo) loglikelihood function in (3.1) over all 0 ď λ0 pTp1q q ď . . . ď λ0 pTpnq q. As argued in Lopuhaä and
Nane (2013), the estimator has to be a nondecreasing step function, that is zero
for x ă Tp1q , constant on the interval rTpiq , Tpi`1q q, for i “ 1, . . . , n ´ 1, and can
be chosen arbitrarily large for x ě Tpnq . Then, for fixed β P
Rp, the (pseudo)
loglikelihood function in (3.1) reduces to
n´1
ÿ
∆piq log λ0 pTpiq q ´
i“1
n
ÿ
i“2
n´1
ÿ
“
β 1 Zpiq
e
i´1
ÿ
“
‰
Tpj`1q ´ Tpjq λ0 pTpjq q
j“1
#
n
“
‰ ÿ
1
∆piq log λ0 pTpiq q ´ λ0 pTpiq q Tpi`1q ´ Tpiq
eβ Zplq
i“1
(S1.1)
+
.
l“i`1
Let λi “ λ0 pTpiq q, for i “ 1, . . . , n ´ 1, and λ “ pλ1 , . . . , λn´1 q. Then, finding the
NPMLE reduces to maximizing
#
+
n´1
n
ÿ
“
‰ ÿ
β 1 Zplq
φpλq “
∆piq log λi ´ λi Tpi`1q ´ Tpiq
e
,
i“1
(S1.2)
l“i`1
over the set 0 ď λ1 ď . . . ď λn´1 . The NPMLE corresponds thus to a vector
λ̂ “ pλ̂1 , . . . , λ̂n´1 q that maximizes φ over 0 ď λ1 ď . . . ď λn´1 . To prove (ii), we
first derive the Fenchel conditions of the estimator. Thus, we will show that the
2
GABRIELA NANE
estimator λ̂n px; βq maximizes the (pseudo) loglikelihood function in (3.1) over
the class of nondecreasing baseline hazard functions if and only if
,
$
n
.
ÿ
ÿ & ∆pjq “
‰
1
´ Tpj`1q ´ Tpjq
eβ Zplq ď 0,
% λ̂j
jěi
(S1.3)
l“j`1
for i “ 1, 2, . . . , n ´ 1, and
$
n´1
ÿ & ∆pjq
% λ̂j
j“1
“
´ Tpj`1q ´ Tpjq
n
‰ ÿ
1
eβ Zplq
l“j`1
,
.
λ̂j “ 0.
(S1.4)
-
The NPMLE λ̂n px; βq is thus uniquely determined by these Fenchel conditions.
The rest of the proof focuses on deriving the Fenchel conditions (S1.3) and (S1.4)
and on establishing (3.6).
First, note that the function φ in (S1.2) is concave and that the vector of
partial derivatives ∇φpλq “ p∇1 φpλq, . . . , ∇n´1 φpλqq is given by
˜
¸
n
‰ÿ
‰ β1Z
∆p1q “
∆pn´1q “
β 1 Zplq
´ Tp2q ´ Tp1q
e
,...,
´ Tpnq ´ Tpn´1q e pnq .
∇φpλq “
λ1
λ
n´1
l“2
Define now the functions gi pλq “ λi´1 ´ λi , for i “ 1, . . . , n ´ 1 and λ0 “ 0, and
the vector gpλq “ pg1 pλq, . . . , gn´1 pλqq. Moreover, define the matrix of partial
derivatives by
ˆ
G“
Bgi pλq
Bλj
˙
,
for i “ 1, . . . , n ´ 1; j “ 1, . . . , n ´ 1.
(S1.5)
Let φ̃pλq “ ´φpλq. Then, maximizing (S1.2) over all 0 ď λ1 ď . . . ď λn´1 is
equivalent with minimizing φ̃pλq under the restriction that all components of the
vector gpλq are negative. An adaptation of the Karush-Kuhn-Tucker theorem
(e.g., see Theorem 8.1 in Groeneboom (1998)) states that λ̂ minimizes φ̃ over all
vectors λ such that gi pλq ď 0, for all i “ 1, . . . , n ´ 1, if and only if the following
conditions hold
∇φ̃pλ̂q ` GT α “ 0,
(S1.6)
gpλ̂q ` w “ 0,
(S1.7)
xα, wy “ 0,
(S1.8)
LIKELIHOOD RATIO TESTS IN THE COX MODEL
3
for α “ pα1 , . . . , αn´1 q, with αi ě 0, i “ 1, . . . , n ´ 1, and w “ pw1 , . . . , wn´1 q,
with wi ě 0, for i “ 1, . . . , n ´ 1. The first condition (S1.6), yields that
$
,
n
.
ÿ & ∆pjq “
ÿ
‰ ÿ
1
∇j φpλ̂q “ ´
αi “ ´
´ Tpj`1q ´ Tpjq
eβ Zplq .
% λ̂j
jěi
jěi
(S1.9)
l“j`1
Since αi ě 0, for all i “ 1, . . . , n ´ 1, condition (S1.3) is immediate. From (S1.7),
w “ ´gpλ̂q “ pλ̂1 ´ λ̂0 , λ̂2 ´ λ̂1 , . . . , λ̂n´1 ´ λ̂n´2 q, with λ̂0 “ 0. Note that the
condition wi ě 0 implies that λ̂i´1 ď λ̂i , for all i “ 1, . . . , n ´ 1, which is trivially
satisfied. Finally, by (S1.8),
n´1
ÿ
ÿ
i“1
jěi
pλ̂i ´ λ̂i´1 q
∇j φpλ̂q “ 0,
which re-writes exactly to (S1.4).
To derive the expression in (3.6), we prove first that (S1.3) and (S1.4) imply
that
$
n´1
ÿ&
j“1
∆pjq
“
´ Tpj`1q ´ Tpjq
% λ̂j
n
‰ ÿ
1
eβ Zplq
l“j`1
řn´1
Condition (S1.3) gives that
j“1
,
.
“ 0.
(S1.10)
-
∇j φpλ̂q ď 0. In addition, as the maximizer λ̂
is nondecreasing,
λ̂1
n´1
ÿ
∇j φpλ̂q “ ´ ∇2 φpλ̂qλ̂2 ´ ∇3 φpλ̂qλ̂3 ´ . . . ´ ∇n´1 φpλ̂qλ̂n´1
j“1
` ∇2 φpλ̂qλ̂1 ` ∇3 φpλ̂qλ̂1 ` . . . ` ∇n´1 φpλ̂qλ̂1
n´1
ÿ
ÿ
i“2
jěi
pλ̂i´1 ´ λ̂i q
“
∇j φpλ̂q ě 0.
This shows (S1.10). Now let B1 , . . . , Bk be blocks of indices on which λ̂ is constant
such that B1 Y . . . Y Bk “ t1, . . . , n ´ 1u and let vnj pβq be the value of λ̂ on the
block Bj , with j “ 1, . . . , k. If k “ 1, then the expression´ of vn1 is ¯immediate
ř
from (S1.10). Moreover, observe that, by (S1.8), n´1
i“1 αi λ̂i ´ λ̂i´1 “ 0, and
since αi ě 0 and λ̂i ě λ̂i´1 , for any i “ 1, . . . , n ´ 1, it will follow that αi “ 0,
whenever λ̂i´1 ă λ̂i . Hence, for k ě 2, there exist k ´ 1 α’s that are zero.
4
GABRIELA NANE
Then (3.6) follows by (S1.9) and (S1.10). For example, for k ě 3, choose any two
consecutive αi that are zero. From (S1.9), we get that by subtracting these αi ’s,
#
ÿ
∇i φpλ̂q “
iPBj
ÿ
iPBj
n
“
‰ ÿ
∆piq
1
´ Tpi`1q ´ Tpiq
eβ Zplq
vnj pβq
l“i`1
+
“ 0.
As vnj pβq is constant on Bj , this yields (3.6).
S2
PROOF LEMMA 2
Proof. We will derive the Karush-Kuhn-Tucker (KKT) conditions, that uniquely
determine the constrained NPMLE, and which implicitly provide the characterization in (ii). To prove the lemma, we will show that the estimator proposed
in (i) satisfies these conditions.
The constrained NPMLE estimator is obtained by maximizing the objective
function (3.1) over 0 ď λ0 pTp1q q ď . . . ď λ0 pTpmq q ď θ0 ď λ0 pTpm`1q q ď . . . ď
λ0 pTpn´1q q. In line with the reasoning for the unconstrained estimator, it can be
argued that the constrained estimator has to be a nondecreasing step function
that is zero for x ă Tp1q , constant on rTpiq , Tpi`1q q, for i “ 1, . . . , n ´ 1, is equal
to θ0 on the interval rx0 , Tpm`1q q, and can be chosen arbitrarily large for x ě Tpnq .
Therefore, for a fixed β P
R, the (pseudo) loglikelihood function in (3.1) reduces
to
m´1
ÿ
#
“
∆piq log λ0 pTpiq q ´ λ0 pTpiq q Tpi`1q ´ Tpiq
i“1
n
‰ ÿ
+
e
β 1 Zplq
l“i`1
` ∆pmq log λ0 pTpmq q
n
“
‰
“
‰( ÿ
1
´ λ0 pTpmq q x0 ´ Tpmq ´ θ0 Tpm`1q ´ x0
eβ Zplq
(S2.1)
l“m`1
n´1
ÿ
`
i“m`1
#
n
“
‰ ÿ
1
∆piq log λ0 pTpiq q ´ λ0 pTpiq q Tpi`1q ´ Tpiq
eβ Zplq
+
.
l“i`1
By letting λi “ λ0 pTpiq q, for i “ 1, . . . , n ´ 1, and λ “ pλ1 , . . . , λn´1 q, we then
5
LIKELIHOOD RATIO TESTS IN THE COX MODEL
want to maximize
φ0 pλq “
m´1
ÿ
#
“
∆piq log λi ´ λi Tpi`1q ´ Tpiq
i“1
n
‰ ÿ
+
e
β 1 Zplq
l“i`1
n
ÿ
β 1 Zplq
“
‰
` ∆pmq log λm ´ λm x0 ´ Tpmq
e
(S2.2)
l“m`1
#
n´1
ÿ
“
∆piq log λi ´ λi Tpi`1q ´ Tpiq
`
i“m`1
n
‰ ÿ
+
e
β 1 Zplq
,
l“i`1
over the set 0 ď λ1 ď . . . ď λm ď θ0 ď λm`1 ď . . . ď λn´1 . Let the vector
λ̂c “ pλ̂c1 , . . . , λ̂cn´1 q denote the constrained NPMLE under the null hypothesis
H0 : λ0 px0 q “ θ0 . We will show next that λ̂c maximizes the objective function
in (S2.2) over the class of nondecreasing baseline hazard functions, under the null
hypothesis, if and only if the following conditions are satisfied
,
$
n
.
ÿ & ∆pjq “
‰ ÿ
1
´ Tpj`1q ´ Tpjq
eβ Zplq ě 0,
for i “ 1, . . . , m ´ 1, (S2.3)
c
%
λ̂
j
jďi
l“j`1
$
m´1
ÿ &
j“1
∆pjq
% λ̂cj
“
´ Tpj`1q ´ Tpjq
n
‰ ÿ
e
jěi
%
pjq
λ̂cj
,
.
plq
-
l“j`1
`
$
ÿ &∆
β1Z
∆pmq
λ̂cm
n
“
‰ ÿ
1
´ Tpj`1q ´ Tpjq
eβ Zplq
n
“
‰ ÿ
1
´ x0 ´ Tpmq
eβ Zplq ě 0,
(S2.4)
l“m`1
,
.
ď 0, for i “ m ` 1, . . . , n ´ 1, (S2.5)
-
l“j`1
and
n´1
ÿ!
∆pjq
λ̂cj
j“1
j‰m
“
´ Tpj`1q ´ Tpjq
n
‰ ÿ
1
eβ Zplq
)´
¯
λ̂cj ´ θ0
l“j`1
(S2.6)
#
`
∆pmq
λ̂cm
“
‰
´ x0 ´ Tpmq
n
ÿ
+
1
eβ Zplq
´
¯
λ̂cm ´ θ0 “ 0.
l“m`1
The NPMLE λ̂c is thus uniquely determined by these conditions. To prove (i),
we will show that λ̂0n defined in (3.7) verifies the Karush-Kuhn-Tucker (KKT)
6
GABRIELA NANE
conditions (S2.3)-(S2.6). Therefore, λ̂0n is the unique maximizer of φ0 pλq in (S2.2),
over the set 0 ď λ1 ď . . . ď λm ď θ0 ď λm`1 ď . . . ď λn´1 . As it will be seen
further, despite bothersome calculations, the distinct form of the likelihood grants
a unified framework for deriving the KKT conditions, that uses all the follow-up
times, unlike the reasoning in Banerjee and Wellner (2001), where the (pseudo)
loglikelihood is split and arguments are carried both to the left and to the right
of x0 .
Similar to the unconstrained case, observe that the function φ0 is concave
and that the vector of partial derivatives is ∇φ0 pλq “ p∇1 φ0 pλq, . . . , ∇n´1 φ0 pλqq,
with
n
‰ ÿ
∆piq “
1
∇i φ pλq “
´ Tpi`1q ´ Tpiq
eβ Zplq ,
λi
l“i`1
0
for i “ 1, . . . , m ´ 1, m ` 1, . . . , n ´ 1, and
∇m φ0 pλq “
n
‰ ÿ
∆pmq “
1
´ x0 ´ Tpmq
eβ Zplq .
λm
l“m`1
Moreover, define the vector gpλq “ pg1 pλq, . . . , gn´1 pλqq, with
$
’
’λi ´ λi`1 i “ 1, . . . , m ´ 1,
’
’
’
’
&λm ´ θ0
i “ m,
gi pλq “
’
’
θ0 ´ λm`1 i “ m ` 1,
’
’
’
’
%λ
´λ
i “ m ` 2, . . . , n ´ 1,
i´1
i
and consider the matrix of partial derivatives defined in (S1.5). Computations as
in (S1.9) can be derived to show that condition (S1.6) yields (S2.3)-(S2.5), upon
noting that
$ř
&
αi “
jďi ∇j φ
0 pλ̂c q
i “ 1, . . . , m,
ř
%´
0 c
jěi ∇j φ pλ̂ q i “ m ` 1, . . . , n ´ 1.
(S2.7)
Condition (S1.7) gives that w “ pλ̂c2 ´ λ̂c1 , . . . , θ0 ´ λ̂cm , λ̂cm`1 ´θ0 , . . . , λ̂cn´1 ´ λ̂cn´2 q,
which together with (S1.8) and (S2.7), yields (S2.6). Moreover, (S1.8) gives that
m´1
ÿ
i“1
´
¯
´
¯
´
¯ n´1
´
¯
ÿ
αi λ̂ci`1 ´ λ̂ci `αm θ0 ´ λ̂cm `αm`1 λ̂cm`1 ´ θ0 `
αi λ̂ci ´ λ̂ci´1 “ 0.
m`2
LIKELIHOOD RATIO TESTS IN THE COX MODEL
7
Obviously, αi “ 0 if λ̂ci ă λ̂ci`1 , for i “ 1, . . . , m ´ 1, m ` 1, . . . , n ´ 1, and (3.8)
can be derived as in the proof of Lemma 1. For the block Bp0 containing m, we
get that
#
ÿ
iPBp0 ztmu
n
“
‰ ÿ
∆piq
1
´
T
´
T
eβ Zplq
pi`1q
piq
0 pβq
vnp
l“i`1
`
+
n
“
‰ ÿ
∆pmq
1
´
x
´
T
eβ Zplq “ 0,
0
pmq
0
vnp pβq
l“m`1
which gives exactly (3.9). Therefore showing that the estimator λ̂0n defined
in (3.7) satisfies the KKT conditions (S2.3)-(S2.6) also proves (ii).
L
Recall that λ̂0n is minpλ̂L
i , θ0 q, for i “ 1, . . . , m, and that λ̂i is the un-
constrained estimator when considering only the follow-up times Tp1q , . . . , Tpmq .
R
Moreover, λ̂0n is maxpλ̂R
i , θ0 q, for i “ m ` 1, . . . , n ´ 1, where λ̂i is the uncon-
strained estimator when considering only the follow-up times Tpmq , . . . , Tpn´1q .
Note that (S1.10) together with (S1.3) imply that
,
$
n
.
ÿ & ∆pjq “
‰ ÿ
1
´ Tpj`1q ´ Tpjq
eβ Zplq ě 0,
% λ̂j
jďi
for i “ 1, . . . , n ´ 1. (S2.8)
l“j`1
The condition holds for i “ 1, . . . , m ´ 1, and, moreover,
#
+
n
ÿ
“
‰ ÿ
∆pjq
1
´ Tpj`1q ´ Tpjq
eβ Zplq
L, θ q
minp
λ̂
0
j
jďi
l“j`1
$
,
n
.
ÿ & ∆pjq “
‰ ÿ
1
ě
´ Tpj`1q ´ Tpjq
eβ Zplq ě 0,
% λ̂L
j
jďi
l“j`1
for i “ 1, . . . , m ´ 1. Therefore, minpλ̂L
i , θq, for i “ 1, . . . , m ´ 1 satisfies (S2.3).
Furthermore, (S2.8) holds for i “ m, which implies that
$
,
m´1
n
.
ÿ &
ÿ
“
‰
∆pjq
1
´ Tpj`1q ´ Tpjq
eβ Zplq
% minpλ̂L
j , θ0 q
j“1
l“j`1
$
,
n
&
.
“
‰ ÿ
∆pmq
1
`
´ x0 ´ Tpmq
eβ Zplq
% minpλ̂L
m , θ0 q
l“j`1
,
$
n
m &
.
ÿ
“
‰ ÿ
∆pjq
1
´ Tpj`1q ´ Tpjq
eβ Zplq ě 0,
ě
% minpλ̂L
j , θ0 q
j“1
l“j`1
8
GABRIELA NANE
hence λ̂0n satisfies (S2.4) as well. It is straightforward that maxpλ̂R
i , θ0 q, for
i “ m ` 1, . . . , n ´ 1 satisfies (S2.5), since, by definition, λ̂R
i satisfies (S1.3), for
i “ m ` 1, . . . , n ´ 1, and
$
ÿ&
∆pjq
n
‰ ÿ
,
.
1
´ Tpj`1q ´ Tpjq
eβ Zplq
R
%
maxpλ̂j , θ0 q
jěi
l“j`1
$
,
n
.
ÿ & ∆pjq “
‰ ÿ
1
ď
´ Tpj`1q ´ Tpjq
eβ Zplq ď 0.
% λ̂R
j
jěi
“
l“j`1
Finally, to check if λ̂0n verifies the condition (S2.6), we will argue on the blocks
R
of indices on which λ̂n , and hence λ̂L
i and λ̂i are constant. By (3.6), for each
block Bj , with j “ 1, . . . , k, on which the unconstrained estimator has the constant value vnj pβq,
#
ÿ
iPBj
n
“
‰ ÿ
∆piq
1
´ Tpi`1q ´ Tpiq
eβ Zplq
vnj pβq
l“i`1
+
vnj pβq “ 0,
and
#
ÿ
iPBj
n
“
‰ ÿ
∆piq
1
´ Tpi`1q ´ Tpiq
eβ Zplq
vnj pβq
l“i`1
+
“ 0.
Then, on each block Bj that does not contain m, we can write
#
ÿ
∆piq
iPBj
λ̂i
“
´ Tpi`1q ´ Tpiq
+
n
‰ ÿ
e
β 1 Zplq
λ̂i
l“i`1
#
“ θ0
ÿ
∆piq
iPBj
λ̂i
“
´ Tpi`1q ´ Tpiq
n
‰ ÿ
(S2.9)
+
β 1 Zplq
e
,
l“i`1
R
L
and this holds for λ̂L
i , as well as for λ̂i . It is straightforward that minpλ̂i , θ0 q, for
i “ 1, . . . , m, and maxpλ̂R
i , θ0 q, for i “ m ` 1, . . . , n ´ 1 satisfy this relationship.
For the block Bp that contains m, we have
9
LIKELIHOOD RATIO TESTS IN THE COX MODEL
#
ÿ
∆piq
iPBp ztmu
λ̂L
i
!∆
`
n
“
‰ ÿ
1
´ Tpi`1q ´ Tpiq
eβ Zplq
+
λ̂L
i
l“i`1
n
n
)
“
‰ ÿ
“
‰ ÿ
1
1
´ Tpm`1q ´ x0
eβ Zplq ´ x0 ´ Tpmq
eβ Zplq λ̂L
m
pmq
λ̂L
m
l“m`1
#
ÿ
∆piq
iPBp ztmu
λ̂L
i
“θ0
#
` θ0
∆pmq
“
‰
´ Tpi`1q ´ Tpiq
1
eβ Zplq
l“i`1
“
´ Tpm`1q ´ x0
λ̂m
l“m`1
+
n
ÿ
n
ÿ
‰
β 1 Zplq
e
“
´ x0 ´ Tpmq
n
‰ ÿ
l“m`1
+
e
β 1 Zplq
.
l“m`1
Constraining λ̂L
m to be θ0 on the interval rx0 , Tpm`1q q yields
#
ÿ
∆piq
iPBp ztmu
λ̂L
i
“
´ Tpi`1q ´ Tpiq
+
e
β 1 Zplq
λ̂L
i
l“i`1
!∆
`
pmq
λ̂L
m
n
)
“
‰ ÿ
1
´ x0 ´ Tpmq
eβ Zplq λ̂L
m
l“m`1
#
“ θ0
n
‰ ÿ
ÿ
∆piq
iPBp ztmu
λ̂L
i
#
` θ0
∆pmq
λ̂L
m
“
´ Tpi`1q ´ Tpiq
n
‰ ÿ
+
e
(S2.10)
β 1 Zplq
l“i`1
“
´ x0 ´ Tpmq
n
‰ ÿ
+
e
β 1 Zplq
.
l“m`1
Once more, for i P Bp , minpλ̂L
i , θ0 q satisfies this relationship. Summing over all
blocks in (S2.9) and (S2.10) completes the proof.
S3
PROOF LEMMA 5
Proof. Note that the processes Xn and Yn are monotone. By making use of
Corollary 2 in Huang and Zhang (1994) and the remark above the corollary, it
suffices to prove that the finite dimensional marginals of the process pXn , Yn q
0 q, in order
converge to the finite dimensional marginals of the process pga,b , ga,b
to prove the lemma.
For x ě Tp1q , let
xn pxq “ Wn pβ̂n , xq ´ Wn pβ̂n , Tp1q q,
W
10
GABRIELA NANE
where Wn is defined in (3.3), and where β̂n is the maximum partial likelihood
estimator. For fixed x0 and x P r´k, ks, with 0 ă k ă 8, define the process
Z
n2{3 !
Vn px0 ` n´1{3 xq ´ Vn px0 q
n pxq “
Φpβ0 , x0 q
”
ı)
xn px0 ` n´1{3 xq ´ W
xn px0 q ,
´ λ0 px0 q W
where Vn is defined in (3.4). For a and b defined in (4.3),
R
(S3.1)
Zn converges weakly
to Xa,b , as processes in Bloc p q, by Lemma 8 in Lopuhaä and Nane (2013).
Define now
Sn pxq “
)
n1{3 ! x
xn px0 q .
Wn px0 ` n´1{3 xq ´ W
Φpβ0 , x0 q
(S3.2)
From the proof of Lemma 9 in Lopuhaä and Nane (2013), Sn pxq converges almost
surely to the deterministic function x, uniformly on every compact set.
Following the approach in Groeneboom (1985), Lopuhaä and Nane (2013)
obtained the asymptotic distribution of the unconstrained maximum likelihood
estimator λ̂n by considering the inverse process
!
)
xn pxq ,
Un pzq “ argmin Vn pxq ´ z W
(S3.3)
xPrTp1q ,Tpnq s
for z ą 0, where the argmin function represents the supremum of times at which
the minimum is attained. Since the argmin is invariant under addition of and
multiplication with positive constants, it follows that
”
ı
n1{3 Un pθ0 ` n´1{3 zq ´ x0 “ argmin t n pxq ´ Sn pxqzu ,
xPIn px0 q
Z
where In px0 q “ r´n1{3 px0 ´ Tp1q q, n1{3 pTpnq ´ x0 qs. For z ą 0, the switching
relationship λ̂n pxq ď z holds if and only if Un pzq ě x, with probability one. This
translates, in the context of this lemma, to
ı
”
ı
”
n1{3 λ̂n px0 ` n´1{3 xq ´ θ0 ď z ô n1{3 Un pθ0 ` n´1{3 zq ´ x0 ě x,
for 0 ă x0 ă τH and θ0 ą 0, with probability one. The switching rela“
‰
tionship is thus Xn pxq ď z ô n1{3 Un pθ0 ` n´1{3 zq ´ x0 ě x. Hence finding the limiting distribution of Xn pxq resumes to finding the limiting distri“
‰
bution of n1{3 Un pθ0 ` n´1{3 zq ´ x0 . By applying Theorem 2.7 in Kim and
Pollard (1990), it follows that, for every z ą 0,
11
LIKELIHOOD RATIO TESTS IN THE COX MODEL
ı
”
d
Ñ U pzq,
n1{3 Un pθ0 ` n´1{3 zq ´ x0 Ý
as inferred in the proof of Theorem 2 in Lopuhaä and Nane (2013), where U pzq “
sup tt P
R : Xa,bptq ´ zt is minimalu. It will result that, for every x P r´k, ks,
”
ı
¯
´
P pXn pxq ď zq “ P n1{3 λ̂n px0 ` n´1{3 xq ´ θ0 ď z
ı
¯
´
”
“ P n1{3 Un pθ0 ` n´1{3 zq ´ x0 ě x
Ñ P pU pzq ě xq .
Using the switching relationship on the limiting process, it can be deduced that
d
U pzq ě x ô ga,b pxq ď z, with probability one, and thus Xn pxq Ý
Ñ ga,b pxq.
In order to prove the same type of result for Yn pxq, consider first the following
process
´
¯
Yrn pxq “ n1{3 λ̃n px0 ` n´1{3 xq ´ θ0 ,
(S3.4)
where, for x0 P p0, τH q, such that Tpmq ă x0 ă Tpm`1q ,
$
’
0
x ă Tp1q ,
’
’
’
’
’ L
’
’
λ̂i Tpiq ď x ă Tpi`1q , for i “ 1, . . . , m ´ 1
’
’
’
’
&λ̂L T
pmq ď x ă x0 ,
m
λ̃n pxq “
’
’
0
x0 ď x ă Tpm`1q ,
’
’
’
’
’
’
λ̂R
Tpiq ď x ă Tpi`1q , for i “ m ` 1, . . . , n ´ 1
’
i
’
’
’
%8 x ě T ,
pnq
R
with λ̂L
i and λ̂i defined in Lemma 2. For this, we have considered up to x0
an unconstrained estimator which is constructed based on the sample points
Tp1q , . . . , Tpm`1q . Moreover, to the right of x0 , we have considered an unconstrained estimator based on the points Tpm`1q , . . . , Tpnq . It is not difficult to see
that
$
´
¯
rn pxq, 0
’
min
Y
’
’
&
Yn pxq “ 0
’
´
¯
’
’
%max Yr pxq, 0
n
For z ą 0, define the inverse processes
x ă 0,
x “ 0,
x ą 0.
(S3.5)
12
GABRIELA NANE
UnL pzq “
!
)
xn pxq ,
Vn pxq ´ z W
argmin
xPrTp1q ,Tpm`1q s
UnR pzq “
!
)
xn pxq
Vn pxq ´ z W
argmin
xPrTpm`1q ,Tpnq s
Take x ă x0 . The switching relationship for λ̃n is given by λ̃n pxq ď z if and only
if UnL pzq ě x, with probability one, which gives that
ı
”
”
ı
n1{3 λ̃n px0 ` n´1{3 xq ´ θ0 ď z ô n1{3 UnL pθ0 ` n´1{3 zq ´ x0 ě x,
with probability one. Moreover,
”
ı
n1{3 UnL pθ0 ` n´1{3 zq ´ x0 “ argmin t
xPInL px0 q
Znpxq ´ Snpxqzu ,
where InL px0 q “ r´n1{3 px0 ´ Tp1q q, n1{3 pTpm`1q ´ x0 qs. Denote by
Zn pz, xq “
Znpxq ´ Snpxqz.
As for the unconstrained estimator, we aim to apply Theorem 2.7 in Kim and
Pollard (1990). As Theorem 2.7 in Kim and Pollard (1990) applies to the argmax
of processes on the whole real line, we extend the above process in the following
manner
$
’
’Zn pz, ´n1{3 px0 ´ Tp1q qq
x ă ´n1{3 px0 ´ Tp1q q,
’
&
Zn´ pz, xq “ Zn pz, xq
´n1{3 px0 ´ Tp1q q ď x ď n1{3 pTpm`1q ´ x0 q,
’
’
’
%Z pz, n1{3 pT
1{3 pT
n
pm`1q ´ x0 qq ` 1 x ą n
pm`1q ´ x0 q.
R
Then, Zn´ pz, xq P Bloc p q and
”
ı
(
(
n1{3 UnL pθ0 ` n´1{3 zq ´ x0 “ argmin Zn´ pz, xq “ argmax ´Zn´ pz, xq .
R
R
xP
xP
Since λ0 px0 q “ θ0 ą 0 and λ0 is continuously differentiable in a neighborhood
of x0 , it follows by a Taylor expansion and by Lemma 2.5 in Devroye (1981)
that n1{3 pTpm`1q ´ x0 q “ Op pn´1 log nq. Therefore, by virtue of Lemma 8 and
Lemma 9 in Lopuhaä and Nane (2013), the process x ÞÑ ´Zn´ pz, xq converges
weakly to Z ´ pxq P
CmaxpRq, for any fixed z, where
Zn´ pz, xq “
$
&´X
a,b pxq
%1
` zx
x ď 0,
x ą 0,
LIKELIHOOD RATIO TESTS IN THE COX MODEL
13
for a and b defined in (4.3). Hence, the first condition of Theorem 2.7 in Kim and
Pollard (1990) is verified. The second condition follows directly from Lemma 11
in Lopuhaä and Nane (2013), while the third condition is trivially fulfilled. Thus,
for any z fixed,
ı
”
d
Ñ U ´ pzq,
n1{3 UnL pθ0 ` n´1{3 zq ´ x0 Ý
where U ´ pzq “ sup tt ď 0 : Xa,b ptq ´ zt is minimalu. Concluding, for x ă 0,
”
ı
¯
´
¯
´
P Yrn pxq ď z “ P n1{3 λ̃n px0 ` n´1{3 xq ´ θ0 ď z
”
ı
¯
´
“ P n1{3 UnL pθ0 ` n´1{3 zq ´ x0 ě x
`
˘
Ñ P U ´ pzq ě x .
The switching relationship for the limiting process gives that U ´ pzq ě x ô
DL pXa,b qpxq ď z, with probability one, where DL pXa,b qpxq has been defined as
the left-hand slope of the GCM of Xa,b , at a point x ă 0. Hence, for x ă 0,
d
Yrn pxq Ý
Ñ DL pXa,b qpxq.
d
Completely analogous, Yrn pxq Ý
Ñ DR pXa,b qpxq, for x ą 0. By continuous
mapping theorem and by (S4.2), it can be concluded that for fixed x P r´k, ks,
d
0
Yn pxq Ý
Ñ ga,b
pxq,
0 has been defined in (3.12).
where ga,b
Our next objective is to apply Theorem 6.1 in Huang and Wellner (1995).
The first condition of Theorem 6.1 is trivially fulfilled. The second condition
follows by Lemma 11 in Lopuhaä and Nane (2013), while the third condition
follows by the definition of the inverse processes. Hence, for fixed x,
`
˘
0
P pXn pxq ď z, Yn pxq ď zq Ñ P ga,b pxq ď z, ga,b
pxq ď z ,
for a and b defined in (4.3). The arguments for one dimensional marginal convergence can be extended to the finite dimensional convergence, as in the proof
of Theorem 3.6.2 in Banerjee (2000), by making use of Lemma 3.6.10 in Banerjee (2000). Hence, we can conclude that the finite dimensional marginals of
the process pXn , Yn q converge to the finite dimensional marginals of the process
0 q. This completes the proof.
pga,b , ga,b
14
S4
GABRIELA NANE
Nonincreasing baseline hazard
The characterization of the unconstrained and the constrained NPMLE es-
timators of a nonincreasing baseline hazard function follows analogously to the
characterization of the nondecreasing estimators. The unconstrained NPMLE
λ̂n px; βq is obtained by maximizing the (pseudo) likelihood function in (3.1) over
all λ0 pTp1q q ě . . . ě λpTpnq q ě 0. Lopuhaä and Nane (2013) showed that the
likelihood is maximized by a nonincreasing step function that is constant on
pTpi´1q , Tpiq s, for i “ 1, . . . , n and where Tp0q “ 0. The (pseudo) loglikelihood
in (3.1) becomes then
#
+
n
n
ÿ
“
‰ÿ
1
∆piq log λ0 pTpiq q ´ λ0 pTpiq q Tpiq ´ Tpi´1q
eβ Zplq .
i“1
(S4.1)
l“i
The lemmas below provide the characterization of the unconstrained estimator
λ̂n px; βq and the constrained estimator λ̂0n px; βq. Their proofs follow by arguments similar to those in the proofs of Lemma 1 and Lemma 2, as well as the
necessary and sufficient conditions that uniquely characterize these estimators..
LEMMA 1. Let Tp1q ă . . . ă Tpnq be the ordered follow-up times and consider
a fixed β P
Rp.
(i) Let Wn be defined in (3.3) and let
ż
V̄n pxq “ δtu ď xu dPn pu, δ, zq.
(S4.2)
Then, the NPMLE λ̂n px; βq of a nonincreasing baseline hazard function λ0
is given by
$
&λ̂i
λ̂n px; βq “
%0
Tpi´1q ă x ď Tpiq , for i “ 1, . . . , n,
x ą Tpnq ,
for i “ 1, . . . , n, with Tp0q “ 0 and where λ̂i is the left derivative of the least
concave majorant (LCM) at the point Pi of the cumulative sum diagram
(CSD) consisting of the points
´
¯
Pj “ Wn pβ, Tpjq q, V̄n pTpjq q ,
for j “ 1, . . . , n and P0 “ p0, 0q.
(S4.3)
15
LIKELIHOOD RATIO TESTS IN THE COX MODEL
(ii) Let B1 , . . . , Bk be blocks of indices such that λ̂n px; βq is constant on each
block and B1 Y . . . Y Bk “ t1, . . . , nu. Denote by vnj pβq, the value of the
estimator on block Bj . Then
ř
vnj pβq “ ř
iPBj
iPBj
“
∆piq
‰ řn
Tpiq ´ Tpi´1q
β 1 Zplq
l“i e
.
In fact, for x ě Tpnq , λ̂n px; βq can take any value smaller than λ̂n , the
left derivative of the LCM at the point Pn of the CSD. As before, we propose
λ̂n pxq “ λ̂n px; β̂n q as the estimator of λ0 and v̂nj “ vnj pβ̂n q, where β̂n denotes
the maximum partial likelihood estimator of β0 . Fenchel conditions as in (S1.3)
and (S1.4) can be derived analogously.
The NPMLE estimator λ̂0n maximizes the (pseudo) loglikelihood function
in (S4.1) over the set λ0 pTp1q q ě . . . ě λ0 pTpmq q ě θ0 ě λ0 pTpm`1q q ě . . . ě
λ0 pTpnq q ě 0. It can be argued that the constrained estimator has to be a
nonincreasing step function that is constant on pTpi´1q , Tpiq s, for i “ 1, . . . , n,
is θ0 on the interval pTpmq , x0 s, and is zero for x ě Tpnq . Hence, the (pseudo)
loglikelihood function becomes
#
+
m
n
ÿ
“
‰ÿ
1
∆piq log λ0 pTpiq q ´ λ0 pTpiq q Tpiq ´ Tpi´1q
eβ Zplq
i“1
l“i
` ∆pm`1q log λ0 pTpm`1q q
n
“
‰
“
‰( ÿ
1
´ θ0 x0 ´ Tpmq ´ λ0 pTpm`1q q Tpm`1q ´ x0
eβ Zplq
n
ÿ
`
#
“
l“m`1
n
ÿ
∆piq log λ0 pTpiq q ´ λ0 pTpiq q Tpiq ´ Tpi´1q
i“m`2
‰
+
e
β 1 Zplq
.
l“i
The characterization of the constrained NPMLE λ̂0n is provided with the next
lemma.
LEMMA 2. Let x0 P p0, τH q fixed, such that Tpmq ă x0 ă Tpm`1q , for a given
1 ď m ď n ´ 1. Consider a fixed β P
Rp.
L
(i) For i “ 1, . . . , m, let λ̂L
i to be the left derivative of the LCM at the point Pi
of the CSD consisting of the points PjL “ Pj , for j “ 1, . . . , m, with Pj
defined in (S4.3), and P0L “ p0, 0q. Moreover, for i “ m ` 1, . . . , n, let λ̂R
i
16
GABRIELA NANE
be the left derivative of the LCM at the point PiR of the CSD consisting of
the points PjR “ Pj , for j “ m, . . . , n, with Pj defined in (S4.3). Then, the
NPMLE λ̂0n px; βq of a nonincreasing baseline hazard function λ0 , under the
null hypothesis H0 : λ0 “ θ0 , is given by
λ̂0n px; βq “
$
0
’
’
’λ̂i
’
’
’
&θ0
’
’
λ̂0m`1
’
’
’
’
%0
Tpi´1q ă x ď Tpiq , for i P t1, . . . nuztm ` 1u,
Tpmq ă x ď x0 ,
(S4.4)
x0 ă x ď Tpm`1q ,
x ą Tpnq ,
0
where Tp0q “ 0 and where λ̂0i “ maxpλ̂L
i , θ0 q, for i “ 1, . . . , m, and λ̂i “
minpλ̂R
i , θ0 q, for i “ m ` 1, . . . , n.
(ii) For k ě 1, let B10 , . . . , Bk0 be blocks of indices such that λ̂0n px; βq is constant
on each block and B10 Y . . . Y Bk0 “ t1, . . . , nu. There is one block, say Br0 ,
on which λ̂0n px; βq is θ0 , and one block, say Bp0 , that contains m ` 1. On all
0 pβq the value of λ̂0 px; βq on block B 0 . Then,
other blocks Bj0 , denote by vnj
n
j
ř
0
vnj
pβq
iPBj0
“ř
“
iPBj0
∆piq
‰ řn
Tpjq ´ Tpj´1q
l“j
1
eβ Zplq
.
On the block Bp0 , that contains m ` 1,
0
vnp
pβq
ř
“ř
iPBp0 ztm`1u
“
Tpiq ´ Tpi´1q
iPBp0 ∆piq
β 1 Zplq
`
l“i`1 e
‰ řn
rTpm`1q ´ x0 s
řn
β 1 Zplq
l“m`1 e
.
We propose λ̂0n pxq “ λ̂0n px; β̂n q as the constrained estimator of a nonincreas0 “ v 0 pβ̂ q on blocks of indices
ing baseline hazard function λ0 , as well as v̂nj
nj n
where the estimator is constant. The Fenchel conditions corresponding to (S2.3)(S2.6) can be derived in the same manner as for the constrained estimator in the
nondecreasing case.
Let slolcmpf, Iq be the left-hand slope of the LCM of the restriction of the
real-valued function f to the interval I. Denote by slolcmpf q “ slolcmpf,
Rq. For
17
LIKELIHOOD RATIO TESTS IN THE COX MODEL
a, b ą 0, let X̄a,b ptq “ a
Wptq ´ bt2, where W is a standard two-sided Brownian
motion originating from zero. Denote by La,b the LCM of X̄a,b and let
la,b ptq “ slolcmpX̄a,b qptq,
(S4.5)
be the left-hand slope of La,b , at point t. Additionally, set
slolcm0 pf q “ max pslolcmpf, p´8, 0sq, 0q 1p´8,0s `min pslolcmpf, p0, 8qq, 0q 1p0,8q .
For t ď 0, construct the LCM of X̄a,b , that will be denoted by LL
a,b and take its
left-hand slope at point t, denoted by DL pX̄a,b qptq. When the slopes fall behind
zero, replace them by zero. In the same manner, for t ą 0, denote the LCM
of X̄a,b by LR
a,b and its slope at point t by DR pX̄a,b qptq. Replace the slopes by
0 , which is
zero when they exceed zero. This slope process will be denoted by la,b
thus given by
0
la,b
ptq “
$
`
˘
’
’
max DL pX̄a,b qptq, 0
’
&
0
’
’
’
%min `D pX̄ qptq, 0˘
R
a,b
t ă 0,
(S4.6)
t “ 0,
t ą 0.
0 ptq “ slolcm0 pX̄ qptq.
Observe that la,b
a,b
By making use of results in Lopuhaä and Nane (2013), a completely similar
result holds in the nonincreasing setting.
LEMMA 3. Assume (A1) and (A2) and let x0 P p0, τH q. Suppose that λ0 is nonincreasing on r0, 8q and continuously differentiable in a neighborhood of x0 , with
λ0 px0 q ‰ 0 and λ10 px0 q ă 0. Moreover, assume that the functions x Ñ Φpβ0 , xq
and H uc pxq, defined in (4.1) and above (4.1), are continuously differentiable in
a neighborhood of x0 .
Then, for a and b defined in (4.3), pXn , Yn q converge jointly to
´
¯
0
la,b , la,b
in
0 have been defined in (S4.5) and (S4.6).
L ˆ L, where the processes la,b and la,b
The asymptotic distribution of the likelihood ratio statistic in the nonincreasing baseline hazard setting can be derived completely analogous.
THEOREM 1. Suppose (A1) and (A2) hold and let x0 P p0, τH q. Assume
that λ0 is nonincresing on r0, 8q and continuously differentiable in a neighborhood of x0 , with λ0 px0 q ‰ 0 and λ10 px0 q ă 0. Moreover, assume that H uc pxq and
18
GABRIELA NANE
x Ñ Φpβ0 , xq, defined in (4.1) and above (4.1), are continuously differentiable in
a neighborhood of x0 . Let 2 log ξn pθ0 q be the likelihood ratio statistic for testing
H0 : λ0 px0 q “ θ0 , as defined in (3.2). Then,
d
2 log ξn pθ0 q Ý
Ñ
D.
Proof. Following the same reasoning as in the proof of Theorem 1 and by Lemma 8,
it can be deduced that
1
2 log ξn pθ0 q Ý
Ñ 2
a
d
ż ”
`0
˘2 ı
(
pla,b pxqq2 ´ la,b
pxq
x P D̄a,b dx,
0 differ. By continuous mapping theorem,
where D̄a,b is the set on which la,b and la,b
it suffices to show that, for t fixed, la,b pX̄a,b qptq has the same distribution as
0 pX̄ qptq has the same distribution as g 0 pX qptq. It is
ga,b pXa,b qptq and la,b
a,b
a,b
a,b
noteworthy that
slolcmpX̄a,b qptq “ ´slogcmp´X̄a,b qptq.
Thus, by Brownian motion properties and continuous mapping theorem,
`
˘
P pla,b ptq ď zq “ P ´slogmcp´a ptq ` t2 q ď z
`
˘
“ P ´slogmcpa ptq ` t2 q ď z “ P p´ga,b ptq ď zq .
W
W
d
Concluding, la,b pX̄a,b qptq “ ´ ga,b pXa,b qptq, and a similar reasoning can be applied
d
0 pX̄ qptq “ ´ g 0 pX qptq. The proof is then immediate, by
to show that la,b
a,b
a,b
a,b
continuous mapping theorem.
References
Banerjee, M. (2000). Likelihood Ratio Inference in Regular and Nonregular
Problems. PhD Dissertation University of Washington.
Banerjee, M. and Wellner, J. A. (2001). Likelihood ratio tests for monotone
functions. Ann. Statist. 29, 1699-1731.
Devroye, L. (1981). Laws of the iterated logarithm for order statistics of uniform
spacings Ann. Prob. 9, 860-867.
Groeneboom, P. (1985). Estimating a monotone density. In Proceedings of the
Berkeley Conference in Honor of Jerzy Neyman and Jack Kiefer 2, 539-555.
LIKELIHOOD RATIO TESTS IN THE COX MODEL
19
Groeneboom, P. (1998). Special Topics Course 593C: Nonparametric Estimation for Inverse Problems: Algorithms and Asymptotics. Technical Report
344, Department of Statistics, University of Washington.
Huang, J. and Wellner, J. A. (1995). Estimation of a monotone density or
monotone hazard under random censoring. Scand. J. Statist. 22, 3-33.
Huang, Y. and Zhang, C.-H. (1994). Estimating a monotone density from censored observations. Ann. Statist. 22, 1256-1274.
Kim, J. and Pollard, D. (1990). Cube root asymptotics. Ann. Statist. 18,
191-219.
Lopuhaä, H. P. and Nane, G. F. (2013). Shape constrained nonparametric
estimators of the baseline distribution in Cox proportional hazards model.
Scand. J. Statist. 40(3), 619-646.