Zhou, Haibo and Pepi, Margaret S.; (1993).Asymptotic Theory of the Estimated Partial Likelihood."

•
ASYMPTOTIC THEORY OF THE ESTIMATED PARTIAL LIKELIHOOD
Haibo Zhou and Margaret S. Pepe
Department of Biostatistics, University of
North Carolina at Chapel Hill, NC.
Institute of Statistics Mimeo Series No. 2114
April 1993
•
Asymptotic Theory of The Estimated Partial Likelihood
Haibo Zhou and Margaret S. Pepe*
..
ABSTRACT
A semi-parametric method, Estimated Paxtial Likelihood(EPL) method, is proposed for the estimation of the relative risk parameters when auxiliaxy covariates are present. The EPL method makes
full use of the avaiable covariates and auxiliary measurement without imposing the parametric
structure on the missing procedures. The EPL function reduces to the partial likelihood function
of Cox(1972) when the auxiliary is perfect measurement about true covariates. Since the EPL takes
a non-parametric appraoch with respect to the underline missing process, therfore avoides some
strong assumptions, e.g. the rare disease assumption(Prentice 1982, Pepe , Self and Prentice 1991).
The asymptotic distribution theory is estabilished for the proposed estimator for the case in which
the surrogate or mismeasured covariates are categorical. The proposed EPL estimator is found to
be consistant and asymptoticaly normally distributed. A consistent variance estimator is provided.
An extension of the EPL is presented when the validation sample is a stratified random sample.
KEY WORDS: Missing data; Survival analysis; Relative Risk Parameter Estimation; Stochastic
Intergration; Surogate Covariate Data.
1
* Haibo
Zhou is a Postdoctoral Fellow, Department of Biostatistics, The University of North
Carolina at Chapel Hill. Chapel Hill, N.C. 27599-7400. Margaret S. Pepe is an associate member,
Division of Public Health Sciences, Fred Hutchinson Cancer Research Center, 1124 Columbia Street,
Seattle, Washington 98104. This research was supported in part by National Science Foundation
grant No. DMS 89-02211. Part of this work was included in 1992 Department of Statistics,
University of Washington Doctoral dissertation by Haibo Zhou. The authors thank C. Ed Davis
for the helpful discussion about SOLVD data and insightful comments.
2
1. INTRODUCTION
.
Auxiliary Covariate data is a common problem in statistical data analysis. For example, in a
substudy ofthe Studies of Left Ventricular Dysfunction(SOLVD)(SOLVD Investigators, 1991), the
patient's left ventricular ejection fraction(EF) is an importent risk factor in predicting the patient's
congestive heart failure. However due to the complication and expensiveness of the measuring
procedure, the EF, which is a standardized radionuclide measurement, is only avaiable to a subset
of 179 patients in the study. For a total of 1111 patients in the substudy, a nonstandardized across
clinical centers measurement of EF is avaiable to everyone.
In the above example, if we denote the the standard EF as X and the nonstadardized EF as
Z, then the data we are dealing with is of the form,
{Si, ai, Xi, Zd
i EV
{Si, ai, Zi}
i E
iT
(1)
where S denotes the observed time event, which is the minimum of potential failure T and censoring
time U. ais the indicator of failure. V is the validation sample where subjects have both X and
Z avaiable. iT is the nonvalidation sample where subjects only have Z avaiable. f3, the regression
coefficient of X is the primary interest. Let n, n v and n v denote the sample size of the whole,
validation and non-validation sample respectively.
The most naive approach to the data in (1) is to discard the non-validation sample and only
analyze the validation sample. This could induce a substantial reduction in efficiency. A more
conprehensive approach is to parameterize the underline distribution and use a fully parametric
approach. However, realistic models are hard to construct, furthermore, this approach may not be
robust to model misspecification. The proposed EPL approach is aimed at avoiding the stringent
model assumption yet still utilize the information contained in the non-validation sample.
Denote the relative risk functions for the subjects in the validation and non-validation sample
respectively as follows:
ri(t)
fi(t)
=
=
r(f3,Xi(t))
if i E V
f(f3, Zi(t))
if i E
iT
For the observed data we will denote the relative risk function for an individual i as
(2)
The inference about regression parameter f3 can be drawn from the partiallikelihood(Prentice &
Self, 1983)
P L(f3) =
In
i=l
ri(Ti)... ]0,
~jE~(T,) rj(T,)
(3)
However, as pointed out by Prentice(1982), the induced relative risk function fi(t) can be expressed
as
fi(t)
f(f3, Zi(t))
=
E(f3, Xi(t)ITi 2: t, Zi(t))
where the condition [Ti 2: t, Zi(t)] usually imply that f(t) depend on the baseline hazard and
underline distribution of X and Z. Hence the partial likelihood (3) is only useful under some
special cases(e.g. under 'rare disease' assumption, Prentice 1982, Pepe, Self and Prentice 1991).
3
We propose to estimate r(t) empirically on the basis of the validation sample observations. So
the estimated relative risk function for an individual i given his observed data is:
(4)
where ri(t) is the estimated relative risk function for an individual in the non-validation sample
with only auxiliary covariate process Z(t) available. Specifically, the empirical estimate of the
induced relative risk function denoted by r(t) is
r(t)
= E[r(,8,X(t»IS;::: t, Z(t)]
=
Liev I[s; >t,Z; (t)=Z(t)]r(,8, Xi(
Liev I[s;~t,Z;(t)=Z(t)]
t»
,
•
(5)
where Z is assumed discrete with Prob(Z = Zk) = Pz", k = 1,· .. ,q and Lk=lPZ" = 1. The EPL
estimator is the the maximizar of the resulting Estimated Partial Likelihood,
(6)
The proposed estimator has the following properties,
1. No parametric assumptions about the underlying distributions are assumed. Hence the EPL
estimator is robust with respect to f(XIT, Z)j
2. Leave baseline hazard function Ao(t) completely arbitrary;
3. No rare disease assumption(Prentice 1982, Pepe, Self and Prentice 1991) needed in the EPL
method;
4. the EPL method is 'Computationally straightforward.
5. Fully utilizes the information about ,8 contained in the non-validation set.
6. Validation set needs not to be a simple random sample of the entire sample. Since the nature
of the estimation of if in the EPL method, the validation set can be a random sample stratified
on Z. Moreover the validation set can be time dependent. Le. the missingness ofthe covariate
may depend on the Z and time t. The last property is very useful in the large cohort study
because the subjects may enter the validation set during the study;
The rest of this artical is orgnized as following: We will first formulate the Cox model( Cox,
1972) into a more general multivariate counting process framework in Section 2 and outline the
proofs of the asymptotic properties of the proposed ~EPL' The conditions needed in the proofs
can be found in Section 2 where some further definitions and notations are given as well. Rigorous
proofs of the two main results in this paper are detailed in Section 3 and 4 where consistEmcy of
~EPL is shown and the asymptotic distribution of ~EPL is determined. A consistent estim~tor of
the asymptotic variance-covariance matrix is also presented in Section 4. While all of the above
results are obtained assuming that the validation set is a simple random sample, a version of main
results with a random validation set stratified on Z can be found in Section 5.
2. THE COUNTING PROCESS FORMULATION OF THE
MODEL
4
..
•
In this section we shall first formulate the Cox model in the framework of the multivariate
counting processes in Section 2.1. For simplicity the time interval is assumed to be finite. Without
lost of generality, we shall be working on the time interval [0,1]. The condition that needed in
developing the asymptotic theory and some definations are presented in Section 2.2. Section 2.3
outlines the idea of proving the asymptotic theory.
Background theory for this chapter are the theories for multivariate counting processes, stochastic integrals and local martingales. We shall use the basic results from those theories without further
comment. A good survey of these theories can be found in Fleming & Harrington(1991). The use
of the local concept will allow us to avoid making the superftuous integrability conditons. Apart
from the background theories, our basic tools are the Inverse Function Theorem(Rudin, 1964), the
Inequality of Lenglart and the martingale central limit theorem of Rebolledo. The latter two theorems are not presented in this work. We usually refer to the one given in Andersen & Gill(1982)
which is slightly extended with respect to the originals.
2.1.
Formulation of The Model
Since we are interested in the asymptotic properties, we shall in fact consider a sequence of
models, indexed by n = 1,2,.... We shall generalize from the censored observation of lifetimes
of n subjects to the observation (in the nth model) of an n-component counting process N(n) =
(N1 n), ...,NAn)), where Nt) counts the observed events in the life of the jth subject, j = 1,2, ..., n,
over the time interval [0,1]. In this work, only the case of nonrecurrent events will be studied.
Specifically, the multivariate counting processes will be generated as follows,
Let SJn) = min(TJn) , ut)) be the observation of the jth subject, where TJn) is the potential failure
time, UJn) is the potential censoring time and SJn) is the observed event time of the jth subject
in the study. The counting process Nt) is defined by
(7)
where
8t)
=
l[Tn)<u(n)j indicates if the observation is a failure time.
J
-
So the sample paths of
J
N1 n), ... , N~n) are step functions that satisfy,
i
2. jump size
= 1,"',nj
+1 only;
3. No two components processes jumping at the same time;
4. At most one jump for each subject in the study.
In our model, properties of stochastic processes, such as being a local martingale or a predictable
process, are relative to a right-continuous nondecreasing family {F}n) : t E [0, I]} of sub u-algebras
on the nth probability space (f2(n),F(n), p(n)); Ft(n) is the filtration and can be thought of as the
history of everything that happens up to time t(in the nth model).
Our basic assumption is that for each n, N(n) has a random intensity process ).(n) = ().~n), ... , ).~n))
such that
(8)
5
where r;(n)(t) is as defined in (2), ~(n)(t) is the at risk process defined by ~(n)(t)
= I[S;n)~tl
and
Ao(t) is a unknown but fixed underlying baseline hazard function.
In our study the covariate processes x(n)(t) and z(n)(t) are assumed to be predictable and
locally bounded. This assumption holds when x(n)(t) and z(n)(t) are left continuous, with right
hand limits and adapted.
By stating that N(n) has intensity process A(n), we mean that the processes Mi(n) defined by
(9)
are local martingales on the time interval [0,1]. Since the intensity process A(n) is locally bounded
as well, the local martingales (9) are in fact local square integrable martingales.
(10)
Le. Mi(n) and MJn) are orthogonal when i 1: j.
The nice property of the local martingale is that if Hi is locally bounded and Ft-predictable,
n
n
then l:i=1 J Hl )dMl ) is a local square integrable martingale with an easy to calculate covariance
n
processes l:i=1 J Hld < Ml ) >. This is one of the key ingredients in establishing asymptotic
properties of ~EPL.
In the following we shall drop the superscript (n) everywhere. Only (3 and AO are same in all
models(Le. for each n). Convergence in probability (~) and convergence in distribution (~)
are always relative to the probability measures p(n) parameterized by (3 and AO.
Let C((3, t) be the logarithm of the partial likelihood (3) evaluated at time t, so that according
to (8) the partial likelihood (3) for the observed data is
n
rlog{ri(s)}dNi(s) - inr log{"L.Y'i(s)ri(s)}dN(s),
n
= "L. in
(11)
i=1 0
0
i=1
where N = l:i=1 Ni. As we discussed in Section 1, the induced relative risk function f((3, t) is
not known for non-validation set member unless the underlying conditional distribution function
fXIS~t,z(x) is specified. In Section 1, we proposed to estimate f((3, t) empirically for non-validation
set members and to draw inference about (3 from the estimated partial likelihood function (6).
Then the proposed estimator ~EPL is the solution to the estimating equation ~6((3, 1) = 0, where
6((3,1) is obtained by substitute f*((3, t) for r*((3, t) in (11). The vector of derivatives U((3, t) of
6((3, t) with respect to (3 has the form
C((3, t)
U((3, t)
=
n
'"'
L...J
i=1
fa.
0
t
_*.(1)( )
ri
s dNo( ) _
I S
ri-* ()
s
= ~ fat Ll(f7)(s) dNi(S)
where f;(I)(S) and Ll(fi)(S) are defined as
6
fa
0
t "\,,n
-v
o (
)-*(1)( )
L"i-l .Ii S ri
s dN( )
"\,,n
s ,
L"i=I.1-vi ( s ) ri_* ( s )
(12)
•
From (8) and (9), we can rewrite (;({3, t) as
(;({3, t) =
2.2.
.
n
ft
i=1
0
2: io
~(fn(s)dMi(S)
n
ft
+ 2: io ~(fn(s)ri(s)Yi(s)'\o(s)ds
i=1
(13)
0
Further Definations and Conditions
Some important notation will be defined in this section which will be used in the proofs. First
the notation defined in Section 2.1 will be the same. Unless otherwise stated, all limits are taken
as n 00, which implies n1J 00 and nii 00 as well, where n1J and v are the number of
subjects in validation and non-validation sets, respectively. For a matrix A or vector a, we define
the norm as IIAII = sUPi,ilaiil and lIall = sUPilail. For a vector a, define lal = O::al)t = (a'a)t.
We also write the matrix of aa' as a~2 and the matrix (aa')( aa')' as a~4. For the relative risk
function r* (as well as for f*, r, rand r), we let r*(j) denote the jth derivative of r* with respect
to {3, j = 0,1,2, where j = represents the function itself.
Some further definitions are:
°
.!.
5(0)
n
5(1)
5(2)
t
Yi( t)fi({3, t)
i=1
t
= ~5(1) = .!. t
=
{) - (0)
{){3S
= -1
n
i=1
_*(1)
Yi(t)ri ({3, t)
Yi( t)f;(2)(;3, t)
i=1
1 n
-;(2)(;3"t)
Yi(t{I_*({3 ') r;({3o,t)
n i=1
ri' t
1 n
_;(1) (;3 t)
Yi( t)(I-'!' (;3 ;) )~2ri(;30, t)
n i=1
rl ,
1 n
_;(2)({3 t)
l
2
Yi( t)( r ''!'(;3 ;) )r& ri({30, t)
n i=1
rl ,
1 n
_;(1)({3 t)
Yi( t)( rl _'!'({3 ;) )r& 4ri({30, t)
n i=l
rl ,
1 n
_;(1)(;3 t)
-2:Yi(t{"'!'(;3;) r;(;3o,t)
n i=1
rl ,
n
- 2:
5(3)
- 2:
5(4)
-L
5(5)
=
5(6)
= - 2:
5(7)
We define S(j)({3, t) as the corresponding functions with r*(;3, t) substituted for f*(;3, t) in the above
5(j)(;3, t), j = 0,1,2"",7. Also, we define
:
~(JT(t)r*({3,t))
~(JT(t)r*(I)({3,
=
t))
~(JT(t)r*(2)(;3, t))
r*(2)({3 t)
~(JT(t) r*(;3, ;) r*({3o, t))
7
8(4)
=
E(Y(t)( r:~I:f;;) )®2r*(fio, t))
8(5)
=
E(Y(t)( r:~2:f;;) )®2r*(fio, t))
S(6)
=
E(Y(t)(*(I)(fi,t))®4 r*(fio t))
r*(fi, t)
,
8(7)
=
E(Y(t)
r:~:~~;)r*(fio, t))
•
and
~
=
1.'
~1
=
i _ O,W )@2) _
fao1[E(Yi(w/-<ll(p
ri(fio,w)
~2
= E{
l
[S(4 (PO'W) _ ( S(l)(Po.W») @2] s(Ol(Po.w)Ao(w)dw
o 8(0) (fio, w)
8(0) (fio, w)
1.' (rl')(Po.w) _
o
QJ =
rj(fio,w)
~
t
S
(l)(P
)@2] AO(w)dw
o,w
8(0)(fio,w)
s(ll(Po.w) ) dM(w) _ 18(0)(fio,w)
1
p
PQ.} @2
1
(14)
(15)
(16)
[1 ~(Ti(fio,w)) Yi(w)Yj(w)I[Zi=Zil
v j=1 Jo
PzjHzj
x [rj(fio, w) - Ti(fio, w)]Ao(w)dw
Qj =
r
Jo
X
We also define p
(T~I)(fiO'W)
_ 8(1)(fio,w)) Y.(w)
Tj(fio,w)
s(O)(fio,w)
1
(rj(fio,w) - Tj(fio,w))AO(w)dw
= limn_co ~ and
Hz(t)
= Prob(Y(t) = liZ = z)
The following list of conditions will be assumed to hold throughout this chapter. There are a
number of redundancies in them, and not all conditions are needed for every result, but in this
way we hope to avoid too many technical distractions in the theorems and their proofs. Further
discussion is deferred till Section 2.5.
(A) fo1 Ao(t)dt <
(B) P(Y(I)
00
= liZ = Zk) > ° k = 1,:2,,,·, q
(C) There exists an open subset B, containing fio, of the Euclidean P space E'P. r(2) with elements
8f3c:af3j r(fi, t) exists and is continuous on B for each t E [0,1], uniformly in t, and f(fi, t) is
bounded away from on B X [0, 1]. ~(fio) is positive definite.
°
(D)
E{SUP8x[o,I)IY(t)r*(i)(fi,t)l} <
8
00
j
= 0,1,2
"
r*(I)(,8 t)
.
r*(2) (,8 t)
.
E{SUPBX[O,11I Y (t)( r*(,8, ;) )~2)r*(,8o, t)l}
E{SUPBX[O,11I Y (t)( r*(,8, ;) )~Jr*(,8o, t)l}
< 00
j = 1,2
< 00
j = 1,2
(E)
where
Z~K)(t) == v'V{~
t I[Y;=I,Zi=zlr~K)(,8,
t) - E(I[Y(t)=I,Z=zlr(K)(,8, t»)} K = 0,1
i=1
Note that the partial derivative conditions on r(j) are satisfied by s(j), S(j), S(j) and 1"S as well
under regularity conditions. Even though we list condition (E) as one of our assumptions, it is
actually fullfilled in a general setting.
2.3. Outline of The Proofs of Asymptotic Theory
Since we define ~EPL as the solution of U(,8, 1) = 0, a Taylor expansion of U(,8, 1) about ,80
evaluated at ~EPL gives
(17)
.
where ,8* is between ~ and ,80. Therefore to prove asymptotic normality of nt(~EPL - ,80) it is
suffficient to show
1.
A
n- 2 U(,8o, 1) -n
-1
8
8,8* U( 1-'*,
A
f.l
1)
d
_P
N(O,P~1
't'"
~
as n
+ (1- p)~2)
-+
as n
-+
00
00
~, ~1l ~2 and p will be defined later. The difficulty in proving the weak convergence of
n-tU(,8o, 1) is that the second component in (13) does not disappear as in Andersen & Gill(1982)
and Prentice & Self(1983). Therefore we cannot write U(,8, t) as a simple summation of local
where
martingales. Martingale theory which is useful for inference with partial likelihood cannot be
invoked here with EPL. In fact the existence of the second component here makes it impossible
to write (13) as a summation of uncorrelated terms. In Section 3 we'll give a proof to show that
n-tU(,8o, 1) still converges to a normal process.
Existence of a unique consistent solution to the EPL equations is obtained as a consequence
of the Inverse Function Theorem(Rudin, 1964). Roughly speaking, the Inverse Function Theorem
states that a continuously differentiable mapping / is invertible in a neighbourhood of any point x
at which the linear transformation /(1)( x) is invertable.
~EPL is the value at 0 of the inverse function ~U-l : EP -+ B, where B is an open subset of
Euclidean P space EP and ,80 is contained in B. In Section 2.3 ~U-l is shown to be well defined
in an open neighborhood about zero with probability going to one, and ~U(O)-l is shown to be a
consistent estimate of ,8.
9
A
3. CONSISTENCY OF
{3EPL
This section is divided into 2 subsections. In Section 3.1, the preliminary results are given. The
consistency theorem is stated and proved in Section 3.2. The principal tool in proving consistency
of {JEPL is the Inverse Function Theorem. There are several forms of the Theorem. One form of
the theorem is given in Theorem 2.2(Rudin, 1964) for ease of reference. Theorem 1 is an important
theorem given by Anderson and Gill(1982). Theorem 5 is the main theorem that states the existence
and uniqueness of consistent estimator of {3. Theorem 3, 4 and Lemma 1- 4 are the preliminaries for
Lemma 5. Note that stability results in Theorem 3, 4 and asymptotic regularity results in Lemma
1 will also be useful in the asymptotic distribution proof. An analogous proof of consistency with
standard likelihood function can be found in Foutz(1977).
3.1. Preliminary Results
The following result which is due to Anderson & Gill will be used below.
Theorem 1 Let Xj X l, · · · , be iid random elements of DE[O, 1] (endowed with the Skorohod topology) where elements of DE[O, 1] are right continuous functions on [0,1] with left hand limits taking
values in a separable Banach space E (rather than the usual R). Suppose that
EIIXII =
For each n, let
ESUPtE[o,l]IIX(t)1I
< 00
ttn) ~ ... ~ t~n) be fixed time instants in [0,1].
Let yjn)
= I[t~n),ll
and suppose that
there exists a distribution function y such that, on [0,1],
II.!. tyjn) - yll n
i=l
°
as n
-+ 00
Then
The proof of Theorem 1 can be seen in Anderson and Gill(1982, Theorem IIL1).
Theorem 2 (The Inverse Function Theorem) Suppose f is a mapping from an open set 0 in
E r into Er, the partial derivatives of f exist and are continuous on 0, and the matrix of derivatives
f'«(J*) has inverse f'«(J*)-l for some (J* E 0. Write
Use the continuity of the elements of f'«(J*) to fix a neighborhood U 5 of (J* of sufficiently small
radius 6 > to insure II f' «(J) - f' ((J*) II < 2"\, whenever (J E U 5. Then (a) for every (Jl, (J2 in U 5,
°
and (b) the image set f(U 5) contains the open neighborhood with radius ,,\6 about f ((J*).
Conclusion (a) insures that f is one-to-one on U5 and that f- 1 is well defined on the image set
f(U5). The theorem is proved in this form in Rudin( 1964, p193).
10
Theorem 3
II r(,8 , t) - 1'(13, t)1I -
sup
0 a.s.
8x[0,1]
sup
IIf*(13, t) - r*(13, t)1I -
0 a.s.
8x[0,1]
PROOF:
•
~*(13 t) -
t 2::=1 I[Yi(t)=I,Zi=z]ri(13, t) _
A
; 2:i=1 I[Y;(t)=I,Zi=z]
B
Since Y, X and Z are left continuous processes with right hand limits, if we reverse the time
axis, we can consider I[Yi(t)=l,zi=z]ri(13, t) and I[y;(t)=I,Zi=z] as random elements of D[O, 1], where
the elements of D[O, 1] take values in Banach space of continuous function on B. This is also a
separable Banach space if we endow this space of functions with the supremum norm. Then by
Result 1 and noticing that
E{SUP[O,I]I[y(t)=I,Z=z]} < 00
r
,-
1
v
-
we have
SUP8x[0,1]IIB where b = E(I[y(t)=l,Z=z])
implies that
bll
~0
(18)
= P(Y(t) = liZ = z)P(Z = z) = pzHz(t).
E{SUP8x[0,I]I[y(t)=I,z=z]r(13,
tn <
Note also that condition (D)
00
Hence
SUP8x [0,1] II A ~here
...
that
t
all
~0
(19)
a = E(I[Y(t)=I,z=z]r(13, t)) = 1'(13, t)PzHz(t). By Condition (B), we have b > O. Also notice
= 1'(13, t), (18) and (19), we have
SUPBx [0,1] II r(13 , t) - 1'(13, t)1I ~ 0
(20)
Since fi(13, t) = ri(13, t)I[v] + ri(13, t)I[v] , so (20) implies
sUP8x [o,I]lI f *(13, t) - r*(13, t)1I ~ O.
The same argument works for f*(I)(13, t) and f*(2)(13, t). 0
Theorem 4
SUP8x[0,1]IISW - SWII ~ 0
SUP8x[0,1]IIS(i) - s(i)1I ~ 0
SUP8x[0,1]IISW - s(i)1I ~ 0
where j = 0, 1, ... , 6.
Lemma 1 sW(., t), j = 0,1,2" ',,6,7 are continuous functions of 13 E B, uniformly in t E [0,1],
sW, j = 0,1,2" ",6,7, are bounded on B x [0, 1]; s(O)(13, t) is bounded away from zero on B x [0,1].
Lemma 2
11
Lemma 3
1 •
p
-U(/3o, 1)-Oas n
n
-+ 00
The proofs of Theorem 4, Lemma 1, 2 and 3 are given in Appendix A.
3.2. Consistency of
f3EPL
.,
Theorem 5 (Consistency Theorem) There exists a sequence
probability going to one as n -+ 00 and
•
{.8n} such that Un(.8n) = 0 with
p
/3n-/3o
If {,Bn} also satisfies the above conditions then .8n = ,Bn with probability going to one as n
-+ 00.
PROOF: By Condition (C), -~(/30) is negative definite, we may define A = 411-~21(.ao>ll. Choose 0
t
sufficiently small that 11~(/30) - ~(/3)1I < whenever 1/3 - /301 < 0 and that the uniform convergence
of ~ l.aU(/3) to -~(f3) proved in Lemma 2 holds for 1/3 - /301 < o. As in Theorem 2.2, call the
neighborhood of /30 with radius 0 as U".
Also since l.a U(/3o) converges to -~(/30) in probability, it ensures that ~ U(/3o) is negative
l.a
definite with probability going to one. Let An
= 41/ ipOl_l(.a)1/
whenever ~ l.a U(/3) is negative definite.
Then An~A. Hence we have
18·
18·
18·
1I--U(/3) - --U(/3o)1I < II; 8/3 U(/3) n 8/3
n 8/3
+1I~(/3)
-
(-~(/3))11
~(/30)1I
+ II -
~(/30)
1 8 .
- ; 8/3 U(/3o) II
< A < 2A n
with probability going to one as n -+ 00 if 1/3 - /301 < o. By the Inverse Function Theorem stated
in Theorem 2.2 ~U is a one-to-one function from Us on to ~U(U,,) and the image set ~U(U,,)
contains the open neighborhood of radius AnO about ~(j(/30) with probability going to one.
By Lemma 3 we can see that 0 E U(U,,) with probability going to 1. Also we have that
I~U(/3o) - 01 < )..2" with probability going to one as n -+ 00.
Consider the inverse function ~U-l : ~U(U,,) -+ U". It is well defined when ~U is one-to-one,
Le. with probability going to one, since 0 E U(U,,) with probabilty going to one we may conclude:
(A) the root, ~U-l(O), of the estimating equation exists in U" with probability going to one; (B)
since 0 may be taken arbitrarily small, ~U-l(O) converges in probability to /30; (C) by one-tooneness of ~U on U", any other sequence {,B} of roots to U(/3) = 0 necessarily lies outside of U"
with probability going to 1 which implies that the sequence does not converge to /30. Therefore
.8EPL = ~U-l(O) is a unique consistent estimator of /30. 0
4. ASYMPTOTIC NORMALITY OF
...
f3EPL
There are three subsection in this section: Section 4.1 presents the preliminary results. Section
4.2 proves the asymptotic normality of .8EPL and a consistent asymptotic covariance matrix estimate
is given in Section 4.3.
12
The asymptotic normality of /3EPL is obtained through three main Theorems. In Theorem 6, the
n- t U(/3o, 1) is shown to be equivalent in probability to two independent summations. Theorem 7
further develops the asymptotic distribution of n-t U(/3o, 1). Recall that the second task in showing
asymptotic normality of /3EPL is to show
1 8 .
p
-n- 2 8/3U(/3,1)1.e=.e* ---+~(/3o).
This is done in Theorem 8. The main difficulty lies in proving Theorem 6, where we establish five
preliminary lemmas for the proof. Lemma 4 - 8 are given in the preliminary section and the proofs
of the Lemmas can be found in Appendix B.
4.1. Preliminary Results
Lemma 4
Lemma 5
n-t ~ fa1
J!.
~(1\(/3o,w))l'i(w)r:(/3o,w)'\o(w)dw
-n-! ~ fa1
~(ri(/3o,w))l'i(w)(ri(/3o,w)
- ri(/3o,w))'\o(w)dw
~ fa1 ~(ri(/3o,w))l'i(w)(Ti(/3o,w)
- ri(/3o,w))'\o(w)dw
Lemma 6
-n-t
J!.
-n-!
t
rt ~(i\(/3o,w))l'i(w)
i=l Jo PZiHZi(w)
1
v
X-v I:
I[Zi=z;]Yj(w)(rj(/3o,w) j=l
Ti(/3o,W))'\o(w)dw
Lemma 7
Lemma 8
lVI: (Q.v -Q')---+O
P
-
-n-2v
v
J
j=l
where Qj and QJ are as defined in Section 2.2.1.
13
J
(21)
4.2. Asymptotic Normality of ~EPL
Now we are ready to prove the asymptotic normality of ~EPL'
Theorem 6
..
where Q j is defined as in Section 2.2.1.
PR.OOF OF THEOR.EM
6:
(22)
Note that
W(t)
= n-t Ln r
(o.(1)({3
)
.(1)({3
))
rio.
o,W _ ri • o,W dMi(W)
i=l 10
ri ({30, w)
ri ({30, w)
t
is a local Martingale with covariance at t = 1
< W> (1)
=
=
=
~o
by Theorem 3 and the fact that ~ 2:?=1 I[z.=zA;]Yi(w) converges to (1 - p)PzA;HzA;(W) uniformly
in wand Condition A. Hence W(. )~O by Lenglart Inequality. The above result together with
Theorem 4, the first term of (22) equals to
n-t
Ln r1 (.(1)({3
ri
o,W ) _
i=110
ri({3o, w)
in probability.
14
s (1)({3o,W )) dM.(w)
s(O)({3o, w)
~
The second term in (22) by Lemma 6
•
•
Furthermore by Lemma 8, the Theorem holds.
0
Theorem 7
where the variance-covariance matrices
PROOF:
~1
and
~2
are as defined in (15) and (16).
From Theorem 6, we have
(23)
The first term and the second term on the right hand side are independent. By the martingale
central limit theorem, the first term in (23) converges weakly to a continuous normal process. The
covariance process of this normal process evaluated at t = 1 is (1 - P)~1. Le.
)(f3o,W) S(1)(f30,W))
L (fP
-'(f3 ) Yi(W)Ti(f3o,W)Ao(w)dw
2
t
< W > (t)
fa° n
-1
V
_
i=1
T1
o,W
s
(0)
(f3o,w)
2
p
--+
< W > (1)
(1- p)
ft
Jo
[
E (_(1)(f3
Ti _ o,W )181 Yi(w) ) _ s (1)(f3 o,W )1812] Ao(w)dw
Ti(f3o,W)
s(O)(f3o,w)
(l-p)~1
The second term in n- t fJ (f30, 1) is also a summation of iid terms from subjects in the validation
sample. By the Central Limit Theorem, it converges to a normal distribution with mean
(24)
and covariance
(25)
15
The first term in the mean expression (25) is a local martingale and expected value of a local
martingale is zero. The second term is also zero, since
{-;Q;}
v r
_ (~1)(.80'W)
= -;;E Jo l'j(W)(T;(.80,W) - T;(.80'W)) r;(.8o,w) E
=
_~
S(l)(.8o,w))
S(O)(.8o,W) Ao(w)dw
•
r E {(r~l)(.8O'W) _ i 1)(.80,w))
1
v Jo
T;(.80'W)
S(O)(.8o,W)
x E [l'j(W)T;(.80,W) -l'j(w)r;(.8o,w) 1l'j(W)=l,Z;=Zi]} Ao(w)dw
_
0
1 •
Since the two terms in n-'2 U(.8o, w) are from the validation and non-validation sets respectively the
limiting distribution of n-t U(.8o,w) is normal with mean zero and covariance matrix (l-p )~l +P~2'
where ~l and ~2 are as defined in (15) and (16). 0
The second step in proving the asymptotic nomality of nt(~EPL-.80)is to show that _n- 1 f3 U(.8,·) 1f3 =f3*
converges to some positive definite quantity. The theorem below gives the limit of _n- 1 88f3 U(.8, .) 1f3 =f3*'
J
Theorem 8
1
8 .
p
-n- 8.8U(.8, 1)1f3=f3*-~(.80) as n -
00
where .8* lies between ~EP Land .80, and ~(.80) is positive definite.
l
Recall from Lemma 2 that -n- 1 f3 U(.8, 1)~~(.8) for any.8 E B and that ~(.80) is positive
definite, where
PROOF:
From Lemma 1,
~(.8)
is continuous in .8.
1 8 .
I-;; 8.8 U(.8, 1) f3=f3* - ~(.80)
: ; 1-;
1
:.8 U(.8, 1) 1f3=f3* -
I
~(.8*)1 + 1~(.8*) - ~(.80)1
The first term on the right hand side of the inequality converges to zero in probability as n - 00
by Lemma 2. The second term converges to zero as well by the continuity of ~ and the fact that
/3* is between ~EPL and /30, and that ~EPL is consistent for /30 by Theorem 2.5 . Therefore
1
8 .
p
-n- 8/3U(/3, 1)1f3=f3*-~(/30) asn-oo
where
~(/30)
is positive definite. 0
16
Theorem 9 (Asymptotic Normality Theorem)
r.:
D
V n ({3EPL - {30) - - Np(O, L,EPL({30»
A
•
where L,EPL({30)
= L,-1({30)((1 -
P)L,l
+ PL,2)L,-1({3ol
as n
-+
00 .
PROOF:
n-t(/3EPL - {30) = [_n- 1 :{3U({3*, l)]-ln-tU({3o, 1)
by Theorem 7 and 8, the result is straightforward. 0
4.3.
Asymptotic Variance Estimator
As in the ordinary partial likelihood setting, the variance estimator needs an estimator for
the baseline hazard function (or cumulative hazard Ao( t)
J~ AO(W )dw). A continuous estimator
obtained by linear interpolation between failure times of
=
•
Ao(t)
= """
LJ
Oi
TiSt L:je'R(Ti)
.
e{Jx
for the underlying cumulative hazard was suggested by Breslow (1972, 1974) in the ordinary partial
likelihood with relative risk r = e{Jx where /3 is the partial likelihood estimate. In this work, we
propose to estimate dA o(t) by
dA: o(t ) --
L:i-l dNi(t)
_
1
. 1 ~dN.( )
(0) •
LJ
1 t
L:i=l Yi(t)fi({3EPL, t)
S ({3EPL, t) n i=l
A
-
•
(26)
Then estimation of L,EPL is done through empirically estimating each component in L,ll E 2 and
E. That is, the expectation in Ell E 2 and E are estimated by taking the average in the observed
data, r* is estimated by f*, s(i) are estimated by S(i). We define
}:({3)
•
(27)
Then the estimator of EEPL({30) is given by
The following theorem shows that }:EPL(/3EPL) is a consistent variance estimator of EEPL({30).
17
Theorem 10
The proof of the Theorem is provided in Appendix C.
5. ASYMPTOTIC THORY WITH STRATIFIED VALIDATION SET
Suppose that the validation set is not a simple random sample of the entire study sample as we
assumed thus far in this chapter. Rather it is a random sample stratified on the auxiliary Z. Let
di , i = 1,· .. , q and a 5 di 5 1, denotes the predecided probability that goes to the validation set
for a subject with auxiliary Zi in the study. Le. 100 X di percent subjects who have the auxiliary
value Zi will be randomly chosen to be in the validation set; and the other 100 x (1 - di) percent
subjects will be in the non-validation set. Hence the validation size n v in this case is
q
v = L:di
X
Pi
X
n
i=l
where n is the entire sample size and Pi = Prob(Z = Zi).
Let ~EP L denotes the EPL estimator obtained from the above stratified set up. Since the
subjects both in the validation set and non-validation set are still indenpendent and identically
distributed, the proof of the asympotic theory in the Section 2.3 and 2.4 are still valid here except
that the asymtotic variance of ~EPL is different with ~EPL. The large sample theory of ~EPL are
given by following theorems.
Theorem 11 The ~EP L is consistent estimator of 13·
PROOF:
See proof of Theorm 5.
0
Theorem 12
Vn(~EPL -13) ..E... Np(O, I:EPL)as n -where I: EPL = I:- 1 ((1 -l:l=l di x Pi)I: 1 + l:l=l di
X
00.
PiI: 2)(I:-l)T and
I:, :E 1 and Qj are the same as defined in Section 2.2
The proof of this theorem is analogous to the proof of Theorem 9. 0
The proposed estimator of the asymptotic variance I: EPL (I3) is given by
PROOF:
tEPL(~EPL) = t(~EPLf\(1-
q
L:d
i
x Ih)tl(~EPL) + pt~(~EPL))(t(~EPLflf
(29)
i=l
where di and pi are the empirical estimates from the sample. Le. Pi is the proportion of subjects in
the entire sample that have the value Zij di is the ratio of the number of subjects in the validation
set with Zi devided by the number of subjects in the entire sample who have Zi.
The following theorem shows that tEPL(~EPL) is a consistent variance estimator of I: EPL (I3).
18
Theorem 13
PROOF:
•
The proof of this theorem is analogous to the proof of Theorem 10.
0
4. DISCUSSION
•
The finite condition (A) is easy to justify in practice. The infinite interval case cannot be derived
from the finite interval case by a simple mapping. This is because 1000 >'o(t)dt = 00 in general. Some
extra conditions have to be made ensuring that the contribution to the test statistics from data
on (T, 00) can be made arbitrarily small, uniformly in n, by taking T large enough. Anderson
and Gill(1982) outlined the proof of such extension when the covariate is bounded in the case of
ordinary partial likelihood with complete covariate data(Theorem 4.2 in Anderson and Gill, 1982).
A reference of the tightness condition on [T, 00) can be found in Fleming and Harrington(1991).
Recall that the empirical estimation of induced relative risk function r requires for a subject
with Z at time t in the non-validation set that there exist subjects still at risk and sharing the
same auxiliary Z in the validation set. Hence condition (B) is a natural assumption ensuring that
EPL can be carried out in large samples.
The regularity condition (C) on boundedness and continuity allows the interchange of orders of
various limiting operations.
Condition (D) ensures the asymptotic stability of SCi), j = 0,1"",,7. Note that the uniformity
of convergence in (3 is not needed in the asymptotic normality proofs. However the consistency
proof needs the uniformity of convergence in (3. Prentice and Self(1983) considered a general form
of relative risk function r, rather than e{3X as in Anderson and Gill. The asymptotic stability
assumption we have assumed in this work is not more than that in Prentice and Self(1983). This
makes sense because after all the relative risk r* in this work is of a general form.
k
Finally Condition (E), the uniformly boundedness of
) (t), is needed to ensure the asymptotic
normality. By modern empirical theory this assumption is satisfied in some general set ups. For
example if r((3, x) = e{3x and x is time independent, then it can be shown that the Condition (E)
holds. A survey of these results can be found in Wellner[10].
zi
Appendix A. Proof of Priliminary Results for Consistency
4: The proof of this theorem is only demonstrated for j
other cases of j, the proofs are analogous.
PROOF OF THEOREM
5(0) - S(O)
1
n
n
i=l
1
-
n
= - L: Yi(t)fi((3, t) =
L: Yi(t)( fi((3, t) -
1
n
n
L: Yi(t)ri((3, t)
i=l
ri((3, t))
n i=l
•
=
1
v
•
- L:Yi(t)(ri((3,t) - ri((3,t))
n i=l
q
=
•
L:( rk((3, t) k=l
19
1
rk((3, t))n
v
L: Yi( t)I[zi=zIc]
i=l
= O.
For the
By Theorem 3,
II rle U3, t) - rle(/3, t)IIBx[o,l] ~
°for any k = 1" .. ,q.
Since
and
•
Then
SUPBx[o,l]IIS(O) -
8(0)11
~ 0.
The same argument works for S(i), where j = 1,2"",6.
The second convergence in the Lemma can be shown using the same types of arguments as
in Theorem 3. As in Theorem 3, we can treat Yi(t)ri(,B, t) as random elements of D[O, 1] (after
reversing the time axis!). By condition (D) and Theorem 1, we can show that
for j = 0,1,2"" ,6,7.
The third expression of the theorem is an immediate consequence of the first two results.
PROOF LEMMA 1: Recall that
S(O)
0
= E(Y(t)r(X(t),B)
Since r*(X(t),B) is continuous in ,B on B, for any ,B -- ,B* in B, we have
SUPtIY(t)r*(X(t),B)I2....sUPtIY(t)r*(X(t),B*)1 as,B -- ,B*.
From Condition (D)
E(SUP[O,l] XB Y(t)r*(X(t),B)) <
00
Hence by the Dominated Convergence Theorem(Chung, 1977), s(O)(,B)~s(O)(,B*) as ,B -- ,B*, so
is continuous function of,B E B for each ,B uniformly in t E [0,1]. The same argument applies
to s(j) j = 1,2"" ,6,7.
Also by Condition (D), s(j), j = 0,1,2" .. , 6, 7, are bounded on B x [0,1].
Since r(,B, t) is bounded away from zero, we have
s(O)
r(,B,t)
= E(r(,B,t)IY(t)=l,Z=z)
= E(Y(t)r(,B, t)IZ = z)/ P(Y(t)
°
= liZ = z)
By Condition (B) and (C) that P(Y(t) = liZ = z) > and r(,B, t) is bounded away from zero,
we have E(Y(t)r(,B, t)IZ = z) bounded away from zero. Hence s(O) = E(E(Y(t)r(,B, t)!Z = z)) is
bounded away from zero. 0
PROOF LEMMA 2:
20
.
Define
A(,8,I)
Then using (9) one can write
fal! ~
[fZ~2)(,8, t) _ (fZ~I)(,8, t)) e2
L.J
(,8)
(,8 )
1 {) .
=
-n ~,8U(,8,I) - A(,8, 1)
u
n
0
i=1
,t
Ti
_ 5(2)(,8, t)
5(0)(,8, t)
+
Ti'
t
(5(1)(,8, t)) e2] dM. t
5(0)(,8, t)
,( )
Since (C) ensures the f7(i)(,8, t) and 5(i)(,8, t) to be predictable and locally bounded for ,8 E B,
therefore
(;(,8,1) ~ A(,8, 1) is a local square integrable martingale with variance process B(,8, 1)
given by
:(3
B(,8, 1) =
r ~t
J n
1
o
_
[fZ~2)(,8, t)
i=1
Ti
_
(,8, t)
(fZ~I)(,8, t)) e2
Ti
(,8, t)
~(2)(,8,t) + (~(I)(,8,t))e2]2,X.
S(O)(,8,t)
S(O)(,8,t)
t dt
,()
8 •
p
We would like to show that B(,8,I)-O, so that ~ 8{3 U(,8, 1) and A(,8,l) would be shown to
converge in probability to the same limit. Expanding the squared term in the above expression
gives
B(,8, 1) =
f
2"
Jo n
+
+
•
+
+
(2)(,8, t)) e2 *
E (f7.~(,8)
(,80, t)'xo(t)dt
,t
lIn
1
1
Ti
i=1
1
T,
n (f (1)(,8, t)) e4 *
L: 7T,.~(,8)
Ti (,80, t)'xo(t)dt
,t
2"
o n i=1
r !(~(2)(,8,t))e2S(0)(,80,t)'xo(t)dt
5(0)(,8, t)
Jo n
r !n (5{1)(,8
.5(0)(,8,,tt))) e4 S(O)(,80, t)'x 0(t)dt
1
Jo
the interaction terms of the 4 components in above
21
The final two integrals in this expression converges in probability to zero uniformly in (3, in view of
Lemma 1, 3, on SU) along with the finite interval Condition (A). The interaction term will converged
to zero if the first two terms do upon applying the Schwarz inequality and the convergence just
noted in the last two terms.
The first two terms above are ~S(5), ~S(6) respectively. By Theorem 4 and Lemma 1 and Condition (A), the first two terms converge to zero uniformly in B. It now follows that IIB({3, 1)1182....0
so that by an inequality of Lenglart (Appendix 1 of Anderson and Gill, 1982) ;/3U({3,1) converges
in probability to the same limit as does A({3, 1), uniformly in (3 E B.
From Lemma 1 and 2 and Condition (A)
A({3,1)
=
p
since 8(3) ({30, t)
= 8(2)({30, t), I:({3) at {3 = (30 equals
rt[8(4)({30,t) _ (8(1)({30,t»~2]8(0)({3 t)>-"(t)dt = I:({3 )
8(0)({30, t)
8(0)({30, t)
0,
°
Jo
which is positive definite by Condition (C). 0
PROOF LEMMA 3: Recall
Define
A({3o, 1)
1
1
n
t)
t») Yi(t)ri•({30, t)>-"odt
1 (f;(I)({3o,
8(1) ({30,
".({3) - "(0)
Ti
0, t
S ({30, t)
= -n L
i=1 °
then by similar argument as in Lemma 2, U({3o, 1) - A({3o, 1) is a local square integrable martingale
with covariance process B({3o, 1) given by
B({30, 1) =
•
22
Each term converges to zero by noting Theorem 4 and Lemma 1 and Condition (A). Hence ~U({3o, 1)
converges to the same limit as A({3o, 1). Observe that
•
=
=
=
=
Hence
U({3o, l)~O.
•
0
Appendix B. Proof of Priliminary Results for Normality
PROOF LEMMA
4:
The first expression
v
=
n-t t; fa
=
n-~ fl t(r~~)({3o,w) - r~~)({3o,w))2rZk({3o,W)
Jo k=I
I
(r~K)({3o,w) - r~K)({3o,w))2Yi(w)ri({3o,w)AO(w)dw
v
x L1rzi=zklYi(w)Ao(w)dw
i=I
•
where q is the index value of Z, the first term in above expression is Op(1) by condition (E), the
second term converges to zero uniformly in W by Theorem 3, Condition (D) implies that Tk({3o, w)
is bounded uniformly in w, the last term is bounded uniformly in w. Finally with Condition (A)
the first expression converges to zero in probability.
23
The second expression in the Lemma
=
Vn
= Vn
1
1
(5(K)(!30,w) - S(K)(!30,w))2S(0)(!30,w),xo(w)dw
1[; ~Yi(W)(f;(K)(!3o,w)
1
1
n
- r;(K)(!3o,w))]
x (5(k)(!30,w) - s(k)(!3o,w))S(O)(!3o,w),xo(w)dw
=
l
q
°
1=1
Vn fa [L)r~~)(!3o,w)
- fi~)(!3o,w))]
1 'Ii
x - L:1rZ.=zklYi(w)(5(k)(!30,w) - s(k)(!3o,w))S(O)(!3o,w),xo(w)dw
n i=1
o
by Theorem 4, Lemma 1, Condition (A) and a similar argument as was used above for the first
expression. 0
PROOF OF LEMMA 5: By Taylor expansion
f(x,y)
=
f(xo, Yo)
I
+ af(x,y)
a
I
+ af(x,y)
a
y
if ~ f,
J;r f, and a:Oyf are finite.
~
"*(1)
"*(1)
= _r__
"(I)
"(I)
r*
r
(XO,yo)
(x - xo)
(XO,yo)
(y - Yo) + O((x - xo) 2 + ( y - Yo) 2)
We have
*(1)( "*
r - r
*)"
r*2
r*
~= ~_S
x
(1)"0
+ O[(f* _
r*)2
+ (f*(I) _
r*(1))2]
° + 0[(50 _ SO)2 + (5(1) _ S(I))2]
(S - S )
s02
So
So
With the above expansion, the left side of (21) can be expressed as
-n-t
r1t
(r;(l)(!3o,W) _ S(l)(!3o,W)) Y;(w)
i=1
ri(!3o,w)
S(O)(!3o,w)
~
Jo
x (fi(!3o,w) - ri(!3o,w)),xo(w)dw + the remainder
From Lemma 4, the remainder goes to zero in probability. Therefore the result holds. 0
PROOF OF LEMMA 6: Notice that fi(!3o,w) - ri(!3o,w) = 0 for i E V, the left side of (21) is
'Ii
left =
-n-t ~
1 ~(fi(!3o,W))Yi(w)(ri(!3o,w)
1
- fi(!3o,W)),xo(w)dw
while the right hand side of (21) is
'Ii
right =
-n-t ~
x
1 ~(fi(!3o,W))Yi(w)
1
t 2:j=1 IrZj=Z;jYj(W)
"H ()
(ri(!3o,w) pZi
Zi w
24
•
_
ri(!3o,w)),xo(w)dw
Therefore
•
=
=
--!... 0
by noting (18) and Condition (E). 0
PROOF OF LEMMA 7: Since M(w) is a local martingale and fc~~
- fa} is predictable, then
W = n-t [1 (~(I)(,6o,W) _ S(1)(,6o,W)) dM(w)
Jo S(O)(,6o,w) s(O)(,6o,w)
is a martingale with covariance process < W> (1).
•
< W> (1)
=
r ![~(I)(,6o,W) _ s(I)(,6o,W)]2 d < M,M >
Jo n S(O)(,6o,w)
=
s(O)(,6o,w)
2
1
[1 (~(I)(,6o,W) _ i )(,6o,W)) S(O)(,6o,w).\o(w)dw
Jo S(O)(,6o,w)
s(O)(,6o,w)
By Theorem 4 and Condition A,
< W> (1)--!...O
Therefore the Lemma holds by Lenglart Inequality. The proof of the second expression in Lemma
is analogous. 0
PROOF OF LEMMA 8:
25
·,
Observe
Moreover with
Zv =
1
v
v
j=1
v'n( - L
I[yj(W)=I,Zj=z"j(rj(.80,w) - Tz,,(,80,W)))
as was assumed in Condition (E) that sUPo$w9lzvl = Op(l). Hence it is clear that
-
v
v
j=1
~n-t L I[yj(W)=I,Zj=z"j(rj(,8o,w) - Tz,,(,80,W))AO(w)dw + op(l)
X
-~n-t
=
v
X
r tY;(w)(rj(,8o,w) - Tj(,8o,W))
Jo
l
j=1
(T~I)(,8o,W)
Tj(,80' w)
_ S(I)(,80,W)) AO(w)dw + 0 (1)
s(O)(,8o, w)
p
Therefore
-
v
v Q)
p
-n _lV",(
2 - L..J Q j j --0.
V j=1
The result holds.
0
APPENDIX C: PROOFS OF VARIANCE ESTIMATOR
PROOF:
First, we will show that
for any ,8. Note that
=
r [(E(T(I~(,8,t)fi)2
Y(t))
r(,8, t)
Jo
26
•
•
(30)
By Theorem 3 and (18), we have
(31)
=
•
With Theorem 7 and above result, the first integration in (30) converges to 0 in probability. By
(26) the second integration in (30) can be expressed as
1
1
o
q
L
f~~\B, t) ~2 1
1=1
1
- S(O)({3, t) S
1
=
1
q
L
o k=l
1~
1
({3, t)dAo(t) - S(O)({3, t) n
.
tr dMi(t))
n
S(O)({3,t)
((3)
}j(t)I[Zj=zk]((l - ·(0)
)dAo(t)
T Zk
, t n j=l
S ({3, t)
0.
·(0)
S
(0)
f~~)({3,t)X2 1
1
-
n
• ({3 ) - L }j(t)I[Zj=zk](dAo(t)
T Zk
,t n j=l
1
L
n
(32)
L dMi(t))
({3, t) n i=l
-
By Theroem 7 and (31), the first term in (32) converges to zero for any {3. The second term is a
local martingale. It can be shown as in the proof of Lemma 2.2 that this local martingale converges
to zero. This is done by showing the covariance process of this local martingale equals to
1
"2
n
_
(q f~~)({3,w)~2 1
r
L
({3 ) - L }j(w)I[Zj=zk]
Zk ,w n j=l
10 k=l
n
1
T
1
)
S(O)({3,w)
=
•
0.
111 (1
n
2
S(O)({3 w)
,
f~~)({3,w)~2
- LY.(w)
n o n j=l 3
f Zk ({3,w)
27
1)2
- S(O)({3,w)
S(O)({3,w)
~
0
By noticing that 8(0)(13, t)~s(O)(j3, t) uniformly in t, s(O)(j3, t) is bounded away from zero and that
-(1)
n
@2
-(1)
.!. LYj(W) rz~ (j3,w) ~E(Y(t/
Tz/c(j3,w)
n j=1
(j3,t)
r(j3, t)
@z
) ~ s(4)(j3,t)
•
where S(4) (13, t) by Lemma 2.1 is uniformly bounded on [0,1] X B. Therefore by Lenglart Inequality,
we have that the second term in (32) converges to zero in probability for any 13. This implies
Similarly we can show that
IE z(j3) -
~z(j3)1 ~O
li::(j3) - ~(j3)I~O
With the convergence of
E1(j3),
i:: z(j3), and i::(j3), we have
(33)
Hence
<
~
li:: EPL (,8EPL) - ~EPL(j3o)1
li:: EPL (,8EPL) - ~EPL(,8EPL)1
+ I~EPL(,8EPL) -
~EPL(j3o)1
0 asn-+oo
by the fact that (33) holds and that ~EPL is continuous in j3 and ,8EPL~j3o as n -+
00.
0
References
[1] Andersen, P.K. and Gill, R.D. (1982), "Cox's Regression Model for Counting Processes: A
Large Sample Study", The Annals of Statistics, 10, 1100-1120.
[2] Caroll, R. J. and Wand, M. P. (1991), "Semiparametric Estimation in Logistic Measurement
Error Models", J. R. Statist. Soc. B, 53,573-585
[3] Fleming, T.R. and Harrington, D.P. (1991), Counting Processes and Survival Analysis, Wiley,
New York.
[4] Foutz, R. V. (1977), "On the Unique Consistent Solution to the Likelihood Equations", Journal
of the American Statistical Association, 72, 147-148.
.
[5] GAUSS System, Version 2.1 (1991), Kent, WA: Aptech Systems Inc.
[6] Kalbfleisch, J. D. (1974), "Some Efficiency Calculations for Survival Distributions",
Biometrika, 61, 31-38.
[7] Prentice, R. 1. (1982), "Covariate Measurement Errors and Parameter Estimation in a Failure
Time Regression Model" Biometrika 69, 2, 331-342
[8] Rudin, W (1964), Principles of Mathematical Analysis, New York: McGraw-Hill Book Co..
28
[9] The SOLVD Investors (1991) "Effect of Enalapril on Surbival In patients with reduced Left
Ventricular Ejection Fraction and Congestive Heart Failure" , New England Iournal of Madicine
, 325:293-302(August 1), 1991
•
[10] Jon Wellner Manuscript. "Empirical Processes In Action: A Review"
[11] Haibo Zhou (1992), "Auxiliary Covariate Data in Failure Time Regression Analysis" Ph.D
Dissertation, Unversity of Washington,
•
.
29