Sinha, A. N. and Sen, P. K.; (1981)Staggering Entry, Random Withdrawal and Progressive Censoring Schemes: Some Nonparametric Procedures."

.e
STAGGERING ENTRY, RANDOM WITHDRAWAL AND PROGRESSIVE
CENSORING SCHEMES: SOME NONPARAMETRIC PROCEDURES
by
Agam Nath Sinha
and
Pranab Kumar Sen
Department of Biostatistics
University of North Carolina at Chapel Hill
Institute of Statistics Mimeo Series No. 1370
December 1981
·e
STAGGERING ENTRY, RANDOM WITHDRAWAL AND PROGRESS IVE CENSORING
SCHEMES: SOME NONPARAMETRIC PROCEDURES*
By AGAM NATH SINHA
American Cyanamid Co., ITinceton, N.J.
and
PRANAB KUMAR SEN
University of North Car'oZina, ChapeZ HiH, N. C.
SUMMARY.
In a life testing experiment (or a clinical trial), for staggering
entry of units (including some batch arrival models), some nonparametric testing
procedures (based on a general class of rank statistics and weighted empirical
processes) under progressive censoring schemes are considered, for some multiple
regression model.
Adjustments needed to incorporate dropouts are also discussed.
The relevance of some weak convergence results provides the desired large sample
theory of the porposed procedure, and some simulation studies of the critical
values of the allied test statistics are made.
AMS Subject Classification:
Key Words & Phrases:
62G99, 62J99, 62L99
Batch arrivals, Bessel sheets, dropouts, Kiefer-Bessel
processes, multisample problem, progressively censoring scheme, rank test,
staggering entry, weighted empirical process.
*work partially supported by the National Heart, Lung and Blood Institute,
Contract NIH-NHLBI-71-2243-L from the National Institutes of Health.
-e
2
1.
INTRODUCTION
For life testing experiments, the logrank test (Mantel, 1966; Cox,
1972; Peto and Peto, 1972) and modified versions of the Wilcoxon test (Gehan,
1965a,b; Halperin and Ware, 1974) provide fixed-sample comparison of two
survival distributions in the presence of arbitrary right censoring. Recently,
Fleming et. ale (1980) have developed one- and two-sample Kolmogorov-Smirnov
type tests for arbitrary right-censored data.
Sequential versions of some of
these tests are due to Gehan (1965a), Curnow (1972) and Armitage (1975, Ch. 7),
among others.
Under progressively censoring schemes (PCS), time-sequential
nonparametric tests based on a general class of linear rank statistics (LRS)
were developed, for the simple regression model, by Chatterjee and Sen (1973)
and extended further to grouped data and multiple regression models by
Majumdar and Sen (1977, 1978a); the choice of asymptotically optimal score
function for PCS rank tests was studied by Sen (1976b).
These tests are
e-
designed for possible early termination of experimentation, where time and
cost are crucial factors.
Davis (1978) and Koziol and Petkau (1978) have
used the theory of Chatterjee and Sen (1973) to illustrate some useful applications
of PCS tests.
DeLong (1980) has studied asumptotic power properties and
expected stopping times for PCS rank tests.
Sinha and Sen (1979a,b) have
considered another class of PCS tests based on some weighted empirical
processes~
for the simple as well as multiple regression models and made a comparative
study of their performance characteristics relative to the rank procedures,
with respect ot both asymptotic power and stopping times.
One common feature of all these procedures is that the subjects all enter
into the scheme at a c011Dllon point of time, i.e., we have a non-staggering
(or single point) entry plan.
For staggering entry plans, a PCS gets
complicated with varying number of subjects and their differential exposure
e-
times.
Majumdar and Sen (1978b) using LRS and Sinha and Sen (1982) using
weighted empirical processes have considered some PCS tests for the simple
regression model in life testing experiments allowing staggering entries as
well as withdrawals (dropouts) of the subjects.
The object of the present
investigation is to extend their results to the multiple regression model,
which includes the several sample problem as a special case.
Along with the basic setup, the proposed PCS tests are outlined in Section 2.
The related asymptotic theory is then presented in Section 3.
with some apecific test procedures.
Section 4 deals
Section 5 is devoted to the incorporation
of (random) withdrawals in a staggering entry plan.
The
q
(~
2)-sample
problem, in a possible batch-arrival model is briefly presented in Section 6.
The concluding section is devoted to some numerical studies of the critical
values of the proposed test statistics obtained through simulation •
.e
2.
THE PROPOSED PCS TEST PROCEDURES
Consider a life testing experimentation involving
which may not all arrive at a common point of time.
n
Let
subjects (units)
be the entry
E.
1
time-point of the ith subject (into the scheme), and suppose that it remains in the
scheme
until the failure (or the response) occurs at time
or it drops out of the scheme at time
for
i=I, ••• ,n.
Then
X~
and
T. (= E
1
i
o
+ X.)
1
Wi (= Si + Y ), which ever comes first,
i
are the actual failure and withdrawal
Y
i
times, respectively, and the observable random variables are
0
Xi = min(Xi , Yi )
,
0. = { I,
1
0,
Xi = X?1
,
for i=I, ••• ,n • (2.1)
Xi = Y.1
In a life testing situation, one would keep track of the experimentation and
record the entries, failures or withdrawals over time as they occur.
be noted that in a staggering entry plan, where the
-e
the ordered values of the
E.1
It may
are not the same,
T., as recorded, do not necessarily reflect the
1
4
ordering of the
Xi (or
x~, even if there is no withdrawal).
Following the general approach of Majumdar and Sen (1978b) and Sinha
and Sen (1982) and extending their model to a general multiple regression
one, we assume that
o
0
X , ••• ,X are independent random variables (r.v.) with
1
n
continuous distribution functions (d.f.) FI, ••• ,F , respectively, all defined
n
on the real line
R, where
(2.2)
aO'~' =
where
£i
=
(c
i1
,···,c
~ 1)
(aI,···,ap ) (for some p
ip
)', i=I, ••• ,n
(not all equal) and
F
are unknown parameters,
are vectors of specified regression constants
is an unknown (but continuous) d.f.
For simplicity
of presentation, we consider first the case of staggering entry without
o. = 1,
i. e., where
dropouts
1
for every
i=1, ••• ,n.
Later
e.
on, in Section 5, modifications for incorporating withdrawals will be
considered.
Now, based on
X = (XI, ••• ,X)
n
""'Il
in (2.1), for the model (2.2),
we desire to test in a life testing setup (under preogressive censoring) for
(2.3)
Without any loss of generality, we may assume that
(2.4)
(as otherwise, we have a non-staggering entry plan where the results of
Majumdar and Sen (I978a) and Sinha and Sen (198Ib) would apply).
review the picture at a time point
E
i
<
T (> E ).
I
T are in the scheme at that time point.
time point
length
Ei
T - E
i
«
T)
Only those units for which
For a subject entering at
the monitoring upto the time
and during that time period
Let us
T covers a period of
[E , T], a failure occurs only
1
5
when
T S T.
i
For the
or 0, according as
E
i
subjects, we define the T-exposure times by
n
is
T or not, for every
S
Essentially these T-exposure times for
T
~
T
~
E
1
and
T - Ei
i=I, ••• ,n.
E , viewed progressively over time,
1
along with the failure events provide the basic data for the PCS testing
procedures.
~
Let
(=
r ni =1
I(E i S T), with
I (A)
as the indicator function for
the set
A) be the number of subjects entering the scheme before time
so that
n
E ). In general, the entry-points E , ••• ,E are random.
1
n
1
is also a random (step-) function obtaining its maximum value n at time E •
is
T
n
T
so that
)"
T
T,
in
(~
n
Because of this stochastic nature of
nT' formulation of testing procedures
under PCS for a staggering entry plan encounters some additional difficulties.
To fix notations, we define for every
k(~
1),
(2.5)
and, conventionally, let
£0 =
Q,
£0 =
Q.
Note that for every
is symmetric and positive semi-definite (p.s.d.) and
p.s.d.
£k+l - £k
k
~
1,
£k
is also
Thus, without any loss of generality, we may assume that there exists
an integer
nO
(~
q+l), such that
£k
Also, for every
k
(~
is positive-definite (p.d.), V k
~
nO .
(2.6)
1), we define
(2.7)
Sk(x)
k
= k- 1 r i=1
I(XiSx), x
E
R
(2.9)
where the
a set
Xi
are defined by (2.1).
{a (I), ••• ,a (k)}
k
k
Further, for every
k
(~
1), we consider
of scores (to be defined more precisely in Section 3)
6
and define the "linea:!' rank statistia (LRS)
~
by
(2.10)
By the assumed continuity of
with probability 1, so that
F , ••• ,Fn , ties among the
1
~
=
(~l'
••• '~k)
We may define the anti-ranks
(l, ••• ,k).
Qki
Xi may be neglected
is some permutation of
by
~Qki = Q~i = i, for i=l, •.. ,k
and rewrite
~
(2.11)
as
(2.12)
In the event that only the smallest
observed and the remaining
k-q
q observations among
X , .••
1
,Xk
are
are censored, one may modify !::k' as in
e-
Chatterjee and Sen (1973) and Majumdar and Sen (1978a), as
(2.13)
while
~ 0
!::k k = ~ k-1 = ~, where
= Q and
* ' {(k_q;-l
.'
~(q) =
r~=q+1 ak(j),
OSqsk-1
o,
q=k •
(2.14)
Finally, we denote by
0,
2
,\,q =
q=O
(k-1)
-1
q
2
*
{r i =l ak(i) + (k-q) [ak(a)]
2
-2
- kak }, lsqsk-1
(2.15)
where
a
k
= k- 1 r k ak(i),
i=l
k ~ 1
(2.16)
7
Let us now examine the T-exposure times and related failure events as
we progressively move with
(~
T
E ); these will be employed in the
1
~(x)
formulation of the histoPy ppoaesses based on the
pes
the proposed
At time point
every
E
1
<
t
(~
E ), there are already
1
T, there are
time points not later than
equal to
T - t.
compute
M (x)
~t
to vary between
n
t
~k.q
and
and
S
n
n
T
(~n)
entries.
Now for
units which have entered the scheme at
t, so that these have T-exposure times at least
Thus, for every
t
[E , T], we are in a position to
1
E
(x), for every
x:
0
~
x
~
T-t, and then allow
t
t
E
1
and
{M
(x), S (x):
nt
~t
T.
Thus, at time point
0
~
Similarly, we may note that by (2.8),
.e
L
testing procedures Will be based on these history porcesses.
T
~
and
x
~
T, we obtain the process
T-t, E
1
~
t
~
qt,T = n t Sn (T-t)
T}.
(2.17)
is the number
t
of failures (among the
~
n
T-t.
t
t) of magnitude
Then, by reference to (2.13)-(2.14), we may note that the remaining
- qt,T
(T-t)+
entries prior to time-point
responses of magnitude
and we have the LRS
>
L
~t,qt,T
T-t
•
may be regarded as censored at time
Thus, at the time-point
T, we
obtain the process
(2.18)
We intend to incorporate (2.17) and (2.18) in the formulation of suitable
test statistics, which we would employ in the setup of repeated significance
testing (RST) as we allow
T
to vary between
E
1
and
T, where
c
T (> E )
c
n
is some preassigned censoring time for the experimentation, set in advance
in accordance with other side conditions of the experimentation.
£i
When the
are scalar constants, such tests have been formulated in Majumdar and
Sen (1978b) and Sinha and Sen (1982).
vector case.
These will be generalized here to the
8
Using the motivation in Sinha and Sen (1979b), we may let
(2.19)
m* (x)
nt
so that at time
= {M'
-n
t
(x)C- M (x)}/{S (x) [1-S (x)]},
-nt-n t
nt
nt
(2.20)
T, we may set the test statistics
= sup{mnO (x):
x
~
T-t, t
<
T} ,
(2.21)
= sup{mn*
x
<
T-t, t
<
T} •
(2.22)
t
*
K (T)
n
(x):
t
Similarly, as in Majumdar and Sen (1978a), we may set
(2.23)
(2.24)
so that at time-point
T, we may set the test statistics
(2.25)
Ln* (T)
sup{l* (x):
n
Statistics with the superscript
~eighted
ones, respectively.
that in (2.20), whenever
0
x
~
T-t, t
T} •
<
(2.26)
t
and
*
are termed the unweighted and
For the weighted statistics, we may remark
Sn (x)
=
0
or in (2.24),
qt,T
~
0,
there may
t
be an operational difficulty in these definitions.
of
x
(and
and
<
1)
t
in (2.22) and (2.26) be such that
and
in these domains.
qt,T
~
1.
As a result, the range
S
n
t
This also demands that . n
t
is strictly
>
be strictly positive
Keeping this in mind, (and to be able to adapt some
simple asymptotic theory), we modify (2.22) and (2.26), by choosing an
€
(>
0), usually quite small, and letting
0
e.
9
*
K (T)
n
= sup{mn* (x): n-1 nt~E, ESS n (X)SI-E, xST-t, t<T} ,
t
t
= sup {£.n*
L*(T)
n
-1
(x) :
n
nt~E,
(2.27)
S (X)~E, xST-t, t<T} •
n
(2.28)
t
t
Finally, we define
O
K = sup{K (T):
n
n
°
K*
n
T S T }
c
= sup{Kn* (T): T S Tc }
L* = sup{L * (T): T S T }
c
n
n
LO = sup{L (T): T S T }
n
n
c
°
(2.29)
.
(2.30)
[In (2.27)-(2.28), the supreme over a null set (if so) is taken as 0. ]
Suppose now that there exist real numbers
HO:
such that under
~
1.°na
O
*
k
na ' na,E'
na,E'
}
°
S a S P{KO ~ k O IH
n
na
°}
(2.31)
and similarly for the other three statistics in (2.29)-(2.30), where
is the desired ZeveZ of signifiaanae of the test.
a (0 < a < 1)
Then,
operationally, the proposed PCS test may be described as follows.
experimentation from the first entry point
KO(T) [or
n
or
k*
na,E
*
K (T)
n
or
LO(T)
n
1.°
experimentation at time
such
T* (s T)
timepoint
or
LnO < na or Ln* (T)
c
L*(T)].
n
<
A*
na,E
E
If
I
At each
•
KO(T)
n
c
T*
is
Monitor
T(> E ), compute
I
< k
O
[or
n
K*(T) <
n
], continue experimentation, while,
T = T* (s T ), the opposite
if for the first time at
inequality holds, stop
along with the rejection of
H :
O
~
=
Q.
If no
exists, then esperimentation is curtailed at the preplanned
T ' along with the acceptance of
c
H •
O
T* may be defined as the
stopping time for the proposed PCS testing procedure.
Note that by definition
T* S T , with probability 1,
c
·e
A*
and
=Q,
P{KO > k O IH
n
na
.~
k
and it may be smaller than
T
c
with a positive probability.
(2.32)
Further, by
10
(2.29)-(2.31), the proposed procedure has the prescribed level of significance
a (0 < a < 1).
Since for different
T, the statistics
KO(T) [or K*(T)
n
n
or
L~(T) or L~(T)] are not independent, nor K~(T) etc. are processes of
independent or homogeneous increments, determination of the exact critical
values in (2.31) may pose a challenging problem.
However, certain invariance
principles for LRS and weighted empirical processes can be incorporated to
provide suitable large sample approximation (or bounds) for these critical
values.
This will be discussed in detail in Section 3.
A discussion on the
relative merits and demerits of the weighted and unweighted statistics will
be made in Section 4.
3.
ASYMPTOTIC THEORY
The exact distribution theory of the proposed test statistics [even
under
and
size
HO:
~
= Q]
T ' through
c
n
depends on the
£i' the entry-points
E1 ,···,En ,
So
F(Tc - Ei - SO), and becomes quite involved as the sample
increases.
e.
In this section, we intend to provide good approximations
for large sample sizes, valid under quite general regularity conditions.
In this context, we assume that the following holds:
(I)
There exists a p.d. matrix
-1
m
and
(II)
if
chl(~)
fm
~
£0' such that
£0 as m increases,
(3.1)
~,
stands for the largest characteristic root of
then
(3.2)
For rank based procedures, we assume further that for every
the set
k
(~
1),
{ak(I), ••• ,~(k)} of scores is generated by a soore-funotion
~ • {~(u):
° < u < 1}
in the following way
e·
11
(3.3)
lim
k~
~(u)
where
=
~(I)(u)
-
/1 {~k(u)
o
~(2)(u),
2
- ~(u)} du
0 < u < 1
and square integrable inside [0, 1].
~(i/(k+l»,
1
~
i
~
=0
(3.4)
,
~(j)
and the
are non-decreasing
A typical case for (3.3) is
ak(i) =
k.
To introduce the asymptotic theory, first, we consider some basic
Let
processes.
p
J
J
~
t
~
<
00,
0 ~ t
<
oo}, j=I, ••• ,p
independent copies of a standard Bpownian sheet.
(~1)
o = {W.(s,t)
0
= W.(s,t)
J
J
the related Kiefep ppoaesses
o
0 ~ s
W. = {W.(s,t):
I}, 1
~
j
~
p.
W.
Also, consider
- tW.(s,I):
J
be
J
0 ~ s <
00,
Define then the (squared) BesseZ sheets by letting
(3.5)
-e
and the (squared) Kiefep-BesseZ sheets by letting
{ B0 (s,t) -_ L.P 1 W.02 (s,t).•
P
J=
J
o
< s <
O~t~n.
00,
(3.6)
Finally, we introduce the processes
B
* = {B *(s,t) = t -1 s -1 Bp (s,t): o <
P
P
B
0*
s
P
-1 -1
t
(l-t)
-1 0
s <
B (s,t):
p
00,
0 < t < oo} ,
o<
S
<
00,
0 < t <
(3.7)
n.
(3.8)
We shall see later on that the boundary crossing probabilities for these
processes provide the desired asymptotic theory for the
pes
tests.
Note that
by the time and scale transformation on the Brownian sheets and (3.5), we
have for every
n
=
n2 /n 1
(> 1)
0 < n1 < n <
2
and
00
and
£ = £2/£1 (> 1),
0 < £1 < £2 <
00,
on letting
12
sup
nISsSn2
£I StS £2
B* (s,t)
P
v""
sup
Isssn
s
o
n1
0 S t <
<~,
~,
o
<
<
n2
~
we obtain by a few routine steps that for every
o
and
t
Wj(s,t) = (t+I)Wj(s, t+l)'
£1 < £2 < 1, on letting
000
0
£j = £j/O-£j)' j=I,2; £ =£2/£1 (> 1),
<
(3.9)
P
Ists£
Further, note that on using the identity that
os
B*(s,t)
<
1
B0* (s, t)
p
sup
nlsssn2
V
=
£I StS £2
Thus, for the standardized processes
and
sup
Isssn B* (s,t)
o p
Ists£
B* adn
p
n = n/n i (> 1)
(3.10)
B0* ,with suitable adjustments
p
on the range spaces, the boundary crossing probabilities may be obtained from
one another.
Let us now define the
B(n)
L
....k,q
= {B(n)(S,t):
0 s s s 1, 0 S t S I}
2L'
C-L
•
= {A-n ....
[ns],q(s,t)-n.... [ns],q(s,t)·
q(s,t)
where
2
A
m
and
Q s (s,t) S
!)
(3.11)
0 ,
(3.12)
is defined by
q(s,t) "" max{q:
and
e.
as in (2.13) and let
2
A
m,q
2
2
s tArns]'
A
0 s t s I}, s
[ns] ,q
are defined by (2.15).
~
Then, in the same manner as
Majumdar and Sen (1978a) extended the invariance principle (for the nonstaggering entry case) of Chatterjee and Sen (1973) to the vector £i
case,
we may extend the theory in Sen (1976a) to the vector case and obtain that
under HO:
as
n +
~.
Q
in (2.2)-(2.3) and the assumed regularity conditions,
~,
e·
13
v
~
where
B (s,t)
B
p
=
p
is defined by (3.5).
p
~
l}
(3.13)
For intended brevity, the details
o
= ~(x)
~(t)
Let us also denote by
are omitted.
Q ~ (s,t)
{B (s,t):
when
F(x-S ) = t
(s,t)
~ l}
o
(refer to (2.2) and (2.7»)and define
o
B(n)
=
0
{B(n)(S,t):
=
{(~~ns](t»'C~(~~ns](t»: Q ~
Q ~ (s,t)
l}
~
(3.14)
Then, as a direct (vector-) extension of the results of Sinha and Sen
(1982), we obtain (on omitting the details) that under
(3.1)-(3.2), as
-_
n
and
~ 00,
V
0
B(n)
where the
=Q
~
H :
O
0
B (s,t)
p
O
0
B = {Bp (s,t):
P
~
Q
~
(s,t)
~
l}
(3.15)
are defined by (3.6).
Now, in a staggering entry plan with distinct entry points
E < ••• <E
1
n
«
T )' in (2.21), (2.23) and (2.29)-(2.30), as
c
to vary between
1 to
n.
On
E
1
and
n
E,
n
t
(2.29)-(2.30) depends on the entry-point
The maximum value of
which is a point on the unit line
the processes
B(n)
lower sub space
0f
C
-
E - 8 ), 1
0
i
cases by
·e
I(n)
x,t
E , •.• ,E , censoring point
1
F(x-S )
O
[0,1]
in (2.21), (2.23) and
n
is of course
F(T
c
T
c
- E - SO)
1
and the minimum value is
Thus, in the staggering entry plan under consideration,
F(T c - En - SO),
F(T
is allowed
assumes all possible values between
the other hand, the domain of
and the d.f. F.
t
and
and
o
B(n)
1 2 -_ [0,1]2
~
i
~
n.
in (3.11) and (3.14) are defined only a
and the structure of this subspace depends on
Let us denote these subspaces for the two
o
I(n)' respectively, and note that, in general, both of
these are stochastic in nature.
(e.g., uniform spacing of the
In some cases with some simple entry-pattern
and
may be specified or
14
may even be bounded by non-random subsets
case,
1* c I
202
and 1* c I ,
(1*
I~) of 1 2 . In any
and
so that some dominated results can be obtained
2
1 •
by using (3.13) and (3.15) for the entire domain
Hence, we are
confronted with the problem of finding the distribution of the statistics
sup{B o(s,t):
sup{B (s,t):
p
for suitable
1*
1*
(3.16)
p
I~, including the possibility for 1* = I~ = 1 2 . If
and
2
1 , then, of course, the scale-time transformation
is a lower rectangle of
may be used to express the first statistic in (3.16) as a constant multiple
of
sup{B (s,t):
p
parameter
s
in
(s,t)
€
BO(s,t)
p
2
I } and a similar transformation on the time
is possible.
We shall discuss more about these
in Section 7.
For a batch arrival model, the
between 1 and
n, it increases to its asymptotic value
jumps only.
Typically, there may be only
values of
t
(~
are
can only assume
L~=1 n s ' for r=I, ••• ,b.
b
may not take on all values
1*
will be negligible.
J
b
points, say
1*
o
or
Nevertheless, the suprema over
or
*
distinct values, so that in (3.14) and (3.14), on
contours on
the ones over
*
E1<'.'<~
Thus, in (2.19)-(2.30), the
As a result, in (3.16), the subspace
b
in finitely many
* so that the possible
E.,
the first time parameter scale, we will have only
of
n
entry points
1)
subjects enter the scheme at timepoint
and
n
b
nt
1
1* will consist
2
will dominate
I~ and when b is not very small, the difference
Thus, the use of (3.16) with appropriate
2
I , will result in a slightly higher critical value and the test will thereby
be somewhat conservative.
One advantage of using these critical values will
be to make them valid irrespective of any batch-frequency patterns.
similar manner, for the test statistics
Kn* and
In a
in (2.27)-(2.30), to
study the asymptotic theory, we confront with the problem of finding the
distributions of
e·
15
sup{B * (s,t):
and
p
oc
1*
and
where
sup{B 0* (s,t):
(3.17)
p
[£,1] x [£, 1-£], £ > O.
In this context, we
can use, with advantage, (3.9) and (3.10), and hence, the problem reduces
to that of finding out the distribution of
sup{B * (s,t):
1 S s S a, 1 S t S b}, for suitable (a,b) >
p
! .
(3.18)
For some numerical studies (by simulation) of (3.18), we may refer to
Section 7.
For the particular case of
principles for
and
B(n)
p=1 (i.e., Scalar
£i)' weak invariance
o
B(n)' under suitable sequences of local
(contiguous) alternatives have been studied by Sen (1976a), Majumdar
and Sen (1978b) and Sinha and Sen (1982).
.e
of vector
£i
Their results extend to the case
(as under review), and under such local alternatives [viz.,
_1
{H }, where under
n
H :
n
(2.2) holds with
~ = n ~
for some
XE
RP ], the
asymptotic power function of the proposed tests can be expressed in terms
of the boundary crossing probabilities of some drifted Bessel sheets or
Kiefer-Bessel processes.
in
t
These drift functions are not, in general, linear
(or s) and, even if they are so, there is no precise analytical
expressions for these probabilities.
As such, such results do not lead to
any useful tool for studying the asymptotic efficiency of these competing
tests.
Empirical study of the power properties (by Monte Carlo simulation
techniques) seems to be a more practical alternative.
For some numerical
studies of this nature, we may refer to Sinha (1979).
4.
SOME SPECIFIC TESTS
In Section 2, we have proposed both the unweighted and weighted test
statistics based on the PCS weighted empirical processes and LRS.
point of view of repeated significance testing (RST), the weighted
From the
16
statistics in (2.27), (2.28) [and (2.29)-(2.30)] are naturally appealing.
If we consider the collection of the statistics in (2.20) [or (2.24)], for
all permissible
x
and
t, then the union-intersection principle leads
us to the test statistics in (2.29)-(2.30) for the weighted case.
(i.e., given x,t), under
Pointwise
H in (2.3), the statistics in (2.20) [or (2.24)]
O
has asymptotically chi-square distribution [see Majumdar and Sen (1978a)
and
Sinha and Sen (1979b)] and they may even be judged locally optimal
against appropriate alternatives.
Thus, the union-intersection principle
applied to such desirable test statistics leads to the weighted ones in
(2.27)-(2.28) and (2.29)-(2.30).
However, in practice, for these weighted
test statistics, one needs to choose some appropriate
E
> 0,
the critical values in (2.31) etc. may depend on this choice of
and, in general,
E.
On
the
other hand, for the unweighted test statistics in (2.19), (2.21), (2.23),
(2.25) and (2.29)-(2.30), one does not need to restrict the domain of
by suitable choice of
independent of this
E
E.
(x,t)
e.
and the critical values in (2.31) etc. are also
From the computational point of view, the unweighted
ones appear to be a lot simpler too.
For tests based on the weighted empirical
processes, Sinha (1979) has made some numerical studies of the relative powers
of the weighted and unweighted statistics (mostly, for exponential distributions),
by Monte Carlo studies, and observed that there is not much difference in
their performances; the unweighted one may even perform somewhat better than
the weighted one in some situations.
rank procedures too.
A very similar picture holds for the
As such, we generally tend to recommend the use of the
simpler unweighted procedures based on
o
K
n
o
L , respectively.
n
and
For rank procedures, we have considered a general class of LRS,
staisfying (3.3)-(3.4).
Among various possibilities, two particular scores
are especially recommended for survival analysis.
scores~
where
~(i) =
i/(k+l), 1 Sis k
and
These are (i)theWiZcozon
(ii)the Zogrank scores where
e·
17
ak(i) = 1 - r1=1 (k_j+l)-I, 1
~ i ~ k.
The first set is suitable on the
grounds on computational simplicity and overall robustness, while the
second set, on the ground of local optimality when the underlying d.f. 's
are all exponential.
For non-staggering entry plans, the asymptotic
opitmality of PCS rank tests has been studied by Sen (1976b).
Though the
picture essentially remains the same for a staggering entry plan, analytical
comparisons may be very difficult to make.
For all the procedures proposed in Section 2, the asymptotic theory
works out well when the increments of
(t
large, i.e., the batch frequencies (relative to
~
are no where very
S )
n
n) are all small.
When they
are not so, we are faced with a somewhat different situation (e.g., the
batch arrival model), which will be discussed in Section 6.
.e
5.
INCORPORATION OF RANDOM WITHCRAWALS
The modifications needed to incorporate dropouts or random withdrawals
of subjects (during [E , T ]) are similar to those described in Section 5
c
1
of Majumdar and Sen (1978b) and Section 5 of Sinha and Sen (1982) and carry
over to the vector case treated here.
we assume that for each
d.f. 's
F
i
and
i,
X~ and Yi
G, respectively, where
some unknown (but continuous) d.f.
r.v.'s with d.f.'s
Referred to the model (2.1)-(2.2),
are independent r.v.'s with
G
(does not depend on
Then, the
Xi
i) is
in (2.1) are independent
F*
i , given by
* = [1-G(x)][I-F.(x)], i
1 - F.(x)
~
and hence, under (2.1)-(2.3) and (5.1),
~
~
1
(5.l)
As such, the
test procedures described in Section 2 remains valid in this case too.
However, we may note that by (5.1),
18
so that the actual distance between
F
i
and
F , for every
Xi
XC) becomes less powerful and the loss of power depends on the
i
damping factor
assumed that
F* is smaller than that of
j
i,&j=l, ••• ,n, so that the test based on the
j
(instead of
F* and
i
° s x S Tc-E 1•
1-G(x) ,
G is continuous.
If
G is not necessarily continuous, (5.1)
remains true, but, the ties among the
a positive probability.
In the above development we have
Yi (and hence,
Xi)
may occur with
In such a case, the definition of the ranks in
(2.9) needs some adjustments (i.e., mid-ranks for tied observations), and,
generally, this will make the test more conservative.
O
X and
i
tacitly assumed that
on
i (=l, ••• ,n).
Y.l.
are independent and
So far, we have
G does not depend
In the negation of either of these (5.1) may not hold,
and hence, some other adjustments may be necessary to apply the proposed
tests.
As in Cox (1972), use of some partial likelihood or empirical
e.
processes is possible, but will be generally very complicated.
6.
MULTISAMPLE CASE AND BATCH ARRIVAL MODELS
Consider now the
q (=p+1)
smap1e problem which can be characterized
by the model (2.2) with a simple structure on the
individual sample sizes are
We write
Ani = n
-1
Suppose that the
n 1 , ••• ,nq , respectively, so that
1 SiS n
ni ,
£i.
Then, if at the ith entry point
n
=
and assume that
E
i
an unit from the kth population enters
the scheme, we let
Q
c
....i
=
if
k=l
(6.2)
e·
19
ors
where the
are the usual Kronecker delta.
Therefore,
(6.3)
If we let
n
t
= n t1+... +n tq ,
where
n
tk
is the number of subjects in the
kth sample entering the scheme at times on or before
t
(E
1
~ t
< E
n
< T ),
c
and let
o< e
~
1 => A
"'1l
-+~,
as
n
-+
00
Technically this means a proportional entry plan for the
that
q
pes
(6.4)
•
t
q
samples, so
does not hamper the design of the experimentation relative to the
samples.
In this case, in (2.8), we have
(6.5)
.e
where the
S
.
n J
t
are the empirical d.f. for the jth sample
observations.
Also, in (2.7), we have
Thus, the weighted empirical processes can be expressed as linear combinations
of the individual sample empirical distribution processes.
Similarly, the
L
in (2.13) and (2.18), are linear combinations of the censored
""Ilt,qt,T
multi-(sub-)sample rank statistics, where the R . will stand for the
n 1
t
ranks of the observations among the
entries prior to time
t.
The
computations of the test statistics poses no problem.
One of the advantages of the proposed test procedures is their adaptability
when the units enter into the scheme in batches and/or the monitoring is
made systemetically at regular time intervals instead of continuously over time.
To illustrate this, we consider the multi-sample model in batch arrivals and
20
distinct time-interval inspection plans.
A detailed account of such schemes
relating to ordered categorical data is given by Majumdar and Sen (1980).
Suppose that the subjects enter into the scheme in
.
..
time points
E1< ••• <~.
(~1)
b
batches at
Also, suppose that the process is statistically
..
E.
monitored at each of these
and their after (if needed) at points
J
Consider
is a preassigned censoring time point.
T
c
then the (ordered) time intervals
..
EC+1
=~.
..
We assume that the lengths of these intervals are all the same,
so that
..
E + 1 = E1 + ja, a > 0, for j=1, .•• ,c-1, though
j
length.
Let
n
..
the scheme at time point
b
~
j
~
E , for 1
k
q.
~
net
n
jk
~
b, 1
J , for
o.
~
n
~
j
jk
q, 1
Thus,
k
subjects,
N
~
~
~
k
b, k
ji
i
~
For notational simplicity, we
. = r i n. , ni.. =
..
ri=l nji'
ji
s=l
.. = r k s
N
1
j
q,
1
JS
~
s=1 Nj r+s-1 '
jkr
N..* = r =l N'* ' 1
kr
j
J kr
q
Note that at time point
q (= P+1).
~
and define
n
scheme.
j
~
1
i
Nkjk need not be equal to
= 0, V k > b
k
Suppose that out of these
failures occur in the time interval
note that
is of infinite
c
be the number of subjects from the jth sample which enter
jk
nj = r k=l n jk , 1
J
..
E , k
k
~
~
k
~
~
c, r
1, there are
~
1
~
i
k
~
c
~
~
,
c, r
1
(6.7)
~
1 ,
(6.8)
(6.9)
n * units already in the
k
Thus, according to (2.7)-(2.8), for every
r
~
1,
(6.10)
(6.11)
~
c •
(6.12)
c;
e.
21
Thus. by (2.20). (6.10). (6.11) and (6.12). we obtain that
(6.13)
for
k=1.2 ••••• c-1. r=1.2, ••.•
Clearly. at time point
El*, one has the
{m**(ar), k < i, r ~ 1, k+r = i}. Thus, if statistical monitoring
n
k
c = c*
is made only at the points E* , .•• ,E * (= T )' there are in all (2)
c
c
1
bunch of
points in the set
1~ in (3.16), so that unless c* is fairly large,
replacing this discrete set by a subrectangle of
1
2
may result in some
conservative property of the test.
For rank based procedures, in this batch-arrival model. we will have
E* + Xi
i
tied observations, as it is only known that
.-
,.
interval
J , for
i
i=l •...• c and
i
~
1.
belongs to some
Adjustments for ties for
non-staggering entry plans (relating to grouped data) were discussed in
detail by Majumdar and Sen (1977) (for scalar
(for vector
c )
i
and Majumdar (1977)
As in (6.7) through (6.13), one can obtain the
c.) .
"'1
in (2.19) for T = E* , ••• ,E * and
2
c
nt·qt,T
t = E*, ••. ,E *_ ; we refer to Majumdar (1977) for these details. For intended
c 1
1
expressions for the
L
brevity we do not reproduce these notations.
c
c * = (2)
points in the set
1*
Here also, we will have
in (3.16) and the smae conservative
character of the test prevails.
In a nonstaggering entry plan, based on some simulation studies, it
has been observed by Sinha (1979) that the two PCS testing procedures based
on the
m**(ar)
and
L*
behave very similarly with respect to their
nk,q
significance levels as well as empirical powers. While this picture is
~
expected to remain the same in a staggering entry plan, a verification needs
an extensive amount of simulation work not only for various underlying
22
distributions, but also, for various entry patterns and withdrawal schemes.
SIMULATION STUDIES OF THE CRITICAL VALUES
OF THE PCS TEST STATISTICS
7.
In practical applications of the PCS testing procedures, we need to
know about the critical values of
o
Kn'
K*.•
n
LO
n
and
L*.
As has been
n
discussed in Section 3, boundary crossing probabilities for the Bessel
sheets and the Kiefer-Bessel processes provide suitable large sample
approximations to these critical values.
On the other hand, for such
Bessel sheets or Kiefer-Bessel processes, generally, no analytical
expressions for these boundary crossing probabilities are available,
though some bounds are available in the literature.
However, MOnte Carlo
techniques can readily be employed for this purpose and this is pursued in
this section.
The basic idea ia to incorporate the weak invariance principles
for multi-dimensional array of r.v.'s to generate these processes and
employ them to provide suitable simulated results for the desired percentile
points.
{~ij' 1 s i s n , 1 s j s n }, k=I, ••• ,p
Let
1
~ij
of r.v. 's, where the
2
= ~i=1
Lo
r
~j
X.
Los=1 ukrs'
Wn - {Wn (s,t):
°
S
for
p
independent sets
are independently distributed according to a
standard normal distribution.
Skij
be
Let
SkOO = SkiO
= SkOi = 0,
Vi
1 S i s n , 1 S j S n , k=I, ••• ,p.
1
2
(s,t) s I}
~
1, and
Let then
be defined by
W (s,t) = {W l(s,t), ••• ,w (s,t)}
-n
n
np
=
for
QS
(s,t) S
1.
°
(nln2)-~{SI[nlS][n2tJ,. •• ,sp[nls][n2t]}
Also, let
W (s,t)
-n
wO = {wO(s,t):
-n
-n
°
(s,t)
S
I}
""
=W
Q s (s,t)
-n (s,t) - tW
-n (s,1), --
S
1
""
S
,
(7.1)
be defined by
(7.2)
e·
23
Then, by virtue of the classical weak invariance principles,
v E
W
~
"'Il
where
W
,..., = {w 1 , .•. ,W}
p
and
as
and
000
W
. . , = (W 1 , .•• ,W)
P
(7.3)
are defined after (3.4) •
As a result, by (3.5), (3.6) and (7.1)-(7.3), we obtain t~at for large
~
·n;>}
"Sup
O::S;s::s;I{B (s,t)} ,
O::s;t::s;l
max
.e
.,
II
•
(7.4)
p
sup
0
<'<
1-~-n1
O::S;s::S;l{B (s,t)}
<'<
1-J-n
O::s;t::s;l
(7.5)
p
2
We denote the upper 100a% point of the distribution of the statistics on the
~a,p
right hand side of (7.4) and (7.5) by
an d
~
O
. 1y, an d
, respect~ve
a,p
we obtain estimates for these by simulating 500 copies of the left hand
sides of (7.4) and (7.5), where we take
n
1
= 50
for each of these 500 sets, we need to generate
variables.
For
and
n
500
2
= 100
and thus
standard normal
p=l, these values are already reported in Majumdar and
Sen (1978b) and Sinha and Sen (1982), while for some
p
~
2, these
simulation results are given in Table 7.1.
TABLE 7.1
Simulated values of
p=2
and
0
~a,p
p=4
0
p=5
0
~a,p
~a,p
~a,p
~a,p
~a,p
~a,p
~a,p
0
~a,p
0.01
10.712
3.209
13.018
4.061
15.064
4.334
16.109
4.881
0.05
7.788
2.578
9.470
3.077
11.155
3.438
12.394
3.956
0.10
6.180
2.036
7.930
2.676
9.770
2.935
11.166
3.448
a
'e
p=3
0
~a,p
24
In view of (3.9) and (3.10). we need to consider only the distribution
of the statistic on the left hand side of (3.10). and. for this purpose. we
note that by virtue of (7.3). for every
0 < n < 1 and
0 < £ <
~
max
[n n]SiSn
l
1
[n £]Sjsn -[n 2 £]
2
2
• (7.6)
so that the empirical distribution of the left hand side. obtained from the
same set of 100 replicates (as in Table 7.1) provides the simulated values
for the percentile points of the distributions of the right hand side
statistics.
These are denoted by
~
0*
a.p
(£.n)
and presented in Table 7.2.
TABLE 7.2
Simulated values of
for
p=2
n = 0.02
p=3
.
£
a=.Ol
a=.05
a=.10
a=.Ol
a=.05
a=.10
.01
22.046
18.176
16.145
25.043
20.876
19.089
.05
22.044
17.781
15.684
24.734
20.745
18.659
.10
21.571
17.239
15.142
23.505
20.209
18.229
.15
20.583
16.103
14.570
23.378
19.122
17.474
.20
19.933
15.940
14.433
23.056
18.605
16.824
.25
19.306
15.458
14.158
22.412
18.109
16.046
p=4
e.
p=5
.01
24.803
22.214
20.229
27.126
23.992
22.948
.05
24.683
21.060
19.643
26.633
23.750
21.845
.10
24.610
20.422
19.482
26.609
23.428
21. 255
.15
24.331
20.265
19.326
26.227
22.881
20.961
.20
23.151
19.795
18.792
26.227
22.065
20.292
..• 25. .22.835
19.584
18.263
25.038
21.717
19.930
25
REFERENCES
ARMITAGE, P. (1975).
SequentiaZ MediaaZ TriaZs.
(2nd ed). Balckwell, Oxford.
..
CHATTERJEE, S.K. and SEN, P.K. (1973). Nonparametric testing under progressive
censoring. CaZautta Statist. Asso. BuZZ. £l, 13-50 •
r
COX, D.R. (1972). Regression models and life tables.
Sere B ~, 187-202.
J. Roy. Statist. Soa.
CURNOW, R. (1972). Contribution to discussion on the paper by R. Peto and
J. Peto. J. Roy. Statist. Soa. Sere A 112, 199-200.
DAVIS, C.E. (1978). A two-sample Wilcoxon test for pregressively censored
data. Commn. Statist. Theor. Method ~, 389-398.
DELONG, D. (1980). Some asymptotic properties of a progressively censored
nonparametric test for multiple regression. J. MUZtivariate AnaZ. lQ,
363-370.
FLEMING, T.R., O'FALLON, J.R. and O'BRIEN, P.C. (1980). Modified KolmogorovSmirnov test procedures with applications to arbitrarily right censored
data. Biometrias~, 607-625.
.e
~
••
GEHAN, E.A. (1965a). A generalized Wilcoxon test for comparing arbitrarily
singly-censored samples. Biometrika Zl, 203-223 •
GEHAN, E.A. (1965b). A generalized two sample Wilcoxon test for doubly
censored data. Biometrika 52, 650-653.
"""'"
HALPERIN, M. and WARE, J. (1974). Early decision in a censored Wilcoxon
two-sample test for accumulating survival data. J. Amer. Statist. Asso.
69, 414-422.
"""'"
KOZIOL, J.A. and PETKAU, A.J. (1978). Sequential testing of the equality
of two survival distributions using the modified Sivage statistics.
Biometrika £1, 615-623.
MAJUMDAR, H. (1977). Rank order tests for multiple regression for grouped
data under progressive censoring. CaZautta Statist. Asso. BuZZ. ~,
1-16.
MAJUMDAR, H. and SEN, P.K. (1977). Rank order tests for grouped data under
progressive censoring. Cammn. Statist. Theor. Meth. ~, 507-524.
MAJUMDAR, H. and SEN, P.K. (1978a). Rank order tests for multiple regression
under progressive censoring. J. MuZtivar. AnaZ.~: 73-95.
MAJUMDAR, H. and SEN, P.K. (1978b). Nonparametric testing for simple
regression under progressive censoring with staggering entry and random
withdrawal. Commn. Statist. Theor. Meth. A7, 349-371.
"""'"
MAJUMDAR, H. and SEN, P.K. (1980). Chi square tests for general categorical
models under progressive censoring with batch arrivals. Sankhya~ Sere B.
!a, 42-57.
26
MANTEL, N. (1966). Evaluation of survival data and two new rank order
statistics arising in its consideration. Cancer Chemotherapy Rep.
163-170.
~,
PETO, R. and PETO, J. (1972). Asymptotically efficient rank invariant test
procedures (with discussions). J. Roy. Statist. Soc. Sere A ~, 185-206.
SEN, P.K. (1978a). A two-dimensional functional permutational central limit
theorem for linear rank statistics. Ann. Probability ~, 13-26.
SEN, P.K. (1978b). Asymptotically optimal rank order tests for progressive
censoring. CaZcutta Statist. Asso. BulZ. £2, 65-78.
SINHA, A.N. (1979). Progressive censoring tests based on weighted empirical
distributions (doctoral dissertation). Inst. Statist. Vniv. North
Carolina Mimeo Rep. No.1217.
SINHA, A.N. and SEN, P.K. (1979a). Progressively censored tests for
clinical experiments and life testing problems based on weighted empirical
distributions. Commn. Statist. Theor. Meth.~ ~, 817-97.
SINHA, A.N. and SEN, P.K. (1979b). Progressively censored tests for multiple
regression based on weighted empirical distributions. Calcutta Statist.
Assoc. BuZl.~ ~, 57-82.
SINHA, A.N. and SEN, P.K. (1982). Tests based on empirical distributions
for progressive cen~oring schemes with staggering entry and random
withdrawaL Sankhya~ Sere B. ~, in press.
e,
••