* Work supported in part by U.S. National Heart and Lung Institute
contract NIH-NHLI-7l-2243 from National Institutes of Health.
CHI-SQUARE TESTS FOR GENERAL MODELS UNDER PROGRESSIVE
CENSORING WITH BATCH ARRIVALS
by
Hiranmay Majundar and Pranab Kumar Sen
Department of Biostatistics
University of North Carolina at Chapel Hill
Institute of Statistics Mimeo Series No. 1058
February 1976
CHI-SQUARE TESTS FOR GENERAL CATEGORICAL MODELS UNDER
PROGRESSIVE CENSORING WITH BATCH ARRIVALS*
by
Hiranmay Majumdar and Pranab Kumar Sen
University of North Carolina
ABSTRACT
For some batch-arrival models relating to categorical data under timesequential studies, suitably progressively censored tests based on chisquare statistics are proposed and studied.
The necessary (asymptotic)
distribution theory is considered for the null as well as local alternative
hypotheses situations.
To facilitate comparisons of the different proposed
tests, a numerical illustration is presented at the end.
AMS 1970 Classification No: 62E20, 62FOS, 62GIO, 62L99.
Key Words and Phrases: Batch-arrival models, categorical data, chi-square
tests, progressive censoring scheme, stopping time, time-squential procedures.
* Work supported in part by U.S. National Heart and Lung Institute
contract NIH-NHLI-71-2243 from National Institutes of Health.
-2-
1.
INTRODUCTION
Chi-square statistics are widely used for testing of hypotheses for
categorical data.
In many situations relating to clinical trials and life-
testing problems, the response categories are (ordered and) sequential in
time, so that a complete collection of data may involve a considerable
period of time
(and
hence,
cost).
For this reason, progressive censoring
schemes (PCS) are often advocated with a view to terminating the experimentation at the earliest possible stage depending on the accumulated evidence at the successive stages.
Further, in many such problems, not all
the subjects enter into the study at a common point of time and, naturally,
a batch-arrival model (BAM) seems to be more appropriate.
Under a general
model on the probability structure and assuming the number of batches to
be fixed in advance, the current investigation provides suitable progressively censored tests based on chi-square statistics.
Among other generali-
ties, this includes as a special case, a time-sequential test essentially
related to Mantel (1966) for
2 x (k+l) , k
~
2
contingency tables.
The basic problem is formulated in Section 2.
presented in Section 3.
The proposed tests are
Sections 4 and 5 deal with the distribution theory
of the test statistics under suitable null and (local) alternative hypothesis.
In this context, it is shown that one of the proposed tests has
the nice property that it has the same level of significance and power of
the overall test but can lead to rejection at an early stage resulting in
lesser expected cost and time of experimentation.
Section 6 is devoted to
various extensions to mUlti-response models and comparisons with some
other tests.
The final section gives a numerical illustration.
-3-
2.
FORMULATION OF THE PROBLEM
In the context of clinical trials or life testing, consider a typical truncated experimentation plan for a pre-determined amount of time
T
and suppose that the response (failures) are recorded in
time intervals
II"" ,I
k
and the surviving (beyond time
I~+l
constitute the cell
Thus, we have a set of
k+l
k
(ordered)
T) subjects
ordered cate-
<t:s:T } , l:s:c:s:k and I~+l = {t: t > T } where
{t: T
k
c-l
c
T<oo
T < T < ... < T
Conventionally, for every c < k, we let
k
o I
gories:
I
c
=
(2. I)
1*c+l
I c+ I
=
u
I c+ 2
Ik
u ... u
u
I k*+ I
We conceive of a BAM where all the subjects may not enter into the study
at the same time.
t(~l)
Instead, there are
T
enters into the study at time-point
h.
J
for some
the observable categories for this batch are
for
l:s: j :s: t;
also,
o
batches, the j-th batch
O:s:h.:s:k-l
J
Il, ... ,I k _h .
so that
I~_h.+l'
and
J
=
hI
< •.• <
J
Note that in this design, it
ht:S: k-l.
is not necessary to fix in advance the individual batch sample sizes;
rather the j -th batch sample size may be decided at the entry time
for
j = 1, .. .
,t.
T .,
h
J
Further, keeping in mind the usual comparative experi-
ments involving one or more factor (or treament etc.), we conceive of the
r(~l)
sample situtation, where, for each batch, the
relate to the same categories.
r
parallel
samples
For a better understanding of the model,
we refer to the numerical illustration in Section 7.
For the i-th sample in the j-th batch, the number of subjects entering into the study is denoted by
n .. , j
1J
be the number of observations (among the
=
l, ... ,t,
n ..
1J
1 :s: i:s: r.
Let
n ..
subjects) belonging to
1J , t
-4-
t = 1, ... ,k-h.
and
J
1
~
j ~ £,
J
~ i ~ r ;
1
n:<.
1J, k - h . + 1
on letting
cell probabilities by
be the number of censored cases,
k. = k-h.,
J
11~.
and
1 ~ t ~ k.
11 .. t '
1J ,
we denote the corresponding
J
J
1J, k . + l '
1~ j
~
1~ i
£,
J
Further,
k.
J
11 .. = L 11. . + 11~. k 1 = 1, 'v' 1 ~ j ~ £,
1J t=llJ,t
1J, j+
(2.2)
1~ i ~ r ;
k.
J
n .. = In..
+ n:<. k l ' 'if 1 ~ j ~ £,
t=llJ,t
1J, j+
1J
(2.3)
1~ i ~ r .
The joint distribution of the observed cell frequencies is given
by the product-multinomial:
£
r
(2.4)
ep=TITT
i=l j=l
n .. !
[
1J
k.]
J
n..
n11.~J,t (11~.
)
t=l 1J,t
1 J ,k j +l
k.
J
n~.
1J,
k1
.+
J
TIn .. !(n:<. k l)J
t=llJ,t
1J, j+
For this model, one usually expresses
(2.5) 11 ..
1J,t
=11 .. (8); 8'=(8 , ... ,8), 'if
1J,t ~
~
s
1
l~i~r,
l~j ~£,
l~t~k.
J
and proceeds to test a null hypothesis
H : F (8) = 0,
m
O
(2.6)
~
where the
F
ill
m
l, ...
,q(~s)
,
are sui table (linearly independent) functions of
8.
Let
the test statistic be
-"2
(2.7) X
.n..
kj (n .. t-n .
)2 (Ilij,k.+1-nijITij,k.+1)2Jl
I __1 ",-J
~",-J_l",-J _t_ +
J
J
L I
i=l j=l t=l
n..
n 1J
.. 11:<.
1J ,t 11..
1J, t
1J, k . +1
r
£
-<...'_ _
-<...'
A
J
where
"-
A
-"
= 11..
11:<.
k +1 = 11:<.
11..
1J, k . + 1 (8)
~
1J, t (8),
~
1J,.
1J , t
J
minimizing Pearson's
J
and
is obtained by
~
r.
-5-
2
(2.8) X =
k.J ( n .. t- n .. IT .. (8))2
1J 1J,t
I I 'i'l 1J, n .. IT..
(8)
i=l j=l t =
1J 1J, t
r
with respect to
f
~,
(n~.
k l-n 1J
. . II~.
k(~)1)2
IJ,·.+
1), .+
... J
n . . II~. -k----l-COY- --1J 1J, . + ~
+ _ _~J,,--::=-
J
subject to the constraints in (2.6).
large sample sizes, any BAN estimator of
For large sample sizes, under
~
Actually, for
can be used, see Neyman (1949).
H in (2.5), under suitable regularity conO
ditions given by Cramer (1946, pp. 426-27) and subsequently modified by
Birch (1964),
(2.9)
and this provides a large sample test of
large values of
H '
O
where we reject
H for
O
"2
X.
The test-statistic
is generally used when we wait until the end
of the experiment as envisaged in the design.
However, as mentioned ear-
lier, we like to adopt a PCS where we monitor the experiment from the
beginning with a view to stopping experimentation at an early stage (i.e.,
T , c:s; k)
c
at some time-point
stage is sufficient to reject
if the evidence accumulated upto that
B '
O
Basically, without affecting the risk
of making incorrect decisions, such a PCS procedure can be constructed
and will result in a reduction in the time of experimentation, and hence,
cost too.
This is elaborated in the next section.
3.
CHI-SQUARE TYPE TESTS FOR GENERAL
MODEL UNDER PCS
Let use examine the structure of the categorical data at the completion of time
c.
J
c-h. ,
J
T
c
Let
the j -th batch
J
{j : h. :s; c}. Then, for j E J , writing
c =
c
J
(n .. ) relates to the categories II' ... , I
C.
1J
J
-6-
1*
c.+l
with respective frequencies
J
l:s:i<;r,
n .. l, ... ,n ..
1J,
so that the joint probability function is
n
(3.1) cjJc =
IT
TT
i=l jEJ
J
J
n~.
c.
n .. !
J n..
1J,c.+l
J
c-----1--"'-J----nITiJ~J ~ t (IT;:J. c +1)
cn
t=l·'
j
1(*
for
,
,n~.
1J,C.1J,C.
, j
)1
t=lnij,t' nij,cj+l .
consider then the related minimum chi-square statistic
,,2
( 3.2) Xc
r
\'
=
L
L
i=l jEJ
where
e
~
n .. IT. .
1J 1J ,t
IT~
1J. ,c. + 1
e
in (2.5) and
~c
based on (3.1).
Here also, use of any BAN estimator (based on
n ...
the earliest stopping time-point
T
b
c
,,2 ,,2
X = Xk
Note that
1J
for constructing
and
involves
b + 1
I (b-h j ) > s-q. Further note that
jEJb
involves only a subset of the probabilities {IT.. }, and hence, the
1J ,t
cells
cjJ
is the minimum chi-square estimator
~c
(3.1)) is permissible for large
°
j
n . . W.
1
1J 1J ,c j +
arc the estimated probabilities obtained by
J
e= e
substituting
of
t=l
C
and
IT ..
1J ,t
~IT)2 + (n~.1J,C.+J l-n 1J.. rr~.1J,C.+J 1)2~
r
c. (n .. -n . . . . t
1J ,t
1J 1J,
\'
1
bl
"" ,Ib,I + )
null hypothesis
on the
H
for which
in (2.5) may involve only a subset
o
at the c-th stage, where
IT ..
1J ,t
q(c)
is 7'
particularly true for contingency tables, where as
q(c)
in
c
c.
of restraints
This is
increases, we deal
with an increasing number of marginal probabilities associated with the
categories.
Thus we conceive of (2.4) and set
H ::
(3.3)
(3.4)
is monotonic
o
H
O,c
Naturally, if
HOC
F (c) (e) = 0, 'if 1:S: m:s: q (c )
m
~
is not true for some
q(c)
c(:s:k) ,
l'
then
in
c.
H
is also not true.
O
-7-
Type A PCS test for H .
O
3.l.
{T
b:s: c:s: k}.
c
as
,,2
Xc :s:
Cll
a
At
T
c'
,c:S: k.
,,2
compute
Continue experimentation so long
Xc .
If, for the first time, for
experimentation at time
exists, accept
C(:S:k)
Consider the set of time-points
c
,,2
C ' Xc > w
a'
=
T
along with rejection of
C
H at time
O
T .
k
H .
O
stop
If no such
w
Here we need to choose
in
a
such a way that the overall level of significance of the test is
Note that in this setup, both
a: 0 < a < 1.
variables.
C and
T
are stochastic
C
Further, the tests proposed here depends only on the set of
frequencies of the uncensored cells.
However, parametric considerations
may enter (excepting in some cases like contingence table situations)
through the dependence of
Type B PCS test for
3.2.
o
(3.5)
on
IT .. t
1J ,
H .
O
e.
Let us define
,,2
,,2
}
if
c=b .
max { Xc - Xl'
0 , b
c-
c
c
=
C,
H .
O
with the rejection of
also,
{6b' ... ,6 }
k
DC> 6 '
C
c
where
a
c
c
c2:b.
If for the
stop experimentation at time
If no such
C(:s:k)
exists, accept
T
along
C
H .
O
Here
are positive constants such that
p{O :s: 6
(3.6)
c :s: k ,
O:S:6,
Then continue experimentation so long as
first time, for
<
c
= level of significance.
For both the tests, as we shall see in the next section, the design
must be compatiable in the sense that
(ii)
L (c-h.)
J
J
c
- s
+
q(c)
should be
(i) q (c)
in
should be
c: b:s: c:s: k.
!
in
c
and
-8-
3.3.
Two simultaneous tests.
We have considered earlier the situation
when the Ilwllbcr of batcllcs as I,ell as the
tiIIlL"pUlll[S
1'1
'
1·
l)
II.
J
j
h.
J
1 •
= 1, ... ,l at which the batches entering into the study are predetermined
in order that we know in advance the degrees of freedom of the terminal
statistic
,,2
X
relating to the overall test computed over all the batches
and all the samples.
Corresponding to Type A and Type B tests, we may
devise two simultaneous test procedures across the independent batches
which will be termed as Type C and Type 0 tests respectively.
As we
shall later show these tests will be particularly useful when the number of batches is large and the experiment is continued over a long
period of time.
In these latter tests, the terminal statistics are
based on the respective batches only.
Consequently, we may relax, in
this case, the condition of batch-arrivals at pre-determined time-points;
rather they may be conveniently decided as the experimentation progresses.
We now proceed to describe these two tests.
3.3.1.
Type C PCS test for
1
t h e re 1 evant b atcles,
compute
1-1
0
As in Type A test but separately for
l
c
.
.
Tho
{ jX/',2} j=l'
at tlme-polnt
c' w ere ~c
c
,
is the number of batches involved at the c-th stage of censoring.
earliest time-point for constructing this statistic is
T
b
If the
for the first
o
for the subsequent batches
T +h ,Tb +h , ... ,T +h
bo 2
b0 l
0 3
The dependence of l
on c comes due to the fact that
batch, then these are
respectively.
(3.7)
and
point
c
tc
I(x)
T ,
bo
stands for the indicator function for the event.
At the time-
consists of only single element
-9-
This
lbO'
may be different from
'b'
Continue experimentation so long as
defined in sections 3.1 and 3.2.
{.X 2 s::
J c
.W
J a
v
II} ,
j = 1, .. . , f .
Stop
c
')
experimentation along with the rejection of
if for some
choose
in such a way that the overall level of significance for
j = 1, ... ,f
experiment is
3.3.2.
is
a": 0 < a" < 1
a: 0 < a < 1
Type 0 PCS test for
Here we
and overall error rate of the
so that
I-a
At
T,
H '
O
T k = T.
.X"" > .W II'
J c J a
c (b ::; c s:: k)
O
every
H at time
O
j,
If no such
a"
exists, accept
H '
O
= 0-0,")
f
.
for each
c
j
1, ... ,f ,
c
com-
pute
A2
A2}
.0 = max { . X - . Xl' 0 , b + h . < c::;
O J
J C
JC J c-
(3.8)
if
c
=b
O
+ h , j
j
.0 ::;.1':., V.; C?bO+h ..
J c
along with the rejection of
T
C
exists, accept
li
O
at
T = T.
k
Here
J
J c
for some
for the first time for
C(::;k)
0
,-C ;
= 1, ... ,f .
Then continue experimentation so long as
tion at time-point
k ; J = 1, ...
J
If
stop experimenta-
j ,
HO'
If no such
{·I':.b +h """'\}
J 0 j
J
form a
double sequence of positive constants such that
P{.ll ~;.I':., V bO+h. ::;cs::k\.H}= I-a"
JCJc
J
JO
(3.9)
where
a"
1S
the level of significance for an individual batch and
the overall error rate of the experiment satisfies
3.3.
Relevance to the Mantel procedure.
a,
I-a = (l_a,,)f.
In further extension of the
methods of Mantel-Haenszel (1959) and Mantel (1963), Mantel (1966) considered an overall comparison of two sets of life tables in their entirety.
-10-
The method is equivalent to decomposing a
into
k
corre lated
2 x (k+l)
contingency table
contingency tables and comhining the result
2x 2
of each into a summary chi-square with one degree of freedom.
By bring-
ing a conditionality argument to bear, he showed how this statistic is
the square of the sum of
k
orthogonal random variables.
He also pro-
posed a time-sequential procedure where the summary chi-square was computed after each failure and the test statistic was the maximum of these
chi-squares over the course of the study.
Mantel, however, did not men-
tion that it is possihle to develop another time-sequential test on the
basis of the distribution of the maximum of the set of
k
conditionally
orthogonal chi-squares each with one degree of freedom from the
contingency tables, along the lines of section 3.2.
k
2x 2
Structurally, Mantel's
procedure is akin to the Type B PCS test.
4.
ASYMPTOTIC DISTRIBUTION THEORY UNDER H
O
First, through the following two lemmas, we show that
,,2
(4.1)
is non-decreasing in
Xc
Le t
Lemma 4. 1 .
m~
.
1J,c.+l
~
c
~
k .
{m.. t; 1 ~ t ~ k., m~. k
l' 1 ~ j ~ i, 1 ~ i ~ r} be a set
1J, j+
I t=J 1m.1J. , t + m~.
k 1 = n .. , 'if 1 ~ i ~ r, 1 ~ j ~ i,
1J,. +
1J
1J,
~.
of positive numbers such that
and let
c: b
be defined as
~n
J
before.
Let then
J
I
c. (
r
I
I
i=l jEJ
Then,
Xc2 (m)
~
~s
c
t=l
nij,t-mij,t
(nij,c.+l-mij
*
* ,c.+l )2
)2
LJ-,--
+
1J,C.+ 1
1J,t
non-decreasing
~n
::-J_ _
m~.
m..
J
c: b
~
c
~
k .
,
b~c~k
.
-11-
By definition,
Proof:
(4.3)
2
X I (Tn)
c+ ~
c·+1
J
I
r
=
I
1.
i=I' J
j(,
r
;: I
L
1j,t
2
)
+
Tn . .
t= I
c+1
t-Tn..
1j,
1
J ,t
*
*
c. + I
2 (n. .
-m ..
J
(n..
-Tn..
)
1J,c.+2 1J,c.+2
r
J
J
I I 1J ,t 1J, t +
i=1 j E J
r
(n ..
1
c
Tn ij
Cj (n ..
L
i=1 jEJ
t= I
j
L
1
J
t=1
C
* ,c.+2
,t
-Tn •.
,t
1J
Tn ij
J
(n ij ,C.+I- Tn ij,c.+1)2
)2
,t
+ _ _---"-J
mij,t
-=J'--_
mij ,c.+1
J
(n~.
2-m~.
2)2
1J,C.+
1J,C.+
L-J-,---
+
L-J_ _
Tn~.
1J,c.+2
J
We now write
and
m~.
1J ,c. +
J
a .. , n~ .
1J,C.+ I-m..
1J,C.+ 1
n..
2
J
f. ..
1J
1J,C.+ 2 -
1J
J
m~
J
Then
.
2
1J,C.+
J
(n ij ,c. +l- Tn ij ,c. +1)
2
J
J
(4.4)
m.1J,C.+
.
1
J
( *
)2
Tn *
n ij ,c.+2- ij,c.+2
J
+
J
Tn~.
1J,C.+
2
J
m:r1J,
. c. + 1
J
r
L
L
. J
Now,
2 d..2 -
a ..
~+~
{e 1J..
i=l JE c
)2
f..
1J
(a .. +d .. )
1J 1J
(e .. +f .. )
1J 1J
2}
-122
2
a ..
d ..
c ..
1J
f ..
(a .. +d .. )
(4.5)
2
1J 1J
(c .. +f..)
1J 1J
~+ ~
1J
2
1J
2
1J
2
.
2
1J 1) 1J 1J
e .. f .. (e .. +f .. )
1J 1J 1J 1J
a .. (c ..• f .. +f .. ) +d ..(e ..• f. . +e .. )-c .. f .. (a .. +d .. )
2
1J
1J
2
2
1J 1J
1J
2
1J
2
(a .. f .. +d .. e .. -2a .. d .. e .. f. .)
_~-2oL 1J 1J
1J 1J 1] 1J
e .. f .. (e .. +f .. )
1J 1J
1J
1J
(a .. f .. -d .. e .. )2
1J 1J 1J 1J
~ 0
e .. f .. (e .. +f .. )
1J 1J
since
e..
1J
and
f..
1J
1J
are positive.
1J
o
As in after (3.2), let
~c
,
V real
a ..
1J
and
d ..
1.J
0
be the minimum chi-square estimator of
e
based on (3.1), so that by (3.2),
~2
(4.6)
2(~
Xc ::: Xc -!:J1c);
Lemma 4.2.
2
2
~
A
~c
--
(m..
t::: n.1J. IT..
1] ,
1], t
A
(~
e ),
~c
Vi, j, to) .
0
A
Xc (!!l) ~ Xc (!!lc) , V fi ~ fi c ' b:s c :S k.
The proof directly follows from the fact that
chi-square estimator of
e
(over all
Xc (m).
value of
based on
A-.
iii~c -- m(8
)
't'c'
~ ~c
'8 ,
~c
being the minimum
leads to the minimum
2
~
From the preceeding two lemmas, we immediately conclude that
(4.7)
By the results of Neyman (1949), (4.7) holds (upto the order of
for any sequence of BAN estimators.
4.1.
(4.8)
~symptotic
null distribution of Type A test statistic.
f
c
:::r(
L
j cJ
c.] -s+q(c),bsc:Sk,
c
J
Let
o (1))
p
-13-
so that by the compatibility condition of the design
f
is
c
~
in
c.
Further we let
(4.9)
d
f
c
2
Finally, let
Xp,y
bution with
p
c
- f
for
c-l
be the upper
b
c
<
:0;
100y%
k
db = f
and
.
b
point of the chi-square distri-
degrees of freedom (d. f.).
It follows
then from the
basic results of Cramer (1946, pp. 424-34) that under the regularity conditions stated in Bi rch (1964), for every
w}:len
(4.10)
However, the
pendent.
c: b:o; c
:0;
k,
holds.
H
O,c
are obviously (even asymptotically) not inde-
But by (4.1), and (4.10),
A2
P{ X > w
c
C(
(4.11)
for some
if
Note that
2
w =X
C(
fk,C(
is the critical value of the overall test (based on
completion of experiment upto the time
T = T).
k
Hence the Type A test
consists in choosing the same critical value of the overall test but allows
the possibility of an early rejection without any increased risk of making
incorrect decision.
4.2.
Asymptotic null distribution of the Type B test statistic.
By (4.1),
(4.10) and the classical Cochran theorem [viz. Searle (1971, p. 64)], under
the same
(Cramer-Birch) regularity conditions, under
oc
(4.12)
Again,
0b
+ ... +
Ok'
V
-+-
2
d
X
for
b:o; c
:0;
k .
fk
=
H
O
c
°c :::: 0,
'if c
and
db
+ ... +
dk , dc > 0, 'if c,
so
-14-
that
{D:
b
c
<:;
c
<:;
k}
forms an asymptotically independent set of chi-
square statistics, see Searle (1971, pp. 60-61).
(4 . 13) P{D , :; (;,. , 't/ b
C
c
<:; C <:;
k IH } -+O
nkb
p {[)
c=
c
:; (;,.
IHO
c,c
Hence, asymptotically,
} -+-
nk
c= b
2
dc
p{ X
<:; (;,.
c
} ,
and thus, if we let
(4.14 )
(;,.
2
X
d
c
0."
b:;c:;k; I-a
(l_al)k-b+l ,
c'
we obtain from (4.13) and (4.14) that
2
(4.15)
P{D :;X
"V b:;c:;kIH } -+-1-0..
c
d ,a
O
c
As a result, the Type B test consists in comparing the successive
di fferences
{D: b
c
<:: C <::
k}
with the corresponding critical values
2
: b c: c <:; k} and rejecting the
0.
c' '
of c, if there is any at all.
{X
d
Asymptotic null distribution of Type C test statistic.
4.3.
c-th censoring stage,
for the j-th batch.
in
at the earliest possible value
c.
. q (c)
J
be the number of restriction on the model
Because of the compatibility condition
. q (c)
J
is
~
in
c.
We also define
(4.16)
.f
= r (c - b - h .) - s + . q (c), b + h. :; c :; k .
O
O
J c
J
J
J
Due to the compatibility condition of the design
.f
J c
also is
Further we let
(4.17)
Let at the
J c
.d
b + h <c:;k; j = 1, ... ,l
= .f - .f
j
J c J c-l' O
.d
= .fc' c = b + h
O j
J
J c
-15-
Leaving the details, we proceed as in section 4.1 and obtain
(4.18)
2
p{·X
J c
>.W
J a
If'
2
p{.X k >.W "I.H }
J'JaJ O
-+
a"
if
,,2
X f
.w"
Ja
j k'
a"
H as
O
where we conceive of
f
(4.19)
}
c: b + h. ~ c ~ kj.H
O J
J ,Oc
for some
k
n
H - (\
O
J. =1 c=b +h
o
.H
'
J O, c
is monotonic
.H
J O, c
j
k
no
(4.20)
C =b
.H
J O,C
(4.21)
.F(c)(8)
J m
.H
+h
J O,c
j
0, 'if 1
~
.q(c)
~m~
0
J
Consequently,
,,2
P{ j Xc ~ j wa '"
(4.22)
'if
c: b0 + hj ~
2
~ .W "I.H }
k
J . J a J O
= p{.X
(4.23)
,,2
P{ . X
J C
-+
~
. W , 'if c: b + h .
O
J a
J
(I-a")
f
,
~
c
~
k
-+
,,2 >
IJ{ .X
J c
and
and
'if
j=l, ... ,fIH }
O
a
0
is
for some
Ow"
As such, Type C test
}
because the batches are independent
J a
and some
I
~ k j H0
I-a"
Therefore, the overall level of significance
(4.24)
C
j
= 1, .. o,fl.H
J
o,C } -+
1- (l_a,,)f = a
0
consists in using the same critical value at all stages
of censoring for the same batch, but different critical values for the different batches.
-16-
4.4.
Asymptotic null distribution of Type 0 test statistic.
arguments as in section 4.2, under
By similar
H '
O
(4.25)
Also,
k
I
(4.26 )
.0
c=bO+h j
20, V c
.0
J c' J c
and
k
.d
.d > 0, V c ,
I
c' J c
c=bo+h j J
so that
{j [} c: b + h
o
s;
j
~ k}
c
chi-square statistics.
Asymptotically, then
p{ .0
J c
(4.27)
forms an asymptotically independent set of
s; .6 ,
J c
V b
O
I
+ h " s; c s; k J. H }
O
J
k
IT
~
c=bO+h j
P{. 0
J c
S;.
6
I· H
J c J O,C
}
k
IT
~
c=bo+h j
p{i.
s;.6 }
d
J C
J C
(k-b -h.+I)
If we let
.6
J C
X.2 d
.a""
J c' J
b 0 + h J._c-;
<
< k
(1 -a") = (I-a"')
0
J
we
obtain
(4.28)
Further if
(4.29)
p{ .0
J
I-a
2
C
s; X d
.
J C
I
- k j H0 1) ~ I-a"
C <
(I_a")!
<
2
{
P.o
J C -X . d ,a ""
J C
~
V bo+ hj <
,a " , ,
(I-a")
!
=
V
I-a .
Consequently, the Type 0 test amounts to comparing successive differences
-17-
{.D : h + h. s c s k; j
O .1
J c
{X
2
.d, a
J c
4.5.
=
I, ... ,t}
'fI' bO+h. scsk; j
.1
with the respective critical values
l, ... ,t}
=
for an early rejection.
Restrictions on the probability structure with BAM.
test the null hypothesis of homogeneity of the
r
With BAM, we
samples, the batch
arrivals being assumed to have uniform pattern for every sample.
It is
not unrealistic to further assume that the batches within the same sample
have the same underlying distribution.
As such, we may impose
additional
restrictions on the probability structure of the model as
n..
(4.30)
IT
1.1 , t
ij , , t
apart from the restraints on
4.6.
Some special cases.
sequence of
oc
.1·<J·',·t
'
1, ... ,k. "
J
IT ..
1J
,t
under
'if i
H '
O
With single batch in one sample situation the
statistics will uniformly have one degree of freedom,
assuming general model.
When the model is unspecified i.e., when para-
metric dependence is not assumed in the model, we encounter the case of
contingency table (with single batch).
formly have
r-l
table situation.
With
r
degrees of freedom, for every
samples
c,
[)
c
will uni-
in a contingency
oc
In the two-sample problem of Mantel (1966),
will
always have one degree of freedom.
5. ASYMPTOTIC DISTRIBUTION THEORY OF THE PROPOSED TEST
STATISTICS UNDER LOCAL ATLTERNATIVES AND POWER CONSIDERATIONS.
5.1.
Consistency of the test-statistics.
'if lsmsq(c)
least one
m.
We have under
and under the fixed alternatives
Let
IT ..
1J , t
o
n..
1J , t
under
H
O,c
H
:
O,C
F(c)(8)
m
~
H
: F (c) (8) ~ 0
l,e
m ~
,
and
n..
1J , t
1
1J ,t
IT ..
= 0
for at
under
,
-18-
HI
' i:= 1, ... ,r; j :=
,c
1, ...
,i; t:= 1, ... ,c ..
Then under the Cramer regu-
)
larity conditions modified by Birch (1964)
is true and
IT .. tee )
1),
~c
gets large, under
HI
ing on
is
and
c
,n
,c
n
-+
00
when
-1"'2 p
X
c
-+
is true.
H
l,c
E,
c
E (>0)
c
where
is the combined sample size.
the probability
0(1) ,
one as
n
e IT~.1), t
",2
X
0f
excee d ing
c
H
Therefore, as
n
O,C
is a constant depend2
As
X2
r(k
when
l
X
r(k l +·· .+ki)-s+q,a
ten d s to
+· .. ki)-s+q,a
By similar reasoning, the powers of the other three tests
viz. Type B, Type C, and Type D can be shown to go to one as
n
-+
00.
Con-
sequently, in order to compare the asymptotic performance of the four
statistics, we consider only local alternatives.
5.2.
Let under
Asymptotic distribution theory of
H
O,c
and local
H(n)
l,c
alternati ve
(5.1)
F~C) (Q) := 0,
H
O,c
'if l:o;m:o;q(c)
H(n) : F(c) (8) := n -!z em' 'if l:o;m:o;q(c)
m
l,c
~
where
{C j
r
(5.2)
1 im ni j
n-+OO n
e
m
7-
0
for at least one
:= Q
0 < Q. . < 1.
ij ,
1)
For convenience we reparaphrase
(5.3)
}
t
m; n:= L L
L n..
+n~ .
j:=l
t:=l
1J,t
1J
,c
+l
i:=l
j
H
O, c
o
IT..
O,c
0
1) , t
F~ c)
H
:= IT 1),
.. t(8 ); 8
~
(£2 0 )
:= 0, V 1:0;
or
and
0'
H(n)
l,c
0
as
0
:= (8 "" ,8 )
s
1
m :0; q ( C )
•
-19-
o
n?~e
= n.. t(e ),
IJ,t
1J, ~e
I J (n) '.
j
l,e
e
Ol
= (eO
1 , ... ,eO
e
~e
(n) =n °
n..
..
IJ,t
IJ,t
()
s-q e ,e
)
+n -!z 0 ..
IJ,t
(e)(e o ) =0 ,
Fm
1 :S:m:S:q (
e ), .1= 1 , .. ,r
\J
v
~
j=l, ... ,f; t=I, ... ,e.
J
Note that for
t = e. + 1
J
(n)
~
*0
~ n
+ n- !<:
n*IJ,e.+
..
1-"
u
IJ,C.+ 1
IJ,C.+ 1
(5.4)
2J: ••
J
J:
c.
J
J ej
(J:
IO
since
1=- L u
..,
IJ,C j +
t=1 1J ,t
u. .
We are assuming that at least one
+0*1'J'C+1=o,Vj
' '
t= ll J' t
, J'
O.IJ' t
is non-zero.
Then by theorem
3.1 of Diamond (1963) under the regularity conditions (a), (b), (c), (d)
of Mitra (1958) and (e) and (f) of Diamond (1963), under
",2
(5.5)
V
2
Xc ~ X
f
where
c' c
Gc = 0' [I - B(B'B)-l B1 ]O
(5.6)
~c
o (Re xl)
~C
~
~
~
= { (Q. . /
IJ
In the above expressions for
O.IJ,C.
.
+l'
J
Let
power
G ' c = b, ... , k;
H(n)
1, c
~
~e
o 1/
n..
) 0 IJ,
.. t }
IJ,t
"2
0
~e
Q,
and
n~.
IJ,C.+
J
1 = n. .
1
IJ,C.+
J
and
O~.
IJ,C.+
J
These have not been distinguished in order to avoid clumsiness.
A*
be the event that
P(A*) is given by
",2
X >w
C
a
for some
c; b:s:c:S:k.
Then the
1
-20-
P (A*)
(5.7)
",2
= P{ Xc
for some
>w
ex,
c: b:s; c :s; k IHI(n) }
,c
k
= ~H(n). As such, by this
where B is the error of type II and
c=b I,c
testing procedure both the type I and type II errors remain the same as
those of the overall test.
Now,
A*
is the union of the following dis-
joint events:
(5.8)
A
c
_ {",2
=
Xc- l:S;w,
ex,
Let
event
T
e
",2}
Xc >wex,
be the time-point at which the experiment is terminated.
T =T
e
c
is the event
A .
c
The
Therefore, the expected stopping time
with this test is
k
(5.9)
LA=E(T )= LTP(A)+T(I-P(A*)).
e c=b c c
The probabilities of the constituent (disjoint-) events of
by:
A are given
-21-
(5.10)
=
foo
00
I
where
u=o
W
Ab
~
a
Gb
and
~2u+d (x)
is the density
b
of the central chi-square distribution
with
For
2u + db
d. f.
c > b,
P (A ) = p{X,..,2
(5.11)
c
c-l
By Searle (1971, pp. 60-61),
:0: W ,
a
X,..,2 > w } .
c
a
X~_l which is asymptotically distributed as X~
is asymptotically independent of
which is asymptotically
,..,2
2
2
Xd (G -G
) = Xd ,G*' If now, Xc-l :0: X:O: Wa , then
c' c c-l
c c
,..,2
>
w
As
such,
Xc
a
-A
c-l(A
)u
00
00
Wa
c-l
peA ) -+ L L r
(x)dx
~2u+f
c
u=Q v=Q j Q
u!
c-l
-A*
C(A*)V
fOO e
c
where
~2v+d
I
v!
W -x
c
.J
distributed as
0
c
> w -x
a
~~>
re
t
(5.12)
(Z)d~
a
A
c
5.3.
=!:2 Gc
and
A* = A - A
c
c
c-l
Asymptotic distribution theory of
oc .
It has already been proved
in section 5.2 that under local alternatives
oc
(5.13)
Letting
2
Xd a'
c'
v
2
Xd
we have
G*' c = b, . . . , k .
c' c
G
c-l' c-l
-22-
(5.14) I-power = P{D s:
y
c
k
-+TTP{D
c=b
Let
c.
B
c
'C
bs:cS:kIH(n)}
'r/
,
1
k-b( 00
s:y IH(n)}-+TTl L
c
c 1 c
' p = O u =0
JYp+b
0
p
-,\*
II
e
p+b(A* ) P
- - - 7 " 1......
p_+_b_ i:; 2
d
(x) dX] .
u .
u +
p
P p+b
be the event that the experiment terminates at the censoring stage
Then
(5.15)
These are the mutually exclusive subevents of
D >Y ,
c
c
for some
B*
which is the event that
We compute respective probabilities as
c.
-,\*
e
I I'"
(5.16)
u=o Yb
PCB )
c
C-l-b(
TT
p=O
b (Ab) u
s2
d (x)dx
u+ b
-,\*
u
u!
00
L
u =0
p
+b
p
e P (,\ * )
Yp+b
p,+b
]
s2u +d
(x)dx
o
Up'
p p+b
f
-,\*
00
e
c(,\*)u
c
L
u=O
s2u+d (x)dx
c
u!
k
From the above we obtain
5.4.
T
B
=
I T PCB ) + T(l-P(B*)).
c=b c
c
Asymptotic performance properties of Type C test.
Define similarly
non-centrality parameters as in section 5.2 but for individual batches
and index them by
J.
As for example, we denote by
.G,
J c
the non-centrality
parameter associated with the asymptotic chi-square distribution of the test
-23-
statistic
at the time-point
T
.H
and
J O,c
H (n)
monotonic and as in (4.19) and (4.20), we conceive of
j 1
under local alternatives
j
k
n
(n)
j H1
(5.17)
H(n)
1, c'
-
c=b +h
O j
Note that both
~.G
.A
for the j-th batch and
c
J c
J c
are
.H(n)
J 1, c
and
H(n)
1
as
l
H(n). H(n)
j l,c'
1
/ \ H(n)
( \ J' 1
j =1
The power of the test is expressed as
(5.18) I-power=p{.x2:s.w I I ' 'if c: bO+hJ.:sc:Sk and 'if j=I, ... ,lIH (n)}
l
J c J a
=
{",2
I (n)
P . Xk :S . W " • HI }
j=1
J
J a J
L
TT
I
=
J.J a
(00
L
l
n
W "
j = 1, .. . ,l
E.
c
E*
e J
O
j=l u=o
The event
-.A
k
( . Ak ) u
J
(x) dx
]
u+J' k
u!
",2 >
]. Xc ]. Wa ,,'
that
f
(,2
for some
c: b
o
+h.:sc:sk
J
and
consists of the following mutually exclusive events:
=
- {",2
J,Xc-l
-<
.Il)
Ja
II'
f or a 11 1,/~ J. <- -c0
c
",2 > .ell I I ' f or some
an d .X
JCJa
b +h <c:Sk
J·=l, ... ,(},
c
0
j-
lc
Note that there are only
j's
which satisfy:
b
O
+
h :S c:S k.
j
The
corresponding probabilities are given by
- A
1 b o ( \ )u
co Joo
e
Il1.b
(5 . 20) P (E ) = ~
0
(,2
d (x)dx
b
u+ l b
o
u-o lWa "
u!
o
(~.(. co
P(E c ) = TIJ'=l
u.L=o
J
J.w"t-jAC-l(.A
~
a
U~! c-
l)u j
]
(,2u.+.f
(x)dx
J ] c-l
\*
-.II.
I
v.=O J
J
(
,\*)
v.
J
Jo,wa"-x e J C j c
v. !
J
CZv .+. d
J J c
(Z)dZ~]
-l
-24-
Expected stopping time
'lC
E(T )
C
T
is then
k
(5.21)
5.5.
C =
I
c=b
T peE ) + T(l-P(E*))
c
c
0
Asymptotic performance properties of Type 0 test.
2
(5.22)
X.d
J c'
=
a'"
Let
.y .
J c
Then
(5.23) I-power = p{.O S:.y, 'if j=l, ... ,land'if c: bo+hJ.S:CS:kIHl(n)}
J
C
J
C
l
-rr
n
j=l c=b +h
k
-+
a
l
-+
n
j=l
k
j
p{.O S:.y I .H (n) }
J C J C J l ,c
(I
n l
00
u =0
c=bO+h j j C
J'Y
Jc
a
- .. \ . u
e J
C( . .\
JC
)J
C
jU c !
~2.u
J
+.d (X)dX] .
C
J
C
The event that the experiment terminates at some time-point
sists of the mutually exclusive events
(5.24)
F
C
={.o
lS:'y lforalllS:js:l,
J cJ cC
.0 S:.y , for some j = l, ... ,l},
J C J C
C
The expressions for the associated probabilities are given by
T
C
con-
-25-
00
(5.25)
P (F
I
) +
bO
S2
u=O
d (x)dx
u+ l b
o
- A*
i v+b +h.
l
P(F )
c
c-l
c-l-b -h.
0
]:I 16
J
r
u
e
00
.
0
L=o
u
J ( A*
j v+b +h
O j
) v
u !
v
v
S2u +.d
(X)dX)
v J v+bO+h j
-.A*
u.
e J C(oA*) J
____~J__c
S2u.+.d (x)dx
u. !
J J c
J
The expected stopping time
6.
TO
)1
J.
can now be computed as before.
CONCLUDING REMARKS
For grouped data, Ghosh (1973) has extended the results of Sen (1967)
in deriving a class of conditionally distribution-free rank order tests
having some asymptotically optimal properties.
It is possible to extend
these test procedures to our batch arrival model under PCS.
llowever, this
will need certain weak convergence results to multi-dimensional Brownian
motion processes as well as a (crude) inequality involving the actual and
an upper bound for a scalar constant appearing in the test statistic.
The
details will be provided in a separate communication.
The proposed test procedure can also be extended to the multi-dimensional categorical data situation, where the main response is time-seqllential
and at each time-point of censoring, responses with regard to the other categories are complete.
an
Following Bahadur (1961), if it is possible to derive
approximate lower dimensional representation of the joint probability
function, we can start censoring at an earlier stage for purposes of tests
under PCS.
-26-
7.
7.1.
NUMERICAL ILLUSTRATION
Descripti?n of the model and underlying distribution.
For purposes
of numerical illustration, we have considered a BAM with two samples and
two batches.
For the sake of clarification, the two samples may be con-
ceived of as control and treatment groups in a contraceptive effectiveness
study with the subjects entering into the experiment in batches.
The
response categories are conception times recorded in one-month intervals
starting from the time-points of the individuals' entry into the trial.
From time and cost considerations, the study may not be continued beyond
a time-point
T.
Allowing for seasonal variation in reproductive beha-
vior, the batches within the same sample may be assumed to differ in
respect of conception time distribution.
Keeping this in view, two
samples of 700 each were generated from exponential distributions with
mean survival times
1..
2
:=
18.0
(i)
months and
'\:=
1..
(iii) Al
2
:=
:=
16.0
16.0
months,
months,
(ii) Al
1..
2
:=
:=
16.0
months,
20.0 months.
The
size of the first batch within each sample was 500; the remaining 200
individuals belonged to the second batch.
It was assumed that the experi-
ment was observed for a period of 24 months since inception.
With the
further assumption that the second batch entered into the study one year
after the first, the failures in the two batches were recorded in 24 monthintervals (categories) for the first batch and in 12 month-intervals for
the second batch.
.
th
th
The last categorIes of the two batches (25
and 13
respectively) included the censored individuals.
7.2.
Computation of the statistics.
In order to derive the test statistics
A, B, C, and 0, we started with unspecified model, allowing for inter-batch
-27-
difference in probability structure.
Under the null hypothesis of no
difference between the survival time distribution of the two samples,
",2
Xc
tors
an
d
",2
j Xc
IT ..
1J , t
were computed by plugging in the maximum likelihood estimaderived separately from the two batches, in (3.2).
important to note that in this unspecified model situation, both
form strictly non-decreasing sequence in
c
It is
{Xc2 }
and
by virtue of Lemma 4.1.
The asymptotic powers and mean stopping times for the four tests were
derived empirically by repeating the experiment 500 times.
The results
are summarized in the next section.
7.3.
Empirical powers and mean stopping times.
By the formulae of sec-
tion 4, we can at once derive the degrees of freedom associated with the
various tests.
The degrees of freedom of the overall test,
We similarly obtain
lf
24
c= 12, ... ,24; .d = 1, V c
J c
are 36.
=24, 2 f 12=12; dc=l, c=1, ... ,12; d c =2,
and
V j.
Next, the test statistics were com-
pared against respective percentile points of the chi-square distributions
(central), in order to compute empirical powers and empirical mean stopping
times associated with the four tests.
are demonstrated in the table below.
The results from the 500 repititions
-28-
TABLE 1
Tests and Significance Levels
A
CHARACTERISTlCS*
C
B
0
0,=.05
0,= .10
0,=.05
0,= .10
0,=.05
0,= .10
0,=.05
.09
.05
.09
.03
.10
.05
.08
.03
23.87
23.97
22.90
23.56
23.79
23.91
23.27
23.73
.18
.10
.12
.08
.17
.09
.12
.07
23.66
23.84
22.36
22.89
23.63
23.82
22.65
23.21
.48
.35
.28
.18
.41
.28
.24
.14
22.56
23.09
20.14
21. 47
22.38
23.01
21. 01
22.27
0,= .10
--I-
Case 1
Power
Mean Stopping time
Case 2
Power
Mean Stopping time
Case 3
Power
Mean Stopping time
*Case 1 : Al = 16.0, A =16.0
2
Case 2: ,\=16.0, A =18.0
2
Case 3: A =16.0, A = 20.0
2
l
From the above the following salient points emerge:
(i)
(ii)
All the tests are consistent
Type A is the most powerful test; next comes Type C; 0 is the
least powerful of all.
(iii)
If we assume that each extra time interval costs
s
unit of
cost more, then the expected cost for each test (apart from overhead cost) is
s x expected
stopping time.
tion, Type B is the most economical test;
From this considera0, C, A corne in this
-29-
order so far as this index of efficieny is concerned.
(iv)
The efficacy of the pes test procedures becolllcs
1Il01'C
and lIlorc
pronounced as the difference between the two samples gets larger
and larger.
7.4.
Additional comments.
Note that we could have derived an overall chi-
square statistic \\lith 48 degrees of freedom for the Type A test, by imposing the restriction on the model that the two batches within the same
sample have the same probability structure.
Type C and Type 0 tests are
more flexible in the sense that we need not know beforehand the timepoints of entry of different batches as in Type A and Type B tests.
On
the other hand, Type C and Type 0 tests do not utilize this inter-block
structure.
Among all the tests, Type A test only preserves the power of
the overall test.
In all the other tests power is affected, in fact
reduced, since we increase the critical level of the overall test.
ACKNOWLEDGEMENT
Thanks are due to Mr. Agam N. Sinha for his invaluable programming
assistance relating to the numerical studies.
REFERENCES
[ 1]
MANTEL, N. (1966). Evaluation of survival data and two new rank order
statistics arising in its consideration. Cancel" Chemothel"apy
Repol"ts 50, 163-170.
[2]
NEYMAN, .J. (1949). Contribution to the theory of the X test.
Pl"oceedings of the Bel"keley Symposium on Mathematical Statistics
and Pl"obability. University of California Press, Berkeley,
239-273.
[3]
CRAMER, H. (1946). Mathematical Methods of Statistics.
University Press, Princeton.
[4]
BIRCH, M.W. (1964). A new proof of the Fisher-Pearson theorem.
Math. Statist. 35, 817-824.
[5]
MANTEL, N. and HAENSZEL, W. (1959). Statistical aspects of the analysis of data from retrospective studies of disease. J. Nat.
Cancel" Inst. 22, 719-748.
[6]
MANTEL, N. (1963). Chi-square tests with one degree of freedom:
extensions of the Mantel-Haenszel procedure. J. Am. Statist.
Assoc. 58, 690-700.
[7]
SEARLE, S.R. (1971).
[8]
DIAMOND, E. L. (1963). The limiting power of categorical data chisquare tests analogous to normal analysis of variance. Ann.
Math. Statist. 34, 1432-1441.
[9]
MITRA, S.K. (1958). On the Umiting power function of the frequency
chi-square test. Ann. Math. Statist. 29, 1221-1233.
[10]
GHOSH, M. (1973). On a class of asymptotically optimal nonparametric
tests for grouped data T. Ann. Inst. Statis~. Mlth. 25, 91-108.
[11 ]
SEN, P.K. (1967). Asymptotically most powerful rank order tests for
grouped data. Ann. Math. Statist. 38, 1229-1239.
[12]
BAHADUR, R.R. (1961). A representation of the joint distribution
of responses to n dichotomous items. Studies in Item Analysis and PY'ediction~ Ed. by Hel"bel"t Soloman. Stanford University
Press, Stanford, California, 158-168.
2
Lineal" Models.
Princeton
Ann.
Wiley, New York.
© Copyright 2026 Paperzz