GENERALIZED RANK TESTS FOR PROGRESSIVE CENSORING PROCEDURES
by
Hiranmay Majumdar
Department of Biostatistics
University of North Carolina at Chapel Hil]
Institute of Statistics Mimeo Series No. 1081
July 1976
MAJUMDAR, HIRANMAY.
Procedures.
Generalized Rank Tests for Progressive Censoring
(Under the direction of PRANAB KUMAR SEN.)
In clinical trials and life testing problems the responses are
time-sequential in nature and a
compl~t~
experimentation may involve
a prohibitive amount of time and cost •.• Fo't "this reason, progressive·
censoring schemes (PCS) are often advocated with a view to terminating
the experimentation at the earliest possible stage if the accumulated
statistical evidence warrants to do so.
In this dissertation we pro-
pose some progressively censored tests for grouped and ungrouped data
and study their asymptotic properties.
For batch-arrival models (BAM) relating to categorical data
under time-&equential studies, we develop four PCS chi-square type
statistics.
The necessary (asymptotic) distribution theory is derived
for the null as well as local alternative hypotheses situations, in
each case.
For grouped and ungrouped data from"time-sequential experiments,
we propose a general class of progressively censored rank order tests
for the multiple regression model and
through the formation of suitable
s~~~Yn~ir
sampl~
asymptotic properties
processes.
}or grouped data
we develop, in addition, PCS rank order tests for simple regression
model.
We show that, under suitable regularity conditions, the dis-
tributions of the proposed statistics converge weakly to those of some
functionals of the standard Brownian motion process under the null
hypothesis and to those of the drifted Brownian motion process under
contiguous alternatives.
As the available statistical tools do not
provide us with the necessary analytical expressions for the exact null
distribution of these processes, we derive their empirical distributions
through simulation.
Two numerical examples using simulated data from exponential
populations are given, illustrating the PCS chi-square type tests and
PCS rank order tests when the responses are recorded on an ordered
categorical set, ina two-sample problem.
We empirically show that,
for the negative exponential survival time distributions one often
encounters in life testing problems, the rank order tests are much
superior to the chi-square type tests from both power and mean stopping
time considerations.
Finally extensions of the proposed tests in three
directions are suggested as topics for further research.
ACKNOWLEDGMENTS
It is a pleasure to express my sincere appreciation to my
adviser, Dr. P. K. Sen, for his patience, continued guidance, and
advice during the course of this research.
I am also indebted to the
other committee members, Drs. R. R. Kuebler, J. E. Grizzle, Dana E. A.
Quade, and H. B. Wells for their helpful comments and suggestions.
I especially thank Dr. J. E. Grizzle for his encouragement and keen
interest in my work.
I take this occasion to express my debt of gratitude to late
Professor Mindel C. Sheps who was primarily responsible for my return
to Chapel Hlll
af~er
a lapse of four years.
The financial support for this research and graduate study was
through a graduate research assistantship from the Department of
Biostatistics.
I wish to place on records my appreciation for the
opportunities this support has provided.
I am thankful to the Government of India for granting me a study
leave for my academic pursuits.
I am grateful to Mr. Agam Nath Sinha for his
inva~uable
pro-
gramming assistance in the numerical investigation.
Finally, I wish to thank Mrs. Gay Hinnant for typing the manuscript with great skill, patience, and speed.
ii
CONTENTS
ACKNOWLEDGMENTS . . . • • • • . .
I.
INTRODUCTION AND LITERATURE REVIEW
1.1
1.2
1.3
1.4
1.5
II.
Description of Various Types of Censoring
and Tr\1.ncation
Literature Review.
Testing Procedures under Progressive
Censoring Schemes.
Analysis with Concomitant Variables.
Outline of Research Proposal .•
.
·····
·
·····..·
....····
..·
····
..
····
··
····
·
····
1
3
10
15
17
CHI-SQUARE TESTS FOR GENERAL MODELS UNDER PROGRESSIVE
CENSORING WITH BATCH ARRIVALS
2.1
2.2
2.3
2.4
2.5
2.6
III.
ii
Introduction • • . . • . . • . • . • • . • • •
Formulation of the Problem . . • • . •
Chi-Square Type Tests for the General
Model under PCS • . • . • • . • • • • • • • •
Asymptotic Distribution Theory under HO'
Asymptotic Distribution Theory of the
Proposed Test Statistics under Local
Alternatives and Power Considerations • . • • .
Concluding Remarks • . . • • . • . • • .
21
22
26
31
41
49
RANK ORDER TESTS FOR GROUPED DATA UNDER PROGRESSIVE
CENSORING: SIMPLE AND MULTIPLE REGRESSIONS
3.1
3.2
3.3
3.4
3.5
3.6
3.7
3.8
3.9
Introduction • . • • • • • •
Models and Assumptions . .
Formul~tion of the Problem .
Tests under pes
.
Some Preliminary Results .
• • • . • • • •
Asymptotic Distribution Theory of the Test
Statistics under HO: Simple Regression.
Asymptotic Distribution Theory of the Test
Statistics under Contiguous Alternatives:
Simple Regression. • • • • . . • • . . . .
Asymptotic Distribution Theory of the Test
Statistics uuder H : Multiple R.egression. .
O Theory of the Test
Asymptotic Distribut~on
Statistics under Contiguous Alternatives:
Multiple Regression. • . • . • • . . . . • • • . •
iii
50
51
53
56
64
71
82
85
88
iv
IV.
NONPARAMETRIC TESTING UNDER PROGRESSIVE CENSORING
MULTIPLE REGRESSION
4.1
4.2
4.3
4.4
4.5
V.
92
92
96
100
106
SIMULATED PERCENTILE POINTS OF THE NULL DISTRIBUTIONS
OF SOME FUNCTIONALS OF STANDARD BROWNIAN MOTION
PROCESS. NUMERICAL ILLUSTRATIONS OF TESTS
5.1
5.2
5.3
5.4
5.5
VI.
Introduction • . . . . • . . . . • • • • •
Model. Assumptions and Basic Formulation
Progressive Censoring Tests.
Asymptotic Distribution Theory of Hr under HO• .
Asymptotic Distribution Theory of H
under Contiguous Alternatives . . :
·····..········.
Introduction
Generation of Brownian Motion Process and
Derivation of the Distributions of some
Functionals of this Process through
Simulation
Numerical Illustration for Chi-Square Tests
under PCS.
Numerical Illustration for Two-Sample Rank
Order Tests with Grouped Data under PCS.
Comparison between the PCS Chi-Square Tests
and PCS Rank Order Tests with Grouped Data
·····
·····
....
········
········
109
110
119
122
125
FOR FURTHER RESEARCH AND SCOPE
FOR APPLICATIONS
SUGG~~TION~
6.1
6.2
6.3
6.4
Introduction.
••...•.• • . • . . • .
Extension of Rank Order PCS Tests to BAM • • •
Some Multivariate Extensions of the PCS Tests.
Scope for Applications • . . . . . . . • . . . .
126
126
128
131
APPENDIX. ON DERIVATION OF EMPIRICAL DISTRIBUTIONS OF SOME
FUNCTIONALS OF THE STANDARD WIENER PROCESS
A.l
Program for Derivation of Empirical Distributions
of Sup (t-~(t» and Sup (t-~IW(t) I) ....
E<t<l
E<t<l
BIBLIOGRAPHY ••
133
137
CHAPTER I
INTRODUCTION AND LITERATURE REVIEW
1.1. Description of Various Types of
Censoring and Truncation
In experiments involving life testing and clinical trials,
observations are made sequentially in time; consequently, they are
automatically collected in order of size.
From cost and time con-
siderations, it is impractical to continue such
exp~riments
responses for the entire group of individuals are obtained.
until
In such
a situation, the experimenter wants either to terminate the experiment
when a specified number of individuals respond or to continue it over
a specified duration.
The two are known as censoring and truncation
procedures respectively.
In the former case, time of termination of
the experiment is random whereas, in the latter, the number of censored
subjects is random.
Since number of respondents O~ termination date of
the experiment is decided beforehand, we term the two procedures fixedplan censoring and truncation respectively.
Life testing is defined
here to include broad aspects of life tables and contraceptive failures
one encounters in demographic studies.
When the individuals enter the experiment at the same pQint of time,
we have what is known as single-point entry as against mu1tiple-point
entry when the subjects enter the study serially between the comm.encement and end of the experiment consistent with the design which is' laid
_
2
down beforehand.
Gehan (1965b) also discusses how, in actual experi-
mental situations, doubly-truncated observations are encountered in
contrast to those censored/truncated on the right only.
In what
follows, discussions will be mostly restricted to censoring/truncation
procedures with single-point entry.
In problems we are concerned with,
fixed~plan
truncation or
censoring has obvious disadvantages from cost and efficiency considerations.
An early truncation may result in increasing risk of incorrect
decision whereas unnecessary continuation of the experiment till the
end may lead to substantial cost and
the gain in efficiency.
los~
of time incommensurate with
Since the experiment is observed continuously
over time, we can exploit this fact for devising a procedure which is
pseudo-sequential in nature.
accumulateu
info~'mation
This procedure enables us to review the
as the experiment progresses and terminate it
as soon as a statistically valid decision is reached.
In other words,
the trial is terminated only if its continuation cannot result in an
outcome suggesting an inference different from the one currently indicated by the accumulated evidence of the experiment.
In contrast to
classical sequential testing designs, the progressive censoring procedures put a restriction on the maximum number of observations to be
made in the experiment.
Armitage (1957) terms such procedures
restricted sequential procedures.
the two procedures are:
Other points of difference between
(a) Whereas the former deals with independent
random variables, the latter, by design, concerns itself with ordered
observations, and (b) the former is sequential in observations as
against the latter being sequential in time.
3
1.2. Literature Review
1.2.1. Nonparametric Methods and Fixed-Plan
Truncation/Censoring Schemes
Halperin (1960) has considered extension of the Wilcoxon-MannWhitney test for samples truncated at the same fixed point.
The null
hypothesis tested is that the two samples are from the same population.
Tables of lower than 5 per cent and 1 per cent points of the test
statistic are given for n < 8 and various degrees of censoring.
The
asymptotic distribution of the test statistic under the null hypothesis
is shown to be normal.
The asymptotic theory is adequate outside the
range of the table providing no more than 75
observations of the two samples are censored.
p~r
cent of the total
The test statistic is
constructed as indicated below.
There are two samples of size n on F and m on G, where F and
G are (continuous) cumulatives. The observations from Fare
denoted by X and those from G by Y. Both samples are truncated at the same fixed point, say T. The number of X's
censored is denoted by r n and that of y's by r m, and we let
r = r m + r • The null hypothesis to be tested is taken to be
F(x)
G(x), - 00 < n < T, against the alternative F(x) > G(x)
for all x. The test-statistic Uc is counted as the number of
times a Y precedes an X among the n + m - r n - rmuncensored
observations plus rn(m-rm), the total number of times uncensored
y's precede a censored X.
=
The consistency of the test-statistic has been established as follows:
In the uncensored case, the consistency of the U test follows
by simple direct application of the Tchebycheff inequality to
show that, as m,n + 00, the probability that U falls in the
critical region of the test when the alternative hypothesis is
true approaches unity as required. Now turning to the behavior
of Uc when F(x)j G(x), - 00 < x ~ T, and, in particular, assuming
that :~~~ > ~~~~, - 00 < x < T, and F(T) > G(T), we consider that
the joint sampling distribution of the m observations on G and n
observations on F is no longer that of (m+n) independent random
variables. This lack of independence yields an additional positive
contribution to the variance of Uc which can be shown to be of a
magnitude not affecting the consistency argument.
4
Gehan (1965a) has proposed a distribution-free two-sample test
which is an extension of the Wilcoxon test to samples with arbitrary
truncation on the right.
observations.
where F
l
The test is conditional on the pattern of
The null hypothesis is:
H :
O
Fl(t) = FZ(t), t .::. T, against either
HI:
Fl(t) < FZ(t), t'::' T, or
HZ:
F (t) < FZ(t) or Fl(t) > FZ(t), t .::. T,
l
and F
Z
(1.Z.l)
are cumulative distributions (discrete or continuous)
of the observations and T is their argument's upper limit.
The test is
shown to be asymptotically normal and consistent against both onesided and two-sided alternatives.
The asymptotic efficiency of the
test relative to the efficient parametric test when the distributions
are exponential ...s at least 0.75 and increases with the degree of
truncation.
When H is true, the test is not seriously affected by
O
real differences in the percentage truncated in the two groups.
Some
comparisons are made for five cases of varying degrees of truncation
and tying between probabilities from the exact test and those from the
proposed test, and these suggest that the test is appropriate under
certain conditions when the sample size is five in each group.
Gehan (1965b) has also considered the same statistic (with
appropriate modification) when the data are doubly-truncated with
multiple-point entry.
The test is again based on permutation principle.
The asymptotic normality and consistency of the test follow directly
from the results derived earlier by Gehan (1965a).
A generalization of the Kruskal-Wallis test, which extends Gehan's
(1965a) generalization of Wilcoxon's test, has been proposed by Breslow
5
(1970) for testing the equality of k continuous
distributionfunction~
when observations are arbitrarily truncated on the right.
The distribu-
tion of the truncation variables is allowed to differ for different
populations.
An alternative statistic is proposed for use when the
truncating distributions may be equal.
chi-square
These statistics have asymptotic
distributions under the respective null hypotheses, whether
the truncation variables are regarded as random or as fixed numbers.
Basu (1967a).has derived the large-sample properties of a
genera1~
ized Wi1coxon-Mann-Whitney statistic when in the ordered sequence of N
observations of the combined sample the first r observations only are
used to construct the statistic, i.e., there is a right-censored sample
of size at most r.
There are two independent samples of sizesm and n,
respectively, from two populations with continuous c.d.f.'s F(x) and
To test the equality of the two populations the statistic V~,n
G(y).
has been proposed, based on the first r ordered observations, where
vu,n is given by
r
= V(N) =
r
r
I
i=l
(um.l. - mn.),
l.
(1.2.2)
and m. and n. are the numbers of x and y failures, respectively, among
l.
l.
the first i ordered observations of the combined sample, so that
m. + n.
l.
l.
= i.
statistic.
This is shown to be related to the Wi1coxon-Mann-Whitney
The asymptotic normality of the test statistic in the null
and non-null cases (under local alternatives) has been established.
The test also has been shown to be consistent.
The performance of the
test has been compared with other asymptotically most powerful rank
tests for censored sampling.
6
In order to test homogeneity of k samples for right-censored data,
with only r out of N ordered observations from the combined sample being
considered, Basu (1967b) has proposed two statistics--one a generalized
version. of Kruskal's H-statistic for unordered alternatives, and the
other a generalization of the Jonckheere statistic with ordered alternatives.
Asymptotic chi-square distribution of the former and normality
of the latter, in both null and non-null cases, have been established.
It is evident from the results of Gehan (l965a, 1965b), Breslow
(1970), and Basu (1967a and 1967b) that tests based on similar statistics under fixed-plan truncation and censoring are asymptotically
equivalent.
Mantel and Haenszel (1959) in a review of the statistical problems
of retrospective studies of disease have provided a method for analyzing
multiple 2x2 contingency tables.
Consider a random sample of diseased
and disease-free individuals classified according to various control
factors in which the distribution of the factor under study within the
i-th classification may be represented as follows:
With factor
Free of factor
Total
With disease
A.
B.
1
N
li
Free of disease
C.
D
i
N
2i
Total
M
li
M
T
i
1
1
2i
Under the null hypothesis of no association and assuming fixed marginal
totals, E(A.)
= (Nl.Ml.)/T.1 and V(A.)
111
1
= (Nl.N2.Ml.M2.)/[T:(Ti-l)].
The
1
11· 1
1
corrected chi-square with one degree of freedom is
(IA. - E(A.)I - ~)2/V(A.),
1 1 1
(1.2.1)
7
which reduces to
.
( IA.D..
- B.C.
1 1
1 1
I-
~.) 2 (T.-l)/(N
1
1
1
.N .M .M .).
l 1 21 ·1 1 21
(1.2.4)
A summary chi-square with one degree of freedom over all the subclassifications is given by
(1.2.5)
Mantel and Haenszel (1959) also discuss two methods for extensions to
more than two factor levels.
In the first method one computes an
uncorrected chi-square with one degree of freedom for the difference
of the first level from all the remaining levels combined.
Discarding
the data from the first level, a second chi-square is computed for the
difference between the second test-factor level and the remaining
levels combined.
This is done successively up to and including the
last two remaining levels.
The approximate summary chi-square is
then the sum of the separate chi-squares with the number of degrees
of freedom being one less than the number of test levels.
In the
second method, exactly (conditionally) orthogonal standardized deviates
would be obtained if, in the summary analysis, as each successive
total deviation from expectation is evaluated, it is adjusted for its
multiple regression on the preceding deviations, and then standardized
by the adjusted variance.
The methods just described have been
exploited subsequently by Mantel (1963, 1966) for analyzing data from
many experimental situations.
Sugiura and Otake (1974),·in an unpublished report, have from a
general approach, derived a chi-square statistic with c7l degrees of
freedom for tests involving a series of 2x c contingency tables, and
8
have shown its invariance property and relation to the logistic model.
Their work can thus be regarded as an extension of the Mantel-Haenszel
procedure.
Mantel (1963) has extended the procedure of Mantel and Haenszel
(1959) in two
direct~ons.
In the first it has been recognized that the
procedure is not limited to the problem of retrospective studies and
its suitability has been shown for other kinds of problem t aS t for
example t comparisons of age-adjusted death rates t life table analysis t
comparisons of two sets of dose-response data t and miscellaneous
appropriate laboratory experiments.
In the second extension t the case
has been considered in which the number of levels for the study factor
is arbitrarYt but orderable t rather than limited to two.
By taking
ordering into account t single degree of freedom summary chi-squares
can be
detc~mined
for the multiple-level problem.
A variety of standard
nonparametric procedures t in asymptotic form t can be derived t and
extended beyond the present range of application t by manipulating the
score assigned to a particular study factor level.
Among the non-
parametric procedures t Mantel mentions generalization of Pitman's
permutation test t Fisher's randomization test for matched pairs t
Wilcoxon's rank-sum test and sign-test t corresponding to various
scoring procedures.
Mantel also discusses the case oj disease status
at several levels t which is equivalent to extension of the procedure to
the multiple-sample problem.
In a further extension of the Mantel-Haenszel techniquetMantel
(1966) considers an overall comparison of two sets of life tables in
their entirety.
The method is equivalent to decomposing a 2X (k+l)
9
contingency table into k correlated 2X2 contingency tables and combining
the result of each 2x2 table into a summary chi-square with one degree of
freedom.
Bringing a conditionality argument to bear, he shows how the
summary chi-square is the square of the sum of k conditionally uncorrelated random variables.
By considering the case in which the life
table intervals are arbitrarily short, it is easily seen to be a rank
order procedure.
Further, the procedure is defined for general or
individual right censorship, for left truncation, and for tied ranks.
The possibility is indicated of evaluating the statistical power of
the procedures for alternatives when there is a constant force of
mortality ratio between the two survival time distributions.
Sen (1967) has extended the findings of Hajek (1962) to grouped
data to derive asymptotically most powerful rank order tests for
simple regression against one-sided alternatives.
The underlying dis-
tributions are assumed to be continuous and the form known but the
random variables observable in a finite or denumerable set of intervals.
The tests are based on some score generating function
with the true density.
~(u)
associated
However, if one works with general scores,
the asymptotic power of the test will depend on the assumed density.
The location problems with respect to truncated one-sample and twosamples follow as special cases of Sen's results.
If truncation occurs
at the left, say at x o ' we consider the first class-interval as
!l
=
(~,xo]'
the remaining class-intervals having Lebesgue measure
sufficiently small.
We similarly deal with right truncation and
multiple-point truncation.
Consequently, Sen's findings generalize
those of Halperin (1960) and Gehan (1965a, 1965b).
I
When the data are
10
censored by sample percentiles, the truncation points are random.
However, under some regularity conditions, the problem of finding
asymptotically most powerful rank order tests under fixed-plan censoring is asymptotically equivalent to that of finding asymptotically
most powerful rank order tests under fixed-plan truncation.
In this
sense, Sen's results cover those of Basu (1967a) and Gastwirth (1965)
as special cases.
Ghosh (1973) has extended the results of Sen (1967) to the
multiple regression situation and has developed a class of nonparametric tests which are asymptotically optimal in Wald's (1943) sense.
Ghosh's findings include those of Basu (1967b) and Breslow (1970) as
special cases.
1.3. Testir6 Pro'edures Under Progressive
Censoring Schemes
1.3.1. Parametric Methods
Epstein (1954) has considered the case when n items are placed on
test and it is decided in advance that the experiment is to be terminated at minimum (X
,To) where X
is a random variable equal to
ro,n
ro,n
the time at which the ro-th failure occurs and To is the truncation
time.
Both rand T
o
0
are assigned before the experiment starts.
If
the experiment is terminated at X
, then the action in terms of
ro,n
hypothesis testing is the rejection of some specified null hypothesis.
If the experiment is terminated at time T , then the action in terms
o
of hypothesis testing is the acceptance of some specified null hypo thesis.
The underlying life distribution is specified as negative
exponential.
Since a life test is observed continuously, it is not
11
worthwhile to wait until the r -th failure occurs (or until time T ).
o
0
Epstein has proposed the following progressive censoring procedure for
hypothesis testing based on X
ro,n
Suppose that at some moment t, there are exactly k failures
k < r o - l,and that the observed total life is V(t),
where -
o<
k
V(t) =
L
i=l
x.
J.,n
+ (n-k)t,
(1.3.1)
and xi ,n is the i-th smallest observation in a sample of
-x/8
size n. The p.d.L of X is 8-1 e,x>O,8>O.
Suppose
HO:
8 = 80 vs. HI:
8 = 8 (8 < 8 ) ,
0
1 1
(1.3.2)
It is noted that V(t) is~in t. The testing procedure is:
(a) Continue experimentation so long as V(t) < roC and
o ~k ~ r o - 1; (b) stop experimentation with acceptance of
Ho as soon as V(t) > roC and 0 ~ k ~ r o - 1; (c) stop
experimentation at Xro,n with rejection of HO if V(t) < roC
for alIt < x r n, where
0'
C = 80 X12 -a (2r 0 )/2r 0 ,
(1.3.3)
and xI-a.(2r ) is the lOO(l-a)th percentile point of the chisquare dist~ibution.
A similar progressive censoring procedure has been developed on the
basis of To.
1.3.2. Nonparametric Methods
Without preassigning the number of failures or maximum duration
of time beyond which the expe:riment is not to be continued, Alling
(1963) has proposed that, as each observation is made, the least upper
and greatest lower bounds for any subsequent value of the statistic
may be calculated.
If the critical value does not lie between these
two bounds then the outcome of test is determined and experimentation
terminated.
Use of this procedure in the case where the Wilcoxon two-
sample test is appli.ed to an ordered sequence of observations
12
considerably reduces the average sample size.
The test proceeds as
follows:
Consider a random sample Xl, .•. ,XM from F(x) and a random
sample Yl, ... ,YN from G(y), F and G being continuous distribution functions. We wish to test at the a level of
significance
H :
O
vs. HI:
F(x) 2. G(x) for all positive x
F(x)
~
G(x) for all positive x and F :f G.
(1.3.4)
Let the statistic U be the number of times a Y precedes an X
in the ordered sample and let b a be the critical value of the
Wilcoxon test, that is, if U > b a accept HO, but if U ~ b a
accept HI. Consider now the situation where m X's and nY's
have been observed and compute Um' where Um is the Wi1coxonMann-Whitney statistic based on m X's and nY's. Obviously,
Um is ~ in m. It should also be noted that
Um + (M-m)n < U < Um + (M-m)N. The progressive censoring test
procedure is-: (a) I f Um +, (M-m) n > b a , accept Ho; (b) i f
Um + (M-m)N < b a , accept HI; (c) if Um + (M-m)n2. ba ~ Um
+ (M-m)N, continue experimentation.
The propose::l.
tes~
and the usual Wilcoxon two-sample test are identical
in power and consistency.
Halperin and Ware (1974) have extended the procedure to the
restricted experimental situation where the experiment is terminated
after a preassigned percentage say, lOOA%, of events has occurred in
either of the two samples.
The asymptotic distributions of. the test
statistic (of Wilcoxon type) and the proportion of events after which
early stopping occurs have been derived.
compared with Alling's (1963) results.
also has been considered.
Let Xl' •••
Some Monte Carlo work is
The case of staggered entry
'~
be the ordered survival times
from the distribution function F(x) and Y ' ••• 'Y the ordered observa1
N
tions from G(y), F and G being continuous.
significance level a
We wish to test at the
13
H :
O
F(t)
=
G(t) for all t
vs. HI:
F(t)
~
G(t) for all t and
F(t) > G(t) for some interval on t.
(1.3.5)
Define
I i f X. < Y.
o
=
ij
1.
1. -
arid for any m and n (m
~
M, n
~
J
(1.3.6)
N) define the Wilcoxon statistic
n
Wm,n
J
{ 0 if X. > Y.
m
= j~l
i~l
°ij
(1.3.7)
The stopping rule and one-sided asymptotic test proposed are:
1. The experiment (when early decision does not occur)
will terminate after the smallest of X or Y
AM
AN
has been observed.
2. Let n(m) be the number of failures occurring in the
Y(X) sample prior to XAM(Y ). Compute the test
AN
statistic
(1.3.8)
W
m,AN
and
2
WAN,AM - (A MN/2)
(1.3.9)
IA 3 MN(M+N) (4-3A)/12
3. Reject H if the statistic (1.3.9) exceeds za' the upper a
O
critical value of the standard normal distribution.
Mantel (1966) has also proposed a rank order procedure for the .
experimental situation cited earlier in which a chi-square statistic
is computed after each death (failure) in a comparative survival time
study.
This statistic depends only on the ordering in time at which
~
14
deaths and losses to follow-up occur.
The proposed statistic is the
maximum chi-square computed over the course of the study.
Such a
statistic can be used in a progressive censoring scheme and significance
or nonsignificance ascertained before the end of the study.
Chatterjee and Sen (1973) have proposed a general class of linear
rank order tests in one- and two-sample problems under progressive
censoring procedures, as against specific statistics considered by
Alling (1963) and Halperin and Ware (1974).
Chatterjee and Sen have
established the asymptotic equivalence of the rank order tests for
censoring and truncation problems.
Along with a basic martingale
property and Brownian motion approximation for a related rank order
process, asymptotic distribution theory of the proposed statistic has
been developed.
They have also considered the asymptotic performance
of the proposed tests in terms of Bahadur efficiency and stochastic
smallness of stopping variables.
For general rank order tests, under
the progressive censoring procedure, the sequence of statistics constructed at the successive censoring points cannot be regarded as
satisfying the monotonicity property as in the particular situations
considered by Epstein (1954), Alling (1963) and Halperin and Ware
(1974).
The Chatterjee and Sen test procedure under progressive
censoring proceeds as follows:
Let X' == (X l' ••• 'X ) be the vector of random observations
~n
n··
nn
from continuous distributions and R' == (R 1, ... ,R ) be the
~n
n
nn
vector of associated ranks. Let also S . designate the
n1
position of the individual in the sequence (1,2, ••• ,n)
whose rank is i. Then ~n
S' == (S n l' .•. 'Snn ) is called the
{a (i)}~ 1 and {c .}ni- are suitably
n
1==
n1 - l
chosen scores and regression constants, respectively, such
that
vector of anti-ranks.
15
n
L
i=l
n
L
i=l
a (i) = 0,
n
a 2 (i) = (n-l) ,
n
n
L
c
i=l
n
L
i=l
ni
= 0,
c 2 . = 1.
nJ.
(1.3.10)
Under a fixed-plan censoring, one observes up to the r-th
order statistic as envisaged in the design and constructs
the test statistic
r
T
n,r
=
L
i=l
C
sni {an (i)
*
a (r)},
n
(1.3.11)
*
-1 n
where a (r) = (n-r)
L a (j). Let T k be the statistic
n
j=r+l n
n,
similarly constructed as in (1.3.11) but based on the first
k order statistics. For purposes of a progressive censoring
test, the proposed statistics are
M+
= max T
n,r
O<k<r n,k'
M
= max ITn,k l .
n,r
O<k<r
(1.3.12)
As the experiment progresses one compares T k (or IT k l )
n,
n,
against the percentile point of the distribution of
M"'n,r
- (or Mn,r ) under the null hypothesis, and rejects the
null hypothesis if it exceeds some ciritcal value, continuing
with the experiment if it does not.
1.4. Analysis with Concomitant Variables
Cox (1972) has considered survival analysis in the presence of
covariables.
It has been assumed that on each individual are available
values of one or more explanatory variables.
The hazard function is
taken to be a function of the explanatory variables and unknown
16
regression coefficients multiplied by an arbitrary and unknown function
of time.
A conditional likelihood is obtained, leading to inferences
about the unknown coefficients.
For the two-sample problems, the
procedures proposed by Cox are closely related to those of Mantel and
Haenszel (1959) and Mantel (1963,1966) due to the fact that by considering a series of 2x2 tables, at each level of the concomitant
variables, adjustment for covariables is automatically made.
Applications of Cox's technique in modified form or otherwise
have been made by various workers, of which the work of Breslow (1974)
may be mentioned.
Breslow has compared survival curves by application
to the clincial trial of maintenance therapy for childhood leukemia.
He has considered three models of which two are parametric and the
other Cox's.
Age and blood count at diagnosis are both shown to be
important for prognosis; adjustment for these variables has marked
effect on treatment comparisons.
Cox's model does not seem to be
general rank order tests.
suitabl~
for application in
First, it is a quasi-parametric model in
the sense that the survival functions are assumed to be in a specified
parametric relationship to each other.
Secondly, as pointed out by
Chatterjee and Sen (1973), the rank order statistics, being not linear
in the original observations, do not lend themselves
analysis through the usual approach.
~o
covariance
A reasonable approach appears to
be blocking by concomitant variables so that the effect of the covariate
is kept to a minimum in each group.
17
18
distribution function with various weight functions, has been considered
by Anderson and Darling (1952) in the uncensored case.
analogue has been developed by Kiefer (1959).
Its k-sample
For excellent reviews on
weak convergence of empirical processes based on sample distribution
functions, reference may be made to Darling (1957) and Durbin (1973).
In the censored two-sample problem, we derive Cramer-Von Mises statistics based on sample distribution functions as follows:
Let Xl, ••. ,X be a random sample from a continuous distribution
m
function F. The order statistics are denoted by
X(l) < X(2) < ••• < X(m) • Let also Yl ,··· Yn be n independent
and identically distributed random variables from a continuous
c.d.f G. They are ordered as Y(l) < Y(2) < .•• < Y(n). Finally,
let Z(l) < ••• < ZeN) denote the ordered combined sample, where
=
N m+n. The empirical distribution functions of the individual
and combined samples are given respectively by
0, x < X(l)
F (x)
ilm, XCi) .::. x < X(i+l)
m
1, X(m) .::. x,
O,x<Y(l)
G (x) =
n
jln, Y(j) .::. x < Y(j+l)
1, Yen) .::. x,
~(x)
= AFm(X) + (I-A) Gn(x) ,
(1.5.1)
where A = mIN. The Cramer-Von Mises statistic based on the
first k order statistics is defined by
~,k
k
=
~i~1[Fm(Z(i)-Gn(Z(i»J2[~(Z(i»(1-~(Z(i»)]-Y,
(1.5.2)
-y
where 0 ~ y ~1 and ~(Z(i»(l-~(Z(i»)J
isa weight
function used to strengthen the statistic in order to increase
the power of the test. The integral representation of (1.5.2)
is
19
.
:n
~l(k/N)
J.
(Fm(X)-Gn(X))2[HN(x)(1-~(x))]-Y d~(x),
(1.5.3)
-h:)
~(x)
where H;l(k/n) = infimum{x:
=
kin}.
The statistic can
also be expressed in terms of the ranks of one sample in the
combined sample. In order to do this we denote the rank of
X(i) in the combined sample by Ri • Then X(i) = Z(R ) and i of
i
the observations of the first sample are less than or equal to
X(i)· Then at Z(Ri)
R.-i
i
1.
=---m
n
i(m+n)-mR.
l.
(1.5.4)
=----~
mn
Since we are considering up to the k order statistics it may be
convenient to express (1.5.4) in terms of anti-ranks S. (such
1.
=R
that SR'
Si
1.
= i), i.e.,
Fm(Z(R.»
- Gn(Z(R.»
1.
(1.5.5)
=
1.
The above is an adaptation from Durbin (1973, p. 41) with
necessary modifications. By a well known result, if the
empirical process {y (t)} ~ {y(t)} in the metric space D[O,l],
n
then {g(y (t)}
n
lJ.
{g(y(t»}, where g is a measurable function in
D which is almost everywhere continuous in metric d [see Durbin
(1973)]. It can be proved that under H
O
2
(F (x) - G (x» 2
n W (t)
0
mn
m
n
~
(1.5.6)
N {~(x)(l-~(x»}Y
{t(l-t)}Y
where t
= F(x)
£ [0,1] and W (t) is a Brownian bridge process.
o
Then, underH '
O
2
AN,k
where k
=
!l
JP W2(t) dt
_0
o
_
{t(l-t)}Y ,
(1.5.7)
[np] + 1 and 0 < P < 1.
We shall, however, concern ourselves with Cramer-Von Mises type
statistics based on linear rank order statistics T k defined
n,
after (1. 3.11) 1. e. ,
(1.5.8)
20
(n)
where W (t
n k
o = t o(n)
)
= Tn, k[Var(Tn,r IH o)] -k2
(n)
(n)
< ••• < t
= 1, and W (0) = O. We propose
l
r
n
D
to show that under some regularity conditions W2 (t) 7W 2 (t) on
< t
n
D[O,l], where W(t) is a Brownian motion process and hence
(1.5.8) converges weakly to a functional of the Brownian
motion process.
As the statistics considered in Chapters III and IV will be
shown to have the distributions of functionals of Brownian motion
process in large samples under appropriate regularity conditions, we
derive empirical distributions of these processes by simulation and
graduate, where possible, by suitable Pearsonian curves, in Chapter V.
We also present two numerical illustrations giving, through simulation
studies, empirical powers and mean stopping times for various drifts
for the tests developed in Chapter II and the test derived for the
simple regrp8sion problem in Chapter III.
We outline proposals for further research in the concluding
chapter of the dissertation.
CHAPTER II
CHI-SQUARE TESTS FOR GENERAL MODELS UNDER PROGRESSIVE
CENSORING WITH BATCH ARRIVALS
2.1. Introduction
Chi-square statistics are widely used for testing of hypotheses
for categorical data.
and
1if~
In many situations relating to clinical trials
testing problems, the response categories are (ordered and)
sequential in time, so that a complete collection of data may involve
a considerable period of time (and hence, cost).
For this reason, pro-
gressive censoring schemes (PCS) are often advocated with a view to
terminating the experimentation at the earliest possible stage depending
on the accumulated evidence at the successive stages.
Further, in many
such problems, not all the subjects enter into the study at a common
point in time and, naturally, a batch-arrival model (BAM) seems to be
more appropriate.
Under a general model on the probability structure
and assuming the number of batches to be fixed in advance, the current
investigation provides suitable progressively censored tests based on
chi-square statistics.
Among other generalities, this includes as a
special case a time-sequential test essentially related to Mantel (1966)
for 2 x (k+l),
k~l,
contingency tables.
The basic problem is formulated in Section 2.2.
are presented in Section 2.3.
The proposed tests
Sections 2.4 and 2.5 deal with the distri-
bution theory of the test statistics under suitable null and (local)
22
alternative hypotheses.
In this context, it is shown that one of the
proposed tests has the attractive property that it has the same level of
significance and power as the overall test but can lead to rejection at
an early stage, resulting in lower expected cost and shorter time of
experimentation.
Section 2.6 is devoted to various extensions to multi-
response models and comparisons with some other tests.
2.2. Formulation of the Problem
In the context of clinical trials or life testing, consider a typical truncated experimentation plan for a pre-determined amount of time
T and suppose that the response (failures) are recorded in k (ordered)
time intervals II, •.. ,I
and the surviving (beyond time T) subjects
k
constitute the cell I *+ •
k l
I
c
=
Thus, we have a set of k+l ordered categories:
it:
< T
k
=T
I*
c+l
= I C+ I
<
00.
Conventionally, for every c < k, we let
u I + 2 u ••• u I
c
k
U
I k*+ l •
(2.2.1)
We conceive of a BAM where all the subjects may not enter into the study
at the same time.
2(~1)
Instead, there are
entering the study at time-point T
h
batches, with the j-th batch
for some 0 < h
-
j
_< k-l, so that the
j
observable categories for this batch are Il,···,I k _h . and I *
k - h .+ l ,
J
for 1
~
j < 2; also, 0
= hI
< .•• < h
2
~
k-l.
J
Note that in this design,
it is not necessary to fix in advance the individual batch sample sizes;
rather, the j-th batch sample size may be decided at the entry time Th '
j
for j
= 1,
•.. , 2.
Further, keeping in mind the usual comparative
experiments involving one or more factor (or treatment, etc.), we
23
conceive of the r(>l) sample situation, where, for each batch, the r
parallel samples relate to the same categories.
For a better under-
standing of the model,we refer to the numerical illustration in
Chapter
v.
For the i-th sample in the j-th batch, the number of subjects
entering into the study is denoted by n
ij
, j = l, ...
nij,t be the number of observations (among the n
ij
,~,
1
~i ~
r.
Let
subjects) belonging
*
t = l, ••. ,k-h. and niJ',k-h.+l
be the number of censored cases,
J
1 ~ j ~
~,
J
1 < i ~ r; on letting k. = k-h. , we denote the corresponding
J
J
cell probabilities by TIij,t' 1 < t -< k. and TI *.. k+l' 1 < j
1J, .
J
J
1 < i <r.
~ ~,
Further,
k.
J
L TI1J,
.. t
t= 1
k.
J
L nij,t
t=l
*
+ TIij,k.+l
= 1, V 1
~ j
~~, 1
< i < r.
(2.2.2)
J
*
+ nij,k.+l' V 1 ~ j ~~, 1 < i < r.
J
The schematic representation of the experiment is given below:
(2.2.3)
24
. .
0
z
eu
.-I
.c:
CJ
CIl
I::Q
~
1
.j.J
1
ctl
2
·.
1 Pll,l [111,2
o
0
2 n ,1 [112,2
12
o
0
..
· ··
0
0
.l/,
2
Categories
(
0
Z
1
k-h .l/,
11"
11.
n.
n.
ll,k-h.l/, 11,k-h.l/,+1
1n1.l/,,1 PU,2
o •
···
·.
o
0
o •
k-h
k-h +1
2
2
11.
11,k-h
11.
11,k-h +1
2
2
*
11.
12,k-h
11.
12,k-h +1
2
2
··
·
·
· ··
*
-°l.l/"k-h.l/, ~.1.l/"k-h.l/,+1
0
0
_
12,k-h.l/, 12,k h.l/,+l
·· ·· ··
0
k-h.l/,+l
~
0
0
--
o •
•
•
0
0
··
· ··
·· ·· ·· · ··
·
· ·
· · I ·
.l/,
*
1n 2.l/,,1 n 2.l/, , 2 · .n,2.l/"k-h.l/, P-2.l/"k-h.l/,+1 · . --
·
0
0
0
· ··
··· ··
0
0
0
r
· ·· ··
·· ·· ·
0
0
0
0
0
0
0
0
1 In ,1 nrl,2
r1
o •
n
2 In ,1 n ,2
r2
r2
o
n
·
··
0
0
0
.l/,
•
0
o
0
k
*
1111 ,k [111,k+l
0
·
--
--
··
···
--
·
--
0
•
k+l
0
/3
0
~
11
11
n
12
··
0
nU
*
Inn, 1 11 21 ,2 In.21,k-h.l/, P,21,k-h.l/,+1 ·.n,21,k-h 2n.21,k-h 2+1 n21 ,k n21 ,k+l n21
*
h
2 1n
22 , 1 11 22 , 2 In 22,k-h.l/, P-22 , k-h .l/,+1 ·. 22,k-h 2P22,k-h 2+1
0
·.
···
Inr.l/"l nr .l/,,2
0
0
0
0
•
0
0
0
--
·
·
· ··
··
··
· ··
··
0
0
p .
rl,k-h.l/, rl,k-h.l/,+1
22,k-h.l/, Pr2, k-h.l/,+l
0
0
o •
0
0
0
0
0
0
·
·
~l
rl,k-h
n
rl,k-h +1
2
2
0
•
0
·
·
0
0
·
0
·
0
o •
n
0
0
0
·
0
--
··
··
·
--
--
n2.l/,
·
··
··
···
··
·
··
·
*
-0
0
·
0
··
·
0
nrl,kPrl,k+l °rl
0
o
0
·
0
°22
··
0
0
•
--
--
·.In.r2,k-h 2 *r2,k-b 2+1 ·.
· ··
·
-n
0..*
r.l/"k-h.l/, r.l/"k-h.l/,+l ·.
0
0
0
•
0
--
--
n
r2
··· ···
-- °r.l/,
(2.2.4)
25
The joint distribution of the observed cell frequencies is given
by the product-multinomial:
r
<t>
=
n
k j . n .. t]
n .. !
Q,
IT
*
TIn. ~J ~ (n ..
~J
[ t=l
i=l j=l kj
*
TIn .. t!(n .. k +l)!
t=l ~J ~
~J, j
For this
mddel~
~J,
~J ~
*
k +1)
j
nij,k.+l
J
(2.2.5)
one usually expresses
(2.2.6)
and proceeds to test a null hypothesis
Fm(~)
H :
O
where the F
m
= 0, m = l, .•• ~q(~s),
(2.2.7)
are suitable (linearly independent) functions of 8.
Let
the test statistic be
A
k j (n •. t-n .. n ... t)2
r
Q,
I
L I
~J
A
~J
t (8),
,~
minimizing Pearson's
r
~J,
, (2.2.8)
n..
A
n..
~J
n..
~J ,t ~J, t
i=l j=l t=l
where nij~t =
,
*
A*
n' j
A
A
.. k +1(8) and 8 is obtained by
k +1 =n ~J,.
~,.
~
J
J
Q,
L I
i=l j=l
(2.2.9)
26
with respect to
e,
subject to the constraints in (2.2.7).
large sample sizes, any BAN estimator of
e
Actually, for
can be used [see Neyman (1949)].
For large sample sizes, under H in (2.2.7), under suitable regularity
O
conditions given by Cramer (1946, pp. 426-27) and subsequently modified
by Birch (1964),
(2.2.10)
and this provides a large-sample test of H ' where we reject H for
O
O
"'-
large values of X2 •
"'-
The test statistic X2 is generally used when we wait until the end
of the experiment as envisaged in the design.
However, as mentioned
earlier, we like to adopt a PCS where we monitor the experiment from the
beginning ?ith a view to stopping experimentation at an early stage
(i. e., at St)TI,e time-point T , c < k) if the evidence ac'cumulated up to
c
-
that stage is sufficient to reject H '
O
Basically, without affecting the
risk of making incorrect decisions, such a PCS procedure can be constructed and will result in a reduction in the time of experimentation,
and hence, cost too.
This is elaborated in the next section.
, 2.3. Chi-Square Type Tests for the
General Medel Under PCS
Let us examine the structure of the categorical jata at the comPletion of time T.
c
c. =
J
c~h.,
J
the
j~th
Let J
c
= {j:
h. < c}.
J -
Then, for j E J , writing
c
batch (n .. ) relates to the categories 1 , •.• ,1 ,
1
1J
cj
respective frequencies n .. l,···,n *
I*
i J. ,c j +1' for 1 < i < r,
c .+l ' with
1J,
J
so that the joint probability function is
27
cj> c =
n
n
TT
...:J.I IIn.,1J, t (II*
n i .!
*
J ---- ,
I_ _....=.L.
i=l jEJ
c
*
cj
t=l ij,t
)
n ij ,cj+l
(2.3.1)
ij,cj+l
n . ! (n. .
+1) !
i J,t
1J ,c.
t=l
J
Consider then the related minimum chi-square statistic
r
A
Xc2
=
c.
2 2
i=l jEJ
c
P
t=l
, (2.3.2.)
~*
where IIij,t and IIij,c.+l are the estimated probabilities obtained by
-
J
substituting ~8 in (2.2.6) and 8 is the minimum chi-square estimator
.
- -c
_c
of
e based
on (2.3.1).
Here also, use of any BAN estimator (based on
A
(2.3.1»
is permissible for large n. "
.
1J
A
Note that cj>=cj>k' X2=x~, and
A
the earliest stopping time-point T for constructing X~ involves b+l
b
*
2
cells (Il, ••. ,Ib,I + l ) for which
b
jEJ
cj>
(b-h.) > s-q.
J
Further note that
b
involves only a subset of the probabilities {IIij,t}' and hence the
c
null hypothesis H in (2.2.7) may involve only a subset q(c) of
O
restraints on the II
ij
, t at the c-th stage, where q (c) is 7' in c.
This
is particularly true for contingency tables, where, as c increases, we
deal with an increasing number of marginal probabilities associated
with the categories.
Thus for the model in (2.2.5) we set
~ H
° -= b<c<kO,c'
H
H
O,C
F(C) (8) = 0, VI 1
m -
H is monotonic:
Oc
~
m ~ q(.c); q(c) / ' in c.
(2.3.3)
(2.3.4)
Naturally,' i f H
is not true for somec«k)., then R.. J.a.also· not true.
O , c · O
28
2.3 .1. Type A PCS Test for H •
O
{T :
c
Consider the set of time-points
..... 2
b':::'c.:::.k}.
At T , compute X •
c
..... 2
c
Continue experimentation so long
If, for the first time, for c
as Xc .:::. Wa' c':::'k.
experimentation at time T along with rejection of H .
C
O
C(.:::.k) exists, accept H at time T .
O
k
If no such
Here we need to choose wa in
such a way that the overall level of significance of the test is
a:
0 < a < 1.
variables.
Note that in this setup, both C and T
e
are stochastic
Further, the test proposed here depends only on the set of
frequencies of the uncensored cells.
However, parametric considerations
may enter (excepting in some cases like contingency table situations)
through the dependence of IT..
on 8.
2.3.2. Type B PCS Test for H .
O
Let us define
1.J , t
D
c
2
max{x c2 - X
. . . c-l'
o} , b < c _< k,
..... 2
= Xc
if c
= b.
Then continue experimentation so long as D <
c -
= C, DC >
first time, for c
with the rejection of H •
O
(2.3.5)
~C'
~
c
, c > b.
-
If for the
stop experimentation at time T along
C
If no such C(.:::.k) exists, accept H .
O
Here
also, {~b' •.• '~k} are positive constants such that
(2.3.6)
I-a,
where a = level of significance.
For both the foregoing tests, as we shall see in the next section,
the design must be compatible in the sense that (i) q(c) should be ~in
c and (it)
I
J
(c-h.) - s + q(c) should be,/' in c:
c
J
b <
C
< k.
29
2.3.3. Two Simultaneous Tests.
We have considered earlier the situation
when the number of batches as well as the time-points T ., 0 < h
h
j
j
~
k-l,
J
= I, ••. ,$/"
at which the batches enter into the study, are predetermined
in order that we know in advance the degrees of freedom of the terminal
statistic
X relating
2
and all the samples.
to the overall test computed over all the batches
Corresponding to the Type A and Type B tests, we
may devise two simultaneous test procedures across the independent
batches; these procedures will be termed Type C and Type D tests,
respectively.
These tests will be particularly useful when the number
of batches is large and the experiment is continued over a long period
of time.
In these latter tests, the terminal statistics are based on
the respective batches only.
Consequently, we may relax, in this case,
the condition of batch arrivals at pre-determined time-points; rather
they may be conveniently decided as the experimentation progresses.
We now proceed to describe these two tests.
2.3.3.1. Type C PCS Test for R •
O
As in Type A test but separately for
$/,
A
C
.
2
the relevant batches, compute {.X L .1' at time-point T , where $/,
J c J=
c
c
is the number of batches involved at the c-th stage of censoring.
the earliest time-point for constructing this statistic is T
b
first batch, then the time-points are 'J;'bO+~1 T +h ' ,t"
bO 3
the subsequent batches respectively.
Th.e· dependence of $/,
for the
Tbo+~ for
c
from tne fact·that
$/,
$/, = I I (b o + h. ~ c), hI = 0 < h 2 < ••• < h$/, ~ k-b O'
c
j=l
J
where I(x) stands for the indicator function for the event.
A
o
If
on c comes
(2.3.7)
At the timeA
point T ' the sequence {.X 2 } consists of only the single element lX b2 •
bo
J c
0
30
This lbO' may be different from 'b' defined in Sections 2.3.1 and 2.3.2.
•
A
Continue experimentation so long as {.X 2 < .W",,,} , V J = 1, ••.
JC-J""
experimentation
along with the rejection
of H0 if, for some j,
.
.
If no such c(b o
~
c
~
k) exists, accept H at time T
O
k
=
T.
Stop
,JI,.
C
J.X c2
> j w".
a
Here we
choose d' in such a way that the overall level of significance for
every j
= 1, ... ,JI,
experiment is a:
is a":
o<
0 < a" < 1, and the overall error rate of the
a < 1, so that I-a =
2.3.3.2. Type D PCS Test for H '
O
A
(l-~')
JI,
•
At Tc , for each j = 1, .•. ,JI, c , .compute .
A
max{.x 2 -.X 2 l'O}, b O + h. < c < k,
J c
J cJ-
.D
J c
A
2
jXc ,
C
= bO + h
r
0
= hI
< h
2 ... hJl, ~ k-b O'
< .6. ,V j, c > b
Jc-Jc
- O
Then continue experimentation so long as .D
If for the first time for c = C, jD
C
> j6.
c,
o
at T
k
= T. Here
+ h J..
for some j, stop experimenta-
tion at time-point T along with the rejection of H '
C
O
C(~k) exists, acceptH
(2.3.8)
If no such
{.6. +h , ••. ,.6. } form a
J b0 j
J k
double sequence of positive constants such that
P{.D
< .6. , Vb
J~-Jc
where
~,
O
+
h. < c <kl.H }
JJ O
(2.3.9)
= I-a",
.
is the level of significance for an individual batch and
the overall error rate of the experiment, satisfies l-P.
CI.,
=.(I-a") JI,;
jH is as defined in (2.4.20).
O
2.3.4. Relevance to the Mantel Procedure.
In further extension of the
methods of Mantel-Haenszel (1959) and Mantel (1963), Mantel (1966) considered an overall comparison of two sets of life tables in their entirety.
31
The method is equivalent to decomposing a 2 x (k+l) contingency table
into k correlated 2 x 2 contingency tables and combining the result
of each into a summary chi-square with one degree of freedom.
By bring-
ing a conditionality argument to bear, he showed how this statistic is
the square of the sum of k orthogonal random variables.
He also pro-
posed a time-sequential procedure where the summary chi-square is
computed after each failure and the test statistic is the maximum of
these chi-squares over the course of the study.
Mantel, however, did
not mention that it is possible to develop another time-sequential test
on the basis of the distribution of the maximum of the set of k conditionally orthogonal chi-squares, each with one degree of freedom,
from the k 2 x 2 contingency tables, along the lines of Section 2.3.2.
Structurally, Mantel's procedure is akin to the Type B
pes
test.
2.4.-Asymptotic Distribution Theory Under H
O
First, through the following two Lemmas, we show that
"2
Xc
is non-decreasing in c:
b < c < k.
(2:4.1)
Lemma 2.4.1. Let {m .. t; 1 < t < k., m~. k +1' 1 ~ j ~ 1, 1 < i < r} be a
J
1.J,
k
.
1J,.
J
*
set of positive numbers such that r~=lmij,t + mij,kj+l = n , V 1 < i ~ r,
ij
1 < j ~ 1, and let m~.
1J ,c
-
c (n . t.;..m .. t) 2
j i J,
L
1J,
r
L L
i=l jEJ
+1 be defined as before.
c
t=l
Let
j
m••
t
1.J,
+
*
2
(n.*
1.J. ,c . +l-m.1J. ,c +1)
J"'--
*
........._
_
j
, b < c < k.
mij ,cj+l
(2.4.2)
Then,
x2 (m)
c -
is non-decreasing in c:
b < c < k.
32
Proof:
By
definition~
X~+1(m) =
r
L L
L L
i=1 J'e:J c
r
=
L ------ +
i=1 je:J c + 1 t=1
r
>
c.+1(n..
-m..) 2
J
1J,t 1J,t
L L
i=1 J'e:J c
*
mij,t
mij ,c.+2
J
f
c.+1(n .. t-mi' t)2
1J,
J,
t=1
*
*
(n.1J,
. c. +2-m1J
. . ,c. +2) 2
+
....J
---,JoL--__ 1
~.,...--_
*
m.. t
mij ,c.+2
J
1) ,
c.
!
*
(n..
+2-mi'
+2 )2
1J,C.J,C . .
J-:-*_ _----'JO<--_I
(n..
+1-m'j
+1)2
1J 'C
1. 'C
j
j
- - - - - - - + ---"'----_....
_
"---
t=1
mij ,c.+1
J
*
+
*
·
(n i J,c.
+2-mi'J,C. +2) 2
....J-:--_ _~J'___
*
mij ,c.+2
J
(2.4.3)
*
*
m. .
m• .
We now write n..
+1 = a 1· ·, n..
+2 = d .. , m..
+1
1J,C.·+1- 1J,C.
1J,C. +2- 1J,C.
1J
1J ,C.
J
J
J
J
J
J
= e iJ' and m. •
+2 = f ... Then
1J, c
1J
*
j
(n.1J. ,c. +1-mij.
J
r
L L
i=1 J'e:J C
J
m..
1J ,C. +1
J
*
+
,C. +1) 2
*
(n.1J. ,C. +2-mi'J
Jol..
,C. +2)
2
..oj/-_
*
mij ,c.+2
J
*
*
(n.1J. ,C. ·+1-m1J
. . ,C. +1) 2
J
J
*
m..
1J ,c. +1
J
33
r
. .d 2..
(a .. + d )2 ]
ij
1.J
.-!J.. + .-!J.. (e
..
+
f
.. )
j£J
f
e
i=l
ij
ij
1.J
1.J
c
I
I
=
[ a.2
.
(2.4.4)
Now,
a
2
•
i
~
+
e..
d~j
1.
_
f..
1.J
1.J
=
+ d .. )2
1.J
(e .. + f .. )
(a ..
1.J
1.J
1.J
a ~ . (e .. • f. .+f ~ .) + d ~ . (e ..• f .+e ~ .) - e .. f. . (a ..+d .. ) 2
1.J 1.J 1.J 1.J
1.J 1.J i J 1.J
1.J 1.J 1.J 1.J
ei·fi· (e ..+f i .)
J J 1.J
J
( 2 f2··+i·ei·d 2 2 2a d e · fi.)
i
= _ai·
_J_1....
J _-'-'J_.....J~_1.
]~1.~]'---"'''''''J.....;:;;;
....J_
e. . f . . (e. . +f. .)
1.J 1.J 1.] 1.J
(a i ·f i ·-d ·e .. )2
J
J i J 1.J
> 0
" 1
d d
"+f)
, V rea a1..j an
i]·'
f
e ij ij ( e ij ij"-
=
since e
ij
and f
are positive.
ij
As in after (2.3.2) let
e
~c
(2.4.5)
o
be the minimum chi-square estimator of
e
based on (2.3.1), so that by (2.3.2),
A
(m .. t = n i ·1T··· t(8 ), V i,j ,t).
1.J ,
J 1.], ~ C
A
Lemma 2.4.2.
2
2
xc (m) ->xc (m
~
~c
(2.4.6)
A
),V m
" m
,b -< c -< k.
~
~c
The proof directly follows from the fact that, since _c
e is the'
A
minimum chi-square estimator of 8 based on
minimum (over all m) value of
-
~
, m = m(8 ) leads to the
c -c
--c
xc2 (m).
~
From the preceding two Lemmas, we immediately conclude that
A
A
A
A
A
A
x~ = X~(~c) ~ X~(~c+1) < X~+1(~c+1) = X~+l' V c.
(2.4.7)
34
By the results of Neyman (1949), (2.4.7) holds (up to the order of
o (1»
p
for any sequence of BAN estimators.
2.4.1. ASymptotic Null Distribution of Type A Test Statistic.
f c = r[
2 c.]
- s + q(c), b < c
je:Jc J
Let
~ k,
so that by the compatibility condition of the design f
(2.4.8)
c
is ~in c.
Further we let
(2.4.9)
2
Finally, let Xp,y
be the upper 100y% point of the chi-square distribution
with p degrees of freedom (d.f.).
It follows then from the basic results
of Cramer (1946, pp. 424-34) that under the regularity conditions stated
in Birch (1964), for every c:
b
~
c
~
k,
(2.4.10)
when H·
holds.
O,c
A
However, the X2 , b <
c
independent.
C
< k, are obviously (even asymptotically), not
But, by (2.4.1) and (2.4.10),
A
p{X c > wa for some c:
b ~ c ~ kIHO,c}
(2.4.11)
Note that wN = X2
\,N
f
k
,a
is the critical value of the overall test (based on
completion of experiment up to the time T
k
~
T).
Hence the Type A test
consists in choosing the same critical value of the overall test but
allows the possibility of an early rejection without any increased risk
of making incorrect decision.
35
2.4.2. ASymptotic Null Distribution of the Type BTest Statistic.
By
(2.4.1), (2.4.10), and the classical Cochran theorem [viz. Searle (1971,
p. 64)], under the same (Cramer-Birch) regularity conditions, under H
O
D
D ~ X2
c
d
for b < c < k.
(2.4.12)
c
A
Again, X~ = Db + ... + Dk , Dc ~ 0, V c and f k = db + ... + d k , d c > 0,
b < c < k} forms an asymptotically independent set
V c, so that {D:
c
of chi-square statistics [see Searle (1971, pp. 60-61)].
Hence,
asymptotically,
k
P{D
k
<!::., V b ~ c ~ kIH } ~np{Dc ~ !::.c/Ho c} ~ TTp{x~ ~ !::.c}'
c - c
O
c=b
'c=b
c
(2.4.13)
and thus, i f we let
A
u
_
c
-
Xd·2 c ,,,,,N"
b
~
c ~ k; I-a
= ( I-a ,)k-b+l ,
(2.4.14)
we obtain from (2.4.13) and (2.4.14) that
(2.4.15)
As a result, the Type B test consists in comparing the successive
differences {D:
c
{x~
a':
c'
b ~ c ~ k} with the corresponding critical values
b ~ c ~ k} and rejecting the HO at the earliest possible value
of c, if there is any such value at all.
2.4.3. Asymptotic Null Distribution of Type C Test Statistic.
Let, at
the c-th censoring stage, .q(c) be the number of restrictions on the
J
model for the j-th batch.
is ./ in c.
Because of the compatibility condition .q(c)
J
We also define
(2.4.16)
36
Due to the compatibility condition of the design . f
J c
also is /' in c.
Further we let
.d
=.f
J c
J c
- .f
J c-
l' bO + h. < C < k, j = 1, ... ,t,·
J(2.4.17)
Leaving the details, we proceed as in Section 2.4.1 and obtain
A
P{jX~ > jWa " for some c:
bO + h j
.s. c .s.
A
A
2
= pLX
J k
kljHa,c}
>
.W
,,1.Ha} + a" i f
JaJ
J,WN"
u.
2
X f
",'"
(2.4.18)
j k'U.
where we conceive of
.H
J a ,c
is monotonic,
J.Ha ,c '
.F(c) (8) = 0, V 1
.H
J a ,c
J m
-
.s. m .s.
(2.4.19)
(2.4.20)
jq(c).
(2.4.21)
Consequently,
A
pLx 2 <
V c:
.W '"
JC-Ja
A
=p{
.x 2
J k
bO
+ h.J -~ c -< kl.H
}
J a
<.W",,, I.H
-Ju.
J
a} + I-a",
(2.4.22)
and
A
p{.X 2 < jW",,,,V c:
J c -
+
u.
b
o + h.J -<
C
< k and V j
-
= 1, ••• , t IHa .}
(I_a")!, because the batches are independent.
Therefore, the overall level of significance a
i~
\,
(2.4.23)
37
A
P{jX~ > jWa " for some c:
and some j
1, ...
bO + h j < C < k
,.R.I.H
}+
J O,c
1 - (l_a")5/,
= a.
(2.4.24)
Thus the Type C test consists in using the same critical value at all
stages of censoring for the same batch, but different critical values
for the different batches.
2.4.4. Asymptotic Null Distribution of Type D Test Statistic.
By similar
arguments as in Section 2.4.2 under H '
O
D
.D
+
J c
X2.d' b0 + h j <
C
< k.
(2.4.25)
J c
Also,
k
I
.D, .D > 0, V c and
J c -
c=bO+h j J c
k
I
c=b
. 0+h j
so that {jD :
c
b
O
+ h
j
jd, J.d c > 0, V c,
c
< C < k} forms an asymptotically independent set
of chi-square statistics.
Asymptotically, then,
< .. b. ,V b + h . <
O
JC-Jc
J -
P{.D
n
k
+
c=bO+h.
.
(2.4.26)
C
<
-
kl jO
H }
pLD < .b. I.H
}
J C - J c J o,c
J
(2.4.27)
= .X2 d
a"
j c' j
we obtain
b +
, '0
h j. _< C ._< k·, (l-NII)
u.
38
P{.D
J
2. X2 d a""" b O + h J. <
c
j c'j
=
Further i f I-a
P{ .D
J c
C
2.
(2.4.28)
kIJ.H } + I-a".
O
(l_a,,)i
2
2.
X do." , ,,, b 0 + h . < c
J
j c'j
+
(l_a,,)i
2.
I
k, " j = 1, ... , i H }
O
= I-a.
(2.4.29)
Consequently, the Type D test amounts to comparing successive differences
{.D:
b
{x2 d
a""
J c
j
O
+ h. 2. c 2. k; j = l, ••. ,i} with the respective critical values
J
b O + h . < C < k; j = 1, .•. ,i} for an early rejection.
J
c' j
2.4.5. R,-estrlctions on the Probability Structure with BAM.
With BAM, we
test the null hypothesis of homogeneity of the r samples, the batch
arrivals being assumed to have uniform pattern for every sample.
not unrealiwtic
L~
It is
further assume that the batches within the same sample
have the same underlying distribution.
As such, we may impose additional
restrictions on the probability structure of the model as
IT..
1.J , t
= IT1.J,
.. , t'
j < j', t
=
l, ... ,k."
J
"i,
(2.4.30)
apart from the restraints on IT .. t under H '
O
1.J ,
2.4.6. Some Special Cases.
With a single batch in the one-sample
situation the sequence of D statistics will uniformly have one degree
c
of freedom, assuming the general model.
When the model is unspecified,
i.e., when parametric dependence is not assumed in the model, we
encounter the case of the contingency table (with single batch).
With
r samples D will uniformly have r-l degrees of freedom, for every c, in
c
a contingency table situation.
In the two-sample problem of Mantel
(1966), D will always have one degree of freedom.
c
39
2.4.6.1. Likelihood Ratio Test.
Let us consider the simple 2 x (k+1)
contingency table situation, i.e., a two-sample, single-batch with
unspecified model situation.
We would now like to prove that the like-
lihood ratio test statistic U based On the first (c+l) categories is
c
monotonically non-decreasing in c.
Since there is only one batch we drop the subscript 't' and display the experiment as in the table below:
Sample
Categories
2
1
Sl
nIl
n
S2
n
n
Total
21
n· l
c+l
c
n
12
n
22
n
n· 2
n
lc
n
2c
n
·c
l,c+l
2,c+l
·c+l
c+2
Total
*
n*
2,c+2
n*
·c+2
n
n
1,c+2
n
l
2
n.
*
*
In consistence with the notations used earlier, ni,c+l
= n
l,c+l + n.1,C+2'
i
= 1,2.
U
c
and U +1 are, then
c
n
c n
.j
"IT
(..3)
. 1 n
U
2
c
*
t· c+1 )
*
·c+l
n
= J=
c
n
n ..
TI TI<--!J.)
i=l j=l n i
n·.
n
(
i=l
1J 2
(2.4.31)
*
n
n*
1 i,c+l
i ,c+)
ni
*
c n.. .J n. +l ·c+l n.* +2 n ·c+2
{ c ,
( c )
TI(-2)
n
n .
= ·j=l
n
n
n
(2.4.32)
n*
n*i,c+
( i,c+2)
n.
J
1
Letting now, xl
= nl,c+l'
*
x 2 = n*
l ,c+2' Yl = nZ,c+l' YZ = n 2 ,c+2'
40
- (x+y)~n(x+y) + x~nx + y~ny - xI~nxI
- (y-y
I
)~n(y-y
I
)
(2.4.33)
Treating now the cell frequencies of the first c cells as fixed, and
consequently x and y as fixed, we take Rc as a function of (xI'YI)' Le.,
(2.4.34)
and consider the nature of this function for variations in xl and YI'
We note that
(2.4.35)
dRc >
YI > xl
and --- = 0 according as -= -dX <
Y < Xz
dRc >
--=0
dY <
,
Z
I
Y < xl
= -YZ > X z
I
according as --
I
Let now
(2.4.36)
Then by chain rule
dR
c
as =
dR
dX
c
I
(2.4.37)
41
dR
dR
c
c
Since, when --- is positive --dX
l
dYl
dR
dR
is negat i ve an dv~ce
'
versa,
aR
aXl
c
c
is positive when --is positive and negative when ---
dX
dRc
~e
a
l
Yl
xl
vanishes only when - = Y2
x2
i.e., when e
Consequently, R has a unique extremium at e
the maximum since, for example, at xl = 0, R is
c
identically vanishes, which implies U
c
= Uc+l '
is negative.
= 1.
But this cannot be
1.
c
c
ae
00.
But at
e=
1, R
c
Since for all other
values of e, U + > Pc' U is monotonically non-decreasing in c.
c l
c
This result may not extend to the BAM situation.
However, by
Neyman (1949), we know, for large n,
A
2tnU
c
= Xc2
+
0
p
(2.4.38) .
(I).
Thus, in large samples, we are assured that {2tnU } is a monotonically
c
non-decreasing sequence in c, stochastically.
2.5. Asymptotic Distribution Theory of the Proposed
Test Statistics under Local Alternatives and
Power Considerations .
2.5.1. Consistency of the Test Statistics.
We have under H
F(C) (e)
O,c
m F{C) (e) T4. 0,·
· d a 1 ternat i ves H
=,
c , and un der t h e f ~xe
O V 1 ~ m ~. q ()
l,c
m I
f or at 1 east one m. Let IT ij,t = IToij,t under HO,C an d IT i j , t=iITj , t un
under HI
,c
' i
= 1, ••• , r; j = 1, ••• , t; t = 1, ••. ,c.J • Then ,under the
Cramer regularity conditions modified by Birch (1964), ITij , t(~c) .Il.
IT~j, t
API
when H
is true and IT .. tee ) + IT .. t when HI
is true. Therefore, as
O, c
~J ,
-c
~J ,
,c
_lA p
n gets large, under Hl,c' n X~ + Ec ' where Ec(>O) is a constant depending on c and n is the combined sample size.
A
Asx2
. ..
. . is
r (k + ••• +k ) -s+q,ex.
t
l
0(1), th~ probability ofX 2 exceedingx 2 .
tends to one as
c
r{kl+ ••• kt)-s+q,~
42
n
+
00.
By similar reasoning, the powers of the other three tests, viz.
Type B, Type C, and Type D, can be shown to go to one as n
+
00.
Con-
sequently, in order to compare the asymptotic performance of the four
statistics, we consider only local alternatives.
A
2.5.2. Asymptotic Distribution Theory of Xc2 •
Let under H
and local
O,C
alternative H(n)
l,c
H
F(C)(8)
H(n).
l,c'
F(c)(8)
O,C
m
m
~
~
m ~ q(c),
= n -~em' V 1
lim n ij = Q
n
ij'
n~
~
r
N
1
°<
°
°
} and
j +l
(2.5.2)
Q•. < l.
1J
and H(n) as
reparaphrase H
l,c
O,c
For convenience
(2.5.1)
m ~ q(c),
J
*
n {.c.
L
L
Ln.
+no j
i=l j=l t=l iJ,t
,c
° for at least one m; n
where em f
~
0, V 1
°
°
°
°
11.. t = 11.. t (8 ), 80' = (8 l , ... ,8 s ) and
1J,
1J , ~
H
O,C
or
H
O,C
°
11 0 ,c =II .. t(8),
1J, ~c
ij, t
(n)
H(n).
11 1J,
.. t
e0'
~c
(8 1 c , ... ,8 s-q ()
c ,c )
=
°
11 ij ,t + n -~0ij,t and
. 1, c·
0, V 1
j
Note that for t
=
I, ... ,5/"
t
~
m
~
q(c), i
I, ... ,c .
j
l, ...
,r,
(2.5.3)
43
*(n)
*0
-~
IIij,c.+l = II.~J. ,c. +1 + no.~J. ,c. + l'
J
J
*~J ,c. +1
0..
J
c.
J
c.
J
= -
J
2
t=l
0 .. t' since
~J,
2
t=l
We are assuming that at least one
0 .. t +
~J,
c.. , t
~J
* +1
c..
~J,Cj
is non-zero.
= 0,
(2.5.4)
Vj.
Then by Theorem
3.1 of Diamond (1963) under the regularity conditions (a), (b), (c),
(d) of Mitra (1958) and (e) and (f) of Diamond (1963), under H(n)
l,c
D
A
x~ -+ X~
(2.5.5)
G' c = b, ••• , k,
c' c
where
A
Let A* be the event that Xc2 > Wa for some c:
b <
C
< k.
Then the
.*
power P(A ) is given by
A
*
P(A ) = p{X 2 > W
c
= P{~~
where
a
for some c:
> WaIHin)} -+ I-a,
6 is the error of type
(2.5.7)
k
n
II and Hi ) = ()H(n).
c=b1,c
Consequently, by
this· testing procedure both the type I and type II errors remain the
same as those of the overall test.
disjoint events:
Now, A* is the union of the following
44
A
~
- {x~ >Wa }
A
A
2
~+l - {x~ 2 wa' Xb+ l > wa}
A
A
-
c
A
A
{xb 2 wa'
A
2
2
, 2
Xb+l -< W
< W
a ,···,X c 1- a Xc > Wa}
A
A
- {x~-l 2 wa' x2c > wa}
A
~
A
- {x~-l
2 wa' X~
(2.5.8)
> wa}.
Let T be the time-point at which the experiment is terminated.
e
event T
e
= T is the event A
c
c
.
The
Therefore, the expected stopping time
with this test is
k
'[
A
= E(T
e)
=
».
T P(A ) + T(l-P(A*
c
c
I
c=b
(2.5.9)
The probabilities of the constituent (disjoint) events of A are given
by:
A
P(~) + p{X~ > Wa }
00
[
I
u=o
e -Ab (Ab) u
ul
(2.5.10)
l;;2U+d~x)dX,
a
where
A
b
=
~
G and l;;
b
2u+d
(x) is the density
b
of the central chi-square
distr~bution
with 2u + db d.f.
For c > b,
A..
A
P(Ac ) = P{X~_l
2 wa'
~ > wa}·
(2.5.11)
A
By Searle (1971, pp. 60-61), X~-l which is asymptotically distributed as
A
2
Xf
G
c-l' c-l
A
2
is asymptotically independent of D = Xc2 - Xc-l
which is
c
45
asymptotically distributed as X~
c'
A
(G -G
)
c c-l
= Xd2
*
G'
If now
c' c
A
2
2
Xc-l
-< X -< Wa , then Dc > wa - x => Xc > w.
a
Thus,
P(A )
c
(2.5.12)
w -x
a
where
Ac
=~
Gc and A*
c=·Ac - Ac- l'
2.5.3. ASymptotic Distribution Theory of D.
c
It has already been proved
in Section 2.5.2 that under local alternatives H(n)
1
Dc
Letting X2
d ,a
c
~ X~
G*' c = b, •.• ,k.
c' C
(2.5.13)
,= Yc , we have
I-power = P{D
< Y , V b ~ c ~ kIHl(n)}
c -
c
-A *
e
u
P+b(A*+b) p
,P
up •
]
Z;;2 +d·.
(X)dX.'
u P P+b
(2.5.14)
Let B be the event that the experiment terminates at the censoring stage
c
c.
Then
46
(2.5.15)
These are the mutually exclusive subevents of B* which is the event that
Dc > y c for some c.
We compute respective probabilities as
00
00
P(B )
b
I
-+
u=o
PCB ) -+
c
f
Yb
-A
e
rtl I
p=O
u =0
P
oJ,
b(A:)u
u!
s2u+d (x)dx
b
(P+b
u
-A *
P+b(A* ) p
e
p+b
0
-A
00
00
I
u=o
f
1;2u +d
up !
e
Yc
p
p+b
(Xld X]
oJ,
c(A*)U
c
s2u+d (x)dx.
c
ul
(2.5.16)
k
*
From the above we obtain E(T ) = lB = I T PCB ) + T(l-P(B
C
c=b c
c
».
2.5.4. ASymptotic Performance Properties of the Type C Test.
Define
non-centrality parameters as in Section 2.5.2 but for individual
batches,and index them by j.
For example,we denote by .G
J c
the non-
centrality parameter associated with the asymptotic chi-square distribuA
tion of the test statistic
and .A
J c
and
of
j
.X 2 at the time-point T for the j-th batch
J c
c
~.G under local alternatives .Hl(n).
J c
J
,c
Note that both .H
J O ,C
H(n) are monotonic, and, as in (2.4.19) and (2.4.20), we conceive
j l,c
H(n) and Hen) as
1
1
j
H1en) -
(2.5.17)
47
The power of the test is expressed as
I-power
<
C
< k and V j
- A
jWall _e_J_'_k_(>oL.'A...=k:;...)_u
f
J
o
.
e!
~2 + f (x)dx
U
]
(2.5.18)
j k
The event E*
j
l, •••
,~
consists of the following mutually exclusive events:
A
Ec
A
- {jX~_l .::. jWa for all I
S-
j
S- ~c-l' and jX~ > jWa " for some
(2.5.19)
~c
Note that there are only
j's which satisfy b O + h <
j
C
< k.
The
corresponding probabilities are given by
00
P(E
) +
bO
I
u=o
J:
lW/V'"
'-"
e
- A
I b O( A
)u
lb O
'
u.,
~2u+ db (x)dx'
1 0
-.A 1
e J c-
(.A
J c-l
P(E )
c
u.!
J
)
u.
J
~2u.+.f c-l.. (X)dX]
J J
A*
* V.
-j c(.A ) J
e·
] c
v.!
J
~2"
.+ .d
J J c
,(Z)~J1
.
(2.5.20)
e
48
Expected stopping time
T
C
is then
k
T
(2.5.21)
L
C
c=b
o
2.5.5. Asymptotic Performance Properties of the Type n Test.
X~d '.0'.'" =.y
J c J
J c
Let
(2.5.22)
•
Then
I-power = P{.D
< .y , V j = l, ••. ,i and V c:
J c - J c
pLn
J
The event
t~at
b
O + hJ. ~ c ~ kIHl(n)l
< .y I.Hl(n)}
c - J c J ,c
the experiment terminates at some time-point T
c
consists of the mutually exclusive events
Fc
- {.n
1 < .y 1 for all 1 ~ j ~ ic_l'
J c- - J c-
.n
< .y , for some j =l, ... ,i},
c
J c - J c
(2.5.24)
The expressions for the associated probabilities are 6iven by
-1\*
e
O(lA*)u
b
O
u!
S2u+ db (x)dx
1 0
49
P(F ) +
c
1-
[
t [
T1
('
I
00
j=1 uj=o
The expected stopping
f'
.y C
*
]]
0 .
--~- 1';2u.+.d (x)dx
u. !
J J c
tim~
Tn
J
J
*
e -.A
J c( A ) u.J
j c
•
(2.5.25)
J
can now be computed as before.
2.6. Concluding Remarks
For grouped data, Ghosh (1973) has extended the results of Sen
(1967) in deriving a class of conditionally distribution-free rank order
tests having some asymptotically optimal properties.
Extension to PCS
will be studied in Chapter III.
The proposed test procedure can also be extended to the multidimensional categorical data situation, where the main response is
time-sequential and .at each time-point of censoring, responses with
regard to the other categories are complete.
Following Bahadur (1961),
if it is possible to derive an approximate lower dimensional
representa~
tion of the joint probability function, we can start censoring at an
earlier stage for purposes of tests under PCS.
Sugiura and Otake's (1973) test for regrouping a 2 x (k+l) contingency table is similar in construction to our Type B test.
can be used for application in the context of PCS as well.
Hence, it
CHAPTER III
RANK ORDER TESTS FOR GROUPED DATA UNDER PROGRESSIVE CENSORING:
SIMPLE AND MULTIPLE REGRESSIONS
3.1. Introduction
Sen (1967) obtained asymptotically most powerful tests for
simple regression against one-sided alternatives for grouped data with
the underlying distributions being continuous and the form being known
but the random variables observable in a finite or denumerable set of
intervals.
If the underlying distribution is continuous but its form
is not kno\TI anc we work with general scores, the test is distributionfree (conditionally) but not asymptotically most powerful, its effidency depending on the assumed score function.
The current investigation
generalizes the results of Sen (1967) and develops suitable progressively
censored tests based on rank order statistics.
Ghosh (1973) extended the results of Sen (1967) to the multiple
regression situation and developed a class of nonparametric tests which
are asymptotically optimal in Wald's (1943) sense, assuming that the
form of the true density function is known.
If the density is not
known, the test is not optimal in the sense mentioned above, and its
efficiency depends on the assumed density.
By generalizing the results
of Ghosh (1973), suitable tests are derived here under progressive
censoring procedures, making use of general scores.
51
Section 3.2 deals with the models and assumptions.
are discussed in Section 3.3.
The problems
Section 3.4 presents the proposed tests
while Section 3.5 gives some preliminary results.
The asymptotic
distribution theory of the test statistics under the null and contiguous
alternatives is derived in Sections 3.6 and 3.7 for simple regression
and in Sections 3.8 and 3.9 for multiple regression.
Some related
numerical computations will be displayed in Chapter V.
3.2.
~odels
and Assumptions
3.2.1. Simple Regression
We consider as in Sen (1967), a sequence of random vectors
X~ =
(Xvl' .•.
'XvNV)
of Nv independent random variables, where XVi has
a continuous distribution function FVi(x), i
= 1, ... ,Nv ,
1 < V <
00,
N > 1, given by
V-
FVi(X)
= F(cr-1 [x-a-ScVi ]); i = 1, ... ,Nv , 1
~
v <
00,
(3.2.1)
where a,S cr(>O) are real parameters and (cVl, •.. ,cVNv) are known
constants.
parameters.
S is the parameter of interest and a and cr are nuisance
We also make the following assumptions:
(3.2.2)
(3.2.3)
MaxI < i < N. c~/C~ = oJ1).
-
f(x)
= dF(x)
dx
-
v
and f'(x)
= df(x)
dx
exist almost everywhere and
00
f [f'(x)/f(x)]2 f(x)dx
_00
=
A2 (F) <
00.
(3.2.4)
52
Since we are concerned with general scores we work with assumed
density g(x) rather than with true density f(x); therefore, we assume,
in addition,
g(x)
= dG(x)
and g'(x)
dx
= Eg(x)
dx
exist almost everywhere and
00
I [g'(x)/g(x)]2
g(x)dx
= A2 (G)
<
(3.2.5)
00.
-00
3.2.2. Multiple Regression
We replace (3.2.1) - (3.2.3) as follows:
FVi(x)
F(cr -1 [x-a-S'cv* .]), i
= 1, ... ,N , 1
- - ~
V
< V <
-
00,
Nv
~
1,
(3.2.6)
where S' = (Sl' ••• 'S ) is the vector of regression coefficients which
-
p
are unknown and of interest; a and cr(>O) are real nuisance parameters;
c*
-Vi
,
= (c *
,c * ., ••• ,c*.),
l vi 2V~
pV~
stants.
i
= 1, ..• ,Nv , are vectors of known con-
Letting
*'
~Vl
*'
~V2
R
-V
N xp
V
c
*'
-V,NV
(3.2.7)
we assume
53
N
V
*
L cmVi
= 0, m = l, •••• ,p,
i=l
~
v-> 1,
N
V
*2
L cmVi
= 1, m = l, •.. ,p, N > 1,
v-
i=l
max
max
1<i<N
l,::.m.8>
--v
* I
Icmvi
= 0(1).
(3.2.8)
Further,
(3.2.9)
where A and A are positive definite i.e., of full rank p.
-v
We note
that A = 1, V l,::.m.8>.
mm
3.3. Formulation of the Problem
In a time-sequential experiment, consider k ordered intervals
I1, ••• ,I
k
in which the responses are recorded and the (k+l)-st interval
denoted by I *+ , which includes the censored cases.
k l
Sen (1967) con-
sidered the case where the sequence {I.} is countable, but for our
J
purpose we treat {I } to be a finite set, since practical considerations
j
may not allow the experimenter to continue the experiment beyond a
certain point of time.
Let·-oo
= al
< a
2
< ••• < a
k
< a + < .•• <
k l
ordered points on the real line [_00,00].
00
be a finite set of
Then the {I.} are defined as
J
(3.3.1)
I k*+ l - (ak+l,OO] .
(3.3.2)
54
Conventionally, for every £ < k
* = 1£+1 u 1£+2 ... u 1k+l ·
1£+1
As in Sen (1967), the random vector X
~V
related to the observable random vector
*
~V
(3.3.3)
=(X l""'X
V
V, N
)' is
V
*
*
= (XVl"",XV,N )' as
V
=
\'k+l
1 I.Z .. , i
J=
J ~J
L.
=
1, ... ,N ,
V
(3.3.4)
where Zij is the indicator variable such that
Z •. =
~J
I, i f X .
v~
{
I.,
E:
J
(3.3.5)
0, otherwise, for all i = 1, ... ,Nv,j = l, ••. ,k+l.
We further define
N
\,V
L.
Zij = Nvj' j
:::i
1, ... ,k+l,
(3.3.6)
i=l
so that N . is the observed frequency of the j-th interval I. and
k+l VJ
J
N" =
v
LN .•
j=lVJ
F
v,l
= 0;
F
V,j+l
=
(3.3.7)
3.3.1. Simple Regression
Letting
....0
6 .
vJ
*3.3.8)
=
0, otherwise,
where ~o(u)
= -f'(F-l(u»[f(F-l(u»]-l,
55
0 < u < 1, we have, following
Sen (1967), the test statistic
N
V
=
L
(3.3.9)
i=l
which provides asymptotically most powerful rank order tests for the
hypothesis:
H :
O
B=
0 against the set of alternatives
If we work with assumed density function g(x)
B>
0 (or
= G'(x)
B<
0).
(3.3.10)
we similarly
define
(3.3.11)
li .
vJ
0, otherwise,
where
~(u)
= -g'(G -1 (u»
decreasing in u.
[g(G
- 1 · -1
(u»]
,0 < u < 1, and </>(u) is non-
Then we test the hypothesis given by (3.3.10) with
the test statistic
SV, k
NV
= I.1.= 1
k+l
iI·l
..
J= liVJ,Z 1.J
A
cV
(3.3.12)
Henceforth, we shall use general scores based on G for purposes of test
construction, on the assumption that the true distribution Fis not
known.
56
3.3.2. Multiple Regression
Let
l, ... ,p,
U
mV,k
U'
" k)'·
-V,k = (U l v, k', ••• , Upv,
(3.3.13)
Ghosh (1973) proposed the test statistic
=
for testingH :
O
2
{(k)
})
J
= ~against
the alternatives H :
v
is defined in (3.5.21).
B (F , I.
V
~
(U'
A-lU
) B- 2 (F· {l(k)})
" k -v
" -v,
"k
-v,
V, J. ,
A
that when we replace
~Vj
A
by
O
~Vj
(3.3.14)
S ~ o where
Ghosh (1973) also has shown
in (3.3.13) (i.e., the true density is
known), the test provided by (3.3.14) is optimal in Wald's sense
asymptotica~ly,
,nder certain regularity conditions.
The test statistics Sand M k are used if the experimenter
v,k
v,
intends to wait until the end of the experiment for a decision.
How-
ever,' for an early decision, we would like to adopt suitable PCS
procedures which are developed in the next section.
3.4. Tests Under PCS
Under the progressive censoring scheme, the experimenter would
like to know, as the experiment progresses, whether tJ ,e data collected
up to date provide sufficient evidence to reject H or continue with
O
the experiment.
statistics {S
V,c
Thus one deals, in this case, with a sequence of
} or {S
} where S
or S"
is constructed similarly
.
-V,c
v,c
-v,c
as in (3.3.12) or (3.3.13) but based on the class-intervals ll, ... ,I '
c
* (which includes the censored individuals only).
and lC+l
For the sake
57
of clarification, we detail the following situation at the time-point
T = a c+l :
Categories
ClassIntervals
1
c+l
c
2
I r ='(_00
., a 2 ]
... 1 c =(a c ,a c+1]
N
Observed
Frequencies:
V
r z.
Vc i=1
... N
=
*
Ic+l(ac+l'oo]
F
=0 F
=
v,l . v,2
-1
Nv
F
=
v,3
F
v,c+l
=
=
r
V
Z
*
V,c+l 1=1 i,c+1
1C
Empirical
Distribution
Function
N
*
N
F*
=1
v,c+2
c
-1 ~
N- l \
NV1 Nv ,2 Nv,Q,
,Q,=l
i~l
V
Score
NVi
A*
II
II
VC
v,c+l
•
(3.4.l)
There are two representations for Z.* +1'
1,C
We shall use either of them
as convenient.
• •. v Z. ,
1C
*
Zi,c+l
=
where Zil v Zi2 v ..• v Zic
= max(Zil""
• •• u Zi,k+l' i
,Zic); or
= 1, ••• ,Nv •
(3.4.2)
A*
II
is defined as
v,c+l
A*
II
=
v,c+l
1 [1t J-l
[HU)dU
v,c+l
du
(3.4.3)
v,c+ '
We now proceed to develop the test procedures for simple and mUltiple
regression situations.
58
3.4.l.PCS Tests for Simple Regression
Based on the first c+l intervals, the statistic S
V,c
is given by
N
V
\' c . f \'~ lz .. i
* l~A* -lJ·
L v 1 LJ -- 1J~Vj + Z.1,C+
i=l
V,c+
S
V, c =
(3.4.4)
Note that the unconditional distribution of Sv,c' even under H ' is
O
dependent on the unknown distribution.
However, under the same permu-
tational argument as in Sen (1967), its conditional distribution under
H does not depend on F.
O
Leaving out the details, if P
v be the
permutational probability measure defined in Sen (1967), we have
-1
j
*
-1
E(Z . . IP )
1J v -
N
v,c+
E(Zi'Z.,
.i 1 J
-
= {NVjNV '
,11.' )
;V
= 0, j
= l, ••• ,c,
IN, j
V
=f: j '
i
= 1, ••• ,Nv ,
= c+l, i = 1, ... ,Nv •
(3.4.5)
v'
(3.4.6)
= 1, ..• , c+1,
(3.4.7)
1, ... ,c+1, i
1, ... ,N
-1
E(ZijZi'j,IPv)
=
N~1(Nv-1)
NVj(NVj ' - 0jj')
i =f: i'
= 1, ... , Nv ,
j, j'
where 0 .. , is the usual Kronecker delta.
JJ
Note that in (3.4.6) and
*
(3.4.7) we have not differentiated Zi,c+1
from Zi,c+l'
Let
S
v,c
where t(V)
c
-1
= Var(Sv,c IPV)~[Var(S v, kiP)]
V
(3.4.8)
indexes the sequence of
statistics { ~v(t.(v) ) }k. l'
J
J=
Under progressive censoring procedure, we suggest the following test
statistics:
59
max
l<c<k
for testing H :
O
S=0
against one-sided alternatives
V
and
~.
free.
S=0
S>
0 (or
S<
= max Ii; (t(V))I.
L
for testing H :
O
0.4.9)
l<c<k
V
(3.4.10)
c
against alternatives
0).
S~
O.
+
Under P • both Lv
v
being functions of tied ranks. are conditionally distributionTherefore. for any pre-assigned level of significance a(O<a<l).
there exist numbers L+..
v.a
aV
= P{LV
and L
v.a
> L·
v,a Ip}
V
such that
< a < P{L
-
> L
V -
v.a Ip}.
V
(3.4.11)
Note that derivation of exact conditional distribution of L~ and Lv on
permutation principle requires knowledge of the cell frequencies
= l •.•.• k+l.
NVj ' j
But these frequencies are not available until the
end of the experiment.
as N
v
+
00
However. it will be shown in Section 3.6 that
+ and a approach a and L+ . and L
v
and k is large. both a
v
v.a
v.a
approach known points of regular functionals of Gaussian process on
D[O.l].
+ and Lv
On the other hand. if k is not large. tests based on Lv
will be conservative. though valid, even in large samples.
The pro-
gressive censoring test procedure consists in comparing, as the
experiment progresses. i; (t(V)) or Ii; (t(V))1 against the 100(1-a)th
V
c
Vc
percentile points of the asymptotic distribution of L~ and Lv
respectively.
If for some c:
1
~ c ~ k.
i; (t(V)) ·or Ii; (t(V))1 exceed
V cV c
these critical values, we reject the null hypothesis at thec-th
censoring stage; otherwise. we continue with the experimentation.
If
60
no such c exists we decide in favor of the null hypothesis.
3.4.2. PCS Tests for Multiple Regression
Following Ghosh (1973), we may construct the sequence of test
statistics {M
} based on the first (c+l) class-intervals, where
v,c
U'
(U
~v,c
N
u.
lUV,C
=
I
:v
*
c.
i=l
l V,c
, ••• , U
pV,c
),
[c~lVJ'" .z ..
J
"'*
*
l, ... ,p,
1.J + f:, V,c+lZ.1.,C+1' m =
f:,
mV;L j
where B2(F",{IJ(.c)}).
v
1.S as d e fi ne d·1.n (3521)
••
.
(3.4.12)
It may be observed that
the monotonicity property of {X 2 } developed under PCS and productc
multinomial mode' in Chapter II or the martingale property of {S
v,c
}
under the simple regression model as will be proved in the next
section of this chpater, is not apparent in the case of {M
}.
V,c
There-
fore, these cannot be utilized as such as a basis for construction of
suitable PCS procedures.
However, by (3.2.9),
~V
is positive definite
(symmetric) and hence there exists a non-singular matrix
~V
such that
pxp
(3.4.13)
where D'
~V
= E'
~V
R'
~V
The diagonalization of
~V
*
~Vi
is achieved through the transformation
=
*,
(d *
lVi ' ... ,d pVi )
=
(c,,~E,,)'
*
~v1.~v
=
*
E~c".
~v~v1.
(3.4.14)
61
In that case
S'(E,)-1 E'
- -v
-v
*
':Vi
= T'
d*
-v -Vi'
(3.4.15)
where
~V = E-lS
= (T l V , ... ,T pV )' .
-V -
(3.4.16)
._
We note that H :
O
~V
=
~
S=
> H :
O
0 and vice versa.
By virtue of
(3.2.8) it is trivial to show that
~V
L d* . = 0
i=1 - v 1.
0.4.16a)
Further,
= -mV-Vi
e' c *
*1.
= (emV·1,···,e"
mvp )C- V·.'
(3.4.17)
and
max
1<i<N
--V
max Id* .1< max
l<m<
mV1. - 1<i<N
-~
r
max \c*.1 (max
Ie ".1) = 0(1).
l<m<n mV1. 1_<m<n j=1 mvJ
--V:..J"
(3.4.18)
:..J"
*.
At the time-point T = a + ' we have c+1 class-intervals !l, .•• ,I c+ l '
c l
Let as in Ghosh (1973),
= NL d *V' LC
. L
V
5
mV,c
i=l m 1. '=1
m
A
f::,
,z .. +
vJ 1.J
= 1, ..• ,p,
c
= l, ... ,k,
' ••• , 5 V· )', c = 1, ••• , k.
5
= (5
-V,c
p ,c
1V ,c
(3.4.19)
(3.4.20)
.-
62
Let also
... ,
~. (t(V»),
pV c
'
c=i, ••. ,k,
(3.4.21)
where
1
= S
.
rvar(s" IP
mv,c!_
mv,c
V
)J-"2,
m
l, ... ,p, c
= I, •.• ,k,
(3.4.22)
and t(V)
c
= Var(S. mV, c IPV ) [Var(SmV , kiP)]
V
vector-statistics
-1
indexes the sequence of
{~(t(V»}. We note that t(V) is independent of m,
-v
c
c
for reasons which will be evident from results in the next section.
For tests under progressive censoring, we propose the statistic
(3.4.23)
Under P , M is conditionally distribution-free.
v
v
As in the simple
regression problem, derivation of exact conditional test of M demands
v
complete knowledge of the frequencies of all the k+l class-intervals and
hence is not feasible under the set-up envisaged here.
be shown in Section 3.8 that as N
v
-7-
co
and k is large, M converges in
distribution to that of a regular functional of
D[O,I].
However, it will
v
Gauss~an
process on
On the other hand, ifk is not large, test basedonM will be
v
conservative, even in large samples, as in the simple regression case.
We develop progressive censoring test procedure similarly as in
the simple regression problem.
63
It may be mentioned that the non-singular transformation on
c
*
.
. 's satisfying (3.4.13) is not unique.
mV1
Consequently the test
statistic M is dependent on the particular non-singular matrix E
~v
v
chosen for the purpose.
Alternatively, we may develop another
statisticM* which remains invariant under the type of transformations
v
which satisfy (3.4.13).
U
mV,c
:::::. S
*
mV,c
Let
Nv
=
* .
L cmV1
i=l
m = l, ...
S
*
~v,c
(S1
v,c , ... ~S pv,c )',
,p,
c = l, ... ,k,
c = l, ... ,k,
(3.4.24)
(3.4.25)
1
fvar(s*v IPv~-~' m = 1, ..• ,p, c = 1, ... ,k,
m ,cl
m,c
"j
S*v
(3.4.26)
(3.4.27)
We now define
(3.4.28)
Noting that Var(S *
IP ) = Var(Sv
IP ) for every m and every c (as
mV,c V
m ,c v
will be clear from the derivation of results in the next section) and
by virtue of (3.4.13) and (3.4.14)
(3.4.29)
64
which shows invariance property of Mv* under any non-singular transfbrmation on c * .'s satisfying (3.4.13).
mVl.
We, next, derive some preliminary results in the following section.
3.5. Some Preliminary Results
We introduce, first, the following notations as in Section 4 of
Sen (1967):
ex] cr -1 ), J.
F .. = F( [a. J
J
Fc +2
= F*c+2
J
c
=1
- Fc +I ' c
= I, ... ,c,
=
I.~
1
[
6. c +1
c+l
o
c
= I,oo.,k,
(3.5.1)
I, ... ,k.
0, otherwise, j
0*
1, ... , c+1, c = 1, •.• , k,
1, c = 1, ... ,k,
F., j
P*+1
=
= I, •.. ,c,
*-1
*
] + ' P + >
(u)duP
c 1
c I
c
= I, ... ,k,
°,
.
=
0, otherwise, c
=
I, ..• ,k.
0, otherwise, j = I, ... ,c, c = I, .•• ,k,
(3.5.2)
65
0, otherwise,
-
C
~.
j~ 1
C
~0.2p.
J
J
= l, ••• ,k.
+
(~o
*
C+I
)2
(3.5.3)
*
P C+1 '
C =
l, •.• ,k.
(3.5.4)
C
=
~ . 62. p.
L
j=l
D
J
6*2*
+ Dc+lP c+ l ' c=l, ••• ,k.
J
(3.5.5)
(3.5.6)
P({I (c)})
j
= C(F ,G, {I~c)})A
-l(F, {r ~c)} )B-l(G, {Ij(C)}) ,
J
. J
(3.5.7)
c = 1, ... ,k.
Let the a-field generated by the indicator (random) variables
{z 1J
.. , 1 -<
j < c, 1 < i < N ) be denoted by
V
p(c)
V
By this definition
empty set.
= p(C)(z'
V
ij ,
p~O)
1 < j < c, 1 < i < N).
_.
- V
(3.5.8)
constitutes the null a-field generated by an
We also define a sub a-field Q(c) of p(c) as
Q(C) =Q(c)(Z
V
v·· ij'
V
V
1 < j ~ c, 1 < i < N ; and N
-
-
-
V
+l, ••• ,NV"k).
v ,C
.
,
(3.5.9)
In the following two sections we derive some results relating to {S
v,c
}
66
3.5.1. Simple Regression
a,
Theorem 3.5.1.
Under H
FCc) ~ 1 -< c -< k} is a martingale for
{S
V,c' V
V
N (>1).
V-
Proof:
(3.5.10)
where E represents expectation over (N
V,C
+l,···,N
v, k).
It will suffice
if we can prove that
E(S
/Q(c) H )
v,k 'v ' a
S
c=l, ..• ,k.
v,c'
(3.5.11)
Now
Nv
C
E(S
/Q(c) H )
v,k V
'
a
=
N
A
I I c .Z .. ~ .
j=l i=l V1 1J VJ
k+1
V
+
A
()
I C"i j=c+l
I {E(Z1JV
.. ~ j/~'C
i=l
v
v
,Ha)}.
(3.5.12)
But conditionally on
,Ha)
E(ZiJ'/'Q,~C)
v
Q~c) '~Vj
is fixed, V j
c+1, •.. ,k+1 and
= N*-l
N (1 Z y
Y'Z ,)
v,c+1 Vc - i1 .•• ic
*-1
*
= Nv, c+1 NVc Z1,
. C +1' i = 1, .... , NV , j = c+1, ... , k+ L
(3.5.13)
Substituting (3.5.13) on the right-hand side of (3.5.12), we obtain
N
V
C
A
I c . I ~ .Zoo
i=l V1 j=l VJ 1J
N,
V
=
N
* +INV
N*
v,c+1
C A V C 'Zi
V1 · , c
L c . L ~ .z .. + i=l
L
i=l v1 j =1 VJ 1J
k+1
N~
I . VJ
j=c+1
Nv
•
VJ
67
N
v
C
N
A
L c i L b. .Z ..
i=1 v j=l VJ 1J
= S, c
v,c
+
Lv c
*
Z.
A*
b.
i=l vi 1,c+l v,c+l
1, ... , k,
k+l
\
N N-l~
remembering that
L
Vj v Vj =
j=c+l
(3.5.14)
1
J
<P(u)du,
F
V,c+l
and
~*
= Jl
v,c+1
<p(u)du [
Fv ,c+1
II ]-1
FV,::I
From (3.5.14), it follows that
Under H ' {S
Q(c) 1 < c _< k}.is a martingale for
V,c' v '
O
Corollary 3.5.2.
Let the a-field generated by (NVl, .•. ,N
vk
) be designated by
Then
(3.5.15)
We now proceed to derive the permutational expectation and
S
v,c
vari~nce
of
•
3.5.1.1. Permutational Expectation of Sv, c
(3.5.16)
A
Since b."j and N . remain invariant under any permutation of the
v
coordinates
VJ
*
of~,
we have
·68
NV
"i
i=l
k+1 A N .
as
~
I
j =1
. NVJ
vJ
V
c .
V1.
k+1
I
~
~
A
'=1
Vj
N ~
.~
0,
NV
(3.5.17)
= o.
From the above result, it also follows that
E(S
IH ) = E{E(S kIB(k), H )}
v,c O
V,
V
O
O.
(3.5.18)
3.5.1.2. Permutational Variance of S
'V,c
Var(S
V,
c
I:')
.)
t
+1A2
I
'=1
+
c+1
I
'.J."
]TJ
~
A
~v·V(Z. ,IB
J
(k)
~
.
~ ., Cov (z .. ,z .. , I B
,H ) .
o
VJ VJ·
1.J
1.J
A
V
+ . N"i
CV·C ·'
i:fi'=l 1. V 1.
+
e+1
I
1.J
(k)
v .. ,R·O)
A
A
[
c+1
L
j=l
.
~.~ .,Cov(Z .•
j:fj'=l VJ VJ
(k)
~V2 .Cov(z .. ,z., .IBv
J
1]
1. J
(k)
,Z.,.,IBv
1.J
1. J
,HO)
]
,H ) ·
O
(3.5.19)
69
In the aboveZ.1. , C+1 and
A*
~V ,·+1
c
have not been distinguished from. Zi*l
, c+
.~
and IJ.V,c +1 respectively, in order to maintain neatness of notation.
By (3.4.5) ... (3.4.7), the right-hand side of (3.5.19) can be written as
+
v
L
N
iofi '=1
.
c.c., [
V1. V1.
c+1
A
L 1J.2.{N
j=l VJ
1
1
.(N .-1)N- (N -1)-
vJ
. VJ
v
v
On substituting these into the right-handsi.de of (3.5.19),
we finally
get
Vp (SV,c)
v
= V(Sv,c IP)
v
(3.5.20)
where
_4t
70
1.
~ 6A2 .N.N-1 + 6A*2 1N*N1
j=l vJ vJ V
v,c+ v,c+ V
(3.5.21)
L
We note that, because of the martingale property of {S
c
= 1, ... ,k},
B2(FV,{I~C)})
is !'in c, c
= 1, ... ,k.
Q(c)
v,c'·v
'
H'
0'
From (2.5) of
Sen (1967), it also follows that
(3.5.22)
c = l , ... ,k.
3.5.1.3. Permutational Covariance Between Sand S.
,c<c'
'V,c ...
v,c , - (k)
Cov(SV ,c ,Sv ,c ,!P)
= Cov(Sv ,c ,Sv , c ,/BV
V
,HO)
= E(S
S
!B(k) H )
V,C V,c' V ' 0
= E{E(S
S
!Q(c) H )}
V,C v,c' v ' 0
But by virtue of Corollary 3.5.2, E(S
Cov(S v,C ,Sv ,c ,!P)
V
(c)
v,C ,!Qv
= E(S2 !B(k)
V,C
V
,H )
O
S
v,c
Therefore,
H)
' 0
= V(S v,c !PV ),c
<
~'.
(3.5.23)
3.5.2. Multiple Regression
We state, without proof, the following results which are trivial
extensions of those relating to simple regression derived in Section
3.5.1.
Theorem 3.5.3.
N (>1).
V-
Under H ' {S
O
F(c)
~V,e' V
'
1 < c < k} is a martingale. for V
-
-
71
Corollary 3.5.4.
{c)
Under R ' { S
,Q
O ",V,c V
1
.s. c
< k} is a martingale for V
NV(>l).
Ep (~V.,c)
0, V c = 1, .•• , k,
V
E{~v,c/Ro) = 0, V
= NV (NV_1)-1 B2 {FV' {I{c)})
V
j
,.
m ,c /Pv)
V{S V
(3.5.24)
c = 1, .•. ,k.
m=l, •.. ,p,
c
c,c'
= l, ••• ,k.
= l, ••• ,k,c
(3.5.25)
< ct.
(3.5".26)
Ip
by (3.4~13).
(3.5.2S) and ;(3.5.26) follow also from the fact that ~~.=
3.6. Asymptotic Distribution Theory of the Test 'Statistics
Under R: Simple Regression
o
We first state, without proof, the following two Lemmas which
follow as special cases of Theorem 2.1 and Lemma 3.1, respectively,
of Sen (1967).
Lemma 3.6.1.
With a finite set of k+1 class-intervals, the likelihood
Nv
k+l
ratio statistic TV*k = I c i L ~~Z .. , provides asymptotically most
,
i=l V
powerful test for R :
O
(3
=0
j=l
against appropriate sequence of one-sided
contiguous"alternatives and under
o such
that (3
V
J 1.J
(3~2.l)
- (3.2.4) and for any finite
= OC-V l
(3.6.1.)
Lemma 3.6.2.
finite (3,
Under (3.2.l) - (3.2.3) and (3.2.5) and for any real and
B2{Fv,{I~c)})
converges in probability to B2{G,{Ijc)})
uniformly in {I~c)}, c = l, .•. ,k.
J
e
72
We consider the k component random vector~ ~v = (SV~1~SV~2~ .•. ~SV~k)'.
For k
finite~
we have the following:
Theorem 3.6.3. Under R and (3.2.1) - (3.2.3) and (3.2.5)
O
asymptotically Nk(~~k)~ where
I is
~V
is
given by
r
. B2(G~{I~1)}) B2(G~{I~1)})B2(G~{I~1)})
J
c- 2I
J
J
B2(G~{I~1)}) B2(G~{I~2)})
J
V-
B2(G~{I~2)})
J.
J
B
(3.6.2)
Proof:
~'
Consider a non-null vector
elements.
=
(~l~
...
~~k)
It will suffice to prove that SV (~)
=
-
of real (finite)
~'S"
_
-v
is asymptotically
We shall establish the proof by means of projection
technique.
As in Sen
(19671~
let us define
N
T
V~c
I
i=l
[Ic
CV '
bJ,Z ij +
~ j=l
*
*
bc+lzi~c+ll ~ c = l~ ••• ~k ~
(3.6.3)
Then by Theorem 2.1 of Sen
VeT
v,c
(1967)~
IRa) = B2(G~{I~c)}), c = l, ... ,k.
J
(3.8.3)
By imposing non-degeneracy condition on G(x) as
Sup. [G. 1 - G.l < 1 with probability 1,
J J+
J
we also have
B2(G~{Ijc)})
E{(S"
v~c
VeT
- T"
IRav)c
V,c
Further, B2 (G,{Ijc)}) is uniformly
> 0 for V c.
bounded by B2 (G) which is finite.
(3.6.4)
From Lemma 3.2 of Sen (1967),
) IRa}
+
0
asV +
00
•
(3.6.5)
73
Consequently. Sand T
.
. v.c
are asymptotically equivalent in mean for
v.c
~
~
Since ~ contains a finite number
every c = l •.••• k [See Rajek (1961)].
of elements. it follows that
= ~~'T V·
TV (~)
~
is equivalent in mean to
Sv(~)'
(3.6.6)
Consequently. we are required only to
prove asymptotic normality of T (~). After rearranging the terms and
v ~
k+l
*
.
noting that Z
=
L Z..• we obtain
i.u+l
j=u+l 1J
Nv
k
k+l
= L c
w.Z ..
L
~ T
u=l u v.u
i=l vi j=l J 1J
l.
N
k+1
V
L
=
i=l
c .
V1
r
. 1
J=
Y' j
1
N
V
=
L
i=l
j-1
k+1
where Y. =
1
(3.6.7)
cViY i •
~
L Y... Y..
1J
u=l 1J
L
u=l
~
/),*
u. u+1
k
+ (
L ~ )/), ..
u=j u
j
J
We note that the random variables {Y • i = 1 •••• • N } are LLd. under
v
i
(3.6.8)
V(Yi/RO) = E(Y~/RO)
k
= E [(
Lw. z.. )
j=l
J 1J
2
IRa]
k+1
k+1
·22
~
= E [L w. Z.. IRa] + E [L w.' \if. ,Z . ; Z. j , IRa]
j=l J 1J
Jtjt J J 1J 1
e
74
e-
=
v
By appealing to the special central limit theorem of Hajek and Sidak
(1967, p. 153), we get that, under H '
O
T (i)
V -
.
~ N(O,i'I i).
-
(3.6.10)
_ -
This concludes the proof of the Theorem.
Corollary 3.6.4.
(T * k'S (i»
V,
V
Under the same conditions as in Theorem 3.6.3,
have asymptotically a bivariate normal distribution.
~
* and S (i), say
We note that any linear combination of T"k
Proof:
v,
V
~
aT * k + bS (i), a and b finite, is equivalent in mean to a linear
V,
V
~
function of N i.i.d. random variables under H .
v
O
By proceeding exactly
along the lines of Theorem 3.6.3, we can prove the asymptotic bivariate
normality of (T * k'S (i»
V,
V
~
under R •
.
O
The explicit expression for the
asymptotic joint distribution is not necessary, since we already know
the corresponding asymptotic marginal distributions.
For the sake of brevity, we shall, henceforth, denote
by B2
V,c
and
B2(G,{I~c)}) by B2 for c = l, ... ,k.
c
J
B2(Fv,{I~C)})
We consider the
empirical process
(3.6.11)
where k(t) = max{c:
2
v,c -<
.B
t B2
v, k } .
Note that by this definition {Wv(t)} is right continuous with left-hand
limit and hence belongs to DIO,l].
{t~V)}~
J
J=l
We also observe that at the points
where according to the definition given after (3.4.8)
75
<t(V) = 1
B2 . B.-2 k' j = 1, ... ,k,0 = t(V)<t(V)<t(V)<
o 1
2
.•. k
'
v,J v,
(3.6.12)
we have
o by
(v)
E (WV(t
j
PV
v, v
= 0, j = l, ... ,k,
V
(W (t ~V») =
v J
(v)
(v)
),W (t., » =
Pv
Covp (W (t
»
v
j
convention,
J
(v)
tj
A
j,j' = l, ••. ,k.
(v)
tj "
(3.6.13)
we get, by Lemma 3.6.2 and Slutsky's Theorem [Cramer (1946, p. 255)],
t.(v)
. ~.
t ., J. -_ 1 , ••• , k as
J
J
V -+
00
(3.6.14)
•
Therefore, the conditional variance-covariance matrix, given P , of the
. V
empirical process {W (t)} is asymptotically, in probability, the same
V
as the variance-covariance matrix of {Wet)} at t ,t , ••. ,t where
l 2
k
{Wet)} is the standard Brownian motion process.
From Theorem 3.6.3, the fact that B2 k
v,
j
.ij.B~
j
and the Slutsky
Theorem [Cramer (1946" p. 255)], we obtain on writing
W
= (WV(Sl(V»,
-V,q
0<
s~V) < s~V) < ••• < ~~V) < 1,
-1 -1
-1
where W(s(V»= S
C · Nv (Nv-l) Bv,k and k j is defined by
v j
v,k. v
s(V)
j
B- 2 :
v,k. V,k
= B2
. 1
J
(3.6.15)
76
Corollary 3.6.5.
(v)
(WV(sl
" (v)
), WV(s2
Under H and (3.2.1) - (3.2.3) and (3.2.5),
O
(v)
), "', Wv(Sq
(W(sl)' W(s2)' .•. , W(Sq»
»
converges in law to
where
-2
2
s. = B
B .
J
k. k
(3.6.16)
J
"
In order to prove that {W (t)}
v
D
~
{Wet)} on DfO,lJ, we have to prove
Let there exist v~V) 1- 0 < v~v) < 1
the tightness property of {Wv(t)}.
J
J
for V j and
·(3.6.17)
Since V(W (t)/P ) is
V
V
~in
t and because of the martingale property
proved in Theorem 3.5.1,
E
(W IV(V), - W ("t(V»)2 < E (W (t(V»
I
V j
Pv V j+1
Pvv' j
- W.(t (V»")2.
v j
max Ep (W (t.+ ) - W (t »2 ~ 0 as N ~
V
v
1
V j
O~J~k-l V
J
Therefore, if
00,
(3.6.18)
the tight-
ness property of {Wv(t)} will follow from Lemma 4 of Brown (1971).
For
the stated condition to be satisfied we have to prove the following
Theorem:
Theorem 3.6.6.
Let {k } be a sequence of positive integers such that
r
lim k
r~
r
=
00
,
~ {l~~~r+l p~r)}
where
pi r ) is
points.
(3.6.19)
(3.6.20)
• 0,
the probability of the j-th cell when there are k
r
censoring
Also define B2(~) as the value of B2 . when there are k
v,J
V,J
r
censoring points.
77
Then, under H '
O
(3.6.21)
Proof:
We shall superscribe r on ~ . andN . as well in order to
V]
V]
indicate their association with k •
r
and
(3.6.22)
Since.
*(r) A*(r)
(r)
A(r)
*(r)A*(r)
NV,j+IL'lV,j+1 = NV,j+IL'lV,j+1 + NV,j+2L'lV,j+2 '
we have
78
Then the right-hand side of (3.6.22) becomes
(~(r)
t(r)
2
V,j+l - V,j+2)
*(r)
NCr)
ep2(U)du + VNj+l
NV,j+2
<
*(r)
V
NV,j+l
p
+ 0 as N +
V
00,
1
J
ep2 (U)du
N(r) N- 1
v,j+l V
uniformly in j,
since
( r) -1 P
max
N
N
+0 and
l<'<k+l
Vj
V
_J- r
ep2(U) is integrable on u £ (0,1) •
o
+
3.6.1. ASymptotic Tests Based on Lv and Lv
-1
Let for finite k, N N
V1 v
k
there exist a k
Ot
-1
N
V
> 0 > 0 and for arbitrarily large k, let
O
\'
LN. > 0 > 0, in probability.
j=lVJ
This insures
a b(V) such that
B2 B- 2 > £ >
V,j v,k
We associate c=lwith this b(V).
Theorem 3.6.7.
a}.
(3.6.23)
We then prove the following Theorem:
Under (3.2.1) - (3.2.3) and (3.2.5) and under H
o
+
p{LV ~ x} ~p{
lim
N~
t
79
-k
~(t) ~ x}
t1~t~1
v
lim
P{L
N~
v ~ xl ~ p{ sup t-~Iw(t) I
<
x}.
(3.6.24)
t1~t~1
V
Proof:
sup
We know from Corollary 3.6.5
. lim P{L+ <
N~
V -
xlp } =
lim P { S
N
V
V
~
C-1 B-1
V,c V
< x, V l~c~k IP }
v
V,c-
V
= lim p{S
N~
V,c
C-~-l (C- 1B- 1 C B
V
v,k
V
V,c V
v,k
) ~ x, V l<c~kIPv}
V
=
p{
Wet )
c
It
c
~
x, c
-k
= 1, ... ,k} > P{ sup
t ~.(t) < xl.
(3.6.25)
tl~t~l
Consequently,
The proof for the second statement follows similarly.
o
Hence, in large samples, the percentile points of the distributions of
the functionals of the Wiener process W(t), viz.,
. sup t-~IW(t) I, will provide conservative tests.
t <t<l
1
sup
tl~t<l
t -~(t) and
However, as k becomes
large, the. distribution
of {WV (t)} approaches that of Brownian
motion
.
. .
process {W(t)}, forv t e: [tl,l].
In that case, the tests become less
and less conservative.
Remark 1:
The reasons for commencing the testingprocedlJre at the
reduced time-point t = e: > 0 are two-fold.
First, the law of the
iterated logarithm for the Brownian motion process states that
80
Wet)
_ _~:.L-.._
P{Um sup
= 1} = 1,
/z tJl,nJl,n (1:.)
t+O
t
P{lim inf _--:.W~(..:;t.!-)- t+O
(3.6.26)
IztJl,nJl,n(i)
see Tucker (1967, p. 265).
_W(_t_)
= -l} = 1,
But
= _---'W"'->(~t)<___
It
(3.6.27)
IztJl,nJl,n(l)
t
The first factor on the right-hand side of (3.6.27) goes to +1 a.s.
while the second factor goes to infinity, as t
not behave properly as t
+
O.
+
O.
_k
He~ce
t ~(t) does
Secondly, the weak convergence of Wv(t) to
Wet) in the sup.,..norm does not necessarily imply that for t going to 0
-k
-k
t ~v(t) and t ~(t) have the same asymptotic distribution.
asymptotic
12J1,nJl,n(~)
ppro~-imation
may not hold. ast + O.
is a slowly varying function of E.
of E, !zJ/,nQ.ri,ci)
Hence the
On the other hand,
In fact, for four values
are as shown below:
E
.001
.005
.01
.05
1.9660
1.8261
1. 7477
1.4813
This clearly shows that a suitably chosen cut-off pOiILt E (Le.,
confining our attention to the interval [E,l]) eliminates both the
problems mentioned above.
However, because of the inequality sign in
(3.6.25), we should choose E as close to t
we can take E
= tl ,
provided t
1
l
as possible.
is strictly positive.
Technically,
81
Remark 2:
One would also like guidance as to the adequate number of
responses for starting the PCS testing procedure.
Suppose the policy
is to end the experiment, in any case, when 10% of the individuals in
the sample have responded.
If one deals with ungrouped random variables
and Wilcoxon's standardized score, viz., ¢(u) = Il2(U-~), is applied,
then the asymptotic variance of C-1S
1-
[t
V
v, k is given by
$'(w)dw - (h1)-1{ (
.1
=1
$(W)dW)']
.1
- "(1-.1)3
=1
- (.9)3
=
.271,
(3.6.28)
seeChattergee and Sen (1973).
If one wishes to start the testing procedure after a proportionr of
the 10% of the individuals have responded, r can be obtained from the
equation
1 e; =
(l-r) 3
(3.6.29)
.271
One will be on the safe side by computing r from (3.6.29), since, with
grouped data, the asymptotic variance of C-1S
V
v, k will be less than .271.
rhe following table shows that, in this particular case, e; and 0 are
almost of the same order.
-e;
.001
.005
.01
.05
-r
0 = lOr
.00009
.00045
.0009
.0045
.0009
.0045
.009
.045
(3.6.30)
82
3.7. Asymptotic Distribution Theory of the Test-Statistics
Under Contiguous Alternatives: Simple Regression
v : S I O.
We consider alternative hypothesis H
contiguous to H if H
v
O
+
H as V
O
+
H will be
v
Contiguity of H to H can be
v
o
00.
achieved in two ways viz., (i) by keeping c . fixed (independent of v)
Vl.
and making S arbitrarily small as v
making c
o as
V +
+
00
or (ii) by keeping Sfixed and
-1
Vi
dependent on v in such a way that each C
00.
v
c
Vi
tends to
We have adopted here the second approach and it can be
verified that the Noether condition on c .'s ensures contiguity of H
v
Vl.
to H as developed in Chapter VI of H~jek and Sid~k (1967).
O
We thus
conceive of the sequence {H } where H : Sv =OCv""1 I 0, for any finite
v
v
We state and prove the following Theorem.
and fixed O.
Under {H } and (3.2.1) - (3.2.5), S' = (S . 1'·· ..,S . k)
v
-v
v,
v,
Theorem 3. 7 .1.
is asymptot.i.call; Nk(~'~)' where
].l
c
o C(F,G, {I.(c)} ), c = 1, ... ,k,
= C (-)
v a
J
(3.7.1)
and ~ and C(F,G,{I~c)}) are given by (3.6.2) and (3.5.6) respectively.
J
Proof:
We apply Corollary 3.6.4, Lemma 4.2 of H~jek and Sid~k (1967)
and the results of Section 4 of Sen (1967), thus obtaining the desired
proof.
Corollary 3.7.2.
Under the same conditions as in Theorem 3.7.1,
(V)
(W (sl .), .•. , W (s
v
v
(v)
q
»,
(v)
where s1
, •.. , s
(3.6.15), is N (M ,D ) asymptotically.
q -q -q
(k )
M = «P({I.
-q
J
Q, })(aO)
(v)
q
are as defined in
M and D are given by
-q
-q
(k )
A(F,{I.
.
J
Q,
}»,
Q,
= 1, .•. ,q,
83
= «hk », 1 = l, ... ,q,
1
~q
=«8 1
(k )
where p({I.
1
J
A
81
,»,
1,1'
= 1, ... ,q,
(3.7.2)
(k )
1
}) and A(F,{I. .}) are as defined in (3.5.7) and (3.5.4)
J
respectively.
Corollary 3.7.3.
Under the conditions stated in Theorem 3.7.1,
D
{Wv(t)} + {wet) + h(t)} for k satisfying conditions of Theorem 3.6.6
where
h(t) . = hk (t)
. ,
o
k(t)
o
= max{c:
< tk
B2 } .
Bc2 -
(3.7.3)
3.7.1. ASy!ptotic Power~ of ~and Lv
+
The asymptotic power of Lv
can be given by
+
where wais thelOO(l-a)th percentile of the distribution of
-~sup t-w(t).
t1<t<1
The asymptotic power of
Lv
can similarly be derived as
(3.7.5)
84
where w is the lOO{l-a)th percentile of the distribution of
a
sup
tl~t~l
It-~{t) I.
The inequalities in (3.7.4) and (3.7.5) can be
approximately replaced by equalities if k is large.
3.7.2. Expected Stopping Time (Asymptotic)
Let A* be the event that "'v
t' (t{V)) > w+ for some c
c
a
= 1 ' ... , k .
Then A* is the union of the following disjoint events:
+
> w },
Al - {~v (t{V))
1
a
+
A - {~ (t{V)) < w . ~ (t{V)) > w+ },
2
v 1
a' v 2
a
......
.....
~
Let
~+l
-
{~ (t{V)) < w+ , j = 1, ... , k-l; ~ (t (V)) > w+ }.
- a
V j
v k
a
be the time-point at which the experiment is terminated.
event a C+ l
= a c+ l
is the event A .
c
(3.7.6)
The
Asymptotically, the expected
stopping time is given by
+
k
TA
= E{a C+l ) = JlaC+lP{Ac ) + a k+l y ,
(3.7.7)
where y+ is defined by (3.7.4).
*
We similarly define Band
Bl, .•. ,B
k
for the two-sided test Lv
and obtain (asymptotic) expected stopping time:
k
TB =
C~laC+IP{Bc) + ak+ly,
(3.7.8)
where Y is defined by (3.7.5).
It is difficult to derive explicit expressions for the asymptotic powers
and expected stopping times for the two tests considered here since it
requires knowledge of the distributions of functionals of drifted
85
Brownian motion process.
Useful approximation can be empirically
obtained through simulation for various drifts.
3.8. Asymptotic Distribution Theory of the Test Statistics
Under H : Multiple Regression
O
From Ghosh (1973), we have that B2 (F ,{I~c)}) converges in
v
probability to
J
B2(G,{I~c)}) for every c, under (3.2.5), (3.2.6),
J
(3.2.8), and (3.2.9) and under H and the sequence of alternatives
O
{H:
v
S:f O}.
We consider sP = (S
-v
l'SV 2' .• ·'Sv k)' the k p-variate
-v, . - ,
- ,
random vectors associated with k censoring points.
For k finite,
we have the following:
Theorem 3.8.1.
Under H and (3.2.5), (3.2.6), (3.2.8), (3.2.9)
O
and (3.4.14), sP is asymptotically N k(O,B9I ) where 9 denotes direct
-v
p - - -p
product and B is given by (3.6.2).
Let us roll out 2~ in a single vector viz.,
Proof:
uP
-v
= (SlV, l'···'S pv, 1'·· o,Slv, k,···,8 pv, k)'·
Consider a non-null
vectorg' = (gl 1,g2 1,···,g 1,···,g k) with real finite elements.
-,
,.
p, .
p,
.
P
It will suffice to prove that SP(g) = g'u is asymptotically
V - -v
We shall use the same technique as we adopted in
proving Theorem 3.6.3.
Let us define
i ~Vi r~
v
T
mV,c
T
_v,c
i=l
A.Z . + t,*+l Zi . +1]' c = 1, ••• ,k,
U=l J i J
c,c
= (T..,c
·
, ••• ,T v ) ' ,
1V
p ,c
v-VP =
c = 1, ••• ,k,
(T 1V
1,···,T
V k)'·
., 1,···,T
.
pV,
.,
1V k,···,T
.
p,
(3.8.1)
86
Si nce
E{(S
mV,c
- T
)2/ HO }
IH )
mV,c
VeT
mV,c
a
0 as NV
+
+
00,
m = 1 , •.• ,p, c
= l, ... ,k,
SP(g)
and TP(g)
= .__
~'Y.vP are asymptotically equivalent in mean.
V V _
But
after rearranging the terms, ~~(~) can be expressed as
N
P
* Y
r
L
d
i=l m=l mvi i
V
N
v
* i'
L
hViY
i=l
(3.8.2)
where
*
hVi
P
=
L d *V·'
Yi .
1
J
m=l m
2, ... ,k,
wk+1 =
[ P
]
gm u ~:+1 + . L gm k
m=l u-=l '
m=l '
P
k-l
L I
~k+1·
Noting that E(Yi/H ) = 0, i = 1, •.. ,Nv , V(Yi/HO) =
O
*
max Ih.
l<i<N
V1
--V
I < l<m<
max
~
_!2P
(3.8.4)
~'~~,
and
max Id * .\p = 0(1), we get the desired proof as
l<i<N
mV1
--V
in the case of Theorem 3.6.3.
Let the p-variate sample process ~V = {~v(t), 0 < t < 1} be
defined as
W
mv
(t)
(3.8.5)
where k(t) is as defined in (3.6.11) and
(3.8.6)
87
Let also
~ = {~(t),
0
~
t
~
one-dimensional time parameter) such that EW(t)
=
min ( t , t ')1.
-p
~
l} be a p-variate Gaussian process (with
=0
and E(W(t)W(t'»
Then we have the fo llowing :
Corollary 3.8.2.
Under H and (3.2.5), (3.2.6), (3.2.8), (3.2.9),
O
D
(3.2.14), (3.6.19) and (3.6.20), ~v ~~, where ~v = {~v(t), 0 ~ t ~ l}.
Proof:
dimensional distribution of W to W
The convergence of finite
-v
follows from Theorem 3.8.1 and the fact that B2 gB 2
v,c
c
and the Slutsky Theorem [see Cramer (1946, p. 255)].
for every c
Regarding the
tightness property of W , we have from Theorem 3.6.6,
-v
8,£:n' > 0, for V m.
(3.8.7)
But
P
II~v(t) - ~v(t')112 2. p L [Wmv(t) - Wmv (t')]2 •
m=l
(3.8.8)
Therefore,
(3.8.9)
By puttingp2£,2
= £2 and pn' = n, we get the desired proof.
o
3.8.1. Asymptotic Tests Based on M", and M",*
We define 8,
£,
and b (v) as in our preparation for Theorem 3.6.7
and now prove Theorem 3.8.3:
88
Theorem 3.8.3.
Under the same conditions as in Theorem 3.8.1,
(3.8.10)
lim P{M* < xl > p{
N ~
V-
V
Proof:
sup (1w'(t)W(t»
t <t<l t1- -
< xl .
-
(3.8.11)
In order to prove the assertion in (3.8.10), we note that each
of the p co-ordinates of the p-variate empirical process W is
-v
asymptotically identically and independently distributed.
Therefore,
the distribution of the maximum of the statistics over the p coordinates
is simply the product of the individual distributions.
We plug in
the corresponding result for each coordinate from Theorem 3.6.7 and
establish the proof.
Hence, in large samples, the percentile points of the distribution
of the functional of the Wiener process W(t), viz.,
will provide conservative tests. We compare ~
100(I-a')th percentile of the d·istTibutionof
sup
t -~IW(t)
I,
tl~t<l
mV
(t(V» against the
c .
-1"
sup t 2IW{t)
I,
for
tl~t~l
p
the a.-th level test, where a.'
=1
- Ii-a. •
We prove the second assertion by appealing to Corollary 3.8.2 and
Billingsley (1968, Theorem 5.1) and by straightforward application of
well known Cramer's Theorem [see Rao (1973, p. 162)].
3.9. Asy!uptotic Distribution Theory of the Test Statistics Under
Contiguous Alternatives: Multiple Regression
Under alternative hypothesis we consider H :
v
~ +~.
By bringing
in the conditions (3.2.8) on c * . 's, we conceive of the sequence {H }
v
mVl.
of alternatives contiguous to H in the sense described in Chapter VI
O
v
of Hajek and Sidak (1967) t We state and prove the:l;ollor.dng Theorem;
89
Theorem 3.9.1.
Under (3.2.4) - (3.2.6), (3.2.8), (3.2.9) and (3.2.14)
and under {H,,},
v
sP is asymptotically Npk(8,B91
-v
__ -p ) where k is finite and
~mc =
.. (c)
vm
-aC(F,G,{I
}),
j
T
m = l, ••• ,p, c = l, ••• ,k,
(3.9.1)
and t~ = (T 1V , •.. ,T pv ) is defined in (3.4.15).
Proof:
Follows from Theorem (3.8.1), Lemma 2.3 of Ghosh (1973),
y
Lemma 4.2 of Hajek and Sidak (1967) and Section 4 of Sen (1967).
Note that under contiguous alternatives as well as under H the p
O
rows of
sP
-v
are asymptotically independent, though not identically
distributed.
Corollary 3.9.2.
~,,(te)
·-v
h
-c
Under the same conditions as in Theorem 3.9.1,
is asymptotically N (h ,t I ) where
p -c
=
c... p
(hI , •.• ,h )t,
cpc
T
h
= p({I~c)})( vm(J)A(F,{I~c)}), m= 1, ... ,p, c = l, ... ,k.
me
J.
.
J
Corollary 3.9.3.
(3.9.2)
Under the same conditions as in Theorem 3.9.1 and
D
(3.6.19) and (3.6.20) {W (t)} -+- {W(t)
-v
~(t)
= ~k
o
+ h(t)} where
(t)'
k (t) = max{c:
o
B2 < t B2 }
C -
k'
*
3.9.1. Asymptotic Power of M and M
v
v
The asymptotic power of M is given by
v
(3.9.3)
90
l-Y' = limP{M > x /H }
N-+oo
v
ex. v
v
=
p{t-~IW(t
) +
c
c
h I > x for some m = l, ..• ,p and
mc
ex.
some c = l, .•• ,k}
< p{ max
sup
l~m2.P tl~t~l
t-~IW(t) +
h
(t)
mk
1
> xex.}'
0.9.4)
0
where x
is the 100(1-ex.)th percentile of the distribution of
ex.
1
max
sup t-~Iw (t) I, where Wl(t), .•• ,W (t) are p independent
l~m2.P tl~t~l
m
p
standard Brownian motion processes. We similarly develop asymptotic
power of M* as
v
l-y * =
= p{L
t
c
p
I (W (t )
m=l m c
+h
mc
)2 > x
* for
ex.
some c}
p
< p{
sup
tl<t~l
1.
t
I
m=l
(w (t)
0.9.5)
m
where x * is the 100 <J-ex.) thpercentile of the distribution of
ex.
p
-1 \
2
.
t
L W (t) and Wl(t), ••• ,w (t) are p independent copies of
m=l m
p
standard Brownian motion process.
3.9.2. Asymptotic Expected Stopping Time
Leaving out the details, we associate the events Dl, ••• ,D
El, ••• ,E
k
k
and
that the experiment is terminated at the censoring times
a , •.• ,a + for the tests M and M* respectively.
v
v
k l
2
D - {t,; (t(v»
mV 1
l
> x , for some m = l, ... ,p},
ex.
........ ..... ......
Then,
91
D - { ':>~ ·(t(.V»
k
mV J
<
-
f
1
d'
1
k
a or every m = , ••. , p an J = , ••• ,. ;
~ (t(V» > x", for some m = 1, .•• ,pJ,
X
mV
k
(3.9.6)
u.
E - {~'(t(V»~ (t(V»
l
-v 1 -v 1
> x *},
{~'(t~V»~ (t~V»
~ - -v J
-v J
< x * j = 1, ••. ,k-l; ~'(t(V»~ (t(V»
a'
-v k -v k
a
....
-
> x *}..
a
(3.9.7)
The expected stopping times (asymptotic) for the two tests are
respectively
k
(3.9.8)
TD = C!laC+lP(D c ) + Y'a k+ l ,
*
k
I
(3.9.9)
T =
a +lP(E ) + Y a k+l ,
E
c=l c
c
*
where Y' and yare
defined by (3.9.4) and (3.9.5) respectively.
The
explicit expressions for the asymptotic powers and mean stopping times
for the two tests based on M and M* seem to be intractable.
v
v
obtain useful approximations through simulation.
One can
CHAPTER IV
NONPARAMETRICTESTING UNDER PROGRESSIVE
CENSORING:
MULTIPLE REGRESSION
4.1 Introduction
For two-sample location and scale problems and simple regression
situation, Chatterjee and Sen (1973) developed a general class of
rank (linear) order tests for progressive censoring.
In this Chapter,
the results of Chatterjee and Sen are generalized to the multiple
regression problem which includes the multi-sample location problem as
a special case.
Section 4.2 discusses the model and assumptions as well as the
basic formulation of the problem.
of PCS test procedures.
Section 4.3 deals with the development
In the last two sections are derived the
asymptotic distribution theories of the test statistic under the null
hypothesis and contiguous alternatives.
Chapter V gives percentile
points of the large-sample distribution of the test statistic under R '
O
4.2. Model, Assumptions and Basic Formulation
4.2.1. Model and Assumptions
We consider a double sequence of independent random variables
X' = (X .1""'X ), where X . has a continuous distribution function
-n
n
nn
n1
F . (x), i = 1, ... ,n, n
n1
~
1, given by
(4.2.1)
93
where S' = (Sl' ••• 'S ) is the vector of regression coefficients with
-
p
real (finite) elements, a is a nuisance parameter and c' =
-ni
(c ., •••
. ,c .) is a vector of known constants.
1
n~
pn~
We assume that for every m:
n
I
1
~
m~ p
mni = 0, n ~ 1,
C
1=1
.Q,
,n
L
max
1
lim n-.Q,
n
n~
l~il <••• ~i.Q,.::n .a=l
= 0 -> lim
...;.0;;;._'
n
L
n~
-'-
= O.
(4.2.2)
c2
1=1 mni
The assumption (4.2.2) is known as Condition Q of Hajek (1961).
We
assume also
n
L
i=l
c2
mni
= 1,
m
= 1, ... ,p,
n
A
-n
=« I
i=l
c
.»
.c,
m n~
mn~
+ A
-
= «A
rom
,»
as n+~,
(4.2.3)
where A
_n and A are p.d.(Note that Amm = 1, V m =1, ••• ,p)
f(x) = dF(x) and f'(x) exist,
dx
00
1(f) =
J[f'(x)!f(x)]2dF(x)<
00,
(4.2.4)
-00
where I(f) is Fisher information.
We shall deal with scores
{a (i), 1 <i < n} and assume that there is a square integrable function
n
-¢(u),
°
< u < 1, such that
¢(u) = ~l(u) - ¢2(1-u),
1
lim
'f
n~
0
[¢n(u) - ¢(u)]2 du = 0,
(4.2.5)
94
= a n (l+[unJ),
where both epl( u) and epZ(u) are non-decreasing in u, ep (u)
n
1
J ep(u)
1
J ep2 (u)du
= 0, and
o
= L
o
We also let
= -
f'(F-l(u»
f (F-l(u»
(4.2.6)
.
In the special case of the multi-sample problem, let
~,
k = l, ••• ,p+l
be the sample sizes of p+l samples such that
(4.2.7)
We may choose
-1
km (l-n
-1
(nm-np+1»' if the i-th observa-
tion belongs to the m-th sample,
c
.
mnl.
k- 1 (_1_n- l (n -n
», if thei-th obserm
m . p+l
=
vation belongs to the (p+l)st sample,
-1
-km (n
where km = {nm + n p+l - n
-1
-1
2 ~
(nm-np+ ) } .
l
= O(n
-1
(nm-np +1»' otherwise,
Note that
) by virtue of
(4.~.7).
Therefore, in this special case, Condition Q is satisfied.
The
Condition Q also holds for the general regression case if for some
,Q,
.
,Q,
,Q,
> 2, L~llc
l.=
mnl. I
1
=
O(N
-"2
), V m
=
l, ... ,p.
(4.2.8)
95
4.2.2. Basic Formulation
In the simple regression problem, for testing H : 8 = 0 against
O
alternatives
8
~
0 (or S > 0 or S < 0), after all the nobservations
are available, a general class of linear rank statistics is given by
n
T
n
= L
i=l
(4.2.9)
c .a (R i)'
n1 n n
where R' = (R 1, ... ,R ) is the vector of ranks and {a (l), ••• ,a (n)}
-n
n
nn
n
n
is a set of suitably chosen scores such that
n
L
a
i=l
n
(i) = 0,
n
L
i=l
=
a 2 (i)
n
(4.2.10)
(n-l).
Let nowS . desrgnate the position of the individual in the sequence
n1
(1,2, ••• ,n) whose rank is i.
vector of anti-ranks.
Then -n
S' = (S n i""'S nn ) is called the
Clearly, R . = SR
S
n1
ni
= i.
Therefore, the
alternative representation of T is
n
n
T
n
= L Cs a (i).
i=l
ni n
(4.2.11)
It can he easily proved that E(T /H ) = 0 and V(T /H ) = 1.
n
n .O
O
Under a fixed-plan censoring, one observes up to the r-th order
statistic as envisaged in the design.
Leaving out the details, we
have, as in Chatterjee and Sen (1973), the test statistic T
defined
n,r
by
T
*
where a (r)
n
n,r
=
r
L Cs
.=
i=l
ni
. -1
n
(n-i)
{a (i) - a *(r)}, 1 ~ r ~ n-l,
n
L a (j).
. j=r+l n
n
Note that T
= T •
n,n-l
n
(4.2.12)
e.
96
We let T
n,r
= Vn,r
=0
for r
= O.
Then E(T
n,r
/H )
O
= O.
Letting V(T
n,r
)
o
/H
for r > 0 and n > l~ we have~ by Gastwirth- (1965).~
T
V
-1<
2
n,r n,r
L
-+ N(O,l) ,
(4.2.13)
under HO and under standard conditions on ci's and ai's and for
r/n -+ p:
0 < P < 1.
Let T k be the statistic similarly constructed as in (4.2.12) but
n,
based on the first k order statistics, where k
= l, ••• ,r.
For purposes
of a progressive censoring test, Chatterjee and Sen (1973) considered
the distributions of the statistics
=
max T k'
O<k<r n,
M
= max IT k l .
n,r
O<k<r n,
(4.2.14)
As the exper~ment progresses one compares T k(orlT k l ) against the
n,
n,
percentile point of the distribution of M+
(or M ) under H and
n,r
n,r
O
rejects the null hypothesis if it exceeds some critical value,
continuing with the experiment if it does not.
We extend these results
to the multiple regression situation, in the next section by developing
Cramer-Von Mises type statistic.
4.3. Progressive Censoring Tests
In life testing or clinical trials, as the experiment is continuouslymonitored, at the k-th censoring stage the order statistics·
Znl, •.• ,znk are available, where ~~
=
(Znl,Zn2, .•. ,Znn) is the vector
of order statistics corresponding to X'
_n = (Xn l' .•• 'Xnn ).
Based on the
knowledge of k order statistics, we construct p rank order statistics
97
k
T
mn,k
=
l:
a (i)
l:
c S
[a (i)
i=l m ni
c S
i=k+l m ni
k
=
l:
+(
c
i=l mS ni n
*
n
a * (k)], k
n
n
)a (k)
n
= l, ... ,r,
m = l, .•. ,p.
(4.3.1)
We also denote
(4.3.Z)
T
' k. = (T l n, k,···,Tpn, k)'
-n,
It is obvious that
E(!~,k/HO)
= 0 for V k = l, ... ,r.
As we shall see
in the next section, V(Tmn, k/HO) = Vn, k' i.e., the variance is
independent of m and V k is /" in k.
n,
t(n)
k
= V V-I, k = O, •.. ,r;
n,k n,r
to(n)
Letting
=0
< t (n) < ••• <
1
(4.3.3)
We define the p-variate sample process {W*(t), 0 < t < I} on D[O,l] as
-n
*
W (t)
-n
.
= (WI·*n (t),
. ••• ,Wpn ( t » ,
(4.3.4)
I
where
-~
-~ I
= (T · n, kVn,r
., ..• , Tpn, kV. n,r
).,
1
W*(0)
-n
= O.
(4.3.5)
We let
c
c
nxp
lnl
1nZ
c
c
Znl
ZnZ
c
c
c
'
-nl
c'
-02
pnl
pnZ
............ ........... -.
(4.3.6)
=
.
• • e_ • • • • • • • • • • • • • • • • • • • e. •
·c
c
lnn2nn
c
pnn
c
'
-nn
ten)
p
I\
L
=
m=l
Ik
99
2
wmn
(t)dt
0
k-1
=
W'(t(n»
Im=l J/,=OL (t(n) _ ten»~
J/,
-n J/,
J/,+1
W (t(n».
-n J/,
(4.3.12)
Note that H is /lin k and it can also be written as
k
k-1
I L
m=l J/,=O
(n)
ten»~ W'*<t(n»E E'W*(t(n»
(tJ/,+l - J/,
_n
J/,
-n-n-n J/,
=
P k-1
(n)
()
* () 1 * ()
L L (tn+1-tn)W'(tn)A-W(tu).
m=l J/,=o
N
J/,
-n
J/,
-n -n J/,
(4.3.13)
For purposes of progressive censoring procedure, we derive the
asymptotic distribution of H under the null hypothesis H and
r
O
compare H against u where u is the 100(1-a)th percentile of the
k
a
a
distri~ution
of H in large samples.
r
If
~
> u for some k, we stop
a
experimentation and take decision against the null hypothesis; otherwise we continue with the experimentation.
If no such k exists for
which ~ > u ' we decide in favor of the null hypothesis.
a
It is evident from (4.3.13) that
~
is invariant under any non-
singular transformation of W* (t) given by (4.3.9) which satisfies
-n
(4.3.8).
As in the multiple regression problem with grouped data in
Chapter III we could have also proposed the test statistic
(4.3.:1-4)
It may beobserved.that·the linear transformation which satisfies
(4.3.8) is not unique.
Consequently, H does not have the invariance
property in the sense H does.
r
r
98
Since L is of full column rank p and A = L'L is of rank p, there
-n
_n
-n-n
exists a non-singular matrix E such that
-n
E'A E
-n-n-n
= -n-n
D'D = I ,
-p
(4.3.7)
where
D'n
= E'L'
n n'
E
n
=
,e , •.
- l n - 2n
(e
~,e-
-pn
)
=
(4.3.8)
e'
"'In
e'
-2n
e'
-pn
Let the p-variate sample process {W (t),
-n
D[O,l] as
O.s..
t
.s..
I} be obtained on
W (t) = (WI (t), ••• ,W(t»'
-n
n
pn
*(t).
= (W-n* '(t)E-n )' = E'W
-n-n
(4.3.9)
We define the Cramer-Von Mises type statistic
p
Hr
= I
m=l
1
f
o
W2 (t)dt,
mn
(4.3.10)
which in our case reduces to
4.3.11)
Under the progressive censoring plan, however, taking advantage of the
fact that the integrand in (4.3.10) is non-negative, we construct H ,
k
at the k-th failure or response, given by
100
4.4. Asymptotic Distribution Theory of H under H
r
O
We state the following Lemma which is a trivial extension of
Lemma 4.1 of Chatterjee and Sen (1973).
Lemma 4.4.1.
Let the a-field generated by the vector (Snl,Sn2""'Snk)
be B(k), 1 < k < n, where B(k) is? in k and B(O) is the trivial a-field.
n
-
-
n
(k)
n
Then, under H ' {T k,B
-n,
n
O
, k = l, •.. ,n} is a martingale.
Theorem 4.4.2.
C~v(~n(t»
Under HO'
*
Cov(W* (t»
- -n
Proof:
is given by
(4.4.1)
= t A , t £ [0,1].
-n
Let
b n, k(i)
= {a~ (i),
i
~
k
a (k), i > k
n
b
-n,k
Then, T' k
-n,
=
=
(4.4.2)
(b n, k(l),b n, k(2), ..• ,b n, k(n»'.
h' kLS ' where
-n, -~n
~S
-n
is obtained from
~n
by a permutation
of the rows of L
-n [replacing (1, ... ,n) by (S n1'''' ,S nn )]
n
V(T
mn,
k/HO)
= V(
I
b
i=l n,
= E[. .
I1.bn, k(i)cm.
S +
2
~=
= n
k(i)c S IH o)
m ni
2
n~
I
'4'
~TJ
b n, k(i)b n, k(j)cm.
S cmS . IH o]
n~
nJ
-1[ I
b 2 .(i) + (n-l) -1 L b k(i)bk(j) I... c hc h ,]
i=l n,k
i:fj n, . ' n,
h:fh' mn mn
I
l
= n- [. b 2 (i) - (n_l)-l L bk(i)b kO)J·.
i=l n,k
i:fj n,
n,
n
(n_l)-l
Lb2
i=l n,
k(i).
101
n
2
= . L1bn,
k(i)Cov(cms • ,cm's • /HO)
Cov(Tmn, k,T,
m n, k/HO)
~=
+
n~
n~
L bn, k(i)b n, k(j)Cov(cms
.J..
~TJ
=
-l[ L.b
n .
.
n~
n
2
i=1 n,
k·
,cm's
L.cmn~.c,
i}]
mn
i=l
i:!'j n,
l[ .L
n
/HO)
(i){ n
+ n-l(n_l)-l[ I bk(i)b k(j){.
= n-
nj
Ic
.hc'nh,}l
h:!'h' ron m
n,
J
n
r
2
b kef)
C i C 'i
1=1 n,
i=l mn m n
J
nL b 2 k(i) nL c. i C , ·i.
f=l n,
i=l mn m n
+ (n-l) - 1
= (n-I)
-1 n
.
n
L b 2 k(i){ i=l
L cron i Cm, n I}.
i=l n,
*(t)
. = T (t)V-~ , where ret)
Remembering that Wmn
mn,r
n,r
= max{k: Vn, k -< tV},
the result follows.
n,r
Corollary 4.4.3.
From Lemma 4.4.1, it follows that
Cov(W*(t(n», W*(t~n»/H) = (t ~n) "t~n»
-
-n
i
~n
~
0
J.
minimum of tin), t~n).
J
A
-n'
where ti(n)J\tJ~n) is the
Under Ho ' C2V(~n(tin»IHo) and
Cov(W (t(n), W (t(n»/H ) are t(n)I and (t(n)"t(n»I
Corollary 4.4.4.
- .....n
i
-n
j
0
..
i-p
i
j
.... p
t'espectively.
Let (kl,
. .•• ,k)
q be a subset of (l, •.• ,r) such that
lim n
-1
n~
-1
lim n
n~
kj =
o.J
> 0, j
r = 0 > O.
= l, ... ,q,
(4.4.3)
102
Then by Lemma 2.2.7 Hajek (1961), Lemma 4.2 and Theorem 4.2 of
Chatterjee and Sen (1973), under assumptions (4.2.5), we have:
Lemma 4.4.5.
lim(n-l)
n
-1
2
i=l
n-+OO
b
k (i) -+
n, j
O.
J
Yj
- [
~2(u)du
1
)
f
HU)dUf
f
+
<5.
J
1im(n-1)
n-+oo
-1 n
I
i=l
b
n,r
(i)-+
<5
Y
=
I
</>2(u)du + (1-0)-
If II</>(U)dU]12 •
(4.4.4)
<5
o
"
Let ~n* = {~n(t),
0 ~ t ~ I} be the p-variate stochastic process
considered earlier, and let W* = {W*(t), 0 < t < I} be a p-variate
~
Gaussian process (with one-dimensional time parameter) such that
EW* (t) = 0 and E(W*(t)W* (t'»
(4.2.3).
Then we have the following:
Theorem 4.4.6.
*
W
TJ
~
= min (t,t')A, where A is defined in
Under the H and (4.2.2), (4.2.3) and (4.2.5),
O
*
W •
-n
Proof:
That
*
E(~n/HO)
= 0 is obvious from (4.3.5).
We: shall first
show the asymptotic joint normality of any finite dimensional distri-
*
bution (f.d.d.) of the process Wand
then its tightness property.
~n
*
W
~n,q
Let
< s(n) ~ 1,
q
(4.4.5)
103
where
_1
w* (s ~n» = T
~n
.
(n)
and k. is defined by s.
J
J
V ~
(4.4.6)
-n,k. n,r
J
e'
J
-1
= V k V
•
n, . n,r
J
Therefore, asymptotic joint normality of (T
~n,
asymptotic joint normality of W
~n,q
k , •.• ,T
1
~n,
k) will imply
.q
.
Let us rollout the elements of (T
~n,
k , ••• ,T ·k) in a single vector
1
q
~n,
(4.4.7)
T
1L' ••• , T l
- n,q = (T. l n, k"'"
1
pn'-L
n, k"'" Tpn, k ).
pqXl
q
q
U'
We now take a linear combination of the elements of U
~n,q
,viz.:
(4.4.8)
where
g' = (gll,···,g
,
p, 1· ,···,gl ,q , •.• ,g p,q ).
~q
After rearranging the terms, we get from (4.3.1)
T (g)
n -
=
n
I
p
I
i=l m=l
c
mS ni
(4.4.9)
d (i),
n
where
2. 2.
[.1. j=lr a
". r gm ,].. J + m=lI 8m ' /(n~k1)
r (V>],
[m=lr. J=2
t
v=k
h·
f gm a. (i) + m=l'
2: gm 1 [(n-k1 )
I. an (V)]
m=l ,q..n
"
v=k
+ rg 2 [(n-k 2)
I an (V)]··"
m=l m
., .
v::;k
m=l
g
.] (il, 1
m,] n
i
kl ,
"
an (i)
d. (i)
n
=
P
-1.....
•.••
-1 .
1
+1
1
"n
k1 <
i2.
k 2,
+1
-1
+1
2
+ ... +
I
g1l1 ';"l[(n-k -1)
m=1 .,q
q
-II
v=k
an(V)],k -1 <i 2. kq ,
. q
+1 ..
q-1
e·
104
l.
I
I
gm 1 [.(n-k ) -1
an
1
m=l' v=k +1
(V)]
+ ... +
I
gm
m=l
1
k
q
,q
[(n-k) -1 y.
an
q
v=k +1
(V)],
q
< i < n.
(4.4.10)
p
Denoting
L
m--1
c
mS.
nl.
= f
S.
nl.
linear rank statistic.
,we observe that T (g) reduces to a simple
-n
Further, in every interval considered in
(4.4.8), d (i) is a linear function of a (i)'s.
n
By Lemma 2.2 and
n
Theorem 3.2 of Hajak (1961), the a (i)'s satisfy the Noether Condition,
n
and thus so do the d (i)'s.
n
It remains to show that the f
n
Condition Q of Hajek (1961).
L
But
f
i=l
S
ni
's satisfy
is identically zero by virtue
S
ni
of (4.2.2) and
]I,
P
n
]l,n
2
L [ L cmS
m=l
ni
a=l
]
2. p
a
p
L L c~s ni
a=l m=l
(4.4.11)
a
We again appeal to (4.2.2) and observe that
]I,
lim
n-+oo
n
L
max
12.il~ ••• ~i]l,
~n
= 0(1), when n
a=l
-1
]I,
n
+
0 as n-+oo.
n
We also have, by Theorem 4.4.2, Corollary 4.4.3 and (4.4.8)
(4.4.12)
where 9 denotes direct product and
n
Lb
i=l n ,k
2
(i)
1
n
(n-1)·B
_n,q =
Lb 2 k (i)
i=l n, 1
......
n
n
n
i~1b~,k1 (i)
L
i=l
b
2
n,k
(i)
1
n
b2
(i)
n,k
i=1
2
L
L
i=l
b
n
L
i=l
b
2
k (i)
n, 1
2
k (i)
n, 2
....
n
L
i=l
2
.b n,k (i)
2
n
2
L
b k (i)
1=1 n, q
(4.4.13)
105
We note that, by virtue of Lemma 4.4.5, as n
B
-q
=
-+ 00,
Yl
Yl
Yl
Yl
YZ
YZ
Yl
YZ
Yq
a-n,q
-+.B
-q
where
(4.4.14)
We now appeal to the Wald-Wolfowitz-Noether-Hoeffding-Hajek permutational central limit theorem and conclude that T (g) is asymptotically
n -
N (0 , g 'Btil Ag ).
l
-q-q
--q
We also have
=
s.,
(4.4.15)
J
which establishes the first part of the proof regarding f.d.d.
DW
*,
For p=l, Chatterjee and Sen (1973) have proved that W* ~
n
which insures that, for V m = 1, ..• ,p,
p{
sup
O~t,t'::,l
[Iw*
mn
(t) -W* (t')I:lt-t'l < 0] > e'}<n', o,e',n' > O.
(4.4.16)
II.*
W (t)
But
.
mn
~
*
.
W. (t') 11 2 <
-
. -
p
p
* ( t)
L1 [Wmn
m=
*
2
- W (t ')] .
mn·
From (4.4.15) it therefore follows that
p{
sup
[IIW*(t) - W*(t')IF:lt-t'l ~o] > p2 e2'} < pn'.
O<t,t'<l
-n
-n
(4.4.17)
By putting p2 e 2' = e: 2 and pn' = n, we prove the tightness property of
W* .
-n
Corollary 4.4.6.
w
-n
Under the same conditions as in Theorem 4.4.5,
= {W (t), 0 < t < I} converges weakly to a p-variateGaussian
-n
process
~
--
such that
E(~(t»= ~
and
E(~(t)~(t'»
= min(t,t')!p'
106
Theorem 4.4.7.
Under the same conditions as in Theorem 4.4.5, H is
r
asymptotically distributed as the sum of p independent and identical
regular functionals of the Brownian motion process W(t) viz.,
1
Io
Proof:
(4.4.18)
Follows from Corollary 4.4.6, straightforward extension of
Cramer's Theorem [see Rao (1973, p. 162)] and Billingsley (1968,
Theorem 5.1).
We therefore compare H against u ,the 100(1-a)th percentile
k
a
1
of the p-fo1d convolution of the distribution of
W2 (t)dt with
I
o
itself •
4.5. Asymptotic Distribution Theory of H
r
Under Contiguous Alternatives
Under the alternative hypotheses, we consider a sequence
{K :
n
*-1
*
S} where we conceive of c . satisfying (4.2.2), as
-n mnl.
S =C
-n
c
.
c
mnl.
C*2
mn
=
*
*-1
.C
,
mnl. mn
n
r
1, ... ,p, n
~
1,
i=l
*-1 = diag(C *-1 , ..• ,C *-1 ).
1n
mn
C
~n
(4.5.1)
We have the following Theorem:
Theorem 4.5.1.
Under
{~}
and the conditions (4.2.1) through (4.2.6),
the p-variate sample process W* converges weakly to the p-variate
..
-n
drifted Brownian motion process
~
*
such that
EW*(t) = Sp(v-l(t»,
-
-
E(W*(t)W* (t'»
~
-
=
min(t,t')A ,
(4.5.2)
107
where
=
¢l s (u)
{¢l(U),
0
<lu ~ s,
(l~S)-lI ¢l(u)du,
s<u<l.
(4.5.3)
s
Proof:
Let P and Q be the joint df of
n
n
respectively.
~n
under H and l<p.
O
From the results of Chapter VI of Hajek and Sidak (1967),
and Theorem 2 of Sen (1976), by virtue of the conditions (4.2.11),
and (4.2.4), under {Kn }, {Qn } is contiguous to {p}.
.(4.2.21,
..
n
We now
v
appeal to Lemma VI.4.2 of Hajek and Sidak (1967) and prove the Theorem.
4.5.1. Asymptotic Power of the PCS Test
The asymptotic power of the test based on H is given by
r
P{R
> u. ex. for some k = 1, ••• ,r/Kn }
--k
+ p{
~[Il
m=l .
o
(w (t) + j l (t» 2
m
m
dtJ
= P{Hr >· uex. /K}
n
> u }.
ex.
(4.5.5)
108
4.5.2. ASymptotic Mean Stopping Time
Under the
on r values
Z
n
pes,
the stopping time Z is a random variable which takes
1' •.. ,z
nr
with probabilities Pk's associated with the
events [Z = znk] under P
n
or {Q }, where
n
Pk = P{H.1-ex
< u , i = l, •.. ,k-l, R
> UN}'
--k
""
p= P{H. < u , i
r
1 - ex
+ P{H.1 -< uex ,
l, ... ,r-l, H > u }
r
ex
i
= l, ••. ,r}.
(4.5.6)
We can compute the mean stopping time as
(4.5.7)
It is difficult to give an explicit expression for the asymptotic power
and mean stopping time since they require knowledge of the distribution
of the functional of the drifted Brownian motion process W(t) +
~
m
(t).
Useful approximation can be empirically obtained through simulation
for various drifts.
CHAPTER V
SIMULATED PERCENTILES OF THE NULL DISTRIBUTIONS OF SOME
FUNCTIONALS OF STANDARD BROWNIAN MOTION PROCESS.
NUMERICAL ILLUSTRATIONS OF TESTS
5.1. Introduction
The distributions of the progressively censored rank order tests
considered in Chapters III and IV have been shown to converge weakly
to those of various functiona1s of the standard Wiener process under
the null hypothesis and of the drifted Wiener
proc~ss
alternatives, under certain regularity conditions.
under contiguous
The null distribu-
tions of these processes are not available in the statistical 1iterature.
Theoretical derivation of these .distributions appears to be
complicated and intractable.
In Section 5.2 we derive the null
distribution of these processes empirically through simulation and
smooth them by suitable Pearsonian curves whenever possible.
In
Sections 5.3 and 5.4 we illustrate applications of the tests developed
in Chapter II and the two-sample test derived in Chapter III respectively, to samples generated from negative exponential populations.
In the concluding section we make a few observations regarding asymptotic performances of the two sets of tests from power and mean
stopping time considerations.
A copy of the program written in FORTRAN
for the <ierivation of empirical distributions of
.
sup
E<t<l
stip
E<t<l
(t-~(t)} and
.
..
(t-~IW(t)l) through simulation is provided in-tne Appendix where
{Wet):
.
.
. '
t'E [O,l]} is a standard Brownian motion process.
110
5.2. Generation of Brownian Motion Process and Derivation
of the Distributions of some Functiona1s of this
Process through Simulation
Consider n independent observations Xl, ••. ,X from the standard
n
normal distribution.
Let
k
I
X., 1 < k < n,
i=l
=0
So
1.
-
n
n
where n(t)
= max{k:
(5.2.1)
by convention.
We define the stochastic process W
W (t)
-
= n -~
= {wn (t),
t E [O,l]} by letting
(5.2.2)
S (t),
n
k < tn}.
We note that the sample process {W (t)} is right continuous with the
n
left-hand limit and hence belongs to the metric space D[O,l] with the
properties
E W (t)
n
2
E W(t)
n
E W (t)' W (t')
n
n
=
0,
= n -1 Int] ,
= n- 1 [n(tAt')],
t, t'
E
(5.2.3)
[0,1],
where [nt] and [n(tl\t')] denote integral parts of nt and n(tAt')
respectively.
The maximum jump of the process is
max
l<k<n
I~I
--=
O(ffrm
a.s.
In
+ 0
a.s. as n+
00
•
(5.2.4)
Consequently, as n gets large, the process Whas a cQntinuous sample
n
path a.s. and has the structure of the standard Brownian motion process
W = {W(t) , t
E
IO,1]}.
111
In our simulation studies, the standard normal variables have been
.generated by using the IBM scientific subroutine GAUSS.
The sample size
n has been taken to be 1000 and the empirical distributions of the
various processes have been obtained through 1000 independent repetitions.
5.2.1. Simulated Distributions of
Sup (t
£<t<l
-\-(t»
and
Sup (t-!.e/W(t)
£<t<l
I
As we have stated in Section 3.6, the processes do not behave
properly in the neighborhood of t=O.
We resolve the problem by
working with the truncated range [£,1] for t, where £ is strictly
positive.
We also note that, by virtue of 5.2.4,
D
W + W,
n
max W(k/n)
n
£n<k<n
max
£n~k<n
D
+
IWn(k/n>I D
-~---o7
IiiJU
_1.;
sup (t 1W(t»,
£<t<l
sup (t-~IW(t) I>.
£<t<l
(5.2.5)
We exploited (5.2.5) to derive the simulated distributi(lns of the two
processes for seven values of £,viz., .001, .005, .01, .05, .10, .15,
.20.
The various steps involved in the derivations are:
.(~)
Generate a random sample of 1000 from N(O,l) and compute
k
(b) Sk
= L Xi'
i=l
(c)
En<k~n (SO
.
(k/n)-~ Wn(k/n) = k-~
£
(d)
=0
by convention);
Sk' £n<k<n,·
= .001,
.005, .01, .05, .10, .15, .20;
~x (k/n)-~IWn~k/n) I,
Max (k/n)"';~ Wn (kin) and
£n<k<n
.
£n<k<n
£ = .001, .005, .01, .05, .10,.15, .20;
lIZ
(e) Repeat (a) - (d) 1000 times and from the observed distributions
of the statistics at (d) compute (i) 900-th, 9Z5-th, 950-th and
990-th order statistics and (ii) mean,
variance'~3' ~4'
and SZ.
From the Sl - Sz diagram of Pearson and Hartley (196Z), Pearson's
Type I (0) curve was chosen to graduate the observed distributions.
From Elderton and Johnson (1969) we briefly describe properties of this
curve:
The equation of the curve is
x
y = y (1
o
+ -)
a
ml
(1 -~)
mZ
a
,-a
Z
are finite, ml/a
l
l
< x < a '
Z
(5.2.6)
= m2/a Z' the mode is at the origin
l
and Yo is a constant such that the area under the curve for x = -a
l
to x = a is unity. Letting a + a = band z = (a + x)/b, we
Z
l
2
l
reduce the curve to a Beta distribution of the first kind with
par~et~rs m + 1 and m + 1.
The constants of the curve are
2
i
estimated through the method of moments. The values to be
where aland a
calculated in
Z
or~er
are
(5.Z.7)
The m's are given by
,>{r-2±!'(r+2lJ
B(r+2)2:~6(r+l)}'
(5. Z. 8)
1
~3
where m is the positive root if
2
is positive.
(5.Z.9)
Mode
= Mean
-
~
•
~3
~Z
r+2
r-2 •
(5.2.10)
113
As the Type I distribution is the extended Beta distribution of the
first kind we fit this distribution by the use of incomplete BetaIntegrals.
Pearson's (1968) Tables of the Incomplete Beta-Function are
available for this purpose.
But use of these tables requires tri-
variate interpolation and involves heavy computations.
We, however,
note that if U has the central F-distribution with VI and V degrees
2
of freedom,where VI and V may be fractional, then Z given by
2
VI
Z = -
V
V-I
U(1
2
+ -!.
V
(5.2.11)
U)
2
has the Beta distribution of the first kind with parameters V /2 and
l
V
2
/Z.
The CFTDIS program of the Biostatistics Time Sharing Library
gives the percentile points of the F-distribution for various p and
vice versa for fractional degrees of freedom.
2(m +l»
2
and V .=. max (2(m +l), 2(m +l»
l
2
2
versa,if
~3
when
We take VI
~3
= min(2(ml +l),
is positive, and vice
is negative and letting ua. be the lOO(l-a.)th percentile
point of the F-distribution, we compute w , the graduated values of·
a.
the lOO(l-a.)th percentile point of the processes considered here, as
follows:
V
u", (1
Uo
w
a.
+ -!.
u )
V
a.
2
-1
= xa. + mode.
(5.2.12)
From the closeness between the percentile points of the observed distributions obtained through simulation and the corresponding graduated
values we may say that the fit was quite satisfactory.
In Table 5.2.1
114
we display the graduated values of the two distributions for seven
values of E and some selected values of a.
100(1-a)th percentiles of
We shall designate the
_1::
sup (t \let»~ and sup (t -~IW(t) /) by
E<t<l
E<t<l
+
Table 5.2.1. Simulated Values of Wa(E) and Wa(E) for Selected
Values of a and E
a
E
= .0lD
a
= .025
a
= .050
a
= .100
+
W (E)
W (E)
+
W (E)
W (E)
+
W (E)
Wa(E)
w+(E)
WaCE)
.001
3.36
3.53
3.06
3.29
2.80
3.07
2.50
2.82
.005
3.32
3.48
3.02
3.25
2.76
3.04
2.46
2.79
.01
3.28
3.46
2.98
3.23
2.72
3.01
2.42
2.76
.05
3.19
3.39
2.87
3.13
2.60
2.89
2.29
2.62
.10
3.e')
3.33
2.76
3.04
2.50
2.79
2.19
2.52
.15
3.03
3.26
2.72
2.98
2.45
2.74
2.13
2.47
.20
3.00
3.23
2.69
2.95
2.41
2.71
2.09
2.43
a
a
a
a
= l •••.• p~are
a
Sup (t -~IWi (t) IJw!i-e;re
E<t<l
p Independent Standard Wiener Processes
5.2.2. Simulated Distribution of
w.(t),
i
].
a
Max
l.s.i~
By the product law of probability
:(>{ Max
sup (t- ~ Iw. (t)
l<i2P E<t<l
].
I)
1::
< xl = [p{ sup (t-2IWi(t)
E<t<l
I)
< x}]
P
(5.2.13)
Therefore, the 100(1-a)th percentile point W(P)(E) of the distribution
a
considered here coincides with W ,(E) defined in Section 5.2.1 where
a'
=1
p
- II-a.
a
Through the first four moments of the corresponding
observed distributions we worked out the graduated percentile points
for selected values of E,a, and p.
These are presented in Table 5.2.2.
115
Table 5.2.2. Simulated Values of W(P)(E) for Selected
a
Values of p,a, and E
) ... 4
p = 3
P = 2
E
a=.OlO a=.050 a=.lOO a=.OlO a=.050 a=.lOO a=.OlO a=.050 a=.lOO
.001
3.69
3.28
3.06
3.78
3.40
3.19
3.84
3.47
3.28
.005
3.64
3.25
3.03
3.71
3.35
3.15
3.77
3.43
3.24
.01
3.61
3.22
3.00
3.69
3.35
3.13
3.74
3.40
3.21
.05
3.57
3.12
2.88
3.66
3.25
3.02
3.73
3.33
3.11
.10
3.53
3.03
2.79
3.64
3.17
2.93
3.71
3.25
3.03
.15
3.45
2.98
2.73
3.55
3.10
2.87
3.63
3.19
2.97
.20
3.41
2.95
2.70
3.51
3.07
2.84
3.58
3.16
2.94
1 ~
.
Sup (- 1. Wi2 (t» when Wi(t), i= 1, ••• ,p,
E<t<l t i =l
.
are p Independent Standard Wiener Processes
5.2.3. Simulated D:l,stribution of
Let [{X };=I' i = l, ••. ,p) be p random samples each of size n
ij
drawn independently from N(O,l). Then we can show that
Max {(k) -1 ·~W~ (~)}
En<k<n n
i=l~n n
~
sup (1:. rW 2 (t» ,
E<t<l t i=l i
(5.2.13)
where
..
Wi·.·. (~)
n n
.k
1
=1ii·
L. IX •• , k ... En, .•• ,n, i = 1, ••• ,po
n J=
'1·
(5.2.14)
~J
Uti1iJ11ng (5.2.B) we simulated the distributions of sup
.
E<t<l
<t
f. W~(t»
i=l
for seven values of e: and for three values of p (viz., 2,.3, 4)
through the following steps:
. (a) Generate 4 random samples of 1000 each from N(O,l) and
compute
116
k
(b) Slk
k
=)
J=l
Xlj' S2k
= j=l
L X2 ",
J
k
S3k
= j=l
L
k
X3J", and S4k = l X4J",
j=l .
En
~
~
n, E = .001, .005, .01, .05, .10, .15, .20;
En
~k ~
n, E = .001, .005 r .01, .05, .10, .15, .20;
k
E = .001, .005, .01, .05, .10, .15, .20;
(e) B(2) (E) =
Max ~3),
En<k<n
(f) Repeat (a) - (e) 1000 times and from the observed distributions
ofB(P)(E) compute (i) 900-th, 925-th,950-th, and 990-th
order statistics and (ii) mean, variance,
From the
al
~3' ~4'
Sl' and S2·
- S2 diagram of Pearson and Hartley (1962), we observed
that Pearson's Type I distribution would be adequate in all the cases.
The fit turned out to be satisfactory in each case.
The graduated
percentile points b(P)(E) were computed in the same manner as detailed
a.
in Section 5.2.1 and are exhibited in Table 5.2.3.
117
Table 5.2.3. Simulated Values of b(P) (E:) for Selected Values
a
of p,a, and E:
o = 2
E:
o = 3
J
= 4
a=.OlO cx=.050 a=.100 a=.010 a=.050 a=.100 a=.010 a=.050 a=.100
.001 .
15.56
12.45
10.88
18.67
15.15
13.39
20.61
17.09
15.31
.005
15.44
12.32
10.71
18.56
15.00
13.19
20.51
16.87
15.03
.01
15.40
12.19
10.54
18.40
14.81
12.99
20.31
16.67
14.82
.05
14.70
11.51
9.88
17.67
14.10
12.27
19.74
15.97
14.06
.10
14.47
11.18
9.50
17 .47
13.71
11.81
19.39
15.51
13.57
.15
14.03
10.84
9.20
16.83
13.27
11.43
18.75
15.05
13.16
.20
13.71
10.50
8.86
16.55
12.94
11.09
18.46
14.73
12.80
n- 1
p
n-1
L L
i = 1 k =0
w~ (k)
~n
n
II
D p
+
L . Wi2 (t)dt,
. 1
0
~=
where Win (kIn) is as defined in (5.2.14).
p
the ellll'irical distribution of .. ~ .
~-1
I
(5.Z.15)
1
2
W
we
Using (5.2.15)
obl:<ltl1eq
(t)dt through siDlu1at1ou for
O.
.
seven vf.l1u.es of e: and 3 values ofp as in Section 5.2.3. The various
steps involved in the derivation are:
(a) Gellerate 4 random samples of 1000 each fromN(O,l) and cOlllPu,te
k
.I Xi"J
.J=1
(b) Slk=
k
k
S2k = ) X2 ·, S3k = I X3k , s4k
J=l
J ."
j=I"
..
k = 1, ••• ,n-1, S'k = 0 for i = 1, ••• .
,4 .
and
k =0;
..
~
.
.
k
,
=lx
j=1 4k
e'
118
-2 n-l
I
( c) C. = n
(d) D(2)
S~k' i = 1,..., 4 ;
k=O
1.
=C +
1
C D(3) = D(2) + C D(4) = D(3) + C •
2'
3'
4'
(e) Repeat (a) -(d) 1000 times to obtain observed distributions
of D(P) and compute the order statistics, the first four
moments, and 13
The observed 13
1
and 13
2
1
and 13 ,as in Section 5.2.1 and Section 5.2.3.
2
of these distributions did not fall within the
range of the 13 1 - 13 2 diagram given in Pearson and Hartley (1962).
However, Johnson et al. (1963) have compiled percentile points of
Pearson curves in general in standard measure for extended values of
13
1
and 13 •
2
We made use of these tables to work out the graduated
percentile points of the empirical distributions for p = 3 and 4 by
bivariate interpolation.
Pearson's lype 1\,J).
These two distributions turned out to be
As the observed 13
and 13 of the simulated dis2
1
tribution for p = 2 were outside the range of the tables compiled by
Johnson et al., we did not attempt graduation in this case.
In
Table 5.2.4 we present graduated percentile
points d(P)
for p = 3,4;
...
ex
the values relating to p = 2 are as they were obtained through simulation.
Table 5.2.4. Simulated Values of d(P) for Selected Values of p and ex
ex
ex
p = 2*
P = 3
.010
5.54
6.15
6.78
.050
3.36
4.23
4.97
.100
2.53
3.31
4.02
*Ungraduated.
p
= 4
119
5.3. Numerical Illustration for Chi-Square Tests under PCS
5.3.1. Generation of Data
For purposes of numerical
illustration~
with two samples and two batches.
we have considered a BAM
For the sake of clarification, the
two samples may be conceived of as control and treatment groups in a
contraceptive effectiveness study with the subjects entering into the
experiment in batches.
The response categories are conception times
recorded in one-month intervals starting from the time-points of the
individuals' entry into the trial.
From time and cost considerations,
the study may not be continued beyond a time-point T.
Allowing for
seasonal variation in reproductive behavior, the batches within the
same sample may be assumed to differ in respect of conception time
distribution.
Keeping this in view, two samples of 700 each were
generated from exponential distributions with mean survival times
(i) Al
= A2 = 16.0
and (iii)
~l
months, (ii) Al
= 16.0
months, A
2
=
= 16.0
months, A
2
20.0 months.
= 18.0
months
The size of the first
batch within each sample was 500; the remaining 200 individuals belonged
to the second batch.
It was assumed that the experiment was observed
fora period of 24 months after inception.
With the further assumption
that the second batch entered into the study one year after the first,
the failures in the two batches were recorded in 24 month-intervals
(categories) for the first batch and in 12 month-intervals for the
second batch.
The last categories of the two batches (25th and 13th
respectively) included the censored individuals.
120
5.3.2. Computation of the Statistics
In order to derive the test statistics A, B, C, and D, we started
with unspecified model, allowing for inter-batch difference in probability structure.
Under the null hypothesis of no difference between
A
A
the survival time distribution of the two samples, X~ and jX~ were
A
computed by plugging in the maximum likelihood estimators TIij,t
derived separately from the two batches, in (2.3.2).
It is important
A
A
to note that in this unspecified model situation, both {X~} and {jX~}
form strictly non-decreasing sequence in c by virtue of Lemma 2.4.1.
The asymptotic powers and mean stopping times for the four tests were
derived empirically by repeating the experiment 500 times.
The results
are summarized in the next section.
5.3.3. EmpL:-ical l?owers and Mean Stopping Times
By the Lormulae of Section
2.4~
we can at once derive the degrees
of freedom associated with the various tests.
,.
The degrees of freedom of
A
24, f
the overall test,. X2 are 36. We similarly obtain lf
24 =
'2 12
24
d = 1, c = 1, ••• ,12; d = 2, c = 12, .•• ,24; .d = 1, 'VI c and V j.
c
c
J c
= 12;
Next, the test statistics were compared against respective percentile
points of the chi-square distributions (central), in order to compute
empirical powers and empirical mean stopping times associated with the
four tests.
The results from the 500 repetitions are demonstrated in
the table below.
121
table 5.3.1. Empirical Powers and Mean Stopping Times for the Four
Tests and Two Values of ~
A
Characteristics *
C
B
D
~=.10
~=.05
~=.10
~=.05
~=.10
~=.O5
~=.10
~=.O5
.09
.05
.09
.03
.10
.05
.08
.03
23.87
23.97
22.90
23.56
23.79
23.91
23.27
23.73
.18
.10
.12
.08
.17
.09
.12
.07
23.66
23.84
22.36
22.89
23.63
23.82
22.65
23.21
.48
.35
.28
.18
.41
.28
.24
.14
22.56
23.09
20.14
21.47
22.38
23.01
21.01
22.27
Case (a)
Power
Mean Stopping
Time
Case (b)
Power
Mean Stopping
Time
Case (c)
Power
Mean Stopping
Time
*Case
(a) :
Al = 16.0, A2 = 16.0.
Case (b) :
Al = 16.0, A2 = 18.0.
Case (c) :
Al = 16.0, A2 = 20.0.
From the above the following salient points emerge:
(i) All the tests are consistent •
. (ii) Type A is the most powerful test; next comes Type C; D is the
least powerful ·of all.
(iii) If we assume that each time interval costs sunits of cost,
then the expected cost for each test (apart from overhead
cost) is s x expected stopping time.
From this considera-
tion, Type B is the most economical test; D;, C, A come
122
in that order so far as this index of efficiency is concerned.
However, these differences seem rather small.
(iv) The efficacy of the PCS test procedures becomes more and more
pronounced as the difference between the two samples gets
larger and larger.
5.3.4. Additional Comments
Note that we could have derived an overall chi-square statistic
with 48 degrees of freedom for the Type A test, by imposing the
restriction on the model that the two batches within the same sample
have the same probability structure.
Type C and Type D tests are more
flexible in the sense that we need not know beforehand the time-points
of entry of different batches as in Type A and Type B tests.
On the
other hand, Type C and Type D tests do not utilize this inter-batch
structure.
Among all the tests, Type A test only preserves the power
of the overall test.
In all the other tests power is affected, in fact
reduced, since we increase the critical level of the overall test.
5.4. Numerical Illustration for Two-Sample Rank Order Tests with
Grouped Data Under PCS
5.4.1. Generation of Data
We generated 3 sets of two samples from exponential populations
exactly in the same manner as described in Section 5.3.1.
The only
exception is that the second batch was dropped from the analysis, i.e.,
we worked with ,two single-batch samples each of size 500.
As before,
there were 24 class intervals corresponding to the 24 months of conception,with the 25th category consisting of the censored subjects.
123
5.4.2. Computation of the Statistics
The two-sample location problem we considered is a particular case
of the model (3.2.1) with cVJ..'s taking on . the values -1 for the first
sample and 1 for the second sample.
The regression coefficient
S
stands for the difference between the location parameters of the two
samples.
We worked with the Wilcoxon scores' ~(u)
for purposes of test construction.
=
2u - 1, 0 ~ u ~ 1
In that case S"
v,c
and veS
v,c !f)
V
defined in (3.4.4) and (3.5.20) take the forms:
+ b.
vc
(n
lc
-n
2c
)
n*
+ Fv ,c(n*
l ,c+l- 2,c+l)'
+ ~2 N
VC Vc
*v,c
+ Fv,c N"+ll,
c
= 1, ••• ,24,
(5.4.1)
where b.
Vj
A*
= FV,j_l + FV,j ..,.1, b.V,c+l = Fv,c' n ij is the observed
frequency of the j-th category from the i-th sample, n *
i,c+l
= n
*l,c+l
+ n*
•
2,c+l
25
=
I
n
j=c+1
ij ,
*
A
NVj ' FV,j' ~j' Nv,c+l'
A*
and,b.v,c+l are defined by (3.3.6), (3.3.7), (3.3.11), (3.4.1), and
(3.4.3) ,respectively.
We have considered three experimental situations,
viz., (a) Al = 16.0, A2 = 16.0, (b) Al = 16.0, A = 18.0, and
2
(c) Al
=0
16.0, A
2
= 20.0.
Case (a) represents the null situation
whereas the other two were chosen for the computation of empirical
powers of the tests.
case,
From the 2x25 contingency tables relating to each
F;v(t~V) defined by (3.4.8) was computed for every c
=0
1, ••• ,24.
-1/16
For the null case, the probability of the first cell is 1 - e '
= .06
124
and for the other two cases it is not less than .05.
1 -
(1 - .05)
samples,
tiV )
3
~
3 x .05
= .15.
But
Since we were concerned with large
defined after (3.4.8) should be at least as large as .15
(in probability).
Therefore, the appropriate value of s defined in
Section 3.6 was taken to be .15.
We carried out PCS tests (both one-
sided and two-sided) against the critical values w+(.15) and w (.15)
a
a
for two values of a (.05 and .10) recorded from Table 5.2.1. We
computed the empirical powers and mean stopping times of the tests
through 500 repetitions of each experimental situation.
The results
are summarized in Table 5.4.1.
Table 5.4.1. Empirical Powers and Mean Stopping Times for PCS Rank
Order Tests with Grouped Data
One-Sided
Characteristics *
Two-Sided
a=.10
a=.05
a=.10
a=.05
.05
.03
.07
.04
23.05
23.58
22.63
23.32
.37
.28
.27
.19
17.58
19.48
19.60
20.91
.79
.71
.70
.62
10.25
12.41
12.53
14.44
Case (a)
Power
Mean Stopping Time
Case (b)
Power
Mean Stopping Time
Case (c)
Power
Mean Stopping Time
*Case
(a) :
Al
= 16.0, A2 = 16.0.
Case (b) :
Al
= 16.0,
Case (c) :
Al
= 16.0, A2 = 20.0.
A
2
= 18.0.
125
The empirical powers of the tests for case (a) suggest that the tests
are conservative,as we expected from the theoretical derivations of
Section 3.6.
However, the tests appear to be quite powerful when the
null hypothesis is not true.
From mean stopping time considerations,
considerable saving in time is expected as the difference between the
locations of the two samples is widened.
In case (c) the experimenter
may be able to come to a decision before the experiment even completes
half of the contemplated period of two years.
5.5. Comparison Between the PCS Chi-Square Tests and PCS Rank
Order Tests with Grouped Data
With regard to the negative exponential survival time distributions
which one often encounters in life testing problems, we observe from
Section 5.3 and Section 5.4 that, from both power and mean stopping
time considerations, the rank order tests (two-sided) are much superior
to the chi-square tests.
However, even for a moderately large number
of categories, the former tests are quite conservative.
We may
recommend chi-square tests for their simplicity of application
particularly when comparatively small number of categories are available.
Otherwise, for the type of distributions we are considering
here, the rank order tests should be preferred.
There may be situations
when for the first few cells the observed frequencies are small.
If we COmmence the PCS test from the first cell,
e: will be small.
On
the other hand, lumping of the first few cells with small frequencies
will result in reduction of the number of class intervals.
In either
case, the powers of the rank order tests will be affected.
Consequently,
it may not be possible to make clear-cut recommendations covering all
types of situations.
~
CHAPTER VI
SUGGESTIONS FOR FURTHER RESEARCH AND
SCOPE FOR APPLICATIONS
6.1. Introduction
In this chapter we present a few suggestions for future research
by proposing generalizations of the PCS rank order tests to the batcharrival model (BAM).
We also indicate possible extensions of the PCS
test procedures developed in Chapters II, III, and IV to the multivariate situation.
In the concluding section, we discuss scope for
applications of the PCS tests to life problems.
6.2. Extension of Rank Order PCS Tests to BAM
In developing the PCS rank order tests in Chapters III and IV, we
have not considered the situation of staggering entry as we have done
in Chapter II.
We shall briefly discuss rank order PCS tests for BAM
with ungrouped data for the simple regression model by drawing upon the
basic ideas and concepts from Sen (1976).
Let n subjects enter into the experiment simultaneously.
Then at
the k-th failure, we compute the censored linear rank statistic
Tn, k' k
= 1 , ... , n-l ,
defined by (4.2.12) i f we work with an unrestricted
design in the sense that we do not set out to plan beforehand as to the
maximum duration of experimentation.
sequence {T k:
n,
k = 1, ••• ,n-l}.
We base the PCS tests on the
127
Suppose now that the subjects enter the experiment at different
points of time.
Then the cumulative sample size obtaining at any time-
point t is a monotonically non-decreasing function of t.
In order to
develop pes test procedures for this staggering entry situation, we
shall require to work with the double-array {T
k:
q, q
q = l, ••• ,n} of censored linear rank statistics.
1 ~ k
q
~ q-l;
Note that q takes on
the values l, ... ,n if the time-point of entry is distinct for every
individual; on the other hand, if the individuals enter the experiment
in
~
batches at the time-points
assumes the values n
, ••• ,n
t
1
size at the time-point t
£
t~
tl,···,t~
(t
(= n) where n
[t ,t + ), i
i i 1
t
l
< t z < ••• <
t~),
q
is the cumulative sample
i
= 1, ••• ,~-1.
A pictorial
representation of the batch-arrival model is given in Figure 6.Z.1.
4It
Figure 6.2.1. Staggering Entry with the Subjects Arriving in Batches
(Total Sample Size)
n
- - - - - - - - - -
n
~~---
t
'
i
-
",
t
h
I
,
I
I
I
I
i
I
I
I
't
h
~;
'&(
Time of Observation (t)
Suppose censoring takes place at the time-point t
sample size obtaining at this time-point is n
t
j
and the cumulative
where h
h
~
j and
,
128
h
= 1"",~,
The cohort which has been admitted into the experiment at
t h are observed for a time-period t
j
- t
h
,
For every failure from the
commencement of the experiment up to the time-point t., we trace back
J
Following Sen (1976),
the respective cohort and record the failure time.
we construct a two-dimensional time-parameter empirical process on
n2 [O,11 based on the double-array {Tq k:
1 < k
, q
< q-1; q
q -
= nt
, •••
1
,nt.
Sen (1976) has proved that, under appropriate regularity conditions,
this process converges in distribution to a standard Brownian sheet
on the unit square 1 2 •
If we work with a restricted design, we shall
concern ourselves with processes over part of the unit square.
One may similarly construct double-array of censored rank order
statistics with BAM and extend the results of Sen (1976) to the grouped
data and the multiple regression situations.
6.3. Some Multivariate Extensions of the PCS Tests
6.3.1. Early Commencement with PCS Chi-Square
Multivariate Tests
As indicated in Section 2.6, the PCS chi-square tests are applicable
even when c is a vector in an rXc classification,
stretch out elements of cin a row.
We need only to
But at each time-point of censoring
we should have complete inforination on the categories other than'the
main censoring (time-sequential) variable.
For example, in a study on
the use of different contraceptives, one may like to know various
secondary effects resulting from their use.
The possibility of an
early commencement of the test procedures with the general model
situation, has also been mentioned with lower dimensional representation of discrete multivariate distribution (by a lower dimensional
129
representation we mean a probability model with smaller number of
parameters).
We briefly discuss Bahadur's (1961) representation of the
joint distribution of responses to n dichotomous items in the next
paragraph.
Let X denote the set of all points x = (xl' .•• ,x ) with each
n
xi
=0
or 1.
Let p(x) be a given probability distribution on X such
that p(x) ~ 0 for each x and
<:X..
1.
I p(x) = I,
xe:X
For each i = l, ••• ,n, let
=P{ x.=l} or equivalently
1.
<:X..
1.
= Ep (xi)' 0 <
<:X..
1.
< 1, i = l, •.• ,n,
where E denotes expected value whan p obtains.
p
(6.3.1)
Next, setting
(6.3.2)
define
r ..
1.J
= EP (z.z.),
1. J
i < j,
Let P[l](xl, •.• ,x ) denote the joint probability distribution of the
n
x. 's when (a) the x. 's are independently distributed, and (b) they
1.
1.
have the same marginal distributions as under the given distribution
p.
In other words,
(6.3.5)
p(x) = P[l](x)ef(x),
(6.3.6)
130
where f(x) =1 +
L r ~J.. ziz.J
+ ••• +r
i<j
. nZlz2 ••• zn.
12 ...
the following classification of distributions:
This suggests
p is of first order
n
if all 2 - n - 1 correlations are zero ,i.e., if the x.' s are
~
independent; p is of m-th order if all correlations of order exceeding
m are zero.
We shall usually concern ourselves with multinomial variables
which will obviously require extension of the Bahadur representation.
6.3.2. PCS Tests with Multivariate Survival
Time Distributions
Though our main interest may lie in the study of survival time
distribution of one particular variable, the cohort under investigation
may simultaneously be subjected to more than one source of depletion.
In clinical trials, for example, cancer patients under observation
may die of heart failure.
In a longitudinal demographic study, in
addition to contraceptive failures, there may be losses to follow-up
and withdrawals.
As we may visualize the various sources vying with
each other, the methods of analyzing such data are termed as competing
risk analysis.
On generalization of exponential distribution to
multivariate failure, we may refer to Cox (1959), Freund (1961), and
Marshall and Olkin (1967).
On some characterizations of multivariate
survival time distributions, see Basu (1971), Puri an(l Rubin (1974)
and Johnson and Kotz (1975).
The paper by Cornfield (1957} gives a good
elementary account of the competing risk problem.
Chiang (1961)
considers the problem of competing risk from stochastic process
approach.
(1971).
We also refer to David (1970) and Moeschberger and David
Gail (1975) presents a good review and attempts to synthesize
various results of competing risk analysis.
Elandt-Johnson (1975)
131
has extended some results of Sethuraman (1965) and David (1970).
formulate our problem in the next paragraph.
We
We adopt the basic
notations and concepts from Gail (1975) and Elandt-Johnson (1975).
Suppose a population is subject to k causes of depletion
cl' ••• ,c ' and suppose each individual is characterized by a
k
sponding random vector variable
!'
=
corre~
(Tl, ••• ,T ) which correspond to
k
the times of his failure from cl, ••• ,c
k
respectively.
We introduce
the Joint failure time distribution
k
(1
F( t , ... , t ) "= p{
(T. < t.)},
l
k
~=l
~
~
(6.3.7)
and the corresponding joint survival function
k
S(tl,···,t k ) = p{(\ (T i > ti)}'
(6.3.8)
~=l
Note that S(O,O, ••• ,O)
=1
and 8(00,00, ••• ,00)
= 0.
and the cause of depletion c. are observable.
~
Only Y
= min(Tl, ••• ,Tk )
In other words, for the
j-th individual we observe (Y.,o.) where 0. is a k-component vector
J -J
-J
with 1 as the i-th element if the failure is from c., the rest of the
~
elements being zeros.
In order to develop progressive censoring tests,
we shall concern ourselves with the distribution of (yj,o.).
-J
6.4. Scope for Applications
In the clinical trials of the Lipids project, with which the
Department of Biostatistics, University of North Carolina, is associated,patients having
hig~
cholesterol level in their blood are
randomly allocated to the control and the treatment groups.
These
two groups are to be kept under observation for several years.
The
patients of the control group are put on diet and placebo and those
132
of the treatment group are put on diet and medication for keeping the
level of cholesterol down.
The purpose is to ascertain whether
medication in addition to diet,raises survival times of the patients.
We may profitably use appropriate progressively censored tests to this
time-sequential experiment.
If the failures are recorded, for example,
in one-month intervals, we may either use PCS chi-square tests or PCS
rank order tests for grouped data.
If the number of failures is few
in each class interval we may apply the PCS linear rank tests for
ungrouped data as an approximation.
The suitability of the PCS tests
in this context, cannot be overstated.
This is particularly important,
since, if cumulative evidence from continuous monitoring shows significantly higher survival times for the treatment group, we should
stop experimentation at the earliest and recommend use of the treatment
without delay.
Suppose a country is urgently trying to implement population
control program.
The government may need specific recommendation as
to the most effective method of family planning through comparison of
a number of alternative methods in a longitudinal demographic survey.
The importance of the PCS tests is obvious from consideration of time.
In a cancer therapy study, the patients are randomly divided into
groups, each put on a different therapy.
We are called upon to compare
the distributions of remission times of the various groups and recommend
use of the best therapy from the findings of the experiment.
Ethical
and humanitarian considerations demand that we reach a decision at the .
earliest.
Again we need not emphasize the importance of the PCS tests
in such circumstances.
APPENDIX
ON DERIVATION OF EMPIRICAL DISTRIBUTIONS OF SOME FUNCTIONALS
OF THE STANDARD WIENER PROCESS
A.1. Program for Derivation of Empirical Distributions
_1·
of
Sup (t ~(t»
E<t<l
. _1
and
Sup (t ~IW(t)l)
E<t<l
00010 C GENERATION OF FUNCTIONALS OF WIENER PROCESS
00020
DIMENSION K(7) ,X(lOOO) ,S(1000) ,W(1000) ,WI(1000) , XSTAT (7 ,1000),
00030
1YSTAT(7,1000),XMEAN(7),YMEAN(7),XMU2(7),XMU3(7),XMU4(7),
00040
2YMU2(7),YMU3(7),YMU4(7),XBETA1(7),XBETA2(7),YBETA1(7),YBETA2(7)
00050
READ(1,555) IX,N1,N2
00060 555 FORMAT(I9,2I4)
00070
N22=N2-1
00080
TN=N2
00090
K(l)=l
00100
K(2)=5
00110
K(3)=10
00120
K(4)=50
00130
K(5)=100
00140
K(6)=150
K(7)=200
00160
NSTAT=4
00170
DO 50 I=1,N1
DO 10IK=1,N2
00180
CALL GAUSS(IX,1.0,0.0,Y)
00190
00200 10 X(IK)=Y
00210
DO 11 JP=l,NSTAT
00220
DO 9 IS=1,N1
S(IS)=O.O
00230 9
00240
IK=K(JP)
MIN=IK
00250
MINN=MIN+1
00260
00270
DO 12 II=I,IK
TKN=FLOAT(IK)/TN
00280
00290 12 S(IK)=S(IK)+X(II)
W(IK)=S(IK)/SQRT(TN)
00300
W(IK)=W(IK)/SQRT(TKN)
00310
00320
SGN=-l.O
IF(W(IK» 17,18,18
00330 .
·00340 17 WI (IK)=SGN*W(IK)
GO TO 14
00350
00360 18 WI (IK)=W(IK)
134
,e
00370 14
00380
00390
00400
00410
00420
00430
00440
00450 19
00460
00470 20
00480
00490 15
00500
00510
00520
0053032
00540 31
00550
00560 34
0057033
00580 30
. 00590
00600
00610 11
00620 50
00630
00640
00650
00660
00670
00680
00690
00700
00710 63
00720
00730 65
00740
00750 66
00760
00770 60
00780 59
00790
00800
00810
00820
00830
00840
00850
00860
IK=IK+l
IF(IK.GT.N2) GO TO 15
TKN=FLOAT(IK)/TN
S(IK)=S(IK-l)+X(IK)
W(IK)=S(IK)/SQRT(TN)
W(IK)=W(IK)/SQRT(TKN)
SGN=-l.O
IF(W('IK» 19',20,20
Wl{IK)=SGN*W(IK)
GO TO 14
Wl(IK)=W(IK)
GO TO 14
WMAX=W(MIN)
WlMAX=Wl (MIN)
DO 30 L=MINN,N2
IF(W(L)-WMAX) 31,31,32
WMAx=W(L)
WMAX=WMAX
IF(Wl(L)-WlMAX) 33,33,34
WlMAX=Wl(L)
WlMAX=WlMAX
CONTINUE
XSTAT(JP,I)=WMAX
YSTAT(JP,I)=WlMAX
CONTINUE
CONTINUE
DO 59LM=I,NSTAT
DO 60 I=I,N22
DO 60 J=I,N2
IF(LEQ.J) GO TO 60
XM=XSTAT(IM,I)
YM=YSTAT(IM,I)
IF(XSTAT(IM,I).GT.XSTAT(IM,J» GO TO 63
GO TO 65
XSTAT (1M, I) =XSTAT (IM,J)
XSTAT(IM,J)=XM
IF (YSTAT(IM,I) .GT.YSTAT(IM,J» GO TO 66
GO TO 60
YSTAT(IM,I)=YSTAT(IM,J)
YSTAT (IM,J) =YM
CONTINUE
CONTINUE
DO 77 M-l,NSTAT
PRINT 150,M
PRINT 200,XSTAT(M,I),XSTAT(M,5),XSTAT(M,10),XST.l\T(M,25),
iXSTAT(M,50) , XSTAT (M, 100) ,XSTAT(M,200) , XSTAT (M, 300) ,
2XSTAT(M,400),XSTAT(M,500) ,XSTAT(M,600) , XSTAT(M, 700),
3XSTAT(M,800),XSTAT(M,900),XSTAT(M,950},XSTAT(M;975),
4XSTAT(M,990),XSTAT(M,995),XSTAT(M,999),XSTAT(M,1000)
PRINT 300,YSTAT(M,I),YSTAT(M,5),YSTAT(M,10),YSTAT(M,25),
135
00870
00880
00890
00900
00910
00920
00930
00940
00950
00960
00970
00980
00990
01000
01010
01020
01030
01040
01050
01060
01070
01080
01090
. 01100
01110
01120
01130
01140
01150
01160
01170
00180
01190
01200
01210
01220
01230
01240
01250
01260
01270
01280
01290
01300
01310
01320
01330
01340
01350
01360
01365
01370
lYSTAT(M,50),
2YSTAT(M,100),YSTAT(M,200),YSTAT(M,300),YSTAT(M,400),
3YSTAT(M,500),YSTAT(M,600),YSTAT(m,700),YSTAT(M,800),
4YSTAT(M,900),YSTAT(M,950),YSTAT(M,975),YSTAT(M,990),
5YSTAT(M,995),YSTAT(M,999),YSTAT(M,1000)
150 FORMAT (/10X, 'LEVEL OF K/N=' ,12)
200 FORMAT(//10X,'FIRST SET :'/20X,10F8.4/20X,10F8.4)
300 FORMAT (//10X, 'SECOND SET :'/21X,10F8.4/21X,10F8.4)
77 CONTINUE
.
DO 82 I=l,NSTAT
XMEAN(I)=O.
YMEAN(I)=O.
DO 80 J=1,N1
YMEAN(I)=YMEAN(I)+YSTAT(I,J)
C
XMEAN(I)=XMEAN(I)+XSTAT(I,J)
80 CONTINUE
XMEAN(I)=XMEAN(I)/N1
YMEAN(I)=YMEAN(I)/N1
82 CONTINUE
DO 90 I=l,NSTAT
XMU2(I)=0.
XMU3(I)=0.
XMU4(I)=0 •
YMU2(I)=0.
YMU3(I)=0.
YMU4(I)=0.
DO 95 J=1,N1
XMU2(I)=XMU2(I)+(XSTAT(I,J)-XMEAN(I»**2
YMU2(I)=YMU2(I)+(YSTAT(I,J)-YMEAN(I»**2
XMU3(I)=XMU3(I)+(XSTAT(I,J)-XMEAN(I»**3
YMU3(I)=YMU3(I)+(YSTAT(I,J)-YMEAN(I»**3
XMU4(I)=XMU4(I)+(XSTAT(I,J)-XMEAN(I»**4
YMU4(I)=YMU4(I)+(YSTAT(I,J)-YMEAN(I»**4
95 CONTINUE
XMU2(I)=XMU2(I)/N1
XMU3(I)=XMU3(I)/N1
XMU4(I) =XMU4 (I)/N1
YMU2(I)=YMU2(I)/N1
YMU3(I)=YMU3(I)/N1
YMU4(I)=YMU4(T)/N1
XBETA1(I)=(XMU3(I)**2)/(XMU2(I)**3)
YBETA1(I)=(YMU3(I) **2) /(YMU2 (I) **3)
XBETA2(I)=XMU4(I)/(XMU2(I)**2)
~BETA2(I)=YMU4(I)/(YMU2(I)**2)
90
CONTINUE
DO 100 I=l,NSTAT
PRINT 150,1
PRINT 400,XMEAN(I),XMU2(I),XMU3(I),XMU4(I),
1XBETA1(I),XBETA2(I)
C
PRINT 410,YMEAN(I),YMU2(I),YMU3(I),YMU4(I),
136
01380
c':lYBETA1(I),YBETA2(I)
01390 100 CONTINUE
01400 400 FORMAT (/lOX, 'FIRST SET :' ,5X, 'MEAN=',F10.6,2X,
01410
l' jVARIANCE=' ,F10.6/26X, 'MOMENT3=' ,FlO.6,2X, 'MOMENT4=',
01420
2F10.6,2X,/26X,'BETA1=' ,F10.6,'; BETA2=' ,FlO.6)
01430 410 FORMAT (/IOX, 'SECOND SET :' ,5X, 'MEAN= ,. ,FlO.6' ,'2X,
01440
l' ;VARIANCE=' ,F10.6/26X, 'MOMENT3=' ,F10.6,2X,'MOMENT4=',
01450
2FIO.6,2X/2oX,'BETA1=',F10.6,';BETA2=',F10.6)
01460
STOP
01470
END
BIBLIOGRAPHY
Alling, D. W. (1963). Early decision in the Wilcoxon two.-sample
test. Journal of the American Statistical Association, 58:
713-720.
Anderson, T. W. and Darling, D. A. (1952). Asymptotic theory of
certain goodness of fit criteria based on stochastic processes.
Annals of Mathematical Statistics, 23: 193-212.
Armitage, P. (1957).
44: 9-26.
Restricted sequential procedures.
Biometrika,
Bahadur, R. R. (1961). A representation of the joint distribution of
responses to n dichotomous items. In Soloman, H. (Editor).
Studies in Item Analysis and Prediction, 158-168 •. Stanford:
Stanford University Press.
Basu, A. P. (1967a). On the large sample properties of a generalized
Wilcoxon-Mann-Whitney statistic. Annals of Mathematical Statistics,
38: 905-15.
Basu, A. P. (1967b). On k-sample rank tests for censored data.
Annals of Mathematical Statistics, 38: 1520-1535.
Basu, A. P. (1971). Bivariate failure rate.
Statistical Association, 66: 103-104.
Journal of the American
Billingsley, P. (1968). Convergence of Probability Measures.
York: John Wi.ley and Sons.
New
Birch, M. W. (1964). A new proof of the Fisher-Pearson theorem.
Annals of Mathematical Statistics, 35: 817-824.
Breslow, N. (1970). A generalized Kruskal-Wallis test for comparing
k samples subject to unequal patterns of censorship. Biometrika,
57:579-594.
Breslow, N. (1974).
Biometrics, 30:
Covariance Analysis of censored survival data.
89;...99.
Brown, B. M. (1971). Martingale central limit theorems.
Mathematical Statistics, 42: 59-66 •.
Annals of
138
Chatterjee, S. K. and Sen, P. K. (1973). Nonparametrictesting
under progressive censoring. Calcutta Statistical Association
Bulletin, 22: 13-50.
Chiang, C. L. (1961). On the probability of death from specific
causes in the presence of competing risks. In Le Cam, L. M.
et ale (Editors). Proceedings of the Fourth Berkley Symposium
on Mathematical Statistics and Probability, 169-180. Berkley:
University of California Press.
Cornfield, J. (1957). The estimation of the probability of developing
a disease in the presence of competing risks. AmericanJournal
of Public Health,!L: 601~607.
Cox, D. R. (1959). The analysis of exponentially distributed lifetimes with two types of failure. Journal of the Royal Statistical
Society - Series B, 21: 411-421.
Cox, D. R. (1972). Regression models and life tables. Journal of
the Royal Statistical Society - Series B, 34: 187-220.
Cramer, H. (1946). Mathematical Methods of Statistics.
Princeton University Press.
Princeton:
Darling, D. A. (1957). The Kolmogorov-Smirnov, Cramer-Von Mises
tests. Annals of Mathematical Statistics, 28: 823-828.
David, H. A. (1970). On Chiang's proportionality assumption in the
theory of competing risks. Biometrics, 26: 336-339.
Diamond, E. L. (1963). The limiting power of categorical data chisquare tests analogous to normal analysis of variance. Annals of
Mathematical Statistics, 34: 1432-1441.
Durbin, J. (1973). Distribution Theory for Tests Based on the Sample
Distribution Function. (Monograph). Philadelphia: Society for
Industrial and Applied Mathematics.
Elandt-Johnson, R. C. (1975). Conditional failure time distributions
under competing risk theory with dependent failure times and
proportional hazard rate. Institute of Statistics Mime 0 Series
No. 1015. Chapel Hill: University of NorthCar~lina.
Elderton, W. P. and Johnson, N. L. (1969). Systems of Frequency
Curves. Cambridge: University of Cambridge Press.
Epstein, B. (1954). Truncated life tests in the exponential case.
Annals of Mathematical Statistics, 25: 555-564.
Freund,R. J. (1961). A bivariate extension of the exponential
distribution. Journal of the American Statistical Association,
56: 971-977 •
139
Gail, M. (1975). A review and critique of some models used in competing
risk analysis. Biometrics, 31: 209-222.
Gastwirth, J. (1965). Asymptotically most powerful rank order tests
for the two-sample problem with censored data. Annals of Mathematical Statistics, 36: 1243-1247.
Gehan, E. A.
(1965a). A generalized Wilcoxon test for comparing
singly-censored samples. Biometrika, 52: 203-223.
arbitrari~y
Gehan, E. A. (1965b). A generalized two-sample Wilcoxon test for
doubly-censored data. Biometrika, 52: 650-653.
Ghosh, M. (1973). On a class of asymptotically optimal nonparametric
tests for grouped data I. Annals of the Institute of Statistical
Mathematics, 25: 91-108.
Hajek, J. (1961). Some extensions of the Wald-Wolfowitz-Noether
theorem. Annals of Mathematical Statistics, 32: 506-523.
Hajek, J. (1962). Asymptotically most powerful rank order tests.
Annals of Mathematical Statistics, 33: 1124-1147.
..
y
..
Hajek,
J. and Sidak,
Z.
Academic Press.
)
(1967.
Theory of Rank Tests.
New York:
Halperin, M. (1960). Extension of the Wilcoxon-Mann-Whitney test to
samples censored at the same fixed point. Journal of the .
American Statistical Association, 55: 125-138.
Halperin, M. and Ware, J. (1974). Early decision in a censored twosample test for accumulating survival data. Journal of the
American Statistical Association, 69: 414-422.
Johnson, N. L. and Kotz, S. (1975). A vector multivariate hazard
rate. Journal of Multivariate Analysis, i: 53-66.
Johnson, N. L., Nixon, E., and Amos, D. E. (1963). Table of percentage
points of Pearson curves, for given IiSl and a2' expressed in
standard measure. Biometrika, 50: 459-498.
Kiefer, J. (1959). K-sample analogues of the Kolmogorov-Smirnov and
Cramer-Von Mises tests. Annals of Mathematical Statistics, 30:
420-447.
Mantel, N. (1963). Chi-square tests with one degree of freedom:
extensions of the Mantel-Haenszel procedure •. Journal of the
American Statistical Association, 58: 690-700.
140
Mantel, N. (1966). Evaluation of survival data and two new rank order
statistics arising in its consideration. Cancer Chemotherapy
Reports, 50: 163-170.
Mantel, N. and Haenszel, W. (1959). Statistical aspects of the
analysis of data from retrospective studies of disease. Journal
of National Cancer Institute, 22: 719-748.
Marshall, A. W. and aIkin, I. (1967). A multivariate exponential
distribution. Journal of the American Statistical Association,
62: 30-44.
Mitra, S. K. (1958).
chi-square test.
1233.
On the limiting power function of the frequency
Annals of Mathematical Statistics, 29: 1227-
Moeschberger, M. L. and David, H. A. (1971). Life tests under competing causes of failure and the theory of competing risks.
Biometrics, 1l: 909-933.
Neyman, J. (1949). Contribution to the theory of the ·X 2 test.
Proceedings of the First Berkley Symposium on Mathematical Statisticsa.nd Probability, 239-273. Berkley: University of California
Press.
Pearson, E. S. a~d Hartley, H. O. (1962). (Editors). Biometrika.
Tables for Statisticians -VoL 1. Cambridge: University of
Cambridge Press.
Pearson, K. (1968). (Editor). Tables of the Incomplete Beta-Function.
(Second Edition). Cambridge: University of Cambridge Press.
Puri, P. and Rubin, H. (1974). On a characterization of the family
of distributions with constant multivariate failure rates.
Annals of Probability, l: 738-740.
Rao, C. R. (1973). Linear Statistical Inference and Its Applications.
(Second edition). New York: John Wiley and Sons.
Searle, S. R.
(1971).
Linear Models.
New York:
John Wiley and Sons.
Sen, P. K. (1967). Asymptotically most powerful rank order tests for
grouped data. Annals of Mathematical Statistics, 38: 1229-1239.
Sen, P. K. (1976). A two-dimensional functional permutational
central limit theorem for linear rank statistics. Annals of
Probability, i: 13-26.
Sethuraman, J. (1965). On characterization of the three limiting
types of extreme. Sankhya - Series A, 12: 357~364.
141
Sugiura,N. and Otake,:M. (1973). Approximate distribution of the
maximum of c-l x2-statistics (2 X 2) derived from 2xc contingency
table. Communications in Statistics,.!: 9-16.
Sugiura, N. and Otake, M. (1974). An extension of Mantel-Haenszel
procedure to k 2x c contingency tables and the relation to the
logit model. (Unpublished) •
. Tucker, H. G. (1967).
Academic Press.
A Graduate Course in Probability.
New York:
Wa1d, A. (1943). Tests of statistical hypotheses concerning several
parameters when the number of observations is large. Transactions
of the American Mathematical Society, 54: 426-483.
e·
© Copyright 2026 Paperzz