Bangdiwala, S. I.; (1980)Sequential and Time-Sequential Procedures for Generalized Models."

SEQUENTIAL AND TIME-SEQUENTIAL PROCEDURES
FOR GENERALIZED MODELS
by
Shrikant I. Bangdiwala
Department of Biostatistics
University of North Carolina at Chapel Hill
Institute of Statistics Mimeo Series No. 1322
December 1980
SEQUENTIAL AND TIME-SEQUENTIAL PROCEDURES FOR
GENERALIZED MODELS
by
Shrikant I. Bangdiwala
A Dissertation SUbmitted to the faculty of the
The University of North Carolina at Chapel Hill
in partial fulfillment of the requirements for
the degree of Doctor of Philosophy in the Department of Biostatistics
Chapel Hill
1980
-e
SHRIKANT I. BANGDIWALA. Sequential and Time-sequential Procedures for
Generalized Models (Under the direction ofPRANAB KUMAR SEN.)
In statistical inference it is often desired to test a specified
function of unknown parameters from an underlying distribution.
Sequen-
tial procedures utilize information from the already collected observations and allow for a possible early termination of experimentation with
a concurrent savings in time and cost.
They also enable the investiga-
tor to prescribe bounds for both the Type I and Type II errors and
therefore are preferred to fixed sample testing procedures when their
implementation is possible from a practical standpoint.
In the present
work we develop suitable sequential testing procedures for functions of
unknown parameters through extensions of results by Bartlett (1946) and
Cox (1963), Sen and Ghosh (1971,1974), Ghosh and Sen (1972,1977) and
Chatterjee and Sen (1973).
distributions is
kno~~,
In the case when the form of the underlying
the use of maximum likelihood techniques allows
development of a suitable sequential test for a broad class of specified
functions of interest for i.i.d. as well as time-dependent observations.
The test procedure is based on a generalization of the Sequential Likelihood Ratio Test (SLRT) proposed by Bartlett (1946) and Cox (1963).
Theoretical justification for the generalization of the SLRT suggested
by Cox (1963) is provided.
The theoretical Operating Characteristic
(OC) and Average Sample Number (ASN) functions are derived for local
alternatives by approximating the distribution of the test statistic
with linear combinations of standard Brownian Motion processes for both
-e
the i.i.d. and time-dependent observation cases.
Simulation studies
were utilized to investigate the goodness of the asymptotic results in
finite samples.
It is shown that the two-sample comparison testing
iii
problem is a special case of our general sequential testing framework.
When the form of the underlying distribution is unknown, robust sequential tests based on suitable rank order statistics were developed for
testing a function of multiple regression parameters for i.i.d. observations.
The DC and ASN functions of the sequential test are examined
under local alternatives.
The problem of rank order estimation of un-
known mUltiple regression parameters for time-dependent observations in
a Progressively Censored Scheme (PCS) is discussed.
The possible imple-
mentation of the proposed sequential testing procedure in compliance
testing for air pollution monitoring is studied as an example.
Some of
the numerous applications of the sequential testing procedures are
presented.
iv
ACKNOWLEDGMENTS
It has been a very special priviledge during this research to work
closely with my adviser, Professor Pranab Kumar Sen.
Professor Sen's
insight, encouragement, counsel and guidance in the completion of this
dissertation was invaluable and is gratefully acknowledged.
I also
wish to thank the other members of the committee, Professors R.L. Harris,
L.L. Kupper, D. Quade, M.J. Symons, and J.D. Taulbee, for their helpful
suggestions and ideas which went into this dissertation.
In addition, I wish to thank my parents Pushpa and Ishver Bangdiwala
as well as my brother Dweepkumar for their encouragement and understanding throughout my entire academic pursuit.
A warm thanks to the staff and students in the Department of Biostatistics who made me feel at home, especially to Rebecca A. Teeter for
her encouragement and support.
Finally, I would like to acknowledge the skillful typing of the
manuscript by Jean Harrison and the financial support provided by the
Department of Biostatistics and the National Institute of Environmental
Health Sciences grant No. 5-232-ES07018-03 that enabled me to pursue
this research.
SIB
TABLE OF CONTENTS
Page
ACKNOWLEDGMENTS
iv
Chapter
I.
INTRODUCTION AND REVIEW Or THE LITERATURE
1.1
1.2
1.3
1.4
1.5
II.
1
Introduction. . . . . . . . . . . .
Parametric Sequential Procedures . .
Nonparametric Sequential Procedures
Time-sequential Problem
Outline of Research Proposal . . . .
22
A SEQUENTIAL LIKELIHOOD RATIO TEST FOR GENERAL
FUNCTIONS . . . .
26
2.1
2.2
2.3
2.4
2.5
2.6
26
2.2.1
2.2.2
2.2.3
31
Preliminary Notions
Development of Test Statistics
Test Procedure
. . . . . .
28
39
40
41
47
54
Background . . . . . .
Wald's Binomial Sequential Test tWBST).
Generalized Sequential Likelihood • • .
Ratio Test (GSLRT)
.
Operating Characteristic Functions for
Local Alternatives
.
Asymptotic Relative Efficiency tARE) . .
54
Two-sample Problem-Comparison of Distributions
64
2.6.3
2.6.4
2.6.5
.
.
55
56
57
57
Introduction
.. .
Preliminary Notions and the Proposed
. .. . .
Test
2.7.3 Examples of Applications
64
Recapitulation of Results
68
2.7.1
2.7.2
2.8
28
Termination Probability of Test Procedure
Operating Characteristic Function
Average Sample Number Function
Example of an ARE Calculation
2.6.1
"
11
16
Introduction .
Preliminary Notions and Proposed Test
2.6.2
2.7
1
5
.
.
65
67
vi
Chapter
III.
Page
A PROGRESSIVELY CENSORED SEQUENTIAL LIKELIHOOD
RATIO TEST FOR GENERAL FUNCTIONALS
3.1
3.2
70
Introduction
....
Preliminary Notions and the Proposed Test
71
Notation and Assumed Regularity
Condi tions . . . • . . . . . . .
3.2.2 Preliminary Results . . . . • .
3.2.3 Development of the Test Statistic
3.2.4 Test Procedure . . . . . . . . . .
71
73
84
99
Termination Probability of Test Procedure
Operating Characteristic Function
Average Sample Number Function
100
102
110
70
3.2.1
3.3
3.4
3.5
IV.
SIMULATIONS AND PRACTICAL EXAMPLE
4.1
4.2
4.3
4.4
Introduction
.
Simulation Study Based on Chapter 2
Simulation Study Based on Chapter 3
Compliance Testing in Air Quality Monitoring
4.4.1
4.4.2
4.4.3
4.4.4
V.
NONPARAMETRIC SEQUENTIAL TESTING PROCEDURES
5.1
5.2
5.3
VI.
Introduction and Background
Description of Data . . . .
Statistical Framework and Test Procedure
Results and Considerations for
Implementation of the Technique
Introduction . . • . . . . . . .
SROT for a Function of Regression Coefficients
in Multiple Linear Regression
.
SROT for a Function of Regression Coefficients
of Multiple Regression under Progressive
Censoring • . . . . . . . .
116
116
117
123
129
129
135
137
140
144
144
145
152
SUGGESTIONS FOR FURTHER RESEARCH
157
Theoretical Extensions
Further Applicat10ns
157
158
6.1
6.2
BIBLIOGRAPHY
160
e
CHAPTER 1
INTRODUCTION AND REVIEW OF THE LITERATURE
1.1
Introduction
A general problem that occurs in statistical inference is esti-
mating the parameters believed to determine the distribution of a variable of interest.
When there is more than one parameter being estimated,
it is often desired to draw inferences about a function of these parameters rather than the parameters themselves.
abound in practice.
Examples of such situations
One is often interested in the survival distribution
function in life testing problems and clinical trials.
In the case of
several covariables affecting survival, one is still interested in the
survival distribution as a function of the covariables, not in the parameters themselves that determine how the covariables enter into the model.
In "accelerated life testing" problems, the experiment is conducted at
high dose levels and the results are interpolated to low dose levels
through a specific model.
In most cases the model needs to be determined,
while the values of the parameters that constitute it are of no special
interest.
When looking at possible harmful side effects of a new drug
being introduced into the market, one expects several variables to affect
toxicity.
As an example, one can consider dosage and time of exposure as
possible variables.
-e
One is not interested in the parameters that tell
how dosage and time of exposure affect toxicity, but rather in the toxicity curve, the downward sloping curve that is a function of both dosage
2
and time of exposure and best expresses the interrelationship between the
two variables.
In air pollution control one usually assumes a Gaussian
plume model where the emissions from a stack are said to disperse as a
normal distribution with mean concentration ~ and variance 0 2 • Other
unknown parameters in the model such as existing background concentration
of pollutants and meteorological conditions can be represented by a vector
~
of covariables.
One would then be interested in estimating the
lOOth percentile of the pollutant concentration as a function of ~,
and 13_, Le. the function f
=~
0
2
S'c + T 0, where T", is the (l-ex)th
- ex
....
percentile point of the standard normal distribution and c is a vector
of known constants.
+
In monitoring air contaminants in a workplace, the
concentration of chemical agents is usually assumed to have a lognormal
frequency distribution and similar considerations as above may be of
interest.
In statistical inference about the unknown parts of a model, the
data used can be of one of two kinds:
fixed sample or sequential.
In
fixed sample analysis, the final sample size n and the sampling rules do
not depend on the data as it becomes available.
In sequential analysis,
the size and composition of the final data are not fixed in advance but
depend, in some specified way, on the data already observed in the course
of the experiment.
The advantages of fixed sample procedures are its
relative simplicity and the fact that one does not always have the freedom to gather data in a sequential manner.
The main disadvantage is that
one usually does not have sufficient knowledge of the underlying distribution to prescribe bounds on the probabilities of incorrect decisions.
The probabilities of incorrectly accepting HI and HO' called Type I and
Type II errors respectively, cannot both be prescribed in the presence of
e-
3
nuisance parameters in a fixed sample situation.
However, in sequential
sampling, desired values for both error probabilities can be prescribed,
with the price that the sample size n becomes a random variable.
For
example, in testing for the mean, lack of knowledge about the underlying
variance
0
2
is a cornmon situation.
Using the sample estimate of the
2
variance s 2 to estimate 0 in fixed sample procedures may not be suffin
2
ciently accurate in certain situations; for example, when 0 is large
and the sample size n is not.
In such situations, the actual values of
Type I and Type II errors cannot be correctly determined.
Fixed sample
procedures fail to make use of the accumulated information in the course
of the experiment.
On the other hand, by using sequential procedures,
one updates the sampling procedure and parameter estimates with information collected as the experiment proceeds and allows for possible early
termination of the experiment, with a concurrent savings in time and
cost.
Predetermined large sample sizes in fixed sample cases incur high
cost without necessarily increasing the sensitivity of the experiment,
while small sample sizes may lead to inaccurate decisions.
Sequential procedures are of two major kinds.
In the so-called
"classical sequential problem", the gathering of the data is not time
dependent and the possible savings over fixed sample schemes arise from
a reduced number of observations sampled.
In the specific area of clini-
cal trials and life testing problems, the observations are gathered sequentially over time.
Two fixed-sample sampling schemes are used:
truncation, where the experiment is conducted for a specified length of
time and the number of observations is a random variable; and censoring,
-e
where the experiment is conducted until a specified number of failures
are observed and the time of termination is the random variable.
"From
4
a practical standpoint even the truncation and censored plans themselves
have several drawbacks.
For example, single point truncation and censor-
ing schemes are often inadequate for most practical problems where ethical reasons may demand high levels of efficiency of statistical procedures based on them.
Second, the truncation scheme may still necessitate
long experimentation to obviate the risk of erroneous decisions and thus
has to be weighted against the increased cost and sacrifice of experimental units which may not contribute significantly to the sensitivity of
the experiment.
Third, the time of termination of experimentation in
the censoring scheme, being random, may be at
variance with other re-
strictions on time and cost" [po 14, Gardiner and Sen (1978)].
For these
reasons, progressively censored schemes (peS) are preferred for clinical
trials and life testing problems.
The experiment is monitored "contin-
uously" from the beginning, so that if, at any early stage, the accumulated statistical evidence warrants a clear-cut decision, experimentation
is terminated and the decision adopted along with a savings in time and
cost of experimentation as well as in lives of experimental units.
In
clinical trials, one may not get all the patients that will eventually
be in the experiment at the same time.
They may arrive in batches or one
at a time, thus providing a natural appeal for a sequential setup.
This
situation is called staggered entry.
We see that the use of sequential procedures is logically desirable because one uses available information from the collected data to
update the sampling scheme and thus allows for a possible early termination of experimentation.
The present work addresses the use of sequen-
tial procedures in solving the inference problem originally mentioned.
Parametric sequential procedures assume a specified form for the
e-
5
underlying distribution of the observations.
They are usually non-robust
if the underlying distribution is incorrect,
and thus nonparametric
sequential procedures, which avoid any distributional assumptions, are
preferred.
However, given a distributional assumption, one is able to
test a broad class of functions of interest satisfying mild regularity
conditions.
While the tests are robust in a nonparametric setting, only
a restricted set of functions can be tested and thus the scope of the
test is narrowed.
We are interested in both parametric and nonparametric
sequential procedures.
Currently used sequential procedures are reviewed,
starting with parametric sequential procedures in Section 1.2. In Section
1.3 we review nonparametric sequential procedures and in Section 1.4 we
discuss the life testing and time sequential problem in general.
In
Section 1.5 we present our proposal for the investigation in this
dissertation.
1.2
Parametric Sequential Procedures
Let e
= (x ,x 2 , .•. , x ) denote a random sample of size n > 1
n
l
from the family of distributions F. (x,e), i=1,2, ... ,n, e £ e , wher~ 8
~
1
is the precisely known parameter space, Fi(x,e) is partly unknown, and
e
-n
is one element of the sample space E = {all possible samples of size
n
n from F. (x,e), i=l,2, ... ,n}.
1
parameter.
The parameter 6 may be a vector valued
If the functional form of F (x,6) is known, VI < i
i
~
n, then
a statement of the form H:6 £ w denotes a parametric hypothesis about
the model, where w is a non-empty proper subset of the parameter space.
The usual concern is the so called two-decision problem, discrimination
between the null hypothesis HO:e
H1:e
£
8 -w.
£
w, and the alternative hypothesis
6
A general framework for statistical tests of H vs HI undergoes
O
the following decisions at the nth stage of sampling:
(i) Accept HO if
(ii) Reject H if
O
e is in the acceptance region RO
_n
e is in the rejection region RI
-n
n
Ciii) Continue sampling by observing xn+ 1 if own
e is in the
continuation region R
n
The regions R~, R~, and Rn are mutually exclusive and such that their
union equals the sample space E. The specification of the regions is
n
based upon practical considerations about the consequences of making
wrong decisions.
Different ways of specifying the regions RO, RI , and
n
R distinguish the statistical tests.
n
n
For a fixed sample test, one lets
(i) and (ii) be possible only at stage n=N, where N is the fixed sample
size number.
In the sequential case the stopping variable n* =
min{n>- 1: own
e ~ Rn j, is a random variable.
There are several properties
that are usually utilized in de-
scribing the behavior of a sequential test.
with probability one if lim Pee
n -+ 00 -n
the test is called a closed test.
E
A sequential test terminates
R 18) = 0, V 8
n
E
8; in such a case
The Operating Characteristic COC)
function of the test s is defined as Q (8) = PCaccept HOI8). If we des
note the power function by Ps (8) = P(reject HoI8), we note that for a
closed test, Ps (8) = l-Q s (8), V 8 E e. The probability of Type I error,
denoted a l (8), equals 1 - Qs (8) if 8 E w; and the probability of Type II
error, denoted by a 2 Ce), equals Qs (8) for 8
E
8 -w.
Intuitively, a
desirable feature of a test is that Qs(8) be large for
for
eE
8 -w.
eE
wand small
The Average Sample Number CASN) function of the test s is
defined to be the expected value of the stopping variable n* given that
e is
the true value of the parameter.
If the test s is not closed, its
7
ASN function does not exist; however, the contrary is not necessarily
true.
The ASN function, denoted by E (n*;8), is usually used as a
s
measure of the efficiency of the test (where "efficiency" is defined
below).
The "principle of fixed sample tests" chooses a test by setting
beforehand an upper bound al(O
~
al
~
1) to the probability of Type I
error for all tests of a given sample size n, and then selects the test
which minimizes the probability of Type II error V8 £
G -w with respect
G -w.
In the sequential
to the possible choices of the critical region
setup, one lets aI' a 2 £(0,1) be the preassigned probabilities of Type I
and Type II errors respectively. The class C of valid tests of H : 8 £ w
O
vs HI: 8 £ e -w is defined to comprise of tests that satisfy the following three conditions;
(i) Terminates with probability one
(ii) Qs(8) = 1 - a l (8)
(iii) Qs(8) = a 2 (8)
~ 1 -
~a2'
aI' V8 £ w
V8 £ e-w
A test sl can be defined to be "more efficient" than a second test s2 at
8 if both arc valid and E (n* ;8)< E (n* ;8) for some 8 £ G.
sl
s2
The statis-
tical test s*£ C is called the uniformly most efficient (UME) test of
inf Es (n*;8)= Es *(n*;8);V8 £ e. In selecting an appropriate
s£ C
sequential test, one may utilize some of the above mentioned properties
H vs HI if
O
of a sequential test.
If 8 is the single unknown parameter of interest, in order to
test the simple versus simple hypothesis HO: 8 = 80 versus HI: 8 = 8 1 ,
80 ~ 81 , with a closed test V8 = 8 0 , 81 such that Q(80)~ I - a l and
Q(8 ) ~ a , Wald (1947) proposed the Sequential Probability Ratio Test
2
l
(SPRT). The SPRT S(b,a) is defined by the following framework:
8
Observe x., i=I,2, ... in succession and at the nth stage of
1
sampling, choose one of the following:
i. accept HO if
ii. reject H if
O
Z < b
nZ > a
niii. continue sampling by observing x
n+
where the stopping bounds (b,a),
_00
I if b < Z < a
< b < a <
n
00
are real numbers
and the test statistic
Z
n
=
In [L (6 1 ) / L (6 0)]
n
n
{0
for n > 1
if both likelihoods are zero
and where Ln (6) = Pee_n ) denotes the joint likelihood of (x ,x 2 ,
l
(1.1)
... ,x)
if 6 is the true value of the parameter.
n
A SPRT is completely characterized by its stopping bounds. Wald
(1947) showed that an upper bound for a is In[(l - a )/a ] and that a
2
l
~
lower bound for b is In[a 2/(1 - a )]. The ~ffi-SPRT, s(b*,a*), is chosen
l
such that P(n* ~ n16) is minimized Vn, 6 subject to a (8 0) ~ a and
l
l
a 2 (6 ) ~ a . The test s(b*,a*) is UME among valid SPRT's, and under certain
2
1
conditions among all sequential and fixed sample tests.
A logical extension of the SPRT is to let the bounds (b,a) vary
with n, the so called generalized SPRT.
Weiss (1953) and Kiefer and
Weiss (1957) have examined some properties of the generalized SPRT in the
simple versus simple hypothesis case.
Samuel (1970) proposed a "random-
ized" SPRT in which decisions at the stopping bounds are taken one way
or the other according to fixed probabilities.
The effect on a l and a 2
in trucated sequential tests (see Section 1.3 for definition of truncation) was examined by Wald (1947).
When dealing with composite hypotheses such as when one has
multiparameter families, Wald (1945, 1947) proposed extending the frame-
9
work of the SPRT of (1.1) by the use of weighted likelihoods as follows:
A test of HO: 8
E W
V8 E w. and az (8)
~
0 -w such that a l (8) ~ aI'
0 -w, where a l and a Z are preas-
versus HI: 8
a Z' V8
E
E
signed risks, is given by:
i. Select weight functions Wand W such that
a
r
= 1
ii. Select a test that follows the framework of (1.1)
and is such that
I
=al
a (8) Wa (8) d6
w l
... At the nth stage
111.
0f
and
fe
-w a 2 (8) Wr (8) de
= aZ
. . f or
samp l'1ng, th e t est stat1stlc
n > 1 is
e
-w
L (8) W (8) d8
n
r
II
L (8) W (6) de
w n
a
l
.
-!
iv. Perform the test following (1.1) with the above changes.
(1. 2)
There is no universally accepted method to determine the weight functions
in (1.2) and an optimal set of weight functions is not necessarily unique.
The usual unavailability of DC and ASN functions make (1.2) hard to compare to other tests.
An alternative test based on invariance, discussed
by Ghosh (1970) but due to Cox (1952) and further justified in Hall, et al.
(1965), deals with multiparameter families, say
are functionally independent.
-e
~
= (y,~),
where Y and
If a sufficient statistic exists for y,
then to test Ho : Y = Yo vs HI: Y = YI , Yo ; YI , with ~ as a nuisance
parameter, the ratio of likelihoods can be shown (c.f. Fraser (1956) to
be independent of 0 and the problem is reduced to the simpler SPRT case
~
10
of (1.1).
If interested in making decisions about some function A = A(e)
and a sufficient statistic does not exist, one may be able to generate a
sequence of functions {Tn} whose distributions depend only on A, and then
one can develop a procedure to make the relevant decision about A using
only the observed value of T.
If the function Tn is chosen such that it
n
is invariant under changes in the values of the nuisance parameter, the
ensuing procedure is called an "invariant" SPRT (c.f. Cox (1952)).
The
invariant SPRT usually does not apply to many interesting decision problems about a real parameter in the presence of one or more nuisance
parameters (c.f. Ghosh (1970)).
To deal with nuisance parameters in the multiparameter setting
of
~
=
(y,~),
Bartlett (1946) and Cox (1963) proposed the Sequential
Likelihood Ratio Test (SLRT).
Under suitable regularity conditions on
~
the underlying distribution F(x,e), at the nth stage of sampling the
A
A
A
computed test statistic is Z' = In{L (e ;yI,O )/L (e ;Yo,e )} where 0
n
n -n
-n n -n
-n
-n
is the maximum likelihood estimate (MLE) of 0 at stage n. The procedure
of the SLRT follows the framework of (1.1) but requires an initial sample
size nO and the condition that the differences IYI-YI and IYo-yl are so
small that the risks a
and a are asymptotically as claimed. Breslow
l
2
(1969) provided some mathematical justification for the SLRT procedure
proposed by Cox (1963).
It remains to be shown that such a procedure
can be extended to the problem of testing any function of the parameters
of the underlying distribution.
Both the weighted SPRT and the SLRT demand some knowledge of the
underlying distribution function, though the SLRT can handle nuisance
parameters better than the weighted SPRT.
If sufficient knowledge of
the distribution function does not exist, as is usually the case, these
e-
11
procedures are not robust and there is a need to develop suitable procedures to cover a broader class of underlying distributions.
Nonparametric
sequential procedures are reviewed in the next section.
1.3
Nonparametric Sequential Procedures
In the case when sufficient knowledge of the underlying distribu-
tion function is not available, robust procedures that will apply to a
broader class of distributions need to be developed.
Savage and Sethura-
man (1966) and Sethuraman (1970) proposed a Sequential Rank Order Probability Ratio Test (SROPRT) for general tests:
Observe random variables x. from an absolutely continuous
J.
distribution frunction F. (x;8), i=I,2, ... ,n.
J.
•
To test the
homogeneity hypothesis HO: FI =F 2=... =F n = F, Vn > 1 versus
HI: not all F are the same, calculate the probability under
i
HO and HI of the vector of ranks (R I ,R , ... ,R ) corresponding
2
n
o
to the x.J. 's, denoted pn and p~ respectively. Under HO all
possible permutations of the ranks are equally likely and
o
thus Pn = (n!)
p~
=
-1
.
f .. ·f
R <•.• <R
I
n
Under HI'
n
II f. (x. ; 8)
. 1 J. J.
J.=
1
0
(1. 3)
Then proceed as in framework (1.1) with Zn = lnlpn Ip n ].
For small drifts of HI from H the probability of early stopping gets
O
very small thus requiring a large initial sample size nO. The procedure
is not genuinely nonparametric as under HI the distribution of the ranks
depends upon the underlying unknown distribution functions F., i > 1.
J.
Since the test is based on Lehmann alternatives (where one of the sampled
populations has a distribution which is a specified power of other
12
sampled distributions), it cannot be applied to the location problem.
The OC and ASN functions of the SROPRT have not been adequately studied.
Sen (1973) considered the case of iid (independent and identically distributed) p-dimensiona1 random vectors, p
continuous distribution function F(x),
unkno~~
~
1, from an absolutely
but belonging to some
suitable family F of distribution functions in RP , the p-dimensiona1
Euclidean space.
To test a simple hypothesis for estimable parameters
defined as functiona1s of the form A(F) with a test that terminates
with probability one, Sen (1973) proposed a test based on suitable estimates of A (F) using differentiable statistical functions of von t-Iises
(1947) and V-statistics of Hoeffding (1948).
The test is asymptotically
consistent and is efficient when compared to other sequential procedures.
The application of the testing procedure is for testing functiona1s of
•
different distributions and not for testing of functionals of the parameters of the distributions.
The availability of estimates of
A(F)
is a
concern of such a procedure.
Sen and Ghosh (1971,1974) and Ghosh and Sen (1972,1977) proposed
a truly nonparametric procedure based on rank order statistics for analyzing different sequential problems.
To obtain sequential confidence
intervals for the median of an unknown symmetric distribution, Sen and
Ghosh (1971) developed a procedure based on a general class of one-sample
rank order statistics:
Let h.,i > U be a sequence of independent and identically
~
-
distributed (iid) random variables from the absolutely continuous distribution function FS(x)
= F(x-S)
where F is
symmetric about 0 and S is the unknown parameter of interest.
Define the one-sample rank order statistic
•
13
for all real d, where In(u) is a suitably defined score function
based on a score J(u) and Rnk(d) is the rank of Ixk-d! among
(1. 4)
In order to construct a (l-a)% confidence interval of a prescribed width
....
....
construct I n* = {8: 8 * < 8 < 8u *J, where n* =
(~2o)
L,n ,n
....
"
min{n ~ nO: 8U,n - 8L,n ~ 2cJ is the stopping variable. The bounds,
independent of c, are defined by
for 8,
8
L,n
= sup{a: Tn o(e-n
eU,n = inf{a:
•
1 ) > T(l)} and
- a _n
n,a
Tn o(e_n - a _n
1 ) < T(2)}
n,a'
where T O(u) is defined in (1.4) and T (1) and T (2) are such that
n,a
n,a
n P {T (2)< T < T (I)} = 1 - a ~ 1 - a as n ~ 00. Chow and Robbins
8=0 n,a - nO - n,a
n
(1965) developed a similar parametric procedure to the one just described.
However, using normal scores. the asymptotic relative efficiency (ARE)
of the proposed Sen and Ghosh (1971) procedure versus the Chow and Robbins (1965) procedure favored the nonparametric procedure when F is not
normal.
Sen and Ghosh (1974) extended the principle of rank order statistics to the one- and two-sample location testing problem.
Motivated by
the same underlying principle of the asymptotic SLRT of Bartlett (1946)
and Cox (1963), they proposed a Sequential Rank Order Test (SROT) for
testing H : 8 = 6 vs HI: 6 = 8 1 = 6 + 6 where 6(> 0) and 80 (usually
O
0
0
taken as zero without loss of generality) are known:
-e
With the setup of (1.4), given prescribed risks (a l ,a 2), continue
sampling if at stage rn,
14
a2
2
(In I-a) v < I:.
c; Tm 1:./2
1
where v
2 = Jl0 J 2(u)
J:
<
l-a 2
2
(In - - ) v
al
du and C; estimates C(F)
=
d/dx J[F(x) - F(-x)] dLF(x) - F(-x)]
(1. 5)
The SROT terminates with probability one for square-integrable nondecreasing score functions J(u), and it is asymptotically distribution
free and consistent.
Though it requires a large initial sample size
no(l:.) in order to estimate C(F) reasonably accurately with
C~,
its ARE
with respect to Wald's (1947) SPRT and the Bartlett (1946) - Cox (1963)
SLRT favors it using normal -scores.
Sen (1977) uses the technique of jackknifing (c.f. Gray, et al.
(1972)) U-statistics in modifying the sequential test for location:
Let 8* and V* denote the appropriate jackknife estimator of 8
n
n
and its variance at stage n of sampling.
Then for prescribed
risks (a l ,a 2) continue sampling if at stage n
a2
(In l-a ) V~ < n 1:.[8~ - (
l
80 + 8 1
2
l-a
)] < (In
--a;:-)2
*
Vn
. (1.6)
In Ghosh and Sen (1977), sequential rank tests for the regression
coefficient in a simple linear regression model are developed along the
same lines as above:
The regression model can be expressed as
Xi = 80 + 8 Ci +
E
i, i
~
1 ,
where the E. are iid random variables from an absolutely contin1
uous distribution F(x), the C. are known constants, and the
1
vector of parameters ~
parameter.
= (8 0 ,8)'
is unknown and 8 is a nuisance
0
In order to test "0: 8=0 versus "1: B= 1:.(>0),
e-
15
6 known, the proposed SROT is based on the following rank
order statistic at stage n
Tn (b) = T(e_ n
- bn
C)
where
b e: R
:n = (XI,···,Xn )'
fn
= (CI,···,C n )'
C.
-c
C*. = 1 n where
n1
[n
C = n- l L C., [2 = L (C._C)2
n .1= 1 1
n
. 1 1 n
1=
J (u) = set of scores based on a suitable score
n
function J (u)
Rni (b) = rank of Xi-bC i among XI-bCI' .,., Xn-bC n ,
Given prescribed risks (a1,a Z) of T)~e I and Type II errors,
•
continue sampling if at stage n
where v 2 =
n =
D
f:
x
JZ (u) du 2t*
n
If:
J(u)
dU]Z
x
(~u,n-!\,n)
where !\ , n = sup{b: Tn (b) > t*}
n
Bu,n = inf{b: Tn(b) > -t~} and t~ is s.t.
Po{IT (0)
n
I -<
t*}
n
= I-an
~ I-a
As before, the test terminates with probability one and it requires an
2
-e
initial sample size n O(6), such that as 6~, nO(6)~ but 6 nO(6)~, It
is efficient when compared with the corresponding SPRT and SLRT when F
is known.
The results utilized by Ghosh and Sen (1977) of certain
16
properties for the linear rank statistics and the derived estimates of
the regression coefficient were established by Ghosh and Sen (1972)
and Sen and Ghosh (1972).
The techniques presented by Sen and Ghosh with respect to rank
order statistics may be extended to the mu1tiparameter problem and
specifically when dealing wlth a function of these parameters.
We note
that these test procedures are suitable for local alternatives under
certain asymptotic considerations just like the SLRT and thus that it
will be necessary to study how large n must be in order for the asymptotic
properties of the test to be applicable.
1.4
Time-sequential Problem
In time-sequential studies relating to clinical trials and life
•
testing problems, one is interestea in the shape of the survivorship
functlon
Set)
= peT
> t)
= I:
feu) du, for t > 0,
where the random variable T denotes the survival time and f(t) is the
unconditional probability density functlon (pdf) of tlme to failure,
t > O.
The conditional failure probability density funcLion or hazard
rate A(t) is related to the survival function by
A(t)
=-
~t{ln
Set)} or Set)
= exp{
-
J:
A(U)
du}
(1.7)
The hazard rate is often used as a means of making inferences on the
survival function.
Estimation of the survival distribution is done differently as
dictated by the form of the available information.
In fixed-plan trun-
cation or censoring schemes, life table techniques are used for grouped
data (see Gross and Clark (1975)) and nonparametric tests based on
e-
17
linear rank order statistics can be used to make inferences on the survival (see review in Chatterjee and Sen (1973), Section 2).
Kaplan and Meier (1958) used a maximum likelihood approach in a
non-parametric setting to estimate the survival function.
If information
is available on each individual's time to failure, the "product-limit"
estimate of the survival function for a specified time t is the product
over all failure times less than or equal to t of the number alive just
after a particular failure time divided by the number alive just before.
Parametric methods based on maximum likelihood procedures arrive at
inferences on the survival through assumptions on the hazard function
and the relationship (1.7).
Standard distributions assumed for the
hazard function are the exponential, gamma, lognormal, Weibull and
Gompertz, and considerable effort has been devoted to finding procedures
for determining which of the parametric models best fits the data.
In life testing problems one often has available other information aside from the survival times or times to failure of the individuals
in the experiment.
The individuals cannot be assumed to be homogeneous
and one expects that their survival depends upon these factors or concomitant variables.
These covariables may either vary with time or
be baseline variables.
In certain situations, covariables are measured
at the time of failure.
One approach to analyzing the effect of covariables on survival
is to form categories of individuals based on the values of the covariables of interest.
Koch, Johnson and Tolley (1972) extend previous re-
sults of Mantel (1966) and allowed for comparison of survival over t
years of r groups which may be formed on the basis of categorizing covariate values, when the actual form of the survival distribution is not
of concern but one wants to test hypotheses concerning differences in
18
survival of the r groups.
Regression approaches to handling covariates
act on the hazard function. Letting Z. = (Z.l' .•. 'Z. )' be the vector
-~
~
~s
th
of s covariables for the i
individual, the hazard function for the i th
individual is assumed to be a function of t and Z., A(t,Z.).
-~
-~
Feigl and
Zelen (1965) considered a single non-time dependent covariable that
entered the hazard under the general model
(1. 8)
where AO(t) is the "underlying hazard" and
of unknown parameters.
~
is an s-dimensional vector
In Feigl and Zelen (1965) as well as in Glasser
(1967), AO(t) was assumed constant under an exponential distributional
assumption.
Any parametric form for AO(t) can be used in (1.8).
A non-
parametric approach to the underlying hazard was proposed by Cox (1972) .
Cox (1972) suggested the form
A(t'~i)
= AO(t)
(1. 9)
exp(~'~i)'
where AO(t) is completely unspecified.
The procedure is actually quasi-
nonparametric since no distributional assumption need be made for AO(t)
while the form of incorporating the covariables is completely specified.
To allow estimation and hypothesis testing about
~
under maximum likeli-
hood techniques, Cox (1972) constructs a "partial likelihood" which does
not involve AO(t).
To estimate the survival function, Cox (1972) makes
the rather strong assumption that AO(t)
= 0 except at values of t where
failures actually occurred and arrives at a step-function estimate for
the survival curve.
Breslow (1972)'s and Holford (1976)'s mild assump-
tions about AO(t) eliminate problems with tied data and result in a continuous estimate for the survival curve.
Cox (1975) defends his likeli-
hood by showing that it can be considered a partial likelihood containing
virtually all the information in the sample relevant to the estimation
e
.I
19
of 13.
-
If the covariables are not time dependent, the models are said to
be based on the "proportional hazards" (p.h.) assumption, and the ratio
of hazard rates for two different individuals will be constant over time
regardless of the form of the underlying hazard.
Kalbfleish (1974) strat-
ifies the observations according to the time dependent covariables so
that within a stratum the p.h. assumption holds.
Cox (1972) and Taulbee
(1977) propose methods that allow for testing of the p.h. assumption in
a non-parametric and parametric setting respectively.
Kalbfleish and
Prentice (1980) address the life testing problem under the proportional
hazard assumption.
Their main focus is on Cox (1972)'s regression model
(see. (1.9)) and they study its use in incorporating covariables into
•
the survivorship analysis and other ramifications.
They only examine
fixed plan truncation or censoring schemes, not sequential procedures .
•
Due to the sequential nature of life testing experiments and the
possibility of early termination of the experiment, progressively censored schemes (PCS) are examined.
Chatterjee and Sen (1973) developed
a class of linear rank statistics incorporated for testing under
pes:
Let {Xl' ... ,X } be a set of independent random variables from
N
the absolutely continuous distributions Fl, ... ,FN respectively,
ZN , l<···<ZN , N be their ordered statistics,
and
Let
~l,
,RNN be their ranks (ties neglected in probability)
SNl,
,SNN be the antiranks {~-S =S
= i, 1 < i < N}.
-~ Ni N~i
~(l), ..• ,~(N)
be real scores and CI, .•. ,CN regression
constants where
=
N- I
N
r
i=l
and
~(i)
=
N-
1
N
r
i=l
•
C.~
20
Then let
~(k)
1
=
N
L
N-k
i=k+l
~(i)
o
o<
k
k < N-l
=N
o
=
k
.2k
J.=l
(C s
*
CN)(~(i) - ~(k))
.-
N1
1 < k < N-2
k
=
(N-l)
-1
_
= 2 (C.-C N)
i=l
= N-l,N
*
2}
N (~(i)-a- ) 2 -.2N (~(i)-~(k))
.2
{ 1=1
N J.=k+l
o
N
=0
k
= 1, ... ,N
k
=0
•
"
2
J.
define the statistics
+
{max
0N,r
= 0 < k < r
for a specified r.
T }/a: ~_
n,k
N-~,r
and
(1.10)
Under a progressively censored scheme, one desires to test the
=....
H : F =F 2
=F =F (unknown) against suitable alternaO l
N
Chatterjee and Sen (1973) have shown that for testing HO' PCS
null hypothesis:
tives.
+
tests can be based on ON ,r or ON ,r and both these statistics are distri-
•
21
bution-free.
Operationally, the test procedure is:
For a specified level of significance aI' 0 < a l < 1, determine
the constant ~+
(or A_
) from
N,r,a l
~,r,al
o<
+
~.
N
+
= PlDN
>
,r
+
~N
,r,a l
IH ]
+
O 2 a l < P[DN,r
~~.+
~,r,al
IH ]
O
,r,a l ). The experiment is monitored from the
beginning and the statistic TN,k is computed at each failure
(similarly for
~
If for the first time for k = M(2r ), TN , M (or ITN, M!)
+
exceeds -"N,r
~[~
~[N6N ,r,a ), experimentation is
N N,r,a (or -"N,r
l
l
stopped at ZN,M and HO is rejected. If no such M(~r) exists, the
time ZN,k.
experiment is terminated at ZN,r along with the acceptance of H '
O
(1.11)
Chatterjee and Sen (1973) examined the two-sample location and scale
problems ((i) below) as well as the simple regression model ((ii) below):
1t
(i) In (1.10), let Fl ,F 2 , ... ,F N be such that the first n obserl
vations are from the first sample, Fl F2=···=F =F and F + l =
nl
nl
n +2 =
F
. . =FN=G, where F(x) and G(x) are unknown.
The
l
location and scale alternatives are H : G(x)=F(x-8), 8# 0
L
and HS : G(x)=F(x/8), 8 > 1 respectively, where 8 and 8 are
known.
(ii) In (1.10) let F. (x)=F(x-8 0-8d.), 1 ~ i 2 N, where 8 , 0, the
1
1
d are known constants and 80 is a nuisance parameter.
i
They also present the asymptotic distribution theory and performance of
the proposed statistics and tests.
Sen (1976) studied the power proper-
ties of these tests for local (contiguous)
alternatives.
Davis (1978)
proposed a Wilcoxon type statistic for the two-sample problem.
Majumdar
and Sen (1978a) considered the extension to the staggered entry situation
22
with random withdrawals.
Majumdar and Sen (1978b) extend the theory to
the multiple regression problem.
~
Suitable Cramer-von Mises and Kolmogorov-
Smirnov types of nonparametric statistics based on linear rank statistics
are proposed and studied for the setting:
(iii) In (1.10) let F. (x) = F(x-BO-B'd.), i _< i _< N, where F
1
is unknown,
Bo'
-
-1
~'
= (Bl, ... ,B ) are unknown and
p
d! = (dl., ..• ,d .) are regression constants and want to
-1
1
test Ho:
p1
~=~
vs
Hl:~
,
~.
(1.12)
Sinha (1979) utilizes a weighted empirical distribution function in the
simple regression problem.
In the multiple regression setting, the
vector of unknown parameters
be partitioned into
~'= [~i
test of interest becomes HO:
B can
~2)
~l=~
contain covariates.
when
~2
vs HI:
parameters that need to be dealt with.
The vector can
contains the covariates.
~l;~
where
BO'~2
The
are nuisance
A setup similar to (1.9) is thus
indicated, where part of Z. is non-stochastic for factor variables and
-1
part is stochastic for the concomitant variables.
Parametric procedures can also be incorporated into progressively
censored schemes using an extension of the Sequential Likelihood Ratio
Tests described in Section 1.2 to the case of non-independent, i.e.,
time dependent, observations.
1.5
Outline of Research Proposal
In the present work we develop suitable sequential testing proce-
dures for functions of unknown parameters through extensions of the findings of Bartlett (1946), Cox (1963), Sen and Ghosh (1971,1974), Ghosh and
Sen (1972,1977) and Chatterjee and Sen (1973).
In Chapter 2 we study parametric classical sequential procedures.
•
23
In a multiparameter setting, Bartlett (1946) and Cox (1963) proposed a
Sequential Likelihood Ratio Test (SLRT) to deal with nuisance parameters
when the form of the underlying distribution is known.
We extend the
SLRT to the problem of testing a specified function of the unknown parameters of the underlying distribution.
Possible functions of parameters
that may be of interest are the correlation coefficient, the coefficient
of variation, the ratio of two means, and the toxicity curve, for example.
Eichhorn and Zacks (1973) looked at a parametric sequential search procedure for the optimal dosage, when the toxicity curve is determined by
one parameter, under certain constraints.
case of our multiparameter setup.
Their approach is a special
Our extension of Cox's (1963) test will
be shown to have similar properties to Wald's (1947) Sequential Probability
Ratio Test (SPRT).
Our asymptotic testing procedure terminates with
probability one, has an Operating Characteristic (OC) function that ap-
..
proximates the one of the SPRT, and under suitable regularity conditions
its Average Sample Number (ASN) function can be shown to be optimal.
It
is shown that the "two-sample problem," is a special case of the framework presented in this chapter.
In Chapter 3 we study time-sequential parametric procedures relating to clinical trials and life testing problems.
We specifically
address the problem of making inferences about the survival function in
the presence of covariables in a progressively censored scheme (PCS)
setting.
Covariables may either vary with time, be baseline values, or
be measured at the time of failure.
We restrict ourselves to covariables
measured at time of entry into the experiment, but we consider time
varying covariables as well.
-e
By extending the maximum likelihood based
SLRT to life testing problems, we are able to test functions of the
24
vector of parameters
~
from Cox's (1972) regression model hazard (1.9).
Sen (1979a) develops appropriate invariance principles for Cox's (1972)
* that would enable us to derive the properties of our
test statistic Lnm
sequential testing procedure for such a case. In general problems of
testing functions of parameters, we utilize results of Sen (1976) and
Anderson (1960) to examine the properties of the asymptotic test
procedure.
In Chapter 4 we examine simulated values of the Operating
Characteristic and Average Sample Number functions for the maximum-likelihood-based testing procedures developed in Chapters 2 and 3.
We also
examine the applicability of the testing procedures to air pollution
monitoring and compliance testing.
In Chapter 5 we look at nonparametric sequential testing procedures.
We extend the procedures developed by Sen and Ghosh (1971,1974)
and Ghosh and Sen (1972,1977) to the problem of testing functions of
parameters.
Using a general class of one-sample rank order statistics,
we extend the results of Sen and Ghosh to the multiparameter setting
and develop suitable Sequential Rank. Order Tests (SROT) for our problem.
We derive the OC and ASN functions of our test procedure and examine its
Asymptotic Relative Efficiency (ARE) with respect to the extended SLRT
developed in Chapter 2.
For time-dependent observations, we consider
the extension of the results of Chatterjee and Sen (1973) and Majumdar
and Sen (1978b) involving linear rank statistics to the multiple regression setting, where the vector of unknown parameters
variables.
One can partition
~
~
can contain co-
of (1.12) into two vectors
~l
and
~2
where 8 1 consists of the parameters of the non-stochastic Z.'s
for factor
1
~
variables and
~2
consists of the parameters of the stochastic Zits for
e-
25
the covariables.
The focus of this dissertation is on making sequential inferences
about a function of unknown parameters.
The usual approach taken is to
estimate the unknown parameters and then justify using the estimates to
make inferences about the function of interest.
A clear illustration of
such a situation arises in life testing problems in the presence of covariables.
One is really interested in making inferences about the
survival curve, but the common approach is to estimate the parameters
of the covariables and then look at the survival curve.
The proposed
procedures will enable us to look directly at the survival curve.
Other
applications include models for extending high dose level results from
accelerated life testing situations to low dose levels and inferences
about optimal dosage and time of exposure for a new drug with harmful
side effects.
In accelerated life testing problems we have no alterna-
tive but to estimate the function of the parameters, in this case the
model relating high dose results to low dose levels.
Topics for further research are presented in Chapter 6.
-e
CHAPTER 2
A SEQUENTIAL LIKELIHOOD RATIO TEST FOR GENERAL FUNCTIONALS
2.1
Introduction
Let {xi' i
~
I} be a sequence of independent and
identically
distributed (iid) random variables from an absolutely continuous distribution function (df) F(x;8) with probability density function (pdf)
f(x;8),
_00
< x <
Assume that the form of F is known but the rnxl vec-
00.
tor of parameters 8 is unknown.
Let g(8) be a specified function of
interest such that one desires to examine the following hypotheses:
where go and
~(,O)
HO: g(8)
=
go
HI: g(8)
= gl = go
+ ~
(2.1)
are known constants.
The problem just formulated is often encountered in practice.
When dealing with distributions for which more than one parameter is unknown, one is usually interested in drawing inferences about a function
of these parameters.
As an example, consider an experiment where a ne\\
drug's toxicity curve is under examination.
The distribution of the variable measuring toxicity will depend
on the values of several covariab1es, where for simplicity we shall refer
to just two:
dosage and time of exposure.
Given a set of values of
dosage and time of exposure, the toxicity variable
normal distribution with mean
~ay
have, say. a log-
~d,t and variance cr~,t' where d is a given
dosage and t a given exposure time.
The toxicity curve
g(~d,t,cr~,t)
e-
27
would be an appropriate function of interest and we would test whether
g(~d,t,O~,t)
is less than or equal to gO,d,t' a toxicity level imposed
by the appropriate regulatory agency for the given dosage and time of
exposure, or for an appropriate range of values for the covariables.
Given the possible benefits of sequential procedures, we examine
problems such as the example above in a sequential setup.
basic problems to deal with.
There are two
First, in statistical inference, one usually
has several parameters that are not of interest and are thus referred to
as nuisance parameters.
Second, in the case of multiple parameters,
one often desires statistical inference on a function of interest rather
than on the parameters themselves.
The first problem was tackled by
Cox (1963), who presented a heuristic argument for a Sequential Likelihood Ratio Test (SLRT) based on maximum-likelihood techniques for the
case of a single parameter of interest in the face of several nuisance
parameters.
Some statistical justification was presented in Breslow
(1969) for Cox's sequential testing procedure.
The second problem has
not yet been addressed by sequential procedures, and is the subject of
this chapter.
By generalizing the sequential testing procedure of Cox
(1963) to incorporate the second problem, we develop a Generalized
Sequential Likelihood Ratio Test (GSLRT) that deals with both problems
mentioned above.
We will provide, in our general setup, further statis-
tical justification for Cox's SLRT.
We will see that even though we are
using maximum likelihood techniques to estimate values of parameters, we
still have nuisance parameters in our test statistics, such as their
variances.
-e
After presenting some preliminary notions, we propose the
test in Section 2.2.
In Section 2.2 we also present theorems dealing
with properties of the test.
We will show that we can get a strongly
28
consistent estimator for the variance of our statistic and that we can
approximate our process by a standard Brownian motion process.
In
Section 2.3 we determine the termination probability of our sequential
procedure.
In Sections 2.4 and 2.5 we examine the operating characteris-
tic (OC) and average sample number (ASN) functions of the test, respectively.
In Section 2.6 we present a numerical example comparing the
ASN of Wald's Binomial Sequential Test to the ASN function of Section
2.5.
The two-sample comparison of distributions problem is presented
as a special case of our framework in Section 2.7.
The theory developed
in this chapter will be utilized in Chapter 4 with an example using air
pOllution monitoring data.
Preliminary Notions and Proposed Test
2.2
2.2.1
Preliminary notions
Let the likelihood function at stage n be denoted by L (X ;e)
n -n -
n
.IT f(x.;e) where X
~= I
~ -n
= (x l ,x2 , ••• ,xn ).
=
The maximum likelihood estimator
at stage n of e is the nxl vector e satisfying
-n
o
mxl
=
cHn Ln (~n ; ~)
ae
(2.2)
e
-n
The following two lemmas are shown to hold.
Lerrma 2.2.1
If
E
Vx, as cS
{~:S,k_~II
~
0, then as n
aln L (X ;e)
~co
= nee -e)'
9
-n -
ae ae'
cS
"-
n -n -
ae'
<
a2 In
;fIn f(x;~)
e*
de
f(x;~)
ae'
e}+
0,
,
{- i
a2In
L (X ;8)
n -n -
de de'
(l + 0(1)), a. s.
(2.3)
e-
29
Lerrma 2.2.2
~In ~(x;e)J
Assume E
2
and a In -!jx;6)
00 00'
=0
is continuous in 6, Vx.
2
.!.
I (6)
n -n -
Then
1(6)
--
=
;6)
1 a In Ln (X
-n -
=-
ae 00'
n
E [aln f(x;6) aln
ae
6
f(x;~)lj
Cl8'
converges almost surely to
, where 1(6) is assumed to be positive
defini teo
Proof of Lemma 2. 2. 1
Expanding (2.2) by the Mean Value Theorem around the true value 8,
I
0' = aln Ln (~n ; ~)
lXm
00'
+
L (X ;8)
(8 -8)'
-n -
n -n -
(2.4)
8*
II~*-~II ~ II~n-~lland where II~II
where 8* satisfies
norm of X.
8
')
a~ln
denotes the Euclidean
Equation (2.4) can be rewritten as
aln L (X; 8)
n - -
aEl'
,..
[1
n (8 -8) , - -n n
2
a 1n
00
Ln (X
; 8)
-n
, -
ca
8
1
n
-
lIn
L (X ; 8)
n -n -
---=:---:,~~-
ae
a3'
It suffices to show that the following quantity converges almost surely
-
to 0:
?
1
-n
=
a8
ae'
2 f(x.
[0 J1ln
-
1
1
n
I~
<rln L (X ;0)
n -n -
a8
ae'
1
n-
a
;8)
·v
8
iln L (X ;8)
n -n de 00'
-
2 n
l
. 1
1=
In f(x. ;8)
ae a8'
1 -
If
f]
30
r
n
I
= -
(2.5)
8 - -a8 a8'
n 1=
. I
Equation (2.5) can be bounded by its supremum
a2In
1
-
a8 a8'
for 8* close to 8.
a In
f(x. ;8)
8*-
f(x.;8)
2
-asa8~
_.
I~
}
(2.6)
Given the assumption in the statement of the Lemma,
equation (2.6) is then the average of n independent and identically distributed (Ed) random variables with asymptotic mean
~
and thus (2.5)
converges almost surely (a.s.) to 0 by Khintchine's Law of Large Numbers.
QED
Proof of Lerruna 2.2.2
Given the assumptions in the statement of the Lemma,
!(~)
can be
rewritten as
1(8)
In order to show
(2.7)
"Ix
convergence of matrices, one shows convergence of the
(k,£)th elements of the matrices:
(2.8)
and
.!.
I (8)
n _n
_ [k,£]
where I
~
k, £
e'
<
m and 8
=
(8 , ..... ,8 ).
m
1
(2.9)
Equation (2.9) can be rewritten
~.
31
as
n
1
- 1:
- a2 In
f(x.~ ;6)
.~
n ~=
. 1
-
6
the average of n independent and identically distributed random variables
with finite expectation (2.8).
Thus by Kolmogorov's Strong Law of Large
Numbers, (2.9) converges almost surely to (2.8) and thus
verges almost surely to (2.7).
2.2.2
!. I
(e) con-
n -n -
QED
Development of Test Statistic
Cox l1963) suggested constructing the following statistic at the
n th stage of sampU I.g:
'"
e
Let g (e) = go
+
Zn = n{g (8 ) -n
l/J6 where l/J £ [0,1]
1
-lg
2 0
+
(2.10)
gl]}
and rewrite (2.10) as
1
Z = n[g(e ) - g(8)]
n
-n
By the Mean Value Theorem,
+
n(l/J - 2)6
(2.11)
= nee-n
-
~) ,ra~§) ~*]
(2.12)
n[g(e ) - gee)]
-n
-
where _6 again lies between e and e. Assuming .~ to be continuous
-n
in some neighborhood of the true value ~ and assuming the yet to be
shown weak convergence of e to e. e and thus e* will be contained in
_n
- _n
such neighborhood of e with high probability for large n. By c01:tinui ty,
ag(~)
a~
Ie*
-
agee)
a~
e
is thus negligible and thus
a~~(~) I~*
sidered to be t h e value at t h e true parameter e , say go (8)
.
--
can be con-
= aEae(e) I~
Equation (2.11) can now be rewritten as
-e
Z
n
0
= nee"
- 6)' gee)
-n
- -
+
1
n(l/J - -)6
2
(2.13)
32
for large n.
The construction of the test statistic is based on the
process defined in
(~.13).
The calculation of
bou~dary
crossing probabilities and expecta-
tions for determining the average sample number function and the operating
characteristic function for the process Z are quite complicated and
n
involved; known results of the standard continuous Wiener process
(Anderson (1960)) are thus utilized once it is sho\VTI that the process
Z can be suitably approximated by it.
n
A Wiener process X(t), t > 0
(often called a Brownian motion process) is a stochastic process with
independent normal (N(~,02)) increments and such that
E[X(t)]
= ~t
E[X(s)X(t)]
E[(X(t)-~t)
=02
2
]
= 0 2t
min(s,t)
Xes) - X(t) is independent of X(t) for s > t
The "standard Wiener process" Wet) has
~=O,
0
2
=1.
e
In order to claim that
the process (2.13) can be suitably approximated by a Wiener process, some
needed results are established.
Theo:r>em 2.1
Under the following regularity condition,
E
[alna;;Xi;~)
e]= ~
\Ix.
1
the following relation holds almost surely (a.s.),
'"
nC~n - 8)' g(~)
---- - = Wen)
y
+
o(rn)
· t h e standar d W·lener process an d y 2 -- gO(8)' I- l (8) ~o(~) <
h
W()
were
n IS
00
•
Proof of Theoroem 2. 1
From Lemma 2.2.1,
e-
33
aln Ln (~n; ~)
-1
e!
-
a8 '
I
nC8_n - -e)' { - n
a2 ln
ae
0
C~) ~ (~) =
Ln (X
-n ;8)
-
a8'
-
I } 1-1(8)
-8 - _
a.s.
(2.14)
The left hand side of (2.14) can be expressed as a sum of independent and
identically distributed random variables y. as follows:
1
n
I
i=l
y.
(2.15)
1
The random variable y.1 has zero expectation (under the assumed regularity
condition) and
E
(Y;l [~(~l '
= E
l
(
en n
f (x. ; 8)
(~) ~
1
-
I~
al n f
(x. ; 8)
CB'
1
-
I~!- 1 (~)
The above second moment derivation follows from the definition of 1(8)
and the assumption of positive definiteness.
Equation (2.15) can be
rewritten as the sum of n iid random variables y~1 with zero expectation
and unit variance as follows:
~i=l Yl~ = iI=l [aln-;::f~(X_i_;~_) I~ 1- 1 (8) i(~)
(2.16)
aJ'
-e
Using the Theorem established by Skorokhod (1965), there exists a se-
34
quence of nonnegative iid random variables
quence of partial sums
*
S~n)
Tk •
k > 1 such that the se-
k
SY * = L y.* }
1
k
i=l
= {I < k < n:
has the same joint distribution as the sequence of Wiener processes
k
We LT.)}
i=l 1
Wen) = {I < k < n:
where Wet) is the standard Wiener process.
Since the sequence of stopping times {T .• i > I} consists of iid random
1
*2
*
k) = Var(y k) = E(Yk ) = 1, Vk ~ 1.
1 n
by Khintchine's Strong Law of Large Numbers. n i~l T.1 converges almost
variables with finite expectation
E(T
n
surely to one.
Thus. since W(
L
i=l
1
T. )
1
= W(n.-
n
LT.)
n i=l
•
as n-+oo,
1
n
We
L
i=l
T.)
1
- W(n) = 0
Vn) • a. s .•
(2.17)
where oCIn) is the negligible contribution of the standard deviation to
the basic fluctuation of the process (since the standard deviation of
the process at stage n is of order In). Thus the joint distribution of
v*
the sequence of partial sums SCn) can be approximated by the new sequence
WCn) = {I ~ k < n:
n
L
i=l
W(k)
+
O(Ik)}.
Combining (2.14) and (2.16).
n
Y~ = L
1
o
~(~)
i=l
o 1
~(~) Y ,
_yl]
a.s .•
(2.18)
where the quantity in brackets on the right hand side is equal almost
surely to the identity matrix of rank n by Lemma 2.2.2.
n
*
Since the partial
sum at stage n. i~l Yi • has approximately the same distribution as
~
~
.
35
Wen)
+
o(I:n), almost surely, the right hand side of (2.18) is thus
"
0
n(~n-~)' ~(~)
y
= Wen)
a.s.
o(rn)
+
QED
Lcrrona 2.2.5
"
The distribution of the vector 1n(8 -8) is asymptotically
-n 1
N(O, 1- (8)) under the following regularity conditions:
2
a ln f(x;,e)
(i) Assume -- ae ae'
is continuous 'Vx
E
(ii)
-8
]=
lr ain f(x;fD
ae
E~ ~* . su,J?_a* _~ II
(iii)
<5
I(e)
(iv)
?
a-In f(x;,e)
I~
{
as
'Vx
0
.~
- -
-;.
< <5
a~ a~'
Ie*_a-Ina~ f(x;,e)
I8
a~'
?
'Vx
0
=E
LaIn f(x;f)) <:lIn f(x;.fD
~ L a~
a~ ,
J=
-E
}
ln f(X;f))]
[i
~
a~ a~'
is positive definite.
Fro 0 f
I)~ .T.,,;:. b"a L:.
2. 3
By Lemmas 2.2.1 and 2.2.2,
"
In (e
-e)
-n -
aln Ln (~n; ~)
ae'
= -1
rn
It suffices to show that
,.
, a.s.
laIn L (X ; e)
n -n Tn
ae'
I
8
~
NCO,I(e)).
-- -
1
n
then follow that !nce -e) ~ N(O,I- ce)).
-n - From the definition of the joint likelihood L (X ;8),
n -n -
n
1
-e
In
<:lIn L (X ;8)
n -n -
n
I
las 1-
1 ain ·_If(x.;e)I
in
8 =
e
~ ~
It would
36
=
1
aln f(x. ;8)
n
- I
-
~
a8
.Tn i=l
(2.19)
8
Assumption (iii) implies that V£ > 0,
J
III II
IIll1 2
>
£
dF
(2.20)
0,
-+
Iii
where F is the common distribution function of
r=
dIn f(x;fD 1
--~d~9~~~
.
8
Combining (2.20) with assumptions (ii) and (iv), and utilizing the
Multivariate Central Limit Theorem III of Puri and Sen (1971), (2.19)
converges to the m-dimensional normal distribution with mean vector 0
and covariance matrix
!(~).
QED
At the n th stage of sampling, Z as given in (2.13) is calcun
lated.
From Theorem 2.1, in order to approximate the process Z by a
standard Wiener process, Z must be multiplied by the quantl. ty
n
n
1 . It
-
Y
will be shown that in order to calculate boundary crossing probabilities,
1
Z must be mUltiplied by yonce
more.
In Section 2.5 it will be seen
n
that for the range of n having statistical interest, ~2n is 0(1) as
~ -+
1
~,
0 and thus -2Zn is multiplied by
so that at the n
th
stage of
y
sampling, the process of interest is then
Z' = -~ Z =
2 n
n
~n(8
-8)'
-n -
~(~)
y2
Y
Since the value of the true parameter
mated by
o "
g(8 ) =
- -n
dg
as
e is
-
Ie-n
+
~2
n(1/! - .!.)
2
y2
(2.21)
unknown, ~(~) and y 2 are esti0
(2.22)
e
-
37
Yn2
and
=
~ (8
- -n
)' [1 I (
n -n
e) ] -1 ~ (e-n) '
(2.23)
0
-n
respectively.
Lerrrna 2.2.4
Under suitably generalized assumptions given by Wald (1949) and
~I
~ 8
for continuous
(i) 8
-n
and I (8), the following hold.:
_n -
o '"
!
(iii)
"8, a.s.
0
(8 ) .. -1(8),
-
I
a.s.
n -n -n
"'2
2
(iv) Y
.. Y , a.s.
n
(ii) g(8 ) .. g(8), a.s.
- -n
Proof of Lerruna 2.2.4
The multiparameter analogs of the necessary assumptions of Wald
(1949) are
(AI)
F(x;8) is absolutely continuous V8,
(A2)
For sufficiently small p and for sufficiently larger r,
-
-
_00
<
x
<
00
the following are finite:
where Vp > 0:
f*(x;8,p)
= ~fl(X;~'P) if f(x;8,p)
l
and
f(x;8,p)
and where Vr > 0:
otherwise
= sup
$*(x;r)
f(x;8')
= f$(x,r)
l
and
$(x;r)
=
1
sup
-8:
.
e
(A3)
If
lim
K-+clO
~K
= 8, then
lim
K-+clO
> I
if $(x,r) > 1
otherwise
f(x;8)
II~II > r
f(x;8_K ) = f(x;8)
Vx
38
(A4)
If
(AS)
If
~l
~O'
'
F(x;~l)
then
lim IsKI
= ~,
then
K- + < > 0 . . . . . .
F(X;~O)
,
lim
K-+<>0
for at least one x.
f(x;SK)
=0
for any x except
perhaps on a set with zero probability.
{lIn f(x;~) IJ < ~
(A6)
E
(A7)
The parameter space
(AB)
The function f(x;S,p) is a measurable function of x for
S
any
~
£
e
e
Rm
is a closed subset of
-
and p
>
O.
Under assumptions (AI)-(AB), if
p{ lim
K-+<>o
a
... K
= 8}
_
b
is the true parameter value,
= 1,
and thus (i) is proved.
To prove (ii) it suffices to show that
p{
.dg
d~
Ie - ~d~ I
8
·-m
~£
> 0,
> E for at least one m > n}
~ 0,
as m
-
~
00,
(2.24)
Assuming ~ to be continuous in some neighborhood of the true value 8 and
using (i), 8" will be contained in such a neighborhood for large m.
-m
and small, 3m such that II 8 -8" < o. This implies
That is, for 0 >
-m due to continuity that ~£ > 0, 3n such that ~m ~ n,
°
1~~le-m-fa Ie
and thus (2.24) is proved.
To prove (iii) it suffices to show that
1"
I
P{ -I (8 ) - -I (8)
m-m 1m-m _m
>
£
~£
q
> 0,
for at least one m > n}
~
0, as m ~
00,
(2.25)
since then (iii) is proven by Lemma 2.2.2 and a similar argument as used
for (ii) above.
Assuming continuity of second order partial derivatives
for the probability density function f(x;e) in some neighborhood of
I (e) is also continuous in such a neighborhood of the true value 8.
-n -
~,
e-
39
Again, from (i) above 8 will be contained in such a neighborhood for m
-m
'"
large. That is, VO > 0, 3m such that IIe -el I < O. Due to continuity
-m -
in
e,
II!m -m
I (8 ) - ! I (8) II
-m
m -m -
Ve: > 0, 3n such that Vm > n,
< e: and thus
(2.25) is proved.
To prove (iv), it suffices to show that Ve: > 0,
"'2
2
p{ly - y I > e: for at least one m > n} ~. 0, as m ~
m
From the assumed continuity of
function of
~,
i!1~
and of
~n(~)'
00.
(2.26)
y2 is a continuous
especially in a small neighborhood of the true value
e.
'" will be contained in such a neighborhood for m large.
Once again e
-m
IIe_m -el
I < O. Thus,
"'2
2
1y m - y I < E and thus
That is, VO > 0, 3m such that
\iE
>0, 3n such that \im >n,
due to continuity,
(2.26) is proved.
QED.
The test statistic at the n th stage of sampling is thus
z* =
~n{g(e
n
) -
1
-2 (go
+
gl)}
-...:.n~--:-_:::__--=--~-
"'2
(2.27)
Yn
Combining (2.11), (2.13) and (2.22), equation (2.27) can be rewritten as
'"
0 '"
2
1
~n (e -e), g ( e )
~ n (\jJ - 2)
z* = -n - - -n +
(2.28)
n
"'2
Yn
From Lemma 2.2.4 and the result proved in Theorem 2.1,
2
1
~ n(\jJ - 2)
z* = ~W(n) +
+ 0 (~rn) , a.s. for large n.
n
2.2.3
y
y2
(2.29)
Test Procedure
Corresponding to given preassigned strength (a l ,a 2), where a l
and a
-e
are the probabilities of Type I and Type II errors respectively,
2
consider two positive numbers A and B such that 0 < B < 1 < A < 00 where
a
l-a 2
2
A = - - and B = - al
l-a l
40
and define a • InA and b
let
nO(~)
= InB.
As in Bartlett (1946) and Cox (1963),
denote an initial sample size of at least moderately large
" with reasonable accuracy.
size in order to be able to estimate y by Y
n
Thus, assume that nO(~) is such that
=
nO(~)
lim
00
lim ~2no(~)
and
~~O
= O.
(2.30)
~~O
This assumption does not affect the asymptotic form of the ASN function.
At the n th stage of sampling, calculate Z* as in (2.27) and use the
n
following decision scheme:
If
Z*
< b
.
nIf b < Zn* < a
stop sampling and accept HO at stage n
continue by sampling the (n+l)st observation
If a < Z*
stop sampling and accept HI at stage n
-
n
The various properties of the above defined test procedure are
examined in Sections 2.3-2.5.
2.3
Termination Probability of the Test Procedure
Z* ¢ (b,a)} denote the stopping varin
abl e f or t h e process Zn*.
as n
~
00
We want to determine the asymptotic value
of
(2.31)
for given fixed values of e and
~.
By definition, (2.31) equals
"
1
~m[g(e) - 2(gO + gl) ]
{
P b <
_m
< a,
x2
Ym
VnO(~)
~m~nl~·6}.
(2.32)
which is bounded from above by
"
P b <
{
~n[g(e)
.... n
1
- -2(gO + gl)]
"2
(2.33)
Yn
Using (2.28), equation (2.33) can be rewritten as
e-
41
{ b
m(e -e)' ~(e
-n - -n
"
'Y"2
n
P lim y <
y
)
+
"2
I
In li (ljJ - -)
2
<
y
.....!....
lim
y~ ~.~}
(2.34)
where for fixed
"2
lim[..l..
y~ ]=0
lim
and
n~
lim[--,,Y~ ]=0.
lim 'Y
n~
(2.35)
From Slutzky's Theorem, Lemma 2.2.3 and part (ii) of Lemma 2.2.4,
"'0"
m(e-n -e)
-
y
",0
-g(e)
_n
=
m(e_n -e)
_
'Y
_gee)
~
almost surely, and
r
has a limiting standard normal distribution [N(O,I)J. Noticing that
I
liin (ljJ - l)
= -(g
+ gl)
lim
2 =
2 0
i f g(~l
n-.oo
I
'Y
00 if g(~)
, '2(go + gl)
the expression between the inequality signs in (2.34) has either a limiting standard normal distribution or it is not bounded, while the boundaries tend to zero as n gets large.
(2.34) goes to zero as n
+
00
Thus, the probability expressed in
and the asymptotic probability that the
test terminates is one.
2.4
Operating Characteristic Function
The operating characteristic (OC) function of a sequential test
was defined previously as P{accept Hol~,li}.
For the s:udy of the OC
and ASN functions in this and the following section, we confine ourselves to local alternatives.
Thus, for theoretical purposes, let li
+
0
be the case to consider.
As li
that z~
ble.
-e
t
+
0, if n is not large, then lin
+
0 and it would be unlikely
(b,a); thus assume n is large enough for z~ not to be negligi-
Then for fixed e, as li * 0 and n is large as assumed above,
lim P(accept Hole,li)
li*O
-
= [0
if
g(~)
I
if
g(~) <
> go
go
42
In order to avoid this limiting degeneracy, assume as in (2.11) that
g(~)
= go
+ ~$,
where $
= [0,1]
£ J
and thus examine local alternatives around gO'
Let
L($,~)
= P(accept
for the test procedure and
Holg(~)
N(~)
= go
,
Define
+ ~$)
nO(~)
as in (2.30).
denote the OC function
the stopping variable as defined in
Section 2.3.
Theorem 2.2
"'2
as n
~
2
Assume the conditions in Lemma 2.2.4 hold so that Y ~ Y , a.s.
n
2 IS
. contInuous
.
. e aroun d t h e true va I ue 0 f t h e
00 an d assume Y
In
parameter.
Then, as
~ ~
0,
{
(AI-2~_ 1)/(AI-2~_ BI-2~) , if $
In A/(lnA - lnB) ,
if
# .!2
~ =
t
(2.36 )
Proof of Theorem 2.2
Introduce the two events
for some ml
~ n O(6)
before Z;2
for some ml > m2
~n
and
and let
Vm
~
~
a
n O (6)
~ no(~)} . where n
> 0 is arbitrary
denote their respective complements.
stand for a probability evaluated given
L($,~)
= P${E l (6)}
= P${El(~) E2(~,n)}
+
using the Theorem of Total Probability.
g(~)
= go
1
+ $6.
P${E I (6) E~(~,n)}
Let
P~
By definition,
(2.37)
The second summand in (2.37) is
bounded from above by P${E~ (6,n)} and from the strong convergence of
e-
43
y2n to y, this probability goes to 0, V~, n, as n ~
Thus the second
00.
summand is negligible and it suffices to focus on the probability of the
intersection event
El(~) E2(~,n),
that is
~ml[g(~ml) - ~(go
El
(~) E2(~,n)
= - - - - - - - = 2 - - - - < b for some ml ::..
y
nO(~)
before
~m2[g(~m2)
- ~(go + gl)]
--------:2=----- > a for some ml > m2
y
~
nO (~) .
(2.38)
Denote by z~ =
for n
~ nO(~)
and define
the stopping variables
Nij(~) = min{n ~ nO(~):
z~ ¢ ((1
+
(-l)in)b, (1
+
(-l)jn)a)), for
Let L? (w,~) denote the DC function of a parallel sequential
i,j=1,2.
1)
test to Z* based on {ZO, n > no(~)} with boundaries (1
n
n-
(1 + (-l)jn)a, i,j=1,2.
VO <
Then V£ > 0, n > 0, 3~0 >
~ < ~O'
o
L21(w,~)
- £
°
~ L(W,~) ~ L12(W'~)
+ £,
+
(-l)in)b and
° such that
Vw £ J,
(2.39)
where from the notation given above,
o
L21(w.~)
0
denotes the DC function of Zn with boundaries ((l+n)b.(l-n)a)
and
L~2(W'~) denotes the DC function of Z~ with boundaries ((l-n)b,(l+n)a).
Noting that (l+n)b > b and (l-n)b < b for n > 0 with equality for n = 0,
the expression (2.39) follows.
Since
£
and n are arbitrary, from (2.39)
it is clear that it suffices to show that
-e
(2.40)
44
Now introduce a sequence of right-continuous, non-decreasing and integer
valued functions
t > 0 , VI::,. > 0,
, t
Assume that as I::,.
~
0, VO < T <
{WI::,.(t), t
£
(2.41)
~ o} .
(2.42)
00,
[O,T]}
£
{W(t), t
the standard continuous Wiener process.
£
(2.43)
[O,T]} ,
The intuitive interpretation
for (2.41) - (2.43) follows from Theorem 2.1 and the desire to introduce
a sequence that will enable an almost sure approximation of the integer
valued Wiener process Wen), by a continuous valued Wiener process, Wet).
One is then able to utilize the result for continuous standard Wiener
processes Wet) stated in Anderson (1960):
lim
T~
p{W(t), t £ [O,T], first crosses the line
before crossing the line ~ c 2 + ocot
2cOc2
(e
=
-
l)/(e
kc l
+
ocot}
2c Oc 2
(2.44 )
{ czl (c -c )
2 1
Combining the definitions of KA(t) and z~, define
"
1
I::,. KI::,.(t)[g(~KI::,.(t)) - I(go
+
gl)]
y2
"
I::,. KI::,.(t)[g(~KI::,.(t)) - g(~)]
=
2
+
Y
(2.45)
e-
45
From (2.42), 0
ZK~ (t)
.
Vt
=-y1
W~ (t) +
(t/J - ~)~2K~(t)
+ 0(1), a. s. ,
y2
[0, Tj .
E
(2.46)
From (2.41), ~2K~ (t)
= OCt)
and lim ~2K~ (t)
~-+{)
and letting
~ ~
°
0,
ZK~(t)
1
c = y'
Now let
for
as
°
<
~ ~
1
= Y Wet)
+ ---y~2~- + 0(1), a.s.
Vt
E
Using (2.43) ,
[O,T].
1
in (2.44)
- t/J, c 1 = (l+(-l)i n )b, c 2 = (l+(-l)jn)a
1 arbitrary. Then from (2.30), (2.43) , (2.44) and (2.45),
<
n
1
(t/J - '2) t
= t.
= '2
Co
0, the OC function of
Z~ (t) is found to be given by
~
°
L..
1J
= lim
(t/J,~) ~
Wet), t
P
°
L .. (t/J)
E
1J
(.!. - t/J)t
lO,T] first crosses the line Y(l+(-l)in )b + __
2
1
1
before crossing the line y(l+(-l)jn)a + (2 - t/J)t
T-+oo
Y
Y
1
2(2 -
.
t/J)(l+(-l)J n )a
(e
- 1)
1
.
2('2 - t/J)(l+(-l)J n)a
=
e
i
1
if
t/J;.!.
2('2 - t/J)(1+(-1) n)b
2
- e
if t/J =
i .
(2.47)
The behavior of
Z~ (t) is exactly like that of Z~ and thus W~(t)
~
is its appropriate stochastic process approximation.
given in (2.47) is thus the OC function of ZOo
n
From (2.47),
-e
The OC function
46
~ W(t), t
°
L .. (1l!) = lim p
1J
T~
E [O,T] first crosses the line
1
i )b + (2 - 1l!)t
(l+(-l) n
y2
before crossing
(.!. -
the line (l+(-l)jn)a + _2
1
1
(1l! - "2)t
y Wet) +
2
= lim P
T~
1l!)t
~y2
first crosses the line
(1+(-1) i n)b before crossing the line
(1+(-l)jn)a
= OC function of
NOl\',
L (1l!)
(2.48)
ZOo
n
o
= lim {lim L~ .(1l!,~)} = lim L.1J. (1l!) and thus from (2.47) and
n-+{)
~-+O
1J
n-+{)
(2.48) ,
1
L(1l!)
= lim
(e
.
-2(1l! - -)(l+(-l)J n )a
2
- 1)
1
n-+{)
i'
-2(1l! - "2) (1+(-1) n)b
- e
a(l+(-l)jn)
a(l+(-l)jn) - b(l+(-l) i n)
= (e
1
-2(1l! - 2)a
1
-2 (1l! - -)a
2
- l)/(e
1
-2(1l! - -)b
2
e
),
a/(a-b)
and letting e a
1
1l!
= "2
1l!
; .!.
1l!
= -21
1l!
; .!.2
1l!
QED
=2
2
= A and eb = B,
L(ljJ) :
i
(A l - 21l! _ 1)/(A(1-21l!) _ B(1-21l!))
InA/ (1nA - InB)
1
e
-
47
Note that the test is asymptotically distribution free since
L(W) is independent of F and
~
and that the test is asymptotically con-
sistent with strength (a l ,a ) since
2
=1 = a2.
L(O)
..
and
L(l)
a
In order to compute the relative
l
efficien~y
of the test proce-
dure with respect to other sequential procedures, the average sample
number (ASN) function is calculated in the following section.
2.5
Average Sample Number Function
We want to get fairly general regularity conditions such that
'IN
£
J,
<
where
N(~)
is as defined in Section 2.3.
00
,
We thus make the following
assumptions:
(i) Assume 30 > 0 s.t. ~n > 0, 3k > 0 and an integer n*=n*(o,n,k)
for which P
w
r 2
(ii) n /
{
Ew[lg(~n)
Y~
kn -1-0 ,
y2 - 1
-
g(~)
IrJ
~
cr <
00,
~n
> n*
~n ~ nO(~)'
for c r ' r > O.
The following theorem then establishes the result for the asymptotic
average sample number (ASN) function.
The01'em 2.:5
If L(W) denotes the OC function of the test statistic, then the
asymptotic ASN function of the test statistic is
lim
~-+O
~2E'Jo[N(~)]
=
[b L(W)
+
a(l-L(w))]
't'
-aby
2
,
1
W = 2" .
(2.49)
48
Proof of Theorem 2.;5
First look at the case when ~ #
i.
For arbitrary E > 0, chosen
small, define
and
Then, ~2E~[N(~)]
~2[
n~
nl(~)
= max{n:
n2(~)
= min{n:
=
L nP~(N(~)=n)
n
l
..
(2.50)
L
+
nl
(~)
(~)< n~ n2(~)
nP~(N(~)=n)
+
L np~(N(~)=n)]
n>n2(~)
.
(2.51)
Now, 6 2
L
n~
n l (6)
nP~(N(~)=n) =
L
n2. n l (~)
of n l (6), it is bounded from above by
EP~(N(~) ~
n l (6)) < E.
~
62np~(N(~)=n) and by the definition
L
n2. n l (6)
EP~(N(6)=n)
=
..
Thus,
2
n~
L
nP~(N(6)=n) 2. E,
(2.52)
n (6)
l
and thus can be negligible since E can be chosen arbitrarily small.
It is easily shown that
Ln
n>k
P~(N(6)=n) = (k+l) P~(N(6»k) +
L P~(N(6»n),
n>k
(2.53)
Now, using assumption (i), for 6 sufficiently small so that n l (6) > n *
and
2
Yn
--2 - I
< n,
V~ E
J, for all n > 0,
Y
"2
"I
"2
P~(N(~»n) = P~{bYn < ~nLg(~n) - 2(gO + gl)] < a Yn }
< P~{bY
2
(1+2n)
"
< ~n[g(~n)
1
- g(~) + (~ - 2)6]
<
.
2
-1-0
ay (1+2n)} +~n
)
(2.54)
e-
49
Rewriting (2.54),
PljJ{N(ll) > nl <
..
P {b
ljJ.
y2
(1+2 n ) _ llln(ljJ _ !) < Ilnlg(8 )-g(8)] < ay2(1+2n) - lllln(ljJ _ !)}
lllln
2
_n
llln
2
o(n
+
-1-0
), Vn >
o.
(2.55)
.
Now defIne n
I
= by2(1+2n)
llv'n
'
n
2
=
and P
n
= lllln(\jJ -
t),
so that
(2.55) becomes
P\jJ{nl-Pn <
I:n[g(~n)-g(~)]
I O
< n 2-Pn } + o(n- - )
~ p\jJ{IIn[g(~n)-g(~)] I>
n
cl )},
(2.56)
where
cln)=
}
llllnl\jJ -
t I> O.
By assumption (ii), Vc
>0
and n
~ n 2 (ll) ,
and using the Markov Inequality,
<
-
c n
r/2
r
r r r
c II n
c
r
= -----:-=rAr r/2
C
Ll
(2.57)
n
From (2.56) and (2.57),
P\jJ(N(ll) > n)
-e
~ p\jJ{lnlg(~n)
-
g(~)
I
>
~1\jJ
-
}I
6 In}
so
c
Vn ~ n2(~)' where k =
r
[ll~ _ lllr
r
(Z.58)
.
From (2.53),
~Z
I
n>n2(~)
nP~(N(~)=n) = ~2[nz(~)+ll P~{N(~»n2(~)} + ~Z
I
n>n2(~)
P~(N(~) >n).
(2.59)
From (2.58),
~2
P~(N(~) > n) ~
I
k
n>n 2 (6)
r
~2-r
n- r / 2
I
n>n2(~)
L
and since for p>l,
n>m
k ~2-r
r
I
-r/2
n>n2(~)
~
[6 2
k*
n2(~)lr/2-1
(2.60)
'
Finally, from the definition of n2(~)'
~2 I
P~(N(6) ~ n) ~ k* Er / 2- 1
for some k* > O.
(2.61)
n>n2(~)
From (2.58) and for large values of
analogously to (2.60) and
n2(~)'
(2.61),
2 ( (A)
n2 u
uA
1) P~ (N(A)
u
> n 2 (A))
U
+
2 UA2n2 (A)
k UA- r n - r / 2 (A)
U
U
z
r
*
r
k
= [~2 n (6)]r/2-1 -< k*r Er / 2 - 1
2
From (2.59), (2.61) and (2.62), ~2
for some k* > O.
r
(2.62)
I
n>n2(~)
n P~(N(6)=n)
can be made arbitrarily small by an appropriate choice of E, for r > 2:
~
2
~
L
n
P~(N(~)=n) ~
n>n2(~)
(k
*
+ k
* r/2-l •
r) E
(2.63)
Due to (2.52) and (2.63), it suffices to focus on the middle
summand in (2.51).
Replace z~ by
e'
51
_
z . (~) =
6{n[g(~n) - g(~)]
2
i
y (1+(-1) n)
n,1
Vn
~
n O(6), i=1,2 and
+
V~ E
n(~ - i)6}
(2.64)
J,
with corresponding stopping variables
~i (6)
t
i=O,1,2
(Let
~o(6)=N(6)
for convenience of notation.). The results (2.52) and (2.63) still follow
-
for Zn,1. (~), i=1,2.
I
n 1 (6)< n~ n 2 (6)
Now, for i=O,l,2, using (2.53), it follows that
nP {N. (6)=n} =
~ 1
(n l (6) + 1) P~(Ni (6) > n 1 (6)) + I
P~(Ni(6) > n)
n>n (6)
l
-
= (n l (6) + 1) P~(Ni(6) > n 1 (6)) - (n2(~)+1) P~(Ni(~) > n 2 (6))
(2.65)
-
From (2.65), assumption (i) and the definitions of N.
1
(~),
i=1,2,
- o(nl(~)-O)
(2.66)
By the definition of
nl(~)'
lim n (6)=O and thus
l
t
using (2.66), in order
~-+-O
to prove the desired result, it suffices to show that
2
lim flim 6
n-+-O 6-+-0
n l (6)<
...
equals (2.49).
I
n~
n 2 (6)
n
P~(Ni (~)
= n)]
(2.67)
Now, given the setup used, assumption (i) and the conver-
52
gence of the process to a standard Wiener Process, Theorem 2.2 also ap-
-
plies to {Zn,l.(W), n>nO(~)}'
and for n(> 0) arbitrarily small, the DC
function of the corresponding sequential procedure based on N.(~)
i=1,2
1
,
-
will be arbitrarily close to L(w), VW £ J as
-1 { n max
(~)< k< n
Tn
0
--
"
k Ig(~k)
insures that the asymptotic (as
~ ~
O.
Assuming that
" I ~ 0 in L norm,
g(~k-l)}
2
~ ~ 0)
excess of ZN.
(~),i(W)
(2.68)
over the
1
nl(~) ~ Ni(~) ~ n2(~)'
boundaries of a and b is negligible for
Hence, for n > 0 arbitrarily small, as
i=1,2.
0, for i=1,2,
~ ~
EW[ZN. (~),i (W)] ~ b L(W) + a(l - L(w))
(2.69)
1
Recalling (2.57) and (2.58), the probability that the process exceeds a
.
boun d·Invo 1·
. D( n -r/2) , an d·It f 0 11 ows t hat f or 1=
. 1 , 2,
gIven
vlng n IS
liJP in f
n~oo
I IZN
'
I
i (w)
dP",
=0
(2. 70)
IjI
[N>n]
Now, using (2.64),
2
1
b. new - 2)
"
- g(~)]
Z . (w) - 2
1· = --..,,2---1.,....·n,l
y (1+(-1) n)
y (1+(-1) n)
-
~n[g(~n)
and using Wald's Identity, for i=1,2,
(2.71)
From (2.69), as
~ ~
0,
~
b L(W) + a(l-L(w)) -
2
1
(W - 2)E
(N (~))
W
-i
y (1+(-1) n)
2
i
(2.72)
e-
53
Combining (2.71) and (2.72), as 6 ~ 0, for i=1,2 and ~ ;
1,
(2.73)
The desired result for ~ ; } is obtained as
For
~
1
= 2'
n~
O.
the above proof does not work since in (2.55) both the
upper and lower bounds for iJn[g(8 ) - gee)] converge to 0 since p =0 and
-n
n
r 2
thus (2.58) may not be o(n- / ). Thus choose a sequence
(-l)t
}
1
{~t -_ 2
+
t
' t > 1
on which to evaluate §(~)t'
§(~,y2)
=
[b
l),
L(~)
where
+
a(l-L(~))] ~ , ~ ; ~
tV - "2
Thus, in order to calculate the value of the ASN function at ,"'f' = 2'
1 it
suffices to calculate
lim §(1t'\, y2)
t~
= lim §(tV t
tV t2
Now, with repeated uses of L 'Hospital's Rule, for c
.-
-e
(2.74 )
, y2)
1
= 1-2'"'+'t'
54
= y2 Cb - a) lim eCa+b)CC2a_2b) + 2bebc
y~
t 2
= y2 Cb - a )
2ae
ac
[ e ac - e bc]2
lim 8Ca+b) (a2_b2)eCa+b)c _ 8a 3e ac + 8b 3e bc
~t~ 2[eac_ebc][4a2eac_4b2ebc]+2[_2aeac+2bebc]2
= -aby 2
QED
Thus, in order to compute the relative efficiency of the test procedure,
one can use the result of Theorem 2.3 for comparison purposes with other
sequential procedures having the same DC function as given in Section 2.4.
2.6
2.6.1
Example of an ARE Calculation
Background
For purposes of illustration, consider a compliance testing situ-
ation in air pollution.
In environmental monitoring of air pollution
concentration, there is a need to develop a testing procedure that will
enable a regulatory agency to sequentially test a polluter's compliance
with existing air pollution standards.
This numerical example is e1ab-
orated upon in Chapter 4, and the reader is directed there for more
detailed background explanations.
The standards are usually specified
in terms of a maximum allowable hourly concentration value of a po11utant, say xO' not to be exceeded more than t times in one year. The
distribution of the random variable x, denoting the measured hourly
concentration of the given pollutant, depends on several parameters such
as meteorological conditions, emission activity, time of day, and other
SS
covariables that can be described by a vector of unknown parameters
B.
Let us assume that the measured concentrations behave as. if they were
independent and identically distributed from F(x;e,B), where F is of a
--
known form, and e and
~
are unknown parameters.
can be defined as g(~,~,t)
= (l-tO-I)IOO
A function of interest
percentile of the distribution
function F, where 0=8784 for leap years or 0=8760 otherwise.
In a com-
pliance testing setup, one would be interested in testing
HI:
g(e,B,t)
= Xo +
~
where
~
> 0 is known.
(2.7S)
Sequentially there are two ways of testing the above hypothesis:
(i) Wald's Binomial Sequential Test (WBST)
(ii) Generalized Sequential Likelihood Ratio Test (GSLRT).
We present both tests and show that they have the same OC function.
We
then look at their asymptotic relative efficiency (ARE) as a means of
comparing the tests and examining the claim that the GSLRT will have
improved efficiency over the WBST.
2.6.2
Wald's Binomial Sequential Test (WBST)
Define the sequence of random variables {c.,
i > I} to be the
1
following sequence of indicator variables:
if xi
~ X
o
(compliance or "success")
(non-compliance or "failure")
if xl > Xo
Let p denote the proportion of observations less than or equal to
•
o and
X
assume that the form of the d.f. F is known in order to make comparisons
with the GSLRT.
eses:
We are thus interested in testing the following hypoth-
S6
P = Po = I-tO
P = PI = I-tO
-1
-1
(2.76)
using a first order Taylor series approximation.
Define
~
as before so
that
P = F(xO;~'~)
where
+
~~',
The sequential test for a given strength (a l ,a ) for
2
the hypothesis above is given by the test statistic at the nth stage
~ £
[0,1].
ZW =(
n
L x.)ln
n . i=l
1
r
PI
n
I-PI
-- + (n x.) In -1-Po
i=l 1
-PO
(2.77)
and the decision rule at the n th stage,
If ZW < b
, stop sampling and accept HO
nIf ZW > a
, stop sampling and reject H
O
nIf b < ZW < a, continue sampling the (n+l)st observation,
n
(2.78)
where b
2.6.3
Generalized Sequential Likelihood Ratio Test (GSLRT)
The formulation of the hypotheses given in (2.7S) is testing the
same alternatives as the ones in (2.76), given that
Once again, define
~ £
[0,1] s.t.
g(~,~,t)
= Xo +
~'
~~.
=
~ f(xO;~'~).
At stage n of
sampling, the sequential test for a given strength (al,aZ) is given by
the test statistic
(2. 79)
n
aZIn.1=II If(x.1_
;</»
--
a¢ d</>'
e·
57
and the decision rule at stage n as in (2.78) with z~
2.6.4
replacing Zw.
n
Operating Characteristic Functions for Local Alternatives
The DC function of the GSLRT is given by (2.36).
~' ~ 0 and the
As
~ ~
0,
DC function of WBST is given by (Wald (1947))
1- PI
.
PI
where l/J = [ 1 - [ I-PO ] h] -;- [[]h
Po
1- PI
[ I-po ]h ] '
_00
5.. h 5..
00.
(2.80)
In order to compare the DC function of (2.36) with (2.80), one needs to
look at them over the same range of values.
The function L(l/J) in (2.36)
ranges from [0,1] and in order for LW(l/J) to range from [0,1], h must
range from [-1,1].
Letting h=1-2l/J in (2.80) provides the same range and
1
W
L (l/J) = L(l/J) for l/J £ [0,1] except for l/J = 2·
1
For l/J = 2' h = 0 and (see
Wald (1947, p.95)):
1-0:2 + [[I-a)
2
In a )
=
(ln~)
=
(J/~:2)" [In [~~a2] - In[I~~J]
1
+
]
We thus have that LW(l/J) = L(l/J), Vl/J £ [0,1].
2.6.5
ASymptotic Relative Efficiency (ARE)
The average sample number (ASN) function for the GSLRT is given
by (2.49).
for WBST is
Letting p =p 0
,
+ l/J~
, the expected sample stopping variable
58
(2.81)
(2.82)
Thus, the asymptotic ASN function, as t" ~ 0, for WBST is given by
lim
t,,~
2
t" E",[n(t,,)]
~
a2
={
L(W) In(-l---)
-a 1
+
l-a2 }
[l-L(W)] In(----) lim
al
t,,~0
t,,2
P'
(2.83)
Now, expanding the logarithms in P using a Taylor series expansion,
59
(2.84)
From (2.84),
lim ~2
~-+{J
lim
P = ~-+{J
1
-----::::2---:-1-----
f(x o;~ ,~) (llJ - 2)
=p("'x--:;~:--,=~)~['"='"l-:-F:::-::(-xo~;-=-~-:'~=).,,]
o
=
+ D( ~ )
F(xO;~,~)[l-F(xO;~,~)]
(llJ -
1
2) f(xo;~'~ )
(2.85)
2
With the use of L'Hospital's rule as done in the proof of
1
Theorem 2.3, for llJ=2'
lim
~-+{J
2
~ EllJ[n(~)]
= -In
[1-a
[a2 ]
2]
--- In -_--al
1 al
P(xO;~,~)[l-F(xO;~'~)]
2
f(xO;~'~)
(2.86)
The asymptotic relative efficiency (ARE) of the Generalized Sequential
Likelihood Ratio Test (GSLRT) with respect to Wald's Binomial Sequential
Test (WBST) is defined as the asymptotic ratio of their expected stopping
times as
~~
0, for all l J
=
£
[0,1]:
P(xO;~,~)[l-F(XO;~'~)]
2
2
Y f(xO;~'~)
(2.87)
using (2.49), (2.83) and (2.85). We now take a closer look at (2.87) for
various distributions.
60
(A) Normal distribution with
Let
-e = (~,cr2)
represent the distribution function of the
~(x)
~(x)
standard N(O,l) distribution and
_00
< x <
its density function,
00.
=
-1
iD
and denote by T the (l-aO)lOO percenaO
tile point of the standard normal distribution function.
g(~,i)
Letting the function
La )
' .
Y2 -_ v~2(1+~2 ~2
o
(2,75).
For
2
g(~,cr,i) =~
equal
. llCl
" t y,
SImp
F(xO;~'cr
2
f(xO;~'cr
~(x)
and
+
in
~(x)
)
2
)
denote the standard normal distribution function
and probability density function, respectively,
1
o
-
_00
< x <
o
~(T
a
) e
O
~
-~
H_O_
o
and
= ~O
Focus on the ratios
and
If
1e t x
o
+
L
)
aO
o
00,
one can re-
61
Under local alternatives, i.e. as
1 -
~ ~
~ ~ ~O'
0,
~(T
a )
O
R (x ) = ---,--,.--i- cr and R (x )
l O
¢(T )
2 0
ao
If a O is small so that T
is large, (2.87) can be approximated using
aO
the two following results (Kendall and Stuart, Vql. 1, p. 137):
1 -
~(x)
¢(x)
~(x)
Thus, for large
T
"'.!.x
for large x > 0
'" 1
for large x > 0 .
a0'
1
ARE G/ w %
(l +
e
(2.88)
.!.2 1 a2 )
O
T
a O ¢ (1 a O)
1
The minimum value of (2.88) is attained at a O=} because the
F(xO;~'~)'
expression (2.87) is symmetric in
Table 2.1 has values of
(2.88) for various values of ao'
(B)
Double exponential distribution with 8
= (k,c)
Let F(x;8) denote the double exponential distribution
function with probability density function
k e-2klx-cl , k > 0,
_00
< x <
00.
f(x;~)
If the function
=
g(~,£) is
the (l-aO)lOO percentile point of the double exponential
c-ln (2a )
O
(k,c) distribution: g(k,c,i) =
2k
' then
1
8k 2
Again let
alternatives, i.e. as
c-In (2a o)
2k
~ ~ ~
~
x
o
in l2.75).
so that we are examining local
Then, evaluating (2.87),
62
2 [(I-CX o) Icxo]
= --::--------2
(2.89)
In (2cx O)+ 21n(2cx O)+3
The minimum value of (2.89) is attained when CX =}.
See Table 2.1 for
o
values of (2.89) for various values of cxO.
(C) Cauchy distribution with 8 = (a,b)
Let F(x;8) denote the Cauchy distribution function with
probability density function f(x;8) =
b
---~2~~2-
,
TI[ (x-a) +b ]
< x <
_00
00.
If the function
g(8,~)
is the (l-cxO)lOO percen-
tile point of the Cauchy (a,b) distribution: g(a,b,~) =
TI
2
2 2 TI
a + b tan (2 - ';fCX O)' then y = 2b sec (} - TIcx o). Again let
a
+
b tan(ITI - ncxo)
natives, i.e., as
xo so that we are examining local alter-
~
~ ~
0 in (2.75).
(I
-
cx )
Then evaluating (2.87),
CX
O o
=------------b"='2-----
ARE G/ W
2b 2 sec 2 t.-2TI -
)
Ttt O -----2---------:::---:----
"" 0)_a]2+ b 2}2
TI {[ a+ b tan (2"TI - '"'.....
=
(2.90)
1
The minimum value of (2.90) is attained when cx O=2.
See Table
2.1 for values of (2.90) for various values of cx O.
From Table 2.1, note that the only time ARE
is for the double exponential distribution and
practice
CX
CX
o
is less than one
G/W
close to 0.5. In
o is closer to 0.001 (see Chapter 4 where i=9) and thus the
e·
63
GSLRT is much more efficient than the WBST in terms of expected savings
in sample size.
TABLE 2.1
Asymptotic Relative Efficiency of Generalized Sequential
Likelihood Ratio Test with Respect to Wald's Binomial
Sequential Test for (2.75), where g(~,t) = (l-aO)lOO
Percentile of the Given Distributions
a
O
0.5
e
N(Wo 2 )
Distribution
Values of (2.88)
I!.. =
2
0.667
Cauchy (a,b)
Distribution
Values of (2.90)
2
TT
1.234
8=
Double Exponential (k, c)
Distribution
Values of (2.89)
2
1.571
-3 =
0.1
2.443
7.590
4.651
0.05
2.503
10.279
9.579
0.01
4.350
18.893
49.519
0.001
16.485
68.443
500.006
0.0001
83.009
341. 798
5134.548
0(1)
00
00
00
Note that the ARE of the GSLRT with respect to the WBST for non-normal
heavy-tailed distributions is much better.
There is a very important
practical consideration which further supports the benefits of the GSLRT
over a test such as the WBST.
In actual practice in compliance testing,
companies tend to be under the standard value of xo in their pollutant
emission.
Thus, the probability of detecting non-compliance is small
and in order to establish with the WBST that such is the case, testing
would have to go on for a fairly large number of times when non-compliance is the situation.
'e
The regulatory agencies such as the Environ-
mental Protection Agency (EPA) would tend to not allow such experimentation to continue under too many non-compliance incidents.
Under the
64
,..
GSLRT, one is recomputing gee ) at every stage of sampling and thus
-n
does not require too many non-compliance incidents to be able to reach
such a decision.
Our test is more valuable from this practical
standpoint.
2.7
Two-Sample Problem - Comparison of Distributions
2.7.1
Introduction
Let {x., I < i _< n} be a sequence of iid random variables coming
1
-
from either one of the following absolutely continuous distributions
F(x;~l)
and
F(x;~2)'
f(x;~2)
respectively, where the form of F is known and the vector of
with probability density functions
f(x;~l)
and
unknown parameters e., j=I,2, is such that
-J
e.
-J
=
mxl
View
e
e
-
e
+ <5
{
<5
j=l, if the observation is from the first sample
j=2, if the observation is from the second sample
as the common unknown nuisance parameter and
of interest.
e
°as the parameter
In the context of comparison of distributions, assume one
is once again interested in a given function of the displacement parameter 0, say
where
~
g(~).
Without loss of generality, we test the hypotheses
> 0 is known.
H :
O
g(o)
=0
H~:
g(o)
= ~,
(2.91)
Note that the general formulation above includes
the special case of the classical location problem, i.e., testing whether
the two samples are really from the same population or are shifted by
2~
in the location parameter; let
j#i, I
~
i, j
~
~'
= (01'02' ... , om)' where 0j=O for
m, 0i stands for the location parameter, and g(o) = ,oi'
An example may help to clarify the above statement.
e·
65
Example 2.1
Let f(x;6) be the Weibull density function with shape and scale
Then ~'= (c,b), ~'= (01,02)'~; = (cl,b ) =
l
(c-o l ,b-0 2)= 6 '-0' and 8' = (c ,b ) = (c+o , b+0 ) = 6 '+~'. The usual
2 2
2
l
- _2
function of interest g(~) assumes 01=0 and is of the form g(~) = 02;
parameters c and b.
i.e. one is interested in examining differences in the scale parameter.
In the present section we show how the problem formulated above
can be handled using the theory developed in this chapter and thus benefit from the optimal properties of the test procedure as discussed in
Sections 2.4 and 2.5.
Note that our interest is not addressing at all
the problem of assigning observations to each sample, which is a design
problem.
At stage n of sampling we have n
l
observations from the first
sample and n (=n-n ) observations from the second sample.
2
l
We do however
require the allocation to be such that neither sample is consistently
getting more observations than the other as n
a randomized allocation scheme so that
n
I
~
00 (an example would be
1 and
-~-
n
2
n
2
1 as n
-~-
n
~
00).
2
The test statistic is developed in Section 2.7.2
where it is
also shown that the two sample problem is just a special case of the
GSLRT general setup.
Section 2.7.3 briefly presents some calculations
for two well-known distributions.
2.7.2
Preliminary Notions and the Proposed Test
Let the likelihood function of the combined sample at stage n
..
be given by
n
L (X ;8,0) = II f(x.1 ;6+do)
_
n -n - i=l
(2.92)
where d=-l if the observation is from sample 1 and d=l if the observation
66
is from sample 2.
Proceed as suggested by Bartlett (1946) and Cox (1963)
and calculate the maximum likelihood estimates of e and 0 at stage n,
e_n and _n
0 respectively.
-
-
Let
a In L (X ;e,O)
n -n - ae
2 =
_n
2mxl
then e"
-n
and 0"
-n
-
(2.93)
a In Ln (X-n ;e,O)
- _
ao
satisfy
2
_n
0
e-n ,6-n = 2mxl
(2.94)
Let the Fisher information matrix of the distribution F(x;e+do) be
denoted by
1(8,0) = ~ee
2mX2m !oe
~eo
!oo
rnxm
_a 2 In f(x;e+d.§)
ae ae'
_a 2 In f(x;e+dQ)
ae ao'
= E _a 2 In f(x ;f1+d§)
_a 2 In f(x;,@+d.Q)
ao as'
ao ae'
(2.95)
and its inverse by
I-l(e,o)
2mx2m- -
(2.96 )
Parallel results to Lemmas 2.2.1, 2.2.2 can be readily established
and thus obtain the following test statistic:
"012
2
where
o
~
nd
6n(0 -o)'g(o)
new - 2)6
=
n -2 - - + ------~--y
y2
(2.97)
and ware defined as in (2.11)-(2.13),6 is replacing e, and
where
(2.98)
e'
67
Since the values of the parameters
e and
2
0 are unknown. ~ (0) and Y are
estimated by
o '"
g(o ) = ag
ao '"
- -n
- -n
0
(2.99)
and
'" "
o '"
"'2
'"
100 ] _I(e
_n .0_n )
Yn = g(o ) , [leS8
-n
-n
- -n
I68]
['fOO
. ~ (~n) •
0
'"
(2.100)
respectively. where
I
-n
,6 )
-lee
_n_n
"ee
I
I"eo
~n
-n
= "0 e "00
I
I
_n
-n
1
= - n
a2 In L (X ; 8.0)
n -n - ae ae'
a2 In L (X ;e,o)
n _n - ao ae'
a 2 In L (X ; e .0)
n -n - ae ao'
a 2 In L (X ; e, 0)
n -n - ao ao'
-1
e ,0 .
- -n -n
(2.101)
The test procedure is analogous to Section 2.23 and the results
of Section 2.3 - 2.5 hold for 2nd replacing 2~. We thus see that the
two sample problem can be handled by the GSLRT and thus benefit from its
demonstrated optimal properties.
We do note that the data may not be
available sequentially and thus the possibility of early termination of
experimentation would not be available.
2.7.3
Examples of Applications
The following two examples present calculations of (2.92) for two
commonly used distributions.
Example 2.2
•
-e
In the case of exponential distributions.
1
-x
f(x;6+do) = b exp(l»)'
where b = e - 0 for first sample observations and b=e + c5 for second
6B
In this case, at the n th stage of sampling,
sample observations, b > O.
n
Ln (X_n ;S,~)
=
-x
1
n
k
n1 S-o exp(S_o)
n2
k=l
m=l
-xm
~ exp(S+6)
1
Example 2.3
In the case of Weibu11 distributions as in Example 2.1,
f(x;S+do) =
-
-
~cx
b
c-1
_
C
where (c,b) is defined in Example 2.1.
-x
C
expel;)
,
The likelihood at stage n of
sampling is given by
L (X ;6,0)
n -n .- -
2.8
=
Recapitulation of Results
In this chapter we have developed the necessary theory to justify
the results of the Sequential Likelihood Ratio Test (SLRT) developed by
Cox (1963) as well as for the proposed Generalized Sequential Likelihood
Ratio Test (GSLRT).
We have examined the properties of the GSLRT as
well as computed its asymptotic relative efficiency with respect to
Wa1d's Binomial Sequential Test (WBST) for certain cornmon distributions.
The emphasis in this chapter has been on the asymptotic case when
small.
The question arises as to how small
in this chapter to apply.
~
is
needs to be for the results
Simulations for finite
functions are presented in Chapter 4.
~
~
of the OC and ASN
The theory that has been developed
for the asymptotic case cannot be used for a fixed
~
because then the
test statistic does not possess the properties of the Sequential Probability Ratio Test and cannot be approximated by a standard Wiener process.
e-
69
We have therefore restricted ourselves to local alternatives, i.e. 6
~
in order to get a good enough approximation to the DC and ASN functions
of interest.
In the next chapter we develop a parallel theory for the case
when the observations are time dependent and thus not identically
distributed .
•
0,
CHAPTER 3
A PROGRESSIVELY CENSORED SEQUENTIAL
LIKELIHOOD RATIO TEST FOR GENERAL FUNCTIONALS
3.1
Introduction
Let {xi,i ~ I}, F(xj8) and g(~) be as defined in Section 2.1 and
let (2.1) be the hypotheses of interest.
In a life testing problem, the
x. are nonnegative so that F(O;8) = 0 V8 £ G £ Rm .
1
Let n be the fixed
-
predetermined number of items to be tested.
What is actually observed
are the order statistics 2n, 1< ..... <2 n,n corresponding to the x.1 IS.
By
virtue of the assumed continuity of F(xj8), ties among the x. and hence
1
among the 2n,l. can be neglected in probability.
As an illustration of this problem, consider the survival distribution of n persons accidentally subjected to a chemical believed to be
carcinogenic and assume that their survival follows a Weibull distribution.
eter 8
Thus the vector of unknown parameters
1
and a shape parameter 8 ,
2
function at point x=t O :
A(tO;~)
~
consists of a scale param-
Let the function g(8) be the hazard
f(toj~)
=
In a non-sequential
l-F(tOj~) .
approach, one would wait until all n subjects failed to then test the
null hypothesis, which could conceivably be a very lengthy wait.
censored plan, for some fixed r(l
~
r
~
In a
n), the experiment is terminated
at point Z
and the test for H is then based on the set (2 1, ... ,2
).
O
n,r
n,
n,r
In a truncated plan, the experiment is continued for a predetermined
e·
71
length of time T (0 < T < 00) and if r * (0
~
r * < n) of the observations
lie in the interval LO.T]. the test is based on (Z
1 ...• Z *) if r>l,
n..
n.r
and one works with the probability [1 - F(T;a)Jn if r*=O. In a progressively censored scheme (PCS). the possibility of terminating the experiment prior to Z
is allowed through monitoring from the beginning.
n.r
Thus. if for some k*«r).
(Zn..
1 ...• Zn. k*) advocates
. a clear statistical
decision in favor of either of the hypotheses. the experiment is stopped
following Z k*' Thus both the "stopping number" k* and the "stopping
n.
time" Z k* are stochastic variables. The values nand r are known
n.
fixed quantities.
Given the possible benefits of
problem in such a setup.
fo1lo~ing
a PCS. we examine the
In the following sections of this chapter a
generalized sequential likelihood ratio test is proposed along the lines
presented in Sen (1976).
..
Along with the preliminary notions, the test
is proposed in Section 3.2.
In Section 3.3 the termination probability
of our sequential procedure is determined.
Sections 3.4 and 3.5 examine
the operating characteristic (aC) and average sample number (ASN) functions of the test.
3.2
Preliminary Notions and the Proposed Test
3.2.1
Notation and Assumed Regularity Conditions
The likelihood function at stage k (1
La(~
(Z
..
(k)
~
k
~
n). denoted by
.n). equals the joint probability density function of
n. 1'" ...• Zn. k):
n-k
La (~(k) .n) =
n!
(n-k) !
(3.1)
72
o < Zn. 1
n. 2< ••• <Z n. k < co}"
Let B k denote the a-field generated by Z(k). k=l •...• n; where
n.
B
stands for the trivial a-field. Note that Vn > 1. B k is nonn.O
n.
decreasing in k (0 ~ k ~ n). Let _n.
A k be the rnxl.vector denoting the
< Z
first derivative of the log-likelihood of Z(k) with respect to e:
a
A
-n.k
a~
=
{
In
L~ (~
(k)
.n). k=l •...• n
o
• k=O
(3.2)
Now. given B
the conditional probability density function
n.k-l.
of Z k' defined for Z k > Z
is given by
n.
n.
n.k-1
n-k
(n-k+l) feZ k;e) [I-FCZ k;e)]
( I
n. n.qe zn. k Bn ,k -1) = ---[-1--F-("'<'z---;e-)-]n--"":""k-<-+:-'l- - n.k-1 -
(3.3)
..
so that, V 1 < k < n, (3.1) can be rewritten as
-
(k)
Le (Z
-
_
.n) -
k
TI qe
i=l _
(z.
1
lB.
1)
n.l-
(3.4)
a
Defining A*
= ae In qe (z·IB . 1)' (3.2) can be defined as
1
n.l- n,i
k
A
k=l •... ,n.
= L _n,l
A* ••
-n.k
i=l
For every a £ (0.1) and
matrix
-1
J (e) =
-a -
+
l~rv [f ~l
""
F
F
J0
aealn
(a;~) -
(3.5)
e £ 0. define the positive definite
(a;e) a
a
- 'll\i'\e In f(x;6) _. In f(x;e) dF(x;e)
a~
ae'
-
f(X;~)dF(X;_6)] [J:l
F
•
..lIn
(a;~)a~'
fCX;e)dF(X;eJ]. (3.6)
-
-
73
and va £ 0, define
lim J (e)
a~l ~a -
~(~)
to be the positive definite matrix
= J(e) = f ~ea
00
- -
a~
0
In f(x;e) __a__ In f(x;e) dF(x;e) . (3.7)
ae'
-
Now assume that the following regularity conditions hold:
(i) The censoring number r can be related to n by letting
nr ~
~
a as n
00,
where a
£
(0,1].
(ii) Assume the space 0 is a compact subset of the m-dimensional
Euclidean space ~ with a non-empty interior, and that the
true value a lies in the interior of 0.
(iii) Assume f(x;e) > 0, Vx £ R+, and is a continuously, twice
differentiable function of e, ve s 0.
(iv) Assume ve
~ m, I a~i
Vi
£
In
0, that
f(x;~) I ~
ul (x), where
J_: u l (x)dF(x;~)
<
00
(3.8)
and
Vi,j
~ m, I ae.~~.
1
3.2.2
In
J
f(x;~) 12 u2 (x),
where
J~ u2(x)dF(x;~)<
00
Preliminary results
Lerrma 3.2.1
Vn ~ 1, {~n,k' Bn,k' 0 < k < n} is a martingale vector.
Proof of Lerrma 3.2.1
Assumptions (iii) and (iv) ensure the integrability of
*
~n
, k and
the differentiability under the integral sign, so that VI 2 k 2 n,
E{~~,kIBn,k_l} = }Zoo
ailn qa(ZIBn,k_l)
n,k-l -
..
=f
00
1·
Z k I qe (ZIBn, k-l)
n, -
-
~
de'e (ZIBn,k-l )
- -
qe(ZIBn,k_l)dZ
-
qe(Z IBn k_l)dZ
,
'
74
I
= a~
Q8(zIBn ,k_1)dZ
Zn, k - 1 -
-
Now, since Bn,k-1
tion (iv),
I IE[~n,kJ II
<
c
00,
00
Bn, k and -n,
A k
S
= a~
= O.
(1)
(3.9)
Bn, k' and since from assump-
it suffices to show that ~n,k-1 =
E(~n,klBn,k_1) :
*
]
k
E(~n,kIBn,k_1) = E [ i=l
L -n,l
A ·IBn,k-1
*
k-1
= E [ i=l
I A-n,}·1 B
.
n,k-1
k-1
*
= i=l
LA.
-n,l
*
-n,
+ E(A
]
+
*
E (A~n,k l B
.)
n,k-1
kl n,-k
B
1) ,
and using (3.5) and (3.9), the desired result follows.
QED
Lemma 3.2.2
Under regularity condition (i),
J
(8) ~ J (8) , as n ~
n -n,r -ex -
l
00,
V8 s 0,
(3.10)
where
k=1,2, ... ,n
k=O.
(3.11)
Proof of Lemma 3.2.2
Consider first the case when r=n, and thus ex=l.
Under regularity
conditions (iii) and (iv) that allow us to interchange differentiation
with integration, and because we are dealing with the complete set of
observations,
75
2
a
[ ae ae '
--
In L (Z (n~n)]
~ -
[_a- -
2
_ ln
aeaa'
=
r
~
f(x.
i=l
2
i=l
[ -a ,ln
_e aaaa
E
;e)]
1-
f(x.
1
;a)]
-
(3.12)
Under regularity conditions (iii) and (iv), J(a) equals
_a 2
In f(x;a) dF(x;a) and thus (3.12) equals n J(8), so that
aaa8 '
1
1
J
(a)
n- _n,n
- = n- n J (6)
_
~
_J (_6) and the claim is true for r=n.
l
Now consider the case r < n, and thus a < 1.
Under regularity
conditions (i) - (iv),
a2 In L (Z (r),n) ]
J
(6) = -E ----n, r a [ aa aa '
~ =
=
~
2
E [ _a
In{[ntn-l)''' (n-r+l)]
fez. ;a) [l-F(z ;8)]n-r}
a
-aaae'
i=l
1r_
2
2
f
]
E [ -a ,ln fez. ;a)] + (n_r)E a [-a ,1n[1-F(z ;a)]]
i=l _a aaaa
1 _ a6aa
r -
(3.13)
Dividing the first summand in (3.13) by n,
2
2
Ea [ _a In f( z. ;a)] = 1.
EarE [ -a ,In f( z. ia) z 1]]'
n i=l _ aada'
1 n i=l _
aaaa
1 r+
1.
--
f
--
I
--
I
(3.14)
76
Now, the probability density function of Z
(.l
n,r+l
f
~
~
r
n-2) is
nl
r
n-r-l
f(Z;~),
Zr+l (z;8) = rl (n-r-l) I [F(z;~)] I1-F(z;~)J
o
< z <
(3.15)
00.
Interchanging integral signs, (3.14) equals
rI I d.
Z
G.=l
0 aeae I
1n feZ. ;8)dF(Z. ;8)1
1 -
1 - _
F(z;e)
f
Zr+l
(3.16)
(z;e)dz.
Using (3.15), (3.16) can be rewritten as
r:J~JOaeae~n
z _a 2
.
.
{
(n-1)!
. r-1
. n-r-l
. }
f(x,~)dF(x,~)] (r-1)! (n-r-l)! [F(z,~)J [1-F(z,~)J
f(z,~) dz.
(3.17)
Recognizing that the quantity within {lin (3.17) is the probability densith
ty function of the r
order statistic from a sample of size n-1, Zn_1 , r'
~
(3.17) is recognized as Ee[§(Z n- 1 ,r )] where §(x) =
Io
aeae I
~ex
= F-1 (ex;~), by the moment convergence of sample quantities
n
x -----_a2 1n f( y; e)d F (y; e) . - S·Ince r
as n
~oo
~
ex,
r
·1 f
ex aI
so asd
n ~ 00, an
n-1~
~en
(1959)),
,
af (x ; e)
--~_.~ dx ae '
-1
IF (ex ; ~ ) a2
---~
0
f(x;e)dx .
(3.18)
aeae
Now, taking the second summand in (3. 13) and dividing by nand
utilizing the moment convergence of sample quanti1es, as n
~
00,
e'
77
2
n-r E
lnfl-F(z is)]] .... (I-a) EsI¢(Z
)],
n S asae'
r _
n,r
[_a
--
where ¢(x) =
_a 2
asas'
In[l-F(x;S)].
-
(3.19)
Note that (3.19) converges to
(l-a)¢(~
a
Now rewrite (3.19) as follows:
(l-a)~(~N) = -(l-a)~,ln[l-F(F-l(a;s);s)]= -(l-~)
~
aaaa
= - (I-a)
- -
-a
ae
x
f(y;S)dy
-
[ [1-F(x;~)]2
= I-a 2 roo_I
a~
(I-a) JF (a;S) _
-
I-a
(I-a)
~--:-
oo
[
f(y;~)dy
a~'
roo_I
JF
aaaa
,In[l-F(x;e)]
~
- -
d
x
2
a
f(y;e)dy
-
x=~
a
+
~ f(y;~)dy
(a;S)ae
I
-
- -
"\2
-a_ . fey; e) dy
l
F- (a;~) a~a~ ,
f
(3.20)
).
78
Combining (3.18) and (3.20),
I
n~n,r(~) ~
I
+
l:a
1:-1
F
-1
F
(a;S)
0
-
1
f(X;~)
af(x;§) af(x;§)dx
as
as'
~ f(y;~)dy
a: f(y;S)dy I:-l
F (a;e)ae
(a;~) -
a2
(3.21)
- [ 0 ------, f(y;~)dy.
a8ae
Recognizing that the last summand in (3.21) equals
~,
(3.21) equals
and the proof is complete for r < n.
(3.6)
QED
The maximum likelihood estimator of 8 at stage k, 1 < k
2
n, is
" and is the rnxl vector satisfying
denoted by ~k
o = ~~,kle
(3.22)
-k
The following lemmas are claimed to hold under regularity conditions
(iv).
(i) -
Lerruna 3.2.3
=
Let 6n
F(x;~),_oo<x<oo
sup
-00< X<00
I ~ 0,
IFn (x;8)-F(x;8)
-
a.s., as n ~
00,
where
is an absolutely continuous distribution function, and
F (x;8) denotes the sample cumulative distribution function for a sample
n
-
of size n.
u(x;8),
6n*
=
_00
sup
_oo<a<oo
Then for bounded, integrable and continuous functions
< x <
00,
I a U(X;~)dFn(X;~)-J
_00
a
_00
u(x;~)dF(x;~) ~
0, a.s., as n
~
00.
(3.23)
e'
79
Proof of Lemma 3.2.3
Let
y=F(x;~),
0 < Y < 1,
_00
< x <
00
and
p=F(a;~).
If
Fn(x;~) =
F (F -1 (y;8);8) = F* (y;8), then
n
-
6n* =
n_
f°
p
I
sup
O<p<l
max
<
-
-
O~~m
u(F- 1 (y;8);8)dF * (y;8) -
0--
n
-
u(F- 1 (y;8);8)dy
- -
I
*
(3.24)
where the interval [0,1] is partitioned into m* intervals [aO,a ],
l
°
[a l ,a 2 ], .... ,[am* - 1 , a m*] such that ~
(chosen to
m < E for arbitrary E >
be small). Using the definition of F (x;8), (3.24) can be rewritten as
1
max
1
O~j~m* n
=1
where 1(*)
1
n
n
i=l
- L
I .u(x. ;8)
1'=1
1
n
-
lex. < a.) -
-
1
-
J
IF(aj;~) u(x;8)dF(x;8) I
°
+ E,
(3.25)
if * is true and =0 if * is false.
u(x.;8) lex. < a.) is the sum of n iid
1
--
-
J
1 -
Recognizing that
random variables with ex-
F (a. ;8)
I°
pectation
J -
u(x;8)dF(x), by Khintchine's Strong Law of Large
-
Numbers, the quantity in
as n ~
00,
VO ~ j ~ m*.
I I
Since
converges a.s. to zero as n
in (3.25) converges to zero almost surely
E
can be made arbitrarily small, (3.25)
~ 00.
QED
Lerrma 3.2.4
If lim E8{
o~
-
sup*
~*:l1 ~ -~ II <0
In f(X;8)1- 8*
1nf (X;8) ~}
--.t
aea8'
-
= _0,
(3.26)
Vx, then as n
~
00,
80
2
a In L (Zlk~n) }
_a_ In La(z(k~n)
= k(ak-a)' - -k
fl (1 + 0 (1)), a. s. .
aa'
- a
- aaaa'
a - A
i
{
I-
-
--
-
(3.27)
Proof of Lerrma 3.2.4
a,
Expanding (3.22) about the true value
, -n,k
A'I
a
o
where ~*:I I~*-~l
=
a,
(a -a) , -A
+
A
a~
_k -
I
A
I < II~k-~I I·
Rewriting (3.28) with the use of (3.2),
A , { i a In Le,(Z- (k~ n)
= k(ak-a) _ -k
2
__a__ In La(z(k~n)
ae'
- e
I-
-
2
a In
1
where Q
=
k
[
(3.28)
-n,k ·a*
Le(~
aaae'
-
(k)
-
aaae
e
aede'
-
}
+Q
,
(3.29)
a
--
a2 In L.e(~(k~n)
,n)
I
-
~.J
In order to prove (3.27) it suffices to show that Q
(3.30)
= ~(I),
a.s.
~
Using
(3.3) and (3.4), (3.30) can be rewritten as
2 k
1
Q =k
2 k
·IBn,l-. 1)
a . I 1 In qa(Z n,l
1=
aeae'
a
1=
=
1
-k
---------aaae'
a2 In
1 k
=-k . I 1
aaaa'
·IBn,l-. 1)
a . I 1 In qa(Z n,l
1=
_
a
qQ(Z
. 1)
l.J
n,l·IBn,laaaa'
~.]
~.]
Ii=l {aaaa'a2 [lnf(Zn,l.;a)+(n-i)ln[I-F(Z
.;e)]-(n-i+1)ln[I-F(Z .1;a)]J
n,l n,l- - -
-
-e
}
a2 [Inf(Z .;a)+(n-i)lnLI-F(Z .;a)]-(n-i+I)In[I-F(Z . l;a)]J
aaae
n,l n,l n,l- a*·
(3.31)
e'
81
If R.=(n-i) In[I-F(Z
1
. ;a)]-(n-i+I) In[I-F(Z . I;a)], (3.31) can be
n,l n,l- -
rewri tten as
I
2
r ------
1 k [ a
Q = -k
In feZ .;a)
i=I aaaa'
n,l - a
In feZ
. ;a) ]
n,l - a'"
-
-
(3.32)
k
Now
r R.I = (n-k) In[I-F(Zn, k;a)]
due to cancelling out of terms, so
-
. 1
1=
that (3.32) can be rewritten as Q = Q(I) + Q(2), where
I
=-k
I{
Inf(Z
i=I a~a2'
n-k
= ---k
and
2
a
a2
a6ae'
2
.;a)la Inf(Z .;a) },
n,l - ~ a~a~'
n,l - ~'"
In[I-F(Z
[ - -
n,
a2
k;6)]1
-
6
aaae'
Define the following function
u(x. ;6) =[
1 -
(3.33)
(3.34)
2
_a__, In feZ . ;6)
a6ae
n,l -
2
a In feZ . ;a)
a~a~'
n,l - 6
--
]
'"
1
Fn (a; _
e)
-a
k
where a = F-1 (-;6)
and F (x;a) is the empirical distribution function
n n nfor a sample of size n. Take k s.t. [nE] ~ k ~ rna] for some arbitrary
£
> 0, where [s1 denotes the greatest integer less than or equal to s,
so that k is bounded away from the origin.
ten as
nF (a;a)
( 1)
Q
As n
-e
~
Thus, (3.33) can be rewrit-
00,
1
=-
n
n
l
. I
1=
-
u(x. ;6)
1
(3.35)
-
(3.35) converges to the following Stieltjes integral
foa u(x;8)-
dF (x;8).
n
-
(3.36)
82
By Lemma 3.2.3 and assuming that Fn (x;e)
_
I: u(x;~) dF(x;~),
(3.36) converges to
~
I
F(a;~)
~
F(x;e),
a.s., as n
_
a.s . .
a
00,
Thus, for 6 > 0,
2
2
In f(x.e) _ _a__ ln f(x;e)
...
0 a~a~'
'- ~ aeae'
fa[
~
I
fa
sup
- F(a;e) 0 ~*:II~*-~11<6
<
a2
e*]dF(X;e)
-
"2
I - aeae'
a
In f(x;e)
aeae'
- e
In f(x;_e)
2
a2
- - In f(x;e)
aeae'
e
e*
dF(x;e)
I }
a
In f(x;e)
aeae '
- e*
(3.37)
Using (3.26) and the fact that as n
II~*-~II < lI~k-~II),
.,
~n,k
e * gets closer to e (since
~,
(3.37) is :>0), a.s.
All that remains to be shown is that Q(2)= ~(I), a.s. Since
th
is the k order statistic from a sample of size n, by the conver-
gence of sample quantiles, Zn, k
~ ~ p.
n
p
, a.s., where
F
-1
(p;~) = ~p
and if
Thus, (3.34) can be rewritten as
2
=
~ ~
k I
[
(1- -)-k
n (_)
n
a
aeae'
2
In[l-F(Z
a - In[l-F(Z k;e)]
k;e)] - - n, e aeae'
n, -
- -
~ !=R
[ a ,
p
aeae
-2
In
[l-F(~p ;e)]
-
-
In[l-F(~P;e)]
I].
- e*
8
-
From the regularity assumptions (i) - (i v),
tinuous around
~
p
a2
aeae'
-
In[I-F(x;8)] is con-
in a neighborhood 6 of the true value e.
will get into this neighborhood as n
~
00,
(3.38)
Since e *
V6 > 0, the difference in
(3.38) will be negligible due to continuity and thus Q(2)=o(I), a.s.
QED
~
83
Lerrrna 3.2.5
Under regularity conditions (ii) - (iv),
2
a
1
In
La(~(k~n) a ~!p J-p (8),
-
(3.39)
- k aaae'where -k
n
~
p as n
~ 00.
Proof of Lerrrna 3.2.5
By definition, for k=1,2, ... ,n,
-1 J
= - -1
(a)
k -n, k -
k
2
[a
~ a~ a~ ,
E
under the assumed regularity conditions.
L~(~ (k),n)]
In
,
(3.40)
Writing. out the definition of
L (z(k~n), (3.40) becomes
a-
1
-k
2
k [ _a
I
i=I aaaa
,
2
In f (Z
.; a) ]
n,l -
n-k a
- ---k ------ In[I-F(Z k;a)].
aaae'
n, -
(3.41)
Now, given Z k I' the Z 1" •. ,Z k are iid random variables and thus
n, +
n,
2 n,
-0
so are the random variables
In feZ .;a), i=1,2, ... ,k. Utilizing
afja§'
n,l Khintchine's Strong Law of Large Numbers, the first summand in (3.41)
converges almost surely to
g(Z
2
k 1) = Ee [ _a
In f(y;a) Iz k 1] , Y > O.
n, +
_ aa aa '
n, +
(3.42)
--
By the stochastic convergence of sample quantiles and the assumption
k
k+l
P
that -n ~ p, --n ~ + 0(1), as n ~ 00, and g(Z n, k+1) ~ g(~ p ), where ~ p is
the pth quantile of the F distribution. Using the moment convergence of
sample quanti1es,
g(~p)
= E[g(Zn,k+l) + 0(1), and thus (3.42) becomes
2
-e
E~Ea[
_a
- aaae'
--
In f(y;a) Iz
-
2
k 1)]]= Ea [ _a , In fCy;a)] , y > O.
n, +
_ aeae
-
--
(3.43)
84
Once again utilizing the stochastic convergence of sample quantiles, the
second summand in (3.41) converges in probability to
nk-k
Lae ae ' lnll-F(~p ;e)]
--
= Ee{(nk-k)
_
2
a , In[l-F(Z k;e)]}
ae ae
n, -
utilizing the moment convergence of sample quanti.les.
+ 0(1),
(3.44)
From (3.43) and
(3.44),
-1
a2
In L (Z
k aeae'
~ -
In
f(y;~)
I [ _a
= E{I- k
(k)
,n)
Ie
~
n-k
- E (-n-)
[
]
2
, y
2
a
(n-k)
Inrl-F(Zn,
n
Clede'
;Bl]
In feZ .
n,l k i=l a~a~'
>
a
k;Bll}
_
.(3.45)
e
Under the assumed regularity conditions and using (3.11). (3.45) equals
1
-k J_no k(e).
_
1
By Lemma 3.2.2. -n J_no k(e)
-
~
l J k(e) ~ ! J (e) as n ~
Thus -k
-n, p _p from (3.45). (3.39) is true.
~n ~ p.
J-p (e)
- as n
00.
~
00.
ve E 0. when
ve e: 0. when ~ ~ P. and thus
n
QED
3.2.3
Development of the Test Statistic
As suggested by Cox (1963), at the k
th
stage of sampling. con-
struct the statistic
(3.46)
as in Chapter 2.
Letting
g(~) =
go
+
lPfl. lP £[0.1], (3.46) can be re-
written as
*
Zk
"
1
= k [g (~k) -g (~)] + k(lP - I)fl.
(3.47)
e
-
85
by the Mean Value Theorem,
"
klg(~k)-g(~)]
where e'" is such that
"
= k(~k-~)
[
,
ag(~)
(3.48)
ae
II~ '" -~ II~ II~k -~ II. Assuming d§~~) to be continu-
e
ous in some neighborhood of the true value
"
and the yet to be shown
~
weak convergence of ~k to ~, for large k, ~k and ~hus ~'" will be con~
tained in such a neighborhood of
Ioga~(~)
Ie *
-
og (§)
a~
Ie I
-
with high probability.
Ie'"
is negligible and thus og (fn
a~
to be the value at the true parameter
e,
say
By continuity,
can be considered
0
-
~ (~)
= ae
agee)
Equation (3.47) can thus be rewritten as
"
*
Zk = k(~k-~)
for large k.
I
0
I~
1
~(~) + k(~ - 2)t.
(3.49)
We focus on the process defined by (3.49) for constructing
our test statistic.
As in Chapter 2, the calculation of boundary crossing probabilities for determining the average sample number function and the operating
characteristic function for (3.49) are quite complicated for fixed values
of t.. We thus would like to approximate the behavior of our test statistic by a suitable linear function of the standard Wiener process WeT),
T >
o.
In order to show that such an approximation is reasonable in
the asymptotic case when t.
4
0, in Chapter 2 we were able to prove
Theorem 2.1 without the introduction of the sequence kt.(t) (see (2.41)).
However, because of the censored nature of the observations, we are not
able to do so here and we thus introduce a sequence of integer-valued,
non-decreasing and right-continuous functions
-e
kn (t)
= max{k: n
~ <- at} , 0
~ I,
< t
where a is related to the censoring number r by -r
n
+
a as n
(3. SO)
+
mea £ (0,1]).
86
Defining, VO
t
~
~
1,
-1
F
J tee)
-a
-
= J0
(at;e)
-
a
a
--ae In f(x;e)--- In f(x;e)dF(x;e)
_
- ae I
-
(3.51)
from Lemma 3.2.2, as n
1.n
4
00,
J
(e)
-n,kn (t) -
4
J_
at
(_e) ,
ve
-
E:
e,
O<t<l.
(3.52)
Replacing the sequence k by kn (t) in (3.49), we have at the kn (t) stage
of sampling
*
(3.53)
Zk (t)
n
As in Chapter 2, in order to calculate boundary crossing probabilities,
one needs to divide z~ (t) by a suitable normalizing constant Y~, where
n
O<t<l.
(3.54)
In Section 3.4 it will be seen that for the range of kn(t) having statist,2 k (t)
n
tical interest,
*
/:),
~a
Zkn(t) by
---:~~a- =
Ott) as /:),
4
0 and we thus further multiply
2
' where ~ is a known constant such that t, n~ ~ > 0 as t,40.
At the k (t)th stage of sampling, we thus focus on the function
n
A
*
Zk (t)
n
/:)'kn(t)[g(~k (t))-g(~)]
n
= -----'="2----- +
f;a
Yt
(3.55)
87
~
Since the true value of the parameter
mated by
o
*I
o
~(~)
2
and Yt are esti-
aI
A
~(~k
•
is unknown,
(t))=
n
'"
e-k
-
(3.56)
n
(t)
and
(3.57)
respectively.
Lemma 3.2.6 (AnaZogous to Lemma 2.2.4)
Under suitably generalized assumptions given by Wald (1949) (see
(AI) -
i!l e
in Proof of Lemma 2.2.4 ) and for continuous functions
(AB)
~~~(~),
and
as n
~
00,
"
P
(i) ~k (t) ~
.
n
(ii)
•
~:
-
I"~k
"2
( iii) Y
t ,n
n
e
(t)
g y t2
Proof of Lemma 3.2.6
From assumptions (AI) Ve: > 0,
"
p
I I~k-~I I ~
max
[ne:] < k < n
rewritten as
max
e: <
t < I
(AB)
as in the proof of Lemma 2.2.4,
0, as n ~
"
p
00.
II~k (t)-~II ~ 0,
n
This expression can be
(3.58)
ve: > 0
Notice that (i) is proved by the definition of stochastic convergence.
Expression (ii) follows immediately from (3.58) and the assumed continuity in
~
of
!AI e in some neighborhood of the true value
a~
From Lemma 3.2.2, (3.50) and (3.52), as n
~
00,
~.
Vt > 0,
88
0,
£(~)
-1
0
0,
-1
0
2
kn(t) ~n,k (t)(~) ~(~) ~ ~(~) at~at(~) ~(~) = Yt
n
From (3.59), notice that Y~ is a continuous function of~.
·
(3.59)
From (3.58)
•
and an argument similar to the proof of (ii) above, (iii) follows.
QED
From (3.27), replacing k with kn(t) ,
(3.60)
where ZW is defined as
I{ -
= _a_ In L~(~ (k n ( t))
1
}
,n) ~ at~at(~)
ae'
ZW
[a%! I]~ .
(3.61)
Using (3.11), (3.49) and (3.52), the right-hand side of (3.60) can be
rewritten as
(3.62)
We now examine some of the properties of Zw.
Using (3.4), (3.61)
can be rewritten as
k (t)
n
L
i=l
a In qe(Z
--. IB . l)at J -1tee)
ae'
_ n,l n,l-a -
-
[~
] '
~e
e
0_
(3.63)
-
where qe(Z n,ll
.IBn,l. 1) is the conditional p.d.f. of the 1.th ord er statistic
given the (i_1)st order statistic. Conditional on the (i_1)st order
statistic, the i th order statistic Zn,l. ranges in values from (Zn,l. 1,00)
and thus for Zn, 0=0,
Ji z
qe CzlB '_l)dz = 1,
n,l
n,i-1 -
Vi=l, ... ,n.
(3.64)
•
89
Assume regularity conditions (iii) and (iv) apply to qe(z) as well as to
f(x;a), so that differentiating under the integral sign, Vi=l, ... ,n,
(3.65 )
Taking the expected value of ZW as expressed in (3.63),
E[Zw]
=
k (t)
n
L
'-1)]
Ee_ [....2.ae' In qa[Z
_ n, 1·IB n, 1
i=l
-
atJ-tl(e)
- Ct -
[~el a ]
0_
-
(3.66)
Now, from (3.65),
[~
E_
e ae' In
-
so that E[Zw]
= O.
qe_ (Z n,l. lB.
n,l- 1)] = -0
(3.67)
,
Since: w has expectation zero,
w
w2
vare[Z ] = Ee[(Z ) ]
-
=
-
o
'-1
[ a
(kn(t))
a
(kn(t))]
-1
0
~(~) at~at(~) E~ a~ln L~(~
,n) ae' In L~(~
,n) at~Qt(~)~(~)
(3.68)
~n,k
By the definition of
(~),
(t)
(3.68) becomes
n
w
Vare(Z)
0
= ~(~)
o
,
1
1
0
at~~t(~){~n,k (t)(e)}at~~t(~) ~(~)
n
n
'-1
= ~(~) at~at(~){kn(t)k (t)
n
1
-1
0
n ~n,kn (t)(~)}at~at(~)~(~)
From (3.52) and the definition of kn(t), as n
~
(3.69)
00,
(3. 70)
Now define the process
-e
~).,
,
-n,k (t)
I
n - at ~at
- (~) ~o(~)
Wn,k (t) = - - n
If,,IQ.y t
O<t<l.
(3.71)
90
~n,k
By (3.61) and the definition of
(t)' (3.71) can be rewritten as
n
(3.72)
Wn,k (t) =
n
Thus, from (3.70),
(3.73)
The following theorem claims that Wn,k (t) converges in distrin
bution to the standard Wiener process as n
Theorem 3.1
!::.'A *,
Let x
n,l
/[laY
k l t)
W
n ,k n (t)
= nL
x
i=l
.
-n,l
= --~-
.
at
-l(
D(
~t ~) ~ ~),
~
00:
1 < 1.
~
n, so t h at using (3.5),
t
., for t E (0,1], and Wn,k (t)=O for t=O.
n ,1
By
n
Theorem 3.2 of McLeish (1974), if
(a)
max
i < k (t)
- n
and
kn(t)
(b)
I
i=l
2
x
.
n,l
P
~
t
Vt
e: [0,1] ,
then Wn,k (t) ~ Wet), t £ lO,l], where Wet) is the standard Wiener process
n
defined on the unit interval.
Before proving Theorem 3.1, the results of the following two
lemmas need to be established.
Lerrtrrrl 3.2.7
If {Yo ,i=1,2, .•. ,n} is a sequence of iid random variables and
1
= Oy2 <
00
Vi, then
max
1 < i < n
IY·I
1
Iii
L
~2 0
p
as n ~
00.
...
91
~of
of Lemma 3.2.7
k
.11 Yi ·
Let Sk =
1=
Then Yk+1 = Sk+1-Sk' k=O,I, ... ,n, where 50=0.
Thus, IY k+1 ' ~ IS k +1 1 + Iskl ~ 2. max
Is J·1·
J=k,k+l
max
IY k+1 ' ~ 2 max
Iskl.
k < n-l
1 < k < n
o<
(1974),
Therefore,
From Theorem 5.3.1 of Chung
VE>O,
p1
max
15 ' >
1 < k < n k
-
~
for some n
~ NJ ~
0, as
N~
(3.74)
00.
From (3.74), 'IE > 0,
p1
and thus (since
IYk' > Ein for some n
max
1 < k < n
-
~ NJ ~ 0,
IY
IY·I
' L
as
N~
00,
2
k
max
---~ 0, as n
I < k < n In P
1__ is bounded by E),
In
~
00.
QED
Given Zn,l. l,i=2, ... ,n, {Z n,l,
..... zn,n 1 can be viewed as
{Zn,l. l+hl, ..... ,Zn,l. l+hn-1+
. ll. The hIs form a sequence
{h.: I ~ j ~ n-i+l1 of iid nonnegative random variables with continuous
J
probability density function he(x), x
feZ
he(x) =
Define the function b(x), x
b (x)=
-' e
_a In
[ae
-
I
E [{+:
. I + x;8)
n,lI-F(Z . 1;8)
n,l- -
if x > 0
r(Z n,l. l;e)
-
if x
E
= o.
(3.75)
R+ as
o
I-F(Zn i-l+X;~)J
-1
~(~),
if x > 0
I-F(Z '.
e)
at~ at (~)
n,1-1;-
o
and define hn-1+
. 1 , 1= min(h 1 ,···,hn-1+
. 1)' n-i > 1.
, if x
=
0,
(3.76)
Given the definition
92
of b(x) in (3.76), for n-i>l,
b(hn-l+
. 1 , 1)
= b[min(hl,·····,hn-l+
. 1)]·
Now, given Zn , 1·-1' Zn,l. has the same conditional distribution function
as the minimum of {Zn,l. l+hl, ..... ,Z n,l. l+h n-l+
. I} and therefore
b(hn-l+
. 11)
= b[min (Z n,l. l+hl,·····,Zn,l·l+hn-l+
. 1) ,
= bLZ n,l.
a
= --- In
[a~'
- Zn,l. 1]
;e)] at
l-F(Z.
n,l ~
l-F(Zn,i_l;~)
J-
1
-at
zn,l. 1]
0
(6) gre)
Based on the above definitions, the result of the following Lemma is
established.
Lemma 3.2.8
The above definitions of he(x) and b(x) satisfy the assumptions
of Lemma 3.3 of Sen (1976) so that:- (i) EI (n-i+l) b(h n _ + ,l) I¢ < ~,
i l
d
V¢: 0 < ¢ ~ nd (where d>O is such that Elb(h) I < 00) and (ii) V fixed
~
a > 0,
li m E{[ (n - i + 1) Ib (h n-l+
. 1 1) - b (0)
n-+-oo
'
where
I] a }
(3.77)
r denotes the gamma function.
Proof of Lemma 3.2.8
The assumptions used by Lemma 3.3 of Sen (1976) are the following:
(a) {h.,j=l, ... ,n-i+l} is a sequence of iid nonnegative random
J
variables with a continuous probability density function
(b)
he (x), x £ R +, such that 0 < he (0) = lim he (x) < 00.
x-tO ..
b (x), x £ r. + is such that in some neighborhood of the origin, b(x) has a continuous first derivative b 1 (x) and
lim bl(x)
x-tO
(c)
= bl(O)
< 00.
d
There exists some d>O such that E{lb(h)l } <
00
•
~
..
93
Now, from the definition of {h j : 1 ~ j ~ n-i+1} and (3.75), he(O) =
r(Z . l;e), the hazard function of the distribution function F evaluated
n,l- at the point Zn,l. I' and is thus positive and finite. Again, by the
definition of hazard function and from (3.66),
[a
;e) J
. l+ x = lim b 1 (x) = lim __
-__ feZ n,lx+O
x+O a~' 1-F(Zn,i_1+x;~}
= lim [-at r(Z '_l+x;e)]
x+O a8
n,l
-
-
= [;; ,
1
at~:t1(~)
-~ -
r(Zn,i-l;~)J at~~~(~) i(~)
since b (x) is continuous in x.
gee)
--
<
00,
Now, if d > 1
d
a
--1n[1-F(Z .]
l+x;e)] atJ t(e)g(8)
n,l-a - - [a8 '
-1
0
d
feZ . l+x;e)
n,l- dx
1-F(Z n,l. 1;8)
_
feZ
. 1+x;8)
n,l- dx
1-F(Zn,l. I.e)
,-
.,
a ]
1 0
d feZ 'l+ x ;e)
---, 1n[1-F(Z '_1;8)] atJ-t(e)g(e)
1 F~i1dx
n,l
-a - - '1' [ a~8
n,l-
-
.S)
(3.78)
The second summand in (3.78) exists Vd>O since E[h (x)]=l. The
e
first summand in (3.78) exists for d=2 due to regularity assumption (iv)
of Section 3.2.1.
Thus (a), (b) and (c) are satisfied and by Lemma 3.3
of Sen(1976), (i) and (ii) as stated in the Lemma are true.
QED
-e
94
Proof of Theopem 3.1
First check that x . satisfies Ca). From the definition of
n. l.
A* • and C3.3).
l.
In fCZn.i;~)Jat~~~C~) iC~)
ae
=
Or
x .
C3.79a)
n. l.
I1t2,
-n.
.ff;d. Yt
-j
+
I-F(Z . ;e)
0
-1
11 [Cn.i)-a- In I-FCZn '.l. -l;e) at~at(~)~(~)
ae'
n.l.- ...
~Yt
+
11[~
ae '
In[l-F(Z
1
. l;e)]latJ- (e)
n.l.- -at -
«
.1 < i < n.
-
~Yt
Now.
max
Ix ·1 ~ max
i<k (t) n.l.
i<k (t)
-n
-n
(3.79b)
1(3. 79a) I
+
max
i<k (t)
-n
1(3. 79b) I
+
max
i<k (t)
-n
(3.79c)
1(3. 79c)
I
I
and it suffices to show that each individual summand converges to zero in
L2 norm.
Now.
max
-n
i <k (t)
1(3. 79a) I < max
i<n
111 Y.1. I
•
where
a
,
In feZ
.
-1
0
.;e)at J tee) _gc_e)
n ,l. -aY. = - - " " - - - - - - - - - - - - - , i= 1 ••••• n .
a~
l.
Now, the Y.l. are iid random variables when considering the entire set of
n observations {x .• l < i< nl and from regularity assumption (iv), they
l. - 1
have a finite second moment. As wi 11 be seen later. 11= 0(-). so from
In
Lemma 3.2.7,
max 111 Y. I
l.
i<n
= max
i<n
as n
-+ co,
e-
95
and so does
I(3.79a) I.
max
i~kn (t)
I
max
(3.79c)l·
i<k (t)
A similar argument applies to
Since Zn,i_l E {x j :
1 ~ j ~ n},
-n
max
!(3.79C)
i<k (t)
-n
where
I -< max
~IY~I ,
j_<n
J
a
*
Y.
J
-1
-
0
--- In[l-F(x.;e)] at J tee) _g(_e)
ae'
J -a-
=-"""'"--------------
Again, the Y.* are iid random variables with 2. finite second moment. Using
J
1 L2
Lemma 3.2.7, max ~IY~j = max IY~I 0(--) ~ 0, as n ~ 00, and so does
i<n
1
1
i<n
rn
max
1(3. 79c)
i<k (t)
-n
max
-n
i <k (t)
I.
By definition of convergence in L2 norm, to show that
L2
(3.79b) ~ 0, as n ~ 00, it suffices to show that
I
I
E max
-e i<k (t)
-n
"(n-l')1[""
__
u
~
y~a
converges to zero as n
0__
""e'
o~
Yt
~
00
In
l-F(Zn,l.;e)J
1m
atJ- 1
(e)0gee)
l-F(Z.
l;e)
--at
-n,l- -
for some m > 2.
Since max
Ix.1 1<
rlx.1
.
-.
1
I
(3.80)
for any
1
x. and set {i}, (3.80) is less than or equal to
1
L
i<k (t) (~a)m/2ym
- n
1['"'
__
.~ ae'
"m( n-l.)m E
U
0__
t-
01
m
In l-F(Zn,l.;e)J
atJ- 1(e)
gee)
l-F(Zn,i_l;~)
-at - - -
Letting m=a in (3.77), and using the definition of b(h n-l+
. 1 , 1)'
J -
10
a
l-F(Z. ;e)
1m
(n_i)m Ee --- In l-F(Z n,l :e) atJ tee) gee)
[ '"'e'
. l' -a - - o
n,l-
-e
-
. (3.81)
96
. m
m
= (_n~lI)
E6 [(n-i+I)ml b (hn-l+
. 1 , I)-b(O) 1 ]
~-l+
and as n
~ 00,
(3.82)
[b l(O)lm
by Lemma 3.2.8, (3.82) converges to r(m+I) h~(O)J <
m < 2, and is thus 0(1).
Thus (3.81) converges as n
~
00
00,
to
m-2 2
= b,
b,
K (t) 0 (1) .
(~o.)m/2y~ n
(3.83)
It will be shown later that as n ~, b,2 kn (t) ~ ~o.t and thus as n ~ 00,
(3.83) becomes b, m-2 0(1). Since b, 2n = 0(1), when n ~ 00, b, ~ 0, (3.83)
converges to zero, for 2
fourth moment of b(h)).
m ~ 4 (i.e. assume existence of up to the
L2
Thus, max
I (3.79b)1 ~ 0 as n ~ 00, and x .
n,l
i<k (t)
-n
<
satisfies condition (a).
It suffices to show that xn,l. satisfies condition (b).
From
~
.
.
the definition of xn ,1.,
k (t)
n
2
I
. 1
1=
xn,l.
=
(3.84)
By (3.2), (3.5), (3.11) and the result of Lemma 3.2.1, xn,l. is a martingale difference array and thus
,
k ( )(6)
-n, n t J
kn(t) *
*'
= E .l.I
t AJ},l...n,l
.A .
[ 1=
[kn(t) *
kn(t) *' ]
= E(A-n,l-n,l
.A .) = E LA. LA.
i=l -n,l j=l -n,J
+
* * ]
2 .l.._n,l_n,J
t A .A . = E
l<J
k (t)
n
L
[ i=l
]
A* .A *' .•
-n,l-n,l
(3.85)
e-
97
Assume that
k (t)
n
r
i=l
*
*'
P
A •A • ~ E
-n,l-n,l
[
k (t)
n
I
i=l
]
A* .A *' . , for k (t) large by the
-n,l-n,l
n
Central Limit Theorem, and thus from (3.84) and (3.85), as n
kn(t) 2
\ X
L
. 1
1=
2
,
-1
-1
~~,
0
p 6 ~(~) at~at(~) ~n,k (t) (~)at~at(~)~(~)
~
n
•
n,l
2
~a Y
t
=---------------=--------2
Y
~a
t
(3.86)
From (3.59), (3.86) converges to
(3.87)
It will be shown later that as n ~
k (t)
and (3.87),
n
I
. 1
1=
00,
2
6 k (t)
n
~ ~at and thus, from (3.86)
P
2
x . ~ t and condition (b) is satisfied.
n,l
By
Theorem
3.2 of McLeish (1974),
Wn,k (t) »W(t),
t
£
(3.88)
[O,lJ
n
QED
To summarize the results of this subsection, we first state that
our test statistic at the k (t) stage of sampling is given by
n
Zk*
n
-e
"2
where Y
t ,n
(t)
=
"2
(3.89)
~aYt ,n
From (3.55),
9~
(3.60), (3.61) and (3.71), (3.89) can be rewritten as
*
Zk
n
(t)
=
1
K
If;O. Yt ,n
Wn,k (t)
~2kn (t) (ljJ
+
- })
, t
....2
E:
[0,1] .
~a Yt,n
n
(3.90)
Notice from (3.89) and (3.90) that the test statistic involves a
.
....2
2
2
consistent estImator Yt ,n of Yt ' Now, Yt i~ an "updated" approximation
at stage t of the variance
sored" variance.
l =l (8)
=
[*1 ~1{"'!~ (~) [~ I~11
)
2
As t approaches 1, Y~ is closer to y (8) .
the "cen-
In the test
2
statistic as given in (3.89) and (3.90), Yt
,n is used rather than
2 ....
Y (~k (t)) in order to have convergence of the process to a drifted
n
Wiener process.
However, as will be shown later, in order to be able to
calculate boundary crossing probabilities of the process, it will be
necessary to study the behavior of Zk** (t)' which equals Zk*
with
(t)
n
n
2
.
2
"2
2 "
Y (~k (t) ) instead of Yt ,n in the denominator, SInce Y rather than Yt
n
is the relevant variance statistic.
e
Now,
while
does
not affect the termination probability of the sequential procedure, but
will facilitate the calculation of the OC and ASN functions.
e-
99
3.2.4 Test Procedure
Corresponding to given preassigned strength (a 1 ,a 2), where a
l
and a 2 are the probabilities of Type I and Type II errors respectively,
and fixed sample size n, consider numbers an and bn such that
and such that given (a ,b ), the DC function as
n n
determined in Section 3.4 is asymptotically consi~tent. The structure
-~
< b
n
< 0 < an <
~
of the test procedure is essentially similar to the testing procedure of
the Sequential Probability Ratio Test (SPRT) and the Generalized Sequential Likelihood Ratio Test (GSLRT) as discussed before.
However, due
to censoring at point r, certain modifications in the testing procedure
are necessary.
These modifications will be
complexity of the DC and ASN functions.
sho~~
to contribute to the
As far as affecting the testing
procedure, it will be shown that an and bn will not be the same as for
the SPRT and GSLRT.
The test procedure requires an initial sample size dependent on
~, nO(~)'
2
2
A2
such as in (2.30) in order to be able to estimate Yt by Yt,n
2
or Y by Y
~
(~k
(t)) with reasonable accuracy.
n
At stage n O(6)
~
knCt)
~
r,
calculate Zk* (t) as in (3.89) or Zk** Ct) and follow the decision scheme:
n
n
If Zk* (t)
~
an' stop sampling and accept HI at stage kn(t);
~
bn , stop sampling and accept HO at stage kn(t);
n
If Zk* (t)
n
If bn < Zk* (t) < an' then,
n
(i) if kn(t) < r, continue sampling the (kn(t)+l)st observation
(ii) if kn(t)
= r,
stop sampling and accept or reject HO according
to whether Zr* is negative or positive, respectively.
-e
We now proceed to examine the properties of such a sequential
testing procedure.
We note that the test procedure does not depend on
100
the value of
3.3
Wsince
its value is unknown for implementation purposes.
Termination Probability of the Test Procedure
In the case of a Progressively Censored Sequential Likelihood
Ratio Test (PCSLRT), the quantities n and r are fixed and kn(t) is random.
Let k~(t) = min{r ~ kn(t) ~ nO(~):
Z~ (t) advoca~es stopping} denote the
n
stopping variable for the process Zk* (t)'
The results of this section
n
**
**
*
follow for Zkn(t) as well (with kn(t) defined similarly to kn(t)).
Now,
for theoretical purposes, let r vary such that for any n > 0, ~ = a,
a
E:
Thus for large n, k*(t) can be quite large.
(0, I] .
n
n depends on the value of
~
The choice of
since the smaller the value of
~
is, the
larger n is needed to be able to detect a difference between H and HI'
O
Recall that ~2n=~ and thus appropriate values of ~ such that the test is
~
sensitive but sample size is within practical limits, need to be proposed.
These will be studied using simulations in Chapter 4.
Now, by definition of k*(t),
k*(t)
< r.
n
n~ ~
0, n
~
00
Under the case when
and letting a=l (i.e. r=n), we look at the worst possible
case and examine the termination probability of the PCSLRT.
We focus
~-
101
Rewrite (3.91) as
[~~d
Yt <
t)
L\Yk n (t) (lj; -
an
+
mx
-------<--~Yt
(3.92)
L\/knlt)
where for fixed L\,
n [~~,n] ~ ] =
lim b Yt
n-.oo [ L\
y2
t
~l<":1t)
0
lim[ anYt
an d n-.oo
L\
n
[Y~,n]2 ~I~
k (t) ]
Y
t
=
a
n
l3.93)
and
lim
n-+cx>
[Mnltll~
~Yt
-i1] =
a if g(8)
= 21 lg 0+ gl)
1
00
if
g(~) >
2(gO+gl)
_00
if
g(~) <
i(gO+gl)
l3. 94)
Now notice from the result of Theorem 3.1 that the first summand within
the inequality signs in (3.92) has a limiting standard normal distribution since (kn(t) times it equals Wn,k (t) and thus behaves as a standard
n
Wiener process as n
~
00.
From this result and (3.93) and (3.94), the
expression between the inequality signs in (3.92) has either a limiting
standard normal distribution or becomes unbounded, while the boundaries
tend to 0 as n
n
~
00,
~
00.
Thus, the probability in (3.91) goes to zero as
Vt > 0, and thus the asymptotic termination probability of the
PCSLRT in the case of no censoring is unity.
-e
Since a < 1 is the usual
case, the boundaries in (3.92) are smaller and the expression within the
inequality signs is larger and thus termination is still with probability
102
one.
The extent to which termination is achieved before kn (t)=r is
examined in Section 3.5 when we look at the average sample number (ASN)
function.
3.4
Operating Characteristic Function
The operating characteristic (OC) function of a sequential test
is defined as P{accept Hol~,~}. We confine ourselves to studying the
behavior of the OC and average sample number (ASN) functions under local
alternatives (i.e.
~~
0).
The behavior of the sequence Zk** (t) is complex
n
and it is approximated using the properties of the Brownian Motion
process.
In order for n to be large enough so that the result of
Theorem 3.1 is applicable, since ~2n=~ and ~ is a known constant (~>o),
~
is required to be small.
fixed
-a,
Thus as n
~
6
00,
~
O.
In such a case, for
lim
~~o
and in order to avoid this limiting degeneracy, we let
g(~)
+ ~~
= go
where
~
s [0,1],
and denote by L(~,6) = P{accept Holg(~) = go
our process.
+ ~~}
the OC function of
We thus examine local alternatives about the point gO.
Define the initial sample size
no(~)
able k **(t) as in Section 3.3.
The following theorem then holds.
n
as in (2.30) and the stopping vari-
Theorem 3.2
Under convergence of y
2
A
(~k
(t)) to y
2
as n
~
00,
n
e
(2~-I)~a(a
n
-b )
n ~(-2'~ay(an -bn )
+
1 _ ~)~/y)
(-2
+
e-
103
(2~-1)~abn
e
e
(2~-1) ~a(b
•
Proof of
~orem
~
Let y (t)
__
~(21~aybn
1
(2 -
~)~/y) -
-a )
n n ~(-2~(an-bn) _ (} _ ~)~/y).
(3.95)
3.2
2 /-
•
= Y ~k (t)) and mtroduce the following events
n
and
1
.6r[g(e" ) - -2(gO+gl)]
·~r
"2
E,;a y (t)
<
where n > 0 is arbitrary.
and
P~
< 0
n
j
Vt
0
(0,1] } ,
Let E~ (.6,n) denote the complement of E3 (.6 ,n)
denote a probability evaluated given gee)
= go + ~.6. Then, by
the Theorem of Total Probability,
L(~,.6)
= P~{El(.6)uE2(.6)}
= P~[{El(.6)UE2(.6)}~E3(.6,n)]+p~[{EI(.6)UE2(.6)}nE~(.6,n)].
(3.96)
As briefly mentioned in Section 3.2.3, the OC function for the statistic
Zk** (t) is evaluated.
The variance y 2 can be updated at point kn(t) using
n
-e
y~. However, if y~ is used, the boundaries to cross for termination of
the sequential testing procedure will not be linear in t.
In such a
104
case, the results of Anderson (1960) on boundary crossing probabilities
of the drifted Wiener process cannot be utilized.
The calculation of
,
such probabilities for non-linear boundaries is quite complex and has
By changing the test statistic from Zk'* (t) to
not yet been studied.
n
Zk'** (t)' the properties of the test procedure are not altered, but the
n
calculations of DC and ASN functions are made feasible.
We now claim that the second summand in (3.96) is negligible.
First notice that it is bounded from above by P1jJ{E~(lI,n)}.
Recall that
y2 is a known specified function of 8,y2(8), and it is continuous in
-P
A
-
e.
A
Recall from Lemma 3.2.6 that ~k (t) ~ ~, that is P{I~k (t)-~I >£ for
n
n
'*}
*
. A2
2
some t s [n ,1] -+ 0, Yn > 0, £ > O. Thus, 1f Y (t) = Y (~k (t))'
A2
2
~2
P 2 n
then p{ sup Iy (t)-y I > d-+o, \f£ > 0, and thus y (t) -+ y. Under
~~9
A2
2
c
convergence of y (t) to Y , P1jJ{E 3(1I,n)} is negligible, \fn > 0, 1I > 0,
A
as n -+
00.
Focus now on the probability of the event {E 1 (1I)uE 2 (1I)}oE 3 (1I,n).
Denote by
o
Zk
for r
(t) =
E;cx Y
2
n o(lI) , and define the stopping variables
o
•
.
ij
N (lI) =min{r ~ n ~ nO(lI) :Zk (t) 1- ((1+ (-l)l n )b, (1+ (-l)Jn)a)}, for i,j=1,2.
~
kn(t)
n
~
n
e
Let L.. (1jJ,lI) denote the DC function of a parallel sequential
1J
~
'**
test to Zkn(t)
based on {Zk (t)' r
n
(l+(-l)in)b ,(l+(-l)jn)a ,i,j=1,2.
n
n
that YO < 1I <
~
kn(t)
~
n O(lI)}, with boundaries
Then Y£ > 0, n > 0, 311
0
> 0 such
~O'
o
L21(1jJ,~)
0
- £
~ L(1jJ,~) ~ L12(1P,~)
+ £,
\f1jJ £ [0,1].
(3.97)
e
105
Since E and
n are arbitrary, from (3.97) it is clear that is suffices to
show that
o
lim {lim L. .(IJi,~)} = L(IJi),
n-+O tr+O I)
for i,j=1,2.
(3.98)
Recall from (3.88) that Wn,k (t) ..Q W(t), t e: [0,1].
From (3.55)
n
o
and the definition Zk (t)' using (3.60) and (3.72),
n
"
~kn(t)[g(~kn(t))-g(~)]
o
Zk (t) = -----=~a-Y-;::;2--n
=
=
Yt
---=W ,k (t)
{a~ y 2 n n
Yt
- - 2 Wet)
+
~2kn(t)(1Ji -~)
~y
~a y
2
~ ~
t)
--~a-Y-:::2--'-
+
~2kn(t)(1Ji ~a y
+ 0(1),
2
t)
a.s.,
kn(t) ~ at and lim
n
~~O
From the definition of k (t), lim
n
n-+OO
Thus, for
+
~2kn(t)(IJi-
Vt
s [0,1].
(3.99)
~2k (t) = ~at.
n
0, (3.99) becomes, for Yt a consistent update of Y,
1
Zk (t) = - - Wet)
2) t
2
+
l[cXy
n
1
(1Ji -
o
+ 0(1),
a.s., Vt s [O,lJ.
(3.100)
y
From Theorem 4.4 of Anderson (1960):
If Wet) is a Wiener process with mean 0 and variance t, and
c 1 > 0 and c 2 > 0, then
p {wet), t e: [0,1] crosses the line c 1+o 1t before crossing the
}
line c 2+o 2t £! c 2+o 2t < wet) < c 1+o 1t, Vt e: [0,1] and W(l) > P
00
=1
-e
where
~
- ~(p)
+
L
r=l
Ar ,
denotes the standard normal distribution function and
(3.101)
106
2
-2[r (c16l+c202)-r(r-l)clo2-r(r+l)c26l]
~(2r(c2-cl)+P)
- e
- e
-2[rc 2-(r-l)c l ] [r6 2-(r-l)ol]
~(2[rc2-(r-l)cl]-P)
2
-2[r (clol+c262)-r(r-l)c26l-r(r+l)clo2]
~(2r(c2-cl)-P).
+ e
CX)
Now,
L
r= 2
Ar consists of summands involving exponential
functions raised
.
to powers less than or equal to minus eight and thus are negligible in
comparison to the terms of AI'
Thus (3.101) simplifies to
(3.102)
Thus, as /).
o
~
0, L..
1J
0
(~,/).) ~
o
1
Zk (t) = - - l\{t) +
n
Ifjiy
L..
1J
(~
(~)
=
-t)t
-~2­
Y
crosses the line (l+(-l)jn)a
n
before crossing the line
1 - P
i
(1+(-1) n)bn or lies between
them Vt E (0,1] and
> 0
1
W(t) crosses the line ~y (1+ (-1) j rDa
n
= l-P
.
crossing the line ~Y(l+(-l)lrj)bn them Vt s (0,1] and Well >
-
(~
(~-2)t~
y
before
1
- 2) tlrci
y
or lies between,
(t - ~) «,a/y
e(3.103)
107
is given by (3.102) with c l = ~Y(l+(-l)jn)an' c 2 =
and 01=02 =
(t - W)~/y = p.
From (3.98),
o
L(W) = lim L.. (lJi)
n-+O 1J
(2lJi-l)~a(b
- e
n
-a )
1
n ~(-2~y(an -bn )-(-2 -W)~/y) .
(3.104 )
QEr'
Note that the OC function is not asymptotically distribution free
since it depends on the underlying distribution through y.
convergence of y
2
A
(~k
(t)) to y
2
as n -+
00,
Due to strong
a consistent estimator of y can
n
be obtained and thus have an asymptotically consistent procedure.
For
prescribed (a ,a 2), one can solve for an and bn in order to have an
l
asymptotically consistent procedure. In order for the procedure to be
asymptotically consistent of strength Ca1 ,a 2 ),
l-a
1
= LCO)
+ e
_ e
·e
= ~(~/y)
-~aa
n~(-2~yan+t~/y)
- e
-~a(a -b )
-~a(b
n
n ~C-2~y(a -b )+~21
~a/y)
n n
n
-a )
n ~(-2~y(a -b ) - ~21
~a/y)
n
n
lOB
cx 2 = L(l) = ~(- ~/y) -e
and
~cxa
n~(-2~yan ~/y)
-b )
n n ~(-2~y(a -b ) - ~2l
~cx/y)
~cx(a
e
+
n
n
(3.105)
The (a ,b ) that solve (3.105) make the procedure asymptotically consisn
tent.
n
The solution of (3.105) is quite complicated, involving possibly
a two-dimensional iterative interpolation formula utilizing the NewtonRaphson methodology.
However, there are some special cases for which an
approximate closed form solution is obtainable.
In most practical se-
quential procedures, cx l =cx 2=cx * and (as in Chapter 2 for a and b) in such
a case, b =-a . The solution of (3.105) simplifies to solving for a in:
n
n
~
n
1
~(- ~/y)-cx
*
1
~/y)
(3.106)
In order to solve (3.106) one can utilize the standard one-dimensional Newton-Raphson technique and iteratively arrive at the value of
a.
n
However, it is desired to express such a solution in closed form.
The closed form solution presented below may be used as an initial value
e-
109
If ~/y is
for an iterative procedure to arrive at the value of an.
small and 2~yan is large, the approximation given will be· quite satisfactory.
Recall that for x > 0 large, ~(-x)
= ~(x)
x
where ~(x) is the
standard normal probability density function (see Kendall and Stuart,
Vol. 1 (1969)).
Then simplify (3.106) as
1
"2- a
*
1
::::-~--e
2~ya
1
+
2 2
1
n
nn
n
-8~ay
1
4~ya
-2~ay a
-e
2 2
a
n
n (e
+
"2 e
n _ e
-~aa
n)
-2~aa
n -e 2~aa n )
(e
ffTI
e
1
~aa
-2~ay
2 2
a
n
-~aa
n
2 2
-8t,;ay an -2~cxan
(3.1071
The expression in (3.107) can be further simplified by expanding e
x
using a second order Taylor series and thus obtain:
(3.108)
From (3.108),
(3.109)
-e
From (3.109) one can obtain a
n
and b
n
such that the test procedure is
asymptotically consistent with strength (a * ,a * ).
110
3.5
Average Sample Number (ASN) Function
The calculation of the ASN function is quite complicated due to
the fact of censoring, as was the case for the DC function in Section 3.4.
Define the following probabilities:
PI
= P
P2 = P
{wet)
~ c 1+6 1t for a t < 1 which is smaller than any t fOr}
which Wet) ~ c +c 2 t
2
{wet)
where t*
~ c2+6 2t for a t < 1 which is smaller than an)'
which wet)
€
~
t
fOr}
cl+olt
[0,1] denotes the random stopping time.
From the above
defini tions,
E(t * )
· J:
dP
t
1
crt
**
Define the stopping number kn (t)
dt +
J:
dP 2
crt
t
= min{r
dt + l·P o·
> k (t)
- n
~
nO(l\):
(3.110)
**
Zk (t) advon
•
•
cates stopping}, and recalling the definition of k (t), the ASN function
n
is then
E[kn**(t)]
=
[na] E(t * )
. given in (3.110).
where E(t * ) IS
=
r E(t * ),
(3.111)
In order to evaluate P 'P and P ,
O 1
2
the following results on boundary crossing probabilities of a drifted
Wiener process as given by Anderson (1960) are utilized.
From Theorem
4.3 of Anderson (1960),
00
(3.112)
and
00
(3.113)
e-
111
where
2
-2[s (clol+c202)-s(s-1)c201-s(s+1)cl02]
~(-01-(2s+l)cl+2sc2)
+ e
(3.114)
The powers of the exponential functions in A for s > 1 are smaller than
s
minus eight and thus negligible in comparison to Al (c ,c ,01,02)'
l 2
approximate (3.112) and (3.113) by
Thus,
(3.115)
and
(3.116)
Now, given (3.114) and the convergence of
z;* (t)
to a drifted
n
Wiener process, PI' P2 and Po can be evaluated for our process with
c
l
= I[ciyan
c 2 = I[ciyb
n
° °
1
-e
=
2 =
(~
-ljJ) I[ci/y
(3.117)
112
+ e
+e
-2~cx(t
-lji)bn [
4>
[-(~
-lji)/[ci
y
+~yb
[-(~ -lji)~
]
J)
-4> - - y - + I[ci"y (2bn -an )
n
-2~CX(~ -lji) (bn-an ) [4> [-(~ -lji)1[ci" +~y (2b
y
]
n
-3an ) -4>
[-(~ -lji)1[ci"
y
I
~(b
n
-2a)
n
J]} .
(3.118)
The integrals in (3.110) are given by Theorem 5.2 of Anderson (1960) as
and
I:
I:
00
dP
1
t _1 dt =0.I B (c l ,c 2 ,01,02)
dt
1 5=0 s
dP
2
-1
t dt dt =6.:
I
s=O
e
Bs (-c 2 ,-c l ,-02,-01)
J
] [
4>(-01+2(s+1)c 2-(2s+l)c l ) (2s+l)c l -2(s+1)C 2 ·
(3.119)
The powers of the exponential functions in B for s > 1 are smaller than
s
minus eight and thus negligible in comparison to B and B . The integrals
O
l
in (3.119) can thus be approximated by
-e
-2 [sc l - (5+1) c 2] [5°1- (5+1) °2]
2
00
e-
113
(3.120)
(3.121)
(3.122)
and
[e
-46(c -C 2)
1
-20(2C
~(O+4c2-3cl)-e
2
-C
)
1
~(-O+4c2-3cl)]
(3c l -4c 2)
(3.123)
Using (3.117) and (3.122) - (3.123), one can evaluate (3.120) and (3.121)
for the case of
~
1
# 2.
evaluate (3.110) for
~
Using the results of (3.120), (3.121) and (3.118),
1
1 2.
When ~ = ~, the boundaries are two horizontal lines.
ous result for
~
The previ-
1 21 cannot be used even with repeated use of L'Hospital's
rule since it is already an approximation.
Instead, a result given by
Billingsley (1968) for a zero drift standard Wiener process can be used:
if b
~
0
~
a, then
P{b < inf Wet) < sup Wet) < a}
- O<t<l
O<t<l
~
= l
(-l)k p {b+k(a-b) < Z < a+k(a-b)},
k=-~
-e
where Z is a standard N(O,l) variate.
Now,
(3.124)
114
E[k*h)] =
(3.125)
n
Also,
(3.126)
and since
=
I/J
1
2'
from (3.100),
1 W (u)
P{k **(t) > j} = P {---n
_ff:::.
n
'-say
= P{b
n
~y
<
inf
O<u~j/r
P{b
n
~y~
~J
From (3.88) and as n
E[k**(t)]
n
~
~J
W (u).
n
Then (3.127) becomes
< inf W*Cv) < sup W*Cv) < a l(Iiy~}.
O<v<l n
- O<v<l n
n
~J
00,
(3.127)
(3.128)
(3.125) becomes
=
~ n
where r/n
~
(bn ,a n )," 0 < u -~ j/r}
W (u) < sup
W (u) < a ytay}.
n
n
O<u~/r n
Now let v=j/r and define W*Cv) =
n
E
~
where a ** =a
n
a
fo
[b
a and j/n
TF:
y~ya
~ya
P n
It
~
t.
and b ** =b
'Ia I'r.ya]
dt,
W(v) < sup W*(v) < n
O~v~l
- O<v<l
If
(3.129)
< inf
Using (3.124),
n
TF:
y~ya
and
•
~ 1S
the standard normal d.f.
evaluation of (3.13Q is simplified in the case when bn =-an to
The
e-
115
E[k:* (t)] = nJcx
o
I (_1)k[4>((2k+1)a**)
-4>( (2k-l)a**)]
It
If
dt.
k=_oo
(3.131)
The actual computation of (3.131) is quite complicated.
In Chapter 4,
suitable approximations for practical purposes are presented for ca1cu1ating the ASN function for the PCSLRT.
-e
CHAPTER 4
SIMULATIONS AND PRACTICAL EXAMPLE
4.1
Introduction
As discussed in Chapter 2, the theory developed for the General-
ized Sequential Likelihood Ratio Test (GSLRT) was for the asymptotic case
when 6
~
O.
In actual practice one encounters a fixed value of 6 in the
statement of the alternative hypothesis.
In such a case the asymptotic
results of the operating characteristic (OC) and average sample number
(ASN) functions (Theorems 2.2, 2.3, and 3.2 and (3.131)) may not provide
a good approximation to the true functions.
One is thus interested in
investigating how small the value of 6 has to be for asymptotic results
to apply reasonably well.
However, the theory for fixed values of 6 is
quite complicated since one is unable to utilize the well-known properties
of the standard Wiener process in order to approximate the behavior of
the test statistic.
It thus becomes necessary to investigate the proper-
ties of the test statistic by simulations for fixed values of 6.
In the next two sections of this chapter we examine simulated
results of OC and ASN functions for fixed values of 6 and compare them
to the theoretical asymptotic approximations of Chapters 2 and 3.
In
Section 4.4, we look at the application of the results of Chapter 2 to
the specific problem of compliance testing in air pollution monitoring.
117
4.2
Simulation Study Based on Chapter 2
Recall the notation established in Chapter 2.
test statistic Zn*
The value of the
(see 2.27) depends on the distribution function
F(x;8) and the choice of the function of interest g(8).
For illustra-
tion purposes, let F(x;8) be the normal distribution with mean
Letting 8=(~,o) " define g(8)= ~ + T 0+ ~6, ~
aO
£
~
and
[0,1]
where T
is the (l-ao)looth percentile point of the standard normal
aO
distribution.
Consider values of 0.05 and 0.01 for a O'
The hypothesis
of interest is
HI:
i.e.
~
= 0
g(8) = go + 6, i.e.
~
= 1 ,
where 6 > 0 and go are known.
the simulations, take
(~,o)
(4.1)
Without loss of generality in performing
= (0,1) and thus let go = T .
aO
From example (A) of Section 2.6, for the above choice of F(x;8)
and g(8), y2 = 02(1+_2l~2). Before describing the procedure to carry
aO
out the simulations, there are several variables whose values need to be
specified.
test.
tions.
~
Prescribe (.05,.05) as the given strength (a ,a 2) for the
l
Eighteen points in the
(6,~)
plane are considered for the simu1a-
The points arise from 6=0.02, 0.05, 0.10, 0.15, 0.20, 0.25 and
= 0, 0.5, 1.0.
The choices of
~
of 0, 0.5 and 1.0 give testing situa-
tions of an alternative hypothesis exactly like the null hypothesis, an
alternative halfway in between the null HO and the alternative HI of
(4.1), and the testing situation of (4.1).
The small values of 6 are
chosen on the basis of expecting that the closer to (0,0) in the
-e
plane, the better the approximating asymptotic results will be.
(6,~)
The
closer the alternative hypothesis being tested is to the null hypothesis,
118
the larger the sample size will tend to be since ~2n = 0(1) and thus the
better the chances of applicability of the asymptotic results.
(~,~)
tests.
For each
combination we performed q=lOOO, 500 or 250 simulated sequential
The smaller number of simulations was dictated by computer time
limitations.
The IMSL subroutine GGNML was used to generate the random
normal deviates.
The actual procedure is outlined as fo11o\\ls:
(i)
For each
(~,~,a,B,aO)
combination, do the following steps
[(ii) - (iv)] q times
(ii)
At n > nO (6) stage of sampling, compute (recall notation
in Chapter 2) :
'"
e
-n
1
[1
a2 ln
L (X ;8)
-n
-TI
ae ae; ,
(iii) Compute
Zn*
-x )2]
2
n
n
= (xn , s n ) = - I Xi' - l (x.
n 1=
. lIn
n i=l
n
-
=2
e
_n
s
n
the value of the test statistic at stage n:
=
(iv)
Examine the decision rule
If
Z* < b,
n -
stop and accept HO
Z* > a, stop and reject HO
nb< Z* < a, continue with sampling of xn + l
n
.
e-
119
(v)
After q times, compute the average sample number
[r
i=l
n.]·
1
.!. where
q
n. is the stopping value at the i
1
n=
th
simula-
tion (i=1,2, .•. ,q) and compute the empirical operating
characteristic function ~ where k is the number of simu1aq
tions out of q that ended with accepting the null hypothesis.
Compare these values to the asymptotic results of
Theorems 2.2 and 2.3.
The results are presented in Table 4.2 and Table 4.3.
As mentioned in (2.30), we require an initial sample size n (6)
O
large enough in order to be able to estimate y by Y with reasonable
n
accuracy.
According to Mahmoud (1973), n (6) = 1/6 would satisfy (2.30),
O
but for the purpose of our simulations we
chose n (6) = 6 -1-P6 ,
O
where 0
~
P < 1.
6
For different
values of 6, n O(6) is presented in
Table 4.1.
TABLE 4.1
Initial Sample Sizes, Section 4.2
-e
6
.02
P6
.177
.05
.000
20
.10
20
.15
.301
.214
10
.20
.431
10
.25
.661
10
nO(M
100
e
e
e
TABLE 4.2
0
N
rl
Chapter 2 Empirical OC and ASN Functions for Various Values of
(u ,u 2)=(·05,.05), u =·05, T =1.645
l
O
uO
Number of
Simulations (9)
Approximate Computer
Time for 1000
Simulations (min)
Observed
OC
Theoretical
OC
fj,
Observed
6 2E(N)
(s.e.)
GSLRT
Theoretical
6 2E(N)
6
1lJ
.02
1.0
500
300.0
.054
.05
13 •14 (. 446)
12.47
0.5
500-
420.0
.502
.50
19.07 (.688)
20.40
0.0
500
300.0
.948
.95
12.30 (.397)
12.47
1.0
1000
50.0
.071
.05
12.45 (.267)
12.47
0.5
1000
75.0
.538
.50
20.29 (.527)
20.40
0.0
1000
50.0
.960
.95
12.07 (.298)
12.47
1.0
1000
12.5
.057
.05
13.01 (.295)
12.47
0.5
1000
20.0
.547
.50
20.42 (.539)
20.40
0.0
1000
12.5
.964
.95
12.22 (.324)
12.47
1.0
1000
5.5
.111
.05
12.43 (.285)
12.47
0.5
1000
9.0
.557
.50
20.47 (.561)
20.40
0.0
1000
5.5
.972
.95
12.03 (.315)
12.47
1.0
1000
3.0
.132
.05
12.50 (.283)
12.47
0.5
1000
5.5
.578
.50
20.88 (.582)
20.40
0.0
1000
3.0
.979
.95
11. 44 (.329)
12.47
1.0
1000
2.0
.148
.05
12.47 (.284)
12.47
0.5
1000
3.5
.612
.50
20.26 (.585)
20.40
0.0
1000
2.0
.988
.95
11.08 (.331)
12.47
.05
.10
.15
.20
.25
.....
.....
TABLE 4.3
N
Chapter 2 Empirical OC and ASN Functions for Various Values of
(a l ,a2 )=(·05,.05), ao=·Ol. Tao =2.326
Number of
Simulations (q)
Approximate Computer
Time for 1000
Simulations (min)
Observed
OC
Theoretical
OC
~
Observed
f\2E(N)
(s.e.)
GSLRT
Theoretical
f\2E(N)
f\
l/J
.02
1.0
250
510.0
.052
.05
20.16 (.908)
19.62
0.5
250
750.0
.500
.50
31.53 (1.535)
32.10
0.0
250
510.0
.952
.95
19.68 (.867)
19.62
1.0
1000
75.0
.060
.05
20.27 (.477)
19.62
0.5
1000
120.0
.523
.50
31.17 (.820)
32.10
0.0
1000
75.0
.970
.95
19.51 (.451)
19.62
1.0
1000
20.0
.066
• OS
20.05 (.430)
19.62
0.5
1000
32.0
.536
.50
33.24 (.864)
32.10
0.0
1000
20.0
.957
.95
19.32 (.447)
19.62
1.0
1000
8.5
.090
.05
19.81 (.431)
19.62
0.5
1000
13.5
.550
.50
31.24 (.826)
32.10
0.0
1000
8.5
.967
.95
19.05 (.486)
19.62
1.0
1000
5.0
.124
• OS
19.08 (.436)
19.62
0.5
1000
7.5
.564
.50
31. 73 (.913)
32.10
0.0
1000
5.0
.980
.95
18.73 (.525)
19.62
1.0
1000
3.0
.142
• OS
19.48 (.425)
19.62
0.5
1000
5.0
.591
.50
31. 55 (.872)
32.10
0.0
1000
3.0
.985
.95
18.19 (.531)
19.62
• OS
.10
.15
.20
.25
e
e
e
122
From Tables 4.2 and 4.3, the empirical values of the OC function
approach the theoretical values as
~
get smaller.
For
~ ~
I the asymp-
totic results provide a reasonable approximation for the OC function.
Also note that the simulated values of the OC function are for the most
part conservative.
A probable explanation for this is the fact that
the theoretical asymptotic values are
by a continuous one.
approximati~g
a discrete process
One would expect that this would tend to make the
empirical ASN values smaller than their theoretical counterparts, but
there is quite a lot of variability in the random number generator
utilized (GGNML).
However, a comparison of numbers generated by GGNML
with ones generated by the FORTRAN subroutine GAUSS showed GGNML to be
preferable.
The discrepancies noted between theoretical and empirical
ASN values is not large when the standard errors are taken into account.
The standard errors of empirical ASN values are approximately 0.027 of
the empirical ASN value.
Comparing Table 4.2 with Table 4.3, note that for aO=.Ol the
asymptotic approximations are closer to the empirical OC and ASN functions for given
~
and
~
than for aO='OS.
The farther apart the stopping
bounds are relative to Zn* , the larger the sample size is and thus the
Thus, taking (a l ,a 2) element wise smaller than (.05,.05) for a given ~,~ combination will also
better the asymptotic approximations become.
improve the approximation of empirical results by asymptotic values as
the stopping bounds for the sequential test get farther apart.
asymptotic theoretical results are thus achieved as
(b
~
~ ~
0 or a
The
~
00
-co).
In the simulations, the function
for a shift in location in g(8).
g(~)
assumed that go accounted
One can also have go account for a
123
scale difference as well as both a location and scale difference in
g(~).
We anticipate that similar results to the ones obtained would be observed.
However, the calculations are quite complicated and would not
shed more light on the behavior of the test statistic.
4.3
Simulation Study Based On Chapter 3
Recall the notation in Chapter 3.
The value of the test statis-
tic z*n (see 3.89) depends on the distribution function F(xj6) and the
choice of the function of interest g(6).
For illustration purposes,
let F(x;6) be the exponential distribution with mean 6 and define g(e)=
eln2
+ ~6, ~ £
[0,1], to be the median of the distribution.
The hypoth-
esis of interest is
HO:
gee) = go
i.e.
~
= 0
HI:
g(6) = gO+6, i.e.
~
= I,
where 6 > 0 and go are known.
(4.2)
Without loss of generality in performing
the simulations, take 6=1 and thus gO=ln 2.
Suppose one wishes to test that the median lifetime of light
bulbs produced by a given manufacturer equals one month (=720 hours)
and started off with n=IOOO bulbs in the experiment.
Assuming that the
lifetime of light bulbs follows an exponential distribution with mean
8=1 (month) under H ' one would be interested in testing a hypothesis
O
of the form (4.2). The standard censored test would involve waiting
until the sooth light bulb failed in order to make a decision on (4.2).
In our procedure we take into account information on the underlying
distribution so that each order statistic contributes some information
-e
and thus enables us to possibly terminate experimentation before observing the sooth order statistic.
124
Generalization of this example to other distributions F and
functions g follow from this example.
However the computation of simu-
lated values becomes quite complicated and thus for illustration purposes,
this simplified example is presented.
At the k th stage of the experiment, 0 < k
~
500, compute Zk* as
follows, given (3.89) and the definition of gee):.
"
~(ek-l)
ln2+~
2
(~-.5)
(4.3)
= ---"="2--""=1-"'--~a(ln2)
In,k(e k )
where, for the exponential (e) distribution,
r ]
" = (
ek
i=l Zn,i
/k + (n-k)Z
and J n, k(e) = k/e
n,
k/k
2
(4.4)
Specify the strength of the test to be (a l ,a 2) = (.05,.05).
The asymptotic results of Chapter 3 apply when n
~
in progressively censored schemes, the value of n is fixed.
~
is too small,
2
~ n=~
However,
00.
Thus, if
becomes small and the value of Zk* gets large.
In
order to satisfy the prescribed strength (a ,a 2), ~ is required to be
l
reasonably large; and if ~ is too small, the value of n will not be large
enough for the asymptotic results of Chapter 3 to apply.
The alternative
of increasing the value of n (=1000) in our simulations is too costly in
view of the fact that one has to generate and order n exponential random deviates a total of 500 times.
Thus, require
consider the following six points in the
0.15 and
~=O,
0.5, 1.0.
(~,~)
~
to not be small and
plane, given by
~=0.10,
We expect that n=lOOO is reasonably large for
e-
125
~
of 0.10 and 0.15 and thus that the asymptotic results of Chapter 3
are close to the empirical results for such 'large' values of
For
tests.
each~,wcombination,
~.
we performed q=500 simulated sequential
The IMSL subroutine GGEXN was used to generate the exponential
deviates.
(i)
The actual procedure is as follows:
* =a =a ) combination, do the following steps
l 2
((ii) - (iv)) SOU times.
For each
(~,w,a
(ii) Generate n=lOOO exponential (8=1) random deviates and obtain
their respective order statistics
(iii) At stage
nO(~) ~
k
~
500, compute Zk* as in (4.3) - (4.4)
(iv) Examine the decision rule given in Section 3.2.4, where a
n
is computed from (3.106)
(v) After 500 times, compute the empirical ASN and OC functions
and compare to the asymptotic results of Chapter 3
Similarly to Table 4.1, Table 4.4 presents the values of
for different values of
~.
nO(~)
Table 4.4 also presents the value an of the
TABLE 4.4
Initial Sample Sizes, Section 4.3
and Boundary of Test
a
.10
.15
60
40
n
.75
.25
upper boundary of the sequential test (see Section 3.2.4).
-e
Since a 1=
a =a*=.os, we utilized (3.106) to iteratively arrive at the value of an
2
Since ~/Y was not small, (3.109) was not utilized in this case.
126
The results are presented in Table 4.5.
TABLE 4.5
Chapter 3 Empirical OC and ASN Functions for Various
(a l ,a 2) = (.05,.05)
~
.10
1.0
.15
Number of
Simulations
(q)
500
Observed
OC
.084
Theoretical
OC
.057
~,~
Combinations
Observed Theoretical
(ASN/n)
(ASN/n)
.318
.301
0.5
500
.508
.500
.395
.396
0.0
500
.940
.943
.283
.301
1.0
500
.100
.057
.116
.106
0.5
500
.604
.500
.183
.199
0.0
500
.982
.943
.107
.106
e
In order to obtain the theoretical values of the OC and ASN
functions, we utilized (3.95) and (3.110) - (3.111), (3.118), (3.120) (3.123), (3.131).
For the given values of a
n
from Table 4.4, the theo-
retica1 values of the OC function in Table 4.5 were calculated using
(3.95).
For ~1 ~, the calculation of the ASN values merely involves
substituting the values of
(3.123) mentioned above.
~,~,a,
an and y into the equations (3.110) -
However, for illustration purposes, we discuss
certain computational simplifications.
In order to evaluate (3.118), note that for large values of c
1
(see notation of (3.117)), some of the terms in (3.118) become neg1igible.
To evaluate the integrals (3.120) - (3.121) the above statement
also provides a computational simplification.
For our case, bn =-an and
(3.118) can be simplified to (see notation of (3.117))
e-
127
Po
= ~CO+C1)-~CO-C1)
-2;aC! -1/I)a
-e
{
2
n[~(O-C1)-~CO-3C1)]
-4;aC! -1/I)a
+ e
2
nIOCO-SCl) - ~CO-3c1)]
(4.5)
and (3.120) and (3.121) simplify to
(4.6)
and
6c
+ SIe
<5
1
~C-o-Sc1) - e
-4c
<5
1 ~(o-Sc1)]
(4. 7)
128
From (4.5) - (4.7), one can compute (3.110) - (3.111) for ~1!.
~=
For
1
2'
utilize (3.131) to evaluate the ASN function of the sequential
test.
One can rewrite (3.131) as
E[k ** (t))
n
where
~
* (x)
where ¢(x)
= l-~(x).
= ~,
= na-4n a I (_l)k
fo k=O
00
~* [ a ** (2k+l) ] dt,
It
(4.8)
Now, Vp > 0,
(x), and thus (4.8) becomes
(4.9)
Since a
**
>
0, for large values of k, the terms of the infinite series in
(4.9) converge to zero.
For a ** /Iii > .25, the terms for k > 4 are
negligible and thus (4.9) provides a suitable approximation for the
function for
~=
AS~
1
2.
From Table 4.5, note that the asymptotic theoretical values of
the DC and ASN functions are better for
note that the closer the alternative
esis
(~=O),
~=.10
than for
h)~othesis
~=.15.
is to the null
Also
h~oth-
the better the theoretical DC values are in approximating
the empirical results.
The theoretical ASN/n values are suitable approx-
imations of the observed average sample numbers.
We thus claim that the
asymptotic theoretical approximations for the DC and ASN functions are
achieved as
~ ~
o.
129
4.4
4.4.1
Compliance Testing in Air Quality Monitoring
Introduction and Background
As an illustration of the procedure developed in Chapter 2, we
address a specific real problem.
At present the implementation of air
quality standards follows a simplistic approach:
a given maximum con-
centration value for a pollutant, averaged over a· given time period and
usually based on health effects data, is not to be exceeded more than
once within a given calendar year in order for compliance to be assessed.
A government publication by the National Institute for Occupational
Safety and Health (NIOSH, No. 77-173) discussed the lack of use of
statistical methodology for an occupational health setting:
"it is
essential that the sampling of the . . . [industria1] environment should
be performed utilizing appropriate statistically based sampling plans
and statistical decision procedures so that the data can support the
decision making processes regarding compliance or noncompliance with
the mandatory health standards" (see page IS).
In environmental moni-
toring and compliance testing of air quality standards, there is a similar need for the use of statistical methodology.
The Environmental
Protection Agency (EPA) has only recently begun to introduce statistical
techniques in their regulations as in the 1979 revision of the National
Ambient Air Quality Standard (NAAQS) for ozone, a photochemical oxydant.
The new standard for ozone is stated in the Federal Register as being
attained "when the expected number of days per calendar year with maximum hourly average concentrations above 0.12 ppm is equal to or less
than one" (see p. 8202, Vol. 44, No. 28).
By a given calendar day
exceeding the maximum concentration level it is meant that the maximum
hourly concentration over the 24 hour period for the day exceeds the
130
value of 0.12 ppm.
This so-called "daily interpretation" of the stand-
ard weakens the old standard under which if on a given day there were
more than one hourly concentration exceeding the standard, there were
that many violations, while under the new standard only one violation
is accounted.
The new standard, besides being statistically stated
rather than deterministically, also involved some use of risk assessment methodology in looking at the health effects of ozone and how they
could be related to the setting of the standard.
Consequently, in-
creased use of statistical techniques is to be expected by the EPA in
the future.
As mentioned above, at present the air quality standards for
most major pollutants for hourly averaging times are stated in terms of
a maximum annual concentration level not to be exceeded more than once
in a year.
A report by EPA (Larsen (1971)) states that since data for
all 8760 hours in a year are not usually available and the accuracy of
the higher measurements might be questionable, the standard practical
protocol is to select the 0.1% frequency, i.e. the ninth highest value
in the year, as the test statistic of interest, rather than the second
highest value as stated by the standard.
A practical method for intro-
ducing statistical methodology in compliance testing would be to utilize the above empirically derived frequency of (87~0 x 100)% as the
focus of interest.
At present the standard practice is to compare the
(87~0 x 100)% value (i.e. the ninth largest observation) to the standard maximum allowable concentration value, and if it is larger, then
a violation has occurred.
The above procedure does not take into
account any distributional assumptions about pollutant concentrations
and it calls for expensive continuous monitoring.
We envision the
~
131
following formulation for the problem:
= measured hourly concentration of given pollutant
xo = standard specified maximum hourly concentration (known)
F(x;8,S) = hypothesized continuous distribution of x of known form,
Let
x
with unknown vectors of parameters 8 and
8
= unknown parameters determining
a = unkno~~
th~
S
shape ·of F
parameters determing the effect of covariables
affecting F
g(~,~,a)
= (l-a)lOO percentile of d.f. F.
Then, in a compliance testing setup, one would be interested in
testing
H :
O
g(~,~,ao) ~
xo
> x
o
o < a O < 1.
(4.10)
Utilizing the empirically derived frequency of Larsen (1971) for our
HI:
g(~,~,al)
illustrative purposes, a O = 87~0 in (4.10).
Observations of air pollution concentrations measured at a
given site are not independent of each other since measurements taken
at adjacent hours tend to be close to each other.
The need for a suit-
able underlying probability model for air pollution data is discussed
by Ott and Mage (1976).
Earlier work by Larsen (1969,1971) claimed
that the distribution of concentrations of most air pollutants followed
a lognormal distribution.
This assumption has been used extensively
in the air pollution field (see Stern, et al. (1973)).
However, the
lack of substantive theoretical engineering justification for the claim
and the lack of fit of the model to various air pollution datasets (see
Mage and Ott (1978) and Ott and Mage (1976)) prompted Ott and Mage (1976)
to develop a new probability model for environmental data.
Since air
132
pollution data is highly serially correlated and usually one has close
to the entire population of interest (8760 hourly observations per year),
statistical goodness of fit tests of probability models are unsuitable
(see Mage and Ott (1978)).
Mage and Ott (1975) and Ott and Mage (1976)
tackle the problem from an engineering viewpoint. They treat the underlying equation of diffusion of atmospheric concentrations stochastically
and show that a 'censored' 3-parameter lognormal model (denoted LN3C)
is the best approximator to the time series of pollutant concentration
data.
Extensive application of the new model to air pollution data for
a broad range of air pollutants (see Mage and Ott (1975)
and Ott, et
al. (1979)) demonstrated the empirical preferability of the LN3C probability model over the previously widely used 2-parameter lognormal
model.
Based on the empirical evidence and the engineering theoretical
justification, for our illustrative purposes, one assumes that even
though the observations of air pollutant concentrations are serially
correlated, a random sample of the observations of a given year can be
treated as iid observations from a 'censored' 3-parameter lognormal
distribution.
An important problem that one encounters is whether the inclusion of B into F(x;e,B) and g(e,B,a) in (4.10) is fully clear.
-
well known that
It is
--the distribution of the random variable x is affected
by several covariables such as meteorological conditions, emission
activity, time of day, season of the year, and many others.
entirely clear how they affect such distribution.
It is not
The functional rela-
tionship by which covariables affect the distribution of pollutant concentration is the central problem that mathematical modelling is concerned about.
As far as our formulation (4.10) is concerned, there are
e-
133
two methods of approaching this problem:
(a) Really have
F(x;~k)
as the distribution of x, where
~k
is
one of several possible values of e determined by being
th
condition of the set of covariables. One
under the k
would thus be concerned with knowing the values of the covariables in order to know what distribution x has.
(b) Really have F(x;e,B) as the distribution of x.
One then
need not be concerned with knowing the values of the covariabIes since they are already accounted for by knowing F.
In the case of ozone, the Federal Register seems to be implying
that method (b) is more appropriate.
In interpreting the probabilistic
nature of ozone concentrations, it states that the variations in maximum ozone concentrations from one time period to another "are mainly
due to the random nature of meteorological factors which affect the
formation and dispersion of ozone in the atmosphere" (see p. 8218).
The implication is that meteorological conditions and other covariables
are factors that determine that maximum ozone concentrations have the
distribution that one observes they have.
The applicability of this
remark to other pollutants such as carbon monoxide (CO), oxides of nitrogen (NO) and non-methane hydrocarbons (NMHC) remains to be shown.
x
However, since we are not trying to model the hourly averages of maximum
concentration levels of CO, NO
x and NMHC, we need not be concerned in
determining the functional relationships (by which covariables affect
the distribution of pollutant concentration) that are implicitly stated
by method (a).
In view of the lack of a satisfactory statistical model
for method (a), we approach the problem of dealing with covariables
given in method (b) and implied in the Federal Register as applicable
134
to ozone concentrations.
We thus view the effect of the covariables as
the cause or reason for the observed variability and distribution of our
pollutant's maximum hourly averaged concentration, x.
Note that in actual
enforcement, the effect of meteorology and other covariables should be
taken into consideration since a violating concentration may arise from
causes not due to man's activities and thus should not be counted as such.
Assume knowledge on the necessary information about the distribution of the concentration of the pollutant and that the framework of
(4.10) is of interest.
A natural question that arises is how large the
sample size must be in order not to be losing sensitivity due to too
small a sample and in order not to be inefficient by having too large a
sample.
It should be clear that there are some monitoring stations for
which any sampling schedule would not be appropriate under any circumstances and for which continuous and complete monitoring (8760 hours in
a year) is dictated.
These are the so-called "trend" stations which are
used for long term studies and are maintained by policy to enforce complete monitoring.
Statistical sampling techniques can be applied only
at special purpose monitoring stations and at compliance testing monitoring stations.
At such stations, the question of sample size is
answered by sequential procedures:
the sample size is determined by the
value of the stopping variable N of the given sequential test.
The
formulation of (4.10) can be tested sequentially under the theoretical
results of Chapter 2.
The use of sequential procedures will allow a
possible early termination of experimentation with a concurrent savings
in time and cost over continuous monitoring.
.
135
4.4.2
Description of Data
As an example of a possible method of introducing our sequential
statistical techniques to environmental compliance testing, we utilized
the data from a study conducted by the U.S. Air Force for testing the
modelling capabilities of the Air Quality Assessment Model (AQAM) (see
Yamartino, et al. (1980)).
The U.S. Air Force needs an analytical tool
to reliably predict the impact of their air bases on air quality, both
for assessing the significance of existing airbase emissions in relation
to National Ambient Air Quality Standards (NAAQS), and for estimating
the impact of proposed facilities as required by environmental impact
statements.
Thus, one of the purposes of the study was to evaluate the
modelling capabilities of the AQAM by examining its overall predictive
•
accuracy with complex emission sources typically found at airports .
Ambient air quality measurements of several pollutants were made at
•
Williams Air Force Base (WAFB) near Phoenix, Arizona from June 1976
through June 1977 (13 months).
The various assumptions and problems
involved in the sampling procedure are discussed in detail in this report.
Data was collected continuously and reduced to hourly averages for several variables such as pollutant concentration, emission activity and
meteorological conditions.
The measurements obtained can be utilized to determine the
effects of local aircraft operations on air quality and it is to this
aspect of the study that we focus our attention.
The main emphasis of
Yamartino, et al. (1980) was towards the corroboration of predictions
of the AQAM model and the data was gathered with such a purpose in mind.
However, the data can also be utilized to illustrate a possible improvement in compliance testing methodology for air quality standards.
136
The choice of pollutants to study was based on the practical
consideration of the availability of a measuring station not substantially
affected by changes in wind direction and such that its location was
suitable for monitoring the given pollutant.
The prime emissions of
aircraft during taxiing operations are CO and NMHC and the proximity of
station #3 (lout of 5) to the taxiing runway made. it suitable to monitor emissions of taxiing aircraft (see Yamartino, et al. (1980)).
The variables chosen for our example were thus the concentrations of carbon monoxide (CO) and non-methane hydrocarbons (NMHC).
Table 4.6 presents the current ambient air quality standards for CO and
NMHC.
TABLE 4.6
Current National Ambient Air Quality Standards
IEnvironment Reporter (1979))
Pollutant
Carbon
Monoxide
Non-methane
Hydrocarbons
•
Statement
Maximum 1 hour concentration
not to be exceeded more than
once per year
Concentration Level
Maximum 3 hour (6-9 a.m.)
concentration not to be
exceeded more than once
per year
160
40 mg/m 3 (35 ppm)
~g/m
3
(0.24 ppm)
From Table 4.6, note that the NAAQS statement for NMHC does not
involve continuous monitoring.
However, the standard is in the process
of being updated and for our illustrative purposes, we assume that the
0.24 ppm figure is the maximum 1 hour concentration limit for all hours
of the year.
There are several sources of variability reflected in the data.
Some of the most important ones include:
137
(i) unknown background concentration of pollutants due to
the proximity of the city of Phoenix
(ii) measuring instrument variability
(a) the concentration levels for certain pollutants were
close to the 'noise range' of the instrument
(iii) diurnal time and seasonal time frames as they pertain
to aircraft and background emission activity and meteorological activity
The argument presented in Section 4.4.1 allows us to assume a
suitable 3-parameter lognormal distribution to explain the behavior of
a random sample of the data in spite of the unaccounted variability
sources mentioned above.
In actual compliance testing situations where
modelling of the data is irrelevant, the data is taken as obtained and
not adjusted due to unaccounted variation.
•
Exceptions (such as an unu-
sually high level of pollutant concentration from Phoenix due to an accident) are accounted for on an individual basis but such implementation
techniques are out of the focus of this illustrative example.
use a random sample of the unadjusted data as given.
We thus
Detailed accuracy-
of-the-data information is available in Yamartino, et a1. (1980).
A
graphical examination of the quantiles of the data provided sufficient
evidence to use a 3-parameter lognormal distribution as a suitable distribution to illustrate the application of our sequential procedure to
the compliance testing problem.
4.4.3
Statistical Framework and Test Procedure
Under the assumptions discussed in Section 4.4.1, assume that
the random sample of observed pollutant concentrations consists of iid
138
random variables from a 3-pararneter lognormal distribution, with probability density function
_
1
f(x;lJ,o,k) =
}(In(X~k)-lJ)2
, x >
e
-k,
(4.11)
o(x+k)12TI
where k,o >
0 and
_00
< lJ <
Since concentrations of pollutants cannot
00.
be negative, the probability of the interval (-k,O] is usually assigned
to the point x=O, giving rise to the LN3C distribution of Mage and Ott
(1975).
For the purpose of this example, a simplified 3-parameter log-
normal distribution as in (4.11) is used for the positive observations,
r
giving rise to the following likelihood function for a sample of size n:
n
L (X ;lJ,o,k) = n
n -n
i=l
i[
1
In
(X~+k)-~
-----e
° (x. +k) I2Tr
x.> 0,
1
1
1 < i < n,
(4.12)
where
~n
= (x l ,x 2 ,···,xn )'· Let 8 = (lJ,o,k)'.
In order to carry out the sequential procedure as outlined in
Section 2.2.3, calculate the maximum likelihood estimator of 8=(lJ,0,k),
at the n
th
stage of sampling.
However, a closed form solution to the
equation
-o
asa{ -n
In 012Tr +
-
a
= a8_
In Ln (X-n ;8)
- IA8
-n
[n.l In(xi+k)] (1J 2
1=1
°
1
-1) - ---2
20
.Ln In 2 (xi+k) _ni2 }
20
1=1
(4.13)
is not possible.
Iterative interpolation techniques such as MAXLIK (see
Kaplan and Elston (1972)) would enable one to obtain
e =(~n ,0n ,kn )'
-n
by
139
solving (4.13) above.
However, if the 'displacement' parameter k is
known, a closed form solution of (4.8) gives the following maximum likelihood estimators,
A
1
~n = -
n
L In(x.+k)
1
n 1=
. 1
A
On
1 n
=-n . l 1 In
2
1=
(x.+k)
(4.14)
1
where k is a suitable value for the parameter k.
In actual implementa-
tion conditions, suitable values for k are usually available from previous
accumulated data of pollutant concentration or can be empirically obtained
from a representative subsamp1e of the data.
Since the purpose of this
example is illustrative of the applicability of our sequential procedure,
we calculated k from the entire available information in one year (Note
that although 13 months of information was available, only 12 months was
utilized to estimate k in order to avoid any seasonal bias) .
•
One wishes to test the following hypothesis:
-
HO:
gee) = -In(x O + k)
HI:
g(~)
= -In(x O + k)
+
~,
(4.15)
where x is the NAAQS concentration level, ~ > 0 is known and gee) is the
o
(l-ao)looth percentile function of the N(~,02) distribution. Define
~
[0,1] and g(~) = ~ + T 0 + ~~, where T
is the (l-a )100
a
aO
0
O
tile point of the standard normal distribution.
£
th
percen-
As in Section 2.6, the test statistic computed at stage n is
A
~n[~
Zn*
A2
where y
n
bution.
1 2
= 0A2n (l~T
a
=
A-I
+T 0 +
n a n
O
In(xO+k)+(~- -2)~J
(4.16)
) for the given choice of g(~) and the normal distriO
The test procedure is carried out as outlined in Section 2.2.3,
140
where the stopping point is n '" < 8760.
The value of L was taken as
aO
3.685 corresponding to the (1- 87~0)100th percentile point of the stand-
ard normal distribution.
The value of
interested in the hypothesis (4.15).
from the standard, the choice of
~
~
was taken as 1.0 since one is
In order to detect small changes
was such that it was small and close
to the accuracy of the measuring instruments.
Table 4.7 presents the
values of constants used in the example.
TABLE 4.7
Constants for CO and NMHC Used to Demonstrate
Applicability of GSLRT to Compliance Testing
-k
Initial Sample Size
no(~)
Pollutant
~
Strength (a) ,(2)
CO
24 (hours)
0.5
ppm
-.03
(.05, . OS)
NMHC
24
0.005 ppm
.06
(.05, .05)
The results are presented in the next section.
4.4.4
Results and Considerations for Implementation of the Technique
The concentrations of CO observed were extremely low in compari-
son to the standard and thus the experiment was terminated at
n=nO(~)
with acceptance of the null hypothesis of compliance and failure to provide us with useful information about the applicability of the procedure.
On
the other hand, the value of the stopping variable for NMHC was
'" advocated discontinuation of sampling togethn '" =454, at which point Z454
er with the acceptance of the null hypothesis of compliance.
The values
of the non-zero NMHC concentrations ranged from 0.0001 to 1.4155 ppm with
a median value of .0913 ppm.
Since the actual observations varied about
the standard, this particular set of data was quite suitable to demon-
141
strate the applicability of our GSLRT procedure for compliance testing.
After an equivalent of about 19 days of continuous monitoring, there was
sufficient evidence to discontinue the experiment and conclude that the
given polluter was complying with the emission standard.
The actual implementation of a sequential testing procedure for
compliance testing involves further considerations than the ones discussed so far.
For compliance testing purposes, one begins sampling
every hour with the first day of the compliance testing year and stops
*
at n * ~ n O(6) such that Zn*
4 (b,a),
where 6, n (6) are suitably deterO
mined constants and Z~, b, a are as given in Section 2.2.3. Since there
is also a seasonal pattern to emission activity and meteorological conditions, the sampling scheme can be suitably modified to include q 'compliance testing periods' corresponding to q (previously determined from past
data) different sets of meteorological conditions and emission activities
•
of the region.
Such considerations are beyond the scope of this study
and rightfully delegated to environmental scientists.
Nevertheless. any
such sampling scheme will still provide savings in cost over continuous
year-round monitoring.
For our illustrative purposes we utilized the non-zero observations and the likelihood expression given in (4.12).
In actuality, a
more appropriate distribution to use would be the LN3C distribution of
Mage and Ott (1975).
The likelihood then takes into consideration the
large number of zero-valued observations and thus is more applicable to
the observed concentrations.
The extra information may provide for an
even earlier termination of the experiment.
One important application of a sequential testing procedure not
previously mentioned is in testing new sources for filing of environ-
142
mental impact statements.
Based on accumulated data from previous years
at a region where a new source is planning on being located, pick out
the i th out of q 'compliance testing period' with the worst compliance
record and test the effect of the new source on the compliance of all
sources for that period, as opposed to testing the effect of the new
source for a full year. Based on the result for the i th 'compliance
testing period', decide the fate of the new source.
An example of such
a situation would be assessing the impact of a (nuclear) power plant in
a given industrial region of the country.
If the increased demand for
energy currently in effect persists, the need for new power plants in
already highly industrialized areas will increase.
There is then a
need for an implementable standardized procedure that will accurately
assess the impact of a new plant in as short a time as possible and our
sequential procedure is a possibility.
Nonparametric sequential testing procedures for compliance
testing have not been considered due to the simplicity of assuming a
standard distribution (such as the LN3C) for most air pollutants and
thus keeping the actual testing protocol simple enough to be implemented
by local regulatory offices.
However, the benefits of utilizing robust
procedures is studied in the next chapter.
For our illustrative purposes, we selected a random sample of
the year's non-zero observations to show the possible savings incurred
in reduced sample sizes by utilizing our sequential procedures.
In
actuality, such a scheme is not possible and a suitable means of dealing with the highly correlated observations is needed if a sequential
procedure is to be implemented.
Careful cost-benefit analysis of
savings incurred by reduced sampling schemes over continuous monitoring
143
will affect the decision of implementation.
The purpose of this example was to illustrate the use of our
technique for available environmental monitoring data and to discuss
some of the possible obstacles in actual implementation of sequential
techniques in compliance testing policy decisions.
between observations necessitated the use of a
The high correlation
ra~dom
sample scheme to
eliminate correlations and thus enable us to illustrate the use of the
developed sequential testing techniques for this data.
CHAPTER 5
NONPARAMETRIC SEQUENTIAL TESTING PROCEDURES
5.1
Introduction
In Chapters
2, 3, and 4 we developed sequential testing proce-
dures for a function of unknown parameters when the form of the underlying distribution was known.
The knowledge of the underlying distribu-
tion was used to estimate via maximum likelihood techniques the nth
stage value of the unknown parameters e and to develop a generalized
SLRT along the principles presented in Bartlett (1946) and Cox (1963).
In the case when the form of the underlying distribution is not known,
alternative robust estimators of e need to be considered.
In nonparametric sequential testing, there are basically two
ways of developing suitable tests for underlying unknown parameters. If
the underlying parameter 6 can be expressed as a functional of the underlying distribution F, say e=e(F), then suitable U-statistics of Hoeffding
(1948) and differentiable statistical functions of von Mises (1947) can
be used to estimate 8 at the nth stage of sampling.
One could then con-
sider arbitrary (subject to mild regularity conditions) functions of
interest
g(~(F))
and apply the jackknifing techniques discussed by Sen
(1977) to an extension of the results of Chapter 2 in order to solve the
problem.
The functional relationship 6=8(F) does not always exist.
The
other approach applies to problems where the unknown parameters of
interest are the location, scale, and regression parameters of the under-
145
lying distribution (similar situation to (i) - (iii), (1.12)).
a case, appropriate linear rank statistics can be utilized.
In such
The study
presented in Section 4.4 for the air pollution monitoring problem is an
example where such statistics could be used.
Sen and Ghosh (1974) dis-
cussed the one- and two-sample location and scale testing problem, while
Ghosh and Sen (1977) examined the simple linear re,gression problem.
In this chapter we focus on the multiple linear regression
problem as an extension of the previously considered problems, specifically addressing the testing of a function of the regression parameters,
g(6), based on suitable rank order statistics as presented in Sen and
Ghosh (1971,1972,1974), Ghosh and Sen (1972,1977), Chatterjee and Sen
(1973) and Majumdar and Sen (1978b).
In Section 5.2 we extend the results of Sen and Ghosh (1972)
and Ghosh and Sen (1972,1977) to our multiparameter-function problem.
We show that the proposed Sequential Rank Order Test (SROT) retains the
OC function as given in (2.36) for the GSLRT and we examine the asymptotic relative efficiency (ARE) of the tests with respect to each other.
In Section 5.3 we address the problem of extending the results
of Chatterjee and Sen (1973) and Majumdar and Sen (1978b) for time-dependent observations in a Progressive Censoring Scheme (PCS) situation to
the case of testing a function of the
multiple regression.
unknow~
regression parameters in
The proposed SROT has important applications in
testing functions of the parameters in Cox's (1972,1975) regression
model.
5.2
SROT For a Function of Regression Coefficients in Multiple Linear
Regression
We extend the simple linear regression framework of Ghosh and
Sen (1977) to multiple linear regression as follows:
146
Let x.= 80+B'c.+e .• i > 1. be a sequence of independent random
1
- -1
1
variables where (8 0 , ~ ) are unknown parameters. c. are known
qxl
_1
regression (q-dimensional) vector constants and the e. are iid
1
random variables from an (unknown) absolutely continuous distribution function F(x) with probability density function f(x).
X £ (_00.00).
(5.1 )
Varying the choice of the c.
enables one to cover a broad range of prac_1
tical problems such as the k-sample (k > 1) homogeneity problem as well
as analysis of covariance.
where lJ.
>
o is known.
We are interested in testing
H :
O
g(S) = go
HI:
g (B) = go
+
lJ..
(5.2)
In order to estimate S at the n th stage of
sampling. we utilize the following linear rank order statistic as proposed by Sen and Puri (1980):
n
L (b)
-n
_
= L (c.-c)
. (h)),
_1 -n a n (R n1_
i=l
where
c
-n
=
1
n
L
c.
n i=l -1
R . (b) = rank of x.-b'c. among x -b'c , a=l •...• n,
n1 1 - _1
a _ -a
an (i) are non-decreasing scores,
and b' = (bl .... ,b )
-
q
£
Rq .
(5.3)
The scores an (i), i=I.2 •... ,n. are considered to be generated by
suitable score functions {J(u):
0 < u < I} as follows:
Let an (v) = an (i) for i-I < v < i. i=l •... ,n.
147
Then an(i)
= J(n~l)
or an(i)
= E[J(Uni )],
where
Un l<.'.<Unn are the order statistics from a sample
of size n from Unif(O,l).
(5.4)
The score functions J(u) equal ¢-1 (u), 0 < u < 1, where ¢ is a cumulative
distribution function satisfying certain regularity conditions (see Ghosh
and Sen (1972)).
If ¢ is the standard normal
dist~ibution
N(O,l) or the
uniform(O,l) distribution, (5.3) for the simple linear regression case
(q=l) becomes the Normal Scores or the Wilcoxon linear rank statistic
for 8, respectively.
The estimator 8
-n of B
_ at stage n proposed by Sen and Puri (1980)
is
1'\
1'\
b
~n: II~n(~n) II = min{IIL-n (b)ll:
where
I I~I I
£
Rq },
(5.5)
Let D*n ={b- £ Rq:IIL_n (b)!1
_
Under certain regularity conditions, Dn* is a closed con-
denotes a suitable norm function.
is a minimum}.
vex set in probability (see Jure~kov~ (1971)).
Thus, in view of such
*
considerations, 8
_n is taken as the center of gravity of D.
n The actual
1'\
1'\
computation of 8
_n requires iterative procedures and may be quite complicated.
Along the principles of the SLRT as discussed in Chapter 2, and
extending the results of Ghosh and Sen (1977), consider the sequence of
statistics:
Z
= tmD (F) [g (~n)
122
- !(gO+gl)]/V Y
v2
= o J 2 (u)du
[I:
2
Y
=
n
where
e
r
[~'J
A-I
-
[~
J(UJdUY
148
D(F) = J_:{d/dX J(F(x))}dF(x) =
U:
p(J.~)
=
2
A~ =
2
~ (u) du
II°
P(J,~)A~V
dU] I A~'
J(u) o/(u)
~(u) = - f'(F-l(u))/f(F-l(u)) ,
n
° < u < 1.
(5.6)
Define C = L (c. -c ) (c. -c )' and assume that
i=l _1 _n _1_n
-n
n -1 C
-n -A'
~
as n
~
00
(5. 7)
,
where A is the positive definite matrix in the definition of y2 in (5.6).
Rewrite Zn from (5.6) as follows,
Zn =
where
g(~)
6nD(F)(~n-~)'
= go +
~6, ~ £
if / v2y 2
+
2
6 n
D(F)(~- t)/v 2y 2,
(5.8)
lO,l] and where the Mean Value Theorem and
continuity assumption of ~ have been used as in Chapter 2 (see (2.12) '"
B from (5.5)
-n
as an estimator of B and the following consistent estimator of D2 (F) (see
(2.13)).
Since the value of Band F are unknown, utilize
Sen and Puri (1980)):
(B
(8
(B
(B
[L
)-L
_n- l / 2b)], [L
)-L
_n- l / 2b)]
_n _n -n -n
-n -n -n -n
-
where
l
S
=Cr (c.-c-n )x.
-n.J1 . 1
1=
_1
At stage n, utilize
1
is the least squares estimator of S at stage n.
Y~ = [~I~~' n~~l I~I~~
-
to estimate y2 as well.
The test procedure parallels Section 2.2.3 with Zn* replaced by
Zn'
..
Under assumptions stated in Ghosh and Sen (1977), the result of
their Theorem 3.1 immediately extends the present case (using an ana1o-
e·
149
gous argument to Section 2.3 as well as (5.16) below) and thus the termination probability of the proposed test is one.
Under _B=O,
R =(Rn 1,Rn 2, ..• ,Rnn)' assumes all
- the vector of ranks -n
possible permutations of (1,2, .•• ,n) with the equal probability (lIn!)
and the exact distribution of the test statistic can be derived by enumeration.
The task gets too complicated for n large and we thus examine
the asymptotic behavior of Z.
In order to derive the DC and ASN funcn
tions of the test procedure, we first establish that the tail sequence of
Zn given in (5.8) behaves like a linearly drifted standard Wiener process
as n
-.00.
Theorem 5.1
Under the regularity conditions of Lemma 4.5 of Jureckov~ (1971),
Theorem 3.2 of Ghosh and Sen (1972) and Theorem 1.2 of Sen and Ghosh (1972),
n D(F)
vy
•
where y2
(8-n -B)'.£&aB-
= [~1' ~ -1 [~1
wen)
+
oem), a.s. as n-"
OO ,
(5.10)
and Wen) is a standard Wiener process defined on
[0,00) •
Proof of Theorem 5.1
B=O, the distribution of the ranks Rnl", 1 <
- n, is independent of the underlying continuous distribution F. Then
Assume that under
i
~
from Theorem 3.2 of Ghosh and Sen (1972) suitably extended to a multiparameter case, under S=O and for a given
--.. 0,
Ii~ll< I ~(~n'~) I
K
where X
-n
=
K
> 0,
a.s. as n -.. 00,
(5.11)
(Xl ,x 2 '· •• ,xn ) , and
(5.12)
150
Thus from (5.11) and (5.12), under 6=0,
--
-! L (0) - -! L
in
In -n -
-n
(n-
l / 2b) - ! C b D(F)
n _n -
= 0(1),
-
(5.13)
.
"-
From Lemma 4.5 of Jureckova (1971), ~ is bounded
Now let b = in(6 -6).
-
a.s.
-n -
in probability (actually, the same argument of the proof follows for almost sure convergence).
Due to invariance under translations of the
linear rank statistic L , the distribution of L for any fixed _6 is the
-n
-n
same as the distribution of Lunder 6=0.
-n
fixed 6; and using b = !:n(6 -6), for 6=0,
-n -
Thus (5.13) is valid for any
"-
11"
1"
L (0) - - L ((3 -(3) - - c (15 -(3) D(F) = 0(1), a.s . . (5.14)
in -n - Iii -n -n - !:n -n -n -
-
Now, using Lemma 4.5 of Jureckova (1971),
-! L
Iii
(B
-(3) can be made arbi-
-n -n -
trarily close to zero and thus (5.14) reduces to
-
1
Iii
= -! C (8 -8) D(F)
L (0)
-n -
Iii -n -n -
+
0(1), a.s.
-
(5.15)
Now, from Sen and Puri (1980) , under (3= 0,
2
(5.16)
n -1/2 L (0) g N (0,v A), as n -+ 00.
-n q 2 , which supports (5.16 )
For a given n, E L (0) = 0 and Var L (0) = v C
_n
_n
-n under the assumption (5.7). Now utilize Theorem 1.2 of Sen and Ghosh
(1972) suitably extended to a multiparameter case to claim that, if
~
~n
=n
-1/2
v
A- l / 2 L (0)
(5.17)
-n -
then
in _n
~
= Wen)
_
+
o(In), a.s.
as n
-+
00,
(5.18)
where Wen) = (wl(n), W2 (n), ••• , wq(n)), and Wi(n) are independent standdard Wiener processes defined on [0,(0), i=1,2, ... ,q. Now, from (5.15),
151
under assumption (5.7),
n
-1/2
v
A-
112
= _n1/2
L (0)
-n -
12 ~
v
Al
-
(B -B) D(F) + 0(1), a.s ••
-n -
1/2
1/2
if = ~
D(F)
(5. 19)
Rewrite (5.19) as
n-l/2
v
~n(2)'
A-
1/2
A-
(~n-~.)' ~g
+
~(l),
a.s.
(5.20)
and recognizing that the left-hand side of (5.20) is ~ , A- 1/2 ~ by
-n aB '
(5.18),
(5.21)
Since a linear combination of independent Wiener processes is still a
Wiener process,
~ ~(n)' ~-1/2 ~ = Wen)
(5.22)
and thus from (5.21) and (5.22), (5.10) holds.
QED
Using Theorem 5.1, the OC function of the test procedure then follows from Theorem 4.1 of Ghosh and Sen (1977) under the assumptions and
conditions of Section 2.4 and is given by (2.36).
The ASN function of
the test procedure follows from Theorem 5.1 of Ghosh and Sen (1977) and
assumption (5.7).
=
Thus
{{a[l-L(W)]
+
2
b L(W)} v y 2/D 2(F)(W_ }), if W,}
_v 2y 2 ab/D 2 (F)
, if
W=} ,
where L(W) is given by (2.36)
a,b are defined in Section 2.2.3
Z ¢ (b,a)}.
n
·e
(5.23)
Now compare the proposed Sequential Rank Order Test (SROT) with
152
the GSLRT based on maximum likelihood techniques of Chapter 2.
totic relative efficiency (ARE) of the SROT with
The asymp-
respect to the GSLRT
is given by the ratio of their ASN functions.
_ D2 (F)
ARESROT.GSLRT -
v
2
2
2
(5.24)
(Y l /Y 2)·
2 !K' -1 ~
where Y2l = ~'-l
ae ! (~) ~
~e (from Chapter 2) and Y2 = ~ ~
~ (from (5.6)).
2 2
The behavior of the ratio Yl /Y 2 depends on the regression coefficients
c., i=l, ... ,q. the function of interest g and the distribution function
_1
F's information matrix.
In the case of simple linear regression, Sen and
Ghosh (1974) show that (5.24), when ¢(u) is the N(O,l) distribution function, is greater than one for nonnormal F and equals one for normal F.
Thus, the SROT based on the normal scores statistic is desirable over the
GSLRT when F is not necessarily normal.
Sen and Ghosh (1974) discuss in
detail the behavior of (5.24).
5.3
SROT for a Function of Regression Coefficients of Multiple Regression
Under Progressive Censoring
In this section we focus again on the multiple linear regression
framework of (5.1) and the test of hypothesis of (5.2) under a life testing problem.
In life testing, the observations xi' i
so that F(O)=O.
Zn, 1 < Zn, 2 <
to the x. 'so
1
1
~
i
~
~
1, are nonnegative
Let n be the fixed number of items in the experiment and
n,n be the observed order statistics corresponding
Due to the assumed continuity of F, ties among the Z .,
< Z
n, can be neglected in probability.
define the vectors of ranks and antiranks of
n,l
As in Majumdar and Sen (1978b),
~n
= (x l .x 2 , ... ,x ) to be
n
R
_n = (Rn 1,Rn2, ..• ,Rnn ) and 5-n = lS n l'S n 2' ... 'Snn ), respectively, so that
neglecting ties, RnS. = SnR. = i, 1 < i ~ n, x.1 = Zn,R . 1 < i < n.
nl
nl
nl
153
At stage k _< n of the experiment. one has observed {Zn1.• 5n1.• 1 -< i -< k}
and wishes to test (5.2) based on the accumulated statistical information.
Suitable linear rank statistics for testing the hypothesis
*:
HO
*
HI:
6=0
(5.25)
6#0
have been discussed by Majumdar and Sen (197eb). The statistics are
multiparameter extensions of the linear rank statistics developed by
Chatterjee and Sen (1973) and presented in (1.10).
The multiparameter
extension is as follows:
At the k th stage of experimentation, define the following linear
rank statistic based on the observed sequence {Z .,5 .; l<i<k}:
n1 n1
k
T
-nk
= . L1 (c-5
1=
n1.
-C
-n
)
[a ( i ) - a * (k)]
n
n
*
k
= . L1 (c.-c
_1 -n )[an (Rn1.)-an (Rnk)]'
1=
where
c
-n
= !n . ~l 1
1=
C.
-1
1
---k
n-
a * (k)
n
=
n
L
. k+ I
1=
an(i),
o
1 < k < n-l
k=O, n,
and the scores a (i) are as given in (5.4).
n
(5.26 )
In order to simplify
the notation, without loss of generality, assume
l
'e
fo J(u)
du = 0
,} =
J2(ul du -
J:
[J:
J(u) dU] 2 = )
154
a
1
n = -n
n
l
i=l
an (i) = 0
n
_ 2
2
and A = (n-l) L [a (i)-a] = 1,
n
i=l n
n
n > 2
(5.27)
Note that under a=o, the vector of ranks R is stochastically independent
-n
of the vector of observations X and thus R assumes all possible permu-n
-n
tations of (l, ... ,n) equally likely. Thus,
E(!nkl~=9) = 0, 0 ~ k ~ n
and E(T T' la=o) = A2
C ,
n,k~q -n
-nk -nq - -
o~
k, q < n,
(5.28)
where a.J\b = min(a,b) and
n
C
-n
l (c.-c
)(c.-c)'
= i=l
-1 -n
-1 ·.n
2
and An,k
= l-(n-l)
-1
lk I[an (i)-an* (k)] 2
n
. +
1=
(5.29)
-Based on (5.26) - (5.29), Majumdar and Sen (1978b) examine Cramer-
von Mises and Kolmogorov-Smirnov types of nonparametric statistics based
on T k for testing (5.25). However, the methods presented do not provide
-n
an estimator of ~ at each stage of experimentation.
The techniques used so far in this work involved obtaining a
suitable estimator of the unknown parameters
sampling.
(~
or
~)
at every stage of
In Chapters 2, 3, and 4 we were able to utilize maximum likeli-
hood estimation while in Section 5.2 we used a suitable linear rank statistic L (b) and (5.5) in order to estimate a.
-n A suitable rank statistic for the estimation of
of experimentation is not generally available under PCS.
~
at every stage
The possibility
of utilizing the 'alignment principle' as done in Section 5.2 (see (5.3))
th
would give us the following statistic at the k stage of experimentation,
.
155
o<
k
~
r, where r is a given fixed positive integer at which censoring
takes place:
k
I
. 1
1=
(c.-c )[a (R . (b))
-1
-n
n
n1-
where Rn1. (b),
1< i< n is defined in (5.3).
_
of
!nk(~)
a * (R k(b))],
n
(5.30)
-
However, the distribution
will depend on the choice of b and the
distribution of the estimator of S.
n
~i'
and so would the
For different choices of the c., an
-1
observation may be switched from getting an 'uncensored score' a
n
getting a 'censored' score an* and vice versa.
sample location problem (k
~
to
In the case of the k-
2) when the amount being subtracted from
each observation, (b'c.), is known, it is possible to derive the distribu-
-1
tion of the statistic (5.30) under H* ' In general, for varying _1
c. it is
O
not possible and thus the problem of obtaining a suitable rank order estimator of Sunder PCS is unsolved and is presented as a topic for further
research in the next chapter.
...
However, note that given a suitable rank
estimator of S, the procedure developed in Section 5.2 can be readily
extended to time-dependent observations along the principles discussed
in Chapter 3.
Sen (1979a)considers the Cox (1972) regression model (1.9) in a
PCS situation.
In order to test (5.25), Cox (1972) considered the follow-
ing test statistic at the censoring point r:
= -nr
U*'
J *- U*
-nr -nr
where
d
U*
-nr
= as
*
J_nr
= dsas'
lnLnr
a2 InL
nr
L denotes the 'partial log-likelihood' function
nr
156
at stage r (see equation (1.5) of Sen (1979a).
A- denotes the generalized inverse of A.
(5.31)
* for
Sen (1979a)tests (5.25) under PCS and thus looks at Lnk
1 < k < r. The testing procedure followed in Sen (1979a)is a Repeated
Significance Test (RST) whereby the experiment is terminated at stage
* advocates rejection of H* or at stage .r by accepting H* .
k < r if Lnk
O
O
If a suitable rank order estimator of B were available, we would
-
be able to extend the methodology developed in Chapter 3 (as done for
Chapter 2 in Section 5.2) for testing a function of the vector of parameters in the Cox (1972) regression model, say
g(~).
The results of Sen
(1979a)would enable us to derive the proof of the asymptotic convergence
of our process to a suitable Wiener process and thus to evaluate the DC
and ASN functions of our test procedure.
CHAPTER 6
SUGGESTIONS FOR FURTHER RESEARCH
6.1
Theoretical Extensions
In the research conducted for this dissertation, we studied the
theoretical justification for a generalization of the Sequential Likelihood Ratio Test (SLRT) suggested by Cox (1963) for the problem of testing a function of unknown parameters.
In the case when the form of the
underlying distribution is known, the use of maximum likelihood techniques
allowed us to develop a suitable sequential test for iid as well as time-
•
dependent observations and for a broad class of functions of interest .
When the form of the underlying distribution is unknown, robust generalizations of the sequential test based on suitable rank order statistics were
developed for iid observations and for testing a function of regression
parameters in multiple regression. For time-dependent observations in a
progressively censored scheme (PCS), satisfactory rank order estimators
for the unknown parameters in multiple regression are currently not available.
Further investigation of possible rank estimators for unknown
parameters under a PCS is needed.
Majumdar (1976) studied the asymptotic
behavior of certain rank order based statistics under a PCS for the case
of staggered entry observations (where the subjects in a life testing
problem do not arrive all at the beginning of the study).
The incorpora-
tion of the staggered entry problem into our sequential test is a possibility for further investigation.
Finally, all the sequential testing
158
procedures developed can be extended to the case when there are k > 1
functions of interest to be tested.
This extension will require the use
of suitable Bessel functions (see Feller (1966)) to investigate the
stopping probabilities of the sequential test.
6.2
Further Applications
The application of the developed sequential testing procedures
are quite numerous and some have already been mentioned as examples in
previous chapters.
The potential savings in time and cost due to util-
izing sequential procedures is especially important in low-dose cancer
experiments and studies of chronic diseases where time frames are usually
quite large and there is a need to determine effective measures as quickly
as possible.
By incorporating information on all observed order statis-
tics into a sequential decision as outlined in Chapter 3, the improved
efficiency of the experiment as measured by lower ASN values over other
sequential procedures is quite possible.
From Section 4.4 we realize
that the implementation of a sequential testing procedure is possible
but can be complicated.
The utilization of the proposed sequential test-
ing procedures in analyzing other datasets can shed extra light on some
of the difficulties encountered in implementing a sequential testing
procedure.
A study of efficiency of the sequential testing procedure
that includes cost considerations (to measure the cost of extra computations involved in sequential sampling) as well as possible savings in
reduced sample sizes may also help in studying the desirability of implementing the suggested sequential testing procedures.
The simulation studies of Sections 4.2 and 4.3 can be extended
•
159
to provide tables of expected OC and ASN values for use by non-trained
personnel in implementing the sequential tests for a broad range of distributions and functions of interest.
BIBLIOGRAPHY
Anderson, T. W. (1960). "A Modification of the Sequential Probabi li ty
Ratio Test to Reduce the Sample Size," The Annals of Mathematical Statistics 31, 165-197.
Anscombe, F. J. (1952). "Large Sample Theory of Sequential Estimation,"
Proceedings of the Cambridge Philosophical Society 48, 600-607.
Anscombe, F.J. (1953). "Sequential Estimation," Journal of the Royal
Statistical Society B 15, 1-29.
Bartlett, M.S. (1946). "The Large-Sample Theory of Sequential Tests,"
Proceedings of the Cambridge Philosophical Society 42, 239-244.
Billingsley, P. (1968). Convergence of Probability Measures, John Wiley
and Sons, New York.
Breslow, N. (1969). "On Large Sample Sequential Analysis with Applications to Survivorship Data," Journal of Applied Probability 6,
261-274.
Breslow, N. (1972). "Comment on D. R. Cox (1972) paper,"
Royal Statistical Society B 34, 216,217.
Journal of the
Chatterjee, S.K., and Sen, P.K. (1973). "Nonparametric Testing Under
Progressive Censoring," Calcutta Statistical Association Bulletin 22, 13-50.
'I
Chow, Y.S., and Robbins, H. (1965). "On the Asymptotic Theory of FixedWidth Sequential Confidence Intervals for the Mean," The Annals
of Mathematical Statistics 36, 457-462.
Chung, K.L. (1974).
New York.
A Course in Probability Theory, Academic Press,
Cox, D.R. (1952). "Sequential Tests for Composite Hypotheses," Proceedings of the Cambridge Philosophical Society 48, 290-299.
Cox, D.R. (1963). "Large Sample Sequential Tests for Composite Hypotheses," Sankhy<r A 25, 5-12.
Cox, D.R. (1972). "Regression Models and Life-Tables," Journal of the
Royal Statistical Society B 34, 187-202.
Cox, D.R. (1975).
"Partial Likelihood," Biometrika 62, 269-276.
Davis, C.E. (1978). "A Two Sample Wilcoxon Test for Progressively Censored Data," Conununications in Statistics - Theory and Methods_
A7, 389-398.
e·
161
Dvoretzky, A., Kiefer, J., and Wolfowitz, J. (1953). "Sequential Decision Problems for Processes with Continuous Time Parameter
Testing Hypotheses," The Annals of Mathematical Statistics 24,
254-264.
Eichhorn, B.H., and Zacks, S. (1973). "Sequential Search for an Optimal
Dosage, I," Journal of the American Statistical Association 68,
594-598.
Federal Register (1979). "National Primary and Secondary Ambient Air
Quality Standards," Vol. 44, No. 28, 8202 ...8237.
Feigl, P., and Zelen, M. (1965). "Estimation of Exponential Survival
Probabilities with Concomitant Information," Biometrics 21,
826-838.
Feller, W. (1966). An Introduction to Probability Theory and Its Applications, Vol. II, John Wiley and Sons, New York.
Fraser, D.A.S. (1956). "Sufficient Statistics with Nuisance Parameters,"
The Annals of Mathematical Statistics 27, 838-842.
Gardiner, J.C., and Sen, P.K. (1978). "Asymptotic Normality of a Class
of Time-Sequential Statistics and Applications," Communications
in Statistics - Theory and Methods A7, 373-388.
Ghosh, B.K. (1970). Sequential Tests of Statistical Hypotheses, AddisonWesley, Reading, Massachusetts.
Ghosh, M., and Sen, P.K. (1972). "On BOWlded Length Confidence Interval
for the Regression Coefficient Based on a Class of Rank Statistics," Sankhya A 34, 33-52.
Ghosh, M., and Sen, P.K. (1977).
Sankhya A 39, 45-62.
"Sequential Rank Tests for Regression,"
Glasser, M. (1967). "Exponential Survival with Covariance," Journal of
the American Statistical Association 62, 561-568.
Gray, H.L., Watkins, T.A., and Adams, J.E. (1972). "On the Jackknife
Statistic, Its Extensions and Its Relation to en -Transformations," The Annals of Mathematical Statistics 43, 1-30.
Gross, A.J. and Clark, V.A. (1975). Survival Distributions: Reliability
Applications in the Biomedical Sciences, John Wiley and Sons,
New York.
Hall, W.J., Wijsman, R.A., and Ghosh, J.K. (.1965). "The Relationship
Between Sufficiency and Invariance with Applications in Sequential Analysis," Annals of Mathematical Statistics 36, 575-614.
Hoeffding, W. (1948). A Class of Statistics with Asymptotically Normal
Distribution," The Annals of Mathematical Statistics 19, 293-325.
162
Holford. T.R. (1976). "Life Tables with Concomitant Information." Biometrics 32. 587-597.
Jureckova. J. (1971). "Nonparametric Estimate of Regression Coefficients." The Annals of Mathematical Statistics 42. 1328-1338.
Kalbfleish. J. D. (1974). "Some Efficiency Calculations for Survival
Distributions." Biometrika 61, 31-38.
Kalbfleish. J.D •• and Prentice, R.L. (1980). The Analysis of Failure
Time Data. John Wiley and Sons, New York.
Kaplan. E.B., and Elston, R.C. (1972). "A Subroutine Package for Maximum Likelihood Estimation (MAXLIK)," Institute of Statistics
Mimeo Series No. 823, The University of North Carolina, Chapel
Hill. North Carolina.
Kaplan, E.L., and Meier, P. (1958). "Nonparametric Estimation from
Incomplete Observations." Journal of the American Statistical
Association 53, 457-481.
Kendall, M., and Stuart, A. (1969). The Advanced Theory of Statistics,
Vol. 1. Third Edition, Macmillan, New York.
Kiefer, J., and Weiss, L. (1957). "Some Properties of Generalized Sequential Probability Ratio Tests," The Annals of Mathematical
Statistics 28, 57-75.
"
Koch, G.G., Johnson, W.D., and Tolley, H.D. (1972). "A Linear Models
Approach to the Analysis of Survival Data and Extent of Disease
in Multidimensional Contingency Tables," Journal of the American Statistical Association 67, 783-796.
Larsen, R.I. (1969). "A New Mathematical Model of Air Pollutant Concentration Averaging Time and Frequency." Journal of the Air Pollution Control Association 19, 24-30.
Larsen, R.I. (1971). "A Mathematical Model for Relating Air Quality
Measurements to Air Quality Standards." Publication No. AP-89,
U.S. Environmental Protection Agency. Research Triangle Park,
North Carolina.
Leidel, N.A .• Busch, K.A .• and Lynch, J.R. (1977). "Occupational Exposure Sampling Strategy Manual." U.S. Department of HEW. PHS,
CDC. NIOSH. No. 77-173.
Mage. D.T. (1980). "An Explicit Solution for SB Parameters Using Four
Percentile Points." Technometrics 22. 247-251.
Mage, D.T .• and Ott. W.R. (1975). "An Improved Statistical Model for
Analyzing Air Pollution Concentration Data," Paper No. 75-51.4,
68th Annual Meeting of the Air Pollution Control Association,
Boston. Massachusetts.
a
•
163
Mage, D.T., and Ott, W.R. (1978). "Refinements of the Lognormal Probability Model for Analysis of Aerometric Data," Journal of the
Air Pollution Control Association 28, 796-798.
•
Mahmond, R.M. (1973). "Sequential Decision Procedures for Testing Hypotheses Concerning General Estimable Parameters," Ph.D. Dissertation, Department of Biostatistics, University of North Carolina,
Chapel Hill, North Carolina.
Majumdar, H. (1976). "Generalized Rank Tests for Progressive Censoring
Procedures," Ph.D. Dissertation, Department of Biostatistics,
University of North Carolina, Chapel Hill, North Carolina.
Majumdar, H., and Sen, P.K. (1978a). "Nonparametric Testing for Simple
Regression Under Progressive Censoring with Staggering Entry
and Random Withdrawal," Communications in Statistics - Theory
and Methods A7, 349-371.
Majumdar, H., and Sen, P.K. (1978b). "Nonparametric Tests for Multiple
Regression Under Progressive Censoring," Journal of Multivariate
Analysis 8, 73-95.
Mantel, N. (1966). "Evaluation of Survival Data and Two New Rank Order
Statistics Arising in its Consideration," Cancer Chemotherapy
Reports 50, 163-170.
t
McLeish, D.L. (1974). "Dependent Central Limit Theorems and Invariance
Principles," The Annals of Probability 2, 620-628.
Ott, W.R., and Mage, D.T. (1976). "A General Purpose Univariate Probability Model for Environmental Data Analysis," Computation and
Operations Research 3, 209-216.
Ott, W.R., Mage, D.T., and Randecker, V.W. (1979). "Testing the Validity
of the Lognormal Probability Model: Computer Analysis of Carbon
Monoxide Data from U.S. Cities," U.S. Environmental Protection
Agency, EPA-600/4-74~040, Research Triangle Park, North Carolina.
Puri, M.L., and Sen, P.K. (1971). Nonparametric Methods in Multivariate
Analysis, John Wiley and Sons, New York.
Samuel, E. (1970). "Randomized Sequential Tests. A Comparison Between
Curtailed Single-Sampling Plans and Sequential Probability
Ratio Tests," Journal of the American Statistical Association
65, 431-437.
Savage, I.R., and Sethuraman, J. (1966). "Stopping Time of a Rank-Order
Sequential Probability Ratio Test Based on Lehman Alternatives,"
The Annals of Mathematical Statistics 37, 1154-1160.
Sen, P.K. (1959). "On the Moments of the Sample Quantiles," Calcutta
Statistical Association Bulletin 9, 1-19.
164
Sen, P.K. (1973). "Asymptotic Sequential Tests for Regular Functionals
of Distribution Functions," Theory of Probability and its Applications 18, 226-240.
Sen, P.K. (1974). "Almost Sure Behavior of U-statistics and Von Mises'
Differentiable Statistical Functions," The Annals of Statistics
2, 387-395.
Sen, P.K. (1976). "Weak Convergence of Progressively Censored Likelihood
Ratio Statistics and Its Role in Asymptotic Theory of Life
Testing," The Annals of Statistics 4, 1247-1257.
Sen, P.K. (1977). "Some Invariance Principles Relating to Jackknifing
and Their Role in Sequential Analysis," The Annals of Statistics
5, 316-329.
Sen, P.K. (1978a). "An Invariance Principle for Linear Combinations of
Order Statistics," Zeitschrift fur Wahrscheinlichkeitstheorie
42, 327-340.
Sen, P. K. (1978b). "Time Sequential Statistical Procedures: A Preface,"
Communications in Statistics - Theory and Methods A7, 311-314.
Sen, P.K. (1979a). "The Cox Regression Model, Invariance Principles for
Some Induced Quantile Processes and Some Repeated Significance
Tests, Institute of Statistics Mimeo Series No. 1208, The
University of North Carolina, Chapel Hill, North Carolina.
Sen, P. K. (1979b). "Weak Convergence of Some Quantile Processes Arising
in Progressively Censored Tests," The Annals of Statistics 7,
414-431.
Sen, P.K., and Ghosh, M. (1971). "On Bounded Length Sequential Confidence
Intervals Based on One-sample Rank Order Statistics," The Annals
of Mathematical Statistics 42, 189-203.
Sen, P. K., and Ghosh, M. (1972). "On Strong Convergence of Regression
Rank Statistics," Sankhya A 34, 335-348.
Sen, P.K., and Ghosh, M. (1974). "Sequential Rank Tests for Location,"
The Annals of Statistics 2, 540~552.
Sen, P.K., and Puri, M.L. (1980). Nonparametric Methods in General
Linear Models, to be published by John Wiley and Sons, New York.
Sethuraman, J. (1970). "Stopping Time of a Rank-Order Sequential Probability Ratio Test Based on Lehmann Alternatives - II," The
Annals of Mathematical Statistics 41, 1322-1333.
Sinha, A.N. (1979). "Progressive Censoring Tests Based on Weighted
Empirical Distributions," Ph.D. Dissertation, Department of
Biostatistics, University of North Carolina, Chapel Hill,
North Carolina.
•
165
Skorokhod, A.V. (1965). Studies in the Theory of Random Processes,
Addison-Wesley, Reading, Massachusetts.
•
Stern, A.C., Wohlers, H.C., Boubel, R.W., and Lowry, W.P. (1973).
mentals of Air Pollution, Academic Press, New York.
Funda-
Taulbee, J.D. (1977). "A General Model for the Hazard Rate with CovariabIes and Methods for Sample Size Determination for Cohort
Studies," Institute of Statistics Mimeo Series No. 1154, The
University of North Carolina, Chapel Hill; North Carolina.
Taulbee, J.D. (1979). "A General Model for the Hazard Rate with Covariables," Biometrics 35, 439-450.
Von Mises, R. (1947). "On the Asymptotic Distribution of Differentiable
Statistical Functions," The Annals of Mathematical Statistics
18, 309-348.
Wald, A. (1945). "Sequential Tests of Statistical Hypotheses," The
Annals of Mathematical Statistics 16, 117-186.
Wald, A. (1947).
Sequential Analysis, John Wiley and Sons, New York.
Wald, A. (1949). "Note on the Consistency of the Maximum Likelihood
Estimate," The Annals of Mathematical Statistics 20, 595-601.
•
Weiss, L. (1953). "Testing One Simple Hypothesis Against Another,"
The Annals of Mathematical Statistics 24, 273-281.
Yamartino, R.J., et al. (1980). "Analyses for the Accuracy Definition
of the Air Quality Assessment Model (AQAM) at Williams AFB,"
Vol. 1, Argonne National Laboratory, Argonne, Illinois.