Oppenheimer, L.; (1971)Estimation of a mixture of exponentials for complete and censored samples." Thesis.

ESTIMATION OF A MIXTURE OF EXPONENTIALS
FOR COMPLETE AND CENSORED SAMPLES
by
Leonard Oppenheimer
Department of Biostatistics
University of North Carolina at Chapel Hill
Institute of Statistics Mimeo Series No. 771
September 1971
ESTIMATION OF A MIXTURE OF EXPONENTIALS FOR COMPLETE
AND CENSORED SAMPLES
by
Leonard Oppenheimer
A dissertation submitted to the faculty
of the University of North Carolina at
Chapel Hill in partial fulfillment of
the requirements for the degree of
Doctor of Philosophy in the Department
of Biostatistics
Chapel Hill
1971
Approved by:
Adviser
ABSTRACT
LEONARD OPPENHEIMER. Estimation of a Mixture of Exponentials for
Complete and Censored Samples • (Under the direction ofHERBERTA.
DAVID. )
Maximum likelihood estimation of the parameters of a mixture of
two exponentials, f(x)
a l -xiS
a 2 -xis
=ee
1 + e- e
2,
has been examined in de-
l
2
tail for the complete sample case and also for various forms of censoring.
The censoring ,schemes included double censoring (types I and II), in
which both tails of our sample were censored; a general form of type I
censoring, where unlike double censoring in which the censoring points
for each item were the same, different items can have distinct censoring
points; and progressive censoring (types I and II), in which the censoring occurred progressively in stages with r i randomly selected items being
censore d at t h e
.th
1
stage.
The aim throughout has been to obtain practical
operating procedures, as well as maintaining a unified approach, with
results appearing in similar form for the various cases considered.
A successive substitution iteration scheme, for solving the
maximum likelihood estimating equations, was derived and examined for
various situations, by applying it to simulated data.
Although, in most
cases the iterative scheme increased the likelihood function at each iteration, for many parameter combinations (the most crucial factor being h
=
S2/Sl+l, Sl>S2) the rate of convergence was nevertheless too slow to be
of practical value.
For those cases in which convergence was hard to
obtain it was found that the
asymptot~c
variance of the estimates became
excessive and a single exponential provided an adequate representation
for the mixture.
would
b~
Thus any attempt to obtain estimates for the mixture
futile because of their large asymptotic variances; in these cases
a computationally obtainable maximum likelihood estimate for the mean of a
ii
single exponential was calculated and statements were only made about the
sample in toto.
The exact asymptotic covariance matrix was obtained by taking the
inverse of the information matrix, which required numerical integration,
and these integrals were tabled for several values of
complete sample case.
0.
1
' h, for the
For the various cases of censoring, the same func-
tions must be numerically integrated, with only the limits of integration
changing.
Obtaining the exact.asymptotic covariance matrix for type II
censoring resulted in numerical complications which were easily avoided by
obtaining an approximation to the asymptotic covariance matrix, which were
quite similar to the results obtained for type I censoring.
In order to
check whether a single exponential provided an adequate fit for a mixture
of exponentials, for complete samples, graphical aids, analytic measures,
and appropriate statistical tests were utilized.
The results obtained for
complete samples will also hold for the different censoring schemes
considered.
Since if a single exponential adequately represents the data then
so must a mixture, the operating procedure is then to check whether a
single exponential is appropriate, by using either a chi-square goodness
of fit test (complete samples or double censoring) or a hazard plotting
procedure (a general form of type I censoring or progressive censoring).
If it is, then we can estimate the mean, and make statements only about
the samp1e.as a whole.
If we reject the hypothesis that a single
exponential is adequate then we obtain moment estimates, or approximate
moment estimates, for the mixture, and use them as initial estimates in
the iterative process to estimate the three parameters.
iii
ACKNOWLEDGMENTS
The author appreciates the assistance of his advisor, Dr. H. A.
David, who was always available for counsel, and thoughtful and
constructive guidance.
In addition, he would like to thank the other
members of his advisory committee, Drs. S. E. E1maghraby, R. R. Kuebler,
P. A.Lachenbruch, and M. J. Symons fQr their helpful criticisms and
suggestions.
He also wishes to thank his parents; Mr. and Mrs; N. Oppenheimer;
relatives, and friends for their multilateral support and encouragement
throughout. his studies.
Finally, the author expresses his appreciation to Mrs. Jo Ann
Beauchaine for her excellent job of typing this paper.
iv
TABLE OF CONTENTS
Page
..
LIST OF TABLES •
LIST OF FIGURES.
vii
• viii
CHAPTER
1.
INTRODUCTION AND REVIEW OF LITERATURE.
1.1
1.'2
1.3
II.
Introduction.
Censoring •
1
3
1.2.1
1 .. 2.2
1. 2. 3
4
Double Censoring •
General Form of Type I Censoring •
Progressive Censoring.
2.2
2.3
2.4
12
1. 3.1 Maximum Likelihood Estimates
1. 3. 2 Method of Moments.
1. 3. 3 Graphical Methods.
13
15
17
III.
18
Introduction.
Iterative Scheme.
19
2.2.1
20
Properties of the Procedure.
18
Asymptotic Covariance Matrix.
Single Exponential Versus a Mixture of Two Exponentials •
23
26
Graphical Observations •
2.4.2. Non-Centrality Parameter •
2.4.3 Appropriate Tests.
36
44
Implementation of the.Procedure •
50
2.5.1
2.5.2
52
2.4.1
2.5
6
9
Mixtures of Distributions.
COMPLETE SAMPLES •
2.1
1
Initial Estimates.
Example.
30
50
DOUBLE CENSORING •
56
3.1
3.2
56
56
Introduction.
Iterative Scheme.
3.2.1
3.2.2
Type I •
Type II.
56
59
v
Page
CHAPTER
3.3
Asymptotic Covariance Matrix.
Type I •
Type II.
3.3.1
3.3.2
3.4
IV.
•
Implementation of the Procedure •
65
3.4.1
3.4.2
66
Initial Estimates.
Example.
69
73
4.1
4.2
4.3
4.4
Introduction.
Iterative Scheme.
Asymptotic Covariance Matrix.
Implementation of the Procedure •
73
73
75
4.4.1
4.4.2
81
Initial Estimates.
Example.
A Further
Generalizatio~•
79
83
87
PROGRESSIVE CENSORING.
89
Introduction.
Iterative Scheme.
89
89
5.1
5.2
Type I •
Type II.
5.2.1
5.2.2
5.3
Asymptotic Covariance Matrix.
Type I •
Type II.
5.3.1
5.3.2
5.4
VI.
59
63
GENERAL FORM OF TYPE I CENSORING •
4.5
V.
59
89
91
92
92
Implementation of the Procedure •
97
5.4.1
5.4.2
98
Initial Estimates.
Example.
97
SUMMARY AND FUTURE WORK.
102
6.1
6.2
102
105
Summary
Suggestions for Future Work •
0
APPENDICES
I.
II.
TABLES OF S., i=1, ••• ,5.
107
INITIAL ESTIMATES FOR COMPLETE SAMPLES •
118
1.
vi
APPENDICES
III.
IV.
INITIAL ESTIMATES UNDER DOUBLE CENSORING.
It
INITIAL ESTIMATES UNDER A GENERAL FORM OF TYPE I
CENSORING • • •
• • •
LIST OF REFERENCES. • • • • • • • • • • .'. •
•
Page
0
•
0
•
e
•
0
•
...
•
120
123
125
vii
LIST OF TABLES
Page
·······
37
····
k=lO
·······
k=32
·····
k=62
·
·····
k=122.
····
37
2.4.1
Values of d for a ,h=.1,.2, ••• ,.9; 61=1.0
l
2.4.2
Values of a
2.4.3
Values of 'A/n for a , h=.l, .2, ••• , .9;
l
Values of 'A/n for a ,h=.1,.2, ••• ,.9;
l
2.4.4
l
which maximize d, given fixed h, 61=1.0.
40
40
41
2.4.S
Values of 'A/n for a l ' h=.l, .2, ••• , .9;
2.4.6
Values of 'A/n for a ,h=.1,.2, ••• ,.9;
l
2.4.7
Maximum value of 'A/n which occurs when h=O.O, a l fixed,
for various values of k ~ •
• • • • • •
42
Values of a which maximize 'A/n for h fixed and various
values of k l • • • • • • • •
• • • • • ••
43
Observations made with simulated data; n=400.
48
2.4.8
2.4.9
•
9
..
I!
········
·······
·
·····
·····
····
41
2.4.10 Observations made with simulated data; n=lOOO
49
2.S.l
Details of the iterative process.
S4
3.4.1
Details of the iterative process.
4.4.1
Details of the iterative proce$s.
S.4.l
Details of the.iterative process.
Q
•
•
e
n
86
100
APPENDIX
AI.l
Values of 8 •
1
AI. 2
Values of.S •
2
AI.3
Values of S3·
AI.4
Values of S4"
AI.S
Values of SS·
.
····
····
.
115
11S
..
········
... ····
.... ····
116
" "
··
.
116
117
viii
LIST OF FIGURES
Page
2
Var(al)/a l plotted against h for various values of a l ; n=lo
2
Var(8 )/8 plotted .against h for various values of a ; n=l. 0
1
l
l
2
Var(8 )/8 2 plotted against h for various values of a ; n=l.
l
2
28
Single exponential versus a mixture of two exponentials;
0
a l =·5, h=l/lO, 82=1 0
31
Single exponential versus a mixture of two exponentials;
a l =·5, h=1/5, 82=1.
32
Single exponential versus a mixture of two exponentials;
a l =·5, h=1/3, 8 2=1.
33
Single exponential versus a mixture of two exponentials;
al=·l, h=1/3, 82=1.
34
Single exponential versus a.mixture of two exponentials;
a l =·5, h=1/2, 82=1.
35
.
A
2.3.1
27
A
2.302
.
A
2.3.3
2.401
······ ···· · ····
0
2.4.2
2.4.3
0
···················
···················
2.4.4
2.4.5
·········· ·········
················
4.4.1
Hazard plot
5.4.1
Hazard plot •
.oe(!.e
•
(11
•
•
•
••••
G
•••
t)e.
29
85
99
CHAPTER I
INTRODUCTION AND REVIEW OF LITERATURE
1.1
Introduction
This study will examine, in detail, maximum likelihood estimates
col -x[e
of the parameters of a mixture of twoexponantia+s, f(x) =ee . 1+
co 2 -x 162
ee
•
1
We shall not on+yconsider the estimation Problem for complete
2
samples, but also obtain estimates under various forms of censoring.
The
aim'throughout,will be to derive practical operating procedures for obtaining the estimates.
We will attempt, whereverpo$sible; to maintain a
unified approach and to obtain our .resulesin a similar ,form,for the
various cases considered.
In the most simple.life test we have one population and only'one
type of failure.
For every item put on ,test we have the time it took for
the failure. to OC9ur;, we will denote .the c,?rresponding failure time density
as f(x).
(1959).
Other; more general models are proposed
a~d
contrasted by Cox
Thus we can also consider a life test in which the items in the
single.population are subject to more than'one tYPe of failure, which will
be referred to as a competing risksitua·t.ion.
This
model is realized in
.
'
medical and actuarial work where the estimation and comparison of death
rates from a particular cause require correction for deaths from other
causes; or in tensile strength testing where tq.eremay be two or more
types of failure,
~.~.,
jaw breaks and fractures in the test specimen.
Another generalization arisesif.we
consid~r
a situation in
wh:i.~h
we have s distinct populations, and the probability that an item is in the
2
i
th
s
population is ai'
O~ai~l, L a i = 1.
1=1
An item which is in population
i is subject only to the risk of failure of the i
th
type, which can be
represented by a failure time density of the form f.(x).
1
The density
corresponding to this life test will be referred to as a mixture of
s
densities ~d expressed as f(x) = I aif.(x).
i=l
1
We will assume that of the n items put on tests, the exact failure
time is known only for N of the sample specimens.
will be denoted by xl""
,~.
The remaining
(n~N)
The N observed failures
items are known. to have
failed in some intervals, finite or semi-infinite, on an appropriate time
scale.
Data which satisfy these requirements will be referred to as
censored data, and those items for which the exact failure time is not known
will be referred to as censored items.
In this study we shall confine our attention to just two populations
. I f a il ure t i me dens1t1es,
..
i
f 1 (x)
(s=2), both having exponent1a
_.~.,
f
2
(x)
1
= e-
e
-Xle2
= 1e
e- X1el '
1
•
An exponential failure time density is appropriate when
2
an accidental cause of failure predominates over failure due to wear, and
hence an item is considered as good as new over its entire lifetime.
Bartholomew (1959) uses a mixture of exponentials to predict the length of
service distribution with regard to labor turnover, his two populations
being workers in newly created jobs and workers in well established jobs,
with the newly created jobs having a higher percentage of short-term
workers.
In a different setting, Bartlett (1953) has two types of unstable
particles mixed in unknown proportions and he wants to estimate the mean
life time of the particles from cloud chamber photographs, where the life
time for each particle follows an exponential density.
3
Some of the work that has been done under the heading of
estimation with mixtures of distributions has been accomplished under the
assumption that when an item fails we are able to determine to which population that item belongs.
may not be tractable.
In many actual situations. such an assumption
This assumption will essentially, for many estimat-
ing procedures, reduce the problem to that of estimating parameters for
regular
(L~.,
not mixture) densities, for the complete sample case.
However, when censoring is present the estimation problem is more difficult
than that for regular densities, even when we make this assumption.
In
this study we shall confine ourselves to the less restrictive situation in
which we do not know to which population a failed item belongs.
In Section 1.2 we will define the various forms of censoring we
will be considering, namely, double censoring, a general form of type I
censoring, and progressive censoring, and review some of the pertinent
earlier work in these areas.
Similarly, Section 1.3 will deal with a
review of different estimation procedures which have been employed for
mixtures.
The remaining chapters will develop operating procedures for
estimating the mixing proportions (al , a 2 ) and the two means
(el"~~2)
for complete samples and various forms of censoring, Chapter II dealing
with complete samples, Chapter III with double censoring, Chapter IV with
a general form of type I censoring, and Chapter V with progressive
censoring.
1.2
Censoring
In the literature the terms truncation and censoring are sometimes
used
int~rchangeably.
In order to avoid this confusion we shall denote a
truncated distribution as one formed from another "complete" distribution
4
by cutting off, and ignoring the part lying in some intervals, finite or
semi-infinite.
A truncated sample is then a1?tained by sampling fram a
truncatedpopulatian,
!.~.,
a population having an underlying failure
time distribution which is atrunqated distribution.
Note that'in a
censored sample,although,the exact values of,the censored items are'not
known, the number of censored items and th.e range in which each must lie
are known.
Hawever, even this-information is not'available in the case
of ,a truncated sample.
1.Z .1, Double Censoring
We shall denote the
orderstat~sticscorresp'ondingto
failure times, xl'xZ"."xn " as
Yl.=:YZ~•••.=:Yn.
the,n observeq
If only one tail of our
sample is censored we shall call that single,censoring, whereas if both
tails are censored it will ,be denated as double censaring.
the
I'
:J;:f we,censor
items Yn-LT
-Ll' Yn...,r+Z, ••• ,yn , this will be called censoring on
<
the right, while i f we cenSor ,the
referred to as
censoring~, the
l'
-
items yi'yZ, ••• ,yr ' this will be
left.
~he
censoring we impose can. take
one of two forms:' (1) observations above or below a certain fixed point
may be omitl:ed, or-(Z) a fixed number of the smallest or largest oQservationsmay be omitted.
These two different ways of:censoring will be refer-
red to as, tYEe ", I and, type "II , respectively.
Note that for type I censoring,
the number of observations censored is a random variable, while for type II
censoring, the point at which the censoring
t~kesplace
is a random variable.
Censoring on,the left may arise because of limitations in the sensitivity
of a.measuring device or to save the expense of continual monitoring of the
items, while censoring on the right may be implemented to reduce the amount
5
of time required to complete a.life test or to keep the cost of the test
toa minimum by reducing the number of expensive items which will fail.
For type I censoring, if T ,T 2 are the left and right censoring
l
points, then·the joint density of the observed failures given that·N
observed failures have occurred is
N
f(xl, ••• ,~IN) = IT {f(xi )/[F(T 2)-F(T 1 )]}
i=l
and the probability of obtaining N observed. failures with r
l
(1.2.1)
items censor-
ed on the right and r 2 on the 1eft(n = N+r l +r 2 ) is
r
nl
r
F(T ) l[F(T )-F(T )]N[1-F(T )] 2
2
1
1
Z
(1.2.2)
Using (1.2.1) and (1.2.2) we obtain the following likelihood function:
(1.2.3)
For type I I censoring where we censor the r
,
"
l
smallest observation,
,
,
denoted byx _x _ ... _x ' and the r largest, denoted by xn-r'+l~n~r +2~
1 Z
r
Z
,
1
2
2
•• • <x , we obtain a likelihood function of the form
-n
,
N
,
"
,
••• f(x ) IT f(Yi)dxl'~' .dxdx
+1" .dx
n i=l
r l n-r Z
n
(1.Z .4)
nl
N
rl
.
rZ
= rllrZI [IT f(Yi)][F(Yl)] [l-F(YN)] •
i=l
Thus, to obtain maximum likelihqod estimates.of the parameters,.
for specific underlying distributions, one must maximize the likelihood
~
....
6
function.
When the likelihood function isa differentiable function of
its distribution parameters, the derivatives with respect to the parameters, of the logarithm of the likelihood function. yield the usual maximum
likelihood estimating equations.
The difficulty with this method arises
in the attempts to solve the maximum likelihood estimating equations.
Much of the work that has been done in this area has been in simplifying
the methods of solving these equations by eliminating the use of special
tables or finding iterative techniques which require a minimum of
calculation and converge rapidly.
Ra1d (1949), Cohen (1950), Gupta (1952) obtain maximum likelihood
estimates.
When the distribution function depends on location and scale
only, Sarhanand Greenberg (1962) obtain unbiased estimates of the location and scale parameters which have minimum variance in the class of
linear unbiased estimators, and also approximations to these estimators.
Walsh (1956, 1958) presents two procedures for estimating the mean and
variance for a wide variety of failure time densities subject to type II
double censoring.
Des Raj (1952, 1953) obtains moment estimates which
yield the same estimating equations as presented by Cohen's (1950) maximum likelihood approach.
For d6ub1ecensoring, and the other forms of
censoring we will consider, all the parametric estimation procedures
considered in the 1iterature.have been limited to single (not mixture)
failure time densities.
1.2.2.
General Form of Type I Censoring
Note that the previous form of type I censoring requires that
the censoring points be the same for each item.
general
form£!·~! censoring
We will now define a
which allows each item to have distinct
7
censoring points,
!.~.,
for each item we either have an exact failure
time or know that the failure occurred in some interval, finite or semiinfinite, where the intervals can differ from item to item.
The most.
natural occurrence of this type of censoring results when the experimenter
cannot start all items at the same time and he is working under a.time
limit.
Thus, if the items are persons who have undergone some operation,
which has been performed·atvarious times, To, and if we stop testing at
1.
time T, then the data we have are either exact times to death or knowledge
that the person is still alive after (T-T ) time units.
i
Note that (T-T )
i
can differ from person to person.
Let us define Bernoulli variables ai' i=l, '.G, n, for each item,
0
f or t h e1.tem
i th .
such t h at if we have an exact f a1.01 ure t1.me
we set a =1 ,
i
whereas, if the failure has occurred in an interval, say (L.,
1. U.),
1. we set
Note that if L. =0 vi, we are in the special case of censoring on
1.
the left, at the point U , and if
i
at the point LiG
Ui=~
vi; we have censoring on the right,
Hence the probability that ai=l,
of getting an exact failure time for the i
!.~.,
the probability
th
item, is l-F(Ui)+F(L ), and
i
th
the probability that ai=O, !.~., the probability that the i
item is a
censored item, is F(U.)-F(L.).
1.
1.
If we rearrange the item numbers such that
n
the first N (= L: a o) correspond to observed failures and the remaining
i=l 1.
(n~N) correspond to censored items, then the joint density of the observed
failures given that a particular set {ail has occurred is
N
f(xl, ••• ,~I"{a1.'})= I( {f(xi)/[l-F(U.)+F(L.)]} •
i=l
1.
1.
(1.2.5)
The probability of obtaining this particular set {a } is
i
N
n
IT {l-F(Ui)+F(L.)} IT {F(U.)-F(L )} •
i
i=l
1. i=N+l
1.
(1.2.6)
8
Hence, from (1.2.5) and (1.2.6) the likelihood function is
N
n
IT· {F(Ui)-F(L )}
L .,. IT f(x )
i
i
i=1,
i"'N+l
(1.2.7)
and the likelihood function in general can be written as
n
a.
I-a
L .,. IT {[f(x )] ~[F(Ui)-F(Li)] i}
i
i=l
(1. 2.8)
Swan (1969) has considered estimating the mean and variance of a
normal density under this type of censoring.
Bartholomew (1957, 1963)
considers a special case of this type of censoring,
only with censoring on the right, for which U.=oo vi.
~
i.~.,
he is concerned
He considers a
single one-parameter exponential failure time density and obtains an
explicit formula for the maximum likelihood estimate of the mean and.the
exact asymptotic variance, and also an approximation to it.
Also given
is the exact distribution of the maximum likelihood of the mean, for·which
computations become laborious for 1argen.
To solve this difficulty, the
asymptotic distribution and also an approximation satisfactory for moderate
sizen, were derived.
Nelson (1969) utilizes a graphical procedure to obtain-estimates of
failure time distribution parameters, percentiles, predictions of the number of failures of unfailed items in specified periods of time, and tests
of fit, for the exponential, Weibul1, normal, log-normal, and extreme
value distributions, as well as indicating nonparametric procedures, for
a general form of type I censoring on the right.
Section 4.4 will use
Nelson's procedure to obtain a test of fit on a single exponential,
and will outline his general method in detail.
9
Kaplan and Meier (1958) offer essentially three nonparametric
estimators of what they call the survivorship function, l-F(x), for
censoring on the right only.
We willuse,and explain in. some detail,
what Kaplan and Meier call the reduced sample estimator [expression
(4.4.12), Section 4.4.1] in connection with obtaining initial estimates
for a general form of type I censoring on the right, and the product
limit estimator
[e~pression
(5.3013), Section 5.3.2] in connection with
obtaining the approximate asymptotic cova+iancematrix for type II
progressive censoring.
The third estimator which is called an actuarial
estimator is closely related to the product limit, but will not be used
in what follows.
1. 2.3
Progressive Censoring
We now have the censoring occurring progressively in k stages
at times T"1
censoring, r
such that Ti>T,1-l' i=1,2, •• 0,k, and at the i
i
stage of
items, selected at random from the survivors, are with-
drawn from the test.
censoring.
th
~
This will be referred to as
~!
progressive
II progressive censoring will differ only in that the
r. withdrawals will be made when the i th failure occurs.
1
Progressive
censoring arises when certain items must be removed, prior to failure,
for use in related experiments, or when test facilities are limited, since
the early censoring of.a substantial number of sample items frees facilities for other tests.
If we define N. to be the number of observed failures in the
1
interval [Ti_l,Ti ], i=1, •.• ,k+l, where To=O, Tk+ 1=oo, No=O, and P , to be
N1
the conditional probability of obtaining N failures in [Ti_l,T ] given
i
i
that we know N , r
j
j
for j<i-l, then the joint density of the N failures
i
10
occurring in [Ti_l,T ] given that we have N failures is
i
i
f(x s ,
+1' ••• '
X
~-l
where Si
i
= LZ=oN
'
Z
P
Ni
Si
IN
=
,_IT
{f(xj)![F(Ti)-F(T i _ l )]}
)
s i
i
J-Si_l+l
setting Mi
=
(1.2.9)
i-I
L (Nj+r j ) we have
j=l
= ( n-M .J. [F(T i. )-F(Ti_l)]N i . [l_F(Ti)]n-Mi-Ni
i ---"';;;--~~---~M=----N
[l-F(T
)]n- i
i
i-I
(1. 2 .10)
and from (1.2.9), f(xl, ... ,~INl, ••• ,Nk+l) ~
k+l
Si
IT
IT
i=l j=S
+1
i-I
{f(xj ) I [F(Ti)-F(T _ l )]} ,
i
(102.11)
we can then obtain the likelihood function for type I censoring as
k+l
L'"
IT
Si
[l_F(Ti)]n-Mi-Ni
IT
i=l j=S.
~-l
=
~
'=l
. J.
f(x)
j
(n-M)
i [l-F(T
) ]n-Mi
N,
i-I
f (x )
j
~
i)
~. (n-M.
N
i=1
i
[l-F(T )Jri
i
(1. 2.12)
If we define fij(yi,y ) as the joint density of the i th and jth
j
ordered observed failure times and f i
Ij (yi IYj )
as the conditional density
of the i th ordered observed failure time given the jth ordered observed
failure time, then since the likelihood function for type I I censoring is
just a function of the N ordered observed failure times we can write it
as follows:
11
But we know that fjILooj-l(YjIYl".Yj-l) is just the density of the
j-l
smallest order statistic from a sample of size n - l: (rt!) drawn from
51.=1
a parent distribution truncated on the left at y _ ,
j 1
i.~.,
(r +1 )-1
J/;
(1.2.14)
Combining (1.2.13) and (1.2.14) we have
L =
N
i-1
II {[n-l: (r +l)]f(y ) [l-F(y )]ri l
i
j
i
i=l
j=l
(1. 2 .15)
A special case of type I I progressive censoring occurs when the
sample is partitioned into random groups of
(r~l)
~
elements, and each group
is observed only to the time of the first failure within the group.
Such
a situation may arise when testing electronic tubes, when the tubes are
partitioned into groups within a piece of electronic equipment,
.!.~.,
each
piece of electronic apparatus contains (ri +1) tubes of this specific type,
and the failure of a.single tube either prohibits further operation of the
equipment or alters conditions for the remaining tubes and thus necessitates
When r. = r v i we obtain
their censoring.
~
N
L
:=
Nli~i(r+l)f(Yi)[l-F(Yi)]r}
(1.2.16)
12
Cohen (1963, 1965), Herd (1956), Iyer and Singh (1962) all
obtain maximum likelihood estimates for a progressively censored sample.
The two previously mentioned methods of hazard plotting by Nelson (1969)
and the nonparametric work of Kaplan and Meier (1958), which were employed
for a general form .of typel censoring, are applicable also to the case of
progressive censoring.
Gajjar and Khatri (1969) also obtain
maximumlikeli~
hood estimates for type I censoring when the parameters under consideration
change at each stage of censoring.
Such a generalization of progressive
censoring may arise when the items at each stage are checked and the
defects eliminated whenever possible.
1.3 Mixtures of Distributions
s
We have already defined a mixture of densities as f(x) "" E aifi(x),
th
i=l
where a is the probability that an individual is in the i . population,
is
ai~ 0, E a.=l, and f.(x) is the. failure time density for the i th popula. 1
J.""
tion.
J.
J.
The s populations are distinct and items in the i
subject only to the risk of failure of the i
th
type.
th
population are
In this section we
will briefly review some of the work that has been done in estimating the
parameters of the individual failure densities as well as the a., i=1,2,
J.
••• ,s.
.
Further references on mixtures of distributions can be found in
the survey article by Blischke (1963) on mixtures of discretedistribu•
tions and also in bibliographic guides to life testing my Mendenhall (1958),
Govindarajula (1964), and Buckland (1964), the last three references
containing also additional work on censoring.
A majority of the estimation
procedures for mixtures can be classified into one of three general methods:
maximum likelihood, method of moments, and graphical methods or methods
13
using graphical aids.
The remaining three sections will review the
general techniques employed in each of these methodso
1.3.1
Maximum Likelihood Estimates·
The use of the maximum likelihood estimation procedure will
prove advantageous in what follows.
me~hod
One desirable feature is that the
of estimation easily generalizes as the number of populations
increases or as we encounter various forms of censoring.
This feature
will enable us to maintain a unified approach, and also obtain our
results in a similar form for the various cases we will considero
Also,
for complete samples, the estimators obtained will have well known and
desired propertieso
In '6ubsequentwork.where we will want to check
whether a single exponential provides an adequate representation for a
mixture, one of the criteria we will use is a likelihood ratio test
[see Section
2~4.3],
which requires obtaining maximum likelihood esti-
mates to implement the test.
Hasselblad(1969) obtains general successive substitutions
iteration equations for obtaining estimates for finite mixtures of distributions from the exponential family, for complete samples.
Wewili
employ these equations for the special case of a mixture of two
exponentials [Section 2.2] and examine them in detail.
Hasselb1ad
assumed that the number of distributions is known, and that the mixture
is composed of distributions of the same type, but with different.parameter values.
f.(x)
J
Thus the probability densities can be written as:
= H(x)Cj(el.,
••• ,e rJ.. )exp[e.jT
.. j(x)+o •• +e r jT rj(x)],
J
1 . 1
·
j=l, ... ,s'
(1.3.1)
14
s
and i f f (x)
=L
j=l
C4 f •(x) we obtain a log likelihood function of the form
j j
n
log L ... L log f(x;) •
J.
i=l
(1.3.2)
Assuming Cj(61j, ••• ,6rj) is differentiable, we obtain the following
maximum likelihood estimating equations by taking derivatives of log L
n
(1.3.3)
alogL/a6 j " L
m
i=l
j = 1, •••. ,s; m=l,o •• ,r
n
L
[f. (xi)-f (x.)] If (x.) = 0, j=l,o •• ,s-l
i=l
J
s
J.
(1.3.4)
J.
which yields
n
n
[fj(x.)T .(xi)!f(x.)]! L [f.(x.)!f(x.)] (1.3.5)
i=l
J.
mJ
J.
i=l
J J.
J.
L
and
n
L [fj(x.)/f(x.)]
i=l
J.
J.
=
n
L [f (x.)!f(x )].
i
i=l
s J.
s
Using (1.306) and the relations f(x)
=
(1.3.6)
s
L C4.f (x), L C4 =l, we obtain
j
j
j=l J
j=l
~n
C4. " " L [f.(x.)!f(x )]
i
J
n i=l
J J.
(1. 3. 7)
Equations (1.3.5) and (1.3.7) are the basis for the successive substitutions iterat;ion scheme of Hasse1b1ad's paper.
Whenever the solution of
maximum likelihood equations for s distributions of the exponential family
can be written. in closed form, then the .successive substitutions iteration
scheme can be applied in the following manner.
If r sets ofs equations
(1.3.5) can be solved for 6 and written as
mj
(1.3.8)
15
n
where tR,j
n
= i:l
[f j (xi)TR,j (xi)/f(:Ki)]1 i:l [f j (xi)/f(x i )]
,
and if we denote the estimates of the parameters
v th iteration by
e(v) ,
where the tR,/ s are evaluated using
e (v),
then the
(v+l)th estimates are given by
e (v+l)
m=l, ". ".,r
mj
j=l, ••• ,s
Three methods of obtaining an initial estimate were offered, the
first being a modification of a technqiue used by Hasselblad (1966), the
second based on moment estimators, and the third using an·initial guess
based on prior knowledge of the problem.
Although the asymptotic covariance matrix is alluded to, it is
never explicitly provided.
In general, Hasselblad states that based on
work with mixtures of Poisson distributions the variance of the estimates
will be quite large if the populations are not sufficiently distinct.
However, for mixtures of exponentials no explicit results are given.
In
Chapter 2 we will examine in detail speed of convergence, asymptotic
variances, and whether a single exponential can adequately represent a
mixture, for various parameter combinations, and coordinate these three
aspects of the estimation problem.
1.3.2
Method of Moments
The first attempt at estimation for mixtures was that by Karl
Pearson (1894).
He considered the case of a mixture of two normal
16
populations and estimated the means, variances, and
mi~ing
equating the first five moments with their sample values.
moment estimators suffer from a variety of drawbacks.
proportion by
Unfortunately,
The method of
estimation does not generalize conveniently as the number of populations
increases or·as various forms of censoring are introduced,
but~
in
general, a different form of the set of estimating equations appears.
Unlike the case of maximum likelihood estimation where the resulting
estimated mixture can be tested using a chi-square goodness of fit
criterion, subtracting k degrees of freedom when k parameters are estimated,
when the fitting is by moments the effective degrees of freedom will exceed
the nominal number.
The increase is the general effect of using an
inefficient method of fitting.
Unfortunately the amount of .the increase is
not known in this case.
Rider (1961) obtains moment estimates for a mixture of two
exponentials for complete samples.
The calculations are easily performed,
involving the solution of a quadratic equation and then substitution into
two simple expressions [for more details see Section 2.5.1].
If the two
means are not equal, the proposed estimators are consistent and the probabilitythat the means are positive, and the mixing proportion is between
zero and one, approaches one as n (the sample size) tends to infinity.
Rider also obtains the asymptotic variance of the means under the assumption that the mixing proportion is known.
Since we can readily obtain explicit estimates, in future work we
will use the method of moments to obtain initial values for our iterative
procedure.
complicated,
For some forms of censoring the moment equations become unduly
.!...!., they do not yi.eld explicit solutions; in these situations
17
we will employ various assumptions to simplify the resulting equationso
Rider does not recommend the use of these estimators when 8 /8 is close
2 1
to oneo
In many cases, if this method is used when 8 /8 is near one,
2 1
nonsensical results are obtained,
!o~o,
negative values for the means or
mixing proportions greater than one or negativeo
1.3.3
Graphical Methods
In this section we will briefly consider methods which are
completely graphical [for example, Harding (1949)] and also methods
which make use of graphical aids to complement numerical computations
[for example, Preston (1953), Bhattacharyya (1967)].
The advantage of
graphical methods seems to be their wide applicability and the relative
computational ease of obtaining estimates
0
When dealing with mixtures,
in many cases the graphical methods function quite easily without the
supposition that the number of populations, s, is small, or even known.
Also, for single failure time densities, probability .and hazard plotting
techniques are able to handle a wide variety of densities, subject to
various censoring patterns.
When considering mixtures, the graphical procedures seem to be
of little use unless the component populations are quite well separated,
so that the individual failure time densities can be analyzed separately.
In the case of exponentials; even when 8 /8 is quite small, there is a
2 1
substantial amount of overlap between the component densities, and thus
graphical procedures are of limited value
0
Other difficulties with these
methods are that the properties of the estimates are not generally known,
and the technique cannot be easily automated but requires that the work
be done by a ski11edperson.
CHAPTER II
COMPLETE SAMPLES
2.1
Introduction
The problem of estimation .for a,mixture of two exponentials,
for complete samples, has been considered by Hasselblad (1969) as a
special case of his· general formulation for exponential families.
provides the.iterative
equation~
He
required, and gives some general
comments concerning convergence of the procedure and the asymptotic
variances of the estimates obtained.
We will look at this problem in
more depth, examining various aspects
ofitfQ~
differing parameter
combinations.
Section 2.2 will develop and discuss the performance of the
iteration scheme. used to obtain the maximum likelihood estimates.
The asymptotic covariance
mat~ix
of these estimates, considered in.
Section 2.3 will involve expressions which require numerical integration.
We will examine these integrals in some detail, providing tables
as an aid to evaluating them.
Also, we will offer graphs of the
asymptotic variances for various parameter combinations so. that we may
determine when the maximum likelihood estimates will be reliable.
When dealing with mixtures of densities one should be aware
that a single density might describe the ,data adequately.
Obviously,
if a single density fits the data, then a mixture will also.
2.4 compares a mixture of two exponentials and a single
ha.ving the same mean for various parameter combinations.
Section
exponential~
The comparison
19
involves formulation of indicators of agreement between the two
den~iti~s,
as weUas pointing out appropriate tests which can be applied.
Section
2.5 will produce an operating procedure to follow in order to implement
the iterative scheme developed, and illustrates it on ·some artificial
data.
Throughout this chapter the aim will be to consolidate the observa-
tions made in the various sections,
!.~.,
if for some parameter combina-
tion convergence is not easily attained, then we would want to look at
the magnitude of the asymptotic variance of these estimates as well as
checking if perhaps a single exponential would be an adequate representation for the mixture.
2.2
Iterative Scheme
The required iterative equations can be obtained as a special
case of Hasselb1ad's general system of equations (1.3.5) and (1.3.7) by
setting H(x)=l, r=1, 6 =6 , C (6 ) =
1i i
i li
1
e'
Tli(x) =
i
x
62
'
i=1,2.
In
i
order to facilitate comparison of the derivation for the complete sample
case with the ensuing derivations for censored samples we will rederive
the equations for the less general but notational1y simpler case of a
mixture of two exponentials.
density of the form:
Thus we will be working with a failure time
a l -xl6 l + a Z -xtS2
f(x) =--8 e
--8 e
1
2
(2.2.1)
with the corresponding log likelihood function expressed as
n
log L
=~
i=1
log f(x )
i
The resulting maximum likelihood estimating equations are then as
follows:
(2.2.2)
20
=
ologLlo~l
ologL/oS
=
j
-xile l
l -x i le 2
e
] /f(x i )} = 0
(2.2.3)
n ,
xi-e j
-xilsj
E {~j( 6 3 ) e
'/f(x i )} = 0, j=1,2.
i=l
j
(2.2.4)
n
1
([r e '
L
-
1
i=l
e
2
A solution to this set of equations can be obtained by using a
successive substitution iteration sche~e.
n
by ~l and' adding E
i=l
1
-xil s
{e
e
2/f (xi)}
to both sides we obtain
2
n
n
Multiplying equation (2i2.3)
=E
i=l
1
{e e
-xi IS 2
/ f (xi)}
2
and by once again referring to equation (2.2.3) we observe the following:
(2.2.5)
Thus we can determine the successive substitution iteration scheme from
(2.2~5)
equations
and (2.2.4), which yields:
A(V)
A
~l
(v+l) = ~l
n
~
n
e(v+l) =
j
(2.2.6)
'"
i=l
[x.e
i=l
~
{L
-xle(V)
i
j
n
-x le(V)
/f(V)(x )]}!{ E [e i j If(v)(x )]},
i
i
i=l
(2~2.7)
j =1,2.
2i2.l
Properties of. the
Procedu~e
The iterative procedure we have derived is referred to by a
variety of terms, some of which are successive substitution, linear
iteration, and functional
ite~ation.
It will be seen in succeeding
chapte'rs that the iterative equations ,will easily generalize, for the
21
forms of censoring considered.
The observations made in this section
are based on limited experience with the iterative process using artifidal data.
Two random uni,form numbers. were used to generate a single
random number, X, from a mixture of exponentials.
If the ,first random
uniform number, Zl' was less than aI' the mixing proportion, this led to
the population with mean 8 ; otherwise to the population with mean 8 ,
2
1
Then X was obtained from the second random uniform. number, Z2' by setting
X = -8 1 log(Z2) or X = -8 2 log(Z2) depending on whether Zl~al or Zl>a l •
At the j th iteration we calculated the estimated parameters,
;ij ), ei j ),
e~j), as well as the logarithm of the likelihood function,
and the quantitiesdlogL!da , dlogL!d8 , dlogL!d8 ,eva1uated at
1
1
2
=:(j) 8 =e(j) 8 =e(j)
We took ~(j) e(j) e(j) as our max~mum
""'1 ""'1 ' 1 1 ' 22 •
.1 ' 1 ' 2
....
N
likelihood estimates when dlogL!da , dlogL!d8 1 , dlogL!d8 2 were all
1
sufficiently small, say, all less than 0.01.
The logarithm of the
likelihood function was then checked to make sure it was a maximum.
It was found that the logarithm of the likelihood function
continually increased but not very dramatically.
(Exceptions to this
occurred occasionally when the log of the likelihood function oscillated
at the eighth digit near the end of the iterative procedure.)
The
partial derivatives tended to increase and decrease more dramatically
than the log of the likelihood function with slight changes in the
parameter values, at times, causing substantial changes in the partial
derivatives.
In order to get a rough idea of how well the iterative procedure
worked in various situations we used simulated data with (a) several
22
values of aI' the mixing proportion, (b) different values of h=6 /e l ,
2
6 >6 , the ratio of means, (c) various values of 61 for a.fixed h,
1 2
(d) several values of n, the numperof observed failure times; and (e)
different starting values.of our iterative procedure.
the most.critical,factor.
The size of h'was
For h small enough convergence tended to take
place relatively rapidly eVien when the other factors were not favorable, .
but when happroached one, convergence was. difficult to attainirrespective of the other factors.
It was observe4 that a value of h less than
or equal to 1/5 was highly desirable and convergence occurred quite
rapidly irrespective of the other factors.
For h=113 we started to obtain
some problems of convergence and speed of convergence for some of . the
extreme cases of the ,other factors.
When h=1/2, even for favorable cases
of the other factors, convergence did not occur all the time and when it
did, it was usually extremely slow.
The value of a
a 1 = 1/2 desirable.
l
doesn't seem to be too critical, with values of
Problems occur when a l approaches ,0 or 1, in which
case larger values of n are required.
For a given h, the value of 6
1
doesn't seem to matter when convergence occurs in a reasonable number of
iterations.
For h<1/3 and .2<a <.8
l
a sample size of 200 may be sufficient;
for less favorable conditions, a sample of size 1000 may be required and
even
th~s
may not be large enough when conditions are unfavorable (when
(0)
Starting values are not too critical when'h.=:..2, using a
l =.5,
n
xi
(0)
xi
( )
6
= nE (-)+1,
6 0 = E (-)-1 proved to be satisfactory. However,
1
. 1
n
2
i=l
n
J.=
h~l).
when,h+l starting values become extremely critical; unfortunately, no
met~od
is currently available fqr producing good starting values. under
these conditions.
23
No optimality properties are claimed for the iterative procedure
used.
However, to see how successive substitution compares with other
iterative methods, we also solved the maximum likelihood estimating
equations using Newton's method.
From limited computational results we
noted that Newton's procedure converged rapidly when the successive substitution scheme did, and when successive substitution didn't converge
or converged very slowly, the same was true of Newton's procedure.
2.3
ASymptotic Covariance Matrix
The asymptotic covariance matrix can be written as the inverse
of the information matrix, where the negative of the information matrix
is defined as:
iH a2logL/ae l act l }
E{ a2J,ogL/Oep
E{e 2logL/ae act }
2 l
E{a 2logL/ae ae }
l 2
E{a 2logL/aep
_ _ _..J
We determine the second partials by differentiating the first partials,
equations (2.2.3), (2.2.4), obtaining:
.a2logL/act2l __ ~ {[1... e-xi1el
i-l
61
a2logL/a6 j act
_1...
e2
e-xile2]2/f2(x )}
i
(2.3.1)
n
x-6
I
I
• (-l)j-li:l{ ~k ( ~3 j)e-Xi ej e-Xi ek / f 2(x )} (2.3.2)
l
i
j-l,2;
n
x -6
x
a2logL/ae a6 • - 1: {ct ct ( ~'3 1)(
l 2
l 2
i-l
1
j+k
\3-e22)e-Xi I61 e-xi Ie2/f2(xi)}i:(2~3:3)
~ • -Xii ej [2e1-4ejXi+x~1
f(x i )
..
..
(2.3.4)
24
The expected values of the above second partial derivatives will
be calculated by transforming the original integrals into a form more
suitable for numerical integration.
Hill (1963) is also concerned.with
this problem but deals with less general case when·e ,e are known and
l 2
only the mixing proportion is estimated.
Unlike our numerical integra-
tion approach most of his work is devoted to obtaining an infinite series
expansion for the required integral, as well as obtaining suitable
approximations for various limiting parameter combinations
0
From (2.3.1) we have
but
If we let 8
1
=
Joo{ele
e-x/ele-xle2/f(x)} dx
we obtain
012
(203.5)
Note that in the appendix we provide a means of calculating 8 and the
1
ensuing integrals 8 , ••• ,8 , and also give tables of these integrals for
2
5
several parameter values.
25
j
S2 ...
Similarly defining
foo{e~e
0
S3 ...
Joo{
o
we obtain from
e-xle1 e -xl 82/f(x)} dx
12
~2
e-Xle1e-xle2/f(x)} dx
e
8
1 2
(2.3.2)-(2~3.4)
... .::!! [S -8 ]
(203.6)
... -E. [..!. S -S ]
8
h 2 1
2
(2.3.7)
8
1
2 1
nCl Cl
1 Z
...
e2
[S3-(h+1)S2+hS 1]
(203.8)
2
(2.3.10)
Thus the asymptotic covariance matrix can be written as the inverse of
n
1 .
- -.- {S -S }
81
2 1
Cl
Cl
1 2
"""82
{S3-(h+1) S2+hS l}
2
(2.3.11)
26
From the appendix the Si' i=1, •. o,5are seen to be functions of
only hand aI' we then note that the elements of the asymptotic covariance matrix can be determined from hand a
l
alone, by setting 6 =1.
1
Since for a.fixed h,if instead of 6 =1 we have that 6 =k, we can obtain
l
1
the appropriate asymptotic covariance matrix for the new value of 6 by
1
multiplying cov(6 ,a ), cov(6 ,a ) by k and multiplying cov(6 ,6 )'
l Z
l l
2 l
2
var(6 ), var(6 ), by k and leaving var(a ) unchanged.
l
l
2
Figures (2.3.1), (2.3.2), (203.3) may be used to compare the
asymptotic dispersions between the three estimated parameters.
If we
recall the parameter combinations .for which the iterative scheme converged slowly we see that they correspond to cases for
of the estimates are excessive.
wh~ch
the variances
Thus for h=1/3 and large or small values
of aI' convergence difficulties appeared, and whenh=1/2 even for a
l
around .5 convergence was. hard to obtain.
Also we noted that values ofa
l
around .5 were desirable, and for a given h, the magnitude of 6 was
1
unimportant as far as convergence was ,concerned.
Thus the cases for which
we had trouble obtaining estimates are of limited interest, for maximum
likelihood procedures, because of the large variances associated with the
estimates.
2.4
Single Exponential Versus a Mixture of Two Exponentials
In the earlier sections of this chapter we have found an
iterative scheme to obtain estimates for mixtures of two exponentials and
calculated the asymptotic variances of these estimates.
For certain
parameter cQmbinations the iterative scheme has proved to be unsatisfactory
and the corresponding asymptotic variance of the estimates has been
excessive.
The question arises whether in these cases the mixture of.
900
.2
.4
.6
.8
27
800
700
600
var(u)
500
2
a1
400
300
200
100
~1
Figure 2.3.1
.6
h
var(a )/af plotted against h for various values of a ;
1
1
n=l
.7
.4
.6
.8
80
70
60
50
40
30
20
0.1
.2
Figure 2.3.2
.3
.4
.5
.6
h
Var(el)/ei plotted against h for various values of a ;
l
n=l
28
.6
.4
.2
80
70
60
30
20
~1
.2
Figure 2.3.3
.3
.4
.5
.6
var(e2)/e~ plotted against h for various values of
ct.
l ; n=l
h
29
30
exponentials perhaps may be adequately approximated by a single
exponential.
In this section we will study the relationship between a
mixture of two exponentials and a single exponential, with the same mean
as the mixture, from a graphical standpoint, by examining the noncentrality parameter, introduced in power considerations for chi-square
tests of goodness of fit, which can be regarded as a measure of discrepancy between the two distributions, and we will also discuss tests which
can be applied to determine whether a single exponential or a mixture is
appropriate.
2.4.1
Graphical Observations
a,1 -x16
We will compare a mixture of two exponentials, f(x)= -- e
1
61
+ :2 e-xI62, with a single exponential having the same mean as the mix2
ture g(x)
:=
a,161~a,262
exp[-xl {a,l61+a,262}]' In ,figures (20401) to (204.5)
we note that as we approach the parameter combinations for which our
iterative procedure converged slowly and for which we obtained excessive
asymptotic variances the single exponential, g(x), serves as an,adequate
representation for the mixture.
Note that ,the greatest discrepancy between the two curves occurs
at the origin, and this distance becomes smaller as the single exponential
adequately depicts the mixtures.
Thus in order to obtain some analytic
comparisons on the graphs between single exponentials and mixtures for
various parameter values it seems natural to scrutinize this distance.
We will aenote the distance by d,
d ...
-
f(x) = ---d. e-xl 10:0
10.0
+
.5
e
-X/1.0
l:Oe
f(x)
.3
Q(X) =_I_e-X/ 5.5
r
.1
o
to
Figure 2.4.1
2.0
3.0
4.0
5.0
5.5
6.0
7.0
Single exponential versus a mixture of two exponentials a =.5, h=1/10, 6 =1
1
2
8.0
x
W
f-'
-
-
e
r f(x)= .....i. e- X / 50 + -.2 -X/1.0
5.0
1.0 e
.1.
f(x)
.4
.2
Q(X) =-.J:SL e-X/3.0
r
.1
o
1.0
Figure 2.4.2
2.0
3.0
3.0
4.0
5.0
6.0
7.0
Single exponential versus a mixture of two exponentials a =.5, h=1/5, 8 =1
l
2
8.0 X
UJ
N
~ fex) = .2.. e- X/3 .O +
3.0
fex)
e
e
e
.5 -X/l.O
roe
.36
.24
.12
o
1.0
Figure 2.4.3
20
3.0
4.0
5.0
60
7.0
Single exponential versus a mixture of two exponentials a =.5, h=1/3, 8 =1
1
2
8.0 x
w
w
e
e
e
1.0
3.6
f(x)=
e-X / 3.O + ~e-X/I.o
1.0
f(X)
.4
.2
Q(X)= ::~ e-X/I.2
o
to
Figure 2.4.4
2.0
3.0
4.0
5.0
6.0
7.0
Single exponential versus a mixture of two exponentials a =.1, h=1/3, 8 =1
1
2
8.0 X
w
~
e
e
f(x)= 2. 5 e-X / 20
.0
+
e
~e-X/1.0
1.0
f(x)
45
.30:
/'Q(X)= JJLe-X/1.5
1.5
.15
o
1.0
Figure 2.4.5
2.0
3.0
4.0
5.0
6.0
7.0
Single exponential versus a mixture of two exponentials a =.5, h=1/2, 8 =1
2
1
8.0
x
W
I.Jl
36
from which we note:
h~l.
(1)
d>O and achieves its minimum value for al=O, 1 or
(2)
For a fixed a ,6 , the value of d monotonically increases as
l l
h decreases, with d being unbounded; since as h decreases the numerator
of equation (2.4.1) increases while the denominator decreases.
(3)
For a fixed h, 6 , the partial derivative of a a ![a +a h]
1
l 2
l 2
with respect to a
l
yields [-ai+a;h]![al+a2h)2frOm which we obtain that
the value of d increases as a
then decreases.
a
l
l
increases until it reaches a maximum, and
r.The maximum is [(I-h) 2 j(vh
+1) 2 6 h) and occurs when
l
=Ih/(Ih +1). Thus as h+l the value of a
l
at which the maximum occurs
approaches .5 and the maximum distance approaches 0, while as h+O the
value of a
l
at which the maximum occurs approaches
° and the maximum
distance becomes unbounded.
(4)
For a given h, a
1
we have that del is a constant.
Tables (2.4.1) and (204.2) reconfirm the above results as well
as providing some idea of the magnitude of d for various parameter
combinations.
2.4.2
Non-Centrality Parameter
Another measure of discrepancy between the single exponential
and a mixture of two exponentials can be obtained from the non-centrality
parameter which evolves from power considerations ina chi-square test
of goodness of fit.
If we define Poi as the probability that the single
exponential is contained in the i
vo=O, Vk=OO,
th
interval [vi_l,vi ), i=l, •• o,k;
37
Table 204.1
:3J
Values of d for a ,h= 01, 02,
1
•
0
,
09; 6 =1.0
1
.1
02
.3
.4
.5
.6
.7
.1
3.8368
1.0286
.3937
.1761
00818
.0375
.0159
.0055
02
4.6285
1.4222
.5939
.2769
.1333
.0627
.0271
00095· 00019
.3
4.5973
1. 5273
06725
.3259
.1615
.0778
.0342
.0122
00025
.4
4.2261
1.4769
.6759
03375
.1714
.0842
.0376
.0136
.0028
.5
3.6818
1.3333
.6282
.3214
01667
.0833
.0378
00139
.0029
.6
3.0375
1.1294
05444
.2842
.1500
.0762
.0351
.0130
.0028
.7
2.3301
.8842
.4342
.2305
.1235
00636
.0297
.0112
.0024
.8
1.5805
.6095
.3039
.1636
.0889
.0464
00219
00083
.0018
.9
.8011
.3130
.1581
00862
.0474
.0250
00119
00046
.0010
Table 204.2
I
.8
I
Values of a which maximize d, given fixed h, 6 =1.0
1
1
h
e
&
a
1
d
.1
.2403
4.6754
.2
.3090
1.5279
.3
.3539
.6818
.4
.3874
03377
.5
04142
.1716
.6
.4365
.0847
.7
04555
00381
.8
.4721
.0139
.9
.4868
.0029
I
.9
.0011
38
and Pli as the probability that the mixture is contained in the i
th
interval as
then the non-centrality parameter relevant for power considerations in
a chi-square test of goodness of fit, that the sample is from a single
exponential against the alternative that it is from a mixture is of the
form:
k P 2
... n[ L (-!L) -1 ]
(2.4.2)
i=l Poi
Since A is a function of PI"'
p01." it can be.usedas an indicator of
~
agreement between the two distributions, and after examination it will
be .found that the results agree, in essence, with the observations made
about d.
The value of A will be a function of the manner in which we
select our Vi.
¥
i.
A natural procedure is choose the Vi such that Poi
Using this procedure we
for j=l, ••• ,k.
obta~nthe
= k1
following:
Hence
i-I elel + a (1- ---)
i-I ele2 - a (1- -)
i elel - a (1- -)
i ele 2
= a 1 (1- ---.)
k
2 . k
1
k.
2
k
.
al
a
l
- +a
a
i~l
i) -h .
a [(1- ):--h+ 2·- 11
\; - -k·
2
k
2]
39
Thus A can be written as
(2.4.3)
From tables (2.4.3) to (2.4.8) and by analytic investigation we
can observe the following general trende appearing:
° when either al=O,l or h=L
(1)
A assumes its minimum value of
(2)
For a fixed aI' the va1ueof A increases as hdecreases
[see table (2.4.7)].
(3)
For a fixed h, the value of A increases
it reaches a maximum and then it decreases.
the value ofa
l
a~
a
l
increases until
As the fixed happroaches 1,
where the maximum occurs approaches .5 and as h approaches
0, the value of a
l
where the maximum occurs also approaches
°but at a
slower rate [see table (2.4.8)].
(4)
k
~
Pli
2
The value of A is bounded by n(k-1) since
~ 1 ~
n[k
i=l
a
l
k
~
2
Pli -1]
~
n(k-l), and assumes this value when h+O,
i=l
~
0.
Thus
lim lim
al+O h+O
h
k
a +a h
l 2
(k ~. {a [(1- i-I)
i=l 1
k
+
= n(k-l) •
a 2 k-1)
2
40
Table 2.4.3
~
Values of
"In
for aI' h= •1, •2 , ••• , •9 ; k=10
.1
.2
.1
.155080
.030816
.008513 .002619 .000807 .000227 .000052 .000008 .000009
.2
.330965
.077754
.023441 .007587 .002410 .000693 .000161 .000024 .000([)@1
.3
.429439
.112751
.036148 .012133 .003945 .001152 .000271 .000041 .000002
.4
.444344
.126085
.042580 .014763 .004903 .OO;L454 .000346 .000053 .000003
.5
.394521
.118228
.041728 .014921 .005061· .001523 .000366 .000056 .000003
.6
.304543
.094945
.034752 .012792 .004432 .001355 .• 000330 .000051 .000003
.7
.198818
.063874
.024077 .009100 .003220 .001000 .000246 .000039 .000002
.8
.099814
.032851
0012677 .004906 .001771 .000559 .000139 .000022 .000001
.9
.027628
.009282
.003650· .001442 .000531 .000170 .000043 .000007 .000000
Table 2.4.4
0J
.1
.2
.3
Values of
.3
.4
"In
.4
.5
.6
.7
I
.8
.9
I
for aI' 'h=.I,.2, ... ,.9; k=32
.5
.6
.7
.8
I
.9
.1
.196639
.045156
.013753 .004470 .001422 .000403 .000091 .000013 .000001
.2
.420309
.114556
.037604 .012745 .004135 .001194 .000275 .000041· .000002
.3
.525115
.156977
.054988 .019444 0006500 0001921 0000452 0000068 .000003
.4
.523797
.164506
.060620 .022332 .007707 .002337 .000562 .000086 .000004
.5
.452128
.145710
.055706 .021276 .007575 .002357 .000579 .000090 .000005
.6
.342559
.111810
.043849 .017255 .006322
.7
.2214.52
.072674
.028991 .011681 .004390 .001436 .000369 .000060 .000003
.8
.110920
.036458
.014704 .006033 .002317 .000775 .000203 .000034 .000002
.9
.030818
.010127
.004112 .001710 .000669 .000228 .000061 .000010 .000001
~OQ2018
.000507 .000081 .000004
41
Table 2.4.5
?SJ
Values of A/n for a , h= •1 , •2 , • 0• , .9 ; k=62
1
.1
.2
.1
.241321
.060931
.019135
.006231 .001937 .000535 .000118 .000017 .000001
.2
.493586
.144615
.048938
.016691 .005378 .001531 .000347 .000050· 0000002
.3
.589122
.186062
.067245
.024132 .008098 .002386 .000557 .000083 .000004
.4
.566478
.185467
.070388
.026455 .009240 .002818 .000678 .000104 .000005
.5
.476146
.158239
.062090
.024242 .008780 .002768 .000686 .000108 .000005
.6
.354392
.118207
.047379
.019050 .007122 .002314 .000590 .000095 .000005
.7
.226569
.075421
.030614
.012580 .004823 .001612 .000422 .000010 .000004
.8
.112757
.037371
.015274
.006374 .002498 .000854 .000229 .000039 .000002
.9
.031225
.010298
.004222
.001781 .000710 .000248 .000068 .000012 .• 000001
Table 2.4.6
~
.3
.4
.5
.6
.7
.8
.9
Values of A/n for a , h= •1, .2 , ••• , .9 ; k=122
1
.3
I
.1
.2
.4
.1
.311901
.084359
.026535
.008433 .002535 .000676 .000144 .000020 .000001·
.2
.594039
.182701
.062435
.021116 .006690 .001867 .000415 .000059 .000003
.3
.666506
.218563
.080153
.028847 .009640 .002817 .000651 .000096 .000005
.4
.611838
.206346
.079594
.030200 .010597 .003235 .000777 .000119 .000006
.5
.498393
.169394
.067510
.026689 .009765 .003101 .000772 .000122 .000006
.6
.363795
.123291
.050086
.020402 .007724 .002538 .000654 .000106 .000006
.7
.229971
.077361
.031741
.013199 .005132 .001737 .000461 .000077 .000004
.8
.113749
.037940
.015631
.006589 .002615 .000907 .000247 .000042 .000002
.9
.031401
.010391
.004285
.001822 .000234 .000260 .000072 .000013 .000001
I
.5
.6
.7
.8
.9
42
Table 2.4.7
Maximum value of A/n which occurs when
h=O.O,o. 1 fixed, for various values of k
h
ka 10
A/n
k=32
A/n
k=62
A/n
k=122
A/n
.001
0.0
8.9800
3009361
60.8761
120.7562
.01
0.0
8.8022
30.3664
59.7721
118.5835
.05
0.0
8.0499
2709415
55.0626
109.2989
.1
0.0
7.1826
25.1001
49.5129
98.3104
.2
0.0
5.6338
19.8833
39.2490
77 .9029
.3
0.0
4.2871
15.2277
30.0604
59.6385
.4
000
301261
11.1612
22.0417
43.7364
.5
0.0
2.1499
7.7180
15.2584
30.3002
.6
0.0
1.,3598
4.9134
9.7304
19.3460
.7
0.0
.7546
2.7478
5.4536
10.8583
.8
0.0
.3304
1.2140
2.4154
4.8164
.9
0.0
.0813
.3017
.6019
1.,2020
.95
000
.0202
.0752
.1502
.3002
.99
0.0
.0008
.0030
.0060
00120
0.
1
43
Table 2.4.8
h
a
1
k=10
A/n
.000001 .00011 8.997633
Values of a which maximizeA/n for h
1
fixed and various values of k
a
1
k=32
A/n
.00033 30.977141
a
1
k=62
A/n
a
1
k=122
A/n
.00060 60.918585
.00109 120.703660
.00001 .0009
8.980649
.003
30.804087
.0045
60.364270
.0081
118.736500
.0001 .007
8.849945
.018
29.671565
.031
56.526897
.053
105.888560
.001 .05
7.989613
.11
23.094980
.16
37.264288
.23
53.355204
.01 .21
4.348328
.30
6.658117
.32
7.177619
.32
7.449146
.05 .34
1.114176
.34
1.285145
.32
1. 373033
.30
1.487082
.1
.37
.447726
.35
.535516
.32
.592261
.30
.666506
.2
.41
.126185
.37
.165521
.35
.190100
.32
.219137
.3
.44
.043086
.40
.060620
.37
.070873
.34
.081773
.4
.46
.015157
.42
.022416
.40
.026455
.37
.030398
.5
.47
.005101
.44
.007810
.42
.009268
.40
.010597
.6
.48
.001529
.45
.002396
.44
.002853
.42
.003249
.7
.49
.000367
.47
.000584
.46
.000696
.45
.000791
.8
.49
.000056
.48
.000091
.47
.000108
.47
.000123
.9
.50
.000003
.49
.000005
.49
.000005
.48
.000006
44
2.4.3
Appropriate·Tests
We will
brief~y
discuss the likelihood ratio test and chi-square
goodness of fit test, useful for testing whether we have a single
exponential or a mixture.
Some computational observations derived from
performing these tests will also be providedo
If we test, using the likelihood ratio criterion, the null
1
hypothesis that we have a single exponential, f(x)=s e
-xiS
I
,
versus the
alternative hypothesis that we have a mixture of tw.o exponentials,
f (x)
a
I
a
I
=.-1.
e- x 81 + --6. e-x 82 , we will use the following procedure:
8
8
1
2
(1) Assuming we have a single exponential, obtain the maximum
,n
likelihood estimate of 8, 8 "'"
EXo!n, and calculate the corresponding
i=l ~
likelihood function using the estimate in place of the true mean,
n
A
L(8)
1
= ~n
!.~.
,
_i~lXiI8
e
8
(2)
Assuming we have a mixture of exponentials, obtain the maximum
!o~o,
likelihood estimates of aI' 81 , 82 ,
aI' 8 ,8 , using (2.2.6),
2
1
(2.2.7) and calculate the corresponding likelihood function using the
estimates in place of the true parameters,
n
a·
i=l
8
I
1 -x'~ 81
=II {-;-e
A
1
Form .the ratio £
~2
8
A
(3)
+
i.~.,
A
A
e-X:i..le2 }
.
2
II\.
= L(8)!L(OI. l ,Sl,S2)
and note that asymptotically
-2log£ is
di~tributedas
freedom.
The two degrees of. freedom used to test the null hypothesis is
a chi-square distribution with 2 degrees of
equal to the difference in the number of parameters estimated under the
alternative and the null hypothesis.
45
We can also check whether we have a single exponential or not by
using, instead of a likelihood ratio test, a chi-square goodness of fit
test.
(1~60)we
Following the recommendations of Kendall and Stuart
obtain.the following operating procedure:
(1)
Use approximate
(2)
Determine
t~e
intervals,
equi~probable
!.~.
number of interva,ls,whenn exceeds 200,
approximately by
r,;'
= b{v2(n-l)
k
A +~-l(p )
p
=f
(2.4.4)
0
00
with bbetween 2 and 4, p
2/5
}
A
1 2
1
721T
e
-ZX
dx = H-A p )' and Po the
p
asymptotic power of the test.
Some examples of the resulting kfor
various other parameter values are as follows:
p
=
.01
P
p
=
.01
p
=
P
p
=
.95
n
= 200
b
P
=
.95
n·'" 1000'
b
.10
P
.5
n
...
200
b
=
.10
P
=
=.
•5
n
= 1000
=
.25
P
...
.5
n
= 2000
(3)
0
0
0
0
0
=2
... 2
-+
k
= 11
-+
k
= 21
= 35
-+
k
b
... 4
... 4
-+
k ... 66
b
=4
-+
k
= 112
If we estimated e from. grouped data, we would then have a
k
likelihood function of the form L=
n
A
n
A
.
i-l ale·
i ale i
[{l-~}
-{1- k}
]
and
i=l
would thus have to solve the following equation to obtain a
hood
maximumlikeli~
estimat~:
k
L
i=l
n
e
62
{-[1-
i-l ale
T]
i-l
log.[l~TJ
i
--=:;-:<.----.;;;...-----.
....
+
[1-
=-~--
[1- H]ale_[l_ !]ale
k
k
i
it]
ala.
i
10g[1- k~}
. . . . ------.......;;.---
:;0
0
46
wheren i is the number of observed failures in
instead~
[Vi_l~Vi].
If we
use~
a maximum likelihood estimate from the ungrouped data and then
k
calculate the chi-square statistic~X2 = 2:
i=l
the distribution of X2 is bounded between a
we know that
variable~
and
ask becomes large these are so close together that the difference can be
2
exceeds the value of Xf~l,P we reject; if x is less
2
than the ,value of Xf-2 ~p ~ we accept ~ but if x is between ,the values of
ignored.
Thus if
x2
Xf-2~P and Xf-l~P~ we are unable to make a decision.
the probability that x
2
isbetweenXf_2 ~p and
To get some idea of
Xf-'-l~P under Ho ~ we have
the following upper and lower bounds for various values of k and p.
kill 32
k=lO
p=.l
k=62
k=122
.0344l~
.04690
.01987~
.02337
.01458~
.01636
•01059 ~ .01149
.'05
.01904~
.02792
.01l27~
.01383
.00835, .00966
•00611 ~ .• 00678
.01
.00443~
.00737
.00274~
.00362
.O0206~
.00252
•00153 ~ .00176
We mi:ly'also consider the power of the chi-'-squaretest using the
non~centrality
parameter considered in Section 2.4.2, since the distribu-
2
tionof X when
nPli~
'and not nPoi are the true expected frequencies is.a
non~centralchi-squarewith
k
k-l degrees:of freedom and non-centrality
2
parameter A = n 2: (Pli-Poi) fPoi. Thus we have available the distribution
i=l
2
of x when we have a mixture of exponentials instead of the hypothesized
single exponential,.
Approximations to the
non~cen1;ral
chi-square
de~sity
may.be found in Kendall and Stuart (1960).'
The power function ,can be used to answer the
(a)
For a given
quacyof H?
o
(b)
n~k;p
followin~
questions:
what is the probabilitY9f establishing the
For a given
k~
inade~
p, we may' ask how manyobservations.are
47
necessary to establish significance at the 100 p percent level with
probability, say .957
(c)
For a given k,n,p, we may ask how large must
A be to be detected with probability, say 050?
Note that a chi-square goodness of fit test was used with
continuous data rather than a Kolmogorov-Smirnov test because the standard
Kolmogorov-Smimov tables are extremely conservative when parameters-have
to be estimated.
However, Lilliefors (1969) does obtain critical values,
when the mean in an exponential distribution is estimated, from Monte
Carlo calculations.
His table is not as readily available nor as exten-
siveas the corresponding chi-square tables.
Also the Ko1mogorov-Smirnov
procedure will not generalize to handle the double censoring situation, in
the next chapter, while the chi-square goodness of fit approach will do so
easily.
For a variety of parameter combinations we.will first offer general
observations based on limited numerical. work, and then present,in tables
(2.4.9) and (204.10) actual results of the estimation procedure and calculation of test statistics for simulated data.
numerical work to samples of size 400 and 1000.
We have restricted our
One is not surprised to
learn that the observation obtained confirm the previous conclusions we
have made,
L~.,
for
h~1/3,
0\1=.5 we usually obtained highly significant
results from both the likelihood ratio and chi-square tests; for h=1/3,
0\1=.1 or h=1/2, 0\1=.5 we have a transition phase with the likelihood
ratio having higher p values than the chi-square test; for 0\1=.1, h=l/2
or h>1/2 we usually did not obtain significance at the 10 percent level
for either test.
e
e
Tab1.e 2.4.9.
Generated
parameter values
(X
8
8
1
2
e
Qbservationsmadewithsimulated data;n=400
Estimated
mean for a
s~~~ieexp.
Estimated
parameter va111es
~~
61
82
.
-21og!l.
(X 22 )
***
X2 goodness of fit values,
2
X , with interval If
32
62
10
***
.5
10.0
1.,0
5.100
.50
9.19
1.07
138.631
.5
5.0
1.0
3.039
.50
5.13
.96
70.544
.5
3.0
1.0
1.889
.56
2.73
.82
22.067
.1
3.0
1.0
1.234
.11
3.06
1.01
12.586
17.05
.5
2.0
1.0
1.436
.50
1.84
1.04
1.948
.5
200.0
100.0
152.814
.41 233.56
96.36
.1
2.0
1.0
1.065
.13
1.41
1.01,
.9
2.0
1.0
1.922
.91
1.94
1. 76
.5
4.0
3.0
3.643
.50
3.71
.1
4.0
3.0
3.305
.10
.5
10.0
9.0
9.459
.1
10.0 '
9.0
9.339
***
***
***
***
11.00
131.05
***
53.70
***
***
165.28
***
95.04
***
***
186.21
***
126.69
***
66.88
108.40
44.00
**
71.51
11.45
23.52
56.63
11.85
28.96 '
57.87
29~60
**
w
7.20
40.48
59.42
0.0
2.40
15.36
39.58
3.57
0.0
4.20
41.12
67.48
3.66
3.26
0.0
3.35
30.·88
57.25
.50
9.66
9.26
0.0
5.60
16.64
54.1.5 '
.10
10.22
9.24
0.0
6.85
29.92
78.64
.076
=N-
*
~
00
e
e
e
Table 2.4.10
Generated
parameter va1uEias
8
8
ex
1
2
Estimated·
mean for a
single, expo
Observations made with simulated data; n=1000
Estimated
parameter values
ex'"
61
62
-21og~
(x 2 )
2
***
.5
10.0
1.0
5.549
.498
10.030
1~111
371.0097
.5
5.0
1.0
3.025
.453
5.450
1.018
195.1308
.5
3.0
1.0
2.034
.469
3.297
.917
82.9423
.1
3.0
1.0
1 ..229
.047
4.039
1.089
27.0495
.5
2.0
1.0
1.481
.465
2.113
.930
18.9245
.5
200.0
100.0
152.328
.1
2.0
1.'0
1.161
.028
.9
2.0
1.0
1.973
.863
.5
4.0
3.0
3.459
.1
4.0
3.0
2.961,
.5
10.0
9.0
.1
10.0
9.0
***
***
***
***
x2goodness of fit values,
x2 , with interval #
122
62
10
32
***
326.060
***
144.360
***
***
398.784
***
220.608
***
***
467.54Q
***
281.044
***
***
540.128
***
371.036
***
60.000
96.256
145.140
200.968
6.12
22.336
6a~508
137.772
**
**
**
129.72
20.06
46.144
86.736
15.6125
12.20
38.528
77 •.()64
108.492
1.109
3.2807
5.76
28.48
61.688
l1O..os8.-
2.137
.944
2.2360
8.24
25~ 728
54.744
120.936
.500
3.498
3.419
0.0
3.18
16.064
39.493
90.. 924
.098
3.366
2.917
0.0
5.00
44.48
66.028
148.264
10.027
.497
11.437
8.633
0.4508
8.30
32.064
67.268
118..74
9.186
.100
9.619
9.138
0.0
5.04
20.544
40.484
.396 231..269 108.850
29.34
***
**
*
***
84.092
.po
\0
50
Tables (2.4.9) and (2.4.10) illustrate the above remarks with
some examples worked out in detail.
Note that in performing the chi-
square goodness of fit test, we estimated the mean using ungrouped data,
and hence the distribution of
variable.
x2
is bounded between
aX~_l and a X~-2
In the.tables we shall denote values significant at 10, 5, and
1 percent levelsby superscripts *, **, and ***, respectively.
If X
2
is
between the value of, say, X~-2,.10 and X~~l,.lO we will denote this as
2.5
*.
Implementation of the Procedure
Given.a set of failure times we first perform a chi-square goodness
of fit test for a single exponential utilizing the recommendations set
forth in'Section 2.4.3,
!.~.,we.
vals, k determined from formula
use k approximately equi,-probable inter-
(2.4.4)~
estimate the mean from ungrouped
data and note that the distribution of the statistic X2 is bounded between
a X~-l and a X~-2 variable.
If we fail to reject the hypothesis that a
single exponential is adequate we estimate the mean and then make statements only about the complete, sample.
If we reject the hypothesis, we
estimate the three parameters using the iterative scheme proposed in
Section 2.2.
2.5.1
Initial Estimates
We shall obtain initial estimates using a method of moments, for
several cases.
The cases will represent situations in which one or more
parameters are already known approximately, and thus initial values do
not have to be.ca1culated for them.
In Section 2.2.1 we observed, from
computational experience, that for h<.2 the starting values are not too
n
n
L: xi!n + 1, 8(0)= L: x/n - 1
critica~ and using a(o)= .5, 8(0)=
1
2
i=l
i=l
51
proved to be satisfactoryo
However, since good starting values will
reduce the number of iterations, and since the computations necessary
to obtain these values are minimal, it will be advantageous to use them
for all h.
Note that as
to be nonsensical,
i.~.,
the initial estimates obtained may prove
h~l,
estimates for a
l
maybe greater than 1 or nega-
tive, and estimates for 8 , 8 may be negative.
1
2
However, for these
situations we might be dealing essentially with a single exponential for·
which we would not need initial estimates, since we can obtain explicit
maximum likelihood solutionso
If we note that E(X) ... a 8 + a 8
2 2
1 1
E(X2 )
...
(20501)
2
2
Za 8 + 2a 6
1 l
Z2
(205.2)
E(X 3) ... 6a 8 3 + 6a 8
1 1
2
i
(205.3)
1 n
and define m ... - L
l
n i =l
2
E(X), E(X ), E(X 3 ) by m , m ,
l
2
m3~
respectively.
Note that when two
parameters are known we only require equation (2.5.1), similarly when
one parameter is known we require (2.5.1) and (2.5.2), and when all three
parameters are unknown we require (2.5.1), (2.5.2), (2.5.3).
The esti-
mates for the most commonly occurring case of all three parameters unknown
will only be derived here, the remaining cases will be dealt with in
Appendix II.
The case when aI' 8 ,8 are all unknown has been treated in some
1 2
detail by Rider (1961).
*
a, ... (m
~
l
From equation (2.501) we obtain
* *
-8 *)/(8,-8,)
j.
~
J
iTj ... 1,2.
(2.504)
52
Substituting (2.5.4) into (2.5.2), (2.5.3) leads to the following:
(2.5.5)
(2.5.6)
Equation (2.5.5) may be solved for 8* (i=1,2), the solution being
i
(2.5.7)
Substituting 81* from (2.5.7) into (2.5.6) and simplifying yields
2
*2
*
2
6( 2mf-:- m2)8 j + 2(m 3-3ml m2 )8 j + 3m 2",:"2m l m3 = 0 •
(2.5.8)
Solving the quadratic equation (2.5.8) and substituting into (2.5.7),
(2.5.4), we obtain two identical sets of estimates.
Alternatively, we
cou:Ld just use the two roots of (2.5.8) as estimates for 8 .and 82 and
l
theI!. substitute these values into (2.5.4) to estimate a l •
2.5.2
Example·
We generated 1000 random
mi~ture-of-exponential
parameters a =.5, 8 =10.0, 8 =1.0.
l
1
2
numbers with
First we performed a chi-square
goodness of fit test in order to see if the data could be adequately
fitted by a single.exponential.
If we let b=3, n=lOOO, p=.05, P =.95
o
in expression (2.4.4) then k=34,
!.~.,
we have 34 approximate equi2
probable intervals.
The resulting value of X is 446.02, which is
highly significant.
Thus, we rej ect the null hypothesis that we have
a single exponential and will proceed to estimate the parameters of
the mixture of two exponentials.
53
The moment estimators obtained from (2.5.4) and (2.508) are
a *=.5220, 8*=10@1675, 8*=07184 and were used as initial estimates in.
l
1
2
the iterative process, equations (Z02.6),
(2~2.7).
iterative scheme can be seen in Table (2.501)0
The details of the
Note that not all the
iterations are given in the table, but just enough to give some idea of
how convergence proceeds.
The likelihood ratio criterion fortestfng
the null hypothesis of a single exponential versus the alternative of a
mixture of two exponentials was also calculated and yielded -2log2 =
401.7541, which is also highly significant.
Expression (2.3011) was used
to obtain the exact asymptotic covariance matrix with 8 =64611, 8 =00874,
2
1
The resulting matrix is of the form:
var(a)
cov(8 ,a)
l
cov(8 ,a)
Z
var(8 )
1
var(8 ,8 )
l Z
8. 336xlO
""
var(8 )
2
-4
-90794xlO
4.002xlO
-3
-1
-L562xlO
20408x10
8.679xlO
Note that as the parameter combinations approach values for which we have
convergence problems, the information matrix will become ill-conditioned
and we will have trouble taking its inverse.
An approximation to the
asymptotic covariance matrix was also obtained by using the calculated
values of the second partials (2.301) to (2.3.4) in place of the expected
values of the second partials of.the log of.the likelihood function, in
the previously defined information matrix.
the form:
The resulting matrix is of
-3
-2
-3
e
Table 2.5.1
Iteration I
0
1
2
3
4
5
6
8
10
12
14
16
18
20
24
28
32
36
40
44
48
52
53
54
55
56
e
e
(Xl
.5220
.5480
.5469
.5400
.5323
.5249
.5184
.5077
.4998
.4939
.4896
.4865
.4841
.4824
.4802
.4791,
.4785
.4782
04780
.4779
.4778
,4778
.4778
.4778
.4778
.4778
I
8
1
10.1675
9.6099
9.5729
9.6485
9.7496
9.8506
909435
10.0993
10.2182
10.3077
10.3744
10.4240
10.4606
1004876
10.5222
10.5408
10.5508
10.5562
10.5591
10.5606
10.5615
10.5619
1005620
10.5621
10.5621
10.5622
I
Details of the iterative process
8
2
.7184
08508
.9163
.9569
.9866
1.0106
1.0308
1.0632
1.0876
1.1058
1.1193
1.1293
1.1367
1.1422
1.1492
1.1529
l.1550
1.1561
1.1566
1.1570
1.1571
1.1572
1.1572
1.1572
1.1572
1.1573
la1o gL/a(X1
104.1983
-4.3856
-27.7187
-31.3653
-29.5154
-26.2671
-22.8985
-17.0489
-1205874
-9.2668
-608117
-500017
-3.6700
-2.6914
-1.4459
-0.7760
-0.4161
-0.2230
-001193
-0.0637
-0,,0339
-0.0179
-0.0152
-0.0129
~0.0109
-0.0093
la1o gL/a8 1 lal0 gL!a8 2 1
-209560
-0.2186
0.4451
0.5782
. 0.5577
004964
0.4296
0.3141
0.2281
001658
0.1207
0.0880
000642
0.0469
0.0251
0.0134
0.0072
0.0039
0.0021
000011
000006
0.0003
0.0003
0.0002
000002
0.0002
115.9209
41.0453
22.2380
15.1707
11.6893
9.5364
7.9763
5.7271
4.1550
3.0243
2.2051
1.6097
1.1761
008598
0.4602
0.2465
0.1322
0,,0709
000380
0.0204
0,,0109
0.0059
0.0050
0.0043
0.0037
.0.0031
logL
-2548.7410
-2536.6442
-253405776
-2533.5789
-2532.8859
-253203737
-2531.9927
-2531. 4999
-2531.2302
-2531.0834
-2531.0039
...,.2530.9609
-253009377
-2530.9252
-253009149
-253009119
-253009111
-2530,,9109
-2530.9108
-2530.9108
-2530.9108
-2530.9107
-2530.9107
-2530.9107
-2530.9107
-2530.9107
Ln
-i:'-
55
var(a)
e-
cov(6 ,a)
1
var(8 )
1
cov(e ,a)
z
cov(8 ,8 )
1 2
var(8 )
2
10 308x10
-3
-10 583xlO
4.776xlO
-2
-1
-2" 725x1Q
30940x10
1.173x10
-3
-2
-2.
CHAPTER Ill;
DOUBLE CENSORING
3.1
Introduction
In this chapter we will be concerned with samples in which
either one or both tails are subject to censoring as defined in Section
1.2.1.
Section 3.2 will contain a successive substitution iteration scheme
for estimating the three parameters of the mixture of exponentials, under
type I and type I I double censoring.
The exact.asymptotic covariance
matrix for these estimators will be obtained in Section 3.3.
We will
offer an operating procedure to implement the iterative scheme in Section
3.4 and illustrate it on some artificial data.
Note that the conclusions reached in the previous chapter for
complete samples will also hold in the case of censoring provided the
censoring is not too extreme.
Thus; the parameter combinations
encountered in the complete 'sample case, for which convergence isdiff1cult to attain, varianc'e of the estimates are excessive, and a single
exponential is an adequate approximation to.the mixture, will also cause
difficulties for double· censoring and the other general forms of censoring cOnsidered in the succeeding two chapters.
3.2
Iterative Scheme
3.2.1
Type I
We are considering a situation in which n items are put on test
and only N failures are observed.
The remaining (n-N) observations,are
51
censored, with r
l
being known to have failed before T , the left
l
censoring part, and r ' (rl+rZ+N=n), being known to have failed after
Z
TZ' the right censoring point.
r 1 , r Z are random variables.
Note. that whereas T1,T
Z
are constants,
From expression (1.2.3) we obtain
(3.Z.Z)
+ r1
10g[1-0\1
e -Tllel
- 0\ e -Tlle2 ]+rZ10g[0\1e -Tz·le 1+0\ Ze -T z le2 ] .
2
By taking partial derivatives of (3.Z.Z), we derive the following maximum
likelihood estimating
equ~tions:
+ r[e-Tzlel _ e-Tzlez]![I_F(T )] ,.. 0
2
Z
dlogL/ae
N
j
,.. i:
i=l
x,-e,
Ie
T
T Ie
{O\, (1 J)e- Xi j!f(x)} -rlO\j e 2I e- 1 j/F(T )
e3
J
1
+ rZO\j :; e-TzISj/[l-F(Tz)] ,.. 0
j
Multiplying equation (3.203) by 0\1 and adding
to both sides yields
N+r +r =n ""
1
2
N 1
-x' Ie
L {- e 1
Z!f(x,)} +
i=l e2
1
Since from (302.3) we also have that
j
j""1,2.
1
(3.2.4)
58
(3.2.6)
we then obtain
(3.2.7)
From (3.2.4) we note that
(3.2.8)
Then (3.2.7), (3.2.8) will determine a successive substitution
iteration scheme.
If we examine the maximum likelihood estimating equations (3.2.3),
(3.2.4) when we have.no observed failures,
!.~.,
N=O we obtain
(3.2.10)
From (3.2.10), j=l,2 we have
\
59
But comparing (3.2.9) and (3.2011) we see that we will have an
inconsistent set of equations unless N>O whenever TITTZo
3.202
Type II
For type I I double censoring the points at which the censoring
takes place, Y , Y are random variables, while the number censored in
N
1
each tail are constanto
From (1.204) we obtain.a log likelihood function
of the form:
logL
All the previous expressions (3.2.3) to (30208), used to derive the
successive substitutions iterative scheme for type I censoring, also hold
for type I I censoring, if we replace Xi' Tl,T Z by Yi'Yl'YN.
Also note
that if we choose r ,r such that r +r <n, no inconsistencies arise in
l 2
1 Z
the maximum likelihood estimating equationso
3.3
Asymptotic Covariance Matrix
3.3.1
Type I
As in Section 203 the asymptotic covariance matrix can be
written as the inverse of the information matrix.
The requisite second.
partial derivatives are derived from the first partials (30Z.3), (3.204)
and are as follows:
(3.3.1)
60
I
I
N
x -8 1 x .-8
= -l: {a a ( ~3 ) ( ·J. 2),e- xi 'Ell e- xi 8 Z/ f 2(x )}
i
83
1 2
i=l
1
2
(3.3.3)
T1 -T 18 T1 -T 18
2
-r a a --.
e l l -e 1 2/ F (T )
11 2 82
82
1
1
2
x.-8.
18
J.
J.) -xi j
a j 83
e
j
--....;;;.-------'] 2
f(x.)
.J.
(
(3.3.4)
2
. T1
2T 1 -1 18
-r a'+lf - -3]e 1 j/F(T) - r1{a
j
IJ 8
8.
1
j
J
T
8~
..
e-TlI8j/F(Tl)}2
j
j=1,2.
Recalling from Section 1.2.1, expressions (1.2.1), (1.2.2) that
N
f(x1'.H,~IN)
= II
i=l
{f(x.)/[F(T )-F(T )]}
2
1
J.
n!
r1
N
r2
h(N,r ,r ) =
INl. I [F(T )]j· [F(T )-F(T )] [1-F(T )]
2
1
2
1
1
2
r1· r 2
0
•
,
61
then·E[g(x1 ,···,xN,r1 ,r z ,N)] =
. ~[g(x1,"
L:
N+r 1+r Z=n':
•• ,xN,r1,rz,N)INl'r1,rZ]h(N,ri,rZ) and E(r1 )=nF(T1 ),
E(r Z)=n[l-F(T Z)]' E(N)=n[F(T z )-F(T 1 )]. We will use these results in
determining the expected values of the second partial derivatives. In
obtaining the expected value of -a21ogL!acxt,. expression (303.1), we note
the following:
N
I
E[ L: ( [1- e-x i 61
i=l 61
I 2]!f(x)}
2.
.'
e-x i 6·
IN,r tr ]
6Z
i·
1 2
1
-r-.
=
_
-N
ITz
1 -x161
.' 1 -x16Z
- cx cx [F(T )-F(T )] T {[f(x)e
] [f(x)- ~ e
]!f(x)}dx
1
Z
1 Z
1
8i
=
-N
[F(T )-F(T ) + e
a1cx z [F(T z)-F(T l )]
Z
1
-TI6.
-T 16
-T [6
Z l_ e 1 1 +e Z Z
~T1[6Z. TZ
1 -xle 1 -xle 2
..
-e·
+
e
e
/f(x)}dx].
T
61 6Z
I. {-1
T
-x[6 1 -x16 Z
e
/f(x)} dx we then obtain
T1
1 Z
.
-x16
th~ following expressions after applying the transformation w=e
1,
If we define 8 (T 1 ,T Z) =
1
1
J Z{e-ee
l:.::E.
z=w h
Note.that 8 (T ,T 2) differs.from the previously defined 8 , used for the
1 1
1
62
complete sample case, only· in the limits of integration.
and Si 2 ) (T l ,T 2 ) are the forms convenient to perform the numerical
integration when ~1/2and h>.1/2, respectively. Similarly this will,
also be the case for S2(T l ,T 2),. .. ,SS(Tl'T 2 ) in what follows.
have that E[-a 2 logL/a0l. 2 ]
1
= _n_
0I.
01.
1 Z
[01.
Thus we
{e-Tllel_e-T2Iel}+0I. {e-Tllel_e-T2IeZ}
2
1
-S (T T)] + n[e-T1Iel_e-T1Iez]z/F(T )
1 l' 2
. I
+n [e-Tzi eLe-Tzl e2]2 / [1-F(T ) ] •
2
(3.3.5)
Similarly, we obtain from (3.3.2) to (3.3.4)
E[-a 2 logL/ae l aOl. l ]
=}
1
[Sl(Tl,TZ)-SZ(T1,T Z)]
(3.3.6)
+ ::1 e-Tllel[l-e-Tllez]/F(Tl)
1
_ nT z e-T2Iele-Tzle2/[1_F(T )]
e2
Z
1
E[-a 2 1ogL/ae ZaOl. l ]
:~l
=:
2
[~2(Tl,TZ)-Sl (T1,T Z)] -
e-Tllez[l_e-Tllel]/F(Tl)
(3.3.7)
2
+ :;2 e-Tzlez e-T2Iel/[1-F(T2)]
Z
+ nOi.
nOi.
T
l
01. - - e
1 Z e2
1
-T 11
Ie -TI e-Tie
1 Z/F(T) +
e2
1
Z
Tz -T .1 e TZ -T. I
e 2 1 --e
2 Z/[l-F(T)]
1 2 e2
e2
. Z
01.
e
--
1
2
63
(3.3.9)
I
T
2
+ a[a 1 e~ e- T2 e1] /[1-F(T )]
2
1
E[-a21ogL/ae~]
na~
= ez-
T1 -T Ie 2
+n[a 82 e 1 2] /F(T )
1
1 2
2
2
2
T
2
+ n[a 2 --2
e
8
-T
2
18 2] 2 ![l-F(T
2
3.3.2
(3.3.H»
SS(T ,T )
2
)]
Type II
For type II censoring the expressions for the second partials
(3.3.1) to (3.304) should have xi ,T 1 ,T 2
rep1~ced
by Yi'Y1'YN' where we
note that the Y are the (rl+~)th observed order statistic in a sample
i
of size n.
If we attempt.to find the expected values of these expressions
we shall encounter some serious numerical complications,
~.£.
if we look
at the first term in (3.3.1)
we will require N numerical integrations to evaluate it.
different approach,
we note that
=
Or, trying a
64
N 1
- xi 16 1
-x'!S2 2 2
1
and E[ L: {e e
- e e 1 } /f'(x i )] =
i=l 1
2
-x162
[1- e-XI61.
·
-1- e
·]
61 .
2
62 .
f(x)
which will require numerical integration for a triple integral.
Th~s,
since the exact expected values of the second partials cannot easily be
evaluated, we will examine large sample approximations, when n+oo with
r
P1=
r
-n ' Pz = -nZ remaining
l
f~l
~1
o
{-;:- e
6
1
~
~1
f'J.I.
{-;:- e
L
6
Z 1
A
We ·can define ~1'~2 as:
fixed.
-xls 1
~Z
+ -;:- e
6
-xls 1
= P1
(3.3.11)
-xls z
} dx
= pZ
(3.3.12)
2
~2
+ -;:- e
6
-xls z
} dx
Z
.
where a, 6 , 6Z are the.maximum 1ike1thood estimates.
1
Using
th~s
results.
Then
asymptotic ,result we can obtain comp.utationa11y manageable
Thus
65
E[-a 210gL/aa 2 ]
1
(3.3.13)
We note that equation (3.3.13) is equivalent to (3.305) developed for
type I censoring, if we rep1aceT!' T2 by
1:1 ,1:2 •
This will also be true
for the remaining terms of the asymptotic covariance matrix (3.3.6) to
(3.3.10).
3.4
Implementation of the Procedure·
We will follow, essentially, the same operating procedure as we
have employed for the complete sample case.
First we will decide, using'
a chi-square goodness of fit test, whether a single exponential gives an
adequate representation for the data.
Since the intervals
(O,T1),{T2'~)
are predetermined we cannot guarantee equi-probab1e intervals, however,
we will divide the segment (T ,T Z) into (k-Z) equi-probable intervals
1
and then follow the recommendations set; forth in Section 2.4.3. If we
reject the null hypothesis that we have a single exponential then,we
proceed to estimate the parameters for the mixture,otherwise we just
66
estimate the mean for a single exponential
Note that when we have
0
N
censoring on the right only, e =
(~
i=l
X. + r T )/n, but i f we have
2 2
1-
double censoring we will require an iterative procedure to obtain eo
3.4.1
Initial Estimates
We will again obtain initial estimates by employing
moments
0
ameth~d
of
In some situations, because of the complexity of the resulting
equations, we will utilize various assumptions in order to facilitate
the solution of the moment equations.
The equations can.be written as:
(3.4.1)
1 N 2
N i=l i
m
= - L X = E(X
m
= -1
2
3
N 3
L X
N i=l. i
2
)
= E(X 3)
(304.Z)
(3.4.3)
T
Z
whereE (X) = --..".--,-.=1--,---,-
F(T 2)-F(T1)
I
F(T 2)-F(!1)
fT
l
{al(TI+el)e -Tllel + a (T +e ) -TIle
2
l z e2
(3.4.4)
(304.5)
67
r1 N rZ
We can use -- n' n' n
1-F(T )' respectively, then the difficulty in obtaining explicit solutions
Z
is due to the presence of the exponential terms.
Hence, we will be
required to find estimates for e-T1Ie1, e-TlI8Z, e -Tzle1 , e -Tzlez •
chapter Z we wouldn't expect to obtain. estimates unless, say
~o.&., TZ~
only if the censoring wasn't too extreme,
this situation we note
1
0
38 ' and
Z
When we are in
tha~
'V 0 0
e -TzI8Z 'V.
.0025
rZ
-TZ\ 1
-T2 Ie 2 'V -and since a l e ·18 + aZe
'V n
too small.
28
81~
From
=~
ale
-Tzlel'V r Z off
'V --n , . a 1
0
~s
not
Note that this assumption will be valid even for cases when
the censoring is extreme, if the value of h is small enough,
81~10eZ andT2~81' we
ing on the right,
have thate-TZlez
i.~.,
~
000005.
.~o.&o,
if
However, for censor-
obtaining estimates for e-T1Ie1, e-T1Iez, because
of the large overlap between the two populations for small failure time
values, even when h is small, this ploy doesn't prove to be effectiveo
Thus we are forced to restrict our attention to obtaining initial estimates when we only have censoring on the right, which is by far, the most
important special case of double censoring.
68
In this section we will derive initial estimates when nothing is
known about.the parameters,while Appendix III will contain estimates
obtained when either one or two parameters can be approximated from prior
knowledge.
Referring to equations· (3.4.1), (3.4.2), (3.4.3) and setting
T 0 e-T21s1 =r 2
1= ,
an
' e
r
m (1- -
1
r
m (1- -
2
-T2!S2 = Owe obtain
2
n
z
n
)
=
(3.4.7)
)
=
(3.4.8)
r
2
m (1- - )
3
n
=
(3.4.9)
From (3.4.7) we obtain
(3.4.10)
Now substituting (3.4.10) into (3.4.8), (3.4.9) yields
r
z .
r Z2
m (l--- )+ -.;-T
Z
n'
n 2
(3.4.11)
r2
r2 3
m3 (1- -n) + -n T2
*2 *
= 81
r2
82[6(~
* *2
r2
*2
r2
-1)] - 81 8 [6(1- -n)]+8 1 [6m1 (1- -n)]
2
(3.4.12)
From (3.4.11) we have
69
1
*
el
=.
r2
I[m2 (1- -n)
r2 2
*
r2
+ -n T2 ] - 6Z[m1 (1--n)
* r 2 " . .... r '"
Z
e 2 [-n- 1] + m1(1- -n)
r
+--1.
n
T ]
2
(3.4.13)
and using (3.4.l3) in (3.4.12) gives
(3.4.14)
*'
r
..
r
r
+ e2 {-3[m 2 (1- -nZ) [m1 (1- -n2) + - n2 T2 ]
rz
r2 · 3
r2 · 3
r
r 2 2 2 '3 r 2 2
r2
+[m (1- - ) + T ] (1- -)}+{-[m (1- -2) + -- T ] ........ T [m (1...; - )
3
n
n2
n
22
n
n2
2 n22
n
Thus the initial estimates are obtained by solving the quadratic
(3.4.14), and from computational experience, using the smaller root to
estimate e 2 , estimates of eland a l are then obtained by referring to
(3.4.13) and (3.4.10).
3.4.2
Example
We generated 800 random mixture-of...;exponential numbers with
parameters a=.6, e =12.0, 62=1.0, and type II censored the largest 100
l
observations.
First we performed a chi-square goodness of fit test in
order to see whether a single ,exponential adequately represented the
data.
To avoid an arbitrary choice for k, we again determined the number
of intervals to use for the
chi~square
test from (2.4.4), even though
the expression was derived for the case when we
have.equi~probable
intervals.
If we let b=3, n=800, p=.OS, P =.95 in expression (2.4.4)
then k=3l.
2
The resulting value of X is 403.7747, which is highly
o
70
significant.
Thus, we reject the null hypothesis that we have a single
exponential and will proceed to estimate the parameters of the mixture
of two exponentials.
The moment estimates, obtained from (3.4.10), (3.4.13), (3.4014)
are a *=.493l, 6*=1307273, 6*=1.3173 and were used as initial estimates
l
1
2
in the iterative process, equations (3.2.7), (3.2.8), the details of
which are presented in.Table (3.4.1).
Not all the iterations.are given
in the table,but just enough to give some ideas of how convergence
proceeds.
The likelihood ratio criterion for testing the null hypothesis
that we have a single exponential versus the alternative that we have a
mixture of two exponentials was also calculated and
yie1ded-Z10g~
=
310.9971, which is highly significant 0
tV
TZwas calculated from (3.3.12) and found to be 18.Z639, then
the asymptotic covariance matrix was determined from (3.3.5) to (303010)
after using numerical integration to obtain 8 (0,18.2639) = .4138,
1
8 (0,18.2639) = 00592, 8
2
8 (0,18.2639) = .5435.
5
3
=(0,18.2639)
=
.0139, 8 (0,18.2639) = .2463,
4
The resulting matrix is of the form:
cov(6 ,a )
l 1
COV(6 ,a )
Z 1
var(6 )
1
cOV(6 ,6 )
l Z
1.527x10
=
-3
-2.740x10
1.055
var(6 )
Z
-2'
-3
-50 334xlO
1.164xlO- l
3.l37xlO- 2
An approximation to the asymptotic covariance matrix was also obtained by
using the values of the second partials (3.3.1).to (3.304) in place of
the expected values of the second partials, and yielded:
e
Table 3.4.1
Iteration
0
1
2
3
4 •
5
6
7
8
10
12
14
16
18
20
24
28
32
36
40
44
48
52
53
54
55
56
e
e
I
Cl. 1
.4931
.4981
.5042.5103
.5159
.5209
05253
.5292
.5326
.5382
.5425
.5457
.5481
.5499
.5512
.5529
.5539
.5545
.5548
.5549
.5550
.5551
05551
.5551
.5551
.5551
.5551
I
81
13.7273
13.6327
13.4817
13.3268
13.1846
13.0587
1209488
12.8535
1207710
12,,6379
12.5387
12.4648
12.4098
1203688
12.3384
12.2989
12.2772 12.2651
12.2585
12.2548
12.2528
12.2517
12.2511
12.2510
1202509
1202508
12.2508
I
8
Details of the iterative process
2
1.3173.
1.2645
1.2267
1.1966
1.1714
1.1499
1.1314
1.1154
1.1016
1.0794
1.0629
1.0506
1.0414
1.0347
1.0296
1.0231
1.0195
1.0175
1.0164
1.0158
1.0155
1.0153
1.0152
1.0152
1.0152
1.0152
1.0151
I CllogL/ClCl.1 I CllogL/a8 1 I CllogL/a8 2
15.9439
19.6666
19.4279
17.8922
16.0382
14.2059
12.5061
10.9686
905945
7.2976
5.5173
4.1526
3.1150
2.3308
1.7408
.9673
.5359
.2965
01639
.0906
- .0502
.0278
.0155
.0134
00116
.0101
.0087
-.1498
-.2466
-.2626
- .• 2505
-02293
-02063
-.1839
-.1629
-.1437
-.1108
-.0846
-.0642
-.0484
-.0364
-00272
-.0152 -.0084
-.0047
-.0026
-.0014
-.0008
-.0004
-.0002
-.0002
-.0002
-.0002 -.0001
-12.2277
- 9.3706
- 7.8301
- 6.8118
- 6.0166
- 5.3216
;... 407051
- 4.1491
- 306484
- 207991
- 2.1295
- 1.6099
- 1.2115
- .9086
- .6797
- .3785
- .2099
- .1161
- .0642
- .0354
- .0196
- .0108
- .0060
- .0051 - .0044
- .0038
- .0033
~OgL
-1861.6218
-1860.9421
-1860,,4581
-1860.0846
-1859.7941
-1859.5689
-1859~3953
-1859.2621
-1859.1605
-1859.0246
-1858.9470
-1858-.9030
..,185B..-87-83 -
-1858.8645
-1858.8568
-1858.8501
-1858.8481
-1858.8474
-185808473
-1858.8472
-1858.8472
-1858.8472
-1858.8472
-1858.8472
-185808472
-1858.8472
-185-8..8472
..,..
-..,J
12
lo070xlO
cov(6 ,6 )
1 2
var(6 )
2
•
=
-3
-1.724xlO
0830
-2
-2.397xlO
50172xlO
10304xlGl
_;r
-2
-2
CHAPTER VI
SUMMARY AND FUTURE WORK
6.1
Summary
In this study
ma~imum
likelihood estimation of the parameters of
-~Ie
al
. a 2. -~le2
a mi~ture of two e~ponentials, f (~)= e e l + e e
, has been
1
2
e~amined in.detail.
Also, we have considered the estimation procedure
under various forms of censoring.
practical operating procedures.
The aim throughout has-been to obtain
We have attempted, wherever possible,
to maintain a unified 'approach and to obtain our. results in a similar
form for the casescons:j.dered.
The compleeesample case was considered in Chapter 2, with the
successive substitution iteration scheme, for solving the
ma~imum
likeli-
hood estimating equations, being derived in Section 2.1.
Although in most
cases.the iterativesche1l1e increased the likelihood function at each
iteration, for many parameter combinations the rate of convergence was
nevertheless too slow to be of practical value.
For those cases in which
convergence-was hard to ob,tain it was found that the asymptotic variance
of the estimates [Section 2.3] became
e~cessive,
provided an adequate representation for
and a single
themi~ture
e~ponential
[Section 2.4].
Thus'
any attempt to obtain estimates would be futile because of their large
asymptotic variances; in these cases an
mate for the mean of a single
e~plicit ma~imum
e~ponential
made only aqout the sample in toto.
likelihood esti-
was obtained and statements were
74
Taking partial derivatives of (40201) we obtain the following maximum
1ike1ihoodestimating equations:
__
1 e-Xi/8l
1 e-xil82
n air8l
82
]
3logL/aa '" oL {....;;;.~~----;;:;......--l
1=1
f(x o )
J:.
82]
_e-:,Ui/ 8_'''';;;''_
l+e -UiI_
) [e-LiI81 _e-LiI82
";;;"'-_ _----''---_
..l!..}
+ (l-a
~
i
=0
(4.2.2)
F(Ul)-F(L i )
alogL/a8
=
j
n
aia j
I
x -8
8
(~)e-Xi 8j
2:. {--.;;;......' - - - - - - ' j - - - -
i=l
Multiplying (4.2.2) by a
f(x )
i
and adding
l
to both sides gives the following:
n
n
a
i
L {ao+(l-ao)}=n= 2: {-e-xi
i=l 1
J:.
i=l 82
18
2/f(x.)
1
(4.2.4)
Since from (402.2) we have that
n
a i e-xil82
(1_ao)[e-LiI82_e-Ui!82]
L { 82
+ _~J:.;;...
1=1
f(xo)··
J:.
.
F(U.)-F(Lo)
J:.
1
} ...
n __
a1 e-xil8l
L {-"-8l
_
1=1
f(x.)
J:.
75
,
substituting (4.204) into (402.5) we then obtain
a
n
=~
L
-a i e -X"J. 81
.
{8 1
n i=l
.
[e-1iI81 _e-uiI81]
}
f(x )
i
(4.2.6)
0
F(Ui)-F(L i )
From (4.2.3) we note that
n
8
a . x . e -xi I 8 j
= L {_J._J.
j
i=l
_
n a e -Xi I8j
+
}I ~ {
f (xi)
......;i","-_ _}
i=l
f(x )
i
(402.7)
then (4.206), (4.2.7) will determine a successive substitution iteration
scheme.
When no observed failures occur,
!.~,
N=O, we obtain from (402.2)
<
"if
i
and hence the maximum likelihood estimating equations are inconsistent
unless N>O.
4.3
Asymptotic Covariance Matrix
The second partials obtained from the first partials (4.202),
(4.2.3) are as follows:
+
°
76
(l-a.)[e
-Lilel _ -Lil e2
e
j-l n
da
j
le2 2
] }
(4.301)
F(U.)-F(L.)
1
].
1.
a2 1o gL!ae
_ ~uilel+ -.u i
e
e
,.. (-1)
L {
1
a. Xi-ej) -xilej -xilek
-!. ( e~ e.
e
. ; .e.k. .
i=1 .
jlt..-_--.."...
. f 2 (x )
_
i
(4.302)
}
f
2
(x.)
1
(40304)
a~
-(I-a. )~
].
e~
J
Since for any function g(x.) having a finite expectation
1
n
E[ L a.g(x.)]
i =l].].
=
n
L
J l(L' " U) g(x.)f(x.)dx
i
. 1 x
1
1
itf·]' 1
1=
77
(4.3.5)
[f(x)-18 e -xilez J/f(x
i
. i »)dx i
Z
But
(4.3.6)
If we define Slo(Lo,Uo) -_f
~
~
~
[ _.1
e e e-xilel e -xilez l/f(x o)} dx o
xo~(Lo,Uo)
1 2
~
1
1.
~
~
we then obtain the following expressions after applying the transformations,
-x Ie
I-h
w<=e i 1 Zo=W h
1.
,1.i·
1
Sl(i ) (L 0, U0)
1.
1.
= Jwot1 (A10' B10.)
1.
1.
1.
{ w°
1.
78
for the complete sample an4 double censoring case, respectively, only
(1)
in the limits of integration.
(2)
Sli (Li,U i ), Sli (Li,U i ) are forms
convenient to perform the numerical integration when
respectively.
h~1/2
and h>1!2,
Similarly, this will also be the case for S2,(L.,U.),
~
~
~
••• ,SS.(Li,U.),
in what follows.
..
~.
~
From (403.S), (4.3.6) we then obtain
Similarly, we obtain from (4.3.2) to (4.3.4)
E[-a 2 1ogL!a6 l aOl. l ]
n
1
-6[Sl' (L. ,U,)-S2i(L. ,U,) ] -
= l: {
i=1
~
1
~
~
~
~
(4.3.8)
E[-a 2 1ogL!a6 2 aOl. l ]
1
n
= l:
i=l
1 1 ,
{--e-
2
L.e-Lile2 _Uie-Uilez
[~.
62Z
][e
~.
-1i161
~
~
~
~
~
-u·l e1
-e~··]
(4.3.9)
F(Ui)-F(L i )
n
E[-a 2 1ogL/a6
[h S2·(L"U,)-Sl,(L.,U.)]+
1
01. 01.
1 2
a6 J = l: {~- [S3' (L i ,U.)-(h+1)SZ' (L. ,U i )+hS 1·. (L, ,U.)] +
z
i=l, 6 2
~
. ~
.
~
~
~
~
~
2
(4.3.10)
79
}
(4.3.11)
}
(4.3.12) .
4.4
Implementation of the Procedure
In this situation we can no longer employ a chi-square goodness of
fit test to check whether a single exponential provides an adequate fit
for the data.
If we restrict our attention to the case of censoring on
the right, i.eo, U.=oo v i, and Lo is the right censoring point, then
--
1.
1.
Nelson's (1969) graphical procedure can be employed to check whether a
single exponential adequately represents the datao
If we define the hazard function, h(x), of a density, f(x), as
h(x)
= f(x)![l-F(x)]
(4.4.1)
then it is the conditional probability of failure at time x, given that
failure has not occurred before then °
The cumulative hazard function can
then be, defined as
x
H(x) =
J h(t)dt = -
log[l-F(x)]
(4.4.2)
o
The princ.ipleunderlying plotting on hazard paper is essentially the
same as that underlying plotting on probability paper.
In both cases,
the aim is to approximate a theoretical distribution with a sample distribution" and to use the latter to make statements about the unknown
theoretical distributiono
The sample cumulative hazard function is
80
plotted by making the increase in the sample function at a failure time
equal to the observed conditional probability of failure,
!o~.,
one
divided by the number k of units in operation 'at the time of failure,
including the failed unit.
Then" the sample cumulative hazard function,
based on the sum of the observed conditional probabilities of failure, '
approximates the theoretical cumulative hazard function.
1 e -xl e ,x~O, the hazard
For the exponential density, f(x)=e
function is
hex)
which is constant over time.
= lIe
x>O
The cumulative hazard function is
x>O
H(x) ... x/e
which implies that x ... eH(x) , and thus the time to failure is a linear
function of the cumulative hazard valueo
Suppose that the failure data on n items consists of the failure
times for the N failed items and the running (censoring) times, Lils,
for theunfailed items o
We then order the n times in the sample, from
smallest to largest, without regard to whether they are censoring or,
failure times.
If some censoring and failure times have
they should be put into the list in random ordero
thesa~e
value,
The hazard value
corresponding to each failure appearing in the list is thEm calculated,
as 100 divided by the number, k,'ofunits with a failure or censoring
time greater than or equal to that particular failure time.
The hazard
value is the observed conditional probability of failure, of a particular failure time,
.!.o~o,
the percentage lOO(l/k) of the k items that ran
that length of time and then'failed.
Next, for each, failure time,
81
calculate the corresponding cumulative hazard value, which is the.sum
of its hazard value and the hazard value of all preceding failure timeso
Note that cumulative hazard values can be larger than 100 percent.
The
failure time is then plotted against the cumulative hazard value, if the
resulting plot is a reasonable approximation to a straight line we will
conclude that we are dealing with a single exponential, the mean can
then be found as the slope of the line, or a maximum likelihood estimate
given by Bartholomew (1957, 1963) can be used,
!.~.,
(4.4.5)
If the plot is not a straight line then we will proceed to estimate the
parameters for a mixture of exponentials
4.401
0
Initial Estimates
The initial estimates we will obtain are not applicable for all
situations which can be covered by the iterative equations, but they do
cover the most important special case.
We will consider general
censor~
ing on the right (i.eo, U.=oov i) where the limits of observation (L )
i
-~
are non-random and forseeable.
A case in point is a situation in which
monitoring of each item starts at different times and testing is
concluded for all items at the same time.
Thus each item will have
either failed in a period of time less than L units or will have been
i
censored after being
observ~d
for Li units.
The initial estimates will be obtained in a manner .which is
analagous to that used for the case of double censoring.
to (3.4.1), (3.4.2), (304.3) we have
Corresponding
82
n
E{
n
aiXi }
(404.6)
n
2
E{
=
l: aiX }
aiX i
i
i=l
i=l
(4.4.7)
l:
i=l
n
l:
aiX i =
L
i=l
2
n
n
3
3
aiX i .., E{ L aiXi }
i ...l
i=l
(404.8)
l:
(40409)
(4.4.10)
(4.4.11),
r
Whereas, for double censoring we
esti~ated
l-F(T 2 ) by
-n2 ' we now
estimate l-F(L ) using a non-parametric estimator proposed by Kaplan
i
and Meier (1958)0
If we define n(L,)
and N(L,)
as
1,
1,
n(L,)
::
1,
The number of items observed and surviving at time L ,
i
where ifL does correspond to an.observed failure or
i
cens.oring point, failures (but not censored items) at
L,1, itself are subtracted off.
The number of items having observation limits L., such
J
83
then what Kaplan and Meier call the reduced sample estimate of P(x>L i )
is defined as
a e -Li 181 + a e -L'~ 182 we have, as before, if the censoring
1
2
82 tV
is not too extreme and if 81~ 38 , that ale -Li181 tVtV Q and e -L'1
~
tV O.
i
2
Since Q
i
tV
tV
The resulting algebraic manipulations necessary to obtain the initial
estimates are similar to that for double censoring, section (3.4.1),
and we will just present the resulting equations necessary to estimate
the three parameters here; Appendix IV will contain the equations to
obtain estimates when either one or two parameters are approximated from
prior knowledge.
(4.4.13)
(4.4014)
(4.4015)
*
n
2 n
.
n - 3
3
n
+ 6 {-3[ E a.x,][ E (aixi+L,Q.)] + [ r (a,x i + L,Q.)] [ E (l-Qi)]}
2
i=l ~ ~ i=l
~ ~
i=l ~
~ ~ ~
i=l
3 n
2 2
2 3 n 2
n
2 2
n
3 3
n
+ {-2[ E (a,x,+L,Q.)] --2 E L,Q,[ E (a,x,+L,Q.)] - [ E (a,xi+LiQ,)]E a,x,}=O.
.
i=l ~ ~ ~ ~
i=l ~ ~ i=l ~ ~ ~ ~
i=l ~
~ i=l ~ ~
4.4.2 Example
We generated 600 random mixture-of-exponentia1 numbers with
84
L401,.o.,L600=5000, u 1 ' •• 0'u 600 =00.
Ten items were found to be censored
at 3000, five at 4000, and two at 50.0.
First we utilized a hazard plot,
Figure (4.401), in order to see whether a single exponential adequately
represented the data.
Since the plot is clearly not a straight line we
will proceed to estimate the parameters for a mixture of two exponentials.
The moment estimates, obtained from (4.4.13), (404.14),
(4.4~15)
are a *=05287, 8* = 13.2933, 8* = 06262 and were used as initial ·estimates
1
1
2
in the iterative process, equations
are presented in Table (404.1).
(4.2~6),(4.207), the
details of which
The likelihood ratio test, that we have
a single exponential versus the alternative that a mixture is present,
was also performed and yielded
-21og~
= 398.5143, which is highly
significant~
The asymptotic covariance matrix was determined from (4.3.7)
to (4.3.12) after using numerical integration to obtain:
81 ,1(30.0,00)=.35685; 82 ,1(3000,00)=.053455; 83 ,1(30.0,00)=.01272;
84 ,1(30.0,00)=041585; 85 ,1(30.0,00)=.497 4 ; 81 ,201(4000,00)=.35685;
82 ,ZOl(40.0,00)=.053455; 83 ,201(40.0,00)=.01272; 84 ,201(40.0,00)=.69841;
85 ,201(40.0,00)=.6285; 81 ,401(5000,00)=.35685; 82 ,401(50.0,00)=.053455;
83 ,401(50.0,00)=.01272; 84 ;401(50.0,00)=1.02505; 85 ,401(50.0,00)=.74915
The resulting matrix is of the form:
var(a)
cov(8 ,a )
1 1
cov(8 ,a )
Z 1
var(8 )
1
COV(8 ,8 )
1
var(8 )
Z
Z
. -3
LOZ8x10
=
-2.027x10- 2 -2.203x10- 3
1.661
60 755x10- 2
L493x10- 2
Note that i f the sample size is 1a1;'ge and if the censoring points are
different for each item, then, since.5nnumerica1 integrations are required
e
e
-
e
It
"'"""-
o
~
~
::J
400
JC
It
rn
1t
>
E
~
~
~
::J
E
::J
U
320
It
x ,.
x
It
•
¥
A
"
11
II
~
240
x
1(1C1(
160
It
x
"
.
W >C
"
)0
"
1C"."
-,,,~,,
y
80
t..."
,. .-<
•
JII"'.c III
If
11"
It
It
o
ep
Figure 4.4.1
100
Hazard plot
1::0
200
250
300
35]
400
00
t.n
obser'\€d failure times
e
Table 4.4.1
Iterat~1
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
e
e
Ct
1
.5287
.5178
.4967
.4751
.4572
.4435
.4331 .
.4254
.4197
.4155
.4125
.4102
.4086
.4074
.4065
.4059
.4054
.4051
.4048
.4047
.4045
.4044
.4044
.4043
.4043
.4043
.4042
.4042
.4042
.4042
.4042
I
61
13.2933
12.0784
12.4871
12.9753
13.4162
13.7804
14.0684
14.2904
14.4587
14.5849
14.6787
14.7481
14.7992
14.8367
14.8642
14.8843
14.8991
14.9098
14.9177
14.9234
14.9276
14.9306
14.9328
14.9344
14.9356
14.9365
14.9371
14.• 9376
14.9379
14.9382
14.9383
I
62
.6262
.7912
.8841
.9469
.9936
1.0295
l.0570
1.0780
1.0939
1.1057
1.1146
1.1211
101259
1.1294
1.1320
1.1339
1.1353
1.1363
1.1371
1.1376
1.1380
101383
1.1385
1.1387
1.1388
1.1389
1.1389
1.1390
101390
1.1390
1.1390
Details of the iterative process
I
a1ogL/aCtl_l~··__·_alogL/a61
-26.2596
-50.7128
-51.9245
-42.8973
-33.3068
-25.2034
-18.8214·
-13.9463
-1002822
-7.5554
-5.5388
-4.0539
-2.9637
-2.1649
-1.5805
-101533
- .8414
- .6136
- .4475
- .3262
- .2378
- .1733
- .1263
- .0920
- .0670
- 00488
- 00356
- .0259
- .0188
- .0137
- .0099
-2.0191
.7874
.8392
.6739
.5039
.3683
.2672
.1935
.1402
.1016
.0738
.0536
.0390
.0284
.0206
.0150
.0109
.0080
.0058
.0042
.0031
.0022
.0016
.0012
.0009
.0006
.0005
.0003
.0002
.0002
.0001
I
dlogL/a6
2
121. 7752
44.8036
25.2832
16.9717 12.1302
8.8429
6.4788
4.7488
3.4780
2.5448
1.8603
1.3590
.9923
.7242
.5284
.3854
.28il
.2050
.1495
.1090
.0794
.0579
.0422
.0308
.0224
.0164
00119
00087
.0063
.0046
.0034
I_LOgL
..,1498.3598
.,..,1484.0230
-1479.3834
-1476.6668
-1475.0492
-1474.1118
-1473.5812
-1473.2861
-1473.1240
-1473.0358
-1472 .9881
-1472.9625
.,..,1472.9487
-1472.9414
-1472.9374
-1472.9354
-1472.9342
-1472.9336
-1472.9333
-14.72.9332
-1472.9331
-1472.9330
-1472.9330
-1472.9330
-1472.9330
-1472.9330
-1472.9330
-1472.9330
-1472.9330
-1472.9330
-1472.9330
ex>
0'
87
to calculate the asymptotic covariance matrix, an approximation becomes
extremely useful.
Using the values of the second partials (4.3.1) to
(4.3.4) in place of the expected values of the second partials yields:
cov(8 ,a )
l l
cov(8 ,a )
Z l
var(8 )
1
cov(8 ,8 )
1 Z
9. 870xlO- 4
=
-10 749xlO- Z -L359xlO -3
3.954xlO-Z
10536
var(8 )
Z
4.5
8. 551xlO
-3
A Further Generalization
The general form of type I censoring can be further generalized into
a situation with each item having more than one censoring interval;
~.£.,
for the i th item we either have an observed failure in the broken interval
I = (0,00)
<-
(L1.'l'U il ) - (LiZ'.UiZ)-.'o - (L. k' ,Uo k')
1., 'i' 1., i
for which we set al=l, aij=O, j=l,o •• ,ki , whereas i f the item failed in
the Q,th censoring interval we setaon""l, a, 0=0, jfQ,; ao=O.
1.~
l.J
1.
Such a form
of censoring can arise when items are continually monitored on weekdays
(or during the daytime) and when no one is around to monitor them on weekends (or at night).
Thus, if we get an exact failure time it occurs on a
weekday, and if we have a censored item we know on which weekend it occurred.
The joint density of the observed failures given that a particular
set {ao,a oo } is observed is
1. l.J
a
o
(F(Uij)-F(L ij )]} 1. (4.5&1)
and the probability of obtaining this particular set {so,aOj}
1. 1.
is
88
n
k,
~
IT {l- ~ [F(Uij)-F(L
i=1
j=l
a
ij )]
k
i
IT
i
j=l
a
[F (U , ,) - F (L , j ) ]
~J
~
ij
}.
(4.5.2)
Hence, from (4.5.1) and (4.5oZ) the likelihood function is given as
k
i
IT
j=l
a
ij
[F(Uij)-F(L ij )] . }
Note if ki=Z, Li1 =0, Uil=T l , LiZ=T Z'
UiZ=~'
(4.5.3)
v i, then this reduces to
the special case of type I double censoring, previously considered in
Chapter 3.
The iterative equations and asymptotic covariance matrix can be
obtained from (4.5.3) by following the procedure set forth in Sections
4.Z and 4.3.
The results, however, wi1lbenotationally complex and are
not presented here.
(' I
CHAPTER V
PROGRESSIVE CENSORING
5.1
Introduction
In this' chapter we will deal with progressive censoring, as
defined in Section 1.2.3. Section 5.2 will contain a successive substitutioniteration scheme for estimating the three parameters .under type I .
and type II censoring.
The, exact asymptotic covariance matrix for these.
estimates will be derived in Section 5.3.
An operating procedure to
implement the iterative scheme will be given in Section 5.4, ,and will be
illustrated on some artificial data.
5.2
Iterative Scheme
5.2.1
Type I
Now we have the censoring occurring progressively in.k stages at
time T., such that T.>T. l' i=1,2,0 • .,k, and at the i
~
~
th
~-
stage of censor-
ing r. units, selected at random from the survivors, are withdrawn from .
~
the test.
From expression (1.2012) to log likelihood can be written as:
k
n-Mi
N
Ct 1
-x· I 6
10gL = l: log ( N. ) + l: 10g[- e ~ 1 +
6
i=l
1
i=l
k
61 +
e-TiI62]
~
Ct
+ l: r i 10g[Ct1e -T·1
2
i=l
~.
.
Ct 2
6
2
e-xiI62]
(502.1)
Taking partial derivatives of (5.2.1) we obtain the following maximum
likelihood estimating equations;
90
3logL!ao<
1 -x'1. Iel _1- -x i
= N
E[{-e
e
1
i=l e1
e2
Ie2}!f(x,)]
1.
(5.2.2)
=
N
Xi-ej
Ie
k
Ti -Tilej
E 0< (
) e-xi j!f (x ) + 2: riO<j -e
2 e
i=l j
e3
i
i=l
j
j
-------
=0
•
[l-F(T )]
i
Multiplying equation (502.2) by 0<1 and adding
to both sides yields the following:
N
k
N
1
E 1 + E r,=n= E [ - e-xi
i=l
i=l 1.
i=l e 2
Ie 2!f(x,)]
1.
k
T
+ E r, e- i
i=l
Ie 2![1-F(T,)].
1.
(502.4)
1.
Sinee, from (5.2.2) we also have that
N
E
i=l
e-xile1/f(x )] + ~ r,
e1
i
i=l 1.
[-!
+
~
i=l
I
[1- e -x'1. e2!f(x,)]
i=l e2
1.
e~Tile1/[1_F(T )] = NE
i
r, e-Tile2![1_F(T,)]
1.
(5.2.5)
1.
we then obtain
(5.2.6)
From (5.2.3) we note that
91
Then (5.2.6), (5.2.7) will determine a successive substitution iterative
scheme.
If we examine the maximum likelihood estimating equations (5.2.2),
(5.2.3), when we have no observed failures,
!.~,
Ti -T'le
~ a.r -- e ~ j/[l-F(T.)] =
i=l J i 8~
~
k
°
N=O, we obtain:
j=1,2
J
since not all T. 's and r.'s equal 0, the three equations are inconsistent.
~
~
k
To exclude this trivial case, we take
~
i=l
5.2.2
r.<n.
~
Type II
For type II censoring the points at which the censoring takes place,
Y1"."YN' are random variables.
The log likelihood is obtained from
(1.2.15) and is as follows:
logL
+
ri10g[ale-YiI8l~a2~-Yir~2)}
All the previous expressions (5.2.2) to (5.2.7), used to derive the
92
successive substitution iteration scheme for type I censoring, also hold
for type II censoring, i f we replace Ti,k by Yi,N.
5.3
Asymptotic Covariance Matrix
5.3.1
Type I
The second partials obtained from the first partials (5.2.2),
(5.2.3) are as follows:
N
1 e -x'~I 81 - -1 e-x'~ I8 2} 2 /f 2 (x )
a 2 1ogL/aa 2 =-E {-1 i=l 81
82
i
I 2
2
E r.{e -T·~I 8 1 -e --T'~ 8 2} /{l-F(T )}
k
. 1 ~
~=
j-1
+
(5.3.1)
i
k
T,
T
I
I
2
E r. ~ e- i 8j e- Ti 8k/{1-F(T.)} ,
i=l ~ 8~
~
(-1)
(5.3.2)
J
j+k=1,2
N
x~-8l (X~-82)
18
18
2
a 2 1ogL/a8 a8 = - E a a (~
)
~
e- Xi 1 e- xi 2/ f (xi) (5.3.3)
l 2
i=l 1 2
e3
83
1
2
k
-
~
i=l
T
2
i
r.a a ---e-Til8l e-T'le2·
~
/{l-F(T.)}
~ 1 2 82 e2
1 2
f(x )
i
k
2
2
~
f
2
(5.3.4)
(x.)
~
T 18 T.
2T.
T. T 18 2
2
+ E {r a.e- i j[~ - -~]/[l-F(T.)]-r,[a,~e- i j] /[l-F(T·)] } •
i J
8~
8j3
~
~ J e~
i
i=l
J
.
J
93
The expected values of the first summations are identical to the
complete sample case and the second summations are constants.
Thus, by
referring to expression (2.3.5) to (2.3.10) we obtain the following:
(5.3.5)
-N -S ]
E[-a 2 logL!ae aa ] = --[S
1 1
e
2 1
1
N 1 -s ]
E[-a 2 logL!ae 2 aa ] = --[-S
e2 h 2 1
1
E[-a 21ogL!ae ae 2 ] =
1
Na a
1 2 [S3-(h+1) S2+hS1]
e22
2
k
T.
T Ie
T Ie
2
e- i 1 e- i 2![1-F(T.)]
+ ~ r a a _1
i=l i 1 2 e 2 e2
1
1 2
2
Nat
k
-T.le Ti
2T i
= ---s - ~{r.a1 e 1 2[__
----]![l-F(T.)]
e2 4 i=l 1
e4
e3
1
1
1
(5.3.8)
1
Ti -T'le 2
2
-r [a -e 1 1] ![l-F(T.)] }
1
i
e2
(5.3.9)
1
1
2
Na 2
= ---s -
k
~ {r. a e
e2 5 i=l
2
1
-T Ie
i
2
2
Ti
2T i
2 [-- - -{-]! [1-F (T . ) ]
e4
e3
1
2
.2
T.
T Ie 2
2
-r.[a ~ e- i 2] ![l-F(T )] }
1
2 e2
i
(5.3.10)
2
5.3.2
Type I I
For type II censoring the expressions for the second partials,
(5.3.1) to (5.3.4), have Ti,k replaced by Yi,N, where we note that y. is
i-1
J.
the smallest observed order statistic in a sample of size n - ~ (r.+1),
j=l
J
N
which is truncated on the left at y.J.- 1.
Since
= ~
i=l
g(x ), the
i
...'.
.
94
expectations of the first sums of ,the second partials will be the same as
that obtained for type I censoring; howevert the second sums are no
longer constants.
In 'generaltthe expectations of the second.sumswill
be difficult to obtain.
We will briefly indicate how the difficulty
arises and then offer an approximation to the secon4 sums which will
alleviate the problem.
In order to obtain the expectations ,of the sums we first have to
find the densities of the ordered observed failure times.
We will
accomplish this by determining the joint densities and then integrating
out.
Noting that
and then referring to (1.2.l4)t we obtain
The marginal densities are then obtained by:' integrating out t
~.:&.
Y2
f 2 (Y2)
= J _ f12(YltY2)dYi =
Yl-O..
.
n(n-r -1)
2
+1
___~l~ [1-F(y )]n-rl - [1~{1-F(Y2)}rl ]f(y )
2
2
rl+l
t
95
_
[l:"{l-F(y ) }rl+r2+2]
3
}
(rl+l) (r +ri+2)
i
r
n(n-r -l)(n-r -r -2) (n-r -r -r -3)f(y ) [l-F(y )] n- l-r2- r 3- 4
1
1 2
123
4
4
l-[l-F(y 4 )]r2+ r 3+Z _ l-[l-F(y 4 )]r3+ l -'--__
+
(r +l)(r +l)(r +r +Z) (r +l)(r +r +Z)(r +l)
Z
Z 3
l
l
l Z
3
1-[1-F(Y4)]
r +r +r +3
l Z 3
-----'---------}
.
(rl+l) (rl+rZ+Z) (r l +r Z+r 3+3)
Thus, since the marginal densities are becoming hard to handle, so will
the resulting expectations.
Instead of proceeding this way, we will find an estimate for Yi'
IV
say y., and then the second summations will be constant, and analogous to
~
the situation for type I censoring.
The only difference between the
exact expected values for type I censoring and the approximate expected
IV
values for type II censoring will be that yi,N replaces Ti,k.
If we define R asa sample estimate for l-F(y ) then we can
i
i
obtain Y as follows:
ale-Yilel + aze-YileZ = R
(5.3.11)
i
i
96
We will now briefly develop two different ,methods of obtaining Ri which
will produce similar results for large samples.
The first of ,these, the
product ,.limit estimate conc.eived by Kaplan and Meier (1958) can be
obtained as follows:
(a) Divide the time scale into suitably chosen intervals,say,
(0, Yl)'
(Yl'Y2),
(b) For each interval
probability, r. = R./R.'
J
J
that the
item~'has
J-
1~
0"
(Yj~l'Yj)'
one estimates the conditional
the probability of surviving beyond y. given
J
survived beyond y.' 'I"
J-
(c) Then R., the proportion in the population surviving beyond
J
Yj is determined by the product of,the estimated r i for all
~j.
For the product limit estimate we have that
j-l
j-l
r. - j) / (n -
E
~
i=l
E
i=l
(5.3.12)
r. - [j-l])
~
Q,
and hence
R n = IT
x"
j=l
r.
(5.3.13)
J
The second estimate obtained by Herd (1956) is derived by making
a change of variable in ,the likelihood function (1.2.15).
v.=l-F(y.),
~
.
~
vl>v2~••• ~vN"
If we define
the new likelihood function is of the form
N
i-I
r.
= IT {rt - E (r.+l)}v. ~ dv.
i=l
j=l J
~
~
(5.3.14)
and hence
N
i~l
= IT {n - E (r.+l)}
i=l
j=l J
1 vI
J J
0
N-l
since r N = n - L: r.-;N implies
j=l J
0 •••
v
2 V 1
J N- J. N0
0
N
VQ,
r.
{v. ~ dv } ,
i
i=l ~
IT
97
N-l
i-I
{n- ~ r.-i+l}
i=l
j=l J
IT
f
1
0
n-
Q,-2
~
Q,
i-I
1
Jl JV l JVQ,-Z
rl
j=l
= II {n- ~ rj-i+l} --Q,--l.,......::~-- 0 o ••• 0 . vI ••• v Q,-l
i=l
j=l
n": ~ r -Q,+2
j=l j
Q,
i-I
Q,
i-I
= II {n- ~ rj-i+l}!II {n- ~ r.-i+2}
i=l
j=l
i=l
j=l J
r -Q,+2
j
(5.3.15)
Then we can use E(vQ,) = E[l-F(yQ,)] as an estimate of l-F(yQ,)'
5.4
Implementation of the Procedure
For this form of censoring we cannot employ a chi-square goodness
of fit test to check whether a single exponential provides an adequate
fit for the data, but we will once again follow Nelson's (1969) graphical
procedure, set forth in Section 4.4.
If the resulting plot is a
straight line we can estimate the mean of the single exponential using
either graphical procedures or a maximum likelihood estimate.
Whereas,
if the resulting plot isn't a straight line then we will proceed to
estimate the parameters for a mixture of exponentials.
5.4.1
Initia+ Estimates
tV
Defining m
l
1
... -N
N
};
i=l
tV
1
Xi' m2 = -N
N
2 tV
1
m
~
=
Xi' 3
Ni=l
i=l
N
L:
replacing m , m ' m in the moment equations
l
Z 3
d~veloped
X~1- , then,
for the complete
98
'V
tV
'V
sample case, Section 2.501, by m , m , m we will obtain initial
l
2
3
estimates for the progressive censoring caseo
504.2
Example
We generated 600 random mixture-of-exponential numbers with
parameters a =.5, 8 =800, 8 =1.0 and randomly selected 10 items at 200,
l
1
2
6 items at 4.0, 4 items at 6.0, 3 items at 8.0, and 2 items at 10.0 to
be censored.
First, we made use of a hazard plot, Figure (5.4.1), to
check whether a single exponential provided an adequate fit.
Since the
plot is clearly not a straight line we will proceed to estimate the
parameters for a mixture of two exponentials.
The moment estimates, obtained by applying the modifications
given in Section 5.4.1 to expressions (2.5.4), (2.5.7), (2.5.8), are
a *=.4045, 8*=6.9271,8 *=101369 and were used as initial estimates in
2
1
the iterative process, equations (5.2.6), (5.2.7), the details of which
are presented in Table (5.4.1).
Note that not all the iterations are
given in the table, but just enough to give some idea of how convergence
proceeds.
The likelihood ratio test, that we have a single exponential
versus the alternative that a mixture is present, was also performed and
yielded -2log.Q,=243.50l6, which is highly significant.
The asymptotic pcovariance matrix was determined from (5.3.5) to
(5.3.10) after using numerical integration to obtain Sl=05027, S2=.1016,
S3=.0344, S4=1.6220, S5=.8702and yielded:
cov(8 ,a )
l l
cov(8 ,a )
2 l
var(8 )
l
COV(8 ,8 )
l
var(8 )
Z
2
1.562xlO- 3
=
-9. 695xlO- 3
-2.362xlO
-3
2.l48xlO- l
1. 797xlO
-2
9.3l8xlO -3
e
e
e
"
It
--
500
It
o
It
-.
o
"--'
ID::J
~
It
It
400
Jt
"Eo
N
o
..c
J(
It
300
'l.
x
>
::J
x
,. JI.
<l>
d
x
X
K
"x
X·
1(
.)C"
200
1(.
IC lC l(
E
::J
I(
.. x
U
A
a
..a
..x
...r
100
.. "
x"
)t
II
•
II
100
Figure 5.4.1
Hazard plot
150
AX)
250
300
350
observed fai lure times
400
4~
\0
\0
e
e
Table 5.4.1
Iteration
0
1
2
3
4
5
6
7
8
9
10
12
14
16
18
20
24
28
32
36
40
44
48
52
56
57
58
59
60
61
I
""1
.4045
.4251
.4337
.4427
.4514
.4592
.4663
.4725
.4780
.4829
.4871
.4941
.4995
.5036
.5067
.5091
.5123
.5142
.5153
.5159
.5163
.5165
.5166
.5167
.5167
.5167
.5167 .
.5167
.5168
.5168
I
6
1
6.9271
7.4834
7.4260
7.3366
7.2447
7.1590
7.0828
7.0160
609577
6.9070
6.8630
6.7915
6.7374
6.6965
6.6655
6.6420
6.6106
6.5924
6.5819
6.5758
6.5723
6.5702
6.5691
6.5684
6.5680
6.5679
6.5678
6.5678
6.5677
6.5677
I
6
2
1.1369
1.0763
1.0286
.9945
.9677
.9458
.9272
.9113
.8976
08858
.8756
.8591
.8467
.8373
.8302
.8248
.8177
.8135
.8111
.8097
.8089
.8085
.8082
.8080
.8079
.8079
.8079
.8079
.8079
.8079
e
Details of the iterative process
I dlogL!aCl. 1 I :dlogL!a6 1 I dlogL!a6 2
51.3084
21.0867
2201086
20.9910
19.0939
17.0237
15.0529
13.2527
11.6407
10.2107
8.9483
6.8598
5.2499
4.0131
3.0651
2.3395
1. 3613
.7912
.4597
.2670
.1551
.0901
.0524
.0305
.0178
.0155
.0136
.0119
.0104
.0091
267025
-.2438
-.3937
-.4243
-.4110
-03813
-.3467
-.3119
-.2787
-.2481
-.2201
-.1720
-01336
-.1032
-.0795
-.0610
-.0358
-.0209
-.0122
-.0071
-.0041
-.0024
-.0014
-.0008
-.0005
-.0004
-.0004
-.0003
-.0003
-.0002
-16.0419
-13.8770
-:::-1007141
- 8.8342
- 7.5722
- 606115
- 508161
- 5.1283
- 4.5222
- 309846
- 3.5072
- 207088
- 2.0849
-: 1. 6005
- 1.2263
- .9383
- .5476
- .3189
- .1854
- 01077
- 00626
- .0363
- 00211.
- .0122
- .0071
- .0062
- .0054
- .0047
- .0041
- .0036
I
-1239 .4973
-1237 1912
-1236.3996
-1235.8338
-123504030
-1235.0709
-1234.8147
-1234.6173
-1234.4656
-1234.3492
-1234.2599
-1234.1392
-1234.0687
-1234.0276
-1234.0036
-1"233.9"897
-1233.9769
-1233.9725
-1233.9711
-1233.9706
-1233.9704
-1233.9704
-1233.9703
-1233.9703
-1233.9703
-1233.9703
-1233.9703
-1233."970T
-1233.9703
-1233.9703
0
I-'
0
0
101
Using the values of the second partials (5.3.1) to (5.3.4) in place of
the expected values of the second partials gives:
var(a )
l
cov(8 ,a )
l l
cov(8 ,a )
Z l
var(8 )
l
cov(8 ,8 )
l Z
var(8 )
Z
.
. -3
1.8l6xlO
-1.389xlO
Z.9l6xlO
-Z
-1··
-Z.693xlO
Z.443xlO
-3
-2
9.497xlO -3
CHAPTER VI
SUMMARY AND FUTURE WORK
6.1
Summary
In this study maximum likelihood estimation of the parameters of
a1 e.-xis 1 + a2 e -xis 2, has' been
a mixture of. two exponentials, .f (x) =
· 2 2
e~amined itl detail.
A1so t we have' considered the estimation procedure
e
under various forms.of censoring.
practical operating procedures.
e
The aim-thl;'oughout,has been to obtain
We 'have attempted, wherever possible,
to maintain a unified approach and to obtain our resu1ts.in a similar
form for the cases considered.
The .comp1ete sample case was considered in,Chapter 2, with the.
successive substitution iteration scheme, for solving the maximum
hood estimating equations, being derived in Section 2.1.
1ike1i~
Although in most.
cases the: iterative scheme increased the likelihood function at each
iteration, for many parameter combinations the.rate.ofconvergence was
nevertheless too slow to be of practical value.
For those. cases in which
convergence was hard to obtain it was found that the asymptotic variance
of the estimates [Section 2.3] became excessive, and a.singleexponentia1
provided an adequate representation for the mixture [Section 2.4].
Thus
any attempt'to obtain estimates would be futile because of their la:t;'ge
asymptotic variances; in these ca$esan explicit maximum-likelihood estimate for the mean of a single-exponential was obtained and statements were
made only about the sample in toto.
103
The,exact asymptotic covariance matrix was obtained by taking
the inverse of the information matrix, which requireq numerical integration, and these integrals were tabled for several values of,a
(Appendix I).
an adequate
In order to
f~t
aids [Section
c~eck
l
anq h
whether a.single exponential provided
for a mixture of exponentials we utilized graphical
2~4;lh
analytic measures [Section 2.4;2], and applied
statistical tests [Section 2.4.3].
exponentials with the same mean
a~
Mixture of exponentials
the. mixture were
plott~d
visually compared, for various parameter combinationso
a~d
single.
and
Noting by eye
that the greatest difference occurred at the origin, we analytically
examined this distance.
Also examined analytically, as a measure of
discrepancY between the two Sensities, was the non-centrality parameter,
introduced for power considerations. in chi-square goodness of fit
tests.
Finally, likelihood ratio tests and chi-square goodness of fit
tests were employed in examinarigwhether a single exponentia+ adequately
represented a mixtureo
In Section 2.2.1, in order to get a rough idea of how well the
iterative procedure worked in various situations, artificial data were
used with (a) several values of a , the mixing proportion, (b) differl
ent values of h=8 /81' 8 >8 ,the ratio of meanfil, (c) various values of
2
l 2
8 for a fixed h, (d) several values of n, the number of observed
1
failures, and (e) different starting values of.our iterative procedure.
The size of h was the most critical factor.
Forh small enough, conver-
gence tended to take place rapidly even when the other factors.werenot
favorable, but when h approached one, convergence was difficult to attain
irrespective of the other factors.
The value of a
l
was not critical.,.
with values ofa l around 1/2 desirable; problems did arise, however, when
104
Cl.
1
approaches 0 or 1.
Under favorable conditions, samples of size 200
were sufficient; in other circumstances samples of size 1000 were
required and even this was not sufficient when conditions became extremely unfavorable.
Cl.
=.5 and 6 ,6 2 equal to the arithmetic mean
1
1
plus and minus one were used as starting values for small h,but as h
increases starting values become more critical and also harder to
For a given h, the magnitude of 6
1
obtain~
didn't seem to matter.
Since.if a single exponential adequately represents the data then
so must a mixture, the operating procedure [Section 205] is then to perform a chi-square goodness of fit test to- check whether a single exponential
is appropriate 0 If it is, then we can estimate the mean, and make statements about the sample as a whole.
If we reject the hypothesis that a
single exponential is adequate then we obtain moment estimates [Section
2.501]"for the mixture and use them as'initia1 estimates in.the
iterative process to estimate the three
parameters~
Estimation of the parameters of the mixture were also obtained
for various forms of censoring:
double censoring (type I and type II)
[Chapter III], where both tails of our sample were censored; a general
form of type I censoring [Chapter IV], where unlike double censoring in
which the censoring points for each item were the same, different items
can have distinct censoring points; and progressive
cen~oring
(type I,
and type II) [Chapter V], where the cens.oring occurred progressively in
stages with r , randomly selected items being censored at the i
i
th
stage.
The observations made for complete samples continued to.hold when'
censoring was present.
Although the resulting iterative equations,for
obtaining maximum likelihood estimates for the various cases differed,
105
the methods of derivation were essentially the same as for the complete
sample case, Sections 3.2, 4.2, 5.20
The exact asymptotic covariance matrix was also provided for each
form of censoring, Sections 303, 4.3, 503.
They all involved numerical
integration of the same function as required for the complete sample
situation, the only difference being in the limits of integration in the
double·. censoring and the general form of type I censoring cases.
For
type II, double and progressive censoring [Sections.3.3.2,5.3.2],
obtaining the exact asymptotic covariance matrix resulted in numerical
complications which were easily avoided by obtaining an,approximation
to the asymptotic covariance matrix, quite similar to the results
obtained for type I cens.oring
0
The general operating procedure employed for complete samples
was also followed for·the case of censoring, Sections 3.4, 4.4, 5.4.
For censoring on the right, we checked on the adequacy of a single
exponential, under a general form of type I and progressive censoring,
by using the hazard plotting technique.
A1so,for double censoring and
a general form of type I censoring, on the right only, in ordet: to
obtain explicit moment estimators we required certain additional assumptionsto simplify the resulting moment equations, Section 3.4.2, 4.4.2.
6.2
Suggestions for Future Work
While mixtures of two exponentials readily appear in the real
world, it would be useful to further extend the generality of the underlying failure time density so that it may cover even more experimental
situations.
To this end we recommend the following extensions.
1) Consider situations for which we have a mixture of s
106
densities, s>2.
For the case when thefa:l,lure time density for each
population is an exponential and s>2, I feel that we will obtain ,essentia11y ,the same conclusions aswhens=2; that is, unless the s popu1ations are sufficiently distinct, convergence will be difficult to obtain,
variances will be excessive, and the mixture of s exponentials
adequately represented by a
2) Consider
m~xtureoft
situatio~s
wi1~
be
exponentials, t<s.
for which the failure time densities for
the individual populations are other, than exponential.
A likely candidate
for the failure time density would. be a Weibu11 density,
f(x)
a
Sa
= --
x
a-1 e-(xls)a ,
x>O, which reduces to the exponential case when
a=l.
3) In all our work we have dealt with continuous data; it would
also be desirable. to
o~tain
estimates from grouped data.
APPENDIX I
TABLES OF Si' i=1, ••• ,5
In order to perform the numerical integration to evaluate the
Si we applied the transformations
only w=e- x1e1 to S50
(l)
Sl
l~h"
.z=w h
to Sl' ••• ' S4 and
The resulting forms of Si then are as follows:
1-h
= J1
wae-xl S1 ,
1-h·
{w~/[01 h+a w~ ]} dw
2
1
o
1-h
Sl(2) = .
J1
{lh_h
z~1
[011
h+Cl. z ]} d z
2
0,
si
1)
:=
-1-h
J1
-
{(logw)w h /[a h+a w h ]} dw
1
2
o
S(2)
2
S(l)
3
:=
f1
-
{log[z
h
1-h
0
2
h
h 1-h ![a h+a z]} dz
]l-h z
1
2
I-h
0
J1
- 11
/[01 1h+0'.2w ]} dw
h
{(log[z
1-h
o
h
2
= J1 I=h {[log(z
o
1-h
11
= J1 {(logw) w
(2) __
S3
-1-h
h
2 h
1-h
]) 1-h z
/[a 1h+Cl. 2z]}dz
h
1-h
2
)]
h
+2[log(z
1-h
l-1
)] +l}z
1-h
dz
108
s(l) "" /1 {(logw)
5
+2logw +l}
h-1
OIlhw +01 2
o
Thenumeric~l
2
dw
integration used to generate the tables was accomplished
by using Simpson's ru1eo
Note that for h<1/2, si l ) proved more
convenient, while for h~1/2, s~2) was preferab1e,and for h=l/Z, s~l)
and
si ) are identicaL
l'
1
2
The limiting values in the tables are calculated analytically
0
If ,the sequence of integrals is uniformly convergent, then we can bring
the limit sign inside the integraL
We can prove uniform convergence by
appealing to the Weierstrass M test,
io~",
M(x»O such that, (a)
converges then
J:
,I f (x,S) I~ M(x),
if we can find a function
Sl,::,S,::,S2' x>a and (b)
J:
M(x)dx
f(x,S)dx is uniformly convergent in 13 1::.13.::.13 2 "
For Sl we have when O<w<l, O<z< 1 that
for 0<01 .::.1, O<h<l
1
l-h
Jo1
[w11I 0I 1h 1 dw ... ,1- ,
OIl
J1 -1
0
01 2
1'
dw = 01 2
l
and thus si ) is uniformly convergent in O<h<l for OIl f::i,xed, 0'::'01 <1,
1
and is also uniformly convergent in 0<01 1,::,1,
h
0~0I1<1
for h fixed,O<h<L
h
. .
h
l-h
1
1-h
S1m11arlYl_h z
![OI l h+0I 2 z] '::'l-h z
lOll
O'::'~<l, 0<01 <1
h
1
1
l-h-z
01
2
1
O~h<l,
h
Jol
[h
I-h z
I-hi
[-hI-h
1
/o
a 1h] dz
h
109
1
1
I-h-- z
] dz
a2
2
and thus 8i ) is uniformly convergent in O<h<l for a 1 fixed
and is also convergent in
lim S (1) =
h+l
1
J1
O<al~l, O~al<l
-I-h
-I-h
for h fixed, O<h<lo
Hence,
for O<al~l
=1
lim {w h/[a h+a w h]} dw
1
2
O~al~l,
o h+l
h
lim 8 (2) = Jl
1
h+O
o
= Jl
h
I-h
l~m {l-h z
/[a l h+a 2 z]} dz
h+O
=0
0
-I-h
-1-h
lim {w h ![a h+a w h]} dw
1
2
1
=1
for O<al~l
for O<h<l
o a ....O
1
for O<h<l
We can proceed in a similar manner for 8 ,000,8 ,
2
5
for
O<w~l, O<z~l
Thus for 8 we have
2
that:
I-h
I-h
11![a h+a w11]
-(logw)w
1
2
< -
I-h
< -(logw)w
1
1
] dw = - [-10gw!a
o
2
a
J
h
lalh
for
O<al~l,
O<h<l
I-h
,
2
l
Jo
11lalh} dw =-h
-{(logw)w
h
h z I-h
- ( 1 og [l-h])_
z
I=h
h
a1
h
<
a 1h+a2~':~,
-(log[z
I-h
h
h
]I=h
I-h- 1
Z
O<h<l
a2
~h--hI-h I-h
-(log[z
])l-h z
.-h...h.-h...
I-h I)l-h-z-I-h
-(log[z
<
alh
O<h<l
110
h
f1 log[z 1-h ] h
- -1
a
1-h
a
2
-h- 1
I-h
1 ,
z
dz ... _.
a2
f1
- -1
a 1h
0
h
h
h z l-h dz
(log[z 1-h ])l-h
dz =0
forO<al-~l
dz =h for O<h<l
For 8 we have for O<w.::.1, O<z<l that:
3
2
(logw)
1-h
wh
<
(l-h)!h
a h+a w
1
2
2
I-h
(logw) w
(l-h)!h
a 1h+.aZw
1 1
f oa-
Z
(logw) dw
2
<
(logw)
w h
a h
1
z
:= -
1
,
N
aZ
l
Zh l
(log[z 1-h ]) -l-h z1-h
a1h+aZz
h
2 h
h
(log[z 1-h ]) I-h z1- h
a 1h+a Zz
l~h
2
h
·h
~1
f
1
(logw)
(log[z I-h ])
h
<:
2h Z
:=.-
0
-L
<:
l-h
w
dw
2 ~
(log[z
1-h
Z
h
l
-l-h z 1-h
h
Z -h... l_h- 1
]) 1-hz
a
Z
a1
111
1
a h
l
1
J0
Z
(log[z
h
l
h
(log[z
h
]) l-h z
Z
=
l-h
Z Jl lim (logw) w h
0 h+l
a h·+·
r),Zw (l-h)!h.
= 11
lim S;Z)
h+O
0
1
=J
o
2
l-h
(log [z
l-h
C:t
O<z~l,we
h
~
l-h
I=hz ..
O<h<l
h+a Zz.
(logw)' +Z10'gw + 1
<
Z
+Zlogw +
O<h<l·
a .
1
.
Z
h{ (logw) +Zlogw + 1} <
·
(l-h)!h
a l h+ aZw
l
J)
..1L
h
Z
Z
{logw)
l
z
O<h<l
have .that:
h{(logw) +Zlogw + 1}
·+ (l-h)!h
a l h .aZw
1 1
-J0
a
l
h
-.-. l-h
lim (log[z
J) l-hz
h+O
a h+a z
l
Z
l-h
2
w h (logw)
lim
(l_h)!h dw = Z
c/'1+0 alh+aZw
..1L
O<w~l,
dw =Z·
l
l
c;-.
z
=-a
dz
.
and thus lim 8 (1)
h+l 3
For 84 ,
2
2h ·
=.--a
Z....!!- l-h
]) l-h z
dz
I=h 2 l l-h -1
1 1
-J0
a
h
l-h
l}
Z
h{{logw) +Zlogw +
11-h)!h
aZw
l}
o ~al<l
, O<h<l
1
dw '"' -al
/1.
2
.
-(l-h)!h
{(logw) +Zlogw + l}w
dw
0
forh>l!Z
h
h
h. Z
h
3
=.c;- [Zh-l -Z(Zh-l) +2(Zh-l) ]
Z
112
<
~[log(z
h
2
h ...
I=h
h
1-h
2
)]
h
+210g(zl-h) +1}
([log(z
zI:h-
-=-
--!L
1-h
h
1
-
h
h
2
1
)] +2[10g(zl h)] + I} zl-h <
--!L
h
h
2
1-h 2
1-h
- 2
1-h {[log(z
)] +2[10g(z
)] +1} zl-h
L
I' . h
0: I-h
1
I
1
h
_h_
h
1 h 2
1-h
1 h -1
1
) + l}z - dz .... 0:
{ [log (z - )] + 210g (z
1
o
2
IJl
h
1-h 0:
o{
2
[10~ (z
and thus lim 8 (1)
4
h+1
lim 8(2)
h+O 4
h
1-h
)]
2
' 0<0: 1 <1, O,<h<l
h
h
-' - 2
+210g (zl-h) + l} zl-h
dz ...
2
= J1
lim {(logw). + 210gw +1}
(dw=l
0 h+l
0: h+o: w 1-h)jh
1
... J1
h2
lim
o h-+O
I=h
2
.
l~h . 2
~
.JL -1
cr10g (:z;
)] +210g(zl h) + I} zl-h •
dz =,0
113
. . 11
Q
hr.2hh1
2
Ii. h{(logw~ +21ogw t i l dw ...
m
.
(l-h)!h
0\1+0 0\1ht0\2w
-2(..2-) 2 +2(._.
h 1.3]'1/2<h<1
2h-1
2h-l
L-
.JL
2
i
lim 8(2) ...
~
af~l
11
h
hI
h .
1-h 2
I-h
I-h I-h {[log(z. )] + 2log(z
) + 1} z
-=......;.;..------------------ dz
0
O<h<1
For 8 , O<w<1 we have that:
5
2
2
(logw) +21ogw+ 1
h-l
O\lhw
+a 2
< (logw) +21ogw+ 1
0\2.
2
2
(logw)
+21ogw+l
h-1
a 1hw
+0\2
-l:.
0\2
fl
2
[(logw)
0
_1_
fl
0\1h
0
O<h<1
[(logw)
.
thus1im
8 (1) '"
h+1 5 -~
< (logw)
+21ogw+l
h-1
O\lhw
O<h<1
+2logw + 1] dw ... 1a2
2
+21ogw + l]w
Jl .I'1m'.
0
h+l
I-h
1
1
{(logw) 2 +21ogw +1}
h-l
Cl. hw
+ 0\2
1
.
2'2
dw'" a h [.2-h - (Z~h)2 +('Z":J;ip] and
1
2
dw ... 1
lim 8(1) ... J1 1im {(logw) h~ilogw.+ I}
h+O 5
0 h+O
a hw
+1);2
1
dw ... - 1
0\2
0.::.a <1
1
... ·1
114
O<h<l
lim
o.f~l
•
S5(1)
= J1
0
2
lim {(logw) +21ogw -+ l} d
o.f~l
0.
h-l-
1 hw
W
+0.
2
1. [
2
2
= h (2-h)3 - (2-h)2
115
Table Alo!
~O.O
Values of 8
1
01
.2
03
04
1.0
LO
1.0
1.0
05
06
.7
.8
.9
LO
1.0
1.0
1.0
1.0
LO
LO
~
000
000
01
0.0
0425
.643
.781
.872
.929
.965
.984
.994
0999 1.0
.2
0.0
.407
.614
.750
.843
.907
.949
0976
0991
.9981.0
.3
000
.409
.741
.833
.898
.942
.971
.989
.998 1.0
.4
0.0
0421
0609
.618
.746
0834
.897
0940
.970
0988
.997 100
.5
000
.443
.638
.760
.843
.901
.942
.970
0988
.997 1.0
.6
0.0
.477
.668
0783
.859
.911
.948
.973
0989
0997 1.0
.7
000
,.527
.711
0815
0889
0926
.957
.977
.991
.8
0.0
.603
.772
0859
.911
.945
.968
.983
.993
.998 LO
.998 LO
.9
0.0
.732
.861
.919
.950
.970
.983
.990
.996
LO
0.0
I
1.0
LO
1.0
LO
.1
.2
03
1.0
LO
1.0
LO
1.0
1.0
.7
.8
Values ·of 8
2
Table AI.2
~I GoO
1.0
.999 1.0
1.0
1.0
.4
.5
.6
.9
1.0
0.0
0.0
.1
0.0
.107
.281
0460
1.0 1.0" 1.0
1.0
1.0
1.0
1.0
0622 .755 .855 .922 .963 .987 1.0
.2
0.0
.090
.236
.391
.541
.674
.785
.872
.934
.3
0.0
.083
.214
0355
.485
.624
.739
.834
.909
.974 LO
.963 1.0
.4
0.0
.079
.200
.332
.464
.589
.704
.804
.887
.952 1.0
.5
0.0
.077
.193
.317
.442
.563
.676
0779
.868
.942 LO
.6
0.0
.077
.188
.543
.655
.758
.852
.933 1.0
.7
000·
0078
.187
.307 .427
• 301~ 0415
0528
0637
.740
0837 . .9241.0
.8
0.0
.082
.188
0297
.407
.516
.622
0725
.823
.916 LO
.9
0.0
.088
0297
.402
.507
.610
.712
.811
.908 LO
1.0
0.0
.1
.192
.2
.3
.4
.5
.6
.7
.8
.9
LO
116
Table Al03
Values of 53
h 1_0_.0
.1
0.0
0.0
2.0
.1
0.0
.042
.198
.459
.787 1.132 1.445 1.689 1.852 1.947 2.0
.2
0.0
.031
.149
.351
0619
09241.235 1.516 1.740 1.898
.3
0.0
.027
.125
:296
.529
.805 1.102 1.393 1.6501.854 2.0
.4
0.0
.024
.111
.262
.470
.72410007 1.298 1.575 1.813
.5
0.0
0022
.101
.237
.428
.663
.933 1.222 1.510 10776 2.0
.6
0.0
.021
.094
.220
.396
.617
.875 1.158 1.453 1.740 2.0
.7
0.0
.020
.088
.206
.371
.579
.826 1.104 1.403 1.708 2.0
.8
0.0
.020
0084
0195
.350
.54$
.785 1.057 1.358 1.677 2.0
.9
000
.019
.082
0187
.334
.522
.750 1.016 1.317 1.648 2.0
1.0
0.0
.020
.080
.180
.320
.500
.720 '.980 1.280 1.620 2.0
Ct
1
•2_ _0_3_ _0_4_ _._5_ _0_6_ _07
2.0
2.0
200
2.0
2.0
2.0
.8_ _._9__1_._0_ _
2.0
2.0
2.@
2.~
20~
•
Table AI. 4 Values of 54
Ct
1 h 1_0_.0
.1_ _._2_ _._3_ _._4_ _._5_ _•_6_ _
•7
~
~
~
~
23.400 4.441
•8_ _._9__1_._0_ _
2.015 1.297 1.0
0.0
0.0
.1
0.0
7.735 7.481 7.111 6.270 5.043
.2
0.0
3.969 3.835 3.729 3.481 3.069 2.547 2.0071.543 1.2091.0
.3
0.0
2.704 2.616 2.570 2.464 2.274 2.010 1.708 1.415 1.173 1.0
.4
0.0
2.069 2.008 1.986 1.936 1.838 1.691 1.510 1.317 1.1411.0
.5
0.0
1.688 1;6471.637 1.614 1.562 1.477
.6
0.0
1.437 1.4101.408 1.399 1.373 1.325 1.257 1.1741.085 1.0
.7
0.0
1.261 1.24$ 1.249 1.21f81.2361~2:t~°1.171>1.1201.061 1.0
.8
0.0
1.135 1.130 1.135 1.138 1.135 1.123 1.102 1.074 1.039 1.0
.9
0.0
1.047 1.049 1.054 1.058 1.058 1.0541.016 1.034 1.019 1.0
1.0
0.0
1.0
00
1.0
1.0
1.0
1.0
3.7002~533
1.0
1~366
1.0
1.720 1.250 1.0
1.239 1.112 1.0
1.0
1.0
1.0
117
Table Al.S
a1
0.0
lo.o
01
02
.3
04
.5
.6
07
.8
~9
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
LO
01
1.111
.840
0800
.798
0813
.841
0875
0911
.947
.977 1.0
.2
1.250
.817
.754
0740
.749
.773
.810
0855
.906
.956 1.0
.3
1.429
0824
0741
0715
.716
0734
0768
0815
0~73
.937 100
04
1.667
.853
0748
0709
0700
0711
.739
0785
.846
.920 LO
.5
2.000
.904
0772
0717
.696
0698
0720
.762
.823
.904 LO
.6
2.500
.984
0814
0737
.702
0693
0707
.744
.804
0890 LO
.7
3.333 1.108
0879
0772
0717
.695
.700
.730
.788
.877 1.0
08
5.000 1.311
0978
.823
.742
0704
0697
.720
.774
0865 LO
10.000 1.690 10135
0899
.779
.719
0699
.713
.762
.854 1.0
20639 1.406 1.011
.830
.741
.705
0709
0752
.843 LO
.9
1.0
•
h
Values of 8
5
00
1
,i /
I:
APPENDIX II
INITIAL ESTIMATES FOR COMPLETE SAMPLES
From section 2.5.1 we have the first two moment equations are
m1
=
(AIL 1)
(AlI. 2)
1) Two parameters approximated from prior knowledge:
From (AII.l) we obtain
j=1,2
From (AILl)
j=1,2
2) One parameter approximated from prior knowledge:
a}
6
i
known.
From (AII.l)
i+j=1,2
and utilizing (Allo3) in (Allo2) yields
(AlIo4)
Since 6i is one of the roots of (AII.4), the other root will give us 6*' ,
j
*
and from (AII.3) we obtain a,.
J.
119
b)
a
i
known
From (AII.I) we obtain
ifj=1,2 (AILS)
substituting (AlIoS) into (AII.2) results in
(AIL6)
and from (Allo6), (ALIoS) we obtain two sets of initial estimateso
APPENDIX II!
INITIAL ESTIMATES UNDER DOUBLE CENSORING
Restricting our attention to
cen~oring
on the right,
io~o,
T =O, we have from section 3.4.1 the following first two moment
1
equations:
(AlII. 1)
(AlII. 2)
1)
Two parameters approximated from prior knowledge
From (AI! 1. 1)
a.
'*
:=
j:=1,2
Since
and hence
121
2) One parameter approximated
a)
Since ale
f~om
prior knowledge
8i known
-T 2 \8 l .
+ a 2e
- T2 1 82
we then obtain the following from (AIII.l),
a
(AIII.2)
* :=
i
(AlII. 3)
(AlII. 4)
Substituting (AIII.3) into (AIII.4) yields
*2
- T2 1 8i
r2
- T2 1 8i r Z
8
{2(e
-1)(1- n:)(ml -8 i ) + 2T 2e
(n -I)}
j
*
+ 8j {2T Ze
+(l-e
-T2 18 i
- T2 1 81
r2
2 r2
) [(1--n)m 2+T2 n -2T
r.
r2
r
r2
n
2
8 ]
i
.
+ (~ -1)(28 2-2 [T 80+8:]e
n
i
2 ~ ~
- 8.~ (l-e
- T2
le i
2
[(:I:- n)ml+T Z n: ]
-T 180
2 ~)}
.. r 2
2 r2
)[(l---)m+T
n 2 2 --]}:=
n
.0
(AlII. 5)
122
Thus we can solve (AllloS) to find 8.* and substitute this value into
J
*
(Alllo3) to find a.
Note that 8 is one of the roots of (AllloS), the
i
other root will give us 8 *o
j
a
b)
Since
e
- T2
lel
'V
'V
i
known
2aln ~
and from (Alllol) with e
- T2 1 81
'V
'V
o we
obtain.
which implies
APPENDIX IV
INITIAL ESTIMATES UNDER A'GENERAL FORM OF TYPE I CENSORING
1) Two parameters approximated from prior knowledge
CI.
a)
8 , 82 known.
1
b)
Cl.
*=
j
,
8j
known
j+k=1,2
* =-1
n
- Li l8 j
L {-L !log([Q -CI. e
]/CI. )}
ni=l
i
i j
k
8
k
2) One parameter approximated from prior knowledge
8
a)
Cl.
* '"'
j
j
known
n
*
n
*
-Lil8l
L {a.x.+L.Q.+6 k (Q.-l)}! L {8.-8 )(1-e
)}
i=1 ~ ~ ~ ~
~
i=l J k
n
+ [ L(I-e
i=1
n
-Lils j
)][
n
L
i=l
2
2
(aix. + L. Qi-28jLiQ.)]
~
~
n
+[ L (Qi-1)][ L {28~-2(L.ej+8j2)e
i=1
i=1
J
~
n
n
2
~
-L.I 8
~ j}]}
2
+ {[i:l (aixi+LiQi)] [~:1 {8 j -(L i 8j +6j )e
-Lil8j
}]
124
b)
C4
j
known
I
.
LIST OF REFERENCES
Bartholomew, Do J. 19576
Assoc. 52: 350-355.
A problem in life testingo
J. Amer. Stat.
Bartholomew, D. J. 1959. Note on the measurement and prediction of
labor turnov~r. J. Roy. Stat. Soc. Series A 122: 232-239.
Bartholomew, Do Jo 1963. The sampling distribution of an estimate
arising in life testing. Technometrics 1: 361-374.
Bartlett, M. So 19530 On the statistical estimation of mean lifetimeso
Philosophical Magazine 44: 249-262.
Bhattacharya,C. Go 1967. A simple method of resolution of a distribution
into Gaussian componentso Biometrics~: 115-35,
Blischke, W. R. 1963. Mixtures of discrete distributions. Proceedings
of the international Symposium 'on classical and contagious discrete
distributions. P. 351-372. Calcutta: Statistical Publishing Society.
Bucklancl, W. R. 1964. Statistical assessment of the life characteristic:
a bibliographic guide. New York. Hafner.
Cohen, A. C., Jr. 1950. Estimating the mean and variance of normal·
populations from singly truncated and doubly truncated samples.
Annals of Mathematical Statistics 21: 557-5690
Cohen, A. C., Jro 1963. Progressive censored samples in life testing.
Technometrics 5: 327-3390
Cohen, A. C., Jr. 1965. Maximum likelihood estimation in the Weibull
distribution based.on complete and on censored sampleso T~chnometrics
2: 579-588 •.
Cox, D. R•. 1959. The analysis of exponentially distributed life times
with two types of failure. Jo Roy 0 Stat •. Soc. SeriesB 21: 411-421.
Des Raj. 19520 On estimating the parameters of normal populations from
singly truncated samples. Ganita 3: 4l~570
Des Raj. 1953, On moments estimation of the parameters of a normal
population from singly and doubly truncated samples. Ganita 4:79-84.
Gajjar, A. V. and Khatri, C. G. 19690 Progressively censored samples
from lognormal and logistic distributionso Technometrics 11: 739-8030
126
Govindarajula, z. 1964. A supplement to Mendenhall's bibliography on
life testing and related topicso J. Amero StatoAssoc. 59: 12311291.
Gupta, A. K. 19520 Estimation of the mean and standard deviation of a
normal population from a censored sample. Biometrika~: 260-273.
Hald, A. 19490 Maximum likelihood estimation of the parameters of a
normal distribution which is truncated at a known point. Skando Akt.
32: 119-1340
Harding, J. Po 19490 The use of probability paper for the graphical
analysis of polymodal frequency distributionso Journal of the
Marine Biological Association of the United "Kingdom ~: 141-153.
Hasselb1ad, V. 1966. Estimation of parameters for a mixture of normal
distributions. Technometrics 8:431-4440
Hasselblad, Vo 19690 Estimation of finite mixtures of distributions from
the exponential family. J. Amer. Stato Assoco ~: 1459-1471.
Herd, G. R. 1956. Estimation of the parameters of a population from a
multi-censored sample. Unpublished PhoD. thesis
Iowa State Col1egeo
0
Hill, B. M. 1963. Information for estimating the proportions in mixtures
of exponential and normal distributions. J. Amero Stato Assoco 58:
918-932.
Iyer, P. Ko and Singh, N. 1962. Estimation of parameters from generalized
censored normal samples. Indian Society of Agricultural Statistics
14: 165-176.
Kaplan, E. L. and Meier, P. 1958. Nonparametric estimation from incomplete
observations. J. Amero Stat. Assoc. 53: 457-4810
Kendall, M. G. and Stuart,A. 1960.
vol. II. New York: Hafner.
The advanced theory of statistics,
Lilliefors, H. W. 1969. On the Ko1mogorov-Smirnov test for the
exponential distribution with mean unknown. J. Amer. Stat. Assoc. 64:
387-389.
Mendenhall, W. 1958. A bibliography on life testing and related topics.
Biometrika 45: 521-543.
Nelson, Wo 1969. Hazard plotting for incomplete failure data.
of Quality Technology 1: 27-57.
Journal
Pearson, K. 1894. Contributions to the mathematical theory of evolution.
Philosophical Trans. Roy. Soc. London 185: '71-1100
127
Preston, E. J. 1953. A graphical method for analysis of statistical
distribution into normal components 0 Biometrika 40: 460-464.
Rider, P.R. 1961. The method of moments applied to a mixture of two
exponential distributions. Annals of Mathematical Statistics 32:
143-1480
Sarhan, A. Eo and Greenberg, Bo G, 19620 Contributions to Order
Statistics. New York: John Wiley & Sons.
Swan, A. V. 19690 Computing maximum likelihood estimates for parameters
of the normal distribution from grouped and censored data. Applied
Statistics 18: 65-69"
Walsh, J. E. 1956. Estimating population mean, variance, and percentage
points from truncated datao Skand. Akt. 39: 47-58.
Walsh, J. E. 1958.
truncated data.
Nonparametric mean and variance estimation from
Skand. Akt. 41: 125-130.