Johnson, N.L.; (1979)Extreme Sample Censoring Problems with Multiavariate Data-II."

EXfREl'£ SAWLE CEN&lRING ProIW1S
\ilTH MULTIVARIAlE DATA - II
TECHNICAL REPORT
N•L.
J()f-flSOO
JULY}
1979
U•S. AA'.1Y I£SEARCH OFFICE
I£SEARCH TRlJiNlLE PA~ ~RTH CAroLINA
DANi29-77-e-0035
DEPARWtNf OF SfATI Sf ICS
UNlVERSllY (f rllRTH CAroLINA AT CHAPEL HILL
CHL\PEL HI LL NORTH CAROLINA 27514
ApPROVED FOR PUBLIC RELEASE
DISTRIBUTION UNLIMITED
--
UNCLASSIFIED
SECURITY CLASSIFICATIOIoI OF THIS PAGE (1111","
n",,, F",,.,,,d)
r
READ INSTRUCTIONS
UE!'"ORI-: COMPLETIN<; FORM
REPORT DOCUMENTATION PAGE
1.
REPORT "lUMBER
4.
TITLE (and Subtll".)
GOVT ACCESSIOIol 1010.
3.
RECIPIEIoIT'S CATALOG NUMBER
5
TYPE OF REPORT tt PERIOD COVERED
Extreme Sample Censoring Problems with
Multivariate Data - II
TECHNICAL
6.
PERFORMIIoIG O"lG. REPORT NUMBER
Mimeo Series No. 1237
7.
8.
AU THOR(s)
9.
CONTRACT OR GRANT NUM13ER,·.)
DAAG29-77-C-0035
N.L. Jolmson
PERFORMIIoIG ORGANIZATIOIoI NAME liND ADDRE5S
10.
PROGRAM ELEMEIoIT. PROJECT. TASK
AREA tt WORK UNIT "lUMBERS
12.
P[PORT DATE
13.
"lUMBER OF PAGES
15.
SECURITY CLASS. (01 thi. ,,,pO")
Department of Statistics
University of North Carolina
Chapel Hill, North Carolina 27514
11.
COIoITROLLIIoIG OFFICE NAME AND ADDRESS
July 1979
Army Research Office
Research Triangle Park, NC
21
-14. MONITORING AGENCY NAME 81 ADDRESS(1f dllfe'M' f,<>m Con',nll;nll
Offire)
UNCLASSIFIED
'l5.
16.
--
DISTRIBUTION STATEMENT (of thl. npl''''')
Approved for Public Release:
17.
DECL ASSI FICA TlON DOWIoIGRADING
SCHEDULE
DISTRIBUTION STATEMENT (01 the ..b",It"
Distribution Unlimited
ent,.,e,J fn Block 20. II dllf.,ent f,om Report)
18. SUPPLEMENTARY NOTES
19. KEY WORDS (Continue on rever.e .ide if "'.,.,. •.••ry lind identllv by /IIock number)
Indirect censoring, Farlie-Gumbel-Morgenstern distribution, Likelihood ratio
tests, Order statistics, moments.
20.
ABSTRACT (Continue on reVer ... • Ide if nec" ....ry ..nd Identlly by block number)
Indirect censoring is defined as the effect on observed variables of
censoring on unobserved variables. Methods of testing for indirect
censoring are discussed, and exemplified, using a bivariate Far1ie-Gumbe1Morgenstern distribution.
DO
FORM
1 JAN 73
1473
EDITION OF , 1oI0V 65 IS OBSOLETE
UNCLASSIFIED
SECURITY CLASSIFICATION OF THIS PAGE (Whe"
D.t. En/er.d)
EXTRIME SAMPLE CENSORING PROBLEMS WIm MULTIVARIATE DATA - II
N.L. Johnson*
University of North Carolina at Chapel Hill
1.
Introduction.
Johnson (1978a) has given a survey of various problems which can arise
in testing for censoring of extreme values from univariate data.
When data
are multivariate, there is a much richer variety of possible problems; some
possibilities are described in Johnson (1975).
The present paper extends
these possibilities and indicates lines of attack on certain of the problems.
These are worked out in some detail for Far1ie-Gumbel-Morgenstern bivariate
~
distributions.
We suppose that observed values on rn characters X1 ,X 2, ••• ,Xm are available
for each of r individuals. We wish to investigate whether these represent
a complete random sample, or are the remainder of such a sample (original
size n>r) after some form of censoring of extreme values has been applied.
As in Johnson (1978a), we will restrict attention to random sampling
from large populations in which the joint distribution of X1' ••• '~ is
absolutely continuous, with joint probability density function (PDF)
We will denote the (unordered) observations on the i-th available individual
by
-X*
~
= (X1*..1 , ••• ,X*.)
rnl
*Research sponsored by the
(i=l, ••• , r) •
}~my Research
Office under Grant DAAG29-77-C-0035.
2
We also use the notation:
m
(i)
(ii)
Pre
n(X~.~x.)]
j"'1)1
J
... Fl2
(x ' ••• ,x)
• oem 1
m
(in particular
Pr[X~.~x]
J1
... FJ,(X))
for the ccnditional PDF of X* , oe. ,X* given Xb* ,.•. ,x..*
al
as
1
-Ot
ga
a·b
b (x , ••• ,x I~ ,... ,xb )
1··· S' 1··· t a 1
as °1
t
(in particular
and
(iii)
for the order statistics corresponding to
X~l' ••• 'X~
J
Jr
We also will focus on the foms of censoring accorded special attention
in Johnson (1978a):
(i)
fran above (exclusion of sr greatest values) or below (exclusion of
So
least values), and
(ii)
symmetrical -(exclusion of equal number of greatest and least values
to HO's or H 0
.r
sO'
(ii) corresponds to H
.
s,s
(s > 0) •
To indicate that the censor ing is applied to the variable Xj we use the
symbol H(j) •
sO,sr
2.
Problems.
Problems which will be discussed in the present report include:
(a)
Censoring is known to be possible on some one of a certain subset of
e
.
3
m' variables.
It is required to decide whether such censoring has occurred
(Sections 3 and 4).
In Section 3 we show that if m'
= 1,
the mivariate
techniques already developed can be used on the specified variable; they
provide the likelihood ratio test of the hypothesis of no censoring.
In
Section 4 we discuss the related problem of indirect censoring: detecting
whether there has been censoring on Xl when this variable is not observed
directly; only values of XZ' •••
city.)
case.
'Xm are
observed.
(We take m = 2 for simpli-
An Appendix sets out the application of the reuslts in a special
This constitutes the major part of this report.
In fact, the text of
the report might even be regarded as an introduction to the Appendix.
(b)
Censoring is mown to have occured on some one of a certain subset
of variables.
It is required to decide which is the censored variable
(Section 5).
(c)
We also give an introduction to problems arising in connection with
chain censoring.
This is censoring in a succession of stages.
At the first
stage the (sa1)+s~1)) individuals with the sal) least and s~l) greatest values
of Xl are removed; then those with the sa 2) least and
X2 among the remainder are removed, and so on.
s~2) greatest values of
(Section 6 - again with m = 2,
for reasons of s:i.n1>licity.)
3.
Censoring Possible
~ly
on One of m' (Observed) Specified Variables.
In the special case of (a), Section 2, with m' • I we suppose (without
loss of generality) that Xl is the possibly censored variable.
Since
f!*(!) = f~~(xl)g~~, ... ,~(~2, ... ,~I~I)
and the last term is the smnc \vhether H(l) s
sO' r
(1)
is valid or not, it follows that
4
the likelihood ratio
(2)
This shows that the likelihood ratio tests appropriate for univariate data,
described in Johnson (1978a) (see also Johnson (1966, 1971, 1972)) can be
applied to Xl' ignoring the observed values of all the remaining
variable~
X2, ••• ,\t. (For m = 2, this result is given in Jo}mson (1978a).)
As in Johnson (1966, 1971, 1972), application of this result requires a
knowledge of the population distribution of Xl (though not of X2, ••• ,\t).
The methods (described in Johnson (1978a,b)) of utilizing partial knowledge
of the distribution of Xl are, of course, also relevant here.
In fact, in
this case we really do not have a multivariate, but only a univariate problem,
so far as detection of censoring is concerned.
4.
Censoring Possible Only on One (Unobserved) Variable: Indirect Censoring.
A truly multivariate problem arises if we suppose that values of the
possibly censored variable (Xl) are not available.
How should the (observed)
values of (Xl.,
••• ,Xml.) (i=l, ••• ,r) be used to detect if there has been censor1
ing on
X1?
For
s~licity,
again, we consider the bivariate case (m = 2).
We first derive a likelihood ratio test.
As we shall see, there appear to be
considerable technical difficulties in applying this test in many natural
situations. Therefore, we also suggest some other procedures which may
sometimes be applied more easily.
The available data consist of the r observed values of XZ' denoted by
XiI'''. 'XZ*r·
Their joint PDF, if lI(l)
is valid, is
sO,sr
5
(1)
f Y .(2SZ 1Hs s)
.02
0' r
(3)
where
~
=min(x11 , ..• ,x lr);
u
= max(xll, ••• ,x1r).
Since
we also have
r
{.rr f 2 (x zi )}f
1=1
.
00
-00
·e
co
···f
s
sr
{Fl(~)} O{l-Fju)} r.rr g12(x1ilx2i)dxll ••• dxlr.
1=1
-00
(3' )
In particular
It follows that the likelihood ratio is
f .CX·IH(l)
,xZ 2 sO,sr
L
=
)
(1)
f~2a~IHo.0)
r
x
(remembering that XII
i~lg12(xli IxZi)dx u " .dxlr
mUl(xil,···,Xir); Xlr
Calculation of L frOOl the observed values
e
=
=
~~
max(Xil,··.,Xrr ))·
is usually quite difficult.
'When this is done, detenninution of the distribution of L (even when the
null hypothesis,
H~16
,
is valid) is likely to be even more difficult.
In the
6
Appcndix we usc a
Farlic-(~u1\l)(' I-Morgenstern
(see e.g. Johnson and Kotz (1975))
joint distribution for illustrative purposes.
Calculation of L is not very
difficult in this special case, but even here, the distribution of L is not
easily derived.
The conditional joint distribution of Xu and X
lr can be
derived from
(5)
but this expression is usually quite canplicated.
We note that the value of L (and so its distribution) is unchanged by any
monotonic increasing transformations of XI and
X~.
This means that we can
take, without loss of generality, each of the variables to have a standard
lDliform distribution (fi(x)
=1
for 0
~
~
x
1, i
= 1,Z).
However, the joint
PDF would then have to be that resulting from application of the appropriate
transformations to the original joint PDF.
The Appendix contains some
analyses appropriate to a bivariate Farlie-Gumbel-Morgenstern distribution
(e.g. Jolmson and Kotz (1975)) which does have standard uniform marginal
distributions.
A simpler criterion, suggested by the above analysis is
Ll
If
E[X~
= {FI(mt
s
E[X1IXh D }
s
0U-Fl(~ E[XlIX~i])} r •
(6)
XZ] is a monotonic increasing function of Xz then
s
L1
= {Fl(E[Xllmin(X~l' ••• 'X~r)])} O{l-FI(E[Xllmax(X~l'••• 'Xir)])}
s
r.
(7)
If E[Xll XZ] is a monotonic decreasing ftmction of XZ' then ''min'' and "max"
in
(7) are interchanged.
Another related criter ion, generally more diff:icul t to compute is
7
(8)
•
where Xi(t)' Xi(u) are independent with PDF's glZ(XIIXzCt))' glZ(xlIX~(u))
respectively and i = (t),(u) resepctive1y
E[X1IX~i]
min~ize
and maximize
with respect to i.
If E[X1IXz] is a monotonic increasing (decreasing) function of XZ' then
XZ(t)
= min(max)
(XZl, ••• ,Xir)
Xi(u) = max(min) (XZl,···,Xir) •
S.
Identification of Censored Variable.
of the m' variables Xl'X Z, ••• ,Xm, has been
censored (and that there has been no other form of censoring). We wish to
Suppose we know that some
~
decide which one of these variables is the one in respect to which censoring
·e
has occurred.
Using the argument in Section 3, the likelihood approach is straightforward,
if it can be asslIRed that (sO,sr) censoring has been used with sO,sr known.
We take each of the m' variables in order and calculate the appropriate (tmivariate)
likelihood ratio criterion for that variable.
the
U~kelihood
ratio is greatest.
{fh(~l)}
So
s
U·Fh(Xhr )} r
We choose that variable for which
This means that we choose X if
h
max ,[{Fj(Xj1 )}
J-1, ••• ,m
= ._
So
S
U-Fj(Xjr)} r]
(with same arbitrary rule for decidlllg ties - which, anyway, have zero probability
of occurrence).
We note that the same decision will be reached for all pairs of values
sO' sr for which sO/sr =
e has the same value. In particUlar, the same
decision will be reached for (i) censoring from above (8=0) with any sr
(ii)
below (a=co)
"
(iii) synmetrical censoring(a=l)
"
"
"
8
(Of course, the decisions for (1), (ii) and (iii) arc not, in general,
identical, though they might be.)
By analogy with the results in Johnson (1971), when the values of
So' sr are mlmolm, we would choose Xh
Fh(Xhl )
+
l-Fh(X}lr)
=
if
max
[f..,(X. l ) +
J
J'-1
- , ••• ,m , J
I-F. (X. )] •
J
Jr
Hasofer and Davis (1979) have considered a similar type of problem where
truncation, rather tItan censoring is applied to one of a number of.variables,
and it is desired to identify which of the variables is being truncated.
6.
Chain Censoring.
A further type of censoring of an essentially multivariate nature occurs
\'ihen two or more var iables, :i n sequence, are used for censoring.
For
example, from a sample of size n, with m variables Xl'XZ' ••• '~ measured on
each individual (i) the s~l) individuals with the least, and s~l) with the
greatest values of Xl are removed and then (ii) from the remainder the s~Z)
with the least, and s~Z) with the greatest values of Xz are removed, leaving
a set of
r
= n-s(I)
s(Z) _ s(l) _ s(Z)
0-0
r
r
values.
Problems arising from this type of situation include:
1)
Given that there has been chain censoring involving two specified
variables Xi and Xj ' which variable has been used for the first censoring
operation?
Z)
Given that Xi is the first and Xj the second (if any), is there evidence that second stage censoring has in fact been applied?
3)
Which variables have been used in censoring? (Given that just two
(or three or more) are used and possibly given an order of precedence such
that for any two specified variables it is known which would be used before
the other, if both were used.)
~
9
APPENDIX:
Detection of Indirect Censoring in Farlie-Gtunbel-Morgenstern
(FGM) Distributions_
Relevant Properties of FCM Distributions
A!.
Here we will illustrate calculations and use of the criteria introduced
in Section 4, using FGM population distributions with joint distribution
(O$xj:sl; j=l,2; 181<1).
(Al)
Each Xj has a marginal standard uniform distribution (Fj(X j ) = xj (O~xj~l)).
This distribution has been chosen for analytical convenience. It is not
claimed that the results will apply for other joint distributions, even after
transformation to make the marginals to be standard uniform.
However, there
are some speculative analogies which might be drawn.
From (Al) it follows that the joint PDF is
(AZ)
and the conditional PDF's are
=1 +
OCI-Zx2)-Cl-2xl)
(O~Xlsl)
&Zl(xzlx1) = 1 + 8(1-2x l )-(1-2x Z)
(O~xZ~l)
glZ(Xllxz)
}.
Hence
.
(u-t)[l+B(l-Zx~)(l-u-l)]
and so
r
=
Al.
(u-l)r n [l+8(l-Zx 2 ")(l-u-l)]
j=l
J
Derivation of L.
The conditional joint PDr of XII and X1r is therefore
(A3)
10
where ZZj = 1-2x 2j •
Fran (4),
_ (r+so+sr )!
- rls
's ,
, 0' r'
(r+sO+Sr)!
= rls Is
I
, O· r'
{r(r-I)
r
I
h__
J(r-2,sO,sr;h)e- Yh - 2a
2 r
h=O
h 2 h
Jer,sO,sr;h) ( +2 )6 Yh+2[
h=O
I
(AS)
where YO = 1; Yh = .
h
I"·I.
.~lZ~j.; Z~j
J1< ••• <Jh
£
1-
= 1-2X~j (h,j=l, •• .,r) and (with
1
a positive integer)
J(B,y,o;E:)
1
=
U
ff
o0
(u-t)8t Y(l-u)<'i e1 -u -R.)£dR..du
= ~ (_l)i(~)fl(l_u)~+c-i
i=O
£
1
0
IUR.y+i(u-R.)8dldg
0
,
'" . L0 (-1)1e~)B(y+i+l,8+1)Be8+y+i+2,0+£-i+l)
1
•
1-
If, also, 8, y and 0 are positive integers
~ (-1) i(~.)
',....:~;;..:-:..,:,.,....~~=
J(B,y,o;£) '" ,lO
1 ~
1'"
~!y! o!
(B+y+~+£+2)!
G(.t)
y,u;c
(A6)
where
(A7)
and a[b] = a(a+1) ••• (a+b-1) is the b-th ascending factorial of a.
e·
11
Formula (AS) can be written
(AS)
with
(If
£
< 0,
G(sO,sr;£) can be defined arbitrarily.)
We note that
so the first tenn on the right hand side of
(AB) is 1.
We also note that
G(s,O;h)
h! 1JIh (5+1) = (-l)Jx;(O,s;h)
II
(AIO)
where
~h(Y) = 1 -
[2]
If + ~ - ••• + (-1)
So, for censoring from below (sr
and for censoring from above
(sO
=
=
h
[h]
~.
(Al~)
0)
0)
( r+s r +1) [h1K(r ' 0 , Sr'-h)- = s[h]
r
.
(A13)
For symmetrical censoring (sO=sr=s) we have
G(s,sjh)
:II
o
{(k+l) [kJ (s+1) [kl
if h is odd
if h = 2k •
(Al4)
Hence, for symmetrical censoring
(r+2s+l) [h]K(r,s,s;h)
.
if h is odd
= {
0
(k+1) [k]s[k
(AlS)
]
if h = 2k •
12
SuJllnarizing, we have the following expressions for the likelihood ratios:r
s[h]
L = 1 + I (_l)r
0 [hJ~hYh
h=l
(r+so+l)
For detecting censoring from be10w:
r
"
"
"
"
L=1
above:
+
I
s [h]
r
h=l(r+s +1) [h)
r
"
ehy
h
(A17)
(Al8)
symmetrical censoring:
"
(A16)
In each case large values of the' statistic arc to be regarded as significant of censoring' of the relevant type.
Some nlDTlerical values for calculating
the coefficients of ehyh in (A16)-(Al8) are shown in Table 1.
AJ.
Moments of L.
From the general theory of testing hypotheses, we have
(5) ,
Zrs are mutually independent and each is distributed
Under H~lb'
, the
lDlifonnlyover the interval (-1,1) so, for all e
(
E[(ZJll)qIH ! ) ]
2
0 0
,
={
0
. -1
(q+l)
It follows that for any h, h'
E[Y IH(!)]
h 0,0
Cr
if q is odd
..
.
1£ q IS even •
(Al9)
h)
= 0 = E[YhYh ,IH(l)]
0,0
}
(A20)
and
var(Y IH(!)) = E[y2 Iu (l)] = (hr)(';')h
h
0,0
h
0,0
J
Hence when using the statistics (A16), (Al7) testing for censoring from
below or above (sO
= s,
sr = 0 or So
= 0,
sr
= 5)
13
(A2l)
while when testing for symmetrical censoring (sO=sr=s)
(1)
var(L\HO,O) =
\
{(k+1) [k]s[k]}2 r 1 Z Zk
•
()( . e )
k~r/Z (r+Zs+1)[2k]
2k ~
L.
(A22)
Approximate significrulCC limits for L may be obtained by supposing the
distribution lmder
A4.
H~:6
is approximately noma!.
Alternative Tests.
Since
=
f
1
(1-2x1){1 + eZ~(1-Zxl)}dx1
° ez*Z
-- 11
(A23)
it follows that
E[XrIX~] = }[l - ~(1-ZX~)]
=
i -*e(1-2X~)
(A24)
Hence, the simplified test statistic, Ll' Jefined in (6 ) is, for our FGM
distribution and with e > 0
(A25)
We note that if So (sr) = 0 (and e > 0), the critical region beCOIOOS
simply X2r «X »K, with an appropriate value for the constant K. This
Z1
would, of course, be the appropriate likelihood ratio test of the null hypothesis, with the alternative that Xz itself has been subjected to censoring
from above (below).
14
In the case of synlnctrk
l"(,lIsorill~
(sO,",,\.=s), the critical region
(for all s > 0) is of form
(A26)
or equivalently
(A26) ,
This can be compared with the critical regions
(for symnetrical censoring)
X21 (~-X2r)
(for general censoring - see Johnson (1971))
X
21 + (1-X 2r)
for likelihood ratio tests of
lIa 2,6.
The values in Table I suggest that useful tests might be constructed by
taking as test statistics the first terms only in the s\.DllJlB.tions in
(A16)-(A18).
This would lead to critical regions (which do not depend on 8).
For censoring from helow:
"
"
"
(A27)
above:
" symnetrical censoring:
(A28)
(A29)
Yz > C •
Since Yl = \:-lZZ*' = L:_l(1-2X~J')' (A27) and (A28) are equivalent to
LJJ
J..
r
I xi· :-
j=1
J
C'
(A27) ,
C'
(A2B) ,
r
LXi·J
j=l
respectively.
(The
<
signs of the inequalities would be reversed if
On the null hypothesis
e < 0.)
1I~~6 (no censoring) the X~j'S are mutually
15
independent stwldard uniform variables.
•
'11terefofc,cven for r as smull
as 5, the distributi9n of their sum is closely approximated by a normal
'00'
, h expccted va l
.
I r
'
d lstrl
tlon Wlt
ue I
'2" rd
an vanance
IT
Kotz (1972, p. 64)).
(e.g. J 0 1mson and
So we obtain an approximate significance level a.
by taking
C'
= 'Z1
I r
r + Aa.t'rr
in (A27)'
in (A28)'
where ~P'a.) = 1 - a..
Fran (A20)
E[Y2IHa~6] = 0; var(Y2IHa~6) =
ri- r(r-l).
(A30)
Assuming YZ has an approximately normal distribution tmder
H~l6'
, we obtain
an approximate significance level a. for the test for symmetrical censoring
(A29) by taking
C ='I\a, f(r-l)
18
The rnOlOOnts of the Yh's tmder H(}) s'
sO' r
may be evaluated by the following
steps:(i)
(ii)
~t's
find the conditional expected value, given
~t,
and
find the expected value of (i) when the joint distribution of the
is that of the (sO+I)-th, (sb+2)-th, ••• , (sO+r)-th order statistics
among (r+sO+s;) variables, each with PDF flex).
For (i) we use the result:
E[ (z*)q\x*] =
21
=
where
zt = 1 - 2Xt.
f
1
0
(1-2x)QU +OZ* (1-2x) hlx
I
{8 CQ +2)-lZr
if q is odd
(q+l)-l
if q is even
(A3l)
(cf (A23))
16
For (ii) we use the result (for ordered variables)
p .qi
(1)
1
P
i
[qi]
E[.II X1h .IHs ' 51] = ----~- II (sb+h .+ L q.-q.)
1-1
1 0' I'
[tqi] i=1
1
j-l) 1
(1'+5 +5;+1) 1
(A32)
0
In fact, in view of (A3l), and since, conditionally on
!!'
the Z2'''s are
n1Jtually independent, we need only the special case qi = 1 for all i,
{h
(s'+ho+i-l)}/(r+s'+s'+l)[P]
i=1 0 l O r
(A33)
We find
E[Y
1
I
IH(~)
,] = 1 e
E[l-2X*oIH(~) ,]
sO,sr
! j=l
1) 5 0 ,SI'
'"
1
~
~
e ~L E[1- 2X1h IH (1)
, 5']
50'
h=l
= ~[r
r
- 2
e·
$6+h
L 1'+5'+5'+1]
h=1
r(S;-5
I'
0
I'
0)
(A34)
• 3(r+sb+s~+1) e •
In particular
(A35)
Next,
So
Y2 '" ?<?,ZZj Z2j'
E[Z~jZ2j' IX!j,X!j'] = ~
and
) )
E[Y2IH~;) stl "'} e2~
0' r
r E[l-2(XiJo+xr·,)
) <J '
)
e2ZijZijl •
+ 4X!).Xi)·IIH(P 5']
.
sO' I'
17
=~
e2[(~)
-
r+s'~s'+l L ) (ZsO+h+h') +
h<h'
o r
•
x
4
(r+s'+s'+l)
o r
[2]
~<~, (sO+h) (sO+h'+l)] •
Perfonning the sUlJlDations (see, e.g. David and Johnson (1954, pp. 239-40)) we
obtain
r(r-l){(sO-S~)2+S0+S;} 2
= 18(r+sb+ s;+I)[Z]
(A36)
e.
We can use (A30) together with
E[Z~~IXj]' = ~
(for all X!j)
to calculate the variance of Y1 (=
(A37)
lJ:=1Z~J') under H~;) 5'. We have
_
0' r
r
= I U[Z*~IH(~) ,]
j=l
1
=
2J
+
3r
so,sr
+
2E[Y IH(~) ,]
2 so,sr
r(r-l){(s'-s,)2+ s '+s'}
0 r
0 r
e2
9(f+s +s;+1) [2]
O
and so, using (A28)
var(Y IH(l) ) = ~ r +
1 s' 5'
.)
0' r
=_1 r
3
+
r{(r-l)(r+sO+s'+l)(sO'+s') - (2r+s o
'+s'+1)(So'-s,)2}
r
r
r
r
9(r+s'+s'+1)2(r+s'+s'+2)
OrO r
(A38)
For censoring fram below (5 0=5, 5;=0) or above (sO=O, sr=s) this gives
var(YIIH~l6) = var(Y1 IH C1 \.= 1 r + rs{ (r-l)(r+s+l) - (2r+s+1)s} e2
,
•
D,s!
9(r+s+l)2(r+s+2)
= 1 r + rs{r(r-s~ • (5+1) 2} e2 •
3
9(r+s+l) (r+s+2)
(A39)
18
Note that as s
+ ~
(with r fixed)
•
Evaluation of variance of Y2, expected values of Yh (h > 2) and higher
manents and product-manents of all Y's, mder H(P s ' requires evaluation
sO' r
of quantities
r-E+1
r
p
(l
(T (r,s',s')=)T = 2 ... I E[ IT Z*, IH ,) ,]
P
0 r
P
. < ••• <'J 1.
'=1 1J 1' sO,s r
Jl
p
=
=
r-E+l
r
2 ••• ~
p
E[ IT (1-2X
)IH(~)
,]
sO,s r
. 1
1h.
h 1<••• < P 1=
1
r e
p-u+1
E
u
1
1. ... }' ~ (_2)u
~ E[ ~ X1h . IH~,~s']
h1<•.• <hp u-O
1 <••• <1
1
u a-I 1a 0 r
r-~+l
? ...
~
[]
~ (-2)u{(r+sb+s;+I) u}-
=
1
r-~+l
l
u-O
r
••• )
<...<\
h1
p-u+l
x
,
~
•••
~
u
(A40)
n (sO+h i +a-I) •
11<••• <lUa=1
a
(The term correspcnding to u = 0 is 1. The final multiple sun of products
can be expressed as a polynomial in So and r with coefficients calculated from
tables like those in David and Johnson (1954, pp. 239-240).)
For example
(1)
E[YIY2 IH, ,]
sO,sr
.. E[(
r
r-l r
(1)
L Z~,)( I l:
j=l J
r-l r
= E[,
,\
J< .J t
Z2·Z~·,)IH,
,]
j j'
J J
SO'Sr
(Z~J.+Z~,,)Z2'Z~.,
J
J
J
r-2 r-l r
+ 3 ~ .y.\
J < J r<J f;
z~.z~.,Z~,,,IH(~)s']
J
J
J
sO' r
1 r-1 r
(1)
1 3r - 2 r-l r
(1)
)'
E[Zl*J,+ZI*' , IH., ,] + ~ ~ ~ ~ E[Z*l, Z*l"Z*l,,,IHs ' s,]
j <jt
J
sO,sr
~r J< ]'<J"
J J J
0' r
= -9 e l
(A41)
.
19
and
•
+
r-3 r-2 r-l r
(1)
6 ~
~ ~ ~ E[Z~J.Z~J.,Z~J·",IHS' 5']
J < J'< J"<J'"
2
4
-- r(fS l ) + -rr
28 (r -2)T 2 + ---zr
20 T4°
.
0' r
(A42)
Values of
(ii)
So and
sr em
be interchanged by multiplying entries
by (_l)h.
(iii) A convenient formula for computation is (with sr > sO)
(SO+l)[i J
G(sO,sr;h)
=
h!. )'
idi/2
In particular, G(s,s+ljh)
=
-.......-r,-_.
with
k =
1
-r(h-l)
{ ~h
l.
1
0
seven •
-
(s+l)[i]
i!
,whence
h!(S+~~[kl
if h .is oJd
lOfh
(fi-Zl) !
1.
h! Li<h/2
(r+2s+2)[h]K(r,s,s+1;h) =
( s r -s 0) [h-2i]
=
(k+l)[h-k]CS+l)[k]
•
21
REFERENCES
•
David, F.N. and Johnson, N.L. (1954). Statistical treatment of censored
data. Part I: Ftndamenta1 fonnulae, Biometrika, 41, 228=240.
Hasofer, A.M•. and Davis, R.B. (1~70). On the identificati0J.10f the selection
var1able 1n Pearson's ~electlon model, Austral. J. StatIst., 21, 71-77.
Johnson, N.L. (1966) SaJll>le censoring, Proc. 12th Conf. Des. Exp., Amy Res.
Des. Testing, 403-424.
(1971). )Comparison of same tests of sample censoring of extreme
--v-a2"'lue-s-,-AU
.... stral. J. Statist., 13, 1-6.
_ _....-...--__..,.. (1972). Inferences on sample size:
Tiab. Estad.,. 23, 85-110.
(1975).
Extren~
Sequences of samples,
sample censoring problems with multivariate
-~aa""~ta---I-, Institute of Statistics, University of North Carolina, Mimeo
Series No. 1010.
-----
(1978a). Tests for censoring of extreme values (especially
WfiCn popwation distributions are incanp1etely defined, Proc. Amy Res.
Office Workshop on Robustness.
Johnson, N.L. and Kotz, S. (1972). Distributions in Statistics. Continuous
Univariate Distributions - 2, New York: Wiley.
(1975). On some generalized Farlie-Gunbe1Mbrgenstern distributions, Camm. Statist., !, 415-427.