ON GOODNESS-OF-FIT TESTS
BASED ON RAO-BLACKWELL DISTRIBUTION FUNCTION ESTIMATORS
by
Federico Jorge O'Reilly
Institute of Statistics
Mimeograph Series No. 788
Raleigh - December 1971
-. -
-~
iv
TABLE OF CONTENTS
Page
3.
4.
INTRODUCTION AND SUMMARY • •
1
REVIEW OF LITERATURE • • • •
4
UNIFOBM STRONG CONVERGENCE OF RAO-BLACKWELL DISTRIBUTION
FUNCTION ESTIMATORS
. • • .
• • • •
6
3.1 Introduction •
• • • •
3.2 A Glivenko-Cantelli Type Result
••••••••
3.3 Strong Consistency of Quantile Estimators • • • ••
6
8
11
.......
...
GOODNESS-OF-FIT TESTS
],.6
16
17
4.1 Statement of the Problem
.•••
4.2 A Distribution-Free Test • • • •
4.3 The Conditional Probability Integral
Transformation • •
5.
6.
24
CHI-SQUARE TYPE STATISTICS • • • • • • •
37
5.1 Simple Goodness-of-Fit Problem
5.2 Composite Goodness-of-Fit Problem
37
41
45
APPLICATIONS· •
Test for Normality, J,.I., (J2 Unknown •
Test for Exponentiality • • • • •
Multivariate Normal Test • • • • • . • • • •
Testing the Fit of a Regression Model
7.
LIST OF REFERENCES
•
•
•
•
•
•.
•
•
•
•
It
•
•
•
•
45
47
48
50
•
•
•
54
1.
INTRODUCTION AND SUMMARY
Goodness-of-fit problems have long received considerable
attention due to their importance in practice.
of-fit problems,
l.~.,
Simple goodness-
those in which the null hypothesis consists
of a single distribution function, have been widely studied for the
univariate case.
Distribution-free and asymptotically distribution-
free tests are common in the literature.
The concept of distribution-freeness has been generalized
here for the composite goodness-of-fit problem by requiring invariance of the distribution of the test statistic under choices
of distribution functions in the null class.
The tests so far
proposed are clearly generalizations of the simple goodness-of-fit
tests using maximum likelihood estimation and more recently minimum
variance unbiased estimation.
Under finite sample theory some of the available tests for
composite goodness-of-fit problems have been shown to be distriputionfree, however, the distribution of the test statistic fails to be
invariant for different choices of the null hypothesis class.
This
fact provided motivation for the present work.
In this thesis, the notion of distribution-freeness is analysed
for composite goodness-of-fit problems.
Sufficient conditions are
established for a problem to admit a distribution-free test, and then
the notion of class-freeness is introduced as the invariance of the
distribution of a test statistic, not only under choices of particular
2
members of the null class to be tested, but under the choice of
null classes, also.
Rao-Blackwell distribution function estimators are used throughout this work.
In Chapter 3, the asymptotic behaviour of Rao-Blackwell
estimators is studied.
A Glivenko-Cantelli type result is shown for
the Rao-Blackwell distribution function estimators, from which
strong consistency of quantile estimators is derived as an example
of immediate application.
In Chapter 4, the notions of distribution-freeness and classfreeness are stUdied.
Sufficient conditions for a generalized
Kolmogorov-Smirnov statistic to be distribution-free are obtained.
A generalization of the probability integral transformation is
given, which under certain conditions, reflected in the absolute
continuity of the Rao-Blackwell distribution estimators of the null
class to be tested, allows a subset of the sample to be transformed
into independently distributed random variables on the unit interval.
In Chapter 5, the chi-square statistic is proposed as a test
statistic for the sample obtained by transforming the original
observations.
In this case, it is shown how the test can also be
carried out with the beginning sample (actually a subset of it),
with the difference that not only the cells are selected at random,
but for each observation of the sample, a different set of random
cells must be selected.
The distribution of the chi-square type
statistics proposed is identified as that of the exact chi-square
statistic for testing a simple goodness-of-fit problem with equiprobable cells.
3
In Chapter 6, some direct applications are given.
normal and exponential fit problems are considered.
Univariate
One multivariate
problem is analysed and a test for the fit of a regression model is
given.
4
2.
REVIEW OF LITERATURE
The problem of testing the goodness-of-fit of a given distribution
or a given class of distribution functions by means of an independent
sample is an old one.
Perhaps the earliest test developed for this
purpose has been the chi-square goodness-of-fit test (usually
denoted X2 ), which has been widely used.
Cochran [4], gives an
excellent exposition of this test.
Later, the Kolmogorov-Smirnov and Cramer-von Mises statistics
were introduced; these are based on stronger theoretical aspects
and more sophisticated "distances" between distribution functions.
Darling [5], discusses these tests and their properties.
In dealing with the problem of testing a class of distributions,
!.~.,
a composite goodness-of-fit problem, under fixed selection of
cells, and under certain regularity conditions, a well known result
is that the X2 statistic, computed with maximum likelihood estimators
(or minimum X2 estimators) is asymptotically distribution-free.
If the composite problem has a null class whose members are
identifiable by location and scale parameters 8 , 8 ; with proper
1
2
"
"
estimates 8 , 8 , the Kolmogorov-Smirnov statistic computed with the
1
2
"
"
empirical distribution F (x) and F(x; 8 , 8 ) is distribution-free
n
1
2
(see David and Johnson [6]). Lilliefcrs, [11] and [12] simulated
the distribution of the statistic under the null hypothesis for the
normal and exponential case.
Srinivasan [21], has shown that the Kolmogorov-Smirnov statistic,
constructed with the minimum variance unbiased estimator (MVUE)
F and
5
the usual empirical distribution F , is distribution-free for testing
n
normality, and also for testing exponentiality.,
Watson [22] and [23] has shown that for a Gomposite goodness.,.offit problem, the X2 statistic, computed with a random selection of
cells is asymptotically distribution-free in the case of location
and scale parameters, and, in general, he identifies the asymptotic
distribution as that of:
k-m-l
~ Z.2 +
i=l
l
k
~ A.Z.
i=k~m
l
2
,
l
where Zl' ••• , Zk are NID(O, 1), and A _ , •.• , Ak are some eigen
k m
values in (0, 1) which he identifies.
Moore [15]
case.
has extended Watson's results to the multivariate
Like Watson, he uses ML estimators and his assumptions contain
the "regularity" assumptions for an estimation problem.
In this thesis, the papers used for reference in minimum variance
unbiased distribution estimation are those of Sathe and Varde [18],
for the univariate case, and Ghurye and Olkin [7], for the multivariate
normal and related distributions.
6
3.
UNIFORM STRONG CONVERGENCE OF RAO-BLACKWELL
DISTRIBUTION FUNCTION ESTIMATORS
3.1
Introduction
Let
~ ~ be the probability space corresponding to
(FPC,
a sequence of independent observations Y , ... , Y , ... of the
n
1
random variable
Y distributed
on the Borel line (R,~) according
to the probability measure P whose associated distribution function
is F.
This probability space will be referred to as the independent
sampling model.
For each n ~ 1 let U :
n
of the first n observations.
RD- Rn
be the vector of order statistics
Define for each x
ox :IfO-
€
R,
[0, l},
by
i f the first coordinate of the sample point is s;
x;
elsewhere.
Let en he the sub O'-algebra of ~oo induced by U. The empiriCal
n
distribution function evaluated at x, computed from the first n terms
of the sample is:
The Glivenko-Cantelli Lemma says that with probability 1,
sup
xcR
I E(5x ICn )
-
E(ox)
1- 0
a.s.
as n - CO •
k
:IfD _ R
n
For each n ~ 1, let T
n, (k
n
some integer ~ n), be
such that:
k
n
3.1.2
n
Tn induces the a-algebra aa c ~n (symmetry) ,
3.1.3
-
~
T
n
is
00
~
3·1.1
measurable,
aan+l caan a (
Y +) , where aan a (Y + ) is the compound
n l
n 1
n
st
a-algebra of IB and that induced by the (n+l)-·
observation (sequentiality).
Definition
A sequence of statistics (T )n ~ 1 defined on (~ ~~ PX),
n
for which 3.1.1 through 3.1.3 is satisfied will be said to be a
sequence of symmetric sequential statistics.
Berk [lJ has shown that for a sequence of sYmmetric sequential
statistics,
3.1.4
n)
E(zfaa ..
= E(zfanaan+l
•.• )
a.s.
for any Z that is aCYl' ... , Y ) measurable (k < n), and integrable
k
:fOe
n l
With 3.1.4 and the fact that (anaa + ..• ) n ~ 1 is a monotonically contracting sequence of a-algebras, he shows that
n
E(ZfB ) _ E(ZI~
oo
a.s., where aa is the tail a-algebra of the
sequence (an)n ~ 1, which by virtue of the Hewitt-Savage, 0-1 law,
is a.s. trivial, i.~., E(z\an ) _ E(Z)
a.s.
Berk's result can be summarized as follows:
8
"Every Rao-Blackwell estimator based on a
sYmmetric sequential statistics converges
strongly to its expectation, whenever this
is finite."
Where, for any r.v. Z, the Rao-Blackwell estimator of E(Z) based on a
sufficient ~-algebra ~n is E(zl~n).
3.2
A Glivenko-Cantelli Type Result
Loeve [13J, p. 409, states the following result:
Result 3.2.1
Let {ef)n~l be a sequence of monotonic sub ~-algebras and let
be a sequence of r.v.
IS
converging to Z a.s. such that
sUPIZ
~l
for some r
~
m
IE
L
r
1; then,
as m, n ... co, where (], co is the tail ~-algebra of the sequence
Theorem
3.2
In the independent sampling model, let (~)~l be sub ~-Ej.lgebras
induced by sYmmetric sequential statistics.
Then,
9
Proof
First note that for (cf)~l' the sequence of sub a-algebras
induced by the order statistics vectors,
Ien)
X
E(5
n
.
= (lin) 1=
. L: 51 ,
X
l
where
is defined for each x
E
R, by:
l' if the i th coordinate of the samp.le point is
{ 0, elsewhere.
Also, note that:
E(S
x
Icf) = E(Six Icf)
therefore, since,
it follows that
from which,
can be thus expressed as:
for i
= 1,
... , n,
$
x,
10
for any k = 1, ..• , n,
~.~.,
Denote by Z the ~-l measurable random variable
n
zn = XER
sUPIE(S I~-l)
x
- E(S )/.
x
The sequence (Zn)n~2 is non-negative, uniformly bounded by 1 and
a.s. convergent to 0, therefore,
sup Z
n
n~2
E
L ,V
r
r
~
1.
Also, note that
so it will suffice to show that:
Because of 3.1.4,
where
an
_ .on n+l
-
UiJt
B
.0.
is monotonic with a.s. trivial limit; hence, by Result 3.2.1,
11
The random variable E(o Ian), will be denoted
x
Fn (x),
and referred
to as the Rao-Blackwell distribution function estimator obtained by
conditioning on an or T •
n
Note:
Multivariate generalizations of the Glivenko-Cantelli Lemma have
been given in the literature.
See, for example, Ranga Rao [16J,
Wolfowitz [24J, [25J, and Blum [3J.
It is of interest to note that
Theorem 3.2 would generalize also, whenever the conditions of
symmetry and sequentiality are met.
Among the multivariate versions of
the Glivenko-Cantelli Lemma, Blum's version states the following:
For
where
~
~
n
the empiric measure,
must be absolutely continuous with respect to the
corresponding Lebesgue measure in the Euclidean space E, and
collection of half-spaces.
a is
some
In his paper, he also conjectures the
validity of the theorem when dropping the requirement of absolute
continuity
of~.
If his conjecture is correct, so is its automatic
generalization by virtue of Theorem 3.2.
3.3
Strong Consistency of Quantile Estimators
As an example of immediate application of Theorem 3.2, the
strong consistency of quantile estimators obtained from
Rao-Blac~well
distribution function estimators is shown in this section, when the
parent is assumed to be absolutely continuous and one-to-one.
12
Definition
For
01 €
(0, 1), the real number x
01
will be said to be the
OI
th
quantile of the distribution function F if,
x
=
Oi
inf{y; F(y) ~ OI}.
Let "'"
F (x) be a Rao-Blackwell distribution function estimator of
n
F, and define the quantile estimator
X to be
nO'
the
OI
th
quantile of
"'"F (x) for each outcome of the sample.
n
First, it is shown that "'"
X is not in general a Rao-Blackwell
nOi
estimator, for which it suffices to show that:
X01
(n+l)
in general.
1
F
E(
X01 lan +l ) '
n
For this, let
(where I stands for an indicator function), where
Fn (x),
e
is unknown.
obtained by conditioning on the minimal sufficient statistic
X(n) (the largest order statistic) is given by:
Fn (x) =
(n - l)/nX(n)' for
{1
° < x s;; X(n);
, for X(n) < x.
Thus, the quantile estimator is given by:
"'"
n~ =
and
{ C'"X(n/(n-l), for
. X(n)
0<01<
(n-l)/n;
, for (n-l)/n ~
01
< 1,
13
2
E(
X)
nOi
2
= { OIn / (n -1), for 0 <
n/(n+l)
~
1,
i.~.,
< (n-1)/n;
, for (n-l)/n ~
It is clear then, that for any
for some k
01
01 E
01.
(0, 1),
the quantile estimators are not Rao-Blackwell
estimators; therefore, for strong consistency of quantiles, Berkls
result can not be used.
Theorem 3.3
,...,
Let F be the Rao-Blackwell distribution function estimator of F,
n
obtained by conditioning on a symmetric sequential statistic.
If F is
absolutely continuous and one-to-one; then the quantile estimator
LX
nOi
converges almost surely to the quantile x .
01
Proof
For any
e
> 0,
F(x
01
-
e) < F(x ) < F(x + e).
a
01
A necessary and sufficient condition for the almost sure convergence of
~OI to x
is that for any
a
N such that,
e
e>
0, there exists a set A and an integer
e
14
c
on Ae·
Fix
e > 0;
o(e)
F is one-to-one, so there exists
>
0
s.t. F
-1
(~+30)
-
F
-1
(a-30) <
e.
For this 0, using the necessary and sufficient condition mentioned
above, select the corresponding set for
0* =
minte, a},
Ao*' and an integer Na for which:
P(A *)
a
<
e,
'*'
and outside Ao
in particular,
c
on A *.
S
By its definition,
thus,
F( na
X ) > a - 0,
15;
i..!:. ,
and on A~*, the maximum jump that
than 25, for all n
~
Fn (x)
can have is clearly less
N tf
Theref'ore,
FnnOi
( X -0)
> FnnOi
(X)
-
2&,
< F( nOi
X)<
01
+ 38,
hence,
(lX.
-
35
which yields:
. X
I,""
n 01
c
on A ",!·
5
- x
01
So let N == N 5 and A
e
I
e ==
A *, which completes the
5
]lroof'~,
16
4.
GOODNESS-OF-FIT TESTS
4.1 Statement of the Problem
A goodness-of-fit problem arises when it is desired to test if a
certain population produced a sample.
be independent.
The sample will be assumed to
The population is identified with a probability
measure, according to which the individual terms in the sample are
distributed; or more generally, the population is identified with any
probability measure belonging to some well defined class.
Definition
The following structure will be referred to as a goodness-of-fit
problem:
"Given an independent sample Xl'
.•• , Xn of observations on the
.
random variable X (scalar or vector valued), whose probability measure
is P, defined on the Euclidean space (0,
H : PEr, versus, H : P E
o
a
measures on (0,
a)
and r
~
a),
- r, where r,
~
it is desired to test,
are classes of probability
c~."
Usually, r will be a parametric class, and
~
will be unspecified.
If strong assumptions can be made concerning the structure of
~,
it
may happen that the goodness-of-fit problem is solvable by means of
Neyman-Pearson theory.
A goodness-of-fit problem can be characterized as simple or
composite according to the class r,
i.~.,
if r is a singleton class,
the problem will be said to be simple, otherwise it will be said to
be composite.
17
Also, the dimensionality of the random observations will
characterize a problem as univariate or multivariate.
Because of the regularity of conditional probabilities in
EUclidean spaces, throughout this work all conditional distributions
will be assumed to exist everywhere.
All problems considered will be understood to be univariate
goodness-of-fit problems unless otherwise stated.
Specifically, a
multivariate problem is dealt with in Chapter 6.
In Section 2 of this chapter, a distribution-free test is
proposed for the composite goodness-of-fit problem whose existence
depends upon the particular class to be tested.
Sufficient conditions
are given on the class for the existence of such a test.
In Section 3, a different approach is followed by way of
generalizing the probability integral transformation.
The notion of
distribution-freeness is analysed, and class-freeness is defined.
Sufficient conditions are then given, on a class, for class-free,.
distribution-free statistics to exist.
The usual absolute continuity
requirement for a simple goodness-of-fit problem to admit a
distribution-free statistic is reconstructed as a particular case of
the sufficient conditions obtained in this section.
4.2 A Distribution-Free•. Test
~
;
For purposes of completeness, the definition of a distributionfree statistic for a composite goodness-of-fit problem is given.
18
Definition
A distribution-free statistic, for a given composite goodness-offit problem whose null class is
r is defined to be any measurable
. function of the observations, whose distribution is the same regardless
of which PEr is true.
Note:
If the problem is simple,
i.~.,
r is a singleton class, then the
above definition does not particularize to the distribution-freeness
of a statistic for the corresponding problem.
Let
r be the null class of a composite goodness-of-fit problem
and let T
n
(inducing~) be the minimal sufficient statistic for
r.
Denote F the empirical distribution function
and by 1n theRao.
n
Blackwell distribution function estimator obtained by conditioning on
Bn .
Definition
The generalized Kolmogorov-Smirnov statistic for the above
problem is defined to be:
sup IF (x) - F (x) I
XER
Srinivasan
(21J,
n
n
.
has shown for the normal and exponential
goodness-of-fit problems, that the above statistic is distributionfree.
He also simulated its distribution for different values of n
for these cases.
Next, sufficient conditions on a composite problem for which
the generalized Kolmogorov-Smirnov statistic is distribution-free are
19
derived.
n
For this, let (R , ~n) be the n-dimensional product space of
the sample Xl' ••• , X ; as before, Tn (inducing Bn)will denote the
n
minimal sufficient statistic for
r and Un
(inducing~) will denote the
order statistics vector.
Definition
The class
r
~)
of measures on the Borel line (R,
is said to admit
an invariance transformation if there exists, for each PEr, a one to
one function gp: R -. R* c R, which induces a probability space
(R*,
~*'
P*) which is the same for all PEr.
The above definition can be relaxed, to allow gp to be essentially
one to one, by which the following is meant:
Suppose gp is onto R*, and denote by S;l the inverse image of
(y}, y E R*, !.~., S;l
= (x E R;
gp(x)
= y}.
Also, assume there
exists a criteria; by means of which, for eacl.1 y
-1
member of Sy
can be chosen.
then, if the set
R*, a "canonical"
E
Denote this canonical member by y
-1
;
U (S-l - (y-l}) is measurable (~) and has P
YER* Y
measure 0, gp will be said to be essentially one to one.
An example of an essentially one to one function gp is the
probability integral transformation for P
Lebesgue measure on (R,
~).
« A'
!.~.,
dominated by the
In this case, the canonical element can
be chosen as the infimum or supremum of the sets S-l.
y
In what follows, when a class
transformation, the functions
~
r is said to admit an invariance
will be assumed to be one to one.
It
can be shown that all results presented will also hold for essentially
one to one functions.
20
Clearly, if P admits an invariance transformation, then the class
n
p(n) of product measures on (R ,
~n)
also admits an invariance trans-
formation, for if (gp)PEP is the set of one to one functions inducing
(R*, ~*' P*), then the functions: g~: Rn ~ R~, defined for each
pn E p(n) by,
are one to one and induce the probability space (R~, ~~, P~) which is
the same for all pn E p(n).
Let ~n,
en be
the sub ~-algebras induced by the minimal sufficient
statistic and the order statistics vector, respectively, for the
family p,
Condition 4.2.1
.Using the above notation, p admits an invariance transformation
and the direct mappings of rf, an by means of g~ (which are themselves
n
s ub-~-algebras of t)*) are the same for all PEP.
algebras by
cC,
Denote thes e
~-
a~, respectively.
Lemma 4.2.2
Assume P meets Condition 4.2.1, and let X be any quasi-integrable
(p) random variable in (R n , ~n).
21
Proof
n
By definition, E(Xla ) is an measurable; moreover,
n)-l( B*
n)
(gp
= Bn , therefore, following Lemma 1, Lehmann [10], p. 37,
n
there exists a B* measurable real-valued function h such that,
(4.2.2.1)
I
n
E(X a )
= hognp a.s.
Using the definition of conditional expectations, and the fact
that BEan iff B
where B =
= (g~)-l(B*),
(g~)~l(B*),
(4.2.2.2)
IB E(Xlan)dPn = IBXdpn
= IB*X*dP~ = IB* E(x*la~)dP~,
the second equality being an application of the transformation theorem.
From
(4.2.2.1)
and
(4.2.2.2)
it follows that,
which by a.s. uniqueness of conditional expectations means that:
Theorem
4.2.3
Consider the composite goodness-of-fit problem with null class
iP.
Assume,
i)
ii)
iP meets condition
4.2.1,
and,
gp is monotonic for each P E iP.
22
Then the statistic,
sup IF (x) - Fn(x) I,
n
XER
is distribution free; where,
Fn (x) = E(I) x lan),Fn (x)
== E(&
x
Icf).
Proof
Fix P
€
P arbitrarily,
nn n
space (R*, ~*' P*).
X
= &x
and consider the transformed probability
In this space, due to the monotonicity of gp' for
in the original space, X* is given by
&~,
where:
&*y =
l-&gp(X)' if gp is decreasing; and,
Applying Lemma 4.2.2,
Therefore, the r.v.
IFn (x)
- Fn (x)1 c~~ be written as,
and thus the probability in the original space,
23
is equal to the probability
and clearly this last statement is equivalent to,
since the supremum can be taken over the images gp(x).
The two cases given by Srinivasan [21J, viz., the normal and
exponential case are immediate by application of the above result.
The functions gp used in his cases are the "standarizing" functions.
In his paper, however, there is a mistake in the claim that,
sup I'F (x) - F (x) I
XER
n
n
is the same as
max
IF
i=l, ••• ,n n
(x.) - F (x.)
~
n
~
I,
where Xl' ••. , xn is the observed sample; since the supremum in the
first expression need not be attained.
Note on Consistency
From Theorem 3.2, the generalized
Kolmogorov~Smirnov statistic
converges to 0 a.s., whenever the null hypothesis is true.
The question of sensitivity or limiting behaviour under the
alternative hypothesis (meaning any pl ~ p) is in general a difficult
one.
It can be asserted, however, that if the minimal sufficient
statistic for p is not also sufficient for P + (plJ, then
Fn (x)
oan
24
l
not be expected to be an adequate estimate of p ( -00, x 1.
Thus, under
such alternatives, the generalized Ko,lmogorov-Smirnov statistic will
be expected to be "larger".
Wi th respect to limiting behaviour of
'Fn (x)
pl, no easy answer seems to be available.
under an alternative
This problem, which in
essence is the limiting behaviour of a conditional expectation when
the probability measure was wrongly assumed requires some attention.
4.3 The Conditional Probability Integral Transformation
Consider a simple goodness-of-fit problem whose null class
corresponds to the distribution function F •
o
As remarked in Section 2, distribution-freeness of a statistic
for a simple problem is not a particular case of that for a
composite problem.
Following the definition given by Birnbaum
[2J, in the simple problem a statistic is said to be distributionfree if the distribution of the test statistic is the same for
different null classes.
Let ffi be the collection of all singleton null classes whose
corresponding distribution functions are absolutely continuous.
A
well known result is that the Kolmogorov-Smirnov statistic,
sup IF (x) - F (x) I,
X€R
n
0
is distribution-free for all absolutely continuous F
o
(!.~.,
with
respect· to ffi).
The 'above assertion can be shown very easily using the
probabil~ty·integral
transformation.
For any absolutely continuous
25
parent Fo , the random variables F0 (Xl)' ••. , F0 (Xn ) are independently
and
uniformly distributed in the n-dimensional unit hypercube.
Obviously, such a transformation leaves invariant the <r-algebra
induced by the order statistics, and thus, F being monotonic, by
o
Lemma 4. 2. 2,
F (x)
n
where Fn : Rn
o
~
=
n
F*(y)oF ,
n
0
(0,)1n is given by (F (Xl ) ' ••• , F (x) )
o n
o
~ld
( )
F*
n Y
is the empirical distribution function computed from the transformed
independent "sample" Fo(X ), •.• , Fo (Xn ) and evaluated at y = F0 (x).
l
Therefore, by an argument similar to that used in theorem
4.2.3,
suplF (x) - F (x)l
xER n
0
is identically distributed with,
sup
IF*(Y) - yl,
yE(O,l) n
and this is true for all absolutely continuous F •
o
In the remainder of this section, the composite problem will be
studied.
The main idea will be that of transforming the independent
-observations into U(O, 1) independent random variables.
Unlike the
simple problem, tl;J.e relationship between the statistic computed in the
transformed space and that computed in the beginning space will not be
studied, except for a chi-square type statistic proposed in Chapter 5.
Let In be a collection of null classes iP, each of which identifies
a composite goodness-of-fit problem.
property of a statistic is introduced.
First, the class-freeness
26
Definition
A class-free
(~)
statistic for solving the composite goodness-of-
fit problem whose null class is P
€ ~
is a distribution-free
(r)
statistic whose unique distribution is invariant under choices of
classes
r
~.
€
First, note that the definition of a distribution-free statistic
for the simple problem is a particular case of the above definition,
namely a class-free
classes {p}.
o
Let
for
r
r
(~)
statistic, where
~
is the set of all singleton
(Usually where P «A in the univariate case.)
0
be a null class.
Consider the minimal sufficient statistic
based on.n independent observations.
Denote by
an
the~-
algebra induced by this statistic and let ~n (x) be the corresponding
Rao-Blackwell estimating distribution function.
of distribution functions corresponding to p.
Denote by
~
the class
A well known result is
that:
Result 4.3.1
If F
€
~
is the parent and F is not absolutely continuous, then
with positive probability
Fn (x)
will not be absolutely continuous.
Proof
Denote by P the probability measure associated with F and by
~
~
P the conditional probability measure associated with F •
n
n
P
</<
Since
A, it means that there exists a Borel set D for which A(D}
and P(D) > 0.
Thus, since
= 0,
27
='Pn (D) >
0,
with positive probability.
From the above result it can be asserted that if P
Fn
the true parent, and if
P«
€
P, is
is absolutely continuous a.s., then,
i.,.
Theorem 4.3.2
Let Xl' ••• , X be an independent sample from the univariate
n
parent P
€
p.
Let Tn be any sufficient statistic for P and ~n the
corresponding Rao-Blackwell distribution estimator of F.
If
Fn is
absolutely continuous a.s., then,
Fn (X.)
~
'" D(O, 1),
for i
= 1,
••• , n.
Proof
For fixed Tn' '"
F is the conditional distribution function of Xi;
n
thus by means of the probability integral transformation the
conditional distribution of Fn (X.)
given Tn is uniform in (0, 1) a.s.;
~
l·~·
,
Fn (X.)
~
Note that
Fn (Xl)'
••• ,
'" D(O, 1) .
Fn (Xn )
are not necessarily independent; in
fact, for given Tn' the conditional distribution of Xl' ••• , Xn will,
in general, be singular.
28
In order to overcome this difficulty, a result from Rosenblatt
[17J, will be used.
In that paper it is shown that for a k-variate
absolutely continuous distribution function F(X , •.• , x ), the k
l
k
transformations:
are such that the kr.v.1s,
are uniformly and independently distributed in (0, l)k.
With the above multivariate probability integral transformation
and the conditional probability integral transformation, it will be
shown that under some conditions of absolute continuity one can obtain
a U(O,
1)
independent sample, but first, a characterization is given
for a random multivariate transformation.
Let F(X , x ) be the conditional bivariate joint distribution of
l
2
Xl' X2 given some statistic T,
For each value of T
=
!.~.,
t, assume Ft(Xl , x ) is absolutely
2
continuous, and let Ft(X2Ixl) be the conditional distribution of X2
obtained from Ft(X l , x 2 ) for given Xl
this random variable by F(x2Ixl).
=
xl'
For unspecified t, denote
29
Lemma 4.3.3
F(X2Ixl) coincides a.s. with the conditional distribution of X2
given Xl and T, evaluated at Xl
= xl'
l·~·,
Proof
Denote bYF(X ) the marginal d.f. of F(X , x21 computed for each
l
l
T =
t,
l.~.,
By the monotone convergence theorem for conditional expectations
the last term in the above expression is equal to
Therefore,
From its definition, for each value of T, F(x2Ixl) obeys the
following relationship:
(4.3.3.1)
Now, F(X , x ) being a conditional expectation, must obey, for
l
2
every B
T
E ~(T),
the following relationship:
(4.3.3.2)
JB F(X1 , x 2 )dPT = JB I[X s:x ,X g ]dP = p([Xl s: xl' X2 s: X2 ]BT )·
l l 2 2
T
T
30
(4.3.3.3)
xl
JB
I-co E(I[X2=>:x 2] IT,
T
Xl)dPT,X
l
= P([Xl ~
xl' X2 =>: X2]BT )·
From(4.3.3.11 (4.3.3.21(4.3.3.31 and the fact that
dPT X
, 1
T
= dPXl
dP
T
where obviously,
T
P (~oo, u)
x.1
for every B
T
€
~(T),
and any xl
E
= F(u);
I"J
R,
xl
T
.
xl
I_
F(x 2 Iu )dPX (u)dPT = IB I..
E(I[X··
] IT,
co
1
Too
2~x2
I"J
JB
T
T
Xl = u)dPx (u)dPT,
1
therefore
With the above characterization, the following theorem is
presented without a detailed description of the marginal and
conditional distribution of a joint distribution which itself will be
a Rao-Blackwell conditional distribution.
Clearly, Lemma 4.3.3
generalizes to the k-variate case.
Theorem 4.3.4
Let Xl' ••• , Xn be an independent
s~ple
from P
E~.
Let
Fn(X , .•• , X~), ~ i n be the joint conditional distribution of the
l
first~
Tn
=
terms of the sample given some sufficient statistic
Tn (Xl' ••• , Xn ).
31
If Fn(Xl , •.• , x ) is absolutely continuous a.s.
t
r.v.'s,
are independent and uniformly distributed in
(~,
[r],
then the
l)t.
Proof
For each value of T
n
(a.s.), the joint conditional distribution
of Xl' ••• , X£ is absolutely continuous, therefore, by the multivariate probability integral transformation, the t above proposed
r.v. 's are conditionally distributed uniformly in the unit tdimensional hypercube.
Since its conditional distribution does not depend on the
particular value of T a.s., then the assertion of the theorem
n
follows.
The absolute continuity (a.s. [r]) requirement of the joint
conditional distribution used in the above theorem, as well as the
assumption (implicit) of t > 0 motivates the following definition:
Definition
Let r be a family of univariate probability measures and
ee._, Xan independent sample from some PEr.
n
sufficient statistic for r based on Xl' ••• , Xn •
Let T be a
n
Denote by
Fn(Xl , ••• , X ) the joint conditional distribution function of the
t
firstt terms of the sample given T •
n
32
n) for which Fn(X , ••• , x ) is
l
t
absolutely continuous a.s. [pJ will be called the absolute continuity
The maximum £(= 0, 1, ,."
rank of P given T ,
n
The arbitrariness of selecting the "first £ observations"
disappears when T is required to be symmetric (which is always the
n
case for independent identically distributed elements in the sample).
Note that if P is not a subset of the class of all absolutely
continuous probability measures with respect to Lebesgue measure, then
the absolute continuity rank given any sufficient statistic must be
0, by virtue of Result
4.3.1.
The question of how large the absolute continuity rank of a
class P can get, for alternative sufficient statistics is answered
with the following argument.
Seheult and Quesenberry [19J have shown that a necessary and
sufficient condition for Fn(X , ••• , X£) to be absolutely continuous,
l
is that there exists a o-(T ) measurable unbiased estimator of the
n
density of Xl' ••. , X , Note that if the absolute continuity rank of
t
P given T. is £ > 0, then the existing o-(T ) measurable unbiased
n
n
estimator can be conditioned with Z , the minimal sufficient statistic
n
for P, thus yielding a o-(Z ) measurable unbiased density estimator.
n
The absolute continuity rank of P given any sufficient statistic is,
therefore, less than or equal to that given the minimal sufficient
s tatistico
In view of this result, when considering a class P, the use of
the minimal sufficient statistic will be understood in computing the
Rao-Blackwell estimators.
33
Let lh be the collection of classes P for which the absolute
continuity rank (given the corresponding minimal sufficient statistic)
is .t(> 0).
Because of Theorem
4.3.4, any measurable function of the
.t transformed observations will be a class-free (rn) statistic.
Theorem
4.3.4, in essence, reduces the composite problem to a simple
problem.
For any problem whose null class is in rn, it is true that under
the null hypothesis the
.e
distributed on (0, 1).
The question arises as to the distribution of
transformed observations will be uniformly
these observations unCler alternatives, and the answer is the same as
that given at the end of Section 2 of this chapter.
It is to be
expected that if the transformations used are ufar u from the correct
ones, then the observations will not be uniformly distributed on
(0, 1).
In Chapter 3, the notion of seql1ential statistics was introduced;
here, a slightly stronger notion is given.
Definition
For each n ~ 1, let Tn(X , .~., X ) be a statistic based on the
n
l
independent sample Xl' ••• , Xn ; (Tn) ~l will be said to be doubly
sequential, if,
~(T
n , Xn )
= ~(T n- l' Xn ).
Next, an interesting result is presented by means of which the
transformations proposed in Theorem
4.3.4 take a very simple form, and
for which convergence of the class-free statistic depends entirely
34
upon the choice of the statistic and not on the random
transformations.
Theorem
4.3.5
Let Xl' ••• , X be an independent sample according to P
n
minimal sufficient statistic T for
n
€
r,
the
r be doubly sequential, and
i (> 0) be the absolute continuity rank for r (given Tn).
Then the
r.v.ls:
Fn (Xn )
(0,
are uniformly distributed on the unit hypercube
'F.(x.)
J J
:=
E(I[x"
....x.
J J
JIT.) for j
J
==
l)i, where,
n - i + 1, ..• , n.
Proof
Select the transformations of Theorem
4.3.4, in reverse order; a
set of i random variables is obtained which are uniformly distributed
in (0, l)i.
Thus it suffices to show that the jth variable, obtained by the
transformation,
From Lemma
Fn (x.J IXn ,
'Fn (x J.Ixn ,
, .. , x '+1)' corresponds to
J
'F. (X.).
J
J
4.3.3,
••• , X. +1 ) == E (I [X
J
.s:x . J IT,
n X,
n .. 0' X.J +1)
J
ao s.
J
Now Tn being doubly sequential implies that the right hand side
of the above equality is a. so equal to F. (x.).
J
J
35
Corollary 1
=n
Under the assumptions of Theorem 4.3.5, let t
- v (where v is
constant) and suppose that y +1' ••• , Y are the transformed
v
n
observations corresponding to a sample of size n.
If a sample of size
\
(n + 1) is considered by adding an independent observation to the
original sample, then the corresponding transformed observations are
Yv +1' •• ", Y,
n Yn+1"
Because of the above result, even though there seems to be no
reason for setting the order of the transformations in any specific
way, under double sequentiality of the minimal sufficient statistic,
and under reverse order selection, the convergepce properties of the
test statistic used on the transformed observations turns out to be
clearly independent of the convergence properties of the
transformations.
It should be clear at this stage that multivariate goodness-offit problems can be solved also by transforming a subset of the sample
into U(O, 1) random variables.
To show this, let
r
be a family of k-
variate probability measures.
For
~l'
••• , X an independent sample of vectors distributed
-n
according to P
€
r,
assume that the absolute continuity rank of
k
(w.r. to A ) given the minimal sufficient statistic is t(> 0).
r
For
simplicity, assume the minimal sufficient statistic to be doubly
sequential, also.
Corollary 2
Under the above assumptions, there exist k· £ random transformations which map the random matrix [X
n+l' •.• , X ] into k' £
-n-J(j
-n
uniformly independently and identically distributed r. v. 's on (0, 1).
Proof
Consider
j
=n - £
Fn (x-n-J(jn+l'
••• , X ), and note that for each
-n
+ 1, ... , n;
then consider the k transformations that are obtained from F.(x.), for
J -J
each of the £ values of j.
37
5.
CHI-SQUARE TYPE STATISTICS
In this chapter, some results concerning the chi-square goodnessof-fit statistic are presented.
Lancaster, [9] p. 3 says:
"Pearson's contributions to statistical theory
were numerous, but perhaps the greatest of them
was the X2 test of goodness-of-fit
.". ".
This chapter is an application of the theory developed in
Chapter 4, for the case when the statistic chosen for the transformed
observations is a chi-square statistic.
It will be shown that in
this important case, the test can be carried out in the original
space, with random selection of cells.
Since exact distribution theory for finite samples is the main
appeal of the transformations developed in Chapter 4, a review of
recent work in the area of simple goodness-of-fit problems is given
in the next section.
5.1 Simple Goodness-of-Fit Problem
Consider a simple goodness-of-fit problem, whose null hypothesis
is identified by the absolutely continuous distribution function F •
o
Denote by F the empirical distribution fUnction.
n
The chi-square
2
statistic (x ) , for testing Fo , with k cells defined by the partition:
(Ci_l,C i ]) i=l, "', k,
is defined to be:
with -co= Co < Cl < .,. < Ck =
+00
k
L:
i=l
F (c. )_~ (c. ) (Fn(C i ) - I' (C. 1) - F (C.) + F (C. 1)}2
o 1.
0
1.-1
n 1.0
1.
0
1.-
For a given k and a given partition, the asymptotic distribution
2
2
of X is that of X (k_l).
Different values of k and different selec-
tions of the partition yield, in general, different finite sample
properties of the test.
The arbitrariness of k and [Ci}~:~ has induced considerable
research.
A wide variety of recommendations appear in the litera-
ture regarding the choice of cells and the number of them.
These
recommendations were developed mainly to achieve a good approximation
to the limiting distribution.
Mann and Wald, [14] introduced the idea of using equiprobable
cells (under H ) and more recently, Good, et al [8] has mentioned
o
--
that the selection of cells proposed by Mann and Wald yields a
distribution-free statistic.
These writers also computed the exact
distribution of X2 for this case.
It is easy to see that equiprobable cells yield a distributionfree statistic, since under these circumstances, the statistic X
is identical to the random variable T k given by:
n,
k
T
n,k
= ~ L: N~
n i=l
- n ,
1.
where (N , ••• , N ) have a joint multinomial distribution with
l
k
k
Pi =
~ , i=l, ••• , k and L: N. = n
i=l
1.
•
2
39
Table 5.1
Table of critical points of the distribution of T k for
the closest above and below significance levels tg'~05
VALUE OF k
n
3
4
6
5
1.000(·333)
4.000(.000)
2.000(.111)
6.000(.000)
2.000 (.250)
6.000 (.000)
3.667 (.063)
9·000(.000)
3·000(.200)
8.000(.000)
5· 333(·040)
12.000(.000)
4. 000 ( .16'7)
10. ooo~. 000)
3.000 .444)
7.000(.023)
2.000(·333)
3·500(.037)
2.800(.136)
5. 200 (.012)
4.000(.203)
6.000(.016)
5·400(.063)
8. 600 (•004)
6.000 (.136)
8. 500 (.008)
6. 000 ( .098)
8. 000 ( .034)
8. 000(.097)
11.000(.005)
8.200(.059)
10.600(.020)
4.000(.053)
7· ooo~. 004)
4·750 .059)
6.250 .033)
6.000(.063)
7·333(·019)
7·000 .034)
5· 667 ( •098 )
9. 000(.023)
8. 250 ( .054)
9· 500 (.037)
8.
12.000
8,500
10.000
5· 600(.059)
6.200(.022)
4. 500( .070)
6. 000(.048)
6.800 (.052)
7·600(.037)
6.667 (.065)
7·333(·048)
8. 000(.055)
9·000(.040)
8.000(.074)
8.833(·040)
9·200(.076)
10.400(.040)
10.000(.060)
11. 000 ( .035)
5·236(.068)
6.143(·033) .
5· 375(·061)
6. 125(.039)
7. 143 ( .060)
7. 714(.046)
6. 500(.061)
7· 500 ( .043)
8.857 (.052)
9· 571(.034)
9·000(.052)
9· 625(·035)
10.000 (.059)
10. 857 ( •038 )
10. 250(. 054)
11.000(.036)
20
5·333(·057)
6·333(·033)
5· 200(.056 )
6.100 (.045)
7·333(·051)
7·778(.039)
7·200(.060)
7·600(.042)
8. 667 (•060 )
9· 222(.043)
9·000(.051)
9· 500(.039)
10.667 .046)
10.000 (.062)
10.600(.048)
co
5· 991(.050)
7 .815(·050)
9· 488 (.050)
11.070(.050)
2
3
4
5
6
8
10
12
14
16
18
6.000~.095)
ooor.014)
059)
.063)
.048)
10.000~.063)
40
Table 5.1
(Continued)
VALUE OF k
n
·2
3
4
5
6
[3
10
12
14
16
18
20
8
7
9
10
5. aoo( .143)
12.000(.000)
4.000(·388)
8.666(.020)
6. ooo( .125)
14.000(.000)
5. oooC· 344)
10·333(·016)
7. 000 (•111)
16.000(.000)
6. 000 (. 304)
12.000 (.012)
8.000(.100)
18. 000 ( •000 )
7·000(.280)
13.667 (.010)
10.000(.073)
13·500(.003)
7. 600( .163)
10.400(.038)
12. 000(.057)
16.000(.002)
9.400 (.128)
12.600(.026)
9· 500(.078)
14. 000(.045)
11. 200 ( .104)
14.800(.018 )
11. 000(.064)
16.000(.037)
18. 000 (.086)
17·000(.014)
8.000 (.145)
10·333(·038)
9· 500 (.105)
11. 250 (.034)
10.000(.103)
12.666(.026)
12. 000 (.067)
14. 000(.020)
12. 000(.075)
15·000(.018)
12.250(.089)
14.500(.045)
14. 000(.057)
17·333(·014)
14.500 (.066)
17·000(.031)
11.000(.059)
12.400 (.036)
10.200(.083)
11·333 (.049)
12.400(.059)
12.000 (.072
13·333(·042)
14.000(.035~
13. 400(.060)
15· 200 (.037)
13· 500(.0 69)
15.000 (•040 )
14.000(.071)
16.000 (•. 039)
14.667 (.066)
16·333(·043)
11.000 (•068 )
12.000(.043)
11.125(·063)
12.000(.048)
12. 286(.070)
13· 429(·045)
13· 000(. 056)
14.000(.038)
14. 286(.051)
15· 571(.088)
14. 375(.056)
15·500(.035)
14·571(.073)
16.000 (.046)
15. 250 (.064)
16. 500 (.043)
11. 556(.054)
12. 333(.043)
11·500(.061)
12.200 (.046)
13. 111( .056)
14.000(.040)
12.800 (.062)
13·600(.045)
14.000 (.061)
15. 000(.046)
14.200(.061)
15·100(.045)
15·333(·061)
16.444(.045)
16.000 (.052)
17·000(.039)
12· 592(.050)
14.067..(.050)
15· 507(·050)
16. 919(.050)
....
41
In that the distributions of T k will be used in Section 2;
n,
these were obtained for some values of nand k, and a table of .05
critical points is exhibited.
These distributions are discrete, thus the .05 level is not
necessarily achieved, therefore, for the closest above and below
significance levels the critical points appear together with the
corresponding critical level.
The tabulations given by Good, et al [8] are actually those of
a quadratic approximation to the T k distribution.
n,
Zahn and
Roberts [26] give the exact T k distribution for n=k (cell expectan,
tions of one) for n = 1(1)25(5)40,50.
5.2
Co~osite
Goodness-of-Fit Problem
Theorem 5.2
For the composite goodness-of-fit problem with null class P;
let the absolute continuity rank of P given the minimal sufficient
statistic be £(>
0),
and let Pl' ••• , Pk be fixed positive probabili-
k
ties,
~ p.= 1.
j=l J
Then there exists £ random partitions of R, given
by:
"-
[(C. l '
~- ,J
,
where
+00
for j = n-£+l, ••• , n, for which the random vector (N , ••• , N );
k
1
.th
.
,
b
wh ose ~- component ~s g~ven y,
N.
~
=
n
L:
I
'V
'V
1 [C.~- 1 ,J. < X.J -< C..
~=n-~+
~J
•
n
]
, i
= 1,
••• , k;
has a multinomial distribution with probabilities Pl' ' •• , Pk;
k
L:
. 1
N.= £.
~
~=
Proof
The proof is an immediate application of Theorem 4.3.4, selecting the transformations in reverse order.
.oo,
For each j = n-£+l,
'V
n, let Y.= F (x.lx,
nJ n
J
.oo, x.J+1)
and
'V
define C. . by:
~J
i
'V
F (C .. Ix ,
n ~J n
.oo, x.J+1)
p., i=l,
~
-
000'
k-l
e
j=l J
Note that,
i-l
Y.
E
J
(L:
j=l
i
p.,
J
L: P.]
j=l J
iff
'V
'V
,J < X.J -<
C. 1 .
~-
C..
~J
thus the result follows from the fact that Yn -£+l'
distributed independently and uniformly on
Yare
n
(0, 1).
Corollary 1
If in the above theorem P j
k
k
tic, £
l:
N~
i=l ~
=~ ,
2
j=l, ••• , k, then the X statis-
- £ is distributed as T£,k of Section 1.
43
Corollary 2
If the minimal sufficient statistic in the above theorem is
doubly sequential, the asymptotic distribution of X2 defined in
2
Corollary 1, is that of X (k_l)'
Theorem 5.2, provides a distribution-free test, which is classfree with respect to t he collection tn.e of classes P whose absolute
continuity rank is to
If the class-freeness notion is relaxed to
allow the distribution of the test statistic to vary with .e within
some known family of distributions, then, in this sense Corollary 1
to Theorem 5.2, provides a class-free statistic with respect to the
collection tn of classes P whose absolute continuity rank is positive.
Watson [22] and [23] has proposed the use of randomly selected
cells in constructing a chi-square statistic for the composite case.
Instead of using MVUE for the distribution, he estimates the para-
e ( m-
meter
theMIE.
2
X
A
dimensional) which identifies the distribution bye,
The statistic proposed is
k
=
L:
A
,.
A.
n
A
n
i=l F(e. ;e) .. F(C. 1;8)
,~
~-
,..
,..
A,..
,..
[F (C.)-F (C. l)-F(C. ;e)+F(C. l;e)}
A
n
~
~-
~
~-
..
A
where~
Ci ' i=l, "', k, is defined by
,.. "
F(C. ;e)
~
i
=
L:
j=l
p. ,
J
Pl' "', Pk being positive prescribed probabilities.
With the above statistic, it is shown that the asymptotic
distribution of X2 is that of,
2
44
2
X (k
-m-
1) +
m
2
~ ~~Z.
j=l J J
where Zl' ••• , Zm are NID (0, 1) and independent of
x2 (k_m..l)"
The
numbers AI' ••• , ~m are some eigen values in (0, 1) which are
identified.
case of
In general, AI' ••• , Am might depend upon
e corresponding
shown that
~l'
e·
In the
to location and scale parameters it is
••• , Am do not depend upon
e,
thus yielding an
asymptotically distribution-free statistic.
Moore [15] has extended Watson's results to the multivariate
case.
6.
APPLICATIONS
In this section the chi-square test proposed in Chapter 5 or,
more generally, the random transformations proposed in Chapter
4 are
illus trated.
As pointed out in Chapter 4, the random transformations can be
obtained also for multivariate problems.
The generalized Kolmogorov-
Smirnov statistic, however, cannot be generalized to multivariate
problems; see Simpson [20J.
The examples discussed are straight forward application of the
results of Chapter 5 and the MVUE estimators given by Sathe and
Yarde [18J for the univariate distributions, and by Ghurye and Olkin
[7J for the multivariate normal cases.
In Section
6.4, a test for the goodness-of-fit of a univariate
multiple regression model is given.
different setting
occurs~
In this example, a slightly
in that, the available independent sample is
not formed by identically distributed r.v. 'so
·6.1 Test for Normality,
gz
2
~
Unknown
Let
For the normal class
is the minimal sufficient statistic; it is also a doubly sequential
statistic.
46
The conditional distribution based on r observations,
Fr (z)
is
given by;
0, if z -
Fr (z)
Xr <
-(r-l)lSr'
X,
r ~ld
1, if (r-l)is r < z -
=
2
r- 2((r-2)'(z-Xr )/[(r_l)S2
r - (z - Xr ) Ji}, elsewhere
G
wh~re
G _
r 2
Since
is the student's t-distribution with r-2 d. f.
Fr (z)
is absolutely continuous for r ~ 3, then for
j = n - t + 1, ••. , n, using the doubly sequentiality property, the
transformation for x. is given by F. (x.). Thus, t = n - 2.
.
J
J J
Let t . b e the (i/k)th percentile from a student's t~ . 2
pJ-
distribution with j-2 d.f.; for j
=
3, ... , n.
Obviously, this
selection is for equiprobable cells •
.The estimated quantiles
Cij , i
= 1, .•. ,
k-l,
for the jth variable are given by the solutions to the equations:
(j-2)* (C . . - X. )/[ (j-l)S~ ~J
J
J
(C..
- X.J )2Ji
~J
t.
,
~ . 2
K,J-
=
which gives:
((j-l)/(j-2))i. t .
~ . 2
k,J-
~
·S.
J
C. . = ----::----..;,;;..~I:-- +
~J
(1
+ t~
/(j-2))~
~ . 2
-,Jk
X.•
J
Note the sequentiality of the computational procedure, in that
X3'
S2 .
3
J.S
2
computed first, then X4' 8 4, etc., and also, if the cells
are co.m.puted for a sample size n , and extra observations are then
0
obtained, the formulas for the first ce.lls are unaltered.
The case of testing normality, when (J2 is known (a case which may
be of little practical importance) can be obtained in straight-forward
manner.
The expression for
Fr (z)
in this case is that of a
6.2 Test for Exponentiality
It is desired to test the composite null hypothesis that the
parent has density A e -At ,A > O.
The sample mean
Xn
sufficient statistic and is clearly doubly sequential.
is the minimal
The Rao-
Blackwell estimating distribution function for
is
which is defined for r
~
2.
The absolute continuity rank is thus n-l and in this case no
tables are needed to select the estimated quantiles,
""'
C , i = 1, •.• , k-l, j=2, ••. , n.
ij
The ""'C , are obtained by solving:
ij
1 - (1 -
C..
fj.X.)j-l
i
J.J
J
=k
48
which yields:
1
=
Following the same
[ 1 - .( 1 - ki)j-lJj ' X{
ste~s
as in the above examples, the random
selection of cells can be easily obtained for the following cases
given by Sathe and Yarde, [18J:
x
e-~
Incomplete Gamma; 'f'G)') e
x
Weibull; (~/e)x~-l e
-i
x
p-l
for p known.
p
e,
~ known.
In both cases, the absolute continuity rank is n-l, and in both
cases, also, the corresponding minimal sufficient statistic is doubly
sequential.
6.3 MUltivariate Normal Test
Consider
~l'
••• ,
random vectors from a
~n
N(~,
to be a
sam~le
of k-variate
inde~endent
V), where V is known.
The minimal sufficient statistic for the above family, is the
vector of sample means, X , which is clearly doubly sequential.
-n
Ghurye and Olkin, give the distribution of anyone term in the
sam~le given
X,
-n
which is a normal distribution with mean vector
and covariance matrix, (1 - l/n)V.
Because of the double sequentiality, it is clear that the
absolute continuity rank for the family is n-l.
Let j = 2, •.. , n be arbitrary, and consid,er the sample mean
vector
X.
-J
computed from the first j observations, and let
'
X
-n
4-9
be the jth observed vector.
From the conditional distribution 1. (x.) , given by the
J -J
N(X., (1 - lfj )V)
-J
evaluated at x., the k conditional distributions
-J
are obtained as follows:
= 2, ..• , k,
For any s
let ~; be the sth diagonal element of V,
Vs _l be the sub-matrix of V obtained by considering the first (s-l)
rows and columns, and let
~s-l
be the vector for which the following
relation holds;
Let
x~
-J
then
F. (xl')
J
J
= (Xl"
.J
... , jt .);
-KJ
is given by a
evaluated at x .. ; and for s = 2, ••. , k,
~J
50
is given by a
N~sJ.,
(1 - lfj )(J2
s
11, •.. , s-1)'
where
~SJ'
!'oJ
= x- sJ,
+v' 1V- l 1 (( Xl"
-s- sJ
Xs- l'J )-, - (x-1"J
0'"
••• , X-),)
,
s- l'J
and,
2
(Js
11, .•. ,S-1
- (J
2
s
- v'
-1
V
v
.
-s-l s-l-s-l
These obtained as the usual conditional mean and variance.
If sufficiently accurate tables are available, the transformations
can be carried out, obtaining (n-l)'k, independent and uniformly
distributed random variables in (0, 1).
In the case of selecting a
chi-square statistic for the transformed observations, the use of only
some prescribed percentiles of the normal tables will be needed.
In
this latter case, suppose M cells are to be selected with equal
probability.
The expression for each of the (n-l)·k. (M-l) percentile
estimators is given by:
c...
.
4J S
= ((1-
s 11,oo.,s- l}*'Z'~
1/j)(J2
+~sJ.;
where zi is the (~)th percentile of a N(O, 1), i
j
= 2,
6.4
o •• ,
n and s
= 1,
•.. , M-l;
1, ••. , k.
Testing the Fit of a Regression Model
Let Yl' ••• , Yn be a sample of r.v. 's and let l.~
Consider the hypothesis:
=
(Yl , ••. , yn ).
51
Y _ N(X ~,
-n
n-
(6.4.1)
~
2 I),
where Xn is some fixed n x p matrix (n > p), of full rank, and
(.§., ~2) are p + 1 unknown parameters.
(6.4.1) is the well known univariate multiple regression model.
In this section, a test is given for testing the hypothesis
(6.4.1).
For the family N(X ~, ~2I), clearly (y'y , X'y ) is the minimal
n-n-n
n-n
sufficient statistic, or equivalently, the statistic,
(X'Y',
y' (I - Xnnn
(X'X r~').l
n-n-n
n:n ).
Since the family is defined with the knowledge of X, then for x!,
n
-~
th
the i
row of X, the statistic:
n
(v'y
s- s '
:4.
X'y
s=-s'
y
s+l
)
is known iff
is known, which shows that the minimal sufficient statistic is
doubly sequential.
Ghurye and Olkin [7J, p. 1268, 4.2 gives the MVUE of a related
density which in the particular case mentioned above, can be written
as follows:
The conditional density of y given t
n
-n
Sn
= -n
y'(I
= X'y
n-n
- Xnnn
(X'X )-~')y
n-n ,
and
52
exists iff X _l is of full rank, and in that case, it is given by the
n
(y _ X'(X/X ) -It )2 h-p-3
n
-n - n n
~n
J} 2
(1 -
(X'X)
-n n n
X'
-1
x)
-n
where,
z, if z > 0,
",(z) =
{ 0, if z
Therefore, for j
t.
-J
=
X~y.
J-J
=
~
o.
p + 2, ••• , n, the conditional density of y. given
J
and,
s.J = -J
y~ (I
- X. (X~Xf~~)y.
J
J
J -J
exists if the first p+l rows of the matrix Xn are linearly independent.
It is assumed that this is the case.
This result enables the writing
of the conditional distribution of y., given t. and S. as a student's
J
-J
J
t-distribution function with j - P - 1 degrees freedom evaluated
at:
Under the assumption that the ordering of the observations is
given as above, then the conditional distribution of the jth
observation(j
=p
+ 2, ••. , n) given S. and t.; is just, G.
J
for U. E (-00, 00 ) or, equivalently for
J
-J
J-P-
l(U,),
J
53
As
before, if accurate tabl.es are available to achieve the
trBnSformation, any desired statistic can be chosen in the transfoxmed space, using the n - p
observations.
~
1 transformed
D(a,
1) independent
If such tables are not available or because of some
other reason, the chi-square statistic is chosen, the corresponding
,.....
formulas for the selection of the quantiles C.. ; i
l.J
= 1,
••• ,
k~l,
(where k~equiprobable cells are to be selected), and
j
.
=p
+ 2, •.• , n; are given by:
'"
c..
l.J
which clearly particularises to the formulas given for the normal
example in Section 1 of this chapter for p
=
1.
54
7· LIST OF REFERENCES
[1]
Berk, R. H. (1969). Strong Consistency of Certain Sequential
Estimators. Annals of Mathematical Statistics, 40:
1492-1495·
[2]
Birnbaum, Z. W. (1953). Distribution free tests of fit for
continuous distribution fUnctions. Annals of Mathematical
Statistics, 24: 484-489.
[3]
Blum, J. R. (1955). On the convergence of empiric distribution
functions. Annals of Mathematical Statistics, 26: 527-529.
[4]
Cochran, W. G. (1952). The X2 test of goodness of fit.
of Mathematical Statistics, 23: 315-345
Annals
0
Darling, D. A. (1957)~ The Kolmogorov-Smirnov, Cramer-von
Mises tests. Annals of Mathematical Statistics, 28: 823·
[ 6]
David, F. N. and N. L. Johnson (1948). The probability integral
transformation when parameters are estimated from the
sample. Biometrika, 35: 182-192.
Ghurye, S. G. and I. Olkin (1969). Unbiased estimation of
some multivariate probability densities and related
fUnctions. Annals of Mathematical Statistics 40:
1261-1271.
[8]
Good, I. J., T. N. Gover and G. L. Mitchell (1970). Exact
distributions for X2 and for the Likelihood-Ratio
Statistic for the Equiprobable Multinomial Distribution.
Journal of American Statistical Association 65: 267-283·
[9]
Lancaster, H.O. (1969). The Chi-squared Distribution.
Wiley and Sons, Inc.
[10]
Lehmann, E. L. (1959). Testing Statistical Hypothesis.
Wiley and Sons, Inc.
[11]
Lilliefors, H. W. (1967). On the Kolmogorov-Smirnov test for
normality with mean and variance unknown. Journal of
American Statistical Association, 62: 399-402.
[12]
Li1liefors, H. W. (1969). On the Kolmogorov-Smirnov test for
the exponential distribution with mean unknown. Journal
of American Statistican Association, 64: 387-389.
[13]
Loeve, M.
Inc.
(1963).
John
John
Probability Theory. D. Van Nostrand Company,
5'5
[14] Mann, H. B. and A. Wald (1942).
On the choice of the number
of class intervals in the application of the chi-square
test. Annals of Mathematical Statistics, 13: 306-317·
[15] Moore, D. S.
(1971).
boundaries.
A chi-square statistic with random cell
Annals of Mathematical Statistics, 42:
147-156.
[16]
Ranga Rao, R. (1962). Relations between weak and uniform
convergence of measures with applications. Annals of
Mathematical Statistics, 33: 659·
Rosenblatt, M. (1952). Remarks on a multivariate transformation. Annals of Mathematical Statistics 23: 470-
472.
[18]
Sathe, y. S. and S. D. Yarde (1969). On minimum variance
unbiased estimation of reliability. Annals of Mathematical
Statistics, 40: 710-714.
Seheult, A. H. and C. P. Quesenberry (1971). On unbiased
estimation of density ~~ctions. Annals, of Mathematical
Statistics, 42: 1434-1438.
[20]
Simpson, p. B. (1951). Note on the estimation of a bivariate
distribution function. Annals of Mathematical Statistics,
22: 476-478.
[21]
Srinivasan, R. (1970). An approach to testing the goodness
of fit of incompletely specified distributions. Biometrika,
57, 3,
p.
[22] Watson, G. S.
605·
The X2 goodness-of-fit test for normal
Biometrika, 44: 336-348.
(1957).
distributions.
(1958). On Chi-square goodness-of-fit tests
for continuous distributions. J. Roy. Statistician. Soc.
Ser B 20: 44-61.
[23] Watson, G. S.
Wolfowitz, J. (1959). Generalization of the theorem of
Glivenko-Cantelli. Annals of Mathematical Statistics,
25: 131.
[25] Wolfowitz, J.
(1960). Convergence of the emplrlc distribution
function on half spaces. Contributions to Problems and
Statistics, Essays in Honor of Hotelling. Stanford
University Press 198.
56
[26]
Zahn, D. A. and G. C. Roberts (1971). Exact X2 Criterion Tables
with Cell Expectations One: An Application to Coleman's
Measure of Consensus. Journal of American Statistical
Association, 66: 145-148.
.
•
© Copyright 2026 Paperzz