Behboodian, J.; (1971)On the distribution of a symmetric statistics from a mixed population."

And Pahlavi University, Shiraz.
Cki THE DISTRIBlITION
OF A SYM'£tRIC STATISTIC
fRoM A MIXED POPULATION
by
Javad Behboodian* .
Depazatment of statistics
University of Norrth Carolina at Chapel BilZ
Institute of Statistics Mimeo Series No. 732
FebltuaJUj, 1971
ON THE DISTRIBUTION OF A SVMV£TRIC STATISTIC··
FROM A r11XED POPULATION
by
Javad Behboodian
University of North CaroZina~ Chapel. BiZZ
and Pah Zavi Universi ty ~ Shiraz
ABsTRACT
= g(xl~x2~ ••• ~xn)~ for a ranf(x) = pf (x)+qf (x), is a bi2
l
The distribution of a symmetric statistic T
dom sample from a mixed population with density
nomial mixture of the densities of the statistics
k
~
= O~l, ••• ,n,
and density
where
f (x)
2
~i's
if
Tk •
g(~1~Xk2'~"~Xkn)~
are independent with density
i > k.
flex)
if
~
i
k
It is shown how to find the distributions of some
important symmetric statistics like sample mean, sample variance, and order statistics by using
Tk's.
The results are applied to normal and exponential mix-
tures.
1.
INTRODLcrION
Consider
(1)
where
and
flex)
q
= l-p.
two densities
and
f (x)
2
are two probability density functions with
The function
flex) and
f(x)
f 2 (x)
0
<
p
<
1
is called the density function of a mixture of
with mixing proportions
p
and
q.
Finite mix-
2
tures of distributions often arise in various biological, psychological, and in-
~
dustrial applications, and have received some general attention recently [1,2,4].
The following probabilistic meaning of (1) might be sometimes useful for
studying finite mixtures of distributions.
tribution
F(x) ,
bution
F (x) ,
lation
Qj
Qj
and let
with
P(D ) • P
l
P(XSx)
X be a random variable·with dis-
be a population with density
Suppose
j • 1,2,.
j
Let
D.
J
and P(D )
2
is the event that
= q.
f j (x)
and distri-
X comes from popu-
Now
•
or
F(x)
for any real number
x.
=
This last equality says that
X comes from a mixed pop-
ulation whose density is given by (1).
Let
X ,X , ••• ,X
n
l 2
density (1).
be independent and identically distributed as
Consider a symmetric function
t · g(x ,x ' ••• ,x )
n
l 2
abIes
Xi's;
of. real vari-
be a statistic.
that the distribution of the statistic
X having
It is clear
T is invariant under any permutation on
so we call T a symmetric statistic of the random sample.
Exampels of
such statistics are sample moments and order statistics.
This paper deals with the distribution of T
for a finite mixture in gen-
eral and the distributions of some important statistics like sample mean, sample
variance~
and order statistics.
tial mixtures as examples.
The results are applied to normal and exponen-
3
•
2.
PRoBABILITY DENSITY FLNCTION OF
The distribution of T,
T
particularly when the sample comes from a mixture
of distributions, is usually very complicated.
However, the following theorem
may be helpful in some special cases.
THEOREM.
and let
T
III
Let
X ,X , ••• ,X
n
1 2
g(X ,X ,
1 2
g(Xk1'Xk2' ••• '~n)'
H. ,Xn )
be a random sample from a mixture given by (1),
be a symmetric statistic.
k = O,l, ••• ,n,
pendent with density
f (x)
1
Let
T =
k
be a statistic for which
if
i S k
...
n
~
and density
f (x)
2
if
Xki's
are inde-
i > k.
Then, we
have
that is the density of
PROOF:
L (n)
k Pk q n-k f T ()
t,
k=O
k
(2)
T is a binomial mixture of the densities of Tk's.
The joint density of the random sample is
=
Let
m be a partition of the set
(3)
{1,2, ••• ,n}
denote the set of all such partitions by
ence, we assume that the subscripts
the sets
A, Band
M.
M.
a, b,
into two sets
A and
B,
For the sake of notational conveniand m always run respectively in
Expanding the right side of (3), we obtain
=
where
k
and
(4)
is the number of elements in the set
A and
(5)
is the density of an n-dimensional random vector whose components are independent
from each other;
k
of them whose subscripts belong to
and the remaining ones have density
f (x).
2
A have density
f (x)
1
It follows from (4) that the joint
4
density of the random sample
~
densities defined by (5).
Nmv, by (4), the characteristic function of T becomes
$T(U)
•
I
m
pk qn-k E [eXP(iug(X
m
l
,x2 , ••• ,Xn »]
(6)
where the expectation is taken with respect to the density (5).
the symmetry of
t
= g(x l ,x2 , ••• ,xn )
and the structure of the density (5), that
the above expectation is invariant under any permutation on
its dependence on the partition m is only through
in the set
A.
Therefore, for any of the
spond to a fixed integer
k,
0
k
~
n,
~
•
are independent with density
i > k.
k,
xl ,X2 ' ••• 'Xn
and
the number of elements
different partitions which correthe above expectation is the same as
= g(~1'~2"'.'~n)'
the characteristic function of the statistic Tk
~i's
We observe, by
flex)
if
i
~
k
and density
where
f 2 (x)
if
Considering this simple observation, from (6), we have
() =
$T u
n
~ (n) k n-k
()
k=O k P q
$T u •
k
Inverting (7) termwise, we obtain (2).
The theorem is proved.
(7)
0
Intuitively, we can easily derive the density of T otherwise, by using the
probabilistic meaning of a finite mixture given in Section 1 •
.For this purpose, let
the
Xi's
have density
Ek ,
flex)
k
= O,l, ••• ,n,
be the event that exactly
and the remaining ones density
f (x).
2
k
Using
conditional density, we have
(8)
where
=
n) k n-k
(k p q
(9)
of
5
•
and by the symmetry of
T one can show
(10)
It should be noted that the generalization of (2) to a finite mixture of more
than two densities is a multinomial mixture, which can be obtained by a similar
argument.
3. DISTRIBUTIONS OF
SAf'lPLE
t'EAN
AND SArvPLE VARIANCE
To find the distributions of sample mean and sample variance, we first find
the distributions of
n
r
and
Kiln
i=l -K
•
by breaking
~1 '~2'··· '~n
common density
fl(x)
common density
f (x).
2
S 2
k
(11)
=
into independent variables
~1'~2'···'~k
~,k+l'~,k+2'••• '~n
and independent variables
Now, for a sample of size
k
from
fl(x) ,
~l = 0 if k = 0 and Sk1 2
for a sample of size
n-k
from
n
~2 =
assuming that
r
i=k+1
f (x),
2
with
we set
=
assuming that
with
(12)
=0
if
k = 0
or
k = 1.
Similarly,
we set
~/(n-k),
=
~2 = 0 if k = nand Sk2 2 = 0 if k = n-1 or k = n.
It
follows from (11)-(13), by some simple computation, that
n\
=
k\i +
(n-k) ~2
(14)
6
•
and
(15)
Thus, if we can find the distributions of the sample mean and sample variance for
f (x)
1
and
f (x) ,
2
we may be able to find the distributions of
and then, by using (2), the distributions of
ExArvPLE.
112
X and
~
S2.
Let (1) be a mixture of two normal densities with means
and common variance
0 2 > O.
S 2
k
and
It follows from (15) that
~
and
III
is distributed
as a normal variable with mean
-II
and variance
2
Sk2'
•
(16)
...
2
To botain the distribution of Sk'
2
SkI'
we observe that
~1-~2 are independent from each other by normality of fl(x)
and
f 2 (x).
02/n •
k
It is also clear that
x2 (n-k-l),
and
kSk12/02
is
X2 (k-l),
(n-k)Sk22/02
k(n-k)(~1-Xk2)2/n02 is non-central chi-square
and
is
X2(1,~)
with
one degree of freedom and non-centrality parameter
(17)
By the reproductive property of chi-square random variables with respect to degrees of freedom and non-centrality parameter, we conclude that
ns
2
k
/0 2
is dis-
f 2 (x)
have
x2(n-l,~).
tributed as
It should be noted that when the normal densities
different variances
with variance
0 2
1
and
0 2
2 '
k012/n2+(n-x)022/n2.
the distribution of
fl(x)
X
K
But it can be shown that
and
is again normal
Sk
2
is a linear
combination of two central and one non-central independent chi-square variables
which cannot be summarized as one non-central chi-square variable.
buiton of
2
Sk'
pansion [3].
The distri-
when the variances are not equal, must be found by a series ex-
7
In short, by using (2), for a random sample of size
tI'
02
two normal distributions with common variance
(1)
The distribution of
butions with common variance
(2)
The distribution of
~.
centrality parameters
4,
from a mixture of
we have:
X
is a binomial mixture of n+l
02/n
and means defined by (16).
Z
nS /0 2
X2 (n-l,~)
chi-square variables
n
is a binomial mixture of n+l non-central
each with
defined by
normal distri-
n-l
degrees of freedom and non-
(17).
DISTRIBUTIONS OF ORnER STATISTICS
For the sake of simplicity, we just show how to find the distribution of
X(l)
•
the first order statistic for a mixture of two absolutely continuous distri-
butions.
The distributions of other order statistics can be obtained similarly •
First we find the distribution of X(kl)
x < X(kl) S x + 6x may be realized as follows:
The event
x <
~i S
remaining
Xki
the first order statistic for
x + 6x
n-l
for one
of the
with density
fZ(x)
~i
with density
~i IS;
and
or in
~i >
flex)
n-k ways,
x + 6x
and
x <
~i >
~i S
bility by
6x
and letting !.Ix
-+
0,
n-l
x >
°
for one
of the
X(l).
and zero elsewhere with
a
j
>
0,
j
~its.
Dividing the proba-
Let (1) be a mixture of two exponential densities
for
for the
we obtain
Now, using (2), we have the density of
ExAMPLE.
k ways,
x + 6x
x + 6x
for the remaining
Combining these, we have the probability of the above event.
In
fj(x)
= 1,2.
=
8
From (18), by simple calculation, we obtain
•
f
X
(x)
=
where, for
x > 0
=
(k1)
0
elsewhere
k = O,l, ••• ,n,
(19)
Now, by using (2), we have the following result:
The distribution of the first order statistic- for a random sample of size
n
from a mixture of two exponential densities with parameters
a1
and
a2
a binomial mixture of n+1 exponential densities with parameters defined by
(19) •
•
REFERENCES
[1]
Boes, D. C.,
liOn the estimation of mixing distributions,"
AnnaZs of Mathematical" Statistios, vol. 37, (1966), 177-188.
[2]
Cox, D. R.,
"Notes on the analysis of mixed frequency distributions,"
Br'itish JoumaZ of Mathematical" and Statistical" PsychoZogy,
vol. 19, (1966), 39-47.
[3]
Press, S. J.,
"Linear combinations of non-central chi-square variates,"
AnnaZs of MathematioaZ Statistios, vol. 37, (1966), 480-487.
[4]
Thomas, E. A. C., "Distributions free tests for mixed probability distributions," Biometrika, vol. 56, (1969), 475-484.
is