1. (Regular) Exponential Family 2. Theorem (Exponential family

1. (Regular) Exponential Family
The density function of a regular exponential family is:
(
)
( ) ( )
( ) ( )]
[∑
(
)
Example. Poisson(θ)
(
)
(
Example. Normal. (
)
√
√
√
[ ( )
)
)
) (both unknown).
(
(
]
)
√
[
(
[
(
[
]
) ]
)]
[
(
)]
2. Theorem (Exponential family & sufficient
Statistic). Let
be a random sample from the
regular exponential family.
Then
( )
is sufficient for
(∑ ( )
(
∑
( ))
)
1
Example. Poisson(θ)
Let
be a random sample from Poisson(θ)
Then
( )
∑
)
(
is sufficient for
Example. Normal. (
Let
) (both unknown).
be a random sample from (
)
Then
( )
is sufficient for
(
(∑
∑
)
)
Exercise.
Apply the general exponential family result to all the standard
families discussed above such as binomial, Poisson, normal,
exponential, gamma.
A Non-Exponential Family Example.
Discrete uniform.
(
is a positive integer.
)
Another Non-exponential Example.
iid
(
)
( )
2
Universal Cases.
are iid with density
• The original data
.
are always sufficient for .
(They are trivial statistics, since they do not lead any data
reduction)
• Order statistics
(
( )
( ))
are always sufficient for .
( The dimension of order statistics is , the same as the
dimension of the data. Still this is a nontrivial reduction as !
different values of data corresponds to one value of . )
3. Theorem (Rao-Blackwell)
Let
be a random sample from the population
with pdf (
). Let ( ) be a sufficient statistic for
θ, and ( ) be any unbiased estimator of θ.
Let
( )
[ ( ) ], then
(1)
( ) is an unbiased estimator of
( ) is a function of T,
( )
( ) for every , and
( )
( ) for some unless
with probability 1 .
(2)
(3)
Rao-Blackwell theorem tells us that in searching
for an unbiased estimator with the smallest
possible variance (i.e., the best estimator, also
called the uniformly minimum variance unbiased
estimator – UMVUE, which is also referred to as
simply the MVUE), we can restrict our search to
only unbiased functions of the sufficient statistic
T(X).
3
Proof: Make use of the following equations:
( )
( )
[
[ (
(
)]
)]
[ (
)]
Note: The fact that ( ) is a sufficient statistic for θ
will ensure that ( ) is a function of only the sample
and in particular, is independent of θ.
4
4. Transformation of Sufficient Statistics
1. If is sufficient for and
( ) a mathematical
function of some other statistic, then is also sufficient.
2. If is sufficient for , and
one, then is also sufficient.
( ) with
being one-to-
Remark: When one statistic is a function of the other statistic
and vise verse, then they carry exactly the same amount of
information.
Examples:
is sufficient, so is ̅ .
• If ∑
• If (∑
∑
) are sufficient, so is ( ̅
• If ∑
is sufficient, so is (∑
∑
and so is ∑
).
∑
).
) is sufficient,
Examples of non-sufficiency.
Ex.
iid Poisson( ).
Ex.
sufficient.
iid pmf
(
is not sufficient.
–
).
(
) is not
5
5.
Minimal Sufficient Statistics
It is seen that different sufficient statistics are possible. Which
one is the "best"? Naturally, the one with the maximum
reduction.
• For (
( ̅ )
),
̅ is a better sufficient statistic for
than
Definition:
is a minimal sufficient statistic if, given any other sufficient
statistic , there is a function ( ) such that
( ).
Equivalently, is minimal sufficient if, given any other
sufficient statistic whenever and are two data values
such that ( )
( ), then ( )
( ).
Partition Interpretation for Minimal Sufficient Statistics:
• Any sufficient statistic introduces a partition on the sample
space.
• The partition of a minimal sufficient statistic is the coarsest.
• Minimal sufficient statistic has the smallest dimension
among possible sufficient statistics. Often the dimension is
equal to the number of free parameters (exceptions do
exist).
Theorem (How to check minimal sufficiency).
A statistic T is minimal sufficient if the following property
holds: For any two sample points x and y (
) (
)
does not depend on (i.e. (
) (
) is a constant
function of ) if and only if ( )
( )
6
6. Exponential Families & Minimal Sufficient
Statistic:
For a random sample from the regular exponential family with
( ) ( )],
probability density (
)
( ) ( )
[∑
where
is k dimensional, the statistic
( )
(∑ ( )
∑
( ))
is minimal sufficient for .
Example. Poisson(θ)
Let
be a random sample from Poisson(θ)
Then
( )
∑
)
(
is minimal sufficient for
Example. Normal. (
Let
) (both unknown).
be a random sample from (
)
Then
( )
Is minimal sufficient for
(∑
(
∑
)
)
Remarks:
• Minimal sufficient statistic is not unique. Any two are in oneto-one correspondence, so are equivalent.
7
7. Complete Statistics
Let a parametric family (
)
be given. Let be a
statistic. Induced family of distributions ( )
.
A statistic is complete for the family (
)
or
equivalently, the induced family ( )
is called
complete, if ( ( ))
for all
implies that ( )
with probability 1.
Example. Poisson(θ)
Let
be a random sample from Poisson(θ)
Then
( )
is minimal sufficient for
complete.
We know that ( )
∑
Now we show that T is also
∑
(
)
Consider any function ( ). We have
[ ( )]
Because
coefficient
∑
setting [ ( )]
( )
requires all the
( )
to be zero, which implies ( )
Example. Let
be iid from
(
). Show
∑
is a complete statistic. (*Please read our text
book for more examples – but the following result on the
regular exponential family is the most important.)
8
8. Exponential Families & Complete Statistics
Theorem. Let
be iid observations from the regular
exponential family, with the pdf
( ) ( )], and
(
)
( ) ( )
[∑
(
) Then
( )
(∑ ( )
∑
is complete if the parameter space (
contains an open set in .
( ))
( )
( ))
(This is only a sufficient condition, not a necessary condition)
Example.
∼
(
)
Example. Poisson(θ);
(
)
(
Example. Normal. (
[ ( )
)
)
) (both unknown).
(
(
)
]
)
√
√
√
√
[
(
[
(
[
]
Example.
∼
(
)
Example.
∼
(
)
) ]
)]
[
(
)]
, is not complete.
is complete.
Example. {Bin(2, ), p = 1/2, p = 1/4} is not complete.
Example. The family {Bin(2,p), 0 < p < 1} is complete.
9
Properties of the Complete Statistics
(i) If
is complete and
( ), then is also complete.
(ii) If a statistic T is complete and sufficient, then any minimal
sufficient statistic is complete.
(iii) Trivial (constant) statistics are complete for any family.
9. Theorem (Lehmann-Scheffe). (Complete
Sufficient Statistic and the Best Estimator)
If T is complete and sufficient, then
( ) is
the Best Estimator (also called UMVUE or MVUE)
of its expectation.
Example. Poisson(θ)
Let
be a random sample from Poisson(θ)
Then
( )
is complete sufficient for
∑
Since
( )
∑
is an unbiased estimator of θ – by the Lehmann-Scheffe
theorem we know that U is a best estimator
(UMVUE/MVUE) for θ.
10
Example. Let X 1 ,
i .i .d .
, X n ~ N (  ,  2 ) , be a random sample from
the normal population where  is assumed known. Please
derive
(a) The maximum likelihood estimator for  2 .
(b) Is the above MLE for  2 unbiased?
(c) Is the MLE a best estimator (UMVUE) for  2 ?
Solution:
(a) The likelihood function is
∏ (
)
∏
(
)
[
√
[
]
(
[
)
∑
]
(
)
]
The log likelihood function is
(
∑
)
(
)
Solving
∑
(
)
We obtain the MLE for
∑
̂
(
)
(
)
(b) Since
[̂]
∑
It is straight-forward to verify that the MLE
∑ (
)
̂
is an unbiased estimator for
.
11
(c) Now we calculate the Cramer-Lower bound for the
variance of an unbiased estimator for
(
)
[ (
)]
[
[
]]
√
(
)
[ (
[ (
[
[ (
)]
)]
)]
[
]
(
)
(
)
[
(
)
]
]
Thus the Cramer-Rao lower bound is:
Therefore we claim that the MLE is an efficient
estimator for . Since the regularity conditions for the
Cramer-Rao lower bound Theorem to be true holds
here, we declare the MLE is also a best estimator
(UMVUE) for .
12
(c) Alternatively, we can derive that the MLE is the best
estimator directly (rather than using the efficient
estimator is also a best estimator argument) as follows:
The population pdf is:
(
(
)
)
√
(
)
√
So it is a regular exponential family, where the red part
is ( ) and the green part is ( ).
Thus ( ) ∑ (
statistic (CSS) for
) is a complete & sufficient
.
It is easy to see that the MLE for
is a function of the
complete & sufficient statistic as follows:
̂
∑
(
)
( )
In addition, we know that the MLE is an unbiased
estimator for
as we have done in part (b):
(̂)
∑(
)
∑
∑
(
)
( )
Since the MLE ̂ is an unbiased estimator and a
function of the CSS, therefore we claim that ̂ is the
best estimator (UMVUE) for
by the Lehmann-Scheffe
Theorem.
13
10. Theorem (Basu)
A complete sufficient statistic T for the parameter θ is
independent of any ancillary statistic – that is, a
statistic whose distribution does not depend on θ
Example. Consider a random sample of size n from a normal
)
(
).
distribution (
 ˆ  X


( X i  X )2

2
ˆ



n

Consider the MLEs
It is easy to verify that ̅ is a complete sufficient statistic for
for fixed values of . Also:
̂
(
)
which does not depend on . It follows from the Basu Theorem
that the two MLEs are independent to each other.
Homework 7:
Due next Thursday, March 30, 2017.
Problems in our textbook:
6.3, 6.6, 6.17, 6.22, 6.30, 7.49, 7.50, 7.52 (a&b), 7.59, 7.60
Read our textbook:
Chapter 6: Sections 6.1 & 6.2
Chapter 7: Sections 7.3.3 and 7.5 (Miscellanea): 7.5.1 & 7.5.3
14