Stat 550 Notes 5
Reading: Chapter 1.6
I. Exponential Families
The binomial and normal models exhibited the interesting
feature that there is a natural minimal sufficient statistic
whose dimension is independent of the sample size. The
exponential family models are a general class of models
that exhibit this feature.
The class of exponential family models includes many of
the mostly widely used statistical models (e.g., binomial,
normal, gamma, Poisson, multinomial). Exponential
family models have an underlying structure with elegant
properties that we will discuss.
One-parameter exponential families: The family of
distributions of a model {P : } is said to be a oneparameter exponential family if there exist real-valued
functions ( ), B( ), T ( x), h( x) such that the pdf or pmf
may be written as
p( x | ) h( x ) exp{ ( )T ( x ) B( )}
(0.1)
Comments:
(1) For an exponential family, the support of the
distribution (i.e., { x : p( x | ) 0} ) cannot depend on .
1
Thus, X 1 , , X n iid Uniform (0, ) is not an exponential
family model.
(2) For an exponential family model, T ( x ) is a sufficient
statistic by the factorization theorem.
(3) , B, T are not unique. For example, can be
multiplied by a constant c and T can be divided by the same
constant c.
Examples of one-parameter exponential family models:
(1) Poisson family.
Let X ~ Poisson( ), 0 . Then for x {0,1, 2,...} ,
x e
1
exp{x log } .
x!
x!
This is a one-parameter exponential family with
( ) log , B( ) , T ( x) x, h( x) 1/ x ! .
p( x | )
(2) Binomial family.
Let X ~ Binomial(n, ), 0 1 . Then for
x {0,1, 2,..., n} ,
n
p ( x | ) x (1 ) n x
x
n
exp[ x log
n log(1 )]
x
1
This is a one-parameter exponential family with
2
1
( ) log
n
,
B
(
)
n
log(1
),
T
(
x
)
x
,
h
(
x
)
x
The family of distributions obtained by taking iid samples
from one-parameter exponential families are themselves
one-parameter exponential families.
Specifically, suppose X ~ P and {P : } is an
exponential family, then for X1 , , X n iid with common
distribution P ,
p( x1 ,
n
n
, xn | ) h( xi ) exp ( ) i 1T ( xi ) nB( )
i 1
A sufficient statistic is i 1T ( xi ) and it is one dimensional
whatever the sample size n is.
n
For X 1 ,
, X n iid Poisson ( ), the sufficient statistic
i1T ( xi ) i1 xi has a Poisson ( n ) distribution and
n
n
hence has an exponential family model. It is generally true
that the sufficient statistic of an exponential family model
follows an exponential family model.
Theorem 1.6.1: Let {P : } be a one-parameter
exponential family of discrete distributions:
p( x | ) h( x ) exp{ ( )T ( x ) B( )}
Then the family of the distributions of the statistic T ( X ) is
a one-parameter exponential family of discrete distributions
whose pdf may be written
3
h *(t ) exp{ ( )t B( )}
for suitable h*.
Proof: By definition,
P [T ( x ) t ] p( x | )
{ x:T ( x ) t }
h( x ) exp[ ( )T ( x ) B ( )]
{ x:T ( x ) t }
exp[ ( )t B ( )]{
h( x )}
{ x:T ( x ) t }
*
h
If we let (t )
{ x:T ( x ) t }
h( x ) , the result follows.
A similar theorem holds for continuous exponential
families.
A useful reparameterization of the exponential family
model is to index ( ) as the parameter to yield
p( x | ) h( x) exp[T ( x) A( )] ,
(0.2)
where A( ) log h( x) exp[T ( x)]dx in the continuous
case and the integral is replaced by a sum in the discrete
space. If , then A( ) must be finite. Let
{ :| A( ) | } . The model given by (0.2) with
ranging over is called the canonical one-parameter
exponential family generated by T and h. is called the
natural parameter space and T is called the natural
sufficient statistic. The canonical one-parameter
exponential family contains the one-parameter exponential
family (0.1) with parameter space and can be thought
4
of as the “biggest” possible parameter space for the
exponential family.
Example 1: Let X ~ Poisson( ), 0 . Then for
x {0,1, 2,...} ,
p( x | )
x e
x!
1
exp{x log }
x!
(0.3)
Letting log , we have
1
p( x | ) exp{ x exp( )}, x={0,1,2,...} .
x!
We have
1
A( ) log e x
x 0 x !
(e ) x
log
x!
x 0
log exp(e ) e
Thus, { :| A( ) | } (, ) .
Note that if 1 , then (0.3) would still be a oneparameter exponential family but it would be a strict subset
of the canonical one-parameter exponential family
generated by T and h with natural parameter space
{ :| A( ) | } (, ) .
A useful result about exponential families is the following
computational shortcut for moments of the natural
sufficient statistic:
5
Theorem 1.6.2: If X is distributed according to (0.2) and
is an interior point of , then the moment-generating
function of T ( X ) exists and is given by
M ( s) E[exp( sT ( X ))] exp[ A( s ) A( )]
for s in some neighborhood of 0.
Moreover,
E [T ( X )] A '( ), Var [T ( X )] A ''( ) .
Proof: This is the proof for the continuous case.
M ( s ) E (exp( sT ( X )))
h( x) exp[(s )T ( x) A( )]dx
{exp[ A( s ) A( )]} h( x ) exp[( s )T ( x ) A( s )]dx
exp[ A( s ) A( )]
because the last factor, being the integral of a density, is
one. The rest of the theorem follows from the moment
generating property of M ( s ) (see Section A.12 of Bickel
and Doksum).
Comment on proof: In order for the moment generating
function (MGF) properties to hold, the MGF must exist (be
less than infinity) for s in some neighborhood of 0. The
proof that the MGF exists for s in some neighborhood of 0
relies on the fact that is an interval or (, ) , which is
established in Section 1.6.4.
6
Example 1 continued: Let X ~ Poisson( ), 0 .
The natural sufficient statistic is T ( X ) X and log ,
A( ) e . Thus, using Theorem 1.6.2,
E [ X ]
d
e
e
log
d log
d2
Var [ X ]
e
e
2
log
d
log
Example 2: Suppose X 1 , , X n is a sample from a
population with pdf
x
x2
p( x | ) 2 exp( 2 ), x 0, 0
2
This is known as the Rayleigh distribution. It is used to
model the density of time until failure for certain types of
equipment. The data comes from an exponential family:
n
xi2
n xi
p ( x1 , , xn | ) 2 exp( 2 )
i 1 2
i 1
n
1
xi exp( 2
2
i 1
n
x
i 1
2
i
n log 2 )
Here
1
1
2
,
, B( ) n log 2 , A( ) n log( 2 ),
2
2
2
.
T ( x ) i 1 xi2
n
7
n
Therefore, the natural sufficient statistic X
i 1
2
i
has mean
A '( ) n / 2n 2 and variance A ''( ) n / 2 4n 4 .
8
© Copyright 2026 Paperzz