September 10

ST 762
Nonlinear Statistical Models for Univariate and Multivariate Response
Generalized Linear Model
Certain nonlinear models with a specific structure arise from using
linear modeling with a parent distribution in the exponential family.
If the linear part is replaced by a more general nonlinear specification,
the result is a special case of our general mean-variance specification
E(Y |x) = f (x, β),
var(Y |x) = σ 2 g (β, θ, x)2 .
Estimation may also be carried out using the GLS estimation
equations.
1 / 10
ST 762
Nonlinear Statistical Models for Univariate and Multivariate Response
The (Scaled) Exponential Family
Y has a scaled exponential family distribution if its density (or
probability mass function) is of the form
y ξ − b(ξ)
f (y ; ξ, σ) = exp
+ c(y , σ) .
σ2
ξ is the canonical parameter, and σ is the scale parameter.
If σ 2 is known, this is the usual one-parameter exponential family
with canonical parameter ξ.
If σ 2 is unknown, it may or may not be the usual two-parameter
exponential family.
2 / 10
ST 762
Nonlinear Statistical Models for Univariate and Multivariate Response
Moments:
db(ξ)
,
dξ
d 2 b(ξ)
var(Y ) = σ 2 bξξ (ξ) = σ 2
.
dξ 2
E(Y ) = bξ (ξ) =
If E(Y ) = µ = bξ (ξ), then ξ = bξ−1 (µ).
The function bξ−1 (·) is called the canonical link function, because it
links the canonical parameter ξ to the mean µ.
3 / 10
ST 762
Nonlinear Statistical Models for Univariate and Multivariate Response
Also
var(Y ) = σ 2 bξξ bξ−1 (µ) = σ 2 g (µ)2 ,
so the variance depends on the mean in a specific way.
Examples of the scaled exponential family:
Distribution
b(ξ)
ξ(µ)
Normal, σ 2 = 1
ξ 2 /2
µ
Poisson
exp(ξ)
log µ
Gamma
− log(−ξ)
1/µ
√
Inverse Gaussian
− −2ξ 1/µ2
µ
Binomial
log 1 + e ξ
log 1−µ
4 / 10
g (µ)2
1
µ
µ2
µ3
µ(1 − µ)
ST 762
Nonlinear Statistical Models for Univariate and Multivariate Response
Sufficiency
If Y1 , Y2 , . . . , Yn is a random sample from a member of this family,
the log-likelihood is
log L =
n X
Yj ξ − b(ξ)
j=1
σ2
+ c(Yj , σ)
#
" n
n
X
X
1
= 2 ξ
Yj − nb(ξ) +
c (Yj , σ)
σ
j=1
j=1
so (if σ 2 is known)
5 / 10
P
Yj is sufficient for ξ.
ST 762
Nonlinear Statistical Models for Univariate and Multivariate Response
Also, if Y1 , Y2 , . . . , Yn are independent, but in the distribution of Yj ,
ξ is replaced by ξj = xTj β, the log-likelihood is

1
log L = 2 
σ
so now
P
n
X
!T
Yj xj
β−
j=1
n
X

b(ξj ) +
j=1
n
X
c (Yj , σ)
j=1
Yj xj is sufficient for β.
But note that
E (Yj | xj ) = µj = bξ (ξj ) = bξ xTj β ,
so this is a conventional linear model only if bξ (ξ) = ξ, i.e., for the
normal distribution.
Otherwise, it is a generalized linear model.
6 / 10
ST 762
Nonlinear Statistical Models for Univariate and Multivariate Response
Note that bξ (·) is determined by the distribution.
We can replace it by a different function
E (Yj | xj ) = f xTj β ,
and it is still called a generalized linear model.
Because the link f −1 (·) is no longer the canonical link, we lose
sufficiency–not a big deal.
R and SAS support fitting these models with the link function chosen
from a list.
7 / 10
ST 762
Nonlinear Statistical Models for Univariate and Multivariate Response
Example: Six Cities Wheezing data
Response: child wheezes at age 9 (0 or 1).
Predictor: mother’s smoking status (0 = none, 1 = moderate, 2 =
heavy).
Possible covariate: community (Portage or Kingston).
8 / 10
ST 762
Nonlinear Statistical Models for Univariate and Multivariate Response
Model:
Yj ∼ Bernoulli (µj ) .
Canonical link:
log
µj
1 − µj
or
E (Yj |xj ) = µj =
= xTj β
exp xTj β
1 + exp xTj β
Logistic regression.
Alternative link: probit function, µj = Φ xTj β .
9 / 10
ST 762
Nonlinear Statistical Models for Univariate and Multivariate Response
Generalized Nonlinear Model
We may want a more general specification for the conditional mean:
E (Yj |xj ) = f (xj , β) .
This is consistent with the scaled exponential family if ξj satisfies
bξ (ξj ) = f (xj , β) .
The mean-variance relationship is still determined by the distribution:
var (Yj |xj ) = σ 2 g {E (Yj |xj )}2 = σ 2 g {f (xj , β)}2 .
10 / 10