October 8

ST 762
Nonlinear Statistical Models for Univariate and Multivariate Response
Specifying the Variance Function
Recall the general mean-variance specification
E(Y |x) = f (x, β),
var(Y |x) = σ 2 g (β, θ, x)2 .
The form of f (x, β) may be suggested by subject matter theory, or
by exploration.
How can we decide on the form of g (β, θ, x)?
1 / 28
Specifying the Variance Function
ST 762
Nonlinear Statistical Models for Univariate and Multivariate Response
If a Generalized Linear Model seems appropriate:
the canonical link function is at least a starting point for f (x, β);
the associated variance function determines g (β, θ, x).
For most of the common instances of GLMs,
g (β, θ, x) = f (x, β)θ
for an appropriate power θ.
Otherwise, we study the residuals to gain knowledge about g (·).
2 / 28
Specifying the Variance Function
ST 762
Nonlinear Statistical Models for Univariate and Multivariate Response
Residuals
Since
o
n
g (β, θ, x) = var(Y |x)/σ 2 = E [Y − f (x, β)]2 /σ 2 ,
a natural place to start is with the residuals rj = Yj − f xj , β̂ OLS
from an initial unweighted (OLS) fit.
Plot the residuals against:
Predicted values Ŷj = f xj , β̂ OLS , possibly log-transformed;
Covariates, elements of xj .
Non-constant variance often shows up as a wedge-shaped plot.
3 / 28
Specifying the Variance Function
ST 762
Nonlinear Statistical Models for Univariate and Multivariate Response
Example
Pharmacokinetics of indomethacin.
Concentration Yj of indomethacin taken at n = 11 times points
xj (hours) post-dose.
The mean function
f (xj , β) = e β1 exp −e β2 xj + e β3 exp −e β4 xj
The variance tends to increase with the level of the response.
A popular choice is the power-of-the-mean model σ 2 f 2θ (xj , β),
with, say, θ = 1.
4 / 28
Specifying the Variance Function
ST 762
Nonlinear Statistical Models for Univariate and Multivariate Response
OLS and GLS fits
source("R-code/indo-GLS-nls.R")
The OLS fit (red line) and the GLS fit with θ = 1 (blue line) are
discernibly different.
5 / 28
Specifying the Variance Function
ST 762
Nonlinear Statistical Models for Univariate and Multivariate Response
OLS residuals
plot(fitted(nOLS), residuals(nOLS)^2, log = "x")
The squared OLS residuals vs the log of predicted values clearly
shows a wedge shape, showing a rather severe increase in variance
with level of the response.
6 / 28
Specifying the Variance Function
ST 762
Nonlinear Statistical Models for Univariate and Multivariate Response
Studentized Residuals
Consider the (matrix form of the) linear model
E(Y) = Xβ,
var(Y) = σ 2 W−1 .
OLS estimator is
β̂ OLS = XT X
−1
XT Y
with fitted values
Ŷ = Xβ̂ OLS = X XT X
where H = X XT X
7 / 28
−1
−1
XT Y = HY,
XT is the hat matrix.
Specifying the Variance Function
ST 762
Nonlinear Statistical Models for Univariate and Multivariate Response
The residuals are
r = Y − Ŷ = (I − H)Y.
If W = I, i.e. the errors have the same variance, then
var(r) = σ 2 (I − H),
so the residuals do not in general have the same variance.
The studentized residuals are
bj =
r
pj
.
σ̂OLS 1 − hj,j
Here hj,j , the j th diagonal entry in the hat matrix H, measures the
leverage of the j th observation.
28
Specifying the Variance
The8 /plot
of bj gives a diagnostic
plotFunction
for non-constant versus
ST 762
Nonlinear Statistical Models for Univariate and Multivariate Response
Extend the notions of leverage and studentization, approximately, to
nonlinear mean models.
By linear approximation:
r ≈ {In − H(β)}{Y − f(β)}
where H(β) is the approximate hat matrix
H(β) = X(β){X(β)T X(β)}−1 X(β)T
and X(β) is the gradient matrix



X(β) = 

fβ (x1 , β)
fβ (x2 , β)
..
.
fβ (xn , β)
9 / 28
Specifying the Variance Function





ST 762
Nonlinear Statistical Models for Univariate and Multivariate Response
Regard the diagonal elements of H(β) as approximate “leverage
values”.
We can define the studentized residuals in the same
way
as for the
linear model, with approximate leverage values hj,j β̂ .
10 / 28
Specifying the Variance Function
ST 762
Nonlinear Statistical Models for Univariate and Multivariate Response
Remarks
In a nonlinear model, the ramifications of “design” will be not only
through the actual design points, but also through the behavior of
the function f at those points.
Unlike a linear model, a nonlinear model allows the changes in f at
different xj settings to be different, as the derivative of f depends on
both xj and β in general.
Depending on how f changes in different parts of the design space,
different observations exert different amounts of influence on the fit.
11 / 28
Specifying the Variance Function
ST 762
Nonlinear Statistical Models for Univariate and Multivariate Response
The ordinary residual plot and the studentized residual plot
source("R-code/nls2lm.R")
lOLS <- nls2lm(nOLS)
par(mfcol = c(2, 1))
plot(fitted(nOLS),
residuals(lOLS) / summary(lOLS)$sigma, log = "x")
plot(fitted(nOLS), rstandard(lOLS), log = "x")
They look similar in this example.
Note
rstandard() produces Studentized residuals as defined earlier;
rstudent() produces slightly different values (”Rstudent”).
12 / 28
Specifying the Variance Function
ST 762
Nonlinear Statistical Models for Univariate and Multivariate Response
Transformations of studentized residuals
If g (β, θ, xj ) = exp{θf (xj , β)}, then
log[{varYj }1/2 ] = log σ + θf (xj , β).
So we plot log |rj | or log |bj | vs ŶOLS = f (xj , β̂ OLS ).
If g (β, θ, xj ) = f θ (xj , β), then
log[{varYj }1/2 ] = log σ + θ log f (xj , β).
So we plot log |rj | or log |bj | vs log(ŶOLS ) = log f (xj , β̂ OLS ).
13 / 28
Specifying the Variance Function
ST 762
Nonlinear Statistical Models for Univariate and Multivariate Response
plot(fitted(nOLS), log(abs(rstandard(lOLS))), log = "x")
plot(fitted(nOLS), abs(rstandard(lOLS))^(2/3), log = "x")
Log transformation and power transformation of the studentized
residuals vs log transformation of the predicted values. It confirms
that the power-of-the-mean model is reasonable.
14 / 28
Specifying the Variance Function
ST 762
Nonlinear Statistical Models for Univariate and Multivariate Response
Graphical verification
Check graphically for evidence that the assumed variance model
accounts adequately for the form of the nonconstant variance.
Based on the original OLS fit, we decide on a tentative form for
g (β, θ, xj )2 and refit (e.g., by GLS-PL).
15 / 28
Specifying the Variance Function
ST 762
Nonlinear Statistical Models for Univariate and Multivariate Response
Define weighted residuals
rweighted,j =
Yj − f xj , β̂
,
g β̂, θ̂, xj
and standardized weighted residuals
rweighted,j /σ̂.
16 / 28
Specifying the Variance Function
ST 762
Nonlinear Statistical Models for Univariate and Multivariate Response
To account for leverage, write as a vector,
n
o
rweighted = W1/2 Y − f β̂
where






1
1
1
W = diag
2 , 2 , . . . 2 .


 g β̂, θ̂, x1
g β̂, θ̂, x2
g β̂, θ̂, xn 
Write
Y∗ = W1/2 Y,
X∗ (β) = W1/2 X(β),
f ∗ (β) = W1/2 f(β).
17 / 28
Specifying the Variance Function
ST 762
Nonlinear Statistical Models for Univariate and Multivariate Response
Then another linear approximation gives
∗
∗
rweighted = Y − f β̂
≈ {In − H∗ (β)} {Y∗ − f ∗ (β)} .
Here H∗ (β) is again an approximate hat matrix
−1 ∗ T
H∗ (β) = X∗ (β) X∗ (β)T X∗ (β)
X (β) .
Studentized weighted residuals are then
rweighted,j
bweighted,j = r
.
∗
σ̂ 1 − hj,j
β̂
18 / 28
Specifying the Variance Function
ST 762
Nonlinear Statistical Models for Univariate and Multivariate Response
The weighted residuals plots suggest that weighting according to the
conjectured variance model has taken appropriate account of the
nonconstant variance
19 / 28
Specifying the Variance Function
ST 762
Nonlinear Statistical Models for Univariate and Multivariate Response
Final remarks about specifying variance function in this example:
It is important to “get the variance right”.
Consider estimation of the terminal half-life, a parameter of great
physical interest to pharmacologists.
The terminal half-life is the time that it takes the mean response in
the “second phase” of the curve to decrease by half and is useful in
determining appropriate dosing regimens. The terminal half-life is
given by log 2/e β4 hours here.
20 / 28
Specifying the Variance Function
ST 762
Nonlinear Statistical Models for Univariate and Multivariate Response
Substituting the estimate of β4 yields an estimated half-life of 3.13
hours based on the OLS fit, and 3.96 hours based on the GLS-PL fit.
The difference in point estimates is nearly one hour, which in a
clinical sense is quite a big difference.
Of course, we have not yet obtained standard errors for these point
estimates, so whether or not this difference is of statistical
importance is not clear.
21 / 28
Specifying the Variance Function
ST 762
Nonlinear Statistical Models for Univariate and Multivariate Response
REML Estimation of Variance
Motivation
Plots based on ordinary residuals may be misleading due to failure to
account for “leverage”.
Will methods for estimation of variance parameters based on ordinary
residuals also be subject to the same problem?
Recall PL estimating equations for σ and θ:
"
# n
X
{Yj − f (xj , β)}2
1
−1 ×
= 0.
νθ (β, θ, xj )
σ 2 g (β, θ, xj )2
j=1
22 / 28
Specifying the Variance Function
ST 762
Nonlinear Statistical Models for Univariate and Multivariate Response
When plugging the true β,
X
n
n X
{Yj − f (xj , β)}2
1
1
×
=
.
νθ (β, θ, xj )
νθ (β, θ, xj )
σ 2 g (β, θ, xj )2
j=1
j=1
because
"
{Yj − f (xj , β)}2
E
σ 2 g (β, θ, xj )2
23 / 28
Specifying the Variance Function
#
= 1.
ST 762
Nonlinear Statistical Models for Univariate and Multivariate Response
When plugging the estimated β̂,
n
o2
!
n
Yj − f xj , β̂
X
1
2 ×
ν
β̂,
θ,
x
θ
j
j=1
σ 2 g β̂, θ, xj
=
n
X
∗
(1 − hj,j
β̂ )
j=1
because
24 / 28
1
νθ β̂, θ, xj
n
o2 

 Yj − f xj , β̂
∗
E
2  ≈ 1 − hj,j β̂ .
σ 2 g β̂, θ, xj
Specifying the Variance Function
!
.
ST 762
Nonlinear Statistical Models for Univariate and Multivariate Response
Modified estimating equations:
n
o2
n
Y
−
f
x
,
β̂
X
j
j
2 ×
2
j=1
σ g β̂, θ, xj
=
1
νθ β̂, θ, xj
n
X
∗
1 − hj,j
!
1
νθ β̂, θ, xj
j=1


n−p
 n
X
=

∗
1 − hj,j
νθ

,
β̂, θ, xj 
j=1
where
Pn
25 / 28
j=1
∗
hj,j
= trace(H∗ (β)) = rank(H∗ (β)) = p.
Specifying the Variance Function
!
ST 762
Nonlinear Statistical Models for Univariate and Multivariate Response
Solution for σ 2 :
σ̂ 2 =
1
n−p
n
X
j=1
n
o2
Yj − f xj , β̂
2
g β̂, θ, xj
That is, the modified PL equations yield the bias-adjusted estimator
of σ 2 .
26 / 28
Specifying the Variance Function
ST 762
Nonlinear Statistical Models for Univariate and Multivariate Response
Recall that the original PL estimating equations arise from
maximizing, for fixed β, the normal log-likelihood
n
o2
n
n
1 X Yj − f xj , β̂
X
PL β̂, θ, σ = −n log σ−
log g β̂, θ, xj −
2 .
2 j=1
2
j=1
σ g β̂, θ, xj
The modified equations arise in the same way from maximizing
T
1
PL β̂, θ, σ + p log σ − log det X β̂ W β̂, θ X β̂
2
27 / 28
Specifying the Variance Function
ST 762
Nonlinear Statistical Models for Univariate and Multivariate Response
Terminology
The extra terms can be viewed as “penalty terms” that impose a
restriction on the solution, hence
REML = REstricted ML.
Modified likelihoods of this form were first suggested based on the
marginal distribution of residuals from a linear fit, hence
REML = REsidual ML.
In some cases (linear mixed model, generalized linear model), they
may be derived from the conditional distribution of the observations,
conditioned on the sufficient statistic for β. This is a partial, or
restricted likelihood.
28 / 28
Specifying the Variance Function