August 27

ST 762
Nonlinear Statistical Models for Univariate and Multivariate Response
Approach 5: Variance-Stabilizing Transformation
For certain distributions of Y , the variance is a known function of the
mean, and consequently a known function h(Y ) has an approximately
constant variance.
For example:
1 / 14
Distribution
Mean-variance
Poisson
Binomial
lognormal
σ2 = µ
σ 2 ∝ µ(1 − µ)
σ 2 ∝ µ2
Transformation h(Y )
√
Y√
−1
sin
Y
log Y
ST 762
Nonlinear Statistical Models for Univariate and Multivariate Response
Empirically, transformation often was found also to make the
dependence linear.
So for example counted data Y , where the Poisson distribution might
be relevant, might be modeled as
p E
Yj xj = xTj β,
p Yj xj = σ 2 .
var
Remarks
We may not be so lucky!
Even if we are, β is hard to interpret.
2 / 14
ST 762
Nonlinear Statistical Models for Univariate and Multivariate Response
Approach 6: Box-Cox Transformation
Let the data determine the transformation in a parametric family
h(Y , λ).
Box and Cox used the power family:
 λ
Y − 1
h(Y , λ) =
λ
log Y
λ 6= 0
λ = 0.
Assume linearity and normality: h (Yj , λ) ∼ N xTj β, σ 2 .
Estimate λ, β, and σ 2 by ML.
3 / 14
ST 762
Nonlinear Statistical Models for Univariate and Multivariate Response
Remarks
Same issues as before.
One transformation is assumed to achieve
linearity,
normality,
and constant variance.
β is hard to interpret.
4 / 14
ST 762
Nonlinear Statistical Models for Univariate and Multivariate Response
Approach 7: Transform Both Sides (TBS)
Suppose we have scientific reasons for a model f (x, β).
Assume that, conditionally on xj ,
h (Yj , λ) = h {f (xj , β) , λ} + ej
The deviations e1 , e2 , . . . , en satisfy the standard assumptions:
constant variance σ 2 ;
independence and normality.
Estimate β, λ, and σ 2 by ML.
5 / 14
ST 762
Nonlinear Statistical Models for Univariate and Multivariate Response
Still requires a strong assumption.
Still difficult to interpret.
But: if h is smooth and invertible and σ is small relative to f (x, β),
we can use a linear approximation:
Yj = h−1 [h {f (xj , β) , λ} + ej , λ]
1
≈ f (xj , β) +
× ej
hy {f (xj , β) , λ}
where
hy (y , λ) =
6 / 14
∂h(y , λ)
.
∂y
ST 762
Nonlinear Statistical Models for Univariate and Multivariate Response
So to this order of approximation,
E (Yj | xj ) ≈ f (xj , β)
and
σ2
hy {f (xj , β) , λ}2
σ2
≈
.
hy {E (Yj ) , λ}2
var ( Yj | xj ) ≈
For the Box-Cox family, hy (y , λ) = y λ−1 , so
var ( Yj | xj ) ≈ σ 2 f (xj , β)2(1−λ) ≈ σ 2 E (Yj | xj )2(1−λ) .
7 / 14
ST 762
Nonlinear Statistical Models for Univariate and Multivariate Response
So:
normality and constant variance on the transformed scale
are (approximately) equivalent to
normality and non-constant variance on the original scale,
with the variance given by a certain function of the mean.
Henceforth, we do not consider transformations. Instead, we model
variances, and sometimes distributions, explicitly.
8 / 14
ST 762
Nonlinear Statistical Models for Univariate and Multivariate Response
Approach 8: ML for an Assumed Model
E.g. Y is a count, which we could model as Poisson-distributed, with
a model for the mean:
E(Y |x) = f (x, β).
This implies a model for the variance,
var(Y |x) = E(Y |x) = f (x, β)
but we do not need it in likelihood-based inference.
9 / 14
ST 762
Nonlinear Statistical Models for Univariate and Multivariate Response
Approach 9: Generalized Least Squares (GLS)
Assume
E(Y |x) = f (x, β),
var(Y |x) = σ 2 g (β, θ, x)2 .
Here:
β is, as before, the vector of parameters in the mean function;
θ is a possible additional parameter in the structure of the
variance function;
σ 2 is a possible over-dispersion parameter.
10 / 14
ST 762
Nonlinear Statistical Models for Univariate and Multivariate Response
Ad hoc estimation scheme:
i
Get initial estimate of β using OLS;
ii
Get initial estimate of θ, if needed, and construct initial
estimated weights
ŵj =
iii
1
2
g β̂ OLS , θ̂, x
Re-estimate β using WLS, treating ŵj as fixed: solve the
estimating equation
n
X
j=1
11 / 14
ŵj {Yj − f (xj , β)} fβ (xj , β) = 0.
ST 762
Nonlinear Statistical Models for Univariate and Multivariate Response
Remarks
In step iii, equivalently, β̂ WLS minimizes
Sŵ (β) =
n
X
ŵj {Yj − f (xj , β)}2 .
j=1
As noted before, this is a computational alternative to solving the
estimating equation, but is not generally preferred.
The estimating-equation approach may be extended to cases where
no such objective function exists.
Could repeat step ii with β̂ WLS instead of β̂ OLS , then repeat step iii.
We could also iterate, perhaps to convergence.
12 / 14
ST 762
Nonlinear Statistical Models for Univariate and Multivariate Response
Approach 10: ML
Same moment structure as in Approach 9:
E(Y |x) = f (x, β),
var(Y |x) = σ 2 g (β, θ, x)2 .
Add an assumption about the distribution of Y |x:
Y = f (x, β) + σg (β, θ, x)Z
where Z has mean 0 and variance 1, such as N(0, 1), and use
likelihood methods.
When g (·) involves β, the ML estimating equation for β is not the
same as in GLS.
13 / 14
ST 762
Nonlinear Statistical Models for Univariate and Multivariate Response
Comments on Approaches 5–10
Approaches 5–7 require (lucky) transformation, and the parameters
can be hard for interpretation.
Approaches 8–10 are the most interesting.
Depending on the form of the functions f (·) and g (·), these
approaches will sometimes yield the same estimating equations.
14 / 14