ST 762
Nonlinear Statistical Models for Univariate and Multivariate Response
Variance Parameters
Recall the general mean-variance specification
E(Y |x) = f (x, β),
var(Y |x) = σ 2 g (β, θ, x)2 .
To the first order approximation, the folklore theorem states that the
asymptotic distribution of β̂ GLS is unaffected by how θ is estimated.
To the second order approximation, the asymptotic distribution of
β̂ GLS does depend on how well θ is estimated.
Note that estimation of σ plays no role in the properties of the GLS
estimator β.
1 / 17
Variance Parameters
ST 762
Nonlinear Statistical Models for Univariate and Multivariate Response
Transformed residuals
Define
j =
Yj − f (xj , β 0 )
.
σ0 g (β 0 , θ 0 , xj )
Without further assumptions,
E (j | xj ) = 0,
var ( j | xj ) = E 2j xj = 1.
We explore estimating σ based on |j |λ for various λ.
2 / 17
Variance Parameters
ST 762
Nonlinear Statistical Models for Univariate and Multivariate Response
Recall key assumption
The relevant moments of j are not dependent on xj and are constant
for all j:
E (|j |λ |xj ) = E (|j |λ ) = constant ∀j,
E (|j |2λ |xj ) = E (|j |2λ ) = constant ∀j.
For λ = 2, the first requirement is automatically met, and similarly
for λ = 1 the second requirement is automatically met.
3 / 17
Variance Parameters
ST 762
Nonlinear Statistical Models for Univariate and Multivariate Response
More general forms of estimating equations for θ
Define η by
e λη = σ λ E |j |λ .
Identify |Yj − f (xj , β)|λ as the “response”.
For λ = 2, η = log σ and η is simply a reparameterization; but for
other λ it depends on the distribution of j :
η = log σ +
4 / 17
o
n 1
log E |j |λ
λ
Variance Parameters
ST 762
Nonlinear Statistical Models for Univariate and Multivariate Response
Then θ and η may be estimated by solving the joint estimating
equations
λ
λ
n
Yj − f xj , β̂ − e λη g β̂, θ, xj
λ X
g β̂, θ, xj τθ β̂, θ, xj
2λ
j=1
g β̂, θ, xj
=0
where β̂ is held fixed.
T
We shall study the large sample distribution of (θ̂ , η̂)T for different
choices of λ.
5 / 17
Variance Parameters
ST 762
Nonlinear Statistical Models for Univariate and Multivariate Response
Consistency
This is an unbiased M-estimating equation, as long as β̂ is a
consistent estimator for β 0 , which we will assume.
Then θ̂ and η̂ are consistent estimators of θ 0 and η0 .
Also note that applying the usual M-estimator argument to deduce
T
the properties of (θ̂ , η̂)T requires the summand in the estimating
equations to be differentiable with respect to θ, η, and β̂.
This is not always true, e.g., when λ = 1.
6 / 17
Variance Parameters
ST 762
Nonlinear Statistical Models for Univariate and Multivariate Response
Asymptotic distribution for λ = 2
Complicated, depends on:
whether β̂ is based on linear or quadratic estimating equations;
excess kurtosis.
Simplifies if either
σ0 → 0;
g (·) does not depend on β.
Then
√ 2+κ
L
n θ̂ − θ 0 −→ N 0,
Λθ .
4
7 / 17
Variance Parameters
ST 762
Nonlinear Statistical Models for Univariate and Multivariate Response
Asymptotic distribution for λ = 1
Complicated, with an additional technical difficulty because of
non-differentiability of |x|
Simplifies if either
σ0 → 0;
g (·) does not depend on β, and j | xj has a symmetric
distribution.
Then
√ L
n θ̂ − θ 0 −→ N (0, c1 Λθ )
where
c1 =
8 / 17
var ( |j || xj )
.
E (|j || xj )2
Variance Parameters
ST 762
Nonlinear Statistical Models for Univariate and Multivariate Response
Asymptotic distribution for general λ
Equally complicated, and same technical difficulty because of
non-differentiability.
Simplifies if j | xj has a symmetric distribution with common absolute
moments up to power 2λ, and either
σ0 → 0;
g (·) does not depend on β.
Then
√ L
n θ̂ − θ 0 −→ N (0, cλ Λθ ) .
9 / 17
Variance Parameters
ST 762
Nonlinear Statistical Models for Univariate and Multivariate Response
Here,
cλ
var |j |λ xj
=
2 ,
λ
2
λ E |j | xj
var log |j |2 xj
=
,
4
λ 6= 0,
λ = 0.
So one can compare the asymptotic efficiency of the two competing
methods by comparing cλ1 and cλ2 .
10 / 17
Variance Parameters
ST 762
Nonlinear Statistical Models for Univariate and Multivariate Response
Asymptotic efficiency relative to λ = 2: c2 /cλ
Assume j is N(0, 1) contaminated with fraction α of N(0, 9); i.e.,
(1 − α) × N(0, 1) + α × N(0, 9).
α
0.000
0.001
0.002
0.010
0.050
λ=1 λ=
0.876
0.948
1.016
1.439
2.035
2
3
0.772
0.841
0.906
1.334
2.100
λ=
1
2
0.693
0.756
0.816
1.216
1.996
1
3
λ=0
0.606
0.662
0.715
1.075
1.823
0.405
0.440
0.480
0.720
1.220
λ=
λ = 1 performs better than λ = 2 even for tiny levels of
contamination (2 × 10−3 , or 2 observations per thousand).
11 / 17
Variance Parameters
ST 762
Nonlinear Statistical Models for Univariate and Multivariate Response
Assume j is t-distributed with ν degrees of freedom.
To match the excess kurtosis in the contaminated normal table, use
ν = ∞, 35.78, 20.16, 7.68, 5.29.
ν
∞
35.78
20.16
7.68
5.29
λ=1 λ=
0.876
0.921
0.965
1.270
2.016
2
3
0.765
0.813
0.861
1.191
1.994
λ=
1
2
0.693
0.741
0.787
1.111
1.897
1
3
λ=0
0.610
0.655
0.698
1.002
1.739
0.405
0.439
0.471
0.695
1.234
λ=
λ = 1 performs better than λ = 2 for ν < 16.
12 / 17
Variance Parameters
ST 762
Nonlinear Statistical Models for Univariate and Multivariate Response
Bottom line
Asymptotic distributions of θ̂ are complicated in general.
The “small σ0 ” simplification is quite useful and also relevant in
practice.
Using λ = 1 has good relative efficiency and requires estimating only
E (|j || xj ), which is not difficult.
13 / 17
Variance Parameters
ST 762
Nonlinear Statistical Models for Univariate and Multivariate Response
In some circumstances, estimation of variance parameters is of
critical importance.
Two common such situations are prediction and calibration.
Prediction
Find Y given x
Ŷ0 = f (x0 , β̂)
Calibration
Find x given Y
x̂0 = f −1 (Y0 , β̂)
14 / 17
Variance Parameters
ST 762
Nonlinear Statistical Models for Univariate and Multivariate Response
Variance of prediction
Y0 − Ŷ0 = Y0 − f (x0 , β̂)
≈ Y0 − f (x0 , β 0 ) − fβT (x0 , β 0 ) β̂ − β 0
15 / 17
Variance Parameters
ST 762
Nonlinear Statistical Models for Univariate and Multivariate Response
Since β̂ depends only on the training data, it is independent of Y0 ,
and thus
var(Y0 − Ŷ0 ) ≈ var{Y0 − f (x0 , β 0 )}
+ fβ (x0 , β 0 )T var(β̂ − β 0 )fβ (x0 , β 0 )
= σ02 g (x0 , β 0 , θ 0 )2
+ n−1 σ02 fβ (x0 , β 0 )T var{n1/2 (β̂ − β 0 )}fβ (x0 , β 0 )
≈ σ02 g (x0 , β 0 , θ 0 )2
+ σ02 fβ (x0 , β 0 )T {n−1 Σ̂}fβ (x0 , β 0 ).
16 / 17
Variance Parameters
ST 762
Nonlinear Statistical Models for Univariate and Multivariate Response
The first term in the variance reflects the uncertainty due to variation
in Y0 , and regardless of how much data are collected, the inherent
variation in the response will always be there.
The second term reflects uncertainty due to fitting the model to the
training data, and it diminishes as more data are collected and used
to fit the model.
The first term dominates the second term, as the second term is
O(n−1 ).
So the predominant source of error in prediction is that due to
inherent variation in the response.
One can do Wald type inference for prediction based on this formula.
The result for calibration is similar.
17 / 17
Variance Parameters
© Copyright 2026 Paperzz