August 25

ST 762
Nonlinear Statistical Models for Univariate and Multivariate Response
General Setup for Nonlinear Models
Data are pairs (Yj , xj ) , j = 1, 2, . . . , n.
A regression model is a specification of the distribution of Y |x.
Most often, we focus on the first two moments,
E(Yj |xj ) = f (xj , β) ,
var(Yj |xj ) = σj2 .
Here x is (r × 1) and β is (p × 1), with possibly p 6= r .
1 / 13
ST 762
Nonlinear Statistical Models for Univariate and Multivariate Response
Inferential Approaches
Mostly, scientific focus is on β.
Objective is to estimate, and test hypotheses about, β.
The variances σj2 may be unequal, and will need to be estimated or
modeled, to assure:
reasonably efficient estimates of β;
valid standard errors for and tests about β.
2 / 13
ST 762
Nonlinear Statistical Models for Univariate and Multivariate Response
Approach 1: Ordinary Least Squares (OLS)
Choose β to minimize
S(β) =
n
X
{Yj − f (xj , β)}2 .
j=1
Motivation:
“sensible” way to find good values of β.
If we assume normality with common variance σ 2 , i.e.
Yj ∼ N{f (xj , β) , σ 2 }, equivalent to maximizing the log-lik
n
n
1 X
2
log L = − log 2πσ − 2
{Yj − f (xj , β)}2 .
2
2σ j=1
3 / 13
ST 762
Nonlinear Statistical Models for Univariate and Multivariate Response
Remarks
This Gaussian likelihood is based on the conditional distribution of
Y |x. The full likelihood has a further factor, the marginal distribution
of x1 , x2 , . . . , xn . If that marginal distribution involved β, we would
want to use the full likelihood.
Minimizing S(β) leads to the estimating equation
n
X
{Yj − f (xj , β)} fβ (xj , β) = 0
(p×1)
j=1
where
fβ (x, β) =
4 / 13
∂f (x, β) ∂f (x, β)
∂f (x, β)
,
,...,
∂β1
∂β2
∂βp
T
.
ST 762
Nonlinear Statistical Models for Univariate and Multivariate Response
Approach 2: Weighted Least Squares (WLS)
If we accept that the variances are unequal, but assume that we know
them up to a constant of proportionality:
σj2 =
σ2
,
wj
with σ 2 unknown, but w1 , w2 , . . . , wn known (unlikely!);
Then we would minimize the weighted sum of squares
Sw (β) =
n
X
j=1
5 / 13
wj {Yj − f (xj , β)}2 .
ST 762
Nonlinear Statistical Models for Univariate and Multivariate Response
The estimating equation becomes
n
X
wj {Yj − f (xj , β)} fβ (xj , β) = 0
j=1
Again, we find the same estimating equation by adopting the
addtional assumption of Gaussian distributions, and maximizing the
log likelihood
n
n
σ2
1 X
1X
log L = −
log 2π
− 2
wj {Yj − f (xj , β)}2 .
2 j=1
wj
2σ j=1
6 / 13
ST 762
Nonlinear Statistical Models for Univariate and Multivariate Response
Remarks
Note that, for both OLS and WLS, the estimating equation
n
X
wj {Yj − f (xj , β)} fβ (xj , β) = 0
j=1
is:
linear in the Y s, which is convenient in developing the sampling
theory of the estimators;
in general, nonlinear in the βs, which is inconvenient in finding
numerical solutions.
MLEs may be found by direct numerical optimization of the sum of
squares, but solving the estimating equations is preferred.
7 / 13
ST 762
Nonlinear Statistical Models for Univariate and Multivariate Response
Approach 3: Plug-in estimates of weights
Suppose we have replicates: Yj,k = k th observation at xj ,
k = 1, 2, . . . , rj .
If we assume that var(Yj,k ) = σj2 , we can estimate it unbiasedly by
rj
sj2
2
1 X
=
Yj,k − Ȳj .
rj − 1 k=1
We could then plug these into Sw , and minimize
rj
n X
X
{Yj,k − f (xj , β)}2
Ss 2 (β) =
.
sj2
j=1 k=1
8 / 13
ST 762
Nonlinear Statistical Models for Univariate and Multivariate Response
Remarks
This can be rewritten
2
n n
X
X
Ȳj − f (xj , β)
Ss 2 (β) =
+
(rj − 1) ,
sj2 /rj
j=1
j=1
depending only on the summary statistics Ȳj , sj2 , j = 1, 2, . . . , n.
The estimating equation is
n X
Ȳj − f (xj , β)
fβ (xj , β) = 0,
sj2 /rj
j=1
linear in Ȳj (but not linear in Yj,k ).
9 / 13
ST 762
Nonlinear Statistical Models for Univariate and Multivariate Response
If rj is small, sj2 is a poor estimator of σj2 , so this may not work well.
We have not exploited any structure in the variation of σj2 .
10 / 13
ST 762
Nonlinear Statistical Models for Univariate and Multivariate Response
Approach 4: Likelihood with replicates
Again, suppose we have replicates Yj,k at xj , k = 1, 2, . . . , rj .
As before, assuming Gaussian distributions leads to a likelihood
function
n
n
rj
X X {Yj,k − f (xj , β)}2
1X
2
log L = −
rj log 2πσj −
.
2 j=1
2σj2
j=1 k=1
Differentiation wrt β and σ12 , σ22 , . . . , σn2 yields equations for all
unknowns.
11 / 13
ST 762
Nonlinear Statistical Models for Univariate and Multivariate Response
Remarks
Again, the likelihood can be
rewritten to depend only on the
summary statistics Ȳj , sj2 , j = 1, 2, . . . , n.
The estimating equations are
n X
Ȳj − f (xj , β)
fβ (xj , β) = 0,
σj2 /rj
j=1
rj
1X
{Yj,k − f (xj , β)}2 = σj2 ,
rj k=1
j = 1, 2, . . . , n.
These could be solved iteratively, given say initial estimates sj2 (or 1)
of σj2 .
12 / 13
ST 762
Nonlinear Statistical Models for Univariate and Multivariate Response
Comments on Approaches 1–4
Approach 1 (OLS) is inefficient when variances are truly unequal.
Approach 2 (WLS) requires unlikely information about variance ratios.
Approaches 3 and 4 require replication, which may or may not be
feasible.
None exploits possible structure, such as the variance being a
function (known, or unknown but smooth) of the mean.
13 / 13