Truncated SVD, nonlinear regression

Instabilities of SVD
Small eigenvalues -> m+ sensitive to small amounts of noise
Small eigenvalues maybe indistinguishable from 0
Possible to remove small eigenvalues to stabilize solution ->
Truncated SVD, TSVD
Condition number cond(G)=s1/sk
TSVD
Example: removing instrument response
g0 t exp(-t/T0) (t≥0)
g(t)=
0 (t<0)

v(t)=∫g(t-)mtrue()d (recorded acceleration)
-
Problem: deconvolving g(t) from v(t) to get mtrue
(ti-tj)exp[-(ti-tj)]/T0 t (tj≥ti)
d=Gm,
Gi,j=
0
(tj<ti)
TSVD
time [-5,100]s
t=0.5s -> G with m=n=210
Singular values 25.3->0.017,
Cond(G)~1480
e.g., 1/1000 noise creates instability..
True signal: mtrue(t)=exp[-(t-8)2/22] + 0.5 exp[-(t-25)2/22 ]
TSVD
dtrue=Gmtrue
m=VS-1UTdtrue
TSVD
dtrue=Gmtrue
m=VS-1 UT(dtrue+), =N(0,(0.05 V)2)
Solution fits data perfectly, but worthless…
TSVD
dtrue=Gmtrue
m=VpSp-1 UpT(dtrue+), =N(0,(0.05 V)2)
Solution for p=26 (removed 184 eigenvalues!)
Nonlinear Regression
Linear regression: Now we know (e.g., LS)
Assume a nonlinear system of m eq and m unknowns
F(x)=0 … What are we going to do??
We will try to find a sequence of vectors x0, x1, … that will
converge toward a solution x*
Linearize: (assume that F is continuously differentiable)
Nonlinear Regression
Linear regression: Now we know (e.g., LS)
Assume a nonlinear system of m eq and m unknowns
F(x)=0 … What are we going to do??
We will try to find a sequence of vectors x0, x1, … that will
converge toward a solution x*
Linearize: (assume that F is continuously differentiable)
F(x0+x)≈F(x0)+F(x0)x
where F(x0) is the Jacobian
Nonlinear Regression
Assume that the x puts us at the unknown solution x*:
F(x0+x)≈F(x0)+F(x0)x=F(x*)=0
-F(x0) ≈ F(x0)x = Newton’s Method!
F(x)=0, initial solution x0. Generate a sequence of solutions
x1, x2, …and stop if the sequence converges to a solution
with F(x)=0.
1. Solve -F(xk) ≈ F(xk)x (fx, using Gaussian elimination).
2. Let xk+1=xk+x.
3. let k=k+1
Properties of Newton’s Method
If x0 is close enough to x*, F(x) is continuously differentiable
in a neighborhood of x*, and F(x*) is nonsingular,
Newton’s method will converge to x*. The convergence
rate is
||xk+1-x*||2≤c||xk-x*||22
Newton’s Method applied to a scalar function
Problem: Minimize f(x)
If f(x) is twice continuously differentiable
f(x0+x)≈f(x0)+f(x0)Tx+1/2 xT 2f(x0) x
where f(x0) is the gradient
and 2f(x0) is the Hessian
Newton’s Method applied to a scalar function
A necessary condition for x* to be a minimum of f(x) is that
f(x*)=0. In the vicinity of x0 we can approximate the gradient
as
f(x0+x)≈f(x0)+2f(x0)Tx (eq 9.8 - higher-order terms)
Setting the gradient to zero (assuming x0+x puts us at x*)
we get -f(x0) ≈ 2f(x0)Tx, which is Newton’s method for
minimizing f(x):
Twice differentiable function f(x), initial solution x0. Generate a
sequence of solutions x1, x2, …and stop if the sequence
converges to a solution with f(x)=0.
1. Solve -f(xk) ≈ 2f(xk)x
2. Let xk+1=xk+x.
3. let k=k+1
Newton’s Method applied to a scalar function
Is the same as solving a nonlinear system of equations
applied to f(x)=0, so therefore
If x0 is close enough to x*, f(x) is twice continuously
differentiable in a neighborhood of x*, and there is a
constant  such that
||2f(x)-2f(y)||2≤||x-y||2
for every y in the neighborhood, and 2f(x*) is positive
definite, and x0 is close enough to x*, then Newton’s
method will converge quadratically to x*
Newton’s Method applied to LS
Not directly applicable to most nonlinear regression and
inverse problems (not equal # of model parameters and
data points, no exact solution to G(m)=d). Instead we will
use N.M. to minimize a nonlinear LS problem, e.g. fit a
vector of n parameters to a data vector d.
m
f(m)=∑ [(G(m)i-di)/i]2
i=1
Let fi(m)=(G(m)i-di)/I i=1,2,…,m, F(m)=[f1(m) … fm(m)]T
m
So that f(m)= ∑ fi(m)2
i=1
m
f(m)=∑ fi(m)2]
i=1
Newton’s Method applied to LS
m
f(m)j=∑ 2fi(m)jF(m)j]
i=i
f(m)=2J(m)TF(m), where J(m) is the Jacobian
m
m
f(m)j=∑ fi(m)2] = ∑ Hi(m), where Hi(m) is the Hessian of
fi(m)2 i=i
i=i
Hij,k(m)=
Newton’s Method applied to LS
f(m)=2J(m)TJ(m)+Q(m), where
m
Q(m)=∑ fi(m) fi(m)
i=i
Gauss-Newton (GN) method ignores Q(m),
f(m)≈2J(m)TJ(m), assuming fi(m) will be reasonably small
as we approach m*. That is,
NM: Solve -f(xk) ≈ 2f(xk)x
f(m)j=∑ 2fi(m)jF(m)j, i.e.
J(mk)TJ(mk)m=-J(mk)TF(mk)
Newton’s Method applied to LS
Levenberg-Marquardt (LM) method uses
[J(mk)TJ(mk)+I]m=-J(mk)TF(mk)
->0 : GN
->large, steepest descent (SD) (down-gradient most rapidly).
SD provides slow but certain convergence.
Which value of  to use? Small values when GN is working
well, switch to larger values in problem areas. Start with
small value of , then adjust.
Statistics of iterative methods
Cov(Ad)=A Cov(d) AT (d has multivariate N.D.)
Cov(mL2)=(GTG)-1GT Cov(d) G(GTG)-1
Cov(d)=2I: Cov(mL2)=2(GTG)-1
However, we don’t have a linear relationship between data
and estimated model parameters for the nonlinear
regression, so cannot use these formulas. Instead:
F(m*+m)≈F(m*)+J(m*)m
Cov(m*)≈(J(m*)TJ(m*))-1
ri=G(m*)i-di
s=[∑ ri2/(m-n)]
Cov(m*)=s2(J(m*)TJ(m*))-1
Implementation Issues
1. Explicit (analytical) expressions for derivatives
2. Finite difference approximation for derivatives
3. When to stop iterating?