Determining the Regularization Parameters for the Solution of Ill

Determining the Regularization Parameters for the
Solution of Ill-posed Least Squares: Using the
χ2 -distribution and applying to seismic signals
Rosemary Renaut,
Joint work with Jodi Mead
Arizona State and Boise State
November 2007
Renaut (ASU)
Scalar Newton method
November 2007
1 / 46
Outline
1
Introduction- Ill-posed least squares
Regularization
Some Standard Methods for Parameter Estimation
2
A Statistically based method: Chi squared Method
Background
Algorithm
Single Variable Newton Method
Extend for General D: Generalized Tikhonov
Observations
3
Results
4
Conclusions
5
References
Renaut (ASU)
Scalar Newton method
November 2007
2 / 46
Least Squares Solutions of Overdetermined Ax = b
Problem
Find x which solves Ax = b: A ∈ Rm×n , b ∈ Rm , x ∈ Rn .
Classical Approach Linear Least Squares
xLS = arg min ||Ax − b||22
x
• Orthogonal projection of b onto the range of A.
Dense A Form QR Decomposition of A, solve directly Rx = Q T b.
Sparse A Use iterative techniques, CG, Krylov subspace, etc.
Renaut (ASU)
Scalar Newton method
November 2007
3 / 46
Least Squares Solutions of Overdetermined Ax = b
Problem
Find x which solves Ax = b: A ∈ Rm×n , b ∈ Rm , x ∈ Rn .
Classical Approach Linear Least Squares
xLS = arg min ||Ax − b||22
x
• Orthogonal projection of b onto the range of A.
Dense A Form QR Decomposition of A, solve directly Rx = Q T b.
Sparse A Use iterative techniques, CG, Krylov subspace, etc.
Renaut (ASU)
Scalar Newton method
November 2007
3 / 46
Least Squares Solutions of Overdetermined Ax = b
Problem
Find x which solves Ax = b: A ∈ Rm×n , b ∈ Rm , x ∈ Rn .
Classical Approach Linear Least Squares
xLS = arg min ||Ax − b||22
x
• Orthogonal projection of b onto the range of A.
Dense A Form QR Decomposition of A, solve directly Rx = Q T b.
Sparse A Use iterative techniques, CG, Krylov subspace, etc.
Renaut (ASU)
Scalar Newton method
November 2007
3 / 46
Least Squares Solutions of Overdetermined Ax = b
Problem
Find x which solves Ax = b: A ∈ Rm×n , b ∈ Rm , x ∈ Rn .
Classical Approach Linear Least Squares
xLS = arg min ||Ax − b||22
x
• Orthogonal projection of b onto the range of A.
Dense A Form QR Decomposition of A, solve directly Rx = Q T b.
Sparse A Use iterative techniques, CG, Krylov subspace, etc.
Renaut (ASU)
Scalar Newton method
November 2007
3 / 46
Least Squares Solutions of Overdetermined Ax = b
Problem
Find x which solves Ax = b: A ∈ Rm×n , b ∈ Rm , x ∈ Rn .
Classical Approach Linear Least Squares
xLS = arg min ||Ax − b||22
x
• Orthogonal projection of b onto the range of A.
Dense A Form QR Decomposition of A, solve directly Rx = Q T b.
Sparse A Use iterative techniques, CG, Krylov subspace, etc.
Renaut (ASU)
Scalar Newton method
November 2007
3 / 46
Solution by Singular Value Decomposition: A = UΣV T
1
A is of rank r then
xLS =
r
X
uT b
i
i=1
σi
vi .
2
If r < n the xLS is solution with minimum 2-norm
3
Sensitivity: Division by small σi in xLS amplifies high-frequency
components in b.
4
Sensitivity: of xLS to changes in model Ax = b inversely
proportional to σr .
5
Sensitivity: of xLS , equivalently, to the condition number of A,
(maybe squared).
xLS is sensitive to changes in the right hand side b when A is
ill-conditioned.
Renaut (ASU)
Scalar Newton method
November 2007
4 / 46
Solution by Singular Value Decomposition: A = UΣV T
1
A is of rank r then
xLS =
r
X
uT b
i
i=1
σi
vi .
2
If r < n the xLS is solution with minimum 2-norm
3
Sensitivity: Division by small σi in xLS amplifies high-frequency
components in b.
4
Sensitivity: of xLS to changes in model Ax = b inversely
proportional to σr .
5
Sensitivity: of xLS , equivalently, to the condition number of A,
(maybe squared).
xLS is sensitive to changes in the right hand side b when A is
ill-conditioned.
Renaut (ASU)
Scalar Newton method
November 2007
4 / 46
Solution by Singular Value Decomposition: A = UΣV T
1
A is of rank r then
xLS =
r
X
uT b
i
i=1
σi
vi .
2
If r < n the xLS is solution with minimum 2-norm
3
Sensitivity: Division by small σi in xLS amplifies high-frequency
components in b.
4
Sensitivity: of xLS to changes in model Ax = b inversely
proportional to σr .
5
Sensitivity: of xLS , equivalently, to the condition number of A,
(maybe squared).
xLS is sensitive to changes in the right hand side b when A is
ill-conditioned.
Renaut (ASU)
Scalar Newton method
November 2007
4 / 46
Solution by Singular Value Decomposition: A = UΣV T
1
A is of rank r then
xLS =
r
X
uT b
i
i=1
σi
vi .
2
If r < n the xLS is solution with minimum 2-norm
3
Sensitivity: Division by small σi in xLS amplifies high-frequency
components in b.
4
Sensitivity: of xLS to changes in model Ax = b inversely
proportional to σr .
5
Sensitivity: of xLS , equivalently, to the condition number of A,
(maybe squared).
xLS is sensitive to changes in the right hand side b when A is
ill-conditioned.
Renaut (ASU)
Scalar Newton method
November 2007
4 / 46
Solution by Singular Value Decomposition: A = UΣV T
1
A is of rank r then
xLS =
r
X
uT b
i
i=1
σi
vi .
2
If r < n the xLS is solution with minimum 2-norm
3
Sensitivity: Division by small σi in xLS amplifies high-frequency
components in b.
4
Sensitivity: of xLS to changes in model Ax = b inversely
proportional to σr .
5
Sensitivity: of xLS , equivalently, to the condition number of A,
(maybe squared).
xLS is sensitive to changes in the right hand side b when A is
ill-conditioned.
Renaut (ASU)
Scalar Newton method
November 2007
4 / 46
Solution by Singular Value Decomposition: A = UΣV T
1
A is of rank r then
xLS =
r
X
uT b
i
i=1
σi
vi .
2
If r < n the xLS is solution with minimum 2-norm
3
Sensitivity: Division by small σi in xLS amplifies high-frequency
components in b.
4
Sensitivity: of xLS to changes in model Ax = b inversely
proportional to σr .
5
Sensitivity: of xLS , equivalently, to the condition number of A,
(maybe squared).
xLS is sensitive to changes in the right hand side b when A is
ill-conditioned.
Renaut (ASU)
Scalar Newton method
November 2007
4 / 46
Example for Ill-Posed Problem: Integral Equations
Z
input X system dΩ = output
Ω
• Given noisy output determine input
• General Application: Signal/Image Restoration
• Signal degradation is modeled as a convolution
b=a⊗x+n
• b is the blurred signal,x is the unknown signal
• a is the point spread function (PSF)- known
• n is noise
• Matrix Formulation
b = Ax + n
Renaut (ASU)
Scalar Newton method
November 2007
5 / 46
Example for Ill-Posed Problem: Integral Equations
Z
input X system dΩ = output
Ω
• Given noisy output determine input
• General Application: Signal/Image Restoration
• Signal degradation is modeled as a convolution
b=a⊗x+n
• b is the blurred signal,x is the unknown signal
• a is the point spread function (PSF)- known
• n is noise
• Matrix Formulation
b = Ax + n
Renaut (ASU)
Scalar Newton method
November 2007
5 / 46
Example for Ill-Posed Problem: Integral Equations
Z
input X system dΩ = output
Ω
• Given noisy output determine input
• General Application: Signal/Image Restoration
• Signal degradation is modeled as a convolution
b=a⊗x+n
• b is the blurred signal,x is the unknown signal
• a is the point spread function (PSF)- known
• n is noise
• Matrix Formulation
b = Ax + n
Renaut (ASU)
Scalar Newton method
November 2007
5 / 46
Example for Ill-Posed Problem: Integral Equations
Z
input X system dΩ = output
Ω
• Given noisy output determine input
• General Application: Signal/Image Restoration
• Signal degradation is modeled as a convolution
b=a⊗x+n
• b is the blurred signal,x is the unknown signal
• a is the point spread function (PSF)- known
• n is noise
• Matrix Formulation
b = Ax + n
Renaut (ASU)
Scalar Newton method
November 2007
5 / 46
Example for Ill-Posed Problem: Integral Equations
Z
input X system dΩ = output
Ω
• Given noisy output determine input
• General Application: Signal/Image Restoration
• Signal degradation is modeled as a convolution
b=a⊗x+n
• b is the blurred signal,x is the unknown signal
• a is the point spread function (PSF)- known
• n is noise
• Matrix Formulation
b = Ax + n
Renaut (ASU)
Scalar Newton method
November 2007
5 / 46
Example for Ill-Posed Problem: Integral Equations
Z
input X system dΩ = output
Ω
• Given noisy output determine input
• General Application: Signal/Image Restoration
• Signal degradation is modeled as a convolution
b=a⊗x+n
• b is the blurred signal,x is the unknown signal
• a is the point spread function (PSF)- known
• n is noise
• Matrix Formulation
b = Ax + n
Renaut (ASU)
Scalar Newton method
November 2007
5 / 46
Example for Ill-Posed Problem: Integral Equations
Z
input X system dΩ = output
Ω
• Given noisy output determine input
• General Application: Signal/Image Restoration
• Signal degradation is modeled as a convolution
b=a⊗x+n
• b is the blurred signal,x is the unknown signal
• a is the point spread function (PSF)- known
• n is noise
• Matrix Formulation
b = Ax + n
Renaut (ASU)
Scalar Newton method
November 2007
5 / 46
Example for Ill-Posed Problem: Integral Equations
Z
input X system dΩ = output
Ω
• Given noisy output determine input
• General Application: Signal/Image Restoration
• Signal degradation is modeled as a convolution
b=a⊗x+n
• b is the blurred signal,x is the unknown signal
• a is the point spread function (PSF)- known
• n is noise
• Matrix Formulation
b = Ax + n
Renaut (ASU)
Scalar Newton method
November 2007
5 / 46
Example Of Convolution
b=a⊗x
Renaut (ASU)
Scalar Newton method
November 2007
6 / 46
Restoration of x with noise added, n
• Find x from b = a ⊗ x + n given b and a with unknown n.
• Assuming normal distributed n yields the estimator
xLS = arg min{kb − a ⊗ xk22 }
x
• Reconstruction with n normally distributed, mean 0 and variance
10−7 )
Renaut (ASU)
Scalar Newton method
November 2007
7 / 46
Restoration of x with noise added, n
• Find x from b = a ⊗ x + n given b and a with unknown n.
• Assuming normal distributed n yields the estimator
xLS = arg min{kb − a ⊗ xk22 }
x
• Reconstruction with n normally distributed, mean 0 and variance
10−7 )
Renaut (ASU)
Scalar Newton method
November 2007
7 / 46
Restoration of x with noise added, n
• Find x from b = a ⊗ x + n given b and a with unknown n.
• Assuming normal distributed n yields the estimator
xLS = arg min{kb − a ⊗ xk22 }
x
• Reconstruction with n normally distributed, mean 0 and variance
10−7 )
Renaut (ASU)
Scalar Newton method
November 2007
7 / 46
Restoration of x with noise added, n
• Find x from b = a ⊗ x + n given b and a with unknown n.
• Assuming normal distributed n yields the estimator
xLS = arg min{kb − a ⊗ xk22 }
x
• Reconstruction with n normally distributed, mean 0 and variance
10−7 )
Renaut (ASU)
Scalar Newton method
November 2007
7 / 46
Regularization
• Add more information about the signal
• Regularize
xLS = arg min{kb − a ⊗ xk22 + λR(x)},
x
where R(x) is a regularization term
• λ is a regularization parameter which is unknown.
Notice that the solution is xLS (λ), dependent on λ. It also depends on
choice of R.
Renaut (ASU)
Scalar Newton method
November 2007
8 / 46
Regularization
• Add more information about the signal
• Regularize
xLS = arg min{kb − a ⊗ xk22 + λR(x)},
x
where R(x) is a regularization term
• λ is a regularization parameter which is unknown.
Notice that the solution is xLS (λ), dependent on λ. It also depends on
choice of R.
Renaut (ASU)
Scalar Newton method
November 2007
8 / 46
Regularization
• Add more information about the signal
• Regularize
xLS = arg min{kb − a ⊗ xk22 + λR(x)},
x
where R(x) is a regularization term
• λ is a regularization parameter which is unknown.
Notice that the solution is xLS (λ), dependent on λ. It also depends on
choice of R.
Renaut (ASU)
Scalar Newton method
November 2007
8 / 46
Regularization
• Add more information about the signal
• Regularize
xLS = arg min{kb − a ⊗ xk22 + λR(x)},
x
where R(x) is a regularization term
• λ is a regularization parameter which is unknown.
Notice that the solution is xLS (λ), dependent on λ. It also depends on
choice of R.
Renaut (ASU)
Scalar Newton method
November 2007
8 / 46
2
Tikhonov Regularized Least Squares for Ax = b R(x) = kD(x − x0 )kW
x
Formulation
Generalized Tikhonov regularization, operator D acts on x.
x̂ = argmin J(x) = argmin{kAx − bk2Wb + kD(x − x0 )k2Wx }.
(1)
Assume N (A) ∩ N (D) = ∅
Weighting matrix Wb is inverse covariance matrix for data b.
x0 is a reference solution, often x0 = 0.
Standard: Wx = λI = I/σx2 , σx2 the variance in data x, D = I, (1) is
x̂(λ) = argmin J(x) = argmin{kAx − bk2Wb + λkD(x − x0 )k2 }. (2)
Question
What is the correct λ? Is choice of λ important?
Renaut (ASU)
Scalar Newton method
November 2007
9 / 46
2
Tikhonov Regularized Least Squares for Ax = b R(x) = kD(x − x0 )kW
x
Formulation
Generalized Tikhonov regularization, operator D acts on x.
x̂ = argmin J(x) = argmin{kAx − bk2Wb + kD(x − x0 )k2Wx }.
(1)
Assume N (A) ∩ N (D) = ∅
Weighting matrix Wb is inverse covariance matrix for data b.
x0 is a reference solution, often x0 = 0.
Standard: Wx = λI = I/σx2 , σx2 the variance in data x, D = I, (1) is
x̂(λ) = argmin J(x) = argmin{kAx − bk2Wb + λkD(x − x0 )k2 }. (2)
Question
What is the correct λ? Is choice of λ important?
Renaut (ASU)
Scalar Newton method
November 2007
9 / 46
2
Tikhonov Regularized Least Squares for Ax = b R(x) = kD(x − x0 )kW
x
Formulation
Generalized Tikhonov regularization, operator D acts on x.
x̂ = argmin J(x) = argmin{kAx − bk2Wb + kD(x − x0 )k2Wx }.
(1)
Assume N (A) ∩ N (D) = ∅
Weighting matrix Wb is inverse covariance matrix for data b.
x0 is a reference solution, often x0 = 0.
Standard: Wx = λI = I/σx2 , σx2 the variance in data x, D = I, (1) is
x̂(λ) = argmin J(x) = argmin{kAx − bk2Wb + λkD(x − x0 )k2 }. (2)
Question
What is the correct λ? Is choice of λ important?
Renaut (ASU)
Scalar Newton method
November 2007
9 / 46
2
Tikhonov Regularized Least Squares for Ax = b R(x) = kD(x − x0 )kW
x
Formulation
Generalized Tikhonov regularization, operator D acts on x.
x̂ = argmin J(x) = argmin{kAx − bk2Wb + kD(x − x0 )k2Wx }.
(1)
Assume N (A) ∩ N (D) = ∅
Weighting matrix Wb is inverse covariance matrix for data b.
x0 is a reference solution, often x0 = 0.
Standard: Wx = λI = I/σx2 , σx2 the variance in data x, D = I, (1) is
x̂(λ) = argmin J(x) = argmin{kAx − bk2Wb + λkD(x − x0 )k2 }. (2)
Question
What is the correct λ? Is choice of λ important?
Renaut (ASU)
Scalar Newton method
November 2007
9 / 46
2
Tikhonov Regularized Least Squares for Ax = b R(x) = kD(x − x0 )kW
x
Formulation
Generalized Tikhonov regularization, operator D acts on x.
x̂ = argmin J(x) = argmin{kAx − bk2Wb + kD(x − x0 )k2Wx }.
(1)
Assume N (A) ∩ N (D) = ∅
Weighting matrix Wb is inverse covariance matrix for data b.
x0 is a reference solution, often x0 = 0.
Standard: Wx = λI = I/σx2 , σx2 the variance in data x, D = I, (1) is
x̂(λ) = argmin J(x) = argmin{kAx − bk2Wb + λkD(x − x0 )k2 }. (2)
Question
What is the correct λ? Is choice of λ important?
Renaut (ASU)
Scalar Newton method
November 2007
9 / 46
2
Tikhonov Regularized Least Squares for Ax = b R(x) = kD(x − x0 )kW
x
Formulation
Generalized Tikhonov regularization, operator D acts on x.
x̂ = argmin J(x) = argmin{kAx − bk2Wb + kD(x − x0 )k2Wx }.
(1)
Assume N (A) ∩ N (D) = ∅
Weighting matrix Wb is inverse covariance matrix for data b.
x0 is a reference solution, often x0 = 0.
Standard: Wx = λI = I/σx2 , σx2 the variance in data x, D = I, (1) is
x̂(λ) = argmin J(x) = argmin{kAx − bk2Wb + λkD(x − x0 )k2 }. (2)
Question
What is the correct λ? Is choice of λ important?
Renaut (ASU)
Scalar Newton method
November 2007
9 / 46
2
Tikhonov Regularized Least Squares for Ax = b R(x) = kD(x − x0 )kW
x
Formulation
Generalized Tikhonov regularization, operator D acts on x.
x̂ = argmin J(x) = argmin{kAx − bk2Wb + kD(x − x0 )k2Wx }.
(1)
Assume N (A) ∩ N (D) = ∅
Weighting matrix Wb is inverse covariance matrix for data b.
x0 is a reference solution, often x0 = 0.
Standard: Wx = λI = I/σx2 , σx2 the variance in data x, D = I, (1) is
x̂(λ) = argmin J(x) = argmin{kAx − bk2Wb + λkD(x − x0 )k2 }. (2)
Question
What is the correct λ? Is choice of λ important?
Renaut (ASU)
Scalar Newton method
November 2007
9 / 46
1-D Original and Noisy Signal
Renaut (ASU)
Scalar Newton method
November 2007
10 / 46
Solution for Different Choices of λ
Renaut (ASU)
Scalar Newton method
November 2007
11 / 46
Solution for Different Choices of λ
Renaut (ASU)
Scalar Newton method
November 2007
11 / 46
Solution for Different Choices of λ
Renaut (ASU)
Scalar Newton method
November 2007
11 / 46
Solution for Different Choices of λ
Renaut (ASU)
Scalar Newton method
November 2007
11 / 46
Solution for Different Choices of λ
Renaut (ASU)
Scalar Newton method
November 2007
11 / 46
Solution for Different Choices of λ
Renaut (ASU)
Scalar Newton method
November 2007
11 / 46
Solution for Different Choices of λ
Renaut (ASU)
Scalar Newton method
November 2007
11 / 46
Solution for Different Choices of λ
Renaut (ASU)
Scalar Newton method
November 2007
11 / 46
Solution for Different Choices of λ
Renaut (ASU)
Scalar Newton method
November 2007
11 / 46
Solution for Different Choices of λ
Renaut (ASU)
Scalar Newton method
November 2007
11 / 46
Solution for Different Choices of λ
Renaut (ASU)
Scalar Newton method
November 2007
11 / 46
Solution for Different Choices of λ
Renaut (ASU)
Scalar Newton method
November 2007
11 / 46
Solution for Different Choices of λ
Renaut (ASU)
Scalar Newton method
November 2007
11 / 46
Another Example: Image Reconstruction Shepp-Logan Phantom
Without Noise in the Data
.1% Noise in the Data
Renaut (ASU)
Scalar Newton method
November 2007
12 / 46
Some standard approaches I: L-curve - Find the corner
Let r(λ) = (A(λ) − A)b:
Influence Matrix A(λ) =
A(AT Wb A + λD T D)−1 AT
Plot
log(kDxk), log(kr(λ)k)
Find corner
Trade off contributions.
Expensive - requires range of
λ.
GSVD makes calculations
efficient.
Not statistically based
No corner
Renaut (ASU)
Scalar Newton method
November 2007
13 / 46
Some standard approaches II: Generalized Cross-Validation (GCV)
Minimizes GCV function
2
kb − Ax(λ)kW
b
[trace(Im − A(λ))]2
,
which estimates predictive
risk.
Multiple minima
Expensive - requires range of
λ.
GSVD makes calculations
efficient.
Statistically based
Requires minimum
Sometimes flat
Renaut (ASU)
Scalar Newton method
November 2007
14 / 46
Some standard approaches III: Unbiased Predictive Risk Estimation
(UPRE)
Minimize expected value of
predictive risk: Minimize
UPRE function
kb − Ax(λ)k2Wb
+2 trace(A(λ)) − m
Expensive - requires range of
λ.
GSVD makes calculations
efficient.
Statistically based
Minimum needed
Renaut (ASU)
Scalar Newton method
November 2007
15 / 46
Development
The Chi squared Method (Mead 2007)
Its Background
A Newton algorithm
Some Examples
Future Work
Renaut (ASU)
Scalar Newton method
November 2007
16 / 46
Development
The Chi squared Method (Mead 2007)
Its Background
A Newton algorithm
Some Examples
Future Work
Renaut (ASU)
Scalar Newton method
November 2007
16 / 46
Development
The Chi squared Method (Mead 2007)
Its Background
A Newton algorithm
Some Examples
Future Work
Renaut (ASU)
Scalar Newton method
November 2007
16 / 46
Development
The Chi squared Method (Mead 2007)
Its Background
A Newton algorithm
Some Examples
Future Work
Renaut (ASU)
Scalar Newton method
November 2007
16 / 46
Development
The Chi squared Method (Mead 2007)
Its Background
A Newton algorithm
Some Examples
Future Work
Renaut (ASU)
Scalar Newton method
November 2007
16 / 46
General Result: Tikhonov (D = I) Cost functional at min is χ2 r.v.
Theorem (Rao:73, Tarantola, Mead (2007))
J(x) = (b − Ax)T Cb −1 (b − Ax) + (x − x0 )T Cx −1 (x − x0 ),
x and b are stochastic (need not be normal)
r = b − Ax0 are iid. (Assume no components are zero)
Matrices Cb = Wb −1 and Cx = Wx −1 are SPD Then for large m,
minimium value of J is a random variable
it follows a χ2 distribution with m degrees of freedom.
Renaut (ASU)
Scalar Newton method
November 2007
17 / 46
General Result: Tikhonov (D = I) Cost functional at min is χ2 r.v.
Theorem (Rao:73, Tarantola, Mead (2007))
J(x) = (b − Ax)T Cb −1 (b − Ax) + (x − x0 )T Cx −1 (x − x0 ),
x and b are stochastic (need not be normal)
r = b − Ax0 are iid. (Assume no components are zero)
Matrices Cb = Wb −1 and Cx = Wx −1 are SPD Then for large m,
minimium value of J is a random variable
it follows a χ2 distribution with m degrees of freedom.
Renaut (ASU)
Scalar Newton method
November 2007
17 / 46
General Result: Tikhonov (D = I) Cost functional at min is χ2 r.v.
Theorem (Rao:73, Tarantola, Mead (2007))
J(x) = (b − Ax)T Cb −1 (b − Ax) + (x − x0 )T Cx −1 (x − x0 ),
x and b are stochastic (need not be normal)
r = b − Ax0 are iid. (Assume no components are zero)
Matrices Cb = Wb −1 and Cx = Wx −1 are SPD Then for large m,
minimium value of J is a random variable
it follows a χ2 distribution with m degrees of freedom.
Renaut (ASU)
Scalar Newton method
November 2007
17 / 46
General Result: Tikhonov (D = I) Cost functional at min is χ2 r.v.
Theorem (Rao:73, Tarantola, Mead (2007))
J(x) = (b − Ax)T Cb −1 (b − Ax) + (x − x0 )T Cx −1 (x − x0 ),
x and b are stochastic (need not be normal)
r = b − Ax0 are iid. (Assume no components are zero)
Matrices Cb = Wb −1 and Cx = Wx −1 are SPD Then for large m,
minimium value of J is a random variable
it follows a χ2 distribution with m degrees of freedom.
Renaut (ASU)
Scalar Newton method
November 2007
17 / 46
General Result: Tikhonov (D = I) Cost functional at min is χ2 r.v.
Theorem (Rao:73, Tarantola, Mead (2007))
J(x) = (b − Ax)T Cb −1 (b − Ax) + (x − x0 )T Cx −1 (x − x0 ),
x and b are stochastic (need not be normal)
r = b − Ax0 are iid. (Assume no components are zero)
Matrices Cb = Wb −1 and Cx = Wx −1 are SPD Then for large m,
minimium value of J is a random variable
it follows a χ2 distribution with m degrees of freedom.
Renaut (ASU)
Scalar Newton method
November 2007
17 / 46
Implications:
Theorem implies
m−
√
2zα/2 < J(x̂) < m +
√
2zα/2
for confidence interval (1 − α), x̂ the solution.
Equivalently, when D = I,
√
√
m − 2zα/2 < rT (ACx AT + Cb )−1 r < m + 2zα/2 .
Note no assumptions on Wx : it is completely general
Question
Can we use the result to obtain an efficient algorithm?
Renaut (ASU)
Scalar Newton method
November 2007
18 / 46
Implications:
Theorem implies
m−
√
2zα/2 < J(x̂) < m +
√
2zα/2
for confidence interval (1 − α), x̂ the solution.
Equivalently, when D = I,
√
√
m − 2zα/2 < rT (ACx AT + Cb )−1 r < m + 2zα/2 .
Note no assumptions on Wx : it is completely general
Question
Can we use the result to obtain an efficient algorithm?
Renaut (ASU)
Scalar Newton method
November 2007
18 / 46
Implications:
Theorem implies
m−
√
2zα/2 < J(x̂) < m +
√
2zα/2
for confidence interval (1 − α), x̂ the solution.
Equivalently, when D = I,
√
√
m − 2zα/2 < rT (ACx AT + Cb )−1 r < m + 2zα/2 .
Note no assumptions on Wx : it is completely general
Question
Can we use the result to obtain an efficient algorithm?
Renaut (ASU)
Scalar Newton method
November 2007
18 / 46
Implications:
Theorem implies
m−
√
2zα/2 < J(x̂) < m +
√
2zα/2
for confidence interval (1 − α), x̂ the solution.
Equivalently, when D = I,
√
√
m − 2zα/2 < rT (ACx AT + Cb )−1 r < m + 2zα/2 .
Note no assumptions on Wx : it is completely general
Question
Can we use the result to obtain an efficient algorithm?
Renaut (ASU)
Scalar Newton method
November 2007
18 / 46
Implications:
Theorem implies
m−
√
2zα/2 < J(x̂) < m +
√
2zα/2
for confidence interval (1 − α), x̂ the solution.
Equivalently, when D = I,
√
√
m − 2zα/2 < rT (ACx AT + Cb )−1 r < m + 2zα/2 .
Note no assumptions on Wx : it is completely general
Question
Can we use the result to obtain an efficient algorithm?
Renaut (ASU)
Scalar Newton method
November 2007
18 / 46
Single Variable Approach: Seek efficient, practical algorithm
Let Wx = σx−2 I, where regularization parameter λ = 1/σx2 .
Use SVD to implement Ub Σb VbT = Wb 1/2 A, svs σ1 ≥ σ2 ≥ . . . σp
and define s = Ub Wb 1/2 r:
Find σx such that
m−
√
2zα/2 < sT diag(
1
σi2 σx2 + 1
√
)s < m +
2zα/2 .
Equivalently, find σx2 such that
F (σx ) = sT diag(
1
)s − m = 0.
1 + σx2 σi2
Scalar Root Finding: Newton’s Method
Renaut (ASU)
Scalar Newton method
November 2007
19 / 46
Single Variable Approach: Seek efficient, practical algorithm
Let Wx = σx−2 I, where regularization parameter λ = 1/σx2 .
Use SVD to implement Ub Σb VbT = Wb 1/2 A, svs σ1 ≥ σ2 ≥ . . . σp
and define s = Ub Wb 1/2 r:
Find σx such that
m−
√
2zα/2 < sT diag(
1
σi2 σx2 + 1
√
)s < m +
2zα/2 .
Equivalently, find σx2 such that
F (σx ) = sT diag(
1
)s − m = 0.
1 + σx2 σi2
Scalar Root Finding: Newton’s Method
Renaut (ASU)
Scalar Newton method
November 2007
19 / 46
Single Variable Approach: Seek efficient, practical algorithm
Let Wx = σx−2 I, where regularization parameter λ = 1/σx2 .
Use SVD to implement Ub Σb VbT = Wb 1/2 A, svs σ1 ≥ σ2 ≥ . . . σp
and define s = Ub Wb 1/2 r:
Find σx such that
m−
√
2zα/2 < sT diag(
1
σi2 σx2 + 1
√
)s < m +
2zα/2 .
Equivalently, find σx2 such that
F (σx ) = sT diag(
1
)s − m = 0.
1 + σx2 σi2
Scalar Root Finding: Newton’s Method
Renaut (ASU)
Scalar Newton method
November 2007
19 / 46
Single Variable Approach: Seek efficient, practical algorithm
Let Wx = σx−2 I, where regularization parameter λ = 1/σx2 .
Use SVD to implement Ub Σb VbT = Wb 1/2 A, svs σ1 ≥ σ2 ≥ . . . σp
and define s = Ub Wb 1/2 r:
Find σx such that
m−
√
2zα/2 < sT diag(
1
σi2 σx2 + 1
√
)s < m +
2zα/2 .
Equivalently, find σx2 such that
F (σx ) = sT diag(
1
)s − m = 0.
1 + σx2 σi2
Scalar Root Finding: Newton’s Method
Renaut (ASU)
Scalar Newton method
November 2007
19 / 46
Single Variable Approach: Seek efficient, practical algorithm
Let Wx = σx−2 I, where regularization parameter λ = 1/σx2 .
Use SVD to implement Ub Σb VbT = Wb 1/2 A, svs σ1 ≥ σ2 ≥ . . . σp
and define s = Ub Wb 1/2 r:
Find σx such that
m−
√
2zα/2 < sT diag(
1
σi2 σx2 + 1
√
)s < m +
2zα/2 .
Equivalently, find σx2 such that
F (σx ) = sT diag(
1
)s − m = 0.
1 + σx2 σi2
Scalar Root Finding: Newton’s Method
Renaut (ASU)
Scalar Newton method
November 2007
19 / 46
Extension to Generalized Tikhonov
Define
2
x̂GTik = argminJD (x) = argmin{kAx − bkW
+ kD(x − x0 )k2Wx },
b
(3)
Theorem
For large m, the minimium value of JD is a random variable which
follows a χ2 distribution with m − n + p degrees of freedom. (Assuming
that no components of r are zero)
Proof.
Use the Generalized Singular Value Decomposition for
[Wb 1/2 A, Wx 1/2 D]
Find Wx such that JD is χ2 with m − n + p d.o.f.
Renaut (ASU)
Scalar Newton method
November 2007
20 / 46
Extension to Generalized Tikhonov
Define
2
x̂GTik = argminJD (x) = argmin{kAx − bkW
+ kD(x − x0 )k2Wx },
b
(3)
Theorem
For large m, the minimium value of JD is a random variable which
follows a χ2 distribution with m − n + p degrees of freedom. (Assuming
that no components of r are zero)
Proof.
Use the Generalized Singular Value Decomposition for
[Wb 1/2 A, Wx 1/2 D]
Find Wx such that JD is χ2 with m − n + p d.o.f.
Renaut (ASU)
Scalar Newton method
November 2007
20 / 46
Extension to Generalized Tikhonov
Define
2
x̂GTik = argminJD (x) = argmin{kAx − bkW
+ kD(x − x0 )k2Wx },
b
(3)
Theorem
For large m, the minimium value of JD is a random variable which
follows a χ2 distribution with m − n + p degrees of freedom. (Assuming
that no components of r are zero)
Proof.
Use the Generalized Singular Value Decomposition for
[Wb 1/2 A, Wx 1/2 D]
Find Wx such that JD is χ2 with m − n + p d.o.f.
Renaut (ASU)
Scalar Newton method
November 2007
20 / 46
Extension to Generalized Tikhonov
Define
2
x̂GTik = argminJD (x) = argmin{kAx − bkW
+ kD(x − x0 )k2Wx },
b
(3)
Theorem
For large m, the minimium value of JD is a random variable which
follows a χ2 distribution with m − n + p degrees of freedom. (Assuming
that no components of r are zero)
Proof.
Use the Generalized Singular Value Decomposition for
[Wb 1/2 A, Wx 1/2 D]
Find Wx such that JD is χ2 with m − n + p d.o.f.
Renaut (ASU)
Scalar Newton method
November 2007
20 / 46
Newton Root Finding Wx = σx−2 Ip
Let
GSVD of [Wb 1/2 A, D]
Υ
A=U
XT
0m−n×n
D = V [M, 0p×n−p ]X T ,
γi are the generalized singular values
P
P
2
m̃ = m − n + p − pi=1 si2 δγi 0 − m
i=n+1 si ,
s̃i = si /(γi2 σx2 + 1), i = 1, . . . , p
ti = s̃i γi .
Find root of
Pp
1
2
i=1 ( γ 2 σ 2 +1 )si
Solve F = 0, where
+
i
F (σx ) = sT s̃ − m̃
Renaut (ASU)
Pm
2
i=n+1 si
=m
and F 0 (σx ) = −2σx ktk22 .
Scalar Newton method
November 2007
21 / 46
Newton Root Finding Wx = σx−2 Ip
Let
GSVD of [Wb 1/2 A, D]
Υ
A=U
XT
0m−n×n
D = V [M, 0p×n−p ]X T ,
γi are the generalized singular values
P
P
2
m̃ = m − n + p − pi=1 si2 δγi 0 − m
i=n+1 si ,
s̃i = si /(γi2 σx2 + 1), i = 1, . . . , p
ti = s̃i γi .
Find root of
Pp
1
2
i=1 ( γ 2 σ 2 +1 )si
Solve F = 0, where
+
i
F (σx ) = sT s̃ − m̃
Renaut (ASU)
Pm
2
i=n+1 si
=m
and F 0 (σx ) = −2σx ktk22 .
Scalar Newton method
November 2007
21 / 46
An Illustrative Example: phillips Fredholm integral equation (Hansen)
Example Error 10%
Add noise to b
Standard deviation
σbi = .01|bi | + .1bmax
Covariance matrix
Cb = σb2 Im = Wb −1
σb2 average of σb2i
− is the original b and ∗ noisy
data.
Renaut (ASU)
Scalar Newton method
November 2007
22 / 46
An Illustrative Example: phillips Fredholm integral equation (Hansen)
Comparison with new method
Compare Solutions:
+ is reference x0 . −− is exact.
L-Curve o
Three other solutions: UPRE,
GCV and χ2 method (blue,
magenta, black)
Each method gives different
solution - but UPRE, GCV and
χ2 are comparable.
Renaut (ASU)
Scalar Newton method
November 2007
23 / 46
Observations: Does the Method Converge?
F is monotonic decreasing, even (Left)
Solution either exists and is unique for positive σ
Or no solution exists F (0) < 0. (Right)
Theoretically, limσ→∞ F > 0 possible. Equivalent to λ = 0. No
regularization needed.
Renaut (ASU)
Scalar Newton method
November 2007
24 / 46
Observations: Does the Method Converge?
F is monotonic decreasing, even (Left)
Solution either exists and is unique for positive σ
Or no solution exists F (0) < 0. (Right)
Theoretically, limσ→∞ F > 0 possible. Equivalent to λ = 0. No
regularization needed.
Renaut (ASU)
Scalar Newton method
November 2007
24 / 46
Observations: Does the Method Converge?
F is monotonic decreasing, even (Left)
Solution either exists and is unique for positive σ
Or no solution exists F (0) < 0. (Right)
Theoretically, limσ→∞ F > 0 possible. Equivalent to λ = 0. No
regularization needed.
Renaut (ASU)
Scalar Newton method
November 2007
24 / 46
Observations: Does the Method Converge?
F is monotonic decreasing, even (Left)
Solution either exists and is unique for positive σ
Or no solution exists F (0) < 0. (Right)
Theoretically, limσ→∞ F > 0 possible. Equivalent to λ = 0. No
regularization needed.
Renaut (ASU)
Scalar Newton method
November 2007
24 / 46
Observations: Does the Method Converge?
F is monotonic decreasing, even (Left)
Solution either exists and is unique for positive σ
Or no solution exists F (0) < 0. (Right)
Theoretically, limσ→∞ F > 0 possible. Equivalent to λ = 0. No
regularization needed.
Renaut (ASU)
Scalar Newton method
November 2007
24 / 46
Remark on F (0) < 0
Notice, when F (0) < 0, m̃ is too big relative to J.
Equivalently, there are insufficient degrees of freedom.
Notice
J(x̂) = kP 1/2 sk22 ,
P = diag(1/((γi σ)2 + 1), 0n−p , Im−n )
In particular J(x̂(0)) = kP 1/2 (0)sk22 = y, for some y. If y < m̃, set
m̃ = floor(y )
Theorem is revised to: m̃ = min{floor(J(0)), m − n + p}.
Here m = 500, J(0) ≈ 39, F (0) ≈ −461. On right m̃ = 38.
Renaut (ASU)
Scalar Newton method
November 2007
25 / 46
Remark on F (0) < 0
Notice, when F (0) < 0, m̃ is too big relative to J.
Equivalently, there are insufficient degrees of freedom.
Notice
J(x̂) = kP 1/2 sk22 ,
P = diag(1/((γi σ)2 + 1), 0n−p , Im−n )
In particular J(x̂(0)) = kP 1/2 (0)sk22 = y, for some y. If y < m̃, set
m̃ = floor(y )
Theorem is revised to: m̃ = min{floor(J(0)), m − n + p}.
Here m = 500, J(0) ≈ 39, F (0) ≈ −461. On right m̃ = 38.
Renaut (ASU)
Scalar Newton method
November 2007
25 / 46
Remark on F (0) < 0
Notice, when F (0) < 0, m̃ is too big relative to J.
Equivalently, there are insufficient degrees of freedom.
Notice
J(x̂) = kP 1/2 sk22 ,
P = diag(1/((γi σ)2 + 1), 0n−p , Im−n )
In particular J(x̂(0)) = kP 1/2 (0)sk22 = y, for some y. If y < m̃, set
m̃ = floor(y )
Theorem is revised to: m̃ = min{floor(J(0)), m − n + p}.
Here m = 500, J(0) ≈ 39, F (0) ≈ −461. On right m̃ = 38.
Renaut (ASU)
Scalar Newton method
November 2007
25 / 46
Remark on F (0) < 0
Notice, when F (0) < 0, m̃ is too big relative to J.
Equivalently, there are insufficient degrees of freedom.
Notice
J(x̂) = kP 1/2 sk22 ,
P = diag(1/((γi σ)2 + 1), 0n−p , Im−n )
In particular J(x̂(0)) = kP 1/2 (0)sk22 = y, for some y. If y < m̃, set
m̃ = floor(y )
Theorem is revised to: m̃ = min{floor(J(0)), m − n + p}.
Here m = 500, J(0) ≈ 39, F (0) ≈ −461. On right m̃ = 38.
Renaut (ASU)
Scalar Newton method
November 2007
25 / 46
Remark on F (0) < 0
Notice, when F (0) < 0, m̃ is too big relative to J.
Equivalently, there are insufficient degrees of freedom.
Notice
J(x̂) = kP 1/2 sk22 ,
P = diag(1/((γi σ)2 + 1), 0n−p , Im−n )
In particular J(x̂(0)) = kP 1/2 (0)sk22 = y, for some y. If y < m̃, set
m̃ = floor(y )
Theorem is revised to: m̃ = min{floor(J(0)), m − n + p}.
Here m = 500, J(0) ≈ 39, F (0) ≈ −461. On right m̃ = 38.
Renaut (ASU)
Scalar Newton method
November 2007
25 / 46
Remark on F (0) < 0
Notice, when F (0) < 0, m̃ is too big relative to J.
Equivalently, there are insufficient degrees of freedom.
Notice
J(x̂) = kP 1/2 sk22 ,
P = diag(1/((γi σ)2 + 1), 0n−p , Im−n )
In particular J(x̂(0)) = kP 1/2 (0)sk22 = y, for some y. If y < m̃, set
m̃ = floor(y )
Theorem is revised to: m̃ = min{floor(J(0)), m − n + p}.
Here m = 500, J(0) ≈ 39, F (0) ≈ −461. On right m̃ = 38.
Renaut (ASU)
Scalar Newton method
November 2007
25 / 46
Remark on F (0) < 0
Notice, when F (0) < 0, m̃ is too big relative to J.
Equivalently, there are insufficient degrees of freedom.
Notice
J(x̂) = kP 1/2 sk22 ,
P = diag(1/((γi σ)2 + 1), 0n−p , Im−n )
In particular J(x̂(0)) = kP 1/2 (0)sk22 = y, for some y. If y < m̃, set
m̃ = floor(y )
Theorem is revised to: m̃ = min{floor(J(0)), m − n + p}.
Here m = 500, J(0) ≈ 39, F (0) ≈ −461. On right m̃ = 38.
Renaut (ASU)
Scalar Newton method
November 2007
25 / 46
Example: Seismic Signal Restoration
Real data set of 48 signals of length 500.
The point spread function is derived from the signals
Calculate the signal variance pointwise over all 48 signals.
Compare restoration of S-wave with derivative orders 0, 1, 2
Weighting matrices are I, σg−2 I, and diag(σg−2
), cases 1, 2, and 3.
i
Renaut (ASU)
Scalar Newton method
November 2007
26 / 46
Goals of the Analysis
Identify seismic time arrivals accurately
Determine existence of secondary structures: ScS, Scd, Sab
Renaut (ASU)
Scalar Newton method
November 2007
27 / 46
Tikhonov Regularization
Observations
Reduced
Degrees of
Freedom
Relevant!
Degrees of
Freedom found
automatically
Case 2 and 1
have different
solutions
Case 3 greater
contrast in
signal
Renaut (ASU)
Scalar Newton method
November 2007
28 / 46
First and Second Order Derivative Restoration
Observations
Derivative
smoothing is not
desirable
Case 3
preserves signal
λ increases with
derivative order
Solution is
smoother larger
λ.
Renaut (ASU)
Scalar Newton method
November 2007
29 / 46
Comparison with L-curve and UPRE Solutions
Observations
L-curve
underestimates λ.
UPRE and χ2 are
comparable for
DOF limited χ2 .
UPRE
underestimates for
case 2 and 3
weighting.
Renaut (ASU)
Scalar Newton method
November 2007
30 / 46
Conclusions
χ2 Newton algorithm is cost effective - converges in 5 − 10
iterations.
It performs as well ( or better) than GCV and UPRE when
statistical information is available.
Should be method of choice when statistical information is
provided
Method can be adapted to find Wb if Wx is provided.
Renaut (ASU)
Scalar Newton method
November 2007
31 / 46
Conclusions
χ2 Newton algorithm is cost effective - converges in 5 − 10
iterations.
It performs as well ( or better) than GCV and UPRE when
statistical information is available.
Should be method of choice when statistical information is
provided
Method can be adapted to find Wb if Wx is provided.
Renaut (ASU)
Scalar Newton method
November 2007
31 / 46
Conclusions
χ2 Newton algorithm is cost effective - converges in 5 − 10
iterations.
It performs as well ( or better) than GCV and UPRE when
statistical information is available.
Should be method of choice when statistical information is
provided
Method can be adapted to find Wb if Wx is provided.
Renaut (ASU)
Scalar Newton method
November 2007
31 / 46
Conclusions
χ2 Newton algorithm is cost effective - converges in 5 − 10
iterations.
It performs as well ( or better) than GCV and UPRE when
statistical information is available.
Should be method of choice when statistical information is
provided
Method can be adapted to find Wb if Wx is provided.
Renaut (ASU)
Scalar Newton method
November 2007
31 / 46
Future Work
Analyse for truncated expansions (TSVD and TGSVD) -reduce
the degrees of freedom.
Further theoretical analysis and simulations with other noise
distributions. Comparison new work of Rust & O’Leary 2007.
Can it be extended for nonlinear regularization terms? (TV?)
Development of the nonlinear least squares for general diagonal
Wx .
Efficient calculation of uncertainty information, covariance matrix.
Nonlinear problems?
Renaut (ASU)
Scalar Newton method
November 2007
32 / 46
Future Work
Analyse for truncated expansions (TSVD and TGSVD) -reduce
the degrees of freedom.
Further theoretical analysis and simulations with other noise
distributions. Comparison new work of Rust & O’Leary 2007.
Can it be extended for nonlinear regularization terms? (TV?)
Development of the nonlinear least squares for general diagonal
Wx .
Efficient calculation of uncertainty information, covariance matrix.
Nonlinear problems?
Renaut (ASU)
Scalar Newton method
November 2007
32 / 46
Future Work
Analyse for truncated expansions (TSVD and TGSVD) -reduce
the degrees of freedom.
Further theoretical analysis and simulations with other noise
distributions. Comparison new work of Rust & O’Leary 2007.
Can it be extended for nonlinear regularization terms? (TV?)
Development of the nonlinear least squares for general diagonal
Wx .
Efficient calculation of uncertainty information, covariance matrix.
Nonlinear problems?
Renaut (ASU)
Scalar Newton method
November 2007
32 / 46
Future Work
Analyse for truncated expansions (TSVD and TGSVD) -reduce
the degrees of freedom.
Further theoretical analysis and simulations with other noise
distributions. Comparison new work of Rust & O’Leary 2007.
Can it be extended for nonlinear regularization terms? (TV?)
Development of the nonlinear least squares for general diagonal
Wx .
Efficient calculation of uncertainty information, covariance matrix.
Nonlinear problems?
Renaut (ASU)
Scalar Newton method
November 2007
32 / 46
Future Work
Analyse for truncated expansions (TSVD and TGSVD) -reduce
the degrees of freedom.
Further theoretical analysis and simulations with other noise
distributions. Comparison new work of Rust & O’Leary 2007.
Can it be extended for nonlinear regularization terms? (TV?)
Development of the nonlinear least squares for general diagonal
Wx .
Efficient calculation of uncertainty information, covariance matrix.
Nonlinear problems?
Renaut (ASU)
Scalar Newton method
November 2007
32 / 46
Future Work
Analyse for truncated expansions (TSVD and TGSVD) -reduce
the degrees of freedom.
Further theoretical analysis and simulations with other noise
distributions. Comparison new work of Rust & O’Leary 2007.
Can it be extended for nonlinear regularization terms? (TV?)
Development of the nonlinear least squares for general diagonal
Wx .
Efficient calculation of uncertainty information, covariance matrix.
Nonlinear problems?
Renaut (ASU)
Scalar Newton method
November 2007
32 / 46
THANK YOU!
Renaut (ASU)
Scalar Newton method
November 2007
33 / 46
Newton’s Method converges in 5 − 10 Iterations
l
cb
0
0
0
1
1
1
2
2
2
1
2
3
1
2
3
1
2
3
Iterations k
mean
std
8.23e + 00 6.64e − 01
8.31e + 00 9.80e − 01
8.06e + 00 1.06e + 00
4.92e + 00 5.10e − 01
1.00e + 01 1.16e + 00
1.00e + 01 1.19e + 00
5.01e + 00 8.90e − 01
8.29e + 00 1.48e + 00
8.38e + 00 1.50e + 00
Table: Convergence characteristics for problem phillips with n = 40 over 500
runs
Renaut (ASU)
Scalar Newton method
November 2007
34 / 46
Newton’s Method converges in 5 − 10 Iterations
l
cb
0
0
0
1
1
1
2
2
2
1
2
3
1
2
3
1
2
3
Iterations k
mean
std
6.84e + 00 1.28e + 00
8.81e + 00 1.36e + 00
8.72e + 00 1.46e + 00
6.05e + 00 1.30e + 00
7.40e + 00 7.68e − 01
7.17e + 00 8.12e − 01
6.01e + 00 1.40e + 00
7.28e + 00 8.22e − 01
7.33e + 00 8.66e − 01
Table: Convergence characteristics for problem blur with n = 36 over 500
runs
Renaut (ASU)
Scalar Newton method
November 2007
35 / 46
Estimating The Error and Predictive Risk
l
cb
0
0
1
1
2
2
2
3
2
3
2
3
χ2
mean
4.37e − 03
4.32e − 03
4.35e − 03
4.39e − 03
4.50e − 03
4.37e − 03
Error
L
GCV
mean
mean
4.39e − 03 4.21e − 03
4.42e − 03 4.21e − 03
5.17e − 03 4.30e − 03
5.05e − 03 4.38e − 03
6.68e − 03 4.39e − 03
6.66e − 03 4.43e − 03
UPRE
mean
4.22e − 03
4.22e − 03
4.30e − 03
4.37e − 03
4.56e − 03
4.54e − 03
Table: Error characteristics for problem phillips with n = 60 over 500 runs with
error contaminated x0 . Relative errors larger than .009 removed.
Results are comparable
Renaut (ASU)
Scalar Newton method
November 2007
36 / 46
Estimating The Error and Predictive Risk
Risk
l
0
0
1
1
2
2
cb
χ2
2
3
2
3
2
3
mean
3.78e − 02
3.88e − 02
3.94e − 02
1.10e − 01
3.41e − 02
3.61e − 02
L
mean
5.22e − 02
5.10e − 02
5.71e − 02
5.90e − 02
6.00e − 02
5.98e − 02
GCV
mean
3.15e − 02
2.97e − 02
3.02e − 02
3.27e − 02
3.35e − 02
3.35e − 02
UPRE
mean
2.92e − 02
2.90e − 02
2.74e − 02
2.79e − 02
3.79e − 02
3.82e − 02
Table: Error characteristics for problem phillips with n = 60 over 500 runs
χ2 method does not give best estimate of risk
Renaut (ASU)
Scalar Newton method
November 2007
37 / 46
Estimating The Error and Predictive Risk
Error Histogram
Normal noise on rhs, first order derivative, Cb = σ 2 I
Renaut (ASU)
Scalar Newton method
November 2007
38 / 46
Estimating The Error and Predictive Risk
Error Histogram
Exponential noise on rhs, first order derivative, Cb = σ 2 I
Renaut (ASU)
Scalar Newton method
November 2007
39 / 46
Some Solutions: with no prior information x0
Illustrated are solutions and error bars
No Statistical Information
Solution is Smoothed
Renaut (ASU)
With statistical information
Cb = diag(σb2i )
Scalar Newton method
November 2007
40 / 46
Some Generalized Tikhonov Solutions: First Order Derivative
No Statistical Information
Renaut (ASU)
Scalar Newton method
Cb = diag(σb2i )
November 2007
41 / 46
Some Generalized Tikhonov Solutions: Prior x0 : Solution not smoothed
No Statistical Information
Renaut (ASU)
Scalar Newton method
Cb = diag(σb2i )
November 2007
42 / 46
Some Generalized Tikhonov Solutions: x0 = 0: Exponential noise
No Statistical Information
Renaut (ASU)
Scalar Newton method
Cb = diag(σb2i )
November 2007
43 / 46
Relationship to Discrepancy Principle
The discrepancy principle can be implemented by a Newton
method.
Finds σx such that the regularized residual satisfies
σb2 =
1
kb − Ax(σ)k22 .
m
(4)
Consistent with our notation
p
X
i=1
(
1
γi2 σ 2 + 1
m
X
)2 si2 +
si2 = m,
(5)
i=n+1
Weight in the first sum is squared here, otherwise functional is the
same.
But discrepancy principle often oversmooths. What happens
here?
Renaut (ASU)
Scalar Newton method
November 2007
44 / 46
Major References
Bennett A, 2005 Inverse Modeling of the Ocean and Atmosphere
(Cambridge University Press)
Hansen, P. C., 1994, Regularization Tools: A Matlab Package for
Analysis and Solution of Discrete Ill-posed Problems, Numerical
Algorithms 6 1-35.
Mead J., 2007, A priori weighting for parameter estimation, J. Inv.
Ill-posed Problems, to appear.
Rao, C. R., 1973, Linear Statistical Inference and its applications,
Wiley, New York.
Tarantola A 2005 Inverse Problem Theory and Methods for Model
Parameter Estimation (SIAM).
Vogel, C. R., 2002. Computational Methods for Inverse Problems,
(SIAM), Frontiers in Applied Mathematics.
Renaut (ASU)
Scalar Newton method
November 2007
45 / 46
blur Atmospheric (Gaussian PSF) (Hansen): Again with noise
Solution on Left and Degraded on the Right
Solutions using x0 = 0, Generalized Tikhonov Second Derivative 5% noise
Renaut (ASU)
Scalar Newton method
November 2007
46 / 46
blur Atmospheric (Gaussian PSF) (Hansen): Again with noise
Solution on Left and Degraded on the Right
Solutions using x0 = 0, Generalized Tikhonov Second Derivative 5% noise
Renaut (ASU)
Scalar Newton method
November 2007
46 / 46
blur Atmospheric (Gaussian PSF) (Hansen): Again with noise
Solution on Left and Degraded on the Right
Solutions using x0 = 0, Generalized Tikhonov Second Derivative 5% noise
Renaut (ASU)
Scalar Newton method
November 2007
46 / 46