Chapter 6: Iterative Methods A Brief Discussion
Chapter 7: Additional Regularization Techniques
Chapter 9: Nonlinear Regression
Math 4/896: Seminar in Mathematics
Topic: Inverse Theory
Instructor: Thomas Shores
Department of Mathematics
Lecture 25, April 13, 2006
AvH 10
Instructor: Thomas Shores Department of Mathematics
Math 4/896: Seminar in Mathematics Topic: Inverse Theo
Chapter 6: Iterative Methods A Brief Discussion
Chapter 7: Additional Regularization Techniques
Chapter 9: Nonlinear Regression
Image Recovery
Problem:
An image is blurred and we want to sharpen it. Let intensity
function
Itrue (x , y )dene
the true image and
Iblurred (x , y )
dene
the blurred image.
A typical model results from convolving true image with
Gaussian point spread function
Iblurred (x , y ) =
where
Z
∞
−∞
Ψ (u , v ) = e −(u
Z
∞
−∞
v
Itrue (x − u , y − v ) Ψ (u , v ) du dv
2 + 2 / 2σ 2
)( )
.
Think about discretizing this over an SVGA image
(1024
× 768).
But the discretized matrix should be sparse!
Instructor: Thomas Shores Department of Mathematics
Math 4/896: Seminar in Mathematics Topic: Inverse Theo
Chapter 6: Iterative Methods A Brief Discussion
Chapter 7: Additional Regularization Techniques
Chapter 9: Nonlinear Regression
Image Recovery
Problem:
An image is blurred and we want to sharpen it. Let intensity
function
Itrue (x , y )dene
the true image and
Iblurred (x , y )
dene
the blurred image.
A typical model results from convolving true image with
Gaussian point spread function
Iblurred (x , y ) =
where
Z
∞
−∞
Ψ (u , v ) = e −(u
Z
∞
−∞
v
Itrue (x − u , y − v ) Ψ (u , v ) du dv
2 + 2 / 2σ 2
)( )
.
Think about discretizing this over an SVGA image
(1024
× 768).
But the discretized matrix should be sparse!
Instructor: Thomas Shores Department of Mathematics
Math 4/896: Seminar in Mathematics Topic: Inverse Theo
Chapter 6: Iterative Methods A Brief Discussion
Chapter 7: Additional Regularization Techniques
Chapter 9: Nonlinear Regression
Image Recovery
Problem:
An image is blurred and we want to sharpen it. Let intensity
function
Itrue (x , y )dene
the true image and
Iblurred (x , y )
dene
the blurred image.
A typical model results from convolving true image with
Gaussian point spread function
Iblurred (x , y ) =
where
Z
∞
−∞
Ψ (u , v ) = e −(u
Z
∞
−∞
v
Itrue (x − u , y − v ) Ψ (u , v ) du dv
2 + 2 / 2σ 2
)( )
.
Think about discretizing this over an SVGA image
(1024
× 768).
But the discretized matrix should be sparse!
Instructor: Thomas Shores Department of Mathematics
Math 4/896: Seminar in Mathematics Topic: Inverse Theo
Chapter 6: Iterative Methods A Brief Discussion
Chapter 7: Additional Regularization Techniques
Chapter 9: Nonlinear Regression
Image Recovery
Problem:
An image is blurred and we want to sharpen it. Let intensity
function
Itrue (x , y )dene
the true image and
Iblurred (x , y )
dene
the blurred image.
A typical model results from convolving true image with
Gaussian point spread function
Iblurred (x , y ) =
where
Z
∞
−∞
Ψ (u , v ) = e −(u
Z
∞
−∞
v
Itrue (x − u , y − v ) Ψ (u , v ) du dv
2 + 2 / 2σ 2
)( )
.
Think about discretizing this over an SVGA image
(1024
× 768).
But the discretized matrix should be sparse!
Instructor: Thomas Shores Department of Mathematics
Math 4/896: Seminar in Mathematics Topic: Inverse Theo
Chapter 6: Iterative Methods A Brief Discussion
Chapter 7: Additional Regularization Techniques
Chapter 9: Nonlinear Regression
Sparse Matrices and Iterative Methods
Sparse Matrix:
A matrix with suciently many zeros that we should pay attention
to them.
There are ecient ways of storing such matrices and doing
linear algebra on them.
Given a problem
Ax = b
with
A
sparse, iterative methods
become attractive because they usually only require storage of
A, x
and some auxillary vectors, and saxpy, gaxpy, dot
algorithms (scalar a*x+y , general A*x+y, dot product)
Classical methods: Jacobi, Gauss-Seidel, Gauss-Seidel SOR
and conjugate gradient.
Methods especially useful for tomographic problems:
Kaczmarz's method, ART (algebraic reconstruction technique).
Instructor: Thomas Shores Department of Mathematics
Math 4/896: Seminar in Mathematics Topic: Inverse Theo
Chapter 6: Iterative Methods A Brief Discussion
Chapter 7: Additional Regularization Techniques
Chapter 9: Nonlinear Regression
Sparse Matrices and Iterative Methods
Sparse Matrix:
A matrix with suciently many zeros that we should pay attention
to them.
There are ecient ways of storing such matrices and doing
linear algebra on them.
Given a problem
Ax = b
with
A
sparse, iterative methods
become attractive because they usually only require storage of
A, x
and some auxillary vectors, and saxpy, gaxpy, dot
algorithms (scalar a*x+y , general A*x+y, dot product)
Classical methods: Jacobi, Gauss-Seidel, Gauss-Seidel SOR
and conjugate gradient.
Methods especially useful for tomographic problems:
Kaczmarz's method, ART (algebraic reconstruction technique).
Instructor: Thomas Shores Department of Mathematics
Math 4/896: Seminar in Mathematics Topic: Inverse Theo
Chapter 6: Iterative Methods A Brief Discussion
Chapter 7: Additional Regularization Techniques
Chapter 9: Nonlinear Regression
Sparse Matrices and Iterative Methods
Sparse Matrix:
A matrix with suciently many zeros that we should pay attention
to them.
There are ecient ways of storing such matrices and doing
linear algebra on them.
Given a problem
Ax = b
with
A
sparse, iterative methods
become attractive because they usually only require storage of
A, x
and some auxillary vectors, and saxpy, gaxpy, dot
algorithms (scalar a*x+y , general A*x+y, dot product)
Classical methods: Jacobi, Gauss-Seidel, Gauss-Seidel SOR
and conjugate gradient.
Methods especially useful for tomographic problems:
Kaczmarz's method, ART (algebraic reconstruction technique).
Instructor: Thomas Shores Department of Mathematics
Math 4/896: Seminar in Mathematics Topic: Inverse Theo
Chapter 6: Iterative Methods A Brief Discussion
Chapter 7: Additional Regularization Techniques
Chapter 9: Nonlinear Regression
Sparse Matrices and Iterative Methods
Sparse Matrix:
A matrix with suciently many zeros that we should pay attention
to them.
There are ecient ways of storing such matrices and doing
linear algebra on them.
Given a problem
Ax = b
with
A
sparse, iterative methods
become attractive because they usually only require storage of
A, x
and some auxillary vectors, and saxpy, gaxpy, dot
algorithms (scalar a*x+y , general A*x+y, dot product)
Classical methods: Jacobi, Gauss-Seidel, Gauss-Seidel SOR
and conjugate gradient.
Methods especially useful for tomographic problems:
Kaczmarz's method, ART (algebraic reconstruction technique).
Instructor: Thomas Shores Department of Mathematics
Math 4/896: Seminar in Mathematics Topic: Inverse Theo
Chapter 6: Iterative Methods A Brief Discussion
Chapter 7: Additional Regularization Techniques
Chapter 9: Nonlinear Regression
Sparse Matrices and Iterative Methods
Sparse Matrix:
A matrix with suciently many zeros that we should pay attention
to them.
There are ecient ways of storing such matrices and doing
linear algebra on them.
Given a problem
Ax = b
with
A
sparse, iterative methods
become attractive because they usually only require storage of
A, x
and some auxillary vectors, and saxpy, gaxpy, dot
algorithms (scalar a*x+y , general A*x+y, dot product)
Classical methods: Jacobi, Gauss-Seidel, Gauss-Seidel SOR
and conjugate gradient.
Methods especially useful for tomographic problems:
Kaczmarz's method, ART (algebraic reconstruction technique).
Instructor: Thomas Shores Department of Mathematics
Math 4/896: Seminar in Mathematics Topic: Inverse Theo
Chapter 6: Iterative Methods A Brief Discussion
Chapter 7: Additional Regularization Techniques
Chapter 9: Nonlinear Regression
Yet Another Regularization Idea
To regularize in face of iteration:
Use the number of iteration steps taken as a regularization
parameter.
Conjugate gradient methods are designed to work with SPD
coecient matrices
A
So in the unregularized least squares
take
A = GTG
and
Ax = b .
T
T
problem G G m = G d
in the equation
b = G T d,
resulting in the CGLS method,
GTG.
(0) = 0, then
Key fact: in exact arithmetic, if we start at m
(k ) m is monotone increasing in k and G m(k ) − d is
monotonically decreasing in k . So we can make an L-curve
terms of k .
in which we avoid explicitly computing
in
Do Example 6.3 from the CD. Change startuple path to
Examples/chap6/examp3
Instructor: Thomas Shores Department of Mathematics
Math 4/896: Seminar in Mathematics Topic: Inverse Theo
Chapter 6: Iterative Methods A Brief Discussion
Chapter 7: Additional Regularization Techniques
Chapter 9: Nonlinear Regression
Yet Another Regularization Idea
To regularize in face of iteration:
Use the number of iteration steps taken as a regularization
parameter.
Conjugate gradient methods are designed to work with SPD
coecient matrices
A
So in the unregularized least squares
take
A = GTG
and
Ax = b .
T
T
problem G G m = G d
in the equation
b = G T d,
resulting in the CGLS method,
GTG.
(0) = 0, then
Key fact: in exact arithmetic, if we start at m
(k ) m is monotone increasing in k and G m(k ) − d is
monotonically decreasing in k . So we can make an L-curve
terms of k .
in which we avoid explicitly computing
in
Do Example 6.3 from the CD. Change startuple path to
Examples/chap6/examp3
Instructor: Thomas Shores Department of Mathematics
Math 4/896: Seminar in Mathematics Topic: Inverse Theo
Chapter 6: Iterative Methods A Brief Discussion
Chapter 7: Additional Regularization Techniques
Chapter 9: Nonlinear Regression
Yet Another Regularization Idea
To regularize in face of iteration:
Use the number of iteration steps taken as a regularization
parameter.
Conjugate gradient methods are designed to work with SPD
coecient matrices
A
So in the unregularized least squares
take
A = GTG
and
Ax = b .
T
T
problem G G m = G d
in the equation
b = G T d,
resulting in the CGLS method,
GTG.
(0) = 0, then
Key fact: in exact arithmetic, if we start at m
(k ) m is monotone increasing in k and G m(k ) − d is
monotonically decreasing in k . So we can make an L-curve
terms of k .
in which we avoid explicitly computing
in
Do Example 6.3 from the CD. Change startuple path to
Examples/chap6/examp3
Instructor: Thomas Shores Department of Mathematics
Math 4/896: Seminar in Mathematics Topic: Inverse Theo
Chapter 6: Iterative Methods A Brief Discussion
Chapter 7: Additional Regularization Techniques
Chapter 9: Nonlinear Regression
Yet Another Regularization Idea
To regularize in face of iteration:
Use the number of iteration steps taken as a regularization
parameter.
Conjugate gradient methods are designed to work with SPD
coecient matrices
A
So in the unregularized least squares
take
A = GTG
and
Ax = b .
T
T
problem G G m = G d
in the equation
b = G T d,
resulting in the CGLS method,
GTG.
(0) = 0, then
Key fact: in exact arithmetic, if we start at m
(k ) m is monotone increasing in k and G m(k ) − d is
monotonically decreasing in k . So we can make an L-curve
terms of k .
in which we avoid explicitly computing
in
Do Example 6.3 from the CD. Change startuple path to
Examples/chap6/examp3
Instructor: Thomas Shores Department of Mathematics
Math 4/896: Seminar in Mathematics Topic: Inverse Theo
Chapter 6: Iterative Methods A Brief Discussion
Chapter 7: Additional Regularization Techniques
Chapter 9: Nonlinear Regression
7.1: Using Bounds as Constraints
7.2: Maximum Entropy Regularization
7.3: Total Variation
Outline
Instructor: Thomas Shores Department of Mathematics
Math 4/896: Seminar in Mathematics Topic: Inverse Theo
Chapter 6: Iterative Methods A Brief Discussion
Chapter 7: Additional Regularization Techniques
Chapter 9: Nonlinear Regression
7.1: Using Bounds as Constraints
7.2: Maximum Entropy Regularization
7.3: Total Variation
Regularization...Sort Of
Basic Idea:
Use prior knowledge about the nature of the solution to restrict it:
Most common restrictions: on the magnitude of the parameter
values. Which leads to the problem:
f (m)
subject to l ≤ m ≤ u .
One could choose f (m) = kG m − dk2 (BVLS)
T
One could choose f (m) = c · mwith additional
kG m − dk2 ≤ δ .
Minimize
Instructor: Thomas Shores Department of Mathematics
constraint
Math 4/896: Seminar in Mathematics Topic: Inverse Theo
Chapter 6: Iterative Methods A Brief Discussion
Chapter 7: Additional Regularization Techniques
Chapter 9: Nonlinear Regression
7.1: Using Bounds as Constraints
7.2: Maximum Entropy Regularization
7.3: Total Variation
Regularization...Sort Of
Basic Idea:
Use prior knowledge about the nature of the solution to restrict it:
Most common restrictions: on the magnitude of the parameter
values. Which leads to the problem:
f (m)
subject to l ≤ m ≤ u .
One could choose f (m) = kG m − dk2 (BVLS)
T
One could choose f (m) = c · mwith additional
kG m − dk2 ≤ δ .
Minimize
Instructor: Thomas Shores Department of Mathematics
constraint
Math 4/896: Seminar in Mathematics Topic: Inverse Theo
Chapter 6: Iterative Methods A Brief Discussion
Chapter 7: Additional Regularization Techniques
Chapter 9: Nonlinear Regression
7.1: Using Bounds as Constraints
7.2: Maximum Entropy Regularization
7.3: Total Variation
Regularization...Sort Of
Basic Idea:
Use prior knowledge about the nature of the solution to restrict it:
Most common restrictions: on the magnitude of the parameter
values. Which leads to the problem:
f (m)
subject to l ≤ m ≤ u .
One could choose f (m) = kG m − dk2 (BVLS)
T
One could choose f (m) = c · mwith additional
kG m − dk2 ≤ δ .
Minimize
Instructor: Thomas Shores Department of Mathematics
constraint
Math 4/896: Seminar in Mathematics Topic: Inverse Theo
Chapter 6: Iterative Methods A Brief Discussion
Chapter 7: Additional Regularization Techniques
Chapter 9: Nonlinear Regression
7.1: Using Bounds as Constraints
7.2: Maximum Entropy Regularization
7.3: Total Variation
Regularization...Sort Of
Basic Idea:
Use prior knowledge about the nature of the solution to restrict it:
Most common restrictions: on the magnitude of the parameter
values. Which leads to the problem:
f (m)
subject to l ≤ m ≤ u .
One could choose f (m) = kG m − dk2 (BVLS)
T
One could choose f (m) = c · mwith additional
kG m − dk2 ≤ δ .
Minimize
Instructor: Thomas Shores Department of Mathematics
constraint
Math 4/896: Seminar in Mathematics Topic: Inverse Theo
Chapter 6: Iterative Methods A Brief Discussion
Chapter 7: Additional Regularization Techniques
Chapter 9: Nonlinear Regression
7.1: Using Bounds as Constraints
7.2: Maximum Entropy Regularization
7.3: Total Variation
Regularization...Sort Of
Basic Idea:
Use prior knowledge about the nature of the solution to restrict it:
Most common restrictions: on the magnitude of the parameter
values. Which leads to the problem:
f (m)
subject to l ≤ m ≤ u .
One could choose f (m) = kG m − dk2 (BVLS)
T
One could choose f (m) = c · mwith additional
kG m − dk2 ≤ δ .
Minimize
Instructor: Thomas Shores Department of Mathematics
constraint
Math 4/896: Seminar in Mathematics Topic: Inverse Theo
Example 3.3
Contaminant Transport
Let
C (x , t )
be the concentration of a pollutant at point
linear stream, time
t,
where 0
≤x <∞
and 0
≤ t ≤ T.
dening model
∂C
∂t
C (0, t )
=
=
∂2C
∂C
−v
2
∂x
∂x
Cin (t )
D
C (x , t ) → 0, x → ∞
C (x , 0) = C0 (x )
x
in a
The
Solution
Solution:
In the case that
C0 (x ) ≡ 0,
C (x , T ) =
the explicit solution is
Z T
0
Cin (t ) f (x , T − t ) dt ,
where
f (x , τ ) = √
2
2
x
e −(x −v τ ) /(4D τ )
3
πD τ
Inverse Problem
Problem:
Given simultaneous measurements at time
T,
to estimate the
contaminant inow history. That is, given data
di = C (xi , T ) , i = 1, 2, . . . , m,
to estimate
Cin (t ) ,
0
≤ t ≤ T.
Change the startuple path to Examples/chap7/examp1 execute it
and examp.
Chapter 6: Iterative Methods A Brief Discussion
Chapter 7: Additional Regularization Techniques
Chapter 9: Nonlinear Regression
7.1: Using Bounds as Constraints
7.2: Maximum Entropy Regularization
7.3: Total Variation
Outline
Instructor: Thomas Shores Department of Mathematics
Math 4/896: Seminar in Mathematics Topic: Inverse Theo
Chapter 6: Iterative Methods A Brief Discussion
Chapter 7: Additional Regularization Techniques
Chapter 9: Nonlinear Regression
7.1: Using Bounds as Constraints
7.2: Maximum Entropy Regularization
7.3: Total Variation
A Better Idea (?)
Entropy:
E (m) = −
n
X
j =1
mj ln (wj mj ), w
a vector of positive weights.
Motivated by Shannon's information theory and Bolzmann's
theory of entropy in statistical mechanics. A measure of
uncertainty about which message or physical state will occur.
Shannon's entropy function for a probability distribution
{pi }ni=1
is
H (p) = −
n
X
i =1
pi ln (pi ).
Bayesian Maximimum Entropy Principle: least biased model is
one that maximizes entropy subject to constraints of testable
information like bounds or average values of parameters.
Instructor: Thomas Shores Department of Mathematics
Math 4/896: Seminar in Mathematics Topic: Inverse Theo
Chapter 6: Iterative Methods A Brief Discussion
Chapter 7: Additional Regularization Techniques
Chapter 9: Nonlinear Regression
7.1: Using Bounds as Constraints
7.2: Maximum Entropy Regularization
7.3: Total Variation
A Better Idea (?)
Entropy:
E (m) = −
n
X
j =1
mj ln (wj mj ), w
a vector of positive weights.
Motivated by Shannon's information theory and Bolzmann's
theory of entropy in statistical mechanics. A measure of
uncertainty about which message or physical state will occur.
Shannon's entropy function for a probability distribution
{pi }ni=1
is
H (p) = −
n
X
i =1
pi ln (pi ).
Bayesian Maximimum Entropy Principle: least biased model is
one that maximizes entropy subject to constraints of testable
information like bounds or average values of parameters.
Instructor: Thomas Shores Department of Mathematics
Math 4/896: Seminar in Mathematics Topic: Inverse Theo
Chapter 6: Iterative Methods A Brief Discussion
Chapter 7: Additional Regularization Techniques
Chapter 9: Nonlinear Regression
7.1: Using Bounds as Constraints
7.2: Maximum Entropy Regularization
7.3: Total Variation
A Better Idea (?)
Entropy:
E (m) = −
n
X
j =1
mj ln (wj mj ), w
a vector of positive weights.
Motivated by Shannon's information theory and Bolzmann's
theory of entropy in statistical mechanics. A measure of
uncertainty about which message or physical state will occur.
Shannon's entropy function for a probability distribution
{pi }ni=1
is
H (p) = −
n
X
i =1
pi ln (pi ).
Bayesian Maximimum Entropy Principle: least biased model is
one that maximizes entropy subject to constraints of testable
information like bounds or average values of parameters.
Instructor: Thomas Shores Department of Mathematics
Math 4/896: Seminar in Mathematics Topic: Inverse Theo
Chapter 6: Iterative Methods A Brief Discussion
Chapter 7: Additional Regularization Techniques
Chapter 9: Nonlinear Regression
7.1: Using Bounds as Constraints
7.2: Maximum Entropy Regularization
7.3: Total Variation
A Better Idea (?)
Entropy:
E (m) = −
n
X
j =1
mj ln (wj mj ), w
a vector of positive weights.
Motivated by Shannon's information theory and Bolzmann's
theory of entropy in statistical mechanics. A measure of
uncertainty about which message or physical state will occur.
Shannon's entropy function for a probability distribution
{pi }ni=1
is
H (p) = −
n
X
i =1
pi ln (pi ).
Bayesian Maximimum Entropy Principle: least biased model is
one that maximizes entropy subject to constraints of testable
information like bounds or average values of parameters.
Instructor: Thomas Shores Department of Mathematics
Math 4/896: Seminar in Mathematics Topic: Inverse Theo
Maximum Entropy Regularization
Maximize Entropy:
That is, our version. So problem looks like:
n
X
mj ln (wj mj )
j =1
Subject to kG m − dk2 ≤ δ and m ≥ 0.
In absence of extra information, take wi = 1.
Maximize
−
multipliers give:
2
Minimize kG m − dk2
subject to
m ≥ 0.
+α
2
n
X
j =1
Lagrange
mj ln (wj mj ),
Change the startuple path to Examples/chap7/examp2 execute it
and examp.
Maximum Entropy Regularization
Maximize Entropy:
That is, our version. So problem looks like:
n
X
mj ln (wj mj )
j =1
Subject to kG m − dk2 ≤ δ and m ≥ 0.
In absence of extra information, take wi = 1.
Maximize
−
multipliers give:
2
Minimize kG m − dk2
subject to
m ≥ 0.
+α
2
n
X
j =1
Lagrange
mj ln (wj mj ),
Change the startuple path to Examples/chap7/examp2 execute it
and examp.
Maximum Entropy Regularization
Maximize Entropy:
That is, our version. So problem looks like:
n
X
mj ln (wj mj )
j =1
Subject to kG m − dk2 ≤ δ and m ≥ 0.
In absence of extra information, take wi = 1.
Maximize
−
multipliers give:
2
Minimize kG m − dk2
subject to
m ≥ 0.
+α
2
n
X
j =1
Lagrange
mj ln (wj mj ),
Change the startuple path to Examples/chap7/examp2 execute it
and examp.
Maximum Entropy Regularization
Maximize Entropy:
That is, our version. So problem looks like:
n
X
mj ln (wj mj )
j =1
Subject to kG m − dk2 ≤ δ and m ≥ 0.
In absence of extra information, take wi = 1.
Maximize
−
multipliers give:
2
Minimize kG m − dk2
subject to
m ≥ 0.
+α
2
n
X
j =1
Lagrange
mj ln (wj mj ),
Change the startuple path to Examples/chap7/examp2 execute it
and examp.
Maximum Entropy Regularization
Maximize Entropy:
That is, our version. So problem looks like:
n
X
mj ln (wj mj )
j =1
Subject to kG m − dk2 ≤ δ and m ≥ 0.
In absence of extra information, take wi = 1.
Maximize
−
multipliers give:
2
Minimize kG m − dk2
subject to
m ≥ 0.
+α
2
n
X
j =1
Lagrange
mj ln (wj mj ),
Change the startuple path to Examples/chap7/examp2 execute it
and examp.
Maximum Entropy Regularization
Maximize Entropy:
That is, our version. So problem looks like:
n
X
mj ln (wj mj )
j =1
Subject to kG m − dk2 ≤ δ and m ≥ 0.
In absence of extra information, take wi = 1.
Maximize
−
multipliers give:
2
Minimize kG m − dk2
subject to
m ≥ 0.
+α
2
n
X
j =1
Lagrange
mj ln (wj mj ),
Change the startuple path to Examples/chap7/examp2 execute it
and examp.
Chapter 6: Iterative Methods A Brief Discussion
Chapter 7: Additional Regularization Techniques
Chapter 9: Nonlinear Regression
7.1: Using Bounds as Constraints
7.2: Maximum Entropy Regularization
7.3: Total Variation
Outline
Instructor: Thomas Shores Department of Mathematics
Math 4/896: Seminar in Mathematics Topic: Inverse Theo
Chapter 6: Iterative Methods A Brief Discussion
Chapter 7: Additional Regularization Techniques
Chapter 9: Nonlinear Regression
7.1: Using Bounds as Constraints
7.2: Maximum Entropy Regularization
7.3: Total Variation
TV Regularization
We only consider total variation regularization from this section.
Regularization term:
m) =
DV (
n −1
X
j =1
|mj +1 − mj | = kLmk1 ,
where
L
is the matrix used in
rst order Tikhonov regularization.
Problem becomes: minimize
kG m − dk22 + α kmk1
kG m − dk1 + α kmk1 .
G
d minimize
m
−
αL
0 Better yet: minimize
Equivalently:
.
1
Now just use IRLS (iteratively reweighted least squares) to
solve it and an L-curve of sorts to nd optimal
Instructor: Thomas Shores Department of Mathematics
α.
Math 4/896: Seminar in Mathematics Topic: Inverse Theo
Chapter 6: Iterative Methods A Brief Discussion
Chapter 7: Additional Regularization Techniques
Chapter 9: Nonlinear Regression
7.1: Using Bounds as Constraints
7.2: Maximum Entropy Regularization
7.3: Total Variation
TV Regularization
We only consider total variation regularization from this section.
Regularization term:
m) =
DV (
n −1
X
j =1
|mj +1 − mj | = kLmk1 ,
where
L
is the matrix used in
rst order Tikhonov regularization.
Problem becomes: minimize
kG m − dk22 + α kmk1
kG m − dk1 + α kmk1 .
G
d minimize
m
−
αL
0 Better yet: minimize
Equivalently:
.
1
Now just use IRLS (iteratively reweighted least squares) to
solve it and an L-curve of sorts to nd optimal
Instructor: Thomas Shores Department of Mathematics
α.
Math 4/896: Seminar in Mathematics Topic: Inverse Theo
Chapter 6: Iterative Methods A Brief Discussion
Chapter 7: Additional Regularization Techniques
Chapter 9: Nonlinear Regression
7.1: Using Bounds as Constraints
7.2: Maximum Entropy Regularization
7.3: Total Variation
TV Regularization
We only consider total variation regularization from this section.
Regularization term:
m) =
DV (
n −1
X
j =1
|mj +1 − mj | = kLmk1 ,
where
L
is the matrix used in
rst order Tikhonov regularization.
Problem becomes: minimize
kG m − dk22 + α kmk1
kG m − dk1 + α kmk1 .
G
d minimize
m
−
αL
0 Better yet: minimize
Equivalently:
.
1
Now just use IRLS (iteratively reweighted least squares) to
solve it and an L-curve of sorts to nd optimal
Instructor: Thomas Shores Department of Mathematics
α.
Math 4/896: Seminar in Mathematics Topic: Inverse Theo
Chapter 6: Iterative Methods A Brief Discussion
Chapter 7: Additional Regularization Techniques
Chapter 9: Nonlinear Regression
7.1: Using Bounds as Constraints
7.2: Maximum Entropy Regularization
7.3: Total Variation
TV Regularization
We only consider total variation regularization from this section.
Regularization term:
m) =
DV (
n −1
X
j =1
|mj +1 − mj | = kLmk1 ,
where
L
is the matrix used in
rst order Tikhonov regularization.
Problem becomes: minimize
kG m − dk22 + α kmk1
kG m − dk1 + α kmk1 .
G
d minimize
m
−
αL
0 Better yet: minimize
Equivalently:
.
1
Now just use IRLS (iteratively reweighted least squares) to
solve it and an L-curve of sorts to nd optimal
Instructor: Thomas Shores Department of Mathematics
α.
Math 4/896: Seminar in Mathematics Topic: Inverse Theo
Chapter 6: Iterative Methods A Brief Discussion
Chapter 7: Additional Regularization Techniques
Chapter 9: Nonlinear Regression
7.1: Using Bounds as Constraints
7.2: Maximum Entropy Regularization
7.3: Total Variation
TV Regularization
We only consider total variation regularization from this section.
Regularization term:
m) =
DV (
n −1
X
j =1
|mj +1 − mj | = kLmk1 ,
where
L
is the matrix used in
rst order Tikhonov regularization.
Problem becomes: minimize
kG m − dk22 + α kmk1
kG m − dk1 + α kmk1 .
G
d minimize
m
−
αL
0 Better yet: minimize
Equivalently:
.
1
Now just use IRLS (iteratively reweighted least squares) to
solve it and an L-curve of sorts to nd optimal
Instructor: Thomas Shores Department of Mathematics
α.
Math 4/896: Seminar in Mathematics Topic: Inverse Theo
Chapter 6: Iterative Methods A Brief Discussion
Chapter 7: Additional Regularization Techniques
Chapter 9: Nonlinear Regression
7.1: Using Bounds as Constraints
7.2: Maximum Entropy Regularization
7.3: Total Variation
TV Regularization
We only consider total variation regularization from this section.
Regularization term:
m) =
DV (
n −1
X
j =1
|mj +1 − mj | = kLmk1 ,
where
L
is the matrix used in
rst order Tikhonov regularization.
Problem becomes: minimize
kG m − dk22 + α kmk1
kG m − dk1 + α kmk1 .
G
d minimize
m
−
αL
0 Better yet: minimize
Equivalently:
.
1
Now just use IRLS (iteratively reweighted least squares) to
solve it and an L-curve of sorts to nd optimal
Instructor: Thomas Shores Department of Mathematics
α.
Math 4/896: Seminar in Mathematics Topic: Inverse Theo
Chapter 6: Iterative Methods A Brief Discussion
Chapter 7: Additional Regularization Techniques
Chapter 9: Nonlinear Regression
7.1: Using Bounds as Constraints
7.2: Maximum Entropy Regularization
7.3: Total Variation
Total Variation
Key Property:
TV doesn't smooth discontinuities as much as Tikhonov
regularization.
Change startuple path to Examples/chap7/examp3 execute it and
examp.
Instructor: Thomas Shores Department of Mathematics
Math 4/896: Seminar in Mathematics Topic: Inverse Theo
Chapter 6: Iterative Methods A Brief Discussion
Chapter 7: Additional Regularization Techniques
Chapter 9: Nonlinear Regression
7.1: Using Bounds as Constraints
7.2: Maximum Entropy Regularization
7.3: Total Variation
Total Variation
Key Property:
TV doesn't smooth discontinuities as much as Tikhonov
regularization.
Change startuple path to Examples/chap7/examp3 execute it and
examp.
Instructor: Thomas Shores Department of Mathematics
Math 4/896: Seminar in Mathematics Topic: Inverse Theo
Chapter 6: Iterative Methods A Brief Discussion
Chapter 7: Additional Regularization Techniques
Chapter 9: Nonlinear Regression
Newton's Method
Gauss-Newton and Levenberg-Marquardt Methods
Outline
Instructor: Thomas Shores Department of Mathematics
Math 4/896: Seminar in Mathematics Topic: Inverse Theo
Chapter 6: Iterative Methods A Brief Discussion
Chapter 7: Additional Regularization Techniques
Chapter 9: Nonlinear Regression
Newton's Method
Gauss-Newton and Levenberg-Marquardt Methods
Basic Problems
Root Finding:
Solve the system of equations represented in vector form as
F (x) = 0.
x∗ for which F (x∗ ) = 0.
F (x) = (f1 (x) , . . . , fm (x))
for point(s)
Here
Gradient notation:
∇fj (x) =
Jacobian notation:
x = (x1 , . . . , xm )
∂ fj
∂ fj
(x) , . . . ,
(x) .
∂ x1
∂ xm
and
∂ fi
∇F (x) = [∇f1 (x) , . . . , ∇fm (x)]T =
.
∂ xj i ,j =1,...m
Instructor: Thomas Shores Department of Mathematics
Math 4/896: Seminar in Mathematics Topic: Inverse Theo
Chapter 6: Iterative Methods A Brief Discussion
Chapter 7: Additional Regularization Techniques
Chapter 9: Nonlinear Regression
Newton's Method
Gauss-Newton and Levenberg-Marquardt Methods
Basic Problems
Root Finding:
Solve the system of equations represented in vector form as
F (x) = 0.
x∗ for which F (x∗ ) = 0.
F (x) = (f1 (x) , . . . , fm (x))
for point(s)
Here
Gradient notation:
∇fj (x) =
Jacobian notation:
x = (x1 , . . . , xm )
∂ fj
∂ fj
(x) , . . . ,
(x) .
∂ x1
∂ xm
and
∂ fi
∇F (x) = [∇f1 (x) , . . . , ∇fm (x)]T =
.
∂ xj i ,j =1,...m
Instructor: Thomas Shores Department of Mathematics
Math 4/896: Seminar in Mathematics Topic: Inverse Theo
Chapter 6: Iterative Methods A Brief Discussion
Chapter 7: Additional Regularization Techniques
Chapter 9: Nonlinear Regression
Newton's Method
Gauss-Newton and Levenberg-Marquardt Methods
Basic Problems
Root Finding:
Solve the system of equations represented in vector form as
F (x) = 0.
x∗ for which F (x∗ ) = 0.
F (x) = (f1 (x) , . . . , fm (x))
for point(s)
Here
Gradient notation:
∇fj (x) =
Jacobian notation:
x = (x1 , . . . , xm )
∂ fj
∂ fj
(x) , . . . ,
(x) .
∂ x1
∂ xm
and
∂ fi
∇F (x) = [∇f1 (x) , . . . , ∇fm (x)]T =
.
∂ xj i ,j =1,...m
Instructor: Thomas Shores Department of Mathematics
Math 4/896: Seminar in Mathematics Topic: Inverse Theo
Chapter 6: Iterative Methods A Brief Discussion
Chapter 7: Additional Regularization Techniques
Chapter 9: Nonlinear Regression
Newton's Method
Gauss-Newton and Levenberg-Marquardt Methods
Basic Problems
Root Finding:
Solve the system of equations represented in vector form as
F (x) = 0.
x∗ for which F (x∗ ) = 0.
F (x) = (f1 (x) , . . . , fm (x))
for point(s)
Here
Gradient notation:
∇fj (x) =
Jacobian notation:
x = (x1 , . . . , xm )
∂ fj
∂ fj
(x) , . . . ,
(x) .
∂ x1
∂ xm
and
∂ fi
∇F (x) = [∇f1 (x) , . . . , ∇fm (x)]T =
.
∂ xj i ,j =1,...m
Instructor: Thomas Shores Department of Mathematics
Math 4/896: Seminar in Mathematics Topic: Inverse Theo
Chapter 6: Iterative Methods A Brief Discussion
Chapter 7: Additional Regularization Techniques
Chapter 9: Nonlinear Regression
Newton's Method
Gauss-Newton and Levenberg-Marquardt Methods
Basic Problems
Optimization:
Find the minimum value of scalar valued function
f (x),
where
x
Ω.
∂f
∂f
(x) , . . . ,
(x)
Set F (x) = ∇f (x) =
∂ x1
∂ xm
2 ∂ f
2
Hessian of f : ∇ (∇f (x)) ≡ ∇ f (x) =
.
∂ xi ∂ xj
ranges over a feasible set
Instructor: Thomas Shores Department of Mathematics
Math 4/896: Seminar in Mathematics Topic: Inverse Theo
Chapter 6: Iterative Methods A Brief Discussion
Chapter 7: Additional Regularization Techniques
Chapter 9: Nonlinear Regression
Newton's Method
Gauss-Newton and Levenberg-Marquardt Methods
Basic Problems
Optimization:
Find the minimum value of scalar valued function
f (x),
where
x
Ω.
∂f
∂f
(x) , . . . ,
(x)
Set F (x) = ∇f (x) =
∂ x1
∂ xm
2 ∂ f
2
Hessian of f : ∇ (∇f (x)) ≡ ∇ f (x) =
.
∂ xi ∂ xj
ranges over a feasible set
Instructor: Thomas Shores Department of Mathematics
Math 4/896: Seminar in Mathematics Topic: Inverse Theo
Chapter 6: Iterative Methods A Brief Discussion
Chapter 7: Additional Regularization Techniques
Chapter 9: Nonlinear Regression
Newton's Method
Gauss-Newton and Levenberg-Marquardt Methods
Basic Problems
Optimization:
Find the minimum value of scalar valued function
f (x),
where
x
Ω.
∂f
∂f
(x) , . . . ,
(x)
Set F (x) = ∇f (x) =
∂ x1
∂ xm
2 ∂ f
2
Hessian of f : ∇ (∇f (x)) ≡ ∇ f (x) =
.
∂ xi ∂ xj
ranges over a feasible set
Instructor: Thomas Shores Department of Mathematics
Math 4/896: Seminar in Mathematics Topic: Inverse Theo
Chapter 6: Iterative Methods A Brief Discussion
Chapter 7: Additional Regularization Techniques
Chapter 9: Nonlinear Regression
Newton's Method
Gauss-Newton and Levenberg-Marquardt Methods
Taylor Theorems
First Order
Suppose that
x∗ , x
∈ Rn .
f : Rn → R
has continuous second partials and
Then
f (x) = f (x∗ ) + ∇f (x∗ )T (x − x∗ ) + O kx − x∗ k2 , x → x.
Second Order
f : Rn → R has continuous third partials and
x∗ , x ∈ Rn . Then f (x) = f (x∗ ) + ∇f (x∗ )T (x − x∗ ) +
Suppose that
1
(x − x∗ )T ∇2 f (x∗ ) (x − x∗ ) + O kx − x∗ k3 , x → x.
2
(See Appendix C for versions of Taylor's theorem with weaker
hypotheses.)
Instructor: Thomas Shores Department of Mathematics
Math 4/896: Seminar in Mathematics Topic: Inverse Theo
Chapter 6: Iterative Methods A Brief Discussion
Chapter 7: Additional Regularization Techniques
Chapter 9: Nonlinear Regression
Newton's Method
Gauss-Newton and Levenberg-Marquardt Methods
Taylor Theorems
First Order
Suppose that
x∗ , x
∈ Rn .
f : Rn → R
has continuous second partials and
Then
f (x) = f (x∗ ) + ∇f (x∗ )T (x − x∗ ) + O kx − x∗ k2 , x → x.
Second Order
f : Rn → R has continuous third partials and
x∗ , x ∈ Rn . Then f (x) = f (x∗ ) + ∇f (x∗ )T (x − x∗ ) +
Suppose that
1
(x − x∗ )T ∇2 f (x∗ ) (x − x∗ ) + O kx − x∗ k3 , x → x.
2
(See Appendix C for versions of Taylor's theorem with weaker
hypotheses.)
Instructor: Thomas Shores Department of Mathematics
Math 4/896: Seminar in Mathematics Topic: Inverse Theo
Chapter 6: Iterative Methods A Brief Discussion
Chapter 7: Additional Regularization Techniques
Chapter 9: Nonlinear Regression
Newton's Method
Gauss-Newton and Levenberg-Marquardt Methods
Newton Algorithms
Root Finding
F, ∇F, x0 , Nmax
for k = 0, ..., Nmax
−1
xk + 1 = xk − ∇ F xk
F xk
k +1 , xk pass a convergence test
if x
k
return(x )
Input
end
end
xNmax )
return(
Instructor: Thomas Shores Department of Mathematics
Math 4/896: Seminar in Mathematics Topic: Inverse Theo
Convergence Result
Theorem
Let x∗ be a root of the equation F (x) = 0, where F, x are
m-vectors, F has continuous rst partials in some neighborhood of
x∗ and ∇F (x∗ ) is non-singular. Then Newton's method yields a
sequence of vectors that converges to x∗ , provided that x0 is
suciently close to x∗ . If, in addition, F has continuous second
partials in some neighborhood of x∗ , then the convergence is
quadratic in the sense that for some constant K > 0,
2
k +1
− x∗ ≤ K xk − x∗ .
x
Newton's Method
Gauss-Newton and Levenberg-Marquardt Methods
Chapter 6: Iterative Methods A Brief Discussion
Chapter 7: Additional Regularization Techniques
Chapter 9: Nonlinear Regression
Newton for Optimization
Bright Idea:
We know from calculus that where
∇f = 0.
f (x)
has a local minimum,
F (x) = ∇f (x) and use Newton's method.
k + 1 = xk − ∇ 2 f xk − 1 ∇ f xk iteration formula: x
So just let
Result is
We can turn this approach on its head: root nding is just a
special case of optimization, i.e., solving
as minimizing
f (x) = kF (x)k2 .
F (x) = 0
is the same
Downside of root nding point of view of optimization: saddle
points and local maxima
x
also satisfy
∇f (x) = 0.
F (x) = 0
f (x) = kF (x)k2 nds the
Upside of optimization view of root nding: if
doesn't have a root, minimizing
next best solutions least squares solutions!
In fact, least squares problem fork
Instructor: Thomas Shores Department of Mathematics
G m − dk2
is optimization!
Math 4/896: Seminar in Mathematics Topic: Inverse Theo
Newton's Method
Gauss-Newton and Levenberg-Marquardt Methods
Chapter 6: Iterative Methods A Brief Discussion
Chapter 7: Additional Regularization Techniques
Chapter 9: Nonlinear Regression
Newton for Optimization
Bright Idea:
We know from calculus that where
∇f = 0.
f (x)
has a local minimum,
F (x) = ∇f (x) and use Newton's method.
k + 1 = xk − ∇ 2 f xk − 1 ∇ f xk iteration formula: x
So just let
Result is
We can turn this approach on its head: root nding is just a
special case of optimization, i.e., solving
as minimizing
f (x) = kF (x)k2 .
F (x) = 0
is the same
Downside of root nding point of view of optimization: saddle
points and local maxima
x
also satisfy
∇f (x) = 0.
F (x) = 0
f (x) = kF (x)k2 nds the
Upside of optimization view of root nding: if
doesn't have a root, minimizing
next best solutions least squares solutions!
In fact, least squares problem fork
Instructor: Thomas Shores Department of Mathematics
G m − dk2
is optimization!
Math 4/896: Seminar in Mathematics Topic: Inverse Theo
Newton's Method
Gauss-Newton and Levenberg-Marquardt Methods
Chapter 6: Iterative Methods A Brief Discussion
Chapter 7: Additional Regularization Techniques
Chapter 9: Nonlinear Regression
Newton for Optimization
Bright Idea:
We know from calculus that where
∇f = 0.
f (x)
has a local minimum,
F (x) = ∇f (x) and use Newton's method.
k + 1 = xk − ∇ 2 f xk − 1 ∇ f xk iteration formula: x
So just let
Result is
We can turn this approach on its head: root nding is just a
special case of optimization, i.e., solving
as minimizing
f (x) = kF (x)k2 .
F (x) = 0
is the same
Downside of root nding point of view of optimization: saddle
points and local maxima
x
also satisfy
∇f (x) = 0.
F (x) = 0
f (x) = kF (x)k2 nds the
Upside of optimization view of root nding: if
doesn't have a root, minimizing
next best solutions least squares solutions!
In fact, least squares problem fork
Instructor: Thomas Shores Department of Mathematics
G m − dk2
is optimization!
Math 4/896: Seminar in Mathematics Topic: Inverse Theo
Newton's Method
Gauss-Newton and Levenberg-Marquardt Methods
Chapter 6: Iterative Methods A Brief Discussion
Chapter 7: Additional Regularization Techniques
Chapter 9: Nonlinear Regression
Newton for Optimization
Bright Idea:
We know from calculus that where
∇f = 0.
f (x)
has a local minimum,
F (x) = ∇f (x) and use Newton's method.
k + 1 = xk − ∇ 2 f xk − 1 ∇ f xk iteration formula: x
So just let
Result is
We can turn this approach on its head: root nding is just a
special case of optimization, i.e., solving
as minimizing
f (x) = kF (x)k2 .
F (x) = 0
is the same
Downside of root nding point of view of optimization: saddle
points and local maxima
x
also satisfy
∇f (x) = 0.
F (x) = 0
f (x) = kF (x)k2 nds the
Upside of optimization view of root nding: if
doesn't have a root, minimizing
next best solutions least squares solutions!
In fact, least squares problem fork
Instructor: Thomas Shores Department of Mathematics
G m − dk2
is optimization!
Math 4/896: Seminar in Mathematics Topic: Inverse Theo
Newton's Method
Gauss-Newton and Levenberg-Marquardt Methods
Chapter 6: Iterative Methods A Brief Discussion
Chapter 7: Additional Regularization Techniques
Chapter 9: Nonlinear Regression
Newton for Optimization
Bright Idea:
We know from calculus that where
∇f = 0.
f (x)
has a local minimum,
F (x) = ∇f (x) and use Newton's method.
k + 1 = xk − ∇ 2 f xk − 1 ∇ f xk iteration formula: x
So just let
Result is
We can turn this approach on its head: root nding is just a
special case of optimization, i.e., solving
as minimizing
f (x) = kF (x)k2 .
F (x) = 0
is the same
Downside of root nding point of view of optimization: saddle
points and local maxima
x
also satisfy
∇f (x) = 0.
F (x) = 0
f (x) = kF (x)k2 nds the
Upside of optimization view of root nding: if
doesn't have a root, minimizing
next best solutions least squares solutions!
In fact, least squares problem fork
Instructor: Thomas Shores Department of Mathematics
G m − dk2
is optimization!
Math 4/896: Seminar in Mathematics Topic: Inverse Theo
Newton's Method
Gauss-Newton and Levenberg-Marquardt Methods
Chapter 6: Iterative Methods A Brief Discussion
Chapter 7: Additional Regularization Techniques
Chapter 9: Nonlinear Regression
Newton for Optimization
Bright Idea:
We know from calculus that where
∇f = 0.
f (x)
has a local minimum,
F (x) = ∇f (x) and use Newton's method.
k + 1 = xk − ∇ 2 f xk − 1 ∇ f xk iteration formula: x
So just let
Result is
We can turn this approach on its head: root nding is just a
special case of optimization, i.e., solving
as minimizing
f (x) = kF (x)k2 .
F (x) = 0
is the same
Downside of root nding point of view of optimization: saddle
points and local maxima
x
also satisfy
∇f (x) = 0.
F (x) = 0
f (x) = kF (x)k2 nds the
Upside of optimization view of root nding: if
doesn't have a root, minimizing
next best solutions least squares solutions!
In fact, least squares problem fork
Instructor: Thomas Shores Department of Mathematics
G m − dk2
is optimization!
Math 4/896: Seminar in Mathematics Topic: Inverse Theo
Remarks on Newton
About Newton:
This barely scratches the surface of optimization theory (take Math
4/833 if you can!!).
Far from a zero, Newton does not exhibit quadratic
convergence. It is accelerated by a line search in the Newton
direction
−∇F xk
−1
F xk
for a point that (approximately)
minimizes a merit function like
m (x) = kF (x)k2 .
Optimization is NOT a special case of root nding. There are
f (x)
∇f .
special characteristics of the min
one only tries to nd a zero of
For example,
−∇f
problem that get lost if
is a search direction that leads to the
method of steepest descent. This is not terribly ecient, but
well understood.
There is an automatic merit function, namely
f (x),
in any
search direction. Using this helps avoid saddle points, maxima.
Remarks on Newton
About Newton:
This barely scratches the surface of optimization theory (take Math
4/833 if you can!!).
Far from a zero, Newton does not exhibit quadratic
convergence. It is accelerated by a line search in the Newton
direction
−∇F xk
−1
F xk
for a point that (approximately)
minimizes a merit function like
m (x) = kF (x)k2 .
Optimization is NOT a special case of root nding. There are
f (x)
∇f .
special characteristics of the min
one only tries to nd a zero of
For example,
−∇f
problem that get lost if
is a search direction that leads to the
method of steepest descent. This is not terribly ecient, but
well understood.
There is an automatic merit function, namely
f (x),
in any
search direction. Using this helps avoid saddle points, maxima.
Remarks on Newton
About Newton:
This barely scratches the surface of optimization theory (take Math
4/833 if you can!!).
Far from a zero, Newton does not exhibit quadratic
convergence. It is accelerated by a line search in the Newton
direction
−∇F xk
−1
F xk
for a point that (approximately)
minimizes a merit function like
m (x) = kF (x)k2 .
Optimization is NOT a special case of root nding. There are
f (x)
∇f .
special characteristics of the min
one only tries to nd a zero of
For example,
−∇f
problem that get lost if
is a search direction that leads to the
method of steepest descent. This is not terribly ecient, but
well understood.
There is an automatic merit function, namely
f (x),
in any
search direction. Using this helps avoid saddle points, maxima.
Remarks on Newton
About Newton:
This barely scratches the surface of optimization theory (take Math
4/833 if you can!!).
Far from a zero, Newton does not exhibit quadratic
convergence. It is accelerated by a line search in the Newton
direction
−∇F xk
−1
F xk
for a point that (approximately)
minimizes a merit function like
m (x) = kF (x)k2 .
Optimization is NOT a special case of root nding. There are
f (x)
∇f .
special characteristics of the min
one only tries to nd a zero of
For example,
−∇f
problem that get lost if
is a search direction that leads to the
method of steepest descent. This is not terribly ecient, but
well understood.
There is an automatic merit function, namely
f (x),
in any
search direction. Using this helps avoid saddle points, maxima.
Remarks on Newton
About Newton:
This barely scratches the surface of optimization theory (take Math
4/833 if you can!!).
Far from a zero, Newton does not exhibit quadratic
convergence. It is accelerated by a line search in the Newton
direction
−∇F xk
−1
F xk
for a point that (approximately)
minimizes a merit function like
m (x) = kF (x)k2 .
Optimization is NOT a special case of root nding. There are
f (x)
∇f .
special characteristics of the min
one only tries to nd a zero of
For example,
−∇f
problem that get lost if
is a search direction that leads to the
method of steepest descent. This is not terribly ecient, but
well understood.
There is an automatic merit function, namely
f (x),
in any
search direction. Using this helps avoid saddle points, maxima.
Chapter 6: Iterative Methods A Brief Discussion
Chapter 7: Additional Regularization Techniques
Chapter 9: Nonlinear Regression
Newton's Method
Gauss-Newton and Levenberg-Marquardt Methods
Outline
Instructor: Thomas Shores Department of Mathematics
Math 4/896: Seminar in Mathematics Topic: Inverse Theo
Gauss-Newton and Levenberg-Marquardt
The Problem:
Given a function
f (x) =
m
X
k =0
F (x) = (f1 (x) , . . . , fm (x)),
minimize
fk (x)2 = kF (x)k2 .
Newton's method can be very expensive, due to derivative
evaluations.
For starters, one shows
Then,
∇f (x) = 2 (∇F (x))T F (x)
∇2 f (x) = 2 (∇F (x))T ∇F (x) + Q (x),
Q (x)contains
all the second derivatives.
where
Gauss-Newton and Levenberg-Marquardt
The Problem:
Given a function
f (x) =
m
X
k =0
F (x) = (f1 (x) , . . . , fm (x)),
minimize
fk (x)2 = kF (x)k2 .
Newton's method can be very expensive, due to derivative
evaluations.
For starters, one shows
Then,
∇f (x) = 2 (∇F (x))T F (x)
∇2 f (x) = 2 (∇F (x))T ∇F (x) + Q (x),
Q (x)contains
all the second derivatives.
where
Gauss-Newton and Levenberg-Marquardt
The Problem:
Given a function
f (x) =
m
X
k =0
F (x) = (f1 (x) , . . . , fm (x)),
minimize
fk (x)2 = kF (x)k2 .
Newton's method can be very expensive, due to derivative
evaluations.
For starters, one shows
Then,
∇f (x) = 2 (∇F (x))T F (x)
∇2 f (x) = 2 (∇F (x))T ∇F (x) + Q (x),
Q (x)contains
all the second derivatives.
where
Gauss-Newton and Levenberg-Marquardt
The Problem:
Given a function
f (x) =
m
X
k =0
F (x) = (f1 (x) , . . . , fm (x)),
minimize
fk (x)2 = kF (x)k2 .
Newton's method can be very expensive, due to derivative
evaluations.
For starters, one shows
Then,
∇f (x) = 2 (∇F (x))T F (x)
∇2 f (x) = 2 (∇F (x))T ∇F (x) + Q (x),
Q (x)contains
all the second derivatives.
where
Chapter 6: Iterative Methods A Brief Discussion
Chapter 7: Additional Regularization Techniques
Chapter 9: Nonlinear Regression
Instructor: Thomas Shores Department of Mathematics
Newton's Method
Gauss-Newton and Levenberg-Marquardt Methods
Math 4/896: Seminar in Mathematics Topic: Inverse Theo
Chapter 6: Iterative Methods A Brief Discussion
Chapter 7: Additional Regularization Techniques
Chapter 9: Nonlinear Regression
Newton's Method
Gauss-Newton and Levenberg-Marquardt Methods
Comment 5.
Instructor: Thomas Shores Department of Mathematics
Math 4/896: Seminar in Mathematics Topic: Inverse Theo
© Copyright 2026 Paperzz