Training Many-Parameter Shape-from-Shading Models Using

Training Many-Parameter
Shape-from-Shading Models Using a
Surface Database
Nazar Khan1
Lam Tran2
Marshall Tappen1
1 University
2 University
of Central Florida
of California-San Diego
Workshop on 3-D Digital Imaging and Modeling, 2009
Introduction
I
Shape-from-shading (SFS) attempts to reconstruct the
shape of a 3D object from its shading in a 2D image.
I
SFS methods tend to use few parameters to avoid
hand-tuning a lot of parameters.
I
This limits the number of different cues that the SFS
problem can exploit.
Basic SFS Model
I
From the observed 2D image, o, compute the optimal
shape z? as
z? = arg min E(z, o)
z
where E(z, o) =
Np
X
Ep
p=1
is a composition of individual per-pixel energy functions Ep .
I
Each of the functions E1 . . . ENp models the relationship
between that pixel’s intensity and possible surface
gradients.
Intensity-Gradient Relationship
Hemisphere
Reflection map
Reflection map in gradient space
I
A hemisphere contains all possible surface gradients.
I
Therefore, a hemisphere rendered from a known
illumination vector results in all possible image intensities
for that illumination.
I
One-to-many relationship between intensity and gradients.
Intensity-Gradient Relationship
I
Different intensities lead to different isophotes that can be
modelled reliably or not so reliably.
I
So why not exploit intensity-based confidence terms in the
energy function?
Intensity-based Weights
I
Since some intensities are worse than others, reduce their
contribution towards the final surface.
E(z, o) =
Np
X
p=1
Ep
Intensity-based Weights
I
Since some intensities are worse than others, reduce their
contribution towards the final surface.
E(z, o) =
Np
X
p=1
ew(Ip ) Ep
Intensity-based Weights
I
Since some intensities are worse than others, reduce their
contribution towards the final surface.
E(z, o) =
Np
X
ew(Ip ) Ep
p=1
where
I
I
I
Ip is the intensity at pixel p,
w(Ip ) is the intensity-based weight, and
exponentiation e(.) ensures positive weightings.
Intensity-based Weights
I
Since some intensities are worse than others, reduce their
contribution towards the final surface.
E(z, o) =
Np
X
ew(Ip ) Ep
p=1
where
I
I
I
I
Ip is the intensity at pixel p,
w(Ip ) is the intensity-based weight, and
exponentiation e(.) ensures positive weightings.
Potentially 256 parameters for grey-scale images.
Intensity-based Weights
I
Too many parameters implies that hand-tuning is not an
option.
I
Contribution: Show how multiple parameters can be
automatically learned for the SFS problem.
Our Formulation for Learning – Step 1
I
Define a loss function L(z, t) that measures quality of
reconstruction z against ground-truth t. For example,
I
I
L(z, t) = kz − tk2 , or
Np
X
L(z, t) =
(1 − nzi · nti )
i=1
where nzi is the normalized normal vector to the surface z
at pixel i.
Our Formulation for Learning – Step 2
I
With the loss function defined, change weight parameters
w such that the optimal surface z? obtained through
z? (w) = arg min E(z, o; w)
z
minimizes the loss.
I
That is, optimal parameters are given by
w? = arg min L(z? (w), t).
w
I
Ideally, compute
obtain w? .
I
But . . .
∂L
∂w
and perform gradient descent to
Our Formulation for Learning – Step 3
I
Loss L(z? (w), t) depends on z? which depends on w.
I
So
∂L
∂w
∂L ∂z?
∂z? ∂w .
∂arg minz E(z,o;w)
∂w
=
?
I ∂z =
∂w
I Problem:
I
arg min is non-differentiable with respect to w.
Solution:
I
I
use an upperbound, Ê, of E so that,
the approximate minimum ẑ? becomes differentiable with
respect to w.
Upperbounding the Energy Function E
I
P
E is of the form − log ( i e−qi ) where the qi ’s are quadratic
terms.
I
Using Jensen’s inequality
!
− log
X
e−qi
≤−
X
i
log e−qi
i
=
X
qi .
i
I
So the upperbound is quadratic and it is also a tight
upperbound.
1D Example
X
y = − log(
exp(−(x − i)2 ) where A = {−7, −3, 1, 4, 7}.
i∈A
1D Example
1D Example
1D Example
1D Example
1D Example
Computing
I
∂ẑ?
∂w
ẑn = f (ẑn−1 , g(w)) where g(w) = ew .
Computing
∂ẑ?
∂w
I
ẑn = f (ẑn−1 , g(w)) where g(w) = ew .
I
Using the chain rule,
Computing
∂ẑ?
∂w
I
ẑn = f (ẑn−1 , g(w)) where g(w) = ew .
I
Using the chain rule,
∂ẑn ∂g(w)
∂ẑn
∂ẑn ∂ẑn−1
+
= n−1
∂w
|∂ẑ {z ∂w } |∂g(w){z ∂w }
indirect effect
direct effect
Computing
∂ẑ?
∂w
I
ẑn = f (ẑn−1 , g(w)) where g(w) = ew .
I
Using the chain rule,
∂ẑn ∂g(w)
∂ẑn
∂ẑn ∂ẑn−1
+
= n−1
∂w
|∂ẑ {z ∂w } |∂g(w){z ∂w }
∂ẑn−1
∂w
indirect effect
∂ẑn−2
= ...
∂w
direct effect
+ ...
Computing
∂ẑ?
∂w
I
ẑn = f (ẑn−1 , g(w)) where g(w) = ew .
I
Using the chain rule,
∂ẑn ∂g(w)
∂ẑn
∂ẑn ∂ẑn−1
+
= n−1
∂w
|∂ẑ {z ∂w } |∂g(w){z ∂w }
∂ẑn−1
indirect effect
∂ẑn−2
= ...
+ ...
∂w
∂w
..
.
∂ẑ0
=0
∂w
direct effect
Computing
∂ẑ?
∂w
I
ẑn = f (ẑn−1 , g(w)) where g(w) = ew .
I
Using the chain rule,
∂ẑn ∂ẑn−1
∂ẑn
∂ẑn ∂g(w)
= n−1
+
∂w
|∂ẑ {z ∂w } |∂g(w){z ∂w }
∂ẑn−1
I
1
indirect effect
∂ẑn−2
direct effect
= ...
+ ...
∂w
∂w
..
.
∂ẑ0
=0
∂w
This method of learning parameters is termed Variational
Mode Learning1 .
M. Tappen, Utilizing Variational Optimization to Learn Markov Random
Fields, CVPR 2007
Summary of the Method
I
Define a loss function L.
I
Optimize multiple parameters w so as to minimize the loss
of the optimally reconstructed surface z? .
I
Use approximate ẑ? to compute
I
Train for w using a database of image-shape pairs.
∂ẑ?
∂w
and obtain
∂L
∂w .
Experiments on Synthetic Surfaces
I
Training sets for SFS not readily available.
I
We generated smooth synthetic surfaces, 64 for training
and 128 for testing.
I
Fix the illumination vector and learn the weighting
parameters w using the training set.
Experiments on Synthetic Surfaces
Ground-truth
Reconstruction w/o learning
Experiments on Synthetic Surfaces
Ground-truth
Reconstruction with learning
Experiments on Synthetic Surfaces
Ground-truth
Reconstruction w/o learning
Reconstruction with learning
Many Parameters vs. One Parameter
I
Computed optimal value of the smoothness parameter on
training set.
I
Computed testing loss using the optimal smoothness
parameter.
I
Computed testing loss using the optimal smoothness
parameter as well as the learned parameters.
I
29% decrease in loss.
I
So, many-parameter learning provides a benefit on top of
using the optimal smoothness parameter value.
Experiments on Human Faces
I
Database of 6 laser-scanned human faces.
I
Leave-one-out-cross-validation.
I
Average improvement of 28% due to learning.
Experiments on Human Faces
Ground-truth
Reconstruction w/o learning
Experiments on Human Faces
Ground-truth
Reconstruction with learning
Experiments on Human Faces
Ground-truth
Reconstruction w/o learning
Reconstruction with learning
Limitations
I
Quantitative improvement does not imply equal qualitative
improvement.
I
I
Need perceptually more accurate 3D surface quality
metrics.
Vertical lighting is problematic.
I
Simultaneous learning of the whole system, i.e. weights +
isophote modelling parameters.
Conclusion
I
Novel learning-based approach to SFS.
I
Potential to enable significant innovations on SFS
problems because of its ability to search over large
parameter spaces in an automated fashion.
I
Good results for synthetic surfaces with sufficient training
data.
Thank You