Multiple View Geometry in Computer Vision

3D Urban Modeling
Marc Pollefeys
Jan-Michael Frahm, Philippos Mordohai
Fall 2006 / Comp 790-089
Tue & Thu 14:00-15:15
SN252
3D Urban Modeling
• Basics: sensor, techniques, …
• Video, LIDAR, GPS, IMU, …
• SfM, Stereo, …
• Robot Mapping, GIS, …
• Review state-of-the-art
• Video and LIDAR-based systems
• Projects to experiment
• Virtual UNC-Chapel Hill
• DARPA Urban Challenge
• …
3D Urban Modeling course schedule
(tentative)
Introduction
Video-based Urban 3D Capture
Sep. 12, 14
Cameras
Epi. Geom. and Triangulation
Sep. 19, 21
Feature Tracking & Matching
Stereo matching
(Sudipta)
(Philippos)
GPS/INS/LIDAR
Project proposals
Sep. 5, 7
Sep. 26, 28
(Brian)
(Jan-Michael)
Oct. 3, 5
Oct. 10, 12
Oct. 17, 19
Oct. 24, 26
(fall break)
Project Status Update
Oct. 31, Nov. 2
Nov. 7, 9
Nov. 14, 16
Nov. 21, 23
(Thanksgiving)
Nov. 28, 30
Dec. 5
Project demonstrations
Note: Dec. 3 is CVPR deadline
(classes ended)
Course material
Slides, notes, papers and references
see course webpage/wiki (later)
On-line “shape-from-video” tutorial:
http://www.cs.unc.edu/~marc/tutorial.pdf
http://www.cs.unc.edu/~marc/tutorial/
Epipolar geometry?
courtesy Frank Dellaert
Epipolar geometry
Underlying structure
in set of matches for
rigid scenes
p
L2
C1
m1
lT1 l2
mT2 F m1  0
M
l1
e1
e2
m2
Fundamental matrix
(3x3 rank 2 matrix)
L1
l2
C2
1. Computable from corresponding
points
Canonical representation:
P  [I | 0] P'  [[e' ] F  e' vT | λe' ] 2. Simplifies matching
3. Allows to detect wrong matches
4. Related to calibration
Epipolar geometry
P
P
l1  e1  x1
C1
L2
x1
M
P  P1l1  P2l2
L1
l1
lT1 l2
e1
x Px PF[ex11]x10 0
T T
2 22 1
Fundamental matrix
(3x3 rank 2 matrix)
e2
x2
l2
l 2  P2 P1l1
C2
l x2  0
T
2
The projective reconstruction theorem
If a set of point correspondences in two views determine the
fundamental matrix uniquely, then the scene and cameras
may be reconstructed from these correspondences alone,
and any two such reconstructions from these
correspondences are projectively equivalent
allows reconstruction from pair of uncalibrated images!
Computation of F
•
•
•
•
Linear (8-point)
Minimal (7-point)
Robust (RANSAC)
Non-linear refinement (MLE, …)
• Practical approach
Epipolar geometry: basic equation
x' T Fx  0
x' xf11  x' yf12  x' f13  y' xf21  y' yf 22  y' f 23  xf31  yf 32  f 33  0
separate known from unknown
x' x, x' y, x' , y' x, y' y, y' , x, y,1 f11, f12 , f13 , f 21, f 22 , f 23 , f31, f32 , f33 T  0
(data)
 x'1 x1
 
 x ' n xn

x'1 y1

x 'n y n
(unknowns)
(linear)
x'1

x 'n
y '1 x1

y ' n xn
y '1 y1

y 'n yn
Af  0
y '1

y 'n
x1

xn
y1 1
  f  0
yn 1
the NOT normalized 8-point algorithm
 f11 
f 
 12 
 f13 
 x1 x1´ y1 x1´ x1´ x1 y1´ y1 y1´ y1´ x1 y1 1 

f
x x ´ y x ´ x ´ x y ´ y y ´ y ´ x
  21 
y
1
2 2
2
2 2
2 2
2
2
2
 2 2
 f   0
22
 







  

  f 23 
 xn xn ´ y n xn ´ xn ´ xn y n ´ y n y n ´ y n ´ xn y n 1  f 
 31 
~10000
~100
~10000
~100 ~10000 ~10000 ~100 ~100
1 f 
32


Orders of magnitude difference
f
 33 
!
between column of data matrix
 least-squares yields poor results
the normalized 8-point algorithm
Transform image to ~[-1,1]x[-1,1]
(0,500)
(0,0)
(700,500)
 2
 700





0
2
500
(700,0)

 1

 1

1

(-1,1)
(1,1)
(0,0)
(-1,-1)
normalized least squares yields good results
(Hartley, PAMI´97)
(1,-1)
the singularity constraint
e'T F  0
Fe  0 detF  0
rank F  2
SVD from linearly computed F matrix (rank 3)
σ1

 V T  U1σ1V1T  U 2σ 2 V2T  U 3σ 3V3T
F  U
σ2


σ
3

Compute closest rank-2 approximation
min F - F' F
σ1

F'  U 
σ 2  V T  U1σ1V1T  U 2σ 2 V2T


0


the minimum case – 7 point correspondences
 x'1 x1
 
 x'7 x7

x'1 y1

x '7 y 7
x'1

x '7
y '1 x1

y '7 x7
y '1 y1

y '7 y 7
A  U7x7 diag σ1 ,..., σ7 ,0,0V9x9
y '1

y '7
y1 1
  f  0
y7 1
x1

x7
T
 A[V8V9 ]  09x2
x i (F1  λF2 )x i  0, i  1...7
T
one parameter family of solutions
but F1+lF2 not automatically rank 2
e.g.V
T
V8  [000000010 ]T

the minimum case – impose rank 2
3
(obtain 1 or 3 solutions)
F7pts F
F1
F2
det( F1  λF2 )  a3λ 3  a2 λ 2  a1λ  a0  0
(cubic equation)
det( F1  λF2 )  det F2 det( F2-1F1  λI)  0 detAB  detA.detB
Compute possible l as eigenvalues of F2-1F1
(only real solutions are potential solutions)
Minimal solution for calibrated cameras: 5-point
Robust estimation
• What if set of matches contains gross outliers?
(to keep things simple let’s consider line fitting first)
RANSAC
(RANdom Sampling Consensus)
Objective
Robust fit of model to data set S which contains outliers
Algorithm
(i) Randomly select a sample of s data points from S and
instantiate the model from this subset.
(ii) Determine the set of data points Si which are within a
distance threshold t of the model. The set Si is the
consensus set of samples and defines the inliers of S.
(iii) If the subset of Si is greater than some threshold T, reestimate the model using all the points in Si and terminate
(iv) If the size of Si is less than T, select a new subset and
repeat the above.
(v) After N trials the largest consensus set Si is selected, and
the model is re-estimated using all the points in the
subset Si
Distance threshold
Choose t so probability for inlier is α (e.g. 0.95)
• Often empirically
2
d
• Zero-mean Gaussian noise σ then  follows
 m2 distribution with m=codimension of model
(dimension+codimension=dimension space)
Codimension
Model
t2
1
line,F
3.84σ2
2
H,P
5.99σ2
3
T
7.81σ2
How many samples?
Choose N so that, with probability p, at least one
random sample is free from outliers. e.g. p=0.99
1  1  e 
s N
 1 p

N  log 1  p  / log 1  1  e 
s

proportion of outliers e
s
2
3
4
5
6
7
8
5%
2
3
3
4
4
4
5
10% 20% 25% 30% 40% 50%
3
5
6
7
11
17
4
7
9
11
19
35
5
9
13
17
34
72
6
12
17
26
57
146
7
16
24
37
97
293
8
20
33
54 163 588
9
26
44
78 272 1177
Note: Assumes that inliers allow to identify other inliers
Acceptable consensus set?
• Typically, terminate when inlier ratio reaches
expected ratio of inliers
T  1  en
Adaptively determining
the number of samples
e is often unknown a priori, so pick worst case,
i.e. 0, and adapt if more inliers are found, e.g.
80% would yield e=0.2
• N=∞, sample_count =0
• While N >sample_count repeat
•
•
•
•
Choose a sample and count the number of inliers
Set e=1-(number of inliers)/(total number of points)
Recompute N from e
N  log 1  p  / log 1  1  e 
Increment the sample_count by 1
• Terminate
s
Other robust algorithms
• RANSAC maximizes number of inliers
• LMedS minimizes median error
inlier percentile
100%
50%
residual
(pixels)
1.25
• Not recommended: case deletion,
iterative least-squares, etc.
Non-linear refinment
Geometric distance
Gold standard
Symmetric epipolar distance
Gold standard
Maximum Likelihood Estimation (= least-squares for
Gaussian noise)
T
subject
to
x̂
'
Fx̂  0




d
x
,
x̂

d
x'
,
x̂
'
 i i
i
i
2
2
i
Initialize: normalized 8-point, (P,P‘) from F, reconstruct Xi
Parameterize:
P  [I | 0], P'  [M | t], Xi
x̂ i  PXi , x̂ i  P' Xi
(overparametrized)
Minimize cost using Levenberg-Marquardt
(preferably sparse LM, e.g. see H&Z)
Gold standard
Alternative, minimal parametrization (with a=1)
(note (x,y,1) and (x‘,y‘,1) are epipoles)
problems:
• a=0  pick largest of a,b,c,d to fix to 1
• epipole at infinity
 pick largest of x,y,w and of x’,y’,w’
4x3x3=36 parametrizations!
reparametrize at every iteration, to be sure
Symmetric epipolar error
 d x' , Fx 
2
i
i

 d x i , F x' i
T

2
i


1
1


  x' Fx

2
2
2
2
 x' T F  x' T F
Fx 1  Fx 2 
1
2

T

 

Some experiments:
Some experiments:
Some experiments:
Some experiments:
Residual error:
 d x' , Fx 
2
i
i
i
(for all points!)

 d x i , F x' i
T

2
Recommendations:
1. Do not use unnormalized algorithms
2. Quick and easy to implement: 8-point normalized
3. Better: enforce rank-2 constraint during minimization
4. Best: Maximum Likelihood Estimation
(minimal parameterization, sparse implementation)
Automatic computation of F
Step 1. Extract features
Step 2. Compute a set of potential matches
Step 3. do
Step 3.1 select minimal sample (i.e. 7 matches)
Step 3.2 compute solution(s) for F
Step 3.3 determine inliers (verify hypothesis)

(generate
hypothesis)
until (#inliers,#samples)<95%
Step 4. Compute F based on all inliers
Step 5. Look for additional matches
Step 6. Refine F based on all correct matches
  1  (1 


# inliers 7 # samples
)
# matches
#inliers
90%
80%
70%
60%
50%
#samples
5
13
35
106
382
Abort verification early
OOOOOIOOIOOOO
I I I O I I I I O O I O I I I I I I O I I I I I I O I O O I I I I…
OOOOOOO
OOOOIOOOO
Assume percentage of inliers p, what is the probability P to
pick n or more wrong samples out of m?
stop if P<0.05 (sample most probably contains outlier)
(P=cum. binomial distr. funct. (m-n,m,p ))
To avoid problems this requires to also verify at random!
(but we already have a random sampler anyway)
(inspired from Chum and Matas BMVC2002)
Finding more matches
restrict search range to neighborhood of epipolar line
(e.g. 1.5 pixels)
relax disparity restriction (along epipolar line)
Degenerate cases:
• Degenerate cases
• Planar scene
• Pure rotation
• No unique solution
• Remaining DOF filled by noise
• Use simpler model (e.g. homography)
• Solution 1: Model selection
(Torr et al., ICCV´98, Kanatani, Akaike)
• Compare H and F according to expected residual error
(compensate for model complexity)
• Solution 2: RANSAC
• Compare H and F according to inlier count
(see next slide)
RANSAC for quasi-degenerate cases
• Full model
(8pts, 1D solution)
(accept inliers to solution F)
• Planar model
(6pts, 3D solution)
(accept inliers to solution F1,F2&F3)
Accept if large number of remaining inliers
• Plane+parallax model
(plane+2pts)
closest rank-6 of Anx9
for all plane inliers
Sample for out of plane points among outliers
More problems:
• Absence of sufficient features (no texture)
• Repeated structure ambiguity
• Robust matcher also finds
support for wrong hypothesis
• solution: detect repetition
(Schaffalitzky and Zisserman,
BMVC‘98)
RANSAC for ambiguous matching
• Include multiple candidate matches in set
of potential matches
• Select according to matching probability
(~ matching score)
• Helps for repeated structures or scenes
with similar features as it avoids an early
commitment
(Tordoff and Murray ECCV02)
two-view geometry
geometric relations between two views is fully
described by recovered 3x3 matrix F
Triangulation
L2
C1
x1
X
L1
Triangulation
x2
C2
- calibration
- correspondences
Triangulation
• Backprojection
L2
X
C1 x1
L1
x2
C
2
• Triangulation
Iterative least-squares
• Maximum Likelihood Triangulation
Optimal 3D point in epipolar plane
• Given an epipolar plane, find best 3D point for (m1,m2)
l1
m1
m1´
m2´
l2
l1
m1
m2
m2
l2
Select closest points (m1´,m2´) on epipolar lines
Obtain 3D point through exact triangulation
Guarantees minimal reprojection error (given this epipolar
plane)
Non-iterative optimal solution
• Reconstruct matches in projective frame
by minimizing the reprojection error
Dm1,P1M  Dm2 ,P2M
2
2
• Non-iterative method
3DOF
(Hartley and Sturm, CVIU´97)
Determine the epipolar plane for reconstruction
Dm1, l1   Dm2 , l2  
2
l1
2
l2
m1
(polynomial of degree 6)
1DOF
m2
Reconstruct optimal point from selected epipolar plane
Note: only works for two views