Basic questions
Estimation
We are going to be interested of solving e.g. the following
estimation problems:
2D homograpy. Given a point set xi in P 2 and corresponding
points x0i in P 2 , find the homography h such that h(xi ) = x0i .
Camera projection. Given a point set Xi in P 3 and
corresponding points xi in P 2 , find the mapping P 3 → P 2 .
The fundamental matrix. Given a point set xi in one image and
corresponding points x0i in a second image, find the
fundamental matrix F between the images. The fundamental
matrix is a singular 3 × 3 matrix F the satisfies x0>
i Fxi = 0 for all
i.
What is required for an exact, unique solutiion, i.e. how may
corresponding points xi ↔ x0i are needed?
A homography H has 8 degrees of freedom. Each point pair gives
2 independet equations x0i = Hxi . Thus we need at least 4 points
for an exact solution.
How can we use more data to improve the solution? What is
meant by “better”? We need to define a metric. What metrics are
simple to calculate? Which are theoretically best?
How do we handle low quality data, i.e. outliers?
– p. 2
– p. 1
Exact solution, the Direct Linear Transform
Study the problem to determine a
homography H : P 2 → P 2 from point
correspondences xi ↔ x0i .
The transformation is given by
x0i = Hxi . Rewriting this gives
x0i × Hxi = 0, since x0i and Hxi are
parallel vectors in R3 .
hj>
Let
be the jth row in H and
x0i = (x0i , yi0 , wi0 )> . Then we may
write
2
3
h1> xi
6
7
Hxi = 4 h2> xi 5
3>
h xi
and
2
3
yi0 h3> xi − wi0 h2> xi
6
7
x0i ×Hxi = 4 wi0 h1> xi − x0i h3> xi 5
x0i h2> xi − yi0 h1> xi
Exact solution, the Direct Linear Transform
The equation A0i h = 0 is linear in h.
Each equation A0i h = 0 has 2 linearly independent equations, i.e. one row
can be removed. Removing the third row gives us
or
2
0>
6
4 wi0 x>
i
−yi0 x>
i
−wi0 x>
i
0>
x0i x>
i
32
3
yi0 x>
h1
i
76 2 7
−x0i x>
5 = 0.
i 54 h
>
0
h3
This equation is on the form A0i h = 0
where A0i is a 3 × 9 matrix and h is a
9-vector with the row-wise elements of
H.
2
3
h1
h1
6
7
6
h = 4 h2 5 , H = 4 h4
h3
h7
2
h2
h5
h8
"
>
0
wi0 x>
i
−wi0 x>
i
>
0
yi0 x>
i
−x0i x>
i
#
3
h1
6 2 7
4 h 5=0
h3
2
or Ai h = 0, where Ai is a 2 × 9 matrix. If any of the points x0i is an ideal point,
i.e. wi0 = 0, another row has to be removed.
The equation is valid for all homogenous representations (x0i , yi0 , wi0 )> of x0i ,
e.g. for wi0 = 1.
3
h3
7
h6 5 .
h9
Each point pair produces 2 equations in the elements of H. With 4 point pairs,
the matrix A0 becomes 12 × 9 and the A matrix 8 × 9.
Both matrices have rank 8, i.e. they have a one-dimensional null-space. The
solution h can be determined from the null-space of A.
– p. 3
– p. 4
DLT — Over-determined solution (SVD)
DLT — Over-determined solution (SVD)
If we have more than 4 point pairs, the equation Ah = 0 becomes
over-determined.
Study the singular value decomposition (SVD) of A,
A = UDV> .
Without error in the points (“noise”), the rank of A will still be 8.
With noise, the rank will be 9 and the only solution of Ah = 0 is
h = 0, i.e. undefined.
The matrix D is diagonal and contains the non-negative singular
values of A, sorted in descending order. The matrices U and V are
orthogonal.
One solution of this problem is to add a contraint to h, i.e. khk = 1.
In that case the problem becomes
min
The solution of the minimization problem
kAhk
h
min
h
khk = 1
subject to
is the right singular vector vn corresponding to the smallest
singular value.
or
min
h
kAhk
.
khk
kAhk
.
khk
– p. 5
– p. 6
Example 1
For the points
Example 2
For the points
2
800
2
500
6
xi = 4 500
1
1.5
2
−1
6
xi = 4 −1
1
−1
1
1
0
0
1
1
−1
1
3
1
7
1 5
1
1
0.5
600
600
1
700
500
1
3
700
7
700 5
1
750
700
650
and
0
and
500
700
1
600
−0.5
2
6
0
xi = 4
−0.99
−1
1
−1
1
1
0
0
1
1
−1
1
3
1
7
1 5
1
2
501
6
0
xi = 4 500
1
−1
−1.5
−2
−2
we get
−1.5
−1
−0.5
0
0.5
1
1.5
500
700
1
600
600
1
700
500
1
550
3
700
7
700 5
1
500
450
2
400
400
450
500
550
600
650
700
750
800
we get
2
6
6
6
6
6
6
6
6
A = 6
6
6
6
6
6
6
6
4
0
−1
0
−1
0
0
0
1
0
1
0
−1
0
1
0
0
0
−1
0
1
0
1
0
1
0
1
0
1
0
1
1
0
1
0
0
0
−1
0
−1
0
1
0
−1
0
0
0
1
0
−1
0
−1
0
−1
0
−1
0
−1
0
−1
0
1
−0.99
−1
−1
0
0
−1
−1
1
−1
1
−0.99
1
1
0
0
1
1
1
−1
−1
0.99
1
1
0
0
−1
−1
1
−1
3
3
2
0.997
7
6 −0.003 7
7
7
6
7
6
7
0.002 7
7
6
7
7
6
7
6 −0.000 7
7
7
6
7
7 and h = 6
1.000 7
7
6
7
6 −0.002 7
7
7
6
7
7
6
7
6 −0.000 7
7
7
6
7
7
4 −0.002 5
5
1
2
6
6
6
6
6
6
6
6
A= 6
6
6
6
6
6
6
6
4
or
2
0.997
6
H = 4−0.000
−0.000
−0.003
1.000
−0.002
3
0.002
7
−0.0025 .
1
0
500
0
500
0
600
0
700
0
700
0
500
0
700
0
600
0
500
0
700
0
1
0
1
0
1
0
1
0
1
−500
0
−500
0
−600
0
−700
0
−700
0
−500
0
−700
0
−600
0
−500
0
−700
0
−1
0
−1
0
−1
0
−1
0
−1
0
250000
−250500
350000
−250000
360000
−360000
350000
−490000
490000
−490000
250000
−250500
490000
−350000
360000
−360000
250000
−350000
490000
−490000
−0.018
0.963
−0.000
3
16.030
7
12.7415 .
1.000
500
−501
700
−500
600
−600
500
−700
700
−700
3
3
2
0.970
7
6 −0.018 7
7
7
6
7
6 16.030 7
7
7
6
7
7
6
7
6 −0.006 7
7
7
6
7
7
7 and h = 6
0.963
7
6
7
6 12.741 7
7
7
6
7
7
6
7
6 −0.000 7
7
7
6
7
7
4 −0.000 5
5
1.000
or
2
– p. 7
0.970
6
H = 4−0.006
−0.000
– p. 8
Inhomogenous solution
Solutions from lines and points
A line correspondence li ↔ l0i also gives 2 equations in the
elements in H so a similar problem may be formulated from e.g. 4
line pairs or 2 point pairs and 2 line pairs.
If we can fix of the elements in h we can remove that element and solve for the 8
remaining.
If we e.g. assume that h9 = H33 = 1 the point equations become
"
0
xi wi0
0
yi wi0
0
wi wi0
−xi wi0
0
−yi wi0
0
−wi wi0
0
xi yi0
−xi x0i
yi yi0
−yi x0i
#
h̃ =
"
−wi yi0
wi x0i
#
,
where h̃ contains the first 8 elements in h.
With 4 point pairs we get an equation Mh̃ = b, where M is 8 × 8, that can be solved
exactly.
With more that 4 point pair we may solve
min kMh̃ − bk
h̃
with a least squares method.
Observe that this method works poorly if the correct solution has H33 = 0.
– p. 9
Algebraic distance
Normalized DLT for 2D homographies
Given n ≥ 4 point pairs {xi ↔ x0i }, determine the homography H
such that x0i = Hxi .
Determine the similarity transformation T such that the points
{x̃i = Txi } have a center of gravity at the origin and a mean
√
distance of 2 to the origin.
Determine the similarity transformation T0 such that the points
{x̃0i = T0 x0i } have a center of gravity at the origin and a mean
√
distance of 2 to the origin.
Determine the homography H̃ for the point correspondences
{x̃i ↔ x̃0i }.
The DLT algorithm minimizes kk = kAhk. Each point pair xi ↔ x0i contributes with an
error vector i that is called the algebraic error vector associated with the point pair
xi ↔ x0i and the homography H. The norm of i is called the algebraic distance dalg
and is
dalg (x0i , Hxi )2
‚"
‚
0>
‚
= ki k = ‚
‚ wi0 x>
i
2
−wi0 x>
i
0>
yi0 x>
i
−x0i x>
i
In general, dalg för två vektorer x1 och x2 is defined as
dalg (x1 , x2 )2 = a21 + a22 where a =
h
a1
a2
a3
iT
# ‚2
‚
‚
h‚ .
‚
= x1 × x2 .
Given a set of point correspondences xi ↔ x0i the total error becomes
kAhk2 = kk2 =
X
i
ki k2 =
X
– p. 10
Re-normalize such that H = T0−1 H̃T.
dalg (x0i , Hxi )2 .
i
The algebraic distance is easy to minimize, but is difficult to interpret geometrically.
Furthermore it is transformation dependent and calculations based on the algebraic
distance should be normalized.
– p. 11
– p. 12
Pertubation sensitivity
Geometrical distance
We will now study a few error measures based on the geometric
distance between measured and estimated point coordinates.
Assume we want to use a grid pattern (x, y) ∈ [500, 900] as a reference coordinate
system for the transformation between two images. Assume the images are
approximately the same, i.e. H ≈ I.
Use the notation x for measured coordinate, x̂ for estimated
coordinates, and x̄ for the true coordinate for a point. An estimated
homographys is denoted Ĥ.
How much will small measurement errors affect the estimation of another point
p = (400, 400, 1)> ?
Result of 100 monte carlo simulations where H was determined from point
correspondences {xi ↔ x0i }, where the points x0i were perturbed with white noise of
standard deviation σ=0.1 pixels.
1000
1000
1000
900
900
900
800
800
800
700
700
700
600
600
600
500
500
500
400
400
300
400
500
600
Setup
700
800
900
400
300
400
500
600
700
800
900
300
400
500
600
700
800
900
Distribution of Hp for
Distribution of Hp for
unnormalized estimation of H normalized estimation of H
– p. 14
– p. 13
Errors in one image only
Errors in both images
If we have errors in one image only, an appropriate error measure
is the Euclidian distance between the measured points x0i and the
transformed exact points Hx̄i . This is called the transfer error and
is denoted
X
d(x0i , Hx̄i )2 ,
If we have measurement errors in both images we need to take both errors into account.
One solution is to sum the geometrical error form the forward transformation H and the backward
transformation H−1 . This is called the symmetric transfer error
X
d(xi , H−1 x0i )2 + d(x0i , Hxi )2 .
i
i
H
x
where d(x, y) is the Euclidian distance between the cartesian
points represented by x and y.
/
d x/
d
H -1
image 1
image 2
An alternate solution is to require a perfect matching and sum the errors in both images. This is
called the reprojection error
X
2
0
0 2
0
d(xi , x̂i ) + d(xi , x̂i ) subject to x̂i = Hx̂i , ∀i.
i
x
x/
d/
d
x
H
H -1
image 1
– p. 15
x/
image 2
– p. 16
Statistical error
Maximum likelihood estimates
If we assume the measurement error is Guassian distributed with variance σ 2 , we may
describe the measured coordinates as x = x̄ + δx, where the error δx is normally
distributed with variance σ 2 .
If we take the logarithm we get the log-likelihood function
0
log Pr({xi }|H) = −
Furthermore, if we assume the errors are independent, the probability density function
(pdf) for a point measurement x given the true point x̄
1 X
0
2
d(xi , Hx̄i ) + c,
2σ 2 i
where c is a constant.
Pr(x) =
„
1
2πσ 2
«
e−d(x,x̄)
2
2
/(2σ )
The maximum likelihood estimate (MLE) of the homography, Ĥ, maximizes the log-likelyhood
function and minimizes
X
0
2
d(xi , Hx̄i ) ,
.
i
In the case of error in one image only we are interested in the probability for observing
the correspondences {x̄i ↔ x0i }. If the observations are independent the pdf becomes
Pr({x0i }|H) = Πi
„
i.e. the probability that we will observe
1
2πσ 2
{x0i }
«
i.e. the geometrical transfer error.
For error in both images we get the pdf for the true correspondences {x̄i ↔ Hx̄i = x̄0i } as
0
e−d(xi ,Hx̄i )
2 /(2σ 2 )
,
0
Pr({xi , xi }|H, {x̄i }) = Πi
„
1
2πσ 2
«
“
”
− d(xi ,x̄i )2 +d(x0i ,Hx̄i )2 /(2σ 2 )
e
,
given that H is the true homography.
whose MLE corresponds both of a homography Ĥ and point correspondences {xi ↔ x0i } and
minimizes
X
2
0
0 2
d(xi , x̂i ) + d(xi , x̂i )
i
– p. 17
Mahalanobis distance
where
x̂0i
= Ĥx̂i , i.e. the reprojection error.
– p. 18
Iterative minimization
If we know the covariance matrix Σ for our observations we get the
MLE by minimizing the Mahalanobis distance
To minimize a geometric distance an iterative method is often
needed.
If an inhomogenous formulation is possible a unconstrained
algorithm may be used, e.g. Gauss-Newton. Otherwise a
constained algorithm, e.g. SQP is the best choice.
kX − X̄k2Σ = (X − X̄)> Σ−1 (X − X̄).
If the errors in both images are independent the corresponding
error measure becomes
For the transfer error the vector of unknowns is h and the objective
P
function becomes i d(x0i , Hx̄i )2 , i.e. the residual function is
kX − X̄k2Σ + kX0 − X̄0 k2Σ0 ,
r1 (h)
#
"
h11 x̄i +h12 ȳi +h13 w̄i
0
r2 (h)
−
x
i
h
x̄
+h
ȳ
+h
w̄
31 i
32 i
33 i
r(h) =
.. , where ri (h) = h21 x̄i +h22 ȳi +h23 w̄i − y 0
.
i
h31 x̄i +h32 ȳi +h33 w̄i
rn (h)
where Σ and Σ0 are the covariance matrices for measurements in
the two images.
A special case is if the measurements are independent but with
different variance. Then the covariance matrix Σ becomes
diagonal.
For a homogenous formulation a normalization constraint on h is
necessary, e.g. h211 + . . . + h233 − 1 = hT h − 1 = 0.
– p. 19
– p. 20
Iterative minimization
Robust estimation
For the reprojection error we have to estimate x̂0i and x̂i in addition to h.
How do we handle observations with large
errors (outliers). One way is to use the
Random Sample Consensus (RANSAC)
algorithm.
x̂0i
The components of the constaint
= Hx̂i has to be normalized. For
instance the implicit constraint ŵi = 1 may be used together with
h> h − 1 = 0. Then the residual function becomes
c
b
Given a model and a data set S containing
outliers:
Pick randomly s data points from the set
S and calculate the model from these
points. For a line, pick 2 points.
Determine the consensus set Si of s, i.e.
the set of points being within t units from
the model. The set Si define the inliers
in S.
If the number of inliers are larger than a
threshold T , recalculate the model based
on all points in Si and terminate.
Otherwise repeat with a new random
subset.
After N tries, choose the largest
consensus set Si , recalculate the model
based on all points in Si and terminate.
2
3
x̂i − xi
6 ŷ − y 7
i 7
6 i
0
7
ri (h) = 6
6 x̂i0 − x0i 7
4 ŵ0i
5
ŷi
− yi0
ŵ0
i
with constraints
2 3 2 3
x̂i
x̂0i
6 7 6 07
H 4 ŷi 5 − 4 ŷi 5 = 0,
ŵi0
1
h> h − 1 = 0.
a
d
C
A
B
C
A
B
D
D
– p. 22
– p. 21
How many samples N ?
How to choose the distance limit t?
If we assume the distance d from the model is normally distributed
with standard deviation σ the limit t may be choosen as
−1
t2 = Fm
(α)σ 2 , where Fm is the cumulative distiribution function for
2
the χ distribution with m degrees of freedom.
The number of samples N should be chosen such that the
probability of having picked at least one sample without outliers is
p.
Assume w is the probability for an inlier, i.e. = 1 − w is the
probability for an outlier.
Such a measurement satisfies d2 < t2 with probability α.
A few examples:
Degrees of freedom
1
2
3
Model
line, fundamental matrix
homography, camera matrix
trifocal tensor
Then we need at least N samples of s points each, where
(1 − ws )N = 1 − p or
2
t
3.84σ 2
5.99σ 2
7.81σ 2
N=
log 1 − p
.
log(1 − (1 − )s )
How to choose an acceptable size of the consensus set T ?
A rule of thumb is to terminate if the size of the consensus set is
equal to the number of expected inliers in the set, i.e. for n points
T = (1 − )n.
– p. 23
– p. 24
Adaptive RANSAC
Example
It is possible to estimate T and N dynamically. Given a point set of
n points:
N = ∞, i = 0.
Repeat while i < N
Pick a subset of s elements and count the number of inliers
k.
Let = 1 − k/n.
log 1−p
Let N = log(1−(1−)
s ) for e.g. p = 0.99.
i = i + 1.
This is called the adaptive RANSAC algorithm.
– p. 25
– p. 26
© Copyright 2025 Paperzz