Rigid Motion Estimation using Mixtures of Projected

Rigid Motion Estimation using Mixtures of Projected - R3-COP

Rigid Motion Estimation using Mixtures of Projected Gaussians
Wendelin Feiten1 , Muriel Lang2 and Sandra Hirche2
Abstract— Modeling the position and orientation in threedimensional space is important in many applications. In
robotics, the position and orientation of objects as well as
the rigid motions of robots are derived from sensor data that
are uncertain. The uncertainties of these sensor data result in
position and orientation uncertainties that can be very widely
spread or have several peaks.
In this paper we describe a class of probability density
functions (pdf ) on the group of rigid motions that allows for
modeling wide-spread and multi-modal pdf and offers most of
the operations that are available for the mixtures of Gaussians
on Euclidean space. The use of this class of pdf is illustrated
with an example from robotic perception.
I. I NTRODUCTION
In many applications, the position and orientation of
objects in the three-dimensional (3D) space or the motion of
objects through the 3D space need to be modeled. Various
approaches to this task have been explored in the past in
different areas. In the estimation of the motion of airplanes,
generally the rotation is described by the three angles of
roll, pitch and yaw. In computer graphics, the rotation in
3D is generally described by unit quaternions. In robotics,
the Rodrigues vector is often used as a parameterization
of the 3Dl rotation. These rotation representations are then
combined with the translation to model the rigid motion,
which is equivalent to the position and orientation in space.
The standard representation in mathematics and also in
robotics is the rotation matrix, in robotics generally enhanced
with an additional column representing the translation.
In most cases, where position and orientation are derived
from measurements, this is not a certain information, but it
depends on random observations. This means that in addition
to the parameterization of the rigid motion a probability
density function (pdf ) over these parameters needs to be
given.
The contribution of this paper is a new class of pdf for the
probabilistic representation of rigid motion. In order to cover
a wide range of applications, the following requirements
should be satisfied by such a class of pdf :
Expressive Power: The class should be able to approximate any pdf.
Information Fusion: For two pdf from the class pertaining to the same pose, there should be an algorithm to
calculate the resulting pose estimate as a pdf from this class
that best explains both prior estimates.
1 W. Feiten is with Siemens Corporate Technology, Siemens AG, D-81739
Munich [email protected]
2 M. Lang and S. Hirche are with the Institute for InformationOriented Control, Faculty of Electrical Engineering and Information Technology, Technische Universität München, D-80290 Munich
[email protected]
Information Propagation: For two pdf pertaining to two
subsequent motions, there should be an algorithm to calculate
the pdf of the composition of the two motions.
Efficiency: The algorithms should be fast enough and the
space requirements should be small enough to allow for the
use of the pdf in real time applications.
In section II we will investigate how these requirements
are addressed in different approaches and draw our conclusions as to the concepts we want to combine and extend in
our approach. The resulting class of pdf together with the
algorithms according to the requirements is then developed
in the subsequent sections. In section III we explain how the
Gaussian distribution in R3 is used to represent a peaked
pdf on the rotations and how this approach generalizes
to represent also widely spread or multimodal pdf using
mixture distributions (we call this the Mixtures of projected
Gaussians or MPG). We then extend this from rotations
to the rigid motions in section IV. Section V describes
the information fusion and the uncertainty propagation with
the mixture distributions on the rigid motions. To keep the
number of mixture elements small, similar elements are
merged and negligible ones are disregarded. A background
noise is introduced in section VI and also an approach to keep
the computation time at a reasonable size. In section VII we
then illustrate the use of this class of pdf in the field of
robotic perception and conclude in section VIII with a short
summary of the properties of the new class of pdf and the
potential further development.
This presentation of our new class of pdf on rigid motions
builds on our previous publications [1], [2], [3] and gives an
overview of the approach.
II. R ELATED W ORK
Representing the position and orientation of objects in
three-dimensional space, or equivalently, representing the
rigid motion, has been under investigation for such a long
time that giving a complete outline of the history seems
inadequate. Admitting that much more work has been done,
and giving credit to it, we report here the work that we used
to develop our approach. Along with a short assessment of
the crucial ingredients of the respective work, we explain
which of them we decided to integrate in our approach and
why. The first design decision is, which parameterization of
the rotations and of the rigid motions to use.
Rotation matrices are well understood and easy to visualize. Also, extended with the translation vector as an
additional column and an additional row (0, 0, 0, 1) to a
homogeneous transformation matrix, they represent the rigid
motions in a way that composition of motions is done by
matrix multiplication. However, the rigid motions have inherently only six degrees of freedom, and this representation
has twelve parameters. As six equality constraints are needed
on the parameter space to determine the valid subset of
parameter, we decided against this representation.
Euler angles, for example the parameter set roll, pitch,
yaw, need no equality constraints to reduce the dimension,
because the rotation has three degrees of freedom. If the values for the Euler angles become large, the parameterization
can become singular. There is an orientation for which the
parameters can not be uniquely determined (gimbal lock).
Since we want to be able to deal with a wide range of
orientations, we decided against Euler angles.
Rodrigues vectors, like Euler angles, use only three
parameters for the representation of the orientation. Hence,
no equality constraints are needed, and a pdf can be induced
on the rotations from a pdf on R3 (see [4]). But for Rodrigues
vectors there is no efficient way, to the best of our knowledge, to determine the parameters of a composition of rigid
motions from the parameters for each individual motion. This
makes it technically more difficult to determine the pdf of
the composition from the pdf of the two components.
The unit quaternions represent the rotations using four
parameters, so one equality constraint is needed to reduce
the dimension (see [5]). The composition of rotations corresponds to multiplication of unit quaternions, and there is no
singularity (except for one symmetry that is easy to handle).
They are in wide use in the computer graphics community
(see [6]). Further, the rigid motions can be described by
dual quaternions (they will be explained below). The dual
quaternions only need one equality constraint, are unique
for a rigid motion (except the easy to handle symmetry in
one quaternion) and have a closed form for the composition
of rigid motions. Their advantage is that the composition
of rigid motions corresponds to the multiplication of dual
quaternions, so that there is a closed form for the parameters
of the composition (see [6]). Goddard and Abidi [7], [8] use
this formalism to capture the correlation between rotation
and translation, and also use the derivatives to estimate the
dynamical state of a system. The dual quaternions seem to
us to be the best match with our requirements. This takes us
to the decision which distribution types to use.
A very common distribution type to describe arbitrary
pdf is the description with a sample set. The expressive
power is high, composition is easy, but this comes at a price.
In high dimensions (and for wide spread distributions, six
dimensions is high) the sample sets need to be very large
in order to be representative of the pdf. Also, the fusion of
two particle sets is not straight forward and computationally
expensive. So we decide to keep the sample set description
as a fallback option, but to develop a representation that
is still general, but more efficient in terms of memory and
computation time.
Since we want to use the same operations on the pdf of our
class that are available for the Gaussians and the mixtures of
Gaussians, we specifically look at those distributions that are
derived from the Gaussian distributions in Euclidean space.
Basically we distinguish two different approaches.
In the first approach, the pdf is defined by a Gaussian in R4 with zero mean, taking the values on the unit
sphere S3 ⊂ R4 and renormalizing with the value of the
integral over the sphere. This is also known as the Bingham
distribution. This type of distribution has been used by [9]
and by [10]. One convenient property of this distribution
is that the values for antipodal points on the sphere are
identical. This makes sense because antipodal points (more
precisely the unit quaternions corresponding to antipodal
points) represent the same rotation. However, two objections
need to be made. The first one is that calculating the integral
is very expensive (see [11]). Glover [10] therefore computes
a lookup table of normalizing constant approximations using
standard floating point arithmetic. The second objection is
that the distribution resulting from a composition of motions
is not a Bingham distribution, so an additional approximation
step is needed to stay in the Bingham distributions.
In the second approach, the unit sphere is treated as a
manifold, and is in turn (locally) parameterized by suitable
tangent spaces. The value of the pdf on the unit sphere is
defined by the value of a Gaussian on the corresponding
tangent space. This approach has for example been taken by
Choe [12] and Goddard [7]. Both approaches do not meet
all of our requirements: Choe does not take the translations
into account, and Goddard uses one Gaussian on one tangent
space only. This is a suitable approach in case that the initial
estimate and the information to be fused is highly certain,
but we need to be more general.
This requirement can be met by using mixture distributions. Mardia et al. [13] use a mixture of bivariate von Mises
distributions to describe a general distribution to specify the
secondary structure of proteins. They use a pdf on the torus,
not on the sphere, but the technique carries over.
Finally, we carry to our approach the common technique of approximating the pdf of a non-linear function
of random variables with Gaussian distributions. We use
the linearization based on the Jacobian of the non-linear
function. Alternatively, the approach from unscented Kalman
filters used by Kraft [14] could also be carried over to our
approach.
Combining the selected techniques from the related work
and adding a fair bit (as described below) we arrive at the
following approach:
The rigid motions are parameterized with dual quaternions. The rotation part of the dual quaternion is locally
parameterized by a tangent plane at a tangent point, the
bijection between tangent plane and unit sphere is given
by the central projection. The pdf on the dual quaternions
is a mixture distribution. Each element of the mixture is
derived from a Gaussian on the 6D Euclidean space that
is the Cartesian product of the space of translation vectors
and the tangent plane.
This approach allows for carrying over many operations
from Gaussian mixtures to our new class of pdf. Because
they are derived from the central projection of Gaussians we
call them Mixtures of Projected Gaussians.
III. M ODELING ROTATION U NCERTAINTIES
For didactic reasons, and because modeling the uncertainty
of orientation in 3D is an interesting topic in itself, we first
restrict ourselves to investigate the rotation part of a rigid
motion. We start with a short review of the parameterization
of the rotations by unit quaternions.
A. Rotation Represented by Unit Quaternions
A quaternion is a number q = a + bi + c j + dk,
where i, j, k are three imaginary parts and a, b, c, d ∈ R.
Hence we can identify the quaternions H with R4 , and the
unit quaternions with the 3D unit sphere S3 ⊂ R4 .
The unit quaternion
q = cos(θ /2) + sin(θ /2)(ui + v j + wk)
(1)
represents the rotation by θ around the unit length vector (u, v, w)> in the following way.
A point p = (x, y, z)> ∈ R3 is represented as the imaginary
quaternion
qp = xi + y j + zk.
(2)
The rotated point p0 is represented by the imaginary quaternion
(3)
q p0 = q ∗ qp ∗ q,
where q = a − bi − c j − dk is the conjugate of q and ∗ is
the quaternion multiplication. From (3) it follows that q
and − q both represent the same orientation. This is the
reason, why we prefer a probability density function on S3
that is antipodal symmetric. To avoid confusion about which
rotation quaternion to use, we always choose the one, which
ensures shortest rotation, i.e. the one in the same hemisphere,
except for 180◦ rotations, where both rotation quaternions are
on the equator.
Each point on the tangent space T S p is projected to two
opposite points on the sphere, as central projection means
projection along a straight line through the center of the
sphere. Hence, this line crosses the tangent space in one
point and the sphere in two points. This fits perfectly the
topology of unit quaternions. Points on the sphere, that are
orthogonal to the tangent point are defined to be projected
to the compact closure of the tangent space.
The central projection Π p is used to define a probability
distribution on S3 according to a probability distribution
on T S p . A projected Gaussian [1]
PG := N ( p, µ, Σ) on S3
(7)
can be derived from a Gaussian distribution N (µ, Σ) on R3
with the projection Π p . To obtain a continuous completion of
the pdf on S3 , we define the density to be zero for all points
on the equator of p. Up to a normalizing factor C, the value
of the projected Gaussian at the projected point Π p ( q) is the
same as the Gaussian kernel at point q,
N ( p, µ, Σ)(Π p ( q)) := CN (µ, Σ)( q).
(8)
The normalizing factor C is found by taking the integral of
this projected pdf over S3 [3]. In the subsection VI-C we give
an efficient algorithm for an approximation to this integral
of arbitrary precision. The projected Gaussian density pg is
a probability density function over the rotations in 3D, or
equivalently over the orientations on S3 . Figure 1 shows a
B. Probability Density Function on S3
We derive the pdf on S3 from the multivariate Gaussian
distributions N (µ, Σ) on a tangent space to S3 . The tangent
space is in our case 3D, and it is determined by the tangent
point p ∈ S3 and a base for the tangent space. For the point
p0 = (1, 0, 0, 0)> ∈ S3 we define the base of the tangent space
to be
B0 = { b01 , b02 , b03 } = {(0, 1, 0, 0)> , (0, 0, 1, 0)> , (0, 0, 0, 1)> }.
(4)
Points on S3 can be interpreted as quaternions. The canonical
base [2] at any other point p ∈ S3 is given by
B p = { p ∗ b01 ∗ p, p ∗ b02 ∗ p, p ∗ b03 ∗ p}.
(5)
This defines a tangent space T S p = T S( p, B p ) that is isomorphic to the Euclidean space R3 . A Gaussian distribution N (µ, Σ) on R3 trivially induces a Gaussian distribution
on T S( p, B p ), and that distribution induces a distribution
on S3 via the central projection
Π p : T S p → S3
pt
pt
Π p ( pt ) = {−
,
}.
k pt k k pt k
(6)
Fig. 1. Projected Gaussian density pg on the sphere S2 , obtained by central
projection ΠNorth Pole of a 2D Gaussian on the tangent space T SNorth Pole .
projected Gaussian density on the sphere S2 projected from
the tangent plane T SNorth Pole ' R2 with the north pole as
point of contact.
C. Mixture Distribution
The distribution described above is peaked, i.e. it has one
orientation with maximal probability density, and away from
this maximum the probability decreases rapidly.
Since we need to be a lot more general, to fulfill the
requirement expressive power, we use the analogon of the
mixture of Gaussians in Euclidean space, i.e. we extend
our class of distributions to combinations of kernels as
introduced above.
A mixture of projected Gaussians (MPG) is defined as
n
M = ∑ λi PG i ,
(9)
i=1
where PG i are the projected Gaussian kernels and λi is the
weighting factor with ∑i λi = 1, λi ≥ 0 ∀i and n ∈ N [1]. With
the back projection Π−1
p : S3 → T S p , from sphere to tangent
space, this mixture Π−1
p (M ) is identical to a mixture of
Gaussians in the Euclidean space.
A MPG is a pdf on S3 that we use to estimate orientations
parameterized by unit quaternions.
IV. M ODELING R IGID M OTION U NCERTAINTIES
The results of the previous section III are extended from
rotations to rigid motions in this section. The special Euclidean group SE(3) is the group of rigid motions in R3 .
A rigid motion is equivalent to a pose consisting of a
3D orientation and a 3D position. First, a parameterization
for the rigid motions is needed. Then, the distribution on
the special orthogonal group SO(3) of rotations has to be
extended to SE(3). We use the dual quaternions to represent
the rigid motion on S3 × R3 .
A. Rigid Motion Represented by Dual Quaternions
The ring of the dual quaternions is defined as
ε2
where ε is a dual unit, having the property
= 0 [15].
As introduces above, a unit quaternion qr represents a 3D
rotation. A 3D translation is represented by the imaginary
quaternion qt = xi + y j + zk. Together they define a rigid
motion ( qr , qt ) in 6D. Hence, isomorphisms exist between
the representations
(11)
for rigid motions. The dual quaternion representing the rigid
motion ( qr , qt ), can be calculated explicitly by
1
(12)
dq := qr + ε · qt ∗ qr ,
2
where ∗ denotes the quaternion multiplication. To transform a point p ∈ R3 by a rigid motion parameterized by
a dual quaternion dq, the point needs to be represented
by an imaginary quaternion qp as in (2). Then the transformed point p0 is represented by the imaginary quaternion dq ∗ ∗ q p0 ∗ ∗ dq, where ∗∗ is the dual quaternion multiplication and dq = qreal + ε · qdual is the conjugate of dq.
B. Probability Density on S3 × R3
The projected Gaussians, introduced in section III-B to
define a pdf on S3 can easily be extended to a pdf on S3 ×R3
and by (11) to a pdf on the rigid motions. The central
projection Π p needs to be extended to not only handle
the rotation parameter, but also the translation part. We
investigate the Cartesian set product of the tangent space T S p
and the space of translations R3 . It is isomorphic to the
Euclidean space R6 ,
T S p × R 3 ' R6 .
Π p : T S p × R3 → S 3 × R3 .
| {z } | {z }
(13)
(14)
' SE(3)
' R6
The projected Gaussian can similarly be extended from
a pdf on S3 to S3 × R3 . Again, everything remains the same,
but a Gaussian distribution N (µ, Σ) on R6 is projected by
the extended central projection. Accordingly, the projected
Gaussian
PG := N ( p, µ, Σ) on S3 × R3
(15)
is now defined on the special Euclidean group. Still it holds,
that up to a normalizing factor C, the value of the projected
Gaussian at the projected point Π p ( q) is the same as the
Gaussian kernel at point q,
N ( p, µ, Σ)(Π p ( q)) := C · N (µ, Σ)( q).
HD = { dq | dq = qreal + ε · qdual & qreal , qdual ∈ H}, (10)
SE(3) ' ( qr , qt ) ' S3 × R3
We define a new central projection from the Cartesian set
product T S p × R3 to the Cartesian set product S3 × R3 of the
sphere with the space of translations. In the new projection,
the mapping of the first three components from the tangent
space T S p to the hypersphere S3 are equal to the central
projection mapping defined in section III-B. The mapping
of the last three components in the translation space R3 is
defined to be the identity. We denote the extended central
projection with Π p as before,
(16)
On the extended projected Gaussians, the definition of
a mixture distribution stays exactly the same as in section III-C. The MPG is defined as
n
M = ∑ λi PG i ,
(17)
i=1
where PG i are the projected Gaussian kernels and λi the
weighting factors [1]. Now, a MPG is a pdf on S3 × R3
that can be used to estimate poses in the special Euclidean
group of rigid motions. Its advantage compared to a single
projected Gaussian is its wide applicability.
We do not distinguish between the notations on the special
orthogonal group and the special Euclidean group, because
in the general case, the restriction to purely orientation is
directly obtained by setting the translation 0.
V. O PERATIONS O N M IXTURES O F P ROJECTED
G AUSSIANS
The fusion of information is an important requirement for
many applications using probability densities. We explain
two basic operations on MPG, the fusion and the propagation
of uncertainty.
A. Information Fusion
Pose information encoded in two MPGs, can be fused
in two steps. First, the projected Gaussians have to be
transfered to a common tangent space and second, each PG i
from one mixture M1 needs to be fused with every PG j
from the other mixture M2 . The first step is performed by
double projection of the Gaussian kernels PG i and PG j
from the tangent spaces T S pi and T S p j to the common
−1
one T S pi j , i.e. Π−1
pi j (Π pi (PG i )) and Π pi j (Π p j (PG j )). In
case the tangent points anyway represent the same orientation, pi ∈ { p j , − p j }, set pi j = pi . For all pi ∈
/ { p j , − p j },
the new tangent point pi j is calculated as the weighted
midpoint on the sphere S3 between the two former tangent
points pi and p j ,
pi j =
λi pi + λ j p j
.
k(λi pi + λ j p j )k
(18)
Recall, that the weights of the projected Gaussians PG i
and PG j are given by λi and λ j , which express the reliability
of the information content of the projected Gaussians.
The fusion of two projected Gaussian densities itself is similar to the fusion of two Gaussian densities ϕ1 ∼ N (µ1 , Σ1 ) and ϕ2 ∼ N (µ2 , Σ2 ) on Rn pertaining
to the same phenomenon [7],
(19)
We can generalize this to projected Gaussian densities pg1 ∼ N ( p1 , µ1 , Σ1 ) and pg2 ∼ N ( p2 , µ2 , Σ2 ) only if
the tangent spaces are reasonably close, i.e. the angle
between the tangent points should be small enough such
that the transfer of projected Gaussians between tangent
spaces does not cause too big distortions. The new projected Gaussian kernels PG i j that come about by successive fusion get new weighting coefficients to evaluate
the plausibility, whether the Gaussian kernels to be fused
pertain the same phenomenon. The coefficients are calculated
as λi j := λi λ j αi j γi j , where
>
2
αi j = e(−a arccos(( pi p j ) ))
(20)
penalizes projected Gaussians with detached tangent points,
because that makes it difficult to share a tangent space. The
factor a ∈ R determines the penalty increase for a growing
angle between the tangent points. The factor γi j is calculated
as
γi j = 1 − (µi − µ j )(Σi + Σ j )−1 (µi − µ j )>
Deriving the pdf of a composition of rigid motions from
the pdf of the components is important in robotics. For
instance the resulting uncertainty in a kinematic chain needs
to be estimated based on the joint uncertainties. In perception, the cumulated effect of sensor uncertainty and feature
uncertainty on the pose of an object needs to be estimated.
To calculate this propagation in our framework,
we define the composition of two projected Gaussians PG 1 := N ( p1 , 0, Σ1 ) and PG 2 := N ( p2 , 0, Σ2 )
as
PG 3 := N p1 ∗ p2 , 0, Jγ Σγ Jγ> ,
(22)
where Jγ is a Jacobian matrix, defined below, and Σγ is the
block diagonal matrix of Σ2 and Σ1 . In formula (22), the
(ϕ2 ,ϕ1 ) , where f (ϕ2 , ϕ1 ) denotes the
Jacobian is Jγ = ∂∂f(ϕ
,ϕ ) 2
ϕ3 ∼ N (µ3 , Σ3 )
µ3 = (Σ1 + Σ2 )−1 (Σ1 µ2 + Σ2 µ1 )
−1 −1
Σ3 = Σ−1
.
1 + Σ2
B. Propagation of Uncertainty
(21)
and expresses whether the transferred kernels are compatible
according to a coincidence measure, based on the Mahalanobis distance. That means, even when the projected
Gaussians can share a tangent space, they still can be
incompatible, because differences of the µs and the Σs can
make it unlikely that both distributions describe the same
phenomenon.
Generally the rotation part of µ in PG is not zero. Since
it is advantageous to work with projected Gaussians that
have zero mean in the corresponding Gaussian on the tangent
space, we approximate the projected Gaussian kernel by one
that is transfered to the according tangent point p by central
projecting to the sphere and then reprojecting to the new
tangent space. The centered projected Gaussian kernel is
called PG 0 := N ( p, 0, Σ).
1
(0,0)
reprojection of the product ∗∗ of the dual quaternions, representing the transformations, that are encoded in Π p1 (ϕ1 )
and Π p2 (ϕ2 ), i.e.
f (ϕ2 , ϕ1 ) = Π−1
p3 (Π p2 (ϕ2 ) ∗ ∗ Π p1 (ϕ1 ))
(23)
This composition formula allows us to propagate uncertainty,
encoded in a projected Gaussian, to another 6D pose by an
uncertain rigid motion.
VI. E FFICIENCY I SSUES
In this section we present three issues for saving memory
space and processing time.
A disadvantage of information fusion is that the number
of mixture elements is rapidly growing. For a mixture M1
having n1 elements, and another mixture M2 having n2
elements, the fused mixture M12 has n1 n2 elements. To keep
the calculation time reasonably small, we replace similar
projected Gaussians by a new projected Gaussian, that contains the information of the old ones. We call this procedure
merge.
In some applications, there is background noise that can
hardly be modeled efficiently by a small number of Gaussian
kernels. An example is the estimation of an object pose
from a feature on the object, where the feature might be
mismatched, i.e. not pertain to the object. Then we want to
model the degree of reliability in the weights that we give
to the projected Gaussians. But the weights of the mixture
elements must sum to 1. Hence, we add a component to
the mixture that expresses the uncertainty, that the feature
might be an outlier. This uncertainty is best modeled by a
component which we call unit distribution, as we have no
information about the true pose of the outlier.
A third issue, explained in this section, is an efficient
approximation of the renormalization factor.
A. Merge
Reducing the number of mixture elements by merging can
be realized iteratively. All pairs of projected Gaussians in
the mixture {PG i , PG j }, i 6= j, are tested for similarity
according to the Kullback-Leibler discrimination (KLD).
According to [16], this is the ideal cost function for Gaussian
mixture reduction. For a reliable comparison, each pair
needs to be transferred to a common tangent space T S pi j
as in V-A. Instead of calculating the KLD directly, which
is computationally expensive, we define a dissimilarity measure B((λi , PG i ), (λ j , PG j )) analog to the one [17] proposes, but for MPG instead of regular Gaussian mixtures.
The dissimilarity measure B is an upper bound for the (nonsymmetric) Kullback-Leibler discrimination of (PG i , PG j )
and (PG j , PG i ). The two mixture elements, which are the
closest, according to the dissimilarity measure B, are replaced
through their moment-preserving merge (λi j , PG i j ), with
λi j = λi + λ j and pi j defined as in (18)
1
(λi µi + λ j µ j )
µi j =
λi j
(24)
λ
λ
1
i j
Σi j =
λ i Σi + λ j Σ j +
(µi − µ j )(µi − µ j )> .
λi j
λi j
is the volume correction term in the integration by substitution. We denote p := (u, v, w, x, y, z) and use the parameters
Σ and µ of the Gaussian density g.
Z
I = Cg
R6
f (u, v, w) e−0.5( p
where
Cg =
n
M = λ0 U + ∑ λi PG i ,
(25)
i=1
where ∑i λi = 1. On compact sets the distribution U would
be the uniform distribution, but in our case we have to
define a distribution over SE(3) = S3 × R3 . To emphasize the
definition of the distribution on SE(3) = S3 × R3 , we call U
unit distribution. It is characterized by the fusion with other
kernels according to [3]
U × PG := PG
(26)
U × U := U .
C. Renormalization
In section III-B we mention a renormalization constant C,
that makes the difference between a Gaussian and a projected
Gaussian. To calculate this factor C, we need to integrate the
projection over S3 × R3 , what is equivalent to calculating
Z
R6
f (u, v, w)g (u, v, w, x, y, z) d (u, v, w, x, y, z) ,
where g is the density of a Gaussian in
f (u, v, w) :=
R6
(1 + u2 + v2 + w2 )2
1
.
√
(2π) det Σ
(29)
(30)
3
1
with r2 := u2 + v2 + w2 .
(1 + r2 )2
(31)
Letting R be the 6 × 6 matrix with the first three diagonal
elements 1 and all other entries 0, instead of the original
integral (29), we now calculate two integrals of the type I(b),
to approximate I
I(b) =
R6
=
R6
Z
=
R6
2
e−br e−0.5( p
e−b pR p
> −µ > Σ−1
> −0.5
)
( p> −µ ) d p
> −1
Σ
( p> −µ )
( p> −µ ) d p
h
i
>
−0.5 p(2bR) p> +( p> −µ ) Σ−1 ( p> −µ )
e
dp
The sum of quadratic terms is again a quadratic term. So in
order to complete the square for the sum
>
S := p (2bR) p> + p> − µ Σ−1 p> − µ ,
(33)
we find the minimum of the sum, i.e. the ’mean
value’ µ0 = (2bΣR + Id)−1 µ. Using this new µ0 in the
quadratic form (33) above, we find the constant term
CQ = µ0> (2bR) µ0 + (µ0 − µ)> Σ−1 (µ0 − µ) ,
such that
>
S = p > − µ0
2bR + Σ−1 p> − µ0 +CQ .
(34)
(35)
(27)
−1
With the shorthand notation Σ−1
b := 2bR + Σ , the approximate integral becomes
and
1
( p> −µ ) d p,
It can be shown that f can be approximated to arbitrary
precision with exponential functions
n
2
1
−bi r max −
a
e
(32)
<ε
i
∑
r, ai , bi ∈R (1 + r 2 )2
i=1
Z
The fusion of pose information is not affected by the
introduction of the unit distribution, as it has the properties
of a unit element. If the unit distributions of both mixtures
are fused, U remains unchanged and gets the sum of the
two former weights as new weight. In case that a unit
distribution U0 with weight λ0 and an arbitrary projected
Gaussian kernel PGk with weight λk is fused, we obtain PGk
with the new weight λ0k = λ0 λk . On iterative fusion, the unit
distribution weight converges to 0 exponentially fast with the
number of steps.
I=
f (u, v, w) =
Z
PG × U := PG
)
This integral can either be calculated via Monte Carlo
integration or using sparse grids methods [18]. Monte Carlo
methods are either slow or provide no exact results. Sparse
grids methods overcome the curse of dimensionality to some
extent, but have problems with Gaussian densities and are
also computationally expensive [19]. Therefore, we take
advantage of the special structure of the integrand and adopt
a different approach. The factor f is a bell shaped curve
depending only on the square of the radius
B. Background Noise
The unit distribution U is introduced as first component
to the mixture
> −µ > Σ−1
(28)
I(b) = e−0.5CQ
Z
R6
e−0.5( p
> −µ
0
> −1
Σb
)
( p> −µ0 ) d p.
The integrand is a multivariate normal distribution up to the
normalizing factor of a Gaussian in R6 , so the value of the
integral is the inverse of that factor, and
p
(36)
I(b) = e−0.5CQ (2π)3 det (Σb ).
From (30) and (36) we get a closed form approximation
for C at little computational cost, with n = 2 in (32)
C∼
= Cg (a1 I(b1 ) + a2 I(b2 )) .
(37)
Fig. 3. Flags determine the full 6D Pose and replace drawing coordinate
systems at each sample.
The approximation coefficients ai and bi are the same for
each projection Π p and can be calculated offline. Using this
method, the calculation time for fusion is 50 times faster than
using Monte Carlo with 2000 samples, what corresponds to
a similar accuracy.
VII. I LLUSTRATION OF THE MPG INFORMATION FUSION
To demonstrate the framework, we simulate a robotic application. Here we consider the 6D pose estimation problem
for an box shaped object, using the class of probability
distributions mixtures of projected Gaussians.
Fig. 4. left: Independent object pose estimations using an edge feature
(red flags) and a surface feature (blue flags). right: Resulting object pose
information, after fusion of the red and blue mixtures in left image.
B. Pose Estimation Process
A. Viewing Areas and Visualization of 6D Pose
The goal of this subsection is to illustrate the assignment
of pdf s to the features. If for example an edge of an object
is detected in a camera image, there are only certain object
poses under which this edge is visible at all. This is illustrated
in Figure 2 to the left, where the possible camera poses are
outlined with a dashed line. Similarly, a point feature like
the letter F, see Figure 2 to the right, is only visible under
certain poses. These viewing cylinder segments and cones
are interpreted as widely spread pdf describing the object
pose, see section IV. The pose estimate is narrowed down
by fusing these pdf using the operators described above.
Fig. 2. Viewing cylinder segment of camera positions one the left and
viewing cone on the right to detect front left object edge as outer edge,
respectively surface feature ’F’.
In the following, the pdf describing a 6D pose is visualized
by drawing a number of samples from the distribution.
Figure 3 shows the visualization of a 6D pose; each flag represents a sample from the estimated probability distribution.
It is placed at the 3D point of the sample and its orientation
equals the orientation of the sample. That is to say, the flag
pole equals the z-axis and its pennant looks in direction of
the x-axis of the right-handed coordinate system.
In this subsection we illustrate the 6D pose estimation process using the operations on MPG introduced in section V.
In Figure 4 to the left, the red flags depict the possible object
poses resulting from all short edges of the box, and the
blue flags depict the possible object poses resulting from
a point feature on the surface of the box. Note that the
short edge could not be identified unambiguously, hence
the object poses resulting from all eight short edges have
to be considered and we obtain 80 kernels, that together
represent the possible object poses. The resulting pdf is
thus widely spread as shown by the red flags. We assume
that the point feature is unique for the object. This means
the corresponding pose pdf is more focused and has fewer
kernels, as depicted by the blue flags. Fusing the object
pose distributions from both features, see subsection V-A,
the impossible poses are eliminated. The ruled out poses
include all those, where the edge and the point feature cannot
be detected simultaneously, or the point feature says that
the object is upright meanwhile the edge says that it is
upside down. The mixture resulting from the fusion of both
estimates is shown to the right of Figure 4. The objects are
drawn at samples from the resulting mixture. Even though
it already is more peaked, the uncertainty over the 6D pose
ranges in several centimeters and degrees.
The estimate is sharpened by including more features
in the fusion process, respecting the efficiency issues in
section VI. In the example shown in Figure 5 we fused seven
features. Only six of them actually are features of the object,
four edge features and two surface features on the front side
of the box. One feature is a mismatch which is not on the
box but part of the background noise.
Fusing the pdf of these features provides a peaked probability distribution for the object. Figure 6 shows in both
toolbox.
Fig. 5. Seven features used for estimating the pose of the box object. Three
Features are surface features, whereof one is an outlier in the background
and four features are edge features.
Fig. 6. Estimation of the object pose after fusion of all seven features, visualized by green samples. left: For easy comparison, additionally the object
pose information from an edge and a surface feature are visualized. right:
Visualization of target as several samples, drawn from final distribution.
images (in green) the resulting probability distribution. Now
the estimate accuracy raised to include 95% of the probability
mass in a bounding box of less than 1 centimeter side length
and rotation smaller than 10◦ . The outlier included in the
fusion process has no significant influence to this result. The
possibility of each feature to be an outlier is captured in the
unit distribution, introduced in section VI-B, which makes
the fusion process robust with respect to outliers.
VIII. C ONCLUSION
In this paper we describe a class of pdf on the rigid
motions which is well suited for problems in robotics and
robotic perception, but can also be used in other applications.
This class of pdf has many useful properties. It has a
large expressive power, being able to describe wide spread
pdf and to model the correlation between position and
orientation. Further, it has closed form expressions for the
fusion of information and for the propagation of uncertainty.
The computations are made efficient by special approximation techniques for renormalization and for modeling of
background noise. Since the class is based on mixtures of
Gaussian distributions, more extensions are possible. We
have implemented sampling from the distribution as well as
fitting a distribution from this class to a sample set using a
variant of the EM algorithm. Further, techniques have been
adapted from the mixtures of Gaussians to fuse base elements
or to drop small base elements. A full implementation of
the class of pdf and of the operations has been made in
Python, and the approach has been validated in simulation.
A re-implementation in C++ is under way. The code will be
available open source as soon as we finish an easy to handle
ACKNOWLEDGMENT
This work was funded in parts under the ARTEMIS Joint
Undertaking as part of the project R3-COP, from the German
Federal Ministry of Education and Research (BMBF) under
grant no. 01IS10004E, and was supported in part within
the DFG excellence initiative research cluster Cognition for
Technical Systems - CoTeSys and the European FP7 research
project ”WEARHAP - Wearable Haptics for Humans and
Robots”.
R EFERENCES
[1] Wendelin Feiten, Pradeep Atwal, Robert Eidenberger, and Thilo
Grundmann. 6D Pose Uncertainty in Robotic Perception. In Torsten
Kröger and Friedrich M. Wahl, editors, Advances in Robotics Research, pages 89–98. Springer Berlin Heidelberg, 2009.
[2] Wendelin Feiten and Muriel Lang. MPG - A Framework for Reasoning
on 6 DOF Pose Uncertainty. Corporate Technology, Intelligent
Systems & Control, Siemens AG, D-80200 Munich, 2011. ICRA
Workshop on Manipulation Under Uncertainty.
[3] Muriel Lang and Wendelin Feiten. MPG - Fast Forward Reasoning
on 6 DOF Pose Uncertainty. In 7th German Conference on Robotics
(ROBOTIK 2012), Munich, Germany, May 2012.
[4] Christoph Hertzberg, Ren Wagner, Udo Frese, and Lutz Schröder.
Integrating generic sensor fusion algorithms with sound state representations through encapsulation of manifolds. Information Fusion,
14(1):57–77, 2013.
[5] J. Stuelpnagel. On the parameterization of the three-dimensional
rotation group. In NASA Technical Documents. NASA, January 1948.
[6] L Kavan, S Collins, Carol O’Sullivan, and J Zara. Dual quaternions for
rigid transformation blending. Technical report TCDCS200646 Trinity
College Dublin, 2009.
[7] James Samuel Goddard. Pose And Motion Estimation From Vision
Using Dual Quaternion-Based Extended Kalman Filtering. PhD thesis,
University of Tennessee, Knoxville, 1997.
[8] J. S. Goddard and M. A. Abidi. Pose and motion estimation using
dual quaternion-based extended kalman filtering. SPIE Conf. on ThreeDimensional Image Capture and Applications, 3313:189–200, January
1998.
[9] Seth Teller and Matthew E. Antone. Robust camera pose recovery
using stochastic geometry. Technical report, Proceedings of the AOIS2001, 2001.
[10] Jared Glover, Gary Bradski, and Radu Bogdan Rusu. Monte carlo pose
estimation with quaternion kernels and the bingham distribution. Los
Angeles, USA, 2011. Submitted to Robotics: Science and Systems
(RSS 2011).
[11] J. J. Love. Bingham statistics. In Encyclopedia of Geomagnetism &
Paleomagnetism, pages 45–47. Springer, Dordrecht, The Netherlands,
2007.
[12] Su Bang Choe. Statistical analysis of orientation trajectories via
quaternions with applications to human motion. PhD thesis, UNIVERSITY OF MICHIGAN, 2006.
[13] K. V. Mardia, C. C. Taylor, and G.K. Subramaniam. Protein bioinformatics and mixtures of bivariate von mises distributions for angular
data. Biometrics, 63(2):505–512, June 2007.
[14] E. Kraft. A quaternion-based unscented kalman filter for orientation
tracking. Information Fusion, 1:47–54, 2003.
[15] Erhan Ata and Yusuf Yayli. Dual unitary matrices and unit dual
quaternions. Differential Geometry - Dynamical Systems, 10:1–12,
2008.
[16] Jason L. Williams. Gaussian Mixture Reduction for Tracking Multiple
Maneuvering Targets in Clutter. Storming Media, Air Force Inst of
Tech Wright-Patterson AFB OH School of Engineering and Management, 2003.
[17] Andrew R. Runnalls. A Kullback-Leibler Approach to Gaussian
Mixture Reduction. IEEE Transactions on Aerospace and Electronic
Systems, 43(3):989–999, 2007.
[18] Hans-Joachim Bungartz and Michael Griebel. Sparse grids. Acta
Numerica, 13:147–269, May 2004.
[19] Dirk Pflüger. Spatially Adaptive Sparse Grids for High-Dimensional
Problems. Verlag Dr. Hut, München, August 2010.

Download Report

Rigid Motion Estimation using Mixtures of Projected - R3-COP

Paperzz.com

Your Paperzz