Rigid Motion Estimation using Mixtures of Projected Gaussians Wendelin Feiten1 , Muriel Lang2 and Sandra Hirche2 Abstract— Modeling the position and orientation in threedimensional space is important in many applications. In robotics, the position and orientation of objects as well as the rigid motions of robots are derived from sensor data that are uncertain. The uncertainties of these sensor data result in position and orientation uncertainties that can be very widely spread or have several peaks. In this paper we describe a class of probability density functions (pdf ) on the group of rigid motions that allows for modeling wide-spread and multi-modal pdf and offers most of the operations that are available for the mixtures of Gaussians on Euclidean space. The use of this class of pdf is illustrated with an example from robotic perception. I. I NTRODUCTION In many applications, the position and orientation of objects in the three-dimensional (3D) space or the motion of objects through the 3D space need to be modeled. Various approaches to this task have been explored in the past in different areas. In the estimation of the motion of airplanes, generally the rotation is described by the three angles of roll, pitch and yaw. In computer graphics, the rotation in 3D is generally described by unit quaternions. In robotics, the Rodrigues vector is often used as a parameterization of the 3Dl rotation. These rotation representations are then combined with the translation to model the rigid motion, which is equivalent to the position and orientation in space. The standard representation in mathematics and also in robotics is the rotation matrix, in robotics generally enhanced with an additional column representing the translation. In most cases, where position and orientation are derived from measurements, this is not a certain information, but it depends on random observations. This means that in addition to the parameterization of the rigid motion a probability density function (pdf ) over these parameters needs to be given. The contribution of this paper is a new class of pdf for the probabilistic representation of rigid motion. In order to cover a wide range of applications, the following requirements should be satisfied by such a class of pdf : Expressive Power: The class should be able to approximate any pdf. Information Fusion: For two pdf from the class pertaining to the same pose, there should be an algorithm to calculate the resulting pose estimate as a pdf from this class that best explains both prior estimates. 1 W. Feiten is with Siemens Corporate Technology, Siemens AG, D-81739 Munich [email protected] 2 M. Lang and S. Hirche are with the Institute for InformationOriented Control, Faculty of Electrical Engineering and Information Technology, Technische Universität München, D-80290 Munich [email protected] Information Propagation: For two pdf pertaining to two subsequent motions, there should be an algorithm to calculate the pdf of the composition of the two motions. Efficiency: The algorithms should be fast enough and the space requirements should be small enough to allow for the use of the pdf in real time applications. In section II we will investigate how these requirements are addressed in different approaches and draw our conclusions as to the concepts we want to combine and extend in our approach. The resulting class of pdf together with the algorithms according to the requirements is then developed in the subsequent sections. In section III we explain how the Gaussian distribution in R3 is used to represent a peaked pdf on the rotations and how this approach generalizes to represent also widely spread or multimodal pdf using mixture distributions (we call this the Mixtures of projected Gaussians or MPG). We then extend this from rotations to the rigid motions in section IV. Section V describes the information fusion and the uncertainty propagation with the mixture distributions on the rigid motions. To keep the number of mixture elements small, similar elements are merged and negligible ones are disregarded. A background noise is introduced in section VI and also an approach to keep the computation time at a reasonable size. In section VII we then illustrate the use of this class of pdf in the field of robotic perception and conclude in section VIII with a short summary of the properties of the new class of pdf and the potential further development. This presentation of our new class of pdf on rigid motions builds on our previous publications [1], [2], [3] and gives an overview of the approach. II. R ELATED W ORK Representing the position and orientation of objects in three-dimensional space, or equivalently, representing the rigid motion, has been under investigation for such a long time that giving a complete outline of the history seems inadequate. Admitting that much more work has been done, and giving credit to it, we report here the work that we used to develop our approach. Along with a short assessment of the crucial ingredients of the respective work, we explain which of them we decided to integrate in our approach and why. The first design decision is, which parameterization of the rotations and of the rigid motions to use. Rotation matrices are well understood and easy to visualize. Also, extended with the translation vector as an additional column and an additional row (0, 0, 0, 1) to a homogeneous transformation matrix, they represent the rigid motions in a way that composition of motions is done by matrix multiplication. However, the rigid motions have inherently only six degrees of freedom, and this representation has twelve parameters. As six equality constraints are needed on the parameter space to determine the valid subset of parameter, we decided against this representation. Euler angles, for example the parameter set roll, pitch, yaw, need no equality constraints to reduce the dimension, because the rotation has three degrees of freedom. If the values for the Euler angles become large, the parameterization can become singular. There is an orientation for which the parameters can not be uniquely determined (gimbal lock). Since we want to be able to deal with a wide range of orientations, we decided against Euler angles. Rodrigues vectors, like Euler angles, use only three parameters for the representation of the orientation. Hence, no equality constraints are needed, and a pdf can be induced on the rotations from a pdf on R3 (see [4]). But for Rodrigues vectors there is no efficient way, to the best of our knowledge, to determine the parameters of a composition of rigid motions from the parameters for each individual motion. This makes it technically more difficult to determine the pdf of the composition from the pdf of the two components. The unit quaternions represent the rotations using four parameters, so one equality constraint is needed to reduce the dimension (see [5]). The composition of rotations corresponds to multiplication of unit quaternions, and there is no singularity (except for one symmetry that is easy to handle). They are in wide use in the computer graphics community (see [6]). Further, the rigid motions can be described by dual quaternions (they will be explained below). The dual quaternions only need one equality constraint, are unique for a rigid motion (except the easy to handle symmetry in one quaternion) and have a closed form for the composition of rigid motions. Their advantage is that the composition of rigid motions corresponds to the multiplication of dual quaternions, so that there is a closed form for the parameters of the composition (see [6]). Goddard and Abidi [7], [8] use this formalism to capture the correlation between rotation and translation, and also use the derivatives to estimate the dynamical state of a system. The dual quaternions seem to us to be the best match with our requirements. This takes us to the decision which distribution types to use. A very common distribution type to describe arbitrary pdf is the description with a sample set. The expressive power is high, composition is easy, but this comes at a price. In high dimensions (and for wide spread distributions, six dimensions is high) the sample sets need to be very large in order to be representative of the pdf. Also, the fusion of two particle sets is not straight forward and computationally expensive. So we decide to keep the sample set description as a fallback option, but to develop a representation that is still general, but more efficient in terms of memory and computation time. Since we want to use the same operations on the pdf of our class that are available for the Gaussians and the mixtures of Gaussians, we specifically look at those distributions that are derived from the Gaussian distributions in Euclidean space. Basically we distinguish two different approaches. In the first approach, the pdf is defined by a Gaussian in R4 with zero mean, taking the values on the unit sphere S3 ⊂ R4 and renormalizing with the value of the integral over the sphere. This is also known as the Bingham distribution. This type of distribution has been used by [9] and by [10]. One convenient property of this distribution is that the values for antipodal points on the sphere are identical. This makes sense because antipodal points (more precisely the unit quaternions corresponding to antipodal points) represent the same rotation. However, two objections need to be made. The first one is that calculating the integral is very expensive (see [11]). Glover [10] therefore computes a lookup table of normalizing constant approximations using standard floating point arithmetic. The second objection is that the distribution resulting from a composition of motions is not a Bingham distribution, so an additional approximation step is needed to stay in the Bingham distributions. In the second approach, the unit sphere is treated as a manifold, and is in turn (locally) parameterized by suitable tangent spaces. The value of the pdf on the unit sphere is defined by the value of a Gaussian on the corresponding tangent space. This approach has for example been taken by Choe [12] and Goddard [7]. Both approaches do not meet all of our requirements: Choe does not take the translations into account, and Goddard uses one Gaussian on one tangent space only. This is a suitable approach in case that the initial estimate and the information to be fused is highly certain, but we need to be more general. This requirement can be met by using mixture distributions. Mardia et al. [13] use a mixture of bivariate von Mises distributions to describe a general distribution to specify the secondary structure of proteins. They use a pdf on the torus, not on the sphere, but the technique carries over. Finally, we carry to our approach the common technique of approximating the pdf of a non-linear function of random variables with Gaussian distributions. We use the linearization based on the Jacobian of the non-linear function. Alternatively, the approach from unscented Kalman filters used by Kraft [14] could also be carried over to our approach. Combining the selected techniques from the related work and adding a fair bit (as described below) we arrive at the following approach: The rigid motions are parameterized with dual quaternions. The rotation part of the dual quaternion is locally parameterized by a tangent plane at a tangent point, the bijection between tangent plane and unit sphere is given by the central projection. The pdf on the dual quaternions is a mixture distribution. Each element of the mixture is derived from a Gaussian on the 6D Euclidean space that is the Cartesian product of the space of translation vectors and the tangent plane. This approach allows for carrying over many operations from Gaussian mixtures to our new class of pdf. Because they are derived from the central projection of Gaussians we call them Mixtures of Projected Gaussians. III. M ODELING ROTATION U NCERTAINTIES For didactic reasons, and because modeling the uncertainty of orientation in 3D is an interesting topic in itself, we first restrict ourselves to investigate the rotation part of a rigid motion. We start with a short review of the parameterization of the rotations by unit quaternions. A. Rotation Represented by Unit Quaternions A quaternion is a number q = a + bi + c j + dk, where i, j, k are three imaginary parts and a, b, c, d ∈ R. Hence we can identify the quaternions H with R4 , and the unit quaternions with the 3D unit sphere S3 ⊂ R4 . The unit quaternion q = cos(θ /2) + sin(θ /2)(ui + v j + wk) (1) represents the rotation by θ around the unit length vector (u, v, w)> in the following way. A point p = (x, y, z)> ∈ R3 is represented as the imaginary quaternion qp = xi + y j + zk. (2) The rotated point p0 is represented by the imaginary quaternion (3) q p0 = q ∗ qp ∗ q, where q = a − bi − c j − dk is the conjugate of q and ∗ is the quaternion multiplication. From (3) it follows that q and − q both represent the same orientation. This is the reason, why we prefer a probability density function on S3 that is antipodal symmetric. To avoid confusion about which rotation quaternion to use, we always choose the one, which ensures shortest rotation, i.e. the one in the same hemisphere, except for 180◦ rotations, where both rotation quaternions are on the equator. Each point on the tangent space T S p is projected to two opposite points on the sphere, as central projection means projection along a straight line through the center of the sphere. Hence, this line crosses the tangent space in one point and the sphere in two points. This fits perfectly the topology of unit quaternions. Points on the sphere, that are orthogonal to the tangent point are defined to be projected to the compact closure of the tangent space. The central projection Π p is used to define a probability distribution on S3 according to a probability distribution on T S p . A projected Gaussian [1] PG := N ( p, µ, Σ) on S3 (7) can be derived from a Gaussian distribution N (µ, Σ) on R3 with the projection Π p . To obtain a continuous completion of the pdf on S3 , we define the density to be zero for all points on the equator of p. Up to a normalizing factor C, the value of the projected Gaussian at the projected point Π p ( q) is the same as the Gaussian kernel at point q, N ( p, µ, Σ)(Π p ( q)) := CN (µ, Σ)( q). (8) The normalizing factor C is found by taking the integral of this projected pdf over S3 [3]. In the subsection VI-C we give an efficient algorithm for an approximation to this integral of arbitrary precision. The projected Gaussian density pg is a probability density function over the rotations in 3D, or equivalently over the orientations on S3 . Figure 1 shows a B. Probability Density Function on S3 We derive the pdf on S3 from the multivariate Gaussian distributions N (µ, Σ) on a tangent space to S3 . The tangent space is in our case 3D, and it is determined by the tangent point p ∈ S3 and a base for the tangent space. For the point p0 = (1, 0, 0, 0)> ∈ S3 we define the base of the tangent space to be B0 = { b01 , b02 , b03 } = {(0, 1, 0, 0)> , (0, 0, 1, 0)> , (0, 0, 0, 1)> }. (4) Points on S3 can be interpreted as quaternions. The canonical base [2] at any other point p ∈ S3 is given by B p = { p ∗ b01 ∗ p, p ∗ b02 ∗ p, p ∗ b03 ∗ p}. (5) This defines a tangent space T S p = T S( p, B p ) that is isomorphic to the Euclidean space R3 . A Gaussian distribution N (µ, Σ) on R3 trivially induces a Gaussian distribution on T S( p, B p ), and that distribution induces a distribution on S3 via the central projection Π p : T S p → S3 pt pt Π p ( pt ) = {− , }. k pt k k pt k (6) Fig. 1. Projected Gaussian density pg on the sphere S2 , obtained by central projection ΠNorth Pole of a 2D Gaussian on the tangent space T SNorth Pole . projected Gaussian density on the sphere S2 projected from the tangent plane T SNorth Pole ' R2 with the north pole as point of contact. C. Mixture Distribution The distribution described above is peaked, i.e. it has one orientation with maximal probability density, and away from this maximum the probability decreases rapidly. Since we need to be a lot more general, to fulfill the requirement expressive power, we use the analogon of the mixture of Gaussians in Euclidean space, i.e. we extend our class of distributions to combinations of kernels as introduced above. A mixture of projected Gaussians (MPG) is defined as n M = ∑ λi PG i , (9) i=1 where PG i are the projected Gaussian kernels and λi is the weighting factor with ∑i λi = 1, λi ≥ 0 ∀i and n ∈ N [1]. With the back projection Π−1 p : S3 → T S p , from sphere to tangent space, this mixture Π−1 p (M ) is identical to a mixture of Gaussians in the Euclidean space. A MPG is a pdf on S3 that we use to estimate orientations parameterized by unit quaternions. IV. M ODELING R IGID M OTION U NCERTAINTIES The results of the previous section III are extended from rotations to rigid motions in this section. The special Euclidean group SE(3) is the group of rigid motions in R3 . A rigid motion is equivalent to a pose consisting of a 3D orientation and a 3D position. First, a parameterization for the rigid motions is needed. Then, the distribution on the special orthogonal group SO(3) of rotations has to be extended to SE(3). We use the dual quaternions to represent the rigid motion on S3 × R3 . A. Rigid Motion Represented by Dual Quaternions The ring of the dual quaternions is defined as ε2 where ε is a dual unit, having the property = 0 [15]. As introduces above, a unit quaternion qr represents a 3D rotation. A 3D translation is represented by the imaginary quaternion qt = xi + y j + zk. Together they define a rigid motion ( qr , qt ) in 6D. Hence, isomorphisms exist between the representations (11) for rigid motions. The dual quaternion representing the rigid motion ( qr , qt ), can be calculated explicitly by 1 (12) dq := qr + ε · qt ∗ qr , 2 where ∗ denotes the quaternion multiplication. To transform a point p ∈ R3 by a rigid motion parameterized by a dual quaternion dq, the point needs to be represented by an imaginary quaternion qp as in (2). Then the transformed point p0 is represented by the imaginary quaternion dq ∗ ∗ q p0 ∗ ∗ dq, where ∗∗ is the dual quaternion multiplication and dq = qreal + ε · qdual is the conjugate of dq. B. Probability Density on S3 × R3 The projected Gaussians, introduced in section III-B to define a pdf on S3 can easily be extended to a pdf on S3 ×R3 and by (11) to a pdf on the rigid motions. The central projection Π p needs to be extended to not only handle the rotation parameter, but also the translation part. We investigate the Cartesian set product of the tangent space T S p and the space of translations R3 . It is isomorphic to the Euclidean space R6 , T S p × R 3 ' R6 . Π p : T S p × R3 → S 3 × R3 . | {z } | {z } (13) (14) ' SE(3) ' R6 The projected Gaussian can similarly be extended from a pdf on S3 to S3 × R3 . Again, everything remains the same, but a Gaussian distribution N (µ, Σ) on R6 is projected by the extended central projection. Accordingly, the projected Gaussian PG := N ( p, µ, Σ) on S3 × R3 (15) is now defined on the special Euclidean group. Still it holds, that up to a normalizing factor C, the value of the projected Gaussian at the projected point Π p ( q) is the same as the Gaussian kernel at point q, N ( p, µ, Σ)(Π p ( q)) := C · N (µ, Σ)( q). HD = { dq | dq = qreal + ε · qdual & qreal , qdual ∈ H}, (10) SE(3) ' ( qr , qt ) ' S3 × R3 We define a new central projection from the Cartesian set product T S p × R3 to the Cartesian set product S3 × R3 of the sphere with the space of translations. In the new projection, the mapping of the first three components from the tangent space T S p to the hypersphere S3 are equal to the central projection mapping defined in section III-B. The mapping of the last three components in the translation space R3 is defined to be the identity. We denote the extended central projection with Π p as before, (16) On the extended projected Gaussians, the definition of a mixture distribution stays exactly the same as in section III-C. The MPG is defined as n M = ∑ λi PG i , (17) i=1 where PG i are the projected Gaussian kernels and λi the weighting factors [1]. Now, a MPG is a pdf on S3 × R3 that can be used to estimate poses in the special Euclidean group of rigid motions. Its advantage compared to a single projected Gaussian is its wide applicability. We do not distinguish between the notations on the special orthogonal group and the special Euclidean group, because in the general case, the restriction to purely orientation is directly obtained by setting the translation 0. V. O PERATIONS O N M IXTURES O F P ROJECTED G AUSSIANS The fusion of information is an important requirement for many applications using probability densities. We explain two basic operations on MPG, the fusion and the propagation of uncertainty. A. Information Fusion Pose information encoded in two MPGs, can be fused in two steps. First, the projected Gaussians have to be transfered to a common tangent space and second, each PG i from one mixture M1 needs to be fused with every PG j from the other mixture M2 . The first step is performed by double projection of the Gaussian kernels PG i and PG j from the tangent spaces T S pi and T S p j to the common −1 one T S pi j , i.e. Π−1 pi j (Π pi (PG i )) and Π pi j (Π p j (PG j )). In case the tangent points anyway represent the same orientation, pi ∈ { p j , − p j }, set pi j = pi . For all pi ∈ / { p j , − p j }, the new tangent point pi j is calculated as the weighted midpoint on the sphere S3 between the two former tangent points pi and p j , pi j = λi pi + λ j p j . k(λi pi + λ j p j )k (18) Recall, that the weights of the projected Gaussians PG i and PG j are given by λi and λ j , which express the reliability of the information content of the projected Gaussians. The fusion of two projected Gaussian densities itself is similar to the fusion of two Gaussian densities ϕ1 ∼ N (µ1 , Σ1 ) and ϕ2 ∼ N (µ2 , Σ2 ) on Rn pertaining to the same phenomenon [7], (19) We can generalize this to projected Gaussian densities pg1 ∼ N ( p1 , µ1 , Σ1 ) and pg2 ∼ N ( p2 , µ2 , Σ2 ) only if the tangent spaces are reasonably close, i.e. the angle between the tangent points should be small enough such that the transfer of projected Gaussians between tangent spaces does not cause too big distortions. The new projected Gaussian kernels PG i j that come about by successive fusion get new weighting coefficients to evaluate the plausibility, whether the Gaussian kernels to be fused pertain the same phenomenon. The coefficients are calculated as λi j := λi λ j αi j γi j , where > 2 αi j = e(−a arccos(( pi p j ) )) (20) penalizes projected Gaussians with detached tangent points, because that makes it difficult to share a tangent space. The factor a ∈ R determines the penalty increase for a growing angle between the tangent points. The factor γi j is calculated as γi j = 1 − (µi − µ j )(Σi + Σ j )−1 (µi − µ j )> Deriving the pdf of a composition of rigid motions from the pdf of the components is important in robotics. For instance the resulting uncertainty in a kinematic chain needs to be estimated based on the joint uncertainties. In perception, the cumulated effect of sensor uncertainty and feature uncertainty on the pose of an object needs to be estimated. To calculate this propagation in our framework, we define the composition of two projected Gaussians PG 1 := N ( p1 , 0, Σ1 ) and PG 2 := N ( p2 , 0, Σ2 ) as PG 3 := N p1 ∗ p2 , 0, Jγ Σγ Jγ> , (22) where Jγ is a Jacobian matrix, defined below, and Σγ is the block diagonal matrix of Σ2 and Σ1 . In formula (22), the (ϕ2 ,ϕ1 ) , where f (ϕ2 , ϕ1 ) denotes the Jacobian is Jγ = ∂∂f(ϕ ,ϕ ) 2 ϕ3 ∼ N (µ3 , Σ3 ) µ3 = (Σ1 + Σ2 )−1 (Σ1 µ2 + Σ2 µ1 ) −1 −1 Σ3 = Σ−1 . 1 + Σ2 B. Propagation of Uncertainty (21) and expresses whether the transferred kernels are compatible according to a coincidence measure, based on the Mahalanobis distance. That means, even when the projected Gaussians can share a tangent space, they still can be incompatible, because differences of the µs and the Σs can make it unlikely that both distributions describe the same phenomenon. Generally the rotation part of µ in PG is not zero. Since it is advantageous to work with projected Gaussians that have zero mean in the corresponding Gaussian on the tangent space, we approximate the projected Gaussian kernel by one that is transfered to the according tangent point p by central projecting to the sphere and then reprojecting to the new tangent space. The centered projected Gaussian kernel is called PG 0 := N ( p, 0, Σ). 1 (0,0) reprojection of the product ∗∗ of the dual quaternions, representing the transformations, that are encoded in Π p1 (ϕ1 ) and Π p2 (ϕ2 ), i.e. f (ϕ2 , ϕ1 ) = Π−1 p3 (Π p2 (ϕ2 ) ∗ ∗ Π p1 (ϕ1 )) (23) This composition formula allows us to propagate uncertainty, encoded in a projected Gaussian, to another 6D pose by an uncertain rigid motion. VI. E FFICIENCY I SSUES In this section we present three issues for saving memory space and processing time. A disadvantage of information fusion is that the number of mixture elements is rapidly growing. For a mixture M1 having n1 elements, and another mixture M2 having n2 elements, the fused mixture M12 has n1 n2 elements. To keep the calculation time reasonably small, we replace similar projected Gaussians by a new projected Gaussian, that contains the information of the old ones. We call this procedure merge. In some applications, there is background noise that can hardly be modeled efficiently by a small number of Gaussian kernels. An example is the estimation of an object pose from a feature on the object, where the feature might be mismatched, i.e. not pertain to the object. Then we want to model the degree of reliability in the weights that we give to the projected Gaussians. But the weights of the mixture elements must sum to 1. Hence, we add a component to the mixture that expresses the uncertainty, that the feature might be an outlier. This uncertainty is best modeled by a component which we call unit distribution, as we have no information about the true pose of the outlier. A third issue, explained in this section, is an efficient approximation of the renormalization factor. A. Merge Reducing the number of mixture elements by merging can be realized iteratively. All pairs of projected Gaussians in the mixture {PG i , PG j }, i 6= j, are tested for similarity according to the Kullback-Leibler discrimination (KLD). According to [16], this is the ideal cost function for Gaussian mixture reduction. For a reliable comparison, each pair needs to be transferred to a common tangent space T S pi j as in V-A. Instead of calculating the KLD directly, which is computationally expensive, we define a dissimilarity measure B((λi , PG i ), (λ j , PG j )) analog to the one [17] proposes, but for MPG instead of regular Gaussian mixtures. The dissimilarity measure B is an upper bound for the (nonsymmetric) Kullback-Leibler discrimination of (PG i , PG j ) and (PG j , PG i ). The two mixture elements, which are the closest, according to the dissimilarity measure B, are replaced through their moment-preserving merge (λi j , PG i j ), with λi j = λi + λ j and pi j defined as in (18) 1 (λi µi + λ j µ j ) µi j = λi j (24) λ λ 1 i j Σi j = λ i Σi + λ j Σ j + (µi − µ j )(µi − µ j )> . λi j λi j is the volume correction term in the integration by substitution. We denote p := (u, v, w, x, y, z) and use the parameters Σ and µ of the Gaussian density g. Z I = Cg R6 f (u, v, w) e−0.5( p where Cg = n M = λ0 U + ∑ λi PG i , (25) i=1 where ∑i λi = 1. On compact sets the distribution U would be the uniform distribution, but in our case we have to define a distribution over SE(3) = S3 × R3 . To emphasize the definition of the distribution on SE(3) = S3 × R3 , we call U unit distribution. It is characterized by the fusion with other kernels according to [3] U × PG := PG (26) U × U := U . C. Renormalization In section III-B we mention a renormalization constant C, that makes the difference between a Gaussian and a projected Gaussian. To calculate this factor C, we need to integrate the projection over S3 × R3 , what is equivalent to calculating Z R6 f (u, v, w)g (u, v, w, x, y, z) d (u, v, w, x, y, z) , where g is the density of a Gaussian in f (u, v, w) := R6 (1 + u2 + v2 + w2 )2 1 . √ (2π) det Σ (29) (30) 3 1 with r2 := u2 + v2 + w2 . (1 + r2 )2 (31) Letting R be the 6 × 6 matrix with the first three diagonal elements 1 and all other entries 0, instead of the original integral (29), we now calculate two integrals of the type I(b), to approximate I I(b) = R6 = R6 Z = R6 2 e−br e−0.5( p e−b pR p > −µ > Σ−1 > −0.5 ) ( p> −µ ) d p > −1 Σ ( p> −µ ) ( p> −µ ) d p h i > −0.5 p(2bR) p> +( p> −µ ) Σ−1 ( p> −µ ) e dp The sum of quadratic terms is again a quadratic term. So in order to complete the square for the sum > S := p (2bR) p> + p> − µ Σ−1 p> − µ , (33) we find the minimum of the sum, i.e. the ’mean value’ µ0 = (2bΣR + Id)−1 µ. Using this new µ0 in the quadratic form (33) above, we find the constant term CQ = µ0> (2bR) µ0 + (µ0 − µ)> Σ−1 (µ0 − µ) , such that > S = p > − µ0 2bR + Σ−1 p> − µ0 +CQ . (34) (35) (27) −1 With the shorthand notation Σ−1 b := 2bR + Σ , the approximate integral becomes and 1 ( p> −µ ) d p, It can be shown that f can be approximated to arbitrary precision with exponential functions n 2 1 −bi r max − a e (32) <ε i ∑ r, ai , bi ∈R (1 + r 2 )2 i=1 Z The fusion of pose information is not affected by the introduction of the unit distribution, as it has the properties of a unit element. If the unit distributions of both mixtures are fused, U remains unchanged and gets the sum of the two former weights as new weight. In case that a unit distribution U0 with weight λ0 and an arbitrary projected Gaussian kernel PGk with weight λk is fused, we obtain PGk with the new weight λ0k = λ0 λk . On iterative fusion, the unit distribution weight converges to 0 exponentially fast with the number of steps. I= f (u, v, w) = Z PG × U := PG ) This integral can either be calculated via Monte Carlo integration or using sparse grids methods [18]. Monte Carlo methods are either slow or provide no exact results. Sparse grids methods overcome the curse of dimensionality to some extent, but have problems with Gaussian densities and are also computationally expensive [19]. Therefore, we take advantage of the special structure of the integrand and adopt a different approach. The factor f is a bell shaped curve depending only on the square of the radius B. Background Noise The unit distribution U is introduced as first component to the mixture > −µ > Σ−1 (28) I(b) = e−0.5CQ Z R6 e−0.5( p > −µ 0 > −1 Σb ) ( p> −µ0 ) d p. The integrand is a multivariate normal distribution up to the normalizing factor of a Gaussian in R6 , so the value of the integral is the inverse of that factor, and p (36) I(b) = e−0.5CQ (2π)3 det (Σb ). From (30) and (36) we get a closed form approximation for C at little computational cost, with n = 2 in (32) C∼ = Cg (a1 I(b1 ) + a2 I(b2 )) . (37) Fig. 3. Flags determine the full 6D Pose and replace drawing coordinate systems at each sample. The approximation coefficients ai and bi are the same for each projection Π p and can be calculated offline. Using this method, the calculation time for fusion is 50 times faster than using Monte Carlo with 2000 samples, what corresponds to a similar accuracy. VII. I LLUSTRATION OF THE MPG INFORMATION FUSION To demonstrate the framework, we simulate a robotic application. Here we consider the 6D pose estimation problem for an box shaped object, using the class of probability distributions mixtures of projected Gaussians. Fig. 4. left: Independent object pose estimations using an edge feature (red flags) and a surface feature (blue flags). right: Resulting object pose information, after fusion of the red and blue mixtures in left image. B. Pose Estimation Process A. Viewing Areas and Visualization of 6D Pose The goal of this subsection is to illustrate the assignment of pdf s to the features. If for example an edge of an object is detected in a camera image, there are only certain object poses under which this edge is visible at all. This is illustrated in Figure 2 to the left, where the possible camera poses are outlined with a dashed line. Similarly, a point feature like the letter F, see Figure 2 to the right, is only visible under certain poses. These viewing cylinder segments and cones are interpreted as widely spread pdf describing the object pose, see section IV. The pose estimate is narrowed down by fusing these pdf using the operators described above. Fig. 2. Viewing cylinder segment of camera positions one the left and viewing cone on the right to detect front left object edge as outer edge, respectively surface feature ’F’. In the following, the pdf describing a 6D pose is visualized by drawing a number of samples from the distribution. Figure 3 shows the visualization of a 6D pose; each flag represents a sample from the estimated probability distribution. It is placed at the 3D point of the sample and its orientation equals the orientation of the sample. That is to say, the flag pole equals the z-axis and its pennant looks in direction of the x-axis of the right-handed coordinate system. In this subsection we illustrate the 6D pose estimation process using the operations on MPG introduced in section V. In Figure 4 to the left, the red flags depict the possible object poses resulting from all short edges of the box, and the blue flags depict the possible object poses resulting from a point feature on the surface of the box. Note that the short edge could not be identified unambiguously, hence the object poses resulting from all eight short edges have to be considered and we obtain 80 kernels, that together represent the possible object poses. The resulting pdf is thus widely spread as shown by the red flags. We assume that the point feature is unique for the object. This means the corresponding pose pdf is more focused and has fewer kernels, as depicted by the blue flags. Fusing the object pose distributions from both features, see subsection V-A, the impossible poses are eliminated. The ruled out poses include all those, where the edge and the point feature cannot be detected simultaneously, or the point feature says that the object is upright meanwhile the edge says that it is upside down. The mixture resulting from the fusion of both estimates is shown to the right of Figure 4. The objects are drawn at samples from the resulting mixture. Even though it already is more peaked, the uncertainty over the 6D pose ranges in several centimeters and degrees. The estimate is sharpened by including more features in the fusion process, respecting the efficiency issues in section VI. In the example shown in Figure 5 we fused seven features. Only six of them actually are features of the object, four edge features and two surface features on the front side of the box. One feature is a mismatch which is not on the box but part of the background noise. Fusing the pdf of these features provides a peaked probability distribution for the object. Figure 6 shows in both toolbox. Fig. 5. Seven features used for estimating the pose of the box object. Three Features are surface features, whereof one is an outlier in the background and four features are edge features. Fig. 6. Estimation of the object pose after fusion of all seven features, visualized by green samples. left: For easy comparison, additionally the object pose information from an edge and a surface feature are visualized. right: Visualization of target as several samples, drawn from final distribution. images (in green) the resulting probability distribution. Now the estimate accuracy raised to include 95% of the probability mass in a bounding box of less than 1 centimeter side length and rotation smaller than 10◦ . The outlier included in the fusion process has no significant influence to this result. The possibility of each feature to be an outlier is captured in the unit distribution, introduced in section VI-B, which makes the fusion process robust with respect to outliers. VIII. C ONCLUSION In this paper we describe a class of pdf on the rigid motions which is well suited for problems in robotics and robotic perception, but can also be used in other applications. This class of pdf has many useful properties. It has a large expressive power, being able to describe wide spread pdf and to model the correlation between position and orientation. Further, it has closed form expressions for the fusion of information and for the propagation of uncertainty. The computations are made efficient by special approximation techniques for renormalization and for modeling of background noise. Since the class is based on mixtures of Gaussian distributions, more extensions are possible. We have implemented sampling from the distribution as well as fitting a distribution from this class to a sample set using a variant of the EM algorithm. Further, techniques have been adapted from the mixtures of Gaussians to fuse base elements or to drop small base elements. A full implementation of the class of pdf and of the operations has been made in Python, and the approach has been validated in simulation. A re-implementation in C++ is under way. The code will be available open source as soon as we finish an easy to handle ACKNOWLEDGMENT This work was funded in parts under the ARTEMIS Joint Undertaking as part of the project R3-COP, from the German Federal Ministry of Education and Research (BMBF) under grant no. 01IS10004E, and was supported in part within the DFG excellence initiative research cluster Cognition for Technical Systems - CoTeSys and the European FP7 research project ”WEARHAP - Wearable Haptics for Humans and Robots”. R EFERENCES [1] Wendelin Feiten, Pradeep Atwal, Robert Eidenberger, and Thilo Grundmann. 6D Pose Uncertainty in Robotic Perception. In Torsten Kröger and Friedrich M. Wahl, editors, Advances in Robotics Research, pages 89–98. Springer Berlin Heidelberg, 2009. [2] Wendelin Feiten and Muriel Lang. MPG - A Framework for Reasoning on 6 DOF Pose Uncertainty. Corporate Technology, Intelligent Systems & Control, Siemens AG, D-80200 Munich, 2011. ICRA Workshop on Manipulation Under Uncertainty. [3] Muriel Lang and Wendelin Feiten. MPG - Fast Forward Reasoning on 6 DOF Pose Uncertainty. In 7th German Conference on Robotics (ROBOTIK 2012), Munich, Germany, May 2012. [4] Christoph Hertzberg, Ren Wagner, Udo Frese, and Lutz Schröder. Integrating generic sensor fusion algorithms with sound state representations through encapsulation of manifolds. Information Fusion, 14(1):57–77, 2013. [5] J. Stuelpnagel. On the parameterization of the three-dimensional rotation group. In NASA Technical Documents. NASA, January 1948. [6] L Kavan, S Collins, Carol O’Sullivan, and J Zara. Dual quaternions for rigid transformation blending. Technical report TCDCS200646 Trinity College Dublin, 2009. [7] James Samuel Goddard. Pose And Motion Estimation From Vision Using Dual Quaternion-Based Extended Kalman Filtering. PhD thesis, University of Tennessee, Knoxville, 1997. [8] J. S. Goddard and M. A. Abidi. Pose and motion estimation using dual quaternion-based extended kalman filtering. SPIE Conf. on ThreeDimensional Image Capture and Applications, 3313:189–200, January 1998. [9] Seth Teller and Matthew E. Antone. Robust camera pose recovery using stochastic geometry. Technical report, Proceedings of the AOIS2001, 2001. [10] Jared Glover, Gary Bradski, and Radu Bogdan Rusu. Monte carlo pose estimation with quaternion kernels and the bingham distribution. Los Angeles, USA, 2011. Submitted to Robotics: Science and Systems (RSS 2011). [11] J. J. Love. Bingham statistics. In Encyclopedia of Geomagnetism & Paleomagnetism, pages 45–47. Springer, Dordrecht, The Netherlands, 2007. [12] Su Bang Choe. Statistical analysis of orientation trajectories via quaternions with applications to human motion. PhD thesis, UNIVERSITY OF MICHIGAN, 2006. [13] K. V. Mardia, C. C. Taylor, and G.K. Subramaniam. Protein bioinformatics and mixtures of bivariate von mises distributions for angular data. Biometrics, 63(2):505–512, June 2007. [14] E. Kraft. A quaternion-based unscented kalman filter for orientation tracking. Information Fusion, 1:47–54, 2003. [15] Erhan Ata and Yusuf Yayli. Dual unitary matrices and unit dual quaternions. Differential Geometry - Dynamical Systems, 10:1–12, 2008. [16] Jason L. Williams. Gaussian Mixture Reduction for Tracking Multiple Maneuvering Targets in Clutter. Storming Media, Air Force Inst of Tech Wright-Patterson AFB OH School of Engineering and Management, 2003. [17] Andrew R. Runnalls. A Kullback-Leibler Approach to Gaussian Mixture Reduction. IEEE Transactions on Aerospace and Electronic Systems, 43(3):989–999, 2007. [18] Hans-Joachim Bungartz and Michael Griebel. Sparse grids. Acta Numerica, 13:147–269, May 2004. [19] Dirk Pflüger. Spatially Adaptive Sparse Grids for High-Dimensional Problems. Verlag Dr. Hut, München, August 2010.
© Copyright 2026 Paperzz