Parametric Geons: A Discrete Set of Shapes with Parameterized Attributes Kenong Wu and Martin D. Levine Center for Intelligent Machines & Dept. of Electrical Engineering McGill University, 3480 University Street Montreal, Quebec, Canada H3A 2A7 [email protected] [email protected] ABSTRACT We propose parametric geons as a volumetric description of object components for qualitative object recognition. Parametric geons are seven qualitative shape types dened by parameterized equations which control the size and degree of tapering and bending. The models provide global shape constraints which make model recovery procedures robust against noise and minor variations in object shape. The surface characteristics of parametric geons are discussed. The properties of parametric geons and conventional geon models are compared. Experiments tting parametric geons to multiview data using stochastic optimization were performed. Results show that unique descriptions of single-part objects with minor shape variations can be obtained with the parametric geon models. 1. INTRODUCTION A major problem of interest in computer vision is the derivation of geometrical models as shape descriptions of 3D objects from input images. For object recognition, we usually require a manageable number of classes of geometrical models in order to support ecient matching. In this paper, we restrict the discussion to volumetric primitive-based descriptions which can represent a 3D object in terms of the human notion of parts and the spatial relations between them. In particular, we propose a new set of geometrical models to characterize object parts for qualitative object recognition. The problem of actually segmenting a specic object into parts is currently under investigation. Biederman has proposed geons (short for geometrical ions) - a nite set of distinct volumetric shapes - as qualitative shape models for parts [1]. This geon-based description is interesting because it has perceptual salience and reects the structure of the world. Evidently the shape of object parts in the world vary in many ways, but the number of geon shapes is limited. Therefore, the key issue in a geon-based object representation is how to approximate the shape of object parts by geons. However, the original geon denition oered by Biederman was based on the attributes of generalized cones manifested by 2D line drawings. These are frequently dicult to detect, and thus, the rst approaches in the literature were limited to recovering geons from data of perfect geon-like objects [2, 3]. The problem of shape approximation was not addressed. Our study of geon-based representations has lead us to a denition of a new set of volumetric primitives called parametric geons, albeit dened for range(3D) imagery and not line drawings. The denition imposes a global shape constraint which is important for the shape approximation process. In addition, the object description contains both qualitative shape as well as quantitative size and deformation information. Starting with a review of object part models in Section 2, we then derive the parametric geon models in Section 3. We illustrate the surface properties of parametric geons and compare them with Biederman's geons in Section 4. The model recovery procedure is briey described in Section 5. We discuss the experimental results in Section 6, followed by some concluding remarks. 2. VOLUMETRIC PRIMITIVES Volumetric primitives carry information about the spatial distribution of a shape [4] and represent the most intuitive decomposition of an object into parts. They can be categorized as qualitative and quantitative (parametric) models. Qualitative models do not rely on a ne metric and provide distinctive shape characteristics which are useful for symbolic object recognition. A typical example of qualitative models are geons. According to Biederman's theory of \Recognition by Components" (RBC) [1], geons are 36 volumetric component shapes 1 described in terms of four qualitative attributes of generalized cones [5]. It is claimed that these properties can be readily detected by an analysis of relatively perfect 2D line drawings. The object components can be dierentiated on the basis of perceptual attributes manifested in a 2D image that are largely independent of viewing position and degradation. Psychological experimentation [6] and recent computational frameworks [2, 3, 7] have provided support for the descriptive power of such geon-based descriptions. Nearly all of the work on this subject has focused on the recovery of geon models from complete line drawings which depict perfect geon-like objects. In the case of machine vision, however, \clean" or complete line drawings of objects usually cannot be obtained due to the color and texture of the object surfaces or poor lighting conditions. Because of this, and also for practical considerations, some research has focussed on data obtained from laser rangenders [8, 9]. All of the previous methods have created their part descriptions in a bottom-up fashion, inferring global properties by aggregating local features. This type of approach is not robust when object features do not fully satisfy the denition of the geon features. We also observe that geons are simple and regular volumes, but objects actually appear in a variety of shapes. Thus an alternative type of object shape approximation is desirable. In this paper we propose that this can be furnished by a model's global shape constraint. Such constraints will be shown to restrict the models to a particular shape family, no matter how the input data vary. This will signicantly assist the process of shape approximation. In contrast to qualitative models, quantitative models provide metrics or parameters to control model shapes and attributes continuously. A generalized cone is the volume swept out according to a rule by an arbitrary planar shape (the cross-section) moving along a 3D curve (the axis) [5]. The axis, the cross section and the sweeping rule are parameterized individually. This formalism has been accepted as a useful volumetric primitive for a wide variety of shapes. However, since generalized cones are not characterized by global shape constraints, and usually 3D global shape is locally underdetermined, this approach can sometimes be very error-prone. In addition, generalized cones are not unique. There exists a large number of descriptions, corresponding to one volumetric shape, depending on how the axis is selected. Hyperquadrics [10] and fourth order polynomials [11] employ parametric equations and can describe a large number of volumetric shapes. However, the parameters obtained are not intuitively related to the object shapes. The number of degrees of freedom associated with these two models weakens their uniqueness in describing the individual object classes. Terzopoulos et al. [12] have proposed the symmetry-seeking deformable model which was constructed from generalized splines. This physically-based model is active in the sense that the model continuously reacts to external forces produced by the image data. Model tting is performed by applying forces to models in space so that the shape of its projection onto the image plane is consistent with an object of interest. This model is powerful as a means of describing the ne detail of irregular objects. However, the solution depends on the initial estimation and does not provide unique information about the volumetric shape. Pentland [13] has proposed to use another physically-based model inspired by modal analysis. The modal representation yields mode parameters which do possess intuitive interpretations of the object shapes. However, without higher modes and special care taken with respect to the correspondence between data points and nodes, it is dicult to represent objects with sharp surface discontinuities. The result is also not unique with respect to dierent initial conditions. Apparently, the most popular parametric models for parts are superellipsoids, a parameterized family of closed surfaces [14, 15]. Superellipsoids and their normals are dened parametrically as follows [15]: 2 x(; !) 3 2 a cos1 cos2 ! 3 1 x(; !) = 4 y(; !) 5 = 4 a2cos1 sin2 ! 5 (1) z (; !) a3 sin 1 2 n (; !) 3 2 1 cos2?1 cos2?2 ! 3 x a1 (2) n(; !) = 4 ny (; !) 5 = 4 a12 cos2?1 sin2?2 ! 5 1 sin2?1 nz (; !) a3 ? 2 2 ? ! : Here is a north-south parameter, like latitude, and ! is an east-west parameter, like longitude. 1 is the \squareness" parameter in the north-south direction; 2 is the \squareness" parameter in the east-west direction. 2 are scale parameters along the x; y; z axes, respectively. Superellipsoids can be also expressed in the form of an implicit equation as follows [15]: a1 ; a 2 ; a 3 x 2= a1 2 y 2= ! = + a 2 2 2 1 2= =1 3 + az 1 (3) The advantage of the superellipsoid model is that its denition provides a global shape constraint and its parameters embody compact shape information. During the model recovery procedure, shape approximation is accomplished by restrictively adapting and molding the model to the object shape. The shape constraints can be used to reduce the inuence of missing data, image noise and minor variations in object shape. In this way, an approximate shape description of objects can be obtained eciently, thereby bypassing some of the common error-prone processing steps such as building point-by-point descriptions of lines and surfaces [16, 17, 18]. The only previously reported attempt to examine the discriminative ability of superellipsoids is due to Raja and Jain [19]. They explored the recovery of geons from single-view range images by classifying the actual parameters of globally-deformed superellipsoids. It was found that the estimated parameters were extremely sensitive to viewpoint, noise and objects with coarse surfaces. Their results illustrated that superellipsoid models, as employed by the authors, have very weak discriminative power. There are basically two reasons for their poor results. First, superellipsoids are nonunique and cause uncertainties in the estimated model parameters, especially when representing noisy and partially-viewed data [20]. Certain parameters in globally-deformed superellipsoids tend to interact with each other in ways that make the model dicult to control. Second, the error measure used in their approach is not proportional to sensor error and biased by the position of data. Also, it is unclear that their optimization method - an iterative gradient decent method perturbed by random noise - converges to the global minimum. From the above discussion, it is evident that a model which provides unique shape information and global shape constraints is absolutely necessary for the shape approximation task. Both of these aspects were taken into consideration when deriving the parametric geon models discussed below. 3.1 Shape Types 3. PARAMETRIC GEONS Similar to Biederman's geons, the class of parametric geons consists of nite set of distinct shapes. We believe that these shapes should reect the essential geometry of objects in the real world. Seven volumetric shapes were chosen, primarily motivated by the art of sculpturing, perhaps the most traditional framework for 3D object representation. One of the most obvious features of sculptured objects is that they consist of a conguration of solids with dierent shapes and sizes which are joined together but which we can perceive as distinct units. The individual volume is the fundamental unit in our perception of sculptural form, as indeed it is in our perception of fully 3D solid form in general [21]. From a sculptor's point of view, all sculptures are composed of variations of ve basic forms: the cube, the sphere, the cone, the pyramid and the cylinder [22, 23]. Another important belief in the world of sculpture is that each form we know originated either as a straight line or a curve [23]. Straightness and curvature are signicant features for the main axis of elongated objects and were employed in dening geon properties [1]. Therefore we also apply these two properties to the elongated cube and cylinder, yielding two additional shape types. By generalizing the ve primitive shapes used in sculptural art and adding the two curved elongated primitives, we arrive at the following seven shapes for parametric geons: the ellipsoid, the cylinder1 , the cuboid, the tapered cylinder, the tapered cuboid, the curved cylinder and the curved cuboid. 3.2 Formulation The parametric forms of these seven shapes are derived from superellipsoid equations (1), (2) and (3) by (i) specifying the shape parameters, 1 and 2 , and (ii) applying tapering and bending deformations. 1 Actually this can be a cylindrical shape with an elliptical cross-section. 3 (a) (b) Figure 1: Tapering deformation. (a) Downward tapering along the z axis; (b) Invalid tapering deformation with Kx ; Ky > 1: 3.2.1 Implicit Equations of the Three Basic Shapes Since 1 and 2 control the degree of \roundness" or \squareness" of superellipsoids in two orthogonal directions respectively, three of the parametric geons can be derived from superellipsoids as follows: Given 1 = 2 = 1 in (3), the equation of an ellipsoid is x 2 y 2 z 2 + a1 + a2 a3 = 1: (4) Given 1 = 0:12 and 2 = 1, the equation of a cylinder is given by x 2 y 2!10 z 20 + a1 + a2 a3 = 1: (5) Given 1 = 2 = 0:1, the equation of a cuboid is x 20 y 20 z 20 + a1 + a2 a3 = 1: (6) 3.2.2 Implicit Equations of Tapered Shapes Two assumptions are made regarding the tapering deformation: (i) tapering deformation is performed along the z axis; (ii) the tapering rate is linear with respect to z . Although this linearity assumption is sometimes violated for real objects, our model is only designed to approximate tapered object parts. Based on these assumptions, tapering deformation is given by ( x X = (K a3 z + 1)x (7) K y Y = ( a z + 1)y 3 where X and Y are the transformed coordinates of the primitives after tapering is applied to the coordinates x and y. Kx ; Ky are tapering parameters in the x and y coordinates. To permit downward tapering only and avoid invalid tapering (see Figure 1), we impose the constraints 0 Kx 1 and 0 Ky 1. Upward tapering can be accomplished by a rotation operation. By substituting Equation (7) into Equations (5) and (6), respectively, we obtain implicit equations for a tapered cylinder and a tapered cuboid as follows: 0 @ X a1 ( Ka3x Z + 1) !2 + Y a2 ( Ka3y Z + 1) !2110 20 A + Z =1 a3 (8) 2 Superellpsoid shape changes smoothly with and . We choose = 0:1 for a cylinder, based on computational robustness and 1 2 1 the perceptual acceptance of its shape. The same reasoning applies to the cuboid. 4 z ( x0 , z 0 ) Z0 1/κ - x 0 θ 1/κ X0 x O Figure 2: Bending deformation in the xz plane. Axis y is perpendicular to this plane, projecting into the paper. The shaded area delimits the original primitive. The thick line depicts the curved primitive. O is the center of bending curvature and is the bending angle. Point (x0 ; z0 ) is transformed into coordinate (X0 ; Z0 ) by the bending operation. X a1 ( Ka3x Z + 1) !20 + Y a2 ( Ka3y Z + 1) !20 20 Z + a3 =1 (9) 3.2.3 Implicit Equations of Curved Shapes We use a simple bending operation which corresponds to a circular section, as shown in Figure 2. The reason for choosing such a simple deformation is that only one parameter - the curvature of the circular section - is used to describe the bending feature. Although many curved object parts do not have constant curvature, we can still amply approximate curved object parts using this qualitative shape model. The bending operation is applied along the z axis in the positive x direction. There is no torsional deformation. The operation transforms vectors (x; y; z ) into vectors (X; Y; Z ). The equations describing the bending deformation are given by (see Figure 2): 8 X = ?1 ? cos (?1 ? x) < (10) : YZ == y(?1 ? x) sin Here = z is the bending angle. The inverse transformation is given by 8 x = ?1 ? pZ 2 + (?1 ? X )2 < (11) : yz == Y?1 = ?1 arctan ?1Z ?X The equations for curved cylinders and cuboids, as given in (12) and (13), can be obtained by substituting Equation (11) into Equations (5) and (6): 0 @ ?1 ? pZ 2 + (?1 ? X )2 !2 2110 + Y A + a1 ?1 ? a2 pZ 2 + (?1 ? X )2 !20 Y 20 a1 ?1 + a2 + arctan ?1Z?X !20 a3 ?1 arctan ?1Z?X a3 !20 =1 =1 (12) (13) The seven typical shapes of parametric geons are illustrated in Figure 3. Since these seven shapes are also dened quantitatively, their variations can represent a variety of dierent shapes, as shown in Figure 4. 5 ellipsoid: ε 1 = 1 ε 2 = 1 ε 1 = 0.1 BENDING TAPERING cylinder: ε 1 = 0.1 ε 2 = 1 tapered cylinder curved cylinder ε2 = 0.1 BENDING TAPERING cuboid: ε 1 = 0.1 ε 2 = 0.1 tapered cuboid curved cuboid Figure 3: The seven parametric geons. 3.2.4 Normal Equations A normal vector at a particular point on the surface of the parametric geons can be computed by dierentiating their implicit equations. Let an implicit equation of a parametric geon be dened as a mapping M such that M : g(~ x; ~ a) = 0 where ~x = fx; y; z gT denes the surface points and ~a is a parameter vector. Then the gradient vector @g(~x; ~a) @g(~x; ~a) @g(~x; ~a) (14) ; ; ~ n = m @x @y @z denes the normal vector to a parametric geon at point ~x. The alternative and simpler approach to computing normals for tapered and curved primitives is to apply a transformation to the normal vectors of the three regular shapes. If the transformation is expressed by the equation ~ =F ~ (~ X x) (15) where X~ is the transformed point of ~x, computation of the normals of deformed primitives requires using the inverse transpose of the Jacobian matrix of the deformation function as follows [24]: X~ ~x nm (16) ~ nm = B~ where B = (detJ )J ?1T and J denotes the Jacobian matrix whose ith column is obtained by the partial derivative of F~ (~x) with respect to ith component in ~x as follows: ( ~ ) ~ (~ ~ (~ @ F (~ x) @ F x) @ F x) J (~ x) = (17) ; ; @x @y @z The determinant of J can be ignored because only the direction of the normals is important. The normal transformation matrix for tapered primitives can be obtained by applying Equation (17) to (7) as follows: 1 0 ky z+1 0 0 a 3 CA kx z + 1 ?1T = B 0 (18) J @ k 0 a3 k k y k k k y y x x x ?( a3 z + 1) a3 x ?( a3 z + 1) a3 y ( a3 z + 1)( a3 z + 1) 6 5 2 3 5 6 4 5 2 1 7 6 7 4 7 3 Figure 4: Some other examples of parametric geon shapes. The number shown near each component indicates the shape type : 1-ellipsoid, 2-cylinder, 3-cuboid, 4-tapered cylinder, 5-tapered cuboid, 6-curved cylinder, 7-curved cuboid. U(P ) S TP (M) P b u M Figure 5: Local surface geometry. The normal transformation matrix for curved primitives can be obtained by applying Equation (17) to (10) as follows: 0 k(k?1 ? x) cos 1 0 sin ?1T = @ 0 k(k?1 ? x) 0 A J (19) ? 1 ?k(k ? x) sin 0 cos By knowing the normal vectors for the regular primitives, one can multiply them by either (18) or (19) to obtain the normal vectors for tapered and curved primitives, respectively. A more detailed discussion of the global deformation of solid shapes can be found in [24, 18]. 4. CHARACTERISTICS OF PARAMETRIC GEONS 4.1 Surface Properties Other constraints on 3D volumetric shapes can be derived from the properties of object surfaces. For example, a cube is only composed of planar surfaces; a sphere is only composed of curved surfaces; a cylinder has both planar and curved surfaces. Object surface information measuring local geometrical properties can be characterized by dierential properties [25]. The determination of simple volumetric shapes is often based on these surface features [26]. Here, we illustrate the surface properties of parametric geons in terms of surface curvatures, as shown in Figure 5. In this illustration, TP (M ) is the tangent plane of a surface M in E 3 at a point P . U (P ) is the normal vector of M at P and u is a particular tangent vector to M at P . A normal plane S , containing U (P ) and u, intersects the surface M , resulting in a curve (or normal section) b which is a function of u. The curvature of b 7 type K H k1 ellipsoid + { { cylinder 0 {, 0 0 cuboid 0 0 0 tapered cylinder 0 {, 0 0 tapered cuboid 0 0 0 curved cylinder {, 0, + {, 0, + {, 0, + curved cuboid 0 {, 0, + 0, + k2 { {,0 0 {,0 0 {, 0 {, 0 Table 1: Parametric geons and the signs of their Gaussian, mean and principal curvatures. We assume that the normals are pointing toward the outside of the primitives. ATTRIBUTES PARAMETRIC GEONS combination of properties either tapering or bending cross sectional shape cross sectional size symmetrical constant, expanding GEONS symmetrical, asymmetrical constant, expanding, expanding & contracting both tapering and bending Table 2: Dierence of qualitative properties between parametric geons and Biederman's original geons. is referred to as the normal curvature k(u) associated with the direction u. If k(u) > 0, the normal section b is bent toward U (P ). The maximum and minimum values of the normal curvature k(u) of M at P are called the principal curvatures of M at P and are denoted by k1 and k2, respectively [27]. Information about the principal curvatures can also be expressed in terms of the Gaussian curvature K and mean curvature H : K = k1k2 and H = (k1 + k2 )=2. K and H are known to be invariant to changes in translation and rotation of object surfaces [25]. Table 1 shows the curvature signs associated with each of the dened parametric geons. For example, a cylinder has K = 0 and k1 = 0 for all surface points. Also, H < 0; k2 < 0 for the side face of the cylinder, and H = 0; k2 = 0 for the top and bottom surfaces. Note that there is no dierence in the curvature signs between the regular and tapered versions of the primitives. This implies that curvature sign information has a restricted potential for primitive discrimination for a subset of the parametric geons. Curvature information derived from visible surfaces in a single-view can cause more ambiguity with regard to inferring the parametric geons. 4.2 Comparing with Geons The major distinction between parametric geons and the conventional geons of Biederman is that the latter are dened in terms of certain local attributes, which do not provide global shape constraints. In contrast, parametric geons are dened in terms of dierent parametric equations, which do provide such constraints. In addition, geons are described in strictly qualitative terms. However, parametric geon descriptions simultaneously supply both qualitative and quantitative characterizations of object parts. The input information is also dierent for these two volumetric primitives. Two-dimensional line drawings from single-view intensity images are used for deriving geons. On the other hand, parametric geon recovery necessitates 3D data obtained by shape-from-x approaches [28] , laser rangenders or stereo vision. The geometrical dierences between these two sets of primitives are given in Table 2. Certain qualitative properties of the parametric geons are simplied in comparison with the original geons of Biederman. For example, an asymmetrical cross section is not used in dening any of the parametric geons. The assumption that all parametric geons are symmetrical with respect to their major axes is adopted in accordance with the well-known human perceptual tendency toward phenomenological simplicity and regularity [29]. Formulating asymmetrical 8 primitives requires more sophisticated parametric models which may lead to model nonuniqueness. Symmetrical primitives have also been employed in the variations of the original geons discussed by other researchers [9, 30]. We do not explicitly allow the cross-sectional size to expand and contract along the axis. However, the ellipsoid model implies this kind of deformation. Furthermore, we do not permit tapering and bending to occur simultaneously. This greatly simplies the attributes of the parametric primitives, and in turn, avoids interaction between tapering and bending parameters. This restriction, applied to either tapering or bending of primitives, was also invoked by Dickinson et al. [30]. 5. RECOVERING PARAMETRIC GEONS Our goal is to derive parametric geon models of a single object part whose shape does not necessarily conform to a particular parametric geon. From the psychological point of view [1], human memory of a slightly irregular form would be coded as the closest regularized neighbor of that form. In an analogy to this concept, we simultaneously t all parametric geons to the input data and select the best model according to the minimum tting residual. 5.1 Fitting as Optimization The tting procedure is performed by searching a specic parameter set ~a0 such that a two-term objective function E (~a) = d1(~a) + d2(~a) (20) is minimized. Here ~a is the vector of parametric geon parameters which includes three scale, three translation, and three rotation parameters, and two tapering and one bending parameter. ~a is dierent for each model. The rst term, d1, is dened as the sum of the Euclidean distance from a data point to the model surface along a line passing through the origin of the model [31]. d2, in the second term, is the sum of the squared dierences between the object's and model's normals at each corresponding position, dened in the same way as for d1. is a user-dened parameter controlling the contribution of the second term to the entire objective function. The rst term measures the distance between object and model surfaces and the second term measures the orientation dierence of object and model surface normals. Details of this objective function are described in [32]. This objective function has a few deep local minima, caused by inappropriate orientations of the model, and many shallow local minima, caused by noise and minor changes in object shape. In order to obtain the best t of a model to an object, we need to nd those model parameters corresponding to the global minimum of the objective function. To accomplish this, we employ a stochastic optimization algorithm, Very Fast Simulated Re-annealing (VFSR) [33]. As an improved version of simulated annealing [34], VFSR permits an annealing schedule which decreases exponentially in annealing time and is dramatically faster than traditional (Boltzmann) annealing whose annealing schedule decreases logarithmically. The re-annealing property permits adaptation to the changing sensitivities in a multidimensional parameter space. After parametric geons are tted to the 3D data, the best model for the object is selected according to the minimum tting residual. 5.2 Multiview Range Data Several researchers have demonstrated that parametric model recovery using single-view data is extremely sensitive to viewpoint and additional data from other views can signicantly improve model estimation [17, 19, 20]. In this study, multiple-view range data are obtained with a laser rangender which scans objects supported by a turntable. The registration among images taken from dierent views is obtained by a method described in [35]. The redundant data appearing in more than one view are detected and removed. Finally, all non-redundant data are converted to a common 3D coordinate system and expressed as a sequence of 3D data points [36]. 6. EXPERIMENTS The following experiments were conducted to investigate the discriminative properties of parametric geons. We are interested in examining the residual dierences among all tted models, especially when the object data 9 Figure 6: Four bananas used in the experiments. 42.822 38.229 (a) 37.876 (d) 41.395 (b) (c) 39.840 11.397 (e) (f) 40.727 (g) Figure 7: Fitted models superimposed on the range data. The numbers in bottom right hand corners indicate the value of the absolute tting residuals. In (c), (e) and (f) some of object data cannot be seen because the object models are opaque and the data are located inside the volume. contain noise and the object shapes are not the exact shapes of the parametric geons. All objects used in the experiments are single-part objects which lack sharp concavities. We conducted three sets of experiments using synthetic data, range data of geon-like objects and range data of imperfect geon-like objects. Here we only report the results from the third set of experiments. In this case, eleven real bananas were used as the objects. Obviously their shapes cannot simply be depicted by any of the parametric geons. Figure 6 shows four of these bananas. Some had stems at their ends and relatively sharp surface variations. In some bananas, the curvature of the main axis changed slightly at the top and signicantly at the bottom. No banana's cross section was perfectly symmetrical. The apparently noisy surfaces shown in the gure were due to the rangender's sampling error. This was because the bananas had to be placed far from the rangender in order for them to t within the its scanning eld-of-view. Four dierent views were taken for each banana. Simple thresholding was performed to remove the supporting plane and other background data. Surface normals were computed by a least-squares tting procedure. After multiview integration, the data was tted to each of the seven models. Figure 7 shows the results of tting the seven parametric geons to the range data. The lighter shaded volumes are the models obtained by the tting procedure and the darker sparse spots indicate the input data. (a) through (g) illustrate models of the ellipsoid, the cylinder, the cuboid, the tapered cylinder, the tapered cuboid, the curved cylinder and the curved cuboid superimposed on one set of banana data, respectively. The algorithm selected the 10 elli 3.255 cyld 2.889 cubd 3.851 MODELS tcyld 3.324 tcubd 3.611 ccyld 1.000 ccubd 2.987 Mean Standard deviation 0.118 0.092 0.152 0.232 0.139 0.000 0.149 Maximum residual 4.00111 3.48881 4.71736 5.0178 4.32779 1.000 3.80155 Minimum residual 2.65569 2.45822 3.10204 2.46402 3.07318 1.000 2.38523 Table 3: Average, maximum and minimum tting residuals, standard deviations for eleven bananas. The symbols, elli, cyld, cubd, tcyld, tcubd, ccyld and ccubd, indicate a model of an ellipsoid, cylinder, cuboid, tapered cylinder, tapered cuboid, curved cylinder, curved cuboid, respectively. curved cylinder shown in (f) as the best model for all of the bananas. Clearly this result is consistent with our intuition of the banana's actual shape. Table 3 presents the average tting residuals, standard deviations, and maximum and minimum tting residuals for all of the bananas. Since absolute tting residuals were aected by the banana's size, and they were all dierent, we cannot compare the tting residuals for the dierent bananas. Thus, the residuals were normalized by the minimum residual obtained from the same banana as follows: Eijn = Eij =Emin;j ; i = 1; :::; 7 Emin;j = min fE g i ij (21) E n is the relative (normalized) residual value, while i and j are the indices of models and objects, respectively. The minimum relative residual is equal to one and the rest of the residuals are greater than one. Table 3 indicates how dierent, on average, the minimum relative residual is from the other residuals. The results show that the best model for all of the bananas was the curved cylinder, which gave the smallest average residual value. Thus, parametric geon models and the presented recovery procedures demonstrate robust behavior and uniquely represent the banana shapes even though they are characterized by minor variations. 7. CONCLUSIONS In this paper, we have introduced parametric geons as volumetric primitives for qualitative object recognition. The seven parametric geon shape types are regular, simple and symmetrical volumes. As well, these shapes are consistent with the basic forms used by sculptors. In addition to the shape label, parametric geons also carry quantitative information which could be used for more detailed object discrimination. This model is designed to describe volumetric objects or object parts whose shapes do not vary signicantly from parametric geons. We note that there exists a trade-o between the scope and uniqueness of primitive shapes when proposing a particular object description. If a specic model can represent a large number of shapes in ne detail, it must necessarily employ many parameters and be very sensitive to these details. However, object details are easily distorted by noise and are often due to minor shape variations which should not be considered as a factor for discrimination. Therefore, the noise and minor shape changes might result in nonunique object descriptions. In our case, we gain discriminative power by restricting the descriptive power. Global shape constraints, as dened by the parametric geon equations, eectively restrict the solutions of the model recovery procedure to the parametric geons. Nevertheless, the approach does adequately compensate for noise and minor shape texture. Experimental results show that a unique description of single-part objects can be robustly computed, even though the data contain noise and the objects do not exactly conform to the shape of the parametric geons. Acknowledgements 11 The authors would like to thanks Dr. Lester Ingber for the VFSR computer code and Gerard Blais and Gilbert Soucy for technical help. M. D. Levine would like to thank the Canadian Institute for Advanced Research and PRECARN for its support. This work was partially supported by a Natural Sciences and Engineering Research Council of Canada Strategic Grant and an FCAR Grant from the Province of Quebec. References [1] I. Biederman. Human image understanding: Recent research and a theory. Computer Vision, Graphics, and Image Processing, 32:29{73, 1985. [2] R. Bergevin and M. D. Levine. Generic object recognition: Building and matching coarse descriptions from line drawings. IEEE Transactions on Pattern Analysis and Machine Intelligence, 15(1):19{36, January 1993. [3] S. J. Dickinson, A. P. Pentland, and A. Rosenfeld. 3D shape recovery using distributed aspect matching. IEEE Transactions on Pattern Analysis and Machine Intelligence, 14(2):174{198, February 1992. [4] D. Marr and H. K. Nishihara. Representation and recognition of spatial organization of three-dimensional shapes. Proceedings of the Royal Society, B200:269{294, 1978. [5] T. O. Binford. Visual perception by computer. In IEEE Conference on Systems and Control, Miami, FL, 1971. [6] I. Biederman and E. Cooper. Priming contour-deleted images: Evidence for intermediate representations in visual object recognition. Cognitive Psychology, 23:394{419, 1991. [7] R. C. Munck-fairwood and L. Du. Shape using volumetric primitives. Image & Vision Computing, 11(6):364{ 371, July 1993. [8] Q. L. Nguyen and M. D. Levine. 3D object representation in range images using geons. In 11th International Conference on Pattern Recognition, Hague, Netherlands, August 1992. [9] N. S. Raja and A. K. Jain. Obtaining generic parts from range data using a multi-view representation. In Proceedings of SPIE conference on Application of Articial Intelligence: Machine Vision & Robotics, Orlando, April 1992. [10] S. Han, D. B. Goldgof, and K. Bowyer. Using hyperquadrics for shape recovery from range data. In Proceedings of Fourth International Conference on Computer Vision, pages 492{496, Berlin, Germany, May 1993. IEEE Computer Society Press. [11] D. Keren, D. Cooper, and J. Subrahmonia. Describing complicated objects by implicit polynomials. Technical Report LEMS Technical Report #102, Division of Engineering, Brown University, Providence, RI, USA, 1992. [12] D. Terzopoulos, , A. Witkin, and M. Kass. Symmetry-seeking models for 3D object reconstruction. International Journal of Computer Vision, 1(3):211{221, 1987. [13] A. P. Pentland. Closed-form solutions for physically based shape modeling and recognition. IEEE transactions on Pattern Analysis and Machine Intelligence, 13:715{729, 1991. [14] M. Gardiner. The superellipse: A curve that lies between the ellipse and the rectangle. Scientic American, 213:222{234, 1965. [15] A. H. Barr. Superquadrics and angle-preserving transformations. IEEE Computer Graphics Applications, 1:11{23, 1981. [16] A. P. Pentland. Recognition by parts. In The First International Conference on Computer Vision, pages 8{11, London, June 1987. 12 [17] T. Boult and A. Gross. Recovery superquadrics from depth information. In Proceedings of the AAAI workshop on spatial Reasoning and Multisensor Integration, pages 128{137. American Association for Articial Intelligence, 1987. [18] F. Solina and R. Bajcsy. Recovery of parametric models from range images: the case for superquadrics with global deformations. IEEE Transactions on Pattern Analysis and Machine Intelligence, 12(2):131{147, 1990. [19] N. S. Raja and A. K. Jain. Recognizing geons from superquadrics tted to range data. Image and Vision Computing, 10(3):179{190, April 1992. [20] P. Whaite and F. P. Ferrie. From uncertainty to visual exploration. IEEE Transactions on Pattern Analysis and Machine Intelligence, 13(10):1038{1049, October 1991. [21] L. R. Rogers. Sculpture. Oxford University Press, 1969. [22] B. Putnam. The Sculptor's Way. Farrar & rinehart, INC., 1939. [23] W. Zorach. Zorach Explains Sculpture: What It Means and How It Is Made. Tudor Publishing Company, New York, 1960. [24] A. H. Barr. Global and local deformations of solid primitives. Computer Graphics, 18(3):21{30, 1984. [25] P. J. Besl and R. C. Jain. Invariant surface characteristics for three dimensional object recognition in range images. Computer Vision, Graphics, and Image Processing, 33(1):33{88, 1986. [26] F. Ferrie and M. D. Levine. Deriving coarse 3D models of objects. In IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pages 345{353, Ann Arbor, Michigan, June 1988. [27] B. O'neill. Elementary Dierential Geometry. Academic Press, Nork and London, 1966. [28] Y. Aloimonos and A. Rosenfeld. Visual recovery. In S. C. Shapiro, editor, Encyclopedia of Articial Intelligence, volume 2, pages 1665{1687. John Wiley & Sons, Inc., 1987. [29] G. Hateld and W. Epstein. The status of the minimum principle in the theoretical analysis of visual perception. Psychological Bulletin, 97(2):155{186, 1985. [30] S. J. Dickinson, A. P. Pentland, and A. Rosenfeld. A representation for qualitative 3D object recognition integrating object-centered and viewer-centered models. In K. N. Leibovic, editor, Vision: A Convergence of Disciplines. Springer Verlag, New York, 1990. [31] A. D. Gross and T. E. Boult. Error of t measures for recovering parametric solids. In Proceedings, 2nd International Conference on Computer Vision, pages 690{694, Tampa, Florida, 1988. Computer Society of the IEEE, IEEE Computer Society Press. [32] K. Wu and M. D. Levine. Recovering parametric geons from multiview range data. In IEEE Conference on Computer Vision & Pattern Recognition, Seattle, June 1994. IEEE computer Society. To appear. [33] L. Ingber. Very fast simulated re-annealing. Mathematical and Computer Modelling, 12(8):967{973, 1989. [34] S. Kirkpatrick, Jr. C. D. Gelatt, and M. P. Vecchi. Optimization by simulated annealing. Science, 220(4598):671{680, May 1983. [35] G. Blais and M. D. Levine. Registering multiview range data to create 3D computer objects. Technical Report TR-CIM-93-16, Center for Intelligent Machines, McGill University, Montreal, Quebec, Canada, October 1993. [36] K. Wu and M. D. Levine. 3-D object representation using parametric geons. Technical Report TR-CIM-93-13, Center for Intelligent Machines, McGill University, Montreal, Quebec, Canada, September 1993. 13
© Copyright 2024 Paperzz