View PDF - CiteSeerX

Parametric Geons: A Discrete Set of Shapes with Parameterized Attributes
Kenong Wu and Martin D. Levine
Center for Intelligent Machines & Dept. of Electrical Engineering
McGill University, 3480 University Street
Montreal, Quebec, Canada H3A 2A7
[email protected] [email protected]
ABSTRACT
We propose parametric geons as a volumetric description of object components for qualitative object recognition.
Parametric geons are seven qualitative shape types dened by parameterized equations which control the size and
degree of tapering and bending. The models provide global shape constraints which make model recovery procedures robust against noise and minor variations in object shape. The surface characteristics of parametric geons are
discussed. The properties of parametric geons and conventional geon models are compared. Experiments tting parametric geons to multiview data using stochastic optimization were performed. Results show that unique descriptions
of single-part objects with minor shape variations can be obtained with the parametric geon models.
1. INTRODUCTION
A major problem of interest in computer vision is the derivation of geometrical models as shape descriptions
of 3D objects from input images. For object recognition, we usually require a manageable number of classes of
geometrical models in order to support ecient matching. In this paper, we restrict the discussion to volumetric
primitive-based descriptions which can represent a 3D object in terms of the human notion of parts and the spatial
relations between them. In particular, we propose a new set of geometrical models to characterize object parts
for qualitative object recognition. The problem of actually segmenting a specic object into parts is currently
under investigation.
Biederman has proposed geons (short for geometrical ions) - a nite set of distinct volumetric shapes - as
qualitative shape models for parts [1]. This geon-based description is interesting because it has perceptual salience
and reects the structure of the world. Evidently the shape of object parts in the world vary in many ways, but
the number of geon shapes is limited. Therefore, the key issue in a geon-based object representation is how to
approximate the shape of object parts by geons. However, the original geon denition oered by Biederman
was based on the attributes of generalized cones manifested by 2D line drawings. These are frequently dicult
to detect, and thus, the rst approaches in the literature were limited to recovering geons from data of perfect
geon-like objects [2, 3]. The problem of shape approximation was not addressed.
Our study of geon-based representations has lead us to a denition of a new set of volumetric primitives called
parametric geons, albeit dened for range(3D) imagery and not line drawings. The denition imposes a global
shape constraint which is important for the shape approximation process. In addition, the object description
contains both qualitative shape as well as quantitative size and deformation information.
Starting with a review of object part models in Section 2, we then derive the parametric geon models in
Section 3. We illustrate the surface properties of parametric geons and compare them with Biederman's geons in
Section 4. The model recovery procedure is briey described in Section 5. We discuss the experimental results in
Section 6, followed by some concluding remarks.
2. VOLUMETRIC PRIMITIVES
Volumetric primitives carry information about the spatial distribution of a shape [4] and represent the most
intuitive decomposition of an object into parts. They can be categorized as qualitative and quantitative (parametric) models. Qualitative models do not rely on a ne metric and provide distinctive shape characteristics
which are useful for symbolic object recognition. A typical example of qualitative models are geons. According
to Biederman's theory of \Recognition by Components" (RBC) [1], geons are 36 volumetric component shapes
1
described in terms of four qualitative attributes of generalized cones [5]. It is claimed that these properties can be
readily detected by an analysis of relatively perfect 2D line drawings. The object components can be dierentiated
on the basis of perceptual attributes manifested in a 2D image that are largely independent of viewing position
and degradation. Psychological experimentation [6] and recent computational frameworks [2, 3, 7] have provided
support for the descriptive power of such geon-based descriptions.
Nearly all of the work on this subject has focused on the recovery of geon models from complete line drawings
which depict perfect geon-like objects. In the case of machine vision, however, \clean" or complete line drawings of
objects usually cannot be obtained due to the color and texture of the object surfaces or poor lighting conditions.
Because of this, and also for practical considerations, some research has focussed on data obtained from laser
rangenders [8, 9]. All of the previous methods have created their part descriptions in a bottom-up fashion,
inferring global properties by aggregating local features. This type of approach is not robust when object features
do not fully satisfy the denition of the geon features. We also observe that geons are simple and regular volumes,
but objects actually appear in a variety of shapes. Thus an alternative type of object shape approximation is
desirable. In this paper we propose that this can be furnished by a model's global shape constraint. Such
constraints will be shown to restrict the models to a particular shape family, no matter how the input data vary.
This will signicantly assist the process of shape approximation.
In contrast to qualitative models, quantitative models provide metrics or parameters to control model shapes
and attributes continuously. A generalized cone is the volume swept out according to a rule by an arbitrary
planar shape (the cross-section) moving along a 3D curve (the axis) [5]. The axis, the cross section and the
sweeping rule are parameterized individually. This formalism has been accepted as a useful volumetric primitive
for a wide variety of shapes. However, since generalized cones are not characterized by global shape constraints,
and usually 3D global shape is locally underdetermined, this approach can sometimes be very error-prone. In
addition, generalized cones are not unique. There exists a large number of descriptions, corresponding to one
volumetric shape, depending on how the axis is selected. Hyperquadrics [10] and fourth order polynomials [11]
employ parametric equations and can describe a large number of volumetric shapes. However, the parameters
obtained are not intuitively related to the object shapes. The number of degrees of freedom associated with
these two models weakens their uniqueness in describing the individual object classes. Terzopoulos et al. [12]
have proposed the symmetry-seeking deformable model which was constructed from generalized splines. This
physically-based model is active in the sense that the model continuously reacts to external forces produced
by the image data. Model tting is performed by applying forces to models in space so that the shape of its
projection onto the image plane is consistent with an object of interest. This model is powerful as a means
of describing the ne detail of irregular objects. However, the solution depends on the initial estimation and
does not provide unique information about the volumetric shape. Pentland [13] has proposed to use another
physically-based model inspired by modal analysis. The modal representation yields mode parameters which do
possess intuitive interpretations of the object shapes. However, without higher modes and special care taken
with respect to the correspondence between data points and nodes, it is dicult to represent objects with sharp
surface discontinuities. The result is also not unique with respect to dierent initial conditions.
Apparently, the most popular parametric models for parts are superellipsoids, a parameterized family of closed
surfaces [14, 15]. Superellipsoids and their normals are dened parametrically as follows [15]:
2 x(; !) 3 2 a cos1 cos2 ! 3
1
x(; !) = 4 y(; !) 5 = 4 a2cos1 sin2 ! 5
(1)
z (; !)
a3 sin 1 2 n (; !) 3 2 1 cos2?1 cos2?2 ! 3
x
a1
(2)
n(; !) = 4 ny (; !) 5 = 4 a12 cos2?1 sin2?2 ! 5
1 sin2?1 nz (; !)
a3
? 2 2
? ! :
Here is a north-south parameter, like latitude, and ! is an east-west parameter, like longitude. 1 is the
\squareness" parameter in the north-south direction; 2 is the \squareness" parameter in the east-west direction.
2
are scale parameters along the x; y; z axes, respectively. Superellipsoids can be also expressed in the
form of an implicit equation as follows [15]:
a1 ; a 2 ; a 3
x 2=
a1
2
y 2= ! =
+ a 2
2
2
1
2=
=1
3
+ az 1
(3)
The advantage of the superellipsoid model is that its denition provides a global shape constraint and its
parameters embody compact shape information. During the model recovery procedure, shape approximation is
accomplished by restrictively adapting and molding the model to the object shape. The shape constraints can
be used to reduce the inuence of missing data, image noise and minor variations in object shape. In this way,
an approximate shape description of objects can be obtained eciently, thereby bypassing some of the common
error-prone processing steps such as building point-by-point descriptions of lines and surfaces [16, 17, 18].
The only previously reported attempt to examine the discriminative ability of superellipsoids is due to Raja
and Jain [19]. They explored the recovery of geons from single-view range images by classifying the actual
parameters of globally-deformed superellipsoids. It was found that the estimated parameters were extremely
sensitive to viewpoint, noise and objects with coarse surfaces. Their results illustrated that superellipsoid models,
as employed by the authors, have very weak discriminative power. There are basically two reasons for their poor
results. First, superellipsoids are nonunique and cause uncertainties in the estimated model parameters, especially
when representing noisy and partially-viewed data [20]. Certain parameters in globally-deformed superellipsoids
tend to interact with each other in ways that make the model dicult to control. Second, the error measure used
in their approach is not proportional to sensor error and biased by the position of data. Also, it is unclear that
their optimization method - an iterative gradient decent method perturbed by random noise - converges to the
global minimum.
From the above discussion, it is evident that a model which provides unique shape information and global
shape constraints is absolutely necessary for the shape approximation task. Both of these aspects were taken into
consideration when deriving the parametric geon models discussed below.
3.1 Shape Types
3. PARAMETRIC GEONS
Similar to Biederman's geons, the class of parametric geons consists of nite set of distinct shapes. We believe
that these shapes should reect the essential geometry of objects in the real world. Seven volumetric shapes
were chosen, primarily motivated by the art of sculpturing, perhaps the most traditional framework for 3D object
representation. One of the most obvious features of sculptured objects is that they consist of a conguration of
solids with dierent shapes and sizes which are joined together but which we can perceive as distinct units. The
individual volume is the fundamental unit in our perception of sculptural form, as indeed it is in our perception
of fully 3D solid form in general [21]. From a sculptor's point of view, all sculptures are composed of variations
of ve basic forms: the cube, the sphere, the cone, the pyramid and the cylinder [22, 23]. Another important
belief in the world of sculpture is that each form we know originated either as a straight line or a curve [23].
Straightness and curvature are signicant features for the main axis of elongated objects and were employed in
dening geon properties [1]. Therefore we also apply these two properties to the elongated cube and cylinder,
yielding two additional shape types. By generalizing the ve primitive shapes used in sculptural art and adding
the two curved elongated primitives, we arrive at the following seven shapes for parametric geons: the ellipsoid,
the cylinder1 , the cuboid, the tapered cylinder, the tapered cuboid, the curved cylinder and the curved cuboid.
3.2 Formulation
The parametric forms of these seven shapes are derived from superellipsoid equations (1), (2) and (3) by (i)
specifying the shape parameters, 1 and 2 , and (ii) applying tapering and bending deformations.
1
Actually this can be a cylindrical shape with an elliptical cross-section.
3
(a)
(b)
Figure 1: Tapering deformation. (a) Downward tapering along the z axis; (b) Invalid tapering deformation with Kx ; Ky > 1:
3.2.1 Implicit Equations of the Three Basic Shapes
Since 1 and 2 control the degree of \roundness" or \squareness" of superellipsoids in two orthogonal directions
respectively, three of the parametric geons can be derived from superellipsoids as follows:
Given 1 = 2 = 1 in (3), the equation of an ellipsoid is
x 2 y 2 z 2
+
a1
+
a2
a3
= 1:
(4)
Given 1 = 0:12 and 2 = 1, the equation of a cylinder is given by
x 2 y 2!10 z 20
+
a1
+
a2
a3
= 1:
(5)
Given 1 = 2 = 0:1, the equation of a cuboid is
x 20 y 20 z 20
+
a1
+
a2
a3
= 1:
(6)
3.2.2 Implicit Equations of Tapered Shapes
Two assumptions are made regarding the tapering deformation: (i) tapering deformation is performed along
the z axis; (ii) the tapering rate is linear with respect to z . Although this linearity assumption is sometimes
violated for real objects, our model is only designed to approximate tapered object parts. Based on these
assumptions, tapering deformation is given by
(
x
X = (K
a3 z + 1)x
(7)
K
y
Y = ( a z + 1)y
3
where X and Y are the transformed coordinates of the primitives after tapering is applied to the coordinates x
and y. Kx ; Ky are tapering parameters in the x and y coordinates. To permit downward tapering only and avoid
invalid tapering (see Figure 1), we impose the constraints 0 Kx 1 and 0 Ky 1. Upward tapering can be
accomplished by a rotation operation. By substituting Equation (7) into Equations (5) and (6), respectively, we
obtain implicit equations for a tapered cylinder and a tapered cuboid as follows:
0
@
X
a1
( Ka3x Z + 1)
!2
+
Y
a2
( Ka3y Z + 1)
!2110 20
A + Z =1
a3
(8)
2 Superellpsoid shape changes smoothly with and . We choose = 0:1 for a cylinder, based on computational robustness and
1
2
1
the perceptual acceptance of its shape. The same reasoning applies to the cuboid.
4
z
( x0 , z 0 )
Z0
1/κ - x 0
θ
1/κ
X0
x
O
Figure 2: Bending deformation in the xz plane. Axis y is perpendicular to this plane, projecting into the paper. The shaded area
delimits the original primitive. The thick line depicts the curved primitive. O is the center of bending curvature and is the bending
angle. Point (x0 ; z0 ) is transformed into coordinate (X0 ; Z0 ) by the bending operation.
X
a1
( Ka3x Z + 1)
!20
+
Y
a2
( Ka3y Z + 1)
!20 20
Z
+
a3
=1
(9)
3.2.3 Implicit Equations of Curved Shapes
We use a simple bending operation which corresponds to a circular section, as shown in Figure 2. The reason
for choosing such a simple deformation is that only one parameter - the curvature of the circular section - is
used to describe the bending feature. Although many curved object parts do not have constant curvature, we can
still amply approximate curved object parts using this qualitative shape model. The bending operation is applied
along the z axis in the positive x direction. There is no torsional deformation. The operation transforms vectors
(x; y; z ) into vectors (X; Y; Z ). The equations describing the bending deformation are given by (see Figure 2):
8 X = ?1 ? cos (?1 ? x)
<
(10)
: YZ == y(?1 ? x) sin Here = z is the bending angle. The inverse transformation is given by
8 x = ?1 ? pZ 2 + (?1 ? X )2
<
(11)
: yz == Y?1 = ?1 arctan ?1Z
?X
The equations for curved cylinders and cuboids, as given in (12) and (13), can be obtained by substituting
Equation (11) into Equations (5) and (6):
0
@
?1
?
pZ 2 + (?1 ? X )2 !2 2110
+ Y A +
a1
?1
?
a2
pZ 2 + (?1 ? X )2 !20 Y 20
a1
?1
+
a2
+
arctan ?1Z?X
!20
a3
?1
arctan ?1Z?X
a3
!20
=1
=1
(12)
(13)
The seven typical shapes of parametric geons are illustrated in Figure 3. Since these seven shapes are also
dened quantitatively, their variations can represent a variety of dierent shapes, as shown in Figure 4.
5
ellipsoid: ε 1 = 1 ε 2 = 1
ε 1 = 0.1
BENDING
TAPERING
cylinder: ε 1 = 0.1 ε 2 = 1
tapered cylinder
curved cylinder
ε2 = 0.1
BENDING
TAPERING
cuboid: ε 1 = 0.1 ε 2 = 0.1
tapered cuboid
curved cuboid
Figure 3: The seven parametric geons.
3.2.4 Normal Equations
A normal vector at a particular point on the surface of the parametric geons can be computed by dierentiating
their implicit equations. Let an implicit equation of a parametric geon be dened as a mapping M such that
M : g(~
x; ~
a) = 0
where ~x = fx; y; z gT denes the surface points and ~a is a parameter vector. Then the gradient vector
@g(~x; ~a) @g(~x; ~a) @g(~x; ~a) (14)
;
;
~
n =
m
@x
@y
@z
denes the normal vector to a parametric geon at point ~x.
The alternative and simpler approach to computing normals for tapered and curved primitives is to apply
a transformation to the normal vectors of the three regular shapes. If the transformation is expressed by the
equation
~ =F
~ (~
X
x)
(15)
where X~ is the transformed point of ~x, computation of the normals of deformed primitives requires using the
inverse transpose of the Jacobian matrix of the deformation function as follows [24]:
X~
~x
nm
(16)
~
nm = B~
where B = (detJ )J ?1T and J denotes the Jacobian matrix whose ith column is obtained by the partial derivative
of F~ (~x) with respect to ith component in ~x as follows:
( ~
)
~ (~
~ (~
@ F (~
x) @ F
x) @ F
x)
J (~
x) =
(17)
;
;
@x
@y
@z
The determinant of J can be ignored because only the direction of the normals is important. The normal
transformation matrix for tapered primitives can be obtained by applying Equation (17) to (7) as follows:
1
0 ky
z+1
0
0
a
3
CA
kx z + 1
?1T = B
0
(18)
J
@ k 0
a3
k
k
y
k
k
k
y
y
x
x
x
?( a3 z + 1) a3 x ?( a3 z + 1) a3 y ( a3 z + 1)( a3 z + 1)
6
5
2
3
5
6
4
5
2
1
7
6
7
4
7
3
Figure 4: Some other examples of parametric geon shapes. The number shown near each component indicates the shape type :
1-ellipsoid, 2-cylinder, 3-cuboid, 4-tapered cylinder, 5-tapered cuboid, 6-curved cylinder, 7-curved cuboid.
U(P )
S
TP (M)
P
b
u
M
Figure 5: Local surface geometry.
The normal transformation matrix for curved primitives can be obtained by applying Equation (17) to (10) as
follows:
0 k(k?1 ? x) cos 1
0
sin ?1T = @
0
k(k?1 ? x)
0 A
J
(19)
?
1
?k(k ? x) sin 0
cos By knowing the normal vectors for the regular primitives, one can multiply them by either (18) or (19) to obtain
the normal vectors for tapered and curved primitives, respectively. A more detailed discussion of the global
deformation of solid shapes can be found in [24, 18].
4. CHARACTERISTICS OF PARAMETRIC GEONS
4.1 Surface Properties
Other constraints on 3D volumetric shapes can be derived from the properties of object surfaces. For example, a
cube is only composed of planar surfaces; a sphere is only composed of curved surfaces; a cylinder has both planar
and curved surfaces. Object surface information measuring local geometrical properties can be characterized
by dierential properties [25]. The determination of simple volumetric shapes is often based on these surface
features [26].
Here, we illustrate the surface properties of parametric geons in terms of surface curvatures, as shown in
Figure 5. In this illustration, TP (M ) is the tangent plane of a surface M in E 3 at a point P . U (P ) is the normal
vector of M at P and u is a particular tangent vector to M at P . A normal plane S , containing U (P ) and u,
intersects the surface M , resulting in a curve (or normal section) b which is a function of u. The curvature of b
7
type
K
H
k1
ellipsoid
+
{
{
cylinder
0
{, 0
0
cuboid
0
0
0
tapered cylinder
0
{, 0
0
tapered cuboid
0
0
0
curved cylinder {, 0, + {, 0, + {, 0, +
curved cuboid
0
{, 0, + 0, +
k2
{
{,0
0
{,0
0
{, 0
{, 0
Table 1: Parametric geons and the signs of their Gaussian, mean and principal curvatures. We assume that the
normals are pointing toward the outside of the primitives.
ATTRIBUTES
PARAMETRIC GEONS
combination
of properties
either tapering
or bending
cross sectional shape
cross sectional size
symmetrical
constant, expanding
GEONS
symmetrical, asymmetrical
constant, expanding,
expanding & contracting
both tapering
and bending
Table 2: Dierence of qualitative properties between parametric geons and Biederman's original geons.
is referred to as the normal curvature k(u) associated with the direction u. If k(u) > 0, the normal section b is
bent toward U (P ). The maximum and minimum values of the normal curvature k(u) of M at P are called the
principal curvatures of M at P and are denoted by k1 and k2, respectively [27]. Information about the principal
curvatures can also be expressed in terms of the Gaussian curvature K and mean curvature H : K = k1k2 and
H = (k1 + k2 )=2. K and H are known to be invariant to changes in translation and rotation of object surfaces [25].
Table 1 shows the curvature signs associated with each of the dened parametric geons. For example, a cylinder
has K = 0 and k1 = 0 for all surface points. Also, H < 0; k2 < 0 for the side face of the cylinder, and H = 0; k2 = 0
for the top and bottom surfaces. Note that there is no dierence in the curvature signs between the regular and
tapered versions of the primitives. This implies that curvature sign information has a restricted potential for
primitive discrimination for a subset of the parametric geons. Curvature information derived from visible surfaces
in a single-view can cause more ambiguity with regard to inferring the parametric geons.
4.2 Comparing with Geons
The major distinction between parametric geons and the conventional geons of Biederman is that the latter are
dened in terms of certain local attributes, which do not provide global shape constraints. In contrast, parametric
geons are dened in terms of dierent parametric equations, which do provide such constraints. In addition, geons
are described in strictly qualitative terms. However, parametric geon descriptions simultaneously supply both
qualitative and quantitative characterizations of object parts.
The input information is also dierent for these two volumetric primitives. Two-dimensional line drawings
from single-view intensity images are used for deriving geons. On the other hand, parametric geon recovery
necessitates 3D data obtained by shape-from-x approaches [28] , laser rangenders or stereo vision.
The geometrical dierences between these two sets of primitives are given in Table 2. Certain qualitative
properties of the parametric geons are simplied in comparison with the original geons of Biederman. For
example, an asymmetrical cross section is not used in dening any of the parametric geons. The assumption that
all parametric geons are symmetrical with respect to their major axes is adopted in accordance with the well-known
human perceptual tendency toward phenomenological simplicity and regularity [29]. Formulating asymmetrical
8
primitives requires more sophisticated parametric models which may lead to model nonuniqueness. Symmetrical
primitives have also been employed in the variations of the original geons discussed by other researchers [9, 30]. We
do not explicitly allow the cross-sectional size to expand and contract along the axis. However, the ellipsoid model
implies this kind of deformation. Furthermore, we do not permit tapering and bending to occur simultaneously.
This greatly simplies the attributes of the parametric primitives, and in turn, avoids interaction between tapering
and bending parameters. This restriction, applied to either tapering or bending of primitives, was also invoked
by Dickinson et al. [30].
5. RECOVERING PARAMETRIC GEONS
Our goal is to derive parametric geon models of a single object part whose shape does not necessarily conform
to a particular parametric geon. From the psychological point of view [1], human memory of a slightly irregular
form would be coded as the closest regularized neighbor of that form. In an analogy to this concept, we
simultaneously t all parametric geons to the input data and select the best model according to the minimum
tting residual.
5.1 Fitting as Optimization
The tting procedure is performed by searching a specic parameter set ~a0 such that a two-term objective
function
E (~a) = d1(~a) + d2(~a)
(20)
is minimized. Here ~a is the vector of parametric geon parameters which includes three scale, three translation,
and three rotation parameters, and two tapering and one bending parameter. ~a is dierent for each model. The
rst term, d1, is dened as the sum of the Euclidean distance from a data point to the model surface along a
line passing through the origin of the model [31]. d2, in the second term, is the sum of the squared dierences
between the object's and model's normals at each corresponding position, dened in the same way as for d1. is a user-dened parameter controlling the contribution of the second term to the entire objective function. The
rst term measures the distance between object and model surfaces and the second term measures the orientation
dierence of object and model surface normals. Details of this objective function are described in [32].
This objective function has a few deep local minima, caused by inappropriate orientations of the model, and
many shallow local minima, caused by noise and minor changes in object shape. In order to obtain the best
t of a model to an object, we need to nd those model parameters corresponding to the global minimum of
the objective function. To accomplish this, we employ a stochastic optimization algorithm, Very Fast Simulated
Re-annealing (VFSR) [33]. As an improved version of simulated annealing [34], VFSR permits an annealing
schedule which decreases exponentially in annealing time and is dramatically faster than traditional (Boltzmann)
annealing whose annealing schedule decreases logarithmically. The re-annealing property permits adaptation to
the changing sensitivities in a multidimensional parameter space.
After parametric geons are tted to the 3D data, the best model for the object is selected according to the
minimum tting residual.
5.2 Multiview Range Data
Several researchers have demonstrated that parametric model recovery using single-view data is extremely
sensitive to viewpoint and additional data from other views can signicantly improve model estimation [17, 19, 20].
In this study, multiple-view range data are obtained with a laser rangender which scans objects supported by a
turntable. The registration among images taken from dierent views is obtained by a method described in [35].
The redundant data appearing in more than one view are detected and removed. Finally, all non-redundant data
are converted to a common 3D coordinate system and expressed as a sequence of 3D data points [36].
6. EXPERIMENTS
The following experiments were conducted to investigate the discriminative properties of parametric geons.
We are interested in examining the residual dierences among all tted models, especially when the object data
9
Figure 6: Four bananas used in the experiments.
42.822
38.229
(a)
37.876
(d)
41.395
(b)
(c)
39.840
11.397
(e)
(f)
40.727
(g)
Figure 7: Fitted models superimposed on the range data. The numbers in bottom right hand corners indicate the value of the
absolute tting residuals. In (c), (e) and (f) some of object data cannot be seen because the object models are opaque and the data
are located inside the volume.
contain noise and the object shapes are not the exact shapes of the parametric geons. All objects used in the
experiments are single-part objects which lack sharp concavities. We conducted three sets of experiments using
synthetic data, range data of geon-like objects and range data of imperfect geon-like objects. Here we only report
the results from the third set of experiments.
In this case, eleven real bananas were used as the objects. Obviously their shapes cannot simply be depicted by
any of the parametric geons. Figure 6 shows four of these bananas. Some had stems at their ends and relatively
sharp surface variations. In some bananas, the curvature of the main axis changed slightly at the top and
signicantly at the bottom. No banana's cross section was perfectly symmetrical. The apparently noisy surfaces
shown in the gure were due to the rangender's sampling error. This was because the bananas had to be placed
far from the rangender in order for them to t within the its scanning eld-of-view. Four dierent views were
taken for each banana. Simple thresholding was performed to remove the supporting plane and other background
data. Surface normals were computed by a least-squares tting procedure. After multiview integration, the data
was tted to each of the seven models.
Figure 7 shows the results of tting the seven parametric geons to the range data. The lighter shaded volumes
are the models obtained by the tting procedure and the darker sparse spots indicate the input data. (a) through
(g) illustrate models of the ellipsoid, the cylinder, the cuboid, the tapered cylinder, the tapered cuboid, the curved
cylinder and the curved cuboid superimposed on one set of banana data, respectively. The algorithm selected the
10
elli
3.255
cyld
2.889
cubd
3.851
MODELS
tcyld
3.324
tcubd
3.611
ccyld
1.000
ccubd
2.987
Mean
Standard
deviation 0.118 0.092 0.152 0.232 0.139 0.000 0.149
Maximum
residual 4.00111 3.48881 4.71736 5.0178 4.32779 1.000 3.80155
Minimum
residual 2.65569 2.45822 3.10204 2.46402 3.07318 1.000 2.38523
Table 3: Average, maximum and minimum tting residuals, standard deviations for eleven bananas. The symbols, elli, cyld, cubd,
tcyld, tcubd, ccyld and ccubd, indicate a model of an ellipsoid, cylinder, cuboid, tapered cylinder, tapered cuboid, curved cylinder,
curved cuboid, respectively.
curved cylinder shown in (f) as the best model for all of the bananas. Clearly this result is consistent with our
intuition of the banana's actual shape. Table 3 presents the average tting residuals, standard deviations, and
maximum and minimum tting residuals for all of the bananas. Since absolute tting residuals were aected by
the banana's size, and they were all dierent, we cannot compare the tting residuals for the dierent bananas.
Thus, the residuals were normalized by the minimum residual obtained from the same banana as follows:
Eijn = Eij =Emin;j ; i = 1; :::; 7
Emin;j = min
fE g
i ij
(21)
E n is the relative (normalized) residual value, while i and j are the indices of models and objects, respectively.
The minimum relative residual is equal to one and the rest of the residuals are greater than one. Table 3 indicates
how dierent, on average, the minimum relative residual is from the other residuals. The results show that
the best model for all of the bananas was the curved cylinder, which gave the smallest average residual value.
Thus, parametric geon models and the presented recovery procedures demonstrate robust behavior and uniquely
represent the banana shapes even though they are characterized by minor variations.
7. CONCLUSIONS
In this paper, we have introduced parametric geons as volumetric primitives for qualitative object recognition.
The seven parametric geon shape types are regular, simple and symmetrical volumes. As well, these shapes are
consistent with the basic forms used by sculptors. In addition to the shape label, parametric geons also carry
quantitative information which could be used for more detailed object discrimination.
This model is designed to describe volumetric objects or object parts whose shapes do not vary signicantly
from parametric geons. We note that there exists a trade-o between the scope and uniqueness of primitive
shapes when proposing a particular object description. If a specic model can represent a large number of shapes
in ne detail, it must necessarily employ many parameters and be very sensitive to these details. However, object
details are easily distorted by noise and are often due to minor shape variations which should not be considered
as a factor for discrimination. Therefore, the noise and minor shape changes might result in nonunique object
descriptions. In our case, we gain discriminative power by restricting the descriptive power.
Global shape constraints, as dened by the parametric geon equations, eectively restrict the solutions of the
model recovery procedure to the parametric geons. Nevertheless, the approach does adequately compensate for
noise and minor shape texture. Experimental results show that a unique description of single-part objects can be
robustly computed, even though the data contain noise and the objects do not exactly conform to the shape of
the parametric geons.
Acknowledgements
11
The authors would like to thanks Dr. Lester Ingber for the VFSR computer code and Gerard Blais and Gilbert
Soucy for technical help. M. D. Levine would like to thank the Canadian Institute for Advanced Research and
PRECARN for its support. This work was partially supported by a Natural Sciences and Engineering Research
Council of Canada Strategic Grant and an FCAR Grant from the Province of Quebec.
References
[1] I. Biederman. Human image understanding: Recent research and a theory. Computer Vision, Graphics, and
Image Processing, 32:29{73, 1985.
[2] R. Bergevin and M. D. Levine. Generic object recognition: Building and matching coarse descriptions from
line drawings. IEEE Transactions on Pattern Analysis and Machine Intelligence, 15(1):19{36, January 1993.
[3] S. J. Dickinson, A. P. Pentland, and A. Rosenfeld. 3D shape recovery using distributed aspect matching.
IEEE Transactions on Pattern Analysis and Machine Intelligence, 14(2):174{198, February 1992.
[4] D. Marr and H. K. Nishihara. Representation and recognition of spatial organization of three-dimensional
shapes. Proceedings of the Royal Society, B200:269{294, 1978.
[5] T. O. Binford. Visual perception by computer. In IEEE Conference on Systems and Control, Miami, FL,
1971.
[6] I. Biederman and E. Cooper. Priming contour-deleted images: Evidence for intermediate representations in
visual object recognition. Cognitive Psychology, 23:394{419, 1991.
[7] R. C. Munck-fairwood and L. Du. Shape using volumetric primitives. Image & Vision Computing, 11(6):364{
371, July 1993.
[8] Q. L. Nguyen and M. D. Levine. 3D object representation in range images using geons. In 11th International
Conference on Pattern Recognition, Hague, Netherlands, August 1992.
[9] N. S. Raja and A. K. Jain. Obtaining generic parts from range data using a multi-view representation.
In Proceedings of SPIE conference on Application of Articial Intelligence: Machine Vision & Robotics,
Orlando, April 1992.
[10] S. Han, D. B. Goldgof, and K. Bowyer. Using hyperquadrics for shape recovery from range data. In
Proceedings of Fourth International Conference on Computer Vision, pages 492{496, Berlin, Germany, May
1993. IEEE Computer Society Press.
[11] D. Keren, D. Cooper, and J. Subrahmonia. Describing complicated objects by implicit polynomials. Technical
Report LEMS Technical Report #102, Division of Engineering, Brown University, Providence, RI, USA, 1992.
[12] D. Terzopoulos, , A. Witkin, and M. Kass. Symmetry-seeking models for 3D object reconstruction. International Journal of Computer Vision, 1(3):211{221, 1987.
[13] A. P. Pentland. Closed-form solutions for physically based shape modeling and recognition. IEEE transactions
on Pattern Analysis and Machine Intelligence, 13:715{729, 1991.
[14] M. Gardiner. The superellipse: A curve that lies between the ellipse and the rectangle. Scientic American,
213:222{234, 1965.
[15] A. H. Barr. Superquadrics and angle-preserving transformations. IEEE Computer Graphics Applications,
1:11{23, 1981.
[16] A. P. Pentland. Recognition by parts. In The First International Conference on Computer Vision, pages
8{11, London, June 1987.
12
[17] T. Boult and A. Gross. Recovery superquadrics from depth information. In Proceedings of the AAAI workshop on spatial Reasoning and Multisensor Integration, pages 128{137. American Association for Articial
Intelligence, 1987.
[18] F. Solina and R. Bajcsy. Recovery of parametric models from range images: the case for superquadrics with
global deformations. IEEE Transactions on Pattern Analysis and Machine Intelligence, 12(2):131{147, 1990.
[19] N. S. Raja and A. K. Jain. Recognizing geons from superquadrics tted to range data. Image and Vision
Computing, 10(3):179{190, April 1992.
[20] P. Whaite and F. P. Ferrie. From uncertainty to visual exploration. IEEE Transactions on Pattern Analysis
and Machine Intelligence, 13(10):1038{1049, October 1991.
[21] L. R. Rogers. Sculpture. Oxford University Press, 1969.
[22] B. Putnam. The Sculptor's Way. Farrar & rinehart, INC., 1939.
[23] W. Zorach. Zorach Explains Sculpture: What It Means and How It Is Made. Tudor Publishing Company,
New York, 1960.
[24] A. H. Barr. Global and local deformations of solid primitives. Computer Graphics, 18(3):21{30, 1984.
[25] P. J. Besl and R. C. Jain. Invariant surface characteristics for three dimensional object recognition in range
images. Computer Vision, Graphics, and Image Processing, 33(1):33{88, 1986.
[26] F. Ferrie and M. D. Levine. Deriving coarse 3D models of objects. In IEEE Computer Society Conference
on Computer Vision and Pattern Recognition, pages 345{353, Ann Arbor, Michigan, June 1988.
[27] B. O'neill. Elementary Dierential Geometry. Academic Press, Nork and London, 1966.
[28] Y. Aloimonos and A. Rosenfeld. Visual recovery. In S. C. Shapiro, editor, Encyclopedia of Articial Intelligence, volume 2, pages 1665{1687. John Wiley & Sons, Inc., 1987.
[29] G. Hateld and W. Epstein. The status of the minimum principle in the theoretical analysis of visual
perception. Psychological Bulletin, 97(2):155{186, 1985.
[30] S. J. Dickinson, A. P. Pentland, and A. Rosenfeld. A representation for qualitative 3D object recognition
integrating object-centered and viewer-centered models. In K. N. Leibovic, editor, Vision: A Convergence
of Disciplines. Springer Verlag, New York, 1990.
[31] A. D. Gross and T. E. Boult. Error of t measures for recovering parametric solids. In Proceedings, 2nd
International Conference on Computer Vision, pages 690{694, Tampa, Florida, 1988. Computer Society of
the IEEE, IEEE Computer Society Press.
[32] K. Wu and M. D. Levine. Recovering parametric geons from multiview range data. In IEEE Conference on
Computer Vision & Pattern Recognition, Seattle, June 1994. IEEE computer Society. To appear.
[33] L. Ingber. Very fast simulated re-annealing. Mathematical and Computer Modelling, 12(8):967{973, 1989.
[34] S. Kirkpatrick, Jr. C. D. Gelatt, and M. P. Vecchi. Optimization by simulated annealing. Science,
220(4598):671{680, May 1983.
[35] G. Blais and M. D. Levine. Registering multiview range data to create 3D computer objects. Technical Report
TR-CIM-93-16, Center for Intelligent Machines, McGill University, Montreal, Quebec, Canada, October 1993.
[36] K. Wu and M. D. Levine. 3-D object representation using parametric geons. Technical Report TR-CIM-93-13,
Center for Intelligent Machines, McGill University, Montreal, Quebec, Canada, September 1993.
13