Three-Dimensional Interpretation of Quadrilaterals

Three-Dimensional Interpretation of Quadrilaterals
Gang X u * f and Saburo Tsuji
*
Department of Control Engineering. Osaka University
Toyonaka, Osaka 560, Japan
also AI Department, A T R Communication Systems Research Laboratories
Seika-cho, Soraku-gun, Kyoto 619-02, Japan
Abstract
Quadrilaterals are figures w i t h w h i c h everybody
becomes f a m i l i a r in h i s / h e r very early stage of education. By s t u d y i n g these seemingly simple figures
we can o b t a i n some insights i n t o the n a t u r e of the
general p r o b l e m of i n t e r p r e t i n g image contours. T h i s
paper discusses in detail how quadrilaterals are i n terpreted three-dimensionally, and draws feasible i n ferences about the general properties of the h u m a n
system of processing line drawings. F i r s t the rectang u l a r i t y regularity is proposed to be the p r i m e constraint in the visual i n t e r p r e t a t i o n of quadrilaterals.
T h e subjective "image center' 1 and focal l e n g t h (finite
and infinite) are determined together w i t h rectangle
o r i e n t a t i o n . Secondly, i n t e r p r e t a t i o n of quadrilaterals as faces of a rectangular p o l y h e d r o n is examined
at b o t h the geometrical level and the perceptual level.
F i n a l l y the g r a v i t y regularity is proposed to derive
constraints on the rectangle o r i e n t a t i o n by analyzing
the relation among the camera, the g r o u n d and the
rectangles supported by the g r o u n d .
1.
Introduction
An image is a two-dimensional p r o j e c t i o n of a threedimensional scene, and a contour image represents significant changes in surface shape, reflectance and i l l u m i n a t i o n ,
which are reflected directly or i n d i r e c t l y in t h e 2-D image.
A contour image can be a. line d r a w i n g , d r a w n by h u m a n
hands, which only includes topologically well-defined, sem a n t i c a l l y significant, b u t not necessarily positionally accurate contours; or it can be generated by c o m p u t e r f r o m
a real image, w h i c h may include many noise edges, if not
processed very carefully.
Line d r a w i n g is probably the most abstract and efficient means of describing our 3-dimensional w o r l d in a
2-dimensional manner. Thus it is often used in h u m a n
communications and is p o t e n t i a l l y very useful in h u m a n machine interfaces. A l t h o u g h the contours alone do not
provide sufficient constraints on the surfaces, humans seem
not to have any difficulty in recovering 3-D shapes f r o m the
2-D contours. The task is so effortless for our eyes t h a t we
rarely pause to ask ourselves how we do. As we t r y to
answer, however, we realize t h a t it is a difficult question.
f T h e work was done at Osaka University. D r . Gang
X u ' s address f r o m this a u t u m n is Center of I n f o r m a t i o n
Science, Peking University, B e i j i n g , C h i n a
1610
Vision and Robotics
A number of papers have been published t h a t t r y
to answer this "shape f r o m contour" question. A m o n g
t h e m are [Barrow &; Tenenbaum, 1981], (Kanade, 1981],
[ B a r n a r d , 1983] and [Brady & Y u i l l e , 1984]. A l l the papers listed above generally deal w i t h only closed contours
and assume t h a t the image contours are, globally or locally, projections of planar space curves. Ellipses are interpreted as circles by a d d i t i o n a l assumptions of: u n i f o r m i t y
of curvature [Barrow &; Tenenbaum, 1981], m a x i m u m entropy [ B a r n a r d , 1983] and m a x i m u m compactness [Brady
& Y u i l l e , 1984], and quadrilaterals or parallelograms are interpreted as rectangles by a d d i t i o n a l assumptions of maxim u m s y m m e t r y [Kanade, 1981; B a r n a r d , 1983]. Reviewing
the papers, it is not difficult to f i n d t h a t the approaches
are based on the psycological facts t h a t ellipses and q u a d r i laterals ( i n c l u d i n g parallellograms) are perceived as circles
and rectangles, respectively. T h e approaches differ only
in how the facts are accounted for and in w h a t specific
criteria, they choose to achieve the predetermined aims.
T h e psyoology of line d r a w i n g perception was first studied
by the Gestalt school. T h e y propose prdgnanz, or figural
goodness, to be the criterion on which the h u m a n perception is based. T h e circle and rectangle interpretations
are preferred because they are the most b e a u t i f u l interpretations among the possible. U n f o r t u n a t e l y , however,
the Gestalt psycologists could not give a theoretical account of the t e r m prdgnanz f r o m a standpoint of informat i o n processing, and figural goodness remains to be j u d g e d
m a i n l y by h u m a n eyes (note t h a t a recent progress is an att e m p t to characterize pragnanz by t r a n s f o r m a t i o n a l invariance [Palmer, 1983].) T h e lack of a d e f i n i t i o n of pragnanz
leaves r o o m for proposal of various specific criteria. T h e
u n i f o r m i t y of c u r v a t u r e , entropy, compactness and symmet r y criteria are developed i n , and thus well suited to the
specific cases, but they are not surely universal and may
not apply to other cases.
Under the circumstances, two general paradigms are
available. T h e first is the q u a n t i t a t i v e p a r a d i g m , in w h i c h
a universal criterion is developed and i n t e r p r e t a t i o n s are
selected by m a x i m i z i n g or m i n i m i z i n g t h a t c r i t e r i o n , as
done in [Barrow & T e n e n b a u m , 1981], [ B a r n a r d , 1983]
and [Brady & Yuille, 1984], T h e second is the qualitative p a r a d i g m , in w h i c h specific f i g u r a l configurations t h a t
have definite i n t e r p r e t a t i o n s are searched for and interpreted. Examples are [Stevens, 19S1&1986], in which parallel (curved) contours are i n t e r p r e t e d as lines of curvat u r e on a c y l i n d r i c a l surface, [ X u & T s u j i , 1987a,b], in
w h i c h a closed b o u n d a r y is segmented and i n t e r p r e t e d as
four lines of curvature if certain conditions are satisfied,
and [Barnard & P e n t l a n d , 1983], in which elliptic arcs are
directly i n t e r p r e t e d as circular arcs. T h e causal relation
between the interpreted a n d the i n t e r p r e t a t i o n is referred
to as regularity. T h e task is thus to discover regularities,
or causal relations, and t h e n to apply t h e m to specific i n t e r p r e t a t i o n processes. We consider t h a t the q u a l i t a t i v e
p a r a d i g m is more advantageous because (1) a line drawing
is generally only q u a l i t a t i v e , not q u a n t i t a t i v e , especially
when it is h a n d - d r a w n ; and (2) while interpretations are
qualitatively stable, they are not always q u a n t i t a t i v e l y stable.
2.
The Rectangularity Regularity
and Rectangle Orientation
2.1
The rectangularity regularity
The h u m a n visual perception, as a part of the b r a i n ,
is the p r o d u c t of millions of years of e v o l u t i o n . As a consequence, various regularities of nature have been embedded
i n t o the vision system. It is these n a t u r a l regularities t h a t
are secrets of the h u m a n vision (and the h u m a n perception
at large.) They fill in the blank inherent in the m a p p i n g
f r o m two-dimensionality onto three-dimensionality. O n l y
by understanding t h e m , can we really understand the hum a n vision and f u r t h e r develop any computer vision systems. (See [Pentland, 1986] for a general discussion on the
role t h a t n a t u r a l regularities play in visual perception, and
[ U l l m a n , 1979a,b; R e u m a n & H o f f m a n , 1986] for how natural regularities play in the visual perception of m o t i o n . )
On the other h a n d , the h u m a n visual perception is
not simply a copy of external regularities; it has its own
i n t e r n a l s t r u c t u r e , which functions in its own right. One
indication of such an internal s t r u c t u r e is the perceptual
system's prefenice of pragnanz, or simple, regular forms
over complex, irregular ones. It is unfair to contribute this
p r o p e r t y to regularities of external w o r l d . Circles are preferred over ellipses [Barrow & Tenenbauin, 1981; Brady &:.
Yuille, 1984] not because we regularly see circles in our
e n v i r o n m e n t , b u t mainly because the internal structure of
the perceptual system appreciate the simplicity or figural
goodness of the circle i n t e r p r e t a t i o n . T h i s k i n d of regularities
if one w o u l d also like to call t h e m regularities—is
subjective regularities, in contrast to external n a t u r a l regularities. ( I n fact, the preference of s i m p l i c i t y is not the
privilege of perception; all the phases of cognition show
this tendency [Kanizsa, 1979, p.238].)
A l l quadrilaterals can be queued, according to the de
gree of regularity, as: rectangle, parallelogram, t r a p e z i u m
and generic quadrilateral ( F i g . 1). It is generally difficult
to define degree of r e g u l a r i t y universally, b u t here it is int u i t i v e and we do not t r y to give a theoretical definition
Palmer, 1983]. It is observed t h a t a single quadrilateral
tends to be always perceived as a rectangle 1 . To put it another way, a q u a d r i l a t e r a l in image, whatever its degree
of regularity is, is interpreted to be a rectangle in space,
the most regular i n t e r p r e t a t i o n among the possible, and
the 2-D i r r e g u l a r i t y is t h o u g h t of as being caused by the
p r o j e c t i o n . It is f r o m this observation t h a t the rect angulari t y regularity is generalized. T h i s regularity, as discussed
in the last p a r a g r a p h , is a. subjective regularity resulting
f r o m the internal structure of the perceptual system.
T h e rectangularity regularity has long been recognized
a;, being of importance. B a r n a r d (1983) computes the planar o r i e n t a t i o n of a rectangle f r o m its image, a quadrilateral, under perpspective projection. K a n a t a n i (1986) discusses how to compute spatial orientations of the faces of a
rectangular t r i h e d r a l polyhedron. X u and T s u j i (1987a,b)
extends this regularity to curved surfaces and proposes
the L O C (line of curvature) regularity to recover shape
of curved surfaces. In this paper we give a unified and
detailed account of i n t e r p e r t i n g quadrilaterals in image as
rectangles in space under b o t h orthographic and perspective projections.
3D orientation of a quadrilateral is determined by incorpor a t i n g the rectangularity regularity. The coordinate system
we assume, as shown in F i g . 2, is different f r o m those we
usually use, in that, the coordinate origin is located on the
image plane and in that the z-axis is independent of the
focal p r i n t , which has the coordinates (:7'0, y/0, — f). B o t h
the "image center" (x(),y0) and the focal length / a r e then
to be determined in the process of i n t e r p r e t a t i o n . W h i l e
the camera system is an objective one if the image is a. real
one, it is a. subjective one if the image is a hand d r a w n
figure.
W h a t is k n o w n is a quadrilateral in the image, and
what is to be k n o w n is the orientation of the rectangle
that projects that quadrilateral, and the location of the
focal p o i n t , b u t not the distance or size of the rectangle.
Let the four corners of the quadrilateral be A, B, C and
D, as shown in F i g . 3. E x t e n d i n g the segments AB and
Xu and Tsuji
1611
This equation describes a hemisphere w i t h a diameter P Q ,
on which F is constrained to lie, as shown in Fig. 5. For /
to have a solution, the point O ( x 0 , y 0 , 0 ) must satisfy the
following inequality:
C D , we have the intersection P (x\,yl). Since AB and
CD are the images of two parallels, P is the vanishing
point of the parallels. Similarly, extending the segments
BC and D A , we have the intersection Q ( x 2 , y 2 ) , which
is also a vanishing point, of the other group of parallels.
P a n d / o r Q approach infinity if the corresponding image
segments are parallel.
A straightforward demonstration of the concept of
vanishing point is t h a t the line connecting the focal point
and a vanishing point is parallel to the parallel lines that
give rise to that vanishing point. As shown in Fig. 4, we
know that P F i s parallel t o A ' B ' and C D ' , and Q F i s
parallel t o B ' C and D ' A \ Since A ' B ' C ' D ' i s a rectangle,
P F i s perpendicular t o Q F . The plane determined b y P F
and QF is parallel to the plane on which the rectangle lies,
and thus the orientations of the two planes are identical.
The segments PF and QF can be expressed in vector form
as
(xl-x0,y1--y0,f)
and ( x 2 - : x 0 , y 2 - y 0 , / ) , respectively.
From the perpendicularity, the inner product of the two
vectors is zero. By this equation F can be either determined or at least constrained. Once F is determined, the
plane normal of (let it be called n) can also be determined
as the outer product of PF and Q F . In the following, we
discuss three cases of PF and Q F : (1) neither P nor Q
approaches infinity; (2) either P or Q approaches infinity;
and (3) b o t h P and Q approach infinity.
( c a s e l ) If b o t h P and Q do not appproach infinity,
then the quadrilateral is a generic one. From the perpendicularity of PF and QF we have
1612
Vision and Robotics
This means that O must lie inside the circle w i t h PQ on
the image plane. When Q reaches the midpoint of P Q ,
all of P O , QO and FQ become radii of the hemisphere,
and / has the maximal value, half the length of P Q . To
determine F completely, however, we have to first know
where the attention is oriented; i.e., where O is. If there
are three or more quadrilaterals in a real image, then P can
be determined as the intersection point of the hemispheres
corresponding to the quadrilaterals (see Section 4 for a special case.) A prerequisite to this solution is that the hemispheres do have a common p o i n t ; i.e., the image is a real
one. However, as in hand-drawn figures, quadrilaterals are
usually prodticed by individual attentions. Consequently,
they should be, and can only be, perceived separately. One
reasonable choice for each quadrilateral is the intersection
of the two diagonals, which is the centroid of the corresponding rectangle in space, if the centroid is w i t h i n the
circle w i t h a diameter P Q . As mentioned above, the maximal value for / i s half the length of P Q . Trially, the closer
to a parallelogram the quadrilateral is, the greater / is.
On the other hand, humans prefer long focal lengths in
the perception of figures. This answers why a parallelogram is more often used to represent a rectangle, and why
a parallelogram is more easily perceived by our eyes as a
rectangle.
(Case 2) If one of P and Q approaches infinity, then
the quadrilateral is a trapezium. W i t h o u t loss of generality,
suppose that P approaches infinity, while Q does not. PO
(xl ~x0, y1-y0) can be expressed by (a, b) as P approaches
infinity. A d d i n g the z-component, PF can be expressed by
(a,b, c), where a, b and c are all constants, c approaches 0
if f does not approach infinity, and c may still be 0 even if
/ d o e s approach infinity. C a n / a p p r o a c h infinity? Suppose
that it does. Then clearly BC is parallel to D A , and their
intersection Q also approaches infinity. This leads to a
contradiction. T h u s / c a n n o t approach infinity and c equals
zero.
Xu and Tsuji
1613
where ( a , b ) i s t h e o r i e n t a t i o n v e c t o r , a n d ( x 1 , y 1 l ) a n d
( x 2 , y 2 ) are t h e c o o r d i n a t e s o f t h e o t h e r t w o i n t e r s e c t i o n
p o i n t s . T h e difference o f t h e first t w o e q u a t i o n s means t h a t
t h e p a r a l l e l edges are p e r p e n d i c u l a r t o t h e l i n e l i n k i n g t h e
o t h e r t w o i n t e r s e c t i o n p o i n t s . T h e f o c a l p o i n t i s n o t comp l e t e l y d e t e r m i n e d , b u t c o n s t r a i n e d t o lie o n a semicircle,
whose t w o e n d p o i n t s are the t w o i n t e r s e c t i o n p o i n t s , a n d
t h e plane on w h i c h the semicircle lies is p e r p e n d i c u l a r to
t h e image p l a n e .
If two of the three intersection points approach infinity
a n d the o t h e r one does n o t , t h e n one o f t h e t h r e e q u a d r i l a t erals m u s t be a rectangle. As discussed in Section 2, if t w o
vanishing points approach infinity, then the corresponding
i m a g e q u a d r i l a t e r a l is a p a r a l l e l o g r a m or a r e c t a n g l e . If t h e
p a r a l l e l o g r a m is to be i n t e r p r e t e d as a rectangle in space,
t h e n the p r o j e c t i o n m u s t b e o r t h o g r a p h i c . I f t h e p r o j e c t i o n
is o r t h o g r a p h i c , t h e n a l l rectangles in space are p r o j e c t e d
as parallelograms in image. Thus, if two of the three in
tersection p o i n t s a p p r o a c h i n f i n i t y a n d t h e i r c o r r e s p o n d i n g
q u a d r i l a t e r a l is n o t a r e c t a n g l e , t h e n the o t h e r one m u s t
also a p p r o a c h i n f i n i t y . As described in Section 2, if a space
rectangle is p a r a l l e l to t h e image p l a n e , t h e n it is p r o j e c t e d
as an image rectangle u n d e r t h e p e r s p e c t i v e p r o j e c t i o n . If
it is a face of a r e c t a n g u l a r p o l y h e d r o n , t h e n the o t h e r t w o
visible faces are p r o j e c t e d as generic q u a d r i l a t e r a l s , w i t h
the corresponding vanishing points not approaching irifin
it v.
I f all t h e three i n t e r s e c t i o n p o i n t s a p p r o a c h i n f i n i t y ,
t h e n the p r o j e c t i o n is o r t h o g r a p h i c . T h e c o n d i t i o n f o r a
rectangular polyhedron interpretation is that the following
three equtions have a c o m m o n s o l u t i o n f o r c l , c 2 a n d c3.
The condition for a solution of c l , c2 and c3 is that all the
three inner products are negative
all the three angles
around the central corner are greater than 90 degrees. It is
impossible that two of the three inner products are positive
and the other one is negative, because otherwise one of the
angles around the central corner would be greater than 180
degrees, reducing three quadrilaterals to two.
W h i l e interpreting figures at the geometrical level is
strictly governed by mathematics, it must possess certain
degree of flexibility at the perceptual level; otherwise it
would not work on hand-drawn figures in a humanlike way
[Sugihara, 198G]. B o t h the condition for a real object interpretation and the condition for a rectangular polyhedron
interpretation derived at the geometrical level have te be
changed. First, in a hand-drawn figure, it is too strict
a condition that the three edges of each group, when ex
tended, meet at a common point. Also, the parallel edges
are nearly parallel. Secondly, as given in Fig. 8, even when
one of the three angles around the central corner is exactly 90 degrees, the figure is still perceived as a rectangular polyhedron (the interpretation is impossible m a t h
ematically.) A feasible explanation for this fact is that
the individual quadrilaterals are first interpreted separately
(as rectangles) and then integrated as faces of a rectangular polyhedron in a less strict way than at the geometrical level. This k i n d of perception seems quite ubiquitous.
W h e n drawing a man or an animal, children usually put
together a frontal view of the face and a side view of the
body. B u t the flexibility has its l i m i t ; if one of the three
angles around the central corner is more than 90 degrees,
the figure is no longer perceived as a rectangular polyhedron (Fig. 9,) but an ordinary parallelepiped [Perkins,
1983; Kanade & Render, 1983].
Lastly, although the rectangular polyhedron interpretation is mathematically correct if all the three intersection
points of each edge group do not approach infinity, it does
not always agree w i t h human perception, if the orthocenter of the triangle formed by the three intersection points
is far away from the figure itself.
4.
The Gravity Regularity
Everything, including the perceiver itself, is attracted
by gravity. As a consequence, objects must be supported
by something. It is usually perceived to be the ground,
perpendicular to the direction of gravity, if no evidence
indicates otherwise. The gravity regularity is generalized
f r o m this universal fact. Unlike the rectangularity reg-
1614
Vision and Robotics
F i g . 9 If one of t h e three angles a r o u n d t h e
c e n t r a l corner is less t h a n 90 degrees, t h e n the f i g ure is no longer perceived as a r e c t a n g u l a r polyhedron.
u l a r i t y , it is a n a t u r a l r e g u l a r i t y o b j e c t i v e l y e x i s t i n g in
t h e e x t e r n a l w o r l d . So f a r it has a t t r a c t e d only a l i t t l e
a t t e n t i o n . K a n a d e et al. (1983) analyzes skewed s y m m e t r y u n d e r g r a v i t y . Recently, Sedgwick (1987) r e p o r t s a
p r o d u c t i o n system t h a t generates a n i n t e r p r e t a t i o n o f the
e n v i r o n m e n t based o n linear p e r s p e c t i v e i n f o r m a t i o n and
c o n t a c t relations between surfaces a n d t h e g r o u n d . T s u j i et
al. (1986) also r e p o r t s a m o b i l e r o b o t t h a t perceives and
navigates in an i n d o o r e n v i r o n m e n t w i t h a h o r i z o n t a l flat
floor a n d objects s t a n d i n g v e r t i c a l l y o n t h e floor.
T o perceive t h e w o r l d is, i n essence, t o perceive the
r e l a t i o n s a m o n g t h e perceiver, t h e g r o u n d a n d the objects
o n t h e g r o u n d . B y i n t r o d u c i n g t h e g r o u n d , the r e l a t i o n
between t h e perceiver a n d t h e rectangles reduces to the
s u m o f the r e l a t i o n between the perceiver a n d t h e g r o u n d ,
a n d t h e r e l a t i o n between t h e g r o u n d a n d the rectangles
s u p p o r t e d b y i t . A l l these relations can b e described i n
e i t h e r a viewer-centered r e p r e s e n t a t i o n or a world-centered
r e p r e s e n t a t i o n based on the g r o u n d .
L e t t h e n o r m a l of the g r o u n d plane be expressed by
n g i n t h e viewer-centered c o o r d i n a t e s y s t e m . n g a c t u a l l y
i m p l i e s t h e r e l a t i o n between the perceiver a n d t h e g r o u n d .
It is n o t d i f f i c u l t to find by i n t r o s p e c t i o n t h a t we usually
assume t h e f o l l o w i n g r e l a t i o n t o t h e g r o u n d . Suppose t h a t
t h e c a m e r a is o r i g i n a l l y so set t h a t t h e o p t i c a l axis of the
c a m e r a a n d t h e h o r i z o n t a l axis of t h e i m a g e plane are parallel t o t h e g r o u n d , a s h u m a n s look f o r w a r d w h i l e keeping
t w o eyes h o r i z o n t a l . R o t a t e the camera a r o u n d the h o r i z o n t a l axis of t h e image plane by an angle of a (see F i g . 10)
as h u m a n s l o o k some feet ahead on to t h e r o a d . T h e n t h e
n o r m a l o f t h e g r o u n d plane i s p r o j e c t e d t o b e u p r i g h t o n t o
t h e i m a g e u n d e r o r t h o g r a p h i c p r o j e c t i o n ; i.e.,
ng
( 0 , 1 , — t a n a),
side is perpendicular to the n o r m a l of the ground. If the
rectangle does not stand vertically, it is interpreted to be
prevented f r o m falling by something else behihnd i t . W h e n
only one of the corners contacts the ground, it is most likely
t h a t we perceive the rectangle standing vertically.
W h i c h contact relation is perceived is largely dependant on the assumption of the perceiver's posture. T h e
f u l l contact relation is perceived only if the upper and
lower corners are obtuse angles much greater than 90 degrees ( F i g . 11a) — i.e., the rectangle is remarkably slanted
towards the sky — because we are not used to looking
downward (see [Stevens, 1981] for an analysis of the rel a t i o n between the image angle of two orthogonal space
vectors and the orientation of their outer product.) The
corner contact relation is perceived if one of the diagonal,
of w h i c h the m i d p o i n t is the centroid, is vertical in image,
and the upper and lower corners are acute angles ( F i g . l i b )
(12)
T o keep n g u p r i g h t e v e r y w h e r e , w h i c h i s desirable, i t i s
necessary to assume t h e o r t h o g r a p h i c p r o j e c t i o n , if a / 0.
Because o f t h e p l a n a r i t y o f rectangle a n d t h e l i n e a r i t y
of i t s sides, t h e r e exist o n l y t h r e e k i n d s of c o n t a c t relat i o n s b e t w e e n t h e g r o u n d a n d a r e c t a n g l e ; (1) t h e w h o l e
r e c t a n g l e contacts t h e g r o u n d ; (2) one of t h e f o u r sides
c o n t a c t s t h e g r o u n d ; a n d ( 3 ) o n l y one o f t h e f o u r corners
c o n t a c t s t h e g r o u n d . W h e n t h e w h o l e rectangle contacts
t h e g r o u n d , t h e o r i e n t a t i o n o f t h e rectangle a n d t h a t o f t h e
g r o u n d are i d e n t i c a l . A l l t h e f o u r sides are p e r p e n d i c u l a r t o
t h e g r o u n d . W h e n o n l y one side contacts t h e g r o u n d , t h a t
again because of the posture assumed by the perceiver.
T h e one side contact relation is perceived if the f u l l contact
and the one corner contact relations are not. The side t h a t
has the smaller angle to the horizontal axis is most likely
perceived to contact the ground plane, because we prefer
interpretations that are less slanted f r o m the image plane
(Fig. l l c , l l d ) .
We do not intend to claim the completeness of the
analysis, because perception of one contact relation is not
necessarily exclusive of another and the conditions are not
Xu and Tsuji
1615
Before concluding this section, we have one point to
note. The orthographic projection is a necessity of the
vertical-to-vertical mapping, but it at the same time does
not exclude perspective projection for other purposes. The
rectangles can still be projected as non-parallelograms. To
perceive local shape of an object, the perspective information is required; whereas to perceive the more global
relation between the perceiver and the ground, the orthography is required. It is really of interest to observe human's
this flexibilty to swing between orthographic and perspective projections.
5.
Conclusions
We have proposed the rectangularity regularity to be
employed in the visual interpretation of quadrilaterals. By
this subjective regualrity resulting from the internal structure of human perception, quadrilaterals in image axe i n terpreted as rectangles in space, w i t h the rectangle orientation and the focal point being completely determined or
partially constrained. Projection is perceived altogether.
As the second p a r t , we have examined interpreting quadrilaterals as faces of a rectanuglar polyhedron at both the
geometrical level and the perceptual level. Interpretations
at the two levels may be fairly different. Finally we have
analyzed the relations among the perceiver (camera,) the
ground and the rectangles supported by the ground, and
proposed the gravity regularity to derive constraints on the
rectangle orientation. T h r o u g h studying these seemingly
simple quadrilaterals we have obtained some insights into
the nature of the general problem of interpreting image
contours: the rect angularity regularity is just a specific
example of the perceptual system's preference of regular
forms over irregular forms; while interpreting figures at
the geometrical level strictly obey mathematics, interpreting figures at the perceptual level is much more flexible;
and the gravity regularity can be widely applied in different forms.
References
Barnard, S. (1983) Interpreting perspective images,
A r t i f i c i a l I n t e l l i g e n c e 2 1 , pp. 435-462
Barnard, S. and Pentland, A. (1983) Three dimensional shape f r o m line drawings, P r o c . 8 t h I n t . J o i n t
C o n f . o n A r t i f i c i a l I n t e l l i g e n c e , pp. 1061-1063
Barrow, H. and Tenenbaum, J. (1981) Interpreting
line drawings as three-dimensional surfaces, A r t i f i c i a l I n t e l l i g e n c e 17, pp. 75-116
Brady, M. and Yuille, A. (1984) An extremum principle for shape f r o m contour, I E E E T r a n s , o n P a t t e r n
A n a l y s i s a n d M a c h i n e I n t e l l i g e n c e , Vol. 6 , pp. 288-301
1616
Vision and Robotics
Hoffman, D. and Richards, W. (1986) Parts of recogn i t i o n , F r o m P i x e l s t o P r e d i c a t e s , ed. b y Pentland, A . ,
Ablex
Huffman, D. (1971) Impossible objects as nonsense
sentences, M a c h i n e I n t e l l i g e n c e 6, ed. by Meltzer, B.
and Michie, D., Edinburgh University Press
Kanade, T. and Render, J. (1983) M a p p i n g image
properties into shape constraints: Skewed symmetry,
afrine-transformable patterns, and the shape-from-texture
paradigm, H u m a n a n d M a c h i n e V i s i o n , ed. b y Beck,
Hope and Rosenfeld, Academic Press
K a n a t a n i , K. (1986) The constraints on images of rectangular Polyhedra, I E E E T r a n s .
Pattern Analysis
a n d M a c h i n e I n t e l l i g e n c e , Vol. 8, pp. 456-463
Kanizsa, G . (1979) O r g a n i z a t i o n i n V i s i o n : E s says o n G e s t a l t P e r c e p t i o n , Praeger
M a c k w o r t h , A. (1973) Interpreting pictures of polyhedral scenes, A r t i f i c i a l I n t e l l i g e n c e 4, pp.121-137
Palmer, S. (1983) The psycology of perceptual organization: A transformational approach, H u m a n a n d M a c h i n e V i s i o n , ed. by Beck, Hope and Resonfeld, Academic Press
Pentland,A. (1986) Introduction: From Pixels to Predicates, F r o m P i x e l s t o P r e d i c a t e s , ed. b y Pentland, A.,
Ablex
Perkins, D. (1983) W h y the human perceiver is a bad
machine, H u m a n a n d M a c h i n e V i s i o n , ed. b y Beck,
Hope and Resonfeld, Academic Press
Reuman, S. and Hoffman, D. (1986) Regularities of
nature: The interpretation o f visual m o t i o n , F r o m P i x e l s
to P r e d i c a t e s , ed. by Pentland, A., Ablex
Sedgwick, H. (1987) Layout2: A production system
modeling visual perpspective information, P r o c . 1st I n t .
C o n f . o n C o m p u t e r V i s i o n , pp. 662-666
Stevens, K. (1981) The visual interpretation of surface
contours, A r t i f i c i a l I n t e l l i g e n c e 17, pp. 47-73
Stevens,K. (1986) Inferring shape f r o m contours across
surfaces. F r o m P i x e l s t o P r e d i c a t e s , ed. b y Pentland,
A., Ablex
Struik, D. (1961) D i f f e r e n t i a l G e o m e t r y , AddisonWeslev
Sugihara, K . (1986) M a c h i n e I n t e r p r e t a t i o n o f
L i n e D r a w i n g s , M I T Press
T s u j i , S., Zheng, J. and Asada, M. (1987) Stereo vision
of a mobile robot: World constraints for image matching
and interpretation, P r o c . I E E E I n t . C o n f . R o b o t i c s
a n d A u t o m a t i o n , pp. 1194 1199
Turner, K. (1974) Computer perception of curved objects using a television camera, P h . D . D i s s e r t a t i o n , E d inburgh University
U l l m a n , (1979a)
The Interpretation of Visual
M o t i o n , M I T Press
Ullman, (19791)) The Interpretation of structure f r o m
motion, P r o c . o f t h e R o y a l S o c i e t y , L o n d o n , B 2 0 3 ,
pp. 405-426
X u , G. and T s u j i , S. (1987a) Inferring surfaces from
boundaries, P r o c . 1st I n t . C o n f . o n C o m p u t e r V i s i o n , pp. 716-720
X u , G. and T s u j i , S. (1987b) Recovering surface shape
from boundary, P r o c . 1 0 t h I n t . J o i n t C o n f . o n A r t i ficial I n t e l l i g e n c e , pp. 731-733