Scene Augmentation via the Fusion of Industrial Drawings and

Scene Augmentation via the
Fusion of Industrial Drawings and Uncalibrated Images
With A View To Marker-less Calibration
Nassir Navab, Benedicte Bascle, Mirko Appel, and Echeyde Cubillo
Industrial Augmented Reality Group
Imaging & Visualization Department
Siemens Corporate Research, Princeton, NJ, USA
[email protected]
Abstract
The application presented in this paper is to augment
uncalibrated images of factories with industrial drawings.
Industrial drawings are among the most important
documents used during the lifetime of industrial
environments. They are the only common documents used
during design, installation, monitoring and control,
maintenance, update and finally dismantling of industrial
units. Leading traditional industries towards the full use
of virtual and augmented reality technology is impossible
unless industrial drawings are integrated into our
systems. Here we provide the missing link between
industrial drawings and digital images of industrial sites.
On one hand, this could enable us to calibrate the
cameras and build the 3D model of the scene without
using any calibration markers. On the other hand it
brings industrial drawings, floor map, images and 3D
models into one unified framework. This provides a solid
foundation for building efficient enhanced virtual
industrial environments.
The augmented scene is obtained by perspective warping
of an industrial drawing of the factory onto its floor,
wherever floor is visible. The visibility of the floor is
determined using probabilistic reasoning over a set of
clues including (1) floor color/intensity (2) image
warping and differencing between an uncalibrated
stereoscopic image pair using the ground plane
homography. Experimental results illustrate the
approach.
1. Introduction
Image Augmentation enhances a user’s perception of the
real world. It displays information that is not readily
available in the scene. Therefore it can make a task easier
for the user to perform. The applications are numerous
[1]. In particular, industrial applications have been
described by several authors. For example, a user can
point at a part of an engine model and the augmented
reality (AR) system displays the name of it (AR group at
ECRC 1992-1996) [12]. A group at Boeing is developing
an AR system to guide a technician in building a wire
bundle for an airplane’s electrical system [2]. Feiner et al
built a laser printer maintenance application [3]. It has
also been shown that AR can also serve as a planning tool
for teleoperation of a robot [8][4].
In this article we present a new industrial application of
image augmentation. It consists in overlaying industrial
drawings on real images. The drawings can be 2D floor
plans, electrical wiring plans or any other type of
drawings related to the factory layout. In the reminder of
this paper we will use the terms “industrial drawings” and
“floor plans” interchangeably. This application of image
augmentation is useful for the following reasons:
• Industrial drawings contain vital structural
information about a factory. In addition, they are
often computerized and hyper-linked to various
databases of information about the factory. This
makes industrial drawings a primary tool in accessing
information about a factory. Therefore they are used
very frequently, for instance for inventory,
maintenance and planning. However reading floor
Fig. 1 - A view of the software interface
•
plans requires some experience and relating them to
reality is not straightforward. Our system makes this
task easier by augmenting real images of the factory
with the corresponding floor charts. These augmented
images provide an intuitive and visual tool for
navigation in the databases of the factory and thus for
factory management. The applications include
maintenance. For instance, a factory worker who
spots a defective pipe can load the augmented image
of the room, visually find the pipe in the image and
look up the inventory number of the pipe on the floor
below the pipe. From this number, additional
information about the pipe can also be drawn quickly
from the factory database. The problem can then be
reported precisely and concisely.
The augmented images also provide an aid to
fiducial-free calibration of the cameras. Calibration is
a necessary first step to performing a 3D
reconstruction of the factory. Calibration is
traditionally achieved by using a calibration grid or
markers. However, putting markers into a factory
room and removing them afterwards is timeconsuming. It also requires a special trip to the
factory and cannot be achieved from already available
marker-free images of the factory. This is why
fiducial-free calibration is desirable. However
calibrating a camera from a “natural scene”, the socalled self-calibration, has a drawback. It does not
provide all the information necessary for full 3D
reconstruction of objects, but only for 3D
reconstruction up to a 3D affine or perspective
transformation. To obtain Euclidean reconstruction,
additional metric constraints are needed. In some
systems built by the computer vision community, the
user provides this by hand. In the industrial
environment, such information is readily available
from industrial drawings. We only need to relate this
to images. This is what our image augmentation
application achieves.
The outline of our approach for augmenting real images
with industrial drawings is detailed in section 2 and
illustrated by an example. Section 3 shows how the
homographic warping between an industrial drawing and
the factory floor plane in an image is calculated. Section 4
explains how the system segments the factory floor plane
into visible and occluded regions in the image. This
allows the system to superimpose the industrial drawing
only on the visible parts of the floor in the image. Section
5 shows experimental results on a set of images.
This image augmentation algorithm has been integrated
into an industrial system called CyliCon. The goal of this
system is to provide factories with tool for the calibration
of image sets, for the 3D reconstruction of factories, and
for doing some industrial augmented reality. Some aspects
of this system have already been presented in previous
papers (see [9] and [10]). For instance, the system
provides tools for the 3D reconstruction of industrial
pipelines from calibrated images. Virtual pipes can be
added to the 3D factory model. The 3D VRML model can
be browsed in 3D. The images can be augmented with the
projections of the 3D model. To implement the
augmentation of images by industrial drawings, we have
taken advantage of existing functionality of the Cylicon
package, such as the browser for image databases and the
interface for image manipulation and feature input (points,
lines, cylinders).
Fig. 2 – The CAD drawing is warped to the current
camera viewpoint using the homography between the
CAD model and the floor plane in the image.
•
2. System Description
The outline of the system for augmenting images with
industrial drawings is the following:
•
•
Figure 1 shows a view of the software interface. The
user has access to uncalibrated images of the factory.
He/she has also access to the top view and if available
side views of the industrial drawings of the same
factory. Note that the industrial drawings can come in
a variety of formats, including object-oriented CAD
files, or simple bitmap scannings of paper drawings.
The user chooses an industrial drawing, and an image
on which to superimpose it.
To do this, the system must compute the planar
transformation, or homography, which maps the
industrial drawing onto the floor in the image. For an
uncalibrated image, this transformation is uniquely
determined by a minimum of 4 point or line
correspondences. To get those correspondences, the
system asks the user to identify a few corresponding
points or linear features that are present both in the
industrial drawing and in the image. From these
correspondences, the system estimates the warping
necessary to superimpose the industrial drawing onto
the floor of the factory in the image. The details of
the calculation are given in section 3. Figure 2 shows
an example of a CAD drawing warped to the floor of
the image seen in fig. 1.
The image and the warped industrial drawing can
then be combined into a single view. However, doing
so by linear combination does not look good (see fig.
3). For it to look realistic, the floor plan chart must be
overlaid only onto the visible parts of the floor, and
not onto other objects. In order to do this, we estimate
the probability that each pixel of the image belongs to
the floor. The calculation is described in section 4.
This results into a probability map (see fig. 4).
Fig. 3 – A simple fusion of the image and the warped
CAD drawing by linear combination. This makes the
objects that occlude the floor look impossibly
transparent.
•
Using this probability mask, the factory image and the
industrial drawing warped onto the floor can be
combined non-linearly. Figure 5 shows an example of
resulting augmented image.
•
•
Fig. 4 – Ground plane probability map constructed for
the image of Fig. 1: Brighter pixels belong to the floor
with a higher probability.
•
Once this process has been done for one factory
image, augmenting other images of the factory can be
done in two ways. The first way is to repeat the
process, i.e. warp the industrial drawing to the new
image floor. The second way, which can be
automated, uses the transitivity between planar
transformations, i.e. warps the industrial drawing
from one image to the next.
•
•
Fig. 5 – Intelligent fusion of the original image and the
CAD drawing, based on the floor probability map.
The augmented image looks more realistic than the
linear combination shown in fig. 3.
•
If the factory images are not calibrated, then using the
metric information given by the industrial drawing
enables us to calibrate the cameras. These calibrated
image can then be used for as-built 3D reconstruction
of the industrial site.
Augmenting factory images with industrial drawings
results in real-time mapping between the two
documents. If the user draws a segment on the
industrial drawing, the system can immediately draw
the corresponding segment on the image, and viceversa. This is illustrated by figure 6. This is an aid to
reading and updating floor maps. Planned additions
to the factory can also be visualized simultaneously
on the industrial drawing and the image.
As discussed earlier, the mapping between an
industrial drawing and a camera image provides
metric information about the scene. This can help
calibrate the cameras without placing any fiducials in
the scene. In the absence of markers in the scene, the
current
implementation
uses
point
feature
correspondences between images and industrial
drawings.
Once the images are calibrated, they can be
augmented, both by the industrial drawing and by the
3D reconstruction of the pipes (see [9] and [10] for
details). Figure 7 shows an example of double
augmentation. In this example, the 3D reconstruction
was obtained after camera calibration using markers
placed in the scene. Please note that the augmentation
by industrial drawing does not require calibration and
was done independently of the calibration. A
potential application of the double augmentation is to
check whether or not an old industrial drawing is up
to date with the current 3D layout of the factory. The
industrial drawing could then be updated if needed.
Another application is to check whether planned
changes to the factory are feasible or not. Figure 8
shows an example. To simulate the addition of a new
pipe, a virtual pipe is added to 3D model of the
factory. Then it is projected into the images. The user
can then visually check for potential collisions of the
new pipes with existing pipes, both in the industrial
drawing and in the augmented image.
Side views can be also used for scene augmentation
and are projected onto the walls of the factory.
Although no example of this is shown in this paper, it
is a straightforward extension of the algorithm. One
key advantage of projecting both the top and side
view of an industrial drawing into an image is that the
image provides a visual clue to the spatial
relationship between the side and top view, which is
often difficult to grasp. This will be added to our
system in the near future.
Fig. 6 – The user’s interactions with the industrial drawing (for instance drawing 2 segments highlighting a pipe
with a given label in the industrial drawing) and the image are automatically propagated from one to the other,
using the homographic mapping of the ground plane.
of the current image. This is done using a planar
perspective transformation, also called a homography.
The mapping of the floor from one stereoscopic image to
the next is also represented by a homograpy.
Fig. 7 – Double augmentation of a real factory image
by virtual reality: the image is augmented both by an
industrial drawing and by the 3D reconstruction of
some of the pipes of the scene.
3. Homographic warping between the floor of an
image and an industrial drawing
As mentioned in the previous section, our approach
requires that the industrial drawing is warped to the floor
Figure 8 – The double augmented image can be used to
plan modifications to the existing factory. In this
example, adding a new pipe (with 2 segments) to the
factory has been simulated. Such a simulation allows
the user to check for possible collisions between
existing pipes and planned add-ons.
Such transformations have been widely used in the
computer vision community to recover information about
3D structure. For instance, [11], [14], [16], [15], [13] and
[6] use homographic mappings of planar surfaces between
stereoscopic images to recover the 3D structure of a
scene. Ground planes transformations have also been used
for the purpose of obstacle detection [7].
The homographic mapping of the floor from one
stereoscopic image to another is represented by a matrix
x 
H 3×3 . If  1  is a point on the floor of the first image,
 x2 
 x1′ 
and   the corresponding point on the second image,
 x2′ 
 x1 
 x1′ 
 
 
we have: s x2′  = H 3×3  x2  . A similar formula
1 
1 
 
 
describes the homographic mapping between an industrial
drawing and the floor of an image. In these formulae, s is
an unknown scale factor. Because of this scale factor, the
homography H only has 8 independent parameters.
Therefore, the homography H is uniquely determined by
four point correspondences between the two images.
Additional point correspondences can be used to obtain a
more accurate least-square estimation. Note that line
correspondences can also be used to compute the
homography [16].
To determine the homography of the ground plane
between two camera images, the system queries the user
for at least four point correspondences between points on
the floor in the two images. Similarly, the user needs to
provide four point correspondences to completely define
the homographic warping of an industrial drawing to the
floor of an image.
4. Probabilistic segmentation of the ground plane
into visible and occluded regions
Once the industrial drawing has been warped onto the
factory image, the warped drawing needs to be
superimposed onto the image wherever the floor is visible.
This is done probabilistically. At each point of the image,
the probability that the point may correspond to the floor
is calculated. Then the floor chart is overlaid on the image
with more or less attenuation depending on this
probability, as illustrated in section 2 by fig. 4 and 5.
To estimate this probability, two approaches are
combined: ground plane segmentation and color or
intensity-based segmentation.
• Ground plane segmentation consists in calculating the
homographic mapping of the floor between two
stereoscopic images (see previous section for details).
This homography is used to map one image to the
other. The difference between the two should be zero
on the floor and non-zero everywhere else. Therefore
this value is an indicator of the probability that a
point belongs to the floor. However, this method is
not a hundred percent reliable. For instance, parts of
large homogeneous objects might get wrongly labeled
as floor. Also some parts of the floor that are visible
in one image but occluded in the other might not be
recognized as floor.
• That is why color or intensity-based segmentation is
used to validate or invalidate the information
provided by ground plane segmentation. Color-based
segmentation is useful in industrial environments
because they are often color-coded. However, there
are also often instances when the floor has a very
noisy hue and the best way to characterize it is by its
average gray level. This is what is done in the current
implementation.
The implementation is as follows:
• Let
I 1 be the factory image that must be segmented into
floor regions (noted Φ ) and non-floor regions, in order to
correctly superimpose an industrial drawing to it. In
general, I 1 is a color image.
• Let I 2 a second image of the factory, taken from a
different viewpoint (and also a color image).
• The user is asked to click at least four points on the
factory floor in image I 1 , and the corresponding points
on image I 2 . These correspondences are then used to
estimate the floor-to-floor (or ground plane) homography
Η Φ between the two images. The calculation is done as
discussed in section 3. Using
Η Φ , image I 2 is warped to
I1 .
Then
the
difference
image
~
~
I d = I 1 − H Φ ( I 2 ) is calculated in the (R, G, B) color
image
space. The ~ denotes that, in order to calculate
I d , I1
and I 2 are normalized with respect to lighting changes by
subtracting the local average intensity (or gray level) from
the intensity component of the HSI representation of the
color at each pixel.
• The user is also asked to outline a rectangular region
from the floor in image I 1 . This region should be
“typical” of the floor appearance in the image. From this,
intensity-based segmentation cooperate to estimate the
floor probability.
G1 and its variance
image I 1 gives the system a floor probability map. The
map is then despeckled to eliminate noisy points.
Morphological dilation is also applied to fill in gaps. An
example of the resulting probability map is shown by fig.
4 in section 2.
• Using this, the warped industrial drawing and the
factory image can be combined at each point
proportionally to the probability that the point lies on the
floor.
the mean gray level of the floor
σG
1
are estimated. These will be used for intensity-based
segmentation of the floor.
• Given a point x ∈ I 1 , the probability that it belongs to
the floor can be written as a conditional probability
P (x ∈ Φ G1 ( x ), I d ( x ) ) dependent on the measures
G1 ( x ) and I d (x ) . Since the facts that x ∈ Φ and
x ∉ Φ are mutually exclusive, the floor probability can
• Estimating
P (x ∈ Φ G1 ( x ), I d ( x ) ) at each point of
5. Experimental Results
be rewritten as follows:
P (x ∈ Φ G1 ( x ), I d ( x ) ) =
P (G1 ( x ), I d ( x ) x ∈ Φ )∗ P (x ∈ Φ )
P (G1 ( x ), I d ( x ) x ∈ Φ )∗ P (x ∈ Φ ) + P (G1 ( x ), I d ( x ) x ∉ Φ )∗ P (x ∉ Φ )
• Since no a-priori knowledge is available about the
density of floor pixels in the images, the a-priori
probabilities of the floor and non-floor points are set as
follows:
Fig. 9 presents a set of images overlaid with the factory
top view drawing. Fig. 10 shows some of these images
with 3D reconstructed pipes superimposed.
This results in:
6. Conclusion
P ( x ∈ Φ ) = P ( x ∉ Φ ) = 0 .5
P (x ∈ Φ G ( x ), I
1
d
( x ) ) = P (G ( x ), I
1
( x) x ∈ Φ ) .
d
• In first approximation, we assume that the measurements
G1 ( x ) and I d (x ) are conditionally independent, e.g.
P (G 1 ( x ), I d ( x ) x ∈ Φ ) =
P (G 1 ( x ) x ∈ Φ ) ∗ P (I d ( x ) x ∈ Φ )
This assumption is reasonable in view of the fact that
G1 ( x ) measures the image gray level at pixel x , whereas
I d (x ) measures the difference of colors between
~
~
−1
I 1 ( x ) and I 2 ( H Φ ( x )) , independently of the local
average
gray
level.
From
P (x ∈ Φ G 1 ( x ), I d ( x ) ) =
this,
we
have:
P (G 1 ( x ) x ∈ Φ ) ∗ P (I d ( x ) x ∈ Φ )
• We assume a Gaussian distribution for
its mean
G1 ( x ) around
G1 . We also assume a Gaussian distribution for
I d (x ) around the zero value that it should have for floor
pixels. The floor probability then becomes:
P (x ∈ Φ G1 ( x ), I d ( x ) ) =
(
exp − (G1 ( x ) − G1 ) / 2σ
2
2
G1
)∗ exp( − I
2
d
( x) / 2 )
This formula shows how ground plane segmentation and
This article presents an industrial system for the
augmentation of factory images by industrial drawings.
Using this system, floor plan charts are overlaid onto real
images by perspective warping and proportionally to the
probability that a point belongs to the floor. The algorithm
does not require prior calibration of the cameras. The
result is an easy-to-grasp and easy-to-use multi-modality
description of the factory. Applications include camera
calibration, 3D as-built reconstruction, maintenance
assistance, and factory update.
6. References
[1] R.T. Azuma, “A Survey of Augmented Reality”, In
Presence: Teleoperators and Virtual Environments 6, 4,
p. 355-385, August 97.
[2] Boeing, WWW page =
http://esto.sysplan.com/ESTO/Displays/HMDTDS/Factsheets/Boeing.html, July 94.
[3]S. Feiner, B. McIntyre, D. Seligmann. “Knowledgebased Augmented Reality”, In Communic. of the ACM
36, 7, p 52-62, July 93.
Fig. 9 – A set of factory images before (left) and after
(right) augmentation by the factory floor plan chart.
[4] W.S. Kim. “Virtual Reality Calibration and
Preview/Predictive Displays for Telerobotics”. In
Presence: Teleoperators and Virtual Environments 5, 2, p
173-190, Spring 96
[5]G. Klinker, D. Stricker and Dirk Reiners, “Augmented
Reality: A Balancing Act between High Quality and Realtime Constraints”, In Proc. of Mixed Reality Workshop,
February 99, Yokohama, Japan.
[6]R.~Kumar, P.~Anandan, and K.~Hanna.. "Direct
recovery of shape from multiple views: a parallax based
approach". In Proc. 10th Int'l Conf. Pattern Recog., Israel,
October 1994.
Applications. IEEE Transaction on Pattern Analysis and
Machine Intelligence (PAMI). September 1996.
[15]T. Vieville, C. Zeller, L. Robert. “Recovering motion
and structure from a set of planar patches in an
uncalibrated image sequence”. In Proc. of ICPR’94,
Jerusalem, Israel, Ocf 94
[16]Zhengyou Zhang, “ A Flexible New Technique for
Camera Calibration”, Technical Report MSR-TR-98-71,
December 1998, Microsoft Research
[7]F. Li and M. Brady, “Modelling the Ground Plane
Transformation for Real-time Obstacle Detection”, To
appear in the international journal of Computer Vision
and Image Understanding
[8]P. Milgram, S. Zhai, S. Drascic, J.J. Grodski.
“Applications of Augmented Reality for Human-Robot
Communication”. In Proc. of Int. Conf. on Intelligent
Robotics and Systems, p 1467-1472, Japan, July 1993.
[9]N. Navab, N. Craft, S. Bauer and A. Bani-Hashemi,
“CyliCon: Software package for 3D reconstruction of industrial
pipelines”, Proceedings of IEEE Workshop on Applications of
Computer Vision, October 1998, Princeton, NJ, USA.
[10]N. Navab, E. Cubillo and N. Craft, "Pipeline
Designer: Industrial Augmented Reality for Design and
Update", In Proc. of Mixed Reality Workshop, February
99, Yokohama, Japan.
[11]N. Navab and S. Mann, "Recovery of Relative Affine
Structure Using the Motion Flow Field of a Rigid Planar
Patch". Mustererkennung 1994, Tagungsband. Informatik
Xpress 5. Pages 186-196. Vienna, Austria, September
1994.
[12]E. Rose, D. Breen, K. Ahlers, C. Crampton, M.
Tuceryan, R. Whitaker, D. Greer. “Annotating RealWorld Objects Using Augmented Reality”. In Proc. of
Computer Graphics Intern. ’95, Leeds, UK, p. 357-370,
June 95.
[13]H.S. Sawhney. "3d geometry from planar parallax".
In
Proc. IEEE Conf. Comput. Vision Pattern Recog., pages
929--934, Seattle, Washington, June 1994.
[14]A. Shashua and N. Navab, "Relative Affine Structure:
Canonical Model for 3D from 2D Geometry and
Fig. 10 – Image augmented by industrial drawing and
3D reconstructed pipes