Extended MIP-MAPs - University of Illinois at Chicago

Real-Time Video View Morphing (sketches_0201)
This sketch describes a real-time virtual video camera application
based on view morphing. This system takes video input from
multiple cameras aimed at the same subject from different angles.
After performing real-time pattern matching, the system generates
synthetic views for a virtual camera that can pan between any two
real views. The approach of this paper differs from the more
common “depth from stereo” method for generating virtual views
in that it does not attempt to reconstruct the 3D structure of the
original scene. Instead it takes two 2D images and directly
generates the 2D output by performing only planar operations. At
the heart of the system are algorithms and data structures that
support the fast inter-image correlation needed for the completely
automated, real-time view morphing.
Overview
The overall process consists the following phases.
i.
ii.
iii.
iv.
Capture synchronized images
Segment out the subject
Establish a correlation between corresponding parts of
the images
View morph and display.
contained a full composition of reduced images in all power of
two X and Y resolutions. This provided for the quick rendering of
accurate images that were severely reduced in the Y dimension
while being stretched in the X direction. Fig F. is an example of
the combination of the extended MIPMAP operation applied to
the results of a NCP transformation. The extended MIP-MAP can
be constructed in linear time with respect to the number of pixels.
Conclusion
Figure G. is a typical generated view at the angle half way
between the two real views. Using the data structures described in
this sketch, the real-time matching and view morphing of two
video streams performs at the rate of 18 frames per second on a
1.2 GHz Athlon PC.
References
SEITZ, S. and DYER, C. 1996. View Morphing. In Proceedings of
ACM SIGGRAPH 96, 21-30.
WILLIAMS, L. Pyramidal Parametrics. In Computer Graphics
(Proceedings of ACM SIGGRAPH 83),Vol. 17, No. 3. 1-11.
Of the four tasks above, only item three is not straightforward.
Establishing the correspondence between the two images involves
identifying identical regions. Realistic looking view morphing has
been demonstrated in the past by using manual identification of
the inter-image correspondences [Seitz and Dyer 1996]. To meet
the needs of real-time streaming video, a completely automated
solution that runs on the order of 10s of milliseconds is needed.
Normalized Cylindrical Projection
Exploiting the properties of epi-polar geometry, the images of the
two data streams (Fig A. and C.) are first reprojected so that scan
line algorithms can be used to find correlations. Blindly trying to
establish correspondences between two views separated by a
significant angle is difficult due to the large difference in
orientation, see Fig. B. To simplify the process of matching
similar regions, the two images are first transformed by a simple
geometric operation. Working with the assumption that a cross
section of the human head is more like a circle than a straight line,
each scan line is distorted as if it were first projected on to a semicircle and then laid flat. A second step is to normalize them so
that they fill a rectangular grid. Fig. D. shows a typical image
after this “Normalized Cylindrical Projection” (NCP)
transformation has been performed. The effectiveness of the NCP
for improving the starting point for performing matching is seen
in Fig. E. In this view morphed image, the original images were
simply transformed by the NCP, shifted linearly in proportion to
the angle between the cameras, and then transformed from NCP
space back to normal image space.
Figures A. B. & C.
Figures D. & E.
Extended MIP-MAPs
The NCP transformation is just the starting point from which to
perform a correspondence analysis between the two images. To
meet the time constraint, performing pattern matching on the full
resolution images is too costly. A multi-resolution approach was
taken and a data structure that is an extension to the MIP-MAP
[Williams 1983] was developed. Standard MIP-MAPs do not
produce good results when the pixels to be rendered are not scaled
by similar amounts in both the X and Y directions. To overcome
this limitation, we created an extended MIP-MAP object that
Figures F. & G.
Fig. A. Left view, B. Naive combined view, C. Right
view, D. Normalized Cylindrical Projection, E.
Intermediate view morph. F. Extended MIP-MAP, G.
Final view morph.
Karl Timm – University of Illinois at Chicago
GE Medical Systems, [email protected]