DETECTION AND REMOVAL OF BINOCULAR LUSTER IN

DETECTION AND REMOVAL OF BINOCULAR LUSTER IN COMPRESSED 3D IMAGES
Can Bal
Ankit K. Jain
Truong Q. Nguyen
University of California, San Diego. ECE Dept. La Jolla, CA 92092 USA
ABSTRACT
Binocular luster is an extremely salient effect seen in 3D when
an object in each stereo image exhibits a different contrast polarity relative to the background. The object appears to shimmer, a phenomenon seen in nature and on 3D displays, which
the Human Visual System rapidly detects. This binocular luster is also induced by compression of stereo imagery where
corresponding blocks are quantized to different values. In this
paper, we discuss the psychovisual background of binocular
luster, introduce the “shine” artifact induced by compression,
and present an algorithm for detection and removal of shine
from JPEG compressed stereo images while preserving the
original bitrate.
Index Terms— 3D, stereo, artifact, metric, compression.
1. INTRODUCTION
Binocular luster is an effect seen when an object appears
darker in one eye’s view and lighter in the other eye’s view,
relative to the background [1]. The object, when fused in 3D,
appears to have a shiny, lustrous appearance. This is a naturally occurring phenomenon, where the incident light from
the sun illuminates an object which strongly reflects light
into one eye but not the other. Binocular luster is primarily
responsible for the shiny appearance of metallic objects such
as cars [2].
The luster effect is also seen on 3D displays when contrast differences exist between the two images of a stereo pair.
For instance, luster may occur as an annoying artifact in compressed stereo images, improperly synthesized virtual images,
or as a result of any processing algorithm that does not specifically account for it. The detection of binocular luster by the
human eye is very rapid, making the luster artifact extremely
salient. In fact, the detection of binocular luster by the visual system is the most powerful process for any exclusively
binocular task [2].
Thus, binocular luster must be accounted for in many areas of stereo image processing. However, to our knowledge,
this paper is the first to report detection or processing methods for binocular luster. Our work has important implications
for 3D saliency detection, compression, enhancement, quality assessment, and view synthesis. The paper is organized as
follows: in Section 2, we discuss 3D artifact studies and some
works in the field of Psychology; in Section 3, we discuss the
cause of binocular luster with respect to compression; in Section 4, we introduce our algorithm for detecting luster and our
algorithm to remove it; and we conclude in Section 5.
2. RELATED WORK
Binocular luster has been long known in psychology, but remained relatively unstudied [2]. Recently, the authors in [3]
measured luster as a function of duration, size, and luminance
differences, and Ludwig et. al. [4] reported the temporal integration of binocular luster stimuli. However, to our knowledge, there has been only one previous study in the image
processing community that recognizes binocular luster as an
image artifact [5]. In this work, the authors investigate the
effect of luster on the overall viewing comfort of 3D displays. They define luster in terms of a color difference between corresponding portions of a stereo image, and refer to
it as “metallic luster”. They induce metallic luster by decreasing the color depth of the image, which coarsely quantizes the
colors and causes rivalry between the two views. The results
in [5] indicate that luster is one of the most important factors
that impacts viewing comfort.
Other work has been done in artifact analysis for 3D images and video, such as the effect of JPEG compression of
stereo images on 3D image quality [6], and subjective tests
for quality of 2D+depth compressed videos [7]. The effect of
Multiview Video Coding (MVC) compression of views and
depth maps is investigated in [8] specifically for view synthesis. Similarly, [9] analyzes the effect of compression on view
synthesis. These works do not address binocular luster and
would be enhanced by its consideration.
3. BINOCULAR LUSTER
Human 3D vision is enabled by seeing slightly different views
of the same scene. The so-called cyclopean image is then
formed in the brain by fusing corresponding objects from the
two views. However, it is possible to force the eyes to see two
distinct images, which results in a sensation called “binocular
rivalry” [1]. Under this condition, the viewer perceives the
two images alternating between one another. One such stimulus occurs when the fused regions are bounded by contours of
opposite contrast, or when the two views differ substantially
in luminance [1, 3]. When this happens, the rivalry produces
a region of variable brightness, and this effect is known as
“binocular luster” in the psychology literature. An illustration
demonstrating this effect is shown in Fig. 1. In this example,
the smaller square is lighter than the background in both of
the views and no rivalry occurs. However, the larger squares
have different contrast polarities in each view, which causes
it to appear as if it is shining.
nances in increments of the JPEG quantization step size, we
ensure that the processing changes are not erased by quantization and will preserve the bitrate. For our experiments, we
use the Middlebury Stereo Database [10].
4.1. Detection
Our shine artifact detection algorithm can be used for visual
quality assessment of stereo images as well as processing
the artifact. The block diagram of our detection algorithm is
shown in Fig. 2.
Fig. 1. Illustration of binocular luster.
In 3D image processing, binocular luster may appear as a
compression or processing artifact. In block-based compression, corresponding blocks in a stereo pair can sometimes be
quantized to different levels. Depending on the surrounding
regions of these blocks, they may have opposing contrast polarities in each view which causes binocular luster. In view
synthesis, when occluded regions are errroneously filled, the
luster effect can be observed when the synthesized views are
fused with one another. These mismatches are immediately
noticeable to the viewer; it literally lights up, and thus we call
it the “shine artifact”.
The shine artifact due to compression of 3D images occurs under coarse compression, when blocking artifacts start
to appear. In this paper, we focus on binocular luster caused
by JPEG compression of grayscale stereo images and introduce novel algorithms for detecting and treating the shine artifact. The implementation of such an algorithm is made difficult by the fact that luster has been investigated on uniform
surrounds, whereas in natural images this is not usually the
case. It is still an open question whether luster is due to local contrast, or the space-averaged contrast with the surround.
Further, how people perceive contrast in color is unknown,
which is why we treat grayscale images in this study. Nevertheless, we employ the contrast polarity model in our algorithms and achieve good performance.
4. DETECTION AND REMOVAL OF SHINE
ARTIFACT
We first present an algorithm to detect the shine artifact in
JPEG images which is based on the low level psychological
understanding of binocular luster. Then, a processing algorithm is presented to reduce the shine artifact by adjusting
the luminances of certain blocks in one image. The motivation is to detect these blocks at the encoder side, and process
the offending blocks before transmission. This allows us to
use the disparity map for detection, and does not require any
processing at the decoder side. By adjusting the block lumi-
Disparity Map
Compressed
Left View
Compressed
Right View
Block Level
Disparity
Estimation
Block Level
View Synthesis
Contrast Polarity
Calculation
Shine Detection
Median Filter
Find Opposing
Polarities
Fig. 2. Block diagram of shine artifact detection algorithm.
The algorithm is based on the contrast polarity model that
is used to explain the binocular luster and therefore requires
the knowledge of the block correspondences between the
views of a stereo pair. The disparity map corresponding to
one of the stereo images is converted to a block-level disparity map by dividing it into blocks and taking the median of
all the disparity values within a block. Using the block-level
disparity map, the blocks are mapped onto the other view.
The contrast polarities are then calculated for the synthesized view and the corresponding original view along the
block boundaries. For horizontally neighboring blocks, each
row of pixels (JPEG uses 8×8 blocks, which gives out 16pixel long rows) are masked with f [k], the filter given in
(1), and the resulting coefficients are added together to yield
the contrast polarity for that row. For vertically neighboring
blocks, the same process is repeated with f [k]T .

|k−β|

if 0 ≤ k ≤ β
α
|k−β|
f [k] = −α
(1)
if −β ≤ k < 0


0
otherwise
The design of f [k] is based on the observation that pixels
along block boundaries contribute more to the shine artifact.
We set α = 0.8 empirically, and chose β = 4 in order to not
use the same pixel values toward calculating contrast polarities of different boundaries without disregarding any information available to the detector.
The contrast polarity calculations around the block boundaries form the contrast polarity maps. The sign of each value
in the map indicates the polarity and the magnitude indicates
the strength of contrast at that pixel location. We compare
the contrast polarity maps of the synthesized and corresponding original view to find the boundaries that have opposing
(a) Right compressed view.
(b) Left compressed view.
(c) Right compressed view with shine detection.
Fig. 3. Shine detection using the algorithm in Fig. 2 (Figs. 3(a)–3(b) are set up for crosseye viewing).
polarities. We form a detection map, which is the difference
between the contrast polarity map wherever the contrast is
opposite and zero elsewhere. Finally, we apply a spatial 1D
median filter of length 9 (block size + 1) along the vertical
and horizontal directions of the detection map to remove false
detections. This ensures the detected shine to be consistent
along the boundaries of a block.
Results show that the proposed algorithm can detect
blocks that possess significant amount of shine and yields
very few false detections when used with lightly compressed
images. A detection example is shown in Fig. 3. We have circled some visually apparent shining blocks on the compressed
images for comparison with the shine detection.
4.2. Removal
Using the result of Section 4.1, we can process the stereo images in order to remove the shine artifacts. Based on the contrast polarity model, we need to only process one of the two
stereo images to ensure that the contrast polarities between
corresponding blocks are not in opposition. As a convention,
we have chosen to generate the detections for and to process
the right image. The algorithm works in an iterative manner;
the block diagram of one iteration is depicted in Fig. 4.
Input
Image
Shine
Detector
Output
Image
Comparison
Block
Selection
Luminance
Adjustment
Shine
Detector
Fig. 4. Block diagram of an iteration of the shine artifact
removal algorithm.
To initialize the algorithm, the detected shine on the right
stereo image is used to decide which blocks to process. The
luminance of these blocks are adjusted to remove the shine.
However, for weaker detections where it is indeterminate of
how to adjust the luminance, the shine is not always removed.
Therefore, to check if the artifact has been removed, the processed image is passed back into the detector. In the final
stage, the detections on the processed image are compared to
the initial detections, and for the blocks which do not exhibit
less shine, luminance values get adjusted back to their original values.
To choose which blocks to process, a boolean map of detections is generated around the block boundaries by finding
the nonzero values of the detection map. Next, the number of
non-zero elements of the boolean map are calculated for each
block, providing a measure of the strength of detection. If
this number exceeds a threshold, we process that block. The
threshold is chosen by considering a strong detection, which
would have 1’s along three border edges of the 8 × 8 block.
The sum of the border would be 22, but to account for imperfect detections, we lower the threshold to 18.
Matching the contrast polarity of corresponding blocks
involves adjusting only the luminance value, or the DC coefficient, of the detected blocks. Since JPEG quantizes DCT
coefficients using a fixed quantization table, we calculate the
quantization step size for the DC coefficient.
The DC coefficient of a block to be processed is increased
or decreased one step size per iteration, which allows us to
gradually converge to the best luminance value. In addition,
the altered DC values do not change the overall bitrate significantly since they are increased for some blocks and decreased
for others. The decision of the adjustment direction is sometimes ambiguous due to weak detections or cases the model
does not explain. Therefore, we always increase or always decrease in an iteration, and alternate the direction in between
the iterations. Since we check if an adjustment makes the
shine better after each iteration this scheme does not have an
adverse effect on the final result. The artifact removal process
stops if an iteration does not change the total amount of shine
in the image, or it exceeds 20 iterations. In our tests, all the
test images converged before 8 iterations.
Our method using this model gives visually pleasing results with minimal change to the input image. Fig. 5 shows
the compressed left and processed right images for the detec-
(a) Right processed view.
(b) Left compressed view.
(c) Right compressed view with shine detection.
Fig. 5. Compressed stereo pair, processed for shine removal (Figs. 5(a)–5(b) set up for crosseye viewing).
tions in Fig. 3(c). For more results of the detection and the
removal algorithms, please refer to
http://videoprocessing.ucsd.edu/~canbal.
5. CONCLUSION
In this paper we present an algorithm for detecting and removing the
shine artifact from JPEG compressed stereo images without introducing additional bitrate. For this we focused on calculating shine
along the block boundaries and removing the artifact by adjusting
the luminances of the blocks. Our method removes the artifacts with
significant shine without a major effect on the bitrate of the encoded
JPEG images.
The shine artifact is not limited to compression and may appear a
result of processing, enhancement or view synthesis. In these cases,
it is not trivial to detect shine automatically as the effect of texture
and color on binocular luster is still not well understood. Since it
is not known how the eye interprets contrast across a nonuniform
background, a better model of contrast would improve detection and
processing results.
Therefore, one of our plans for the future is to conduct low-level
psychological tests to develop a better model of binocular luster.
We aim to extend this work to color images and H.264 compressed
videos, and investigate how motion effects the visibility of luster.
In addition, we plan to generate an image/video dataset that include
various types and levels of shine. This database will be paired with
subjective assessment of the images for which blocks or regions exhibit shine. We also plan to use our findings in 3D metric development and in developing a model for saliency in stereo.
6. REFERENCES
[1] I. P. Howard, Seeing in Depth.
I. Porteous, 2002.
[2] C. Tyler, “Binocular vision,” in Duane’s Foundations of Clinical Ophthalmology, vol. 2, W. Tasman and E. Jaeger, Eds.
Philadelphia: J.B. Lippincott Co., 2004.
[3] M. Formankiewicz and J. Mollon, “The psychophysics of
detecting binocular discrepancies of luminance,” Vision Research, vol. 49, no. 15, pp. 1929–1938, 2009.
[4] I. Ludwig, W. Pieper, and H. Lachnit, “Temporal integration of
monocular images separated in time: Stereopsis, stereoacuity,
and binocular luster,” Perception & Psychophysics, vol. 69, pp.
92–102, 2007.
[5] F. L. Kooi and A. Toet, “Visual comfort of binocular and 3-d
displays,” in Encyclopedia of Optical Engineering, 2004, pp.
1–14.
[6] P. Seuntiens, L. Meesters, and W. Ijsselsteijn, “Perceived quality of compressed stereoscopic images: Effects of symmetric and asymmetric jpeg coding and camera separation,” ACM
Trans. Appl. Percept., vol. 3, no. 2, pp. 95–109, 2009.
[7] C. Hewage, S. Worrall, S. Dogan, S. Villette, and A. Kondoz,
“Quality evaluation of color plus depth map-based stereoscopic
video,” Selected Topics in Signal Processing, IEEE Journal of,
vol. 3, no. 2, pp. 304 –318, april 2009.
[8] K. Klimaszewski, K. Wegner, and M. Domanski, “Distortions
of synthesized views caused by compression of views and
depth maps,” in 3DTV Conference: The True Vision - Capture,
Transmission and Display of 3D Video, 2009, 4-6 2009, pp. 1
–4.
[9] P. Merkle, Y. Morvan, A. Smolic, D. Farin, K. Muller,
P. de With, and T. Wiegand, “The effect of depth compression on multiview rendering quality,” in 3DTV Conference:
The True Vision - Capture, Transmission and Display of 3D
Video, 2008, 28-30 2008, pp. 245 –248.
[10] D. Scharstein and C. Pal, “Learning conditional random fields
for stereo,” in IEEE Conference on Computer Vision and Pattern Recognition, 2007, June 2007, pp. 1–8.